1、Andy Witkowski, Architect Thomas Kyte, VP Oracle Corporation,Oracle Database 10g SQL Model Clause,40166,Whats now in SQL for Modeling,Aggregation Enhancements Cube, Rollup, Grouping Sets New aggregates: Inverse Distribution, FIRST/LAST,etc Analytic Functions Window Functions: Rank, Moving, Cumulativ
2、e Statistical Functions: Correlation, Linear Regression,etc Old tools still have more modeling power than SQL Spreadsheets, MOLAP engines,SQL Model enhances SQL with modeling power,Case Study Modeling with Excel,Excel fits well at the personal scale UI and Formatting Calculations (build-in functions
3、, formulas) What-If analysis Excel fits poorly at corporate scale for modeling Cryptic row-column addressing No metadata, No standards, No mathematical model 100s of spreadsheets and consolidation by hand Does not scale (1000s formulas, TB of data) Perpetual data exchange: databases-Excel,Replace Ex
4、cel Modeling with SQL Modeling,Modeling with SQL Model,Language: Spreadsheet-like calculations in SQL Inter-row calculation. Treats relations as an N-Dim array Symbolic references to cells and their ranges Multiple Formulas over N-Dim arrays Automatic Formula Ordering Recursive Model Solving Model i
5、s a relation & can be processed further in SQL Multiple arrays with different dimensionality in one query Performance Parallel Processing in partitioning & formulas Multiple-self joins with one data access structure Multiple UNIONs with one data access structure Why Better? Automatic Consolidation (
6、models as views combine using SQL) Self Adjusting (as database changes no need to re-define) One version of truth (calc directly over data base, no exchange),SQL Model Concepts,Define Relation as Array,SELECT prod, time, s FROM sales,Relation,Array,1999,2000,2001,vcr,dvd,tv,pc,vcr 2001 9 dvd 2001 0,
7、prod,time,5 6 7 8,1 2 3 4,9 0 1 2,DIMENSION BY (prod, time) MEASURES (s),prod time s,Relation,DIMENSION BY (prod, time) MEASURES (s),Array,1999,2000,2001,vcr,dvd,tv,pc,prod,time,5 6 7 8,1 2 3 4,9 0 1 2,RULES UPSERT (sANY, 2000 = sCV(prod), CV(time) - 1 * 2,svcr, 2002 = svcr, 2001 + svcr, 2000,sdvd,
8、2002 =AVG(s) CV(prod), time2001 ),Sales in 2000 2x of previous year,Predict vcr sales in 2002,Predict dvd sales in 2002,vcr 2001 9 dvd 2001 0,prod time s,SELECT prod, time, s FROM sales,Define Business Rules,Relation,DIMENSION BY (prod, time) MEASURES (s),Array,1999,2000,2001,vcr,dvd,tv,pc,1 2 3 4,9
9、 0 1 2,RULES UPSERT (sANY, 2000 = sCV(prod), CV(time) - 1 * 2,svcr, 2002 = svcr, 2001 + svcr, 2000,sdvd, 2002 = AVG(s) CV(prod), time2001 ),Sales in 2000 2x of previous year,Predict vcr sales in 2002,Predict dvd sales in 2002,vcr 2001 9 dvd 2001 0,prod time s,SELECT prod, time, s FROM sales,2 4 6 8,
10、Evaluate Formulas 1st,Relation,DIMENSION BY (prod, time) MEASURES (s),1999,2000,2001,vcr,dvd,tv,pc,2002,2 4 6 8,1 2 3 4,9 0 1 2,11,vcr 2001 9 dvd 2001 0,prod time s,SELECT prod, time, s FROM sales,Sales in 2000 2x of previous year,Predict vcr sales in 2002,Predict dvd sales in 2002,Evaluate Formulas
11、 2nd,RULES UPSERT (sANY, 2000 = sCV(prod), CV(time) - 1 * 2,svcr, 2002 = svcr, 2001 + svcr, 2000,sdvd, 2002 = AVG(s) CV(prod), time2001 ),Relation,DIMENSION BY (prod, time) MEASURES (s),1999,2000,2001,vcr,dvd,tv,pc,2002,2 4 6 8,1 2 3 4,9 0 1 2,11 3,vcr 2001 9 dvd 2001 0,prod time s,SELECT prod, time
12、, s FROM sales,Sales in 2000 2x of previous year,Predict vcr sales in 2002,Predict dvd sales in 2002,RULES UPSERT (sANY, 2000 = sCV(prod), CV(time) - 1 * 2,svcr, 2002 = svcr, 2001 + svcr, 2000,sdvd, 2002 = AVG(s) CV(prod), time2001 ),Evaluate Formulas 3rd,Relation,DIMENSION BY (prod, time) MEASURES
13、(s),1999,2000,2001,vcr,dvd,tv,pc,2002,2 4 6 8,1 2 3 4,9 0 1 2,11 3,Relation again,vcr 2002 11 dvd 2002 3,vcr 2001 9 dvd 2001 0,Self-join.,join + UNION,join + UNION,vcr 2001 9 dvd 2001 0,prod time s,SELECT prod, time, s FROM sales,RULES UPSERT (sANY, 2000 = sCV(prod), CV(time) - 1 * 2,svcr, 2002 = sv
14、cr, 2001 + svcr, 2000,sdvd, 2002 = AVG(s) CV(prod), time2001 ),Return as Relation,Rows updated & inserted by the Model clause,Model Clause Components,Model clause,Key Concepts (1),New SQL Model Clause: Data as N-dim arrays with DIMENSIONS & MEASURES Data can be PARTITION-ed - creates an array per pa
15、rtition Formulas defined over the arrays express a (business) model Formulas within a Model: Use symbolic addressing using familiar array notation Can be ordered automatically based on dependency between cells Can be recursive with a convergence condition recursive models Can UPDATE or UPSERT cells
16、Support most SQL functions including aggregates,Key Concepts (2),Result of a SQL Model is a relation Can participate further in processing via joins, etc. Can define views containing Model computations SQL Model is the last query clause Executed after joins, aggregation, window functions Before ORDE
17、R BY Main Model and Reference Models Can relate models of different dimensionality,Formula Fundamentals (1),Formulas: SQL expressions over cells with aggs, functions, etc. Formula has a left and right side and represents assignment svcr, 2002 = svcr, 2001 + svcr, 2000 single ref svcr, 2002 = AVG(s)v
18、cr, t2002 multi ref on rightLeft side can qualify multiple cells sp IN (vcr,dvd), t2002 = 1000 multi ref on left sANY, t=2002 = 2 * sCV(p), CV(t)-1 left-right correlation sp IN (SELECT prod FROM prod_tb), 2000 = 1000 Formula can operate in update or upsert mode update svcr, 2002 = svcr, 2001 + svcr,
19、 2000 upsert svcr, 2002 = svcr, 2001 + svcr, 2000,Formula Fundamentals (2),Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. sANY, 2002 = sCV(p), CV(t)-1 +sCV(p), CV(t) 2,Formula Fundamentals (2),Function CV(dimension) p
20、ropagates values from left to the right side. In example, products in 2002 are sum of two previous years. sANY, 2002 = sCV(p), CV(t) -1 + sCV(p), CV(t) 2Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: svcr, ANY ORDER BY t = svcr, CV(t) - 1,Formu
21、la Fundamentals (2),Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. sANY, 2002 = sCV(p), CV(t) -1 + sCV(p), CV(t) 2Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time:
22、svcr, ANY ORDER BY t = svcr, CV(t) - 1,vcr 2001 300.00 0 vcr 2002 350.00 300.00 vcr 2003 400.00 vcr 2004 450.00 vcr 2005 500.00,ORDER BY t,Formula Fundamentals (2),Function CV(dimension) propagates values from left to the right side. E.g, products in 2002 are sum of two previous years sANY, 2002 = s
23、CV(p), CV(t) -1 + sCV(p), CV(t) 2Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: svcr, ANY ORDER BY t = svcr, CV(t) - 1,vcr 2001 300.00 0 vcr 2002 350.00 300.00 vcr 2003 400.00 350.00 vcr 2004 450.00 vcr 2005 500.00,ORDER BY t,Formula Fundamenta
24、ls (2),Function CV(dimension) propagates values from left to the right side. E.g, products in 2002 are sum of two previous years sANY, 2002 = sCV(p), CV(t) -1 + sCV(p), CV(t) 2Formula result can depend on processing order. Can specify order in each formula. E.g., shift by time: svcr, ANY ORDER BY t
25、= svcr, CV(t) - 1,vcr 2001 300.00 0 vcr 2002 350.00 300.00 vcr 2003 400.00 350.00 vcr 2004 450.00 400.00 vcr 2005 500.00 450.00,ORDER BY t,Model Options Fundamentals,rule options,global options,NAV Options: Handling Sparse Data,West dvd 2001 300.00 West tv 2002 500.00 West vcr 2001 200.00 West vcr 2
26、002 400.00,West dvd 2001 300.00 West tv 2002 500.00 West dvd 2003 - West tv 2003 500.00 West vcr 2001 200.00 West vcr 2002 400.00,?,2001,keep nav,NAV Options: Handling Sparse Data,West dvd 2001 300.00 West tv 2002 500.00 West vcr 2001 200.00 West vcr 2002 400.00,West dvd 2001 300.00 West tv 2002 500
27、.00 West dvd 2003 300.00 West tv 2003 500.00 West vcr 2001 200.00 West vcr 2002 400.00,assume 0,ignore nav,NAV Options: Handling Sparse Data,West dvd 2001 300.00 West tv 2002 500.00 West vcr 2001 200.00 West vcr 2002 400.00,West dvd 2001 300.00 West tv 2002 500.00 West dvd 2003 - West tv 2003 500.00
28、 West vcr 2001 200.00 West vcr 2002 400.00,West dvd 2001 300.00 West tv 2002 500.00 West dvd 2003 300.00 West tv 2003 500.00 West vcr 2001 200.00 West vcr 2002 400.00,?,2001,assume 0,ignore nav,keep nav,Automatic Formula Ordering,Automatic Formula Ordering,Automatic Formula Ordering,UPDATE, UPSERT &
29、 Partitions,Region Product Time s East dvd 2001 100 East dvd 2002 150 East vcr 2002 100 West dvd 2001 200,UPDATE, UPSERT & Partitions,Region Product Time Old s New s East dvd 2001 100 100 East dvd 2002 150 100 East vcr 2002 100 120 West dvd 2001 200 200,updated,Region Product Time s East dvd 2001 10
30、0 East dvd 2002 150 East vcr 2002 100 West dvd 2001 200,UPDATE, UPSERT & Partitions,Region Product Time Old s New s East dvd 2001 100 100 East dvd 2002 150 100 East vcr 2002 100 120 East dvd 2003 - 250 West dvd 2001 200 200 West dvd 2003 - 200,updated,upserted,Region Product Time s East dvd 2001 100
31、 East dvd 2002 150 East vcr 2002 100 West dvd 2001 200,Different dimensions: Reference,c p t s USA dvd 2001 300.00 $ USA tv 2001 500.00 $ Poland vcr 2001 200.00 zl France vcr 2001 100.00 fr,c ratio USA 1 Poland 0.24 France 0.12,Sales Table,Conv table converts currency to $,Relate Models with differe
32、nt dimensions. Represent each as n-dimensional array: one main, others as reference or lookup arrays.,Different dimensions: Reference,Sales Table,Conv table converts currency to $,USA dvd 2001 300.00 $ USA tv 2001 500.00 $ Poland vcr 2001 48.00 $ France vcr 2001 12.00 $,Converted values,c p t s USA
33、dvd 2001 300.00 $ USA tv 2001 500.00 $ Poland vcr 2001 200.00 zl France vcr 2001 100.00 fr,c ratio USA 1 Poland 0.24 France 0.12,Recursive Model Solving,Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatic
34、ally detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions,Iteration 1 2 3 4 5 6 7 8 S value 1024 512 128 64 32 16 8 4,Recursive Model Solving,Iteration 1 2 3 4 5 6 7 8 S value 1024 512 128 64 32 16 8 4,Model can contain
35、cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions,Recursive Model Solving,Iteration
36、 1 2 3 4 5 6 7 8 S value 1024 512 128 64 32 16 8 4,Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL claus
37、e to specify convergence conditions,Recursive Model Solving,Iteration 1 2 3 4 5 6 7 8 S value 1024 512 128 64 32 16 8 4,Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error repor
38、ted. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions,Recursive Model Solving with Until,Iteration 1 2 3 4 5 6 7 8 9 10 S value 1024 512 256 128 64 32 16 8 4 2,previous(s1),- s1 = 512,Model can contain cyclic (recursive) formulas. - If cyclic formul
39、as desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause to specify convergence conditions,Recursive Model Solving with Until,previous(s1),- s1 = 256,Iteration 1 2 3 4 5 6
40、 7 8 9 10 S value 1024 512 256 128 64 32 16 8 4 2,Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Use ITERATE clause to specify # of iterations or Use UNTIL clause
41、 to specify convergence conditions,Recursive Model Solving with Until,Model can contain cyclic (recursive) formulas. They are automatically detected, and error is reported. Unless cycles are intentional which is indicated with ITERATE option Use ITERATE clause to specify # of iterations or Use UNTIL
42、 to specify convergence conditions. Stop if true.,previous(s1),- s1 = 4,Iteration 1 2 3 4 5 6 7 8 9 10 S value 1024 512 256 128 64 32 16 8 4 2,SQL Model Business Examples,Time Series Calculation (1),Compute the ratio of current month sales of each product to sales one year ago, one quarter ago and one month ago.,