1、A Unified Relational Approach to Grid Information Services (GWD-GIS-012-1 (Informational),Peter A. Dinda, Northwestern Beth Plale, Georgia Techhttp:/www.cs.nwu.edu/pdinda/relational-gis,2,Related Work,Steve Fisher, RAL Relational model for Grid Performance Working group Interesting thoughts on how t
2、o provide distributed relational model Jennifer Schopf, “The Dictionary Project”,3,Claim Applications need common compositional queries over information of varying dynamicityApproach Build down from an RDBMS world-view Relational = relational data model and queries Unified = tables and streamsResear
3、ch Questions How “far down” must we go? What extensions are needed?,1,2,3,4,Outline,Needs of Grid applications Why RDBMS? Our approach (and research) Existence proofs Call for participation,5,Needs of Grid Applications,Compositional queries Application-specific information aggregration Support for i
4、nformation of varying dynamicity Varying update rates and freshness requirements Seamless inclusion of streaming data A common data model and query language Powerful, high level, declarative, easy-to-optimize,6,Some Examples,Adaptive data parallel SOR Workflow Dv scientific visualization Distributed
5、 laboratories dQUOB RPS prediction system and Remos RPSDB Grid schedulers GridSearcher,7,Adaptive Data Parallel SOR,Startup: “Find 4 hosts which all have the same architecture and have a combined memory of 0.5 to 1 GB” Compositional Query Over Static Information Adaptation: “Tell me about instances
6、in which the predicted load on any one of those 4 hosts exceeds the average of their predicted loads by 50%”Compositional Query Over Dynamic Information,?,?,?,?,8,Our Approach,Compositional queries as SQL queries Extensible type hierarchy Extensible schemas and indices Time-bounded non-deterministic
7、 queries Data streams as relations High update rates and freshness Friendly interfaces for non-experts Decentralized administration and data,Prototype Systems: RPSDB, dQUOB,9,Supporting Compositional Queries,Set operations - Relational Algebra - RDBMS Relational data model Tables with relationships
8、Indices separately created and managedCan change to meet changing query demands ANSI SQL Powerful, flexible, complete query language Declarative nature (what, not how) enables optimization Decouples app from specific RDBMS implementations Relational database manager ACID (Atomicity, Consistency, Iso
9、lation, Durability),10,Query Example (RPSDB),11,Extensible Type Hierarchy,Type identifiers Single inheritence tree Is-a relationships Type conversion requirement Set of base types that can be extended Single manager Subtypes added by consensus,12,Extensible Type Hierarchy (RPSDB),unique,benchmark,ho
10、stbenchmark,hostspecificbenchmark,linkbenchmark,switchbenchmark,switchpecificbenchmark,pathbenchmark,networknode,host,switch,switchport,networklink,networkpath,module,endpoint,flowsource,moduleexec,linksource,nodesource,datasource,13,Schemas and Indices,Schemas encode types into tables and establish
11、 relationships between the tables Indices determine which relationships are fast with respect to queries,14,Schema (RPSDB),15,Non-deterministic Time-bounded Queries,Queries can be incredibly expensive N-way joins Typically dont need “all the answers” Example: “Find 4 hosts which all have the same ar
12、chitecture and have a combined memory of 0.5 to 1 GB” Only one such group is needed Typically have time and resource constraints,Run until the deadline, returning a non-deterministic subset of the full query results,16,Example,17,Data Stream Support and Unification,Extend SQL query model to streams
13、Add dynamic types to hierarchy RPS measurements and predictions, etc. Leverage dQUOB technology Data stream is a set of relational tables SQL-like queries on data stream Stream optimizations enabled by relational model,18,user- definedaction,D D D D D D A T A D D D D D D D D D D D S T R E A M D D D
14、D D,C1,C2,C3,C4,MPEG compression,bounding box extraction,units conversion,violation notification,user- definedaction,user- definedaction,SQL query,dQUOB Quoblet,19,Fast Updates and Freshness,Dynamic objects will become the majority Update rate and freshness constraints Remote filtering and triggers
15、Push updates to GIS and to consumers dQUOB-like technologyRDBMS systems support frequent updates,20,Distributed Operation,Centralized model One administrative domain, fine-grain access control, centralized database Decentralized model Multiple administrative domains, distributed database,Centralizat
16、ion seems to be a real disadvantage for RDBMS Can it be overcome? Should it be overcome? Is distributed operation really necessary?,21,Performance Evaluation,Scalability of relational approach compared to the hierarchical approach Effectiveness of nondeterminism Achievable update rates and freshness
17、 Value of ACID properties,22,Tensions to explore,RDBMS versus distributed data and decentralized administration and multiple security domains RDBMS versus expensive queries Expressibility versus usability (SQL),23,Interaction with other GIS and Grid Performance Systems,Monitors,Prediction,Non-relati
18、onal GIS,Relational GIS,App,App,App,Alternatives: MDS Index Nodes, ,24,Claim Applications need common compositional queries over information of varying dynamicityApproach Build down from an RDBMS world-view Relational = relational data model and queries Unified = tables and streamsResearch Questions
19、 How “far down” must we go? What extensions are needed?,1,2,3,25,Come Join Us,Peter A. Dinda, Northwestern, pdindacs.nwu.edu Beth Plale, Georgia Tech, bethcc.gatech.edu Relational Task Group, http:/www.cs.nwu.edu/pdinda/relational-gis,26,Proposed Areas/Papers,Use cases Expand on the examples in our
20、paper Type hierarchy and set of base types Useful independent of data model The vision paper (Plale) Schema design / critique Reference implementations Interaction with Steve Fishers work,AREAS RIPE FOR PARTICIPATION!,27,Implementation of Non-deterministic, Time-bounded Queries,Current research Leve
21、rage work by Olken and Tan, et al Query-rewriting approach Hopefully RDBMS-independent,28,Resource Prediction System,Software Configuration Management: “For each of those hosts, find an RPS prediction stream corresponding to a measurement stream from a load sensor on the host”Compositional Query Ove
22、r Semistatic Information Performance Monitoring Streams: “Tell me about instances in which the predicted load on any one of those 4 hosts exceeds the average of their predicted loads by 50%”Compositional Query Over Dynamic Streams,29,Dv (and traditional workflow),Startup: “Find a pool of five hosts
23、each of which have at least a GB of memory for interpolation, a second pool of five different hosts with at least 1 GFLOP/s performance for isosurface extraction, and a third pool of five different hosts with special scene synthesis hardware, where the inter-pool bandwidth is at least 10 MB/s.”Compo
24、sitional Query Over Static Information Adaptation: “What is the host within the isosurface extraction pool which is expected to have the minimum load over the next 10 seconds?” Compositional Query Over Dynamic Streams,30,Dv as a Query,“Show me the results of rendering the scene synthesized by combin
25、ing the results of isosurface extraction and morphology reconstruction over regularly grided data resulting from interpolation of this region of the simulation database” Compositional Query Describing An Application No Specific Query Plan is Implied,31,Grid Schedulers,Similar needs, more flexibility
26、 But these abstractions are important GridSearcher Schopf Compositional Queries over MDS,32,Our Approach,Compositional queries as SQL queries Type hierarchy Schema and indices (including example) Time-bounded non-deterministic queries Data stream support with dQUOB Fast updates and streaming Tensions and questions,Prototype Systems: RPSDB, dQUOB,