1、Appendix A: XML and XML Schema,Service-Oriented Computing: Semantics, Processes, Agents Munindar P. Singh and Michael N. Huhns, Wiley, 2005,Appendix A,2,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Highlights of this Chapter,XML and Vocabularies Well-Fo
2、rmedness Namespaces and Qualified Names XML Extensions XML Schema XML Query Languages XPath XSLT Limitations,Appendix A,3,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Brief Introduction to XML,Basics Parsing Storage Transformations,Appendix A,4,Service-
3、Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Markup History,None Ad hoc tags SGML (Standard Generalized Markup L): complex, few reliable tools HTML (HyperText ML): simple, unprincipled, mixes structure and display XML (eXtensible ML): simple, yet extensible sub
4、set of SGML to capture new vocabularies Machine processible Comprehensible to people: easier debugging,Appendix A,5,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XML Basics and Namespaces, Optional text also known as PCDATA,Appendix A,6,Service-Oriented
5、Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Parsing and Validating,An XML document maps to a parse tree. Each tag ends once: nesting structure (one root) Each attribute occurs at most once; quoted string Well-formed XML documents can be parsed Applications have an expl
6、icit or implicit syntax for their particular XML-based tags If explicit, may be expressed in DTDs and XML Schemas Best referred to definitions elsewhere XML Schemas, expressed in XML, are superior to DTDs When docs are produced by external components, they should be validated,Appendix A,7,Service-Or
7、iented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XML Schema,A data definition language for XML: defines a notion of schema validity Same syntax as regular XML documents Local scoping of subelement names Incorporates namespaces Types Primitive (built-in): string, inte
8、ger, float, date, Primitive (built-in): ID (key), IDREF (foreign key) simpleType constructors: list, union Restrictions: intervals, lengths, enumerations, regex patterns, Flexible ordering of elements Key and referential integrity constraints,Appendix A,8,Service-Oriented Computing: Semantics, Proce
9、sses, Agents - Munindar Singh and Michael Huhns,XML Schema: complexType,Specifies types of elements with structure: Must use a compositor if 1subelements Subelements with types Min and max occurrences (default 1) of subelements Elements with text content not easy: ignore EMPTY elements: easy. Exampl
10、e?,Appendix A,9,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XML Schema: Compositors,Sequence: ordered Can occur within other compositors Allows varying min and max occurrence All: unordered Must occur directly below root element Max occurrence of each
11、element is 1 Choice: exclusive or Can occur within other compositors,Appendix A,10,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XML Schema: Key Namespaces,http:/www.w3.org/2001/XMLSchema Conventional prefix: xsd Terms for defining schemas: schema, eleme
12、nt, attribute, The tag schema has an attribute targetNamespace http:/www.w3.org/2001/XMLSchema-instance Conventional prefix: xsi Terms for use in instances: schemaLocation, null targetNamespace: user-defined,Appendix A,11,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and
13、Michael Huhns,XML Schema Instance Doc, Define null values as ,Appendix A,12,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Creating Schema Docs: 1,Included into the same namespace as the including space.,Appendix A,13,Service-Oriented Computing: Semantics
14、, Processes, Agents - Munindar Singh and Michael Huhns,Creating Schema Docs: 2,Use imports instead of include Specify namespaces from which schemas are to be imported Location of schemas not required and may be ignored if provided,Appendix A,14,Service-Oriented Computing: Semantics, Processes, Agent
15、s - Munindar Singh and Michael Huhns,Document Object Model (DOM),Basis for parsing XML, which provides a node-labeled tree in its API Conceptually simple: traverse by requesting tag, its attribute values, and its children Processing program reflects document structure Can edit documents Inefficient
16、for large documents: parses them first entirely to build the tree even if a tiny part is needed,Appendix A,15,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,DOM Example Simeoni 2003,Element s = d.getDocumentElement(); NodeList l = s.getElementsByTagName(“
17、member”); Element m = (Element) l.item(0); int code = m.getAttribute(“code”); NodeList kids = m.getChildNodes(); Node kid = kids.item(0); String tagName = (Element)kid).getTagName(); ,Appendix A,16,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Simple API
18、 for XML (SAX),Parser generates a sequence of events: startElement, endElement, Programmer implements these as callbacks More control for the programmer Processing program does not reflect document structure,Appendix A,17,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and
19、Michael Huhns,SAX Example Simeoni 2003,class MemberProcess extends DefaultHandler public void startElement (String uri, String n, String qName, Attributes attrs) if (n.equals(“member”) code = attrs.getValue(“code”);if (n.equals(“project”) inProject = true;buffer.reset(); public void endElement (Stri
20、ng uri, String n, String qName) if (n.equals(“project”) inProject = false;if (n.equals(“member”) ,Appendix A,18,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Programming with XML,Current approaches concentrate on structure but ignore meaning Difficult to
21、 construct and maintain Treat everything as a string Inadequate type checking can hide errors Emerging approaches (e.g., JAXB) provide superior binding from XML to programming languages Primitives such as unmarshal to materialize an object from XML,Appendix A,19,Service-Oriented Computing: Semantics
22、, Processes, Agents - Munindar Singh and Michael Huhns,Uses of XML,Exchanging information across software components Storing information in nonproprietary format XML documents represent structured descriptions: Products, services, catalogs Contracts Queries, requests, invocations (as in SOAP) Data-c
23、entric versus document-centric (irregular, heterogeneous data, depend on entire doc for app-specific meaning) views,Appendix A,20,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Data-Centric View,V11 V1nVm1 VmnExtract and store into DB via mapping to DB mo
24、del Regular, homogeneous tags May be expensive if repeatedly parsed and instantiated,Appendix A,21,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Document-Centric View,Storing docs in DBs Use character large objects (clobs) within DB Store paths to extern
25、al files containing docs Combine with some structured elements with search conditions for both structured elements and unstructured clobs or files Heterogeneity also complicates mappings to traditional typed OO programming languages,Appendix A,22,Service-Oriented Computing: Semantics, Processes, Age
26、nts - Munindar Singh and Michael Huhns,Directions,Limitations of XML Doesnt represent meaning Enables multiple representations for the same information; transform if models known Trends: sophisticated approaches for Querying and manipulating XML, e.g., XSLT Binding to PLs and DBs Semantics, e.g., RD
27、F, DAML, OWL, ,Appendix A,23,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XML Query Languages,XPath XPointer XSLT XQuery,Appendix A,24,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XPath,Model XML documents
28、as trees with nodes Elements Attributes Text (PCDATA) Comments Root node: above root of document,Appendix A,25,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Achtung!,Parent in XPath is like parent as traditionally in computer science Child in XPath is co
29、nfusing: An attribute is not the child of its parent Makes a difference for certain kinds of recursion (e.g., apply-templates discussed in XSLT) Our terminology is based on the traditional terminology: e-children, a-children, t-children Sets via et- or ta-, etc.,Appendix A,26,Service-Oriented Comput
30、ing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XPath Paths,Leading /: root /: indicates walking down a tree .:current node :parent node attr: to access values for the given attribute text() comment(),Appendix A,27,Service-Oriented Computing: Semantics, Processes, Agents - Munin
31、dar Singh and Michael Huhns,XPath Navigation,Select children according to position, e.g., j, where j could be 1 last() Descendant-or-self operator, / ./elem finds all elems under the current /elem finds all elems in the document Ancestors: not needed in this course Wildcard, *: collects e-children o
32、f the node where it is applied, but omits the t-children *: finds all attribute values,Appendix A,28,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XPath Queries,Incorporate selection conditions in XPath Attributes: /Songgenre=“jazz” Elements: /Songstarts
33、-with(./group, “Led”) Existence of attribute: /Songgenre Existence of subelement: /Songgroup Boolean operators: and, not, or Set operator: union (|); none others Arithmetic operators: , , String functions: contains(), concat(), length(), Aggregates: sum(), count(),Appendix A,29,Service-Oriented Comp
34、uting: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XPointer,Combines XPath with URLs URL to get to a document; XPath to walk down the document Can be used to formulate queries, e.g., Song-URL#xpointer(/Songgenre=“jazz”),Appendix A,30,Service-Oriented Computing: Semantics, Process
35、es, Agents - Munindar Singh and Michael Huhns,XSLT,A functional programming language A stylesheet specifies transformations on a document ,Appendix A,31,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XSLT Stylesheets,Use the XSLT namespace, conventionally
36、 abbreviated as xsl Includes primitives: Copy-of,Appendix A,32,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XSLT Templates: 1,A pattern to specify where a given transform should apply This match only works on the root:Only anonymous templates in this co
37、urse,Appendix A,33,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XSLT Templates: 2,Can be applied recursively on the et-children via By default, if no other template matches, recursively apply to et-children of current node (ignores attributed) and to ro
38、ot:Can over-apply; to override the default, may need an empty template:! e.g., match all text() ,Appendix A,34,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,XSLT Templates: 3,Subtleties of XSLT matching are beyond our scope Discuss some examples,Appendix
39、 A,35,Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns,Appendix A Summary,XML enables information sharing XML is well established Several aspects are worked out Lots of tools Works with databases and programming languages XML provides a useful substrate for service-oriented computing,