1、 INCITS/ISO/IEC TR 19075-1:2011 2015 (ISO/IEC TR 19075-1:2011, IDT) Information technology - Database languages - SQL Technical Reports - Part 1: XQuery Regular Expression Support in SQL (Technical Report) INCITS/ISO/IEC TR 19075-1:2011 2015 PDF disclaimer This PDF file may contain embedded typeface
2、s. In accordance with Adobes licensing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringin
3、g Adobes licensing policy. The ISO Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimize
4、d for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. Registered by INCITS (InterNational Committee for Information Te
5、chnology Standards) as an American National Standard. Date of Registration: 2/1/2015 Published by American National Standards Institute, 25 West 43rd Street, New York, New York 10036 Copyright 2015 by Information Technology Industry Council (ITI). All rights reserved. These materials are subject to
6、copyright claims of International Standardization Organization (ISO), International Electrotechnical Commission (IEC), American National Standards Institute (ANSI), and Information Technology Industry Council (ITI). Not for resale. No part of this publication may be reproduced in any form, including
7、 an electronic retrieval system, without the prior written permission of ITI. All requests pertaining to this standard should be submitted to ITI, 1101 K Street NW, Suite 610, Washington DC 20005. Printed in the United States of America ii ITIC 2015 All rights reserved Reference numberISO/IEC TR 190
8、75-1:2011(E)ISO/IEC 2011TECHNICAL REPORT ISO/IECTR19075-1First edition2011-07-15Information technology Database languages SQL Technical Reports Part 1: XQuery Regular Expression Support in SQL Technologies de linformation Langages de base de donnes Rapport techniques SQL Partie 1: Support dexpressio
9、ns rgulires de XQuery en SQL ISO/IEC TR 19075-1:2011(E) COPYRIGHT PROTECTED DOCUMENT ISO/IEC 2011 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, wi
10、thout permission in writing from either ISO at the address below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO/IEC 2011 A
11、ll rights reservedContents PageForeword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vIntroduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12、. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi1 Scope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 XQuery regular expressions. .
13、. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.1 Matching a specific character. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Metacharact
14、ers and escape sequences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Dot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15、. . . . . 52.4 Anchors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.5 Line terminators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16、 . . . . . . . . . . . . . . . . . . . . . . . . 62.6 Bracket expressions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.6.1 Listing characters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17、. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.6.2 Matching a range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.6.3 Negation. . . . . . . . . . . . . . . . . .
18、. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.6.4 Character class subtraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.7 Alternati
19、on. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.8 Quantifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20、. . . . . . . . . . . . . 92.9 Locating a match. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.10 Capture and back-reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21、. . . . . . . . . . . . . . . . . . . . . . . . . 112.11 Precedence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.12 Modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22、 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Operators using regular expressions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.1 LIKE_REGEX. . . . . . . . . . . . . . . . . . . . . .
23、. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 OCCURRENCES_REGEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.3 POSITION_REGEX. . . . . . . . . . . . . . .
24、 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.4 SUBSTRING_REGEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.5 TRANSLATE_REGEX. . . . . . . .
25、 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Index
26、. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25ISO/IEC 2011 All rights reserved Contents iiiISO/IEC TR 19075-1:2011(E)(Blank page)iv XQuery Regular Expression Support in
27、 SQL/Foundation ISO/IEC 2011 All rights reservedISO/IEC TR 19075-1:2011(E)ForewordISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission)form the specialized system for worldwide standardization. National bodies that are members of ISO or IECp
28、articipate in the development of International Standards through technical committees established by therespective organization to deal with particular fields of technical activity. ISO and IEC technical committeescollaborate in fields of mutual interest. Other international organizations, governmen
29、tal and non-governmental,in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEChave established a joint technical committee, ISO/IEC JTC 1.International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.Th
30、e main task of the joint technical committee is to prepare International Standards. Draft International Standardsadopted by the joint technical committee are circulated to national bodies for voting. Publication as an Interna-tional Standard requires approval by at least 75 % of the national bodies
31、casting a vote.In exceptional circumstances, when the joint technical committee has collected data of a different kind fromthat which is normally published as an International Standard (“state of the art”, for example), it may decideto publish a Technical Report. A Technical Report is entirely infor
32、mative in nature and shall be subject toreview every five years in the same manner as an International Standard.Attention is drawn to the possibility that some of the elements of this document may be the subject of patentrights. ISO and IEC shall not be held responsible for identifying any or all su
33、ch patent rights.ISO/IEC TR 19075-1 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,Subcommittee SC 32, Data management and interchange.ISO/IEC TR 19075 consists of the following parts, under the general title Information technology Databaselanguages SQL Technical Rep
34、orts:ISO/IEC 2011 All rights reserved Foreword vISO/IEC TR 19075-1:2011(E) Part 1: XQuery Regular Expression Support in SQLIntroductionThe organization of this part of ISO/IEC TR 19075 is as follows:1) Clause 1, “Scope”, specifies the scope of this part of ISO/IEC TR 19075.2) Clause 2, “XQuery regul
35、ar expressions”, explains how XQuery regular expressions are formed.3) Clause 3, “Operators using regular expressions”, explains how the SQL operators use regular expressions.vi XQuery Regular Expression Support in SQL/Foundation ISO/IEC 2011 All rights reservedISO/IEC TR 19075-1:2011(E)TECHNICAL RE
36、PORT ISO/IEC TR 19075-1:2011(E)Information technology Database languages SQL Technical Reports Part 1:XQuery Regular Expression Support in SQL1 ScopeThis Technical Report describes the regular expression support in SQL adopted from the regular expressionsyntax of XQuery F for this, XQuery F a variet
37、y of popular works contain detailed treatments of regularexpressions).2.1 Matching a specific characterPerhaps the most elementary pattern matching requirement is the ability to match a single character or string.For most characters, this is done by simply writing the character in the regular expres
38、sion. For example, supposeyou want to know if a string S contains the letters “xyz”. This could be done with the following predicate:S LIKE_REGEX xyzNote that the SQL LIKE predicate would require an exact match for “xyz”. However, the convention withregular expressions is that S need only contain a
39、substring that is “xyz”. For example, all of the followingvalues of S would yield True for the predicate above:xyzabcxyz1231 xyz 2 xyz 3 xyzNote that in the last example, there are actually three occurrences of the regular expression “xyz” within thetested value. The user may wish to know the number
40、 of occurrences of a match. This can be done withOCCURRENCES_REGEX. For example:OCCURRENCES_REGEX (xyz IN 1 xyz 2 xyz 3 xyz) = 3The user might also wish to know the position of a specific match. This can be done using POSITION_REGEX.For example, to learn the starting character position of the second
41、 occurrence,POSITION_REGEX ( xyz IN 1 xyz 2 xyz 3 xyz OCCURRENCE 2 ) = 9It is also possible to ask for the character position of the first character after the match. For example:ISO/IEC 2011 All rights reserved XQuery regular expressions 3ISO/IEC TR 19075-1:2011(E)2.1 Matching a specific characterPO
42、SITION_REGEX ( AFTER xyz IN 1 xyz 2 xyz 3 xyz OCCURRENCE 2 ) = 12If AFTER is used and the last character of the subject string is consumed, then the result is the length of thestring plus 1 (one):POSITION_REGEX ( AFTER xyz IN xyz ) = 42.2 Metacharacters and escape sequencesAs mentioned, most charact
43、ers can be matched by simply writing the character in the regular expression.However, certain characters are reserved as metacharacters. The complete list of metacharacters is:. ? * + ( ) | $The use of each of these metacharacters will be explained later. If you want to match a metacharacter, then y
44、ouneed to use an escape sequence, consisting of a backslash (“/”) followed by the metacharacter. For example,to test whether a string contains a dollar sign, you could writeS LIKE_REGEX $In particular, the escape sequence representing a backslash is two consecutive backslashes. There are variousothe
45、r defined escape sequences, matching either a single character, or any of a group of characters. The singlecharacter escape sequences are:newline (U+000A)nr return (U+000D)t tab (U+0009)- minus sign (-)The so-called category escapes are exemplified by “pL” or “pLu”. A category escape begins with“p”
46、followed by one uppercase letter, optionally a lowercase letter, and then the closing brace. In theseexample, “pL” matches any letter (as defined by Unicode) and “pLu” matches any uppercase letter.Some interesting category escapes are listed below:Any letter.pLpLu Any uppercase letter.pLl Any lowerc
47、ase letter.pNd Any decimal digit.pP Any punctuation mark.pZ Any separator (space, line, paragraph, etc.).The complete list of category escapes is found in XML Schema: Datatypes, section F.1.1, “Character classescapes”.There are also complementary category escapes, which are exemplified by “PL” or “P
48、Lu”. A comple-mentary category escape matches any character that would not be matched by the corresponding category4 XQuery Regular Expression Support in SQL/Foundation ISO/IEC 2011 All rights reservedISO/IEC TR 19075-1:2011(E)2.1 Matching a specific characterescape. The difference is that the (posi
49、tive) character escape is written with a lowercase “p” whereas the com-plementary character escape is written with an uppercase “P”.The so-called block escapes match any character in a block of Unicode, that is, a predefined consecutive rangeof code points. For example, “pIsBasicLatin” matches the ASCII character set. There are also com-plementary block esca