1、Syntax Analysis (Section 2.2-2.3),CSCI 431 Programming Languages Fall 2003,A modification of slides developed by Felix Hernandez-Campos at UNC Chapel Hill,Review: Compilation/Interpretation,Compiler or InterpreterTranslation Execution,Review: Syntax Analysis,Compiler or InterpreterTranslation Execut
2、ion,Source Code,Specifying the formof a programming languageTokens Regular Expressions (also F.A.s & Reg. Grammars)Syntax Context-FreeGrammars (also P.D.A.s),Target Code,Phases of Compilation,Syntax Analysis,Syntax: Websters definition: 1 a : the way in which linguistic elements (as words) are put t
3、ogether to form constituents (as phrases or clauses) The syntax of a programming language Describes its form Organization of tokens Context Free Grammars (CFGs) Must be recognizable by compilers and interpreters Parsing LL and LR parsers,Context Free Grammars,CFGs Add recursion to regular expression
4、s Nested constructions Notation expression identifier | number | - expression| ( expression ) | expression operator expression operator + | - | * | /Terminal symbols Non-terminal symbols Production rule (i.e. substitution rule) terminal symbol terminal and non-terminal symbols,Parsing,Parsing an arb
5、itrary Context Free Grammar O(n3) Too slow for large programs Linear-time parsing LL parsers (a Left-to-right, Left-most derivation) Recognize LL grammar Use a top-down strategy LR parsers (a Left-to-right, Right-most derivation) Recognize LR grammar Use a bottom-up strategy,Parsing example,Example:
6、 comma-separated list of identifierCFGid_list id id_list_tail id_list_tail , id_list_tail id_list_tail ;ParsingA, B, C;,Top-down derivation of A, B, C;,CFG,Top-down derivation of A, B, C;,CFG,Bottom-up parsing of A, B, C;,CFG,Left-to-right, Right-most derivation LR parsing (a shift-reduce parser),Bo
7、ttom-up parsing of A, B, C;,CFG,Bottom-up parsing of A, B, C;,CFG,LR Parsing vs. LL Parsing,LL A top-down or predictive parser Predict needed productions based on the current left-most non-terminal in the tree and the current input token The top-of-stack contains the left-most non-terminal The stack
8、 contains a record of what the parser expects to see LR A bottom-up or shift-reduce parser Shifts tokens onto the stack until it recognizes a right-hand side then reduces those tokens to their left-hand side The stack contains a record of what the parser has already seen,An appropriate LR Grammar,id
9、_list id_list_prefix ; id_list_prefix id_list_prefix , id id,This grammar cant be parsed top-down! Problems for LL grammars:- left recursion, example above- common prefixes, example:stmt id := expr | id (arg_list),LL(1) Grammar for the Calculator Language,LR(1) Grammar for the Calculator Language,Hi
10、erarchy of Linear Parsers,Basic containment relationship All CFGs can be recognized by LR parser Only a subset of all the CFGs can be recognized by LL parsers,LL parsing,CFGs LR parsing,Bigger Picture,Chomsky Hierarchy of Grammars,Regular Grammar,Context Free Grammar,Context Sensitive Grammar,Unrest
11、ricted Grammar,Implementation of an LL Parser,Two options: A recursive descent parser (section 2.2.3) For LL grammars only Parse table and a driver (section 2.2.5) LR parsers covered in section 2.2.6,Recursive Descent Parser Example,LL(1) grammar,Recursive Descent Parser Example,Outline of recursive
12、 parserThis parser onlyverifies syntaxmatch isthe scanner,Recursive Descent Parser Example,Recursive Descent Parser Example,Recursive Descent Parser Example,A program that develops recursive decent parsers: JavaCC,Semantic Analysis,Compiler or InterpreterTranslation Execution,Source Code,Specifying the meaningof a programming languageAttribute Grammars,Target Code,