编译原理Antlr教程

一.安装、配置Antlr

首先,安装配置Antlr前,确保你已经安装好java环境了。

1.下载Antlr4

下载网址:https://www.antlr.org/download/

选择 Tool and Java runtime lib 目录下的 antlr-4.7.2-complete.jar 下载。

2.配置批处理文件

antlr-4.7.2-complete.jar 所在目录下新建两个bat文件,antlr4.bat和grun.bat

文件组织如下:

在antlr4.bat中写入:

java org.antlr.v4.Tool %*

在grun.bat中写入:

java org.antlr.v4.gui.TestRig %*

3.配置环境变量

步骤:(win10)设置 -> 系统 -> 关于 -> (右上角)高级系统设置 -> 环境变量 ->系统变量。

在系统变量 CLASSPATH 中添加antlr-4.7.2-complete.jar所在路径:

就成功配置好了Antlr环境。

二、使用Antlr

1.编写.g4文件

.g4文件是antlr生成词法解析规则和语法解析规则的基础,是语言的文法的表示方法。一个完整的文法是编译原理整个实验的基础。

以下是我的实验采用的C语言的文法文件。命名为MyCGrammer.g4

具体是参考

/*
 [The "BSD licence"]
 Copyright (c) 2013 Sam Harwell
 All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 3. The name of the author may not be used to endorse or promote products
    derived from this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

/** C 2011 grammar built from the C11 Spec */
grammar MyCGrammer;

primaryExpression
    :   tokenId//Identifier
    |   tokenConstant//Constant
    |   tokenStringLiteral//StringLiteral+
    |   '(' expression ')'
    |   genericSelection
    |   '__extension__'? '(' compoundStatement ')' // Blocks (GCC extension)
    |   '__builtin_va_arg' '(' unaryExpression ',' typeName ')'
    |   '__builtin_offsetof' '(' typeName ',' unaryExpression ')'
    ;
tokenId 
	: Identifier
	;
tokenConstant
	: Constant
	;
tokenStringLiteral
	: StringLiteral+
	;

genericSelection
    :   '_Generic' '(' assignmentExpression ',' genericAssocList ')'
    ;

genericAssocList
    :   genericAssociation
    |   genericAssocList ',' genericAssociation
    ;

genericAssociation
    :   typeName ':' assignmentExpression
    |   'default' ':' assignmentExpression
    ;

postfixExpression
    :   primaryExpression                                   #postfixExpression_pass
    |   postfixExpression '[' expression ']'                #postfixExpression_arrayaccess
    |   postfixExpression '(' argumentExpressionList? ')'   #postfixExpression_funcall
    |   postfixExpression '.' Identifier                    #postfixExpression_member
    |   postfixExpression '->' Identifier                   #postfixExpression_point
    |   postfixExpression '++'                              #postfixExpression_
    |   postfixExpression '--'                              #postfixExpression_
    |   '(' typeName ')' '{' initializerList '}'            #postfixExpression_pass
    |   '(' typeName ')' '{' initializerList ',' '}'        #postfixExpression_pass
    |   '__extension__' '(' typeName ')' '{' initializerList '}'    #postfixExpression_pass
    |   '__extension__' '(' typeName ')' '{' initializerList ',' '}'    #postfixExpression_pass
    ;

argumentExpressionList
    :   assignmentExpression
    |   argumentExpressionList ',' assignmentExpression
    ;

unaryExpression
    :   postfixExpression       #unaryExpression_pass
    |   '++' unaryExpression    #unaryExpression_
    |   '--' unaryExpression    #unaryExpression_
    |   unaryOperator castExpression    #unaryExpression_
    |   'sizeof' unaryExpression    #unaryExpression_pass
    |   'sizeof' '(' typeName ')'   #unaryExpression_pass
    |   '_Alignof' '(' typeName ')' #unaryExpression_pass
    |   '&&' Identifier    #unaryExpression_pass
    ;

unaryOperator
    :   '&' | '*' | '+' | '-' | '~' | '!'
    ;

castExpression
    :   unaryExpression         #castExpression_pass
    |   '(' typeName ')' castExpression     #castExpression_
    |   '__extension__' '(' typeName ')' castExpression #castExpression_
    ;

multiplicativeExpression
    :   castExpression          #multiplicativeExpression_pass
    |   multiplicativeExpression '*' castExpression #multiplicativeExpression_
    |   multiplicativeExpression '/' castExpression #multiplicativeExpression_
    |   multiplicativeExpression '%' castExpression #multiplicativeExpression_
    ;

additiveExpression
    :   multiplicativeExpression    #additiveExpression_pass
    |   additiveExpression '+' multiplicativeExpression #additiveExpression_
    |   additiveExpression '-' multiplicativeExpression #additiveExpression_
    ;

shiftExpression
    :   additiveExpression  #shiftExpression_pass
    |   shiftExpression '<<' additiveExpression #shiftExpression_
    |   shiftExpression '>>' additiveExpression #shiftExpression_
    ;

relationalExpression
    :   shiftExpression     #relationalExpression_pass
    |   relationalExpression '<' shiftExpression    #relationalExpression_
    |   relationalExpression '>' shiftExpression    #relationalExpression_
    |   relationalExpression '<=' shiftExpression   #relationalExpression_
    |   relationalExpression '>=' shiftExpression   #relationalExpression_
    ;

equalityExpression
    :   relationalExpression    #equalityExpression_pass
    |   equalityExpression '==' relationalExpression    #equalityExpression_
    |   equalityExpression '!=' relationalExpression    #equalityExpression_
    ;

andExpression
    :   equalityExpression  #andExpression_pass
    |   andExpression '&' equalityExpression    #andExpression_
    ;

exclusiveOrExpression
    :   andExpression   #exclusiveOrExpression_pass
    |   exclusiveOrExpression '^' andExpression #exclusiveOrExpression_
    ;

inclusiveOrExpression
    :   exclusiveOrExpression   #inclusiveOrExpression_pass
    |   inclusiveOrExpression '|' exclusiveOrExpression #inclusiveOrExpression_
    ;

logicalAndExpression
    :   inclusiveOrExpression   #logicalAndExpression_pass
    |   logicalAndExpression '&&' inclusiveOrExpression #logicalAndExpression_
    ;

logicalOrExpression
    :   logicalAndExpression    #logicalOrExpression_pass
    |   logicalOrExpression '||' logicalAndExpression   #logicalOrExpression_
    ;

conditionalExpression
    :   logicalOrExpression ('?' expression ':' conditionalExpression)?
    ;

assignmentExpression
    :   conditionalExpression   #assignmentExpression_pass
    |   unaryExpression assignmentOperator assignmentExpression #assignmentExpression_
    ;

assignmentOperator
    :   '=' | '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|='
    ;

expression
    :   assignmentExpression    #expression_
    |   expression ',' assignmentExpression #expression_pass
    ;

constantExpression
    :   conditionalExpression
    ;

declaration
    :   declarationSpecifiers initDeclaratorList? ';'
    |   staticAssertDeclaration
    ;

declarationSpecifiers
    :   declarationSpecifier+
    ;

declarationSpecifiers2
    :   declarationSpecifier+
    ;

declarationSpecifier
    :   storageClassSpecifier
    |   typeSpecifier
    |   typeQualifier
    |   functionSpecifier
    |   alignmentSpecifier
    ;

initDeclaratorList
    :   initDeclarator
    |   initDeclaratorList ',' initDeclarator
    ;

initDeclarator
    :   declarator
    |   declarator '=' initializer
    ;

storageClassSpecifier
    :   'typedef'
    |   'extern'
    |   'static'
    |   '_Thread_local'
    |   'auto'
    |   'register'
    ;

typeSpecifier
    :   'void'     #typeSpecifier_
    |   'char'      #typeSpecifier_
    |   'short'     #typeSpecifier_
    |   'int'       #typeSpecifier_
    |   'long'      #typeSpecifier_
    |   'float'     #typeSpecifier_
    |   'double'    #typeSpecifier_
    |   'signed'    #typeSpecifier_
    |   'unsigned'  #typeSpecifier_
    ;

structOrUnionSpecifier
    :   structOrUnion Identifier? '{' structDeclarationList '}'
    |   structOrUnion Identifier
    ;

structOrUnion
    :   'struct'
    |   'union'
    ;

structDeclarationList
    :   structDeclaration
    |   structDeclarationList structDeclaration
    ;

structDeclaration
    :   specifierQualifierList structDeclaratorList? ';'
    |   staticAssertDeclaration
    ;

specifierQualifierList
    :   typeSpecifier specifierQualifierList?
    |   typeQualifier specifierQualifierList?
    ;

structDeclaratorList
    :   structDeclarator
    |   structDeclaratorList ',' structDeclarator
    ;

structDeclarator
    :   declarator
    |   declarator? ':' constantExpression
    ;

enumSpecifier
    :   'enum' Identifier? '{' enumeratorList '}'
    |   'enum' Identifier? '{' enumeratorList ',' '}'
    |   'enum' Identifier
    ;

enumeratorList
    :   enumerator
    |   enumeratorList ',' enumerator
    ;

enumerator
    :   enumerationConstant
    |   enumerationConstant '=' constantExpression
    ;

enumerationConstant
    :   Identifier
    ;

atomicTypeSpecifier
    :   '_Atomic' '(' typeName ')'
    ;

typeQualifier
    :   'const'
    |   'restrict'
    |   'volatile'
    |   '_Atomic'
    ;

functionSpecifier
    :   ('inline'
    |   '_Noreturn'
    |   '__inline__' // GCC extension
    |   '__stdcall')
    |   gccAttributeSpecifier
    |   '__declspec' '(' Identifier ')'
    ;

alignmentSpecifier
    :   '_Alignas' '(' typeName ')'
    |   '_Alignas' '(' constantExpression ')'
    ;

declarator
    :   pointer? directDeclarator gccDeclaratorExtension*
    ;

directDeclarator
    :   Identifier          #directDeclarator_pass
    |   '(' declarator ')'  #directDeclarator_pass
    |   directDeclarator '[' typeQualifierList? assignmentExpression? ']'   #directDeclarator_array
    |   directDeclarator '[' 'static' typeQualifierList? assignmentExpression ']'   #directDeclarator_array
    |   directDeclarator '[' typeQualifierList 'static' assignmentExpression ']'    #directDeclarator_array
    |   directDeclarator '[' typeQualifierList? '*' ']'     #directDeclarator_array
    |   directDeclarator '(' parameterTypeList ')'          #directDeclarator_func
    |   directDeclarator '(' identifierList? ')'            #directDeclarator_func
    ;

gccDeclaratorExtension
    :   '__asm' '(' StringLiteral+ ')'
    |   gccAttributeSpecifier
    ;

gccAttributeSpecifier
    :   '__attribute__' '(' '(' gccAttributeList ')' ')'
    ;

gccAttributeList
    :   gccAttribute (',' gccAttribute)*
    |   // empty
    ;

gccAttribute
    :   ~(',' | '(' | ')') // relaxed def for "identifier or reserved word"
        ('(' argumentExpressionList? ')')?
    |   // empty
    ;

nestedParenthesesBlock
    :   (   ~('(' | ')')
        |   '(' nestedParenthesesBlock ')'
        )*
    ;

pointer
    :   '*' typeQualifierList?
    |   '*' typeQualifierList? pointer
    |   '^' typeQualifierList? // Blocks language extension
    |   '^' typeQualifierList? pointer // Blocks language extension
    ;

typeQualifierList
    :   typeQualifier
    |   typeQualifierList typeQualifier
    ;

parameterTypeList
    :   parameterList
    |   parameterList ',' '...'
    ;

parameterList
    :   parameterDeclaration
    |   parameterList ',' parameterDeclaration
    ;

parameterDeclaration
    :   declarationSpecifiers declarator
    |   declarationSpecifiers2 abstractDeclarator?
    ;

identifierList
    :   Identifier
    |   identifierList ',' Identifier
    ;

typeName
    :   specifierQualifierList abstractDeclarator?
    ;

abstractDeclarator
    :   pointer
    |   pointer? directAbstractDeclarator gccDeclaratorExtension*
    ;

directAbstractDeclarator
    :   '(' abstractDeclarator ')' gccDeclaratorExtension*
    |   '[' typeQualifierList? assignmentExpression? ']'
    |   '[' 'static' typeQualifierList? assignmentExpression ']'
    |   '[' typeQualifierList 'static' assignmentExpression ']'
    |   '[' '*' ']'
    |   '(' parameterTypeList? ')' gccDeclaratorExtension*
    |   directAbstractDeclarator '[' typeQualifierList? assignmentExpression? ']'
    |   directAbstractDeclarator '[' 'static' typeQualifierList? assignmentExpression ']'
    |   directAbstractDeclarator '[' typeQualifierList 'static' assignmentExpression ']'
    |   directAbstractDeclarator '[' '*' ']'
    |   directAbstractDeclarator '(' parameterTypeList? ')' gccDeclaratorExtension*
    ;


initializer
    :   assignmentExpression
    |   '{' initializerList '}'
    |   '{' initializerList ',' '}'
    ;

initializerList
    :   designation? initializer
    |   initializerList ',' designation? initializer
    ;

designation
    :   designatorList '='
    ;

designatorList
    :   designator
    |   designatorList designator
    ;

designator
    :   '[' constantExpression ']'
    |   '.' Identifier
    ;

staticAssertDeclaration
    :   '_Static_assert' '(' constantExpression ',' StringLiteral+ ')' ';'
    ;

statement
    :   labeledStatement
    |   compoundStatement
    |   expressionStatement
    |   selectionStatement
    |   iterationStatement
    |   jumpStatement
    |   ('__asm' | '__asm__') ('volatile' | '__volatile__') '(' (logicalOrExpression (',' logicalOrExpression)*)? (':' (logicalOrExpression (',' logicalOrExpression)*)?)* ')' ';'
    ;

labeledStatement
    :   Identifier ':' statement                   
    |   'case' constantExpression ':' statement     
    |   'default' ':' statement
    ;

compoundStatement
    :   '{' blockItemList? '}'
    ;

blockItemList
    :   blockItem
    |   blockItemList blockItem
    ;

blockItem
    :   declaration
    |   statement
    ;

expressionStatement
    :   expression? ';'
    ;

selectionStatement
    :   'if' '(' expression ')' statement ('else' statement)?   #selectionStatement_if
    |   'switch' '(' expression ')' statement                   #selectionStatement_switch
    ;

iterationStatement
    :   'while' '(' expression ')' statement                    #iterationStatement_while
    |   'do' statement 'while' '(' expression ')' ';'           #iterationStatement_dowhile
    |   'for' '(' expression? ';' expression? ';' expression? ')' statement     #iterationStatement_for
    |   'for' '(' declaration expression? ';' expression? ')' statement         #iterationStatement_forDeclared
    ;

jumpStatement
    :   'goto' Identifier ';'           #jumpStatement_goto
    |   'continue' ';'                  #jumpStatement_continue
    |   'break' ';'                     #jumpStatement_break
    |   'return' expression? ';'        #jumpStatement_return
    |   'goto' unaryExpression ';'      #jumpStatement_        // GCC extension     
    ;

compilationUnit
    :   translationUnit? EOF
    ;

translationUnit
    :   externalDeclaration
    |   translationUnit externalDeclaration
    ;

externalDeclaration
    :   functionDefinition
    |   declaration
    |   ';' // stray ;
    ;

functionDefinition
    :   declarationSpecifiers? declarator declarationList? compoundStatement
    ;

declarationList
    :   declaration
    |   declarationList declaration
    ;

functionCall: tokenId '(' argumentExpressionList? ')'	#functionCall_	;

Auto : 'auto';
Break : 'break';
Case : 'case';
Char : 'char';
Const : 'const';
Continue : 'continue';
Default : 'default';
Do : 'do';
Double : 'double';
Else : 'else';
Enum : 'enum';
Extern : 'extern';
Float : 'float';
For : 'for';
Goto : 'goto';
If : 'if';
Inline : 'inline';
Int : 'int';
Long : 'long';
Register : 'register';
Restrict : 'restrict';
Return : 'return';
Short : 'short';
Signed : 'signed';
Sizeof : 'sizeof';
Static : 'static';
Struct : 'struct';
Switch : 'switch';
Typedef : 'typedef';
Union : 'union';
Unsigned : 'unsigned';
Void : 'void';
Volatile : 'volatile';
While : 'while';

Alignas : '_Alignas';
Alignof : '_Alignof';
Atomic : '_Atomic';
Bool : '_Bool';
Complex : '_Complex';
Generic : '_Generic';
Imaginary : '_Imaginary';
Noreturn : '_Noreturn';
StaticAssert : '_Static_assert';
ThreadLocal : '_Thread_local';

LeftParen : '(';
RightParen : ')';
LeftBracket : '[';
RightBracket : ']';
LeftBrace : '{';
RightBrace : '}';

Less : '<';
LessEqual : '<=';
Greater : '>';
GreaterEqual : '>=';
LeftShift : '<<';
RightShift : '>>';

Plus : '+';
PlusPlus : '++';
Minus : '-';
MinusMinus : '--';
Star : '*';
Div : '/';
Mod : '%';

And : '&';
Or : '|';
AndAnd : '&&';
OrOr : '||';
Caret : '^';
Not : '!';
Tilde : '~';

Question : '?';
Colon : ':';
Semi : ';';
Comma : ',';

Assign : '=';
// '*=' | '/=' | '%=' | '+=' | '-=' | '<<=' | '>>=' | '&=' | '^=' | '|='
StarAssign : '*=';
DivAssign : '/=';
ModAssign : '%=';
PlusAssign : '+=';
MinusAssign : '-=';
LeftShiftAssign : '<<=';
RightShiftAssign : '>>=';
AndAssign : '&=';
XorAssign : '^=';
OrAssign : '|=';

Equal : '==';
NotEqual : '!=';

Arrow : '->';
Dot : '.';
Ellipsis : '...';

Identifier
    :   IdentifierNondigit
        (   IdentifierNondigit
        |   Digit
        )*
    ;

fragment
IdentifierNondigit
    :   Nondigit
    |   UniversalCharacterName
    //|   // other implementation-defined characters...
    ;

fragment
Nondigit
    :   [a-zA-Z_]
    ;

fragment
Digit
    :   [0-9]
    ;

fragment
UniversalCharacterName
    :   '\\u' HexQuad
    |   '\\U' HexQuad HexQuad
    ;

fragment
HexQuad
    :   HexadecimalDigit HexadecimalDigit HexadecimalDigit HexadecimalDigit
    ;

Constant
    :   IntegerConstant
    |   FloatingConstant
    //|   EnumerationConstant
    |   CharacterConstant
    ;

fragment
IntegerConstant
    :   DecimalConstant IntegerSuffix?
    |   OctalConstant IntegerSuffix?
    |   HexadecimalConstant IntegerSuffix?
    |	BinaryConstant
    ;

fragment
BinaryConstant
	:	'0' [bB] [0-1]+
	;

fragment
DecimalConstant
    :   NonzeroDigit Digit*
    ;

fragment
OctalConstant
    :   '0' OctalDigit*
    ;

fragment
HexadecimalConstant
    :   HexadecimalPrefix HexadecimalDigit+
    ;

fragment
HexadecimalPrefix
    :   '0' [xX]
    ;

fragment
NonzeroDigit
    :   [1-9]
    ;

fragment
OctalDigit
    :   [0-7]
    ;

fragment
HexadecimalDigit
    :   [0-9a-fA-F]
    ;

fragment
IntegerSuffix
    :   UnsignedSuffix LongSuffix?
    |   UnsignedSuffix LongLongSuffix
    |   LongSuffix UnsignedSuffix?
    |   LongLongSuffix UnsignedSuffix?
    ;

fragment
UnsignedSuffix
    :   [uU]
    ;

fragment
LongSuffix
    :   [lL]
    ;

fragment
LongLongSuffix
    :   'll' | 'LL'
    ;

fragment
FloatingConstant
    :   DecimalFloatingConstant
    |   HexadecimalFloatingConstant
    ;

fragment
DecimalFloatingConstant
    :   FractionalConstant ExponentPart? FloatingSuffix?
    |   DigitSequence ExponentPart FloatingSuffix?
    ;

fragment
HexadecimalFloatingConstant
    :   HexadecimalPrefix HexadecimalFractionalConstant BinaryExponentPart FloatingSuffix?
    |   HexadecimalPrefix HexadecimalDigitSequence BinaryExponentPart FloatingSuffix?
    ;

fragment
FractionalConstant
    :   DigitSequence? '.' DigitSequence
    |   DigitSequence '.'
    ;

fragment
ExponentPart
    :   'e' Sign? DigitSequence
    |   'E' Sign? DigitSequence
    ;

fragment
Sign
    :   '+' | '-'
    ;

fragment
DigitSequence
    :   Digit+
    ;

fragment
HexadecimalFractionalConstant
    :   HexadecimalDigitSequence? '.' HexadecimalDigitSequence
    |   HexadecimalDigitSequence '.'
    ;

fragment
BinaryExponentPart
    :   'p' Sign? DigitSequence
    |   'P' Sign? DigitSequence
    ;

fragment
HexadecimalDigitSequence
    :   HexadecimalDigit+
    ;

fragment
FloatingSuffix
    :   'f' | 'l' | 'F' | 'L'
    ;

fragment
CharacterConstant
    :   '\'' CCharSequence '\''
    |   'L\'' CCharSequence '\''
    |   'u\'' CCharSequence '\''
    |   'U\'' CCharSequence '\''
    ;

fragment
CCharSequence
    :   CChar+
    ;

fragment
CChar
    :   ~['\\\r\n]
    |   EscapeSequence
    ;
fragment
EscapeSequence
    :   SimpleEscapeSequence
    |   OctalEscapeSequence
    |   HexadecimalEscapeSequence
    |   UniversalCharacterName
    ;
fragment
SimpleEscapeSequence
    :   '\\' ['"?abfnrtv\\]
    ;
fragment
OctalEscapeSequence
    :   '\\' OctalDigit
    |   '\\' OctalDigit OctalDigit
    |   '\\' OctalDigit OctalDigit OctalDigit
    ;
fragment
HexadecimalEscapeSequence
    :   '\\x' HexadecimalDigit+
    ;
StringLiteral
    :   EncodingPrefix? '"' SCharSequence? '"'
    ;
fragment
EncodingPrefix
    :   'u8'
    |   'u'
    |   'U'
    |   'L'
    ;
fragment
SCharSequence
    :   SChar+
    ;
fragment
SChar
    :   ~["\\\r\n]
    |   EscapeSequence
    |   '\\\n'   // Added line
    |   '\\\r\n' // Added line
    ;

ComplexDefine
    :   '#' Whitespace? 'define'  ~[#]*
        -> skip
    ;
         
// ignore the following asm blocks:
/*
    asm
    {
        mfspr x, 286;
    }
 */
AsmBlock
    :   'asm' ~'{'* '{' ~'}'* '}'
	-> skip
    ;
	
// ignore the lines generated by c preprocessor                                   
// sample line : '#line 1 "/home/dm/files/dk1.h" 1'                           
LineAfterPreprocessing
    :   '#line' Whitespace* ~[\r\n]*
        -> skip
    ;  

LineDirective
    :   '#' Whitespace? DecimalConstant Whitespace? StringLiteral ~[\r\n]*
        -> skip
    ;

PragmaDirective
    :   '#' Whitespace? 'pragma' Whitespace ~[\r\n]*
        -> skip
    ;

Whitespace
    :   [ \t]+
        -> skip
    ;

Newline
    :   (   '\r' '\n'?
        |   '\n'
        )
        -> skip
    ;

BlockComment
    :   '/*' .*? '*/'
        -> skip
    ;

LineComment
    :   '//' ~[\r\n]*
        -> skip
    ;

2.利用Antlr生成词法分析器和语法分析器

MyCGrammer.g4文件目录打开命令行

输入:

antlr4 MyCGrammer.g4 -visitor

-visitor(是生成visitor类,默认不生成,这涉及antlr的两种遍历方式,其实生不生成影响不大)

之后文件目录下会生成如下文件

 接着对其进行编译,在命令行输入:

javac MyCGrammer*.java

这样C语言词法分析器和语法分析器就生成好了。

 3.测试

在命令行输入:

grun MyCGrammer compilationUnit -tokens

再输入一段c语言代码,按Crtl+Z结束。就可以生成对应代码的词法分析结果。

 在命令行输入:

grun MyCGrammer compilationUnit -gui

同样再输入一段c语言代码,按Crtl+Z结束。就可以生成对应代码的语法分析树。

再介绍以下其它的选项:

-tokens:打印出词法符号流。

-tree:以LISP格式打印出语法分析树。

-gui:在对话框中以可视化方式显示语法分析树。

-ps file.ps :以PostScript格式生成可视化语法分析树,然后将其存储于file.ps。

-encoding encodingname:若当前的区域设定无法正确读取输入,使用这个选项指定测试组件输入文件的编码。

-trace:打印规则的名字以及进入和离开该规则时的词法符号。

-diagnostics:开启解析过程中的调试信息输出。

-SLL:使用另外一种更快但是功能稍弱的解析策略。
 

三、打包词法分析器和语法分析器

以上我们的工作都是在命令行进行的,如果要将词法分析和语法分析放到项目中,就需要将生成的文件进行打包。

在刚刚生成的文件目录下,新建两个文件夹libMyCGrammer

将下载的antlr-4.7.2-complete.jar复制到lib文件夹中

用IDEA打开(我此处用的是IntelliJ IDEA Community Edition 2021.1,其它应该也类似,或者自己搜索打包方法)

 找到以下文件,并在头部输入:

package MyCGrammer;

并移入MyCGrammer文件夹中 

 点击左上角File->Project Structure->Artifacts->JAR->From modules with dependecies->copy to...

 

 然后点击Build->Build Artifacts

即可在对应目录out中生成对应的jar包。

———————————————————————————————————————————

以上便是Antlr的整个教程,后续将利用此对C语言进行词法分析,语法分析,中间代码生成以及生成目标代码。


版权声明:本文为weixin_43877853原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。