【编译原理】BUPT词法分析实验报告

发布于 2022-10-20  139 次阅读


代码仓库

https://github.com/sinkers-lan/lexicalAnalyzer

C语言词法分析实验报告

实验题目

词法分析程序的设计与实现

实验要求

  1. 选定源语言,比如:C、Pascal、Python、Java 等,任何一种语言均可;
  2. 可以识别出用源语言编写的源程序中的每个单词符号,并以记号的形式输出每个单词符号。
  3. 可以识别并跳过源程序中的注释。
  4. 可以统计源程序中的语句行数、各类单词的个数、以及字符总数,并输出统计结果。
  5. 检查源程序中存在的词法错误,并报告错误所在的位置。
  6. 对源程序中出现的错误进行适当的恢复,使词法分析可以继续进行,对源程序进行一次扫描,即可检查并报告源程序中存在的所有词法错误。

采用 C/C++作为实现语言,手工编写词法分析程序。

程序设计说明

语言说明

C语言定义有以下记号及单词:

  1. 标识符:C语言中的标识符只能由字母、数字和下划线三种字符组成,且第一个字符必须是字母或下划线;
  2. 保留字:标识符集合的子集。C语言共有33个关键字(不包含C11新增的关键字):
    auto, break, case, char, const, continue, default, do, double, else, enum, extern, float, for, goto, if, inline, int, long, register, restrict, return, short, signed, sizeof, static, struct, switch, typedef, union, unsigned, void, volatile
    
  3. 整数常量:
    整数常量可以是十进制、八进制或十六进制的常量。前缀指定基数:0x 或 0X 表示十六进制,0 表示八进制,不带前缀则默认表示十进制。
    整数常量也可以带一个后缀,后缀是 U 和 L 的组合,U 表示无符号整数(unsigned),L 表示长整数(long)。后缀可以是大写,也可以是小写,U 和 L 的顺序任意。
  4. 浮点常量:
    浮点常量由整数部分、小数点、小数部分和指数部分组成。您可以使用小数形式或者指数形式来表示浮点常量。
    当使用小数形式表示时,必须包含整数部分、小数部分,或同时包含两者。当使用指数形式表示时, 必须包含小数点、指数,或同时包含两者。带符号的指数是用 e 或 E 引入的。
    浮点常量也可以带一个后缀,后缀是L或者F,L代表double,F代表float。后缀可以是大写,也可以是小写。
  5. 运算符:
    1. 关系运算符6种:<, <=, >, >=, ==, !=
    2. 算术运算符7种:+, -, *, /, %, ++, --
    3. 逻辑运算符3种:&&, ||, !
    4. 位操作运算符6种:&, |, ~, ^, <<, >>
    5. 赋值运算符11种:=, +=, -=, *=, /=, %=, &=, |=, ^=, >>=, <<=
    6. 条件运算符1种:?:
    7. 逗号运算符1种:,
    8. 指针运算符2种:*, &
    9. 特殊运算符4种:(), [], ->, .
  6. 标点符号:{, }, ', ", ;, ,, #
  7. 注释标记:
    1. /*开始,以*/结束;
    2. //开始,以换行符结束;
  8. 单词符号间的分隔符:空格。

记号的正规文法

  1. 标识符的文法
    id → letter rid | _ rid
    rid → ε | letter rid | _ rid | digit rid
    letter → A | B | ... | Z | a | b | ... | z
    digit → 0 | 1 | 2 | ... | 9
  2. 整数的文法
    digits → digit remainder
    remainder → ε | digit remainder
  3. 无符号数的文法
    numdigit num1
    num1 → digit num1 ∣ . num2 ∣ E num4 ∣ ε
    num2 → digit num3
    num3 → digit num3 | E num4 | ε
    num4 → + digits ∣ − digits ∣ digit num5
    digits → digit num5
    num5 → digit num5 ∣ ε
  4. 关系运算符的正规文法
    relop → < | < equal | = equal | > | > equal | ! equal
    greater → >
    equal → =
  5. 算数运算符的正规文法
    ariop → + ∣ − ∣ ∗ ∣ / ∣ % ∣ + plus ∣ − minus
    plus → +
    minus → −
  6. 逻辑运算符的正规文法
    logop → & and ∣ ∣ or ∣ !
    and → &
    or → ∣
  7. 位操作符的正规文法
    bitop → & ∣ ∣ ∣ ∼ ∣ ^ ∣ < left ∣ > right
    left → <
    right → >
  8. 赋值运算符的正规文法
    assop → = | +equal ∣ −equal ∣ ∗equal ∣ /equal ∣ %equal ∣ &equal ∣ ∣equal ∣ ^equal ∣ > right ∣ < left
    right → > equal
    left → < equal
    equal → =
  9. 条件运算符的文法:
    conop → ?
  10. 逗号运算符的文法:
    comop → ,
  11. 指针运算符的文法:
    ponop → ∗ ∣ &
  12. 特殊运算符的文法:
    speop → ( ∣ ) ∣ [ ∣ ] ∣ . ∣ − right
    right → >
  13. 标点符号的文法:
    symbol → { ∣ } ∣ : ∣ ′ ∣ " ∣ ; ∣ , ∣ #
  14. 注释头符号的文法:
    note → / star ∣ / slash
    star → ∗
    slash → /

状态转换图

10.9.13.00

输出形式

正则表达式 记号 属性
auto KEY 0
break KEY 1
case KEY 2
char KEY 3
const KEY 4
continue KEY 5
default KEY 6
do KEY 7
double KEY 8
else KEY 9
enum KEY 10
extern KEY 11
float KEY 12
for KEY 13
goto KEY 14
if KEY 15
inline KEY 16
int KEY 17
long KEY 18
register KEY 19
restrict KEY 20
return KEY 21
short KEY 22
signed KEY 23
sizeof KEY 24
static KEY 25
struct KEY 26
switch KEY 27
typedef KEY 28
union KEY 29
unsigned KEY 30
void KEY 31
volatile KEY 32
整形int I val
整形long L val
整形long long LL val
浮点型float F val
浮点型double D val
浮点型long double LD val
整形unsigned int U val
整形unsigned long UL val
整形unsigned long long ULL val
用户自定义标识符id ID 单词在符号表的下标
< REL 0
<= REL 1
> REL 2
>= REL 3
== REL 4
!= REL 5
= ASS 0
+= ASS 1
-= ASS 2
*= ASS 3
/= ASS 4
%= ASS 5
&= ASS 6
= ASS 7
^= ASS 8
>>= ASS 9
<<= ASS 10
~ ASS 11
& BIT 0
BIT | 1
~ BIT 2
^ BIT 3
<< BIT 4
>> BIT 5
&& LOG 0
| LOG | 1
! LOG 2
+ ARI 0
- ARI 1
* ARI 2
/ ARI 3
% ARI 4
++ ARI 5
-- ARI 6
( SPE 0
) SPE 1
[ SPE 2
] SPE 3
-> SPE 4
. SPE 5
/* NOT 0
// NOT 1
{ SYM 0
} SYM 1
: SYM 2
' SYM 3
" SYM 4
; SYM 5
, SYM 6
# SYM 7
字符串常量string STR 单词在字符串表的下标
字符常量char CHAR 单词在字符串表的下标

设计全局变量和过程

#define KEYSIZE 33
#define BUFFERSIZE 1000
#define IDCAPACITY 1000

#define KEY 0
#define ID 1
#define REL 4
#define ASS 5
#define BIT 6
#define LOG 7
#define ARI 8
#define SPE 9
#define NOT 10
#define SYM 11
#define SEL 12
#define STR 13
#define CHR 14
#define U 15
#define UL 16
#define ULL 17
#define I 18
#define L 19
#define LL 20
#define F 21
#define D 22
#define LD 23

int token_num[24]; //存放各种记号单词的数量

union attributes {
    int inum;
    long lnum;
    long long llnum;
    float fnum;
    double dnum;
    long double ldnum;
    unsigned int unum;
    unsigned long ulnum;
    unsigned long long ullnum;
}; //存放二元组中的属性

struct tokens {
    int token;
    union attributes attribute; 
};
struct tokens binary; //存放当前单词二元组

char* signTable; //存放用户自定义字符的符号表
char* strTable; //存放字符串
int signPos; //符号表当前末尾位置
int str_signPos;  //字符串常量表当前末尾位置
char keyTable[KEYSIZE][10] = {"auto", "break", "case", "char", "const", "continue", "default", "do", "double", "else", "enum", "extern", "float", "for", "goto", "if", "inline", "int", "long", "register", "restrict", "return", "short", "signed", "sizeof", "static", "struct", "switch", "typedef", "union", "unsigned", "void", "volatile"}; //存放关键字字符
int state; //当前状态指示
char C; //存放当前读入的字符
int isKey; //值为-1表示识别出的单词是用户自定义的标识符,否则,表示识别出的单词是关键字,其值为关键字的记号
char token[1000]; //存放当前正在识别的单词字符串
int pos; //token中的字符串尾部;
int forward; //向前指针
int line; //当前行数
int total; //不带换行和空格的字符总数
int total_ns; //带换行和空格的字符总数
char buffer[BUFFERSIZE * 2]; //输入缓冲区
FILE* fp; //文件指针
int re_flag; //向前指针回退时的标记,避免重复加载缓冲区

void get_char(); //根据向前指针forward的指示从输入缓冲区中读一个字符放入变量C中,然后移动forward,使之指向下一个字符
void get_nbc(void); //检查C中的字符是否为空格,若是,则反复调用get_char,直到C中进入一个非空字符为止
void cat(void); //把C中字符链接在token中的字符串后面
void retract(void); //向前指针forward后退一个字符,同时将token的最后一个字符删去
void combine(int tokens, long long attribute_i, long double attribute_f); //把单词的记号和属性组合成一个二元组
void error(int log); //对发现的错误进行相应的处理
int letter(void); //判断C中的字符是否为字母,若是则返回1,否则返回0
int digit(void); //判断C中的字符是否为数字,若是则返回1,否则返回0
int reserve(void); //根据token中的单词查关键字表,若token中的单词是关键字,则返回该关键字的记号,否则,返回值“-1”
void table_insert(void); //将识别出来的用户自定义标识符,即token中的单词,插入符号表,返回该单词在符号表中的位置指针
void str_table_insert(void); //将识别出来的字符串,即token中的单次,插入字符串常量表,返回该字符串在表中的位置指针
void token_print(void); //打印识别出来的记号
void buffer_fill(int start); //填充一半的缓冲区
void outcome_print(void); //打印统计结果

编制词法分析程序

此处简要说明一下程序中的主要内容,完整程序见附件源代码。

该函数在从缓冲区获取下一个字符的同时,前移向前指针,统计程序的总行数,并在向前指针到达缓冲区边界时更新缓冲区。

void get_char() {
    C = buffer[forward];
    if (C == 'n') {
        line++;
    }
    forward++;
    if (forward == BUFFERSIZE && re_flag == 0) {
            buffer_fill(BUFFERSIZE);
    }
    else if (forward == BUFFERSIZE * 2 && re_flag == 0) {
            buffer_fill(0);
    }
    re_flag = 0;
    forward = (forward + BUFFERSIZE * 2) % (BUFFERSIZE * 2);
}

该函数向后回退一个向前指针,并将回退标记置1,防止重复更新缓冲区;同时若超前扫描时导致行数计数器增加,在这里一起回退。

void retract(void) {
    re_flag = 1;    
    forward = (forward + BUFFERSIZE * 2 - 1) % (BUFFERSIZE * 2);
    if (C == 'n') {
        line--;
    }
}

该函数将每个单词的记号和属性组合成一个二元组,并清空token,为读新的单词作准备。

void combine(int tokens, long long attribute_i, long double attribute_f) {
    binary.token = tokens;
    if (binary.token == F) {
        binary.attribute.fnum = (float)attribute_f;
    } else if (binary.token == D) {
        binary.attribute.dnum = (double)attribute_f;
    } else if (binary.token == LD) {
        binary.attribute.ldnum = attribute_f;
    } else if (binary.token == I) {
        binary.attribute.inum = (int)attribute_i;
    } else if (binary.token == L) {
        binary.attribute.lnum = (long)attribute_i;
    } else if (binary.token == LL) {
        binary.attribute.llnum = attribute_i;
    } else if (binary.token == U) {
        binary.attribute.unum = (unsigned int)attribute_i;
    } else if (binary.token == UL) {
        binary.attribute.ulnum = (unsigned long)attribute_i;
    } else if (binary.token == ULL) {
        binary.attribute.ullnum = (unsigned long long)attribute_i;
    } else {
        binary.attribute.inum = (int)attribute_i;
    }
    token_print();
    pos = 0;
}

该函数是错误处理函数,针对词法分析时可能出现的各种错误给出相应的报告及错误所在的行数,由于行数的增加是在get_char()函数中,若导致错误的字符恰好是换行符,line中储存的行数就会不正确,这里用局部变量eline(error line)解决这个问题。

void error(int log) {
    int eline;
    if (C == 'n') {
        eline = line - 1;
    }
    else {
        eline = line;
    }
    switch (log) {
        case 1:
            printf("t[Error] exponent has no digits. (line %d)n", eline); break;
        case 2:
            printf("t[Error] exponent has no digits. (line %d)n", eline); break;
        case 3:
            printf("t[Error] stray '%c' in program. (line %d)n", C, eline); break;
        case 4:
            printf("t[Error] invalid suffix "x" on integer constant. (line %d)n", eline); break;
        case 5:
            printf("t[Error] invalid suffix "b" on integer constant. (line %d)n", eline); break;
    }
    pos = 0;
}

该函数更新缓冲区,并在文件结束时即时停止读入,同时在读入字符的同时统计程序总字符数。为方便之后的判断,将文件结束符设置为'\0'。

void buffer_fill(int start) {
    int i = 0;
    char c;
    while (i < BUFFERSIZE && (c = fgetc(fp)) != EOF) {
        buffer[start + i] = c;
        i++;
        total_ns++;
        if (c != ' ' && c != 'n') {
            total++;
        }
    }
    if (c == EOF) {
        buffer[start + i] = '\0';
    }
}

词法分析器的主体函数,程序逻辑比较简单,就是用switch-case实现上述状态转换图,每个状态就是一个case,状态的转换就是case的跳转。由于该函数过长,请在完整程序中查看。

测试报告

输入程序

#include <stdio.h>
#include <stdlib.h>
/**
 * @auther: sinkers
 * @date: 2022-10-9
 * 测试用例
 */
void print(void){
    printf("%s\n", "hello world");
}

int main(void){
    //整数与浮点数常量
    int _a, _, a_;
    int a_i_1 = 123;
    int a_i_2 = -123;
    int a_i_3 = 0x123abc;
    a_i_3 = 0x;  // 错误
    int a_i_4 = 0123;
    int a_i_5 = 0b1010;
    a_i_5 = 0B;  // 错误
    long int a_l = 123l;
    long int a_l_2 = 0x123abcl;
    long int a_l_3 = 0123l;
    long int a_l_4 = 0b1010l;
    long long int a_ll = 123ll;
    long long int a_ll_2 = 0x123abcll;
    long long int a_ll_3 = 0123ll;
    long long int a_ll_4 = 0b1010ll;
    float a_f_1 = 1.23f;
    float a_f_2 = 123.f;
    float a_f_3 = .123f;
    float a_f_4 = 1.23e4f;
    float a_f_5 = 123.e4f;
    float a_f_6 = .123e4f;
    float a_f_7 = 123e4f;
    float a_f_8 = 123e-4f;
    float a_f_9 = 12ef;  //错误
    float a_f = 12e+;  //错误
    double a_d_1 =  1.23;
    double a_d_2 = 0xep2;
    double a_d_3 = 0x1p-2;
    double a_d_4 = 0x1p;  //错误
    long double a_ld = 1.23l;
    unsigned int a_u_1 = 123u;
    unsigned int a_u_2 = 0x80000001u;
    unsigned long a_ul_1 = 123ul;
    unsigned long a_ul_2 = 123LU;
    unsigned long a_ul_3 = 0x123abcul;
    unsigned long a_ul_4 = 0123lu;
    unsigned long a_ul_5 = 0b1010lu;
    unsigned long long a_ull_1 = 123ull;
    unsigned long long a_ull_2 = 123LLu;
    unsigned long long a_uul_3 = 0x123abcull;
    unsigned long long a_uul_4 = 0123llu;
    unsigned long long a_uul_5 = 0b1010llu;
    //运算符
    int a, b, c;
    a < b;
    a <= b;
    a > b;
    a >= b;
    a == b;
    a != b;
    c = a + b;
    c = a - b;
    c = a * b;
    c = a / b;
    c = a % b;
    a++;
    a--;
    c = a && b;
    c = a || b;
    c = !a;
    c = a & 0b1;
    c = a | 0b1;
    c = ~ 0b1;
    c = a ^ 1;
    c = a << 1;
    c = a >> 1;
    a += b;
    a -= b;
    a *= b;
    a /= b;
    a %= b;
    a &= b;
    a |= b;
    a ^= b;
    a >>= 1;
    a <<= 1;
    c = 1>2?a:b;

    int i;
    for (i=0; i<5; i++) {
        print();    
    }

    //符号错误
    int j;
}

测试结果

    <#, SYM, 7>
    <include, ID, 0>
    <<, REL, 0>
    <stdio, ID, 8>
    <., SPE, 5>
    <h, ID, 14>
    <>, REL, 2>
    <#, SYM, 7>
    <include, ID, 16>
    <<, REL, 0>
    <stdlib, ID, 24>
    <., SPE, 5>
    <h, ID, 31>
    <>, REL, 2>
    </*, NOT, 0>
    <void, KEY, 31>
    <print, ID, 33>
    <(, SPE, 0>
    <void, KEY, 31>
    <), SPE, 1>
    <{, SYM, 0>
    <printf, ID, 39>
    <(, SPE, 0>
    <", SYM, 4>
    <%s\n, STR, %s\n>
    <", SYM, 4>
    <,, SYM, 6>
    <", SYM, 4>
    <hello world, STR, hello world>
    <", SYM, 4>
    <), SPE, 1>
    <;, SYM, 5>
    <}, SYM, 1>
    <int, KEY, 17>
    <main, ID, 46>
    <(, SPE, 0>
    <void, KEY, 31>
    <), SPE, 1>
    <{, SYM, 0>
    <//, NOT, 1>
    <int, KEY, 17>
    <_a, ID, 51>
    <,, SYM, 6>
    <_, ID, 54>
    <,, SYM, 6>
    <a_, ID, 56>
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_1, ID, 59>
    <=, ASS, 0>
    <123, I, 123>
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_2, ID, 65>
    <=, ASS, 0>
    <-, ARI, 1>
    <123, I, 123>
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_3, ID, 71>
    <=, ASS, 0>
    <0x123abc, I, 1194684>
    <;, SYM, 5>
    <a_i_3, ID, 77>
    <=, ASS, 0>
    [Error] invalid suffix "x" on integer constant. (line 18)
    <;, SYM, 5>
    <//, NOT, 1>
    <int, KEY, 17>
    <a_i_4, ID, 83>
    <=, ASS, 0>
    <0123, I, 83>
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_5, ID, 89>
    <=, ASS, 0>
    <0b1010, I, 0>
    <;, SYM, 5>
    <a_i_5, ID, 95>
    <=, ASS, 0>
    [Error] invalid suffix "b" on integer constant. (line 21)
    <;, SYM, 5>
    <//, NOT, 1>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l, ID, 101>
    <=, ASS, 0>
    <123l, L, 123>
    <;, SYM, 5>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l_2, ID, 105>
    <=, ASS, 0>
    <0x123abcl, L, 1194684>
    <;, SYM, 5>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l_3, ID, 111>
    <=, ASS, 0>
    <0123l, L, 83>
    <;, SYM, 5>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l_4, ID, 117>
    <=, ASS, 0>
    <0b1010l, L, 0>
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll, ID, 123>
    <=, ASS, 0>
    <123ll, LL, 123>
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll_2, ID, 128>
    <=, ASS, 0>
    <0x123abcll, LL, 1194684>
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll_3, ID, 135>
    <=, ASS, 0>
    <0123ll, LL, 83>
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll_4, ID, 142>
    <=, ASS, 0>
    <0b1010ll, LL, 0>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_1, ID, 149>
    <=, ASS, 0>
    <1.23f, F, 1.230000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_2, ID, 155>
    <=, ASS, 0>
    <123.f, F, 123.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_3, ID, 161>
    <=, ASS, 0>
    <.123f, F, 0.123000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_4, ID, 167>
    <=, ASS, 0>
    <1.23e4f, F, 12300.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_5, ID, 173>
    <=, ASS, 0>
    <123.e4f, F, 1230000.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_6, ID, 179>
    <=, ASS, 0>
    <.123e4f, F, 1230.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_7, ID, 185>
    <=, ASS, 0>
    <123e4f, F, 1230000.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_8, ID, 191>
    <=, ASS, 0>
    <123e-4f, F, 0.012300>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_9, ID, 197>
    <=, ASS, 0>
    [Error] exponent has no digits. (line 38)
    <f, ID, 203>
    <;, SYM, 5>
    <//, NOT, 1>
    <float, KEY, 12>
    <a_f, ID, 205>
    <=, ASS, 0>
    [Error] exponent has no digits. (line 39)
    <;, SYM, 5>
    <//, NOT, 1>
    <double, KEY, 8>
    <a_d_1, ID, 209>
    <=, ASS, 0>
    <1.23, D, 1.230000>
    <;, SYM, 5>
    <double, KEY, 8>
    <a_d_2, ID, 215>
    <=, ASS, 0>
    <0xep2, D, 56.000000>
    <;, SYM, 5>
    <double, KEY, 8>
    <a_d_3, ID, 221>
    <=, ASS, 0>
    <0x1p-2, D, 0.250000>
    <;, SYM, 5>
    <double, KEY, 8>
    <a_d_4, ID, 227>
    <=, ASS, 0>
    [Error] exponent has no digits. (line 43)
    <;, SYM, 5>
    <//, NOT, 1>
    <long, KEY, 18>
    <double, KEY, 8>
    <a_ld, ID, 233>
    <=, ASS, 0>
    <1.23l, LD, 0.000000>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <int, KEY, 17>
    <a_u_1, ID, 238>
    <=, ASS, 0>
    <123u, U, 123>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <int, KEY, 17>
    <a_u_2, ID, 244>
    <=, ASS, 0>
    <0x80000001u, U, 2147483649>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_1, ID, 250>
    <=, ASS, 0>
    <123ul, UL, 123>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_2, ID, 257>
    <=, ASS, 0>
    <123LU, UL, 123>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_3, ID, 264>
    <=, ASS, 0>
    <0x123abcul, UL, 1194684>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_4, ID, 271>
    <=, ASS, 0>
    <0123lu, UL, 83>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_5, ID, 278>
    <=, ASS, 0>
    <0b1010lu, UL, 0>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_ull_1, ID, 285>
    <=, ASS, 0>
    <123ull, ULL, 123>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_ull_2, ID, 293>
    <=, ASS, 0>
    <123LLu, ULL, 123>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_uul_3, ID, 301>
    <=, ASS, 0>
    <0x123abcull, ULL, 1194684>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_uul_4, ID, 309>
    <=, ASS, 0>
    <0123llu, ULL, 83>
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_uul_5, ID, 317>
    <=, ASS, 0>
    <0b1010llu, ULL, 0>
    <;, SYM, 5>
    <//, NOT, 1>
    <int, KEY, 17>
    <a, ID, 325>
    <,, SYM, 6>
    <b, ID, 327>
    <,, SYM, 6>
    <c, ID, 329>
    <;, SYM, 5>
    <a, ID, 331>
    <<, REL, 0>
    <b, ID, 333>
    <;, SYM, 5>
    <a, ID, 335>
    <<=, REL, 1>
    <b, ID, 337>
    <;, SYM, 5>
    <a, ID, 339>
    <>, REL, 2>
    <b, ID, 341>
    <;, SYM, 5>
    <a, ID, 343>
    <>=, REL, 3>
    <b, ID, 345>
    <;, SYM, 5>
    <a, ID, 347>
    <==, REL, 4>
    <b, ID, 349>
    <;, SYM, 5>
    <a, ID, 351>
    <!=, REL, 4>
    <b, ID, 353>
    <;, SYM, 5>
    <c, ID, 355>
    <=, ASS, 0>
    <a, ID, 357>
    <+, ARI, 0>
    <b, ID, 359>
    <;, SYM, 5>
    <c, ID, 361>
    <=, ASS, 0>
    <a, ID, 363>
    <-, ARI, 1>
    <b, ID, 365>
    <;, SYM, 5>
    <c, ID, 367>
    <=, ASS, 0>
    <a, ID, 369>
    <*, ARI, 2>
    <b, ID, 371>
    <;, SYM, 5>
    <c, ID, 373>
    <=, ASS, 0>
    <a, ID, 375>
    </, ARI, 3>
    <b, ID, 377>
    <;, SYM, 5>
    <c, ID, 379>
    <=, ASS, 0>
    <a, ID, 381>
    <%, ARI, 4>
    <b, ID, 383>
    <;, SYM, 5>
    <a, ID, 385>
    <++, ARI, 5>
    <;, SYM, 5>
    <a, ID, 387>
    <--, ARI, 6>
    <;, SYM, 5>
    <c, ID, 389>
    <=, ASS, 0>
    <a, ID, 391>
    <&&, LOG, 0>
    <b, ID, 393>
    <;, SYM, 5>
    <c, ID, 395>
    <=, ASS, 0>
    <a, ID, 397>
    <||, LOG, 1>
    <b, ID, 399>
    <;, SYM, 5>
    <c, ID, 401>
    <=, ASS, 0>
    <!, ASS, 0>
    <a, ID, 403>
    <;, SYM, 5>
    <c, ID, 405>
    <=, ASS, 0>
    <a, ID, 407>
    <&, BIT, 0>
    <0b1, I, 0>
    <;, SYM, 5>
    <c, ID, 409>
    <=, ASS, 0>
    <a, ID, 411>
    <|, BIT, 1>
    <0b1, I, 0>
    <;, SYM, 5>
    <c, ID, 413>
    <=, ASS, 0>
    <~, ASS, 11>
    <0b1, I, 0>
    <;, SYM, 5>
    <c, ID, 415>
    <=, ASS, 0>
    <a, ID, 417>
    <^, BIT, 3>
    <1, I, 1>
    <;, SYM, 5>
    <c, ID, 419>
    <=, ASS, 0>
    <a, ID, 421>
    <<<, BIT, 4>
    <1, I, 1>
    <;, SYM, 5>
    <c, ID, 423>
    <=, ASS, 0>
    <a, ID, 425>
    <>>, BIT, 5>
    <1, I, 1>
    <;, SYM, 5>
    <a, ID, 427>
    <+=, ASS, 1>
    <b, ID, 429>
    <;, SYM, 5>
    <a, ID, 431>
    <-=, ASS, 2>
    <b, ID, 433>
    <;, SYM, 5>
    <a, ID, 435>
    <*=, ASS, 3>
    <b, ID, 437>
    <;, SYM, 5>
    <a, ID, 439>
    </=, ASS, 4>
    <b, ID, 441>
    <;, SYM, 5>
    <a, ID, 443>
    <%=, ASS, 5>
    <b, ID, 445>
    <;, SYM, 5>
    <a, ID, 447>
    <&=, ASS, 6>
    <b, ID, 449>
    <;, SYM, 5>
    <a, ID, 451>
    <|=, ASS, 7>
    <b, ID, 453>
    <;, SYM, 5>
    <a, ID, 455>
    <^=, ASS, 8>
    <b, ID, 457>
    <;, SYM, 5>
    <a, ID, 459>
    <>>=, ASS, 9>
    <1, I, 1>
    <;, SYM, 5>
    <a, ID, 461>
    <<<=, ASS, 10>
    <1, I, 1>
    <;, SYM, 5>
    <c, ID, 463>
    <=, ASS, 0>
    <1, I, 1>
    <>, REL, 2>
    <2, I, 2>
    <?, SEL, 0>
    <a, ID, 465>
    <:, SYM, 2>
    <b, ID, 467>
    <;, SYM, 5>
    <int, KEY, 17>
    <i, ID, 469>
    <;, SYM, 5>
    <for, KEY, 13>
    <(, SPE, 0>
    <i, ID, 471>
    <=, ASS, 0>
    <0, I, 0>
    <i, ID, 473>
    <<, REL, 0>
    <5, I, 5>
    <;, SYM, 5>
    <i, ID, 475>
    <++, ARI, 5>
    <), SPE, 1>
    <{, SYM, 0>
    <print, ID, 477>
    <(, SPE, 0>
    <), SPE, 1>
    <;, SYM, 5>
    <}, SYM, 1>
    <//, NOT, 1>
    <int, KEY, 17>
    <j, ID, 483>
    [Error] stray '�' in program. (line 99)
    [Error] stray '�' in program. (line 99)
    [Error] stray '�' in program. (line 99)
    <}, SYM, 1>


该程序共有 100 行
该程序的字符总数为 1463 / 1843
各种记号的的个数为:
    KEY:    79
    ID:     133
    REL:    12
    ASS:    70
    BIT:    5
    LOG:    2
    ARI:    9
    SPE:    12
    NOT:    9
    SYM:    99
    SEL:    1
    STR:    2
    CHR:    0
    U  :    2
    UL :    5
    ULL:    5
    I  :    17
    L  :    4
    LL :    4
    F  :    8
    D  :    3
    LD :    1
[Finished in 239ms]

分析说明

输入:

#include <stdio.h>
#include <stdlib.h>

输出:

    <#, SYM, 7>
    <include, ID, 0>  //符号表从0开始,include共7个单次,则角标从0到6。角标为7的位置填写'\0'
    <<, REL, 0>
    <stdio, ID, 8>  //第二个词的起始位置为8,说明写符号表正确
    <., SPE, 5>
    <h, ID, 14>
    <>, REL, 2>
    <#, SYM, 7>
    <include, ID, 16>
    <<, REL, 0>
    <stdlib, ID, 24>
    <., SPE, 5>
    <h, ID, 31>
    <>, REL, 2>

分析:见注释

输入:

/**
 * @auther: sinkers
 * @date: 2022-10-9
 * 测试用例
 */

输出:

    </*, NOT, 0>

分析:对于注释的处理,输出记号NOT

输入:

void print(void){
    printf("%s\n", "hello world");
}

输出:

    <void, KEY, 31>
    <print, ID, 33>
    <(, SPE, 0>
    <void, KEY, 31>
    <), SPE, 1>
    <{, SYM, 0>
    <printf, ID, 39>
    <(, SPE, 0>
    <", SYM, 4>
    <%s\n, STR, %s\n>  //将双引号内的内容存为字符串常量
    <", SYM, 4>
    <,, SYM, 6>
    <", SYM, 4>
    <hello world, STR, hello world>
    <", SYM, 4>
    <), SPE, 1>
    <;, SYM, 5>
    <}, SYM, 1>

分析:见注释

输入:

int main(void){
    //整数与浮点数常量
    int _a, _, a_;
    int a_i_1 = 123;
    int a_i_2 = -123;
    int a_i_3 = 0x123abc;
    a_i_3 = 0x;  // 错误
    int a_i_4 = 0123;
    int a_i_5 = 0b1010;
    a_i_5 = 0B;  // 错误

输出:

    <int, KEY, 17>
    <main, ID, 46>
    <(, SPE, 0>
    <void, KEY, 31>
    <), SPE, 1>
    <{, SYM, 0>
    <//, NOT, 1>
    <int, KEY, 17>  // 关键字int
    <_a, ID, 51>  // 检查用户自定义标识符是否可以正确检查出来
    <,, SYM, 6>
    <_, ID, 54>  // 检查用户自定义标识符是否可以正确检查出来
    <,, SYM, 6>
    <a_, ID, 56>  // 检查用户自定义标识符是否可以正确检查出来
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_1, ID, 59>  // 检查用户自定义标识符是否可以正确检查出来
    <=, ASS, 0>
    <123, I, 123>
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_2, ID, 65>
    <=, ASS, 0>
    <-, ARI, 1>
    <123, I, 123>
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_3, ID, 71>
    <=, ASS, 0>
    <0x123abc, I, 1194684>  // 16进制数会被按声明的类型转化为值储存
    <;, SYM, 5>
    <a_i_3, ID, 77>
    <=, ASS, 0>
    [Error] invalid suffix "x" on integer constant. (line 18)  //"0x"后没有数字,报错。报错行数正确
    <;, SYM, 5>
    <//, NOT, 1>
    <int, KEY, 17>
    <a_i_4, ID, 83>
    <=, ASS, 0>
    <0123, I, 83>  // 8进制数会被按声明的类型转化为值储存
    <;, SYM, 5>
    <int, KEY, 17>
    <a_i_5, ID, 89>
    <=, ASS, 0>
    <0b1010, I, 0>  // 2进制数会被按声明的类型转化为值储存
    <;, SYM, 5>
    <a_i_5, ID, 95>
    <=, ASS, 0>
    [Error] invalid suffix "b" on integer constant. (line 21)
    <;, SYM, 5>
    <//, NOT, 1>

分析:结果正确,见注释

输入:

    long int a_l = 123l;    long int a_l_2 = 0x123abcl; long int a_l_3 = 0123l; long int a_l_4 = 0b1010l;   long long int a_ll = 123ll; long long int a_ll_2 = 0x123abcll;  long long int a_ll_3 = 0123ll;  long long int a_ll_4 = 0b1010ll;

输出:

    <long, KEY, 18>
    <int, KEY, 17>
    <a_l, ID, 101>
    <=, ASS, 0>
    <123l, L, 123>  // 后缀有l的数被按照long型存储
    <;, SYM, 5>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l_2, ID, 105>
    <=, ASS, 0>
    <0x123abcl, L, 1194684>  // 16进制的long型
    <;, SYM, 5>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l_3, ID, 111>
    <=, ASS, 0>
    <0123l, L, 83>    // 8进制的long型
    <;, SYM, 5>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_l_4, ID, 117>
    <=, ASS, 0>
    <0b1010l, L, 0>    // 2进制的long型
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll, ID, 123>
    <=, ASS, 0>
    <123ll, LL, 123>  // 10进制的long long
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll_2, ID, 128>
    <=, ASS, 0>
    <0x123abcll, LL, 1194684>  // 16进制的long long
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll_3, ID, 135>
    <=, ASS, 0>
    <0123ll, LL, 83>  // 8进制的long long
    <;, SYM, 5>
    <long, KEY, 18>
    <long, KEY, 18>
    <int, KEY, 17>
    <a_ll_4, ID, 142>
    <=, ASS, 0>
    <0b1010ll, LL, 0>  // 2进制的long long
    <;, SYM, 5>

分析:结果正确,见上述注释

输入:

    float a_f_1 = 1.23f;
    float a_f_2 = 123.f;
    float a_f_3 = .123f;
    float a_f_4 = 1.23e4f;
    float a_f_5 = 123.e4f;
    float a_f_6 = .123e4f;
    float a_f_7 = 123e4f;
    float a_f_8 = 123e-4f;
    float a_f_9 = 12ef;  //错误
    float a_f = 12e+;  //错误

输出:

    <float, KEY, 12>
    <a_f_1, ID, 149>
    <=, ASS, 0>
    <1.23f, F, 1.230000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_2, ID, 155>
    <=, ASS, 0>
    <123.f, F, 123.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_3, ID, 161>
    <=, ASS, 0>
    <.123f, F, 0.123000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_4, ID, 167>
    <=, ASS, 0>
    <1.23e4f, F, 12300.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_5, ID, 173>
    <=, ASS, 0>
    <123.e4f, F, 1230000.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_6, ID, 179>
    <=, ASS, 0>
    <.123e4f, F, 1230.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_7, ID, 185>
    <=, ASS, 0>
    <123e4f, F, 1230000.000000>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_8, ID, 191>
    <=, ASS, 0>
    <123e-4f, F, 0.012300>
    <;, SYM, 5>
    <float, KEY, 12>
    <a_f_9, ID, 197>
    <=, ASS, 0>
    [Error] exponent has no digits. (line 38)
    <f, ID, 203>
    <;, SYM, 5>
    <//, NOT, 1>
    <float, KEY, 12>
    <a_f, ID, 205>
    <=, ASS, 0>
    [Error] exponent has no digits. (line 39)
    <;, SYM, 5>
    <//, NOT, 1>

分析:测试了小数常量。

​ float a_f_1 = 1.23f;
​ float a_f_2 = 123.f;
​ float a_f_3 = .123f;
​ float a_f_4 = 1.23e4f;
​ float a_f_5 = 123.e4f;
​ float a_f_6 = .123e4f;
​ float a_f_7 = 123e4f;
​ float a_f_8 = 123e-4f; 分别测试了10进制小数的各种表示。

​ float a_f_9 = 12ef;
​ float a_f = 12e+; 分别测试了两种错误表示

输入:

    double a_d_1 =  1.23;   double a_d_2 = 0xep2;   double a_d_3 = 0x1p-2;  double a_d_4 = 0x1p;  //错误

输出:

    <double, KEY, 8>
    <a_d_1, ID, 209>
    <=, ASS, 0>
    <1.23, D, 1.230000>
    <;, SYM, 5>
    <double, KEY, 8>
    <a_d_2, ID, 215>
    <=, ASS, 0>
    <0xep2, D, 56.000000>
    <;, SYM, 5>
    <double, KEY, 8>
    <a_d_3, ID, 221>
    <=, ASS, 0>
    <0x1p-2, D, 0.250000>
    <;, SYM, 5>
    <double, KEY, 8>
    <a_d_4, ID, 227>
    <=, ASS, 0>
    [Error] exponent has no digits. (line 43)
    <;, SYM, 5>
    <//, NOT, 1>

分析:测试了16进制浮点表示。16进制没有.号。

输入:

    long double a_ld = 1.23l;

输出:

    <long, KEY, 18>
    <double, KEY, 8>
    <a_ld, ID, 233>
    <=, ASS, 0>
    <1.23l, LD, 0.000000> 
    <;, SYM, 5>

分析:按照long double存储的,但不知道为何我的编译器不能打印long double类型的变量。以下代码的输出也是0.000000。

#include<stdio.h>
int main(void){
    long double a = 1.23;
    printf("%lf", a);
}

输入:

    unsigned int a_u_1 = 123u;
    unsigned int a_u_2 = 0x80000001u;
    unsigned long a_ul_1 = 123ul;
    unsigned long a_ul_2 = 123LU;
    unsigned long a_ul_3 = 0x123abcul;
    unsigned long a_ul_4 = 0123lu;
    unsigned long a_ul_5 = 0b1010lu;
    unsigned long long a_ull_1 = 123ull;
    unsigned long long a_ull_2 = 123LLu;
    unsigned long long a_uul_3 = 0x123abcull;
    unsigned long long a_uul_4 = 0123llu;
    unsigned long long a_uul_5 = 0b1010llu;

输出:

    <unsigned, KEY, 30>
    <int, KEY, 17>
    <a_u_1, ID, 238>
    <=, ASS, 0>
    <123u, U, 123>  // 10进制无符号int
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <int, KEY, 17>
    <a_u_2, ID, 244>
    <=, ASS, 0>
    <0x80000001u, U, 2147483649>  // 16进制无符号int
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_1, ID, 250>
    <=, ASS, 0>
    <123ul, UL, 123>  // 10进制无符号long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_2, ID, 257>
    <=, ASS, 0>
    <123LU, UL, 123>  // 10进制无符号long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_3, ID, 264>
    <=, ASS, 0>
    <0x123abcul, UL, 1194684>  // 16进制无符号long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_4, ID, 271>
    <=, ASS, 0>
    <0123lu, UL, 83>  // 8进制无符号long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <a_ul_5, ID, 278>
    <=, ASS, 0>
    <0b1010lu, UL, 0>  // 2进制无符号long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_ull_1, ID, 285>
    <=, ASS, 0>
    <123ull, ULL, 123>  // 10进制无符号long long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_ull_2, ID, 293>
    <=, ASS, 0>
    <123LLu, ULL, 123>  // 10进制无符号long long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_uul_3, ID, 301>
    <=, ASS, 0>
    <0x123abcull, ULL, 1194684>  // 16进制无符号long long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_uul_4, ID, 309>
    <=, ASS, 0>
    <0123llu, ULL, 83>  // 8进制无符号long long
    <;, SYM, 5>
    <unsigned, KEY, 30>
    <long, KEY, 18>
    <long, KEY, 18>
    <a_uul_5, ID, 317>
    <=, ASS, 0>
    <0b1010llu, ULL, 0>  // 2进制无符号long long
    <;, SYM, 5>

分析:见上述注释,测试了各种unsigned long和unsigned long long的组合

输入:

    //运算符
    int a, b, c;
    a < b;
    a <= b;
    a > b;
    a >= b;
    a == b;
    a != b;
    c = a + b;
    c = a - b;
    c = a * b;
    c = a / b;
    c = a % b;
    a++;
    a--;
    c = a && b;
    c = a || b;
    c = !a;
    c = a & 0b1;
    c = a | 0b1;
    c = ~ 0b1;
    c = a ^ 1;
    c = a << 1;
    c = a >> 1;
    a += b;
    a -= b;
    a *= b;
    a /= b;
    a %= b;
    a &= b;
    a |= b;
    a ^= b;
    a >>= 1;
    a <<= 1;
    c = 1>2?a:b;

输出:

    <//, NOT, 1>
    <int, KEY, 17>
    <a, ID, 325>
    <,, SYM, 6>
    <b, ID, 327>
    <,, SYM, 6>
    <c, ID, 329>
    <;, SYM, 5>
    <a, ID, 331>
    <<, REL, 0>  //<
    <b, ID, 333>
    <;, SYM, 5>
    <a, ID, 335>
    <<=, REL, 1>  //<=
    <b, ID, 337>
    <;, SYM, 5>
    <a, ID, 339>
    <>, REL, 2>  //>
    <b, ID, 341>
    <;, SYM, 5>
    <a, ID, 343>
    <>=, REL, 3>  //>=
    <b, ID, 345>
    <;, SYM, 5>
    <a, ID, 347>
    <==, REL, 4>  //==
    <b, ID, 349>
    <;, SYM, 5>
    <a, ID, 351>
    <!=, REL, 4>  //!=
    <b, ID, 353>
    <;, SYM, 5>
    <c, ID, 355>
    <=, ASS, 0>  //=
    <a, ID, 357>
    <+, ARI, 0>  //+
    <b, ID, 359>
    <;, SYM, 5>
    <c, ID, 361>
    <=, ASS, 0>
    <a, ID, 363>
    <-, ARI, 1>  //-
    <b, ID, 365>
    <;, SYM, 5>
    <c, ID, 367>
    <=, ASS, 0>
    <a, ID, 369>
    <*, ARI, 2>  //*
    <b, ID, 371>
    <;, SYM, 5>
    <c, ID, 373>
    <=, ASS, 0>
    <a, ID, 375>
    </, ARI, 3>  //除
    <b, ID, 377>
    <;, SYM, 5>
    <c, ID, 379>
    <=, ASS, 0>
    <a, ID, 381>
    <%, ARI, 4>  //%
    <b, ID, 383>
    <;, SYM, 5>
    <a, ID, 385>
    <++, ARI, 5>  //++
    <;, SYM, 5>
    <a, ID, 387>
    <--, ARI, 6>  //--
    <;, SYM, 5>
    <c, ID, 389>
    <=, ASS, 0>
    <a, ID, 391>
    <&&, LOG, 0>  //&&
    <b, ID, 393>
    <;, SYM, 5>
    <c, ID, 395>
    <=, ASS, 0>
    <a, ID, 397>
    <||, LOG, 1>  //||
    <b, ID, 399>
    <;, SYM, 5>
    <c, ID, 401>
    <=, ASS, 0>
    <!, ASS, 0>  //!
    <a, ID, 403>
    <;, SYM, 5>
    <c, ID, 405>
    <=, ASS, 0>
    <a, ID, 407>
    <&, BIT, 0>  //&
    <0b1, I, 0>
    <;, SYM, 5>
    <c, ID, 409>
    <=, ASS, 0>
    <a, ID, 411>
    <|, BIT, 1>  //|
    <0b1, I, 0>
    <;, SYM, 5>
    <c, ID, 413>
    <=, ASS, 0>
    <~, ASS, 11>  //~
    <0b1, I, 0>
    <;, SYM, 5>
    <c, ID, 415>
    <=, ASS, 0>
    <a, ID, 417>
    <^, BIT, 3>  //^
    <1, I, 1>
    <;, SYM, 5>
    <c, ID, 419>
    <=, ASS, 0>
    <a, ID, 421>
    <<<, BIT, 4>  //<<
    <1, I, 1>
    <;, SYM, 5>
    <c, ID, 423>
    <=, ASS, 0>
    <a, ID, 425>
    <>>, BIT, 5>  //>>
    <1, I, 1>
    <;, SYM, 5>
    <a, ID, 427>
    <+=, ASS, 1>  //+=
    <b, ID, 429>
    <;, SYM, 5>
    <a, ID, 431>
    <-=, ASS, 2>  //-+
    <b, ID, 433>
    <;, SYM, 5>
    <a, ID, 435>
    <*=, ASS, 3>  //*=
    <b, ID, 437>
    <;, SYM, 5>
    <a, ID, 439>
    </=, ASS, 4>  // /=
    <b, ID, 441>
    <;, SYM, 5>
    <a, ID, 443>
    <%=, ASS, 5>  //%=
    <b, ID, 445>
    <;, SYM, 5>
    <a, ID, 447>
    <&=, ASS, 6>  //&=
    <b, ID, 449>
    <;, SYM, 5>
    <a, ID, 451>
    <|=, ASS, 7>  //|=
    <b, ID, 453>
    <;, SYM, 5>
    <a, ID, 455>
    <^=, ASS, 8>  //^=
    <b, ID, 457>
    <;, SYM, 5>
    <a, ID, 459>
    <>>=, ASS, 9>   //>>=
    <1, I, 1>
    <;, SYM, 5>
    <a, ID, 461>
    <<<=, ASS, 10>  //<<=
    <1, I, 1>
    <;, SYM, 5>
    <c, ID, 463>
    <=, ASS, 0>
    <1, I, 1>
    <>, REL, 2>
    <2, I, 2>
    <?, SEL, 0>  //?
    <a, ID, 465>
    <:, SYM, 2>  //:
    <b, ID, 467>
    <;, SYM, 5>

分析:测试了所有运算符,结果均正确

输入:

    int i;
    for (i=0; i<5; i++) {
        print();    
    }

输出:

    <int, KEY, 17>
    <i, ID, 469>
    <;, SYM, 5>
    <for, KEY, 13>
    <(, SPE, 0>
    <i, ID, 471>
    <=, ASS, 0>
    <0, I, 0>
    <i, ID, 473>
    <<, REL, 0>
    <5, I, 5>
    <;, SYM, 5>
    <i, ID, 475>
    <++, ARI, 5>
    <), SPE, 1>
    <{, SYM, 0>
    <print, ID, 477>
    <(, SPE, 0>
    <), SPE, 1>
    <;, SYM, 5>
    <}, SYM, 1>

分析:简单的综合型测试,结果均正确

输入:

    //符号错误
    int j;
}

输出:

    <//, NOT, 1>
    <int, KEY, 17>
    <j, ID, 483>
    [Error] stray '�' in program. (line 99)
    [Error] stray '�' in program. (line 99)
    [Error] stray '�' in program. (line 99)
    <}, SYM, 1>

分析:输入了一个中文的分号,由于中文占3个字节,在报错时报了3个非法字符。

结果统计输出:

该程序共有 100 行
该程序的字符总数为 1463 / 1843
各种记号的的个数为:
    KEY:    79
    ID:     133
    REL:    12
    ASS:    70
    BIT:    5
    LOG:    2
    ARI:    9
    SPE:    12
    NOT:    9
    SYM:    99
    SEL:    1
    STR:    2
    CHR:    0
    U  :    2
    UL :    5
    ULL:    5
    I  :    17
    L  :    4
    LL :    4
    F  :    8
    D  :    3
    LD :    1

分析:统计结果均正确