Process C global variable declarations. This involves both installing the declarations into the symbol table and allocating memory in the assembly language output file. Also, after all declarations have been processed, you should dump the symbol table to stderr. Finally, you should detect any semantic errors and print appropriate error messages to stderr.
The controlling standard is the 1989 ANSI C standard.
Your compiler should read C source code from stdin and write Intel IA-32 assembly language output to stdout. Your compiler executable should be called scc (the slim C compiler).
To receive 70% of the credit: You must be able to process the following basic type specifiers: int, char, float, and double. You may limit the syntax so that only one type specifier may be given per declaration. You may also limit the syntax so that there is only one declarator per declaration. You must also be able to handle pointer and array type modifiers. Each declaration should include an identifier. If not, an error should be issued. A symbol table entry should be made for each identifier being declared. The entry should indicate the type of the declaration.
To receive 85% of the credit: In addition to obtaining the 70% level, you should also allow multiple type specifiers per declaration. You should handle the additional specifiers signed, unsigned, short, and long. You should add the necessary semantic checks and error messages to support multiple type specifiers (i.e. short short, unsigned double, etc. are illegal). In addition your program should be capable of handling multiple declarators per declaration.
To receive 100% of the credit: In addition to obtaining the 85% level, you should also support the function type modifier. A function return type can be declared for any type supported at the 70% and 85% level. In addition you should support the declaration of the void function return type. Parameters may be declared to be any type supported at the 70% and 85% levels (and can also use the function modifier). A 'void' parameter list should also be supported. You should add the necessary semantic checks and error messages to support function modifiers (i.e. it is illegal for a function to return a function, etc.).
When dumping the symbol table, print both the name and the type of each declared identifier. Print one identifier per line. Print the type in quasi-English. Here are some examples:
The Intel IA-32 assembly code to be emitted for this assignment is relatively simple. Start your output file with:
.data
For each declared variable, emit:
.comm name, size, align
where name is the name of the variable, size is the size of the variable in bytes, and align is the size of the variable in bytes except for long double when it is 16. Note that no code is generated for functions. Function declarations simply update the symbol table.
You must calculate the size of a variable from its type structure. For base types: int and long int are 4 bytes; char is 1 byte; short is 2 bytes; float is 4 bytes; double is 8 bytes; and long double is 12 bytes. Pointers are always 4 bytes. The size of an array is equal to the size of an element times the dimension of the array.
At all levels you are responsible for detecting duplicate declarations. Note that a declaration for a given identifier is duplicate only if there is a prior declaration for the identifier and the two declarations assign different types to the identifier.
Your compiler should be capable of detecting multiple semantic errors in one file. You can make arbitrary decisions about how to proceed when errors occur (for instance, with a duplicate declaration you might decide to ignore the second declaration). The important point is to do something so you can proceed (without causing a later 'bus error', 'segmentation fault', etc.).
All semantic error messages should include an estimate of the line number where the error occurred. Support this requirement by adding the current line number (available from the scanner) to each AST node that you build. Because of the vagaries of how the parsing algorithm works, the line number available to you in the scanner is only estimate and may be slightly wrong at times. It is good enough for our purposes.
You may allow the compiler to stop processing with the first syntax error. A syntax error is defined with respect to the distributed C grammar.
In ~cs712/public/phase1 are the base files for the project. They include a flex scanner, a skeleton bison parser. and a sample Makefile. The flex scanner will strip out comments, but it does not handle any of the C preprocessor commands.
In ~cs712/public/phase1 are also the public test files. Remember there will be hidden test files as well, so do your own testing too.
You will need the 70% level functionality in order to do later parts of the project. So be sure you at least get that much of the assignment completed.
Remember: you get credit for features successfully implemented. You do not get credit for attempting to do something; you get credit for the things that you can successfully demonstrate work.
Also remember: as always you are expected to do your own work on this assignment.
Finally: you should adequately document and structure your program.
You must turn in all source files (including the scanner and parser) and the Makefile for your compiler. Please place all your files into a tar file called src.tar that un-tars into a directory called src. The default goal of the Makefile should be to build the compiler executable called scc in the src directory.
To turn in this assignment, type:
~cs712/bin/submit phase1 src.tar
Submissions can be checked by typing (also on agate.cs.unh.edu):
~cs712/bin/scheck phase1
Note: You must approach this assignment by first completing the 70% level, then the 85% level, and finally the 100% level. First, I believe this will most likely lead to the greatest success for the most of you. You need to get a piece of the assignment working and then add to it as much as you can until you run out of time (or energy). (Do not try to sit down and do the whole assignment at once!) Second, this is how I will grade it. For example, my tests for the 85% level will assume you have the 70% level done.
The assignment is due on Sunday February 12. There is a grace period to 8am on Monday February 13 when no late penalty will be assigned. Submissions between 8am February 13 and 8am February 14 will have a late penalty of 15 points. Submissions between 8am February 14 and 8am February 15 will have a late penalty of 30 points. No program may be turned in after 8am on Wednesday February 15.
Remember: you are expected to do your own work on this assignment.
Comments and questions should be directed to hatcher@unh.edu