Add support to your T compiler for statements in the main block. You should support block, empty, expression, if-then-else, while, return, out, break and continue statements.
You only need to support integer types in expressions. Expressions can contain integer literals, parenthesized expressions, main-block variable names, unary minus operators, logical complement operators, multiplication operators, division operators, addition operators, subtraction operators, less-than operators, greater-than operators, equality operators and assignment operators.
For full credit, also add support for arrays of integers. This requires supporting array assignment, array creation expressions and array access expressions. You should support multidimensional arrays of integers.
You should extend your scanner to recognize the additional tokens required for this phase. The scanner should still be capable of running in stand-alone mode.
You should extend your parser to recognize the additional syntax required for this phase. Your parser should build an AST for the whole program. The compiler may stop at the first parse error.
The semantic analysis required by this phase is implemented by an additional traversal of the AST that is performed after the analysis of class declarations. Move the analysis of main-block variable declarations into this traversal while also analyzing the statements found within the main block. Remember that main-block variables have scope from their declaration points to the end of the main block.
The semantic analysis should annotate and transform the AST to fully specify the semantics of the input program. Even though there is only int type in expressions in this phase, you should still label the type of all expression AST nodes. (We will introduce class types in a later phase.) You should also appropriatedly introduce Dereference nodes to distinguish the value of a variable from the address of a variable.
You should try to identify all semantic errors in the input. When an error is found, report it with the appropriate line number and an appropriate error message. (Sometimes because of how flex and bison work, your line number may unavoidably be off by one. This is okay.) Error messages should be written to stderr. Try to avoid a cascade of error messages caused by a single error.
You should continue to support the command-line options mandated by Phase 1. In addition, add a command-line option ("-after") that allows the user to dump the AST after semantic analysis is done. The dump should include the type for all expression AST nodes. The AST should be dumped in prefix form, one AST node per line. The dump should be printed to stderr. Print a blank line to stderr before starting the dump. It should be possible to dump an AST even if the input program contained semantic errors. You may have to make arbitrary decisions about how to set fields of the AST when there is an error, but it should still be possible to do the dump without something bad happening (like a segfault).
If there are no semantic errors, then perform code generation by traversing the AST. Linux assembly code should be generated for the Intel IA-32 architecture. We will discuss in class all Intel instructions that will be required to complete this assignment. The output code should be written to stdout. You should provide a C source file that implements the run-time environment for the generated code. This file must be called "RTS.c" and it will be compiled and linked with the output of the code generator in order to execute a compiled program.
You should again submit a Makefile for building both the stand-alone scanner ("lexdbg") and the whole compiler ("tc").
Provide a README file (called "README") that explains the current state of your compiler. Describe how well your compiler fulfills the requirements of this assignment.
The four required command-line options ("-before", "-after", "-classes" and "-main") can be given in any order if more than one is supplied.
Archive all your files in a tar file called "phase2.tar". This tar file should un-tar into a single directory called "phase2", which should contain the Makefile, the README file, the RTS.c file, and all the source files. All submitted files should be placed directly in the top-level of this directory. (That is, please do not use subdirectories.) You should submit your tar archive from agate.cs.unh.edu using my "submit" script. Please note: the tar file you submit should not be compressed. But, please, do not include any executable or object files in the tar file.
To turn in this assignment, type:
~cs712/bin/submit phase2 phase2.tar
Submissions can be checked by typing (also on agate.cs.unh.edu):
~cs712/bin/scheck phase2
Please read this specification carefully and try to follow it exactly. I will use scripts to test your compiler and therefore it is important that you follow my directions for the file names, command-line options, Makefile goals, tar file name, etc. (Points may be deducted if you do not follow the directions.)
To obtain 60% of the points for this assignment, you should support integer literals, addition operators, subtraction operators, parenthesized expressions, unary minus operators, empty statements, return statements and out statements. This includes detecting semantic errors involving these language elements.
To obtain 70% of the points for this assignment, you should first complete the 60% requirements. And then add support for main-block-variable names in expressions, assignment operators, multiplication operators and division operators. You may assume that all main-block variables will be of int type. This includes detecting semantic errors involving these language elements.
To obtain 80% of the points for this assignment, you should first complete the 70% requirements. And then add support for logical complement operators, less-than operators, greater-than operators and equality operators. This includes detecting semantic errors involving these language elements.
To obtain 90% of the points for this assignment, you should first complete the 80% requirements. And then add support for if-then-else statements, while statements, block statements, break statements and continue statements. This includes detecting semantic errors involving these language elements.
To obtain 100% of the points for this assignment, you should first complete the 90% requirements. And then add support for main-block variables that are integer arrays, array-creation expressions, and array-index expressions. Also add support for detecting semantic errors involving arrays, including the mis-use of array type in the other operators and statements.
Note: You must approach this assignment by first completing the 60% level, then the 70% level, then the 80% level, then the 90% level, and finally the 100% level. First, I believe this will most likely lead to the greatest success for the most of you. You need to get a piece of the assignment working and then add to it as much as you can until you run out of time (or energy). ( Do not try to sit down and do the whole assignment at once! ) Second, this is how I will grade it. For example, my tests for the 80% level will assume you have the 70% level done.
There will be two submissions for this assignment. You should submit what you have prior to 8am on Saturday March 14. No late submissions will be accepted for this first submission. I will test this submission for the 60% and the 70% levels of this assignment. Your final grade for these levels will be the average of the grades for this initial submission and the final submission.
To receive full credit for the final submission of this assignment, you must turn in your files prior to 8am on Monday March 30. Late submissions will be accepted at a penalty of 2 points for one day late, 5 points for two days late, 10 points for three days late, 20 points for four days late, and 40 points for five days late. No program may be turned in more than 5 days late.
Remember: you are expected to do your own work on this assignment.
Comments and questions should be directed to hatcher@unh.edu