Homework 1
Purpose:
This first assignment is designed to put some of the ideas
discussed in class into practice. It is also a chance
to make an assessment of the PAG framework.
Problem:
You will implement an escape analysis for the While
programming language. Escape analyses attempt to calculate
the program region within which a heap allocated instance is
referenced. Most languages have scope rules which define
the program region that corresponds to the lifetime
of a variable, but the lifetime of heap allocated data is
not governed by scope rules.
The While language doesn't have heap allocated
data and it only has global variables, so its not a good
choice for thinking about escape analysis. To make it
more appropriate, we will adopt the following conventions:
- All variables beginning with "L" will be considered local;
- Assume that locals are named uniquely across the program;
- All variables beginning with "A" will be considered allocators;
- Assume that allocators appear on the RHS of assignments at the top-level,
i.e., they are not embedded in an expression;
- Parameters should be considered as locals (and named starting
with "L"); and
- All other variables are globals
Given these conventions it is possible to write a program like:
program test
proc a(L0) is
L1 = ALLOC1;
call b(L1);
L1 = L0;
g = L1;
end
proc b(L2) is
L3 = L2;
end
proc c(L4) is
L5 = ALLOC2;
g = L5;
end
begin
a(5);
c(0);
end
Your analysis should be able to determine that values
allocated at ALLOC1 are accessible within
procedures a and b , but not
within c or the main program. Similarly
your analysis should be able to determine that
values allocated at ALLOC2 are accessible
globally throughout the program.
Here are a collection of test files that should give you the
idea of what the analysis should do. This is the output from
my current implementation (I'm in the process of rewriting
it to clean up certain things and I'll post updated results
when they are ready; the answers themselves won't change just
their encoding in the tuples). Look at the tuples at the
end node to extract the final escape information
Getting Started:
As a general roadmap to solving this homework problem you should:
- Study a couple of PAG analyses, e.g., RD and LV, and understand
what they are doing;
- Copy one of the analysis directories to your home area as a
starting point;
- Think about how you want to represent the information about
allocators and the program regions their values flow to. In doing
this you will want to consider how you are approximating
the program behavior, e.g.,
heap allocated data is lumped into per-allocator equivalence
classes, program regions are defined by procedure bodies;
- Design your representation for data flow information using
DATLA, in the
.set file;
- Decide how to configure the data flow framework in PAG,
e.g., direction, combining operator, etc.;
- Think about the effect of statements on the information
and code them as transfer functions in the
.optla
file; and
- Demonstrate that your analysis works by designing a variety
of different test cases and show your analysis results.
I realize that the PAG documentation is not perfect, but
after approximately 5 hours of study and experimentation
I was able to solve this problem. Don't give up when
you run into a problem, ask a question or try an experiment
with the tool.
Helpful Hints:
As you ask questions I'll update this list of hints. Here
are a few to start with
- The
SUPPORT code for the expression related
analyses is very useful (study it). In particular, you can easily
adapt it to determine whether the RHS of an assignment is
a single variable. Some of it is directly useful,
for example expVar will convert an expression
that is a a single variable reference to the
referenced Var .
- I have defined two analysis support functions in the C libraries
for
While . To use them you need to include
the following declarations in your SUPPORT code:
isLocal :: Var -> bool;
isAllocator :: Var -> bool;
The first takes a variable and tells you if it is a local and
the second tells you if it is an allocator. FULA doesn't have
much in the way of string manipulation functions, so I had
to implement these in C.
The isLocal routine now considers variable names beginning
with # to be indicate a local; this handles parameters.
- The set comprehension syntax allows for local
definitions. These are very useful for matching
set elements and replacing them. See the manual for
examples.
- There are multiple ways to formulate the analysis. One
approach is to use functions to represent data flow facts.
Take a look at
~santos/PAG/PAG*/EXAMPLES/CLAX/interval/...
for an example of an analysis that uses functions.
- It is possible to include print statements
in your transfer functions as an aid to debugging.
You simply say something like
print("lhs var is ".varString(lhs))...
to introduce the side-effect of printing the string while
evaluating the expression ... . I implemented
the varString
and procString routines in the C libraries (as
described above) so you'll need to add
the following declarations in your SUPPORT code:
varString :: Var -> str;
procString :: Proc -> str;
if you want to use them.
- PAG currently prints data flow information using the
integer representations of the lowest level values,
e.g., variable names, labels, procedure names, by default.
This can be overridden by making modifications to the
main.c and makefile for your
analyzer. Note that the name my analysis is
ESCAPE
and the name of my dataflow information type is Info ;
you'll need to modify these files based on the names you choose.
- You can access the current procedure id by introducing a
NODE definition in your .optla file of
the form:
proc : Proc
This is analagous to label .
Note that the current procedure at a CALL node
is the calling procedure (you can access the called procedure
through the first parameter of the CALL node).
- It is possible to traverse the flow graph and manipulate
the calculated data. An example is available in the modified
constant propogation analysis
CP2 . You'll need
to look at interface.h to get
an idea of the flow graph API. The manual describes it
and the DFI_STORE API that can be used to
access the data flow facts for a given flow graph node.
Note that the CP2 example is a bit different
than what I presented in class. Now I simply iterate
over the result of calling KFG_NODE_LIST kfg_all_nodes(kfg) .
- For those of you who are trying to keep your PAG installation
up to date with the one on the CIS machines, I'll post
updates of syntax_tree.c
that you will need to install in your local
share/pag/frontend/while directory.
Due date:
The assignment is due on Oct. 10, before class. We'll discuss
people's solutions in class.
Solutions
7 solutions have been installed here.
|