Homework 1

Homework 1

Purpose:

This first assignment is designed to put some of the ideas discussed in class into practice. It is also a chance to make an assessment of the PAG framework.

Problem:

You will implement an escape analysis for the While programming language. Escape analyses attempt to calculate the program region within which a heap allocated instance is referenced. Most languages have scope rules which define the program region that corresponds to the lifetime of a variable, but the lifetime of heap allocated data is not governed by scope rules.

The While language doesn't have heap allocated data and it only has global variables, so its not a good choice for thinking about escape analysis. To make it more appropriate, we will adopt the following conventions:

  • All variables beginning with "L" will be considered local;
  • Assume that locals are named uniquely across the program;
  • All variables beginning with "A" will be considered allocators;
  • Assume that allocators appear on the RHS of assignments at the top-level, i.e., they are not embedded in an expression;
  • Parameters should be considered as locals (and named starting with "L"); and
  • All other variables are globals
Given these conventions it is possible to write a program like:
program test
  proc a(L0) is
    L1 = ALLOC1;
    call b(L1);
    L1 = L0;
    g = L1;
  end

  proc b(L2) is
    L3 = L2;
  end 

  proc c(L4) is
    L5 = ALLOC2;
    g = L5;
  end

begin
  a(5);
  c(0);
end
Your analysis should be able to determine that values allocated at ALLOC1 are accessible within procedures a and b, but not within c or the main program. Similarly your analysis should be able to determine that values allocated at ALLOC2 are accessible globally throughout the program.

Here are a collection of test files that should give you the idea of what the analysis should do. This is the output from my current implementation (I'm in the process of rewriting it to clean up certain things and I'll post updated results when they are ready; the answers themselves won't change just their encoding in the tuples). Look at the tuples at the end node to extract the final escape information

Getting Started:

As a general roadmap to solving this homework problem you should:
  • Study a couple of PAG analyses, e.g., RD and LV, and understand what they are doing;
  • Copy one of the analysis directories to your home area as a starting point;
  • Think about how you want to represent the information about allocators and the program regions their values flow to. In doing this you will want to consider how you are approximating the program behavior, e.g., heap allocated data is lumped into per-allocator equivalence classes, program regions are defined by procedure bodies;
  • Design your representation for data flow information using DATLA, in the .set file;
  • Decide how to configure the data flow framework in PAG, e.g., direction, combining operator, etc.;
  • Think about the effect of statements on the information and code them as transfer functions in the .optla file; and
  • Demonstrate that your analysis works by designing a variety of different test cases and show your analysis results.
I realize that the PAG documentation is not perfect, but after approximately 5 hours of study and experimentation I was able to solve this problem. Don't give up when you run into a problem, ask a question or try an experiment with the tool.

Helpful Hints:

As you ask questions I'll update this list of hints. Here are a few to start with
  • The SUPPORT code for the expression related analyses is very useful (study it). In particular, you can easily adapt it to determine whether the RHS of an assignment is a single variable. Some of it is directly useful, for example expVar will convert an expression that is a a single variable reference to the referenced Var.
  • I have defined two analysis support functions in the C libraries for While. To use them you need to include the following declarations in your SUPPORT code:
      isLocal :: Var -> bool;
      isAllocator :: Var -> bool;
    
    The first takes a variable and tells you if it is a local and the second tells you if it is an allocator. FULA doesn't have much in the way of string manipulation functions, so I had to implement these in C. The isLocal routine now considers variable names beginning with # to be indicate a local; this handles parameters.
  • The set comprehension syntax allows for local definitions. These are very useful for matching set elements and replacing them. See the manual for examples.
  • There are multiple ways to formulate the analysis. One approach is to use functions to represent data flow facts. Take a look at
       ~santos/PAG/PAG*/EXAMPLES/CLAX/interval/...
    for an example of an analysis that uses functions.
  • It is possible to include print statements in your transfer functions as an aid to debugging. You simply say something like print("lhs var is ".varString(lhs))... to introduce the side-effect of printing the string while evaluating the expression .... I implemented the varString and procString routines in the C libraries (as described above) so you'll need to add the following declarations in your SUPPORT code:
      varString :: Var -> str;
      procString :: Proc -> str;
    
    if you want to use them.
  • PAG currently prints data flow information using the integer representations of the lowest level values, e.g., variable names, labels, procedure names, by default. This can be overridden by making modifications to the main.c and makefile for your analyzer. Note that the name my analysis is ESCAPE and the name of my dataflow information type is Info; you'll need to modify these files based on the names you choose.
  • You can access the current procedure id by introducing a NODE definition in your .optla file of the form:
      proc : Proc
    
    This is analagous to label. Note that the current procedure at a CALL node is the calling procedure (you can access the called procedure through the first parameter of the CALL node).
  • It is possible to traverse the flow graph and manipulate the calculated data. An example is available in the modified constant propogation analysis CP2. You'll need to look at interface.h to get an idea of the flow graph API. The manual describes it and the DFI_STORE API that can be used to access the data flow facts for a given flow graph node. Note that the CP2 example is a bit different than what I presented in class. Now I simply iterate over the result of calling KFG_NODE_LIST kfg_all_nodes(kfg).
  • For those of you who are trying to keep your PAG installation up to date with the one on the CIS machines, I'll post updates of syntax_tree.c that you will need to install in your local share/pag/frontend/while directory.

Due date:

The assignment is due on Oct. 10, before class. We'll discuss people's solutions in class.

Solutions

7 solutions have been installed here.

Maintained by Matt Dwyer. Fri Aug 17 10:21:56 CDT 2001 [HOME]