Report 2010-1.
Abstract parsing of string updates and user input
by Kyung-Goo Doh, Hyunha Kim, and David A. Schmidt
Abstract:
We extend our formulation of demand-driven, static-analysis-based,
abstract parsing of the strings generated by PHP scripts to include
strings that are generated from string-replacement operators and
user input. Our approach combines
LR(k)-parsing technology and data-flow analysis
to analyze, in advance of execution, the documents generated dynamically
by a script. String-replacement operations are computed statically
by composing the finite-state automaton defined by a string
replacement with the finite-state control of the LR(k)-parser,
and user input is predicted and processed by characterizing the input
by an LR(k)-grammer and analyzing the strings generated by the grammer.
Our work is implemented in Objective Caml.
Report 2010-2
Modular, parsing-based, flow analysis of dictionary data structures in scripting languages
by David A. Schmidt
Abstract:
We design and implement a modular, constant-propagation-like
forwards flow analysis for a Python subset containing strings
and dictionaries (hash tables). The analysis infers types of
dictionaries and the functions and modules that use them.
Unlike records and class-based objects, dictionaries are wholly
dynamic, and we employ a domain of dictionary types that delineate
which fields a dictionary must have. We have deliberately omitted
unification-based inference and row variables to obtain the benefits
of a forwards analysis that matches a programmer's intuitions.
Nonetheless, to accommodate a modular analysis, the values of parameters
and free (global) variables are represented by tokens to which are
attached constraints. At link- and function-call-time, the constraints
are matched against the actual values of arguments and global variables.
Finally, programmers are encouraged to use a BNF-like syntax to define
the forms of data types employed in their scripts.
The analysis uses the programmer-written BNF rules to ``abstractly parse''
program
phrases and associate them with derivations possible from the
programmer-defined grammars.
A prototype of the system is under construction.