We present a theory for slicing imperative probabilistic programs, containing random assignments and "observe" statements for conditioning. We represent such programs as probabilistic control-flow graphs (pCFGs) whose nodes modify probability distributions. This allows direct adaptation of standard machinery such as data dependence, postdominators, relevant variables, etc. to the probabilistic setting. We separate the specification of slicing from its implementation:

1. first we develop syntactic conditions that a slice must satisfy (they involve the existence of another disjoint slice such that the variables of the two slices are probabilistically independent of each other);
2. next we prove that any such slice is semantically correct;
3. finally, we give an algorithm to compute the least slice.
To generate smaller slices, we may in addition take advantage of knowledge that certain loops will terminate (almost) always.

Our results carry over to the slicing of structured imperative probabilistic programs, as handled in recent work by Hur et al. For such a program we can define its slice which has the same "normalized" semantics as the original program; the proof of this property is based on a result proving the adequacy of the semantics of pCFGs wrt. the standard semantics of structured imperative probabilistic programs.

We consider slicing extended finite state machines. Extended finite state machines (EFSMs) combine a finite state machine with a store and can model a range of computational phenomena, from high-level software to cyber-physical systems. EFSMs are essentially interactive, possibly non-terminating or with multiple exit states and may be nondeterministic, so standard techniques for slicing, developed for control flow graphs of programs with a functional semantics, are not immediately applicable.

This paper addresses the various aspects of correctness for slicing of EFSMs, and provides syntactic criteria that we prove are sufficient for our proposed notions of semantic correctness. The syntactic criteria are based on the "weak commitment" and "strong commitment" properties highlighted by Danicic et alia. We provide polynomial-time algorithms to compute the least sets satisfying each of these two properties. We have conducted experiments using widely-studied benchmark and industrial EFSMs that compare our slicing algorithms with those using existing definitions of control dependence. We found that our algorithms produce the smallest average slices sizes, 21% of the original EFSMs when "weak commitment" is sufficient and 58% when "strong commitment" is needed (to preserve termination properties).

We present a theory for slicing probabilistic imperative programs -- containing random assignment and "observe" statements -- represented as control flow graphs whose nodes transform probability distributions. We show that such a representation allows direct adaptation of standard machinery such as data and control dependence, postdominators, relevant variables, etc. to the probabilistic setting. We separate the specfication of slicing from its implementation: first we develop syntactic conditions that a slice must satisfy; next we prove that any such slice is semantically correct; finally we give an algorithm to compute the least slice. A key feature of our syntactic conditions is that they involve two disjoint slices such that the variables of one slice are probabilistically independent of the variables of the other. This leads directly to a proof of correctness of probabilistic slicing.

In previous work, we have proposed a compositional framework for stating and automatically verifying complex conditional information flow policies using a relational Hoare logic. The framework allows developers and verifiers to work directly with the source code using source-level code contracts. In this work, we extend that approach so that the algorithm for verifying code compliance to an information flow contract emits formal certificates of correctness that are checked in the Coq proof assistant. This framework is implemented in the context of SPARK -- a subset of Ada that has been used in a number of industrial contexts for implementing certified safety and security critical systems.

Control dependence forms the basis for many program analyses, such as program slicing. Recent work on control dependence analysis has led to new definitions of dependence that can allow for reactive programs with their necessarily non--terminating computations. One important such definition is the definition of Weak Order Dependence, which was introduced to generalize classical control dependence for a Control Flow Graph (CFG) without end nodes. In this paper we show that for a CFG where all nodes are reachable from each other, weak order dependence can be expressed in terms of traditional control dependence where one node has been converted into an end node.

Embedded information assurance applications that are critical to national and international infrastructures, must often adhere to certification regimes that require information flow properties to be specified and verified. SPARK, a subset of Ada for engineering safety critical systems, is being used to develop multiple certified information assurance systems. While SPARK provides information flow annotations and associated automated checking mechanisms, industrial experience has revealed that these annotations are not precise enough to specify many desired information flow policies. One key problem is that arrays are treated as indivisible entities -- flows that involve only particular locations of an array have to be abstracted into flows on the whole array. This has substantial practical impact since SPARK does not allow dynamic allocation of memory, and hence makes heavy use of arrays to implement complex data structures.

In this paper, we present a Hoare logic for information flow that enables precise compositional specification of information flow in programs with arrays, and automated deduction algorithms for checking and inferring contracts in an enhanced SPARK information flow contract language. We demonstrate the expressiveness of the enhanced contracts and effectiveness of the automated verification algorithm on realistic embedded applications.

Information assurance applications built according to the MILS (Multiple Independent Levels of Security) architecture often contain information flow policies that are conditional in the sense that data is allowed to flow between system components only when the system satisfies certain state predicates. However, existing specification and verification environments, such as SPARK Ada, used to develop MILS applications can only capture unconditional information flows.

Motivated by the need to better formally specify and certify MILS applications in industrial contexts, we present an enhancement of the SPARK information flow annotation language that enables specification, inferring, and compositional checking of conditional information flow contracts. We report on the use of this framework for a collection of SPARK examples.

Slicing is a program transformation technique with numerous applications, as it allows the user to focus on the parts of a program that are relevant for a given purpose. Ideally, the sliced program should have the same termination properties as the original program, but to keep the slices manageable, it might be preferable to slice away loops that do not affect the values of relevant variables. This paper provides the first theoretical foundation to reason about non-termination insensitive slicing without assuming the presence of a unique end node. A slice is required to be closed under data dependence and under a recently proposed variant of control dependence, called weak order dependence. This allows a simulation-based correctness proof for a correctness criterion stating that the observational behavior of the original program must be a prefix of the behavior of the sliced program.

Keywords: program slicing, control dependence, observable behavior, simulation techniques

We formulate an intraprocedural information flow analysis algorithm for sequential, heap manipulating programs. We prove correctness of the algorithm, and argue that it can be used to verify some naturally occurring examples in which information flow is conditional on some Hoare-like state predicates being satisfied. Because the correctness of information flow analysis is typically formulated in terms of noninterference of pairs of computations, the algorithm takes as input a program together with two-state assertions as postcondition, and generates two-state preconditions together with verification conditions. To process heap manipulations and while loops, the algorithm must additionally be supplied "object flow invariants" as well as "loop flow invariants" which are themselves two-state, and possibly conditional.

The Ambient Calculus was developed by Cardelli and Gordon as a formal framework to study issues of mobility and migrant code. Numerous analyses have been developed for numerous variants of that calculus. We take up the challenge of developing, in a type-based setting, a relatively precise "topology" analysis for the original version of the calculus. To compensate for the lack of "co-capabilities" (an otherwise increasingly popular extension), the analysis is flow-sensitive, with the actions of processes being summarized by "behaviors".

A subject reduction property guarantees that for a well-typed process, the location of any ambient is included in what is predicted by its type; additionally it ensures that communicating subprocesses agree on their "topic of conversation". Based on techniques borrowed from finite automata theory, type checking of type-annotated processes is decidable (though potentially exponential).

This paper specifies, via a Hoare-like logic, an interprocedural and flow sensitive (but termination insensitive) information flow analysis for object-oriented programs. Pointer aliasing is ubiquitous in such programs, and can potentially leak confidential information. Thus the logic employs independence assertions to describe the noninterference property that formalizes confidentiality, and employs region assertions to describe possible aliasing. Programmer assertions, in the style of JML, are also allowed, thereby permitting a more fine-grained specification of information flow policy.

The logic supports local reasoning about state in the style of separation logic. Small specifications are used; they mention only the variables and addresses relevant to a command. Specifications are combined using a frame rule. An algorithm for the computation of postconditions is described: under certain assumptions, there exists a strongest postcondition which the algorithm computes.

We specify an information flow analysis for a simple imperative language, using a Hoare-like logic. The logic facilitates static checking of a larger class of programs than can be checked by extant type-based approaches in which a program is deemed insecure when it contains an insecure subprogram. The logic is based on an abstract interpretation of a "prelude" semantics which makes independence between program variables explicit. Unlike other, more precise, approaches based on Hoare logics, our approach does not require a theorem prover to generate invariants. We demonstrate the modularity of our approach by showing that a frame rule holds in our logic. Finally, we show how our logic can be applied to a program transformation, namely, forward slicing: given a derivation of a program in the logic, with the information that variable l is independent of variable h, the slicing transformation systematically creates the forward l-slice of the program: the slice contains all the commands independent of h. We show that the slicing transformation is semantics preserving.

The notion of control dependence underlies many program analysis and transformation techniques. Despite being widely used, existing definitions and approaches to calculating control dependence are difficult to apply directly to modern program structures because these make substantial use of exception processing and increasingly support reactive systems designed to run indefinitely.

This paper revisits foundational issues surrounding control dependence, and develops definitions and algorithms for computing several variations of control dependence that can be directly applied to modern program structures. To provide a foundation for slicing reactive systems, the paper proposes a notion of slicing correctness based on weak bisimulation, and proves that some of these new definitions of control dependence generate slices that conform to this notion of correctness. This new framework of control dependence definitions, with corresponding correctness results, is able to support even programs with irreducible control flow graphs. Finally, a variety of properties show that the new definitions conservatively extend classic definitions. These new definitions and algorithms form the basis of Indus Java Slicer -- a publicly available program slicer that has been implemented for full Java.

We specify an information flow analysis for a simple imperative language, using a Hoare logic. The logic facilitates static checking of a larger class of programs than can be checked by extant type-based approaches in which a program is deemed insecure when it contains an insecure subprogram. The logic is based on an abstract interpretation of program traces that makes independence between program variables explicit. Unlike other, more precise, approaches based on Hoare logic, our approach does not require a theorem prover to generate invariants. We demonstrate the modularity of our approach by showing that a frame rule holds in our logic. Moreover, given an insecure but terminating program, we show how strongest postconditions can be employed to statically generate failure explanations.

Previous type systems for mobility calculi (the original Mobile Ambients, its variants and descendants, e.g., Boxed Ambients and Safe Ambients, and other related systems) offer little support for generic mobile agents. Previous systems either do not handle communication at all or globally assign fixed communication types to ambient names that do not change as an ambient moves around or interacts with other ambients. This makes it hard to type examples such as a "messenger" ambient that uses communication primitives to collect a message of non-predetermined type and deliver it to a non-predetermined destination.

In contrast, we present our new type system PolyA. Instead of assigning communication types to ambient names, PolyA assigns a type to each process $P$ that gives upper bounds on (1) the possible ambient nesting shapes of any process $P'$ to which $P$ can evolve, (2) the values that may be communicated at each location, and (3) the capabilities that can be used at each location. Because PolyA can type generic mobile agents, we believe PolyA is the first type system for a mobility calculus that provides type polymorphism comparable in power to polymorphic type systems for the $\lambda$-calculus. PolyA is easily extended to ambient calculus variants. A restriction of PolyA has principal typings.

We introduce an annotated type system for a compiler intermediate language. The type system is designed to support inter-procedural register allocation and the representation of tuples and variants directly in the register file. We present an algorithm that generates constraints for assigning annotations, and prove its soundness with respect to the type system.

There are many calculi for reasoning about concurrent communicating processes which have locations and are mobile. Examples include the original Ambient Calculus and its many variants, the Seal Calculus, the MR-calculus, the M-calculus, etc. It is desirable to use such calculi to describe the behavior of mobile agents. It seems reasonable that mobile agents should be able to follow non-predetermined paths and to carry non-predetermined types of data from location to location, collecting and delivering this data using communication primitives. Previous type systems for ambient calculi make this difficult or impossible to express, because these systems (if they handle communication at all) have always globally mapped each ambient name to a type governing the type of values that can be communicated locally or with adjacent locations, and this type can not depend on where the ambient has traveled.

We present a new type system where there are no global assignments of types to ambient names. Instead, the type of an ambient process $P$ not only indicates what can be locally communicated but also gives an upper bound on the possible ambient nesting shapes of any process $P'$ to which $P$ can evolve, as well as the possible capabilities and names that can be exhibited or communicated at each location. Because these shapes can depend on which capabilities and names are actually communicated, the types support this with explicit dependencies on communication. This system is thus the first type system for an ambient calculus which provides type polymorphism of the kind that is usually present in polymorphic type systems for the $\lambda$-calculus.

The Ambient Calculus was developed by Cardelli and Gordon as a formal framework to study issues of mobility and migrant code. We present a type system for the calculus, parameterized by a set of security constraints: static ones concerning where a given ambient may reside, and dynamic ones expressing where a given ambient may be dissolved. A subject reduction property then guarantees that a well-typed process never violates these constraints; additionally it ensures that communicating subprocesses agree on their "topic of conversation".

The type system employs a notion of causality in that processes are assigned "behaviors". We argue that this significantly increases the precision of the analysis and compensates for the lack of "co-capabilities" (an otherwise increasingly popular extension to the ambient calculus); also it allows (in contrast to other approaches) an ambient to hold multiple topics of conversation.

Based on techniques borrowed from finite automata theory, type checking of type-annotated processes is decidable. Under certain quite natural restrictions, type inference is also possible.

The Ambient Calculus (henceforth, AC) was developed by Cardelli and Gordon as a formal framework to study issues of mobility and migrant code. We present a type system for AC that allows the type of exchanged data within the same ambient to vary over time. Our type system assigns what we call behaviors to processes; a denotational semantics of behaviors is then proposed, here called trace semantics, underlying much of the remaining analysis. We state and prove a Subject Reduction property for our typed version of AC. Based on techniques borrowed from finite automata theory, type-checking of fully type-annotated processes is shown to be decidable. We show that our typed version of AC is a conservative extension of the typed version of AC originally proposed by Cardelli and Gordon.

The Ambient Calculus was developed by Cardelli and Gordon as a formal framework to study issues of mobility and migrant code. We consider an Ambient Calculus where ambients transport and exchange programs rather that just inert data. We propose different senses in which such a calculus can be said to be polymorphically typed, and design accordingly a polymorphic type system for it. Our type system assigns types to embedded programs and what we call behaviors to processes; a denotational semantics of behaviors is then proposed, here called trace semantics, underlying much of the remaining analysis. We state and prove a Subject Reduction property for our polymorphically-typed calculus. Based on techniques borrowed from finite automata theory, type-checking of fully type-annotated processes is shown to be decidable. Our polymorphically-typed calculus is a conservative extension of the typed Ambient Calculus originally proposed by Cardelli and Gordon.

We consider a naive, quadratic string matcher testing whether a pattern occurs in a text; we equip it with a cache mediating its access to the text; and we abstract the traversal policy of the pattern, the cache, and the text. We then specialize this abstracted program with respect to a pattern, using the off-the-shelf partial evaluator Similix.

Instantiating the abstracted program with a left-to-right traversal policy yields the linear-time behavior of Knuth, Morris and Pratt's string matcher. Instantiating it with a right-to-left policy yields the linear-time behavior of Boyer and Moore's string matcher.

Recent work has shown equivalences between various type systems and flow logics. Ideally, the translations upon which such equivalences are based should be faithful in the sense that information is not lost in round-trip translations from flows to types and back or from types to flows and back. Building on the work of Nielson & Nielson and of Palsberg & Pavlopoulou, we present the first faithful translations between a class of finitary polyvariant flow analyses and a type system supporting polymorphism in the form of intersection and union types. Additionally, our flow/type correspondence solves several open problems posed by Palsberg & Pavlopoulou: (1) it expresses call-string based polyvariance (such as k-CFA) as well as argument based polyvariance; (2) it enjoys a subject reduction property for flows as well as for types; and (3) it supports a flow-oriented perspective rather than a type-oriented one.

We report on a case study in the application of partial evaluation, initiated by the desire to speed up a constraint-based algorithm for control-flow analysis. We designed and implemented a dedicated partial evaluator, able to specialize the analysis wrt. a given constraint graph and thus remove the interpretive overhead, and measured it with Feeley's Scheme benchmarks. Even though the gain turned out to be rather limited, our investigation yielded valuable feed back in that it provided a better understanding of the analysis, leading us to (re)invent an incremental version. We believe this phenomenon to be a quite frequent spinoff from using partial evaluation, since the removal of interpretive overhead makes the flow of control more explicit and hence pinpoints sources of inefficiency. Finally, we observed that partial evaluation in our case yields such regular, low-level specialized programs that it begs for runtime code generation.

Preface

Program analysis and type systems have been recognised as important tools in software development. By analysing a program before running it one may catch a variety of errors in due time; this goes not only for obvious errors such as adding an integer to a boolean, but also for higher-level'' errors such as violating a given communication protocol. Even for programs that are already correct, analysis is useful for reasons of efficiency: the result of the analysis may guide the transformation into an equivalent program with better performance; or it may guide the run-time system to improve the program's execution profile, for example by giving directions for an optimal scheduling of processors.

We firmly believe that for a given program analysis to be reliable, a solid foundation must be established. This includes that the specification of the analysis is faithful to a formal semantics for the source language, and that it is correctly implemented. To demonstrate these issues involves stating and proving a bulk of auxiliary results; and as analyses get more complex, it becomes increasingly difficult to fit these bits together in a coherent way.

This book reports on a powerful type and effect analysis. Taking a type based approach enables a succinct representation of program properties and facilitates modular reasoning. The system meets the theoretical demands outlined above and is useful for practical purposes, as witnessed by a prototype implementation that is available on the world-wide web. Indeed, a main objective of our research has been to advance the state of the art of effect system technology, in particular by integrating polymorphism (essential for code reuse) with a notion of subtyping (essential for the sake of precision).

The role of effects, or rather behaviours so as to emphasise that temporal information is included, is to explicitly record the actions of interest that will be performed at run-time. A variety of analyses on the language of behaviours may be built on top of the system; one might even devise a tool (à la the Mobility Workbench) for testing certain properties of behaviours.

Chapter 1 puts our development in perspective; prior knowledge of type systems will be helpful but is not strictly assumed. Chapter 7 (which is based on our paper Behaviour Analysis and Safety Conditions: a Case Study in CML gives a case study illustrating the use of the system. Chapters 2--6 are more technical and require some amount of mathematical maturity'' on part of the reader.

The research documented in this book grew out of the ESPRIT project LOMAPS (BRA 8130: Logical and Operational Methods in the Analysis of Programs and Systems). For this reason our development is conducted for a language (Concurrent ML) that integrates the functional and concurrent paradigms, but most of the ideas can be immediately transferred to other settings. Also, we have been partly supported by the DART project (funded by the Danish Science Research Council).

We would like to thank our collaborators in the above projects for fruitful discussions, in particular Pierre Jouvelot, Lone Leth, Jean-Pierre Talpin, Bent Thomsen, and Mads Tofte; we also thank Hans Rischel for providing us with the program used for the case study in Chapter 7.

Århus, October 1998

Keywords: type systems, effect systems, polymorphism, subtyping, behaviours, process algebras, Concurrent ML, Standard ML, type inference, algorithm W.

Abstract (of a preliminary version)

The integration of polymorphism (in the style of the ML let-construct), subtyping, and effects (modelling assignment or communication) into one common type system has proved remarkably difficult. This paper presents a type system for (a core subset of) Concurrent~ML that extends the ML type system in a conservative way and that employs all these features; and in addition causality information has been incorporated into the effects (which may therefore be termed behaviours'').

The semantic soundness of the system is established via a subject reduction result. An inference algorithm is presented; it is proved sound and (in a certain sense) also complete. A prototype system based on this algorithm has been implemented and can be experienced on the WWW; thanks to a special post-processing phase it produces quite readable and informative output.

The communication patterns of concurrent programs can be expressed succinctly using behaviours; these can be viewed as a kind of causal constraints or as a kind of process algebra terms. We present a system that infers behaviours from a useful fragment of Concurrent ML programs; it is based on previously developed theoretical results and forms the core of a system available on the Internet. By means of a case study, used as a benchmark in the literature, we shall see that the system facilitates the validation of certain safety conditions for reactive programs.

We describe a case study where novel program analysis technology has been used to pinpoint a subtle bug in a formally developed control program for an embedded system. The main technology amounts to first defining a process algebra (called behaviours) suited to the programming language used (in our case CML) and secondly to devise an annotated type and effect system for extracting behaviours from programs in a such a manner that an automatic inference algorithm can be developed. The case study is a control program developed for the Karlsruhe Production Cell'' and our analysis of the behaviours shows that one of the safety conditions fails to hold.

The integration of polymorphism (in the style of the ML let-construct), subtyping, and effects (modelling assignment or communication) into one common type system has proved remarkably difficult. One line of research has succeeded in integrating polymorphism and subtyping; adding effects in a straightforward way results in a semantically unsound system. Another line of research has succeeded in integrating polymorphism, effects, and subeffecting; adding subtyping in a straightforward way invalidates the construction of the inference algorithm. This paper integrates all of polymorphism, effects, and subtyping into an annotated type and effect system for Concurrent ML and shows that the resulting system is a conservative extension of the ML type system.

We study an annotated type and effect system that integrates let-polymorphism, effects, and subtyping into an annotated type and effect system for a fragment of Concurrent ML. First a small-step operational semantics is defined and next the annotated type and effect system is proved semantically sound. This provides insights into the rule for generalisation in the annotated type and effect system.

We study an annotated type and effect system that integrates let-polymorphism, effects, and subtyping into an annotated type and effect system for a fragment of Concurrent ML. First we define a type inference algorithm and then construct procedures for constraint normalisation and simplification. Next these algorithms are proved syntactically sound with respect to the annotated type and effect system.

In this paper we develop a sound and complete type and behaviour inference algorithm for a fragment of CML (Standard ML with primitives for concurrency). Behaviours resemble terms of a process algebra and yield a concise representation of the communications taking place during execution; types are mostly as usual except that function types and delayed communication types'' are labelled by behaviours expressing the communications that will take place if the function is applied or the delayed action is activated. The development of the present paper improves a previously published algorithm in achieving completeness as well as soundness; this is due to an alternative strategy for generalising over types and behaviours.

We convert, via a version that uses constraints, a type inference system for strictness analysis into an algorithm which given an expression finds the set of possible typings. Although this set in general does not possess a minimal element, it can be represented compactly by means of symbolic expressions in normal form - such expressions have the property that once values for the constraint variables with negative polarity have been supplied it is straight-forward to compute the minimal values for the constraint variables with positive polarity. The normalization process works on the fly, i.e. by a leaf-to-root traversal of the inference tree.

This report deals with strictness types, a way of recording whether a function needs its argument(s) or not. We shall present an inference system for assigning strictness types to expressions and subsequently we transform this system into an algorithm capable of annotating expressions with strictness types. We give an example of a transformation which can be optimized by means of these annotations, and finally we prove the correctness of the optimized transformation -- at the same time proving the correctness of the annotation.

Everything has been implemented; documentation can be found in appendix.

By thunkifying'' the arguments to function applications and dethunkifying'' variables one can translate a lambda-expression e into a lambda-expression e', such that call-by-value evaluation of e' gives the same result as call-by-name evaluation of e. By using the result of a strictness analysis, some of these thunkifications can be avoided. In this paper we present a type system for strictness analysis; present a translation algorithm which exploits the strictness proof tree; and give a combined proof of the correctness of the analysis/translation.

The main purpose of the thesis is to develop a model enabling one to reason about various techniques for program optimization, in particular wrt. speedup and correctness. Moreover, several methods to increase the amount of sharing is discussed, and a new one (to be termed ultimate sharing) is treated in detail.

Concerning speedup, a main point is to factor out the reasons why a program transformation may yield more than a constant speedup.

Concerning correctness, the interesting part is preservation of termination properties. This issue is mainly addressed for a logic language, where previous results from the literature are generalized - but also for a functional language some (more limited) results are presented.

The unfold/fold framework constitutes the spine of many program transformation strategies. However, by unrestricted use of folding the target program may terminate less often than the source program. Several authors have investigated the problem of setting up conditions of syntactic nature, i.e. not based on some well-founded ordering of the arguments, which guarantee preservation of termination properties. These conditions are typically formulated in a way which makes it hard to grasp the basic intuition why they work, and in a way which makes it hard to give elegant proofs of correctness. The aim of this paper will be to give a more unified treatment by setting up a model which enables us to reason about termination preservation in a cleaner and more algebraic fashion. The model resembles a logic language and is parametrized with respect to evaluation order, but it should not be too difficult to transfer the ideas to other languages.

It is well known that the performance of a program can often be improved by means of program transformation. Several program transformation techniques, eg. partial evaluation, work as follows: it is recognized that the original program often, when executed, enters states with common components. From these components alone it may be possible to do a lot of computations once and for all, which otherwise would have to be done again and again.

The evaluation of the common components mentioned above may itself benefit from identifying common components and evaluating them separately once and for all. Even this evaluation process may possess common components, etc -- an arbitrarily high level of nesting'' can be achieved, at least in theory.

The purpose of this paper is threefold:

1. A multilevel transition semantics for a logic language will be set up, expressing the ideas above. When restricted to two levels (the number of levels employed by most program transformation systems) the semantics gives a framework general enough to incorporate many program transformation tactics. The framework also includes run time'' - of the original program, of the transformed one and the transformation itself. So one can reason (in a limited way, of course) about efficiency improvements.
2. It has long been suspected that certain kinds of program transformations are able to speed up execution by (at most) a constant factor only: when an interpreter is partially evaluated a constant corresponding to the interpreter overhead'' disappears, when two loops are combined a factor 2 is typically saved, etc. On the other hand, it is easy to come up with examples where execution time is reduced by an order of magnitude. The reason for this can be identified as being either strong'' transformation techniques or a non-optimal execution strategy for the original program. Under certain conditions, reflecting the absence of these factors, it can be rigorously shown that at most a constant factor is achieved. As a simple corollary, it can be shown that by the use of (two-level) program transformation total execution time (ie.\ transformation time plus execution of the transformed program) can not be smaller than the square root of the execution time of the original program. More generally, by the use of $n$-level program transformation total execution time can not be reduced to less than the $n$th root of the original execution time.
3. After a program transformation based on the unfold/fold framework has been performed, it may easily happen that the definition domain of the transformed program is strictly smaller than the definition domain of the original program. We will show that this - when a certain rather weak condition is met - cannot happen within the meta-level framework.

Cook demonstrated the possibility of simulating any given 2-way Deterministic Pushdown Automaton (2DPDA) in linear time in the length of the read-only input tape. The purpose of this note is to show how this result can be obtained by means of a generalization of the well-known concept of memoization. This clever evaluation strategy will be termed partial memoization . We present a straight-forward simulator for two-way deterministic pushdown automata which - if memoization is performed on only a subset of the simulator's input parameters - can be shown to perform linearly. Hence the idea of partial memoization provides a new and surprisingly simple proof of Cook's theorem.

This paper reports two experiments with implementations of constructions from theoretical computer science. The first one deals with Kleene's and Rogers' second recursion theorems and the second is an implementation of Cook's linear time simulation of two way deterministic pushdown automata (2DPDAs). Both experiments involve the treatment of programs as data objects and their execution by means of interpreters.

For our implementations we have been using a small LISP-like language called Mixwell, originally devised for the partial evaluator MIX used in the second experiment. LISP-like languages are especially suitable since programs are data (S-expressions) so the tedious coding of programs as Gödel numbers so familiar from recursive function theory is completely avoided.

We programmed the constructions in the standard proofs of Kleene's and Rogers' recursion theorems and found (as expected) the program so constructed to be far too inefficient for practical use. We then designed and implemented a new programming language called Reflect in which Kleene and Rogers "fixed point" programs can be expressed elegantly and much more efficiently. We have programmed some examples in Reflect in an as yet incomplete attempt to find out for which sort of problems the second recursion theorems are useful program generating tools.

The second experiment concerns an automaton that can solve many non-trivial pattern matching problems. Cook has shown that any 2DPDA can be simulated in linear time by a clever memoization technique. We wrote a simple interpreter to execute 2DPDA programs and an interpreter using Cook's algorithm, and we observed that the latter was indeed much faster on certain language recognition problems. Both have, however, a high computational overhead, since they in effect work by interpretation rather than compilation. In order to alleviate this we applied the principle of partial evaluation to specialize each of the two interpreters on fixed 2DPDAs. The result was a substantial speedup.