Some history about programming-language design

You can do a better job on a task if you learn how others have handled similar jobs. Since this is a course on programming-language design, it's time for some history about language design:

1950s: Von Neumann machines, assembly language, Fortran

The first vacuum-tube (valve) computers simulated finite-capacity Turing machines. For efficiency, some storage cells were promoted to register status and wired tightly to the ALU. Programming consisted of physically rewiring the ALU --- a person would climb into the CPU and rearrange the ALU's wires to do that day's program.

John von Neumann realized that Kurt Goedel's numbering scheme could be used to encode wiring layouts. The number-coded wiring programs were saved in primary storage, giving stored programs. The number codes were called machine language --- binary code sequences for ''loads'' and ''stores'' and ''tests'' and ''gotos.''

Assembly languages were merely textual names for the codes, and an assembler was a macro translator that expanded the texual names into machine language.

At IBM, John Backus was asked to develop an ``automatic programming language'' that would look like the equation sets physicists wrote when they mapped out computations. Backus developed Fortran (FORula TRANslator), which mapped sequences of equations (assignments) plus conditionals plus for-loops into assembly code.

Fortran was a huge undertaking: Backus had to develop a parser that would read the lines of equations and disassemble them into an internal format. He also wrote a translator that translated the internal format into assembly code. One person did this! Most of what we know as ``imperative programming'' comes from Backus.

Fortran was a massive success and made IBM the dominant computer vendor. Soon after, COBOL (COmmon Business-Oriented Language) was developed by a team of experts who wanted an ``automatic programming language'' for the business domain. COBOL was hugely successful and is used to this day.

At this point in time, parsing was ad-hoc, often based on searching for operator symbols, blanks, and new lines. Internal reprsentations were ad-hoc.

1960s: Grammars, block structure, systems languages, compilation theory, LL/LR grammars, operational semantics

IBM had a lock on Fortran. In the late 1950s, a group of primarily European academics met to design an improvement that used a newly discovered data structure --- the stack. The language they designed, Algol60, had these innovations:

Its syntax was defined by a grammar, an idea developed by Noam Chomsky and discovered independently by John Backus.
The language's command-block allowed local definitions and automatic dynamic storage management (local vars are popped when a block is exited).
Procedures could invoke themselves, like recurrence equations did.
There was a semantics-definition document, written primarily by Peter Naur of Copenhagen, which described how the language's phrases, as defined by the grammar, would execute. This is the first example of an inductively defined explanation of a programming language.

Algol was an intellectual sensation, but no company owned it. Burroughs built machines with hardware stacks, but IBM outmuscled them in the marketplace.

There was still a need for assembly-like languages for programming low-level hardware, and a sequence of systems languages were developed: CPL (Common Programming Language, due largely to Christopher Strachey at Cambridge and Oxford), BCPL, B, and then C. Most of C existed in CPL....

The Algol60 design and documentation triggered massive research on parser and translator construction. Grammars were mathematical definitions that could be analyzed and parsers could be designed and proved correct. Translators were defined inductively --- the parser's output, a parse tree (trees were a new concept around 1960!), was the input to the translator, which traversed the tree to calculate its meaning. If the meaning was the program's execution and output, the translator was called an interpreter, because it interpreted (did) the program. If the program's meaning was assembly code or machine code, the translator was called a compiler. We use interpreters and compilers to this day.

Subclasses of grammars quickly developed --- the most famous are the LL(k) grammars, which build parse trees top down, and the LR(k) grammars, which build parse trees bottom up. We thank Donald Knuth for the latter classification.

Inductive definitions of semantics were first defined in terms of abstract machines; a famous abstract machine is the tree machine of the Vienna Definition Language, developed by IBM Labs, Vienna, by Dines Bjoerner and Cliff Jones. Abstract-machine semantics is also called operational semantics, because it shows how the language operates.

Already, people were building compiler generators, which let a designer write a grammar, write an operational semantics definition, and use them to assemble a slow but operational prototype of a language. These tools were impractical, however.

At MIT, John McCarthy required a computer language for natural-language processing; Fortran, Algol, etc., were impractical. Singlehandedly, McCarthy invented the notion of dynamic list (via cons, car, cdr operations) and garbage collection of lists that no longer were active in a computation. The resulting language, LISP (LISt Processing language), required an abstract machine based on Alonzo Church's definition of lambda-calculus, which itself was a study of naming devices in predicate logic. There were indeed some ``LISP machines'' manufactured, but they did not succeed in the marketplace. Mostly, LISP was implemented by slow interpreters coded to run on von Neumann machines. Nonetheless, LISP stimulated and even generated the field of artificial intelligence.

Programs were being understood more and more as logical/mathematical entities --- as knowledge transformers. Robert Floyd at Texas and Tony Hoare at Belfast both seized upon the idea of associating "levels of knowledge" --- assertions --- to program execution so that a program is an assertion transformer. This produced specification and verification techniques that are the basis of programming-by-contract, program verifiers, specifications, and axiomatic semantics. This level of semantics told the programmer to understand a program as a knowledge transformer. It did not explain how to design or implement the language.

In contrast, operational semantics indicated how a specific machine executed program phrases. Alas, this was disconnected from the logical meaning of the program. Clearly, there was a middle ground that remained unexplored.

1970s: virtual machines, SLR/LALR grammars, denotational semantics, functional languages

The problems of parser construction were solved essentially forever by Frank DeRemer's formulation of the SLR(k) languages and their spinoff, LALR(k). Now, tools like Yacc and ANTLR generate parsers from grammars.

Several alumni of the Algol60 design committee carried forward their ideas. Most notable were Kristin Nygaard (Simula67) and Nicklaus Wirth (Pascal and Modula). Simula was the first object-oriented language, because it allowed blocks of declarations that could hold private declarations. Pascal pioneered systematic data typing, standardized design of compilers, and a targeted virtual machine for which a compiler generated object code. A virtual machine could be ported to any CPU by writing an assembler that mapped virtual-machine code to the CPU's binary language. (This idea underlies Java and is the key to its success.)

At Oxford, Christopher Strachey, a founder of the C-line of languages, developed a keen interest in stating what a language meant independently of the choice of a target machine or virtual machine. Strachey decided that programs meant/represented/denoted mathematical functions that mapped inputs to outputs. Strachey adapted Church's lambda notation into a notation for mathematics and functions, and he used inductive definitions to translate program phrases into mathematics. This technique is called denotational semantics. Dana Scott, a founder of automata theory and an expert on constructive mathematics, helped Strachey define lattice structures of computational values that Strachey's functions computed upon.

Denotational semantics generated a huge amount of research into language analysis, because proof techniques from mathematics could now be applied to prove correctness properties about programs and languages. But Strachey's use of lambda-calculus notation also triggered a line of work about using the lambda-calculus notation itself as a programming language. This led to a line of functional programming languages: ML, CAML, OCAML, and Haskell. These languages lack loops and updatable variables --- all computational work is done with (recursive) function calls and parameter passing.

At the same time as Strachey, Knuth realized that meaning can be given to parse trees by attaching the meanings to the nodes of the tree. He called the meanings ``attributes'' and extended the usual grammar notation to define the combination of meanings from subtrees to supertrees. He called his result attribute grammars. The format is not tied to mathematical functions or target code or anything in particular --- it is best used like Strachey's inductive definition format as a means to define systematic translation from source program to target program/meaning.

1980s: component languages, actor and object models, logic=algorithmics, semantic calculi

Software became larger and more complex, especially because of the appearance of distributed systems and graphical devices. This motivated languages that supported components as their fundamental building unit --- Modula, Ada, Euler. When units are combined, there can be naming clashes or reuses of components in incompatible ways. This generated research on naming, importation, visibility and scope.

At the same time, research was conducted on components-as-little-programs that did computation by sharing computational abilities with other components-as-little-programs. The semantic model of this was called ``actors,'' and the Smalltalk language was inspired by this concept. The ``actors'' in Smalltalk were called ``objects,'' and this is how object-oriented programming began.

Smalltalk was also inspired by the pressing problem of developing a language that would naturally lend itself to multi-component GUI assembly via message passing and event handling.

The pressure to develop correct software that matched a specification written in logic led to intensive study of predicate logic (specifically, intuitionistic predicate logic) to program development. In this view, a specification is a data type, and a program is typed with the spec exactly when the program's behavior satisfies the spec. The new twist is that one can use intuitionistic logic to build ``proofs'' of specs where the proof is in fact the program code! The work required goes far beyond the usual hacking, but the payoff is huge. Many lines of work on this topic, especially in safety-critical and secure systems, continue to this day. A few relevant logics are Intuitionistic Type Theory, nuPRL, and LCF. Related proof/program development tools are NuPRL, LCF, HOL, Isabelle, and Coq.

Yet another application of logic to computing was literally to ``run the spec itself''. In this case, computation is the search for a proof that proves the spec. This is the basis of the Prolog language, whose interpreter is an Alan-Robinson-style resolution theorem prover for Horn-clause logic. Because dynamic lists and struct constructors were added, Prolog can be used to solve classic problems and problems in a.i. and data-bases.

1990s: domain-specific languages, software architecture, deduction-based semantics

More and more, languages were designed for limited problem areas: there were GUI-building languages, spreadsheet languages, web-page languages, networking languages, math languages, and so on. Specialized languages are called Domain Specific Languages. They remain till today the ``growth area'' of programming languages.

One DSL that failed was Java --- it was intended for programming Smalltalk-like objects on embedded chips. Its designer then extended Java so that it could be a web language, where objects (``applets'') are be sent on the web and executed in a web browser. Because Java was designed to execute on a virtual machine, it was easy to port (and embed into web browsers)` and became popular as a practical alternative to Smalltalk. It was also a good entry point to learning object-oriented programming. (The only alternative at that time was C++, which was a complex macroprocessor placed on top of C.)

Languages that did distributed and parallel computing were difficult to analyze with Scott-Strachey denotational semantics, and Gordon Plotkin at Edinburgh and Gilles Kahn at INRIA, Sofia-Antipolis, advocated its replacement with natural-deduction-style proof rules that computed in operational-semantics-like steps. Their systems are called small-step and big-step operational semantics. These systems only imply the existence of virtual machines, and serious work is needed to prove basic mathematical properties, yet they were indeed able to express semantics for distribution, nondeterminism, and parallelism.

Other calculi, such as Milner's CCS, Abadi-Cardeli's object calculus, and Process Algebra, were proposed as replacements for lambda-calculus as the ``metalanguage'' for semantics.

Picture languages for software architecture (most notably UML) became popular. The languages are somewhat semantics-free, and they require agreement on semantics to be useful in practice.

2000s: static analysis, verification, security, safety-critical applications

New ``big languages'' (e.g., C#) and ``little languages'' (e.g., Ajax, XML) continue to appear.

Recent emphasis has been on the application of semantics methods to build tools that help programmers write correct code. Tools like Java Modelling Language, Spec#, and Coq are based either on predicate logic or upon a theory of ``abstract interpretation'' where a program is executed with approximate values (property tokens) rather than concrete, specific inputs. Study of this topic properly belongs to the CIS806 course, which follows this one.

What does this have to do with us?

When we look closely at the inventions of languages and description methods, we see certain points that stand out:

A new language is developed to satisfy a need. For example, Fortran was developed to help physicists program their equations, Simula was developed to do simulations, Pascal was developed for portability, the C-languages were developed for systems programming, and Java was developed for embedded systems. There are many, many more examples, especially domain-specific languages (e.g., Javascript, Matlab, HTML, CSS, SQL, Ajax...) . When you work in an application area, it is inevitable you will want a language customized to the application area. You might end up inventing it yourself.
A language is designed by a sole person or a small team with a common vision. The designer has both a deep understanding of the intended application area and its runtime (hardware) platform. The designer always has prior experience at implementing languages.
The language's design begins with the runtime platform (runtime machine, virtual machine) that the language will manipulate. Indeed,

the purpose of a programming language is to manipulate a machine
The language is prototyped (given a quick implementation) in terms of an interpreter coded in another language or in terms of a core subset of itself (``bootstrapping''). After case studies, efficient implementations are developed.
Formal semantics techniques (operational, denotational, axiomatic) are used more for after-the-fact documentation and analysis and less for design and development.

This course focusses on design and development of ``small languages'' (Domain-Specific Languages) which everyone uses and almost every experienced programmer invents. This prepares you for the possible big-language project that you might encounter some day. Because the emphasis here is on design, we focus on runtime-machine design and prototyping.

There is a follow-up course (CIS806) that studies documentation, analysis, and proof by means of formal semantics definitions.

References

D. Gelernter and S. Jagannathan. Programming Linguistics, MIT Press, 1990.

Schmidt, D.A. Induction, Domains, Calculi: Strachey's Contributions to Programming-Language Engineering. Journal of Higher Order and Symbolic Computation 13-1/2 (2000) 89-102.

M. Gabbrielli and S. Martini. Programming Languages: Principles and Paradigms. Springer, 2010.