Copyright © 2010 David Schmidt

Chapter 3:
The core language and its characteristic domains


3.1 Characteristic domains
3.2 Characteristic domains for imperative languages
3.3 Declarative languages: no storable values


The languages studied in the previous chapter are small but each already has a unique character. They are called core languages because they have a ``core'' of primitive operations on data and storage. We will learn how to add extensions (like procedures, modules, iterators) to a core language.

But first we must understand what constitutes a ''core language.'' The key question is, ``What concepts can you express and compute in the language?'' Some languages are general purpose because they give a starting point for modelling many problem areas. Examples are Fortran, Algol, Pascal, C, Scheme, Smalltalk, Prolog, ML, C++, Python, Ruby, Java, and C#.

Other languages are special purpose or domain specific, because they express concepts specific to a problem area, like word processing, or data mining, or avionics. (Think about what you can express and compute in HTML or SQL.)

In this chapter we learn how to desribe a core language in terms of its characteristic domains.


3.1 Characteristic domains

To understand or design a language, we must state what the language can talk about, name, and remember. Christopher Strachey, a professor at Oxford University in the 1960s and '70s, observed that key language notions are organized into characteristic domains:

expressible values: What can you write in an expression?
denotable values: What can you give a name to?
storable values: What can you assign into storage and update?
The way these questions are answered define the characteristic domains of the core language. Here are the details:
  1. expressible values: What values can the language express and compute upon? (What kinds of expressions can you write?) The expressible values for a typical general-purpose language include numbers and strings. Systems languages like C let us express locations (``pointers''). Scripting and artificial-intelligence languages can express dynamic data structures like lists, arrays, and hash tables.

    If the language is domain specific, the expressible values will reflect this. For example, a word-processing language will have characters, strings, and files as expressible values. A language for writing flight-control software will express fractional numbers (floats) as inputs and outputs for hardware sensors on an airplane. Perhaps the sensors themselves are expressed. A game-playing language uses expressibles like numbers, positions, players, and directions for movement. A graphics language uses numbers, positions, colors, and shapes.

    The expressible values will come with operations that compute on them, for example, addition and subtraction for integers and find-string, append-string, replace-string for strings. There can be operations on positions, colors, shapes, etc. The operations define the computational capabilities of the language.

    Expressible values are represented in the language's syntax as expressions, and they are represented in the language's implementation as temporary values (values that come-and-go, saved in registers or in a temporary-value stack).

  2. denotable values: What values can be named? This is important, because in spoken language, we give names to concepts that are worth remembering and repeating. A typical general purpose language lets you name locations in storage that hold integers, booleans, characters, etc. We call such names, ``variable names.'' It is also common that a language lets you name a code segment, like a procedure or a function.

    There are usually two operations for denotable values: definition (binding a name to its value) and lookup (retrieving the value associated with a name).

    Denotable values are usually represented in syntax by declarations and they are implemented within a namespace or environment (also called symbol table or activation record).

  3. storable (updatable) values: What are the values that can be archived and updated? (These are also called persistent values.) In an implementation, storable values are saved in persistent storage (memory or heap or disk) --- and locations label the cells/positions in storage where the values are stored. A key notion here is the ability to update --- change part or all of the value --- that was stored. That is, we can do this:
    int x = 2;
    x = x + 1
    
    which stores an int for x and then updates it; ints are storable values here. In contrast, denotable values are not necessarily placed in storage and updatable. (Consider a named procedure. The procedure code is saved in the computer, somewhere, but it might not be updatable.) Again, the key is the ability is to archive a value and update it later.

    Storable values are represented within commands --- for example, look at the right-hand side of an assignment command to see what you can store.

    The operations that accompany the storable values are to allocate a new cell in storage for storing a new value; store/update a value in a specified location; lookup a value in a specified location; and deallocate a cell that is no longer needed. The operations can apply to primary storage (RAM) or secondary storage (disk), depending on what forms of storage the language uses.

As noted above, each of these three characteristic domains must be represented in the language's syntax: As we saw in an earlier chapter, the language's syntax can be defined with BNF grammar rules, and for reasons of precision of understanding, there may be more grammar rules than just exactly these three. But within the heart of every language's syntax and semantics, one finds these domains.

The first thing you should do when learning or designing a new language is identify the expressible, denotable, and storable values.

The values and operations that represent the characteristic domains form the core of a programming language. A core is typically extended with some control structures, data structures, and component structures. We study these in detail in subsequent chapters.


3.2 Characteristic domains for imperative languages

An imperative language (''assignment language'') is distinguished by its domain of storable values --- what can you assign into storage.

If we recap the two imperative languages in Chapter 2, we have this list:

  1. C-like language:

  2. object language:
In the two examples, we see that the expressible values are the same as the storable values --- we can store anything we can express/compute. Some languages do not preserve this principle. For example, perhaps you have used a Fortran-like language, where you can express an entire array structure, but you cannot store the entire array --- you can only store the array's individual elements, one at a time.

On the other hand, it makes no sense to have a language that allows you to store a wider range of values than what you can express. So, the storable values form a subset of the expressible values.

The two imperative languages we have studied have minimal cores. (This made for compact syntax and semantics.) It is common for an imperative language to have a core whose expressible values includes characters, strings, booleans, arrays, etc., along with the operations for computing on these values.

Also, we studied a variant of the C-like language with data types. In this case, we can claim that the expressible and denotable values are type-tagged ints and type-tagged locations. (But the type tags were not stored in the memory cells, so the storable values remain the same.) The language's interpreter provides the insight we require to understand the core language.


3.3 Declarative languages: no storable values

What if there are no storable values? This means there is no way to update a binding of a name to a value. This means we don't have the usual assignment command --- at best, we have only initialization commands. (These are called final variables in some languages and single-assignment variables in others.)

The burden of computation is placed on the expressions and declarations, and their domains will be richer than what we have seen so far.

A declarative language is one that has no storable values --- no storage cells that can be updated once they are initialized. Let's consider such a language, with these characteristic domains:

expressibles: integers and lists (stacks) of integers
denotables: integers, lists of integers, and functions
storables: nothing
Because nothing can be updated (no updatable global variables, no arrays), we rely heavily on declarations. Here is the syntax; there are no commands, but the expression constructions take on some of the command constructs:
===================================================

E : Expression
D : Declaration
I : Identifier
N : Numeral

E ::=  N  |  (E1 + E2)  |  (E1 - E2) 
    |  [ E* ]  |   head E  |  tail E
            where  E*  means  zero or more expressions separated by commas
    |  if E1 == E2 then E2 else E3
    |  let D+ in E end  |  I  |  I(E) 
            where  D+  means one or more  D  phrases

D ::=  val I = E  |  fun I1(I2) = E

N ::=  numerals

I ::=  alphanumeric strings beginning with a letter

===================================================
Look at the expressions: the first line gives the ways of expressing integers; the second line gives the ways of building lists (sequences/stacks) of values; the third line is a conditional expression (like e1 ? e2 : e3 in C); the fourth line shows how to declare names and then use them.

If you are rusty with lists, please remember that a list looks like an array, e.g., [2,3,5,7,11] is a list. To look at the list's first element, use head, e.g., head [2,3,5,7,11] computes to 2. To make a new list that is one shorter than the list in hand, use tail, e.g., tail [2,3,5,7,11] computes to [3,5,7,11]. That's it --- there are no indexing operations here.

There are two forms of declarations: a value declaration (in Java and C#, this is called a ``final'' assignment) and function definition.

Here is a little example program that assembles a list of integers and then calls a function to compute the length of the list:

let val one = 1
    val mylist = [one, one, 3, (one + one)]
    fun lengthOf(x) =
          if x == []  then 0
                      else (1 + lengthOf(tail x))
in
    lengthOf(mylist)
This is one, huge expression! The name, one, is bound to 1. The name, mylist, is bound to a list of four ints: [1,1,3,2]. The name, lengthOf, is bound to expression code that needs a future argument to bind to name x. These name-denotable-value bindings are saved in the interpreter's namespace. (We study the interpreter in a future chapter.) The declarations are not assignment commands. You cannot say,
val one = 1
val one = one + 1
because the name one is given a meaning exactly once.

Let's compute the meaning of this massive expression. Since we do not have the interpreter coded, we use a classic shorthand that comes from algebra --- copy rule semantics. This means we will copy and recopy the expression, replacing references to names by their values, until we reach the final answer. (We used copy-rule semantics in the chapter on BNF, when we studied how the baby expression interpreter operated.)

Let's go:

===================================================

let val one = 1
    val mylist = [one, (one + one)]
    fun lengthOf(x) =
          if x == []  then 0
                      else (1 + lengthOf(tail x))
in lengthOf(mylist)

=>

let val one = 1
    val mylist = [1, (1 + 1)]   # copied  1  for all the  one 's
    fun lengthOf(x) =
          if x == []  then 0
                      else (1 + lengthOf(tail x))
in lengthOf(mylist)

=>

let val one = 1
    val mylist = [1, 2]  
    fun lengthOf(x) =
          if x == []  then 0
                      else (1 + lengthOf(tail x))
in lengthOf(mylist)

=>  lengthOf(mylist)

=>  lengthOf([1,2]) => val x = [1,2]         # bind  x  to its argument
                       if x == []  then 0    # copy function's code
                       else 1 + lengthOf(tail x)

                    => val x = [1,2]       
                       if [1,2] == []  then 0   
                       else 1 + lengthOf(tail x)

                    => val x = [1,2]
                       if False  then 0
                       else 1 + lengthOf(tail x)

                    => val x = [1,2]
                       1 + lengthOf(tail x)

                    => val x = [1,2]
                       1 + lengthOf(tail [1,2])

                    => val x = [1,2]
                       1 + lengthOf([2]) => val x = [2]
                                            if x == []  then 0
                                            else 1 + lengthOf(tail x)

                                         => val x = [2]
                                            if [2] == []  then 0
                                            else 1 + lengthOf(tail x)

                                         => val x = [2]
                                            1 + lengthOf(tail [2])

                                         => val x = [2]
                                            1 + lengthOf([]) => val x = []
                                                                if x == []  then 0
                                                                else 1 + lengthOf(tail x)

                                                             => val x = []
                                                                if True then 0
                                                                else 1 + lengthOf(tail x)

                                                             => 0  # return!

                                          => 1 + 0 => 1  # return!

                    => 1 + 1 => 2  # return!

=> 2   # finished

===================================================
The example shows that storable values are not a necessity. On the other hand, it would be foolish to use a declarative language to implement a database, because there is no way to update the database, which is a persistent value that accumulates knowledge over time. Indeed, declarative languages have no simple way of building, remembering, and updating data structures --- in a declarative language, data structures must be rebuilt as needed.

Exercise:

  1. Use the declarative language to write this function:
    fun reverse(inputList) = ...code for reversing and returning the elements
                                in the list,  inputList...
    
    For example, reverse([1,2,3,4]) computes and returns a new list, [4,3,2,1]. Also, reverse([1, [2, 3], [[4,5]]]) returns [ [[4,5]], [2,3], 1].

  2. Add this form of conditional to the language:
    if isList E then E2 else E3
    
    Use it to rewrite reverse so that it reverses all elements at all levels of a nested list, e.g., reverse([1, [2, 3], [[4,5]]]) returns [ [[5,4]], [3,2], 1].