Translation and interpretation
Suppose you are a translator, whose task is to translate a speech between two different languages. Concretely, the speech corresponds to the computer program, the original language is any programming language and the destination language is assembly, the only language that the computer understands. There are two very different contexts in which this translation can be performed, leading to two distinct linguistic trades.
Translation: the regular translator has the entire original text at his disposal, with as much time as necessary to translate it. Thus, he can read the text multiple times to analyse the style and determine a translation strategy that to ensure coherence. He will wonder how to translate names, or a recurrent expression used in different contexts. After this analysis, he will start to actually translate the text more or less linearly but sometimes in multiple passes, each time bringing a batch of corrections to the text in the target language. The goal is to produce as best as possible, without too many time constraints, a new self-contained text.
Interpretation : the interpreter has to work in real time. Indeed, the speech is being uttered, the foreign audience is waiting impatiently for what is being said. The interpreter has a sliding window view on the speech: his immediate memory allows him to remember at most the last sentence, and he has no knowledge of what is going to be said. In this different state of mind, coherence can hardly be maintained beyond a single sentence. Each moment spent searching for an idiomatic expression entails some delay with regard to the speech, which continues to be delivered. The interpreter’s goal is to produce as fast as possible an understandable flow in the target language, somewhat truthful to the original speech.
Dynamic and static languages
A problem of types
We now have to talk about the delicate matter of type systems, on which opinions are divergent. The discussion that follows is rather dry, but is an essential one. The linguistic metaphor will again be of help. A typed program can be compared to a piece of text in which we have annotated each word with its grammatical category; for instance:
As we can see with this simple sentence, writing a typed program is more painful than writing a non-typed program. But then, what is the advantage of imposing ourselves this programmatic rigor? The answer is tightly bound to the execution mode of the program.
When a program is compiled, the compiler (translator) is going to use these type annotations to prove a certain number of properties concerning the variables of the program. Let us take an example: if the program says
a: int (read
a is of type
b : int and
print(a+b), then the compiler can prove that the addition
a+b is valid (we can add integers), and that the function to print the result is the function that prints an integer. The compiler can then produce a sequence of assembly instructions that performs what the program says. This sequence of instructions is then executed by the computer.
Let us now interpret the same piece of code. Interpretation works on flows, which means that the interpreter receives the program symbol after symbol. Here are the interpretation steps:
a : intand
b : int: we keep that in memory.
a+b: it is an addition, we have to verify that the two arguments are integers. The interpreter is thus going to emit assembly instruction whose effect is to go into memory, read the type of
b, and compare these to the expected type (integer). Once executed, the interpreter emits and executes the assembly instruction that performs addition.
We can notice here that interpretation leads to a substantial number of verifications (a few per “real” instruction) that are done during the program’s execution. When compiling, these verifications are done once and for all and do not appear in the generated assembly program. If the program is not valid, the compiler raises an error and does not generate an assembly program. On the other hand, the interpreter raises a runtime error.
Two competing philosophies
After reading the previous paragraph, we wonder: what is the advantage of using a dynamic language since they will always be handicapped from a performance point of view due to numerous runtime checks? First, let us remind ourselves that these languages are perfectly suited for the use cases we have discussed before. But apart from that, we observe that dynamic languages are used more and more in static contexts such as server-side code (see Django or Node.js).
It is easy to adopt a theoretician’s position, disgruntled to see that the common folk do not understand the mathematical superiority of the static languages. But actually, it is important to take into account psychological factors whose influence is significant. The following is not based on facts or data, but is a rather plausible personal interpretation.
Indeed, static languages have been developed for forty years and the first generation of them have started to feel the passage of time. C and C++ which remain the standard, each have a type system, primitive for C and exuberant for C++, that both perform self-sabotage when exposing a
void type that can be anything, leading to countless bugs. On the other hand, the new wave of strongly typed languages (OCaml, Haskell, Rust) rely on better-conceived theories but that give a “proof” feeling, rather unappealing to the developer who is not well-versed in mathematics. In these strongly-typed languages, the compiler has access to a lot more information and proves more properties on the program during its work, exposing more bugs and preventing runtime errors. Nevertheless, it is very frustrating to fight against the compiler during debugging, because it feels as if we weren’t good enough to even run the program. The developer’s only hope are then error messages, which are extremely difficult to render readable when the compiler is complex. All of this leads to a rather steep learning curve for the developer.
By contrast, a dynamic language gives the impression to the developer of having control of his program: whenever a runtime error occurs, he can locate precisely from which instruction it stems from, use
While static languages would be the tool of choice in an idealized world for professional developers whose preoccupations are performance and bug-freedom, a dynamic language is a good starting point for a less experienced developer. Using a dynamic language in a static context is not a crime (as long as performance is not a must have), and indeed it works. But how many debugging hours could have been spared using types? Starting a prototype in Python is generally a good choice, but an internal alarm must go off when the project goes beyond a thousand lines of code: is my language the right choice given the scope and execution context of my program? If the answer is no, it becomes profitable to invest some time learning a typed (and static) language. However, the static/dynamic debate is still a dividing issue for the programming community and features even more subtle points that will be dealt with in a future article.
- Static Typing Where Possible, Dynamic Typing When Needed, a 2004 Microsoft paper that gives further details about the arguments raised here.
- The OCaml website, install and discover a strongly-typed static language.
- Countless online articles going over the issue with more or less relevant arguments.