We will also try and come up with our own reasonable definitions for a lot of the buzzwords surrounding types and compilation.
There is huge amount of great stuff in the TypeScript project that we won’t be able to cover within the scope of this blog post. Please read the official documentation to learn more.
TypeScript is an amazingly powerful tool, and really quite easy to get started with.
Whenever we stray into the area of talking about types, compilers, etc. things can get really confusing, really fast.
This article is designed as a “what you need to know” guide for a lot of these potentially confusing concepts, so that by the time you dive into the “Getting Started” style tutorials, you are feeling confident with the various themes and terminology that surround the topic.
Getting to grips with the buzzwords
There is something about running our code in a web browser that makes us feel differently about how it works. “It’s not compiled, right?”, “Well, I definitely know there aren’t any types…”
Things get even more interesting when we consider that both of those statements are both correct and incorrect at the same time - depending on the context and how you define some of these concepts.
As a first step, we are going to do exactly that!
Traditionally, developers will often think about a language being a “compiled language” when they are the ones responsible for compiling their own programs.
In basic terms, when we compile a program we are converting it from the form we wrote it in, to the form it actually gets run in.
In a language like Golang, for example, you have a command line tool called
go build which allows you to compile your
.go file into a lower-level representation of the code, which can then be executed and run:
# We manually compile our .go file into something we can run
# using the command line tool "go build"
go build awesome-program.go
# ...then we execute it!
We write some code, and load it up in a browser using a
<script> tag (or a server-side environment such as node.js), and it just runs.
An interpreted computer program is one that is executed like a human reads a book, starting at the top and working down line-by-line.
The classic example of interpreted programs that we are already familiar with are bash scripts. The bash interpreter in our terminal reads our commands in line-by-line and executes them.
Take this code as an example:
hello() function before we have even defined it! A simple line-by-line execution of this program would just not be possible, because
hello() on line 1 does not have any meaning until we reach its declaration on line 2.
We will not dig any further into the subtleties of defining “compiled vs interpreted” here (there are a LOT).
<script>tag in an HTML document.
Run Time vs Compile Time
Now that we have properly introduced the idea that compiling a program and running a program are two distinct phases, the terms “Run Time” and “Compile Time” become a little easier to reason about.
When something happens at Compile Time, it is happening during the conversion of our code from what we wrote in our editor/IDE to some other form.
When something happens at Run Time, it is happening during the actual execution of our program. For example, our
hello() function above is executed at “run time”.
The TypeScript Compiler
Now that we understand these key phases in the lifecycle of a program, we can introduce the TypeScript compiler.
<script> tag, for example, we will first pass it through the TypeScript compiler so that it can give us helpful hints on how we can improve our program before it runs.
There are many great posts about the different options for integrating the TypeScript compiler into your existing workflow, including the official documentation. It is beyond the scope of this article to go into those options here.
Dynamic vs Static Typing
Just like with “compiled vs interpreted” programs, the existing material on “dynamic vs static typing” can be incredibly confusing.
We have the following program:
How would we describe this code to somebody?
“We have declared a variable called
name, which is assigned the string of ‘James’, and we have declared the variable
sum, which is assigned the value we get when we add the number
1 to the number
Taken directly from the official spec:
An ECMAScript language type corresponds to values that are directly manipulated by an ECMAScript programmer using the ECMAScript language.
The ECMAScript language types are Undefined, Null, Boolean, String, Symbol, Number, and Object.
We could take our
name variable which is currently assigned the string ‘James’, and reassign it to the current value of our second variable
sum, which is the number
The value ‘James’ is always one type - a string - but the
name variable can be assigned any value, and therefore any type. The exact same is true in the case of the
sum assignment: the value
1 is always a number type, but the
sum variable could be assigned any possible value.
For our purposes, this also just so happens to be the very definition of a “dynamically typed language”!
By contrast, we can think of a “statically typed language” as being one in which we can (and very likely have to) associate type information with a particular variable:
In this code, we are better able to explicitly declare our intentions for the
name variable - we want it to always be used as a string.
And guess what? We have just seen our first bit of TypeScript in action!
This improved clarity benefits not only the TypeScript compiler, but also our colleagues and future selves when they come to read and understand our code. Code is read far more than it is written.
: string for our
name variable is used by TypeScript at compile time (in other words, when we pass our code through the TypeScript compiler) to make sure that the rest of the code is true to our original intention.
Let’s take a look at our program again, and add another explicit annotation, this time for our
If we let TypeScript take a look at this code for us, we will now get an error
Type 'number' is not assignable to type 'string' for our
name = sum assignment, and we are appropriately warned against shipping potentially problematic code to be executed by our users.
The type annotations are all removed for us automatically, and we can now run our code.
NOTE: In this example, the TypeScript Compiler would have been able to offer us the exact same error even if we hadn’t provided the explicit type annotations
TypeScript is very often able to just infer the type of a variable from the way we have used it!
Our source file is our document, TypeScript is our Spell Check
A great analogy for TypeScript’s relationship with our source code, is that of Spell Check’s relationship to a document we are writing in Microsoft Word, for example.
There are three key commonalities between the two examples:
- It can tell us when stuff we have written is objectively, flat-out wrong:
- Spell Check: “we have written a word that does not exist in the dictionary”
- TypeScript: “we have referenced a symbol (e.g. a variable), which is not declared in our program”
- It can suggest that what we have written might be wrong:
- Spell Check: “the tool is not able to fully infer the meaning of a particular clause and suggests rewriting it”
- TypeScript: “the tool is not able to fully infer the type of a particular variable and warns against using it as is”
- Our source can be used for its original purpose, regardless of if there are errors from the tool or not:
- Spell Check: “even if your document has lots of Spell Check errors, you can still print it out and “use” it as document”
TypeScript is a tool which enables other tools
The TypeScript compiler is made up of a couple of different parts or phases. We are going to finish off this article by looking at how one of those parts - the Parser - offers us the chance to build additional developer tools on top of what TypeScript already does for us.
The result of the “parser step” of the compilation process is what is called an Abstract Syntax Tree, or AST for short.
What is an Abstract Syntax Tree (AST)?
We write our programs in a free text form, as this is a great way for us humans to interact with our computers to get them to do the stuff we want them to. We are not so great at manually composing complex data structures!
However, free text is actually a pretty tricky thing to work with within a compiler in any kind of reasonable way. It may contain things which are unnecessary for the program to function, such as whitespace, or there may be parts which are ambiguous.
For this reason, we ideally want to convert our programs into a data structure which maps out all of the so-called “tokens” we have used, and where they slot into our program.
This data structure is exactly what an AST is!
An AST could be represented in a number of different ways, but let’s take a look at a quick example using our old buddy JSON.
If we have this incredibly basic source code:
The (simplified) output of the TypeScript Compiler’s Parser phase will be the following AST:
The objects in our in our AST are called nodes.
Example: Renaming symbols in VS Code
Internally, the TypeScript Compiler will use the AST it has produced to power a couple of really important things such as the actual Type Checking that occurs when we compile our programs.
But it does not stop there!
We can use the AST to develop our own tooling on top of TypeScript, such as linters, formatters, and analysis tools.
One great example of a tool built on top of this AST generation is the Language Server.
It is beyond the scope of this article to dive into how the Language Server works, but one absolutely killer feature that it enables for us when we write our programs is that of “renaming symbols”.
Let’s say that we have the following source code:
After a thorough code review and appropriate bikeshedding, it is decided that we should switch our variable naming convention to use camel case instead of the snake case we are currently using.
In our code editors, we have long been able to select multiple occurrences of the same text and use multiple cursors to change all of them at once - awesome!
Ah! We have fallen into one of the classic traps that appear when we continue to treat our programs as pieces of text.
The word “name” in our comment, which we did not want to change, got caught up in our manual matching process. We can see how risky such a strategy would be for code changes in a real-world application!
As we learned above, when something like TypeScript generates an AST for our program behind the scenes, it no longer has to interact with our program as if it were free text - each token has its own place in the AST, and its usage is clearly mapped.
We can take advantage of this directly in VS Code using the “rename symbol” option when we right click on our
first_name variable (TypeScript Language Server plugins are available for other editors).
Much better! Now our
first_name variable is the only thing that will be changed, and this change will even happen across multiple files in our project if applicable (as with exported and imported values)!
Phew! We have covered a lot in this post.
We cut through all of the academic distractions to decide on practical definitions for a lot of the terminology that surrounds any discussion on compilers and types.
We looked at compiled vs interpreted languages, run time vs compile time, dynamic vs static typing, and how Abstract Syntax Trees give us a more optimal way to build tooling for our programs.