programming languages (intro)

machine code and assembly

The CPU only understands machine code instructions, so the obvious way to write a program is to directly write in machine code. Writing a bunch of binary codes is extremely tedious, however, so we make it a bit easier on ourselves by writing in an assembly language. Assembly is basically machine code in a friendlier text form, wherein each line of text gets translated (by an assembler program) into one machine instruction. There is no one particular ‘assembly language’ but rather countless variants, each specific to a particular Instruction Set Architecture.

low-level vs. high-level languages

Languages are broadly classified into low– and high-level. Low-level languages directly reflect the workings of the machine whereas high-level languages enable the programmer to get more work done per line of ┬ácode at the cost of direct control of the machine. Assembly languages are low-level while most other languages are high-level. A few languages, though, such as C, exist somewhere in between and so are sometimes called mid-level languages.

compilers and interpreters

Source code (the code written by the programmer as text) must somehow be translated into a running program. The programs which do this are broadly categorized as either compilers or interpreters*. Compilers translate the source into another form of code, usually machine code. Interpreters translate code directly into action: an interpreter does what the code says to do as it reads the code, line-by-line.

*It’s arguable whether assemblers are a kind of compiler or a distinct, third kind of translator.

dynamic vs. static typing

Languages are broadly categorized as either dynamic or static. In a statically-typed language, the variables, function parameters, and function return values all must be given a fixed type in the source code; once given a fixed type, no other type may be used in that place, e.g. a variable of type X can only be assigned X values. In a dynamically-typed language, nothing is given a fixed type, so say, any variable can be assigned any value. The trade off here is that static typing is more restrictive but allows the language to detect certain type errors just by looking at the source (as opposed to detecting errors by actually running the code). Static typing also allows for optimizations that can be critical in high-performance code (so most graphically intensive games, for example, are written in static languages). Pigeon, Javascript, and Python are all dynamically typed.

weak vs. strong typing

Languages are also broadly categorized as either weakly– or strongly-typed. In a weakly-typed language, no mechanism in the language prevents the programmer from performing any operation on any piece of data whether or not the operation makes sense for that type of data. For example, it makes no sense to multiply the bytes of one string to another, but a weakly-typed language won’t stop us from doing so. We can basically treat any data as just a bunch of bytes. In a strongly-typed language, the language prevents this, preventing the programmer from making silly mistakes though at the cost of flexibility and, in some cases, efficiency. Virtually all high-level languages today, including Pigeon and Javascript, are strongly typed. The primary examples of weakly-typed languages are C and assembly.

polymorphism

Polymorphism literally┬ámeans ‘many shapes’. In programming, polymorphism refers to the ability of an operation to change its behavior depending upon the number and types of its inputs. For example, the + operator in Javascript performs addition when its inputs are numbers, but with string inputs, + performs string concatenation. Polymorphism comes in two flavors: in static polymorphism, the decision of which operation to perform is made at compile-time, while in dynamic polymorphism, the decision is made at runtime.

paradigms (imperative vs. functional, procedural vs. object-oriented)

A programming paradigm is a general style of approaching problems. The four most prominent paradigms are imperative, functional, procedural, and object-oriented. The imperative and functional paradigms are best understood as opposites: in imperative programming, the programmer mutates data freely, whereas in functional programming, the programmer attempts to mutate data as little as possible. The procedural and object-oriented paradigms are also best understood as opposites: in procedural programming, the programmer focuses on breaking the problem into units of action, whereas in object-oriented programming, the programmer focuses on breaking the problem into units of data.

Programming languages are typically designed to facilitate one or more paradigms at the cost of others (though it is usually possible to program in any paradigm in any language). The large majority of programming languages in use today favor the imperative paradigm over the functional. The split between procedural and object-oriented programming is about even.

tools and popular languages

A tool in programming broadly refers to any program used in the development of software, such as assemblers, compilers, interpreters, and text editors (used to write our source code), among others.

A programming language can itself be considered a kind of tool. We’ll survey the most significant languages used today.

efficiency and portability

When we say code is efficient, we mean that it uses a minimum of resources of one kind or another, such as processor time, memory, or network bandwidth. When we say code is portable, we mean that it can be run on different platforms (different hardware and/or different operating systems) with minimal additional work by the programmer. We’ll look at the factors that make some languages tend to produce more efficient or more portable code than other languages.

Comments are closed.