the C language (notes, part 1)

C was created in 1972 by Ken Thompson and Dennis Ritchie, who also were working at the time on the Unix operating system at AT&T’s Bell Labs.

Two other popular languages, C++ and Objective-C, created in the 1980’s, are basically C with the addition of object-oriented features.

C was standardized by ANSI with the C89 standard in 1989. The standard was revised by C90 (1990) and C99 (1999).

Today, the most popular compilers for C are GNU’s compiler and Microsoft’s compiler.

§

C is imperative, procedural, and statically- and weakly-typed.

C differs from most other high-level languages in that it allows the programmer to explicitly manipulate the contents of memory, byte-by-byte. This allows the programmer to keep data compact, but it means they are responsible for manually allocating and deallocating memory.

The C calling convention makes it easy for C to invoke functions written in assembly and for code written in assembly to invoke functions written in C.

Because it allows control of memory and easy integration with assembly, C is often used as a “systems programming” language, meaning it is used to write operating systems and device drivers.

C is also appropriate for performance-critical code, such as code for real-time 3D graphics in a game (though the large majority of games for PC and consoles are written in C++).

§

The basic data types:

  • char                      1-byte signed integer
  • int                         n-byte signed integer (n depends on ISA we’re compiling for)
  • float                      single-precision floating-point
  • double                  double-precision floating-point

To create a variable, we declare it with a statement of this form:

type name;

Functions are defined with this syntax:

returnType name(parameters) {body}

…where parameters are declarations separated by commas.

A casting operation takes a value and returns its equivalent (or approximate equivalent) in another type:

(type) expression

For instance:

(int) 36.7           // return the nearest int equivalent of this double value

§

C has no boolean type. Instead, for the sake of conditions, the numeric value 0 (or 0.0) represents false while all other numeric values represent true. Logical operations return 0 for false and 1 for true.

Like in Javascript, each function in C is its own scope, but in C, each {} within a function is a subscope. Variables declared within a subscope is only exists within that subscope, not the rest of the function. When a subscope declares a variable of the same name as a variable from an outer scope, the outer variable is effectively hidden within that subscope.

In a C program, execution begins with the function named main, which should return an int. (The value returned by main becomes the “exit code”, something discussed in a later unit.)

§

Variables in C are always ‘value variables’: they hold a value itself directly, not the address of that value elsewhere. However, C has data types called pointers, which are values representing addresses. So a pointer variable is a variable which holds an address.

There’s no one pointer type. Rather, for each type in the language, there is a corresponding pointer type. To declare a pointer, we use *:

int *foo;   // declare variable foo as a pointer to an int

The reference operator returns a pointer value that points to an ‘lvalue’ (something you can assign to, namely a variable):

int hippo;
int *eagle = &hippo;       // get the address of hippo and assign it to eagle

Note that & used on an int variable returns a pointer-to-int, so it can only be assigned to a pointer-to-int variable. However, any pointer can be cast into any other kind of pointer, e.g.:

int hippo;
char *toad = (char *) &hippo;

To get the value at the address pointed to by a pointer, use the dereference operator (*) as a unary operator:

*pointer

A dereference expression is a valid lvalue: when assigning to the dereference of a pointer, we are assigning a value at the address pointed to.

int hippo;
int *eagle = &hippo;
*eagle = 6;      // assign 6 to the int at the location pointed to by eagle

§

Adding or subtracting an integer to a pointer returns a pointer with an address. Adding n to a pointer-to-x returns a pointer which is n x’s higher in memory. Subtacting produces a pointer which is lower in memory.

You can subtract one pointer-to-x from another to get the number of x’s that fit in the space between the two addresses. However, you can never add one pointer to another.

If you want a pointer to a specific fixed address, simply cast an integer literal into a pointer, e.g.:

char *p;
p = (char *) 0xB000FFFF;

(Note that the literal can be expressed in hex.)

The addresses of pointers can be compared with relational tests (==, >, <=, etc.), e.g. == tests whether two pointers point to the same address.

It’s because of pointers that C is a weakly typed language. With pointer arithmetic, casting, and dereferencing, we can effectively write any arbitrary stuff we want to any address.

§

The standard library function contains malloc and free for allocating and deallocating memory, respectively. A call to malloc takes the number of bytes you wish to allocate, returning a pointer to the allocated block. When an allocation fails, malloc returns a null pointer (a pointer with the address 0). A call to free takes the pointer to the start of the allocated block you wish to free.

The sizeof operator returns the size of a type in bytes:

sizeof type

This is useful for portability because some types differ in size when you compile for different ISA’s.

§

An array, in C, is a contiguous block of memory allocated on the stack. Arrays are declared like local variables:

type name[size];

For instance:

float sam[5];    // declare an array the size of 5 floats

The array name sam becomes a pointer-to-float pointing to the start of the array. An array name is not a variable, however, so you can’t assign to it.

An array, like any other local variable, only exists for the duration of its scope. Therefore, when you want a block of memory that persists from one function call to the next, you should use malloc instead of an array.

§

C has no real string type. Instead, a string literal is actually a char pointer value: the string data is stored in a permanent place in the process (neither on the stack nor on the heap), and the char pointer points to the first byte. These “strings” are stored with a null byte (a byte of 0000_0000) to mark the end. The standard library functions that deal with strings always expect strings in this form—an array of chars ending in a null byte.

A structure—or struct—in C, is a data type formed as a compound of other types. A struct is declared:

struct typeName {members};

…where members is a list of declarations. For instance:

struct cat {
    char *name;
    int age;
};

Be clear that struct becomes part of the type name, so the above defines a type named “struct cat”. The members of a struct are accessed with the . (dot) operator:

struct cat mittens;  // declare a struct cat named mittens
mittens.name = “Mittens”;
mittens.age = 5;

You can assign a struct to another struct of the same type, which effectively copies each member. Oddly, though, you can’t use == to test whether two structs share all the same values.

sizeof is especially useful when allocating memory for structs with malloc:

struct cat *cats = malloc(8 * (sizeof struct cat));

This allocates enough memory to hold 8 struct cats.

Comments are closed.