Python – continued

truth value and booleans

The boolean values are written True and False. Both are reserved words, but they actually resolve to names located in the builtins module: True in builtins references the one boolean object representing true while False in builtins references the one boolean object representing false.

For the purpose of truth tests, the following values are considered false:

  • None
  • False
  • zero (of any numeric type)

For other types, the truth value is determined by their __len__ or __bool__ methods:

  • If present, the __bool__ method determines the truth value by returning either True or False.
  • Collection types include a method __len__, which returns the length (number of items) of the collection. If an object has a __len__ method but no __bool__ method, Python invokes __len__ to determine its truth value. A length greater than zero is considered true; a length of zero is considered false. So empty strings, dictionaries, and lists, for example, are considered false.
  • Objects with neither a __len__ nor __bool__ method are simply considered true.

So for example, if I create a class with both a __len__ method and a __bool__ method, the truth value of an instance of the class is determined by the value returned from calling its __bool__ method.

logical operators

The logical operators are written not with symbols like in Javascript but with the reserved words andor, and not. Otherwise, they work just the same:

x and y             # returns x if x is false, returns y if x is true
x or y              # returns x if x is true, returns y if x is false
not x               # returns False if x is true, or True if x is false

chained assignment

In Javascript, = is an operator which returns the value assigned, but in Python, an assignment is not considered a kind of expression. So whereas this is legal Javascript:

foo(x = 3)         # legal in Javascript

…it is not valid Python. However, Python makes special allowance for chaining assignments:

a = b = c = 3         # assign 3 to a, b, and c

chained relational and equality operators

In most programming languages, including Javascript, (x < y < z) is not a test for whether x is less than y which is in turn less than z. Rather, (x < y < z) is evaluated as:

(x < y) < z

…the first < operation returns the boolean value true or false, so the outer operation doesn’t make any sense: z can’t be greater than the values true or false. The correct way to express (x < y < z) in Javascript is more verbosely as two separate relational operations and‘ed together: “x is less than y and y is less than z”:

(x < y) && (y < z)

However, Python has a feature it calls chaining, wherein Python bends the normal evaluation rules for adjacent relational and equivalence operations, e.g.:

w == x < y >= z

…is magically treated as:

(w == x) and (x < y) and (y >= z)

So Python—unlike C, Java, C#, and most other languages—treats:

x < y < z

…automatically the same as:

x < y and y < z

I really wish Python didn’t have this chaining syntax, as it’s too easy to forget about this special behavior coming from other languages. In fact, even though the more explicit syntax is more verbose, most Python code I’ve seen favors being verbose over using the chaining syntax. I suggest you do likewise.

the ==!=is, and is not operators

The == operator invokes the special method __eq__:

x == y    # x.__eq__(y)

For the built-in types (such as strings, numbers, lists, and dictionaries), the __eq__ method returns True when the values are equivalent and False when they are not. For classes of your own making, if you don’t include an __eq__ method yourself, the __eq__ method of the object class ends up getting called, which returns True when the objects are identical and False when they are not. If you want == to test for value equivalence of instances of your class, you must yourself define __eq__ on their class to do an equivalence test.

The != operator invokes the special method __ne__ (“not equals”). As you’d expect, the __ne__ methods defined in the built-in types return True or False based on value equivalence, returning True when the values are not equivalent and False when they are. Likewise, __ne__ defined in the object class returns True when the values are not identical and False when they are. When you define an __eq__ method for a class, you should generally also define an __ne__ method. (Be clear that nothing forces you to make __ne__ return the inverse truth value of __eq__, but it would be odd if you didn’t.)

The is and is not operators always test for object identity:

x is y        # return True if x and y are identical, False otherwise
x is not y    # return True if x and y are not identical, False otherwise

Be clear that the is not operator is a single binary operator even though it is written as two separate reserved words. Unlike ==!=, and most other operators, is and is not do not invoke any special methods.

reflected operator methods

Python gives every binary operator method (__add____sub____mul__etc.) a corresponding ‘reflected’ method (__radd____rsub____rmul__etc.). These methods only get called when the operand types differ and the left operand’s operator method either doesn’t exist or returns the value NotImplemented. So for example, given (3 + x) where x is an object of type Foo, Python first invokes (3).__add__(x), which will return NotImplemented (because that’s what __add__ of the int class does when the other argument is not a number); Python then invokes x.__radd__(3), throwing an exception if x has no such method.

The point of reflected methods is to allow us to use binary operators without having to think about which operand should go on the left and which the right. In general, then, a reflected method should do the same thing as its non-reflected counterpart, just with the operands swapped. Non-reflected methods also should generally return NotImplemented when the type of the other operand is not recognized.

literals

In Python, we can express numbers in hex, octal, binary, and engineering notation:

35                    # the value 35
2.3                   # the value 2.3
0x6AC2                # the value hex 6AC2
0o27                  # the value octal 27
0b11110110            # the value binary 11110110
662.25E-4             # the value 662.25 * 10-4

(The prefixes 0x, 0o, and 0b can alternatively be written 0X, 0O, and 0B.)

open delimiters

When a (, [, or { is left open by the end of its line, the normal rules of indentaion are suspended until it gets closed. This is especially useful when calling a function with a lengthy argument list because it allows us to spread the arguments across multiple lines. For example, rather than writing:

foo(a, b, c, d, e, f, g) * 98
bar()

…we can write:

foo(a, b,
    c,
    d, e,
    f,
    g) * 98
bar()      # normal indentaion rules apply

The additional lines can be indented any way we want, but the above is probably stylistically most sensible. After the line on which the delimiter is closed, normal indentation resumes.

multiple statements per line

Normally, we write one statement per line, but we can write more than one expression or assignment statement on a line by separating them with semi-colons:

foo(); bar(); x = 3; ack();

Most programmers avoid this style.

one-line blocks

When we introduce a block with a colon, we can write the block on the same line if it consists of just one line:

if x > 5: foo()

This line can consist of multiple expression statements separated by semi-colons:

if x > 5: foo(); bar()

Some people appreciate the compactness this affords for simple blocks, but I find the stylistic inconsistency introduced not worth it. Be clear that we can’t nest other blocks this way:

if x > 5: if y < 2: foo()    # syntax error

The simple rule to remember is that a line never has more than one colon.

multiple lines with \

If your line is getting too long, you can continue it onto the next line using . Logically, Python treats the line after one ending as if it is just continuation of the previous line. For example:

x = 3 + 6 *
  y
 * 3

…is the same as writing:

x = 3 + 6 * y * 3

Generally, the better style is to use an open delimiter to get the same allowance:

x = (3 + 6 *
  y
  * 3)

pass statement

A block is always expected after a colon, but sometimes we don’t want to do anything in the block. In such cases, Python requires you make this explicit with a statement consisting of just the reserved word pass:

def foo(): pass         # a function that does nothing

 

number types and precision

In Javascript, all number values are represented in an IEEE 64-bit floating-point format, even integers. Consequently, Javascript numbers have a limited range: past a magnitude of about 10308, all values get rounded to positive or negative infinity. Moreover, not all values within that range can be fully represented, for example 10300 is a possible value, but (10300 + 1) is not a possible value because it would require a significand with more bits than Javascript uses to represent a number. In Python, rational values are represented by two separate types: int for integers and float for non-integers. The int type allows for infinite precision: rather than storing all integer values with the same fixed number of bits, Python allocates as much space as needed to store any value. So (10300 + 1), for example, is a possible value in Python. A float, on the other hand, is represented in the IEEE 64-bit floating-point format, so the usual concerns about precision and magnitude apply. Also recall that some non-integer values simply cannot be expressed perfectly in binary floating-point no matter how many bits are used. 1/3, for example, is not expressable in binary floating-point using any number of digits, so it must be approximated, just as it must be approximated in decimal (0.3333333333333 is close but not exactly the same as 1/3) . Python also has another number type, complex, for complex numbers. A number literal suffixed with or J denotes a complex number with just an imaginary number:

3j          # imaginary 3

(Why j instead of i? I believe simply for style: a j is more visible than an in most fonts.) To get a complex with both a real and imaginary part, simply add an imaginary to a real:

2 + 3j                  # a complex with 2 for the real part and 3 for the imaginary part

Alternatively, invoke the complex class (complex in the builtinsmodule):

complex(2, 3)           # a complex with 2 for the real part and 3 for the imaginary part

Lastly, the boolean type is really a subtype of int. As a number value, True is equal to 1 while Falseis equal to 0:

True + 3          # 4
False + 3         # 3
False + 3         # -1

integer division operator

The normal division operator behaves like you would most expect, returning an integer or the closest float approximation of the answer. In contrast, the integer division operator, denoted as //, always returns an integer by dicarding the remainder of division:

9 / 3        # 3
9 / 2        # 4.5
9 // 2       # 4 (remainder of 1 is discarded)

compound assignment operators

Python has no ++ and — operators, but it does have compound assignment operators (sometimes called “augmented” assignment operators):

foo += 3                # foo = foo + 3

Python translates the compound assignment operators into calls of the ‘in-place’ methods, such as __iadd__for ‘in-place add’:

foo += x                # foo = foo.__iadd__(x)

If no ‘in-place’ method exists, Python falls back to the regular method. For example, the integer type has no __imul__ method, so __mul__is used:

# assume foo is an integer
foo *= x                # foo = foo.__mul__(x)

The ‘in-place’ methods exist because, in some cases, it may be possible to make the ‘in-place’ version of an operation more efficient. Still, the in-place methods should do effectively the same thing as their regular counterparts. While it would be possible to define, say, the + and += operators to perform fundamentally different operations, that would be a strange thing to do. Only some compound assignment operators invoke ‘in-place’ methods; the others just call the ‘regular’ method directly.

lambda function expressions

A function is most commonly defined by a def statement, but we can also define a function with an expression beginning with the keyword lambda followed by a list of parameters, a colon, and then the function body. The limitation of lambda, however, is that the function body can only consist of just one expression, which implcitly becomes the return value of the function:

lambda parameters: expression

For example:

foo = (lambda x, y: x * y + 4)
foo(3, 2)               # 10

(Precedence allowing, a lambda need not always be surrounded in parens, but I’ve included them here for clarity. Also notice that the two parameters, x and y, are not surrounded in parens.) The above lambda expression returns a new function which returns the product of its two parameters added to 4. Recall that in Javascript, the body of a function expression can contain any number of statements; Python can’t allow such flexibility because of its free-form syntax, so in practice Python’s lambda is used far less often than Javascript’s function expressions. The choice of the reserved word “lambda” derives from ‘lambda calculus’, in which the lambda character is used to notate a function.

how import finds modules

Python expects a file of source code to end in the extension .py. Because parsing source code is quite time consuming, it’s best if Python only has to do this once for each file, so the Python interpreter will preserve the semi-compiled code of a once-loaded module as a file of the same name but ending .pyc. When importing a module, Python tries to find and load a .pyc file with the corresponding name before resorting to finding and loading a .py file. So for example, when importing the module foo, Python tries to load foo.pyc, but if not found, Python loads foo.py and produces foo.pyc to save processing time for the next time foo needs to be loaded.

When searching for module files, Python searches through a list of directory paths. This list is accesible as path in the standard library module sys (short for ‘system’):

import sys
sys.path                # the list of directories to search when importing modules

The order of the list can be important because Python searches the list one-by-one in order. Python goes with the first matching file it finds, so if two or more modules of the same name exist in different directories, only the one in the directory searched earlier gets loaded.

By default, the list starts with the directory path of the main module (the one invoked when we started the Python interpreter). In interactive mode, the list starts with the current-working directory of the Python interpreter. The remaining directories contain standard library modules (such as sys).

On my system, this is the sys.path list when I run Python in interactive mode:

['', 'D:\\Windows\\system32\\python31.zip', 'D:\\Python31\\DLLs', 'D:\\Python31\\lib', 'D:\\Python31\\lib\\plat-win', 'D:\\Python31', 'D:\\Python31\\lib\\site-packages']

Notice that the first path is denoted as an empty string: this is a relative path denoting the current-working directory of the Python process. (We discuss relative paths, absolute paths, and the current-working directory in more detail in a later unit.) Also notice that the second item is a zip file path, not a directory. A zip file in sys.path is searched for modules just like a directory.

To add your own directory paths (or zip files) to the list, you can modify the list directly like any other list, but best practice is to set the PYTHONPATH environment variable[1] before launching Python. Upon launch, Python adds any directory paths listed in PYTHONPATH to the list, placing them after the first directory path but before the standard library module directory paths.

Another way to add more directory paths to the list is to use .pth (‘path’) files, which are simply text files containing a list of directory paths. Where exactly these .pth files should be placed depends upon the platform, but generally Python expects to find them in its root installation directory or in its site-packages sub-directory. (For details, consult the Python docs.)

packages

What Python calls a package is basically a directory of module files, all of which are presumably related to each other somehow. For example, in a program you might have one package for just the code related to the user interface, one for just code related to file processing, and so forth. A module is imported from a package by qualifying the package/directory name:

import package.module

Though this resembles an attribute lookup, it actually just tells Python to find the module in the directory of the given package name, e.g.:

import tiger.frog

The above imports a module frog from a directory named tiger (the directory tiger itself must be located in one of the directories listed in sys.path).

We can place packages inside packages:

import eagle.whale.tiger.frog

The above imports a module frog from a directory named tiger, itself inside a directory named whale, itself inside a directory named eagle (and the directory eagle itself must be located in one of the directories listed in sys.path).

Each package must contain an __init__.py file. This file serves to effectively mark the directory as a package, but it also may contain code which is run the first time any module in the package is loaded. The module object created from __init__.py gets assigned to the name of the package itself, and the subpackage or module inside becomes an attribute of this module:

import tiger.frog
tiger                         # the module object of __init__.py in the package tiger
tiger.frog                    # the module object of frog in the package tiger
frog                          # exception: no such name

(Despite both having to do with some kind of initialization, __init__ files and __init__ methods in classes have no special relationship aside from sharing a name.)

Because a package itself also acts like a module, it’s possible to import a package directly:

import tiger                        # import the package (__init__.py becomes a module assigned to tiger)

importing with from-import and as

A few variations of the import statement exist for convenience.

Sometimes, you might wish to refer to a module by a name different than its usual name. You can achieve this effect by simply assigning the module object to another name:

import supercalifragilisticexpealadocious
sup = supercalifragilisticexpealadocious

Alternatively, you can use the import statement’s optional as clause. Using import-as not only accomplishes the same thing in one line instead of two, it has the benefit of assigning directly to the name you want without also assigning to the name you don’t:

import supercalifragilisticexpealadocious as sup
sup                                 # the module object
supercalifragilisticexpealadocious  # exception: name undefined

You sometimes might also find it more convenient to have a direct reference to an object from a module:

import lobster
chicken = lobster.chicken
chicken                       # same object as lobster.chicken

The from-import statement is a more compact way of expressing this, except the module object itself is not assigned to a variable:

from lobster import chicken   # imports lobster but assigns its attribute chicken to chicken lobster                       # exception: lobster not found

 

This also works with packages: if lobster were a package and chicken a module, the above would import them but directly assign chicken to chicken.

from duck.hamster.goat import rat   # like import duck.hamster.goat.rat but only assigns rat to rat duck                                # exception: duck not found
hamster                             # exception: hamster not found
goat                                # exception: goat not found

A single from-import statement can assign more than one of a module’s attributes (or a package’s modules) directly into the current scope:

from lobster import turtle, rabbit, jackolope

If you wish to assign an attribute (or module) to a different name, use as:

from lobster import turtle, rabbit as cheetah, jackolope # rabbit gets assigned to cheetah instead of rabbit

Using *, you can assign all of the attributes of a module directly into the namespace without having to list them all:

from lobster import *

(Used on a package, * assigns all of the attributes of that package’s __init__ into the namespace; it does not import the modules of the package.)

Using from-import * is generally considered bad style because it pollutes the namespace with names, many of which you probably won’t use. Still, it can be handy if you’re just experimenting in interactive mode.

Lastly, you can import multiple modules in a single import statement:

import chicken, cow, goose                   # import the modules chicken, cow, and goose
import alligator as roger, horse as kate     # import alligator (assigning it to roger) and horse (assigning it to kate)

relative imports

A . in from-import spares us from having to write out the name of the package that contains the current module:

# assume we’re inside a module of the package tiger.whale
from . import turtle                # from tiger.whale import turtle
from .gerbil.eel import newt        # from tiger.whale.gerbil.eel import newt

Using .. refers to the package above that of the current module:

# assume we’re inside a module of the package tiger.whale
from .. import turtle               # from tiger import turtle
from ..gerbil.eel import newt       # from tiger.gerbil.eel import newt

 

Every additional dot refers to the package one more up the chain:

# assume we’re inside a module of the package tiger.whale.vulture.pig.lemur
from ....gerbil.eel import newt         # from tiger.whale.gerbil.eel import newt

 

By using dots when importing modules of the same package, you make it easier to rename the package or move it to a different directory.

 

Comments are closed.