Table of Contents
Previous Chapter
Objects are Python's abstraction for data. All data in a Python program is represented by objects or by relations between objects. (In conformance to Von Neumann's model of a "stored program computer", code is also represented by objects.)
Every object has an identity, a type and a value. An object's identity never changes once it has been created; you may think of it as the object's address in memory. The `is' operator compares the identity of two objects; the `id()' function returns an integer representing its identity (currently implemented as its address). An object's type is also unchangeable. It determines the operations that an object supports (e.g. "does it have a length?") and also defines the possible values for objects of that type. The `type()' function returns an object's type (which is an object itself). The value of some objects can change. The `==' operator compares the value of two objects. Objects whose value can change are said to be mutable; objects whose value is unchangeable once they are created are called immutable. An object's (im)mutability is determined by its type; for instance, numbers, strings and tuples are immutable, while dictionaries and lists are mutable.
Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable. (Implementation note: the current implementation uses a reference-counting scheme which collects most objects as soon as they become unreachable, but never collects garbage containing circular references.)
Note that the use of the implementation's tracing or debugging facilities may keep objects alive that would normally be collectable. Also note that catching an exception with a `try...except' statement may keep objects alive.
Some objects contain references to "external" resources such as open files or windows. It is understood that these resources are freed when the object is garbage-collected, but since garbage collection is not guaranteed to happen, such objects also provide an explicit way to release the external resource, usually a close()
method. Programs are strongly recommended to always explicitly close such objects. The `try...finally' statement provides a convenient way to do this.
Some objects contain references to other objects; these are called containers. Examples of containers are tuples, lists and dictionaries. The references are part of a container's value. In most cases, when we talk about the value of a container, we imply the values, not the identities of the contained objects; however, when we talk about the (im)mutability of a container, only the identities of the immediately contained objects are implied. So, if an immutable container (like a tuple) contains a reference to a mutable object, its value changes if that mutable object is changed.
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g. after ``a = 1; b = 1'', a
and b
may or may not refer to the same object with the value one, depending on the implementation, but after ``c = []; d = []'', c
and d
are guaranteed to refer to two different, unique, newly created empty lists. (Note that ``c = d = []'' assigns the same object to both c and d.)
Below is a list of the types that are built into Python. Extension modules written in C can define additional types. Future versions of Python may add types to the type hierarchy (e.g. rational numbers, efficiently stored arrays of integers, etc.).
Some of the type descriptions below contain a paragraph listing `special attributes'. These are attributes that provide access to the implementation and are not intended for general use. Their definition may change in the future. There are also some `generic' special attributes, not listed with the individual objects: __methods__
is a list of the method names of a built-in object, if it has any; __members__
is a list of the data attribute names of a built-in object, if it has any.
None
. It is used to signify the absence of a value in many situations, e.g. it is returned from functions that don't explicitly return anything. Its truth value is false.
Ellipsis
. It is used to indicate the presence of the ``...'' syntax in a slice. Its truth value is true.
OverflowError
is raised. For the purpose of shift and mask operations, integers are assumed to have a binary, 2's complement notation using 32 or more bits, and hiding no bits from the user (i.e., all 4294967296 different bit patterns correspond to different values).len()
returns the number of items of a sequence. When the length of a sequence is n, the index set contains the numbers 0, 1, ..., n. Item i of sequence a is selected by a[i].
a[i:j]
selects all items with index k such that i<=k<j. When used as an expression, a slice is a sequence of the same type this implies that the index set is renumbered so that it starts at 0 again.chr()
and ord()
convert between characters and nonnegative integers representing the byte values. Bytes with the values 0-127 usually represent the corresponding ASCII values, but the interpretation of values is up to the program. The string data type is also used to represent arrays of bytes, e.g. to hold data read from a file.del
(delete) statements. a[k]
selects the item indexed by k
from the mapping a
; this can be used in expressions and as the target of assignments or del
statements. The built-in function len()
returns the number of items in a mapping.
{...}
notation. (See "Dictionary displays" on page28.)func_code
is the code object representing the compiled function body; func_globals
is (a reference to) the dictionary that holds the function's global variables it defines the global name space of the module in which the function was defined. Additional information about a function's definition can be retrieved from its code object; see the description of internal types below.im_self
is the instance object; im_func
is the function object; im_class is the class that defined the method (which may be a base class of the class of which im_self is an instance); __doc__ is the method's documentation (same as im_func.__doc__); __name__ is the method name (same as im_func.__name__). len
and math.sin
(math is a standard built-in module). The number and type of the arguments are determined by the C function. Special read-only attributes: __doc__ is the function's documentation string, or None if unavailable; __name__ is the function's name; __self__ is set to None (but see the next paragraph). list.append
, assuming list
is a list object. In this case, the special read-only attribute __self__ is set to the object denoted by list. __init__
method if it has one. Any arguments are passed on to the __init__
method if there is no __init__
method, the class must be called without arguments.import
statement. (See "The import statement" on page43.) A module object has a name space implemented by a dictionary object (this is the dictionary referenced by the func_globals
attribute of functions defined in the module). Attribute references are translated to lookups in this dictionary, e.g. m.x is equivalent to m.__dict__["x"]. A module object does not contain the code object used to initialize the module (since it isn't needed once the initialization is done).
__dict__
is the dictionary object that is the module's name space.__name__
is the module name; __doc__
is the module's documentation string, or None
if unavailable; __file__ is the pathname of the file from which the module was loaded, if it was loaded from a file. The __file__ attribute is not present for C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file. __dict__
is the dictionary that is the class's name space; __name__ is the class name; __bases__
is a tuple (possibly empty or a singleton) containing the base classes, in the order of their occurrence in the base class list. __dict__
yields the attribute dictionary; __class__
yields the instance's class.open()
built-in function, and also by posix.popen()
, posix.fdopen()
and the makefile
method of socket objects. The objects sys.stdin
, sys.stdout
and sys.stderr
are initialized to file objects corresponding to the interpreter's standard input, output and error streams. See the Python Library Reference for methods of file objects and other details.
co_code
is a string representing the sequence of bytecode instructions; co_consts
is a tuple containing the literals used by the bytecode; co_names
is a tuple containing the names used by the bytecode; co_filename
is the filename from which the code was compiled; co_flags is an integer encoding a number of flags for the interpreter. The following flag bits are defined: bit 2 is set if the function uses the "*arguments'' syntax to accept an arbitrary number of positional arguments; bit 3 is set if the function uses the ``**keywords'' syntax to accept arbitrary keyword arguments; other bits are used internally or reserved for future use. The first item in co_consts is the documentation string of the function, or None if undefined. To find out the first line number of a function, you have to disassemble the bytecode instructions; the standard library module codehack
defines a function getlineno() that returns the first line number of a code object. f_back
is to the previous stack frame (towards the caller), or None
if this is the bottom stack frame; f_code
is the code object being executed in this frame; f_locals
is the dictionary used to look up locals variables; f_globals
is used for global variables; f_builtins is used for built-in (intrinsic) names; f_restricted is a flag indicating whether the function is executing in restricted execution mode; f_owner is the class or module that defined the code, if any; f_lineno
gives the current line number and f_lasti
gives the precise instruction (this is an index into the instruction string of the code object). try
statement" on page47.), the stack trace is made available to the program as sys.exc_traceback
. When the program contains no suitable handler, the stack trace is written (nicely formatted) to the standard error stream; if the interpreter is interactive, it is also made available to the user as sys.last_traceback
.tb_next
is the next level in the stack trace (towards the frame where the exception occurred), or None
if there is no next level; tb_frame
points to the execution frame of the current level; tb_lineno
gives the line number where the exception occurred; tb_lasti
indicates the precise instruction. The line number and last instruction in the traceback may differ from the line number of its frame object if the exception occurred in a try
statement with no matching except
clause or with a finally
clause. This section describes how user-defined classes can customize their behavior or emulate the behavior of other object types. In the following, if a class defines a particular method, any class derived from it is also understood to define that method (implicitly).
A class can implement certain operations that are invoked by special syntax (such as arithmetic operations or subscripting and slicing) by defining methods with special names. For instance, if a class defines a method named __getitem__
, and x
is an instance of this class, then x[i]
is equivalent to x.__getitem__(i)
. (The reverse is not true; e.g. if x
is a list object, x.__getitem__(i)
is not equivalent to x[i]
.) Except where mentioned, attempts to execute an operation raise an exception when no appropriate method is defined.
__init__(self, [args...])
Called when the instance is created. The arguments are those that were passed to the class constructor expression. If a base class has an __init__ method the derived class's __init__ method must explicitly call it to ensure proper initialization of the base class part of the instance, e.g.``BaseClass.__init__(self,[args...])''.
__del__(self)
Called when the instance is about to be destroyed. If a base class has a __del__ method the derived class's __del__ method must explicitly call it to ensure proper deletion of the base class part of the instance. e.g. ``BaseClass.__del__(self)''. Note that it is possible (though not recommended!) for the __del__ method to postpone destruction of the instance by creating a new reference to it. It may then be called at a later time when this new reference is deleted. It is not guaranteed that __del__ methods are called for objects that still exist when the interpreter exits.
__repr__(self)
Called by the repr()
built-in function and by string conversions (reverse quotes) to compute the "official" string representation of an object. This should normally look like a valid Python expression that can be used to recreate an object with the same value.
__str__(self)
Called by the str()
built-in function and by the print
statement compute the ``informal'' string representation of an object. This differs from __repr__ in that it doesn't have to look like a valid Python expression: a more convenient or concise representation may be used instead.
__cmp__(self, other)
Called by all comparison operations. Should return a negative integer if self
<
other
, zero if self
==
other
, a positive integer if self
>
other
. If no __cmp__ method is defined, class instances are compared by object identity ("address"). (Implementation note: due to limitations in the interpreter, exceptions raised by comparisons are ignored, and the outcome will be random in this case.)
__hash__(self)
Called for the key object for dictionary operations, and by the built-in function hash(). Should return a 32-bit integer usable as a hash value for dictionary operations. The only required property is that objects which compare equal have the same hash value; it is advised to somehow mix together (e.g. using exclusive or) the hash values for the components of the object that also play a part in comparison of objects. If no __hash__ method is defined, class instances are hashed by object identity (``address''). If a class does not define a __cmp__ method it should not define a __hash__ method either; if it defines __cmp__ but not __hash__ its instances will not be usable as dictionary keys. If a class defines mutable objects and implements a __cmp__ method it should not implement __hash__ since the dictionary implementation requires that a key's hash value is immutable (if the object's hash value changes, it will be in the wrong hash bucket).__nonzero__(self)
Called to implement truth value testing; should return 0 or 1. When this method is not defined, __len__
is called, if it is defined (see below). If a class defines neither __len__
nor __nonzero__
, all its instances are considered true.
The following methods can be defined to customize the meaning of attribute access (use of, assignment to, or deletion of x.name) for class instances. For performance reasons, these methods are cached in the class object at class definition time; therefore, they cannot be changed after the class definition is executed.
__getattr__(self, name)
Called when an attribute lookup has not found the attribute in the usual places (i.e. it is not an instance attribute nor is it found in the class tree for self). name is the attribute name. This method should return the (computed) attribute value or raise an AttributeError exception.
__getattr__
is not called. (This is an asymmetry between __getattr__
and __setattr__
.) This is done both for efficiency reasons and because otherwise __setattr__
would have no way to access other attributes of the instance. Note that at least for instance variables, you can fake total control by not inserting any values in the instance attribute dictionary (but instead inserting them in another object).__setattr__(self, name, value)
Called whenever an attribute assignment is attempted. This is called instead of the normal mechanism (i.e. instead of storing the value in the instance dictionary). name is the attribute name, value is the value to be assigned to it.
__delattr__(self, name)
Like __setattr__ but for attribute deletion instead of assignment.
__call__(self, [args...])
Called when the instance is "called" as a function; if this method is defined, x(arg1, arg2, ...) is a shorthand for x.__call__(arg1, arg2, ...).
The following methods can be defined to emulate sequence or mapping objects. The first set of methods is used either to emulate a sequence or to emulate a mapping; the difference is that for a sequence, the allowable keys should be the integers k for which 0 <= k < N where N is the length of the sequence, and the method __getslice__ (see below) should be defined. It is also recommended that mappings provide methods keys, values and items behaving similar to those for Python's standard dictionary objects; mutable sequences should provide methods append, count, index, insert, sort, remove and reverse like Python standard list objects. Finally, sequence types should implement addition (meaning concatenation) and multiplication (meaning repetition) by defining the methods __add__, __radd__, __mul__ and __rmul__ described below; they should not define __coerce__ or other numerical operators.
__len__(self)
Called to implement the built-in function len()
. Should return the length of the object, an integer >=
0. Also, an object that doesn't define a __nonzero__() method and whose __len__()
method returns zero is considered to be false in a Boolean context.
__getitem__(self, key)
Called to implement evaluation of self[key]
. Note that the special interpretation of negative keys (if the class wishes to emulate a sequence type) is up to the __getitem__
method.
__setitem__(self, key, value)
Called to implement assignment to self[key]
. Same note as for __getitem__
.
__delitem__(self, key)
Called to implement deletion of self[key]
. Same note as for __getitem__
.
The following methods can be defined to further emulate sequence objects. For immutable sequences methods, only __getslice__ should be defined; for mutable sequences, all three methods should be defined.
__getslice__(self, i, j)
Called to implement evaluation of self[i:j]
. The returned object should be of the same type as self. Note that missing i
or j
in the slice expression are replaced by 0 or len(self)
, respectively, and len(self)
has been added (once) to originally negative i
or j
by the time this function is called (unlike for __getitem__
).
__setslice__(self, i, j, sequence)
Called to implement assignment to self[i:j]
. The sequence argument can have any type. The return value should be None. Same notes for i and j as for __getslice__
.
__delslice__(self, i, j)
Called to implement deletion of self[i:j]
. Same notes for i and j as for __getslice__
.
Notice that these methods are only invoked when a single slice with a single colon is used. For slice operations involving extended slice notation, __getitem__, __setitem__ or __delitem__ is called.
The following methods can be defined to emulate numeric objects. Methods corresponding to operations that are not supported by the particular kind of number implemented (e.g., bitwise operations for non-integral numbers) should be left undefined.
__add__(self, right)
__sub__(self, right)
__mul__(self, right)
__div__(self, right)
__mod__(self, right)
__divmod__(self, right)
__pow__(self, right)
__lshift__(self, right)
__rshift__(self, right)
__and__(self, right)
__xor__(self, right)
__radd__(self, left)
__rsub__(self, left)
__rmul__(self, left)
__rdiv__(self, left)
__rmod__(self, left)
__rdivmod__(self, left)
__rpow__(self, left)
__rlshift__(self, left)
__rrshift__(self, left)
__rand__(self, left)
__rxor__(self, left)
__ror__(self, left)
These functions are called to implement the binary arithmetic operations (+
, -
, *
, /
, %
, divmod()
, pow()
, <<
, >>
, &
, ^
, |
) with reversed operands. These functions are only called if the left operand does not support the corresponding operation (possibly after coercion). For instance: to evaluate the expression x+y, where x is an instance of a class that does not have an __add__ method, y.__radd(x) is called. If the class defines a __coerce__ method that coerces its arguments to a common type, these methods will never be called and thus needn't be defined. They are useful for classes that implement semi-numerical data types (types that have some numerical behavior but don't adhere to all invariants usually assumed about numbers).__neg__(self)
__pos__(self)
__abs__(self)
__invert__(self)
Called to implement the unary arithmetic operations (-
, +
, abs()
and ~
).__int__(self)
__long__(self)
__float__(self)
Called to implement the built-in functions int()
, long()
and float()
. Should return a value of the appropriate type.__oct__(self)
__hex__(self)
Called to implement the built-in functions oct()
and hex()
. Should return a string value.__coerce__(self, other)
Called to implement "mixed-mode" numeric arithmetic. Should either return a 2-tuple containing self and other converted to a common numeric type, or None if no conversion is possible. When the common type would be the type of other, it is sufficient to return None, since the interpreter will also ask the other object to attempt a coercion (but sometimes, if the implementation of the other type cannot be changed, it is useful to do the conversion to the other type here).