Friday, December 14, 2012

A guide to Python Namespaces

A guide to Python Namespaces

This post is part of the Powerful Python series where I talk about features of the Python language that make the programmer’s job easier. The Powerful Python page contains links to more articles as well as a list of future articles.
Namespaces are a fundamental idea in Python and can be very helpful in structuring and organizing your code (especially if you have a large enough project). However, namespaces might be a somewhat difficult concept to grasp and get used to if you’re new to programming or even coming from another programming language (in my case, Java). Here’s my attempt to make namespaces just a little easier to understand.
What’s in a name?
Before starting off with namespaces, you have to understand what Python means by a name. A name in Python is roughly analogous to a variable in just about any other language, but with a few extras. First of, because of Python’s dynamic nature, you can apply a name to just about anything. You can of course give names to values.
1= 12
2= 'B'
3= [1234]
But you can also give names to things like functions:
1def func():
2    print 'This is a function'
4= func
Now whenever you want to use func(), you can use f() instead. You can also take a name and reuse it. For example, the following code is perfectly legal in Python:
1var = 12
2var = "This is a string now"
3var = [2468]
If you accessed the name var in between assignments, you’d get a number, a string and a list at different times. Names go hand in hand with Python’s object system, ie. everything in Python is an object. Numbers, strings, functions, classes are all object. The way to get to the objects is often through a name.
Modules and Namespaces go hand in hand
So much for names. A namespace, is obviously enough, a space that holds a bunch of names. The Python tutorial says that they are a mapping from names to objects. Think of it as a big list of all the names that you’ve defined, either explicitly or my importing from modules. It’s not something than you have to create, it’s created whenever necessary.
To understand namespaces, you also have to have some understanding of modules in Python. A module is simply a file containing Python code. This code can be in the form of Python classes, functions, or just a list of names. Each module gets it’s own global namespaces. So you can’t have two classes or two functions in the same module with the same name as they share the namespace of the module (unless they are nested, which we’ll come to later).
However each namespace is also completely isolated. So two modules can have the same names within them. You can have a module called Integer and a module called FloatingPoint and both could have a function named add(). Once you import the module into your script, you can access the names by prefixing them with the module name: FloatingPoint.add() and Integer.add().
Whenever you run a simple Python script, the interpreter treats it as module called __main__, which gets its own namespace. The builtin functions that you would use also live in a module called __builtin__ and have their own namespace.
Importing pitfalls
Of course, modules are useless unless you import them into your program. There are a number of ways to do imports, and each has a different effect on the namespace.
1. import SomeModule
This is the simplest way to do imports and generally recommended. You get access to the module’s namespace provided you use the module’s name as a prefix. This means that you can have names in your program which are the same as those in the module, but you’ll be able to use both of them. It’s also helpful when you’re importing a large number of modules as you see which module a particular name belongs to.
2. from SomeModule import SomeName (name in namespace)
This imports a name (or a few, separated by commas) from a module’s namespace directly into the program’s. To use the name you imported, you no longer have to use a prefix, just the name directly. This can be useful if you know for certain you’ll only need to use a few names. The downside is that you can’t use the name you imported for something else in your own program. For example, you could use add() instead of Integer.add(), but if your program has an add() function, you’ll lose access to the Integer’s add() function.
3. from SomeModule import *
This imports all the names from SomeModule directly into the module’s namespace. Generally not a good idea as it leads to ‘namespace pollution’. If you find yourself writing this in your code, you should be better off with the first type of import.
These imports apply to classes and other data just as much as functions. Imports can be confusing for the effect they have on the namespace, but exercising a little care can make things much cleaner.
Scoping LEG-B
Even though modules have their own global namespaces, this doesn’t mean that all names can be used from everywhere in the module. A scope refers to a region of a program from where a namespace can be accessed without a prefix. Scopes are important for the isolation they provide within a module. At any time there are a number of scopes in operation: the scope of the current function you’re in, the scope of the module and then the scope of the Python builtins. This nesting of scopes means that one function can’t access names inside another function.
Namespaces are also searched for names inside out. This means that if there is a certain name declared in the module’s global namespace, you can reuse the name inside a function while being certain that any other function will get the global name. Of course, you can force the function to use the global name by prefixing the name with the ‘global’ keyword (not OOP). But if you need to use this, then you might be better off using classes and objects.
Classes and namespaces have special interactions. The only way for a class’ methods to access it’s own variables or functions (as names) is to use a reference to itself. This means that the first argument of a method must be a ‘self’ parameter, if it to access other class attributes. You need to do this because that while the module has a global namespace, the class itself does not. You can define multiple classes in the same module (and hence the same namespace) and have them share some global data. While this is different from other object-oriented languages, you’ll quickly get used to it.
Hopefully this guide will help you avoid some of the pitfalls that can arise if you don’t understand namespaces. They can lead to unusual results if you don’t use them properly, but they can help you write clean, properly separated code if you use them well. A further source of information on namespaces and classes is the excellent Python tutorial.

No comments:

Post a Comment