LEGB? Meet ICPO, Python’s search strategy for attributes
When it comes to variables, Python has a well-known search strategy, known by the acronym “LEGB.” Whenever you mention a variable — and by “variable,” I mean a name that could be referencing data, a function, or a class — Python tries to find it in four different places: The local (function) scope, the enclosing function’s scope, the global scope, and finally in the “builtins” namespace.
Variable scoping seems both boring and annoying, but it actually explains a lot about Python’s design. It’s really worth learning about, and won’t take much of your time. Indeed, I have a free e-mail course on the subject; you’re welcome to subscribe.
But what about attributes? How does Python search for those?
(Quick aside: If you’re wondering what “attributes” are, then consider that when you say “a.b” in Python, the “a” is a variable and the “b” is the attribute “b” on “a”, not a variable. As a general rule, anything with a dot before its name is an attribute. And yes, there are exceptions to this rule, but it’s a good generalization.)
From my experience, this question seems like an odd one to many Python developers, including those who have been using the language for a while. What does it mean to “search for attributes”? Aren’t attributes attached to an object, in a (sort of) private dictionary?
The answer, of course, is both “yes” and “no.” Attributes do belong to a single object. But in many cases, when Python cannot find an attribute on one object, it’ll look on another object to find it.
Indeed, this search for attributes sits at the heart of the Python language, and explains many of the things that we’ve come to expect, such as method calls.
I’ll start from the end: Just as the acronym LEGB (local, enclosing, global, builtins) makes it easy (or easier) for us to understand and follow Python variable lookups, I use the acronym ICPO (instance, class, parent, object) to understand where Python searches for attributes. Keeping this search path in mind will help you to both read and write Python more easily.
I’ll expand the acronym here, and then go through a bunch of examples, so that you can understand it better:
- Instance: When we ask for a.b, Python first checks: Does the object “a” have an attribute “b”? If so, then “b” is retrieved from “a”, and the search ends.
- Class: If Python doesn’t find “b” on object “a”, then it looks for type(a).b. That is, it looks for an attribute “b” on a’s class. If it finds the attribute here, then it returns the value, and the search ends.
- Parents: If Python doesn’t find “b” on type(a), then it looks on the parents of type(a):
- If the class inherits directly from “object”, then there aren’t really any parents from which to inherit, and this phase is skipped.
- If the class inherits from one class, then we check there — and on its parents, and its parents, etc.
- If the class inherits from multiple classes, then we follow the MRO (method resolution order) of the class.
- Object: All classes in Python inherit, directly or indirectly, from “object”, the top of our class hierarchy. As a result, searching for an attribute, if not first found elsewhere always concludes on “object”. If we cannot find an attribute on “object”, and it wasn’t found elsewhere previously, then Python raises an AttributeError exception.
I is for “instance”
Let’s start with some simple code to understand what I mean by all of this:
class Foo(): def __init__(self, x): self.x = x f = Foo(10) print(f.x)
In the above code, we define a class, Foo, with a single method, __init__. (We’ll return to methods in a little bit.) When we create a new instance of Foo, aka “f”, we create a new attribute on “self”, the instance.
Take note of this: Whenever we add or update an attribute on “self”, we’re doing so on the instance. In this case, it’s an instance of Foo, not the Foo class. Just as there’s a difference between an auto factory and an individual car, so too is there is a difference between the class Foo and f — and in a method, “self” points to the individual instance.
Thus, when we ask (on the final line) to see the value of “f.x”, Python goes through its ICPO search path, first asking: Is “x” an attribute on the instance, which we call “f”? Happily, the answer is “yes,” the search ends, and we get our answer of 10 returned to us.
C is for “class”
The above is probably how most people think of attributes, and attribute lookups, in Python. But things are a bit more complex than that. Let’s make our class a tiny bit more interesting:
class Foo(): def __init__(self, x): self.x = x def x2(self): return self.x * 2 f = Foo(10) print(f.x2())
Once again, we’ve defined a class Foo. And once again, the __init__ method will define an attribute named “x” on our new instance.
But this time around, we’re not asking for “f.x”. Rather, we’re going to execute the method “f.x2”. So Python starts off asking if the object “f” has an attribute named “x2”.
Except that the answer is “no,” because methods are class attributes. “x2” doesn’t exist on “f”; “x2” was defined on the class “Foo”. And thus, when Python cannot find “x2” on “f”, it goes to the next stop on its search, namely on f’s class — Foo.
Does Foo have an attribute “x2”? It does, a method object. That object is returned, and then the parentheses tell Python to execute it. When the method is executed, it does Python’s magic switcheroo, turning “f.x2()” into “Foo.x2(f)”, thus passing an argument to the “self” parameter. And the method runs!
What happens if I define a new attribute on “f” whose value is “x2”? By the ICPO rule, that attribute would have priority, and would effectively make calling the method impossible via the instance.
NOTE: This is not something you would normally want to do.
Let’s throw caution and sanity to the wind, and try it:
class Foo(): def __init__(self, x): self.x = x def x2(self): return self.x * 2 f = Foo(10) f.x2 = lambda: 'Not the x2 you expected' print(f.x2())
When we run the above code, Python first checks on “f”, to see if it has an attribute “x2”. And the answer is “yes.” It stops searching, and returns the function that we defined with “lambda”. The function is then executed via the parentheses, and returns a string value.
Note that this hijacked version of “x2” is only available on “f”. If we were to define another, separate instance of “Foo”, on which we didn’t define an “x2” attribute, the ICPO rule would fail to find “x2” on the instance, which means it would then search on Foo (i.e., the class). Sure enough, there’s an “x2” attribute on the class “Foo”, which would be returned.
P is for “parent”
What happens, though, if the attribute isn’t found either on the instance or on the class? Let’s look at an example:
class Foo(): def __init__(self, x): self.x = x
return self.x * 2class Bar(Foo): pass b = Bar(10) print(b.x2())
When we create b, our new instance of Bar, Python looks for __init__ on it. But there is no attribute “__init__” defined on b. So it looks on b’s class, Bar. The attribute isn’t there, either. Python then falls back to its third ICPO possibility, the parent. Bar only inherits from a single class, Foo. Sure enough, Foo.__init__ does exist — so that method attribute is retrieved, and then executes.
In other words: The ICPO rule describes how inheritance functions in Python. If we don’t find a method in a class, we look in its parent class. But inheritance works on all attributes, not just methods; it’ll work for data, as well. That’s why, if you create a class attribute, it’s available via the instances:
class Foo(): y = 100
passf = Foo() print(Foo.y) print(f.y)
When we run the above code, we see “100” printed twice: The first time, because Python asks if “y” is an attribute on Foo, and the answer is “yes.”
Wait a second: Isn’t “Foo” a class? Why is it getting searched first?
Because in Python, classes are indeed special. But they’re also objects like everything else in the language. So yes, “f” is an instance of “Foo”. But “Foo” is an instance of “type”. If we ask for “Foo.y” and the attribute “y” doesn’t exist, then the ICPO rule tells Python to look at type(Foo), which is “type”. Fortunately, in this case, that doesn’t happen, and we get the value “100” back.
Then, we ask for “f.y”; this also returns “100” — because by the ICPO rule, Python looks on the instance “f”, fails to find attribute “y”, and then goes to the class “Foo”, where it finds the (class) attribute “y”.
As always with ICPO, the first match that Python finds wins. This means that if a subclass and a parent class both have a method of the same name, the subclass’s method will execute:
class Foo(): def __init__(self, x): self.x = x
return self.x * 2class Bar(Foo): def x2(self): return self.x * 22 b = Bar(10) print(b.x2()) # prints 220
In the above example, we invoke “b.x2()”:
- Python looks for “b.x2”, and doesn’t find it.
- Python looks for “Bar.x2”, finds it, and executes it.
- Python does not continue searching, and thus “Foo.x2” is not executed. Which is precisely what we want, but is (in my experience) surprising to many newcomers to Python.
Find, but what about multiple inheritance? In such a case, the “P” stands for parents (plural) and not parent (singular). Python will search through each parent class, one at a time, according to the MRO (method resolution order). For example:
class A(): def __init__(self, x): self.x = x def x2(self): return self.x * 2 class B(): def __init__(self, y): self.y = y def y2(self): return self.y * 2 class C(A, B): pass
In the above code, class C inherits from both A and B. If we ask for its MRO, we’ll find that it first looks on itself (as usual), then A, then B, and then (finally) object. This means that if we say:
c = C(10) print(vars(c))
The above will show that c has a single attribute, “x”, whose value is 10. That’s because when we created “c”, Python looked for c.__init__. It kept following the ICPO rule until it found the first parent class, A, where __init__ was defined. That method ran, but B’s __init__ didn’t. This means that “c” doesn’t have the “y” attribute. And thus, if we write:
We’ll get an error — not an indication that the method is missing, because the “y2” method is indeed found, by searching “c”, then the class “C”, then its parent “A”, and then finally its parent “B”. And indeed, we find the “y2” method there!
So what’s the problem? The “y2” method expects the object (self, aka our instance “c”) to have an attribute “y”. But because only A.__init__ ran, and not B.__init__, there is no “y” attribute, and we get an error.
The error is obviously not something we want, but it is the natural result of the ICPO rule.
O is for “object”
The final location in which Python looks for an attribute is “object,” the top of our object hierarchy. In Python 2, you had to explicitly inherit from “object” to avoid having an old-style class, which worked just like modern classes… until it didn’t. Nowadays, we don’t have to worry about this; all classes automatically inherit from “object”, whether you state this expressly or not.
“object” doesn’t actually have a lot defined on it. There are some methods that are used as defaults, such as “__init__” (which does nothing, and fires if you don’t provide an “__init__” method on your class) and “__str__” (which ensures that all objects can be cast into strings). In many cases, you’ll want to implement — and thus override — these default methods, so that your objects can be initialized appropriately, as well as be cast as strings in the right way.
Why should I care?
You’ve read pretty far down in a long blog post — so I hope that you care! But if you got to this point and aren’t yet convinced of the importance of this rule, consider:
- Every time you invoke a method, Python uses the ICPO rule to find it. In other words: Inheritance in Python is a directly outgrowth of the ICPO rule.
- The fact that everything in Python, including classes, are objects, means that everything follows the same rules. The ICPO rule applies to your instances and classes, but also to built-in instances and classes. Python is nothing if not consistent, and your objects fit into this consistent hierarchy.
- It’s common for people coming from other languages to talk about “instance variables” and “class variables.” Abandoning those terms in favor of “attributes,” along with the ICPO rule, will help you to understand how Python works, and why it is (or isn’t) finding the data you asked for.
At the end of the day, the entire object system in Python boils down to a few rules and systems. An important part of this system is the ICPO rule. Once you’ve internalized it, many things that previously seemed odd about Python will (I believe) be more straightforward and consistent.
So, let me know: Does this make things easier to understand? What does this not explain? Leave a comment here, and I’ll try to respond and/or update the article!