After some time programming in Javascript I have grown a little fond of the duality there between objects and associative arrays (dictionaries):
//Javascript
var stuff = { a: 17, b: 42 };
stuff.a; //direct access (good sugar for basic use)
stuff['a']; //key based access (good for flexibility and for foreach loops)
In python there are basically two ways to do this kind of thing (as far as I know)
Dictionaries:
stuff = { 'a': 17, 'b':42 };
# no direct access :(
stuff['a'] #key based access
or Objects:
#use a dummy class since instantiating object does not let me set things
class O(object):
pass
stuff = O()
stuff.a = 17
stuff.a = 42
stuff.a #direct access :)
getattr(stuff, 'a') #key based access
edit: Some responses also mention namedtuples as a buitin way to create lighweight classes for immutable objects.
So my questions are:
Are there any established best-practices regarding whether I should use dicts or objects for storing simple, method-less key-value pairs?
I can imagine there are many ways to create little helper classes to make the object approach less ugly (for example, something that receives a dict on the constructor and then overrides
__getattribute__
). Is it a good idea or am I over-thinking it?- If this is a good thing to do, what would be the nicest approach? Also, would there be any good Python projects using said approach that I might take inspiration from?
After some time programming in Javascript I have grown a little fond of the duality there between objects and associative arrays (dictionaries):
//Javascript
var stuff = { a: 17, b: 42 };
stuff.a; //direct access (good sugar for basic use)
stuff['a']; //key based access (good for flexibility and for foreach loops)
In python there are basically two ways to do this kind of thing (as far as I know)
Dictionaries:
stuff = { 'a': 17, 'b':42 };
# no direct access :(
stuff['a'] #key based access
or Objects:
#use a dummy class since instantiating object does not let me set things
class O(object):
pass
stuff = O()
stuff.a = 17
stuff.a = 42
stuff.a #direct access :)
getattr(stuff, 'a') #key based access
edit: Some responses also mention namedtuples as a buitin way to create lighweight classes for immutable objects.
So my questions are:
Are there any established best-practices regarding whether I should use dicts or objects for storing simple, method-less key-value pairs?
I can imagine there are many ways to create little helper classes to make the object approach less ugly (for example, something that receives a dict on the constructor and then overrides
__getattribute__
). Is it a good idea or am I over-thinking it?- If this is a good thing to do, what would be the nicest approach? Also, would there be any good Python projects using said approach that I might take inspiration from?
- There's also namedtuple – Alex L Commented Feb 4, 2012 at 15:52
4 Answers
Reset to default 9Not sure about "established best practices", but what I do is:
- If the value types are homogenous – i.e. all values in the mappings are numbers, use a dict.
- If the values are heterogenous, and if the mapping always has a given more or less constant set of keys, use an object. (Preferrably use an actual class, since this smells a lot like a data type.)
- If the values are heterogenous, but the keys in the mapping change, flip a coin. I'm not sure how often this pattern comes up with Python, dictionaries like this notably appear in Javascript to "fake" functions with keyword arguments. Python already has those, and
**kwargs
is a dict, so I'd go with dicts.
Or to put it another way, represent instances of data types with objects. Represent ad-hoc or temporary mappings with dicts. Swallow having to use the ['key']
syntax – making Python feel like Javascript just feels forced to me.
This would be how I decide between a dict
and an object
for storing simple, method-less key-value pairs:
- Do I need to iterate over my key-value pairs?
- Yes: use a
dict
- No: go to 2.
- Yes: use a
- How many keys am I going to have?
- A lot: use a
dict
- A few: go to 3.
- A lot: use a
- Are the key names important?
- No: use a
dict
- Yes: go to 4.
- No: use a
- Do I wish to set in stone once and forever this important key names?
- No: use a
dict
- Yes: use an
object
- No: use a
It may also be interesting to tale a look at the difference shown by dis
:
>>> def dictf(d):
... d['apple'] = 'red'
... return d['apple']
...
>>> def objf(ob):
... ob.apple = 'red'
... return ob.apple
...
>>> dis.dis(dictf)
2 0 LOAD_CONST 1 ('red')
3 LOAD_FAST 0 (d)
6 LOAD_CONST 2 ('apple')
9 STORE_SUBSCR
3 10 LOAD_FAST 0 (d)
13 LOAD_CONST 2 ('apple')
16 BINARY_SUBSCR
17 RETURN_VALUE
>>> dis.dis(objf)
2 0 LOAD_CONST 1 ('red')
3 LOAD_FAST 0 (ob)
6 STORE_ATTR 0 (apple)
3 9 LOAD_FAST 0 (ob)
12 LOAD_ATTR 0 (apple)
15 RETURN_VALUE
Well, if the keys are known ahead of time (or actually, even not, really), you can use named tuples, which are basically easily-created objects with whatever fields you choose. The main constraint is that you have to know all of the keys at the time you create the tuple class, and they are immutable (but you can get an updated copy).
http://docs.python.org/library/collections.html#collections.namedtuple
In addition, you could almost certainly create a class that allows you to create properties dynamically.
Well, the two approaches are closely related! When you do
stuff.a
you're really accessing
stulff.__dict__['a']
Similarly, you can subclass dict
to make __getattr__
return the same as __getitem__
and so stuff.a
will also work for your dict
subclass.
The object approach is often convenient and useful when you know that the keys in your mapping will all be simple strings that are valid Python identifiers. If you have more complex keys, then you need a "real" mapping.
You should of course also use objects when you need more than a simple mapping. This "more" would normally be extra state or extra computations on the returned values.
You should also consider how others will use your stuff
objects. If they know it's a simple dict
, then they also know that they can call stuff.update(other_stuff)
etc. That's not so clear if you give them back an object. Basically: if you think they need to manipulate the keys and values of your stuff
like a normal dict
, then you should probably make it a dict
.
As for the most "pythonic" way to do this, then I can only say that I've seen libraries use both approaches:
The BeautifulSoup library parses HTML and hands you back some very dynamic objects where both attribute and item access have special meanings.
They could have chosen to give back
dict
objects instead, but there there is a lot of extra state associated with each object and so it makes perfect sense to use a real class.There are of course also lots of libraries that simply give back normal
dict
objects — they are the bread and butter of many Python programs.