Quite some time ago I gave a little presentation on DictMixin at ChiPy. If you haven’t used DictMixin before, it’s a class that implements all the derivative methods of dictionaries so you only have to implement the most minimal set: __getitem__, __setitem__, __delitem__, and keys. It’s a lot better than subclassing dict directly, as you have to implement a lot more, and dict implies a specific kind of storage. With DictMixin you can get the information from anywhere.
I thought of a couple examples, and wrote some doctests for them; I thought satisfying the doctests would itself be the presentation. I’m not sure how it worked; it was a fairly experienced crowd, but the switch from code to test can be disorienting.
One of the examples I used was a filesystem access layer. Representing a filesystem as a dictionary is nothing new, but the simplicity of the representation worked well. Here’s how it works:
- An FSDict represents one directory.
- The keys are the filenames in the directory.
- The values are the contents of the files (strings).
- When there is a subdirectory, it is another FSDict instance.
- When you assign a dictionary-like object to a key, it creates a FSDict from that object.
Dictionaries have lots of methods, like items(), update(), etc. But using DictMixin you just implement the four methods. First, the setup:
class FSDict(DictMixin):
def __init__(self, path):
self.path = path
Creation of a dictionary is not part of the dictionary interface. This seems a little strange at first, but the dict class interface isn’t the same as the dictionary instance interface. So FSDict.__init__ doesn’t bear any particular relation to dict.__init__.
Now the other methods… in each case, strings and dictionaries (files and directories) are treated differently.
def __getitem__(self, item):
fn = os.path.join(self.path, item)
if not os.path.exists(fn):
raise KeyError("File %s does not exist" % fn)
if os.path.isdir(fn):
return self.__class__(fn)
f = open(fn, 'rb')
c = f.read()
f.close()
return c
Note the use of self.__class__(fn) instead of FSDict(fn). This makes the class subclassable if you retain the FSDict.__init__ signature. This way subclasses will create new instances using the subclass. Note also that KeyError is part of the dictionary interface (an important part!), so we can’t raise IOError.
Now, assignment…
def __setitem__(self, item, value):
if item in self:
del self[item]
fn = os.path.join(self.path, item)
if isinstance(value, str):
f = open(fn, 'wb')
f.write(value)
f.close()
else:
# Assume it is a dictionary
os.mkdir(fn)
f = self[item]
f.update(value)
Note that with subdirectories (represented as nested dictionaries) we let DictMixin.update do all the hard work, and just create an empty directory to be filled.
Deletion…
def __delitem__(self, item):
fn = os.path.join(self.path, item)
if not os.path.exists(fn):
raise KeyError("File %s does not exist" % fn)
if os.path.isdir(fn):
## one way...
self[item].clear()
os.rmdir(fn)
## another way...
#shutil.rmtree(fn)
else:
os.unlink(fn)
Enumeration…
def keys(self):
return os.listdir(self.path)
So, to recursively copy '/foo/bar' to '/dest/path/bar' you do:
FSDict('/dest/path')['bar'] = FSDict('/foo')['bar']
It doesn’t really matter if '/foo/bar' is a directory or file. There’s a number of other clever things that come out of this. I think it’s an example of the power of a closed set — dictionaries are expressable from these four operations, and all the other methods can be derived from there. If you find this interesting, you might want to read the source for DictMixin; it’s only about 95 lines.
My article templating via dict wrappers has some other similar dict tricks.
No related posts.
I wrote a Ruby version of your FSDict class. It’s here: http://blog.ntecs.de/articles/2007/08/17/ian-bickings-fsdict-in-ruby
DictMixin is handy. But it’s an old-style class, and I ran into difficulty when using it to store dictionary contents in a database with Storm because of that. This led me to the #storm IRC channel, where I was told simply that what I was doing wasn’t a proper dictionary and therefore I should use an over-complicated delegation system instead. Ugh.
I can’t see the point of that. Yes, a class using DictMixin can do some clever stuff behind the scenes that you wouldn’t necessarily expect from a normal dictionary. But that doesn’t mean that it isn’t actually a dictionary, does it? It still stores key/value pairs, right?
I think adapting simple interfaces like the Dict interface to complex behaviour is one of the things that makes Python so useful. I don’t see the problem with “magic” like this, in fact I think it’s a great thing. What do you say to people who want to double your class-count for the sake of it? Telling them to go back to Java sounds too snotty, but their attitude isn’t great either.
I’m not exactly sure what you were doing.
obj.__dict__
must be a real dictionary, I don’t even think a dictionary subclass is okay. But that’s right at the heart of Python’s objects, so it’s not too surprising — lots of low-level code pokes right into an instance’s dictionary structure directly, I imagine. I’m guessing something along those lines is what happened to you with Storm? There’s also questions like “can this object change behind my back?” — which you can feel more confident about with a real dictionary. As an example, WSGI doesn’t allow dictionary-like objects for requests, only exactlydict
. It’s like a contract in that case.You can do
class Foo(object, DictMixin):
if you want a new-style dict-like class; that works fine.