So I promised some more technical discussion of App Engine than my last two posts. Here it is:
Google App Engine uses a somewhat CGI-like model. That is, a script is run, and it uses stdin/stdout/environ to handle the requests. To avoid the overhead of CGI a process can be reused by defining __main__.main(). But while a process can be reused, it might not be, and of course it might get run on an entirely separate server. So in many ways it’s like the CGI model, with a small optimization so that, particularly under load, your requests can run with less latency.
This part is all well and good. I’ve already come to terms with servers going up and down without warning. But the environment itself has a number of other restrictions. It seems that App Engine is providing security in the language itself. The interpreter has been modified so that code is sandboxed, with no ability to write to the disk, open sockets, import C extensions, and see quite a few things in its environment. It’s these things that are a bit harder to come to terms with.
While they claim it supports any Python framework, these restrictions don’t actually make it easy. So for the last few days quite a few of us have been hacking various things to get stuff working.
The first thing people noticed is that Mako and Genshi didn’t work, because they use the ast (via the parser module) to handle the templating, and that module has been restricted. Apparently arbitrary bytecode is not safe in this environment, and so anything that can produce bytecode is considered dangerous. From what I understand Philip Jenvy has been working on Mako and the trunk is currently working. He’d already been doing work to get Mako working on Jython, which had similar issues. Genshi is also in progress and fairly close to working, though with some missing features. Genshi has the harder task as Mako was primarily reading the ast, while Genshi was writing it.
The first thing I noticed is that Setuptools doesn’t work. I’m flattered that one of the only 3 libraries included with App Engine is WebOb, but of course I am more enamored of a rich set of reusable libraries. Setuptools didn’t work because several modules and functions have been removed — this like os.open, os.uname, imp.acquire_lock, etc. Some of these are kind of reasonable, while others are not. The removal of many functions from imp doesn’t really make sense, for instance (I think the motivation was the difficulty of auditing the implementation of those functions, not that the functionality itself is dangerous). And while some functions can’t be used in the environment, the fact you can’t import those functions is more problematic. For instance, The Setuptools’ pkg_resources module has support to unzip eggs when they are imported. App Engine doesn’t support importing from zip files at all, and you certainly can’t unzip to a temporary location. But withoutthe necessary modules and objects pkg_resources won’t even import.
To work around this I started a new project: appengine-monkey, which adds several monkeypatches and replacement dummy modules to the environment to simulate a more typical environment. It’s just a small list so far (mostly in this module), but I expect as people experiment with other libraries the list will increase. For example, I would welcome implementations of things like httplib on top of urlfetch in this library. (Implementing httplib and stubbing out parts of socket would probably make urllib run.)
But the good news is that Pylons is pretty much working on App Engine, as is Setuptools and you can manage your libraries using virtualenv.
The instructions are all located in the appengine-monkey Pylons wiki page. Please leave comments if you have improvements or problems with that process. I also welcome contributors and developers to the project itself — this is a project for expediting App Engine development, it is not a project I care to champion or control. Or support to any large degree.
One ticket which is rather important is the apparent maximum number of files and blobs: 1000. Libraries involve lots of files, and the base Pylons install is only barely under this limit. Now I just wish I could use lxml, but that’s probably going to be a long time coming.
Update: As of April 2009 these issues were fixed; it took a year, but at least it’s done. The 1000 file limit has been relaxed (1000 code plus 1000 static) but still exists. lxml remains unlikely.