So I promised some more technical discussion of App Engine than my last two posts. Here it is:
Google App Engine uses a somewhat CGI-like model. That is, a script is run, and it uses stdin/stdout/environ to handle the requests. To avoid the overhead of CGI a process can be reused by defining __main__.main(). But while a process can be reused, it might not be, and of course it might get run on an entirely separate server. So in many ways it’s like the CGI model, with a small optimization so that, particularly under load, your requests can run with less latency.
This part is all well and good. I’ve already come to terms with servers going up and down without warning. But the environment itself has a number of other restrictions. It seems that App Engine is providing security in the language itself. The interpreter has been modified so that code is sandboxed, with no ability to write to the disk, open sockets, import C extensions, and see quite a few things in its environment. It’s these things that are a bit harder to come to terms with.
While they claim it supports any Python framework, these restrictions don’t actually make it easy. So for the last few days quite a few of us have been hacking various things to get stuff working.
The first thing people noticed is that Mako and Genshi didn’t work, because they use the ast (via the parser module) to handle the templating, and that module has been restricted. Apparently arbitrary bytecode is not safe in this environment, and so anything that can produce bytecode is considered dangerous. From what I understand Philip Jenvy has been working on Mako and the trunk is currently working. He’d already been doing work to get Mako working on Jython, which had similar issues. Genshi is also in progress and fairly close to working, though with some missing features. Genshi has the harder task as Mako was primarily reading the ast, while Genshi was writing it.
The first thing I noticed is that Setuptools doesn’t work. I’m flattered that one of the only 3 libraries included with App Engine is WebOb, but of course I am more enamored of a rich set of reusable libraries. Setuptools didn’t work because several modules and functions have been removed — this like os.open, os.uname, imp.acquire_lock, etc. Some of these are kind of reasonable, while others are not. The removal of many functions from imp doesn’t really make sense, for instance (I think the motivation was the difficulty of auditing the implementation of those functions, not that the functionality itself is dangerous). And while some functions can’t be used in the environment, the fact you can’t import those functions is more problematic. For instance, The Setuptools’ pkg_resources module has support to unzip eggs when they are imported. App Engine doesn’t support importing from zip files at all, and you certainly can’t unzip to a temporary location. But withoutthe necessary modules and objects pkg_resources won’t even import.
To work around this I started a new project: appengine-monkey, which adds several monkeypatches and replacement dummy modules to the environment to simulate a more typical environment. It’s just a small list so far (mostly in this module), but I expect as people experiment with other libraries the list will increase. For example, I would welcome implementations of things like httplib on top of urlfetch in this library. (Implementing httplib and stubbing out parts of socket would probably make urllib run.)
But the good news is that Pylons is pretty much working on App Engine, as is Setuptools and you can manage your libraries using virtualenv.
The instructions are all located in the appengine-monkey Pylons wiki page. Please leave comments if you have improvements or problems with that process. I also welcome contributors and developers to the project itself — this is a project for expediting App Engine development, it is not a project I care to champion or control. Or support to any large degree.
One ticket which is rather important is the apparent maximum number of files and blobs: 1000. Libraries involve lots of files, and the base Pylons install is only barely under this limit. Now I just wish I could use lxml, but that’s probably going to be a long time coming.
Update: As of April 2009 these issues were fixed; it took a year, but at least it’s done. The 1000 file limit has been relaxed (1000 code plus 1000 static) but still exists. lxml remains unlikely.
Automatically generated list of related posts:
- App Engine and Open Source This is about Google App Engine which probably everyone has...
- App Engine: Commodity vs. Proprietary I like this phrasing of the debate about App Engine’s...
- What Does A WebOb App Look Like? Lately I’ve been writing code using WebOb and just a...
I’ll also note that Ben Bangert has just released a new version of [Beaker](http://pypi.python.org/pypi/Beaker/0.9.4) that adds support for sessions and caches backed by [Google App Engine's Datastore API](http://code.google.com/appengine/docs/datastore/).
Am I missing something about this App Engine? People time is expensive, computers are cheap, right? Why would you go to all this effort to port to a free hosting platform when it’s virtually free anyway? I pay $30 a month for dedicated Linux box from [server pronto](http://www.serverpronto.com/) and it’s great.
You’re missing a few things in your assumption. First and foremost, is the ease of scaling. As your projects get larger, scaling becomes more and more of a problem – no matter how good you are. The amount of time and money needed in order to scale both vertically and horizontally is, well, can – be a lot. The google app engine platform simplfies this process almost to the point where you don’t need to worry about it, so – if you’re expecting large growth and don’t want the growing pains, it’s a great platform – so long as you know python or java :)
Well, the advantage of the Google thing isn’t necessarily the price. It’s the ability to scale an application until the sky turns plaid.
A good chunk of the effort is similar to the effort you’d have to put in to handle gobs of traffic. With the AppEngine, though, you don’t have to do anything but write your application intelligently. They handle all the repetitive annoying stuff.
The manageability is as important to me as the price. People time is expensive, so why waste it managing servers? (Personally the manageability of the environment is more attractive than the scalability, though the two are of course related)
I wonder why file system stuff isn’t implemented atop of Big Table. How hard is it to just dump the file inside the database, and then retrieve it with the file path as the key? It’s okay that GAE has restrictions, but they should replace all standard libraries with their own alternatives, perhaps even transparently where possible.
Obligatory “Need to construct more Pylons” comment.
IIRC, once you have to scale up, then you have to start getting charged extra. So then you are paying just like you would somewhere else, except you are married to google’s sandbox? if that is true, then no thanks.
Ken: no hosting is free. To start out with this is the cheapest hosting out there (free!) — we don’t know how the charges will look after that. But I would be surprised if it didn’t stay the cheapest.
Asbjørn: the Datastore doesn’t have locking, but it does have transactions. Filesystems don’t usually have transactions, but they do have locking. While it would kind of work, it seems awkward, and it would perform so differently from a real filesystem that I think it wouldn’t work very well.
Whether or not you would use it, getting platforms working on App engine is going to have a huge trickle down effect on python web dev. Right now deployment is the biggest hurdle to wider adoption, and if Google is providing free deployment with a good tool set, you can bet that way more people will be trying python out for web dev, resulting in community growth, and all the major hosting companies are going to have to start paying attention to python or lose out. From a python growth and marketing perspective, app engine is great news and the work put into getting the major platforms working is well worth it!
Thanks!
There are few other problems faced while using GAE like mentioned here http://bygsoft.wordpress.com/2010/01/09/cloudy-combo-google-app-engine-and-amazon-s3-combo-pack/
Yet another language to learn, though? Sure we can do it, but what’s the percentage of Python programmers vs. other languages out there.
http://bygsoft.wordpress.com/2010/01/09/cloudy-combo-google-app-engine-and-amazon-s3-combo-pack/