Book Idea: Python Optimization
Every so often it seems like some Python figure thinks about writing
a book, but they aren't sure what to write about. So, for anyone
like that, here's an idea for the taking: Python optimization. It
seems like it has a fairly narrow scope, but I think there's
actually a lot of material there. Here's some things I can think of:
- Profiling, where all optimization should start.
-
In-python optimizations:
-
Caching (this is where I always start). Also pooling.
-
Factoring applications into long-running processes, when
startup times are a problem. Also tips like lazy loading
of code.
-
Other tricks to speed startup time. I understand things
like the search path can significantly effect some of
this.
-
Streaming... I feel like there's a term I am missing
here. Like the difference between SAX XML parsing and
DOM, where SAX encourages the programmer to do all the
work possible as the document is being parsed, while DOM
typically does all the parsing up front. That's one
example of a larger concept, and one that dramatically
effects performance.
-
Note important modules that may be forgotten. E.g.,
StringIO
-
Lots of little tips, e.g.,
lst.sort(); lst.reverse() instead of
lst.sort(lambda a, b: cmp(b, a)).
-
Numeric and related modules. There's a lot of novel ways of
using these tools outside of their core domain.
- Pyrex.
-
Extensions in C, thinking particularly about those inner loops.
I think this subject could remain fairly tight as long as you
are focused on optimization, rather than the entire subject of
programming C Python extensions.
-
Psyco. It's kind of magic, so it might be hard to talk about,
but there is some important information about how it works.
-
Microthreads. There's a lot of flavors and ideas out there, some
built on generators, some not. Perhaps some talk of Stackless,
or Greenlets (the mysterious new kid on the block).
-
Asynchronous programming. This is a big topic, but at least
there should be a discussion of the performance characteristics,
and enough information to get a feel.
-
XML parsing. It seems very specific, but there's a bunch of
alternatives, and performance can be a significant issue.
-
GUI programming, particularly techniques to make a GUI
responsive. This might be difficult to address without covering
GUI programming as a whole, so maybe this wouldn't work.
I'm sure there's things I'm not thinking of, and there's a lot of
research that would go into it. I think the result could be a really
good book, though, with something for every level. Like a Python
Cookbook, only more specific. There's a fair amount of competition among
generic Python books, so specific topics seem to have more potential. It
could also be quite popular; it's something that would catch the eye of
even non-Python programmers, as there's many people who want to use
Python, but are concerned about the performance. I think that concern is
often misplaced, but it's there nonetheless. Performance is also
exciting to people, in the way that gets people to buy books.
Anyway, there's my idea, maybe it'll be helpful to someone.
Created 17 Sep '04
Modified 14 Dec '04
Here's another idea: "Design and Interpretation of Runtime Systems".
Every computer science department on the planet offers a course on
compilers, but how many offer a course on the theory and practice of
what you have to do at runtime? Linking and loading, byte-code
interpreters, just-in-time compilation---there's a lot there, but
no-one has ever written it all down in one place.
I think Python would be an running example for such a book, if one
included Psyco, IronPython, etc.
An optimization book would be great. Many people who use Python have
rich programming backgrounds and can transfer universal optimization
concepts like caching and pooling to Python. However I believe for a
significant number of Python users that Python is a first or at most
second language. This group doesn't have the benefit of lessons
learned after years of programming in C, Java, or Perl to apply to
optimizing Python programs.
Rather than a dead trees book, how about a wikipedia-like public
repository, with many contributors?