I was reading this unfinished article on Jython (part of what looks to be very a good but also incomplete series of articles on JVM languages) and he had a few criticisms/suggestions of Python. I've mentioned some minor things that annoy me before, but this made me want to collect them all, and list all the things I wish Python did. Python 3K is already taken, but I'll up the ante -- Python 4K will be the 33% better than even Python 3K. (Note: implementation of Python 4K is left an an exercise for the reader.)
This isn't all backward compatible, but that said I'm not inclined to suggest needless backward incompatibilities either. Some people seem allergic to any "bloat" in a language, and obsess over removing things from builtins and elsewhere. I find this silly. Some people need to learn how to ignore what they don't care about.
So, what's my list?
Strings would not be iterable (noted here).
The self in method signatures would become optional. self would be a keyword with special significance. I regularly have bugs because of this -- mostly because I accidentally add self when calling methods on self, not because I leave them out of the function definition. But still, I find this lame.
cls would similarly become a keyword -- or maybe even allow class -- for class methods. For symmetry as much as anything.
super would become a keyword with special significance. I find super really confusing to use.
Maybe class methods wouldn't be callable on instances. But class variables should stay visible through instances. That seems tricky at first, but I actually think it's just a matter of a slightly different classmethod implementation.
We'd get some kind of declarative/lazy language extension, maybe like this. That turns class into one of several possible constructor arguments (property being a notable second such addition). Interface declarations could also be done more cleanly this way. This means that:
interface IFoo(Bar): a = 1
Means IFoo = interface('IFoo', (Bar,), {'a': 1}). You can kind of do that right now with metaclasses, but it feels very hacky, and Bar has to be designed for it. Something more conservative might also be just fine. It annoys me that attributes are unordered through these techniques. But maybe not worth fixing.
object grows a __classinit__ method, as described in a conservative metaclass.
Something to handle the descriptor problem, where descriptors don't know about what they are attached to. Having tried several things related to this, handling subclassing is hard, but I think it's important enough to make happen even if there are some inevitable warts.
There would only be bytes and unicode, no str, none of these Unicode*Errors. I hate those so much. Everyone does. And none of this "unicode is just intrinsically hard" crap. That's what slackers say about hard things. Python 4K will be a stand-up language, not some slacker language!
I'll admit this is a big backward compatibility issue. But unicode has already introduced huge backward compatibility problems, but certain people aren't willing to admit it, and instead just call the issues "bugs". Yeah, there's more bugs than non-bugs in the standard library, so I'm unimpressed.
Maybe something to clean up magic methods, like def *(other):. This wouldn't clean up all magic methods, but would clean up several. And right now, you actually can add a '*' method to objects. Maybe def '*'(other)? Not a big deal; magic methods are magic anyway, whatever name you give them.
Some query syntax for LINQ-like expressions -- that is, expressions which are introspectable and not run immediately. SQLObject's SQLBuilder (similar things seen elsewhere) is similar, but all hacky. Stuff like this and this would be easier and cleaner with it.
Some order-by syntax. Then you could do [value for key, value in some_dict.items() orderby value]. Right now you have to do [value for key, value in sorted(some_dict.items(), key=lambda v: v[1])] and that's obviously not as nice. Nice for the queries too.
Some string interpolation. Probably $"something ${expression}", ala PEP 215.
Some resolution to the anonymous function thing. I agree completely with Guido -- anonymous functions and function-expressions beyond lambda don't fit in Python. I could care less if lambda goes away -- I just want something good and lambda isn't good (I'm not a minimalist). I haven't figured out exactly what I want. I would probably defer to the brainstorming of the Twisted guys and other people who frequently bang their heads on this.
Some way of dealing with reloading. Maybe just cleaning up reload() (making it overwrite classes instead of recreating them, for instance). If that fancier reloading is part of the language, then people might be more careful about working so they don't break it. There would have to be an option of module-level functions __before_reload__ and __after_reload__ or something, so that even tricky code could fix itself up.
Something for object initialization. I find this tedious:
class Foo(object): def __init__(self, bar, baz): self.bar = bar self.baz = baz
But I'm not actually sure what to do. I mentioned that here.
I think double-underscore class-private variables (like self.__private) are super-lame. People think they are something they are not. They are not really-truly-private variables. They are I'm-going-to-piss-people-off private variables. They are I'm-too-lazy-to-use-name-mangling private variables. They are I'm-going-to-break-subclassing private variables. If you want class-private variables for a valid reason, explicit name mangling is in all ways better.
Delegation using __getattr__ and __setattr__ can be hard to work with, both creating delegates and consuming them. I'd like if there was more explicit support for this, with better failure conditions. Maybe this just means a nice built-in metaclass that solves this problem particularly well.
Something better for forward-references. I don't know exactly what to call these -- except it means for references to things that don't exist yet. Like a reference to a class that may or may not exist yet. Maybe this could be handled with a descriptor or something. Give it a specific implementation and pattern, so people can use it more easily than having to build the pattern up from what's there currently.
There's probably some interpreter things I'd like too...
Something like Pyrex but all polished and pretty. Maybe RPython can be this.
Microprocesses -- I haven't really used Erlang much, but I'm still drawn to it more than any of the other languages in that branch (the weird functional branch -- Haskell, ML, etc). They don't have microthreads (as you might see in Stackless), but they are basically the same except without shared state. Messages only. I think that's so much cooler than microthreads, at least if you can pull it off and still be fast and reliable.
Really fast interpreter startup. That might make the reloading thing much less important. I imagine frozen modules -- freezing the actual module in its full glory just as it looks immediately after import. With copy-on-write in some fashion, to facilitate really fast microprocess startup too (so microprocesses can share some set of modules so long as they aren't written to).
The first part (freezing) seems not-that-hard right now. In theory you just put pickle and marshal together and you got it, so long as the module doesn't do anything environment-dependent on startup (modules could opt in or opt out or something with respect to their environment dependencies). Copy-on-write is probably much much harder; though maybe it's an extension of garbage collection algorithms.
Really good cross-platform intra-Python interprocess communication. This should look just like inter-microprocess communication, for process portability.
Traceback extensions, so you can better annotate your code in ways that will show up in tracebacks. Similar in spirit to what Zope does with __traceback_info__ (and what Paste copies). But a built-in convention.
Some improved exception messages. I hate signature errors, for instance; very unhelpful (especially with the implicit-self-related counting error). Python does pretty well -- lots of languages do a lot worse -- but I also think this is way up there in terms of usability and can always use improvement.
No sys.path crappiness. Think this stuff through, one right way, no "search paths" (implicit awfulness). Careful or explicit relative imports.
A shortcut for for item in seq: yield item, when doing generator pass-through. Maybe yield *seq. Saw this on python-dev already. As I use generators more I encounter this desire fairly frequently.
Tuple-unpacking, like first, *rest = path.split('/'). This would be mighty useful. The guy who wrote the Jython review raves about Python packing; this would be another fine enhancement to a delightful feature. I desire this frequently.
Better signature unpacking, allowing def foo(*args, bar=None). I think I saw something recently about this in python-dev, and it might happen before 3.0.
Things I wouldn't do:
Put up with any significant-whitespace whining. Here 99% of the Python community is with me. I think there is a meme that Python people are close-minded to suggestions for changes in the language. I think there is significant truth to that. But sometimes everyone else is just completely wrong. I want nothing to do with any programmer who would mis-indent their code. If you want to mis-indent your code you are an idiot. If you want idiotic code to be an option you are being absurd.
One thing I noticed from some recent Ruby criticisms I've read is that its syntax errors can be hard to read -- and then I realized this is in part because you have to wait until the end of the file for the compiler to catch some syntax errors that a Python can catch immediately because of whitespace.
Continuation-based web frameworks (like Seaside) seem lame to me. I'm so much more impressed by event-based programming, and continuations (used in that way) seem like a way to avoid explicit events. Events are the web. Continuations are people who want to make the web into an MVC GUI. GUIs are so twentieth century. Grow up!
Language-level continuations (as opposed to the explicit continuation style used by Twisted) are cool and all -- I don't think they hurt -- but I don't think they are actually very useful either. That you can implement other things in terms of them doesn't impress me either; I'm also pretty comfortable with a series of specific language features instead of one giant meta-feature, at least when it comes to control flow.
I like the idea of adding to the ways you can express data in Python (it's not bad now). But actual macros don't really interest me (in a general sense) as I think they work on the wrong level -- they deal with syntax transformations, as opposed to object-level transformations. A fairly conservative set of additions to the syntax (I note queries and the interface/etc extension above) are better than macros.
Global interpreter lock? I still don't care, sorry. I want better processes, not better threads.
Turning = into an expression. To generalize: dynamic typing adds some usually shallow errors in return for far more expressive code. Assignment as an expression adds some really obnoxious errors (ones that don't raise exceptions) in return for minor expressiveness. That = isn't an expression catches me from making hard bugs almost every day (instead they are easy syntax error bugs).
I'd be fine with := as an assignment-expression, because I wouldn't misspell it. That would certainly be useful. The minimalists would hate this.
self in method bodies would not become optional. I.e., you'd still talk about self.instance_variable. That's a brevity that I don't want. Ruby's @ is fine (but doesn't fit Python's aesthetic). But the implicit this is horrible, and unworkable in Python anyway. People often clump signature-self whining together with body-self whining. One of those groups has a point, the other group doesn't. Anyway, I think the complaining would go away entirely if signatures and super was fixed, because people only complain about method bodies because think it adds weight to their argument against signatures (even though I think the opposite is true).
I actually like ? and ! in function names, as in Scheme. But I don't think it is worth adding.
I also kind of like no-public-attributes (i.e., all outside access is explicitly identified). property() helps a lot, but isn't all I would like.
The lexical scoping problems people have don't bother me. I wouldn't want Javascript-like variable declarations to define scope. It would be nice if Python gave an error earlier for scoping mistakes, though, and a better error too. I think this is statically detectable. Maybe something in addition to just the global declaration would be useful. People ask for lexical, and I can certainly see the benefit. (Note: this is only an issue when assigning to variables in enclosing scopes)
Well, that's all for now. Now you know everything I think is wrong with Python (that I can think of right now), and most of what I think would make it right. Looking over it again, I actually don't think it's that pie-in-the-sky, though some parts (like self) are indeed tricky, and involve more than just syntactic additions.
Update:
Well, I don't really need, or understand deeply, all the advanced/dynamic/metaclass features you're asking for, so I'll comment on the ones I care about.
Definite agreement:
- non-iterable strings
- Object initialization
- Microprocesses. This is the holy Python grail for me. They are so sweet in Erlang.
- Better tuple unpacking. Another win for Erlang that we could steal and improve. I like your examples.
- Same for signature unpacking.
- Explicit relative imports.
- orderby.
- Better reload().
- LINQ-like stuff.
- Unicode - my number 2 behind microprocesses.
Questions:
- You just say that __variables__ are lame, and mangling is better, without ever saying what you'd do about them. How does Python 4k handle private variables/name mangling properly?
- Why aren't "?" and "!" worth allowing in function names? I want these, personally. I feel like they would aid the descriptiveness of my code significantly. (How about "-"?)
I'd add (and I don't follow python-dev, so forgive me if they exist/are coming soon):
- Built-in currying
- local variable dump in exceptions, like in cgi module
- Microprocesses. Oh, did I mention that one already?
- ipython in the standard distribution
- Real, Good, Pythonic, Documented, Cross-Platform GUI library. We're talking 4k here right? Can we ever make this happen?
- By class-private variables I mean __private, not __magic__. Class-private variables are simply never needed. If you have self.__really_private you can just use self._MyClass_really_private for similar effect (actually self._MyClass__really_private is the identical effect)
- I think adding ? and ! to function names (or symbol names in general) changes the feel of the language too much. I'd add them if I was starting from zero, but at this point I wouldn't want to use those because it would be too jarring against the bulk of existing code. Dashes are, of course, completely out ;) Also, they conflict with outside naming conventions, making some Python symbols hard to access from other languages. I don't know if this is a big deal in practice, but I think there's something good about using a lowest-common-denominator in naming. Case sensitivity is useful for the same reason (though part of me would like to be case and underscore insensitive, to remove the source of needless style differences).
For the things you would add:
- There is a proposal for a partial function, which I think is what you mean by currying. Maybe in Python 2.5? I personally don't think it's very readable in practice. lambda actually not half bad for this one case.
- I agree that a nicer interactive prompt -- roughly based on ipython I imagine -- would be very useful.
- Local variable dumping is possible currently, but not enabled in the default traceback. I don't know if it should be -- I think it's overwhelming except in interactive environments. It should be included in more interactive environments; maybe standard library support would help. Goes with the __traceback_info__ stuff.
- GUI library... seems too hard, even for 4K ;) If it hasn't happened outside of the standard library, how can it happen inside? Plus GUIs are the past, the web is the future ;) XUL will rock, though! I'll be all over that.
# Ian Bicking
Bill: Check out PEP 309 for the currying/partial functions thing. http://www.python.org/peps/pep-0309.html. According to the PEP, it'll be in Python 2.5.
PartialApplication is officially in for 2.5 I cannot resist pointing our my recipe here:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/440557
It's sortof PA++ ;)
I very much want to see 'locals' and 'globals' be sort of accessible keywords much like you suggest 'super' and 'self', and also 'class'. It'd make dealing with the various name-spaces a cinch. Or maybe 'module' instead of 'globals'.
That'd be nice.
# Brantley
Why Python 4k ? Since most of this is not backward compatible, I would be very thanksfull if you began making PEP's for it all.
Personnally, don't understand all the stuff there but I vote for :
- really fast interpreter startup
- turns class into one of several possible constructor arguments
- order by syntax
- microprocesses (don't yet read about it but sounds cool).
About your assertion that GUI is old-style, I'm not totally convinced but I really think, GUI should be able to use bookmarks, protocols and REST style programming. What I see as a limitation in web style programming is the lack for "push" facilities.
Thanks for this wonderfull post.
# Alex Garel
The GUI stuff was really my slightly facetious rejection of those people who complain about how inelegent web programming is, and how it should be more like GUI programming. I think they are just opening a can of worms for themselves, by trying to impose a style of programming they are familiar with to a domain where it is not appropriate. Of course, I also think GUI programming is a bit retro all considered -- I genuinely think that as a user, not just as a programmer. But that only directs my interest, it doesn't really effect the language much one way or the other.
Some of these aren't language extensions per se, but just hard implementation problems: microprocesses are an example of that, as is interpreter startup. I am hopeful that PyPy will open up this kind of development.
# Ian Bicking
Good Ideas but please let strings yield characters on iteration :-) I've much code using string iteration.
And self/cls should'nt be keywords and should be the first parameter of method. Really :)
String iteration just doesn't make sense. They don't yield characters... they just yield smaller strings. Strings that contain strings that contain strings. Strings just aren't containers in Python -- they are indexable, but to iterate something should be a container. It's only (IMHO) an accident of an earlier Python that didn't have the iterator protocol, that you can iterate over strings. Strings should grow a method like .chars() that yields all the characters. Of course, not backward compatible, but the errors should be obvious ("TypeError: strings are not iterable").
I've been thinking about the self thing more. It's one of the hardest to resolve issues (and I doubt it will be resolved). But I think it would be reasonable if functions defined in a class scope meant something different than a function defined outside of a class scope. So if you were injecting a function definition into a class, it would (reasonably) work just like it does now, with an explicit self parameter. But if you are doing it in a class scope, the metaclass (type) wraps the function in some fashion. It seems less intrusive than class-private variables, even (which are statically determined).
The self parameter also works badly with decorators, though the same might be true if that parameter was removed. Getting a decorator to act properly with a method and with a function is hard.
# Ian Bicking
I certainly agree with some of the changes, like something as basic as proper relative imports. But I think I hear too many "GUI is old style, the web is the future" comments from a lot of programmers that are more over-enthusiasm for new web technologies rather than a careful look at why we keep and discard old/new technologies in general. As others have pointed out, even new asynchronous web technologies are still unidirectional in one sense: the client makes a request, and only then does the server send a reply. Web servers don't initiate requests to web browsers, although traditional networked GUI applications do this all the time. In tech history, we've gone from assembly language (independent of hardware), to portable languages (independent of assembly language), and now to various styles of cross-platform (independent of OS or basic platform). It's not just the web, but any successful model of cross-platform apps that is the future. The web is the future only because the browser is ubiquitous, but in the end I still have to install Firefox, etc. to be able to use the latest web technology. With things like PyGTK and wxPython, I can have the same ubiquity for GUI apps too. It's all the same: I download the app through my browser when I visit a page, or I download the Python script and run it. (P.S. I do agree with a lot of Ian's points in general. Sorry if this sounds like trolling).
# Rob Walker
I don't personally find Python wishlists that interesting, especially the ones which propose language changes, but it's good to see that you've touched on issues with the runtime itself because that is where Python (the platform) should be innovating, or at least fixing itself a fair amount. The problem is that because the language syntax is the thing most people have direct experience with (and one can become superficially acquainted with Python in that way, provoking the "whitespace sucks" arguments offered by detractors/newcomers), people feel more inclined to suggest tweaks which frequently say more about their own habits (at many different levels) rather than to seriously think about some of the harder issues - things like concurrency, performance, predictability, memory usage and library support - and then provide improvements in those areas. And when the harder issues do manage to enter the scene, proposals like optional static typing emerge: the arguably lazy and inadequate approach to dealing with such issues.
Certainly, I don't see Python diminishing in (relative) popularity because of the lack of some language constructs or notation: it's more likely to happen because Python doesn't conveniently work well with or support other, popular technologies, isn't perceived to scale well (in a way that involves convenient programming techniques), or isn't perceived to have good enough documentation - the latter being another story entirely.
My personal feeling is that a multi-line lambda is a mistake, like Guido says. However, I agree with the functionally oriented people that having to use a local named function breaks code locality and tends to make it hard to follow the code in Python written with a more functional flavor.
I think that the right Pythonic answer is to introduce a syntax kind of like ruby blocks where a function call can optionally be followed by a colon, then arg list and a suite underneath and that the arguments and suite would be converted to a local function passed as the last argument to the function. Here is an example:
def open_in( filename, block ): f = open( filename ) try: return block( f ) finally: close(f) max_char = open_in( "foo.txt" ): (f) x = 0 for l in f: x = max(x, len(l)) return xOr something like that. There are a few warts:
- The arg list of the anonymous block is awkward, maybe another colon at the end would help.
- The block is a lexical closure, not in the same frame as the outside function so you get the usual assignment to local problem, but its worse because it looks like a suite in the local scope the same way for and while suites are. That makes its more confusing.
- In a similar vein you need a return or yield in the block to get a result out which makes simple things like reduce look painful (but list comprehension saves filters and maps).
To see what I mean about the return or yield thing, here is an example. First the new form of reduce:
def reduce( l, v, f ): for x in l: v = f( v, x ) return initialNow, the clunky way to write a sum:
sum = reduce( x, 0 ): (a,b) return a+bVs, the more elegant (IMHO):
sum = reduce( x, 0 ): (a,b) a+bHowever, sum is a simple contrived example because of course sums, products, any, and all should be built in, and possibly the only real uses involve suites large enough that return or yield is help in understanding, instead of a hindrance.
<pre>
interface IFoo(Bar):
a = 1
</pre>
Consider this declaration means:
<pre>
def tmp():
a = 1
IFoo = interface('IFoo', (Bar,), tmp)
</pre>
With this we can express user defined function definition. Interface or class declaration can be emulated by executing tmp and using its local scope as definitions list.# irg
"I actually like ? and ! in function names, as in Scheme. But I don't think it is worth adding."
Scheme? I thought Ruby invented that! ;)
In any case, I hear you on the whitespace with one exception: I wonder if comments should be exempt from that rule. Sometimes it's just easier to scan down through a file when the important comments are left-aligned.
Sometimes it's just easier to scan down through a file when the important comments are left-aligned.
Can't you do that right now? I just tried it and it seems to work fine.
# Ian Bicking
Surely you meant to write "py-in-the-sky"?
I think making self.something spellable as .something would be nice.
Not a big fan of that myself. I guess I usually read code in my head on some level, and .something isn't readable in the same way self.something is. Also, that dot is really small. As a separator that's size fine -- good even -- but if it is an independent modifier that's bothersome. Plus I'm not so much looking to save a few keystrokes by taking it out of the signature, as much as bringing some symmetry in signature and use.
# Ian Bicking
"Turning = into an expression. To generalize: dynamic typing adds some usually shallow errors in return for far more expressive code. Assignment as an expression adds some really obnoxious errors (ones that don't raise exceptions) in return for minor expressiveness. That = isn't an expression catches me from making hard bugs almost every day (instead they are easy syntax error bugs).
I'd be fine with := as an assignment-expression, because I wouldn't misspell it. That would certainly be useful. The minimalists would hate this. "
I'm guessing yu'll be using = for equality?
I'm guessing yu'll be using = for equality?
No, nothing so dramatic; instead adding := as another way to do assignment, allowed in expressions. This is why minimalists would hate it (and not without justification). My real problem making assignment into an expression is that it's really easy to spell == as =, with disasterous consequences. It's really hard to accidentally leave out an = and add in a :, so with := I don't expect that typos will be too much of a problem.
# Ian Bicking
I like the way you're handling self and cls. One thing to note here is that any method that contains a self. is an instance method, any method that contains a cls. is a class method if it doesn't contain a self. as well. If it contains neither, it's a static method. Both the classmethod and staticmethod functions go away.
Another minor point here is that allowing both self. and cls. in the same method gives you access to the class object without any special syntax.
# John Roth
I don't know... I'd consider that much too magical and opaque. »Explicit is better than implicit«, right?