(I'd really like feedback; if anyone has ideas or experience trying something like this, or even knows of the appropriate forum to talk about this sort of thing, please tell me.)
At work we have a problem with using many different kinds of user authentication. Zope has its own authentication, Apache has authentication (with many different kinds of backends, but only supporting HTTP authentication), and Webware and other systems use their own techniques.
So far I've found some ways around this. For instance, in Webware using LoginKit, I wrote ProxyUserManager, which lets me share authentication with Zope (or even Apache, I guess). In Zope I've used PluggableUserFolders to authenticate off external sources, which is painful but workable (the pain in Zope's fault, not PUF's). But really this is a dumb way to go about it -- I'm left creating ad hoc hooks between various systems, without any clear concept of who owns what and with lots of duplicated user databases and permission systems. In-application solutions just don't do much good, this really has to be orthogonal to all the kinds of applications I'm using, regardless of language or location.
So... my answer is to centralize around the only common component to all these systems -- Apache. I can't write a Python library, I can't even create some service that works in different environments, since I have to protect static content as well.
Apache authentication, unfortunately, sucks. It's tied to HTTP authentication, and the UI issues of HTTP authentication are simply insurmountable and unacceptable. Every authentication backend in Apache is different, and I don't want to install and maintain the configuration of N modules for N different kinds of authentication (e.g., database, file, and a variety of off-site systems).
But, getting away from all the pessimism, I have been working on this some. I used mod_python to add a handler -- headerparserhandler, which gets called very early in the request cycle. This handler parses cookies and checks for a signed cookie that gives the username. If it finds such a cookie, it creates a fake Authorization header, and if it doesn't find a cookie it deletes any Authorization header that might be there. We later put in a handler that checks the username/password, but accepts every username, knowing that it was set in that earlier handler. It's a little contorted, but it seems to work.
This way authentication isn't controlled by anything in Apache; any script that knows how to sign a cookie (any script that knows the server-side secret) can authenticate a user, using whatever technique it cares to, and it doesn't otherwise have to interact nicely with any other piece of the system.
Unfortunately there's a bit part I haven't figured out -- how to signal that a login is required. Typically this would be done with a 401 response, but I have yet to figure out how to incept such a response. ErrorDocument lets you give a page, like a login screen, but you can't replace the 401 response with a 200 response while you are doing that. The ErrorDocument warns about using fully-qualified URLs, because it will erase the login information -- but sadly I cannot blissfully ignore that (desiring this thing they consider a bug) because Apache 2 ignores such an ErrorDocument directive entirely.
I feel like I'm close. Certainly the solution may need refactoring -- for instance, I'm not comfortable with the overhead of mod_python for such a small task, and I might want to rewrite it in C. But if I could solve that 401 problem in any way at all I'd feel confident about the direction.
I have some rough code at http://svn.colorstudy.com/home/ianb/apache/authen.py
Again -- critique or alternate solutions are very much welcome.
Update: Some (still incomplete) followup on this is here
Can you not simply redirect to a common login url if a login is required. If the login is ok then it could redirect you back to the original page.
Sending the 401 HTTP response to me implies that you're using HTTP authentication, which you're clearly not wanting to use. I guess the desire you have is to send the correct HTTP header, but I'm thinking that 401 is only the correct header if you plan on using the rest of the HTTP headers for authentication.
I could, but that means Apache authentication wouldn't work right (since it always throws a 401). I added something to add another mod_python directive that indicates if a location is protected, and redirect if it is and the user isn't logged in; that does work, but doesn't feel right to me. Also, I would like for there to be a single protocol for requiring authentication, and 401 kind of is that protocol. Though I suppose I could add something like a special environmental variable, and make applications redirect to the URL given in that variable if they require login.
There might also be a handler that I'm missing that could rewrite the status of the response.
You also touched on the fact that all your systems have to implement their own authorization (not to be confused with authentication) systems for permissions. I hate to say it, but it sounds like maybe LDAP is the way to go for that.
I haven't really seen many cases when LDAP seemed like a good solution -- though I've never tried it, so maybe I'm just talking myself out of learning something new. (I have tried to learn it, but it seems like a PITA)
However, keeping user information centrally managed really isn't a very good option. If I was doing intranet design for a single company I could make that kind of decision, but we're making websites for clients that already have IT departments with their own policies, and I don't want to compete with those departments. Nor do I want to take over their jobs ;) I need to find common protocols, not concrete data stores.
I think LDAP is definitely the correct solution for you here, although it is a pita if you haven't used it before. It is supported by pretty much everything however, and once you've got it set up it is very straightforward to administer.
It also handles one of the things that I don't think the other schemes you have discussed do, which is additional data associated with a user. Things like their real name, while not essential to authentication, are very useful to centralise, rather than repeating them ad infinitum in every new application that requires them.
I'd say it is worth the learning curve for LDAP - since I got the hang of it I have found it a useful tool in a number of situations where previously I would have chosen something else, but the LDAP fit is actually far better.
On your second point, that IT already have a central database of user information, I think LDAP is still the right answer. First, if they are a Microsoft shop then it is quite likely they use Active Directory, which supports LDAP. You can authenticate directly against that and bob's your uncle.
If they have some other user source (NIS perhaps) they you still have the issue of duplicating data somewhere however you do this. At least with a central store that uses a rich and well-supported protocol you only have to handle reconciliation/import/merge or whatever policy you come up with at a single point.
Ian, have you considered using standard single singon framework? This page has links to most of them: http://openportalguard.sourceforge.net/wiki/index.php/Links/HomePage , another one is http://www.openfusion.com.au/labs/mod_auth_tkt/# Ksenia
I've looked at a few of these, and I'm definitely interested, but I don't think it really applies on the smaller scale I'm thinking of. By centralizing the logins between applications it would allow us to use a single signon framework (without heroic effort), because it would only require updating one login method. But I don't think a single signon framework would be an expedient way to deal with the simpler problem of taking out authentication from individual web applications. I'm really looking for some minimal and logical language-independent authentication protocol -- which is why I'd like to use HTTP/CGI's conventions of $REMOTE_USER and the 401 response code. (And of course 403 Forbidden for permission denied -- unlike Zope with its wildly inappropriate use of 401 for permission denied.) Really the only reason I want it to be "logical" is because I feel that's a sign that the interface is going to continue to make sense in the future. And I hate implementing crufty workarounds, and I certainly hate modifying third-party products to add crufty workarounds, so it's psychological as well ;) No matter what this is going to require modifying other people's packages.
One thing to note with the technique I present is that it doesn't involve password anywhere, unlike many authentication systems. This property would be essential to using something like single signon, among other authentication methods, and I think it's a sign that it provides proper separation of responsibilities.
Have you looked at this allready?
I have, but it requires username/password authentication, and that's part of what I want to avoid. It's also tied to HTTP authentication. I want to authenticate the request, not the username and password sent to me in one header. In various ways every mod_auth_* handler I found seemed deficient, mostly because of this. mod_auth_cookie (http://www.live-data.co.uk/development_folder/mod_auth_cookie_doc/view) kind of gets around this, but has other problems.
Have you looked at pubcookie? It splits things out a little further (in that there's a common authentication page that provides the signed cookies all the way out to the user's browser... I think it only makes sense when you have a large institution behind a common namespace, and have lots of diversely-managed applications that you want controlled by a single mechanism, but it also means that you can be arbitrarily sophsticated at the pubcookie login page, too, without changing any apps at all.) It isn't quite what you're trying to do, but might be worth a look.
Did you look into a WebISO solution like pubcookie or cosign? I'm not sure if these would work for your situation though.# anonymous
Oops I didn't notice any comments until after I submitted that comment. Guess I don't know exactly how this wiki works. Sorry.