I’m blogging about the development of a new product in Mozilla, look here for my other posts in this series
I teeter between thinking big about PageShot and thinking small. The benefit of thinking small is: how can this tool provide value to people who wouldn’t know if it would provide any value? And: how do we get it done?
Still I can’t help but thinking big too. The web gave us this incredible way to talk about how we experience the web: the URL. An incredible amount of stuff has been built on that, search and sharing and archiving and ways to draw people into content and let people skim. Indexes, summaries, APIs, and everyone gets to mint their own URLs and accept anyone else’s URLs, pointing to anything.
But not everyone gets to mint URLs. Developers and site owners get to do that. If something doesn’t have a URL, you can’t point to it. And every URL is a pointer, a kind of promise that the site owner has to deliver on, and sometimes doesn’t choose to, or they lose interest.
I want PageShot to give a capability to users, the ability to address anything, because PageShot captures the state of any page at a moment, not an address so someone else can try to recreate that page. The frozen page that PageShot saves is handy for things like capturing or highlighting parts of the page, which I think is the feature people will find attractive, but that’s just a subset of what you might want to do with a snapshot of web content. So I also hope it will be a building block. When you put content into PageShot, you will know it is well formed, you will know it is static and available, you can point to exact locations and recover those locations later. And all via a tool that is accessible to anyone, not just developers. I think there are neat things to be built on that. (And if you do too, I’d be interested in hearing about your thoughts.)
Comments
If I am understanding right, PageShot is a web clipper tool with sharing feature built in? As firefox is landing the reading list feature, maybe this is a good chance to integrate this tool with it by allowing users to insert page shots into the reading list and share them.
This sounds like a powerful tool to me as a teacher. Help me to better see this product/idea, please. Where is this PageShot stored? Is it an html file on one's file system (w/info on both the whole page and the exact area selected)? What applications are able to view it? How does one share it? Additionally, would it be 'easy' to build meta tools to analyze the content of the pageshot (such as a word count/wordle tool? maybe I'm asking about an api?)
The shot is uploaded to a server, at a hard-to-guess URL. That page on the server contains the clips you've made at the top, and then the full page below, with the ability to navigate to those clips inline. You need Firefox (and the addon) to create a shot, but you can view the shot on any browser.
Right now we don't have any access control except that you shouldn't share the link except to someone you want to see it. So you can give the link to any service and it can fetch it and get all the metadata. The page is an HTML document, but also well formed and the analyzing software is sure to see the same thing the user sees. The clips are images or text, but in the case of images we also determine the text most likely contained in the image, so you could still do things like text search across those image clips.
I've thought it would be nice if there was an API where you could ask a service be informed about all or some of your shots. Then you could automatically send all that information on to someone else. To make it a bit more interesting, that service could then add its own annotations. E.g., a word count app could accept shots, do a word count, then add that count as an annotation. Right now each tidbit of information shows up as a clip or comment, but if other use cases emerge I'm sure we could find more appropriate ways to find other information. Maybe another example of a tool that could be used this way: https://github.com/mozilla-... – you can imagine reverse image search or other "detective" tools being applied to a web page.
Of course it's annoying to setup third-party services to do little annotations, so I also wonder if there's opportunities to have community-provided analysis tools that can run locally or on our servers. I have to poke around more to find out about the state of the art in Javascript sandboxing.