A... is for A New BeginningUntil recently Alexa's website popularity rankings were based solely on data taken from users of its browser toolbar. The install base of the toolbar was seen as a representative sample and user habits were extrapolated to estimate the relative popularity of different websites.But that all changed in April 2008 with the news that the company was casting its net further to take into account other sources. The metrics data you now see when you profile a website on Alexa.com is still taken largely from toolbar users, but now also "aggregates data from multiple sources" (nobody knows), which includes a vague and misty "global panel". How exciting, we say.
B... is for Bot for HireBy 2005 Alexa's crawler was one of the Web's biggest, hitting 4 to 5 billion pages a month and archiving 1 terabyte of data a day (it’s now up to 1.6 terabytes per day, and a total of 4.5 billion pages from over 16 million sites ). In a bold move designed to open up a whole new market, it announced that it would open up its crawler to requests from the floor to anyone willing to pay.The Alexa Web Search Platform, which enabled programmatic access to Alexa's web search engine, briefly threatened to spark a search revolution... and was then sucked into Amazon Web Services.
C... is for Competitors...in the website rankings space like Compete and comScore. According to Compete's own measurements, they now have the upper hand on Alexa (below). More notable is the Quantcast graph, but the 'People Count' metric may be polluted somewhat by Quantcast's innovative model (feel free to fill us in below). Quantcast gets their figures for opted-in websites via direct measurement, and we covered the model in some depth earlier this month (and here). The rest use toolbars, panels, random irritating phone calls, data from ISPs and a variety of other methods to get their traffic figures.Compete sold for 'up to' US$ 150 mil. earlier this year. Quantcast, the new kid on the block, has raised US$ 20 mil. in funding. comScore floated in 2007, got into a big row with Google about paid clicks, and have a current market cap. of US$ 650 mil.
D... is for Alexa DirectoryYes, they've got a directory too, which is in fact a direct copy of DMOZ, the Open Directory Project. Alexa has added the ability to "sort categories by Popularity and Avg. Review Rating, and you can access all of Alexa's helpful site information just by clicking on the title of any site." Find the Alexa Directory here.
E... is for Egypt...which is where the Library of Alexandria stood, from which Alexa gets its name. One of the Wonders of the ancient world, the Library was by far the greatest collection of recorded information in olden times, storing so many archives, it is said that it would take the length of three SteveNotes speeches toread them all. The Library was eventually destroyed (no one is entirely sure how), so the conjecture will likely remain untested.
F... is for Flat FileCMSWire.com Useless Geek Fact of the Day # 10009: Alexa uses a flat-file and not a SQL database for analysis and storage. Apparently for massive data sets flat-file is the only show in town. More on this fascinating topic on O'Reilly Radar. Flat-file is an emotive topic, try not to step in the puddles of blood over there.Meanwhile, if you really want to see what a Flat-File bloodbath is all about, have a look in the comments section of this, for what is surely the greatest comments-war Content Management has ever seen.
G… is for Bruce GilliatBruce is the co-founder of Alexa along with Brewster Kahle. He studied at MIT, Golden Gate and Berkeley, and worked in fibre-optics before teaming up with Kahle at WAIS Inc. in 1994 and jumping onto the roller-coaster that became Alexa (see V). He remained with Alexa after the 1999 takeover (see J), served as CEO, and left the subsidiary in 2007. Gilliat has sat on various boards and seems to keep pretty quiet.
H… is for Hornbaker, RonHornbaker got into a legal tangle with Amazon Corp. in 2006 after starting a website called Alexaholic.com, on which he displayed Alexa traffic graphs and ‘…misappropriated the Alexa name’. Hornbaker disagreed with this assessment. And Alexa vs Hornbaker duly went to the lawyers. Hornbaker was ultimately forced to re jig his service and rebrand it as Statsaholic.
I... is for Internet ArchiveIA is a non-profit which was started by Alexa’s Bruce Kahle in 1996, which maintains an on-line library and archive of Web and multimedia resources.Its explicit mission is to "help preserve [Web] artifacts and create an Internet library for researchers, historians, and scholars." The Wayback Machine is a service created by the Internet Archive which allows public users to view snapshots of websites as they were on a given date. Wayback gets its data from Alexa’s crawler, and “As of 2006 … contained almost 2 petabytes of data and was growing at a rate of 20 terabytes per month, a two-thirds increase over the 12 terabytes/month growth rate reported in 2003.” (Wikipedia).Check out the very first recorded edition of CMSWire, which ran off Movable Type 2.64.Internet Archive is based just down the road from Alexa at the Presidio, SF (see P), in a strikingly beautiful old Colonial-era building.
Internet Archive HQ