I beg my readers’ indulgence for a moment, while I say for the record that I am honored to call TechCrunch correspondent Ron Miller a friend and colleague. He and I have worked together, and he is among the finest web journalists working today.
As my other friends and colleagues will attest, that’s what I generally say before I drop-kick one of their op-eds.
Technically, I cannot rebut Ron’s TechCrunch piece from last week, entitled, “The Internet is Failing the Website Preservation Test.” It’s because he’s right: “If the Internet is at its core is a system of record,” he wrote, “then it is failing to complete that mission.”
Ron goes on to say that, when it no longer suits a publisher’s purposes to maintain old content online, it disappears forever. Again, he’s right. I’ve written as much myself, in a 2012 op-ed that, in a moment of self-torture, Ron may have read.
It hurts that the legacy we online writers think we stand upon, when we talk about our experience, is mostly comprised of memory, and not the electronic kind. We became writers because we wanted to contribute something to the world. Those who came before us, who contributed something to the airwaves, at least found a medium more permanent than we did.
Ron is right: If the Internet has an obligation to persist our words into the infinite archives of the future, then it has already failed.
The Internet never had that obligation. Not for one solitary day in its life.
The Internet is a medium of communication. It is a system of connections that facilitate the exchange of information, hopefully the good kind. It’s not a system for retaining information, but rather for distributing it.
To the extent that the Internet is an extension of our intelligence, that’s perhaps the scariest part of all: It reminds us, every day, of how much intelligence we must yet attain, for us to deserve whatever effort was expended in putting us here.
But maybe Ron really meant the web.
Many folks use the words “Internet” and “web” interchangeably. The web, as originally conceived, was a system of cross-referenced electronic documents. As an ideal — in people’s dreams (including my own)— it became a lexical snapshot of “now,” the culmination of everything worth saying by anyone to everyone else in the rest of the world.
That the web must still be so in Ron’s mind is actually to his credit.
But it’s time for us to be practical. This is not what the web has become.
From a technical perspective, the web is the system of protocols with which servers exchange information over the Internet. This system is actually becoming quite good, frankly to my surprise. It facilitates the intercommunication between disparate processes all over the planet. It makes the cloud possible and the smartphone feasible.
This probably isn’t the web Ron meant, though.
He meant the medium for the publication of content like what you’re reading now, the “C” in “CMSWire:” a file, an electronic resource, a response to a database query. Characters on pages, digits in memory buffers, octets in messages.
Imagine having to archive every one of those things.
You think big data is big? Think of every snapshot of the web at every conceivable moment in time, over and over and over. As many of the big data vendors ask you today, how are you going to take control of that much data, all at once?
The answer, as much as it hurts us to admit it, and as politically incorrect it may be at the most inopportune of moments, is to get rid of most of it.
A true archive of every moment of the web’s history from this time forward is not possible, if we continue to think of it as snapshots of static electrons representing the data that comprises content. It’s like a many-universes interpretation of physics; despite the fact that it explains otherwise inexplicable phenomena, it can’t be real.
Yet hypothetically, maybe, we could recreate the Web for any moment in time, if we were to simplify and reproduce its underlying mechanisms. We wouldn’t have to snapshot every moment and replicate every bit, just the parts that changed, when they changed.
Think about how MPEG format compresses movies — encoding only the movements and not the parts that hold still — and you might get the idea.
But this would require us to do something we can’t quite do today: comprehend the web’s logic.
Consider all the paywalls you’d have to reconstruct, the 404s you’d have to log and re-enact on-demand, the fail-whales you’d have to redraw, the Flash videos whose multiple means of crashing you’d need to simulate, the malware and viruses you’d have to send and send and send again.
What’s more, imagine all the complaints you’d have to respond to, from people saying you got some almost imperceptible aspect of it wrong.
The Web, you see, is beyond the scope of a content management system. It’s the complete dynamics of information exchange between every server in the process. Few Web pages are the product of single servers on exclusive domains any more — certainly not this one.
Besides, folks don’t pay much attention to the infrastructure that supports the publication of these articles, even when “CMS” is in the name of the publication.
Surely that’s not the web Ron meant though. He probably meant the good part, the part with his byline and, with a bit of luck, mine.
But even that scintilla of occasional characters, a few flashes of light over a million miles of optical cable and a continent filled with data centers, rests upon a mountain of logic and a bottomless pit of code.
Are you ready to make sense of this entire torrential tangle of data?
That’s what I thought. I can’t say I blame you.
Bylines and By-Products
“Nobody should experience what I did the other day,” wrote Ron Miller, “going to look for a page that’s only a couple of years old, and finding that it’s disappeared.”
I get it. The impermanence of this medium can be frustrating and depressing, especially for those of us who keep 30-year-old printed articles with our bylines in safekeeping.
Last year, I had the good fortune of being printed in a magazine again, after a 23-year absence from between soft covers. It felt so good to be able to hold the instrument of my propagation in my hand, to feel the gloss, to smell the still-drying chemicals, to hear the thunder of the press in my mind.
It felt real again. The web can be so imaginary by comparison.
I thought for a moment how my parents would have felt to see their son’s name on a piece of paper after so long. Mom would have thumb-tacked me to her easel. Dad might have framed me and hung me in his office. You see, I still think of my byline as “me;” we magazine veterans refer to our work in the first person.
And then this occurred to me: Besides my parents, how many dozens of coffee tables and recycle bins in the world have the glorious honor of supporting “me” — that is to say, my byline, coupled with a few thousand words about some technology no one will care about in 2017?
And when it comes time for the janitor to replace “me” with this month’s edition or with Vogue or People, is he duty-bound to donate “me” to some institute for archival preservation?
The illusion of print being more permanent than the Web is given substance by romanticists like myself. We tend to think information is only grounded in reality with ink. So much of what we’ve set to paper has already been burned.
We can still print books, but if you think about it, it doesn’t matter how. We’ll be able to give e-books to our children and grandchildren, without imposing upon Amazon or Apple the duty of preserving master copies.
What history we do have is not the product or by-product of paper or wires or webs, but the effort of people. Information only exists in our minds; it does not inform the airwaves or an undersea cable.
Databases have no sworn duty to their data. However we encode what we know, or what we think we know, the responsibility for its perpetuity belongs to each of us individually, and cannot be automated.
The only lasting impact we make upon this world is the impression we leave in people’s minds. If we have done that much, and done it well, then we can trust our progeny to sustain us. That is to say, to sustain those words to which we affix our names.
The duty to preserve knowledge is ours and ours alone. Not the web’s.
If Ron needs to remember what he wrote two years from now, he can call me.
Title image by Glen Noble.