theconversation.com
151.101.2.110 

URL: http://theconversation.com/your-internet-data-is-rotting-115891
Submission: On May 16 via automatic , source hackernews

Form analysis 4 forms found in the DOM

/search

<form action="/search">
  <input type="text" name="q" placeholder="Search analysis, academics&#x2026;">
  <button type="submit">
    <span class="icon-search"></span>
  </button>
</form>

GET /uk/search

<form class="masthead-search" action="/uk/search" accept-charset="UTF-8" method="get">
  <input name="utf8" type="hidden" value="&#x2713;">
  <fieldset>
    <legend>Search</legend>
    <div class="row">
      <div class="input-wrapper">
        <label for="q"><i class="icon-search"></i></label>
        <input type="text" name="q" id="q" value="" placeholder="Search analysis, research, academics&#x2026;"> </div>
      <button type="submit" class="button" value="Search"></button>
    </div>
  </fieldset>
</form>

GET /experts/search

<form class="for formtastic expert_search_form" novalidate="novalidate" id="new_expert_search_form" action="/experts/search" accept-charset="UTF-8" method="get">
  <input name="utf8" type="hidden" value="&#x2713;">
  <fieldset class="inputs">
    <ol>
      <li class="string input required stringish" id="expert_search_form_term_input">
        <label for="expert_search_form_term" class="label">Find experts with
          <span class="nobr">knowledge in:</span>
          <abbr title="required">*</abbr>
        </label>
        <input placeholder="e.g. Cyber Security" id="expert_search_form_term" type="text" name="expert_search_form[term]"> </li>
    </ol>
  </fieldset>
  <fieldset class="actions">
    <ol>
      <li class="action input_action " id="expert_search_form_submit_action">
        <input type="submit" name="commit" value="Search" class="find-experts button large primary">
      </li>
    </ol>
  </fieldset>
</form>

POST /subscriptions

<form action="/subscriptions" method="post" class="subscription-newsletter-form">
  <input type="hidden" name="subscribe[location]" id="subscribe_location" value="footer">
  <input type="hidden" name="subscribe[newsletter_list_id]" id="subscribe_newsletter_list_id" value="2">
  <input type="submit" name="submit" value="Subscribe" class="subscribe button primary">
  <div class="field-wrapper">
    <label class="subscribe-email-label" for="subscribe_email"> Email address </label>
    <div class="subscription-newsletter-status"> <i class="success-icon">&#x2714;</i> <i class="icon-delete failure-icon"></i>
      <img class="spinner-icon" src="/assets/spinner-9643e2633c59d728d78b58f465f2fb9c.gif" alt="Spinner"> </div>
    <input type="email" name="subscribe[email]" id="subscribe_email" value="" placeholder="Your email address" class="email" spellcheck="false"> </div>
</form>

Text Content

Your internet data is rotting
Editions
Africa
Australia
Canada
Canada (français)
España
France
Global Perspectives
Indonesia
New Zealand
United Kingdom
United States
Sections
Home
Arts + Culture
Business + Economy
Cities
Education
Environment + Energy
Health + Medicine
Politics + Society
Science + Technology
Brexit
Search
Services
Events
Newsletter
Read on Apple News
Read on Flipboard
Information
Who we are
Our charter
Our team
Partners and funders
Republishing guidelines
Contact us
Donate
Friends of The Conversation
Privacy policy
Terms and conditions
Corrections
Edition:
Available editionsClose menu
United Kingdom
Africa
Australia
Canada
Canada (français)
España
France
Global Perspectives
Indonesia
New Zealand
United States
Donate
Events
Become an author
Sign up as a reader
Sign in
Get newsletter
Get our newsletter
Search
Academic rigour, journalistic flair
Arts + Culture
Business + Economy
Cities
Education
Environment + Energy
Health + Medicine
Politics + Society
Science + Technology
Brexit
Follow Topics
Rosetta
Explainer
Digital economy
Hubble 25
LHC
future of computers
Pluto
Cambridge Analytica
Your internet data is rotting
May 15, 2019 11.47am BST
The internet is growing, but old information continues to disappear daily.
wk1003mike/shutterstock.com
Your internet data is rotting
May 15, 2019 11.47am BST
Paul Royster, University of Nebraska-Lincoln
Author
Paul Royster
Coordinator of Scholarly Communications, University of Nebraska-Lincoln
Disclosure statement
Paul Royster does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.
Partners
University of Nebraska-Lincoln provides funding as a member of The Conversation US.
The Conversation UK receives funding from these organisations
View the full list
Republish this article
Republish our articles for free, online or in print, under Creative Commons licence.
Email
Twitter12
Facebook42
LinkedIn
WhatsApp
Messenger
Print
Many MySpace users were dismayed to discover earlier this year that the social media platform lost 50 million files uploaded between 2003 and 2015. 
The failure of MySpace to care for and preserve its users’ content should serve as a reminder that relying on free third-party services can be risky.
MySpace has probably preserved the users’ data; it just lost their content. The data was valuable to MySpace; the users’ content less so.
What happened to MySpace
MySpace is a social networking media site where performers could upload music or other content for access and distribution to its user community. It has always been a free site, with revenues coming from ads and programming that targets users for specific products.
Formed in 2003 in imitation of the social gaming site Friendster, MySpace grew rapidly and was purchased by Rupert Murdoch’s News Corporation in 2005. By 2008, MySpace was the leading social networking site, valued at one time at US$12 billion But it declined in popularity – thanks to an overprevalence of ads, concerns about exposure of minors to sexual content and other issues. In 2011, News Corporation sold MySpace to Specific Media, who sold it again in 2016 to Time Inc., which was in turn bought by the Meredith Corporation in 2018.
So the company went through three changes in ownership over a 12-year period, and saw revenues and membership drop precipitously over that time. One sale might be fine, but three sales over short term suggests to me a troubled business that was not in a good position to watch over others’ intellectual property.
Anyone using MySpace as a storage service who did not have alternate backup is simply out of luck. You left your intellectual property sitting beside the information superhighway, and when you came back 10 years later it was gone. 
MySpace is not alone in encountering problems. Amazon cloud services, for example, also experienced a a substantial outage in 2011 and another in 2017. Though temporary, and without actual loss of data, these outages left users without access to precious and important files for some time.
In a statement, Myspace said, ‘We apologize for the inconvenience.’
chrisdorney/shutterstock.com
A much bigger problem
Preserving content or intellectual property on the internet presents a conundrum. If it’s accessible, then it isn’t safe; if it’s safe, then it isn’t accessible. 
Accessible content is subject to tampering, theft or other sorts of bad actions. Only content that is inaccessible can be locked and protected from hacking.
The internet currently accesses about 15 zettabytes of data, and is growing at a rate of 70 terabytes per second. It is an admittedly leaky vessel, and content is constantly going offline to wind up lost forever. 
Massive and desperate efforts are underway to preserve whatever is worth preserving, but even sorting out what is and what is not is itself a formidable undertaking. What will be of value in 10 years – or 50 years? And how to preserve it? 
Acid-free paper can last 500 years; stone inscriptions even longer. But magnetic media like hard drives have a much shorter life, lasting only three to five years. They also need to be copied and verified on a very short life cycle to avoid data degradation at observed failure rates between 3% and 8% annually. 
Then there is also a problem of software preservation: How can people today or in the future interpret those WordPerfect or WordStar files from the 1980s, when the original software companies have stopped supporting them or gone out of business?
A nonprofit startup called The Internet Archive is preserving snapshots of the web on an ongoing basis, but mostly this is for top-level public HTML webpages such as The New York Times website and Facebook, not for underlying content files. As of last fall, its Wayback Machine held over 450 billion pages in 25 petabytes of data. This would represent .0003% of the total internet.
Universities, governments and scientific societies are struggling to preserve scientific data in a hodgepodge of archives, such as the U.K.‘s Digital Preservation Coalition, MetaArchive, or the now-disbanded collaborative Digital Preservation Network.
Preservation is hard and expensive in time, money and equipment. To be most useful, it not only has to be stored, but hosted in a form that is accessible and available for future reuse. 
Actual storage costs less than $0.05 per gigabyte, but storage is only a small percentage of the costs of preservation. Acquisition, networking, maintenance and administration all require substantial and costly human labor. 
Budgeting models suggest a 10-year preservation expense of around $2.50 per gigabyte, or $2,500 per terabyte, or $625,000 for the files MySpace failed to preserve.
Huge amounts of new content are uploaded to the internet every day.
Fingon ss/shutterstock.com
Considering your own data
So yes, the internet is rotting, but archivists and digital librarians like myself knew it was rotten already, as did anyone who ever got a “404 File Not Found” error. 
Where there is economic incentive to keep and use data – such as user information, profiles or browsing history – it may exist for quite a long time. It has been said by many that data is the new oil, and corporations are anxious to drill and exploit this resource. 
However, where content is less valuable to whomever owns the servers, there is less incentive to invest in preserving it. A survey of 10 million hits from 25 random sites in 2004 suggests that 404 errors occur at close to 3% of targeted URLs. The internet is growing much faster than it is rotting, but both things are happening at once. No giant internet company has your interests closer to its heart than its own. 
One preservation network is known under the acronym LOCKSS – Lots of Copies Keeps Stuff Safe – and that’s a good rule of thumb. Always have a backup, and always have multiple backups. Guard your privacy and guard your content, at least that content you may wish to have preserved, like photos, email, that screenplay or novel, or video and music files. Copyright rules do not prohibit storing content you may have purchased, as long as you don’t put it out for public sharing. 
Free storage is a great offer, but sometimes you only get what you pay for. The internet is neither secure nor permanent. It never promised to be, and users should not assume that it will become so. Parts are rotting and corroding and collapsing as I type this. Just hope and plan to not be resting on that platform when it falls.
Social media
Internet
Data
Archives
Digital preservation
MySpace
Online archives
Tweet12
Share42
Get newsletter
Newsletter
You might also like
Both sea ice and government data are disappearing.
U.S. Geological Survey, flickr
How the ‘guerrilla archivists’ saved history – and are doing it again under Trump
Republican presidential candidate Richard Nixon smiles for the cameras during a 1968 news conference.
AP Photo
A conservative activist’s quest to preserve all network news broadcasts
Libraries are offering new and innovative things that belie their historic image as silent places to read.
7 unexpected things that libraries offer besides books
Jacinda Ardern and Immanuel Macron will head up the Christchurch Call meeting, aimed at coordinating international regulation of harmful online content.
Ian Langsdon / AAP
It’s vital we clamp down on online terrorism. But is Ardern’s ‘Christchurch Call’ the answer?
Sign in to comment
3 Comments
Oldest
Newest
Comments are open for 72 hours but may be closed early if there
is a high risk of comments breaching our standards.
Show all comments
Most popular on The Conversation
Should vegans avoid avocados and almonds?
We accidentally created a new wonder material that could revolutionise batteries and electronics
The Y chromosome is disappearing – so what will happen to men?
Parkinson’s: four unusual signs you may be at risk
AI develops human-like number sense – taking us a step closer to building machines with general intelligence
Huawei: fears in the West are misplaced and could backfire in the long run
‘Smiling depression’: it’s possible to be depressed while appearing happy – here’s why that’s particularly dangerous
People with depression use language differently – here’s how to spot it
Feel like time is flying? Here’s how to slow it down
Explaining coprophagy – why do dogs eat their own poo?
Events
¡Yo soy Fidel! Post-Castro Cuba and the cult of personality
—
Egham, Surrey
Open House Festival 
—
Liverpool, Liverpool
Green Film Festival @UEA: The Reluctant Radical
—
Norwich, Norfolk
The Hatnub inscriptions and the pharaoh’s pyramid
—
Liverpool, Liverpool
Capitalism without Capital: Understanding our new “knowledge” economy
—
York, York
More events
Expert Database 
Find experts with knowledge in:*
The Conversation
Community
Community standards
Republishing guidelines
Friends of The Conversation
Research and Expert Database
Analytics
Events
Our feeds
Donate
Company
Who we are
Our charter
Our team
Our blog
Partners and funders
Resource for media
Contact us
Stay informed and subscribe to our free daily newsletter and get the latest analysis and commentary directly in your inbox.
Email address
✔
Follow us on social media
Privacy policy
Terms and conditions
Corrections
Copyright © 2010–2019, The Conversation Trust (UK) Limited