And despite decades writing programs mainly for my personal research, I have a non technical, traditional media business, corporate experience of computing. ![]() I don't understand, unless maybe I was gushing somewhat. So you can nudge even the least hungry prospect customers towards your sales team.Ĭan someone enlighten me to a better appreciation of the moderation of my above comment? The need for refreshing the media is so overlooked by businesses that otherwise have great practices. Oh, sorry, I misled you, I have to add what just now came to my mind : anything that encourages big companies to give into really old data archives, is just when they are ripe for selling new bulk capacity. I wrote above the delightful ideas this prompted me to forget about dinner last night, so I will stop here.īut I cannot help but feel in my gut this is a real and viable project that can snowball and be a huge thing. Well those donors are who need the reassurance of a more vendor. just who might aggressively retain data for their own reasons, compliance and security the usual needs, Of course that's window dressing,but my aim is to get big businesses to contribute more old data. I mention Hitachi only because by anointing Lucene in a Tier 1 product, if I encounter the need to provide a audit that is relied upon to attest any sensitive data is purged from the cache, I'm sure that it will be acceptable in the format of such a high end system report. Thanks for all your help with suggestions everyone!īoth because of the abundance of people doing similar extraction from comparable data stores, but also because Lucene is a part of the Hitachi data suites which are my business choice. For example, about 11% of Arab Spring-related tweets were gone within a year (even though Twitter is - currently - still around). Even in a highly stable, funded, curated environment, link rot happens anyway. (The English Wikipedia has seen a 2010-2011 spike from a few thousand dead links to ~110,000 out of ~17.5m live links.) The dismal studies just go on and on and on (and on). The French company Linterweb studied external links on the French Wikipedia before setting up their cache of French external links, and found - back in 2008 - already 5% were dead. A Science study looked at articles in prestigious journals they didn’t use many Internet links, but when they did, 2 years later ~13% were dead3. Nelson and Allen (2002) examined link rot in digital libraries and found that about 3% of the objects were no longer accessible after one year.īruce Schneier remarks that one friend experienced 50% linkrot in one of his pages over less than 9 years (not that the situation was any better in 1998), and that his own blog posts link to news articles that go dead in days2 Vitorio checks bookmarks from 1997, finding that hand-checking indicates a total link rot of 91% with only half of the dead available in sources like the Internet Archive the Internet Archive itself has estimated the average lifespan of a Web page at 100 days. ![]() McCown et al 2005 discovered that half of the URLs cited in D-Lib Magazine articles were no longer accessible 10 years after publication, and other studies have shown link rot in academic literature to be even worse (Spinellis, 2003, Lawrence et al., 2001). discovered that about one link out of every 200 disappeared each week from the Internet. Gwern has a good summary of the research in this:
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |