
I try to stay out of politics on this website. This post is not mainly about politics. It’s a call to action. We’re trying to do something rather simple and clearly worthwhile. We’re trying to create backups of US government climate data.
The background is, of course, political. Many signs point to a dramatic change in US climate policy:
• Oliver Milman, Trump’s transition: sceptics guide every agency dealing with climate change, The Guardian, 12 December 2016.
So, scientists are now backing up large amounts of climate data, just in case the Trump administration tries to delete it after he takes office on January 20th:
• Brady Dennis, Scientists are frantically copying U.S. climate data, fearing it might vanish under Trump, Washington Post, 13 December 2016.
Of course saving the data publicly available on US government sites is not nearly as good as keeping climate programs fully funded! New data is coming in all the time from satellites and other sources. We need it—and we need the experts who understand it.
Also, it’s possible that the Trump administration won’t go so far as trying to delete big climate science databases. Still, I think it can’t be a bad thing to have backups. Or as my mother always said: better safe than sorry!
Quoting the Washington Post article:
Alarmed that decades of crucial climate measurements could vanish under a hostile Trump administration, scientists have begun a feverish attempt to copy reams of government data onto independent servers in hopes of safeguarding it from any political interference.
The efforts include a “guerrilla archiving” event in Toronto, where experts will copy irreplaceable public data, meetings at the University of Pennsylvania focused on how to download as much federal data as possible in the coming weeks, and a collaboration of scientists and database experts who are compiling an online site to harbor scientific information.
“Something that seemed a little paranoid to me before all of a sudden seems potentially realistic, or at least something you’d want to hedge against,” said Nick Santos, an environmental researcher at the University of California at Davis, who over the weekend began copying government climate data onto a nongovernment server, where it will remain available to the public. “Doing this can only be a good thing. Hopefully they leave everything in place. But if not, we’re planning for that.”
[…]
“What are the most important .gov climate assets?” Eric Holthaus, a meteorologist and self-proclaimed “climate hawk,” tweeted from his Arizona home Saturday evening. “Scientists: Do you have a US .gov climate database that you don’t want to see disappear?”
Within hours, responses flooded in from around the country. Scientists added links to dozens of government databases to a Google spreadsheet. Investors offered to help fund efforts to copy and safeguard key climate data. Lawyers offered pro bono legal help. Database experts offered to help organize mountains of data and to house it with free server space. In California, Santos began building an online repository to “make sure these data sets remain freely and broadly accessible.”
In Philadelphia, researchers at the University of Pennsylvania, along with members of groups such as Open Data Philly and the software company Azavea, have been meeting to figure out ways to harvest and store important data sets.
At the University of Toronto this weekend, researchers are holding what they call a “guerrilla archiving” event to catalogue key federal environmental data ahead of Trump’s inauguration. The event “is focused on preserving information and data from the Environmental Protection Agency, which has programs and data at high risk of being removed from online public access or even deleted,” the organizers said. “This includes climate change, water, air, toxics programs.”
The event is part of a broader effort to help San Francisco-based Internet Archive with its End of Term 2016 project, an effort by university, government and nonprofit officials to find and archive valuable pages on federal websites. The project has existed through several presidential transitions.
I hope that small “guerilla archiving” efforts will be dwarfed by more systematic work, because it’s crucial that databases be copied along with all relevant metadata—and some sort of cryptographic certificate of authenticity, if possible. However, getting lots of people involved is bound to be a good thing, politically speaking.
If you have good computer skills, good understanding of databases, or lots of storage space, please get involved. Efforts are being coordinated by Barbara Wiggin and others at the Data Refuge Project:
• PPEHLab (Penn Program in the Environmental Humanities), DataRefuge.
You can contact them at DataRefuge@ppehlab.org. Nick Santos is also involved, and if you want to get “more plugged into the project” you can contact him here. They are trying to build a climate database mirror website here:
• Climate Mirror.
At the help form on this website you can nominate a dataset for rescue, claim a dataset to rescue, let them know about a data rescue event, or help in some other way (which you must specify).
PPEHLab and Penn Libraries are organizing a data rescue event this Thursday:
• PPEHLab, DataRefuge meeting, 14 December 2016.
At the American Geophysical Union meeting in San Francisco, where more than 20,000 earth and climate scientists gather from around the world, there was a public demonstration today starting at 1:30 PST:
• Rally to stand up for science, 13 December 2016.
And the “guerilla archiving” hackathon in Toronto is this Saturday—see below. If you know people with good computer skills in Toronto, get them to check it out!
To follow progress, also read Eric Holthaus’s tweets and replies here:
• Eric Holthaus.
Guerrilla archiving in Toronto
Here are details on this:
Guerrilla Archiving Hackathon
Date: 10am-4pm, December 17, 2016
Location: Bissell Building, 4th Floor, 140 St. George St. University of Toronto
RSVP and up-to-date information: Guerilla archiving: saving environmental data from Trump.
Bring: laptops, power bars, and snacks. Coffee and pizza provided.
This event collaborates with the Internet Archive’s End of Term 2016 project, which seeks to archive the federal online pages and data that are in danger of disappearing during the Trump administration. Our event is focused on preserving information and data from the Environmental Protection Agency, which has programs and data at high risk of being removed from online public access or even deleted. This includes climate change, water, air, toxics programs. This project is urgent because the Trump transition team has identified the EPA and other environmental programs as priorities for the chopping block.
The Internet Archive is a San Francisco-based nonprofit digital library which aims at preserving and making universally accessible knowledge. Its End of Term web archive captures and saves U.S. Government websites that are at risk of changing or disappearing altogether during government transitions. The Internet Archive has asked volunteers to help select and organize information that will be preserved before the Trump transition.
End of Term web archive: http://eotarchive.cdlib.org/2016.html
New York Times article: “Harvesting Government History, One Web Page at a Time”
Activities:
Identifying endangered programs and data
Seeding the End of Term webcrawler with priority URLs
Identifying and mapping the location of inaccessible environmental databases
Hacking scripts to make accessible to the webcrawler hard to reach databases.
Building a toolkit so that other groups can hold similar events
Skills needed: We need all kinds of people — and that means you!
People who can locate relevant webpages for the Internet Archive’s webcrawler
People who can identify data targeted for deletion by the Trump transition team and the organizations they work with
People with knowledge of government websites and information, including the EPA
People with library and archive skills
People who are good at navigating databases
People interested in mapping where inaccessible data is located at the EPA
Hackers to figure out how to extract data and URLs from databases (in a way that Internet Archive can use)
People with good organization and communication skills
People interested in creating a toolkit for reproducing similar events
Contacts: michelle.murphy@utoronto.ca, p.keilty@utoronto.ca