DataRescue RTP | National Humanities Center

Public Events

DataRescue RTP

June 10, 2017

June 10–11, 2017 from 10:00 am to 2:00 pm at the National Humanities Center


DataRescue RTP, an event organized by DataRescue Chapel Hill and the National Humanities Center, aims to preserve online government data related to housing and education programs.

We are focusing on datasets identified as being at high risk for removal from online public access. While the Internet Archive has preserved copies of many government websites, it is unable to archive other types of information, particularly datasets. DataRescue events are a key piece in ensuring that these datasets are copied. The Internet Archive, DataRefuge and a consortium of research libraries hold these copies and keep them available for public access.

This event is open to volunteers from all backgrounds and technical abilities. Following a workflow developed by EDGI and the DataRefuge project, together we will archive trustworthy copies of government data.

Who should volunteer?

We’re looking for people who are knowledgable about housing and educational research areas, data scientists, hackers, archivists, librarians, writers, web designers, people with good communication skills, and anyone else who is eager to help.

How do I sign up?

If you are interested in helping to organize, coordinate and/or volunteer for the RTP DataRescue, please sign up here. Indicate on the form what role/team you would like to volunteer for at the event. If you have any further questions, email Sangeeta Desai at

What should I bring?

A laptop and a charger. Lunch will be provided by the National Humanities Center.

Where do I go?

The National Humanities Center is located at 7 T.W. Alexander Drive in Research Triangle Park. A map and directions are available here.

Role Descriptions

Roles requiring advanced technical and/or archival skills:


Researchers review URLs the Seeders and Sorters mark as “Uncrawlable”. Consider this path if you have strong front end web experience and like to find out more information about things.


Harvesters figure out how to capture the uncrawlable data. Consider this path if you’re a hacker. Harvesters take the “uncrawlable” data and try to figure out how to capture it. This is a complex task which can require substantial technical expertise and also requires different techniques for different tasks.


Checkers inspect a harvested dataset and make sure that it is complete. The main question the checkers need to answer is “will the bag make sense to a scientist?” Checkers need to have an in-depth understanding of harvesting goals and potential content variations for datasets. Previous work with scientific data is a plus.


Baggers perform a quality assurance check and package the data. Consider this path if you have data or web archiving experience, or have strong tech skills and attention to detail. As a bagger you will have to package the data into a bagit file (or “bag”), which includes basic technical metadata, and upload it to the final DataRefuge destination.


(This may include a few people from the Baggers path.) Consider this path if you have experience working with scientific data, particularly climate or environmental data, or with creating metadata. Trained librarians and scientists will be very helpful on this path. Describers will create a CKAN account to help organize and fill out metadata for the datasets and URLs that have been downloaded. They then link to bags of data to make the data public and accessible.

Roles requiring basic or no technical skills:


Consider this path if you’re on social media (Facebook, Instagram, Twitter, whatever), if you can use Storify, if you have good listening and writing skills, and/or if you can make creative engaging materials.


This is the widest path and requires a variety of skill levels. Consider this path if you are a coder, hacker, have front end web experience, or just have a great attention to details.


Consider this path if you’d like to build DataRefuge into the future. Future projects will also call attention to all the data that exists but can’t be captured in a single weekend as well as data that doesn’t exist, but should. Many kind of skills needed. Experience in public engagement projects and informal STEM education settings will be especially helpful.


Consider this path if you are interested in helping to organize the DataRescue and/or to keep the event running smoothly.