DDJ workflows: how to document your research and archive your data

Investigative and data journalism is based on working with data: finding open data sources, scraping what can’t be found in an open format, creating data yourself, analysing it and finally presenting it to the audience (read more about data journalism workflows here). Something that journalists often underestimate is structuring and archiving your data so that you can always find it again. This is a quick and dirty look at a couple of possibilities to document your research.

Of the easiest ways to document your research is using social bookmarks like Diigo, Delicious or I personally switched from Delicious to which is a mixture of Deliciuos and Pinterest a couple of months ago.

Apart from using bookmarking services which are great for quickly finding useful information on the web, you should always back up your data, especially when you are working with large amounts of it.

The German journalist Sebastian Mondial who worked on the offshore leaks story among others, presented some good ideas at the recent Netzwerk Recherche conference in Hamburg. Here’re the most important take-aways.

#1 Archive all the information that can get lost on the web. To do that, you need ay lot of discipline. Integrate archiving and documenting your research into your weekly routine.

#2 Use a log book for data archives so that you know when you have changed what. Make paper copies and categorize them.

#3 Always save important stuff in two formats: for example both in Microsoft Word and in rtf/txt.

#4 Know your formats and your freeware. Different data have different formats. For example, you can save emails in the eml. format. To do that, you will need Thunderbird or other freeware.

#5 You either save and document everything – or just the most important things. If you can’t tag and name all the files you are saving so that they are easy to find afterwards, you should rather delete as much as possible. Otherwise you will have too much background noise.

#6 Combine hardware and virtualization. Buy a hard drive (SATA) and use it to backup all the information.

 #7 Maintain your archive and adapt it to changing technologies. You should check and clone your archive on a new hard drive each year.

How do you archive and back up your research?


Leave a Reply

Your email address will not be published. Required fields are marked *