If These Crawls Could Talk: Studying and Documenting Web Archives Provenance

dc.contributor.authorMaemura, Emily
dc.contributor.authorWorby, Nicholas
dc.contributor.authorMilligan, Ian
dc.contributor.authorBecker, Christoph
dc.date.accessioned2018-03-22T20:50:10Z
dc.date.available2018-03-22T20:50:10Z
dc.date.issued2018
dc.descriptionThis is an accepted manuscript of an article to be published in the Journal of the Association for Information Science and Technologyen_US
dc.description.abstractThe increasing use and prominence of web archives raises the urgency of establishing mechanisms for transparency in the making of web archives to facilitate the process of evaluating a web archive’s provenance, scoping, and absences. Some choices and process events are captured automatically, but their interactions are not currently well understood or documented. This study examines the decision space of web archives and its role in shaping what is and what is not captured in the web archiving process. By comparing how three different web archives collections were created and documented, we investigate how curatorial decisions interact with technical and external factors and we compare commonalities and differences. The findings reveal the need to understand both the social and technical context that shapes those decisions and the ways in which these individual decisions interact. Based on the study, we propose a framework for documenting key dimensions of a collection that addresses the situated nature of the organizational context, technical specificities, and unique characteristics of web materials that are the focus of a collection. The framework enables future researchers to undertake empirical work studying the process of creating web archives collections in different contexts.en_US
dc.description.sponsorshipPart of this work was supported by the National Science and Engineering Research Council (NSERC) through RGPIN-2016-06640, and the Social Sciences and Humanities Research Council (SSHRC) through Insight Grant 435-2015-0011 and Canada Graduate Scholarship 767-2015-2217. Ian Milligan was also supported by the Marshall McLuhan Centenary Fellowship in Digital Sustainability at the University of Toronto iSchool Digital Curation Institute.en_US
dc.identifier.doi10.1002/asi.24048
dc.identifier.urihttp://hdl.handle.net/1807/82840
dc.language.isoen_caen_US
dc.publication.journalJournal of the Association for Information Science and Technologyen_US
dc.subjectweb archivesen_US
dc.subjectprovenanceen_US
dc.subjectdigital curationen_US
dc.titleIf These Crawls Could Talk: Studying and Documenting Web Archives Provenanceen_US
dc.typeArticle Post-Printen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
JASIST_IfTheseCrawls-Preprint-20180122.pdf
Size:
705.03 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: