If These Crawls Could Talk: Studying and Documenting Web Archives Provenance
| dc.contributor.author | Maemura, Emily | |
| dc.contributor.author | Worby, Nicholas | |
| dc.contributor.author | Milligan, Ian | |
| dc.contributor.author | Becker, Christoph | |
| dc.date.accessioned | 2018-03-22T20:50:10Z | |
| dc.date.available | 2018-03-22T20:50:10Z | |
| dc.date.issued | 2018 | |
| dc.description | This is an accepted manuscript of an article to be published in the Journal of the Association for Information Science and Technology | en_US |
| dc.description.abstract | The increasing use and prominence of web archives raises the urgency of establishing mechanisms for transparency in the making of web archives to facilitate the process of evaluating a web archive’s provenance, scoping, and absences. Some choices and process events are captured automatically, but their interactions are not currently well understood or documented. This study examines the decision space of web archives and its role in shaping what is and what is not captured in the web archiving process. By comparing how three different web archives collections were created and documented, we investigate how curatorial decisions interact with technical and external factors and we compare commonalities and differences. The findings reveal the need to understand both the social and technical context that shapes those decisions and the ways in which these individual decisions interact. Based on the study, we propose a framework for documenting key dimensions of a collection that addresses the situated nature of the organizational context, technical specificities, and unique characteristics of web materials that are the focus of a collection. The framework enables future researchers to undertake empirical work studying the process of creating web archives collections in different contexts. | en_US |
| dc.description.sponsorship | Part of this work was supported by the National Science and Engineering Research Council (NSERC) through RGPIN-2016-06640, and the Social Sciences and Humanities Research Council (SSHRC) through Insight Grant 435-2015-0011 and Canada Graduate Scholarship 767-2015-2217. Ian Milligan was also supported by the Marshall McLuhan Centenary Fellowship in Digital Sustainability at the University of Toronto iSchool Digital Curation Institute. | en_US |
| dc.identifier.doi | 10.1002/asi.24048 | |
| dc.identifier.uri | http://hdl.handle.net/1807/82840 | |
| dc.language.iso | en_ca | en_US |
| dc.publication.journal | Journal of the Association for Information Science and Technology | en_US |
| dc.subject | web archives | en_US |
| dc.subject | provenance | en_US |
| dc.subject | digital curation | en_US |
| dc.title | If These Crawls Could Talk: Studying and Documenting Web Archives Provenance | en_US |
| dc.type | Article Post-Print | en_US |
