New COPIM Scoping Report Published on Archiving and Preserving Open Access Monographs
Work Package 7 of the COPIM Project has released their Scoping Report, identifying and examining the key challenges associated with archiving and preserving open access monographs, particularly those published by small and scholar-led presses.
Work Package 7 of the Community-led Open Publication Infrastructures for Monographs project (COPIM) has released their new Scoping Report, identifying and examining the key challenges associated with archiving and preserving open access monographs in all their variation and complexity, particularly those published by small and scholar-led presses, and working towards developing new solutions.
The report begins with an introduction providing background context, followed by two main sections detailing current archiving and preservation practices in OA publishing. Section 2 (‘Current Practice in Preservation of OA Books and Journals’, pp. 2-7) draws from a series of semi-structured interviews of approximately 30 minutes conducted on Microsoft Teams in September and October 2020. These interviews were conducted with small publishers, university presses, and preservation specialists, all involved with the publishing or preserving of open access research outputs. This section details some of the significant responses of various interview participants, in order of the questions asked.
The report’s findings are also informed by a workshop which took place 16th September 2020, bringing together COPIM teammates and experts in digital preservation. The workshop dialogue factors into the content of Section 3 (‘Discussion Points’, pp. 7-11), which synthesises the various threads of discussion from the workshop, the interview responses, and points raised by participants in both the workshop and interviews. Here, key factors for future work have been highlighted.
Key findings in this report include particular resource challenges for small and scholar-led presses, as these presses often are not supported by a memory institution. ‘Memory institution’ is a broad term for any organisation maintaining a repository of public knowledge, but largely comprises libraries, archives, and heritage organisations. Universities are one primary example. Not belonging to one of these institutions means small and scholar-led presses lack the support of existing infrastructure and funds, and as such, often have fewer staff members: not only staff on the press, but staff as part of an affiliated institution who may be able to provide knowledge, assistance, and access to technology, amongst other contributions.
Digital preservation practices amongst small and scholar-led OA monograph publishers currently follow two clear trends, with publishers either using an online library, such as DOAB or OAPEN, with which they have minimal interaction, or relying on their memory institution for local backups. File formats play a role in preservation choices and outcomes, with many small presses relying heavily on PDF and lacking the resources to convert to XML. XML is often a preferred format for digital preservation archives, and viewed as the better format for preservation, because XML can be normalised into other formats (such as BITS) and reconstructed as required. XML also allows for packaging of various data and file components together with metadata. While there are still some differing views in the sector, and PDF/A has also advanced some promising features in recent years, the traditional PDF format can still raise concerns for preservation – particularly with complex digital monographs.
Another finding is the lack of consistent practice in the archiving and preservation of embedded and linked material, with clear indications that existing processes are not sufficient or sustainable for complex open access monographs including multimedia, embedded content, or other supplementary materials.
A number of publishers raised concerns about what happens to archived material in a ‘trigger event’. Most digital preservation archives are ‘dark archives’, meaning that any preserved material from a publisher is inaccessible to the public, until a ‘trigger event’, for instance is a publisher ceases to operate. As the majority of preserved content is that of proprietary publishers, protection of the content in this way is necessary. However, the OA presses wondered about the level of discoverability post-‘trigger event’ and where the content would be located online.
Additionally, there are evident challenges across the OA publishing ecosystem regarding the importance of digital preservation for OA monographs, with a widespread lack of understanding in this area. Outreach and further education is needed, in aid of a significant culture shift for publishers and researchers alike.
Opportunities for future work are identified in the report, including the need for a consensus on file formats, further awareness and a culture shift to acknowledge and respond to the importance of digital preservation, increased support and guidance for small and scholar-led publishers to assure equity in the publishing and preservation landscape, and a clear way forward regarding techniques to effectively preserve the components of complex digital monographs, including links and embedded content. Additionally, possible roles for libraries and repositories were identified, as well as the potential for emulation as preservation for complex monographs with exceptional requirements. The key takeaway is the reality that there will be no single solution or ‘silver bullet’, but that multiple approaches will be required.
COPIM’s Work Package 7 aims to progress possibilities for repository workflows and catalogue best practices during the current project, with ambitions to further address culture change via education and outreach, increase support for small and scholar-led presses with the provision of further tools, and to develop and expand repository workflow integrations in a future project bid.