Skip to main content
SearchLoginLogin or Signup

A growing network of open infrastructures and federated services with Thoth

Published onMay 21, 2024
A growing network of open infrastructures and federated services with Thoth

Over the last few months, Thoth has been working to establish links to an ever-increasing number of open infrastructures active in the field of Open Access book publishing, dissemination, and archiving. Below, we provide an overview of this synergy work, moving to closer collaboration by establishing interoperable data exchange protocols facilitating an open exchange of data across multiple platforms. This interoperability is precisely the benefit of open infrastructures, and is closely linked with the future direction of Thoth’s growing number of services.

Integrations with other systems

Public Knowledge Project

In the context of Thoth’s participation in the Open Book Futures project, Thoth has forged a collaboration with the Public Knowledge Project (PKP) to support Thoth in the development of a plugin to facilitate direct exchange of book metadata between PKP’s widely used Open Monographs Press (OMP) and Thoth. This will enable publishers using OMP to deploy all the advanced features that the Thoth platform provides, including the creation and enrichment of book metadata, as well as the benefits of multi-format metadata exports that are an integral part of it.

An early exchange between the development leads of both organisations at the end of 2023 led to the drafting of key technical specifications. Development is soon to begin, and we will share updates on progress in the coming months.

Publishers such as Scottish Universities Press have already expressed interest in testing the OMP-Thoth integration. If you are a publisher using Open Monographs Press and would like to test-run the Thoth integration yourself, why not get in touch with our team via [email protected]?

COKI / OAPEN Book Analytics service

In a similar vein, Thoth has been collaborating with the Curtin Open Knowledge Initiative (COKI) to develop and enhance data exchange between Thoth and the OAPEN Book Analytics service and dashboard, which is being developed by COKI. The Book Analytics service uses so-called ‘telescopes’ to pull in data from a variety of sources. Thoth’s high-quality dataset, which, among other things, enables publishers to document the variety of locations to which they are disseminating their book content, can thus be very useful in the larger service stack, as it facilitates easy access to those multiple locations and the respective usage data.

A Thoth telescope that pulls up-to-date title-level metadata from Thoth’s Export API via an ONIX feed has already been implemented, and the teams will continue on this path to facilitate an integration of more data sources, and to expand the existing service model towards hopefully improving affordability for small and scholar-led presses.


In 2022, a development need was expressed by Open Book Collective member publishers who use ScienceOpen’s BookMetaHub as one of their hosting platforms, with the goal of enabling easier data exchange between the two systems and reducing, if not eliminating, the amount of work necessary to add data to both platforms. Between 2022 and 2023, ScienceOpen thus began development to support publishers that had books on the BookMetaHub platform to export data to Thoth, and then use Thoth to improve data quality and enable a conversion to a broader set of metadata formats and platform-specific implementations thereof. As ScienceOpen’s Nina Tscheke writes

Thanks to this new collaboration with Thoth, publishers will now be able to create a record on BookMetaHub, send it to Thoth for further enhancement, and then gain access to additional output formats for future export. (Tscheke, 2023)

An early data exchange proof-of-concept has since been implemented, although further work would still be needed to make this a fully seamless workflow without manual import/export.

OAPEN & Directory of Open Access Books (DOAB)

Thoth and OAPEN signed a strategic partnership agreement in December 2023, formalising a closer collaboration between the two platforms. As part of that agreement, Thoth will act as a trusted intermediary to enable small and scholar-led publishers to participate in OAPEN. Further to that, a Memorandum of Understanding has been signed to formalise Thoth’s participation in the DOAB Trusted Platform Network.

During the first quarter of 2024, the two teams also agreed to scope the adaptability of automated workflows utilising the SWORD protocol to push book data into a DSpace repository that had previously been developed in the context of the Thoth Open Archiving Network. With OAPEN and DOAB also using DSpace, the teams hope to be able to establish automated workflows between the Thoth platform and OAPEN’s DSpace instance, which would allow for a more direct, (semi-)automated integration of metadata and content with OAPEN.


Since early 2024, Thoth is proud to be able to act as a dedicated Crossref Sponsor. Integrated within Thoth Plus, sponsorship provides significant benefits to individual publishers seeking Crossref membership, including membership management and waived annual fees, automatic DOI registration, and per-DOI costs covered by Thoth. Automated DOI registration for books and chapters ensures that all Crossref DOIs included in Thoth metadata records are automatically registered with Crossref, alleviating the administrative burden for publishers. Sponsored members can also sign up to additional Crossref services such as Crossmark and Similarity Check.

Crossref supports a wide range of metadata for registered content, and asks that as much as possible be included and that the data is accurate and clean – the more comprehensive the metadata it, the more likely the content will be discovered. The Thoth Crossref-compliant XML output supports the required, recommended and optional metadata that Crossref suggests is included in deposits. Once a DOI is automatically registered with Crossref via Thoth, any subsequent edit made within the Thoth metadata record, will be automatically deposited to Crossref again with the changes included. 

The agreement with Crossref highlights the underlying ethos of what we are trying to achieve with Thoth: we are putting in the legwork of making organisational and technical connections with distributors so that publishers don’t have to. This also includes ensuring that publishers do not have to sign up to services that will subsequently lock them into perpetual dependencies – including Thoth!

Agreements with Dissemination Platforms

Thoth Plus offers automated distribution solutions tailored to the needs of Open Access publishers, allowing them to focus on providing quality content while we handle the logistics of reaching a wider audience. A large part of the work to set up this distribution solution has been on establishing connections, agreements and workflows with some of the major content platforms and ebook distributors that we are including in our Thoth Plus subscription packages. Whilst we are still in development with establishing the full list of platforms we can distribute content and metadata to, we would like to share an update on current progress. 

Thoth are currently in discussions with ebook aggregators Project MUSE and JSTOR, to discuss the formation of partnerships enabling us to work more closely together, similar to what we have in place with OAPEN and Crossref. The proposals see Thoth overseeing membership to the platforms, submission of metadata and content, and management of associated hosting fees. We are really pleased to confirm that JSTOR will be waiving its $100 per-title hosting fee. Similarly, we are very happy to say that Project MUSE have agreed to a 50% reduction of their nominal $100 per-title hosting fee.

Workflows and agreements between Thoth and ebook aggregators EBSCO and ProQuest Ebook Central are currently being formalised. Connections and workflows for other knowledge bases, content platforms, and archiving platforms, including OCLC KB, Google Books, Internet Archive and Zenodo are already established. The Thoth development team is working hard to now automate some of these workflow processes. Our future vision is to continue to grow the range of content and archiving platforms included in Thoth Plus, to be able to reach further into the vast OA landscape. 

Seeding for an open ecosystem

As part of our collective outreach work within and beyond the Copim community, many of the initiatives involved in the community, including but not limited to the Open Book Collective, OAPEN, Jisc, and Thoth, have been in touch with a number of like-minded open infrastructures over the years. One idea has come up again and again in a variety of constellations: to draw on the power of a more closely-aligned collaboration, which is currently tentatively being labelled Open Infrastructure Alliance.

This collective endeavour would, through closer strategic alignment, be in a position to combine a variety of services into a larger collective offer that might eventually provide collectively managed solutions to libraries, funders, and publishers, thus providing a fully open source and open data pipeline for usage data, funding data, metadata, and digital publications.

For Thoth, the idea of seeding for an open ecosystem has been part and parcel right from its inception. This is rooted in our collective belief that open access publishing will not prove to be viable in the long term unless the entire scholarly publishing network connecting authors with reviewers, publishers, funders, repositories, and institutions is managed via interconnected, collectively managed open protocols. In a nutshell, within the Copim community, we subsume this under the slogan “No open access without open infrastructure.”

Fig. 1: “No open access without open infrastructure.”

Taking up the wider metaphor of a mangrove forest to represent an ecosystem of open infrastructures that we will explore in more detail in a forthcoming blog post, we apply this imagery to the case of Thoth, to visualise how this could be conceived from the point of a single infrastructure organisation.

Thoth is very much rooted in principles of open source and open data. Through its platform and export and dissemination capabilities, it enables publishers to create and manage metadata that can then disseminated out via the variety of branches (aka. service areas1) and leaves (transfers of individual metadata & content, knowledge).

Fig. 2: Thoth Open Metadata: an outline of existing (solid boxes) and in-development (dashed boxes) services.

And while this example focuses on one organisation as a singular tree, it is important to highlight the inherent interconnectedness with other systems — so to follow through with the mangrove metaphor, other open infrastructure organisations (aka. trees) that Thoth is disseminating data & content to would themselves be represented as mangrove trees, so branches would interweave with other branches, forming an overarching canopy of data exchange. Those smaller tendrils and branches reaching back to the ground then could also signify each individual stakeholder’s rootedness in a larger collective and that collective’s underlying set of values — values that guide and nourish the whole ecosystem.

Outlook: launching new services, expanding our work with publishers and libraries

For Thoth, there have been a variety of exciting new areas that have emerged out of our core focus on the facilitation of fully open metadata creation, managemement and dissemination for open access.

These key areas, which we hope to develop further in the coming months, include:

  • extending Thoth metadata provision to support publishers with legal deposit requirements and corresponding submission processes and workflows;

  • extending Thoth metadata provision to support publishers with registering ISBNs;

  • building on the PKP OMP-Thoth integration, to work with more open publishing platforms such as e.g. Pressbooks, PubPub, and Janeway, to establish metadata exchange and enrichment workflows and/or plugins;

  • rolling out the offer of Thoth Hosting for project comms-related open source solutions, incl. hosting of

    • Nextcloud for file-sharing and collaborative text editing environments;

    • Mattermost as an open-source alternative to Slack;

    • ReadTheDocs for Wiki-style public documentation;

    • Matomo as an open, self-hosted and privacy-respecting alternative to e.g. GoogleAnalytics, to enable publishers and consortia to collect website usage statistics;

  • offering Thoth Hosting solutions for websites and catalogues to individual publishers and publisher consortia. This would be based on Strapi CMS and could include the hosting of publishers’ original book files under their own dedicated domain, thus empowering those publishers to remain in control over their open access outputs.

All of those services are being designed with a publishers’ independence in mind. At Thoth, we firmly believe in providing publishers with the freedom to choose their own path. Our commitment to openness ensures that publishers are not locked into any specific platform or service. Thoth’s open architecture and APIs empower publishers to integrate seamlessly with other platforms and workflows, allowing for greater flexibility and adaptability. With all metadata available openly in multiple formats, Thoth enables publishers to retain full control over their data and operations.

Overall, leveraging the manifold benefits of fully open metadata, and working closely with other open infrastructures, Thoth now finds itself in an excellent position to establish open workflows for many of the key aspects of open access book publishing, including:

  1. provision of solutions to host a publisher’s own books under their own domain, enabling them to showcase their publications in a highly-flexible catalogue powered by the rich metadata available in Thoth;

  2. providing GDPR-compliant, privacy-respecting usage statistics;

  3. facilitating good metadata practice in the context of open access books by providing publishers with a means to implement persistent identifiers and controlled vocabularies relevant in the context of Open Science / Open Scholarship (cf. our recent Metadata Standards blog post);

  4. the minting of DOIs with no attached costs to the publisher via Thoth’s Crossref Sponsorship;

  5. improving discoverability by submitting metadata and content to key open infrastructures including OAPEN and DOAB, as well as many other stakeholders active in the larger book supply chain and ecosystem;

  6. facilitiating longer-term archiving of OA books in a variety of open infrastructures and repositories.

As we continue to (further) develop and roll out these services in the coming months, the Thoth team will also increase its work with libraries to make an even stronger case for the inclusion of Thoth’s rich open metadata within library catalogues.

If you are a publisher who would like to learn more about the various solutions that Thoth Open Metadata has on offer, or a library or open access platform / provider that would like to work with us, do get in touch via [email protected] or visit

Photo by Kier in Sight Archives on Unsplash.

No comments here
Why not start the discussion?