Skip to main content

Case study: British Library

State of Open: The UK in 2024

Phase Four “AI Openness: End of Year Update 2024”

Elena Nesterova, Head of Delivery

Macrocosmos, an ‘open source AI’ research lab, stands at the forefront of democratising AI. By integrating AI and blockchain technology, Macrocosmos drives the development of distributed, decentralised AI solutions. While headquartered in the UK, Macrocosmos’ global and diverse team of 30 employees manages a vast decentralised network of over 1000 community contributors, who play a pivotal role in creating transparent, open, and collaborative innovation. 

Open Source as a Catalyst for Democratisation

Open source is fundamental to Macrocosmos, not just philosophically but also to their product and service. It reduces development barriers by encouraging the reuse and improvement of existing models, leading to reduced development time and expense.

Macrocosmos leverages blockchain technology to support open source collaboration, helping to ensure data integrity and decentralisation, while also incentivising contributors to coordinate their efforts to advance AI capabilities – and be fairly rewarded for their efforts.

For example, Macrocosmos’ community is now the largest supplier of fresh media datasets on Hugging Face, the biggest open source data repository. With an average daily contribution of 350 million rows, these datasets are instrumental for AI model training and product development.

Community Driven Innovation

Macrocosmos’ decentralised approach extends to its operational structure. Despite being a small team, their innovative management frameworks allow them to efficiently oversee a large global network of contributors. The organisation incentivises contributions through a reward system that promotes quality and efficiency. Their Pre-training Network encourages contributors to produce foundational LLM models of varying sizes with state-of-the-art quality standards. This system not only attracts top tier talent into commodity production but also ensures that the community consistently delivers value to end users.

Technological Advancements & Challenges

Macrocosmos’ commitment to decentralisation drives their research into distributed AI training. Training large language models (LLMs) traditionally requires immense computational resources concentrated into centralised data centres, often controlled by a few tech giants. To counter this, Macrocosmos is exploring distributed training across multiple devices, reducing resource demands and environmental impact. Recent experiments have achieved a tenfold compression of model weights, significantly accelerating training times while maintaining model quality. However, this ambitious work is not without its challenges. Regulatory uncertainty, particularly regarding data compliance and intellectual property poses significant hurdles. To navigate risks associated with GDPR and continue innovating, Macrocosmos needs clarity on text and data mining in the UK.

Long-Term Sustainability

All tech startups face the risk of obsolescence, platform risk, and incompatibility. For Macrocosmos, these risks can be mitigated in part through product diversification of on-demand full stack AI/ML services. One of them is an innovative product ‘Gravity’, which enables users to collect large amounts of media data for sentiment analytics and insight gathering, opening new revenue streams. Macrocosmos also focuses on partnerships, collaborating with open source communities and industry bodies to expand their reach and impact.

Impact on the AI Ecosystem

Macrocosmos’ work demonstrates that there is an alternative to proprietary, centralised, and closed AI. If state of the art intelligence can be convened both at a lower cost and at a global scale, startups no longer need be at the behest of Big Tech – levelling the playing field and unlocking previously unattainable use cases. Their model shows that openness and collaboration are not just ideals but practical strategies that have the potential to further drive AI innovation.

First published by OpenUK in 2024 as part of State of Open: The UK in 2024 Phase Four “AI Openness: End of Year Update 2024”

© OpenUK 2024

Download Report

View all case studies

Scroll to top of the content