Case study: BBC Research & Development
State of Open: The UK in 2022
Phase One “The Open Source Journey”
Phil Tudor, Head of Applied Research for Infrastructure, BBC R&D
Rob Cooper, Producer, BBC R&D
Synopsis
BBC Research and Development (BBC R&D) has undergone a digital transformation embracing a collaborative culture and incorporating Open Source Software (OSS) into its operations. This shift is driven by the inherent software-driven nature of the broadcasting industry. Utilising an Infrastructure as a Service (laaS) model with OpenStack (now Open Infrastructure), BBC R&D boasts a substantial cloud-based infrastructure. Notably, the department actively contributes to OSS, ranking among the top contributors to Open Infrastucture. They have made substantial code commits and engage in code reviews, fostering collaboration in the OSS community. Furthermore, BBC R&D utilises Kaldi, an open-source toolkit, for Speech to text applications, achieving impressive results in accuracy. This journey not only enhances outcomes but also facilitates internal skill development and knowledge expansion around OSS policies and management. The BBC is intentionally transitioning from mere OSS consumption to becoming active contributors, fostering innovation and collaboration in the media sector.
Case Study – BBC Research and Development
Phil Tudor, Head of Applied Research for Infrastructure
Rob Cooper, Producer at BBC R&D
BBC Research and Development (BBC R&D) supports the digitalisation efforts of the BBC engineers who are at the forefront of broadcast technology. It has forged the way in the media sector, with innovative technology and collaborative ways of working. Based in Research Labs in the North and South of the UK, the department includes over 200 highly specialised research engineers, scientists, ethnographers, designers, producers and innovation professionals working across broadcast supporting work with audiences, production and distribution right through to making tv programmes.
Transformative Journey to a collaborative culture
BBC R&D has transformed traditional broadcasting infrastructure into cloud-based IT platform technologies, allowing it to share projects on an Open Source Software basis using distributed repositories, building communities and using open collaboration. Phil recognises that, “The shared nature of Open Source Software as a medium for collaboration is very powerful.”
It’s not a conscious effort for the BBC to use Open Source Software but an inevitability, as Open Source Software is deeply embedded in the existing software stacks it uses. Phil notes, “The technology we use is deeply driven by software – our industry has been on a journey from broadcast equipment and hardware systems to being software and computing driven.” Moving forward, they expect a further shift to even more software.
Open Infrastructure
“In the BBC nature programme Spring Watch24 There are lots of cameras filming animals out in the natural world. R&D has built machine vision pipelines that do a lot of the hard work of looking at hours of feed and finding the interesting bits – identifying when the animal walks in front of the camera and what kind of animal it is. The acquisition pipeline and storage are running on Open Infrastructure cloud.”
BBC R&D made the shift to an Infrastructure as a Service (IaaS) model five years ago to support internal research and projects. They chose OpenStack, now called Open Infrastructure, a toolkit of many different technologies which creates a hybrid platform used for their research projects. As Phil explains, “We’ve built and currently have 3000 CPU cores, five petabytes of storage, 10 terabytes of RAM, 64 GPUs. The resources are available as a service to the teams on demand – and we can scale things up and down as needed.”
BBC R&D is not just a consumer of Open Source Software it also contributes and is in the top 20 contributors over the last five releases of Open Infrastructure. They’ve made 1500 code commits and are in the top 10 for code reviews (meaning 6500 code reviews). The BBC team lead is actively engaged as a leader in the Open Infrastructure community, building a network of trust helping Open Infrastructure to deliver new releases every six months.
Contributing upstream is important, beyond giving back to the community, “The way we are using the software is unique to our use cases, for example in a particular network architecture which scales for the kinds of media we’re using, we use the software in a certain configuration. And that’s often where you find a bug or something that’s not covered elsewhere, because other people aren’t using the same code or tools in that way. That drives our contribution upstream. The important thing is that those contributions we’ve made remain in the source code that everyone else is testing and building on. It stops us effectively diverging with our code from the upstream code and allows many eyes to peer review our work.”
Speech to text
BBC R&D also uses Speech to text software, created on Kaldi, a toolkit for speech recognition written in the C++ language and licensed under the Apache 2.0 Licence. “You’ve got things like music beds in the back of dramas, crowd noise in sports programmes, cross talk in discussion programmes, all sorts of things that speech to text really struggles with understanding,” Rob explains. The BBC shared their data stacks with a group of academics who then used the Open Source Software Kaldi tool. The results that came back were impressive in the accuracy of speech to text systems automatically converting spoken audio to text, despite the distractions of background noise etc.
The BBC chose Kaldi for speech recognition as other commercial vendor tools were not seen to be fit for research purposes. Because it was trained on the broadcast data that BBC R&D supplied the researchers with, they were able to achieve higher than industry standard accuracy results in their subtitles.
Delivering outcomes at speed is critical in large research teams and Open Source Software allows for rapid prototyping, experimentation and tweaking as they go. As Rob says, “just the chance Open Source Software offers to get something up and running is crucial for innovation in general.” The real improvements have come from endlessly optimising the model and adapting it.
According to Rob, adopting this Open Source Software has allowed the BBC to embark on a crucial learning journey. As it is a complex tool, its success requires that it has a sufficient amount of training and
a specialised skill set to do this. They put one of the BBC’s best developers on it for a period of almost twelve months before they really got to grips with the specifics of using and optimising the models. This collaboration allowed for an unprecedented opportunity to enhance the internal BBC skill set and expand team knowledge, particularly around the internal Open Source Software policies, licence understanding and management of Open Source Software projects.
Moving Forward
BBC R&D is creating a supportive ecosystem that allows it to contribute to the community, iterate, fix bugs, speed up delivery and enhance outcomes. The BBC is intentionally making the shift from pure consumption to being active contributors within the relevant Open Source Software communities and building the necessary internal processes and governance to effectively do so with appropriate diligence.
Challenges and Benefits
The ranking of the benefits is also observed in the total sample and the ranking of the challenges follows the same pattern as the total sample, with the exception that lack of coding skills or technical knowledge does not feature as a big challenge here, compared to the total sample.