CERN Accelerating science

Knowledge-sharing projects

CERN openlab’s mission rests on three pillars: technological investigation, education, and dissemination. The collaboration with research communities and laboratories outside the high-energy physics community brings together all these aspects.

As part of its fifth phase, CERN openlab is working closely with the Knowledge Transfer Group on a number of initiatives aimed at transferring tools, skills, and knowledge from the high-energy physics community to other research fields.

Find out more about two projects related to the field of biomedicine below.

 

GeneROOT

The GeneROOT project is run by CERN openlab, as part of ongoing investigations around transferring computing knowledge and technologies to other research communities. The aim of the project is to use ROOT — a data-processing framework created at CERN and used widely in the high-energy physics community — to analyse large genomics datasets and to share methods and results across a distributed community of scientists.

In collaboration with project partner King’s College London, GeneROOT is initially making use of sequences from the TwinsUK registry. However, the project has the goal of extending to other similar datasets hosted by facilities across the globe.

The project will investigate whether the ROOT system can be successfully used to extend the capabilities of existing analysis and file-access tools. Increased efficiency is required to handle the ever increasing amounts of data produced by modern sequencing techniques. The field of genomics has very well established working practices, meaning that any new system will have to be highly performant in order to ensure widespread community adoption.

The project kicked off in May 2016. The 300TB genomics dataset has been transferred to the CERN EOS storage system to ensure data redundancy and to provide local access to CERN experts. Initial genome sequence alignment jobs are being run to improve understanding of the computing requirements. The project’s next step is to integrate the standard data format used in genomics into ROOT, so that analysis benchmarking can take place. Work will then start on adapting the ROOT analysis tools to the genomics community’s needs and ways of working.

 

BioDynaMo

The BioDynaMo project is part CERN openlab’s collaboration with Intel on code modernisation.

The project is a collaboration between CERN, Newcastle University, Innopolis University, Kazan Federal University, and Intel to design and build a scalable and flexible cloud-based computing platform for rapid simulation of biological tissue development. It foresees three main phases: the consolidation, optimisation, and further extension of biological simulation code to run efficiently on modern multi-core and many-core platforms; the deployment of a cloud-based platform using state-of-the-art cloud-based high-performance computing technologies; the creation of a shared ecosystem of tools, datasets, processes, and human networking in the field of biological simulation.

The project focuses initially on the area of brain tissue simulation, drawing inspiration from existing, but low-performance software frameworks. Late 2015 and early 2016 saw algorithms already written in Java code ported to C++. Once porting was completed, work was carried out to optimise the code for modern computer processors and co-processors, so as to make the best possible use of the many available cores. The optimisations will be tested over the first months of 2017 and support for additional cell types and behaviours will be added. The next step will then be to extend the system to run in a cloud-computing environment, thus making it possible to harness many thousands of computer processors to simulate very large biological structures.