This phase is tackling ambitious challenges covering the most critical needs of ICT infrastructures. These needs were identified following an in-depth cross-community consultation exercise at the end of CERN openlab’s fourth phase, involving a series of technical meetings and in-depth discussions with representatives of diverse research organisations. This exercise resulted in the publication of the ‘CERN openlab Whitepaper on Future IT Challenges in Scientific Research’.
Data Acquisition (online) |
Existing and emerging large-scale research projects are producing increasingly high amounts of data at faster and faster rates. The quantity and rates at which the data is produced is expected to increase as new technology and instruments are deployed. Projects in different scientific disciplines use a wide variety of instruments, including sensors, detectors (such as the LHC experiments’ detectors at CERN), high-throughput genome sequencers, X-ray free-electron lasers, satellite imaging devices, and radio telescopes or antennas. While the different instruments have specialised capabilities for their particular field of application, they all have in common the need to reliably transform physical or chemical processes into digital data and the need to support complex chains of systems to filter, store, and analyse the data in real-time. |
Innovation Management and Entrepreneurship |
Discoveries in fundamental physics require constant technological innovation in such diverse fields as computing, electronics, materials science, and industrial sensing and control systems. CERN provides a stimulating environment where talented minds from multiple disciplines can meet to make such innovation happen. |
Other | |
Computing Platforms (offline) |
The success of existing and future scientific and experimental programmes depends among other factors on an efficient exploitation of the recent and future advances in computing technology. Computing power is needed for a number of very important tasks. The data produced by scientific detectors and instruments—and possibly filtered by the data acquisition systems—needs to be processed and analysed to understand whether it represent a significant event or physical phenomenon, or to produce aggregated, consolidated, or derived information. The data must also be reprocessed when new algorithms are defined or the instruments are recalibrated, and simulations of the instruments’ behaviours and properties have to be performed to provide comparative numbers against which the real data can be measured. |
Data Storage Architectures |
The storage and management of LHC data is one of the most crucial and demanding activities in the LHC computing infrastructure at CERN and also at the many collaborating sites within the Worldwide LHC Computing Grid (WLCG). Every year, the four large-scale LHC experiments create tens of PBs of data, which need to be reliably stored for analysis in the CERN Data Centre and many partner sites in the WLCG. Today, most physics data is still stored with custom storage solutions, which have been developed for this purpose within the HEP community. As the user demands are increasing in data volume and aggregated speed of data access, CERN and its partner institutes are continuously investigating new technological solutions to provide their user communities with more scalable and efficient storage solutions. At the same time, CERN closely follows the larger market trends on the commercial side and continuously evaluates new solutions for the physics use cases, so as to be ready for their adoption as soon as they have matured sufficiently for deployment at large scale. |
Compute Management and Provisioning |
European scientific research has benefited in the past several years from the increasing availability of computing and data infrastructures that have provided unprecedented capabilities for large-scale distributed scientific initiatives. A number of major projects and endeavours, such as EGI14, PRACE15, WLCG16, OSG17 (in the USA), and others, have been established to share the ever-growing amount of computational and storage resources. This collaborative effort has involved hundreds of participating research organisations, academic institutes, and commercial companies. The major outcome was a number of active production infrastructures providing services to many research communities, such as HEP, life sciences, material science, astronomy, computational chemistry, environmental science, humanities, and more. |
Networks and Connectivity |
Most of the CERN infrastructure is controlled and managed over a pervasive IP network. Safety and access controls for the accelerator complex use communication channels over IP, with robots even being used to carry out remote inspections of dangerous areas and giving feedback over Wi-Fi and GSM IP networks. In total, there are over 50,000 devices connected to the CERN network, which is comprised of roughly 5,000 km of optical fibre out of a total of 40,000 km for all networks on CERN sites. This whole network is operated and monitored by the CERN Network Operation Centre. |
Data Analytics |
During the past decades, CERN and other internatonal research laboratories have been gathering not only enormous amounts of scientific data, but also very large quantities of systems-monitoring data from their instruments. Curating, enriching and managing this data would enable its exploitation. Added value could be obtained in terms of increasing knowledge of the engineering systems, enabling better delivery of services to the scientific community, and helping appropriate decisions to be taken during the lifecycles of the systems and instruments. |