CERN openlab

Data Analytics

Oracle Analytics as a Service

The main objective of this set of projects is to support the research and development - R&D - activities regarding the implementation of the CERN Data Analytics as a Service infrastructure – DAaaS. This infrastructure aims to (1) Integrate the existing analytics developments; (2) Centralize and standardise the complex data analytics needs for CERN’s research and engineering community; (3) Deliver real-time, batch data analytics and information discovery capabilities; (4) Offer storage for large data volumes – structured and unstructured; (5) Provide transparent access and Extract, Transform and Load (ETL), mechanisms to the various and mission-critical existing data repositories.

Siemens Industrial Control and Monitoring

The control systems used by CERN’s technical infrastructures produce enormous amounts of data related to both the systems they control and their own internal state. This project focuses on ways to handle these large datasets and extract insights that can lead to improved operational efficiency. The work is arranged into two main areas:

The data flow from WinCC Open Architecture (a SCADA tool widely used at CERN) to a high-performance storage system
The analysis of the stored data with Siemens analytics tools

In the area of WinCC OA, work is being carried out to make a generic archiver through which one can plug in different systems. For the analysis of the stored data, work is being carried out to enhance detection of faulty sensor measurements, to enable better measurement of the performance of control processes, and to develop a new alarm system for flooding detection.

IDT Trigger and Data Analytics

The operation of the IT infrastructure at CERN relies on significant and continuous streams of monitoring and logging data, which are aggregated and stored in central repositories. The main repositories are based on a Hadoop cluster deployed on commodity hardware. It gives hardware experts, system administrators and service managers a convenient framework for large-scale data processing using Apache Spark.

The topic to be studied is whether or not an alternative Hadoop deployment on a cluster with a low latency RapidIO interconnect will provide sufficient throughput for performing useful near-line processing of the accumulated IT monitoring and logging data.

Yandex Data Popularity at LHCb

Data collected by the LHCb experiment is stored in the form of multiple datasets (files) on tapes and disks in the LHCb data storage grid. The storage systems used within this vary in terms of their cost, energy consumption, and speed of use. The goal of this project is to design, develop, and deploy a ‘data popularity estimator service’ that would analyse the usage history of each dataset, predict future usage patterns, and provide an optimal scheme for data placement and movement.

Yandex Anomaly Detection in LHCb Online Data Processing

Ensuring data quality is essential for the LHCb experiment. Checks are done in several steps, both offline and online. Monitoring is based on continuous comparison of histograms with references, which have to be regularly updated by experts. The aim of this project is to create a novel, autonomous data-collection monitoring service that is capable of identifying deviations from normal operational modes. It will also help the personnel responsible for data-quality monitoring to explore the underlying reasons for such deviations, thus reducing the amount of ‘spoiled data’ that may erroneously be stored for further analysis.

Related content

Partner Spotlight
Gaining New Insights from Data Lakes
Joining CERN openlab as research member
Oracle Enables Research at Scale
Oracle Customer Viewpoint—CERN
CERN and Siemens subsidiary ETM sign knowledge transfer agreement
CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
IDT Reports Q1 Fiscal Year 2017 Financial Results
IDT and CERN openlab Mark Milestone for Data Acquisition and Data Center Analytics Applications Used for Large Hadron Collider
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center

Presentation
Building Secure Rest Architectures With ORDS
Extending an asynchronous messaging library using an RDMA-enabled interconnect
Computational neurology: not just moonshots, but clinical impact
Leveraging Oracle Big Data Discovery to Master CERN’s Data
Using Oracle Database In-Memory feature to Speed-up CERN Applications
Oracle Database In-Memory feature at CERN
What's New for Machine Learning with Oracle Database and Hadoop
Big Data, Information Discovery and the biggest machine in the world
Exploring Data-Driven Decision with Oracle Big Data Discovery
Fault Detection using Advanced Analytics at CERN's Large Hadron Collider
Data Analytics for Controls (Siemens)
Data Analysis (Oracle)
RapidIO for DAQ, Trigger and Data Analysis (IDT)
CERN openlab Management Update
Oracle WebLogic Server 12.2.1 Mul5tenancy
CERN Private Java Cloud: Deep Dive into On-Demand PaaS Internals
Oracle: Database and Data Management Innovations with CERN
Siemens: Smart Technologies for Large Control Systems

Technical document
Extending an asynchronous messaging library using an RDMA-enabled interconnect.
1000 things you always want to know about SSO but you never dare to ask
Improving Reproducibility of Data Science Experiments
Reproducible Experiment Platform
Evolution of Database Replication Technologies for WLCG
Smart Data Analysis of CERN Control Systems

Press coverage
«Яндекс» займется поисками асимметрии вещества на Большом адронном коллайдере
Машинное обучение и поиск темной материи: соревнование от ЦЕРНа и Яндекса
Oracle: Big Data Is Proving Harder Than Previously Thought
Yandex - Mesin Pencari buatan Rusia
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
CERN : Oracle analyse les données du LHC
CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
ORACLEVOICE: CERN TESTS DATA EXPLORATION USING BIG DATA, ANALYTICS, AND THE CLOUD
CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
ORACLEVOICE: CERN TESTS DATA EXPLORATION USING BIG DATA, ANALYTICS, AND THE CLOUD
CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
OracleVoice: CERN Tests Data Exploration Using Big Data, Analytics, And The Cloud
Atom Smashing using Machine Learning at CERN (CERN Openlab project)
IDT and CERN openlab mark milestone for data acquisition and data center analytics applications
TECH STOCKS TO LOOK OUT FOR: INTEGRATED DEVICE TECHNOLOGY, INC. (IDTI), FINISAR CORP. (FNSR)
IDT and CERN openlab Mark Milestone for Data Acquisition and Data Center Analytics Applications Used for Large Hadron Collider
IDT and CERN openlab Mark Milestone for Data Acquisition and Data Center Analytics Applications Used for Large Hadron Collider
IDT and CERN openlab Mark Milestone for Data Acquisition and Data Center Analytics Applications Used for Large Hadron Collider
IDT and CERN openlab Mark Milestone for Data Acquisition and Data Center Analytics Applications Used for Large Hadron Collider
IDT and CERN openlab Mark Milestone for Data Acquisition and Data Center Analytics Applications Used for Large Hadron Collider
RapidIO.org Announces the Public Release of the Next Generation 25xN 100Gbps RapidIO Specification
CERN Upgrades Data Center
Oracle企业管理器面向Oracle VM新管理功能比思论坛吧
IDT and CERN speed and improve analytics at Large Hadron Collider
SystemTap Guru Mode and Oracle SQL Parsing
CERN openlab Internship
Low-Latency Interconnects Plumb the Depths of Particle Physics
IDT and CERN Announce Low-Latency RapidIO Platform
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
For Big Data Transfers LHC Using RapidIO Interface
Scientists May Have Discovered a New Particle in Nature – or it Could Be a Glitch
LHC to Use RapidIO Interface for Big Data Transfers
IDT, CERN Speed and Improve Analytics at Large Hadron Collider
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
IDT helps speed & improve analytics at CERN's LHC
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
IDT and CERN Engineer Low-Latency Platform to Speed and Improve Analytics at Large Hadron Collider
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large ...
IDT And CERN Openlab Engineer Low-Latency RapidIO Platform To Speed And Improve Analytics At Large Hadron Collider And Data Center
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
IDT and CERN openlab Engineer Low-Latency RapidIO Platform to Speed and Improve Analytics at Large Hadron Collider and Data Center
MTN Foundation Partners Oracle to Equip Nigerian Students with Exclusive ICT skills
Nigerian Students will be Equipped with Exclusive ICT Skills as MTN Foundation Partners Oracle
MTN FOUNDATION AND ORACLE TO EQUIP NIGERIAN STUDENTS WITH EXCLUSIVE ICT SKILLS
MTN FOUNDATION & ORACLE TO EQUIP NIGERIAN STUDENTS WITH EXCLUSIVE ICT SKILLS
MTN Foundation And Oracle To Equip Nigerian Students With Exclusive ICT Skills
500-LEVEL COMPUTER ENGINEERING STUDENT AKURUYEJO, MUFUTAU AYOTUNDE, WINS MTNF SCHOLARSHIP
How to Save Our Sick, Neglected Oceans
Yandex Data Factory hosts Machine Learning Conference
Brocade steunt SDN-research van CERN
Flavours of Physics: Join the LHCb Machine-Learning Contest
500 Level Unilag Student Wins Switzerland Training, Sets African Record
500 Level Student Wins Switzerland Training, Sets African Record
CERN openlab on Future IT Challenges in Scientific Research
How education can liberate a whole community

News
Blog: A vibrant exchange of ideas and experiences at the BIWA Summit 2016
CERN openlab holds 2015 technical workshop

CERN Accelerating science

You are here