LHC Collider Recorded Over 75 Petabyte Until Today's Shutdown
The Large Hadron Collider (LHC) has generated the enormous amount of 75 petabytes of physics data during the past three years, until todays, February 14, maintenance shutdown. Computer engineers at CERN, the location of the world's very first web server, today announced that the CERN Data Centre has reached the 100 petabytes mark of recorded data from all experiments over the last 20 years.
One hundred petabytes is equal to 100 million gigabytes or 100,000 terabytes and storing it is a challenge. The majority of this data, about 88 petabytes, is archived on tape using the CERN Advanced Storage system (CASTOR), with the remaining amount residing on the enormous 13 petabyte EOS disk pool system – a system optimized for fast analysis access by many concurrent users.
"We have eight robotic tape libraries distributed over two buildings, and each tape library can contain up to 14,000 tape cartridges," says German Cancio Melia of the CERN IT department. "We currently have around 52,000 tape cartridges with a capacity ranging from one terabyte to 5.5 terabytes each. For the EOS system, the data are stored on over 17,000 disks attached to 800 disk servers."
Not all the information was generated by LHC experiments. "CERN IT hosts the data of many other high-energy-physics experiments at CERN, past and current," says Dirk Duellmann of the IT department.
Tapes are checked regularly to make sure they stay in good condition and are accessible to users. To optimize storage space, the complete archive is regularly migrated to the newest high-capacity tapes.
The Data Centre will stay active and busy during the Long Shutdown of the whole accelerator complex that begins today, because scientific analysing of all the data recorded during the LHC's first three-year run continues. Also, upgrades don't stop with the accelerator devices, the data center needs to prepare as well for the higher expected data flows when the higher-energy accelerators and experiments start up again. An extension of the Centre, and the use of a remote data centre in Hungary will further increase the Data Centre's capacity.
On top of internal extensions and a new remote data center in Hungary, the CERN Data Centre is at the moment part of the pilot project in building a European research cloud computing network, called Helix Nebula. Born out of a unifying vision, over the longer term, it is hoped that use of commercial cloud resources could become a useful addition to very large data centres owned and managed by the scientific community in Europe.