Extreme-scale Scientific Data Stores

Experiments at the frontiers of particle physics are beginning to build scientific data stores at scales of hundreds of petabytes, and Argonne is at the forefront, providing leadership in design and development of the underlying input/output (I/O), data persistence, and metadata infrastructure necessary to support science at such data scales.

Image credit:  CERN, 2001
The European Organization for Nuclear Research, or CERN's, Large Hadron Collider, near Geneva and Lac Leman. In the foreground, the red ring indicates the location of the tunnel, which is 100 meters (328 feet) below ground. The ATLAS site is at the 2 o'clock position. The French Alps with Mont Blanc can be seen in the background. (Image credit: CERN 2001)

The Large Hadron Collider (LHC) is arguably the largest scientific instrument in the world, colliding high-energy proton beams at nearly the speed of light in a ring that is 27 kilometers [16.8 miles] in circumference and 100 meters [328 feet] underground and located just outside of Geneva, Switzerland. By recreating on a small scale conditions near those of the Big Bang, the LHC is designed to illuminate some of the most important and intriguing questions in contemporary physics, including the origin of mass; physics beyond the Standard Model; the makeup of the early universe; supersymmetry and the nature of dark matter and dark energy; and the possibility of extra, "hidden" dimensions beyond those encountered in everyday experience.

ATLAS is one of two general-purpose detectors built to address this ambitious research agenda. The ATLAS detector is, in fact, a collection of almost 100 heterogeneous detectors organized into half a dozen subsystems. Together, they are the size of a five-storey building and reflect a variety of designs and purposes, including tracking, particle identification, calorimetry, muon spectrometry, luminosity measurement, and more. More than 3,000 PhD physicists from 174 universities and research institutes in 38 countries around the globe are members of the ATLAS collaboration.

Image credit:  ATLAS Experiment ©2012 CERN

Installing the ATLAS calorimeter. The eight toroidal magnets can be seen with the calorimeter before it is moved into the middle of the detector. This calorimeter will measure the energies of particles produced when protons collide in the center of the detector. (Image credit: ATLAS Experiment ©2012 CERN)

For more than a decade Argonne has held coordination responsibility within the international collaboration for the globally distributed, 100-petabyte ATLAS data store, I/O framework, and metadata infrastructure that supports this endeavor. Argonne leads an international team of developers responsible for the design and delivery of software to ensure that ATLAS physicists can efficiently and robustly identify, select, navigate to, and access data of interest in a distributed store at such scales, and can also understand their data's provenance. Because the LHC will run for decades, Argonne researching are working to ensure that the ATLAS collaboration has the flexibility to evolve its data model and infrastructure while maintaining backward compatibility, and they are also adapting ATLAS software to take advantage of emerging software and hardware technologies, including new chip designs and increasingly multicore processors, which are all important aspects of this ongoing work.

