ATLAS Experiment

David Cameron

home
work

European Middleware Initiative/University of Oslo (Oct 2011-)

In October 2011 I started to work for the European Middleware Initiative (EMI), which aims to develop and consolidate several middleware distributions into one product. My main involvement is leading the re-design of the ARC data staging system, which was badly in need of re-development. See the paper Adaptive data management in the ARC Grid middleware or the project's wiki page for more information. I am also part of the team working to merge together data libraries from ARC and gLite into one library, the EMI datalib.

I am still involved in the ATLAS experiment, tailoring ARC to fit its needs, taking shifts to monitor CERN data processing and data export on the Grid and general data management support for ARC ATLAS sites.

Nordic DataGrid Facility/University of Oslo (March 2007 - Sept 2011)

In March 2007 I started to work at the University of Oslo for the Nordic DataGrid Facility (NDGF) which is a collaboration between Nordic countries to share resources using the ARC Grid middleware developed by NorduGrid. The ATLAS physics experiment currently running at CERN is the main user of NorduGrid resources and so a large part of my work is tailoring ARC toward ATLAS requirements. I am also working on general improvements to ARC, such the components which deal with movement of input and output data, data caching, and other additions to make ARC more robust to an ever widening user base and increasing scale of requirements.

I spent 2 years in Oslo, 1 year in Ghana and have been based at CERN since April 2010.

An overview of ATLAS and Nordugrid can be found in Performance of an ARC-enabled computing grid for ATLAS/LHC physics analysis and Monte Carlo production under realistic conditions, presented at CHEP 2009, and in an older paper Complete Distributed Computing Environment for a HEP Experiment: Experience with ARC-Connected Infrastructure for ATLAS.

Managing ATLAS Data on a Petabyte Scale with DQ2 shows the design and status of the ATLAS data management system.

Complete Distributed Computing Environment for a HEP Experiment: Experience with ARC-Connected Infrastructure for ATLAS gives a summary of the full chain of physics analysis, from simulated data production, to data management on NorduGrid, to physics analysis and results obtained using NorduGrid resources.

ATLAS Distributed Data Management/CERN (Feb 2005 - Jan 2007)

I worked for two years at CERN as a fellow designing and implementing the ATLAS distributed data management system, DQ2.

The vast amounts of data produced by the ATLAS experiment and the analysis of the data will be distributed worldwide, and so a reliable and scalable system is required to ensure the data is moved around efficiently and reliably. Grid software binds together distributed resources and allows for example physics analysis software to run anywhere in the world, however there needs to be an application specific layer between ATLAS physics software and the underlying Grid infrastructure, and a way to organise the data in an ATLAS specific way. For ATLAS, data from many physics events (~1000) in the detector are grouped together by physics properties into data files, and in turn these files are grouped into datasets (~100 files per dataset). Datasets are the unit of data movement. We have developed a set of catalogs which hold information on ATLAS datasets such as their content and where they are located, and distributed services which act to move datasets between different sites around the world. This software is called DQ2.

The design of DQ2 is described in the CHEP 2007 paperManaging ATLAS Data on a Petabyte Scale with DQ2 and current information can be found on the DQ2 web site.

There is more information on the internal data movement within CERN and export to Tier 1 Grid sites in the paper from HEP 2007 ATLAS computing system commissioning: real-time data processing and distribution tests.

PhD/University of Glasgow (Oct 2001 - Jan 2005)

Thesis "Replica Management and Optimisation for Data Grids" completed in January 2005. (ps) (pdf)

After I obtained a MSci degree from the University of Glasgow in Physics and Astronomy in 2001 I joined the Experimental Particle Physics group to study for a PhD. This group is involved in the ScotGrid project, which aims to create a Grid Tier 2 centre mainly for the analysis of physics data from experiments in the Large Hadron Collider at CERN.

Most of my work was in Grid middleware, as part of the data management work package of the European DataGrid, which deals with the management of large amounts of data over global scale networks. As part of the Optimisation task, I helped to design and test algorithms that could be used to control the management of replicas throughout the Grid. A novel replication strategy we tested involved using an economic model where the files are considered as goods in the Grid marketplace.

We built a Grid simulator, called OptorSim, to test various replica optimisation algorithms in a Grid environment before they are used on a real Grid. OptorSim takes a grid configuration and a replica optimisation algorithm as input and then runs a number of grid jobs using the given configuration. It also allows a user to visualise the performance of the algorithm. Some results can be found in the papers below.

From August 2003 to March 2004 I worked at CERN on other components of the EDG Data Management software. This included testing the functionality and the performance of the Replica Manager, which controls data replication and cataloging, and the underlying services it uses such as the Replica Location Service, the Replica Metadata Catalog and the Replica Optimization Service.

I also looked after the ScotGrid web pages - www.scotgrid.ac.uk - which included developing monitoring tools for web pages to account for local users of the cluster.

 

Publications

Journal Papers

(pdf) Dynamic Data Replication in LCG in 2008. C. Nicholson, D. G. Cameron, A. T. Doyle, A. P. Millar and K. Stockinger. Concurrency and Computation: Practice and Experience, 20(11), 1259-1271, 2008

(pdf) (proof copy) Formal Analysis of an Agent-based Optimisation for Data Grids. D. G. Cameron, R. Carvajal-Schiaffino, C. Nicholson, K. Stockinger, F. Zini, A. P. Millar and L. Serafini. Multiagent and Grid Systems, 2(2) 149-162, 2006

(ps) (pdf) Analysis of Scheduling and Replica Optimisation Strategies for Data Grids Using OptorSim. D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, C. Nicholson, K. Stockinger and F. Zini. Journal of Grid Computing, 2(1), March 2004.

(ps) (pdf) OptorSim - A Grid Simulator for Studying Dynamic Data Replication Strategies. W. H. Bell, D. G. Cameron, L. Capozza, A. P. Millar, K. Stockinger and F. Zini. International Journal of High Performance Computing Applications, 17(4), 2003.

Conference Proceedings

(pdf) Adaptive data management in the ARC Grid middleware. D. Cameron, A. Gholami, D. Karpenko and A. Konstantinov. In Computing in High Energy Physics 2010 (CHEP 2010) Taipei, Taiwan, October 2010.

(pdf) Performance of an ARC-enabled computing grid for ATLAS/LHC physics analysis and Monte Carlo production under realistic conditions. B. H. Samset, D. Cameron, M. Ellert, A. Filipcic, M. Gronager, J. Kleist, S. Maffioletti, F. Ould-Saada, K. Pajchel, A. L. Read, A. Taga and the ATLAS Collaboration. In Computing in High Energy Physics 2009 (CHEP 2009) Prague, Czech Republic, April 2009.

(pdf) The Advanced Resource Connector for Distributed LHC Computing. D. Cameron, A. Konstantinov, F. Ould-Saada, K. Pajchel, A. Read, B. Samset, A. Taga. In XII Advanced Computing and Analysis Techniques in Physics Research, Erice, Italy. November 2008.

(pdf) Managing ATLAS Data on a Petabyte Scale with DQ2. M. Lassnig, M. Branco, P. Salgado, V. Garonne, D. Cameron, R. Rocha, B. Gaidioz, B. Koblitz, T. Wenaus. In Computing in High Energy Physics 2007 (CHEP 2007) Victoria, Canada, September 2007.

(pdf) ATLAS DDM Integration in ARC. G. Behrmann, D. Cameron, M. Ellert, J. Kleist and A. Taga. In Computing in High Energy Physics 2007 (CHEP 2007) Victoria, Canada, September 2007.

(pdf) Complete Distributed Computing Environment for a HEP Experiment: Experience with ARC-Connected Infrastructure for ATLAS. A. Read, A. Taga, F. Ould-Saada, K. Pajchel, B. H. Samset, D. Cameron. In Computing in High Energy Physics 2007 (CHEP 2007) Victoria, Canada, September 2007.

(pdf) ATLAS computing system commissioning: real-time data processing and distribution tests. A. Nairz, L. Goossens, M. Branco, D. Cameron, P. Salgado, D. Barberis, K. Bos, G. Poulard. In The 2007 Europhysics Conference on High Energy Physics (HEP 2007), Manchester, UK, July 2007.

(pdf) Storage and Data Management in EGEE. G. Stewart, D. Cameron, G. A. Cowan and G. McCance. In 5th Australasian Symposium on Grid Computing and e-Research (AusGrid 2007), Ballarat, Australia, February 2007.

(ps) (pdf) Replica Management Services in the European DataGrid Project. David Cameron, James Casey, Leanne Guy, Peter Kunszt, Sophie Lemaitre, Gavin McCance, Heinz Stockinger, Kurt Stockinger, Giuseppe Andronico, William Bell, Itzhak Ben-Akiva, Diana Bosio, Radovan Chytracek, Andrea Domenici, Flavia Donno, Wolfgang Hoschek, Erwin Laure, Levi Lucio, Paul Millar, Livio Salconi, Ben Segal and Mika Silander. In e-Science All Hands Meeting, Nottingham, UK, September 2004.

(ps) (pdf) Evaluating Scheduling and Replica Optimisation Strategies in OptorSim. D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, C. Nicholson, K. Stockinger and F. Zini. In 4th International Workshop on Grid Computing (Grid2003), Phoenix, USA, November 2003. IEEE Computer Society Press.

(ps) (pdf) UK Grid Simulation with OptorSim. D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, C. Nicholson, K. Stockinger and F. Zini. In e-Science All Hands Meeting, Nottingham, UK, September 2003.

(ps) (pdf) Evaluation of an Economy-Based File Replication Strategy for a Data Grid. W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino, A. P. Millar, K. Stockinger, and F. Zini. In International Workshop on Agent based Cluster and Grid Computing at CCGrid 2003, Tokyo, Japan, May 2003. IEEE Computer Society Press.

(ps) (pdf) Simulation of Dynamic Grid Replication Strategies in OptorSim. W. H. Bell, D. G. Cameron, L. Capozza, A. P. Millar, K. Stockinger, F. Zini. 3rd International Workshop on Grid Computing, Baltimore, USA, November 2002.

Others

(ps) (pdf) Design of a Replica Optimisation Service, WP2 - Data Management EU DataGrid Project. W. H. Bell, D. G. Cameron, L. Capozza, P. Millar, K. Stockinger, F. Zini. Technical Report, CERN 2002.