Scalability and interoperability within glideinWMS
Physicists have access to thousands of CPUs in grid federations such as OSG and EGEE. With the start-up of the LHC, it is essential for individuals or groups of users to wrap together available resources from multiple sites across multiple grids under a higher user-controlled layer in order to provide a homogeneous pool of available resources. One such system is glideinWMS, which is based on the Condor batch system. A general discussion of glideinWMS can be found elsewhere. Here, we focus on recent advances in extending its reach: scalability and integration of heterogeneous compute elements. We demonstrate that the new developments exceed the design goal of over 10,000 simultaneous running jobs under a single Condor schedd, using strong security protocols across global networks, and sustaining a steady-state job completion rate of a few Hz. We also show interoperability across heterogeneous computing elements achieved using client-side methods. We discuss this technique and the challenges in direct access to NorduGrid and CREAM compute elements, in addition to Globus based systems.
- Research Organization:
- Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC02-07CH11359
- OSTI ID:
- 986995
- Report Number(s):
- FERMILAB-CONF-10-258-CD; TRN: US1006586
- Journal Information:
- J.Phys.Conf.Ser.219:062036,2010, Conference: Prepared for 17th International Conference on Computing in High Energy and Nuclear Physics (CHEP 09), Prague, Czech Republic, 21-27 Mar 2009
- Country of Publication:
- United States
- Language:
- English
Similar Records
Use of glide-ins in CMS for production and analysis
CDF GlideinWMS usage in grid computing of high energy physics