Seminar: Grid Computing with ASG

summer term 2006

Prof. Dr. habil. Andreas Polze
Dipl. Inf. Peter Tröger
Dipl. Inf. Andreas Rasche

A Grid is a massively distributed system, which enables the offering, finding and usage of resources across multiple administrative domains. Grid Computing considers questions of availability, authentication, billing and quality-of-service in widely distributed systems. Multiple resources from different organizations are combined in a non-centralized virtual organization. Grid Computing can be seen as analogy to the power grid, which offers its resources regardless of the provider in a constant quality to the end user, using standardized and low-cost interfaces.

The seminar discusses major topics in the area of distributed systems (especially Grid systems) and tries to analyze their evolution in the last 20 years. We will discuss basic concepts of distributed systems from history, their practical application in cluster environments, and their current application to grid toolkits.

Each participant must provide a 45 minute presentation, which will be followed by a discussion. Two weeks before the presentation, each student should contact his tutor for a preliminary discussion of the presentation material. The presentation must be given in English language.

The grading demands not only presentation and discussion, but also some practical work (programming / installation / experiments) with a particular distributed system, and the delivery of a project report document (10 - 50 pages). The document must be written with the hpitr document template, and can be written in German language.

The presentation time slots are 7.6., 21.6., 28.6., and 12.7.

Remote deployment and execution in distributed systems (Martin Breest)

  • Historical approaches (rshell, ssh)
  • Cluster vs. Grid - problem analysis
  • Cluster systems (Sun Grid Engine, Condor, Cactus, PBS, Legion)
  • Cluster scheduling strategies (backfilling, advanced reservation)
  • Monitoring and profiling (NWS, RMON, Ganglia, Paradyn)
  • Grid toolkits (Globus, Unicore, Legion, GridLab)
  • Service grids (WSRF, WS-Management, OGSA, JSDL, DRMAA, GAT)
  • Practical experiments with job submission and grid scheduling (Nimrod, Maui, Community Scheduler, Silver)
  • Simple example program (e.g. povray), applied to different environments - positioning as possible user of such systems

Parallel and distributed programming (Francisco Marcano)

  • Roots of parallel and distributed programming (control-parallelism, data-parallelism, LOGP, PRAM, MIMD vs SIMD, ...)
  • Inter-process communication (RPC, message passing, peer-to-peer, group communication, tuple spaces)
  • RPC and distributed objects (DCE RPC, Sun RPC)
  • Message-oriented communication (PVM, MPI, message queues)
  • Programming languages for distributed systems (HPF, Occam, Parallel C, OpenMP)
  • Toolkits for grid programming (Cactus)
  • Exemplary parallelization of given algorithm in High-Performance Fortran, execution in cluster testbed (e.g. with TCP MPI)

Security in distributed systems (Uwe Kylau)

  • Terminology (Identification, authentication, authorization, audits, integrity, confidentiality, non-repudiation, imperative vs. role-based security)
  • Kerberos
  • Middleware security (DCE, DCOM, RMI, CORBA, Legion)
  • Grid security (Globus GSI, X.509, PKI, WS-Security)
  • Practical experiments with WS-Security, for usage in ASG / J2EE environment

Scalable and fault-tolerant data management in distributed systems (Nikolai Eipel)

  • Data replication in distributed systems (consistency models, replication strategies, indexing)
  • Historical approaches (Plan 9, Linda)
  • Distributed file systems (NFS, AFS, GFS, SMB, Coda)
  • Distributed data management (EU Data Grid, GridFTP, Grid File System, RLS)

Some literature::

Andreas Polze. Vorhersagbares Rechnen in Multicomputersystemen. Habilitationsschrift, Mathematisch-Naturwissenschaftliche Fakult¨at der Humboldt-Universit¨at zu Berlin, 2001.

Andreas Savva and Ali Anjomshoaa and Fred Brisard and R Lee Cook and Donal K. Fellows and An Ly and Stephen McGough and Darren Pulsipher. Job Submission Description Language (JSDL) Specification.

Gregory R. Andrews. Paradigms for process interaction in distributed programs. ACM Comput. Surv., 23(1):49–90, 1991.

B. Dreier and M. Zahn and T. Ungerer. Rthreads—A Uniform Interface for Parallel and Distributed Programming. In Proc. of the 2nd Int’l Conference on Massively Parallel Computing Systems (MPCS’96), pages 530–534, 1996.

Henri E. Bal, Jennifer G. Steiner, and Andrew S. Tanenbaum. Programming languages for distributed computing systems. ACM Comput. Surv., 21(3):261–322, 1989.

Bertrand Meyer. Systematic concurrent object-oriented programming. Commun. ACM, 36(9):56–80, 1993.

C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1985.

C. Lee and S. Matsuoka and D. Talia and A. Sussman and M. Mueller and G. Allen and J.Saltz. A Grid Programming Primer. Advanced Programming Models Research Group, GWD-I (Informational), August 2001.

Chris Steketee. Process Migration and Load Balancing in Amoeba. In Proceedings of the Twenty Second Australasian Computer Science Conference, Auckland, New Zealand, January 18-21 1999. Springer-Verlag, Singapore.

D. Loveman. High Performance Fortran. IEEE Parallel & Distributed Technology, pages 25–42, February 1993.

David E. Culler and Richard M. Karp and David A. Patterson and Abhijit Sahay and Klaus E. Schauser and Eunice Santos and Ramesh Subramonian and Thorsten von Eicken. LogP: Towards a Realistic Model of Parallel Computation. In Principles Practice of Parallel Programming, pages 1–12, 1993.

David Gelernter. Generative communication in Linda. ACM Trans. Program. Lang. Syst., 7(1):80–112, 1985.

Frank DeRemer and Hans Kron. Programming-in-the-Large Versus Programming-in-the-Small. June 1976.

G. Allen and K. Davis and K. Dolkas and N. Doulamis and T. Goodale and T. Kielmann and A. Merzky and J. Nabrzyski and J. Pukacki and T. Radke and M. Russell and E. Seidel and J. Shalf and I. Taylor. Enabling Applications on the Grid: A GridLab Overview, 2003.

Gabrielle Allen and Werner Benger and Thomas Dramlitsch and Tom Goodale and Hans-Christian Hege and Gerd Lanfermann and Andre Merzky and Thomas Radke and Edward Seidel and John Shalf. Cactus Tools for Grid Applications. Cluster Computing, 4(3):179–188, 2001.

Geoffrey Fox and Dennis Gannon and Mary Thomas. Overview of Grid Computing Environments. In Fran Berman and Geoffrey Fox and Tony Hey, editor, Grid Computing - Making the Global Infrastructure a Reality, pages 543–553. John Wiley & Sons Ltd, Chichester, England, 2003.

George Coulouris and Jean Dollimore and Tim Kinderberg. Distributed Systems - Concepts and Design. Addison Wesley, Third edition, 2001.

Gregory R. Andrews. Paradigms for process interaction in distributed programs. ACM Comput. Surv., 23(1):49–90, 1991.

Han-Ku Lee and Bryan Carpenter and Geoffrey Fox and Sang Boem Lim. HPJava: Programming Support for High-Performance Grid-Enabled Applications.

Henri E. Bal. A Comparative Study of Five Parallel Programming Languages. In EurOpen Spring Conference on Open Distributed Systems, pages 209–228, Tromso, May 1991.

Henri E. Bal and Andrew S. Tanenbaum. Distributed programming with shared data. Comput. Lang., 16(2):129–146, 1991.

Henri E. Bal and Jennifer G. Steiner and Andrew S. Tanenbaum. Programming Languages for Distributed Computing Systems. ACM Computing Surveys, 21(3), September 1989.

High-Performance Fortran: http://www.hpfpc.org/fhpf-E.html , http://www.hpfpc.org/index-E.html

Hrabri Rajic and Roger Brobst and Waiman Chan and Fritz Ferstl and Jeff Gardiner and Andreas Haas and Bill Nitzberg and John Tollefsrud. Distributed Resource Management Application API Specification 1.0. http://forge.ggf.org/projects/drmaa-wg/, 2004.

I. Foster and C. Kesselman. Globus: A Metacomputing Infrastructure Toolkit. The International Journal of Supercomputer Applications and High Performance Computing, 11(2):115–128, Summer 1997.

I. Foster and C. Kesselman and J. Nick and S. Tuecke. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration, 2002.

Ian Foster. What is the Grid? A Three Point Checklist. Argonne National Laboratory & University of Chicago, July 2002.

Ian Foster and Carl Kesselman, editor. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco, CA, 2004.

Ian Foster and Carl Kesselman and Steven Tuecke. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Lecture Notes in Computer Science, 2150, 2001.

Ian Foster and Jeffrey Frey and Steve Graham and Steve Tuecke and Karl Czajkowski and Don Ferguson and Frank Leymann and Martin Nally and Igor Sedukhin and David Snelling and Tony Storey and William Vambenepe and Sanjiva Weerawarana. Modeling Stateful Resources with Web Services. IBM DeveloperWorks Whitepaper, March 2004.

Ian Foster. Designing and Building Parallel Programs. Online version of the book

Jarek Nabrzyski and Jennifer M. Schopf and Jab Weglarz. Grid Resource Management. Kluwer Academic Publishers, 2004.

Jason Garman. Kerberos: The Definitive Guide. OReilly and Associates, 2003.

K. Seymour and H. Nakada and S. Matsuoka and J. Dongarra and C. Lee and H. Casanova. An Overview of GridRPC: A Remote Procedure Call API for Grid Computing. In 3rd International Workshop on Grid Computing, November 2002.

K.A. Hawick and H.A. James and L.H. Pritchard. Tuple-Space Based Middleware for Distributed Computing. Technical Report 128, Distributed and High-Performance Computing Group, University of Adelaide, Adelaide, Australia, October 2002.

Karl Czajkowski and Donald F. Ferguson and Ian Foster and Jeffrey Frey and Steve Graham and Igor Sedukhin and David Snelling and Steve Tuecke and William Vambenepe. The WS-Resource Framework. http://globus.org/wsrf/specs/ws-wsrf.pdf, 05 2004.

Konstantin Berlin and Jun Huan and Mary Jacob and Garima Kochhar and Jan Prins and Bill Pugh and P. Sadayappan and Jaime Spacco and Chau- Wen Tseng. Evaluating the Impact of Programming Language Features on the Performance of Parallel Applications on Cluster Architectures. Lecture Notes in Computer Science, 2958:194 – 208, jan 2004.

Leanne Guy and Peter Kunszt and Erwin Laure and Heinz Stockinger and Kurt Stockinger. Replica Management in Data Grids - GGF5 Working Draft. Technical report, Global Grid Forum, July 2002.

Barbara Liskov, Maurice Herlihy, and Lucy Gilbert. Limitations of synchronous communication with static process structure in languages for distributed computing. In POPL ’86: Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 150–159, New York, NY, USA, 1986. ACM Press.

M. Ben-Ari. Principles of Concurrent Programming. Prentice Hall Professional Technical Reference, 1982.

M. F. Kilian. Can O-O Aid Massively Parallel Programming? In D. B. Johnson and F. Makedon and P. Metaxas, editor, Proceedings of the Dartmouth Institute for Advanced Graduate Study in Parallel Computation Symposium, pages 246–256, 1992.

M. Satyanarayanan and James J. Kistler and Puneet Kumar and Maria E. Okasaki and Ellen H. Siegel and David C. Steere. Coda: A Highly Available File System for a Distributed Workstation Environment. IEEE Transactions on Computers, 39(4):447–459, 1990.

Marty Humphrey. From Legion to Legion-G to OGSI.NET: Object-Based Computing for Grids. In Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003). IEEE Computer Society, April 2003.

M.J. Flynn. Some computer organizations and their effectiveness. IEEE Transactions on Computers, 21:948–960, 1972.

M.J. Litzkow and M. Livny and M.W. Mutka. Condor - A Hunter of Idle Workstations. In Proceedings of the Eighth International Conference on Distributed Computing Systems, pages 104–111, 1988.

M.P. Papazoglou and D. Georgakopoulos. Service-Oriented Computing. Communications of the ACM, 46(10):25–28, October 2003.

MPI Forum. MPI-2: Extensions to the Message-Passing Interface. Technical report, University of Tennessee, Knoxville, Tennessee, July 1997.

Nicholas Carriero and David Gelernter. How to write parallel programs: a guide to the perplexed. ACM Comput. Surv., 21(3):323–357, 1989.

Pankaj Kumar. J2EE Security. Prentice Hall PTR. 2004. ISBN 0-13-140264-1

Percy Mett and David Crowe and Peter Strain-Clark. The specification and design of concurrent systems. McGraw-Hill International Series in Software Engineering. MCGraw-Hill Book Company Europe, Berkshire, England, 1994.

R.E. Schantz and D.C. Schmidt. Middleware for Distributed Systems: Evolving the Common Structure for Network-centric Applications. Encyclopedia of Software Engineering, New York, Wiley & Sons, pages 801–813, 2001.

Rich Wolski and Neil T. Spring and Jim Hayes. The network weather service: a distributed resource performance forecasting service for metacomputing. Future Generation Computer Systems, 15(5–6):757–768, 1999.

Rohit Chandra and Ramesh Menon and Leo Dagum and David Kohr and Dror Maydan and Jeff McDonald. Parallel Programming in OpenMP. Morgan Kaufmann, October 2000.

Patricia Gomes Soares. On remote procedure call. In CASCON ’92: Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research, pages 215–267. IBM Press, 1992.

Tom Goodale and Keith Jackson and Stephen Pickles. Simple API for Grid Applications (SAGA) Working Group. Global Grid Forum Recommendation Document Draft.