NetLogger: A toolkit for distributed system performance tuning anddeb ugging
Developers and users of high-performance distributed systemsoften observe performance problems such as unexpectedly low throughput orhigh latency. Determining the source of the performance problems requiresdetailed end-to-end instrumentation of all components, including theapplications, operating systems, hosts, and networks. In this paper wedescribe a methodology that enables the real-time diagnosis ofperformance problems in complex high-performance distributed systems. Themethodology includes tools for generating timestamped event logs that canbe used to provide detailed end-to-end application and system levelmonitoring; and tools for visualizing the log data and real-time state ofthe distributed system. This methodology, called NetLogger, has proveninvaluable for diagnosing problems in networks and in distributed systemscode. This approach is novel in that it combines network, host, andapplication-level monitoring, providing a complete view of the entiresystem. NetLogger is designed to be extremely light-weight, and includesa mechanism for reliably collecting monitoring events from multipledistributed locations. This technical report summarizes most importantpoints of several previous papers on NetLogger, and is meant to be usedas a general overview.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- DE-AC02-05CH11231
- OSTI ID:
- 924785
- Report Number(s):
- LBNL-51276; R&D Project: UNKNOWN; BnR: KJ0102000; TRN: US200811%%180
- Country of Publication:
- United States
- Language:
- English
Similar Records
The NetLogger Toolkit V2.0
Final Scientific/Technical Report - Sensor Indoor Location Network for Smart Airport Terminal Management