
- IEEE TRANSACTION ON COMPUTERS 1 Adaptive Fault Management of Parallel
- Building a Fault-Aware Computing Environment
- System Log Pre-processing to Improve Failure Prediction Ziming Zheng, Zhiling Lan
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Toward Automated Anomaly Identification in
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Fault-Aware Runtime Strategies for High
- Proc. of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
- Enhancing Application Robustness through Adaptive Fault Tolerance* Zhiling Lan, Yawei Li, Ziming Zheng, and Prashasta Gujrati
- Proc. of International Conference on Parallel Processing (ICPP'07) A Meta-Learning Failure Predictor for Blue Gene/L Systems
- In the Proc. of International Conference on Parallel Processing (ICPP'07) Fault-Driven Re-Scheduling For Improving System-level Fault Resilience
- Failure-Aware Resource Selection for Grid Computing Zhiling Lan and Yawei Li
- A Novel Workload Migration Scheme for Heterogeneous Distributed Yawei Li and Zhiling Lan
- FENCE: Fault awareness ENabled C ti E i tComputing Environment
- Adaptive Fault Management for HighAdaptive Fault Management for High Performance Computing
- Building a FaultAware Computing Environment for High End Computing
- Analyzing and Adjusting User Runtime Estimates to Improve Job Scheduling on the Blue Gene/P
- Fault-Aware, Utility-Based Job Scheduling on Blue Gene/P Systems
- Reducing Fragmentation on Torus-Connected Supercomputers Wei Tang, Zhiling Lan, Narayan Desai, Daniel Buettner, Yongen Yu
- A Study of Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems
- Exploit Failure Prediction for Adaptive Fault-Tolerance in Cluster Yawei Li and Zhiling Lan
- Evaluating Performance and Scalability of Advanced Accelerator Simulations Jungmin Lee1
- Anomaly Localization in Large-Scale Clusters Ziming Zheng, Yawei Li and Zhiling Lan
- Performance under Failures of DAG-based Parallel Computing
- A Practical Failure Prediction with Location and Lead Time for Blue Gene/P Ziming Zheng
- Performance Emulation of Cell-based AMR Cosmology Simulations , Roberto E. Gonzalez
- Reducing Fragmentation on Torus-Connected Supercomputers Wei Tang, Zhiling Lan, Narayan Desai, Daniel Buettner, Yongen Yu
- Co-analysis of RAS Log and Job Log on Blue Gene/P Ziming Zheng, Li Yu, Wei Tang, Zhiling Lan
- Practical Online Failure Prediction for Blue Gene/P: Period-based vs Event-driven