Treffer: Utilizing Runtime Information for Accurate Root Cause Identification in Performance Diagnosis
Weitere Informationen
This dissertation highlights that existing performance diagnostic tools often become less effective due to their inherent inaccuracies in modern software. To overcome these inaccuracies and effectively identify the root causes of performance issues, it is necessary to incorporate supplementary runtime information into these tools. Within this context, the dissertation integrates specific runtime information into two typical performance diagnostic tools: profilers and causal tracing tools. The integration yields a substantial enhancement in the effectiveness of performance diagnosis. Among these tools, gprof stands out as a representative profiler for performance diagnosis. Nonetheless, its effectiveness diminishes as the time cost calculated based on CPU sampling fails to accurately and adequately pinpoint the root causes of performance issues in complex software. To tackle this challenge, the dissertation introduces an innovative methodology called value-assisted cost profiling (vProf). This approach incorporates variable values observed during runtime into the profiling process. By continuously sampling variable values from both normal and problematic executions, vProf refines function cost estimates, identifies anomalies in value distributions, and highlights potentially problematic code areas that could be the actual sources of performance is- sues. The effectiveness of vProf is validated through the diagnosis of 18 real-world performance is- sues in four widely-used applications. Remarkably, vProf outperforms other state-of-the-art tools, successfully diagnosing all issues, including three that had remained unresolved for over four years. Causal tracing tools reveal the root causes of performance issues in complex software by generating tracing graphs. However, these graphs often suffer from inherent inaccuracies, characterized by superfluous (over-connected) and missed (under-connected) edges. These inaccuracies arise from the diversity of programming paradigms. To mitigate the inaccuracies, the ...