The incubator contains papers in the review phase prior to acceptance of the journal.
Anyone can access the article and the Git repository containing the reproducible workflow and provide feedback.
Please see the guidelines for reviewers for details on the process.
Articles in the Incubator
The following articles are currently in the incubator:
- performance analysis
- time series
- job analysis
One goal of support staff at a data center is to identify inefficient jobs and to improve their efficiency. Therefore, a data center deploys monitoring systems that capture the behavior of the executed jobs. While it is easy to utilize statistics to rank jobs based on the utilization of computing, storage, and network, it is tricky to find patterns in 100.000 jobs, i.e., is there a class of jobs that aren’t performing well. Similarly, when support staff investigates a specific job in detail, e.g., because it is inefficient or highly efficient, it is relevant to identify related jobs to such a blueprint. This allows staff to understand the usage of the exhibited behavior better and to assess the optimization potential.
In this paper, we describe a methodology to identify jobs related to a reference job based on their temporal I/O similarity. Practically, we apply several previously developed time series algorithms and also utilize the Kolmogorov-Smirnov-Test to compare the distribution of the metrics. A study is conducted to explore the effectiveness of the approach by investigating related jobs for three reference jobs. The data stems from DKRZ’s supercomputer Mistral and includes more than 500.000 jobs that have been executed for more than 6 months of operation. Our analysis shows that the strategy and algorithms are effective to identify similar jobs and revealed interesting patterns in the data. It also shows the need for the community to jointly define the semantics of similarity depending on the analysis purpose.