Monk: opportunistic scheduling to delay horizontal scaling
In modern server computing, efficient CPU resource usage is often traded for latency. Garbage collection is a key aspect of memory management in programming languages like Java, but it often competes with application threads for CPU time, leading to delays in processing requests and consequent increases in latency. This work explores if opportunistic scheduling in ZGC, a fully concurrent garbage collector (GC), can reduce application latency on middle-range CPU utilization, a topical deployment, and potentially delay horizontal scaling.
We implemented an opportunistic scheduling that schedules GC threads during periods when CPU resources would otherwise be idle. This method prioritizes application threads over GC workers when it matters most, allowing the system to handle higher workloads without increasing latency.
Our findings show that this technique can significantly improve performance in server applications. For example, in tests using the SPECjbb2015 benchmark, we observed up to a 15% increase in the number of requests processed within the target 25ms latency. Additionally, applications like Hazelcast showed a mean latency reduction of up to 40% compared to ZGC without opportunistic scheduling.
The feasibility and effectiveness of this approach were validated through empirical testing on two widely used benchmarks, showing that the method consistently improves performance under various workloads.
This work is significant because it addresses a common bottleneck in server performance—how to manage GC without degrading application responsiveness. By improving how GC threads are scheduled, this research offers a pathway to more efficient resource usage, enabling higher performance and better scalability in server applications.