Container clusters play an increasingly important
role in cloud computing for processing dynamic computing tasks.
The resource manager (i.e., orchestrater) of the cluster automates
the scheduling of the dynamic requests, effectively manages the
resources’ utilization across distributing infrastructure resources.
For many applications, the requests to the cluster are often
with restricted deadlines. The scheduling of container clusters
is often tricky, especially when the cluster’s size is large and the
load of the requests is dynamically changing. Machine learningbased
approaches such as reinforcement learning have attracted
lots of research attention during the past years; However, those
approaches suffer from low robustness when the requests in an
operational environment are changing and different from the
training data sets. This paper investigates this problem by quantifying
the robustness and proposing meta-gradient reinforcement
learning to improve the robustness of classical reinforcement
learning-based approaches. The proposed approach can lead
to better deadline guarantees and faster adaptation for timecritical
task scheduling under dynamic environments. We then
empirically test the benefits of our method using both real-world
and synthetic data sets. The evaluation results show that the
proposed method outperforms the compared RL methods in
scheduling performance and robustness.