However, I have some concerns about the queue management algorithm: 1). based on my understanding, the node will notify the RM upon task completion, not through heartbeat messages. Given that the latency within datacenter is negligible, such design won't give us much benefits. 2) the queue length is predefined by the master. However, I feel that a dynamic queue length is better because it can deal with latency spikes and temporary master failure. 3). Likewise, to incorporate latency variations, different nodes should have different queue length. In our work, sol, we solved a similar problem using dynamic and variable queue length.