Pollux Co-Adaptive Cluster Scheduling for Goodput Optimized Deep Learning
本文记录阅读OSDI2021 best paer "Pollux Co-Adaptive Cluster Scheduling for Goodput Optimized Deep Learning"笔记
Design and Architecture
- PolluxAgent: 通过调整batch size和learning rate来充分利用已分配的资源(job-level granularity)
- PolluxSched:通过调整分配给各个任务的资源来提升公平性和加快任务完成时间(cluster-wide granularity)