Review of "Dominant Resource Fairness: Fair Allocation of Multiple Resource Types"

29 Sep 2015

Review of "Dominant Resource Fairness: Fair Allocation of Multiple Resource Types"

Dominant Resource Fairness (DRF) is a generalization of max-min fairness to multiple heterogeneous resources.

What is min-max fair sharing? For example, in a 3 user system, if a user wants no more than 33%, say 20%, give it to him, and evenly distribute the other 80% to the other two users.

While previous work on weighted max-min fairness has focused on single resources, the advent of cloud computing and multi-core systems has increased the need for allocation policies for environments with multiple resources and heterogeneous user demands. Existing fair schedulers for clusters such as Quincy and Hadoop Fair Scheduler, ignore the heterogeneity of user demands and allocate resources at the granularity of slots, where a slot is a fixed fraction of a node. As a consequence, the allocation often result in under-utilization and over-utilization, both of which is not what user expects.

DRF takes a different approach. It propose that the allocation of a user should be determined by the user's dominant share, which is maximum share the user has been allocated of any resources. The strength of DRF is that it satisfies the following properties. First, it provides incentives for users to share resources by guaranteeing that no user is better off in a system where everything is statically partitioned. Every user can get at most 1/N of he dominant resources, N is the total number of users. Second, it's strategy-proof in which a user cannot get a better allocation by lying about the resource demands. Third, it's Pareto-efficient as it allocates all available resources subject to satisfying the other properties, and without preempting existing allocations. Finally, it's envy-free as no user will prefer the allocation of another user.

In practice, DRF is implemented in Mesos resource manager, and brings fairer allocation of resources and higher utilization than existing solutions that allocates identical resource slices to all tasks. Will this paper be influential in 10 years? Yes, because it provides a solid proof that DRF brings better resource utilization in the more and more relevant cloud computing environment.


From Wikipedia: Pareto efficiency or Pareto optimality, is a state of allocation of resources in which it is impossible to make any one individual better off without making at least one individual worse off.