Review of "Apache Hadoop YARN: Yet Another Resource Negotiator"

26 Sep 2015

Review of "Apache Hadoop YARN: Yet Another Resource Negotiator"

YARN is the next generation of Hadoop compute platform. It departs from the original monolithic architecture by separating resource management functions from the programming model, and delegates many scheduling-related functions to per-job components.

To address some of the multi-tenency issues, Yahoo! developed Hadoop on Demand (HOD), which used Torque and Maui to allocate Hadoop clusters on a shared pool of hardware, and works together with JobTracker and TaskTracker. But it suffers from middling resource utilization because it has too little information to make intelligent decisions about resource allocation. Also JobTracker did a poor job on fault tolerance/recovery, which also makes the development of YARN necessary. Several notable targets of YARN also include secure and auditable operation, multiple diverse framework support similar to Mesos, Flexible Resource Model ,and Backward compatibility to avoid "second version syndrome".

Main components of YARN are one Resource Manager (RM), a set of Node Managers (NM), one on each node, and a set of framework specific Application Managers (AM) which negotiate resources with the RM. RM runs as a daemon on a dedicated machine and acts as the central authority arbitrating resources among various competing applications in the cluster. This enables fairness, capacity, and locality across tenants. To obtain containers, which is a logical bundle of resources like <2GB RAM, 1 CPU>, AM issues resource requests to the RM, this request also include the number of required containers, specification of locality preferences and priority of requests within the application. RM will attempt to satisfy the resource requests and generate a lease for the resource, which will be pulled by a subsequent AM heartbeat. The AM will then present the token to NM to obtain the container. All containers (including AMs) in YARN is described by a container launch context (CLC). Which includes a map of env variables dependencies stored in remotely accessible storage, security tokens, payloads for NM services and the command necessary to create the process. This serves as a descriptor for NM to start a container. Similar to Mesos, YARN didn't do much on fault tolerance. It mainly delegates it to application specific AMs.

Will this paper be influential in 10 years? Maybe, considering the large market share of Hadoop, and YARN does solve a lot of issues identified in Hadoop 1.0.