Review of "IOFlow: A Software-Defined Storage Architecture"
In data centers, IO path to storage is usually long and complex which makes it hard to enforce end-to-end policies that dictates performance and routing. These policies require differentiation of IOs along the path and global visibility at the control plane. IOFlow is an architecture which enables this kind of policies. It does so by decoupling control of IO flows from data plane by introducing programmable data plane queues that allows for flexible service and routing properties. It borrows the design concept of SDN's decoupling of control and data plane.
In an IOFlow system there's a logically centralized data center controller discovers and interacts with the stages in servers across the data center to maintain a topology graph. Stages implement traffic differentiation through queues. Queuing rules, created by controller, are used by stages to map IO requests to queues. Each queues have different configurations like bandwidth limit or next-hop, thus enforcing end-to-end performance policies or routing policies.
One interesting aspect of this architecture is the single controller program, which might bring single point of failure. I guess it makes the the implementation much easier not to think about fault-tolerant here. The solution of IOFlow is, in the face of a controller failure. All other control program will fall-back to a reasonable handling of IO requests, for example, for malware scanning application, it'll fall back to scan all traffic instead of doing it selectively.
IOFlow is implemented on top of Windows-based IO stack with two kernel drivers that intercept storage IO traffic and each serves as an IOFlow stage – one storage driver on top of SMBc driver and one storage driver below SMBs driver. This makes IOFlow be able to bring benefit to applications without any modification to the application code or the VM code.
Will this paper be influential in 10 years? Maybe. The separation of control and data plane borrowed from SDN is really necessary given the complexity of the current data centers. In an era of cloud computing, requirements of end-to-end policies enforcement will certainly grow. But the paper says evaluation on large data centers is delayed for future work, so let's stay tuned.