Programming Model and Protocols for Reconfigurable Distributed Systems

Arad, Cosmin (2013) Programming Model and Protocols for Reconfigurable Distributed Systems. Doctoral thesis, KTH Royal Institute of Technology.

PDF - Published Version
Available under License Creative Commons Attribution.

[img]Archive (TGZ)

Official URL:


Distributed systems are everywhere. From large datacenters to mobile devices, an ever richer assortment of applications and services relies on distributed systems, infrastructure, and protocols. Despite their ubiquity, testing and debugging distributed systems remains notoriously hard. Moreover, aside from inherent design challenges posed by partial failure, concurrency, or asynchrony, there remain significant challenges in the implementation of distributed systems. These programming challenges stem from the increasing complexity of the concurrent activities and reactive behaviors in a distributed system on the one hand, and the need to effectively leverage the parallelism offered by modern multi-core hardware, on the other hand. This thesis contributes Kompics, a programming model designed to alleviate some of these challenges. Kompics is a component model and programming framework for building distributed systems by composing message-passing concurrent components. Systems built with Kompics leverage multi-core machines out of the box, and they can be dynamically reconfigured to support hot software upgrades. A simulation framework enables deterministic execution replay for debugging, testing, and reproducible behavior evaluation for large-scale Kompics distributed systems. The same system code is used for both simulation and production deployment, greatly simplifying the system development, testing, and debugging cycle. We highlight the architectural patterns and abstractions facilitated by Kompics through a case study of a non-trivial distributed key-value storage system. CATS is a scalable, fault-tolerant, elastic, and self-managing key-value store which trades off service availability for guarantees of atomic data consistency and tolerance to network partitions. We present the composition architecture for the numerous protocols employed by the CATS system, as well as our methodology for testing the correctness of key CATS algorithms using the Kompics simulation framework. Results from a comprehensive performance evaluation attest that CATS achieves its claimed properties and delivers a level of performance competitive with similar systems which provide only weaker consistency guarantees. More importantly, this testifies that Kompics admits efficient system implementations. Its use as a teaching framework as well as its use for rapid prototyping, development, and evaluation of a myriad of scalable distributed systems, both within and outside our research group, confirm the practicality of Kompics.

Item Type:Thesis (Doctoral)
Uncontrolled Keywords:distributed systems, programming model, message-passing concurrency, nested hierarchical composition, reactive components, software architecture, dynamic reconfiguration, multi-core, discrete-event simulation, peer-to-peer, testing, debugging, distributed key-value stores, data replication, consistency, linearizability, network partition tolerance, consistent hashing, self-organization, scalability, elasticity, fault tolerance, consistent quorums
ID Code:5526
Deposited By:Dr. Cosmin Arad
Deposited On:20 Jun 2013 11:00
Last Modified:29 Aug 2016 16:09

Repository Staff Only: item control page