User Tools

Site Tools


services:cloudy

Cloud-based Services in CNs

In this section we present the approach followed to provide cloud-based services in Community-Lab.

The Cloudy Distribution

In order to facilitate the automatic deployment of applications and services in the Community Network, our approach is to provide a customised distribution for the community, i.e. an operating system image which is prepared to be placed either as OS of the community network hosts or in virtual machines. We use a Debian-based distribution called Cloudy, which has been equipped with a set of basic platform applications.

Figure 1 shows some of the already integrated or foreseen services of the Cloudy Community Network distribution:

  • The service publishing and discovery mechanism: Based on Avahi (http://avahi.org). This service is particularly important for the system since it allows the location of services, and with additional software, even outside of the LAN. This way, services can be seen from any node and with the use of the browsing capabilities of Avahi, the user experiences a vision of a uniform integration of services within the network.
  • Storage application: Using ownCloud, a popular open source software which in some features resembles the commercial Dropbox. ownCloud (https://owncloud.org) allows to select among different storage backends. ownCloud storage application combined with Tahoe-LAFS and XtreemFS offers an attractive end user application, with web-based GUI through ownCloud and security and performance through Tahoe-LAFS and XtreemFS.
  • Peer to peer media streaming: Using the PeerStreamer (http://peerstreamer.org) software, developed by a research team from the University of Trento.

 Cloudy

                                   Figure 1: Cloudy distribution  

The figure also shows that the community services provided by the Cloudy distribution aim to run on different cloud resources provided by cloud management systems such as OpenStack or OpenNebula, and on other on low-resource devices such as the Research Devices provided by CONFINE’s Community-Lab testbed.

Services Included in Cloudy

Avahi

The service publishing and discovery mechanism included in the Cloudy distribution is based on Avahi, a free Zero-configuration networking implementation. By means of a series of scripts that are executed periodically on a per-service basis, the list of available services is subsequently published using the Avahi daemon broadcast mechanism. Then, the rest of nodes (which have the Avahi daemon running too) receive the information and store it locally for a while (until the next update is received, or a timeout is reached). This information can then be used internally by the node for operational purposes or be shown to the end user on the node’s web interface.

Avahi works as a service publishing and discovery system for the broadcast domain of the nodes (this is in a network segment of the data link layer where all the nodes are connected). However, most commonly in community networks, devices are spread over different nodes that belong to different broadcast domains where Avahi packets can not be directly exchanged from one node to another. To solve this limitation, a virtual network connecting all the cloud devices is created. Using this layer-2 VPN network (red line in Figure 1), all the devices appear to be in in the same broadcast domain, and Avahi packets can be exchanged between different, distant nodes.

The software chosen to create this virtual network in all the Cloudy devices is Tinc (http://www.tinc-vpn.org), a Virtual Private Network (VPN) daemon that uses tunnelling and encryption to create a secure private network between hosts on the Internet. Tinc is automatically installed and configured on every Cloudy device, ready to be activated (for privacy reasons, this option is left to the user’s choice). Right after activation, a VPN is started to reach the rest of Cloudy nodes via a layer-2 network and Avahi can communicate transparently with other nodes.

Tahoe-LAFS

Tahoe-LAFS is a decentralised storage system with provider-independent security. This feature means that the user is the only one who can view or modify disclosed data. The storage service provider never has the ability to read or modify the data thanks to standard cryptographic techniques. The general idea is that the client can store files on the Tahoe-LAFS cluster in an encrypted form using cryptographic techniques. The clients maintain the necessary cryptographic keys needed to access the files. These keys are embedded in read/write/verify ”capability strings”. Without these keys no entity is able to learn any information about the files in the storage cluster. The data and metadata in the cluster is distributed among servers using erasure coding and cryptography. The erasure coding parameters determine how many servers are used to store each file, denoted as N, and how many of them are necessary for the files to be available, denoted as K. The default parameters used in Tahoe-LAFS are K=3 and N=10 (3-of-10), which means that each file is shared across 10 different servers, and the correct function of any 3 of those servers is sufficient to retrieve the file. This makes Tahoe-LAFS tolerate multiple storage server failures and attacks.

The Tahoe-LAFS cluster consists on a set of storage nodes, client nodes and a single coordinator node called the Introducer. The main responsibility of the Introducer is to act as a kind of publish-subscribe hub. The storage nodes connect to the Introducer and announce their presence. On the other hand, the client nodes connect to the Introducer to get the list of all connected storage nodes. The Introducer does not transfer data between clients and storage nodes, but the transfer is done directly between them. The Introducer is a single-point-of-failure for new clients or new storage peers, since they need it for joining the storage network. It is important to notice that, for a production environment, the Introducer must be deployed on a stable server of the Community Network.

When the client uploads a file to the storage cluster, a unique public/private key pair is generated for that file, and the file is encrypted, erasure coded and distributed across storage nodes (with enough storage space). To download a file, the client asks all known storage nodes to list the number of shares of that file they hold and in the subsequent round, the client chooses which share to request based on various heuristics like latency, node load, etc.

XtreemFS

The XtreemFS is an open source object-based distributed file system for grid and cloud infrastructures. The file system replicates objects for fault tolerance and caches data and metadata to improve performance over high-latency links. As an object-based file-system, XtreemFS stores the directory tree on the Metadata and Replica Catalog (MRC) and file content on Object Storage Devices (OSD). The MRC uses an LSM-tree based database which can handle volumes that are larger than the main memory. OSDs can be added to the system as needed without any data re-balancing. Empty OSDs are automatically used for newly created files and replicas.

In addition to regular file replication, XtreemFS provides read-only replication. This replication mode works on immutable files and supports a large number of replicas. The read-only replication helps to quickly build a caching infrastructure on top of XtreemFS in order to reduce latency and bandwidth consumption between data-centers. In contrast, the read-write replication allows files to be modified and is fully POSIX compatible.

ownCloud

ownCloud is an open source cloud Storage as a Service (SaaS) application, which has seen rapid development in the recent years. ownCloud (version 6) provides a number of the features similar to cloud-based storage solutions, including a web-based file access (view/upload/download) and a ”desktop sync” client for Windows, OS X and Linux, which allows automated synchronized copies of data on both the client and cloud server. ownCloud also allows users to mount their own external storage like Dropbox, Google Drive, OpenStack Swift, Tahoe-LAFS, Amazon S3, etc., and use them as storage backends. External storage facilities are provided by an application known as ”External storage support”, which is available on the apps dashboard. ownCloud with its attractive web-based user interface seems to be a suitable option to be successfully applied in Community Networks.

PeerStreamer

PeerStreamer is an open software developed by a team of programmers from Trento University which allows users to stream video in a peer-to-peer overlay network. We think it is extremely useful to have a software that allows video sharing among different users in a Community Network. PeerStreamer has some interesting features:

  • Video streaming on demand.
  • Live streaming (from almost any devices, such as webcams or antennas).
  • Streaming from VLC.
  • Web GUI Interface.

BitTorrent

BitTorrent is an open-source file-sharing application effective for distributing software and media files. As the number of users increases and the file is divided into many small pieces, a client downloads from a swarm of other peers which have the required file. There are several open-source implementations of the components of the BitTorrent system available. While the tracker should run on a stable node, the swarm of BitTorrent peers can grow and shrink, a behaviour that is also typical of many client nodes in a Community Network. File-sharing within Community Networks would reduce the traffic on the Internet gateways.

services/cloudy.txt · Last modified: 2016/12/19 11:06 by ivilata