Monitoring the CONFINE testbed presents specific challenges that the designed system should address:
A general-purpose monitoring system does not meet these special-purpose requirements of CONFINE and are meant for different workloads and properties. Slice specific and sliver specific information (lxc monitoring) cannot be obtained directly by any of the existing monitoring systems. Nagios, Zenoss, ntop, Ganglia and cacti use RRDtool for storing data.
RRDtool (Round Robin Database tool) is great for storing time series data and aggregating information, but are quite inflexible. It becomes necessary to compromise between flexibility and efficiency. Adding new metrics would require updating the database file (RRA). Once an RRA (Round Robin Archive) is created, it is possible to change existing values and add new data sources, it is not possible to add or remove metrics and change their properties. If modelling of data is not considered carefully, it can lead to a number of updates as and when new slivers are created in a node. Slice specific data implies data from different nodes and would result in a dynamic list of RRD which in turn would need additional scripts to fetch, aggregate and display data. For instance in Comon (monitoring system of PlanetLab), the data model is carefully chosen, but still old database files are deleted when the format changes. In many cases (depending on configuration) if an update is made to an RRD series but is not followed up by another update soon, the original update will be lost. This makes it less suitable for recording data such as operational metrics. There is no way to back-fill data in an RRD series and depending on the data model, a single RRD receiving data from multiple sources can be affected by this. Given the large scale varying resource consumption and the dynamic nature of CONFINE, flexibility is a key requirement.
Apart from that, sliver-centric information is not easily integrated into node-centric data provided by off-the-shelf monitoring systems. This kind of data gathering is also part of the motivation for developing a separate monitoring system to meet the specific needs of CONFINE.