User Tools

Site Tools


soft:server-apps-monitor

Monitor Application

Periodic Node and Sliver monitorization with the aim of:

  • Providing status to researchers
  • Sending alerts of malbehaviours to Admins
  • Sending alerts of offline nodes to Owners

Parameters to monitor:

  • Ping, Uptime, reliability
  • Load
  • Memory
  • Disk
  • Bandwidth

Architecture:

  • Do Nodes and Slivers expose this data via an API?

Settings

  • MONITOR_ALERT_EXPIRATION: time that a problem should persist after an alert is sent to the admins (by mail), 1 day by default.
  • MONITOR_ALERT_LOCK: concurrency control lock file, '/dev/shm/.controller.monitor.lock' by default.
  • MONITOR_EXPIRE_SECONDS: monitorized timeserie duration, 300 seconds by default.
  • MONITOR_MONITORS: list of system monitors.

Celery workers monitor Celery daemon configuration is generated by setupceleryd management command. This command accepts –processes parameter (5 by default), and defines the number of threads for celery_w1 worker. When overriding the default CELERY_W1 setting keep in mind that the number is calculated as #processes + 1 (the process that acts as coordinator).

Local Monitor (server)

Periodic monitorization for the confine-controller. Exists a command task than can be added as cron task, and configured to send an email to server admins when detects some problem.

Server admins can be configured with ADMINS setting.

soft/server-apps-monitor.txt · Last modified: 2014/05/29 17:44 by santiago