Since the architecture discussion seems not closed yet, I (Marc) want to put my share providing an overview of 4 different asynchronous client-server architectures I can think on:
Tasks are procedures created by the controller, stored on a message queue (i.e. RabbitMQ) and executed by workers. The result of a task execution can be stored within a database, so the server can know if they succed or fail (keeping state of the nodes).
A worker is a process that can run on the local system and also on a networked machine. Therefore based on where these workers run we can have two different architectures: 1) task push and 2) task pull
Celery can be used for implementing task oriented architectures (it is realy well integrated with Django).
Workers are on the same machine where the server is. All existing workers pull tasks off the queue. Workers can be distributed over multiple servers improving scalability.
If a node is down the task should be retried.
Each node has a dedicated worker running on it. The workers pull the server queue in order to retrieve their assigned tasks. Routing mechanism must be used on the queue in order to ensure that each task is executed on the correct worker (node).
When something changes on the server an event is published using a feed (i.e. Atom). Nodes subscribe to the feed so they can pull new events.
Advantatges: failure recovery (events are ordered in time), scale and availability (events (feeds) can be easily cached using common web technologies).
Each node preiodically retrieve their configuration (crontask) which is exposed by a server API.
Atomicity of the work (Unit of work) should be guaranteed somehow in order to prevent nodes to retrieve incomplete configuration (i.e. sliver creation by the user still in progress). Possible solution: take advantage of db transaction read isolation.