[Pydra] Master-Node-Worker relationship refactor

Peter Krenesky peter at osuosl.org
Tue Aug 18 04:44:12 UTC 2009


Hi all,

I've started refactoring Master, Node, and Worker to change the way in
which they relate to eachother.  When this refactor is complete Master
will only communicate with Nodes.   Node will be the only component to
interact with Workers.  Workers will be spawned per TaskInstance.


== WHY? ==
  - workers need to be chrooted (sandboxed) per TaskInstance to ensure
no task can affect other users.  Even importing a task file to read task
name and description puts the cluster at risk.
 
  - Some libraries, django especially, can only be configured once per
runtime.  This means changing datasources is not possible under the
current system.

  - less network overhead from TCP connections.

  - simpler networking logic.
 


== How? ==

Master
     - remove WorkerConnectionManager module
     - change add_node() so that instead of adding workers to the
checker, WORKER_CONNECTED signals are emited with a special proxy object
that mimics a WorkerAvatar but is really the remote from the Node.  This
allows all other logic in Master to remain the same.
    - change node disconnection logic to include disconnecting workers
as well
   

Node
     - Add WorkerConnectionManager Module, Master's version of this can
be reused.
     - Add mechanism for tracking running workers
     - Add task_run that manages passing work to workers, and starting
new workers. 
     - Add callback system to task_run to handle asynchronous nature of
waiting for a worker to start before passing on a task_run
     - Add remotes that proxy all other functions in
worker_task_controls to worker avatars
     - Add remotes that proxy master functions to MasterAvatar.


Worker
    - Modify WorkerConnectionManager to connect locally only and use
Node key for auth.



== status ==

Much of the above code in place but it is not tested.  I'll likely have
it complete within the next few days.


More information about the Pydra mailing list