[Pydra] Updates on task packaging and task sync

Yin QIU qiuyin at gmail.com
Tue Sep 1 09:08:07 UTC 2009

On Sun, Aug 30, 2009 at 12:15 AM, Peter Krenesky<peter at osuosl.org> wrote:
> Yin QIU wrote:
>> On Sat, Aug 29, 2009 at 2:38 PM, Peter Krenesky<peter at osuosl.org> wrote:
>>> The TaskSync code has been merged into master.  I haven't thoroughly
>>> tested it yet, but the basic functionality works.  I only encountered
>>> minor issues:
>>>     * it was including .pyc files in the hashes so there were
>>> mismatches when loading compiled code
>> I intentionally avoided examining only .py files in computing the
>> hash. Because I thought not all the files in a task package were .py
>> files - there might be dynamic libraries, configuration files, etc. Of
>> course, we can explicitly exclude .pyc files.
> and thats the right thing to do.  there shouldn't be any dynamic
> libraries within the package anyways.  We can't avoid .pyc files because
> that is where the python runtime puts them.
> we could avoid this in a more generic way by just using the directory
> name as the hash instead of computing it every time.  If users edit the
> internal cache manually its likely it will break anyways.  the only
> thing we lose is the ability to determine if the user edited it locally
> and tell them not to do it. again.

If we hash directory names only, we won't be able to tell if a task
package has code modifications if those modifications change neither
the directory names nor the directory structure. So I think it would
be better to keep computing hashes from the contents while excluding
certain types of files, which can be configured flexibly, e.g., in

> I'm also considering whether we even need task_cache on Nodes, or just
> task_cache_internal.  Really you shouldn't be manually placing files
> there when there is no way to disable task synchronization.  We have the
> issue of large packages that might make someone want to avoid the sync
> client, but that should eventually be solved by using a
> consumer/producer elements of twisted to transfer the data more effectively.

Currently TaskManager's at the master and the nodes are in fact
identical. But now the situation is: we don't use the master's
TaskManager to request synchronization and we don't use nodes'
TaskManager's to handle sync requests. That is, we actually made a
distinction between the two kinds of TaskManager's. So task_cache on
nodes seems to be unnecessary.

We can of course implement a mechanism to disable task
synchronization. On the other hand, we can leave this alone, and have
the opportunity to implement a new feature that enables P2P-style
synchronization. For example, in a large cluster, we may update the
task package on an arbitrary node, and expect other node, perhaps
including the master, to sync with this node. I think this would
greatly ease the maintenance of a cluster. But certainly this feature
is far from our current project goal, and is hence just an imagination
right now :-)

>>>     * run_task and _run_task signatures needed to be merged by hand to
>>> match what changes I had made.
>>> - Peter
>>> Yin QIU wrote:
>>>> Hi,
>>>> I just pushed some changes to my public repo. I managed to add
>>>> preliminary support for keeping multiple versions of a task package.
>>>> There are now two folders holding the task code, namely tasks_cache
>>>> and tasks_cache_internal. The former is publicly known and is for
>>>> deployment usage; the latter is used by TaskManager internally and is
>>>> thus hidden from the outside world.
>>>> tasks_cache always contains the latest code. We can either drop files
>>>> to this folder or put contents into it with certain API (not available
>>>> yet). TaskManager keeps monitoring tasks_cache, and if it notices
>>>> updates, copies the latest task code into tasks_cache_internal, where
>>>> it places the code in a subdirectory with the SHA1 hash of the code as
>>>> the directory's name.
>>>> I've performed a simple test against this new feature. I put a
>>>> modified task package while running an older version of the package.
>>>> This resulted in two different task packages in tasks_cache_internal.
>>>> There is currently no cleanup mechanism yet. That is, once a task
>>>> package is created in tasks_cache_internal, there is no automatic way
>>>> to remove it after it expires. This issue will be resolved after we
>>>> let the scheduler emit TASK_STARTED and TASK_STOPPED signals and
>>>> handle these signals in TaskManager.
> _______________________________________________
> Pydra mailing list
> Pydra at osuosl.org
> http://lists.osuosl.org/mailman/listinfo/pydra

Nanjing University, China

More information about the Pydra mailing list