Skip to content

retain logs and data of failed jobs in a location accessible to the operator #3

@pymonger

Description

@pymonger

All job work dirs are exposed via WebDAV on each worker instance.

Work directory is left unscathed in case of failed jobs but longevity is not guaranteed if subsequent jobs need disk space to run. Need development if we want to ship failed work directories to a less volatile work space. Or, as we did in OCO-2, we catch exceptions in our PGE so verdi never catches a failure and ship those work dirs out to external storage for review.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions