We typically run Batch workflows on AWS spot instances to take advantage of cost savings when possible.
However, when some redun task is interrupted due to its host being terminated, the scheduler halts leading to potentially a lot of lost work.
It would be helpful for redun to detect this case and re-submit the task without halting, up to the configured maximum number of re-submits.