dk.sdu.cloud.app.orchestrator.api.Job

UCloud Developer Guide / Orchestration of Resources / Compute / Jobs

Job

A Job in UCloud is the core abstraction used to describe a unit of computation.

data class Job(
    val id: String,
    val owner: ResourceOwner,
    val updates: List<JobUpdate>,
    val specification: JobSpecification,
    val status: JobStatus,
    val createdAt: Long,
    val output: JobOutput?,
    val permissions: ResourcePermissions?,
    val providerGeneratedId: String?,
)

They provide users a way to run their computations through a workflow similar to their own workstations but scaling to much bigger and more machines. In a simplified view, a Job describes the following information:

A Job is started by a user request containing the specification of a Job This information is verified by the UCloud orchestrator and passed to the provider referenced by the Job itself. Assuming that the provider accepts this information, the Job is placed in its initial state, IN_QUEUE. You can read more about the requirements of the compute environment and how to launch the software correctly here.

At this point, the provider has acted on this information by placing the Job in its own equivalent of a job queue. Once the provider realizes that the Job is running, it will contact UCloud and place the Job in the RUNNING state. This indicates to UCloud that log files can be retrieved and that interactive interfaces (VNC/WEB) are available.

Once the Application terminates at the provider, the provider will update the state to SUCCESS. A Job has terminated successfully if no internal error occurred in UCloud and in the provider. This means that a Job whose software returns with a non-zero exit code is still considered successful. A Job might, for example, be placed in FAILURE if the Application crashed due to a hardware/scheduler failure. Both SUCCESS or FAILURE are terminal state. Any Job which is in a terminal state can no longer receive any updates or change its state.

At any point after the user submits the Job, they may request cancellation of the Job This will stop the Job, delete any ephemeral resources and release any bound resources.

Properties

Last updated