Workbench: Background and Workbench Jobs - Introduction

Workbench provides the ability to execute jobs which run in the background, next to the main R session. There are two different types of jobs:

Background jobs
Workbench jobs

While both sound similar, they are different in how and where the code is executed. Background jobs run on the same machine as the current R session, while workbench jobs may run on a different machine. The “may” here is intentional as it depends on whether your infrastructure and Workbench installation provides the option to access other nodes.

Workbench - On-Premise

When running Workbench on premises, i.e., on a single static machine within your environment, you can look at background and workbench jobs as if they were the same. All jobs send to either of them will be executed on the same machine as your current R session is running. If your Workbench instance is running containerized (which is the case when cynkra deployed it), it will run in the same container.

Workbench - Kubernetes

When running on Kubernetes, the Workbench launcher can launch jobs in isolated pods (~=containers), which can possibly be run on a different machine. What is definitely the case is that these jobs run fully independent of your main R session or the Workbench installation itself. The big advantage of this approach is that if the job crashed or become a memory leak, it will not affect your and other people’s R sessions.

You should always use workbench jobs on an Workbench Kubernetes installation instead of background jobs.

Programmatic Job Submission

Jobs can be submitted via the built-in GUI interface in Workbench or programmatically via the {rstudioapi} R package. The following example shows how to submit a job programmatically which would run a file named test.R in a user’s home directory and requests 1 CPU and 500 MB of memory.

library(rstudioapi)
launcherSubmitJob("test-job",
  cluster = "Kubernetes",
  # using arguments 'exe' and 'args' in `launcherSubmitJob()` somehow errors
  command = "/bin/bash -c 'cd ~ && R --slave --no-save --no-restore -f ~/test.R'",
  resourceLimits = list(launcherResourceLimit(c("cpuCount"), c("1")),
    launcherResourceLimit(c("memory"), c("500"))),
  container = launcherContainer("rocker/r-ver:4.2.1")
)

You can also submit jobs with a fraction of a CPU in Kubernetes - this is actually quite common as most nodes only come with one or two CPUs. If your job does not rely on heavy computation, you might want to test using a value between zero and one for cpuCount.