Job configuration reference

You set these fields when you create a job, either in the web interface or with cosmicac jobs create. The job type determines which fields apply. In non-interactive mode, set each field with the flag in the CLI flag column. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.

Common fields

These fields apply to every job type.

Field	Required	CLI flag	Description
Job type	Yes	`--type`	The kind of job to create, GPU Container or Managed Inference.
Job name	Yes	`--name`	A name to identify the job.
Tags	Yes	`--tags`	One or more labels for the job. The CLI accepts a comma-separated list.
Location	Yes	`--location`	Where the job runs, for example `IN`. The CLI lists the locations your racks report.

GPU configuration

These fields select the job's hardware.

Field	Required	CLI flag	Description
GPU	Yes	`--gpu-type`	The GPU to use, for example `GH100_H100_SXM5_80GB`. The CLI lists the GPU types your racks report.
GPU count	Yes	`--gpu-count`	Number of GPUs.
CUDA / driver	Yes	`--driver`	GPU driver version. Only `CUDA 12.9` is supported.

Set the GPU type and count in one flag with --gpu TYPE=COUNT, for example --gpu H100=2. This replaces --gpu-type and --gpu-count.

GPU Container parameters

These fields apply to a GPU Container Job.

Field	Required	CLI flag	Description
Base OS image	Yes	`--base-image`	Base OS image for the container. Only `Ubuntu22.04/CUDA12.9` is supported.
Disk (GB)	Yes	`--root-disk-size-gb`	Root disk size in GB. One of `250`, `500`, or `1000`.

Managed Inference (vLLM) parameters

These fields apply to a vLLM Managed Inference Job.

Field	Required	CLI flag	Description
Model	Yes	`--model`	Hugging Face model ID to serve (`Qwen/Qwen3-32B`).
Runtime image (CUDA)	Yes	`--runtime-image`	Serving runtime image (`vllm-openai-0.8.5`).
Data type	Yes	`--data-type`	Numeric precision the model runs at (`BF16` or `Auto`).
Quantisation	Yes	`--quantisation`	Quantization scheme (`FP8` or `INT8`).
Tensor parallel	Yes	`--tensor-parallel`	Number of GPUs to split the model across.
GPU memory utilization	Yes	`--gpu-memory-utilization`	Fraction of GPU memory to use, between `0` and `1`.
Max concurrent sequences	Yes	`--max-concurrent-sequences`	Maximum requests handled at once.
Max model length	Yes	`--max-model-length`	Maximum model context length.
Reasoning parser	Yes	`--reasoning-parser`	Parser for the model's reasoning output.
Video & image input	Yes	`--multimodal`	Whether the model accepts multimodal input. `true` or `false`.
Root disk size	Yes	`--root-disk-size-gb`	VM root disk size in GB. One of `250`, `500`, or `1000`.
Environment variables	No	`--env`	Environment variables passed to the inference service.
Endpoint name	Yes	`--endpoint-name`	Name of the inference endpoint. Must be unique across active inference jobs.
Replicas	Yes	`--replica`	Number of endpoint replicas.
Require Authorization header	Yes	`--require-auth-header` / `--no-auth-header`	Whether callers must send an authorization header. `true` or `false`.

Managed Inference (Parakeet) parameters

These fields apply to a Parakeet Managed Inference Job.

Field	Required	CLI flag	Description
Model	Yes	`--model`	Parakeet model to serve, `nvidia/parakeet-tdt-0.6b-v3`.
Endpoint name	Yes	`--endpoint-name`	Name of the transcription endpoint.
Chunk duration	Yes	`--chunk-duration`	Audio chunk length in seconds (`60`).
Chunk overlap	Yes	`--chunk-overlap`	Overlap between chunks in seconds (`10`).
Max file size (MB)	Yes	`--max-file-size-mb`	Maximum upload size in MB (`2048`).
Require Authorization header	Yes	`--require-auth-header` / `--no-auth-header`	Whether callers must send an authorization header. `true` or `false`.