IC GPU

Terraform Provider

Manage IC GPU Service infrastructure as code. 8 resources, 4 data sources, full lifecycle management.

Prerequisites

  • Terraform 1.5 or later
  • An IC GPU Service account with an API key (sk-ic-...)

Getting Started

1

Configure the provider

main.tf
terraform {
  required_providers {
    icgpu = {
      source  = "registry.terraform.io/ic-gpu/icgpu"
      version = "~> 1.0"
    }
  }
}

provider "icgpu" {
  # API key can also be set via IC_GPU_API_KEY env var
  api_key = var.ic_gpu_api_key
  api_url = "https://api.gpu.local"  # optional, this is the default
}

variable "ic_gpu_api_key" {
  type      = string
  sensitive = true
}
2

Set your API key

Terminal
export IC_GPU_API_KEY="sk-ic-your-api-key-here"
3

Initialise and apply

Terminal
terraform init
terraform plan
terraform apply

Provider Configuration

AttributeTypeEnv VarDescription
api_keystringIC_GPU_API_KEYAPI key (sk-ic-...)
tokenstringIC_GPU_TOKENOIDC bearer token
api_urlstringIC_GPU_API_URLAPI base URL (default: https://api.gpu.local)

Resources

icgpu_instance

Manages a GPU workspace instance with SSH access and optional inference engine.

instance.tf
resource "icgpu_instance" "workspace" {
  name             = "ml-workspace"
  tier             = "timesliced"     # "timesliced" (no memory isolation), "dedicated", or "mig"
  inference_engine = "vllm"           # optional: "vllm", "ollama", "sglang"
}

output "ssh_command" {
  value = "ssh gpuuser@${icgpu_instance.workspace.ssh_host} -p ${icgpu_instance.workspace.ssh_port}"
}

Computed attributes:

id, status, ssh_host, ssh_port, created_at

icgpu_vm

Manages a KubeVirt virtual machine from a template.

vm.tf
resource "icgpu_vm" "dev" {
  name        = "dev-server"
  template_id = "tpl-ubuntu-24"
  ssh_key_id  = icgpu_ssh_key.laptop.id   # optional
}

output "vm_status" {
  value = icgpu_vm.dev.status
}

Computed attributes:

id, status, ip_address, created_at

icgpu_cluster

Manages a vCluster Kubernetes cluster with automatic kubeconfig generation.

cluster.tf
resource "icgpu_cluster" "ml" {
  name = "ml-cluster"
}

# Write kubeconfig to a local file
resource "local_file" "kubeconfig" {
  content  = icgpu_cluster.ml.kubeconfig
  filename = "${path.module}/kubeconfig.yaml"
}

Computed attributes:

id, status, kubeconfig (sensitive), created_at

icgpu_model_deployment

Deploys an LLM model from HuggingFace with auto-scaling.

model.tf
resource "icgpu_model_deployment" "llama" {
  model_name       = "my-llama"
  huggingface_repo = "meta-llama/Meta-Llama-3-8B-Instruct"
  engine           = "vllm"        # default
  gpu_count        = 1             # default
  min_replicas     = 1             # default
  max_replicas     = 3
}

Computed attributes:

id, status, created_at

icgpu_api_key

Creates a scoped API key. The full key is only available at creation.

api_key.tf
resource "icgpu_api_key" "ci" {
  name        = "ci-pipeline"
  scopes      = ["instances", "models"]
  permissions = ["read", "write"]
}

output "api_key_prefix" {
  value = icgpu_api_key.ci.prefix
}

output "api_key_full" {
  value     = icgpu_api_key.ci.key
  sensitive = true
}

icgpu_ssh_key

Uploads an SSH public key. SSH keys are immutable — changes force recreation.

ssh_key.tf
resource "icgpu_ssh_key" "laptop" {
  name       = "laptop"
  public_key = file("~/.ssh/id_ed25519.pub")
}

output "fingerprint" {
  value = icgpu_ssh_key.laptop.fingerprint
}

icgpu_webhook

Creates an HMAC-signed webhook endpoint for event notifications.

webhook.tf
resource "icgpu_webhook" "alerts" {
  url    = "https://example.com/hooks/gpu-events"
  events = ["instance.created", "instance.terminated", "balance.low"]
}

output "webhook_secret" {
  value     = icgpu_webhook.alerts.secret
  sensitive = true
}

icgpu_spending_alert

Creates a threshold-based spending alert.

alert.tf
resource "icgpu_spending_alert" "low_balance" {
  name       = "low-balance-warning"
  threshold  = 5000.0
  notify_via = "dashboard"     # or "webhook"
}

Data Sources

data_sources.tf
# List available GPU tiers and pricing
data "icgpu_gpu_tiers" "available" {}

output "gpu_tiers" {
  value = [for t in data.icgpu_gpu_tiers.available.tiers : {
    name  = t.name
    price = t.price_per_hour
  }]
}

# List token packages
data "icgpu_token_packages" "available" {}

output "cheapest_package" {
  value = data.icgpu_token_packages.available.packages[0].name
}

# Browse pre-tested model catalogue
data "icgpu_model_catalogue" "models" {}

output "featured_models" {
  value = [for m in data.icgpu_model_catalogue.models.models : m.display_name if m.featured]
}

# List VM templates
data "icgpu_vm_templates" "available" {}

output "templates" {
  value = [for t in data.icgpu_vm_templates.available.templates : t.name]
}

Complete Infrastructure Example

A realistic setup with a GPU instance, model deployment, SSH key, webhook, and spending alert.

infrastructure.tf
terraform {
  required_providers {
    icgpu = {
      source  = "registry.terraform.io/ic-gpu/icgpu"
      version = "~> 1.0"
    }
  }
}

provider "icgpu" {}  # Uses IC_GPU_API_KEY env var

# --- SSH Key ---
resource "icgpu_ssh_key" "deploy" {
  name       = "deploy-key"
  public_key = file("~/.ssh/id_ed25519.pub")
}

# --- GPU Instance ---
resource "icgpu_instance" "workspace" {
  name             = "ml-workspace"
  tier             = "dedicated"
  inference_engine = "vllm"
}

# --- Model Deployment ---
resource "icgpu_model_deployment" "llama" {
  model_name       = "llama-3-8b"
  huggingface_repo = "meta-llama/Meta-Llama-3-8B-Instruct"
  engine           = "vllm"
  gpu_count        = 1
  min_replicas     = 1
  max_replicas     = 3
}

# --- API Key (read-only for monitoring) ---
resource "icgpu_api_key" "monitoring" {
  name        = "monitoring"
  scopes      = ["instances", "models"]
  permissions = ["read"]
}

# --- Webhook ---
resource "icgpu_webhook" "slack" {
  url    = "https://hooks.slack.com/services/T00/B00/xxx"
  events = [
    "instance.created",
    "instance.terminated",
    "balance.low",
  ]
}

# --- Spending Alert ---
resource "icgpu_spending_alert" "warning" {
  name       = "balance-warning"
  threshold  = 10000.0
  notify_via = "webhook"
}

# --- Outputs ---
output "ssh_command" {
  value = "ssh gpuuser@${icgpu_instance.workspace.ssh_host} -p ${icgpu_instance.workspace.ssh_port}"
}

output "model_status" {
  value = icgpu_model_deployment.llama.status
}

output "api_key" {
  value     = icgpu_api_key.monitoring.key
  sensitive = true
}

Importing Existing Resources

Import resources created outside Terraform into your state file.

Terminal
# Import by resource ID
terraform import icgpu_instance.workspace inst-abc123
terraform import icgpu_vm.dev vm-def456
terraform import icgpu_cluster.ml clust-ghi789
terraform import icgpu_model_deployment.llama my-llama
terraform import icgpu_api_key.ci key-jkl012
terraform import icgpu_ssh_key.laptop ssh-mno345
terraform import icgpu_webhook.alerts wh-pqr678
terraform import icgpu_spending_alert.warning alert-stu901