From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4820F21CA0A for ; Sun, 31 Aug 2025 04:00:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756612809; cv=none; b=EghHjLIr/WS+OLTuS2uwC+ntSv8eW2/L8u+iD+9IDQcOs3L0QHn+v4dLChGtG2JC9EInnNVR7MKZ7LYIysKaKKXb84AagiiDNVlq8raFHE1E+CLhmdUKoS8I733dk+KUjyuyNCdDqjEHcOhttktNjmPE00kix/7dPUqumAMoy6M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756612809; c=relaxed/simple; bh=1Wut+orc1I8JMox8yF3myqgJgbEeO7wah8HBAm3Qkd4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=R5V0rhNzux5AmA7z8DJIMUIIRttxIeiUFLMd8vprO30cOrQa2Zswb2cWEO26BhFNTl1ED7Vh+/JYd/hhXMt7SCocuAbn3r8k7vUcw9Rt5OM/9FT1wYMKR25mlL5h80tS06vegAQLJ9a2CiUFnWVuw+2BnCU4KNHgmsoI+uIkf6w= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=yQJsvF6h; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="yQJsvF6h" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=7oVtMBgIaQ/sPmQHW4kJ+ICX/fXYwcfRHnZTzvg+FTM=; b=yQJsvF6h0My+CxsNTU3IETtB87 D33uAWj1wKm7r0bUamw1+qhtKWWapQ95ib1JFt7bL/Dok3oFfgKQ+kWJ/tVAXMYnDkq23LAS5yH7K 1v6o6CeHiB8/v3uSUiAVAXvoiAuY98eGqBJofMWdHQTeZBkSUcyQNG5TagS3243cQ5mR6iywrfW+g weCDvoD0DGsHAjJQqcUdQPv6oiLXpAJQ9ioNyv/0SovmfGHq0Lmfa824X12Y4E/9nhwwAvBOXUAu2 kkjmc/JdZhEblz7Ar6vIDT4T/aigvwuLgJYHwHJtd3EDM4WaXsAJot5z9ZIgvlNS7YKwWf8XOTaBH ogbgHT/g==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1usZEb-000000093sE-3Pke; Sun, 31 Aug 2025 04:00:05 +0000 From: Luis Chamberlain To: Chuck Lever , Daniel Gomez , kdevops@lists.linux.dev Cc: Luis Chamberlain , Your Name Subject: [PATCH v3 07/10] terraform/lambdalabs: add Kconfig structure for Lambda Labs Date: Sat, 30 Aug 2025 21:00:01 -0700 Message-ID: <20250831040004.2159779-8-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250831040004.2159779-1-mcgrof@kernel.org> References: <20250831040004.2159779-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: kdevops@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: Luis Chamberlain Add comprehensive Kconfig system for Lambda Labs cloud provider: - Main Kconfig: Provider overview and menu organization - Kconfig.location: Region selection with smart inference modes - Kconfig.compute: Instance type configuration with capacity info - Kconfig.identity: SSH key management - Kconfig.smart: Smart selection algorithms configuration - Kconfig.storage: Placeholder for future storage features Key features: - Three selection modes: smart cheapest, smart inference, manual - Dynamic region/instance generation based on API availability - Complete region mappings including us-east-3 and other regions - Provider limitations clearly documented - Integration with dynamic configuration system The Kconfig structure provides user-friendly configuration while handling Lambda Labs provider constraints transparently. Generated-by: Claude AI Signed-off-by: Your Name --- terraform/lambdalabs/Kconfig | 33 ++++++++ terraform/lambdalabs/kconfigs/Kconfig.compute | 48 ++++++++++++ .../lambdalabs/kconfigs/Kconfig.identity | 76 +++++++++++++++++++ .../lambdalabs/kconfigs/Kconfig.location | 73 ++++++++++++++++++ terraform/lambdalabs/kconfigs/Kconfig.smart | 25 ++++++ terraform/lambdalabs/kconfigs/Kconfig.storage | 12 +++ 6 files changed, 267 insertions(+) create mode 100644 terraform/lambdalabs/Kconfig create mode 100644 terraform/lambdalabs/kconfigs/Kconfig.compute create mode 100644 terraform/lambdalabs/kconfigs/Kconfig.identity create mode 100644 terraform/lambdalabs/kconfigs/Kconfig.location create mode 100644 terraform/lambdalabs/kconfigs/Kconfig.smart create mode 100644 terraform/lambdalabs/kconfigs/Kconfig.storage diff --git a/terraform/lambdalabs/Kconfig b/terraform/lambdalabs/Kconfig new file mode 100644 index 0000000..050f546 --- /dev/null +++ b/terraform/lambdalabs/Kconfig @@ -0,0 +1,33 @@ +if TERRAFORM_LAMBDALABS + +# Lambda Labs Terraform Provider Limitations: +# The elct9620/lambdalabs provider (v0.3.0) has significant limitations: +# - NO OS/distribution selection (always Ubuntu 22.04) +# - NO storage volume management +# - NO custom user creation (always uses "ubuntu" user) +# - NO user data/cloud-init support +# +# Only these features are supported: +# - Region selection +# - GPU instance type selection +# - SSH key management + +menu "Resource Location" +source "terraform/lambdalabs/kconfigs/Kconfig.location" +endmenu + +menu "Compute" +source "terraform/lambdalabs/kconfigs/Kconfig.compute" +endmenu + +# Storage menu removed - not supported by provider +# OS image selection removed - not supported by provider + +menu "Identity & Access" +source "terraform/lambdalabs/kconfigs/Kconfig.identity" +endmenu + +# Note: Storage and OS configuration files are kept as placeholders +# for future provider updates but contain no options currently + +endif # TERRAFORM_LAMBDALABS diff --git a/terraform/lambdalabs/kconfigs/Kconfig.compute b/terraform/lambdalabs/kconfigs/Kconfig.compute new file mode 100644 index 0000000..579e720 --- /dev/null +++ b/terraform/lambdalabs/kconfigs/Kconfig.compute @@ -0,0 +1,48 @@ +# Lambda Labs compute configuration + +if TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + +comment "Instance type: Automatically selected (cheapest available)" +comment "Enable manual region selection to choose specific instance type" + +endif # TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + +# Include dynamically generated instance types when not using smart cheapest selection +if !TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST +source "terraform/lambdalabs/kconfigs/Kconfig.compute.generated" +endif + +config TERRAFORM_LAMBDALABS_INSTANCE_TYPE + string + output yaml + default $(shell, python3 scripts/lambdalabs_smart_inference.py instance) if TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + # Dynamically generated mappings for all instance types + default "cpu_4x_general" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_CPU_4X_GENERAL + default "gpu_1x_a10" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_A10 + default "gpu_1x_a100" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_A100 + default "gpu_1x_a100_sxm4" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_A100_SXM4 + default "gpu_1x_a6000" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_A6000 + default "gpu_1x_gh200" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_GH200 + default "gpu_1x_h100_pcie" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_H100_PCIE + default "gpu_1x_h100_sxm5" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_H100_SXM5 + default "gpu_1x_rtx6000" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_1X_RTX6000 + default "gpu_2x_a100" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_2X_A100 + default "gpu_2x_a6000" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_2X_A6000 + default "gpu_2x_h100_sxm5" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_2X_H100_SXM5 + default "gpu_4x_a100" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_4X_A100 + default "gpu_4x_a6000" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_4X_A6000 + default "gpu_4x_h100_sxm5" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_4X_H100_SXM5 + default "gpu_8x_a100" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_8X_A100 + default "gpu_8x_a100_80gb_sxm4" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_8X_A100_80GB_SXM4 + default "gpu_8x_b200_sxm6" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_8X_B200_SXM6 + default "gpu_8x_h100_sxm5" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_8X_H100_SXM5 + default "gpu_8x_v100" if TERRAFORM_LAMBDALABS_INSTANCE_TYPE_GPU_8X_V100 + +# OS image is not configurable - provider limitation +config TERRAFORM_LAMBDALABS_IMAGE + string + default "ubuntu-22.04" + help + Lambda Labs terraform provider does NOT support OS/image selection. + The provider always deploys Ubuntu 22.04. This is a placeholder + config that exists only for consistency with other cloud providers. diff --git a/terraform/lambdalabs/kconfigs/Kconfig.identity b/terraform/lambdalabs/kconfigs/Kconfig.identity new file mode 100644 index 0000000..5bc2602 --- /dev/null +++ b/terraform/lambdalabs/kconfigs/Kconfig.identity @@ -0,0 +1,76 @@ +# Lambda Labs identity and access configuration + +# SSH Key Security Model +# ======================= +# For security, each kdevops project directory should use its own SSH key. +# This prevents key sharing between different projects and environments. +# +# Two modes are supported: +# 1. Unique keys per directory (recommended) - Each project gets its own key +# 2. Shared key (legacy) - Use a common key name across projects + +choice + prompt "Lambda Labs SSH key management strategy" + default TERRAFORM_LAMBDALABS_SSH_KEY_UNIQUE + help + Choose how SSH keys are managed for Lambda Labs instances. + + Unique keys (recommended): Each project directory gets its own SSH key, + preventing key sharing between projects. The key name includes a hash + of the directory path for uniqueness. + + Shared key: Use the same key name across all projects (less secure). + +config TERRAFORM_LAMBDALABS_SSH_KEY_UNIQUE + bool "Use unique SSH key per project directory (recommended)" + help + Generate a unique SSH key name for each kdevops project directory. + This improves security by ensuring projects don't share SSH keys. + + The key name will be generated based on the directory path, like: + "kdevops-lambda-kdevops-a1b2c3d4" + + The key will be automatically created and uploaded to Lambda Labs + when you run 'make bringup' if it doesn't already exist. + +config TERRAFORM_LAMBDALABS_SSH_KEY_SHARED + bool "Use shared SSH key name (legacy)" + help + Use a fixed SSH key name that you specify. This is less secure + as multiple projects might share the same key. + + You'll need to ensure the key exists in Lambda Labs before + running 'make bringup'. + +endchoice + +config TERRAFORM_LAMBDALABS_SSH_KEY_NAME_CUSTOM + string "Custom SSH key name (only for shared mode)" + default "kdevops-lambdalabs" + depends on TERRAFORM_LAMBDALABS_SSH_KEY_SHARED + help + Specify the custom SSH key name to use when in shared mode. + This key must already exist in your Lambda Labs account. + +config TERRAFORM_LAMBDALABS_SSH_KEY_NAME + string + output yaml + default $(shell, python3 scripts/lambdalabs_ssh_key_name.py 2>/dev/null || echo "kdevops-lambdalabs") if TERRAFORM_LAMBDALABS_SSH_KEY_UNIQUE + default TERRAFORM_LAMBDALABS_SSH_KEY_NAME_CUSTOM if TERRAFORM_LAMBDALABS_SSH_KEY_SHARED + +config TERRAFORM_LAMBDALABS_SSH_KEY_AUTO_CREATE + bool "Automatically create and upload SSH key if missing" + default y if TERRAFORM_LAMBDALABS_SSH_KEY_UNIQUE + default n if TERRAFORM_LAMBDALABS_SSH_KEY_SHARED + help + When enabled, kdevops will automatically: + 1. Generate a new SSH key pair if it doesn't exist + 2. Upload the public key to Lambda Labs if not already there + 3. Clean up the key when destroying infrastructure + + This is enabled by default for unique keys mode and disabled + for shared key mode. + +# Note: Lambda Labs doesn't support custom SSH users +# Instances always use the OS default user (ubuntu for Ubuntu 22.04) +# To handle this, we disable SSH user inference for Lambda Labs diff --git a/terraform/lambdalabs/kconfigs/Kconfig.location b/terraform/lambdalabs/kconfigs/Kconfig.location new file mode 100644 index 0000000..7dd3d5b --- /dev/null +++ b/terraform/lambdalabs/kconfigs/Kconfig.location @@ -0,0 +1,73 @@ +# Lambda Labs location configuration with smart inference + +choice + prompt "Lambda Labs region selection method" + default TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + help + Select how to choose the Lambda Labs region for deployment. + +config TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + bool "Smart selection - automatically select cheapest instance in closest region" + help + Enable smart inference that: + 1. Determines your location from public IP + 2. Finds all available instance/region combinations + 3. Selects the cheapest instance type + 4. Picks the closest region where that instance is available + + This ensures you get the most affordable option with lowest latency. + +config TERRAFORM_LAMBDALABS_REGION_SMART_INFER + bool "Smart inference - automatically select region with available capacity" + help + Automatically selects a region that has available capacity for your + chosen instance type. This eliminates manual checking of region availability. + +config TERRAFORM_LAMBDALABS_REGION_MANUAL + bool "Manual region selection" + help + Manually select a specific region. Note that the selected region + may not have capacity for your chosen instance type. + +endchoice + +if TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + +comment "Region: Automatically selected (closest with cheapest instance)" +comment "Instance: Automatically selected (cheapest available)" + +endif # TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + +if TERRAFORM_LAMBDALABS_REGION_SMART_INFER + +comment "Region: Automatically selected based on instance availability" + +endif # TERRAFORM_LAMBDALABS_REGION_SMART_INFER + +if TERRAFORM_LAMBDALABS_REGION_MANUAL +# Include dynamically generated regions when using manual selection +source "terraform/lambdalabs/kconfigs/Kconfig.location.generated" +endif # TERRAFORM_LAMBDALABS_REGION_MANUAL + +config TERRAFORM_LAMBDALABS_REGION + string + output yaml + default $(shell, python3 scripts/lambdalabs_smart_inference.py region) if TERRAFORM_LAMBDALABS_REGION_SMART_CHEAPEST + default $(shell, scripts/lambdalabs_infer_region.py $(TERRAFORM_LAMBDALABS_INSTANCE_TYPE)) if TERRAFORM_LAMBDALABS_REGION_SMART_INFER + default "us-tx-1" if TERRAFORM_LAMBDALABS_REGION_US_TX_1 + default "us-midwest-1" if TERRAFORM_LAMBDALABS_REGION_US_MIDWEST_1 + default "us-west-1" if TERRAFORM_LAMBDALABS_REGION_US_WEST_1 + default "us-west-2" if TERRAFORM_LAMBDALABS_REGION_US_WEST_2 + default "us-west-3" if TERRAFORM_LAMBDALABS_REGION_US_WEST_3 + default "us-south-1" if TERRAFORM_LAMBDALABS_REGION_US_SOUTH_1 + default "us-south-2" if TERRAFORM_LAMBDALABS_REGION_US_SOUTH_2 + default "us-south-3" if TERRAFORM_LAMBDALABS_REGION_US_SOUTH_3 + default "europe-central-1" if TERRAFORM_LAMBDALABS_REGION_EU_CENTRAL_1 + default "asia-northeast-1" if TERRAFORM_LAMBDALABS_REGION_ASIA_NORTHEAST_1 + default "asia-northeast-2" if TERRAFORM_LAMBDALABS_REGION_ASIA_NORTHEAST_2 + default "asia-south-1" if TERRAFORM_LAMBDALABS_REGION_ASIA_SOUTH_1 + default "me-west-1" if TERRAFORM_LAMBDALABS_REGION_ME_WEST_1 + default "us-east-1" if TERRAFORM_LAMBDALABS_REGION_US_EAST_1 + default "us-east-3" if TERRAFORM_LAMBDALABS_REGION_US_EAST_3 + default "australia-east-1" if TERRAFORM_LAMBDALABS_REGION_AUSTRALIA_EAST_1 + default "us-tx-1" diff --git a/terraform/lambdalabs/kconfigs/Kconfig.smart b/terraform/lambdalabs/kconfigs/Kconfig.smart new file mode 100644 index 0000000..fb4e385 --- /dev/null +++ b/terraform/lambdalabs/kconfigs/Kconfig.smart @@ -0,0 +1,25 @@ +# Lambda Labs Smart Inference Configuration + +config TERRAFORM_LAMBDALABS_SMART_CHEAPEST + bool "Automatically select cheapest available instance in closest region" + default y + help + Enable smart inference that: + 1. Determines your location from public IP + 2. Finds all available instance/region combinations + 3. Selects the cheapest instance type + 4. Picks the closest region where that instance is available + + This ensures you get the most affordable option with lowest latency. + +if TERRAFORM_LAMBDALABS_SMART_CHEAPEST + +config TERRAFORM_LAMBDALABS_SMART_INSTANCE + string + default $(shell, python3 scripts/lambdalabs_smart_inference.py instance) + +config TERRAFORM_LAMBDALABS_SMART_REGION + string + default $(shell, python3 scripts/lambdalabs_smart_inference.py region) + +endif # TERRAFORM_LAMBDALABS_SMART_CHEAPEST diff --git a/terraform/lambdalabs/kconfigs/Kconfig.storage b/terraform/lambdalabs/kconfigs/Kconfig.storage new file mode 100644 index 0000000..4a91702 --- /dev/null +++ b/terraform/lambdalabs/kconfigs/Kconfig.storage @@ -0,0 +1,12 @@ +# Lambda Labs storage configuration +# +# NOTE: The Lambda Labs terraform provider (elct9620/lambdalabs v0.3.0) does NOT support +# storage volume management. Instances come with their default storage only. +# +# If you need additional storage, you must: +# 1. Use the Lambda Labs web console to attach volumes manually +# 2. Or use a different cloud provider that supports storage management +# +# This file is kept as a placeholder for future provider updates. + +# No configuration options available - provider doesn't support storage management -- 2.50.1