From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8E0430C62F for ; Wed, 27 Aug 2025 21:29:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756330148; cv=none; b=C3LetjPsMtjpQeIfciuRM3WjlNMaWTyFvfj063eAdjQK2PAG8sMMaRV8bxsFbJH30CA603vClmDiGLX6SfPZ5zoYQvovx+UsgUuLqMTi4EqcZYtiR04ozZYeWM+5p1/qSRWhg2RDRGZA5v11AXu0fHimnZNtg3urt8w2FMw8Z84= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756330148; c=relaxed/simple; bh=5+NS6fYJ5KUZG23OnbGvQBluD3fR7WGtvGDbRp4mh0Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uUo6EVnBfR9vH75KFC7dmxzvJvQ20CG53bwl2XegtMfGQV+Nh0dhSiT1y9KFmphe9d9z166nRtZnhlAigP6+O4pJ0zZN7x5JJHt1YYCS3u9y0/NYASX3WiCvohOZakwn1f4Z2C6GP+Fki0TQxgLKgdaqlE64C8e3a0nXsWZqNL8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=eEZTFF/m; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="eEZTFF/m" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description; bh=nM7KB8EJUpffbkwVRVEJxRrlLWucHePeOUNcRtvEv2E=; b=eEZTFF/mdT+/0hfiMPKVP8blCY GjvC64oNFrG3MLvsWzSOprlBfklGcbwhw+U8qvB67v8zXJ7iWrNnQ0Kemvp9i8J9+k/Lv6+xMGlJm aFv7o33IT6iL8rD526oST56z6ZxQnrb53Wx/wEfv5oBo2UFIwPc4VMlTfLIW1PxDZbkgey3SJ3weo 9KZejsKu5nl+Yalt9n/nnQ9bsbHM34BYV37i6Ru1bunLw28ikmKcAL+eEyWAr1JemaDDU38GaYZhk 5jYJcAfhVJMhWgJUA2laE+BUI5Z5ylKMQ/XVSN+HWgLFKu/DLiPPwjPYHPQ6ln6lLnjGht9B3GM5c eueSm4mw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1urNhY-0000000GsJZ-1ckp; Wed, 27 Aug 2025 21:29:04 +0000 From: Luis Chamberlain To: Chuck Lever , Daniel Gomez , kdevops@lists.linux.dev Cc: Luis Chamberlain Subject: [PATCH v2 05/10] kconfig: add dynamic cloud provider configuration infrastructure Date: Wed, 27 Aug 2025 14:28:56 -0700 Message-ID: <20250827212902.4021990-6-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250827212902.4021990-1-mcgrof@kernel.org> References: <20250827212902.4021990-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: kdevops@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: Luis Chamberlain Introduce a new dynamic configuration system that queries cloud provider APIs to generate real-time Kconfig options. This allows configuration menus to show current availability and capacity directly from the cloud. The system adds: - Python script to query APIs and generate Kconfig files - New 'make cloud-config' target for cloud-specific updates - Integration with existing 'make dynconfig' infrastructure - CLOUD_INITIALIZED marker to detect cloud setup state - Automatic fallback to static defaults when API unavailable This sets up the framework for API-driven configuration but doesn't enable any specific providers yet. The architecture supports future extension to AWS, Azure, GCE, and other cloud providers. Generated-by: Claude AI Signed-off-by: Luis Chamberlain --- kconfigs/Kconfig.bringup | 5 + scripts/dynamic-cloud-kconfig.Makefile | 44 +++++ scripts/dynamic-kconfig.Makefile | 2 + scripts/generate_cloud_configs.py | 223 +++++++++++++++++++++++++ scripts/lambdalabs_ssh_keys.py | 3 +- 5 files changed, 276 insertions(+), 1 deletion(-) create mode 100644 scripts/dynamic-cloud-kconfig.Makefile create mode 100755 scripts/generate_cloud_configs.py diff --git a/kconfigs/Kconfig.bringup b/kconfigs/Kconfig.bringup index 8caf07b..b64ba50 100644 --- a/kconfigs/Kconfig.bringup +++ b/kconfigs/Kconfig.bringup @@ -9,8 +9,13 @@ config KDEVOPS_ENABLE_NIXOS bool output yaml +config CLOUD_INITIALIZED + bool + default $(shell, test -f .cloud.initialized && echo y || echo n) = "y" + choice prompt "Node bring up method" + default TERRAFORM if CLOUD_INITIALIZED default GUESTFS config GUESTFS diff --git a/scripts/dynamic-cloud-kconfig.Makefile b/scripts/dynamic-cloud-kconfig.Makefile new file mode 100644 index 0000000..cc0a6b8 --- /dev/null +++ b/scripts/dynamic-cloud-kconfig.Makefile @@ -0,0 +1,44 @@ +# SPDX-License-Identifier: copyleft-next-0.3.1 +# Dynamic cloud provider Kconfig generation + +DYNAMIC_CLOUD_KCONFIG := +DYNAMIC_CLOUD_KCONFIG_ARGS := + +# Lambda Labs dynamic configuration +LAMBDALABS_KCONFIG_DIR := terraform/lambdalabs/kconfigs +LAMBDALABS_KCONFIG_COMPUTE := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.generated +LAMBDALABS_KCONFIG_LOCATION := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.generated +LAMBDALABS_KCONFIG_IMAGES := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.generated + +LAMBDALABS_KCONFIGS := $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_IMAGES) + +# Individual Lambda Labs targets are now handled by generate_cloud_configs.py +cloud-config-lambdalabs: + $(Q)python3 scripts/generate_cloud_configs.py + +# Clean Lambda Labs generated files +clean-cloud-config-lambdalabs: + $(Q)rm -f $(LAMBDALABS_KCONFIGS) + +DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs + +cloud-config-help: + @echo "Cloud-specific dynamic kconfig targets:" + @echo "cloud-config - generates all cloud provider dynamic kconfig content" + @echo "cloud-config-lambdalabs - generates Lambda Labs dynamic kconfig content" + @echo "clean-cloud-config - removes all generated cloud kconfig files" + @echo "cloud-list-all - list all cloud instances for configured provider" + +HELP_TARGETS += cloud-config-help + +cloud-config: + $(Q)python3 scripts/generate_cloud_configs.py + +clean-cloud-config: clean-cloud-config-lambdalabs + $(Q)echo "Cleaned all cloud provider dynamic Kconfig files." + +cloud-list-all: + $(Q)chmod +x scripts/cloud_list_all.sh + $(Q)scripts/cloud_list_all.sh + +PHONY += cloud-config cloud-config-lambdalabs clean-cloud-config clean-cloud-config-lambdalabs cloud-config-help cloud-list-all diff --git a/scripts/dynamic-kconfig.Makefile b/scripts/dynamic-kconfig.Makefile index b6c0e43..bab83e3 100644 --- a/scripts/dynamic-kconfig.Makefile +++ b/scripts/dynamic-kconfig.Makefile @@ -6,6 +6,7 @@ DYNAMIC_KCONFIG_PCIE_ARGS := HELP_TARGETS += dynamic-kconfig-help include $(TOPDIR)/scripts/dynamic-pci-kconfig.Makefile +include $(TOPDIR)/scripts/dynamic-cloud-kconfig.Makefile ANSIBLE_EXTRA_ARGS += $(DYNAMIC_KCONFIG_PCIE_ARGS) @@ -19,5 +20,6 @@ PHONY += dynamic-kconfig-help dynconfig: $(Q)$(MAKE) dynconfig-pci + $(Q)$(MAKE) cloud-config PHONY += dynconfig diff --git a/scripts/generate_cloud_configs.py b/scripts/generate_cloud_configs.py new file mode 100755 index 0000000..294a1d9 --- /dev/null +++ b/scripts/generate_cloud_configs.py @@ -0,0 +1,223 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: copyleft-next-0.3.1 + +""" +Generate dynamic cloud configurations for all supported providers. +Provides a summary of available options and pricing. +""" + +import os +import sys +import subprocess +import json +import urllib.request +import urllib.error +from typing import Dict, List, Optional, Tuple + +# Import our credentials module +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) +from lambdalabs_credentials import get_api_key as get_api_key_from_credentials + + +def get_lambdalabs_summary() -> Tuple[bool, str]: + """ + Get a summary of Lambda Labs configurations. + Returns (success, summary_string) + """ + api_key = get_api_key_from_credentials() + if not api_key: + return False, "Lambda Labs: API key not set - using defaults" + + try: + # Get instance types with capacity + headers = {"Authorization": f"Bearer {api_key}", "User-Agent": "kdevops/1.0"} + req = urllib.request.Request( + "https://cloud.lambdalabs.com/api/v1/instance-types", headers=headers + ) + + with urllib.request.urlopen(req) as response: + data = json.loads(response.read().decode()) + + if "data" not in data: + return False, "Lambda Labs: Invalid API response" + + # Get pricing data + pricing = { + "gpu_1x_gh200": 1.49, + "gpu_1x_h100_sxm": 3.29, + "gpu_1x_h100_pcie": 2.49, + "gpu_1x_a100": 1.29, + "gpu_1x_a100_sxm": 1.29, + "gpu_1x_a100_pcie": 1.29, + "gpu_1x_a10": 0.75, + "gpu_1x_a6000": 0.80, + "gpu_1x_rtx6000": 0.50, + "gpu_1x_quadro_rtx_6000": 0.50, + "gpu_2x_h100_sxm": 6.38, + "gpu_2x_a100": 2.58, + "gpu_2x_a100_pcie": 2.58, + "gpu_2x_a6000": 1.60, + "gpu_4x_h100_sxm": 12.36, + "gpu_4x_a100": 5.16, + "gpu_4x_a100_pcie": 5.16, + "gpu_4x_a6000": 3.20, + "gpu_8x_b200_sxm": 39.92, + "gpu_8x_h100_sxm": 23.92, + "gpu_8x_a100_80gb": 14.32, + "gpu_8x_a100_80gb_sxm": 14.32, + "gpu_8x_a100": 10.32, + "gpu_8x_a100_40gb": 10.32, + "gpu_8x_v100": 4.40, + } + + # Count available instances and get price range + available_count = 0 + total_count = len(data["data"]) + available_prices = [] + all_regions = set() + + for instance_type, info in data["data"].items(): + regions = info.get("regions_with_capacity_available", []) + if regions: + available_count += 1 + if instance_type in pricing: + available_prices.append(pricing[instance_type]) + for r in regions: + all_regions.add(r["name"]) + + # Format summary + if available_prices: + min_price = min(available_prices) + max_price = max(available_prices) + price_range = f"${min_price:.2f}-${max_price:.2f}/hr" + else: + price_range = "pricing varies" + + region_count = len(all_regions) + + return ( + True, + f"Lambda Labs: {available_count}/{total_count} GPU types available, {region_count} regions, {price_range}", + ) + + except urllib.error.HTTPError as e: + if e.code == 403: + return False, "Lambda Labs: API key invalid - using defaults" + else: + return False, f"Lambda Labs: API error {e.code}" + except Exception as e: + return False, f"Lambda Labs: Error - {str(e)}" + + +def generate_lambdalabs_configs(output_dir: str) -> bool: + """Generate Lambda Labs Kconfig files.""" + try: + # Run the lambdalabs_api.py script + result = subprocess.run( + [sys.executable, "scripts/lambdalabs_api.py", "all", output_dir], + capture_output=True, + text=True, + ) + + if result.returncode != 0: + print( + f" ⚠ Error generating Lambda Labs configs: {result.stderr}", + file=sys.stderr, + ) + return False + + return True + except Exception as e: + print(f" ⚠ Error: {e}", file=sys.stderr) + return False + + +def generate_aws_configs(output_dir: str) -> bool: + """ + Generate AWS Kconfig files (placeholder for future implementation). + """ + # For now, just return True as AWS uses static configs + return True + + +def generate_azure_configs(output_dir: str) -> bool: + """ + Generate Azure Kconfig files (placeholder for future implementation). + """ + # For now, just return True as Azure uses static configs + return True + + +def generate_gce_configs(output_dir: str) -> bool: + """ + Generate GCE Kconfig files (placeholder for future implementation). + """ + # For now, just return True as GCE uses static configs + return True + + +def main(): + """Main function to generate all cloud configurations.""" + print("Generating dynamic cloud configurations based on latest data...") + print() + + # Create .cloud.initialized marker file to signal cloud support is configured + # This will be used by Kconfig to set intelligent defaults + try: + with open(".cloud.initialized", "w") as f: + f.write("# This file indicates cloud support has been initialized\n") + f.write("# Created by 'make cloud-config'\n") + f.write("# Kconfig will use this to set cloud-related defaults\n") + except Exception as e: + print(f" ⚠ Warning: Could not create .cloud.initialized: {e}", file=sys.stderr) + + # Get summaries for each provider + providers = [] + + # Lambda Labs + success, summary = get_lambdalabs_summary() + providers.append(("Lambda Labs", success, summary)) + + # Future providers (placeholders for when we add dynamic support) + # When these providers get dynamic config support, they would show: + # providers.append(("AWS", True, "AWS: 100+ instance types, 26 regions, $0.01-$40.00/hr")) + # providers.append(("Azure", True, "Azure: 200+ VM sizes, 60+ regions, $0.01-$50.00/hr")) + # providers.append(("GCE", True, "GCE: 50+ machine types, 35 regions, $0.01-$30.00/hr")) + + # Print summaries + for provider, success, summary in providers: + if success: + print(f" ✓ {summary}") + else: + print(f" ⚠ {summary}") + + print() + + # Generate configurations for each provider + configs_generated = [] + + # Lambda Labs + print(" • Generating Lambda Labs configurations...") + if generate_lambdalabs_configs("terraform/lambdalabs/kconfigs"): + configs_generated.append("Lambda Labs") + print(" ✓ Instance types, regions, and capacity information updated") + else: + print(" ⚠ Using default configurations") + + # Future providers would go here + # print(" • AWS configurations (static)...") + # configs_generated.append("AWS") + + print() + + if configs_generated: + print(f"✓ Cloud configurations ready for: {', '.join(configs_generated)}") + print(" Run 'make menuconfig' to select your cloud provider and options") + else: + print("⚠ No dynamic configurations were generated, using defaults") + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/scripts/lambdalabs_ssh_keys.py b/scripts/lambdalabs_ssh_keys.py index d4caede..2fa9880 100755 --- a/scripts/lambdalabs_ssh_keys.py +++ b/scripts/lambdalabs_ssh_keys.py @@ -270,7 +270,8 @@ def main(): print("Error: Lambda Labs API key not found", file=sys.stderr) print("Please configure your API key:", file=sys.stderr) print( - " python3 scripts/lambdalabs_credentials.py set 'your-api-key'", file=sys.stderr + " python3 scripts/lambdalabs_credentials.py set 'your-api-key'", + file=sys.stderr, ) sys.exit(1) -- 2.50.1