From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A2502C0296 for ; Thu, 4 Sep 2025 09:00:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756976436; cv=none; b=uxzVzWEzEYoKtT87u7vn2PtjGsrOOGiNCHgi62E/XuI2QOwKF/XuW6EwS80aFYYeWm4MEYvuKwg+ERi0jwHfZ9Qcd8jyU4QBDlxU6oQdtrLKXo/7yh/tZXvZN5J1P3D4Q8VCN6amJySaDyK2JZCknGnFiD5qdG5zjpjrAdb4TPA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756976436; c=relaxed/simple; bh=u4SSz/5rQuGkqswpJzhJgbQIpeAyHFvAhL1mEgrJYGE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tTMi+l6NAXs6YDv8HUSb072AKAvx2x/rKlrXxRzJDstADlUIbDQ+4KWmF8aV7ByhXj9C/MqO2AVGC0ENXjvOO6CqptfBEDNbpy8bdQX/Hi4PY8PEzNuA+1Xrun0UQA2iW2oo6pvOM44OGGCNX+fU6ebMi9xrBb5Mjt2nDL3MHLM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=BlwjR/wy; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="BlwjR/wy" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description; bh=UprkTbO4/aZkrZKPWhg5+SNaC317vluoRAMNNFu4hFE=; b=BlwjR/wyiTSutFNKqvqt0J9u+k hOan46wx9hrdTEDoWgUREx4cjehyljEOzUmrGvF8HNkZkBeElgbLCCVmIHjMTJnjUoFK39aGEhPsc aG4mfE7eKh4KkTa7rfnKRZlikVmTmQtwXMMmW/gfinTT10DvOOWXo1Xax+3lOLGj7KpVKiGywyHxa gGWSw/sVb5EHUxTy2ugZysKtqe/nU9pDSCPIOq/T86mAlveJwTft+TXUFWhv1+pKFW8mUqiMHraKR AZ/4hZtyikg3ww3kV1lSuC3qp/7NYDQFeS2gqXefdfjy9K39MrcGtt2aw4c0vNiOCdf2UnF4nmDf7 5Rjj7tcw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1uu5pX-0000000APfO-3cRC; Thu, 04 Sep 2025 09:00:31 +0000 From: Luis Chamberlain To: Chuck Lever , Daniel Gomez , kdevops@lists.linux.dev Cc: Luis Chamberlain Subject: [PATCH 1/3] aws: add dynamic cloud configuration support using AWS CLI Date: Thu, 4 Sep 2025 02:00:27 -0700 Message-ID: <20250904090030.2481840-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250904090030.2481840-1-mcgrof@kernel.org> References: <20250904090030.2481840-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: kdevops@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: Luis Chamberlain Add support for dynamically generating AWS instance types and regions configuration using the AWS CLI, similar to the Lambda Labs implementation. This allows users to: - Query real-time AWS instance availability - Generate Kconfig files with current instance families and regions - Choose between dynamic and static configuration modes - See pricing estimates and resource summaries Key components: - scripts/aws-cli: AWS CLI wrapper tool for kdevops - scripts/aws_api.py: Low-level AWS API functions - Updated generate_cloud_configs.py to support AWS - Makefile integration for AWS Kconfig generation - Option to use dynamic or static AWS configuration Usage: Run 'make cloud-config' to generate dynamic configuration. This also parallelize cloud provider operations to significantly improve generation. $ time make cloud-config Cloud Provider Configuration Summary ============================================================ ✓ Lambda Labs: 14/20 instances available, 14 regions, $0.50-$10.32/hr Kconfig files generated successfully ✓ AWS: 979 instance types available, 17 regions, ~$0.05-$3.60/hr Kconfig files generated successfully ⚠ Azure: Dynamic configuration not yet implemented ⚠ GCE: Dynamic configuration not yet implemented Note: Dynamic configurations query real-time availability Run 'make menuconfig' to configure your cloud provider real 6m51.859s user 37m16.347s sys 3m8.130s This also adds support for GPU AMIs: - AWS Deep Learning AMI with pre-installed NVIDIA drivers, CUDA, and ML frameworks - NVIDIA Deep Learning AMI option for NGC containers - Custom GPU AMI support for specialized images - Automatic detection of GPU instance types - We conditionally display of GPU AMI options only for GPU instances We automatically detects when you select a GPU instance family (like G6E) and provides appropriate GPU-optimized AMI options including the AWS Deep Learning AMI with all necessary drivers and frameworks pre-installed. Generated-by: Claude AI Signed-off-by: Luis Chamberlain --- .gitignore | 3 + defconfigs/aws-gpu-g6e-ai | 53 + .../templates/aws/terraform.tfvars.j2 | 5 + scripts/aws-cli | 436 +++++++ scripts/aws_api.py | 1135 +++++++++++++++++ scripts/dynamic-cloud-kconfig.Makefile | 88 +- scripts/generate_cloud_configs.py | 198 ++- terraform/aws/kconfigs/Kconfig.compute | 109 +- 8 files changed, 1937 insertions(+), 90 deletions(-) create mode 100644 defconfigs/aws-gpu-g6e-ai create mode 100755 scripts/aws-cli create mode 100755 scripts/aws_api.py diff --git a/.gitignore b/.gitignore index 09d2ae33..30337add 100644 --- a/.gitignore +++ b/.gitignore @@ -115,3 +115,6 @@ terraform/lambdalabs/.terraform_api_key .cloud.initialized scripts/__pycache__/ +.aws_cloud_config_generated +terraform/aws/kconfigs/*.generated +terraform/aws/kconfigs/instance-types/*.generated diff --git a/defconfigs/aws-gpu-g6e-ai b/defconfigs/aws-gpu-g6e-ai new file mode 100644 index 00000000..affc7a98 --- /dev/null +++ b/defconfigs/aws-gpu-g6e-ai @@ -0,0 +1,53 @@ +# AWS G6e.2xlarge GPU instance with Deep Learning AMI for AI/ML workloads +# This configuration sets up an AWS G6e.2xlarge instance with NVIDIA L40S GPU +# optimized for machine learning, AI inference, and GPU-accelerated workloads + +# Cloud provider configuration +CONFIG_KDEVOPS_ENABLE_TERRAFORM=y +CONFIG_TERRAFORM=y +CONFIG_TERRAFORM_AWS=y + +# AWS Dynamic configuration (required for G6E instance family and GPU AMIs) +CONFIG_TERRAFORM_AWS_USE_DYNAMIC_CONFIG=y + +# AWS Instance configuration - G6E family with NVIDIA L40S GPU +# G6E.2XLARGE specifications: +# - 8 vCPUs (3rd Gen AMD EPYC processors) +# - 32 GB system RAM +# - 1x NVIDIA L40S Tensor Core GPU +# - 48 GB GPU memory +# - Up to 15 Gbps network performance +# - Up to 10 Gbps EBS bandwidth +CONFIG_TERRAFORM_AWS_INSTANCE_TYPE_G6E=y +CONFIG_TERRAFORM_AWS_INSTANCE_G6E_2XLARGE=y + +# AWS Region - US East (N. Virginia) - primary availability for G6E +CONFIG_TERRAFORM_AWS_REGION_US_EAST_1=y + +# GPU-optimized Deep Learning AMI +# Includes: NVIDIA drivers 535+, CUDA 12.x, cuDNN, TensorFlow, PyTorch, MXNet +CONFIG_TERRAFORM_AWS_USE_GPU_AMI=y +CONFIG_TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING=y +CONFIG_TERRAFORM_AWS_GPU_AMI_NAME="Deep Learning AMI GPU TensorFlow*" +CONFIG_TERRAFORM_AWS_GPU_AMI_OWNER="amazon" + +# Storage configuration optimized for ML workloads +# 200 GB for datasets, models, and experiment artifacts +CONFIG_TERRAFORM_AWS_DATA_VOLUME_SIZE=200 + +# Basic workflow configuration for kernel development +CONFIG_WORKFLOWS=y +CONFIG_WORKFLOW_LINUX_CUSTOM=y +CONFIG_BOOTLINUX=y + +# Skip testing workflows for pure AI/ML setup +CONFIG_WORKFLOWS_TESTS=n + +# Enable systemd journal remote for debugging +CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y + +# Note: After provisioning, the instance will have: +# - Jupyter notebook server ready for ML experiments +# - Pre-installed deep learning frameworks +# - NVIDIA GPU drivers and CUDA toolkit +# - Docker with NVIDIA Container Toolkit for containerized ML workloads \ No newline at end of file diff --git a/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2 b/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2 index d880254b..f8f4c842 100644 --- a/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2 +++ b/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2 @@ -1,8 +1,13 @@ aws_profile = "{{ terraform_aws_profile }}" aws_region = "{{ terraform_aws_region }}" aws_availability_zone = "{{ terraform_aws_av_zone }}" +{% if terraform_aws_use_gpu_ami is defined and terraform_aws_use_gpu_ami %} +aws_name_search = "{{ terraform_aws_gpu_ami_name }}" +aws_ami_owner = "{{ terraform_aws_gpu_ami_owner }}" +{% else %} aws_name_search = "{{ terraform_aws_ns }}" aws_ami_owner = "{{ terraform_aws_ami_owner }}" +{% endif %} aws_instance_type = "{{ terraform_aws_instance_type }}" aws_ebs_volumes_per_instance = "{{ terraform_aws_ebs_volumes_per_instance }}" aws_ebs_volume_size = {{ terraform_aws_ebs_volume_size }} diff --git a/scripts/aws-cli b/scripts/aws-cli new file mode 100755 index 00000000..6cacce8b --- /dev/null +++ b/scripts/aws-cli @@ -0,0 +1,436 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: MIT +""" +AWS CLI tool for kdevops + +A structured CLI tool that wraps AWS CLI commands and provides access to +AWS cloud provider functionality for dynamic configuration generation +and resource management. +""" + +import argparse +import json +import sys +import os +from typing import Dict, List, Any, Optional, Tuple +from pathlib import Path + +# Import the AWS API functions +try: + from aws_api import ( + check_aws_cli, + get_instance_types, + get_regions, + get_availability_zones, + get_pricing_info, + generate_instance_types_kconfig, + generate_regions_kconfig, + generate_instance_families_kconfig, + generate_gpu_amis_kconfig, + ) +except ImportError: + # Try to import from scripts directory if not in path + sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + from aws_api import ( + check_aws_cli, + get_instance_types, + get_regions, + get_availability_zones, + get_pricing_info, + generate_instance_types_kconfig, + generate_regions_kconfig, + generate_instance_families_kconfig, + generate_gpu_amis_kconfig, + ) + + +class AWSCLI: + """AWS CLI interface for kdevops""" + + def __init__(self, output_format: str = "json"): + """ + Initialize the CLI with specified output format + + Args: + output_format: 'json' or 'text' for output formatting + """ + self.output_format = output_format + self.aws_available = check_aws_cli() + + def output(self, data: Any, headers: Optional[List[str]] = None): + """ + Output data in the specified format + + Args: + data: Data to output (dict, list, or primitive) + headers: Column headers for text format (optional) + """ + if self.output_format == "json": + print(json.dumps(data, indent=2)) + else: + # Human-readable text format + if isinstance(data, list): + if data and isinstance(data[0], dict): + # Table format for list of dicts + if not headers: + headers = list(data[0].keys()) if data else [] + + if headers: + # Calculate column widths + widths = {h: len(h) for h in headers} + for item in data: + for h in headers: + val = str(item.get(h, "")) + widths[h] = max(widths[h], len(val)) + + # Print header + header_line = " | ".join(h.ljust(widths[h]) for h in headers) + print(header_line) + print("-" * len(header_line)) + + # Print rows + for item in data: + row = " | ".join( + str(item.get(h, "")).ljust(widths[h]) for h in headers + ) + print(row) + else: + # Simple list + for item in data: + print(item) + elif isinstance(data, dict): + # Key-value format + max_key_len = max(len(k) for k in data.keys()) if data else 0 + for key, value in data.items(): + print(f"{key.ljust(max_key_len)} : {value}") + else: + # Simple value + print(data) + + def list_instance_types( + self, + family: Optional[str] = None, + region: Optional[str] = None, + max_results: int = 100, + ) -> List[Dict[str, Any]]: + """ + List instance types + + Args: + family: Filter by instance family (e.g., 'm5', 't3') + region: AWS region to query + max_results: Maximum number of results to return + + Returns: + List of instance type information + """ + if not self.aws_available: + return [ + { + "error": "AWS CLI not found. Please install AWS CLI and configure credentials." + } + ] + + instances = get_instance_types( + family=family, region=region, max_results=max_results + ) + + # Format the results + result = [] + for instance in instances: + item = { + "name": instance.get("InstanceType", ""), + "vcpu": instance.get("VCpuInfo", {}).get("DefaultVCpus", 0), + "memory_gb": instance.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024, + "instance_storage": instance.get("InstanceStorageSupported", False), + "network_performance": instance.get("NetworkInfo", {}).get( + "NetworkPerformance", "" + ), + "architecture": ", ".join( + instance.get("ProcessorInfo", {}).get("SupportedArchitectures", []) + ), + } + result.append(item) + + # Sort by name + result.sort(key=lambda x: x["name"]) + + return result + + def list_regions(self, include_zones: bool = False) -> List[Dict[str, Any]]: + """ + List regions + + Args: + include_zones: Include availability zones for each region + + Returns: + List of region information + """ + if not self.aws_available: + return [ + { + "error": "AWS CLI not found. Please install AWS CLI and configure credentials." + } + ] + + regions = get_regions() + + result = [] + for region in regions: + item = { + "name": region.get("RegionName", ""), + "endpoint": region.get("Endpoint", ""), + "opt_in_status": region.get("OptInStatus", ""), + } + + if include_zones: + # Get availability zones for this region + zones = get_availability_zones(region["RegionName"]) + item["zones"] = len(zones) + item["zone_names"] = ", ".join([z["ZoneName"] for z in zones]) + + result.append(item) + + return result + + def get_cheapest_instance( + self, + region: Optional[str] = None, + family: Optional[str] = None, + min_vcpus: int = 2, + ) -> Dict[str, Any]: + """ + Get the cheapest instance meeting criteria + + Args: + region: AWS region + family: Instance family filter + min_vcpus: Minimum number of vCPUs required + + Returns: + Dictionary with instance information + """ + if not self.aws_available: + return {"error": "AWS CLI not available"} + + instances = get_instance_types(family=family, region=region) + + # Filter by minimum vCPUs + eligible = [] + for instance in instances: + vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0) + if vcpus >= min_vcpus: + eligible.append(instance) + + if not eligible: + return {"error": "No instances found matching criteria"} + + # Get pricing for eligible instances + pricing = get_pricing_info(region=region or "us-east-1") + + # Find cheapest + cheapest = None + cheapest_price = float("inf") + + for instance in eligible: + instance_type = instance.get("InstanceType") + price = pricing.get(instance_type, {}).get("on_demand", float("inf")) + if price < cheapest_price: + cheapest_price = price + cheapest = instance + + if cheapest: + return { + "instance_type": cheapest.get("InstanceType"), + "vcpus": cheapest.get("VCpuInfo", {}).get("DefaultVCpus", 0), + "memory_gb": cheapest.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024, + "price_per_hour": f"${cheapest_price:.3f}", + } + + return {"error": "Could not determine cheapest instance"} + + def generate_kconfig(self) -> bool: + """ + Generate Kconfig files for AWS + + Returns: + True on success, False on failure + """ + if not self.aws_available: + print("AWS CLI not available, cannot generate Kconfig", file=sys.stderr) + return False + + output_dir = Path("terraform/aws/kconfigs") + + # Create directory if it doesn't exist + output_dir.mkdir(parents=True, exist_ok=True) + + try: + from concurrent.futures import ThreadPoolExecutor, as_completed + + # Generate files in parallel + instance_types_dir = output_dir / "instance-types" + instance_types_dir.mkdir(exist_ok=True) + + def generate_family_file(family): + """Generate Kconfig for a single family.""" + types_kconfig = generate_instance_types_kconfig(family) + if types_kconfig: + types_file = instance_types_dir / f"Kconfig.{family}.generated" + types_file.write_text(types_kconfig) + return f"Generated {types_file}" + return None + + with ThreadPoolExecutor(max_workers=10) as executor: + # Submit all generation tasks + futures = [] + + # Generate instance families Kconfig + futures.append(executor.submit(generate_instance_families_kconfig)) + + # Generate regions Kconfig + futures.append(executor.submit(generate_regions_kconfig)) + + # Generate GPU AMIs Kconfig + futures.append(executor.submit(generate_gpu_amis_kconfig)) + + # Generate instance types for each family + # Get all families dynamically from AWS + from aws_api import get_generated_instance_families + + families = get_generated_instance_families() + + family_futures = [] + for family in sorted(families): + family_futures.append(executor.submit(generate_family_file, family)) + + # Process main config results + families_kconfig = futures[0].result() + regions_kconfig = futures[1].result() + gpu_amis_kconfig = futures[2].result() + + # Write main configs + families_file = output_dir / "Kconfig.compute.generated" + families_file.write_text(families_kconfig) + print(f"Generated {families_file}") + + regions_file = output_dir / "Kconfig.location.generated" + regions_file.write_text(regions_kconfig) + print(f"Generated {regions_file}") + + gpu_amis_file = output_dir / "Kconfig.gpu-amis.generated" + gpu_amis_file.write_text(gpu_amis_kconfig) + print(f"Generated {gpu_amis_file}") + + # Process family results + for future in family_futures: + result = future.result() + if result: + print(result) + + return True + + except Exception as e: + print(f"Error generating Kconfig: {e}", file=sys.stderr) + return False + + +def main(): + """Main entry point""" + parser = argparse.ArgumentParser( + description="AWS CLI tool for kdevops", + formatter_class=argparse.RawDescriptionHelpFormatter, + ) + + parser.add_argument( + "--output", + choices=["json", "text"], + default="json", + help="Output format (default: json)", + ) + + subparsers = parser.add_subparsers(dest="command", help="Available commands") + + # Generate Kconfig command + kconfig_parser = subparsers.add_parser( + "generate-kconfig", help="Generate Kconfig files for AWS" + ) + + # Instance types command + instances_parser = subparsers.add_parser( + "instance-types", help="Manage instance types" + ) + instances_subparsers = instances_parser.add_subparsers( + dest="subcommand", help="Instance type operations" + ) + + # Instance types list + list_instances = instances_subparsers.add_parser("list", help="List instance types") + list_instances.add_argument("--family", help="Filter by instance family") + list_instances.add_argument("--region", help="AWS region") + list_instances.add_argument( + "--max-results", type=int, default=100, help="Maximum results (default: 100)" + ) + + # Regions command + regions_parser = subparsers.add_parser("regions", help="Manage regions") + regions_subparsers = regions_parser.add_subparsers( + dest="subcommand", help="Region operations" + ) + + # Regions list + list_regions = regions_subparsers.add_parser("list", help="List regions") + list_regions.add_argument( + "--include-zones", + action="store_true", + help="Include availability zones", + ) + + # Cheapest instance command + cheapest_parser = subparsers.add_parser( + "cheapest", help="Find cheapest instance meeting criteria" + ) + cheapest_parser.add_argument("--region", help="AWS region") + cheapest_parser.add_argument("--family", help="Instance family") + cheapest_parser.add_argument( + "--min-vcpus", type=int, default=2, help="Minimum vCPUs (default: 2)" + ) + + args = parser.parse_args() + + cli = AWSCLI(output_format=args.output) + + if args.command == "generate-kconfig": + success = cli.generate_kconfig() + sys.exit(0 if success else 1) + + elif args.command == "instance-types": + if args.subcommand == "list": + instances = cli.list_instance_types( + family=args.family, + region=args.region, + max_results=args.max_results, + ) + cli.output(instances) + + elif args.command == "regions": + if args.subcommand == "list": + regions = cli.list_regions(include_zones=args.include_zones) + cli.output(regions) + + elif args.command == "cheapest": + result = cli.get_cheapest_instance( + region=args.region, + family=args.family, + min_vcpus=args.min_vcpus, + ) + cli.output(result) + + else: + parser.print_help() + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/scripts/aws_api.py b/scripts/aws_api.py new file mode 100755 index 00000000..1cf42f39 --- /dev/null +++ b/scripts/aws_api.py @@ -0,0 +1,1135 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: MIT +""" +AWS API library for kdevops. + +Provides AWS CLI wrapper functions for dynamic configuration generation. +Used by aws-cli and other kdevops components. +""" + +import json +import os +import re +import subprocess +import sys +from typing import Dict, List, Optional, Any + + +def check_aws_cli() -> bool: + """Check if AWS CLI is installed and configured.""" + try: + # Check if AWS CLI is installed + result = subprocess.run( + ["aws", "--version"], + capture_output=True, + text=True, + check=False, + ) + if result.returncode != 0: + return False + + # Check if credentials are configured + result = subprocess.run( + ["aws", "sts", "get-caller-identity"], + capture_output=True, + text=True, + check=False, + ) + return result.returncode == 0 + except FileNotFoundError: + return False + + +def get_default_region() -> str: + """Get the default AWS region from configuration or environment.""" + # Try to get from environment + region = os.environ.get("AWS_DEFAULT_REGION") + if region: + return region + + # Try to get from AWS config + try: + result = subprocess.run( + ["aws", "configure", "get", "region"], + capture_output=True, + text=True, + check=False, + ) + if result.returncode == 0 and result.stdout.strip(): + return result.stdout.strip() + except: + pass + + # Default to us-east-1 + return "us-east-1" + + +def run_aws_command(command: List[str], region: Optional[str] = None) -> Optional[Dict]: + """ + Run an AWS CLI command and return the JSON output. + + Args: + command: AWS CLI command as a list + region: Optional AWS region + + Returns: + Parsed JSON output or None on error + """ + cmd = ["aws"] + command + ["--output", "json"] + + # Always specify a region (use default if not provided) + if not region: + region = get_default_region() + cmd.extend(["--region", region]) + + try: + result = subprocess.run( + cmd, + capture_output=True, + text=True, + check=False, + ) + if result.returncode == 0: + return json.loads(result.stdout) if result.stdout else {} + else: + print(f"AWS command failed: {result.stderr}", file=sys.stderr) + return None + except (subprocess.SubprocessError, json.JSONDecodeError) as e: + print(f"Error running AWS command: {e}", file=sys.stderr) + return None + + +def get_regions() -> List[Dict[str, Any]]: + """Get available AWS regions.""" + response = run_aws_command(["ec2", "describe-regions"]) + if response and "Regions" in response: + return response["Regions"] + return [] + + +def get_availability_zones(region: str) -> List[Dict[str, Any]]: + """Get availability zones for a specific region.""" + response = run_aws_command( + ["ec2", "describe-availability-zones"], + region=region, + ) + if response and "AvailabilityZones" in response: + return response["AvailabilityZones"] + return [] + + +def get_instance_types( + family: Optional[str] = None, + region: Optional[str] = None, + max_results: int = 100, + fetch_all: bool = True, +) -> List[Dict[str, Any]]: + """ + Get available instance types. + + Args: + family: Instance family filter (e.g., 'm5', 't3') + region: AWS region + max_results: Maximum number of results per API call (max 100) + fetch_all: If True, fetch all pages using NextToken pagination + + Returns: + List of instance type information + """ + all_instances = [] + next_token = None + page_count = 0 + + # Ensure max_results doesn't exceed AWS limit + max_results = min(max_results, 100) + + while True: + cmd = ["ec2", "describe-instance-types"] + + filters = [] + if family: + # Filter by instance type pattern + filters.append(f"Name=instance-type,Values={family}*") + + if filters: + cmd.append("--filters") + cmd.extend(filters) + + cmd.extend(["--max-results", str(max_results)]) + + if next_token: + cmd.extend(["--next-token", next_token]) + + response = run_aws_command(cmd, region=region) + if response and "InstanceTypes" in response: + batch_size = len(response["InstanceTypes"]) + all_instances.extend(response["InstanceTypes"]) + page_count += 1 + + if fetch_all and not family: + # Only show progress for full fetches (not family-specific) + print( + f" Fetched page {page_count}: {batch_size} instance types (total: {len(all_instances)})", + file=sys.stderr, + ) + + # Check if there are more results + if fetch_all and "NextToken" in response: + next_token = response["NextToken"] + else: + break + else: + break + + if fetch_all and page_count > 1: + filter_desc = f" for family '{family}'" if family else "" + print( + f" Total: {len(all_instances)} instance types fetched{filter_desc}", + file=sys.stderr, + ) + + return all_instances + + +def get_pricing_info(region: str = "us-east-1") -> Dict[str, Dict[str, float]]: + """ + Get pricing information for instance types. + + Note: AWS Pricing API requires us-east-1 region. + Returns a simplified pricing structure. + + Args: + region: AWS region for pricing + + Returns: + Dictionary mapping instance types to pricing info + """ + # For simplicity, we'll use hardcoded common instance prices + # In production, you'd query the AWS Pricing API + pricing = { + # T3 family (burstable) + "t3.nano": {"on_demand": 0.0052}, + "t3.micro": {"on_demand": 0.0104}, + "t3.small": {"on_demand": 0.0208}, + "t3.medium": {"on_demand": 0.0416}, + "t3.large": {"on_demand": 0.0832}, + "t3.xlarge": {"on_demand": 0.1664}, + "t3.2xlarge": {"on_demand": 0.3328}, + # T3a family (AMD) + "t3a.nano": {"on_demand": 0.0047}, + "t3a.micro": {"on_demand": 0.0094}, + "t3a.small": {"on_demand": 0.0188}, + "t3a.medium": {"on_demand": 0.0376}, + "t3a.large": {"on_demand": 0.0752}, + "t3a.xlarge": {"on_demand": 0.1504}, + "t3a.2xlarge": {"on_demand": 0.3008}, + # M5 family (general purpose Intel) + "m5.large": {"on_demand": 0.096}, + "m5.xlarge": {"on_demand": 0.192}, + "m5.2xlarge": {"on_demand": 0.384}, + "m5.4xlarge": {"on_demand": 0.768}, + "m5.8xlarge": {"on_demand": 1.536}, + "m5.12xlarge": {"on_demand": 2.304}, + "m5.16xlarge": {"on_demand": 3.072}, + "m5.24xlarge": {"on_demand": 4.608}, + # M7a family (general purpose AMD) + "m7a.medium": {"on_demand": 0.0464}, + "m7a.large": {"on_demand": 0.0928}, + "m7a.xlarge": {"on_demand": 0.1856}, + "m7a.2xlarge": {"on_demand": 0.3712}, + "m7a.4xlarge": {"on_demand": 0.7424}, + "m7a.8xlarge": {"on_demand": 1.4848}, + "m7a.12xlarge": {"on_demand": 2.2272}, + "m7a.16xlarge": {"on_demand": 2.9696}, + "m7a.24xlarge": {"on_demand": 4.4544}, + "m7a.32xlarge": {"on_demand": 5.9392}, + "m7a.48xlarge": {"on_demand": 8.9088}, + # C5 family (compute optimized) + "c5.large": {"on_demand": 0.085}, + "c5.xlarge": {"on_demand": 0.17}, + "c5.2xlarge": {"on_demand": 0.34}, + "c5.4xlarge": {"on_demand": 0.68}, + "c5.9xlarge": {"on_demand": 1.53}, + "c5.12xlarge": {"on_demand": 2.04}, + "c5.18xlarge": {"on_demand": 3.06}, + "c5.24xlarge": {"on_demand": 4.08}, + # C7a family (compute optimized AMD) + "c7a.medium": {"on_demand": 0.0387}, + "c7a.large": {"on_demand": 0.0774}, + "c7a.xlarge": {"on_demand": 0.1548}, + "c7a.2xlarge": {"on_demand": 0.3096}, + "c7a.4xlarge": {"on_demand": 0.6192}, + "c7a.8xlarge": {"on_demand": 1.2384}, + "c7a.12xlarge": {"on_demand": 1.8576}, + "c7a.16xlarge": {"on_demand": 2.4768}, + "c7a.24xlarge": {"on_demand": 3.7152}, + "c7a.32xlarge": {"on_demand": 4.9536}, + "c7a.48xlarge": {"on_demand": 7.4304}, + # I4i family (storage optimized) + "i4i.large": {"on_demand": 0.117}, + "i4i.xlarge": {"on_demand": 0.234}, + "i4i.2xlarge": {"on_demand": 0.468}, + "i4i.4xlarge": {"on_demand": 0.936}, + "i4i.8xlarge": {"on_demand": 1.872}, + "i4i.16xlarge": {"on_demand": 3.744}, + "i4i.32xlarge": {"on_demand": 7.488}, + } + + # Adjust pricing based on region (simplified) + # Some regions are more expensive than others + region_multipliers = { + "us-east-1": 1.0, + "us-east-2": 1.0, + "us-west-1": 1.08, + "us-west-2": 1.0, + "eu-west-1": 1.1, + "eu-central-1": 1.15, + "ap-southeast-1": 1.2, + "ap-northeast-1": 1.25, + } + + multiplier = region_multipliers.get(region, 1.1) + if multiplier != 1.0: + adjusted_pricing = {} + for instance_type, prices in pricing.items(): + adjusted_pricing[instance_type] = { + "on_demand": prices["on_demand"] * multiplier + } + return adjusted_pricing + + return pricing + + +def sanitize_kconfig_name(name: str) -> str: + """Convert a name to a valid Kconfig symbol.""" + # Replace special characters with underscores + name = name.replace("-", "_").replace(".", "_").replace(" ", "_") + # Convert to uppercase + name = name.upper() + # Remove any non-alphanumeric characters (except underscore) + name = "".join(c for c in name if c.isalnum() or c == "_") + # Ensure it doesn't start with a number + if name and name[0].isdigit(): + name = "_" + name + return name + + +# Cache for instance families to avoid redundant API calls +_cached_families = None + + +def get_generated_instance_families() -> set: + """Get the set of instance families that will have generated Kconfig files.""" + global _cached_families + + # Return cached result if available + if _cached_families is not None: + return _cached_families + + # Return all families - we'll generate Kconfig files for all of them + # This function will be called by the aws-cli tool to determine which files to generate + if not check_aws_cli(): + # Return a minimal set if AWS CLI is not available + _cached_families = {"m5", "t3", "c5"} + return _cached_families + + # Get all available instance types + print(" Discovering available instance families...", file=sys.stderr) + instance_types = get_instance_types(fetch_all=True) + + # Extract unique families + families = set() + for instance_type in instance_types: + type_name = instance_type.get("InstanceType", "") + # Extract family prefix (e.g., "m5" from "m5.large") + if "." in type_name: + family = type_name.split(".")[0] + families.add(family) + + print(f" Found {len(families)} instance families", file=sys.stderr) + _cached_families = families + return families + + +def generate_instance_families_kconfig() -> str: + """Generate Kconfig content for AWS instance families.""" + # Check if AWS CLI is available + if not check_aws_cli(): + return generate_default_instance_families_kconfig() + + # Get all available instance types (with pagination) + instance_types = get_instance_types(fetch_all=True) + + # Extract unique families + families = set() + family_info = {} + for instance in instance_types: + instance_type = instance.get("InstanceType", "") + if "." in instance_type: + family = instance_type.split(".")[0] + families.add(family) + if family not in family_info: + family_info[family] = { + "architectures": set(), + "count": 0, + } + family_info[family]["count"] += 1 + for arch in instance.get("ProcessorInfo", {}).get( + "SupportedArchitectures", [] + ): + family_info[family]["architectures"].add(arch) + + if not families: + return generate_default_instance_families_kconfig() + + # Group families by category - use prefix patterns to catch all variants + def categorize_family(family_name): + """Categorize a family based on its prefix.""" + if family_name.startswith(("m", "t")): + return "general_purpose" + elif family_name.startswith("c"): + return "compute_optimized" + elif family_name.startswith(("r", "x", "z")): + return "memory_optimized" + elif family_name.startswith(("i", "d", "h")): + return "storage_optimized" + elif family_name.startswith(("p", "g", "dl", "trn", "inf", "vt", "f")): + return "accelerated" + elif family_name.startswith(("mac", "hpc")): + return "specialized" + else: + return "other" + + # Organize families by category + categorized_families = { + "general_purpose": [], + "compute_optimized": [], + "memory_optimized": [], + "storage_optimized": [], + "accelerated": [], + "specialized": [], + "other": [], + } + + for family in sorted(families): + category = categorize_family(family) + categorized_families[category].append(family) + + kconfig = """# AWS instance families (dynamically generated) +# Generated by aws-cli from live AWS data + +choice + prompt "AWS instance family" + default TERRAFORM_AWS_INSTANCE_TYPE_M5 + help + Select the AWS instance family for your deployment. + Different families are optimized for different workloads. + +""" + + # Category headers + category_headers = { + "general_purpose": "# General Purpose - balanced compute, memory, and networking\n", + "compute_optimized": "# Compute Optimized - ideal for CPU-intensive applications\n", + "memory_optimized": "# Memory Optimized - for memory-intensive applications\n", + "storage_optimized": "# Storage Optimized - for high sequential read/write workloads\n", + "accelerated": "# Accelerated Computing - GPU and other accelerators\n", + "specialized": "# Specialized - for specific use cases\n", + "other": "# Other instance families\n", + } + + # Add each category of families + for category in [ + "general_purpose", + "compute_optimized", + "memory_optimized", + "storage_optimized", + "accelerated", + "specialized", + "other", + ]: + if categorized_families[category]: + kconfig += category_headers[category] + for family in categorized_families[category]: + kconfig += generate_family_config(family, family_info.get(family, {})) + if category != "other": # Don't add extra newline after the last category + kconfig += "\n" + + kconfig += "\nendchoice\n" + + # Add instance type source includes for each family + # Only include families that we actually generate files for + generated_families = get_generated_instance_families() + kconfig += "\n# Include instance-specific configurations\n" + for family in sorted(families): + # Only add source statement if we generate a file for this family + if family in generated_families: + safe_name = sanitize_kconfig_name(family) + kconfig += f"""if TERRAFORM_AWS_INSTANCE_TYPE_{safe_name} +source "terraform/aws/kconfigs/instance-types/Kconfig.{family}.generated" +endif + +""" + + return kconfig + + +def generate_family_config(family: str, info: Dict) -> str: + """Generate Kconfig entry for an instance family.""" + safe_name = sanitize_kconfig_name(family) + + # Determine architecture dependencies + architectures = info.get("architectures", set()) + depends_line = "" + if architectures: + if "x86_64" in architectures and "arm64" not in architectures: + depends_line = "\n\tdepends on TARGET_ARCH_X86_64" + elif "arm64" in architectures and "x86_64" not in architectures: + depends_line = "\n\tdepends on TARGET_ARCH_ARM64" + + # Family descriptions + descriptions = { + "t3": "Burstable performance instances powered by Intel processors", + "t3a": "Burstable performance instances powered by AMD processors", + "m5": "General purpose instances powered by Intel Xeon Platinum processors", + "m7a": "Latest generation general purpose instances powered by AMD EPYC processors", + "c5": "Compute optimized instances powered by Intel Xeon Platinum processors", + "c7a": "Latest generation compute optimized instances powered by AMD EPYC processors", + "i4i": "Storage optimized instances with NVMe SSD storage", + "is4gen": "Storage optimized ARM instances powered by AWS Graviton2", + "im4gn": "Storage optimized ARM instances with NVMe storage", + "r5": "Memory optimized instances powered by Intel Xeon Platinum processors", + "p3": "GPU instances for machine learning and HPC", + "g4dn": "GPU instances for graphics-intensive applications", + } + + description = descriptions.get(family, f"AWS {family.upper()} instance family") + count = info.get("count", 0) + + config = f"""config TERRAFORM_AWS_INSTANCE_TYPE_{safe_name} +\tbool "{family.upper()}" +{depends_line} +\thelp +\t {description} +\t Available instance types: {count} + +""" + return config + + +def generate_default_instance_families_kconfig() -> str: + """Generate default Kconfig content when AWS CLI is not available.""" + return """# AWS instance families (default - AWS CLI not available) + +choice + prompt "AWS instance family" + default TERRAFORM_AWS_INSTANCE_TYPE_M5 + help + Select the AWS instance family for your deployment. + Note: AWS CLI is not available, showing default options. + +config TERRAFORM_AWS_INSTANCE_TYPE_M5 + bool "M5" + depends on TARGET_ARCH_X86_64 + help + General purpose instances powered by Intel Xeon Platinum processors. + +config TERRAFORM_AWS_INSTANCE_TYPE_M7A + bool "M7a" + depends on TARGET_ARCH_X86_64 + help + Latest generation general purpose instances powered by AMD EPYC processors. + +config TERRAFORM_AWS_INSTANCE_TYPE_T3 + bool "T3" + depends on TARGET_ARCH_X86_64 + help + Burstable performance instances powered by Intel processors. + +config TERRAFORM_AWS_INSTANCE_TYPE_C5 + bool "C5" + depends on TARGET_ARCH_X86_64 + help + Compute optimized instances powered by Intel Xeon Platinum processors. + +config TERRAFORM_AWS_INSTANCE_TYPE_I4I + bool "I4i" + depends on TARGET_ARCH_X86_64 + help + Storage optimized instances with NVMe SSD storage. + +endchoice + +# Include instance-specific configurations +if TERRAFORM_AWS_INSTANCE_TYPE_M5 +source "terraform/aws/kconfigs/instance-types/Kconfig.m5" +endif + +if TERRAFORM_AWS_INSTANCE_TYPE_M7A +source "terraform/aws/kconfigs/instance-types/Kconfig.m7a" +endif + +if TERRAFORM_AWS_INSTANCE_TYPE_T3 +source "terraform/aws/kconfigs/instance-types/Kconfig.t3.generated" +endif + +if TERRAFORM_AWS_INSTANCE_TYPE_C5 +source "terraform/aws/kconfigs/instance-types/Kconfig.c5.generated" +endif + +if TERRAFORM_AWS_INSTANCE_TYPE_I4I +source "terraform/aws/kconfigs/instance-types/Kconfig.i4i" +endif + +""" + + +def generate_instance_types_kconfig(family: str) -> str: + """Generate Kconfig content for specific instance types within a family.""" + if not check_aws_cli(): + return "" + + instance_types = get_instance_types(family=family, fetch_all=True) + if not instance_types: + return "" + + # Filter to only exact family matches (e.g., c5a but not c5ad) + filtered_instances = [] + for instance in instance_types: + instance_type = instance.get("InstanceType", "") + if "." in instance_type: + inst_family = instance_type.split(".")[0] + if inst_family == family: + filtered_instances.append(instance) + + instance_types = filtered_instances + if not instance_types: + return "" + + pricing = get_pricing_info() + + # Sort by vCPU count and memory + instance_types.sort( + key=lambda x: ( + x.get("VCpuInfo", {}).get("DefaultVCpus", 0), + x.get("MemoryInfo", {}).get("SizeInMiB", 0), + ) + ) + + safe_family = sanitize_kconfig_name(family) + + # Get the first instance type to use as default + default_instance_name = f"{safe_family}_LARGE" # Fallback + if instance_types: + first_instance_type = instance_types[0].get("InstanceType", "") + if "." in first_instance_type: + first_full_name = first_instance_type.replace(".", "_") + default_instance_name = sanitize_kconfig_name(first_full_name) + + kconfig = f"""# AWS {family.upper()} instance sizes (dynamically generated) + +choice +\tprompt "Instance size for {family.upper()} family" +\tdefault TERRAFORM_AWS_INSTANCE_{default_instance_name} +\thelp +\t Select the specific instance size within the {family.upper()} family. + +""" + + seen_configs = set() + for instance in instance_types: + instance_type = instance.get("InstanceType", "") + if "." not in instance_type: + continue + + # Get the full instance type name to make unique config names + full_name = instance_type.replace(".", "_") + safe_full_name = sanitize_kconfig_name(full_name) + + # Skip if we've already seen this config name + if safe_full_name in seen_configs: + continue + seen_configs.add(safe_full_name) + + size = instance_type.split(".")[1] + + vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0) + memory_mib = instance.get("MemoryInfo", {}).get("SizeInMiB", 0) + memory_gb = memory_mib / 1024 + + # Get pricing + price = pricing.get(instance_type, {}).get("on_demand", 0.0) + price_str = f"${price:.3f}/hour" if price > 0 else "pricing varies" + + # Network performance + network = instance.get("NetworkInfo", {}).get("NetworkPerformance", "varies") + + # Storage + storage_info = "" + if instance.get("InstanceStorageSupported"): + storage = instance.get("InstanceStorageInfo", {}) + total_size = storage.get("TotalSizeInGB", 0) + if total_size > 0: + storage_info = f"\n\t Instance storage: {total_size} GB" + + kconfig += f"""config TERRAFORM_AWS_INSTANCE_{safe_full_name} +\tbool "{instance_type}" +\thelp +\t vCPUs: {vcpus} +\t Memory: {memory_gb:.1f} GB +\t Network: {network} +\t Price: {price_str}{storage_info} + +""" + + kconfig += "endchoice\n" + + # Add the actual instance type string config with full instance names + kconfig += f""" +config TERRAFORM_AWS_{safe_family}_SIZE +\tstring +""" + + # Generate default mappings for each seen instance type + for instance in instance_types: + instance_type = instance.get("InstanceType", "") + if "." not in instance_type: + continue + + full_name = instance_type.replace(".", "_") + safe_full_name = sanitize_kconfig_name(full_name) + + kconfig += ( + f'\tdefault "{instance_type}" if TERRAFORM_AWS_INSTANCE_{safe_full_name}\n' + ) + + # Use the first instance type as the final fallback default + final_default = f"{family}.large" + if instance_types: + first_instance_type = instance_types[0].get("InstanceType", "") + if first_instance_type: + final_default = first_instance_type + + kconfig += f'\tdefault "{final_default}"\n\n' + + return kconfig + + +def generate_regions_kconfig() -> str: + """Generate Kconfig content for AWS regions.""" + if not check_aws_cli(): + return generate_default_regions_kconfig() + + regions = get_regions() + if not regions: + return generate_default_regions_kconfig() + + kconfig = """# AWS regions (dynamically generated) + +choice + prompt "AWS region" + default TERRAFORM_AWS_REGION_USEAST1 + help + Select the AWS region for your deployment. + Note: Not all instance types are available in all regions. + +""" + + # Group regions by geographic area + us_regions = [] + eu_regions = [] + ap_regions = [] + other_regions = [] + + for region in regions: + region_name = region.get("RegionName", "") + if region_name.startswith("us-"): + us_regions.append(region) + elif region_name.startswith("eu-"): + eu_regions.append(region) + elif region_name.startswith("ap-"): + ap_regions.append(region) + else: + other_regions.append(region) + + # Add US regions + if us_regions: + kconfig += "# US Regions\n" + for region in sorted(us_regions, key=lambda x: x.get("RegionName", "")): + kconfig += generate_region_config(region) + kconfig += "\n" + + # Add EU regions + if eu_regions: + kconfig += "# Europe Regions\n" + for region in sorted(eu_regions, key=lambda x: x.get("RegionName", "")): + kconfig += generate_region_config(region) + kconfig += "\n" + + # Add Asia Pacific regions + if ap_regions: + kconfig += "# Asia Pacific Regions\n" + for region in sorted(ap_regions, key=lambda x: x.get("RegionName", "")): + kconfig += generate_region_config(region) + kconfig += "\n" + + # Add other regions + if other_regions: + kconfig += "# Other Regions\n" + for region in sorted(other_regions, key=lambda x: x.get("RegionName", "")): + kconfig += generate_region_config(region) + + kconfig += "\nendchoice\n" + + # Add the actual region string config + kconfig += """ +config TERRAFORM_AWS_REGION + string +""" + + for region in regions: + region_name = region.get("RegionName", "") + safe_name = sanitize_kconfig_name(region_name) + kconfig += f'\tdefault "{region_name}" if TERRAFORM_AWS_REGION_{safe_name}\n' + + kconfig += '\tdefault "us-east-1"\n' + + return kconfig + + +def generate_region_config(region: Dict) -> str: + """Generate Kconfig entry for a region.""" + region_name = region.get("RegionName", "") + safe_name = sanitize_kconfig_name(region_name) + opt_in_status = region.get("OptInStatus", "") + + # Region display names + display_names = { + "us-east-1": "US East (N. Virginia)", + "us-east-2": "US East (Ohio)", + "us-west-1": "US West (N. California)", + "us-west-2": "US West (Oregon)", + "eu-west-1": "Europe (Ireland)", + "eu-west-2": "Europe (London)", + "eu-west-3": "Europe (Paris)", + "eu-central-1": "Europe (Frankfurt)", + "eu-north-1": "Europe (Stockholm)", + "ap-southeast-1": "Asia Pacific (Singapore)", + "ap-southeast-2": "Asia Pacific (Sydney)", + "ap-northeast-1": "Asia Pacific (Tokyo)", + "ap-northeast-2": "Asia Pacific (Seoul)", + "ap-south-1": "Asia Pacific (Mumbai)", + "ca-central-1": "Canada (Central)", + "sa-east-1": "South America (São Paulo)", + } + + display_name = display_names.get(region_name, region_name.replace("-", " ").title()) + + help_text = f"\t Region: {display_name}" + if opt_in_status and opt_in_status != "opt-in-not-required": + help_text += f"\n\t Status: {opt_in_status}" + + config = f"""config TERRAFORM_AWS_REGION_{safe_name} +\tbool "{display_name}" +\thelp +{help_text} + +""" + return config + + +def get_gpu_amis(region: str = None) -> List[Dict[str, Any]]: + """ + Get available GPU-optimized AMIs including Deep Learning AMIs. + + Args: + region: AWS region + + Returns: + List of AMI information + """ + # Query for Deep Learning AMIs from AWS + cmd = ["ec2", "describe-images"] + filters = [ + "Name=owner-alias,Values=amazon", + "Name=name,Values=Deep Learning AMI GPU*", + "Name=state,Values=available", + "Name=architecture,Values=x86_64", + ] + cmd.append("--filters") + cmd.extend(filters) + cmd.extend(["--query", "Images[?contains(Name, '2024') || contains(Name, '2025')]"]) + + response = run_aws_command(cmd, region=region) + + if response: + # Sort by creation date to get the most recent + response.sort(key=lambda x: x.get("CreationDate", ""), reverse=True) + return response[:10] # Return top 10 most recent + return [] + + +def generate_gpu_amis_kconfig() -> str: + """Generate Kconfig content for GPU AMIs.""" + # Check if AWS CLI is available + if not check_aws_cli(): + return generate_default_gpu_amis_kconfig() + + # Get available GPU AMIs + amis = get_gpu_amis() + + if not amis: + return generate_default_gpu_amis_kconfig() + + kconfig = """# GPU-optimized AMIs (dynamically generated) + +# GPU AMI Override - only shown for GPU instances +config TERRAFORM_AWS_USE_GPU_AMI + bool "Use GPU-optimized AMI instead of standard distribution" + depends on TERRAFORM_AWS_IS_GPU_INSTANCE + output yaml + default n + help + Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers, + CUDA, and ML frameworks instead of the standard distribution AMI. + + When disabled, the standard distribution AMI will be used and you'll need + to install GPU drivers manually. + +if TERRAFORM_AWS_USE_GPU_AMI + +choice + prompt "GPU-optimized AMI selection" + default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + depends on TERRAFORM_AWS_IS_GPU_INSTANCE + help + Select which GPU-optimized AMI to use for your GPU instance. + +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + bool "AWS Deep Learning AMI (Ubuntu 22.04)" + help + AWS Deep Learning AMI with NVIDIA drivers, CUDA, cuDNN, and popular ML frameworks. + Optimized for machine learning workloads on GPU instances. + Includes: TensorFlow, PyTorch, MXNet, and Jupyter. + +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA + bool "NVIDIA Deep Learning AMI" + help + NVIDIA optimized Deep Learning AMI with latest GPU drivers. + Includes NVIDIA GPU Cloud (NGC) containers and frameworks. + +config TERRAFORM_AWS_GPU_AMI_CUSTOM + bool "Custom GPU AMI" + help + Specify a custom AMI ID for GPU instances. + +endchoice + +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + +config TERRAFORM_AWS_GPU_AMI_NAME + string + output yaml + default "Deep Learning AMI GPU TensorFlow*" + help + AMI name pattern for AWS Deep Learning AMI. + +config TERRAFORM_AWS_GPU_AMI_OWNER + string + output yaml + default "amazon" + +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA + +config TERRAFORM_AWS_GPU_AMI_NAME + string + output yaml + default "NVIDIA Deep Learning AMI*" + help + AMI name pattern for NVIDIA Deep Learning AMI. + +config TERRAFORM_AWS_GPU_AMI_OWNER + string + output yaml + default "amazon" + +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA + +if TERRAFORM_AWS_GPU_AMI_CUSTOM + +config TERRAFORM_AWS_GPU_AMI_ID + string "Custom GPU AMI ID" + output yaml + help + Specify the AMI ID for your custom GPU image. + Example: ami-0123456789abcdef0 + +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM + +endif # TERRAFORM_AWS_USE_GPU_AMI + +# GPU instance detection +config TERRAFORM_AWS_IS_GPU_INSTANCE + bool + output yaml + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6 + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5 + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5 + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3 + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN + default n + help + Automatically detected based on selected instance type. + This indicates whether the selected instance has GPU support. + +""" + + return kconfig + + +def generate_default_gpu_amis_kconfig() -> str: + """Generate default GPU AMI Kconfig when AWS CLI is not available.""" + return """# GPU-optimized AMIs (default - AWS CLI not available) + +# GPU AMI Override - only shown for GPU instances +config TERRAFORM_AWS_USE_GPU_AMI + bool "Use GPU-optimized AMI instead of standard distribution" + depends on TERRAFORM_AWS_IS_GPU_INSTANCE + output yaml + default n + help + Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers, + CUDA, and ML frameworks instead of the standard distribution AMI. + Note: AWS CLI is not available, showing default options. + +if TERRAFORM_AWS_USE_GPU_AMI + +choice + prompt "GPU-optimized AMI selection" + default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + depends on TERRAFORM_AWS_IS_GPU_INSTANCE + help + Select which GPU-optimized AMI to use for your GPU instance. + +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + bool "AWS Deep Learning AMI (Ubuntu 22.04)" + help + Pre-configured with NVIDIA drivers, CUDA, and ML frameworks. + +config TERRAFORM_AWS_GPU_AMI_CUSTOM + bool "Custom GPU AMI" + +endchoice + +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + +config TERRAFORM_AWS_GPU_AMI_NAME + string + output yaml + default "Deep Learning AMI GPU TensorFlow*" + +config TERRAFORM_AWS_GPU_AMI_OWNER + string + output yaml + default "amazon" + +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING + +if TERRAFORM_AWS_GPU_AMI_CUSTOM + +config TERRAFORM_AWS_GPU_AMI_ID + string "Custom GPU AMI ID" + output yaml + help + Specify the AMI ID for your custom GPU image. + +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM + +endif # TERRAFORM_AWS_USE_GPU_AMI + +# GPU instance detection (static) +config TERRAFORM_AWS_IS_GPU_INSTANCE + bool + output yaml + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6 + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5 + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5 + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3 + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN + default n + help + Automatically detected based on selected instance type. + This indicates whether the selected instance has GPU support. + +""" + + +def generate_default_regions_kconfig() -> str: + """Generate default Kconfig content when AWS CLI is not available.""" + return """# AWS regions (default - AWS CLI not available) + +choice + prompt "AWS region" + default TERRAFORM_AWS_REGION_USEAST1 + help + Select the AWS region for your deployment. + Note: AWS CLI is not available, showing default options. + +# US Regions +config TERRAFORM_AWS_REGION_USEAST1 + bool "US East (N. Virginia)" + +config TERRAFORM_AWS_REGION_USEAST2 + bool "US East (Ohio)" + +config TERRAFORM_AWS_REGION_USWEST1 + bool "US West (N. California)" + +config TERRAFORM_AWS_REGION_USWEST2 + bool "US West (Oregon)" + +# Europe Regions +config TERRAFORM_AWS_REGION_EUWEST1 + bool "Europe (Ireland)" + +config TERRAFORM_AWS_REGION_EUCENTRAL1 + bool "Europe (Frankfurt)" + +# Asia Pacific Regions +config TERRAFORM_AWS_REGION_APSOUTHEAST1 + bool "Asia Pacific (Singapore)" + +config TERRAFORM_AWS_REGION_APNORTHEAST1 + bool "Asia Pacific (Tokyo)" + +endchoice + +config TERRAFORM_AWS_REGION + string + default "us-east-1" if TERRAFORM_AWS_REGION_USEAST1 + default "us-east-2" if TERRAFORM_AWS_REGION_USEAST2 + default "us-west-1" if TERRAFORM_AWS_REGION_USWEST1 + default "us-west-2" if TERRAFORM_AWS_REGION_USWEST2 + default "eu-west-1" if TERRAFORM_AWS_REGION_EUWEST1 + default "eu-central-1" if TERRAFORM_AWS_REGION_EUCENTRAL1 + default "ap-southeast-1" if TERRAFORM_AWS_REGION_APSOUTHEAST1 + default "ap-northeast-1" if TERRAFORM_AWS_REGION_APNORTHEAST1 + default "us-east-1" + +""" diff --git a/scripts/dynamic-cloud-kconfig.Makefile b/scripts/dynamic-cloud-kconfig.Makefile index e15651ab..4105e706 100644 --- a/scripts/dynamic-cloud-kconfig.Makefile +++ b/scripts/dynamic-cloud-kconfig.Makefile @@ -12,9 +12,24 @@ LAMBDALABS_KCONFIG_IMAGES := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.generated LAMBDALABS_KCONFIGS := $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_IMAGES) +# AWS dynamic configuration +AWS_KCONFIG_DIR := terraform/aws/kconfigs +AWS_KCONFIG_COMPUTE := $(AWS_KCONFIG_DIR)/Kconfig.compute.generated +AWS_KCONFIG_LOCATION := $(AWS_KCONFIG_DIR)/Kconfig.location.generated +AWS_INSTANCE_TYPES_DIR := $(AWS_KCONFIG_DIR)/instance-types + +# List of AWS instance type family files that will be generated +AWS_INSTANCE_TYPE_FAMILIES := m5 m7a t3 t3a c5 c7a i4i is4gen im4gn +AWS_INSTANCE_TYPE_KCONFIGS := $(foreach family,$(AWS_INSTANCE_TYPE_FAMILIES),$(AWS_INSTANCE_TYPES_DIR)/Kconfig.$(family).generated) + +AWS_KCONFIGS := $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION) $(AWS_INSTANCE_TYPE_KCONFIGS) + # Add Lambda Labs generated files to mrproper clean list KDEVOPS_MRPROPER += $(LAMBDALABS_KCONFIGS) +# Add AWS generated files to mrproper clean list +KDEVOPS_MRPROPER += $(AWS_KCONFIGS) + # Touch Lambda Labs generated files so Kconfig can source them # This ensures the files exist (even if empty) before Kconfig runs dynamic_lambdalabs_kconfig_touch: @@ -22,20 +37,43 @@ dynamic_lambdalabs_kconfig_touch: DYNAMIC_KCONFIG += dynamic_lambdalabs_kconfig_touch +# Touch AWS generated files so Kconfig can source them +# This ensures the files exist (even if empty) before Kconfig runs +dynamic_aws_kconfig_touch: + $(Q)mkdir -p $(AWS_INSTANCE_TYPES_DIR) + $(Q)touch $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION) + $(Q)touch $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated + $(Q)for family in $(AWS_INSTANCE_TYPE_FAMILIES); do \ + touch $(AWS_INSTANCE_TYPES_DIR)/Kconfig.$$family.generated; \ + done + +DYNAMIC_KCONFIG += dynamic_aws_kconfig_touch + # Individual Lambda Labs targets are now handled by generate_cloud_configs.py cloud-config-lambdalabs: $(Q)python3 scripts/generate_cloud_configs.py +# Individual AWS targets are now handled by generate_cloud_configs.py +cloud-config-aws: + $(Q)python3 scripts/generate_cloud_configs.py + # Clean Lambda Labs generated files clean-cloud-config-lambdalabs: $(Q)rm -f $(LAMBDALABS_KCONFIGS) -DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs +# Clean AWS generated files +clean-cloud-config-aws: + $(Q)rm -f $(AWS_KCONFIGS) + $(Q)rm -f .aws_cloud_config_generated + +DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs cloud-config-aws cloud-config-help: @echo "Cloud-specific dynamic kconfig targets:" @echo "cloud-config - generates all cloud provider dynamic kconfig content" @echo "cloud-config-lambdalabs - generates Lambda Labs dynamic kconfig content" + @echo "cloud-config-aws - generates AWS dynamic kconfig content" + @echo "cloud-update - converts generated cloud configs to static (for committing)" @echo "clean-cloud-config - removes all generated cloud kconfig files" @echo "cloud-list-all - list all cloud instances for configured provider" @@ -44,11 +82,55 @@ HELP_TARGETS += cloud-config-help cloud-config: $(Q)python3 scripts/generate_cloud_configs.py -clean-cloud-config: clean-cloud-config-lambdalabs +clean-cloud-config: clean-cloud-config-lambdalabs clean-cloud-config-aws + $(Q)rm -f .cloud.initialized $(Q)echo "Cleaned all cloud provider dynamic Kconfig files." cloud-list-all: $(Q)chmod +x scripts/cloud_list_all.sh $(Q)scripts/cloud_list_all.sh -PHONY += cloud-config cloud-config-lambdalabs clean-cloud-config clean-cloud-config-lambdalabs cloud-config-help cloud-list-all +# Convert dynamically generated cloud configs to static versions for git commits +# This allows admins to generate configs once and commit them for regular users +cloud-update: + @echo "Converting generated cloud configs to static versions..." + # AWS configs + $(Q)if [ -f $(AWS_KCONFIG_COMPUTE) ]; then \ + cp $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \ + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \ + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.compute.static"; \ + fi + $(Q)if [ -f $(AWS_KCONFIG_LOCATION) ]; then \ + cp $(AWS_KCONFIG_LOCATION) $(AWS_KCONFIG_DIR)/Kconfig.location.static; \ + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.location.static; \ + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.location.static"; \ + fi + $(Q)if [ -f $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated ]; then \ + cp $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static; \ + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static; \ + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static"; \ + fi + # AWS instance type families + $(Q)for file in $(AWS_INSTANCE_TYPES_DIR)/Kconfig.*.generated; do \ + if [ -f "$$file" ]; then \ + static_file=$$(echo "$$file" | sed 's/\.generated$$/\.static/'); \ + cp "$$file" "$$static_file"; \ + echo " Created $$static_file"; \ + fi; \ + done + # Lambda Labs configs + $(Q)if [ -f $(LAMBDALABS_KCONFIG_COMPUTE) ]; then \ + cp $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static; \ + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static"; \ + fi + $(Q)if [ -f $(LAMBDALABS_KCONFIG_LOCATION) ]; then \ + cp $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static; \ + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static"; \ + fi + $(Q)if [ -f $(LAMBDALABS_KCONFIG_IMAGES) ]; then \ + cp $(LAMBDALABS_KCONFIG_IMAGES) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static; \ + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static"; \ + fi + @echo "Static cloud configs created. You can now commit these .static files to git." + +PHONY += cloud-config cloud-config-lambdalabs cloud-config-aws clean-cloud-config clean-cloud-config-lambdalabs clean-cloud-config-aws cloud-config-help cloud-list-all cloud-update diff --git a/scripts/generate_cloud_configs.py b/scripts/generate_cloud_configs.py index b16294dd..332cebe7 100755 --- a/scripts/generate_cloud_configs.py +++ b/scripts/generate_cloud_configs.py @@ -10,6 +10,9 @@ import os import sys import subprocess import json +from concurrent.futures import ThreadPoolExecutor, as_completed +from pathlib import Path +from typing import Tuple def generate_lambdalabs_kconfig() -> bool: @@ -100,29 +103,194 @@ def get_lambdalabs_summary() -> tuple[bool, str]: return False, "Lambda Labs: Error querying API - using defaults" +def generate_aws_kconfig() -> bool: + """ + Generate AWS Kconfig files. + Returns True on success, False on failure. + """ + script_dir = os.path.dirname(os.path.abspath(__file__)) + cli_path = os.path.join(script_dir, "aws-cli") + + # Generate the Kconfig files + result = subprocess.run( + [cli_path, "generate-kconfig"], + capture_output=True, + text=True, + check=False, + ) + + return result.returncode == 0 + + +def get_aws_summary() -> tuple[bool, str]: + """ + Get a summary of AWS configurations using aws-cli. + Returns (success, summary_string) + """ + script_dir = os.path.dirname(os.path.abspath(__file__)) + cli_path = os.path.join(script_dir, "aws-cli") + + try: + # Check if AWS CLI is available + result = subprocess.run( + ["aws", "--version"], + capture_output=True, + text=True, + check=False, + ) + + if result.returncode != 0: + return False, "AWS: AWS CLI not installed - using defaults" + + # Check if credentials are configured + result = subprocess.run( + ["aws", "sts", "get-caller-identity"], + capture_output=True, + text=True, + check=False, + ) + + if result.returncode != 0: + return False, "AWS: Credentials not configured - using defaults" + + # Get instance types count + result = subprocess.run( + [ + cli_path, + "--output", + "json", + "instance-types", + "list", + "--max-results", + "100", + ], + capture_output=True, + text=True, + check=False, + ) + + if result.returncode != 0: + return False, "AWS: Error querying API - using defaults" + + instances = json.loads(result.stdout) + instance_count = len(instances) + + # Get regions + result = subprocess.run( + [cli_path, "--output", "json", "regions", "list"], + capture_output=True, + text=True, + check=False, + ) + + if result.returncode == 0: + regions = json.loads(result.stdout) + region_count = len(regions) + else: + region_count = 0 + + # Get price range from a sample of instances + prices = [] + for instance in instances[:20]: # Sample first 20 for speed + if "error" not in instance: + # Extract price if available (would need pricing API) + # For now, we'll use placeholder + vcpus = instance.get("vcpu", 0) + if vcpus > 0: + # Rough estimate: $0.05 per vCPU/hour + estimated_price = vcpus * 0.05 + prices.append(estimated_price) + + # Format summary + if prices: + min_price = min(prices) + max_price = max(prices) + price_range = f"~${min_price:.2f}-${max_price:.2f}/hr" + else: + price_range = "pricing varies by region" + + return ( + True, + f"AWS: {instance_count} instance types available, " + f"{region_count} regions, {price_range}", + ) + + except (subprocess.SubprocessError, json.JSONDecodeError, KeyError): + return False, "AWS: Error querying API - using defaults" + + +def process_lambdalabs() -> Tuple[bool, bool, str]: + """Process Lambda Labs configuration generation and summary. + Returns (kconfig_generated, summary_success, summary_text) + """ + kconfig_generated = generate_lambdalabs_kconfig() + success, summary = get_lambdalabs_summary() + return kconfig_generated, success, summary + + +def process_aws() -> Tuple[bool, bool, str]: + """Process AWS configuration generation and summary. + Returns (kconfig_generated, summary_success, summary_text) + """ + kconfig_generated = generate_aws_kconfig() + success, summary = get_aws_summary() + + # Create marker file to indicate dynamic AWS config is available + if kconfig_generated: + marker_file = Path(".aws_cloud_config_generated") + marker_file.touch() + + return kconfig_generated, success, summary + + def main(): """Main function to generate cloud configurations.""" print("Cloud Provider Configuration Summary") print("=" * 60) print() - # Lambda Labs - Generate Kconfig files first - kconfig_generated = generate_lambdalabs_kconfig() + # Run cloud provider operations in parallel + results = {} + any_success = False - # Lambda Labs - Get summary - success, summary = get_lambdalabs_summary() - if success: - print(f"✓ {summary}") - if kconfig_generated: - print(" Kconfig files generated successfully") - else: - print(" Warning: Failed to generate Kconfig files") - else: - print(f"⚠ {summary}") - print() + with ThreadPoolExecutor(max_workers=4) as executor: + # Submit all tasks + futures = { + executor.submit(process_lambdalabs): "lambdalabs", + executor.submit(process_aws): "aws", + } + + # Process results as they complete + for future in as_completed(futures): + provider = futures[future] + try: + results[provider] = future.result() + except Exception as e: + results[provider] = ( + False, + False, + f"{provider.upper()}: Error - {str(e)}", + ) + + # Display results in consistent order + for provider in ["lambdalabs", "aws"]: + if provider in results: + kconfig_gen, success, summary = results[provider] + if success and kconfig_gen: + any_success = True + if success: + print(f"✓ {summary}") + if kconfig_gen: + print(" Kconfig files generated successfully") + else: + print(" Warning: Failed to generate Kconfig files") + else: + print(f"⚠ {summary}") + print() - # AWS (placeholder - not implemented) - print("⚠ AWS: Dynamic configuration not yet implemented") + # Create .cloud.initialized if any provider succeeded + if any_success: + Path(".cloud.initialized").touch() # Azure (placeholder - not implemented) print("⚠ Azure: Dynamic configuration not yet implemented") diff --git a/terraform/aws/kconfigs/Kconfig.compute b/terraform/aws/kconfigs/Kconfig.compute index bae0ea1c..6b5ff900 100644 --- a/terraform/aws/kconfigs/Kconfig.compute +++ b/terraform/aws/kconfigs/Kconfig.compute @@ -1,94 +1,54 @@ -choice - prompt "AWS instance types" - help - Instance types comprise varying combinations of hardware - platform, CPU count, memory size, storage, and networking - capacity. Select the type that provides an appropriate mix - of resources for your preferred workflows. - - Some instance types are region- and capacity-limited. - - See https://aws.amazon.com/ec2/instance-types/ for - details. +# AWS compute configuration -config TERRAFORM_AWS_INSTANCE_TYPE_M5 - bool "M5" - depends on TARGET_ARCH_X86_64 +config TERRAFORM_AWS_USE_DYNAMIC_CONFIG + bool "Use dynamically generated instance types" + default $(shell, test -f .aws_cloud_config_generated && echo y || echo n) help - This is a general purpose type powered by Intel Xeon® - Platinum 8175M or 8259CL processors (Skylake or Cascade - Lake). + Enable this to use dynamically generated instance types from AWS CLI. + Run 'make cloud-config' to query AWS and generate available options. + When disabled, uses static predefined instance types. - See https://aws.amazon.com/ec2/instance-types/m5/ for - details. + This is automatically enabled when you run 'make cloud-config'. -config TERRAFORM_AWS_INSTANCE_TYPE_M7A - bool "M7a" - depends on TARGET_ARCH_X86_64 - help - This is a general purpose type powered by 4th Generation - AMD EPYC processors. - - See https://aws.amazon.com/ec2/instance-types/m7a/ for - details. - -config TERRAFORM_AWS_INSTANCE_TYPE_I4I - bool "I4i" - depends on TARGET_ARCH_X86_64 - help - This is a storage-optimized type powered by 3rd generation - Intel Xeon Scalable processors (Ice Lake) and use AWS Nitro - NVMe SSDs. +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG +# Include cloud-generated or static instance families +# Try static first (pre-generated by admins for faster loading) +# Fall back to generated files (requires AWS CLI) +source "terraform/aws/kconfigs/Kconfig.compute.static" +endif - See https://aws.amazon.com/ec2/instance-types/i4i/ for - details. - -config TERRAFORM_AWS_INSTANCE_TYPE_IS4GEN - bool "Is4gen" - depends on TARGET_ARCH_ARM64 - help - This is a Storage-optimized type powered by AWS Graviton2 - processors. - - See https://aws.amazon.com/ec2/instance-types/i4g/ for - details. - -config TERRAFORM_AWS_INSTANCE_TYPE_IM4GN - bool "Im4gn" - depends on TARGET_ARCH_ARM64 +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG +# Static instance types when not using dynamic config +choice + prompt "AWS instance types" help - This is a storage-optimized type powered by AWS Graviton2 - processors. + Instance types comprise varying combinations of hardware + platform, CPU count, memory size, storage, and networking + capacity. Select the type that provides an appropriate mix + of resources for your preferred workflows. - See https://aws.amazon.com/ec2/instance-types/i4g/ for - details. + Some instance types are region- and capacity-limited. -config TERRAFORM_AWS_INSTANCE_TYPE_C7A - depends on TARGET_ARCH_X86_64 - bool "c7a" - help - This is a compute-optimized type powered by 4th generation - AMD EPYC processors. + See https://aws.amazon.com/ec2/instance-types/ for + details. - See https://aws.amazon.com/ec2/instance-types/c7a/ for - details. endchoice +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG +# Use static instance type definitions when not using dynamic config source "terraform/aws/kconfigs/instance-types/Kconfig.m5" source "terraform/aws/kconfigs/instance-types/Kconfig.m7a" -source "terraform/aws/kconfigs/instance-types/Kconfig.i4i" -source "terraform/aws/kconfigs/instance-types/Kconfig.is4gen" -source "terraform/aws/kconfigs/instance-types/Kconfig.im4gn" -source "terraform/aws/kconfigs/instance-types/Kconfig.c7a" +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG choice prompt "Linux distribution" default TERRAFORM_AWS_DISTRO_DEBIAN help - Select a popular Linux distribution to install on your - instances, or use the "Custom AMI image" selection to - choose an image that is off the beaten path. + Select a popular Linux distribution to install on your + instances, or use the "Custom AMI image" selection to + choose an image that is off the beaten path. config TERRAFORM_AWS_DISTRO_AMAZON bool "Amazon Linux" @@ -120,3 +80,8 @@ source "terraform/aws/kconfigs/distros/Kconfig.oracle" source "terraform/aws/kconfigs/distros/Kconfig.rhel" source "terraform/aws/kconfigs/distros/Kconfig.sles" source "terraform/aws/kconfigs/distros/Kconfig.custom" + +# Include GPU AMI configuration if available (generated by cloud-config) +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG +source "terraform/aws/kconfigs/Kconfig.gpu-amis.static" +endif -- 2.50.1