From: Chuck Lever <cel@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
Daniel Gomez <da.gomez@kruces.com>,
kdevops@lists.linux.dev
Subject: Re: [PATCH 1/3] aws: add dynamic cloud configuration support using AWS CLI
Date: Thu, 4 Sep 2025 09:55:46 -0400 [thread overview]
Message-ID: <d38f8fc3-3a44-4de7-b463-b258cbcd66c4@kernel.org> (raw)
In-Reply-To: <20250904090030.2481840-2-mcgrof@kernel.org>
On 9/4/25 5:00 AM, Luis Chamberlain wrote:
> Add support for dynamically generating AWS instance types and regions
> configuration using the AWS CLI, similar to the Lambda Labs implementation.
>
> This allows users to:
> - Query real-time AWS instance availability
> - Generate Kconfig files with current instance families and regions
> - Choose between dynamic and static configuration modes
> - See pricing estimates and resource summaries
>
> Key components:
> - scripts/aws-cli: AWS CLI wrapper tool for kdevops
> - scripts/aws_api.py: Low-level AWS API functions
> - Updated generate_cloud_configs.py to support AWS
> - Makefile integration for AWS Kconfig generation
> - Option to use dynamic or static AWS configuration
>
> Usage: Run 'make cloud-config' to generate dynamic configuration.
I'd like to see more documentation for this make target. Is this a
target to be run as part of every workflow, or is it one that developers
run every once in a while? (I think the latter, based on the next patch
in this series, but it would be nice to put that in docs somewhere).
For example, ISTR a docs file that describes "make refs-default".
> This also parallelize cloud provider operations to significantly improve
> generation.
>
> $ time make cloud-config
> Cloud Provider Configuration Summary
> ============================================================
>
> ✓ Lambda Labs: 14/20 instances available, 14 regions, $0.50-$10.32/hr
> Kconfig files generated successfully
>
> ✓ AWS: 979 instance types available, 17 regions, ~$0.05-$3.60/hr
> Kconfig files generated successfully
>
> ⚠ Azure: Dynamic configuration not yet implemented
> ⚠ GCE: Dynamic configuration not yet implemented
>
> Note: Dynamic configurations query real-time availability
> Run 'make menuconfig' to configure your cloud provider
>
> real 6m51.859s
> user 37m16.347s
> sys 3m8.130s
I spent a little time yesterday adding new instance families by hand,
after asking the AWS instance type assistant to recommend appropriate
families for CI/CD. I added: t3, t3a, m7i-flex, and the two g4 families.
Other families looked too expensive or might be more than is needed for
CI/CD (like who needs 16 GPUs, 192 vCPUs, and 8 200GbE adapters for a
development system? ;-)
What I'd like to do is prevent an overload of choices in the instance
menus -- perhaps we can maintain a list somewhere of the families we'd
like to add to the menu and let the scripting consult that list as it
constructs the menu.
(And, I can drop my by-hand patches... I didn't realize you were ready
to post your automation that does the same thing).
> This also adds support for GPU AMIs:
IMHO the new GPU AMI support is a good idea, but should be split into a
separate patch in this series.
> - AWS Deep Learning AMI with pre-installed NVIDIA drivers, CUDA, and
> ML frameworks
> - NVIDIA Deep Learning AMI option for NGC containers
> - Custom GPU AMI support for specialized images
> - Automatic detection of GPU instance types
> - We conditionally display of GPU AMI options only for GPU instances
(I think OCI also provides distinct OS images with GPU user space tools,
fwiw).
> We automatically detects when you select a GPU instance family (like G6E) and
> provides appropriate GPU-optimized AMI options including the AWS Deep
> Learning AMI with all necessary drivers and frameworks pre-installed.
>
> Generated-by: Claude AI
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
> .gitignore | 3 +
> defconfigs/aws-gpu-g6e-ai | 53 +
> .../templates/aws/terraform.tfvars.j2 | 5 +
> scripts/aws-cli | 436 +++++++
> scripts/aws_api.py | 1135 +++++++++++++++++
Note there is also an Ansible collection that provides the AWS API
to playbooks: amazon.aws. This is not a recommendation to
reimplement this patch, just pointing out there are other ways to
skin the cat.
> scripts/dynamic-cloud-kconfig.Makefile | 88 +-
> scripts/generate_cloud_configs.py | 198 ++-
> terraform/aws/kconfigs/Kconfig.compute | 109 +-
> 8 files changed, 1937 insertions(+), 90 deletions(-)
> create mode 100644 defconfigs/aws-gpu-g6e-ai
> create mode 100755 scripts/aws-cli
> create mode 100755 scripts/aws_api.py
>
> diff --git a/.gitignore b/.gitignore
> index 09d2ae33..30337add 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -115,3 +115,6 @@ terraform/lambdalabs/.terraform_api_key
> .cloud.initialized
>
> scripts/__pycache__/
> +.aws_cloud_config_generated
> +terraform/aws/kconfigs/*.generated
> +terraform/aws/kconfigs/instance-types/*.generated
> diff --git a/defconfigs/aws-gpu-g6e-ai b/defconfigs/aws-gpu-g6e-ai
> new file mode 100644
> index 00000000..affc7a98
> --- /dev/null
> +++ b/defconfigs/aws-gpu-g6e-ai
> @@ -0,0 +1,53 @@
> +# AWS G6e.2xlarge GPU instance with Deep Learning AMI for AI/ML workloads
> +# This configuration sets up an AWS G6e.2xlarge instance with NVIDIA L40S GPU
> +# optimized for machine learning, AI inference, and GPU-accelerated workloads
> +
> +# Cloud provider configuration
> +CONFIG_KDEVOPS_ENABLE_TERRAFORM=y
> +CONFIG_TERRAFORM=y
> +CONFIG_TERRAFORM_AWS=y
> +
> +# AWS Dynamic configuration (required for G6E instance family and GPU AMIs)
> +CONFIG_TERRAFORM_AWS_USE_DYNAMIC_CONFIG=y
> +
> +# AWS Instance configuration - G6E family with NVIDIA L40S GPU
> +# G6E.2XLARGE specifications:
> +# - 8 vCPUs (3rd Gen AMD EPYC processors)
> +# - 32 GB system RAM
> +# - 1x NVIDIA L40S Tensor Core GPU
> +# - 48 GB GPU memory
> +# - Up to 15 Gbps network performance
> +# - Up to 10 Gbps EBS bandwidth
> +CONFIG_TERRAFORM_AWS_INSTANCE_TYPE_G6E=y
> +CONFIG_TERRAFORM_AWS_INSTANCE_G6E_2XLARGE=y
> +
> +# AWS Region - US East (N. Virginia) - primary availability for G6E
> +CONFIG_TERRAFORM_AWS_REGION_US_EAST_1=y
> +
> +# GPU-optimized Deep Learning AMI
> +# Includes: NVIDIA drivers 535+, CUDA 12.x, cuDNN, TensorFlow, PyTorch, MXNet
> +CONFIG_TERRAFORM_AWS_USE_GPU_AMI=y
> +CONFIG_TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING=y
> +CONFIG_TERRAFORM_AWS_GPU_AMI_NAME="Deep Learning AMI GPU TensorFlow*"
> +CONFIG_TERRAFORM_AWS_GPU_AMI_OWNER="amazon"
> +
> +# Storage configuration optimized for ML workloads
> +# 200 GB for datasets, models, and experiment artifacts
> +CONFIG_TERRAFORM_AWS_DATA_VOLUME_SIZE=200
> +
> +# Basic workflow configuration for kernel development
> +CONFIG_WORKFLOWS=y
> +CONFIG_WORKFLOW_LINUX_CUSTOM=y
> +CONFIG_BOOTLINUX=y
> +
> +# Skip testing workflows for pure AI/ML setup
> +CONFIG_WORKFLOWS_TESTS=n
> +
> +# Enable systemd journal remote for debugging
> +CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y
> +
> +# Note: After provisioning, the instance will have:
> +# - Jupyter notebook server ready for ML experiments
> +# - Pre-installed deep learning frameworks
> +# - NVIDIA GPU drivers and CUDA toolkit
> +# - Docker with NVIDIA Container Toolkit for containerized ML workloads
> \ No newline at end of file
> diff --git a/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2 b/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2
> index d880254b..f8f4c842 100644
> --- a/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2
> +++ b/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2
> @@ -1,8 +1,13 @@
> aws_profile = "{{ terraform_aws_profile }}"
> aws_region = "{{ terraform_aws_region }}"
> aws_availability_zone = "{{ terraform_aws_av_zone }}"
> +{% if terraform_aws_use_gpu_ami is defined and terraform_aws_use_gpu_ami %}
> +aws_name_search = "{{ terraform_aws_gpu_ami_name }}"
> +aws_ami_owner = "{{ terraform_aws_gpu_ami_owner }}"
> +{% else %}
> aws_name_search = "{{ terraform_aws_ns }}"
> aws_ami_owner = "{{ terraform_aws_ami_owner }}"
> +{% endif %}
> aws_instance_type = "{{ terraform_aws_instance_type }}"
> aws_ebs_volumes_per_instance = "{{ terraform_aws_ebs_volumes_per_instance }}"
> aws_ebs_volume_size = {{ terraform_aws_ebs_volume_size }}
> diff --git a/scripts/aws-cli b/scripts/aws-cli
> new file mode 100755
> index 00000000..6cacce8b
> --- /dev/null
> +++ b/scripts/aws-cli
> @@ -0,0 +1,436 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: MIT
> +"""
> +AWS CLI tool for kdevops
> +
> +A structured CLI tool that wraps AWS CLI commands and provides access to
> +AWS cloud provider functionality for dynamic configuration generation
> +and resource management.
> +"""
> +
> +import argparse
> +import json
> +import sys
> +import os
> +from typing import Dict, List, Any, Optional, Tuple
> +from pathlib import Path
> +
> +# Import the AWS API functions
> +try:
> + from aws_api import (
> + check_aws_cli,
> + get_instance_types,
> + get_regions,
> + get_availability_zones,
> + get_pricing_info,
> + generate_instance_types_kconfig,
> + generate_regions_kconfig,
> + generate_instance_families_kconfig,
> + generate_gpu_amis_kconfig,
> + )
> +except ImportError:
> + # Try to import from scripts directory if not in path
> + sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
> + from aws_api import (
> + check_aws_cli,
> + get_instance_types,
> + get_regions,
> + get_availability_zones,
> + get_pricing_info,
> + generate_instance_types_kconfig,
> + generate_regions_kconfig,
> + generate_instance_families_kconfig,
> + generate_gpu_amis_kconfig,
> + )
> +
> +
> +class AWSCLI:
> + """AWS CLI interface for kdevops"""
> +
> + def __init__(self, output_format: str = "json"):
> + """
> + Initialize the CLI with specified output format
> +
> + Args:
> + output_format: 'json' or 'text' for output formatting
> + """
> + self.output_format = output_format
> + self.aws_available = check_aws_cli()
> +
> + def output(self, data: Any, headers: Optional[List[str]] = None):
> + """
> + Output data in the specified format
> +
> + Args:
> + data: Data to output (dict, list, or primitive)
> + headers: Column headers for text format (optional)
> + """
> + if self.output_format == "json":
> + print(json.dumps(data, indent=2))
> + else:
> + # Human-readable text format
> + if isinstance(data, list):
> + if data and isinstance(data[0], dict):
> + # Table format for list of dicts
> + if not headers:
> + headers = list(data[0].keys()) if data else []
> +
> + if headers:
> + # Calculate column widths
> + widths = {h: len(h) for h in headers}
> + for item in data:
> + for h in headers:
> + val = str(item.get(h, ""))
> + widths[h] = max(widths[h], len(val))
> +
> + # Print header
> + header_line = " | ".join(h.ljust(widths[h]) for h in headers)
> + print(header_line)
> + print("-" * len(header_line))
> +
> + # Print rows
> + for item in data:
> + row = " | ".join(
> + str(item.get(h, "")).ljust(widths[h]) for h in headers
> + )
> + print(row)
> + else:
> + # Simple list
> + for item in data:
> + print(item)
> + elif isinstance(data, dict):
> + # Key-value format
> + max_key_len = max(len(k) for k in data.keys()) if data else 0
> + for key, value in data.items():
> + print(f"{key.ljust(max_key_len)} : {value}")
> + else:
> + # Simple value
> + print(data)
> +
> + def list_instance_types(
> + self,
> + family: Optional[str] = None,
> + region: Optional[str] = None,
> + max_results: int = 100,
> + ) -> List[Dict[str, Any]]:
> + """
> + List instance types
> +
> + Args:
> + family: Filter by instance family (e.g., 'm5', 't3')
> + region: AWS region to query
> + max_results: Maximum number of results to return
> +
> + Returns:
> + List of instance type information
> + """
> + if not self.aws_available:
> + return [
> + {
> + "error": "AWS CLI not found. Please install AWS CLI and configure credentials."
> + }
> + ]
> +
> + instances = get_instance_types(
> + family=family, region=region, max_results=max_results
> + )
> +
> + # Format the results
> + result = []
> + for instance in instances:
> + item = {
> + "name": instance.get("InstanceType", ""),
> + "vcpu": instance.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> + "memory_gb": instance.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024,
> + "instance_storage": instance.get("InstanceStorageSupported", False),
> + "network_performance": instance.get("NetworkInfo", {}).get(
> + "NetworkPerformance", ""
> + ),
> + "architecture": ", ".join(
> + instance.get("ProcessorInfo", {}).get("SupportedArchitectures", [])
> + ),
> + }
> + result.append(item)
> +
> + # Sort by name
> + result.sort(key=lambda x: x["name"])
> +
> + return result
> +
> + def list_regions(self, include_zones: bool = False) -> List[Dict[str, Any]]:
> + """
> + List regions
> +
> + Args:
> + include_zones: Include availability zones for each region
> +
> + Returns:
> + List of region information
> + """
> + if not self.aws_available:
> + return [
> + {
> + "error": "AWS CLI not found. Please install AWS CLI and configure credentials."
> + }
> + ]
> +
> + regions = get_regions()
> +
> + result = []
> + for region in regions:
> + item = {
> + "name": region.get("RegionName", ""),
> + "endpoint": region.get("Endpoint", ""),
> + "opt_in_status": region.get("OptInStatus", ""),
> + }
> +
> + if include_zones:
> + # Get availability zones for this region
> + zones = get_availability_zones(region["RegionName"])
> + item["zones"] = len(zones)
> + item["zone_names"] = ", ".join([z["ZoneName"] for z in zones])
> +
> + result.append(item)
> +
> + return result
> +
> + def get_cheapest_instance(
> + self,
> + region: Optional[str] = None,
> + family: Optional[str] = None,
> + min_vcpus: int = 2,
> + ) -> Dict[str, Any]:
> + """
> + Get the cheapest instance meeting criteria
> +
> + Args:
> + region: AWS region
> + family: Instance family filter
> + min_vcpus: Minimum number of vCPUs required
> +
> + Returns:
> + Dictionary with instance information
> + """
> + if not self.aws_available:
> + return {"error": "AWS CLI not available"}
> +
> + instances = get_instance_types(family=family, region=region)
> +
> + # Filter by minimum vCPUs
> + eligible = []
> + for instance in instances:
> + vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0)
> + if vcpus >= min_vcpus:
> + eligible.append(instance)
> +
> + if not eligible:
> + return {"error": "No instances found matching criteria"}
> +
> + # Get pricing for eligible instances
> + pricing = get_pricing_info(region=region or "us-east-1")
> +
> + # Find cheapest
> + cheapest = None
> + cheapest_price = float("inf")
> +
> + for instance in eligible:
> + instance_type = instance.get("InstanceType")
> + price = pricing.get(instance_type, {}).get("on_demand", float("inf"))
> + if price < cheapest_price:
> + cheapest_price = price
> + cheapest = instance
> +
> + if cheapest:
> + return {
> + "instance_type": cheapest.get("InstanceType"),
> + "vcpus": cheapest.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> + "memory_gb": cheapest.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024,
> + "price_per_hour": f"${cheapest_price:.3f}",
> + }
> +
> + return {"error": "Could not determine cheapest instance"}
> +
> + def generate_kconfig(self) -> bool:
> + """
> + Generate Kconfig files for AWS
> +
> + Returns:
> + True on success, False on failure
> + """
> + if not self.aws_available:
> + print("AWS CLI not available, cannot generate Kconfig", file=sys.stderr)
> + return False
> +
> + output_dir = Path("terraform/aws/kconfigs")
> +
> + # Create directory if it doesn't exist
> + output_dir.mkdir(parents=True, exist_ok=True)
> +
> + try:
> + from concurrent.futures import ThreadPoolExecutor, as_completed
> +
> + # Generate files in parallel
> + instance_types_dir = output_dir / "instance-types"
> + instance_types_dir.mkdir(exist_ok=True)
> +
> + def generate_family_file(family):
> + """Generate Kconfig for a single family."""
> + types_kconfig = generate_instance_types_kconfig(family)
> + if types_kconfig:
> + types_file = instance_types_dir / f"Kconfig.{family}.generated"
> + types_file.write_text(types_kconfig)
> + return f"Generated {types_file}"
> + return None
> +
> + with ThreadPoolExecutor(max_workers=10) as executor:
> + # Submit all generation tasks
> + futures = []
> +
> + # Generate instance families Kconfig
> + futures.append(executor.submit(generate_instance_families_kconfig))
> +
> + # Generate regions Kconfig
> + futures.append(executor.submit(generate_regions_kconfig))
> +
> + # Generate GPU AMIs Kconfig
> + futures.append(executor.submit(generate_gpu_amis_kconfig))
> +
> + # Generate instance types for each family
> + # Get all families dynamically from AWS
> + from aws_api import get_generated_instance_families
> +
> + families = get_generated_instance_families()
> +
> + family_futures = []
> + for family in sorted(families):
> + family_futures.append(executor.submit(generate_family_file, family))
> +
> + # Process main config results
> + families_kconfig = futures[0].result()
> + regions_kconfig = futures[1].result()
> + gpu_amis_kconfig = futures[2].result()
> +
> + # Write main configs
> + families_file = output_dir / "Kconfig.compute.generated"
> + families_file.write_text(families_kconfig)
> + print(f"Generated {families_file}")
> +
> + regions_file = output_dir / "Kconfig.location.generated"
> + regions_file.write_text(regions_kconfig)
> + print(f"Generated {regions_file}")
> +
> + gpu_amis_file = output_dir / "Kconfig.gpu-amis.generated"
> + gpu_amis_file.write_text(gpu_amis_kconfig)
> + print(f"Generated {gpu_amis_file}")
> +
> + # Process family results
> + for future in family_futures:
> + result = future.result()
> + if result:
> + print(result)
> +
> + return True
> +
> + except Exception as e:
> + print(f"Error generating Kconfig: {e}", file=sys.stderr)
> + return False
> +
> +
> +def main():
> + """Main entry point"""
> + parser = argparse.ArgumentParser(
> + description="AWS CLI tool for kdevops",
> + formatter_class=argparse.RawDescriptionHelpFormatter,
> + )
> +
> + parser.add_argument(
> + "--output",
> + choices=["json", "text"],
> + default="json",
> + help="Output format (default: json)",
> + )
> +
> + subparsers = parser.add_subparsers(dest="command", help="Available commands")
> +
> + # Generate Kconfig command
> + kconfig_parser = subparsers.add_parser(
> + "generate-kconfig", help="Generate Kconfig files for AWS"
> + )
> +
> + # Instance types command
> + instances_parser = subparsers.add_parser(
> + "instance-types", help="Manage instance types"
> + )
> + instances_subparsers = instances_parser.add_subparsers(
> + dest="subcommand", help="Instance type operations"
> + )
> +
> + # Instance types list
> + list_instances = instances_subparsers.add_parser("list", help="List instance types")
> + list_instances.add_argument("--family", help="Filter by instance family")
> + list_instances.add_argument("--region", help="AWS region")
> + list_instances.add_argument(
> + "--max-results", type=int, default=100, help="Maximum results (default: 100)"
> + )
> +
> + # Regions command
> + regions_parser = subparsers.add_parser("regions", help="Manage regions")
> + regions_subparsers = regions_parser.add_subparsers(
> + dest="subcommand", help="Region operations"
> + )
> +
> + # Regions list
> + list_regions = regions_subparsers.add_parser("list", help="List regions")
> + list_regions.add_argument(
> + "--include-zones",
> + action="store_true",
> + help="Include availability zones",
> + )
> +
> + # Cheapest instance command
> + cheapest_parser = subparsers.add_parser(
> + "cheapest", help="Find cheapest instance meeting criteria"
> + )
> + cheapest_parser.add_argument("--region", help="AWS region")
> + cheapest_parser.add_argument("--family", help="Instance family")
> + cheapest_parser.add_argument(
> + "--min-vcpus", type=int, default=2, help="Minimum vCPUs (default: 2)"
> + )
> +
> + args = parser.parse_args()
> +
> + cli = AWSCLI(output_format=args.output)
> +
> + if args.command == "generate-kconfig":
> + success = cli.generate_kconfig()
> + sys.exit(0 if success else 1)
> +
> + elif args.command == "instance-types":
> + if args.subcommand == "list":
> + instances = cli.list_instance_types(
> + family=args.family,
> + region=args.region,
> + max_results=args.max_results,
> + )
> + cli.output(instances)
> +
> + elif args.command == "regions":
> + if args.subcommand == "list":
> + regions = cli.list_regions(include_zones=args.include_zones)
> + cli.output(regions)
> +
> + elif args.command == "cheapest":
> + result = cli.get_cheapest_instance(
> + region=args.region,
> + family=args.family,
> + min_vcpus=args.min_vcpus,
> + )
> + cli.output(result)
> +
> + else:
> + parser.print_help()
> + sys.exit(1)
> +
> +
> +if __name__ == "__main__":
> + main()
> diff --git a/scripts/aws_api.py b/scripts/aws_api.py
> new file mode 100755
> index 00000000..1cf42f39
> --- /dev/null
> +++ b/scripts/aws_api.py
> @@ -0,0 +1,1135 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: MIT
> +"""
> +AWS API library for kdevops.
> +
> +Provides AWS CLI wrapper functions for dynamic configuration generation.
> +Used by aws-cli and other kdevops components.
> +"""
> +
> +import json
> +import os
> +import re
> +import subprocess
> +import sys
> +from typing import Dict, List, Optional, Any
> +
> +
> +def check_aws_cli() -> bool:
> + """Check if AWS CLI is installed and configured."""
> + try:
> + # Check if AWS CLI is installed
> + result = subprocess.run(
> + ["aws", "--version"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + if result.returncode != 0:
> + return False
> +
> + # Check if credentials are configured
> + result = subprocess.run(
> + ["aws", "sts", "get-caller-identity"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + return result.returncode == 0
> + except FileNotFoundError:
> + return False
> +
> +
> +def get_default_region() -> str:
> + """Get the default AWS region from configuration or environment."""
> + # Try to get from environment
> + region = os.environ.get("AWS_DEFAULT_REGION")
> + if region:
> + return region
> +
> + # Try to get from AWS config
> + try:
> + result = subprocess.run(
> + ["aws", "configure", "get", "region"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + if result.returncode == 0 and result.stdout.strip():
> + return result.stdout.strip()
> + except:
> + pass
> +
> + # Default to us-east-1
> + return "us-east-1"
> +
> +
> +def run_aws_command(command: List[str], region: Optional[str] = None) -> Optional[Dict]:
> + """
> + Run an AWS CLI command and return the JSON output.
> +
> + Args:
> + command: AWS CLI command as a list
> + region: Optional AWS region
> +
> + Returns:
> + Parsed JSON output or None on error
> + """
> + cmd = ["aws"] + command + ["--output", "json"]
> +
> + # Always specify a region (use default if not provided)
> + if not region:
> + region = get_default_region()
> + cmd.extend(["--region", region])
> +
> + try:
> + result = subprocess.run(
> + cmd,
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + if result.returncode == 0:
> + return json.loads(result.stdout) if result.stdout else {}
> + else:
> + print(f"AWS command failed: {result.stderr}", file=sys.stderr)
> + return None
> + except (subprocess.SubprocessError, json.JSONDecodeError) as e:
> + print(f"Error running AWS command: {e}", file=sys.stderr)
> + return None
> +
> +
> +def get_regions() -> List[Dict[str, Any]]:
> + """Get available AWS regions."""
> + response = run_aws_command(["ec2", "describe-regions"])
> + if response and "Regions" in response:
> + return response["Regions"]
> + return []
> +
> +
> +def get_availability_zones(region: str) -> List[Dict[str, Any]]:
> + """Get availability zones for a specific region."""
> + response = run_aws_command(
> + ["ec2", "describe-availability-zones"],
> + region=region,
> + )
> + if response and "AvailabilityZones" in response:
> + return response["AvailabilityZones"]
> + return []
> +
> +
> +def get_instance_types(
> + family: Optional[str] = None,
> + region: Optional[str] = None,
> + max_results: int = 100,
> + fetch_all: bool = True,
> +) -> List[Dict[str, Any]]:
> + """
> + Get available instance types.
> +
> + Args:
> + family: Instance family filter (e.g., 'm5', 't3')
> + region: AWS region
> + max_results: Maximum number of results per API call (max 100)
> + fetch_all: If True, fetch all pages using NextToken pagination
> +
> + Returns:
> + List of instance type information
> + """
> + all_instances = []
> + next_token = None
> + page_count = 0
> +
> + # Ensure max_results doesn't exceed AWS limit
> + max_results = min(max_results, 100)
> +
> + while True:
> + cmd = ["ec2", "describe-instance-types"]
> +
> + filters = []
> + if family:
> + # Filter by instance type pattern
> + filters.append(f"Name=instance-type,Values={family}*")
> +
> + if filters:
> + cmd.append("--filters")
> + cmd.extend(filters)
> +
> + cmd.extend(["--max-results", str(max_results)])
> +
> + if next_token:
> + cmd.extend(["--next-token", next_token])
> +
> + response = run_aws_command(cmd, region=region)
> + if response and "InstanceTypes" in response:
> + batch_size = len(response["InstanceTypes"])
> + all_instances.extend(response["InstanceTypes"])
> + page_count += 1
> +
> + if fetch_all and not family:
> + # Only show progress for full fetches (not family-specific)
> + print(
> + f" Fetched page {page_count}: {batch_size} instance types (total: {len(all_instances)})",
> + file=sys.stderr,
> + )
> +
> + # Check if there are more results
> + if fetch_all and "NextToken" in response:
> + next_token = response["NextToken"]
> + else:
> + break
> + else:
> + break
> +
> + if fetch_all and page_count > 1:
> + filter_desc = f" for family '{family}'" if family else ""
> + print(
> + f" Total: {len(all_instances)} instance types fetched{filter_desc}",
> + file=sys.stderr,
> + )
> +
> + return all_instances
> +
> +
> +def get_pricing_info(region: str = "us-east-1") -> Dict[str, Dict[str, float]]:
> + """
> + Get pricing information for instance types.
> +
> + Note: AWS Pricing API requires us-east-1 region.
> + Returns a simplified pricing structure.
> +
> + Args:
> + region: AWS region for pricing
> +
> + Returns:
> + Dictionary mapping instance types to pricing info
> + """
> + # For simplicity, we'll use hardcoded common instance prices
> + # In production, you'd query the AWS Pricing API
> + pricing = {
> + # T3 family (burstable)
> + "t3.nano": {"on_demand": 0.0052},
> + "t3.micro": {"on_demand": 0.0104},
> + "t3.small": {"on_demand": 0.0208},
> + "t3.medium": {"on_demand": 0.0416},
> + "t3.large": {"on_demand": 0.0832},
> + "t3.xlarge": {"on_demand": 0.1664},
> + "t3.2xlarge": {"on_demand": 0.3328},
> + # T3a family (AMD)
> + "t3a.nano": {"on_demand": 0.0047},
> + "t3a.micro": {"on_demand": 0.0094},
> + "t3a.small": {"on_demand": 0.0188},
> + "t3a.medium": {"on_demand": 0.0376},
> + "t3a.large": {"on_demand": 0.0752},
> + "t3a.xlarge": {"on_demand": 0.1504},
> + "t3a.2xlarge": {"on_demand": 0.3008},
> + # M5 family (general purpose Intel)
> + "m5.large": {"on_demand": 0.096},
> + "m5.xlarge": {"on_demand": 0.192},
> + "m5.2xlarge": {"on_demand": 0.384},
> + "m5.4xlarge": {"on_demand": 0.768},
> + "m5.8xlarge": {"on_demand": 1.536},
> + "m5.12xlarge": {"on_demand": 2.304},
> + "m5.16xlarge": {"on_demand": 3.072},
> + "m5.24xlarge": {"on_demand": 4.608},
> + # M7a family (general purpose AMD)
> + "m7a.medium": {"on_demand": 0.0464},
> + "m7a.large": {"on_demand": 0.0928},
> + "m7a.xlarge": {"on_demand": 0.1856},
> + "m7a.2xlarge": {"on_demand": 0.3712},
> + "m7a.4xlarge": {"on_demand": 0.7424},
> + "m7a.8xlarge": {"on_demand": 1.4848},
> + "m7a.12xlarge": {"on_demand": 2.2272},
> + "m7a.16xlarge": {"on_demand": 2.9696},
> + "m7a.24xlarge": {"on_demand": 4.4544},
> + "m7a.32xlarge": {"on_demand": 5.9392},
> + "m7a.48xlarge": {"on_demand": 8.9088},
> + # C5 family (compute optimized)
> + "c5.large": {"on_demand": 0.085},
> + "c5.xlarge": {"on_demand": 0.17},
> + "c5.2xlarge": {"on_demand": 0.34},
> + "c5.4xlarge": {"on_demand": 0.68},
> + "c5.9xlarge": {"on_demand": 1.53},
> + "c5.12xlarge": {"on_demand": 2.04},
> + "c5.18xlarge": {"on_demand": 3.06},
> + "c5.24xlarge": {"on_demand": 4.08},
> + # C7a family (compute optimized AMD)
> + "c7a.medium": {"on_demand": 0.0387},
> + "c7a.large": {"on_demand": 0.0774},
> + "c7a.xlarge": {"on_demand": 0.1548},
> + "c7a.2xlarge": {"on_demand": 0.3096},
> + "c7a.4xlarge": {"on_demand": 0.6192},
> + "c7a.8xlarge": {"on_demand": 1.2384},
> + "c7a.12xlarge": {"on_demand": 1.8576},
> + "c7a.16xlarge": {"on_demand": 2.4768},
> + "c7a.24xlarge": {"on_demand": 3.7152},
> + "c7a.32xlarge": {"on_demand": 4.9536},
> + "c7a.48xlarge": {"on_demand": 7.4304},
> + # I4i family (storage optimized)
> + "i4i.large": {"on_demand": 0.117},
> + "i4i.xlarge": {"on_demand": 0.234},
> + "i4i.2xlarge": {"on_demand": 0.468},
> + "i4i.4xlarge": {"on_demand": 0.936},
> + "i4i.8xlarge": {"on_demand": 1.872},
> + "i4i.16xlarge": {"on_demand": 3.744},
> + "i4i.32xlarge": {"on_demand": 7.488},
> + }
> +
> + # Adjust pricing based on region (simplified)
> + # Some regions are more expensive than others
> + region_multipliers = {
> + "us-east-1": 1.0,
> + "us-east-2": 1.0,
> + "us-west-1": 1.08,
> + "us-west-2": 1.0,
> + "eu-west-1": 1.1,
> + "eu-central-1": 1.15,
> + "ap-southeast-1": 1.2,
> + "ap-northeast-1": 1.25,
> + }
> +
> + multiplier = region_multipliers.get(region, 1.1)
> + if multiplier != 1.0:
> + adjusted_pricing = {}
> + for instance_type, prices in pricing.items():
> + adjusted_pricing[instance_type] = {
> + "on_demand": prices["on_demand"] * multiplier
> + }
> + return adjusted_pricing
> +
> + return pricing
> +
> +
> +def sanitize_kconfig_name(name: str) -> str:
> + """Convert a name to a valid Kconfig symbol."""
> + # Replace special characters with underscores
> + name = name.replace("-", "_").replace(".", "_").replace(" ", "_")
> + # Convert to uppercase
> + name = name.upper()
> + # Remove any non-alphanumeric characters (except underscore)
> + name = "".join(c for c in name if c.isalnum() or c == "_")
> + # Ensure it doesn't start with a number
> + if name and name[0].isdigit():
> + name = "_" + name
> + return name
> +
> +
> +# Cache for instance families to avoid redundant API calls
> +_cached_families = None
> +
> +
> +def get_generated_instance_families() -> set:
> + """Get the set of instance families that will have generated Kconfig files."""
> + global _cached_families
> +
> + # Return cached result if available
> + if _cached_families is not None:
> + return _cached_families
> +
> + # Return all families - we'll generate Kconfig files for all of them
> + # This function will be called by the aws-cli tool to determine which files to generate
> + if not check_aws_cli():
> + # Return a minimal set if AWS CLI is not available
> + _cached_families = {"m5", "t3", "c5"}
> + return _cached_families
> +
> + # Get all available instance types
> + print(" Discovering available instance families...", file=sys.stderr)
> + instance_types = get_instance_types(fetch_all=True)
> +
> + # Extract unique families
> + families = set()
> + for instance_type in instance_types:
> + type_name = instance_type.get("InstanceType", "")
> + # Extract family prefix (e.g., "m5" from "m5.large")
> + if "." in type_name:
> + family = type_name.split(".")[0]
> + families.add(family)
> +
> + print(f" Found {len(families)} instance families", file=sys.stderr)
> + _cached_families = families
> + return families
> +
> +
> +def generate_instance_families_kconfig() -> str:
> + """Generate Kconfig content for AWS instance families."""
> + # Check if AWS CLI is available
> + if not check_aws_cli():
> + return generate_default_instance_families_kconfig()
> +
> + # Get all available instance types (with pagination)
> + instance_types = get_instance_types(fetch_all=True)
> +
> + # Extract unique families
> + families = set()
> + family_info = {}
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." in instance_type:
> + family = instance_type.split(".")[0]
> + families.add(family)
> + if family not in family_info:
> + family_info[family] = {
> + "architectures": set(),
> + "count": 0,
> + }
> + family_info[family]["count"] += 1
> + for arch in instance.get("ProcessorInfo", {}).get(
> + "SupportedArchitectures", []
> + ):
> + family_info[family]["architectures"].add(arch)
> +
> + if not families:
> + return generate_default_instance_families_kconfig()
> +
> + # Group families by category - use prefix patterns to catch all variants
> + def categorize_family(family_name):
> + """Categorize a family based on its prefix."""
> + if family_name.startswith(("m", "t")):
> + return "general_purpose"
> + elif family_name.startswith("c"):
> + return "compute_optimized"
> + elif family_name.startswith(("r", "x", "z")):
> + return "memory_optimized"
> + elif family_name.startswith(("i", "d", "h")):
> + return "storage_optimized"
> + elif family_name.startswith(("p", "g", "dl", "trn", "inf", "vt", "f")):
> + return "accelerated"
> + elif family_name.startswith(("mac", "hpc")):
> + return "specialized"
> + else:
> + return "other"
> +
> + # Organize families by category
> + categorized_families = {
> + "general_purpose": [],
> + "compute_optimized": [],
> + "memory_optimized": [],
> + "storage_optimized": [],
> + "accelerated": [],
> + "specialized": [],
> + "other": [],
> + }
> +
> + for family in sorted(families):
> + category = categorize_family(family)
> + categorized_families[category].append(family)
> +
> + kconfig = """# AWS instance families (dynamically generated)
> +# Generated by aws-cli from live AWS data
> +
> +choice
> + prompt "AWS instance family"
> + default TERRAFORM_AWS_INSTANCE_TYPE_M5
> + help
> + Select the AWS instance family for your deployment.
> + Different families are optimized for different workloads.
> +
> +"""
> +
> + # Category headers
> + category_headers = {
> + "general_purpose": "# General Purpose - balanced compute, memory, and networking\n",
> + "compute_optimized": "# Compute Optimized - ideal for CPU-intensive applications\n",
> + "memory_optimized": "# Memory Optimized - for memory-intensive applications\n",
> + "storage_optimized": "# Storage Optimized - for high sequential read/write workloads\n",
> + "accelerated": "# Accelerated Computing - GPU and other accelerators\n",
> + "specialized": "# Specialized - for specific use cases\n",
> + "other": "# Other instance families\n",
> + }
> +
> + # Add each category of families
> + for category in [
> + "general_purpose",
> + "compute_optimized",
> + "memory_optimized",
> + "storage_optimized",
> + "accelerated",
> + "specialized",
> + "other",
> + ]:
> + if categorized_families[category]:
> + kconfig += category_headers[category]
> + for family in categorized_families[category]:
> + kconfig += generate_family_config(family, family_info.get(family, {}))
> + if category != "other": # Don't add extra newline after the last category
> + kconfig += "\n"
> +
> + kconfig += "\nendchoice\n"
> +
> + # Add instance type source includes for each family
> + # Only include families that we actually generate files for
> + generated_families = get_generated_instance_families()
> + kconfig += "\n# Include instance-specific configurations\n"
> + for family in sorted(families):
> + # Only add source statement if we generate a file for this family
> + if family in generated_families:
> + safe_name = sanitize_kconfig_name(family)
> + kconfig += f"""if TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}
> +source "terraform/aws/kconfigs/instance-types/Kconfig.{family}.generated"
> +endif
> +
> +"""
> +
> + return kconfig
> +
> +
> +def generate_family_config(family: str, info: Dict) -> str:
> + """Generate Kconfig entry for an instance family."""
> + safe_name = sanitize_kconfig_name(family)
> +
> + # Determine architecture dependencies
> + architectures = info.get("architectures", set())
> + depends_line = ""
> + if architectures:
> + if "x86_64" in architectures and "arm64" not in architectures:
> + depends_line = "\n\tdepends on TARGET_ARCH_X86_64"
> + elif "arm64" in architectures and "x86_64" not in architectures:
> + depends_line = "\n\tdepends on TARGET_ARCH_ARM64"
> +
> + # Family descriptions
> + descriptions = {
> + "t3": "Burstable performance instances powered by Intel processors",
> + "t3a": "Burstable performance instances powered by AMD processors",
> + "m5": "General purpose instances powered by Intel Xeon Platinum processors",
> + "m7a": "Latest generation general purpose instances powered by AMD EPYC processors",
> + "c5": "Compute optimized instances powered by Intel Xeon Platinum processors",
> + "c7a": "Latest generation compute optimized instances powered by AMD EPYC processors",
> + "i4i": "Storage optimized instances with NVMe SSD storage",
> + "is4gen": "Storage optimized ARM instances powered by AWS Graviton2",
> + "im4gn": "Storage optimized ARM instances with NVMe storage",
> + "r5": "Memory optimized instances powered by Intel Xeon Platinum processors",
> + "p3": "GPU instances for machine learning and HPC",
> + "g4dn": "GPU instances for graphics-intensive applications",
> + }
> +
> + description = descriptions.get(family, f"AWS {family.upper()} instance family")
> + count = info.get("count", 0)
> +
> + config = f"""config TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}
> +\tbool "{family.upper()}"
> +{depends_line}
> +\thelp
> +\t {description}
> +\t Available instance types: {count}
> +
> +"""
> + return config
> +
> +
> +def generate_default_instance_families_kconfig() -> str:
> + """Generate default Kconfig content when AWS CLI is not available."""
> + return """# AWS instance families (default - AWS CLI not available)
> +
> +choice
> + prompt "AWS instance family"
> + default TERRAFORM_AWS_INSTANCE_TYPE_M5
> + help
> + Select the AWS instance family for your deployment.
> + Note: AWS CLI is not available, showing default options.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_M5
> + bool "M5"
> + depends on TARGET_ARCH_X86_64
> + help
> + General purpose instances powered by Intel Xeon Platinum processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_M7A
> + bool "M7a"
> + depends on TARGET_ARCH_X86_64
> + help
> + Latest generation general purpose instances powered by AMD EPYC processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_T3
> + bool "T3"
> + depends on TARGET_ARCH_X86_64
> + help
> + Burstable performance instances powered by Intel processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_C5
> + bool "C5"
> + depends on TARGET_ARCH_X86_64
> + help
> + Compute optimized instances powered by Intel Xeon Platinum processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_I4I
> + bool "I4i"
> + depends on TARGET_ARCH_X86_64
> + help
> + Storage optimized instances with NVMe SSD storage.
> +
> +endchoice
> +
> +# Include instance-specific configurations
> +if TERRAFORM_AWS_INSTANCE_TYPE_M5
> +source "terraform/aws/kconfigs/instance-types/Kconfig.m5"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_M7A
> +source "terraform/aws/kconfigs/instance-types/Kconfig.m7a"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_T3
> +source "terraform/aws/kconfigs/instance-types/Kconfig.t3.generated"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_C5
> +source "terraform/aws/kconfigs/instance-types/Kconfig.c5.generated"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_I4I
> +source "terraform/aws/kconfigs/instance-types/Kconfig.i4i"
> +endif
> +
> +"""
> +
> +
> +def generate_instance_types_kconfig(family: str) -> str:
> + """Generate Kconfig content for specific instance types within a family."""
> + if not check_aws_cli():
> + return ""
> +
> + instance_types = get_instance_types(family=family, fetch_all=True)
> + if not instance_types:
> + return ""
> +
> + # Filter to only exact family matches (e.g., c5a but not c5ad)
> + filtered_instances = []
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." in instance_type:
> + inst_family = instance_type.split(".")[0]
> + if inst_family == family:
> + filtered_instances.append(instance)
> +
> + instance_types = filtered_instances
> + if not instance_types:
> + return ""
> +
> + pricing = get_pricing_info()
> +
> + # Sort by vCPU count and memory
> + instance_types.sort(
> + key=lambda x: (
> + x.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> + x.get("MemoryInfo", {}).get("SizeInMiB", 0),
> + )
> + )
> +
> + safe_family = sanitize_kconfig_name(family)
> +
> + # Get the first instance type to use as default
> + default_instance_name = f"{safe_family}_LARGE" # Fallback
> + if instance_types:
> + first_instance_type = instance_types[0].get("InstanceType", "")
> + if "." in first_instance_type:
> + first_full_name = first_instance_type.replace(".", "_")
> + default_instance_name = sanitize_kconfig_name(first_full_name)
> +
> + kconfig = f"""# AWS {family.upper()} instance sizes (dynamically generated)
> +
> +choice
> +\tprompt "Instance size for {family.upper()} family"
> +\tdefault TERRAFORM_AWS_INSTANCE_{default_instance_name}
> +\thelp
> +\t Select the specific instance size within the {family.upper()} family.
> +
> +"""
> +
> + seen_configs = set()
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." not in instance_type:
> + continue
> +
> + # Get the full instance type name to make unique config names
> + full_name = instance_type.replace(".", "_")
> + safe_full_name = sanitize_kconfig_name(full_name)
> +
> + # Skip if we've already seen this config name
> + if safe_full_name in seen_configs:
> + continue
> + seen_configs.add(safe_full_name)
> +
> + size = instance_type.split(".")[1]
> +
> + vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0)
> + memory_mib = instance.get("MemoryInfo", {}).get("SizeInMiB", 0)
> + memory_gb = memory_mib / 1024
> +
> + # Get pricing
> + price = pricing.get(instance_type, {}).get("on_demand", 0.0)
> + price_str = f"${price:.3f}/hour" if price > 0 else "pricing varies"
> +
> + # Network performance
> + network = instance.get("NetworkInfo", {}).get("NetworkPerformance", "varies")
> +
> + # Storage
> + storage_info = ""
> + if instance.get("InstanceStorageSupported"):
> + storage = instance.get("InstanceStorageInfo", {})
> + total_size = storage.get("TotalSizeInGB", 0)
> + if total_size > 0:
> + storage_info = f"\n\t Instance storage: {total_size} GB"
> +
> + kconfig += f"""config TERRAFORM_AWS_INSTANCE_{safe_full_name}
> +\tbool "{instance_type}"
> +\thelp
> +\t vCPUs: {vcpus}
> +\t Memory: {memory_gb:.1f} GB
> +\t Network: {network}
> +\t Price: {price_str}{storage_info}
> +
> +"""
> +
> + kconfig += "endchoice\n"
> +
> + # Add the actual instance type string config with full instance names
> + kconfig += f"""
> +config TERRAFORM_AWS_{safe_family}_SIZE
> +\tstring
> +"""
> +
> + # Generate default mappings for each seen instance type
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." not in instance_type:
> + continue
> +
> + full_name = instance_type.replace(".", "_")
> + safe_full_name = sanitize_kconfig_name(full_name)
> +
> + kconfig += (
> + f'\tdefault "{instance_type}" if TERRAFORM_AWS_INSTANCE_{safe_full_name}\n'
> + )
> +
> + # Use the first instance type as the final fallback default
> + final_default = f"{family}.large"
> + if instance_types:
> + first_instance_type = instance_types[0].get("InstanceType", "")
> + if first_instance_type:
> + final_default = first_instance_type
> +
> + kconfig += f'\tdefault "{final_default}"\n\n'
> +
> + return kconfig
> +
> +
> +def generate_regions_kconfig() -> str:
> + """Generate Kconfig content for AWS regions."""
> + if not check_aws_cli():
> + return generate_default_regions_kconfig()
> +
> + regions = get_regions()
> + if not regions:
> + return generate_default_regions_kconfig()
> +
> + kconfig = """# AWS regions (dynamically generated)
> +
> +choice
> + prompt "AWS region"
> + default TERRAFORM_AWS_REGION_USEAST1
> + help
> + Select the AWS region for your deployment.
> + Note: Not all instance types are available in all regions.
> +
> +"""
> +
> + # Group regions by geographic area
> + us_regions = []
> + eu_regions = []
> + ap_regions = []
> + other_regions = []
> +
> + for region in regions:
> + region_name = region.get("RegionName", "")
> + if region_name.startswith("us-"):
> + us_regions.append(region)
> + elif region_name.startswith("eu-"):
> + eu_regions.append(region)
> + elif region_name.startswith("ap-"):
> + ap_regions.append(region)
> + else:
> + other_regions.append(region)
> +
> + # Add US regions
> + if us_regions:
> + kconfig += "# US Regions\n"
> + for region in sorted(us_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> + kconfig += "\n"
> +
> + # Add EU regions
> + if eu_regions:
> + kconfig += "# Europe Regions\n"
> + for region in sorted(eu_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> + kconfig += "\n"
> +
> + # Add Asia Pacific regions
> + if ap_regions:
> + kconfig += "# Asia Pacific Regions\n"
> + for region in sorted(ap_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> + kconfig += "\n"
> +
> + # Add other regions
> + if other_regions:
> + kconfig += "# Other Regions\n"
> + for region in sorted(other_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> +
> + kconfig += "\nendchoice\n"
> +
> + # Add the actual region string config
> + kconfig += """
> +config TERRAFORM_AWS_REGION
> + string
> +"""
> +
> + for region in regions:
> + region_name = region.get("RegionName", "")
> + safe_name = sanitize_kconfig_name(region_name)
> + kconfig += f'\tdefault "{region_name}" if TERRAFORM_AWS_REGION_{safe_name}\n'
> +
> + kconfig += '\tdefault "us-east-1"\n'
> +
> + return kconfig
> +
> +
> +def generate_region_config(region: Dict) -> str:
> + """Generate Kconfig entry for a region."""
> + region_name = region.get("RegionName", "")
> + safe_name = sanitize_kconfig_name(region_name)
> + opt_in_status = region.get("OptInStatus", "")
> +
> + # Region display names
> + display_names = {
> + "us-east-1": "US East (N. Virginia)",
> + "us-east-2": "US East (Ohio)",
> + "us-west-1": "US West (N. California)",
> + "us-west-2": "US West (Oregon)",
> + "eu-west-1": "Europe (Ireland)",
> + "eu-west-2": "Europe (London)",
> + "eu-west-3": "Europe (Paris)",
> + "eu-central-1": "Europe (Frankfurt)",
> + "eu-north-1": "Europe (Stockholm)",
> + "ap-southeast-1": "Asia Pacific (Singapore)",
> + "ap-southeast-2": "Asia Pacific (Sydney)",
> + "ap-northeast-1": "Asia Pacific (Tokyo)",
> + "ap-northeast-2": "Asia Pacific (Seoul)",
> + "ap-south-1": "Asia Pacific (Mumbai)",
> + "ca-central-1": "Canada (Central)",
> + "sa-east-1": "South America (São Paulo)",
> + }
> +
> + display_name = display_names.get(region_name, region_name.replace("-", " ").title())
> +
> + help_text = f"\t Region: {display_name}"
> + if opt_in_status and opt_in_status != "opt-in-not-required":
> + help_text += f"\n\t Status: {opt_in_status}"
> +
> + config = f"""config TERRAFORM_AWS_REGION_{safe_name}
> +\tbool "{display_name}"
> +\thelp
> +{help_text}
> +
> +"""
> + return config
> +
> +
> +def get_gpu_amis(region: str = None) -> List[Dict[str, Any]]:
> + """
> + Get available GPU-optimized AMIs including Deep Learning AMIs.
> +
> + Args:
> + region: AWS region
> +
> + Returns:
> + List of AMI information
> + """
> + # Query for Deep Learning AMIs from AWS
> + cmd = ["ec2", "describe-images"]
> + filters = [
> + "Name=owner-alias,Values=amazon",
> + "Name=name,Values=Deep Learning AMI GPU*",
> + "Name=state,Values=available",
> + "Name=architecture,Values=x86_64",
> + ]
> + cmd.append("--filters")
> + cmd.extend(filters)
> + cmd.extend(["--query", "Images[?contains(Name, '2024') || contains(Name, '2025')]"])
> +
> + response = run_aws_command(cmd, region=region)
> +
> + if response:
> + # Sort by creation date to get the most recent
> + response.sort(key=lambda x: x.get("CreationDate", ""), reverse=True)
> + return response[:10] # Return top 10 most recent
> + return []
> +
> +
> +def generate_gpu_amis_kconfig() -> str:
> + """Generate Kconfig content for GPU AMIs."""
> + # Check if AWS CLI is available
> + if not check_aws_cli():
> + return generate_default_gpu_amis_kconfig()
> +
> + # Get available GPU AMIs
> + amis = get_gpu_amis()
> +
> + if not amis:
> + return generate_default_gpu_amis_kconfig()
> +
> + kconfig = """# GPU-optimized AMIs (dynamically generated)
> +
> +# GPU AMI Override - only shown for GPU instances
> +config TERRAFORM_AWS_USE_GPU_AMI
> + bool "Use GPU-optimized AMI instead of standard distribution"
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + output yaml
> + default n
> + help
> + Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers,
> + CUDA, and ML frameworks instead of the standard distribution AMI.
> +
> + When disabled, the standard distribution AMI will be used and you'll need
> + to install GPU drivers manually.
> +
> +if TERRAFORM_AWS_USE_GPU_AMI
> +
> +choice
> + prompt "GPU-optimized AMI selection"
> + default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + help
> + Select which GPU-optimized AMI to use for your GPU instance.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + bool "AWS Deep Learning AMI (Ubuntu 22.04)"
> + help
> + AWS Deep Learning AMI with NVIDIA drivers, CUDA, cuDNN, and popular ML frameworks.
> + Optimized for machine learning workloads on GPU instances.
> + Includes: TensorFlow, PyTorch, MXNet, and Jupyter.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> + bool "NVIDIA Deep Learning AMI"
> + help
> + NVIDIA optimized Deep Learning AMI with latest GPU drivers.
> + Includes NVIDIA GPU Cloud (NGC) containers and frameworks.
> +
> +config TERRAFORM_AWS_GPU_AMI_CUSTOM
> + bool "Custom GPU AMI"
> + help
> + Specify a custom AMI ID for GPU instances.
> +
> +endchoice
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> + string
> + output yaml
> + default "Deep Learning AMI GPU TensorFlow*"
> + help
> + AMI name pattern for AWS Deep Learning AMI.
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> + string
> + output yaml
> + default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> + string
> + output yaml
> + default "NVIDIA Deep Learning AMI*"
> + help
> + AMI name pattern for NVIDIA Deep Learning AMI.
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> + string
> + output yaml
> + default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +
> +if TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +config TERRAFORM_AWS_GPU_AMI_ID
> + string "Custom GPU AMI ID"
> + output yaml
> + help
> + Specify the AMI ID for your custom GPU image.
> + Example: ami-0123456789abcdef0
> +
> +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +endif # TERRAFORM_AWS_USE_GPU_AMI
> +
> +# GPU instance detection
> +config TERRAFORM_AWS_IS_GPU_INSTANCE
> + bool
> + output yaml
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN
> + default n
> + help
> + Automatically detected based on selected instance type.
> + This indicates whether the selected instance has GPU support.
> +
> +"""
> +
> + return kconfig
> +
> +
> +def generate_default_gpu_amis_kconfig() -> str:
> + """Generate default GPU AMI Kconfig when AWS CLI is not available."""
> + return """# GPU-optimized AMIs (default - AWS CLI not available)
> +
> +# GPU AMI Override - only shown for GPU instances
> +config TERRAFORM_AWS_USE_GPU_AMI
> + bool "Use GPU-optimized AMI instead of standard distribution"
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + output yaml
> + default n
> + help
> + Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers,
> + CUDA, and ML frameworks instead of the standard distribution AMI.
> + Note: AWS CLI is not available, showing default options.
> +
> +if TERRAFORM_AWS_USE_GPU_AMI
> +
> +choice
> + prompt "GPU-optimized AMI selection"
> + default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + help
> + Select which GPU-optimized AMI to use for your GPU instance.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + bool "AWS Deep Learning AMI (Ubuntu 22.04)"
> + help
> + Pre-configured with NVIDIA drivers, CUDA, and ML frameworks.
> +
> +config TERRAFORM_AWS_GPU_AMI_CUSTOM
> + bool "Custom GPU AMI"
> +
> +endchoice
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> + string
> + output yaml
> + default "Deep Learning AMI GPU TensorFlow*"
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> + string
> + output yaml
> + default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +if TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +config TERRAFORM_AWS_GPU_AMI_ID
> + string "Custom GPU AMI ID"
> + output yaml
> + help
> + Specify the AMI ID for your custom GPU image.
> +
> +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +endif # TERRAFORM_AWS_USE_GPU_AMI
> +
> +# GPU instance detection (static)
> +config TERRAFORM_AWS_IS_GPU_INSTANCE
> + bool
> + output yaml
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN
> + default n
> + help
> + Automatically detected based on selected instance type.
> + This indicates whether the selected instance has GPU support.
> +
> +"""
> +
> +
> +def generate_default_regions_kconfig() -> str:
> + """Generate default Kconfig content when AWS CLI is not available."""
> + return """# AWS regions (default - AWS CLI not available)
> +
> +choice
> + prompt "AWS region"
> + default TERRAFORM_AWS_REGION_USEAST1
> + help
> + Select the AWS region for your deployment.
> + Note: AWS CLI is not available, showing default options.
> +
> +# US Regions
> +config TERRAFORM_AWS_REGION_USEAST1
> + bool "US East (N. Virginia)"
> +
> +config TERRAFORM_AWS_REGION_USEAST2
> + bool "US East (Ohio)"
> +
> +config TERRAFORM_AWS_REGION_USWEST1
> + bool "US West (N. California)"
> +
> +config TERRAFORM_AWS_REGION_USWEST2
> + bool "US West (Oregon)"
> +
> +# Europe Regions
> +config TERRAFORM_AWS_REGION_EUWEST1
> + bool "Europe (Ireland)"
> +
> +config TERRAFORM_AWS_REGION_EUCENTRAL1
> + bool "Europe (Frankfurt)"
> +
> +# Asia Pacific Regions
> +config TERRAFORM_AWS_REGION_APSOUTHEAST1
> + bool "Asia Pacific (Singapore)"
> +
> +config TERRAFORM_AWS_REGION_APNORTHEAST1
> + bool "Asia Pacific (Tokyo)"
> +
> +endchoice
> +
> +config TERRAFORM_AWS_REGION
> + string
> + default "us-east-1" if TERRAFORM_AWS_REGION_USEAST1
> + default "us-east-2" if TERRAFORM_AWS_REGION_USEAST2
> + default "us-west-1" if TERRAFORM_AWS_REGION_USWEST1
> + default "us-west-2" if TERRAFORM_AWS_REGION_USWEST2
> + default "eu-west-1" if TERRAFORM_AWS_REGION_EUWEST1
> + default "eu-central-1" if TERRAFORM_AWS_REGION_EUCENTRAL1
> + default "ap-southeast-1" if TERRAFORM_AWS_REGION_APSOUTHEAST1
> + default "ap-northeast-1" if TERRAFORM_AWS_REGION_APNORTHEAST1
> + default "us-east-1"
> +
> +"""
> diff --git a/scripts/dynamic-cloud-kconfig.Makefile b/scripts/dynamic-cloud-kconfig.Makefile
> index e15651ab..4105e706 100644
> --- a/scripts/dynamic-cloud-kconfig.Makefile
> +++ b/scripts/dynamic-cloud-kconfig.Makefile
> @@ -12,9 +12,24 @@ LAMBDALABS_KCONFIG_IMAGES := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.generated
>
> LAMBDALABS_KCONFIGS := $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_IMAGES)
>
> +# AWS dynamic configuration
> +AWS_KCONFIG_DIR := terraform/aws/kconfigs
> +AWS_KCONFIG_COMPUTE := $(AWS_KCONFIG_DIR)/Kconfig.compute.generated
> +AWS_KCONFIG_LOCATION := $(AWS_KCONFIG_DIR)/Kconfig.location.generated
> +AWS_INSTANCE_TYPES_DIR := $(AWS_KCONFIG_DIR)/instance-types
> +
> +# List of AWS instance type family files that will be generated
> +AWS_INSTANCE_TYPE_FAMILIES := m5 m7a t3 t3a c5 c7a i4i is4gen im4gn
> +AWS_INSTANCE_TYPE_KCONFIGS := $(foreach family,$(AWS_INSTANCE_TYPE_FAMILIES),$(AWS_INSTANCE_TYPES_DIR)/Kconfig.$(family).generated)
> +
> +AWS_KCONFIGS := $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION) $(AWS_INSTANCE_TYPE_KCONFIGS)
> +
> # Add Lambda Labs generated files to mrproper clean list
> KDEVOPS_MRPROPER += $(LAMBDALABS_KCONFIGS)
>
> +# Add AWS generated files to mrproper clean list
> +KDEVOPS_MRPROPER += $(AWS_KCONFIGS)
> +
> # Touch Lambda Labs generated files so Kconfig can source them
> # This ensures the files exist (even if empty) before Kconfig runs
> dynamic_lambdalabs_kconfig_touch:
> @@ -22,20 +37,43 @@ dynamic_lambdalabs_kconfig_touch:
>
> DYNAMIC_KCONFIG += dynamic_lambdalabs_kconfig_touch
>
> +# Touch AWS generated files so Kconfig can source them
> +# This ensures the files exist (even if empty) before Kconfig runs
> +dynamic_aws_kconfig_touch:
> + $(Q)mkdir -p $(AWS_INSTANCE_TYPES_DIR)
> + $(Q)touch $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION)
> + $(Q)touch $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated
> + $(Q)for family in $(AWS_INSTANCE_TYPE_FAMILIES); do \
> + touch $(AWS_INSTANCE_TYPES_DIR)/Kconfig.$$family.generated; \
> + done
> +
> +DYNAMIC_KCONFIG += dynamic_aws_kconfig_touch
> +
> # Individual Lambda Labs targets are now handled by generate_cloud_configs.py
> cloud-config-lambdalabs:
> $(Q)python3 scripts/generate_cloud_configs.py
>
> +# Individual AWS targets are now handled by generate_cloud_configs.py
> +cloud-config-aws:
> + $(Q)python3 scripts/generate_cloud_configs.py
> +
> # Clean Lambda Labs generated files
> clean-cloud-config-lambdalabs:
> $(Q)rm -f $(LAMBDALABS_KCONFIGS)
>
> -DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs
> +# Clean AWS generated files
> +clean-cloud-config-aws:
> + $(Q)rm -f $(AWS_KCONFIGS)
> + $(Q)rm -f .aws_cloud_config_generated
> +
> +DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs cloud-config-aws
>
> cloud-config-help:
> @echo "Cloud-specific dynamic kconfig targets:"
> @echo "cloud-config - generates all cloud provider dynamic kconfig content"
> @echo "cloud-config-lambdalabs - generates Lambda Labs dynamic kconfig content"
> + @echo "cloud-config-aws - generates AWS dynamic kconfig content"
> + @echo "cloud-update - converts generated cloud configs to static (for committing)"
> @echo "clean-cloud-config - removes all generated cloud kconfig files"
> @echo "cloud-list-all - list all cloud instances for configured provider"
>
> @@ -44,11 +82,55 @@ HELP_TARGETS += cloud-config-help
> cloud-config:
> $(Q)python3 scripts/generate_cloud_configs.py
>
> -clean-cloud-config: clean-cloud-config-lambdalabs
> +clean-cloud-config: clean-cloud-config-lambdalabs clean-cloud-config-aws
> + $(Q)rm -f .cloud.initialized
> $(Q)echo "Cleaned all cloud provider dynamic Kconfig files."
>
> cloud-list-all:
> $(Q)chmod +x scripts/cloud_list_all.sh
> $(Q)scripts/cloud_list_all.sh
>
> -PHONY += cloud-config cloud-config-lambdalabs clean-cloud-config clean-cloud-config-lambdalabs cloud-config-help cloud-list-all
> +# Convert dynamically generated cloud configs to static versions for git commits
> +# This allows admins to generate configs once and commit them for regular users
> +cloud-update:
> + @echo "Converting generated cloud configs to static versions..."
> + # AWS configs
> + $(Q)if [ -f $(AWS_KCONFIG_COMPUTE) ]; then \
> + cp $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \
> + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \
> + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.compute.static"; \
> + fi
> + $(Q)if [ -f $(AWS_KCONFIG_LOCATION) ]; then \
> + cp $(AWS_KCONFIG_LOCATION) $(AWS_KCONFIG_DIR)/Kconfig.location.static; \
> + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.location.static; \
> + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.location.static"; \
> + fi
> + $(Q)if [ -f $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated ]; then \
> + cp $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static; \
> + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static; \
> + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static"; \
> + fi
> + # AWS instance type families
> + $(Q)for file in $(AWS_INSTANCE_TYPES_DIR)/Kconfig.*.generated; do \
> + if [ -f "$$file" ]; then \
> + static_file=$$(echo "$$file" | sed 's/\.generated$$/\.static/'); \
> + cp "$$file" "$$static_file"; \
> + echo " Created $$static_file"; \
> + fi; \
> + done
> + # Lambda Labs configs
> + $(Q)if [ -f $(LAMBDALABS_KCONFIG_COMPUTE) ]; then \
> + cp $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static; \
> + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static"; \
> + fi
> + $(Q)if [ -f $(LAMBDALABS_KCONFIG_LOCATION) ]; then \
> + cp $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static; \
> + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static"; \
> + fi
> + $(Q)if [ -f $(LAMBDALABS_KCONFIG_IMAGES) ]; then \
> + cp $(LAMBDALABS_KCONFIG_IMAGES) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static; \
> + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static"; \
> + fi
> + @echo "Static cloud configs created. You can now commit these .static files to git."
> +
> +PHONY += cloud-config cloud-config-lambdalabs cloud-config-aws clean-cloud-config clean-cloud-config-lambdalabs clean-cloud-config-aws cloud-config-help cloud-list-all cloud-update
> diff --git a/scripts/generate_cloud_configs.py b/scripts/generate_cloud_configs.py
> index b16294dd..332cebe7 100755
> --- a/scripts/generate_cloud_configs.py
> +++ b/scripts/generate_cloud_configs.py
> @@ -10,6 +10,9 @@ import os
> import sys
> import subprocess
> import json
> +from concurrent.futures import ThreadPoolExecutor, as_completed
> +from pathlib import Path
> +from typing import Tuple
>
>
> def generate_lambdalabs_kconfig() -> bool:
> @@ -100,29 +103,194 @@ def get_lambdalabs_summary() -> tuple[bool, str]:
> return False, "Lambda Labs: Error querying API - using defaults"
>
>
> +def generate_aws_kconfig() -> bool:
> + """
> + Generate AWS Kconfig files.
> + Returns True on success, False on failure.
> + """
> + script_dir = os.path.dirname(os.path.abspath(__file__))
> + cli_path = os.path.join(script_dir, "aws-cli")
> +
> + # Generate the Kconfig files
> + result = subprocess.run(
> + [cli_path, "generate-kconfig"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + return result.returncode == 0
> +
> +
> +def get_aws_summary() -> tuple[bool, str]:
> + """
> + Get a summary of AWS configurations using aws-cli.
> + Returns (success, summary_string)
> + """
> + script_dir = os.path.dirname(os.path.abspath(__file__))
> + cli_path = os.path.join(script_dir, "aws-cli")
> +
> + try:
> + # Check if AWS CLI is available
> + result = subprocess.run(
> + ["aws", "--version"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode != 0:
> + return False, "AWS: AWS CLI not installed - using defaults"
> +
> + # Check if credentials are configured
> + result = subprocess.run(
> + ["aws", "sts", "get-caller-identity"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode != 0:
> + return False, "AWS: Credentials not configured - using defaults"
> +
> + # Get instance types count
> + result = subprocess.run(
> + [
> + cli_path,
> + "--output",
> + "json",
> + "instance-types",
> + "list",
> + "--max-results",
> + "100",
> + ],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode != 0:
> + return False, "AWS: Error querying API - using defaults"
> +
> + instances = json.loads(result.stdout)
> + instance_count = len(instances)
> +
> + # Get regions
> + result = subprocess.run(
> + [cli_path, "--output", "json", "regions", "list"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode == 0:
> + regions = json.loads(result.stdout)
> + region_count = len(regions)
> + else:
> + region_count = 0
> +
> + # Get price range from a sample of instances
> + prices = []
> + for instance in instances[:20]: # Sample first 20 for speed
> + if "error" not in instance:
> + # Extract price if available (would need pricing API)
> + # For now, we'll use placeholder
> + vcpus = instance.get("vcpu", 0)
> + if vcpus > 0:
> + # Rough estimate: $0.05 per vCPU/hour
> + estimated_price = vcpus * 0.05
> + prices.append(estimated_price)
> +
> + # Format summary
> + if prices:
> + min_price = min(prices)
> + max_price = max(prices)
> + price_range = f"~${min_price:.2f}-${max_price:.2f}/hr"
> + else:
> + price_range = "pricing varies by region"
> +
> + return (
> + True,
> + f"AWS: {instance_count} instance types available, "
> + f"{region_count} regions, {price_range}",
> + )
> +
> + except (subprocess.SubprocessError, json.JSONDecodeError, KeyError):
> + return False, "AWS: Error querying API - using defaults"
> +
> +
> +def process_lambdalabs() -> Tuple[bool, bool, str]:
> + """Process Lambda Labs configuration generation and summary.
> + Returns (kconfig_generated, summary_success, summary_text)
> + """
> + kconfig_generated = generate_lambdalabs_kconfig()
> + success, summary = get_lambdalabs_summary()
> + return kconfig_generated, success, summary
> +
> +
> +def process_aws() -> Tuple[bool, bool, str]:
> + """Process AWS configuration generation and summary.
> + Returns (kconfig_generated, summary_success, summary_text)
> + """
> + kconfig_generated = generate_aws_kconfig()
> + success, summary = get_aws_summary()
> +
> + # Create marker file to indicate dynamic AWS config is available
> + if kconfig_generated:
> + marker_file = Path(".aws_cloud_config_generated")
> + marker_file.touch()
> +
> + return kconfig_generated, success, summary
> +
> +
> def main():
> """Main function to generate cloud configurations."""
> print("Cloud Provider Configuration Summary")
> print("=" * 60)
> print()
>
> - # Lambda Labs - Generate Kconfig files first
> - kconfig_generated = generate_lambdalabs_kconfig()
> + # Run cloud provider operations in parallel
> + results = {}
> + any_success = False
>
> - # Lambda Labs - Get summary
> - success, summary = get_lambdalabs_summary()
> - if success:
> - print(f"✓ {summary}")
> - if kconfig_generated:
> - print(" Kconfig files generated successfully")
> - else:
> - print(" Warning: Failed to generate Kconfig files")
> - else:
> - print(f"⚠ {summary}")
> - print()
> + with ThreadPoolExecutor(max_workers=4) as executor:
> + # Submit all tasks
> + futures = {
> + executor.submit(process_lambdalabs): "lambdalabs",
> + executor.submit(process_aws): "aws",
> + }
> +
> + # Process results as they complete
> + for future in as_completed(futures):
> + provider = futures[future]
> + try:
> + results[provider] = future.result()
> + except Exception as e:
> + results[provider] = (
> + False,
> + False,
> + f"{provider.upper()}: Error - {str(e)}",
> + )
> +
> + # Display results in consistent order
> + for provider in ["lambdalabs", "aws"]:
> + if provider in results:
> + kconfig_gen, success, summary = results[provider]
> + if success and kconfig_gen:
> + any_success = True
> + if success:
> + print(f"✓ {summary}")
> + if kconfig_gen:
> + print(" Kconfig files generated successfully")
> + else:
> + print(" Warning: Failed to generate Kconfig files")
> + else:
> + print(f"⚠ {summary}")
> + print()
>
> - # AWS (placeholder - not implemented)
> - print("⚠ AWS: Dynamic configuration not yet implemented")
> + # Create .cloud.initialized if any provider succeeded
> + if any_success:
> + Path(".cloud.initialized").touch()
>
> # Azure (placeholder - not implemented)
> print("⚠ Azure: Dynamic configuration not yet implemented")
> diff --git a/terraform/aws/kconfigs/Kconfig.compute b/terraform/aws/kconfigs/Kconfig.compute
> index bae0ea1c..6b5ff900 100644
> --- a/terraform/aws/kconfigs/Kconfig.compute
> +++ b/terraform/aws/kconfigs/Kconfig.compute
> @@ -1,94 +1,54 @@
> -choice
> - prompt "AWS instance types"
> - help
> - Instance types comprise varying combinations of hardware
> - platform, CPU count, memory size, storage, and networking
> - capacity. Select the type that provides an appropriate mix
> - of resources for your preferred workflows.
> -
> - Some instance types are region- and capacity-limited.
> -
> - See https://aws.amazon.com/ec2/instance-types/ for
> - details.
> +# AWS compute configuration
>
> -config TERRAFORM_AWS_INSTANCE_TYPE_M5
> - bool "M5"
> - depends on TARGET_ARCH_X86_64
> +config TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> + bool "Use dynamically generated instance types"
> + default $(shell, test -f .aws_cloud_config_generated && echo y || echo n)
> help
> - This is a general purpose type powered by Intel Xeon®
> - Platinum 8175M or 8259CL processors (Skylake or Cascade
> - Lake).
> + Enable this to use dynamically generated instance types from AWS CLI.
> + Run 'make cloud-config' to query AWS and generate available options.
> + When disabled, uses static predefined instance types.
>
> - See https://aws.amazon.com/ec2/instance-types/m5/ for
> - details.
> + This is automatically enabled when you run 'make cloud-config'.
>
> -config TERRAFORM_AWS_INSTANCE_TYPE_M7A
> - bool "M7a"
> - depends on TARGET_ARCH_X86_64
> - help
> - This is a general purpose type powered by 4th Generation
> - AMD EPYC processors.
> -
> - See https://aws.amazon.com/ec2/instance-types/m7a/ for
> - details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_I4I
> - bool "I4i"
> - depends on TARGET_ARCH_X86_64
> - help
> - This is a storage-optimized type powered by 3rd generation
> - Intel Xeon Scalable processors (Ice Lake) and use AWS Nitro
> - NVMe SSDs.
> +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Include cloud-generated or static instance families
> +# Try static first (pre-generated by admins for faster loading)
> +# Fall back to generated files (requires AWS CLI)
> +source "terraform/aws/kconfigs/Kconfig.compute.static"
> +endif
>
> - See https://aws.amazon.com/ec2/instance-types/i4i/ for
> - details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_IS4GEN
> - bool "Is4gen"
> - depends on TARGET_ARCH_ARM64
> - help
> - This is a Storage-optimized type powered by AWS Graviton2
> - processors.
> -
> - See https://aws.amazon.com/ec2/instance-types/i4g/ for
> - details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_IM4GN
> - bool "Im4gn"
> - depends on TARGET_ARCH_ARM64
> +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Static instance types when not using dynamic config
> +choice
> + prompt "AWS instance types"
> help
> - This is a storage-optimized type powered by AWS Graviton2
> - processors.
> + Instance types comprise varying combinations of hardware
> + platform, CPU count, memory size, storage, and networking
> + capacity. Select the type that provides an appropriate mix
> + of resources for your preferred workflows.
>
> - See https://aws.amazon.com/ec2/instance-types/i4g/ for
> - details.
> + Some instance types are region- and capacity-limited.
>
> -config TERRAFORM_AWS_INSTANCE_TYPE_C7A
> - depends on TARGET_ARCH_X86_64
> - bool "c7a"
> - help
> - This is a compute-optimized type powered by 4th generation
> - AMD EPYC processors.
> + See https://aws.amazon.com/ec2/instance-types/ for
> + details.
>
> - See https://aws.amazon.com/ec2/instance-types/c7a/ for
> - details.
>
> endchoice
> +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
>
> +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Use static instance type definitions when not using dynamic config
> source "terraform/aws/kconfigs/instance-types/Kconfig.m5"
> source "terraform/aws/kconfigs/instance-types/Kconfig.m7a"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.i4i"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.is4gen"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.im4gn"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.c7a"
> +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
>
> choice
> prompt "Linux distribution"
> default TERRAFORM_AWS_DISTRO_DEBIAN
> help
> - Select a popular Linux distribution to install on your
> - instances, or use the "Custom AMI image" selection to
> - choose an image that is off the beaten path.
> + Select a popular Linux distribution to install on your
> + instances, or use the "Custom AMI image" selection to
> + choose an image that is off the beaten path.
>
> config TERRAFORM_AWS_DISTRO_AMAZON
> bool "Amazon Linux"
> @@ -120,3 +80,8 @@ source "terraform/aws/kconfigs/distros/Kconfig.oracle"
> source "terraform/aws/kconfigs/distros/Kconfig.rhel"
> source "terraform/aws/kconfigs/distros/Kconfig.sles"
> source "terraform/aws/kconfigs/distros/Kconfig.custom"
> +
> +# Include GPU AMI configuration if available (generated by cloud-config)
> +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +source "terraform/aws/kconfigs/Kconfig.gpu-amis.static"
> +endif
--
Chuck Lever
next prev parent reply other threads:[~2025-09-04 13:55 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-04 9:00 [PATCH 0/3] aws: add dynamic kconfig support Luis Chamberlain
2025-09-04 9:00 ` [PATCH 1/3] aws: add dynamic cloud configuration support using AWS CLI Luis Chamberlain
2025-09-04 13:55 ` Chuck Lever [this message]
2025-09-04 17:12 ` Luis Chamberlain
2025-09-04 9:00 ` [PATCH 2/3] aws: run make cloud-update Luis Chamberlain
2025-09-04 9:00 ` [PATCH 3/3] lambda: " Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d38f8fc3-3a44-4de7-b463-b258cbcd66c4@kernel.org \
--to=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=kdevops@lists.linux.dev \
--cc=mcgrof@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox