Re: [PATCH 1/3] aws: add dynamic cloud configuration support using AWS CLI

public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed

From: Chuck Lever <cel@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
	Daniel Gomez <da.gomez@kruces.com>,
	kdevops@lists.linux.dev
Subject: Re: [PATCH 1/3] aws: add dynamic cloud configuration support using AWS CLI
Date: Thu, 4 Sep 2025 09:55:46 -0400	[thread overview]
Message-ID: <d38f8fc3-3a44-4de7-b463-b258cbcd66c4@kernel.org> (raw)
In-Reply-To: <20250904090030.2481840-2-mcgrof@kernel.org>

On 9/4/25 5:00 AM, Luis Chamberlain wrote:
> Add support for dynamically generating AWS instance types and regions
> configuration using the AWS CLI, similar to the Lambda Labs implementation.
> 
> This allows users to:
> - Query real-time AWS instance availability
> - Generate Kconfig files with current instance families and regions
> - Choose between dynamic and static configuration modes
> - See pricing estimates and resource summaries
> 
> Key components:
> - scripts/aws-cli: AWS CLI wrapper tool for kdevops
> - scripts/aws_api.py: Low-level AWS API functions
> - Updated generate_cloud_configs.py to support AWS
> - Makefile integration for AWS Kconfig generation
> - Option to use dynamic or static AWS configuration
> 
> Usage: Run 'make cloud-config' to generate dynamic configuration.

I'd like to see more documentation for this make target. Is this a
target to be run as part of every workflow, or is it one that developers
run every once in a while? (I think the latter, based on the next patch
in this series, but it would be nice to put that in docs somewhere).

For example, ISTR a docs file that describes "make refs-default".


> This also parallelize cloud provider operations to significantly improve
> generation.
> 
> $ time make cloud-config
> Cloud Provider Configuration Summary
> ============================================================
> 
> ✓ Lambda Labs: 14/20 instances available, 14 regions, $0.50-$10.32/hr
>   Kconfig files generated successfully
> 
> ✓ AWS: 979 instance types available, 17 regions, ~$0.05-$3.60/hr
>   Kconfig files generated successfully
> 
> ⚠ Azure: Dynamic configuration not yet implemented
> ⚠ GCE: Dynamic configuration not yet implemented
> 
> Note: Dynamic configurations query real-time availability
> Run 'make menuconfig' to configure your cloud provider
> 
> real    6m51.859s
> user    37m16.347s
> sys     3m8.130s

I spent a little time yesterday adding new instance families by hand,
after asking the AWS instance type assistant to recommend appropriate
families for CI/CD. I added: t3, t3a, m7i-flex, and the two g4 families.
Other families looked too expensive or might be more than is needed for
CI/CD (like who needs 16 GPUs, 192 vCPUs, and 8 200GbE adapters for a
development system? ;-)

What I'd like to do is prevent an overload of choices in the instance
menus -- perhaps we can maintain a list somewhere of the families we'd
like to add to the menu and let the scripting consult that list as it
constructs the menu.

(And, I can drop my by-hand patches... I didn't realize you were ready
to post your automation that does the same thing).


> This also adds support for GPU AMIs:

IMHO the new GPU AMI support is a good idea, but should be split into a
separate patch in this series.


>   - AWS Deep Learning AMI with pre-installed NVIDIA drivers, CUDA, and
>     ML frameworks
>   - NVIDIA Deep Learning AMI option for NGC containers
>   - Custom GPU AMI support for specialized images
>   - Automatic detection of GPU instance types
>   - We conditionally display of GPU AMI options only for GPU instances

(I think OCI also provides distinct OS images with GPU user space tools,
fwiw).


> We automatically detects when you select a GPU instance family (like G6E) and
> provides appropriate GPU-optimized AMI options including the AWS Deep
> Learning AMI with all necessary drivers and frameworks pre-installed.
> 
> Generated-by: Claude AI
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  .gitignore                                    |    3 +
>  defconfigs/aws-gpu-g6e-ai                     |   53 +
>  .../templates/aws/terraform.tfvars.j2         |    5 +
>  scripts/aws-cli                               |  436 +++++++
>  scripts/aws_api.py                            | 1135 +++++++++++++++++

Note there is also an Ansible collection that provides the AWS API
to playbooks: amazon.aws. This is not a recommendation to
reimplement this patch, just pointing out there are other ways to
skin the cat.


>  scripts/dynamic-cloud-kconfig.Makefile        |   88 +-
>  scripts/generate_cloud_configs.py             |  198 ++-
>  terraform/aws/kconfigs/Kconfig.compute        |  109 +-
>  8 files changed, 1937 insertions(+), 90 deletions(-)
>  create mode 100644 defconfigs/aws-gpu-g6e-ai
>  create mode 100755 scripts/aws-cli
>  create mode 100755 scripts/aws_api.py
> 
> diff --git a/.gitignore b/.gitignore
> index 09d2ae33..30337add 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -115,3 +115,6 @@ terraform/lambdalabs/.terraform_api_key
>  .cloud.initialized
>  
>  scripts/__pycache__/
> +.aws_cloud_config_generated
> +terraform/aws/kconfigs/*.generated
> +terraform/aws/kconfigs/instance-types/*.generated
> diff --git a/defconfigs/aws-gpu-g6e-ai b/defconfigs/aws-gpu-g6e-ai
> new file mode 100644
> index 00000000..affc7a98
> --- /dev/null
> +++ b/defconfigs/aws-gpu-g6e-ai
> @@ -0,0 +1,53 @@
> +# AWS G6e.2xlarge GPU instance with Deep Learning AMI for AI/ML workloads
> +# This configuration sets up an AWS G6e.2xlarge instance with NVIDIA L40S GPU
> +# optimized for machine learning, AI inference, and GPU-accelerated workloads
> +
> +# Cloud provider configuration
> +CONFIG_KDEVOPS_ENABLE_TERRAFORM=y
> +CONFIG_TERRAFORM=y
> +CONFIG_TERRAFORM_AWS=y
> +
> +# AWS Dynamic configuration (required for G6E instance family and GPU AMIs)
> +CONFIG_TERRAFORM_AWS_USE_DYNAMIC_CONFIG=y
> +
> +# AWS Instance configuration - G6E family with NVIDIA L40S GPU
> +# G6E.2XLARGE specifications:
> +# - 8 vCPUs (3rd Gen AMD EPYC processors)
> +# - 32 GB system RAM
> +# - 1x NVIDIA L40S Tensor Core GPU
> +# - 48 GB GPU memory
> +# - Up to 15 Gbps network performance
> +# - Up to 10 Gbps EBS bandwidth
> +CONFIG_TERRAFORM_AWS_INSTANCE_TYPE_G6E=y
> +CONFIG_TERRAFORM_AWS_INSTANCE_G6E_2XLARGE=y
> +
> +# AWS Region - US East (N. Virginia) - primary availability for G6E
> +CONFIG_TERRAFORM_AWS_REGION_US_EAST_1=y
> +
> +# GPU-optimized Deep Learning AMI
> +# Includes: NVIDIA drivers 535+, CUDA 12.x, cuDNN, TensorFlow, PyTorch, MXNet
> +CONFIG_TERRAFORM_AWS_USE_GPU_AMI=y
> +CONFIG_TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING=y
> +CONFIG_TERRAFORM_AWS_GPU_AMI_NAME="Deep Learning AMI GPU TensorFlow*"
> +CONFIG_TERRAFORM_AWS_GPU_AMI_OWNER="amazon"
> +
> +# Storage configuration optimized for ML workloads
> +# 200 GB for datasets, models, and experiment artifacts
> +CONFIG_TERRAFORM_AWS_DATA_VOLUME_SIZE=200
> +
> +# Basic workflow configuration for kernel development
> +CONFIG_WORKFLOWS=y
> +CONFIG_WORKFLOW_LINUX_CUSTOM=y
> +CONFIG_BOOTLINUX=y
> +
> +# Skip testing workflows for pure AI/ML setup
> +CONFIG_WORKFLOWS_TESTS=n
> +
> +# Enable systemd journal remote for debugging
> +CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y
> +
> +# Note: After provisioning, the instance will have:
> +# - Jupyter notebook server ready for ML experiments
> +# - Pre-installed deep learning frameworks
> +# - NVIDIA GPU drivers and CUDA toolkit
> +# - Docker with NVIDIA Container Toolkit for containerized ML workloads
> \ No newline at end of file
> diff --git a/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2 b/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2
> index d880254b..f8f4c842 100644
> --- a/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2
> +++ b/playbooks/roles/gen_tfvars/templates/aws/terraform.tfvars.j2
> @@ -1,8 +1,13 @@
>  aws_profile = "{{ terraform_aws_profile }}"
>  aws_region = "{{ terraform_aws_region }}"
>  aws_availability_zone = "{{ terraform_aws_av_zone }}"
> +{% if terraform_aws_use_gpu_ami is defined and terraform_aws_use_gpu_ami %}
> +aws_name_search = "{{ terraform_aws_gpu_ami_name }}"
> +aws_ami_owner = "{{ terraform_aws_gpu_ami_owner }}"
> +{% else %}
>  aws_name_search = "{{ terraform_aws_ns }}"
>  aws_ami_owner = "{{ terraform_aws_ami_owner }}"
> +{% endif %}
>  aws_instance_type = "{{ terraform_aws_instance_type }}"
>  aws_ebs_volumes_per_instance = "{{ terraform_aws_ebs_volumes_per_instance }}"
>  aws_ebs_volume_size = {{ terraform_aws_ebs_volume_size }}
> diff --git a/scripts/aws-cli b/scripts/aws-cli
> new file mode 100755
> index 00000000..6cacce8b
> --- /dev/null
> +++ b/scripts/aws-cli
> @@ -0,0 +1,436 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: MIT
> +"""
> +AWS CLI tool for kdevops
> +
> +A structured CLI tool that wraps AWS CLI commands and provides access to
> +AWS cloud provider functionality for dynamic configuration generation
> +and resource management.
> +"""
> +
> +import argparse
> +import json
> +import sys
> +import os
> +from typing import Dict, List, Any, Optional, Tuple
> +from pathlib import Path
> +
> +# Import the AWS API functions
> +try:
> +    from aws_api import (
> +        check_aws_cli,
> +        get_instance_types,
> +        get_regions,
> +        get_availability_zones,
> +        get_pricing_info,
> +        generate_instance_types_kconfig,
> +        generate_regions_kconfig,
> +        generate_instance_families_kconfig,
> +        generate_gpu_amis_kconfig,
> +    )
> +except ImportError:
> +    # Try to import from scripts directory if not in path
> +    sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
> +    from aws_api import (
> +        check_aws_cli,
> +        get_instance_types,
> +        get_regions,
> +        get_availability_zones,
> +        get_pricing_info,
> +        generate_instance_types_kconfig,
> +        generate_regions_kconfig,
> +        generate_instance_families_kconfig,
> +        generate_gpu_amis_kconfig,
> +    )
> +
> +
> +class AWSCLI:
> +    """AWS CLI interface for kdevops"""
> +
> +    def __init__(self, output_format: str = "json"):
> +        """
> +        Initialize the CLI with specified output format
> +
> +        Args:
> +            output_format: 'json' or 'text' for output formatting
> +        """
> +        self.output_format = output_format
> +        self.aws_available = check_aws_cli()
> +
> +    def output(self, data: Any, headers: Optional[List[str]] = None):
> +        """
> +        Output data in the specified format
> +
> +        Args:
> +            data: Data to output (dict, list, or primitive)
> +            headers: Column headers for text format (optional)
> +        """
> +        if self.output_format == "json":
> +            print(json.dumps(data, indent=2))
> +        else:
> +            # Human-readable text format
> +            if isinstance(data, list):
> +                if data and isinstance(data[0], dict):
> +                    # Table format for list of dicts
> +                    if not headers:
> +                        headers = list(data[0].keys()) if data else []
> +
> +                    if headers:
> +                        # Calculate column widths
> +                        widths = {h: len(h) for h in headers}
> +                        for item in data:
> +                            for h in headers:
> +                                val = str(item.get(h, ""))
> +                                widths[h] = max(widths[h], len(val))
> +
> +                        # Print header
> +                        header_line = " | ".join(h.ljust(widths[h]) for h in headers)
> +                        print(header_line)
> +                        print("-" * len(header_line))
> +
> +                        # Print rows
> +                        for item in data:
> +                            row = " | ".join(
> +                                str(item.get(h, "")).ljust(widths[h]) for h in headers
> +                            )
> +                            print(row)
> +                else:
> +                    # Simple list
> +                    for item in data:
> +                        print(item)
> +            elif isinstance(data, dict):
> +                # Key-value format
> +                max_key_len = max(len(k) for k in data.keys()) if data else 0
> +                for key, value in data.items():
> +                    print(f"{key.ljust(max_key_len)} : {value}")
> +            else:
> +                # Simple value
> +                print(data)
> +
> +    def list_instance_types(
> +        self,
> +        family: Optional[str] = None,
> +        region: Optional[str] = None,
> +        max_results: int = 100,
> +    ) -> List[Dict[str, Any]]:
> +        """
> +        List instance types
> +
> +        Args:
> +            family: Filter by instance family (e.g., 'm5', 't3')
> +            region: AWS region to query
> +            max_results: Maximum number of results to return
> +
> +        Returns:
> +            List of instance type information
> +        """
> +        if not self.aws_available:
> +            return [
> +                {
> +                    "error": "AWS CLI not found. Please install AWS CLI and configure credentials."
> +                }
> +            ]
> +
> +        instances = get_instance_types(
> +            family=family, region=region, max_results=max_results
> +        )
> +
> +        # Format the results
> +        result = []
> +        for instance in instances:
> +            item = {
> +                "name": instance.get("InstanceType", ""),
> +                "vcpu": instance.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> +                "memory_gb": instance.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024,
> +                "instance_storage": instance.get("InstanceStorageSupported", False),
> +                "network_performance": instance.get("NetworkInfo", {}).get(
> +                    "NetworkPerformance", ""
> +                ),
> +                "architecture": ", ".join(
> +                    instance.get("ProcessorInfo", {}).get("SupportedArchitectures", [])
> +                ),
> +            }
> +            result.append(item)
> +
> +        # Sort by name
> +        result.sort(key=lambda x: x["name"])
> +
> +        return result
> +
> +    def list_regions(self, include_zones: bool = False) -> List[Dict[str, Any]]:
> +        """
> +        List regions
> +
> +        Args:
> +            include_zones: Include availability zones for each region
> +
> +        Returns:
> +            List of region information
> +        """
> +        if not self.aws_available:
> +            return [
> +                {
> +                    "error": "AWS CLI not found. Please install AWS CLI and configure credentials."
> +                }
> +            ]
> +
> +        regions = get_regions()
> +
> +        result = []
> +        for region in regions:
> +            item = {
> +                "name": region.get("RegionName", ""),
> +                "endpoint": region.get("Endpoint", ""),
> +                "opt_in_status": region.get("OptInStatus", ""),
> +            }
> +
> +            if include_zones:
> +                # Get availability zones for this region
> +                zones = get_availability_zones(region["RegionName"])
> +                item["zones"] = len(zones)
> +                item["zone_names"] = ", ".join([z["ZoneName"] for z in zones])
> +
> +            result.append(item)
> +
> +        return result
> +
> +    def get_cheapest_instance(
> +        self,
> +        region: Optional[str] = None,
> +        family: Optional[str] = None,
> +        min_vcpus: int = 2,
> +    ) -> Dict[str, Any]:
> +        """
> +        Get the cheapest instance meeting criteria
> +
> +        Args:
> +            region: AWS region
> +            family: Instance family filter
> +            min_vcpus: Minimum number of vCPUs required
> +
> +        Returns:
> +            Dictionary with instance information
> +        """
> +        if not self.aws_available:
> +            return {"error": "AWS CLI not available"}
> +
> +        instances = get_instance_types(family=family, region=region)
> +
> +        # Filter by minimum vCPUs
> +        eligible = []
> +        for instance in instances:
> +            vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0)
> +            if vcpus >= min_vcpus:
> +                eligible.append(instance)
> +
> +        if not eligible:
> +            return {"error": "No instances found matching criteria"}
> +
> +        # Get pricing for eligible instances
> +        pricing = get_pricing_info(region=region or "us-east-1")
> +
> +        # Find cheapest
> +        cheapest = None
> +        cheapest_price = float("inf")
> +
> +        for instance in eligible:
> +            instance_type = instance.get("InstanceType")
> +            price = pricing.get(instance_type, {}).get("on_demand", float("inf"))
> +            if price < cheapest_price:
> +                cheapest_price = price
> +                cheapest = instance
> +
> +        if cheapest:
> +            return {
> +                "instance_type": cheapest.get("InstanceType"),
> +                "vcpus": cheapest.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> +                "memory_gb": cheapest.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024,
> +                "price_per_hour": f"${cheapest_price:.3f}",
> +            }
> +
> +        return {"error": "Could not determine cheapest instance"}
> +
> +    def generate_kconfig(self) -> bool:
> +        """
> +        Generate Kconfig files for AWS
> +
> +        Returns:
> +            True on success, False on failure
> +        """
> +        if not self.aws_available:
> +            print("AWS CLI not available, cannot generate Kconfig", file=sys.stderr)
> +            return False
> +
> +        output_dir = Path("terraform/aws/kconfigs")
> +
> +        # Create directory if it doesn't exist
> +        output_dir.mkdir(parents=True, exist_ok=True)
> +
> +        try:
> +            from concurrent.futures import ThreadPoolExecutor, as_completed
> +
> +            # Generate files in parallel
> +            instance_types_dir = output_dir / "instance-types"
> +            instance_types_dir.mkdir(exist_ok=True)
> +
> +            def generate_family_file(family):
> +                """Generate Kconfig for a single family."""
> +                types_kconfig = generate_instance_types_kconfig(family)
> +                if types_kconfig:
> +                    types_file = instance_types_dir / f"Kconfig.{family}.generated"
> +                    types_file.write_text(types_kconfig)
> +                    return f"Generated {types_file}"
> +                return None
> +
> +            with ThreadPoolExecutor(max_workers=10) as executor:
> +                # Submit all generation tasks
> +                futures = []
> +
> +                # Generate instance families Kconfig
> +                futures.append(executor.submit(generate_instance_families_kconfig))
> +
> +                # Generate regions Kconfig
> +                futures.append(executor.submit(generate_regions_kconfig))
> +
> +                # Generate GPU AMIs Kconfig
> +                futures.append(executor.submit(generate_gpu_amis_kconfig))
> +
> +                # Generate instance types for each family
> +                # Get all families dynamically from AWS
> +                from aws_api import get_generated_instance_families
> +
> +                families = get_generated_instance_families()
> +
> +                family_futures = []
> +                for family in sorted(families):
> +                    family_futures.append(executor.submit(generate_family_file, family))
> +
> +                # Process main config results
> +                families_kconfig = futures[0].result()
> +                regions_kconfig = futures[1].result()
> +                gpu_amis_kconfig = futures[2].result()
> +
> +                # Write main configs
> +                families_file = output_dir / "Kconfig.compute.generated"
> +                families_file.write_text(families_kconfig)
> +                print(f"Generated {families_file}")
> +
> +                regions_file = output_dir / "Kconfig.location.generated"
> +                regions_file.write_text(regions_kconfig)
> +                print(f"Generated {regions_file}")
> +
> +                gpu_amis_file = output_dir / "Kconfig.gpu-amis.generated"
> +                gpu_amis_file.write_text(gpu_amis_kconfig)
> +                print(f"Generated {gpu_amis_file}")
> +
> +                # Process family results
> +                for future in family_futures:
> +                    result = future.result()
> +                    if result:
> +                        print(result)
> +
> +            return True
> +
> +        except Exception as e:
> +            print(f"Error generating Kconfig: {e}", file=sys.stderr)
> +            return False
> +
> +
> +def main():
> +    """Main entry point"""
> +    parser = argparse.ArgumentParser(
> +        description="AWS CLI tool for kdevops",
> +        formatter_class=argparse.RawDescriptionHelpFormatter,
> +    )
> +
> +    parser.add_argument(
> +        "--output",
> +        choices=["json", "text"],
> +        default="json",
> +        help="Output format (default: json)",
> +    )
> +
> +    subparsers = parser.add_subparsers(dest="command", help="Available commands")
> +
> +    # Generate Kconfig command
> +    kconfig_parser = subparsers.add_parser(
> +        "generate-kconfig", help="Generate Kconfig files for AWS"
> +    )
> +
> +    # Instance types command
> +    instances_parser = subparsers.add_parser(
> +        "instance-types", help="Manage instance types"
> +    )
> +    instances_subparsers = instances_parser.add_subparsers(
> +        dest="subcommand", help="Instance type operations"
> +    )
> +
> +    # Instance types list
> +    list_instances = instances_subparsers.add_parser("list", help="List instance types")
> +    list_instances.add_argument("--family", help="Filter by instance family")
> +    list_instances.add_argument("--region", help="AWS region")
> +    list_instances.add_argument(
> +        "--max-results", type=int, default=100, help="Maximum results (default: 100)"
> +    )
> +
> +    # Regions command
> +    regions_parser = subparsers.add_parser("regions", help="Manage regions")
> +    regions_subparsers = regions_parser.add_subparsers(
> +        dest="subcommand", help="Region operations"
> +    )
> +
> +    # Regions list
> +    list_regions = regions_subparsers.add_parser("list", help="List regions")
> +    list_regions.add_argument(
> +        "--include-zones",
> +        action="store_true",
> +        help="Include availability zones",
> +    )
> +
> +    # Cheapest instance command
> +    cheapest_parser = subparsers.add_parser(
> +        "cheapest", help="Find cheapest instance meeting criteria"
> +    )
> +    cheapest_parser.add_argument("--region", help="AWS region")
> +    cheapest_parser.add_argument("--family", help="Instance family")
> +    cheapest_parser.add_argument(
> +        "--min-vcpus", type=int, default=2, help="Minimum vCPUs (default: 2)"
> +    )
> +
> +    args = parser.parse_args()
> +
> +    cli = AWSCLI(output_format=args.output)
> +
> +    if args.command == "generate-kconfig":
> +        success = cli.generate_kconfig()
> +        sys.exit(0 if success else 1)
> +
> +    elif args.command == "instance-types":
> +        if args.subcommand == "list":
> +            instances = cli.list_instance_types(
> +                family=args.family,
> +                region=args.region,
> +                max_results=args.max_results,
> +            )
> +            cli.output(instances)
> +
> +    elif args.command == "regions":
> +        if args.subcommand == "list":
> +            regions = cli.list_regions(include_zones=args.include_zones)
> +            cli.output(regions)
> +
> +    elif args.command == "cheapest":
> +        result = cli.get_cheapest_instance(
> +            region=args.region,
> +            family=args.family,
> +            min_vcpus=args.min_vcpus,
> +        )
> +        cli.output(result)
> +
> +    else:
> +        parser.print_help()
> +        sys.exit(1)
> +
> +
> +if __name__ == "__main__":
> +    main()
> diff --git a/scripts/aws_api.py b/scripts/aws_api.py
> new file mode 100755
> index 00000000..1cf42f39
> --- /dev/null
> +++ b/scripts/aws_api.py
> @@ -0,0 +1,1135 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: MIT
> +"""
> +AWS API library for kdevops.
> +
> +Provides AWS CLI wrapper functions for dynamic configuration generation.
> +Used by aws-cli and other kdevops components.
> +"""
> +
> +import json
> +import os
> +import re
> +import subprocess
> +import sys
> +from typing import Dict, List, Optional, Any
> +
> +
> +def check_aws_cli() -> bool:
> +    """Check if AWS CLI is installed and configured."""
> +    try:
> +        # Check if AWS CLI is installed
> +        result = subprocess.run(
> +            ["aws", "--version"],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +        if result.returncode != 0:
> +            return False
> +
> +        # Check if credentials are configured
> +        result = subprocess.run(
> +            ["aws", "sts", "get-caller-identity"],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +        return result.returncode == 0
> +    except FileNotFoundError:
> +        return False
> +
> +
> +def get_default_region() -> str:
> +    """Get the default AWS region from configuration or environment."""
> +    # Try to get from environment
> +    region = os.environ.get("AWS_DEFAULT_REGION")
> +    if region:
> +        return region
> +
> +    # Try to get from AWS config
> +    try:
> +        result = subprocess.run(
> +            ["aws", "configure", "get", "region"],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +        if result.returncode == 0 and result.stdout.strip():
> +            return result.stdout.strip()
> +    except:
> +        pass
> +
> +    # Default to us-east-1
> +    return "us-east-1"
> +
> +
> +def run_aws_command(command: List[str], region: Optional[str] = None) -> Optional[Dict]:
> +    """
> +    Run an AWS CLI command and return the JSON output.
> +
> +    Args:
> +        command: AWS CLI command as a list
> +        region: Optional AWS region
> +
> +    Returns:
> +        Parsed JSON output or None on error
> +    """
> +    cmd = ["aws"] + command + ["--output", "json"]
> +
> +    # Always specify a region (use default if not provided)
> +    if not region:
> +        region = get_default_region()
> +    cmd.extend(["--region", region])
> +
> +    try:
> +        result = subprocess.run(
> +            cmd,
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +        if result.returncode == 0:
> +            return json.loads(result.stdout) if result.stdout else {}
> +        else:
> +            print(f"AWS command failed: {result.stderr}", file=sys.stderr)
> +            return None
> +    except (subprocess.SubprocessError, json.JSONDecodeError) as e:
> +        print(f"Error running AWS command: {e}", file=sys.stderr)
> +        return None
> +
> +
> +def get_regions() -> List[Dict[str, Any]]:
> +    """Get available AWS regions."""
> +    response = run_aws_command(["ec2", "describe-regions"])
> +    if response and "Regions" in response:
> +        return response["Regions"]
> +    return []
> +
> +
> +def get_availability_zones(region: str) -> List[Dict[str, Any]]:
> +    """Get availability zones for a specific region."""
> +    response = run_aws_command(
> +        ["ec2", "describe-availability-zones"],
> +        region=region,
> +    )
> +    if response and "AvailabilityZones" in response:
> +        return response["AvailabilityZones"]
> +    return []
> +
> +
> +def get_instance_types(
> +    family: Optional[str] = None,
> +    region: Optional[str] = None,
> +    max_results: int = 100,
> +    fetch_all: bool = True,
> +) -> List[Dict[str, Any]]:
> +    """
> +    Get available instance types.
> +
> +    Args:
> +        family: Instance family filter (e.g., 'm5', 't3')
> +        region: AWS region
> +        max_results: Maximum number of results per API call (max 100)
> +        fetch_all: If True, fetch all pages using NextToken pagination
> +
> +    Returns:
> +        List of instance type information
> +    """
> +    all_instances = []
> +    next_token = None
> +    page_count = 0
> +
> +    # Ensure max_results doesn't exceed AWS limit
> +    max_results = min(max_results, 100)
> +
> +    while True:
> +        cmd = ["ec2", "describe-instance-types"]
> +
> +        filters = []
> +        if family:
> +            # Filter by instance type pattern
> +            filters.append(f"Name=instance-type,Values={family}*")
> +
> +        if filters:
> +            cmd.append("--filters")
> +            cmd.extend(filters)
> +
> +        cmd.extend(["--max-results", str(max_results)])
> +
> +        if next_token:
> +            cmd.extend(["--next-token", next_token])
> +
> +        response = run_aws_command(cmd, region=region)
> +        if response and "InstanceTypes" in response:
> +            batch_size = len(response["InstanceTypes"])
> +            all_instances.extend(response["InstanceTypes"])
> +            page_count += 1
> +
> +            if fetch_all and not family:
> +                # Only show progress for full fetches (not family-specific)
> +                print(
> +                    f"  Fetched page {page_count}: {batch_size} instance types (total: {len(all_instances)})",
> +                    file=sys.stderr,
> +                )
> +
> +            # Check if there are more results
> +            if fetch_all and "NextToken" in response:
> +                next_token = response["NextToken"]
> +            else:
> +                break
> +        else:
> +            break
> +
> +    if fetch_all and page_count > 1:
> +        filter_desc = f" for family '{family}'" if family else ""
> +        print(
> +            f"  Total: {len(all_instances)} instance types fetched{filter_desc}",
> +            file=sys.stderr,
> +        )
> +
> +    return all_instances
> +
> +
> +def get_pricing_info(region: str = "us-east-1") -> Dict[str, Dict[str, float]]:
> +    """
> +    Get pricing information for instance types.
> +
> +    Note: AWS Pricing API requires us-east-1 region.
> +    Returns a simplified pricing structure.
> +
> +    Args:
> +        region: AWS region for pricing
> +
> +    Returns:
> +        Dictionary mapping instance types to pricing info
> +    """
> +    # For simplicity, we'll use hardcoded common instance prices
> +    # In production, you'd query the AWS Pricing API
> +    pricing = {
> +        # T3 family (burstable)
> +        "t3.nano": {"on_demand": 0.0052},
> +        "t3.micro": {"on_demand": 0.0104},
> +        "t3.small": {"on_demand": 0.0208},
> +        "t3.medium": {"on_demand": 0.0416},
> +        "t3.large": {"on_demand": 0.0832},
> +        "t3.xlarge": {"on_demand": 0.1664},
> +        "t3.2xlarge": {"on_demand": 0.3328},
> +        # T3a family (AMD)
> +        "t3a.nano": {"on_demand": 0.0047},
> +        "t3a.micro": {"on_demand": 0.0094},
> +        "t3a.small": {"on_demand": 0.0188},
> +        "t3a.medium": {"on_demand": 0.0376},
> +        "t3a.large": {"on_demand": 0.0752},
> +        "t3a.xlarge": {"on_demand": 0.1504},
> +        "t3a.2xlarge": {"on_demand": 0.3008},
> +        # M5 family (general purpose Intel)
> +        "m5.large": {"on_demand": 0.096},
> +        "m5.xlarge": {"on_demand": 0.192},
> +        "m5.2xlarge": {"on_demand": 0.384},
> +        "m5.4xlarge": {"on_demand": 0.768},
> +        "m5.8xlarge": {"on_demand": 1.536},
> +        "m5.12xlarge": {"on_demand": 2.304},
> +        "m5.16xlarge": {"on_demand": 3.072},
> +        "m5.24xlarge": {"on_demand": 4.608},
> +        # M7a family (general purpose AMD)
> +        "m7a.medium": {"on_demand": 0.0464},
> +        "m7a.large": {"on_demand": 0.0928},
> +        "m7a.xlarge": {"on_demand": 0.1856},
> +        "m7a.2xlarge": {"on_demand": 0.3712},
> +        "m7a.4xlarge": {"on_demand": 0.7424},
> +        "m7a.8xlarge": {"on_demand": 1.4848},
> +        "m7a.12xlarge": {"on_demand": 2.2272},
> +        "m7a.16xlarge": {"on_demand": 2.9696},
> +        "m7a.24xlarge": {"on_demand": 4.4544},
> +        "m7a.32xlarge": {"on_demand": 5.9392},
> +        "m7a.48xlarge": {"on_demand": 8.9088},
> +        # C5 family (compute optimized)
> +        "c5.large": {"on_demand": 0.085},
> +        "c5.xlarge": {"on_demand": 0.17},
> +        "c5.2xlarge": {"on_demand": 0.34},
> +        "c5.4xlarge": {"on_demand": 0.68},
> +        "c5.9xlarge": {"on_demand": 1.53},
> +        "c5.12xlarge": {"on_demand": 2.04},
> +        "c5.18xlarge": {"on_demand": 3.06},
> +        "c5.24xlarge": {"on_demand": 4.08},
> +        # C7a family (compute optimized AMD)
> +        "c7a.medium": {"on_demand": 0.0387},
> +        "c7a.large": {"on_demand": 0.0774},
> +        "c7a.xlarge": {"on_demand": 0.1548},
> +        "c7a.2xlarge": {"on_demand": 0.3096},
> +        "c7a.4xlarge": {"on_demand": 0.6192},
> +        "c7a.8xlarge": {"on_demand": 1.2384},
> +        "c7a.12xlarge": {"on_demand": 1.8576},
> +        "c7a.16xlarge": {"on_demand": 2.4768},
> +        "c7a.24xlarge": {"on_demand": 3.7152},
> +        "c7a.32xlarge": {"on_demand": 4.9536},
> +        "c7a.48xlarge": {"on_demand": 7.4304},
> +        # I4i family (storage optimized)
> +        "i4i.large": {"on_demand": 0.117},
> +        "i4i.xlarge": {"on_demand": 0.234},
> +        "i4i.2xlarge": {"on_demand": 0.468},
> +        "i4i.4xlarge": {"on_demand": 0.936},
> +        "i4i.8xlarge": {"on_demand": 1.872},
> +        "i4i.16xlarge": {"on_demand": 3.744},
> +        "i4i.32xlarge": {"on_demand": 7.488},
> +    }
> +
> +    # Adjust pricing based on region (simplified)
> +    # Some regions are more expensive than others
> +    region_multipliers = {
> +        "us-east-1": 1.0,
> +        "us-east-2": 1.0,
> +        "us-west-1": 1.08,
> +        "us-west-2": 1.0,
> +        "eu-west-1": 1.1,
> +        "eu-central-1": 1.15,
> +        "ap-southeast-1": 1.2,
> +        "ap-northeast-1": 1.25,
> +    }
> +
> +    multiplier = region_multipliers.get(region, 1.1)
> +    if multiplier != 1.0:
> +        adjusted_pricing = {}
> +        for instance_type, prices in pricing.items():
> +            adjusted_pricing[instance_type] = {
> +                "on_demand": prices["on_demand"] * multiplier
> +            }
> +        return adjusted_pricing
> +
> +    return pricing
> +
> +
> +def sanitize_kconfig_name(name: str) -> str:
> +    """Convert a name to a valid Kconfig symbol."""
> +    # Replace special characters with underscores
> +    name = name.replace("-", "_").replace(".", "_").replace(" ", "_")
> +    # Convert to uppercase
> +    name = name.upper()
> +    # Remove any non-alphanumeric characters (except underscore)
> +    name = "".join(c for c in name if c.isalnum() or c == "_")
> +    # Ensure it doesn't start with a number
> +    if name and name[0].isdigit():
> +        name = "_" + name
> +    return name
> +
> +
> +# Cache for instance families to avoid redundant API calls
> +_cached_families = None
> +
> +
> +def get_generated_instance_families() -> set:
> +    """Get the set of instance families that will have generated Kconfig files."""
> +    global _cached_families
> +
> +    # Return cached result if available
> +    if _cached_families is not None:
> +        return _cached_families
> +
> +    # Return all families - we'll generate Kconfig files for all of them
> +    # This function will be called by the aws-cli tool to determine which files to generate
> +    if not check_aws_cli():
> +        # Return a minimal set if AWS CLI is not available
> +        _cached_families = {"m5", "t3", "c5"}
> +        return _cached_families
> +
> +    # Get all available instance types
> +    print("  Discovering available instance families...", file=sys.stderr)
> +    instance_types = get_instance_types(fetch_all=True)
> +
> +    # Extract unique families
> +    families = set()
> +    for instance_type in instance_types:
> +        type_name = instance_type.get("InstanceType", "")
> +        # Extract family prefix (e.g., "m5" from "m5.large")
> +        if "." in type_name:
> +            family = type_name.split(".")[0]
> +            families.add(family)
> +
> +    print(f"  Found {len(families)} instance families", file=sys.stderr)
> +    _cached_families = families
> +    return families
> +
> +
> +def generate_instance_families_kconfig() -> str:
> +    """Generate Kconfig content for AWS instance families."""
> +    # Check if AWS CLI is available
> +    if not check_aws_cli():
> +        return generate_default_instance_families_kconfig()
> +
> +    # Get all available instance types (with pagination)
> +    instance_types = get_instance_types(fetch_all=True)
> +
> +    # Extract unique families
> +    families = set()
> +    family_info = {}
> +    for instance in instance_types:
> +        instance_type = instance.get("InstanceType", "")
> +        if "." in instance_type:
> +            family = instance_type.split(".")[0]
> +            families.add(family)
> +            if family not in family_info:
> +                family_info[family] = {
> +                    "architectures": set(),
> +                    "count": 0,
> +                }
> +            family_info[family]["count"] += 1
> +            for arch in instance.get("ProcessorInfo", {}).get(
> +                "SupportedArchitectures", []
> +            ):
> +                family_info[family]["architectures"].add(arch)
> +
> +    if not families:
> +        return generate_default_instance_families_kconfig()
> +
> +    # Group families by category - use prefix patterns to catch all variants
> +    def categorize_family(family_name):
> +        """Categorize a family based on its prefix."""
> +        if family_name.startswith(("m", "t")):
> +            return "general_purpose"
> +        elif family_name.startswith("c"):
> +            return "compute_optimized"
> +        elif family_name.startswith(("r", "x", "z")):
> +            return "memory_optimized"
> +        elif family_name.startswith(("i", "d", "h")):
> +            return "storage_optimized"
> +        elif family_name.startswith(("p", "g", "dl", "trn", "inf", "vt", "f")):
> +            return "accelerated"
> +        elif family_name.startswith(("mac", "hpc")):
> +            return "specialized"
> +        else:
> +            return "other"
> +
> +    # Organize families by category
> +    categorized_families = {
> +        "general_purpose": [],
> +        "compute_optimized": [],
> +        "memory_optimized": [],
> +        "storage_optimized": [],
> +        "accelerated": [],
> +        "specialized": [],
> +        "other": [],
> +    }
> +
> +    for family in sorted(families):
> +        category = categorize_family(family)
> +        categorized_families[category].append(family)
> +
> +    kconfig = """# AWS instance families (dynamically generated)
> +# Generated by aws-cli from live AWS data
> +
> +choice
> +	prompt "AWS instance family"
> +	default TERRAFORM_AWS_INSTANCE_TYPE_M5
> +	help
> +	  Select the AWS instance family for your deployment.
> +	  Different families are optimized for different workloads.
> +
> +"""
> +
> +    # Category headers
> +    category_headers = {
> +        "general_purpose": "# General Purpose - balanced compute, memory, and networking\n",
> +        "compute_optimized": "# Compute Optimized - ideal for CPU-intensive applications\n",
> +        "memory_optimized": "# Memory Optimized - for memory-intensive applications\n",
> +        "storage_optimized": "# Storage Optimized - for high sequential read/write workloads\n",
> +        "accelerated": "# Accelerated Computing - GPU and other accelerators\n",
> +        "specialized": "# Specialized - for specific use cases\n",
> +        "other": "# Other instance families\n",
> +    }
> +
> +    # Add each category of families
> +    for category in [
> +        "general_purpose",
> +        "compute_optimized",
> +        "memory_optimized",
> +        "storage_optimized",
> +        "accelerated",
> +        "specialized",
> +        "other",
> +    ]:
> +        if categorized_families[category]:
> +            kconfig += category_headers[category]
> +            for family in categorized_families[category]:
> +                kconfig += generate_family_config(family, family_info.get(family, {}))
> +            if category != "other":  # Don't add extra newline after the last category
> +                kconfig += "\n"
> +
> +    kconfig += "\nendchoice\n"
> +
> +    # Add instance type source includes for each family
> +    # Only include families that we actually generate files for
> +    generated_families = get_generated_instance_families()
> +    kconfig += "\n# Include instance-specific configurations\n"
> +    for family in sorted(families):
> +        # Only add source statement if we generate a file for this family
> +        if family in generated_families:
> +            safe_name = sanitize_kconfig_name(family)
> +            kconfig += f"""if TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}
> +source "terraform/aws/kconfigs/instance-types/Kconfig.{family}.generated"
> +endif
> +
> +"""
> +
> +    return kconfig
> +
> +
> +def generate_family_config(family: str, info: Dict) -> str:
> +    """Generate Kconfig entry for an instance family."""
> +    safe_name = sanitize_kconfig_name(family)
> +
> +    # Determine architecture dependencies
> +    architectures = info.get("architectures", set())
> +    depends_line = ""
> +    if architectures:
> +        if "x86_64" in architectures and "arm64" not in architectures:
> +            depends_line = "\n\tdepends on TARGET_ARCH_X86_64"
> +        elif "arm64" in architectures and "x86_64" not in architectures:
> +            depends_line = "\n\tdepends on TARGET_ARCH_ARM64"
> +
> +    # Family descriptions
> +    descriptions = {
> +        "t3": "Burstable performance instances powered by Intel processors",
> +        "t3a": "Burstable performance instances powered by AMD processors",
> +        "m5": "General purpose instances powered by Intel Xeon Platinum processors",
> +        "m7a": "Latest generation general purpose instances powered by AMD EPYC processors",
> +        "c5": "Compute optimized instances powered by Intel Xeon Platinum processors",
> +        "c7a": "Latest generation compute optimized instances powered by AMD EPYC processors",
> +        "i4i": "Storage optimized instances with NVMe SSD storage",
> +        "is4gen": "Storage optimized ARM instances powered by AWS Graviton2",
> +        "im4gn": "Storage optimized ARM instances with NVMe storage",
> +        "r5": "Memory optimized instances powered by Intel Xeon Platinum processors",
> +        "p3": "GPU instances for machine learning and HPC",
> +        "g4dn": "GPU instances for graphics-intensive applications",
> +    }
> +
> +    description = descriptions.get(family, f"AWS {family.upper()} instance family")
> +    count = info.get("count", 0)
> +
> +    config = f"""config TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}
> +\tbool "{family.upper()}"
> +{depends_line}
> +\thelp
> +\t  {description}
> +\t  Available instance types: {count}
> +
> +"""
> +    return config
> +
> +
> +def generate_default_instance_families_kconfig() -> str:
> +    """Generate default Kconfig content when AWS CLI is not available."""
> +    return """# AWS instance families (default - AWS CLI not available)
> +
> +choice
> +	prompt "AWS instance family"
> +	default TERRAFORM_AWS_INSTANCE_TYPE_M5
> +	help
> +	  Select the AWS instance family for your deployment.
> +	  Note: AWS CLI is not available, showing default options.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_M5
> +	bool "M5"
> +	depends on TARGET_ARCH_X86_64
> +	help
> +	  General purpose instances powered by Intel Xeon Platinum processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_M7A
> +	bool "M7a"
> +	depends on TARGET_ARCH_X86_64
> +	help
> +	  Latest generation general purpose instances powered by AMD EPYC processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_T3
> +	bool "T3"
> +	depends on TARGET_ARCH_X86_64
> +	help
> +	  Burstable performance instances powered by Intel processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_C5
> +	bool "C5"
> +	depends on TARGET_ARCH_X86_64
> +	help
> +	  Compute optimized instances powered by Intel Xeon Platinum processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_I4I
> +	bool "I4i"
> +	depends on TARGET_ARCH_X86_64
> +	help
> +	  Storage optimized instances with NVMe SSD storage.
> +
> +endchoice
> +
> +# Include instance-specific configurations
> +if TERRAFORM_AWS_INSTANCE_TYPE_M5
> +source "terraform/aws/kconfigs/instance-types/Kconfig.m5"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_M7A
> +source "terraform/aws/kconfigs/instance-types/Kconfig.m7a"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_T3
> +source "terraform/aws/kconfigs/instance-types/Kconfig.t3.generated"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_C5
> +source "terraform/aws/kconfigs/instance-types/Kconfig.c5.generated"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_I4I
> +source "terraform/aws/kconfigs/instance-types/Kconfig.i4i"
> +endif
> +
> +"""
> +
> +
> +def generate_instance_types_kconfig(family: str) -> str:
> +    """Generate Kconfig content for specific instance types within a family."""
> +    if not check_aws_cli():
> +        return ""
> +
> +    instance_types = get_instance_types(family=family, fetch_all=True)
> +    if not instance_types:
> +        return ""
> +
> +    # Filter to only exact family matches (e.g., c5a but not c5ad)
> +    filtered_instances = []
> +    for instance in instance_types:
> +        instance_type = instance.get("InstanceType", "")
> +        if "." in instance_type:
> +            inst_family = instance_type.split(".")[0]
> +            if inst_family == family:
> +                filtered_instances.append(instance)
> +
> +    instance_types = filtered_instances
> +    if not instance_types:
> +        return ""
> +
> +    pricing = get_pricing_info()
> +
> +    # Sort by vCPU count and memory
> +    instance_types.sort(
> +        key=lambda x: (
> +            x.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> +            x.get("MemoryInfo", {}).get("SizeInMiB", 0),
> +        )
> +    )
> +
> +    safe_family = sanitize_kconfig_name(family)
> +
> +    # Get the first instance type to use as default
> +    default_instance_name = f"{safe_family}_LARGE"  # Fallback
> +    if instance_types:
> +        first_instance_type = instance_types[0].get("InstanceType", "")
> +        if "." in first_instance_type:
> +            first_full_name = first_instance_type.replace(".", "_")
> +            default_instance_name = sanitize_kconfig_name(first_full_name)
> +
> +    kconfig = f"""# AWS {family.upper()} instance sizes (dynamically generated)
> +
> +choice
> +\tprompt "Instance size for {family.upper()} family"
> +\tdefault TERRAFORM_AWS_INSTANCE_{default_instance_name}
> +\thelp
> +\t  Select the specific instance size within the {family.upper()} family.
> +
> +"""
> +
> +    seen_configs = set()
> +    for instance in instance_types:
> +        instance_type = instance.get("InstanceType", "")
> +        if "." not in instance_type:
> +            continue
> +
> +        # Get the full instance type name to make unique config names
> +        full_name = instance_type.replace(".", "_")
> +        safe_full_name = sanitize_kconfig_name(full_name)
> +
> +        # Skip if we've already seen this config name
> +        if safe_full_name in seen_configs:
> +            continue
> +        seen_configs.add(safe_full_name)
> +
> +        size = instance_type.split(".")[1]
> +
> +        vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0)
> +        memory_mib = instance.get("MemoryInfo", {}).get("SizeInMiB", 0)
> +        memory_gb = memory_mib / 1024
> +
> +        # Get pricing
> +        price = pricing.get(instance_type, {}).get("on_demand", 0.0)
> +        price_str = f"${price:.3f}/hour" if price > 0 else "pricing varies"
> +
> +        # Network performance
> +        network = instance.get("NetworkInfo", {}).get("NetworkPerformance", "varies")
> +
> +        # Storage
> +        storage_info = ""
> +        if instance.get("InstanceStorageSupported"):
> +            storage = instance.get("InstanceStorageInfo", {})
> +            total_size = storage.get("TotalSizeInGB", 0)
> +            if total_size > 0:
> +                storage_info = f"\n\t  Instance storage: {total_size} GB"
> +
> +        kconfig += f"""config TERRAFORM_AWS_INSTANCE_{safe_full_name}
> +\tbool "{instance_type}"
> +\thelp
> +\t  vCPUs: {vcpus}
> +\t  Memory: {memory_gb:.1f} GB
> +\t  Network: {network}
> +\t  Price: {price_str}{storage_info}
> +
> +"""
> +
> +    kconfig += "endchoice\n"
> +
> +    # Add the actual instance type string config with full instance names
> +    kconfig += f"""
> +config TERRAFORM_AWS_{safe_family}_SIZE
> +\tstring
> +"""
> +
> +    # Generate default mappings for each seen instance type
> +    for instance in instance_types:
> +        instance_type = instance.get("InstanceType", "")
> +        if "." not in instance_type:
> +            continue
> +
> +        full_name = instance_type.replace(".", "_")
> +        safe_full_name = sanitize_kconfig_name(full_name)
> +
> +        kconfig += (
> +            f'\tdefault "{instance_type}" if TERRAFORM_AWS_INSTANCE_{safe_full_name}\n'
> +        )
> +
> +    # Use the first instance type as the final fallback default
> +    final_default = f"{family}.large"
> +    if instance_types:
> +        first_instance_type = instance_types[0].get("InstanceType", "")
> +        if first_instance_type:
> +            final_default = first_instance_type
> +
> +    kconfig += f'\tdefault "{final_default}"\n\n'
> +
> +    return kconfig
> +
> +
> +def generate_regions_kconfig() -> str:
> +    """Generate Kconfig content for AWS regions."""
> +    if not check_aws_cli():
> +        return generate_default_regions_kconfig()
> +
> +    regions = get_regions()
> +    if not regions:
> +        return generate_default_regions_kconfig()
> +
> +    kconfig = """# AWS regions (dynamically generated)
> +
> +choice
> +	prompt "AWS region"
> +	default TERRAFORM_AWS_REGION_USEAST1
> +	help
> +	  Select the AWS region for your deployment.
> +	  Note: Not all instance types are available in all regions.
> +
> +"""
> +
> +    # Group regions by geographic area
> +    us_regions = []
> +    eu_regions = []
> +    ap_regions = []
> +    other_regions = []
> +
> +    for region in regions:
> +        region_name = region.get("RegionName", "")
> +        if region_name.startswith("us-"):
> +            us_regions.append(region)
> +        elif region_name.startswith("eu-"):
> +            eu_regions.append(region)
> +        elif region_name.startswith("ap-"):
> +            ap_regions.append(region)
> +        else:
> +            other_regions.append(region)
> +
> +    # Add US regions
> +    if us_regions:
> +        kconfig += "# US Regions\n"
> +        for region in sorted(us_regions, key=lambda x: x.get("RegionName", "")):
> +            kconfig += generate_region_config(region)
> +        kconfig += "\n"
> +
> +    # Add EU regions
> +    if eu_regions:
> +        kconfig += "# Europe Regions\n"
> +        for region in sorted(eu_regions, key=lambda x: x.get("RegionName", "")):
> +            kconfig += generate_region_config(region)
> +        kconfig += "\n"
> +
> +    # Add Asia Pacific regions
> +    if ap_regions:
> +        kconfig += "# Asia Pacific Regions\n"
> +        for region in sorted(ap_regions, key=lambda x: x.get("RegionName", "")):
> +            kconfig += generate_region_config(region)
> +        kconfig += "\n"
> +
> +    # Add other regions
> +    if other_regions:
> +        kconfig += "# Other Regions\n"
> +        for region in sorted(other_regions, key=lambda x: x.get("RegionName", "")):
> +            kconfig += generate_region_config(region)
> +
> +    kconfig += "\nendchoice\n"
> +
> +    # Add the actual region string config
> +    kconfig += """
> +config TERRAFORM_AWS_REGION
> +	string
> +"""
> +
> +    for region in regions:
> +        region_name = region.get("RegionName", "")
> +        safe_name = sanitize_kconfig_name(region_name)
> +        kconfig += f'\tdefault "{region_name}" if TERRAFORM_AWS_REGION_{safe_name}\n'
> +
> +    kconfig += '\tdefault "us-east-1"\n'
> +
> +    return kconfig
> +
> +
> +def generate_region_config(region: Dict) -> str:
> +    """Generate Kconfig entry for a region."""
> +    region_name = region.get("RegionName", "")
> +    safe_name = sanitize_kconfig_name(region_name)
> +    opt_in_status = region.get("OptInStatus", "")
> +
> +    # Region display names
> +    display_names = {
> +        "us-east-1": "US East (N. Virginia)",
> +        "us-east-2": "US East (Ohio)",
> +        "us-west-1": "US West (N. California)",
> +        "us-west-2": "US West (Oregon)",
> +        "eu-west-1": "Europe (Ireland)",
> +        "eu-west-2": "Europe (London)",
> +        "eu-west-3": "Europe (Paris)",
> +        "eu-central-1": "Europe (Frankfurt)",
> +        "eu-north-1": "Europe (Stockholm)",
> +        "ap-southeast-1": "Asia Pacific (Singapore)",
> +        "ap-southeast-2": "Asia Pacific (Sydney)",
> +        "ap-northeast-1": "Asia Pacific (Tokyo)",
> +        "ap-northeast-2": "Asia Pacific (Seoul)",
> +        "ap-south-1": "Asia Pacific (Mumbai)",
> +        "ca-central-1": "Canada (Central)",
> +        "sa-east-1": "South America (São Paulo)",
> +    }
> +
> +    display_name = display_names.get(region_name, region_name.replace("-", " ").title())
> +
> +    help_text = f"\t  Region: {display_name}"
> +    if opt_in_status and opt_in_status != "opt-in-not-required":
> +        help_text += f"\n\t  Status: {opt_in_status}"
> +
> +    config = f"""config TERRAFORM_AWS_REGION_{safe_name}
> +\tbool "{display_name}"
> +\thelp
> +{help_text}
> +
> +"""
> +    return config
> +
> +
> +def get_gpu_amis(region: str = None) -> List[Dict[str, Any]]:
> +    """
> +    Get available GPU-optimized AMIs including Deep Learning AMIs.
> +
> +    Args:
> +        region: AWS region
> +
> +    Returns:
> +        List of AMI information
> +    """
> +    # Query for Deep Learning AMIs from AWS
> +    cmd = ["ec2", "describe-images"]
> +    filters = [
> +        "Name=owner-alias,Values=amazon",
> +        "Name=name,Values=Deep Learning AMI GPU*",
> +        "Name=state,Values=available",
> +        "Name=architecture,Values=x86_64",
> +    ]
> +    cmd.append("--filters")
> +    cmd.extend(filters)
> +    cmd.extend(["--query", "Images[?contains(Name, '2024') || contains(Name, '2025')]"])
> +
> +    response = run_aws_command(cmd, region=region)
> +
> +    if response:
> +        # Sort by creation date to get the most recent
> +        response.sort(key=lambda x: x.get("CreationDate", ""), reverse=True)
> +        return response[:10]  # Return top 10 most recent
> +    return []
> +
> +
> +def generate_gpu_amis_kconfig() -> str:
> +    """Generate Kconfig content for GPU AMIs."""
> +    # Check if AWS CLI is available
> +    if not check_aws_cli():
> +        return generate_default_gpu_amis_kconfig()
> +
> +    # Get available GPU AMIs
> +    amis = get_gpu_amis()
> +
> +    if not amis:
> +        return generate_default_gpu_amis_kconfig()
> +
> +    kconfig = """# GPU-optimized AMIs (dynamically generated)
> +
> +# GPU AMI Override - only shown for GPU instances
> +config TERRAFORM_AWS_USE_GPU_AMI
> +	bool "Use GPU-optimized AMI instead of standard distribution"
> +	depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> +	output yaml
> +	default n
> +	help
> +	  Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers,
> +	  CUDA, and ML frameworks instead of the standard distribution AMI.
> +
> +	  When disabled, the standard distribution AMI will be used and you'll need
> +	  to install GPU drivers manually.
> +
> +if TERRAFORM_AWS_USE_GPU_AMI
> +
> +choice
> +	prompt "GPU-optimized AMI selection"
> +	default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +	depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> +	help
> +	  Select which GPU-optimized AMI to use for your GPU instance.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +	bool "AWS Deep Learning AMI (Ubuntu 22.04)"
> +	help
> +	  AWS Deep Learning AMI with NVIDIA drivers, CUDA, cuDNN, and popular ML frameworks.
> +	  Optimized for machine learning workloads on GPU instances.
> +	  Includes: TensorFlow, PyTorch, MXNet, and Jupyter.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +	bool "NVIDIA Deep Learning AMI"
> +	help
> +	  NVIDIA optimized Deep Learning AMI with latest GPU drivers.
> +	  Includes NVIDIA GPU Cloud (NGC) containers and frameworks.
> +
> +config TERRAFORM_AWS_GPU_AMI_CUSTOM
> +	bool "Custom GPU AMI"
> +	help
> +	  Specify a custom AMI ID for GPU instances.
> +
> +endchoice
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> +	string
> +	output yaml
> +	default "Deep Learning AMI GPU TensorFlow*"
> +	help
> +	  AMI name pattern for AWS Deep Learning AMI.
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> +	string
> +	output yaml
> +	default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> +	string
> +	output yaml
> +	default "NVIDIA Deep Learning AMI*"
> +	help
> +	  AMI name pattern for NVIDIA Deep Learning AMI.
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> +	string
> +	output yaml
> +	default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +
> +if TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +config TERRAFORM_AWS_GPU_AMI_ID
> +	string "Custom GPU AMI ID"
> +	output yaml
> +	help
> +	  Specify the AMI ID for your custom GPU image.
> +	  Example: ami-0123456789abcdef0
> +
> +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +endif # TERRAFORM_AWS_USE_GPU_AMI
> +
> +# GPU instance detection
> +config TERRAFORM_AWS_IS_GPU_INSTANCE
> +	bool
> +	output yaml
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G6
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G5
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P5
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P3
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN
> +	default n
> +	help
> +	  Automatically detected based on selected instance type.
> +	  This indicates whether the selected instance has GPU support.
> +
> +"""
> +
> +    return kconfig
> +
> +
> +def generate_default_gpu_amis_kconfig() -> str:
> +    """Generate default GPU AMI Kconfig when AWS CLI is not available."""
> +    return """# GPU-optimized AMIs (default - AWS CLI not available)
> +
> +# GPU AMI Override - only shown for GPU instances
> +config TERRAFORM_AWS_USE_GPU_AMI
> +	bool "Use GPU-optimized AMI instead of standard distribution"
> +	depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> +	output yaml
> +	default n
> +	help
> +	  Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers,
> +	  CUDA, and ML frameworks instead of the standard distribution AMI.
> +	  Note: AWS CLI is not available, showing default options.
> +
> +if TERRAFORM_AWS_USE_GPU_AMI
> +
> +choice
> +	prompt "GPU-optimized AMI selection"
> +	default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +	depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> +	help
> +	  Select which GPU-optimized AMI to use for your GPU instance.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +	bool "AWS Deep Learning AMI (Ubuntu 22.04)"
> +	help
> +	  Pre-configured with NVIDIA drivers, CUDA, and ML frameworks.
> +
> +config TERRAFORM_AWS_GPU_AMI_CUSTOM
> +	bool "Custom GPU AMI"
> +
> +endchoice
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> +	string
> +	output yaml
> +	default "Deep Learning AMI GPU TensorFlow*"
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> +	string
> +	output yaml
> +	default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +if TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +config TERRAFORM_AWS_GPU_AMI_ID
> +	string "Custom GPU AMI ID"
> +	output yaml
> +	help
> +	  Specify the AMI ID for your custom GPU image.
> +
> +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +endif # TERRAFORM_AWS_USE_GPU_AMI
> +
> +# GPU instance detection (static)
> +config TERRAFORM_AWS_IS_GPU_INSTANCE
> +	bool
> +	output yaml
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G6
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G5
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P5
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P3
> +	default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN
> +	default n
> +	help
> +	  Automatically detected based on selected instance type.
> +	  This indicates whether the selected instance has GPU support.
> +
> +"""
> +
> +
> +def generate_default_regions_kconfig() -> str:
> +    """Generate default Kconfig content when AWS CLI is not available."""
> +    return """# AWS regions (default - AWS CLI not available)
> +
> +choice
> +	prompt "AWS region"
> +	default TERRAFORM_AWS_REGION_USEAST1
> +	help
> +	  Select the AWS region for your deployment.
> +	  Note: AWS CLI is not available, showing default options.
> +
> +# US Regions
> +config TERRAFORM_AWS_REGION_USEAST1
> +	bool "US East (N. Virginia)"
> +
> +config TERRAFORM_AWS_REGION_USEAST2
> +	bool "US East (Ohio)"
> +
> +config TERRAFORM_AWS_REGION_USWEST1
> +	bool "US West (N. California)"
> +
> +config TERRAFORM_AWS_REGION_USWEST2
> +	bool "US West (Oregon)"
> +
> +# Europe Regions
> +config TERRAFORM_AWS_REGION_EUWEST1
> +	bool "Europe (Ireland)"
> +
> +config TERRAFORM_AWS_REGION_EUCENTRAL1
> +	bool "Europe (Frankfurt)"
> +
> +# Asia Pacific Regions
> +config TERRAFORM_AWS_REGION_APSOUTHEAST1
> +	bool "Asia Pacific (Singapore)"
> +
> +config TERRAFORM_AWS_REGION_APNORTHEAST1
> +	bool "Asia Pacific (Tokyo)"
> +
> +endchoice
> +
> +config TERRAFORM_AWS_REGION
> +	string
> +	default "us-east-1" if TERRAFORM_AWS_REGION_USEAST1
> +	default "us-east-2" if TERRAFORM_AWS_REGION_USEAST2
> +	default "us-west-1" if TERRAFORM_AWS_REGION_USWEST1
> +	default "us-west-2" if TERRAFORM_AWS_REGION_USWEST2
> +	default "eu-west-1" if TERRAFORM_AWS_REGION_EUWEST1
> +	default "eu-central-1" if TERRAFORM_AWS_REGION_EUCENTRAL1
> +	default "ap-southeast-1" if TERRAFORM_AWS_REGION_APSOUTHEAST1
> +	default "ap-northeast-1" if TERRAFORM_AWS_REGION_APNORTHEAST1
> +	default "us-east-1"
> +
> +"""
> diff --git a/scripts/dynamic-cloud-kconfig.Makefile b/scripts/dynamic-cloud-kconfig.Makefile
> index e15651ab..4105e706 100644
> --- a/scripts/dynamic-cloud-kconfig.Makefile
> +++ b/scripts/dynamic-cloud-kconfig.Makefile
> @@ -12,9 +12,24 @@ LAMBDALABS_KCONFIG_IMAGES := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.generated
>  
>  LAMBDALABS_KCONFIGS := $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_IMAGES)
>  
> +# AWS dynamic configuration
> +AWS_KCONFIG_DIR := terraform/aws/kconfigs
> +AWS_KCONFIG_COMPUTE := $(AWS_KCONFIG_DIR)/Kconfig.compute.generated
> +AWS_KCONFIG_LOCATION := $(AWS_KCONFIG_DIR)/Kconfig.location.generated
> +AWS_INSTANCE_TYPES_DIR := $(AWS_KCONFIG_DIR)/instance-types
> +
> +# List of AWS instance type family files that will be generated
> +AWS_INSTANCE_TYPE_FAMILIES := m5 m7a t3 t3a c5 c7a i4i is4gen im4gn
> +AWS_INSTANCE_TYPE_KCONFIGS := $(foreach family,$(AWS_INSTANCE_TYPE_FAMILIES),$(AWS_INSTANCE_TYPES_DIR)/Kconfig.$(family).generated)
> +
> +AWS_KCONFIGS := $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION) $(AWS_INSTANCE_TYPE_KCONFIGS)
> +
>  # Add Lambda Labs generated files to mrproper clean list
>  KDEVOPS_MRPROPER += $(LAMBDALABS_KCONFIGS)
>  
> +# Add AWS generated files to mrproper clean list
> +KDEVOPS_MRPROPER += $(AWS_KCONFIGS)
> +
>  # Touch Lambda Labs generated files so Kconfig can source them
>  # This ensures the files exist (even if empty) before Kconfig runs
>  dynamic_lambdalabs_kconfig_touch:
> @@ -22,20 +37,43 @@ dynamic_lambdalabs_kconfig_touch:
>  
>  DYNAMIC_KCONFIG += dynamic_lambdalabs_kconfig_touch
>  
> +# Touch AWS generated files so Kconfig can source them
> +# This ensures the files exist (even if empty) before Kconfig runs
> +dynamic_aws_kconfig_touch:
> +	$(Q)mkdir -p $(AWS_INSTANCE_TYPES_DIR)
> +	$(Q)touch $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION)
> +	$(Q)touch $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated
> +	$(Q)for family in $(AWS_INSTANCE_TYPE_FAMILIES); do \
> +		touch $(AWS_INSTANCE_TYPES_DIR)/Kconfig.$$family.generated; \
> +	done
> +
> +DYNAMIC_KCONFIG += dynamic_aws_kconfig_touch
> +
>  # Individual Lambda Labs targets are now handled by generate_cloud_configs.py
>  cloud-config-lambdalabs:
>  	$(Q)python3 scripts/generate_cloud_configs.py
>  
> +# Individual AWS targets are now handled by generate_cloud_configs.py
> +cloud-config-aws:
> +	$(Q)python3 scripts/generate_cloud_configs.py
> +
>  # Clean Lambda Labs generated files
>  clean-cloud-config-lambdalabs:
>  	$(Q)rm -f $(LAMBDALABS_KCONFIGS)
>  
> -DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs
> +# Clean AWS generated files
> +clean-cloud-config-aws:
> +	$(Q)rm -f $(AWS_KCONFIGS)
> +	$(Q)rm -f .aws_cloud_config_generated
> +
> +DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs cloud-config-aws
>  
>  cloud-config-help:
>  	@echo "Cloud-specific dynamic kconfig targets:"
>  	@echo "cloud-config            - generates all cloud provider dynamic kconfig content"
>  	@echo "cloud-config-lambdalabs - generates Lambda Labs dynamic kconfig content"
> +	@echo "cloud-config-aws        - generates AWS dynamic kconfig content"
> +	@echo "cloud-update            - converts generated cloud configs to static (for committing)"
>  	@echo "clean-cloud-config      - removes all generated cloud kconfig files"
>  	@echo "cloud-list-all          - list all cloud instances for configured provider"
>  
> @@ -44,11 +82,55 @@ HELP_TARGETS += cloud-config-help
>  cloud-config:
>  	$(Q)python3 scripts/generate_cloud_configs.py
>  
> -clean-cloud-config: clean-cloud-config-lambdalabs
> +clean-cloud-config: clean-cloud-config-lambdalabs clean-cloud-config-aws
> +	$(Q)rm -f .cloud.initialized
>  	$(Q)echo "Cleaned all cloud provider dynamic Kconfig files."
>  
>  cloud-list-all:
>  	$(Q)chmod +x scripts/cloud_list_all.sh
>  	$(Q)scripts/cloud_list_all.sh
>  
> -PHONY += cloud-config cloud-config-lambdalabs clean-cloud-config clean-cloud-config-lambdalabs cloud-config-help cloud-list-all
> +# Convert dynamically generated cloud configs to static versions for git commits
> +# This allows admins to generate configs once and commit them for regular users
> +cloud-update:
> +	@echo "Converting generated cloud configs to static versions..."
> +	# AWS configs
> +	$(Q)if [ -f $(AWS_KCONFIG_COMPUTE) ]; then \
> +		cp $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \
> +		sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \
> +		echo "  Created $(AWS_KCONFIG_DIR)/Kconfig.compute.static"; \
> +	fi
> +	$(Q)if [ -f $(AWS_KCONFIG_LOCATION) ]; then \
> +		cp $(AWS_KCONFIG_LOCATION) $(AWS_KCONFIG_DIR)/Kconfig.location.static; \
> +		sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.location.static; \
> +		echo "  Created $(AWS_KCONFIG_DIR)/Kconfig.location.static"; \
> +	fi
> +	$(Q)if [ -f $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated ]; then \
> +		cp $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static; \
> +		sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static; \
> +		echo "  Created $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.static"; \
> +	fi
> +	# AWS instance type families
> +	$(Q)for file in $(AWS_INSTANCE_TYPES_DIR)/Kconfig.*.generated; do \
> +		if [ -f "$$file" ]; then \
> +			static_file=$$(echo "$$file" | sed 's/\.generated$$/\.static/'); \
> +			cp "$$file" "$$static_file"; \
> +			echo "  Created $$static_file"; \
> +		fi; \
> +	done
> +	# Lambda Labs configs
> +	$(Q)if [ -f $(LAMBDALABS_KCONFIG_COMPUTE) ]; then \
> +		cp $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static; \
> +		echo "  Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static"; \
> +	fi
> +	$(Q)if [ -f $(LAMBDALABS_KCONFIG_LOCATION) ]; then \
> +		cp $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static; \
> +		echo "  Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static"; \
> +	fi
> +	$(Q)if [ -f $(LAMBDALABS_KCONFIG_IMAGES) ]; then \
> +		cp $(LAMBDALABS_KCONFIG_IMAGES) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static; \
> +		echo "  Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static"; \
> +	fi
> +	@echo "Static cloud configs created. You can now commit these .static files to git."
> +
> +PHONY += cloud-config cloud-config-lambdalabs cloud-config-aws clean-cloud-config clean-cloud-config-lambdalabs clean-cloud-config-aws cloud-config-help cloud-list-all cloud-update
> diff --git a/scripts/generate_cloud_configs.py b/scripts/generate_cloud_configs.py
> index b16294dd..332cebe7 100755
> --- a/scripts/generate_cloud_configs.py
> +++ b/scripts/generate_cloud_configs.py
> @@ -10,6 +10,9 @@ import os
>  import sys
>  import subprocess
>  import json
> +from concurrent.futures import ThreadPoolExecutor, as_completed
> +from pathlib import Path
> +from typing import Tuple
>  
>  
>  def generate_lambdalabs_kconfig() -> bool:
> @@ -100,29 +103,194 @@ def get_lambdalabs_summary() -> tuple[bool, str]:
>          return False, "Lambda Labs: Error querying API - using defaults"
>  
>  
> +def generate_aws_kconfig() -> bool:
> +    """
> +    Generate AWS Kconfig files.
> +    Returns True on success, False on failure.
> +    """
> +    script_dir = os.path.dirname(os.path.abspath(__file__))
> +    cli_path = os.path.join(script_dir, "aws-cli")
> +
> +    # Generate the Kconfig files
> +    result = subprocess.run(
> +        [cli_path, "generate-kconfig"],
> +        capture_output=True,
> +        text=True,
> +        check=False,
> +    )
> +
> +    return result.returncode == 0
> +
> +
> +def get_aws_summary() -> tuple[bool, str]:
> +    """
> +    Get a summary of AWS configurations using aws-cli.
> +    Returns (success, summary_string)
> +    """
> +    script_dir = os.path.dirname(os.path.abspath(__file__))
> +    cli_path = os.path.join(script_dir, "aws-cli")
> +
> +    try:
> +        # Check if AWS CLI is available
> +        result = subprocess.run(
> +            ["aws", "--version"],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +
> +        if result.returncode != 0:
> +            return False, "AWS: AWS CLI not installed - using defaults"
> +
> +        # Check if credentials are configured
> +        result = subprocess.run(
> +            ["aws", "sts", "get-caller-identity"],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +
> +        if result.returncode != 0:
> +            return False, "AWS: Credentials not configured - using defaults"
> +
> +        # Get instance types count
> +        result = subprocess.run(
> +            [
> +                cli_path,
> +                "--output",
> +                "json",
> +                "instance-types",
> +                "list",
> +                "--max-results",
> +                "100",
> +            ],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +
> +        if result.returncode != 0:
> +            return False, "AWS: Error querying API - using defaults"
> +
> +        instances = json.loads(result.stdout)
> +        instance_count = len(instances)
> +
> +        # Get regions
> +        result = subprocess.run(
> +            [cli_path, "--output", "json", "regions", "list"],
> +            capture_output=True,
> +            text=True,
> +            check=False,
> +        )
> +
> +        if result.returncode == 0:
> +            regions = json.loads(result.stdout)
> +            region_count = len(regions)
> +        else:
> +            region_count = 0
> +
> +        # Get price range from a sample of instances
> +        prices = []
> +        for instance in instances[:20]:  # Sample first 20 for speed
> +            if "error" not in instance:
> +                # Extract price if available (would need pricing API)
> +                # For now, we'll use placeholder
> +                vcpus = instance.get("vcpu", 0)
> +                if vcpus > 0:
> +                    # Rough estimate: $0.05 per vCPU/hour
> +                    estimated_price = vcpus * 0.05
> +                    prices.append(estimated_price)
> +
> +        # Format summary
> +        if prices:
> +            min_price = min(prices)
> +            max_price = max(prices)
> +            price_range = f"~${min_price:.2f}-${max_price:.2f}/hr"
> +        else:
> +            price_range = "pricing varies by region"
> +
> +        return (
> +            True,
> +            f"AWS: {instance_count} instance types available, "
> +            f"{region_count} regions, {price_range}",
> +        )
> +
> +    except (subprocess.SubprocessError, json.JSONDecodeError, KeyError):
> +        return False, "AWS: Error querying API - using defaults"
> +
> +
> +def process_lambdalabs() -> Tuple[bool, bool, str]:
> +    """Process Lambda Labs configuration generation and summary.
> +    Returns (kconfig_generated, summary_success, summary_text)
> +    """
> +    kconfig_generated = generate_lambdalabs_kconfig()
> +    success, summary = get_lambdalabs_summary()
> +    return kconfig_generated, success, summary
> +
> +
> +def process_aws() -> Tuple[bool, bool, str]:
> +    """Process AWS configuration generation and summary.
> +    Returns (kconfig_generated, summary_success, summary_text)
> +    """
> +    kconfig_generated = generate_aws_kconfig()
> +    success, summary = get_aws_summary()
> +
> +    # Create marker file to indicate dynamic AWS config is available
> +    if kconfig_generated:
> +        marker_file = Path(".aws_cloud_config_generated")
> +        marker_file.touch()
> +
> +    return kconfig_generated, success, summary
> +
> +
>  def main():
>      """Main function to generate cloud configurations."""
>      print("Cloud Provider Configuration Summary")
>      print("=" * 60)
>      print()
>  
> -    # Lambda Labs - Generate Kconfig files first
> -    kconfig_generated = generate_lambdalabs_kconfig()
> +    # Run cloud provider operations in parallel
> +    results = {}
> +    any_success = False
>  
> -    # Lambda Labs - Get summary
> -    success, summary = get_lambdalabs_summary()
> -    if success:
> -        print(f"✓ {summary}")
> -        if kconfig_generated:
> -            print("  Kconfig files generated successfully")
> -        else:
> -            print("  Warning: Failed to generate Kconfig files")
> -    else:
> -        print(f"⚠ {summary}")
> -    print()
> +    with ThreadPoolExecutor(max_workers=4) as executor:
> +        # Submit all tasks
> +        futures = {
> +            executor.submit(process_lambdalabs): "lambdalabs",
> +            executor.submit(process_aws): "aws",
> +        }
> +
> +        # Process results as they complete
> +        for future in as_completed(futures):
> +            provider = futures[future]
> +            try:
> +                results[provider] = future.result()
> +            except Exception as e:
> +                results[provider] = (
> +                    False,
> +                    False,
> +                    f"{provider.upper()}: Error - {str(e)}",
> +                )
> +
> +    # Display results in consistent order
> +    for provider in ["lambdalabs", "aws"]:
> +        if provider in results:
> +            kconfig_gen, success, summary = results[provider]
> +            if success and kconfig_gen:
> +                any_success = True
> +            if success:
> +                print(f"✓ {summary}")
> +                if kconfig_gen:
> +                    print("  Kconfig files generated successfully")
> +                else:
> +                    print("  Warning: Failed to generate Kconfig files")
> +            else:
> +                print(f"⚠ {summary}")
> +            print()
>  
> -    # AWS (placeholder - not implemented)
> -    print("⚠ AWS: Dynamic configuration not yet implemented")
> +    # Create .cloud.initialized if any provider succeeded
> +    if any_success:
> +        Path(".cloud.initialized").touch()
>  
>      # Azure (placeholder - not implemented)
>      print("⚠ Azure: Dynamic configuration not yet implemented")
> diff --git a/terraform/aws/kconfigs/Kconfig.compute b/terraform/aws/kconfigs/Kconfig.compute
> index bae0ea1c..6b5ff900 100644
> --- a/terraform/aws/kconfigs/Kconfig.compute
> +++ b/terraform/aws/kconfigs/Kconfig.compute
> @@ -1,94 +1,54 @@
> -choice
> -	prompt "AWS instance types"
> -	help
> -	  Instance types comprise varying combinations of hardware
> -	  platform, CPU count, memory size, storage, and networking
> -	  capacity. Select the type that provides an appropriate mix
> -	  of resources for your preferred workflows.
> -
> -	  Some instance types are region- and capacity-limited.
> -
> -	  See https://aws.amazon.com/ec2/instance-types/ for
> -	  details.
> +# AWS compute configuration
>  
> -config TERRAFORM_AWS_INSTANCE_TYPE_M5
> -	bool "M5"
> -	depends on TARGET_ARCH_X86_64
> +config TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +	bool "Use dynamically generated instance types"
> +	default $(shell, test -f .aws_cloud_config_generated && echo y || echo n)
>  	help
> -	  This is a general purpose type powered by Intel Xeon®
> -	  Platinum 8175M or 8259CL processors (Skylake or Cascade
> -	  Lake).
> +	Enable this to use dynamically generated instance types from AWS CLI.
> +	Run 'make cloud-config' to query AWS and generate available options.
> +	When disabled, uses static predefined instance types.
>  
> -	  See https://aws.amazon.com/ec2/instance-types/m5/ for
> -	  details.
> +	This is automatically enabled when you run 'make cloud-config'.
>  
> -config TERRAFORM_AWS_INSTANCE_TYPE_M7A
> -	bool "M7a"
> -	depends on TARGET_ARCH_X86_64
> -	help
> -	  This is a general purpose type powered by 4th Generation
> -	  AMD EPYC processors.
> -
> -	  See https://aws.amazon.com/ec2/instance-types/m7a/ for
> -	  details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_I4I
> -	bool "I4i"
> -	depends on TARGET_ARCH_X86_64
> -	help
> -	  This is a storage-optimized type powered by 3rd generation
> -	  Intel Xeon Scalable processors (Ice Lake) and use AWS Nitro
> -	  NVMe SSDs.
> +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Include cloud-generated or static instance families
> +# Try static first (pre-generated by admins for faster loading)
> +# Fall back to generated files (requires AWS CLI)
> +source "terraform/aws/kconfigs/Kconfig.compute.static"
> +endif
>  
> -	  See https://aws.amazon.com/ec2/instance-types/i4i/ for
> -	  details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_IS4GEN
> -	bool "Is4gen"
> -	depends on TARGET_ARCH_ARM64
> -	help
> -	  This is a Storage-optimized type powered by AWS Graviton2
> -	  processors.
> -
> -	  See https://aws.amazon.com/ec2/instance-types/i4g/ for
> -	  details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_IM4GN
> -	bool "Im4gn"
> -	depends on TARGET_ARCH_ARM64
> +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Static instance types when not using dynamic config
> +choice
> +	prompt "AWS instance types"
>  	help
> -	  This is a storage-optimized type powered by AWS Graviton2
> -	  processors.
> +	Instance types comprise varying combinations of hardware
> +	platform, CPU count, memory size, storage, and networking
> +	capacity. Select the type that provides an appropriate mix
> +	of resources for your preferred workflows.
>  
> -	  See https://aws.amazon.com/ec2/instance-types/i4g/ for
> -	  details.
> +	Some instance types are region- and capacity-limited.
>  
> -config TERRAFORM_AWS_INSTANCE_TYPE_C7A
> -	depends on TARGET_ARCH_X86_64
> -	bool "c7a"
> -	help
> -	  This is a compute-optimized type powered by 4th generation
> -	  AMD EPYC processors.
> +	See https://aws.amazon.com/ec2/instance-types/ for
> +	details.
>  
> -	  See https://aws.amazon.com/ec2/instance-types/c7a/ for
> -	  details.
>  
>  endchoice
> +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
>  
> +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Use static instance type definitions when not using dynamic config
>  source "terraform/aws/kconfigs/instance-types/Kconfig.m5"
>  source "terraform/aws/kconfigs/instance-types/Kconfig.m7a"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.i4i"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.is4gen"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.im4gn"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.c7a"
> +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
>  
>  choice
>  	prompt "Linux distribution"
>  	default TERRAFORM_AWS_DISTRO_DEBIAN
>  	help
> -	  Select a popular Linux distribution to install on your
> -	  instances, or use the "Custom AMI image" selection to
> -	  choose an image that is off the beaten path.
> +	Select a popular Linux distribution to install on your
> +	instances, or use the "Custom AMI image" selection to
> +	choose an image that is off the beaten path.
>  
>  config TERRAFORM_AWS_DISTRO_AMAZON
>  	bool "Amazon Linux"
> @@ -120,3 +80,8 @@ source "terraform/aws/kconfigs/distros/Kconfig.oracle"
>  source "terraform/aws/kconfigs/distros/Kconfig.rhel"
>  source "terraform/aws/kconfigs/distros/Kconfig.sles"
>  source "terraform/aws/kconfigs/distros/Kconfig.custom"
> +
> +# Include GPU AMI configuration if available (generated by cloud-config)
> +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +source "terraform/aws/kconfigs/Kconfig.gpu-amis.static"
> +endif


-- 
Chuck Lever

next prev parent reply	other threads:[~2025-09-04 13:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-04  9:00 [PATCH 0/3] aws: add dynamic kconfig support Luis Chamberlain
2025-09-04  9:00 ` [PATCH 1/3] aws: add dynamic cloud configuration support using AWS CLI Luis Chamberlain
2025-09-04 13:55   ` Chuck Lever [this message]
2025-09-04 17:12     ` Luis Chamberlain
2025-09-04  9:00 ` [PATCH 2/3] aws: run make cloud-update Luis Chamberlain
2025-09-04  9:00 ` [PATCH 3/3] lambda: " Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d38f8fc3-3a44-4de7-b463-b258cbcd66c4@kernel.org \
    --to=cel@kernel.org \
    --cc=da.gomez@kruces.com \
    --cc=kdevops@lists.linux.dev \
    --cc=mcgrof@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox