From: Chuck Lever <cel@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
Daniel Gomez <da.gomez@kruces.com>,
kdevops@lists.linux.dev
Subject: Re: [PATCH 1/2] aws: add dynamic cloud configuration support using AWS CLI
Date: Sun, 7 Sep 2025 13:24:43 -0400 [thread overview]
Message-ID: <d6ae74bb-3769-4de8-9e29-2c079f3335ed@kernel.org> (raw)
In-Reply-To: <20250907042325.2228868-2-mcgrof@kernel.org>
On 9/7/25 12:23 AM, Luis Chamberlain wrote:
> Add support for dynamically generating AWS instance types and regions
> configuration using the AWS CLI, similar to the Lambda Labs implementation.
>
> This allows users to:
> - Query real-time AWS instance availability
> - Generate Kconfig files with current instance families and regions
> - Choose between dynamic and static configuration modes
> - See pricing estimates and resource summaries
>
> Key components:
> - scripts/aws-cli: AWS CLI wrapper tool for kdevops
> - scripts/aws_api.py: Low-level AWS API functions (includes GPU AMI query functions)
> - Updated generate_cloud_configs.py to support AWS
> - Makefile integration for AWS Kconfig generation
> - Option to use dynamic or static AWS configuration
> - Documentation for cloud configuration management
>
> Usage: Run 'make cloud-config' to generate dynamic configuration.
>
> This parallelizes cloud provider operations to significantly improve
> generation time. The cloud-update target allows administrators to
> convert generated configs to static files for regular users to avoid
> the ~6 minute generation time.
>
> Generated-by: Claude AI
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
> .gitignore | 3 +
> docs/cloud-configuration.md | 268 ++++++
> scripts/aws-cli | 436 +++++++++
> scripts/aws_api.py | 1161 ++++++++++++++++++++++++
> scripts/dynamic-cloud-kconfig.Makefile | 95 +-
> scripts/generate_cloud_configs.py | 198 +++-
> terraform/aws/kconfigs/Kconfig.compute | 104 +--
> 7 files changed, 2175 insertions(+), 90 deletions(-)
> create mode 100644 docs/cloud-configuration.md
> create mode 100755 scripts/aws-cli
> create mode 100755 scripts/aws_api.py
>
> diff --git a/.gitignore b/.gitignore
> index 09d2ae33..30337add 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -115,3 +115,6 @@ terraform/lambdalabs/.terraform_api_key
> .cloud.initialized
>
> scripts/__pycache__/
> +.aws_cloud_config_generated
> +terraform/aws/kconfigs/*.generated
> +terraform/aws/kconfigs/instance-types/*.generated
> diff --git a/docs/cloud-configuration.md b/docs/cloud-configuration.md
> new file mode 100644
> index 00000000..e8386c82
> --- /dev/null
> +++ b/docs/cloud-configuration.md
> @@ -0,0 +1,268 @@
> +# Cloud Configuration Management in kdevops
> +
> +kdevops supports dynamic cloud provider configuration, allowing administrators to generate up-to-date instance types, locations, and AMI options directly from cloud provider APIs. Since generating these configurations can take several minutes (approximately 6 minutes for AWS), kdevops implements a two-tier system to optimize the user experience.
> +
> +## Overview
> +
> +The cloud configuration system follows a pattern similar to Linux kernel refs management (`make refs-default`), where administrators generate fresh configurations that are then committed to the repository as static files for regular users. This approach provides:
> +
> +- **Fast configuration loading** for regular users (using pre-generated static files)
> +- **Fresh, up-to-date options** when administrators regenerate configurations
> +- **No dependency on cloud CLI tools** for regular users
> +- **Reduced API calls** to cloud providers
> +
> +## Configuration Generation Flow
> +
> +```
> +Cloud Provider API → Generated Files → Static Files → Git Repository
> + ↑ ↑ ↑
> + make cloud-config (automatic) make cloud-update
> +```
> +
> +## Available Targets
> +
> +### `make cloud-config`
> +
> +Generates dynamic cloud configurations by querying cloud provider APIs.
> +
> +**Purpose**: Fetches current instance types, regions, availability zones, and AMI options from cloud providers.
> +
> +**Usage**:
> +```bash
> +make cloud-config
> +```
> +
> +**What it does**:
> +- Queries AWS EC2 API for all available instance types and their specifications
> +- Fetches current regions and availability zones
> +- Discovers available AMIs including GPU-optimized images
> +- Generates Kconfig files with all discovered options
> +- Creates `.generated` files in provider-specific directories
> +- Sets a marker file (`.aws_cloud_config_generated`) to enable dynamic config
> +
> +**Time required**: Approximately 6 minutes for AWS (similar for other providers)
> +
> +**Generated files**:
> +- `terraform/aws/kconfigs/Kconfig.compute.generated`
> +- `terraform/aws/kconfigs/Kconfig.location.generated`
> +- `terraform/aws/kconfigs/Kconfig.gpu-amis.generated`
> +- `terraform/aws/kconfigs/instance-types/Kconfig.*.generated`
> +- Similar files for other cloud providers
> +
> +### `make cloud-update`
> +
> +Converts dynamically generated configurations to static files for committing to git.
> +
> +**Purpose**: Creates static copies of generated configurations that load instantly without requiring cloud API access.
> +
> +**Usage**:
> +```bash
> +make cloud-update
> +```
> +
> +**What it does**:
> +- Copies all `.generated` files to `.static` equivalents
> +- Updates internal references from `.generated` to `.static`
> +- Prepares files for git commit
> +- Allows regular users to benefit from pre-generated configurations
> +
> +**Static files created**:
> +- All `.generated` files get `.static` counterparts
> +- References within files are updated to use `.static` versions
> +
> +### `make clean-cloud-config`
> +
> +Removes all generated cloud configuration files.
> +
> +**Usage**:
> +```bash
> +make clean-cloud-config
> +```
> +
> +**What it does**:
> +- Removes all `.generated` files
> +- Removes cloud initialization marker files
> +- Forces regeneration on next `make cloud-config`
> +
> +## Usage Workflow
> +
> +### For Cloud Administrators/Maintainers
> +
> +Cloud administrators are responsible for keeping the static configurations up-to-date:
> +
> +1. **Generate fresh configurations**:
> + ```bash
> + make cloud-config # Wait ~6 minutes for API queries
> + ```
> +
> +2. **Convert to static files**:
> + ```bash
> + make cloud-update # Instant - just copies files
> + ```
> +
> +3. **Commit the static files**:
> + ```bash
> + git add terraform/*/kconfigs/*.static
> + git add terraform/*/kconfigs/instance-types/*.static
> + git commit -m "cloud: update static configurations for AWS/Azure/GCE
> +
> + Update instance types, regions, and AMI options to current offerings.
> +
> + Generated with AWS CLI version X.Y.Z on YYYY-MM-DD."
> + git push
> + ```
Thanks, this is very helpful.
I want to pull this some time this week and try it out. Is it in a
public branch?
A few more comments below. Quite possibly you could merge this and
we can just start polishing once it is merged.
> +### For Regular Users
> +
> +Regular users benefit from pre-generated static configurations:
> +
> +1. **Clone or pull the repository**:
> + ```bash
> + git clone https://github.com/linux-kdevops/kdevops
> + cd kdevops
> + ```
> +
> +2. **Use cloud configurations immediately**:
> + ```bash
> + make menuconfig # Cloud options load instantly from static files
> + make defconfig-aws-large
> + make
> + ```
> +
> +No cloud CLI tools or API access required - everything loads from committed static files.
I expect that a CLI tool or cloud console access /is/ needed to generate
authentication tokens, so this claim ought to be more specific.
> +
> +## How It Works
> +
> +### Dynamic Configuration Detection
> +
> +kdevops automatically detects whether to use dynamic or static configurations:
> +
> +```kconfig
> +config TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> + bool "Use dynamically generated instance types"
> + default $(shell, test -f .aws_cloud_config_generated && echo y || echo n)
> +```
> +
> +- If `.aws_cloud_config_generated` exists, dynamic configs are used
> +- Otherwise, static configs are used (default for most users)
> +
> +### File Precedence
> +
> +The Kconfig system sources files in this order:
> +
> +1. **Static files** (`.static`) - Pre-generated by administrators
> +2. **Generated files** (`.generated`) - Created by `make cloud-config`
> +
> +Static files take precedence and are preferred for faster loading.
> +
> +### Instance Type Organization
> +
> +Instance types are organized by family for better navigation:
> +
> +```
> +terraform/aws/kconfigs/instance-types/
> +├── Kconfig.m5.static # M5 family instances
> +├── Kconfig.m7a.static # M7a family instances
> +├── Kconfig.g6e.static # G6E GPU instances
> +└── ... # Other families
> +```
> +
> +## Supported Cloud Providers
> +
> +### AWS
> +- **Instance types**: All EC2 instance families and sizes
> +- **Regions**: All AWS regions and availability zones
> +- **AMIs**: Standard distributions and GPU-optimized Deep Learning AMIs
> +- **Time to generate**: ~6 minutes
> +
> +### Azure
> +- **Instance types**: All Azure VM sizes
> +- **Regions**: All Azure regions
> +- **Images**: Standard and specialized images
> +- **Time to generate**: ~5-7 minutes
> +
> +### Google Cloud (GCE)
> +- **Instance types**: All GCE machine types
> +- **Regions**: All GCE regions and zones
> +- **Images**: Public and custom images
> +- **Time to generate**: ~5-7 minutes
I don't see the Azure or Google Cloud pieces in this patch. Should
the above mentions be removed for the moment?
> +
> +### Lambda Labs
> +- **Instance types**: GPU-optimized instances
> +- **Regions**: Available data centers
> +- **Images**: ML-optimized images
> +- **Time to generate**: ~1-2 minutes
> +
> +## Benefits
> +
> +### For Regular Users
> +- **Instant configuration** - No waiting for API queries
> +- **No cloud CLI required** - Works without AWS CLI, gcloud, or Azure CLI
> +- **Consistent experience** - Same options for all users
> +- **Offline capable** - Works without internet access
> +
> +### For Administrators
> +- **Centralized updates** - Update once for all users
> +- **Version control** - Track configuration changes over time
> +- **Reduced API calls** - Query once, use many times
> +- **Flexibility** - Can still generate fresh configs when needed
> +
> +## Best Practices
> +
> +1. **Update regularly**: Cloud administrators should regenerate configurations monthly or when significant changes occur
> +
> +2. **Document updates**: Include cloud CLI version and date in commit messages
> +
> +3. **Test before committing**: Verify generated configurations work correctly:
> + ```bash
> + make cloud-config
> + make cloud-update
> + make menuconfig # Test that options appear correctly
> + ```
> +
> +4. **Use defconfigs**: Create defconfigs for common cloud configurations:
> + ```bash
> + make savedefconfig
> + cp defconfig defconfigs/aws-gpu-large
> + ```
> +
> +5. **Handle errors gracefully**: If cloud-config fails, static files still work
> +
> +## Troubleshooting
> +
> +### Configuration not appearing in menuconfig
> +
> +Check if dynamic config is enabled:
> +```bash
> +ls -la .aws_cloud_config_generated
> +grep USE_DYNAMIC_CONFIG .config
> +```
In terms of usability, why does the kdevops user need to config/enable
dynamic menu building? Can we just replace the menu files I wrote
with the generated menus, wholesale, with this patch? Seems like there
is sensible default behavior for users after a simple "git clone
kdevops" -- the same set of make targets will work the same way.
Or to put it another way, for me the merge criteria for this patch is
that it can generate a set of working AWS menus that are a superset of
what is already in the tree now. I'm not seeing a need to turn this
facility on or off. If there is a need for disabling it, can you add it
to the patch description or Kconfig help text?
> +
> +### Generated files have wrong references
> +
> +Run `make cloud-update` to fix references from `.generated` to `.static`.
Again, I'm missing the difference between .generated and .static. It
might be simpler overall if we just moved forward with all generated
Kconfig menus.
> +
> +### Old instance types appearing
> +
> +Regenerate configurations:
> +```bash
> +make clean-cloud-config
> +make cloud-config
> +make cloud-update
> +```
> +
> +## Implementation Details
> +
> +The cloud configuration system is implemented in:
> +
> +- `scripts/dynamic-cloud-kconfig.Makefile` - Make targets and build rules
> +- `scripts/aws_api.py` - AWS configuration generator
> +- `scripts/generate_cloud_configs.py` - Main configuration generator
> +- `terraform/*/kconfigs/` - Provider-specific Kconfig files
> +
> +## See Also
> +
> +- [AWS Instance Types](https://aws.amazon.com/ec2/instance-types/)
> +- [Azure VM Sizes](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes)
> +- [GCE Machine Types](https://cloud.google.com/compute/docs/machine-types)
> +- [kdevops Terraform Documentation](terraform.md)
> diff --git a/scripts/aws-cli b/scripts/aws-cli
> new file mode 100755
> index 00000000..6cacce8b
> --- /dev/null
> +++ b/scripts/aws-cli
> @@ -0,0 +1,436 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: MIT
> +"""
> +AWS CLI tool for kdevops
> +
> +A structured CLI tool that wraps AWS CLI commands and provides access to
> +AWS cloud provider functionality for dynamic configuration generation
> +and resource management.
> +"""
> +
> +import argparse
> +import json
> +import sys
> +import os
> +from typing import Dict, List, Any, Optional, Tuple
> +from pathlib import Path
> +
> +# Import the AWS API functions
> +try:
> + from aws_api import (
> + check_aws_cli,
> + get_instance_types,
> + get_regions,
> + get_availability_zones,
> + get_pricing_info,
> + generate_instance_types_kconfig,
> + generate_regions_kconfig,
> + generate_instance_families_kconfig,
> + generate_gpu_amis_kconfig,
> + )
> +except ImportError:
> + # Try to import from scripts directory if not in path
> + sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
> + from aws_api import (
> + check_aws_cli,
> + get_instance_types,
> + get_regions,
> + get_availability_zones,
> + get_pricing_info,
> + generate_instance_types_kconfig,
> + generate_regions_kconfig,
> + generate_instance_families_kconfig,
> + generate_gpu_amis_kconfig,
> + )
> +
> +
> +class AWSCLI:
> + """AWS CLI interface for kdevops"""
> +
> + def __init__(self, output_format: str = "json"):
> + """
> + Initialize the CLI with specified output format
> +
> + Args:
> + output_format: 'json' or 'text' for output formatting
> + """
> + self.output_format = output_format
> + self.aws_available = check_aws_cli()
> +
> + def output(self, data: Any, headers: Optional[List[str]] = None):
> + """
> + Output data in the specified format
> +
> + Args:
> + data: Data to output (dict, list, or primitive)
> + headers: Column headers for text format (optional)
> + """
> + if self.output_format == "json":
> + print(json.dumps(data, indent=2))
> + else:
> + # Human-readable text format
> + if isinstance(data, list):
> + if data and isinstance(data[0], dict):
> + # Table format for list of dicts
> + if not headers:
> + headers = list(data[0].keys()) if data else []
> +
> + if headers:
> + # Calculate column widths
> + widths = {h: len(h) for h in headers}
> + for item in data:
> + for h in headers:
> + val = str(item.get(h, ""))
> + widths[h] = max(widths[h], len(val))
> +
> + # Print header
> + header_line = " | ".join(h.ljust(widths[h]) for h in headers)
> + print(header_line)
> + print("-" * len(header_line))
> +
> + # Print rows
> + for item in data:
> + row = " | ".join(
> + str(item.get(h, "")).ljust(widths[h]) for h in headers
> + )
> + print(row)
> + else:
> + # Simple list
> + for item in data:
> + print(item)
> + elif isinstance(data, dict):
> + # Key-value format
> + max_key_len = max(len(k) for k in data.keys()) if data else 0
> + for key, value in data.items():
> + print(f"{key.ljust(max_key_len)} : {value}")
> + else:
> + # Simple value
> + print(data)
> +
> + def list_instance_types(
> + self,
> + family: Optional[str] = None,
> + region: Optional[str] = None,
> + max_results: int = 100,
> + ) -> List[Dict[str, Any]]:
> + """
> + List instance types
> +
> + Args:
> + family: Filter by instance family (e.g., 'm5', 't3')
> + region: AWS region to query
> + max_results: Maximum number of results to return
> +
> + Returns:
> + List of instance type information
> + """
> + if not self.aws_available:
> + return [
> + {
> + "error": "AWS CLI not found. Please install AWS CLI and configure credentials."
> + }
> + ]
> +
> + instances = get_instance_types(
> + family=family, region=region, max_results=max_results
> + )
> +
> + # Format the results
> + result = []
> + for instance in instances:
> + item = {
> + "name": instance.get("InstanceType", ""),
> + "vcpu": instance.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> + "memory_gb": instance.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024,
> + "instance_storage": instance.get("InstanceStorageSupported", False),
> + "network_performance": instance.get("NetworkInfo", {}).get(
> + "NetworkPerformance", ""
> + ),
> + "architecture": ", ".join(
> + instance.get("ProcessorInfo", {}).get("SupportedArchitectures", [])
> + ),
> + }
> + result.append(item)
> +
> + # Sort by name
> + result.sort(key=lambda x: x["name"])
> +
> + return result
> +
> + def list_regions(self, include_zones: bool = False) -> List[Dict[str, Any]]:
> + """
> + List regions
> +
> + Args:
> + include_zones: Include availability zones for each region
> +
> + Returns:
> + List of region information
> + """
> + if not self.aws_available:
> + return [
> + {
> + "error": "AWS CLI not found. Please install AWS CLI and configure credentials."
> + }
> + ]
> +
> + regions = get_regions()
> +
> + result = []
> + for region in regions:
> + item = {
> + "name": region.get("RegionName", ""),
> + "endpoint": region.get("Endpoint", ""),
> + "opt_in_status": region.get("OptInStatus", ""),
> + }
> +
> + if include_zones:
> + # Get availability zones for this region
> + zones = get_availability_zones(region["RegionName"])
> + item["zones"] = len(zones)
> + item["zone_names"] = ", ".join([z["ZoneName"] for z in zones])
> +
> + result.append(item)
> +
> + return result
> +
> + def get_cheapest_instance(
> + self,
> + region: Optional[str] = None,
> + family: Optional[str] = None,
> + min_vcpus: int = 2,
> + ) -> Dict[str, Any]:
> + """
> + Get the cheapest instance meeting criteria
> +
> + Args:
> + region: AWS region
> + family: Instance family filter
> + min_vcpus: Minimum number of vCPUs required
> +
> + Returns:
> + Dictionary with instance information
> + """
> + if not self.aws_available:
> + return {"error": "AWS CLI not available"}
> +
> + instances = get_instance_types(family=family, region=region)
> +
> + # Filter by minimum vCPUs
> + eligible = []
> + for instance in instances:
> + vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0)
> + if vcpus >= min_vcpus:
> + eligible.append(instance)
> +
> + if not eligible:
> + return {"error": "No instances found matching criteria"}
> +
> + # Get pricing for eligible instances
> + pricing = get_pricing_info(region=region or "us-east-1")
> +
> + # Find cheapest
> + cheapest = None
> + cheapest_price = float("inf")
> +
> + for instance in eligible:
> + instance_type = instance.get("InstanceType")
> + price = pricing.get(instance_type, {}).get("on_demand", float("inf"))
> + if price < cheapest_price:
> + cheapest_price = price
> + cheapest = instance
> +
> + if cheapest:
> + return {
> + "instance_type": cheapest.get("InstanceType"),
> + "vcpus": cheapest.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> + "memory_gb": cheapest.get("MemoryInfo", {}).get("SizeInMiB", 0) / 1024,
> + "price_per_hour": f"${cheapest_price:.3f}",
> + }
> +
> + return {"error": "Could not determine cheapest instance"}
> +
> + def generate_kconfig(self) -> bool:
> + """
> + Generate Kconfig files for AWS
> +
> + Returns:
> + True on success, False on failure
> + """
> + if not self.aws_available:
> + print("AWS CLI not available, cannot generate Kconfig", file=sys.stderr)
> + return False
> +
> + output_dir = Path("terraform/aws/kconfigs")
> +
> + # Create directory if it doesn't exist
> + output_dir.mkdir(parents=True, exist_ok=True)
> +
> + try:
> + from concurrent.futures import ThreadPoolExecutor, as_completed
> +
> + # Generate files in parallel
> + instance_types_dir = output_dir / "instance-types"
> + instance_types_dir.mkdir(exist_ok=True)
> +
> + def generate_family_file(family):
> + """Generate Kconfig for a single family."""
> + types_kconfig = generate_instance_types_kconfig(family)
> + if types_kconfig:
> + types_file = instance_types_dir / f"Kconfig.{family}.generated"
> + types_file.write_text(types_kconfig)
> + return f"Generated {types_file}"
> + return None
> +
> + with ThreadPoolExecutor(max_workers=10) as executor:
> + # Submit all generation tasks
> + futures = []
> +
> + # Generate instance families Kconfig
> + futures.append(executor.submit(generate_instance_families_kconfig))
> +
> + # Generate regions Kconfig
> + futures.append(executor.submit(generate_regions_kconfig))
> +
> + # Generate GPU AMIs Kconfig
> + futures.append(executor.submit(generate_gpu_amis_kconfig))
> +
> + # Generate instance types for each family
> + # Get all families dynamically from AWS
> + from aws_api import get_generated_instance_families
> +
> + families = get_generated_instance_families()
> +
> + family_futures = []
> + for family in sorted(families):
> + family_futures.append(executor.submit(generate_family_file, family))
> +
> + # Process main config results
> + families_kconfig = futures[0].result()
> + regions_kconfig = futures[1].result()
> + gpu_amis_kconfig = futures[2].result()
> +
> + # Write main configs
> + families_file = output_dir / "Kconfig.compute.generated"
> + families_file.write_text(families_kconfig)
> + print(f"Generated {families_file}")
> +
> + regions_file = output_dir / "Kconfig.location.generated"
> + regions_file.write_text(regions_kconfig)
> + print(f"Generated {regions_file}")
> +
> + gpu_amis_file = output_dir / "Kconfig.gpu-amis.generated"
> + gpu_amis_file.write_text(gpu_amis_kconfig)
> + print(f"Generated {gpu_amis_file}")
> +
> + # Process family results
> + for future in family_futures:
> + result = future.result()
> + if result:
> + print(result)
> +
> + return True
> +
> + except Exception as e:
> + print(f"Error generating Kconfig: {e}", file=sys.stderr)
> + return False
> +
> +
> +def main():
> + """Main entry point"""
> + parser = argparse.ArgumentParser(
> + description="AWS CLI tool for kdevops",
> + formatter_class=argparse.RawDescriptionHelpFormatter,
> + )
> +
> + parser.add_argument(
> + "--output",
> + choices=["json", "text"],
> + default="json",
> + help="Output format (default: json)",
> + )
> +
> + subparsers = parser.add_subparsers(dest="command", help="Available commands")
> +
> + # Generate Kconfig command
> + kconfig_parser = subparsers.add_parser(
> + "generate-kconfig", help="Generate Kconfig files for AWS"
> + )
> +
> + # Instance types command
> + instances_parser = subparsers.add_parser(
> + "instance-types", help="Manage instance types"
> + )
> + instances_subparsers = instances_parser.add_subparsers(
> + dest="subcommand", help="Instance type operations"
> + )
> +
> + # Instance types list
> + list_instances = instances_subparsers.add_parser("list", help="List instance types")
> + list_instances.add_argument("--family", help="Filter by instance family")
> + list_instances.add_argument("--region", help="AWS region")
> + list_instances.add_argument(
> + "--max-results", type=int, default=100, help="Maximum results (default: 100)"
> + )
> +
> + # Regions command
> + regions_parser = subparsers.add_parser("regions", help="Manage regions")
> + regions_subparsers = regions_parser.add_subparsers(
> + dest="subcommand", help="Region operations"
> + )
> +
> + # Regions list
> + list_regions = regions_subparsers.add_parser("list", help="List regions")
> + list_regions.add_argument(
> + "--include-zones",
> + action="store_true",
> + help="Include availability zones",
> + )
> +
> + # Cheapest instance command
> + cheapest_parser = subparsers.add_parser(
> + "cheapest", help="Find cheapest instance meeting criteria"
> + )
> + cheapest_parser.add_argument("--region", help="AWS region")
> + cheapest_parser.add_argument("--family", help="Instance family")
> + cheapest_parser.add_argument(
> + "--min-vcpus", type=int, default=2, help="Minimum vCPUs (default: 2)"
> + )
> +
> + args = parser.parse_args()
> +
> + cli = AWSCLI(output_format=args.output)
> +
> + if args.command == "generate-kconfig":
> + success = cli.generate_kconfig()
> + sys.exit(0 if success else 1)
> +
> + elif args.command == "instance-types":
> + if args.subcommand == "list":
> + instances = cli.list_instance_types(
> + family=args.family,
> + region=args.region,
> + max_results=args.max_results,
> + )
> + cli.output(instances)
> +
> + elif args.command == "regions":
> + if args.subcommand == "list":
> + regions = cli.list_regions(include_zones=args.include_zones)
> + cli.output(regions)
> +
> + elif args.command == "cheapest":
> + result = cli.get_cheapest_instance(
> + region=args.region,
> + family=args.family,
> + min_vcpus=args.min_vcpus,
> + )
> + cli.output(result)
> +
> + else:
> + parser.print_help()
> + sys.exit(1)
> +
> +
> +if __name__ == "__main__":
> + main()
> diff --git a/scripts/aws_api.py b/scripts/aws_api.py
> new file mode 100755
> index 00000000..e23acaa9
> --- /dev/null
> +++ b/scripts/aws_api.py
> @@ -0,0 +1,1161 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: MIT
> +"""
> +AWS API library for kdevops.
> +
> +Provides AWS CLI wrapper functions for dynamic configuration generation.
> +Used by aws-cli and other kdevops components.
> +"""
> +
> +import json
> +import os
> +import re
> +import subprocess
> +import sys
> +from typing import Dict, List, Optional, Any
> +
> +
> +def check_aws_cli() -> bool:
> + """Check if AWS CLI is installed and configured."""
> + try:
> + # Check if AWS CLI is installed
> + result = subprocess.run(
> + ["aws", "--version"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + if result.returncode != 0:
> + return False
> +
> + # Check if credentials are configured
> + result = subprocess.run(
> + ["aws", "sts", "get-caller-identity"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + return result.returncode == 0
> + except FileNotFoundError:
> + return False
> +
> +
> +def get_default_region() -> str:
> + """Get the default AWS region from configuration or environment."""
> + # Try to get from environment
> + region = os.environ.get("AWS_DEFAULT_REGION")
> + if region:
> + return region
> +
> + # Try to get from AWS config
> + try:
> + result = subprocess.run(
> + ["aws", "configure", "get", "region"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + if result.returncode == 0 and result.stdout.strip():
> + return result.stdout.strip()
> + except:
> + pass
> +
> + # Default to us-east-1
> + return "us-east-1"
> +
> +
> +def run_aws_command(command: List[str], region: Optional[str] = None) -> Optional[Dict]:
> + """
> + Run an AWS CLI command and return the JSON output.
> +
> + Args:
> + command: AWS CLI command as a list
> + region: Optional AWS region
> +
> + Returns:
> + Parsed JSON output or None on error
> + """
> + cmd = ["aws"] + command + ["--output", "json"]
> +
> + # Always specify a region (use default if not provided)
> + if not region:
> + region = get_default_region()
> + cmd.extend(["--region", region])
> +
> + try:
> + result = subprocess.run(
> + cmd,
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> + if result.returncode == 0:
> + return json.loads(result.stdout) if result.stdout else {}
> + else:
> + print(f"AWS command failed: {result.stderr}", file=sys.stderr)
> + return None
> + except (subprocess.SubprocessError, json.JSONDecodeError) as e:
> + print(f"Error running AWS command: {e}", file=sys.stderr)
> + return None
> +
> +
> +def get_regions() -> List[Dict[str, Any]]:
> + """Get available AWS regions."""
> + response = run_aws_command(["ec2", "describe-regions"])
> + if response and "Regions" in response:
> + return response["Regions"]
> + return []
> +
> +
> +def get_availability_zones(region: str) -> List[Dict[str, Any]]:
> + """Get availability zones for a specific region."""
> + response = run_aws_command(
> + ["ec2", "describe-availability-zones"],
> + region=region,
> + )
> + if response and "AvailabilityZones" in response:
> + return response["AvailabilityZones"]
> + return []
> +
> +
> +def get_instance_types(
> + family: Optional[str] = None,
> + region: Optional[str] = None,
> + max_results: int = 100,
> + fetch_all: bool = True,
> +) -> List[Dict[str, Any]]:
> + """
> + Get available instance types.
> +
> + Args:
> + family: Instance family filter (e.g., 'm5', 't3')
> + region: AWS region
> + max_results: Maximum number of results per API call (max 100)
> + fetch_all: If True, fetch all pages using NextToken pagination
> +
> + Returns:
> + List of instance type information
> + """
> + all_instances = []
> + next_token = None
> + page_count = 0
> +
> + # Ensure max_results doesn't exceed AWS limit
> + max_results = min(max_results, 100)
> +
> + while True:
> + cmd = ["ec2", "describe-instance-types"]
> +
> + filters = []
> + if family:
> + # Filter by instance type pattern
> + filters.append(f"Name=instance-type,Values={family}*")
> +
> + if filters:
> + cmd.append("--filters")
> + cmd.extend(filters)
> +
> + cmd.extend(["--max-results", str(max_results)])
> +
> + if next_token:
> + cmd.extend(["--next-token", next_token])
> +
> + response = run_aws_command(cmd, region=region)
> + if response and "InstanceTypes" in response:
> + batch_size = len(response["InstanceTypes"])
> + all_instances.extend(response["InstanceTypes"])
> + page_count += 1
> +
> + if fetch_all and not family:
> + # Only show progress for full fetches (not family-specific)
> + print(
> + f" Fetched page {page_count}: {batch_size} instance types (total: {len(all_instances)})",
> + file=sys.stderr,
> + )
> +
> + # Check if there are more results
> + if fetch_all and "NextToken" in response:
> + next_token = response["NextToken"]
> + else:
> + break
> + else:
> + break
> +
> + if fetch_all and page_count > 1:
> + filter_desc = f" for family '{family}'" if family else ""
> + print(
> + f" Total: {len(all_instances)} instance types fetched{filter_desc}",
> + file=sys.stderr,
> + )
> +
> + return all_instances
> +
> +
> +def get_pricing_info(region: str = "us-east-1") -> Dict[str, Dict[str, float]]:
> + """
> + Get pricing information for instance types.
> +
> + Note: AWS Pricing API requires us-east-1 region.
> + Returns a simplified pricing structure.
> +
> + Args:
> + region: AWS region for pricing
> +
> + Returns:
> + Dictionary mapping instance types to pricing info
> + """
> + # For simplicity, we'll use hardcoded common instance prices
> + # In production, you'd query the AWS Pricing API
Not clear to me... is this script simply returning a constant blob
of JSON, or is there a real API query going on? The comments here
suggest there is more to be done here. (Just an observation).
Also, see below: I'm not sure why we need to keep a lot of default
information around in this script. Either the menu regeneration
worked and replaces the previous one (and can be backed out via
a normal revert) or regeneration doesn't work, in which case the
menus shouldn't change.
We always have the safety net of git to quickly get back to a working
configuration: something like "git reset --hard".
> + pricing = {
> + # T3 family (burstable)
> + "t3.nano": {"on_demand": 0.0052},
> + "t3.micro": {"on_demand": 0.0104},
> + "t3.small": {"on_demand": 0.0208},
> + "t3.medium": {"on_demand": 0.0416},
> + "t3.large": {"on_demand": 0.0832},
> + "t3.xlarge": {"on_demand": 0.1664},
> + "t3.2xlarge": {"on_demand": 0.3328},
> + # T3a family (AMD)
> + "t3a.nano": {"on_demand": 0.0047},
> + "t3a.micro": {"on_demand": 0.0094},
> + "t3a.small": {"on_demand": 0.0188},
> + "t3a.medium": {"on_demand": 0.0376},
> + "t3a.large": {"on_demand": 0.0752},
> + "t3a.xlarge": {"on_demand": 0.1504},
> + "t3a.2xlarge": {"on_demand": 0.3008},
> + # M5 family (general purpose Intel)
> + "m5.large": {"on_demand": 0.096},
> + "m5.xlarge": {"on_demand": 0.192},
> + "m5.2xlarge": {"on_demand": 0.384},
> + "m5.4xlarge": {"on_demand": 0.768},
> + "m5.8xlarge": {"on_demand": 1.536},
> + "m5.12xlarge": {"on_demand": 2.304},
> + "m5.16xlarge": {"on_demand": 3.072},
> + "m5.24xlarge": {"on_demand": 4.608},
> + # M7a family (general purpose AMD)
> + "m7a.medium": {"on_demand": 0.0464},
> + "m7a.large": {"on_demand": 0.0928},
> + "m7a.xlarge": {"on_demand": 0.1856},
> + "m7a.2xlarge": {"on_demand": 0.3712},
> + "m7a.4xlarge": {"on_demand": 0.7424},
> + "m7a.8xlarge": {"on_demand": 1.4848},
> + "m7a.12xlarge": {"on_demand": 2.2272},
> + "m7a.16xlarge": {"on_demand": 2.9696},
> + "m7a.24xlarge": {"on_demand": 4.4544},
> + "m7a.32xlarge": {"on_demand": 5.9392},
> + "m7a.48xlarge": {"on_demand": 8.9088},
> + # C5 family (compute optimized)
> + "c5.large": {"on_demand": 0.085},
> + "c5.xlarge": {"on_demand": 0.17},
> + "c5.2xlarge": {"on_demand": 0.34},
> + "c5.4xlarge": {"on_demand": 0.68},
> + "c5.9xlarge": {"on_demand": 1.53},
> + "c5.12xlarge": {"on_demand": 2.04},
> + "c5.18xlarge": {"on_demand": 3.06},
> + "c5.24xlarge": {"on_demand": 4.08},
> + # C7a family (compute optimized AMD)
> + "c7a.medium": {"on_demand": 0.0387},
> + "c7a.large": {"on_demand": 0.0774},
> + "c7a.xlarge": {"on_demand": 0.1548},
> + "c7a.2xlarge": {"on_demand": 0.3096},
> + "c7a.4xlarge": {"on_demand": 0.6192},
> + "c7a.8xlarge": {"on_demand": 1.2384},
> + "c7a.12xlarge": {"on_demand": 1.8576},
> + "c7a.16xlarge": {"on_demand": 2.4768},
> + "c7a.24xlarge": {"on_demand": 3.7152},
> + "c7a.32xlarge": {"on_demand": 4.9536},
> + "c7a.48xlarge": {"on_demand": 7.4304},
> + # I4i family (storage optimized)
> + "i4i.large": {"on_demand": 0.117},
> + "i4i.xlarge": {"on_demand": 0.234},
> + "i4i.2xlarge": {"on_demand": 0.468},
> + "i4i.4xlarge": {"on_demand": 0.936},
> + "i4i.8xlarge": {"on_demand": 1.872},
> + "i4i.16xlarge": {"on_demand": 3.744},
> + "i4i.32xlarge": {"on_demand": 7.488},
> + }
> +
> + # Adjust pricing based on region (simplified)
> + # Some regions are more expensive than others
> + region_multipliers = {
> + "us-east-1": 1.0,
> + "us-east-2": 1.0,
> + "us-west-1": 1.08,
> + "us-west-2": 1.0,
> + "eu-west-1": 1.1,
> + "eu-central-1": 1.15,
> + "ap-southeast-1": 1.2,
> + "ap-northeast-1": 1.25,
> + }
> +
> + multiplier = region_multipliers.get(region, 1.1)
> + if multiplier != 1.0:
> + adjusted_pricing = {}
> + for instance_type, prices in pricing.items():
> + adjusted_pricing[instance_type] = {
> + "on_demand": prices["on_demand"] * multiplier
> + }
> + return adjusted_pricing
> +
> + return pricing
> +
> +
> +def sanitize_kconfig_name(name: str) -> str:
> + """Convert a name to a valid Kconfig symbol."""
> + # Replace special characters with underscores
> + name = name.replace("-", "_").replace(".", "_").replace(" ", "_")
> + # Convert to uppercase
> + name = name.upper()
> + # Remove any non-alphanumeric characters (except underscore)
> + name = "".join(c for c in name if c.isalnum() or c == "_")
> + # Ensure it doesn't start with a number
> + if name and name[0].isdigit():
> + name = "_" + name
> + return name
> +
> +
> +# Cache for instance families to avoid redundant API calls
> +_cached_families = None
> +
> +
> +def get_generated_instance_families() -> set:
> + """Get the set of instance families that will have generated Kconfig files."""
> + global _cached_families
> +
> + # Return cached result if available
> + if _cached_families is not None:
> + return _cached_families
> +
> + # Return all families - we'll generate Kconfig files for all of them
> + # This function will be called by the aws-cli tool to determine which files to generate
> + if not check_aws_cli():
> + # Return a minimal set if AWS CLI is not available
> + _cached_families = {"m5", "t3", "c5"}
> + return _cached_families
> +
> + # Get all available instance types
> + print(" Discovering available instance families...", file=sys.stderr)
> + instance_types = get_instance_types(fetch_all=True)
> +
> + # Extract unique families
> + families = set()
> + for instance_type in instance_types:
> + type_name = instance_type.get("InstanceType", "")
> + # Extract family prefix (e.g., "m5" from "m5.large")
> + if "." in type_name:
> + family = type_name.split(".")[0]
> + families.add(family)
> +
> + print(f" Found {len(families)} instance families", file=sys.stderr)
> + _cached_families = families
> + return families
> +
> +
> +def generate_instance_families_kconfig() -> str:
> + """Generate Kconfig content for AWS instance families."""
> + # Check if AWS CLI is available
> + if not check_aws_cli():
> + return generate_default_instance_families_kconfig()
> +
> + # Get all available instance types (with pagination)
> + instance_types = get_instance_types(fetch_all=True)
> +
> + # Extract unique families
> + families = set()
> + family_info = {}
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." in instance_type:
> + family = instance_type.split(".")[0]
> + families.add(family)
> + if family not in family_info:
> + family_info[family] = {
> + "architectures": set(),
> + "count": 0,
> + }
> + family_info[family]["count"] += 1
> + for arch in instance.get("ProcessorInfo", {}).get(
> + "SupportedArchitectures", []
> + ):
> + family_info[family]["architectures"].add(arch)
> +
> + if not families:
> + return generate_default_instance_families_kconfig()
> +
> + # Group families by category - use prefix patterns to catch all variants
> + def categorize_family(family_name):
> + """Categorize a family based on its prefix."""
> + if family_name.startswith(("m", "t")):
> + return "general_purpose"
> + elif family_name.startswith("c"):
> + return "compute_optimized"
> + elif family_name.startswith(("r", "x", "z")):
> + return "memory_optimized"
> + elif family_name.startswith(("i", "d", "h")):
> + return "storage_optimized"
> + elif family_name.startswith(("p", "g", "dl", "trn", "inf", "vt", "f")):
> + return "accelerated"
> + elif family_name.startswith(("mac", "hpc")):
> + return "specialized"
> + else:
> + return "other"
> +
> + # Organize families by category
> + categorized_families = {
> + "general_purpose": [],
> + "compute_optimized": [],
> + "memory_optimized": [],
> + "storage_optimized": [],
> + "accelerated": [],
> + "specialized": [],
> + "other": [],
> + }
> +
> + for family in sorted(families):
> + category = categorize_family(family)
> + categorized_families[category].append(family)
> +
> + kconfig = """# AWS instance families (dynamically generated)
> +# Generated by aws-cli from live AWS data
> +
> +choice
> + prompt "AWS instance family"
> + default TERRAFORM_AWS_INSTANCE_TYPE_M5
> + help
> + Select the AWS instance family for your deployment.
> + Different families are optimized for different workloads.
> +
> +"""
> +
> + # Category headers
> + category_headers = {
> + "general_purpose": "# General Purpose - balanced compute, memory, and networking\n",
> + "compute_optimized": "# Compute Optimized - ideal for CPU-intensive applications\n",
> + "memory_optimized": "# Memory Optimized - for memory-intensive applications\n",
> + "storage_optimized": "# Storage Optimized - for high sequential read/write workloads\n",
> + "accelerated": "# Accelerated Computing - GPU and other accelerators\n",
> + "specialized": "# Specialized - for specific use cases\n",
> + "other": "# Other instance families\n",
> + }
> +
> + # Add each category of families
> + for category in [
> + "general_purpose",
> + "compute_optimized",
> + "memory_optimized",
> + "storage_optimized",
> + "accelerated",
> + "specialized",
> + "other",
> + ]:
> + if categorized_families[category]:
> + kconfig += category_headers[category]
> + for family in categorized_families[category]:
> + kconfig += generate_family_config(family, family_info.get(family, {}))
> + if category != "other": # Don't add extra newline after the last category
> + kconfig += "\n"
> +
> + kconfig += "\nendchoice\n"
> +
> + # Add instance type source includes for each family
> + # Only include families that we actually generate files for
> + generated_families = get_generated_instance_families()
> + kconfig += "\n# Include instance-specific configurations\n"
> + for family in sorted(families):
> + # Only add source statement if we generate a file for this family
> + if family in generated_families:
> + safe_name = sanitize_kconfig_name(family)
> + kconfig += f"""if TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}
> +source "terraform/aws/kconfigs/instance-types/Kconfig.{family}.generated"
> +endif
> +
> +"""
> +
> + # Add the TERRAFORM_AWS_INSTANCE_TYPE configuration that maps to the actual instance type
> + kconfig += """# Final instance type configuration
> +config TERRAFORM_AWS_INSTANCE_TYPE
> + string
> + output yaml
> +"""
> +
> + # Add default for each family that maps to its size variable
> + for family in sorted(families):
> + safe_name = sanitize_kconfig_name(family)
> + kconfig += f"\tdefault TERRAFORM_AWS_{safe_name}_SIZE if TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}\n"
> +
> + # Add a final fallback default
> + kconfig += '\tdefault "t3.micro"\n\n'
> +
> + return kconfig
> +
> +
> +def generate_family_config(family: str, info: Dict) -> str:
> + """Generate Kconfig entry for an instance family."""
> + safe_name = sanitize_kconfig_name(family)
> +
> + # Determine architecture dependencies
> + architectures = info.get("architectures", set())
> + depends_line = ""
> + if architectures:
> + if "x86_64" in architectures and "arm64" not in architectures:
> + depends_line = "\n\tdepends on TARGET_ARCH_X86_64"
> + elif "arm64" in architectures and "x86_64" not in architectures:
> + depends_line = "\n\tdepends on TARGET_ARCH_ARM64"
> +
> + # Family descriptions
> + descriptions = {
> + "t3": "Burstable performance instances powered by Intel processors",
> + "t3a": "Burstable performance instances powered by AMD processors",
> + "m5": "General purpose instances powered by Intel Xeon Platinum processors",
> + "m7a": "Latest generation general purpose instances powered by AMD EPYC processors",
> + "c5": "Compute optimized instances powered by Intel Xeon Platinum processors",
> + "c7a": "Latest generation compute optimized instances powered by AMD EPYC processors",
> + "i4i": "Storage optimized instances with NVMe SSD storage",
> + "is4gen": "Storage optimized ARM instances powered by AWS Graviton2",
> + "im4gn": "Storage optimized ARM instances with NVMe storage",
> + "r5": "Memory optimized instances powered by Intel Xeon Platinum processors",
> + "p3": "GPU instances for machine learning and HPC",
> + "g4dn": "GPU instances for graphics-intensive applications",
> + }
> +
> + description = descriptions.get(family, f"AWS {family.upper()} instance family")
> + count = info.get("count", 0)
> +
> + config = f"""config TERRAFORM_AWS_INSTANCE_TYPE_{safe_name}
> +\tbool "{family.upper()}"
> +{depends_line}
> +\thelp
> +\t {description}
> +\t Available instance types: {count}
> +
> +"""
> + return config
> +
> +
> +def generate_default_instance_families_kconfig() -> str:
> + """Generate default Kconfig content when AWS CLI is not available."""
> + return """# AWS instance families (default - AWS CLI not available)
> +
> +choice
> + prompt "AWS instance family"
> + default TERRAFORM_AWS_INSTANCE_TYPE_M5
> + help
> + Select the AWS instance family for your deployment.
> + Note: AWS CLI is not available, showing default options.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_M5
> + bool "M5"
> + depends on TARGET_ARCH_X86_64
> + help
> + General purpose instances powered by Intel Xeon Platinum processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_M7A
> + bool "M7a"
> + depends on TARGET_ARCH_X86_64
> + help
> + Latest generation general purpose instances powered by AMD EPYC processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_T3
> + bool "T3"
> + depends on TARGET_ARCH_X86_64
> + help
> + Burstable performance instances powered by Intel processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_C5
> + bool "C5"
> + depends on TARGET_ARCH_X86_64
> + help
> + Compute optimized instances powered by Intel Xeon Platinum processors.
> +
> +config TERRAFORM_AWS_INSTANCE_TYPE_I4I
> + bool "I4i"
> + depends on TARGET_ARCH_X86_64
> + help
> + Storage optimized instances with NVMe SSD storage.
> +
> +endchoice
> +
> +# Include instance-specific configurations
> +if TERRAFORM_AWS_INSTANCE_TYPE_M5
> +source "terraform/aws/kconfigs/instance-types/Kconfig.m5"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_M7A
> +source "terraform/aws/kconfigs/instance-types/Kconfig.m7a"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_T3
> +source "terraform/aws/kconfigs/instance-types/Kconfig.t3.generated"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_C5
> +source "terraform/aws/kconfigs/instance-types/Kconfig.c5.generated"
> +endif
> +
> +if TERRAFORM_AWS_INSTANCE_TYPE_I4I
> +source "terraform/aws/kconfigs/instance-types/Kconfig.i4i"
> +endif
> +
> +# Final instance type configuration
> +config TERRAFORM_AWS_INSTANCE_TYPE
> + string
> + output yaml
> + default TERRAFORM_AWS_M5_SIZE if TERRAFORM_AWS_INSTANCE_TYPE_M5
> + default TERRAFORM_AWS_M7A_SIZE if TERRAFORM_AWS_INSTANCE_TYPE_M7A
> + default TERRAFORM_AWS_T3_SIZE if TERRAFORM_AWS_INSTANCE_TYPE_T3
> + default TERRAFORM_AWS_C5_SIZE if TERRAFORM_AWS_INSTANCE_TYPE_C5
> + default TERRAFORM_AWS_I4I_SIZE if TERRAFORM_AWS_INSTANCE_TYPE_I4I
> + default "t3.micro"
> +
> +"""
> +
> +
> +def generate_instance_types_kconfig(family: str) -> str:
> + """Generate Kconfig content for specific instance types within a family."""
> + if not check_aws_cli():
> + return ""
> +
> + instance_types = get_instance_types(family=family, fetch_all=True)
> + if not instance_types:
> + return ""
> +
> + # Filter to only exact family matches (e.g., c5a but not c5ad)
> + filtered_instances = []
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." in instance_type:
> + inst_family = instance_type.split(".")[0]
> + if inst_family == family:
> + filtered_instances.append(instance)
> +
> + instance_types = filtered_instances
> + if not instance_types:
> + return ""
> +
> + pricing = get_pricing_info()
> +
> + # Sort by vCPU count and memory
> + instance_types.sort(
> + key=lambda x: (
> + x.get("VCpuInfo", {}).get("DefaultVCpus", 0),
> + x.get("MemoryInfo", {}).get("SizeInMiB", 0),
> + )
> + )
> +
> + safe_family = sanitize_kconfig_name(family)
> +
> + # Get the first instance type to use as default
> + default_instance_name = f"{safe_family}_LARGE" # Fallback
> + if instance_types:
> + first_instance_type = instance_types[0].get("InstanceType", "")
> + if "." in first_instance_type:
> + first_full_name = first_instance_type.replace(".", "_")
> + default_instance_name = sanitize_kconfig_name(first_full_name)
> +
> + kconfig = f"""# AWS {family.upper()} instance sizes (dynamically generated)
> +
> +choice
> +\tprompt "Instance size for {family.upper()} family"
> +\tdefault TERRAFORM_AWS_INSTANCE_{default_instance_name}
> +\thelp
> +\t Select the specific instance size within the {family.upper()} family.
> +
> +"""
> +
> + seen_configs = set()
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." not in instance_type:
> + continue
> +
> + # Get the full instance type name to make unique config names
> + full_name = instance_type.replace(".", "_")
> + safe_full_name = sanitize_kconfig_name(full_name)
> +
> + # Skip if we've already seen this config name
> + if safe_full_name in seen_configs:
> + continue
> + seen_configs.add(safe_full_name)
> +
> + size = instance_type.split(".")[1]
> +
> + vcpus = instance.get("VCpuInfo", {}).get("DefaultVCpus", 0)
> + memory_mib = instance.get("MemoryInfo", {}).get("SizeInMiB", 0)
> + memory_gb = memory_mib / 1024
> +
> + # Get pricing
> + price = pricing.get(instance_type, {}).get("on_demand", 0.0)
> + price_str = f"${price:.3f}/hour" if price > 0 else "pricing varies"
> +
> + # Network performance
> + network = instance.get("NetworkInfo", {}).get("NetworkPerformance", "varies")
> +
> + # Storage
> + storage_info = ""
> + if instance.get("InstanceStorageSupported"):
> + storage = instance.get("InstanceStorageInfo", {})
> + total_size = storage.get("TotalSizeInGB", 0)
> + if total_size > 0:
> + storage_info = f"\n\t Instance storage: {total_size} GB"
> +
> + kconfig += f"""config TERRAFORM_AWS_INSTANCE_{safe_full_name}
> +\tbool "{instance_type}"
> +\thelp
> +\t vCPUs: {vcpus}
> +\t Memory: {memory_gb:.1f} GB
> +\t Network: {network}
> +\t Price: {price_str}{storage_info}
> +
> +"""
> +
> + kconfig += "endchoice\n"
> +
> + # Add the actual instance type string config with full instance names
> + kconfig += f"""
> +config TERRAFORM_AWS_{safe_family}_SIZE
> +\tstring
> +"""
> +
> + # Generate default mappings for each seen instance type
> + for instance in instance_types:
> + instance_type = instance.get("InstanceType", "")
> + if "." not in instance_type:
> + continue
> +
> + full_name = instance_type.replace(".", "_")
> + safe_full_name = sanitize_kconfig_name(full_name)
> +
> + kconfig += (
> + f'\tdefault "{instance_type}" if TERRAFORM_AWS_INSTANCE_{safe_full_name}\n'
> + )
> +
> + # Use the first instance type as the final fallback default
> + final_default = f"{family}.large"
> + if instance_types:
> + first_instance_type = instance_types[0].get("InstanceType", "")
> + if first_instance_type:
> + final_default = first_instance_type
> +
> + kconfig += f'\tdefault "{final_default}"\n\n'
> +
> + return kconfig
> +
> +
> +def generate_regions_kconfig() -> str:
> + """Generate Kconfig content for AWS regions."""
> + if not check_aws_cli():
> + return generate_default_regions_kconfig()
> +
> + regions = get_regions()
> + if not regions:
> + return generate_default_regions_kconfig()
> +
> + kconfig = """# AWS regions (dynamically generated)
> +
> +choice
> + prompt "AWS region"
> + default TERRAFORM_AWS_REGION_USEAST1
> + help
> + Select the AWS region for your deployment.
> + Note: Not all instance types are available in all regions.
> +
> +"""
> +
> + # Group regions by geographic area
> + us_regions = []
> + eu_regions = []
> + ap_regions = []
> + other_regions = []
> +
> + for region in regions:
> + region_name = region.get("RegionName", "")
> + if region_name.startswith("us-"):
> + us_regions.append(region)
> + elif region_name.startswith("eu-"):
> + eu_regions.append(region)
> + elif region_name.startswith("ap-"):
> + ap_regions.append(region)
> + else:
> + other_regions.append(region)
> +
> + # Add US regions
> + if us_regions:
> + kconfig += "# US Regions\n"
> + for region in sorted(us_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> + kconfig += "\n"
> +
> + # Add EU regions
> + if eu_regions:
> + kconfig += "# Europe Regions\n"
> + for region in sorted(eu_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> + kconfig += "\n"
> +
> + # Add Asia Pacific regions
> + if ap_regions:
> + kconfig += "# Asia Pacific Regions\n"
> + for region in sorted(ap_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> + kconfig += "\n"
> +
> + # Add other regions
> + if other_regions:
> + kconfig += "# Other Regions\n"
> + for region in sorted(other_regions, key=lambda x: x.get("RegionName", "")):
> + kconfig += generate_region_config(region)
> +
> + kconfig += "\nendchoice\n"
> +
> + # Add the actual region string config
> + kconfig += """
> +config TERRAFORM_AWS_REGION
> + string
> +"""
> +
> + for region in regions:
> + region_name = region.get("RegionName", "")
> + safe_name = sanitize_kconfig_name(region_name)
> + kconfig += f'\tdefault "{region_name}" if TERRAFORM_AWS_REGION_{safe_name}\n'
> +
> + kconfig += '\tdefault "us-east-1"\n'
> +
> + return kconfig
> +
> +
> +def generate_region_config(region: Dict) -> str:
> + """Generate Kconfig entry for a region."""
> + region_name = region.get("RegionName", "")
> + safe_name = sanitize_kconfig_name(region_name)
> + opt_in_status = region.get("OptInStatus", "")
> +
> + # Region display names
> + display_names = {
> + "us-east-1": "US East (N. Virginia)",
> + "us-east-2": "US East (Ohio)",
> + "us-west-1": "US West (N. California)",
> + "us-west-2": "US West (Oregon)",
> + "eu-west-1": "Europe (Ireland)",
> + "eu-west-2": "Europe (London)",
> + "eu-west-3": "Europe (Paris)",
> + "eu-central-1": "Europe (Frankfurt)",
> + "eu-north-1": "Europe (Stockholm)",
> + "ap-southeast-1": "Asia Pacific (Singapore)",
> + "ap-southeast-2": "Asia Pacific (Sydney)",
> + "ap-northeast-1": "Asia Pacific (Tokyo)",
> + "ap-northeast-2": "Asia Pacific (Seoul)",
> + "ap-south-1": "Asia Pacific (Mumbai)",
> + "ca-central-1": "Canada (Central)",
> + "sa-east-1": "South America (São Paulo)",
> + }
> +
> + display_name = display_names.get(region_name, region_name.replace("-", " ").title())
> +
> + help_text = f"\t Region: {display_name}"
> + if opt_in_status and opt_in_status != "opt-in-not-required":
> + help_text += f"\n\t Status: {opt_in_status}"
> +
> + config = f"""config TERRAFORM_AWS_REGION_{safe_name}
> +\tbool "{display_name}"
> +\thelp
> +{help_text}
> +
> +"""
> + return config
> +
> +
> +def get_gpu_amis(region: str = None) -> List[Dict[str, Any]]:
> + """
> + Get available GPU-optimized AMIs including Deep Learning AMIs.
> +
> + Args:
> + region: AWS region
> +
> + Returns:
> + List of AMI information
> + """
> + # Query for Deep Learning AMIs from AWS
> + cmd = ["ec2", "describe-images"]
> + filters = [
> + "Name=owner-alias,Values=amazon",
> + "Name=name,Values=Deep Learning AMI GPU*",
> + "Name=state,Values=available",
> + "Name=architecture,Values=x86_64",
> + ]
> + cmd.append("--filters")
> + cmd.extend(filters)
> + cmd.extend(["--query", "Images[?contains(Name, '2024') || contains(Name, '2025')]"])
> +
> + response = run_aws_command(cmd, region=region)
> +
> + if response:
> + # Sort by creation date to get the most recent
> + response.sort(key=lambda x: x.get("CreationDate", ""), reverse=True)
> + return response[:10] # Return top 10 most recent
> + return []
> +
> +
> +def generate_gpu_amis_kconfig() -> str:
> + """Generate Kconfig content for GPU AMIs."""
> + # Check if AWS CLI is available
> + if not check_aws_cli():
> + return generate_default_gpu_amis_kconfig()
> +
> + # Get available GPU AMIs
> + amis = get_gpu_amis()
> +
> + if not amis:
> + return generate_default_gpu_amis_kconfig()
> +
> + kconfig = """# GPU-optimized AMIs (dynamically generated)
> +
> +# GPU AMI Override - only shown for GPU instances
> +config TERRAFORM_AWS_USE_GPU_AMI
> + bool "Use GPU-optimized AMI instead of standard distribution"
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + output yaml
> + default n
> + help
> + Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers,
> + CUDA, and ML frameworks instead of the standard distribution AMI.
> +
> + When disabled, the standard distribution AMI will be used and you'll need
> + to install GPU drivers manually.
> +
> +if TERRAFORM_AWS_USE_GPU_AMI
> +
> +choice
> + prompt "GPU-optimized AMI selection"
> + default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + help
> + Select which GPU-optimized AMI to use for your GPU instance.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + bool "AWS Deep Learning AMI (Ubuntu 22.04)"
> + help
> + AWS Deep Learning AMI with NVIDIA drivers, CUDA, cuDNN, and popular ML frameworks.
> + Optimized for machine learning workloads on GPU instances.
> + Includes: TensorFlow, PyTorch, MXNet, and Jupyter.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> + bool "NVIDIA Deep Learning AMI"
> + help
> + NVIDIA optimized Deep Learning AMI with latest GPU drivers.
> + Includes NVIDIA GPU Cloud (NGC) containers and frameworks.
> +
> +config TERRAFORM_AWS_GPU_AMI_CUSTOM
> + bool "Custom GPU AMI"
> + help
> + Specify a custom AMI ID for GPU instances.
> +
> +endchoice
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> + string
> + output yaml
> + default "Deep Learning AMI GPU TensorFlow*"
> + help
> + AMI name pattern for AWS Deep Learning AMI.
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> + string
> + output yaml
> + default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> + string
> + output yaml
> + default "NVIDIA Deep Learning AMI*"
> + help
> + AMI name pattern for NVIDIA Deep Learning AMI.
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> + string
> + output yaml
> + default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING_NVIDIA
> +
> +if TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +config TERRAFORM_AWS_GPU_AMI_ID
> + string "Custom GPU AMI ID"
> + output yaml
> + help
> + Specify the AMI ID for your custom GPU image.
> + Example: ami-0123456789abcdef0
> +
> +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +endif # TERRAFORM_AWS_USE_GPU_AMI
> +
> +# GPU instance detection
> +config TERRAFORM_AWS_IS_GPU_INSTANCE
> + bool
> + output yaml
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN
> + default n
> + help
> + Automatically detected based on selected instance type.
> + This indicates whether the selected instance has GPU support.
> +
> +"""
> +
> + return kconfig
> +
> +
> +def generate_default_gpu_amis_kconfig() -> str:
> + """Generate default GPU AMI Kconfig when AWS CLI is not available."""
> + return """# GPU-optimized AMIs (default - AWS CLI not available)
> +
> +# GPU AMI Override - only shown for GPU instances
> +config TERRAFORM_AWS_USE_GPU_AMI
> + bool "Use GPU-optimized AMI instead of standard distribution"
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + output yaml
> + default n
> + help
> + Enable this to use a GPU-optimized AMI with pre-installed NVIDIA drivers,
> + CUDA, and ML frameworks instead of the standard distribution AMI.
> + Note: AWS CLI is not available, showing default options.
> +
> +if TERRAFORM_AWS_USE_GPU_AMI
> +
> +choice
> + prompt "GPU-optimized AMI selection"
> + default TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + depends on TERRAFORM_AWS_IS_GPU_INSTANCE
> + help
> + Select which GPU-optimized AMI to use for your GPU instance.
> +
> +config TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> + bool "AWS Deep Learning AMI (Ubuntu 22.04)"
> + help
> + Pre-configured with NVIDIA drivers, CUDA, and ML frameworks.
> +
> +config TERRAFORM_AWS_GPU_AMI_CUSTOM
> + bool "Custom GPU AMI"
> +
> +endchoice
> +
> +if TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +config TERRAFORM_AWS_GPU_AMI_NAME
> + string
> + output yaml
> + default "Deep Learning AMI GPU TensorFlow*"
> +
> +config TERRAFORM_AWS_GPU_AMI_OWNER
> + string
> + output yaml
> + default "amazon"
> +
> +endif # TERRAFORM_AWS_GPU_AMI_DEEP_LEARNING
> +
> +if TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +config TERRAFORM_AWS_GPU_AMI_ID
> + string "Custom GPU AMI ID"
> + output yaml
> + help
> + Specify the AMI ID for your custom GPU image.
> +
> +endif # TERRAFORM_AWS_GPU_AMI_CUSTOM
> +
> +endif # TERRAFORM_AWS_USE_GPU_AMI
> +
> +# GPU instance detection (static)
> +config TERRAFORM_AWS_IS_GPU_INSTANCE
> + bool
> + output yaml
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6E
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G6
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G5G
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4DN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_G4AD
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P5EN
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4D
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P4DE
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3
> + default y if TERRAFORM_AWS_INSTANCE_TYPE_P3DN
> + default n
> + help
> + Automatically detected based on selected instance type.
> + This indicates whether the selected instance has GPU support.
> +
> +"""
> +
> +
> +def generate_default_regions_kconfig() -> str:
> + """Generate default Kconfig content when AWS CLI is not available."""
> + return """# AWS regions (default - AWS CLI not available)
> +
> +choice
> + prompt "AWS region"
> + default TERRAFORM_AWS_REGION_USEAST1
> + help
> + Select the AWS region for your deployment.
> + Note: AWS CLI is not available, showing default options.
> +
> +# US Regions
> +config TERRAFORM_AWS_REGION_USEAST1
> + bool "US East (N. Virginia)"
> +
> +config TERRAFORM_AWS_REGION_USEAST2
> + bool "US East (Ohio)"
> +
> +config TERRAFORM_AWS_REGION_USWEST1
> + bool "US West (N. California)"
> +
> +config TERRAFORM_AWS_REGION_USWEST2
> + bool "US West (Oregon)"
> +
> +# Europe Regions
> +config TERRAFORM_AWS_REGION_EUWEST1
> + bool "Europe (Ireland)"
> +
> +config TERRAFORM_AWS_REGION_EUCENTRAL1
> + bool "Europe (Frankfurt)"
> +
> +# Asia Pacific Regions
> +config TERRAFORM_AWS_REGION_APSOUTHEAST1
> + bool "Asia Pacific (Singapore)"
> +
> +config TERRAFORM_AWS_REGION_APNORTHEAST1
> + bool "Asia Pacific (Tokyo)"
> +
> +endchoice
> +
> +config TERRAFORM_AWS_REGION
> + string
> + default "us-east-1" if TERRAFORM_AWS_REGION_USEAST1
> + default "us-east-2" if TERRAFORM_AWS_REGION_USEAST2
> + default "us-west-1" if TERRAFORM_AWS_REGION_USWEST1
> + default "us-west-2" if TERRAFORM_AWS_REGION_USWEST2
> + default "eu-west-1" if TERRAFORM_AWS_REGION_EUWEST1
> + default "eu-central-1" if TERRAFORM_AWS_REGION_EUCENTRAL1
> + default "ap-southeast-1" if TERRAFORM_AWS_REGION_APSOUTHEAST1
> + default "ap-northeast-1" if TERRAFORM_AWS_REGION_APNORTHEAST1
> + default "us-east-1"
> +
> +"""
> diff --git a/scripts/dynamic-cloud-kconfig.Makefile b/scripts/dynamic-cloud-kconfig.Makefile
> index e15651ab..fffa5446 100644
> --- a/scripts/dynamic-cloud-kconfig.Makefile
> +++ b/scripts/dynamic-cloud-kconfig.Makefile
> @@ -12,9 +12,24 @@ LAMBDALABS_KCONFIG_IMAGES := $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.generated
>
> LAMBDALABS_KCONFIGS := $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_IMAGES)
>
> +# AWS dynamic configuration
> +AWS_KCONFIG_DIR := terraform/aws/kconfigs
> +AWS_KCONFIG_COMPUTE := $(AWS_KCONFIG_DIR)/Kconfig.compute.generated
> +AWS_KCONFIG_LOCATION := $(AWS_KCONFIG_DIR)/Kconfig.location.generated
> +AWS_INSTANCE_TYPES_DIR := $(AWS_KCONFIG_DIR)/instance-types
> +
> +# List of AWS instance type family files that will be generated
> +AWS_INSTANCE_TYPE_FAMILIES := m5 m7a t3 t3a c5 c7a i4i is4gen im4gn
> +AWS_INSTANCE_TYPE_KCONFIGS := $(foreach family,$(AWS_INSTANCE_TYPE_FAMILIES),$(AWS_INSTANCE_TYPES_DIR)/Kconfig.$(family).generated)
> +
> +AWS_KCONFIGS := $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION) $(AWS_INSTANCE_TYPE_KCONFIGS)
> +
> # Add Lambda Labs generated files to mrproper clean list
> KDEVOPS_MRPROPER += $(LAMBDALABS_KCONFIGS)
>
> +# Add AWS generated files to mrproper clean list
> +KDEVOPS_MRPROPER += $(AWS_KCONFIGS)
> +
> # Touch Lambda Labs generated files so Kconfig can source them
> # This ensures the files exist (even if empty) before Kconfig runs
> dynamic_lambdalabs_kconfig_touch:
> @@ -22,20 +37,55 @@ dynamic_lambdalabs_kconfig_touch:
>
> DYNAMIC_KCONFIG += dynamic_lambdalabs_kconfig_touch
>
> +# Touch AWS generated and static files so Kconfig can source them
> +# This ensures the files exist (even if empty) before Kconfig runs
> +dynamic_aws_kconfig_touch:
> + $(Q)mkdir -p $(AWS_INSTANCE_TYPES_DIR)
> + $(Q)touch $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_LOCATION)
> + $(Q)touch $(AWS_KCONFIG_DIR)/Kconfig.gpu-amis.generated
> + $(Q)touch $(AWS_KCONFIG_DIR)/Kconfig.compute.static
> + $(Q)touch $(AWS_KCONFIG_DIR)/Kconfig.location.static
> + $(Q)for family in $(AWS_INSTANCE_TYPE_FAMILIES); do \
> + touch $(AWS_INSTANCE_TYPES_DIR)/Kconfig.$$family.generated; \
> + touch $(AWS_INSTANCE_TYPES_DIR)/Kconfig.$$family.static; \
> + done
> + # Touch all existing generated files' static counterparts
> + $(Q)for file in $(AWS_INSTANCE_TYPES_DIR)/Kconfig.*.generated; do \
> + if [ -f "$$file" ]; then \
> + static_file=$$(echo "$$file" | sed 's/\.generated$$/\.static/'); \
> + touch "$$static_file"; \
> + fi; \
> + done
> + # Also touch G6E specifically since it's needed for GPU instances
> + $(Q)touch $(AWS_INSTANCE_TYPES_DIR)/Kconfig.g6e.static
> +
> +DYNAMIC_KCONFIG += dynamic_aws_kconfig_touch
> +
> # Individual Lambda Labs targets are now handled by generate_cloud_configs.py
> cloud-config-lambdalabs:
> $(Q)python3 scripts/generate_cloud_configs.py
>
> +# Individual AWS targets are now handled by generate_cloud_configs.py
> +cloud-config-aws:
> + $(Q)python3 scripts/generate_cloud_configs.py
> +
> # Clean Lambda Labs generated files
> clean-cloud-config-lambdalabs:
> $(Q)rm -f $(LAMBDALABS_KCONFIGS)
>
> -DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs
> +# Clean AWS generated files
> +clean-cloud-config-aws:
> + $(Q)rm -f $(AWS_KCONFIGS)
> + $(Q)rm -f .aws_cloud_config_generated
> +
> +DYNAMIC_CLOUD_KCONFIG += cloud-config-lambdalabs cloud-config-aws
>
> cloud-config-help:
> @echo "Cloud-specific dynamic kconfig targets:"
> @echo "cloud-config - generates all cloud provider dynamic kconfig content"
> @echo "cloud-config-lambdalabs - generates Lambda Labs dynamic kconfig content"
> + @echo "cloud-config-aws - generates AWS dynamic kconfig content"
> + @echo "cloud-update - converts generated cloud configs to static (for committing)"
> @echo "clean-cloud-config - removes all generated cloud kconfig files"
> @echo "cloud-list-all - list all cloud instances for configured provider"
>
> @@ -44,11 +94,50 @@ HELP_TARGETS += cloud-config-help
> cloud-config:
> $(Q)python3 scripts/generate_cloud_configs.py
>
> -clean-cloud-config: clean-cloud-config-lambdalabs
> +clean-cloud-config: clean-cloud-config-lambdalabs clean-cloud-config-aws
> + $(Q)rm -f .cloud.initialized
> $(Q)echo "Cleaned all cloud provider dynamic Kconfig files."
>
> cloud-list-all:
> $(Q)chmod +x scripts/cloud_list_all.sh
> $(Q)scripts/cloud_list_all.sh
>
> -PHONY += cloud-config cloud-config-lambdalabs clean-cloud-config clean-cloud-config-lambdalabs cloud-config-help cloud-list-all
> +# Convert dynamically generated cloud configs to static versions for git commits
> +# This allows admins to generate configs once and commit them for regular users
> +cloud-update:
> + @echo "Converting generated cloud configs to static versions..."
> + # AWS configs
> + $(Q)if [ -f $(AWS_KCONFIG_COMPUTE) ]; then \
> + cp $(AWS_KCONFIG_COMPUTE) $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \
> + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.compute.static; \
> + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.compute.static"; \
> + fi
> + $(Q)if [ -f $(AWS_KCONFIG_LOCATION) ]; then \
> + cp $(AWS_KCONFIG_LOCATION) $(AWS_KCONFIG_DIR)/Kconfig.location.static; \
> + sed -i 's/Kconfig\.\([^.]*\)\.generated/Kconfig.\1.static/g' $(AWS_KCONFIG_DIR)/Kconfig.location.static; \
> + echo " Created $(AWS_KCONFIG_DIR)/Kconfig.location.static"; \
> + fi
> + # AWS instance type families
> + $(Q)for file in $(AWS_INSTANCE_TYPES_DIR)/Kconfig.*.generated; do \
> + if [ -f "$$file" ]; then \
> + static_file=$$(echo "$$file" | sed 's/\.generated$$/\.static/'); \
> + cp "$$file" "$$static_file"; \
> + echo " Created $$static_file"; \
> + fi; \
> + done
> + # Lambda Labs configs
> + $(Q)if [ -f $(LAMBDALABS_KCONFIG_COMPUTE) ]; then \
> + cp $(LAMBDALABS_KCONFIG_COMPUTE) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static; \
> + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.compute.static"; \
> + fi
> + $(Q)if [ -f $(LAMBDALABS_KCONFIG_LOCATION) ]; then \
> + cp $(LAMBDALABS_KCONFIG_LOCATION) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static; \
> + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.location.static"; \
> + fi
> + $(Q)if [ -f $(LAMBDALABS_KCONFIG_IMAGES) ]; then \
> + cp $(LAMBDALABS_KCONFIG_IMAGES) $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static; \
> + echo " Created $(LAMBDALABS_KCONFIG_DIR)/Kconfig.images.static"; \
> + fi
> + @echo "Static cloud configs created. You can now commit these .static files to git."
> +
> +PHONY += cloud-config cloud-config-lambdalabs cloud-config-aws clean-cloud-config clean-cloud-config-lambdalabs clean-cloud-config-aws cloud-config-help cloud-list-all cloud-update
> diff --git a/scripts/generate_cloud_configs.py b/scripts/generate_cloud_configs.py
> index b16294dd..332cebe7 100755
> --- a/scripts/generate_cloud_configs.py
> +++ b/scripts/generate_cloud_configs.py
> @@ -10,6 +10,9 @@ import os
> import sys
> import subprocess
> import json
> +from concurrent.futures import ThreadPoolExecutor, as_completed
> +from pathlib import Path
> +from typing import Tuple
>
>
> def generate_lambdalabs_kconfig() -> bool:
> @@ -100,29 +103,194 @@ def get_lambdalabs_summary() -> tuple[bool, str]:
> return False, "Lambda Labs: Error querying API - using defaults"
>
>
> +def generate_aws_kconfig() -> bool:
> + """
> + Generate AWS Kconfig files.
> + Returns True on success, False on failure.
> + """
> + script_dir = os.path.dirname(os.path.abspath(__file__))
> + cli_path = os.path.join(script_dir, "aws-cli")
> +
> + # Generate the Kconfig files
> + result = subprocess.run(
> + [cli_path, "generate-kconfig"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + return result.returncode == 0
> +
> +
> +def get_aws_summary() -> tuple[bool, str]:
> + """
> + Get a summary of AWS configurations using aws-cli.
> + Returns (success, summary_string)
> + """
> + script_dir = os.path.dirname(os.path.abspath(__file__))
> + cli_path = os.path.join(script_dir, "aws-cli")
> +
> + try:
> + # Check if AWS CLI is available
> + result = subprocess.run(
> + ["aws", "--version"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode != 0:
> + return False, "AWS: AWS CLI not installed - using defaults"
> +
> + # Check if credentials are configured
> + result = subprocess.run(
> + ["aws", "sts", "get-caller-identity"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode != 0:
> + return False, "AWS: Credentials not configured - using defaults"
> +
> + # Get instance types count
> + result = subprocess.run(
> + [
> + cli_path,
> + "--output",
> + "json",
> + "instance-types",
> + "list",
> + "--max-results",
> + "100",
> + ],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode != 0:
> + return False, "AWS: Error querying API - using defaults"
Seems like the process should just fail here. "git clone kdevops"
already should give reasonable defaults and would restore you to a
working configuration. If menu regeneration fails, simply keep using
what you have in place?
Again, it's always quite possible that I've misread something.
> +
> + instances = json.loads(result.stdout)
> + instance_count = len(instances)
> +
> + # Get regions
> + result = subprocess.run(
> + [cli_path, "--output", "json", "regions", "list"],
> + capture_output=True,
> + text=True,
> + check=False,
> + )
> +
> + if result.returncode == 0:
> + regions = json.loads(result.stdout)
> + region_count = len(regions)
> + else:
> + region_count = 0
> +
> + # Get price range from a sample of instances
> + prices = []
> + for instance in instances[:20]: # Sample first 20 for speed
> + if "error" not in instance:
> + # Extract price if available (would need pricing API)
> + # For now, we'll use placeholder
> + vcpus = instance.get("vcpu", 0)
> + if vcpus > 0:
> + # Rough estimate: $0.05 per vCPU/hour
> + estimated_price = vcpus * 0.05
> + prices.append(estimated_price)
> +
> + # Format summary
> + if prices:
> + min_price = min(prices)
> + max_price = max(prices)
> + price_range = f"~${min_price:.2f}-${max_price:.2f}/hr"
> + else:
> + price_range = "pricing varies by region"
> +
> + return (
> + True,
> + f"AWS: {instance_count} instance types available, "
> + f"{region_count} regions, {price_range}",
> + )
> +
> + except (subprocess.SubprocessError, json.JSONDecodeError, KeyError):
> + return False, "AWS: Error querying API - using defaults"
> +
> +
> +def process_lambdalabs() -> Tuple[bool, bool, str]:
> + """Process Lambda Labs configuration generation and summary.
> + Returns (kconfig_generated, summary_success, summary_text)
> + """
> + kconfig_generated = generate_lambdalabs_kconfig()
> + success, summary = get_lambdalabs_summary()
> + return kconfig_generated, success, summary
> +
> +
> +def process_aws() -> Tuple[bool, bool, str]:
> + """Process AWS configuration generation and summary.
> + Returns (kconfig_generated, summary_success, summary_text)
> + """
> + kconfig_generated = generate_aws_kconfig()
> + success, summary = get_aws_summary()
> +
> + # Create marker file to indicate dynamic AWS config is available
> + if kconfig_generated:
> + marker_file = Path(".aws_cloud_config_generated")
> + marker_file.touch()
> +
> + return kconfig_generated, success, summary
> +
> +
> def main():
> """Main function to generate cloud configurations."""
> print("Cloud Provider Configuration Summary")
> print("=" * 60)
> print()
>
> - # Lambda Labs - Generate Kconfig files first
> - kconfig_generated = generate_lambdalabs_kconfig()
> + # Run cloud provider operations in parallel
> + results = {}
> + any_success = False
>
> - # Lambda Labs - Get summary
> - success, summary = get_lambdalabs_summary()
> - if success:
> - print(f"✓ {summary}")
> - if kconfig_generated:
> - print(" Kconfig files generated successfully")
> - else:
> - print(" Warning: Failed to generate Kconfig files")
> - else:
> - print(f"⚠ {summary}")
> - print()
> + with ThreadPoolExecutor(max_workers=4) as executor:
> + # Submit all tasks
> + futures = {
> + executor.submit(process_lambdalabs): "lambdalabs",
> + executor.submit(process_aws): "aws",
> + }
> +
> + # Process results as they complete
> + for future in as_completed(futures):
> + provider = futures[future]
> + try:
> + results[provider] = future.result()
> + except Exception as e:
> + results[provider] = (
> + False,
> + False,
> + f"{provider.upper()}: Error - {str(e)}",
> + )
> +
> + # Display results in consistent order
> + for provider in ["lambdalabs", "aws"]:
> + if provider in results:
> + kconfig_gen, success, summary = results[provider]
> + if success and kconfig_gen:
> + any_success = True
> + if success:
> + print(f"✓ {summary}")
> + if kconfig_gen:
> + print(" Kconfig files generated successfully")
> + else:
> + print(" Warning: Failed to generate Kconfig files")
> + else:
> + print(f"⚠ {summary}")
> + print()
>
> - # AWS (placeholder - not implemented)
> - print("⚠ AWS: Dynamic configuration not yet implemented")
> + # Create .cloud.initialized if any provider succeeded
> + if any_success:
> + Path(".cloud.initialized").touch()
>
> # Azure (placeholder - not implemented)
> print("⚠ Azure: Dynamic configuration not yet implemented")
> diff --git a/terraform/aws/kconfigs/Kconfig.compute b/terraform/aws/kconfigs/Kconfig.compute
> index bae0ea1c..12083d1a 100644
> --- a/terraform/aws/kconfigs/Kconfig.compute
> +++ b/terraform/aws/kconfigs/Kconfig.compute
> @@ -1,94 +1,54 @@
> -choice
> - prompt "AWS instance types"
> - help
> - Instance types comprise varying combinations of hardware
> - platform, CPU count, memory size, storage, and networking
> - capacity. Select the type that provides an appropriate mix
> - of resources for your preferred workflows.
> -
> - Some instance types are region- and capacity-limited.
> -
> - See https://aws.amazon.com/ec2/instance-types/ for
> - details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_M5
> - bool "M5"
> - depends on TARGET_ARCH_X86_64
> - help
> - This is a general purpose type powered by Intel Xeon®
> - Platinum 8175M or 8259CL processors (Skylake or Cascade
> - Lake).
> -
> - See https://aws.amazon.com/ec2/instance-types/m5/ for
> - details.
> +# AWS compute configuration
>
> -config TERRAFORM_AWS_INSTANCE_TYPE_M7A
> - bool "M7a"
> - depends on TARGET_ARCH_X86_64
> +config TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> + bool "Use dynamically generated instance types"
> + default $(shell, test -f .aws_cloud_config_generated && echo y || echo n)
> help
> - This is a general purpose type powered by 4th Generation
> - AMD EPYC processors.
> + Enable this to use dynamically generated instance types from AWS CLI.
> + Run 'make cloud-config' to query AWS and generate available options.
> + When disabled, uses static predefined instance types.
>
> - See https://aws.amazon.com/ec2/instance-types/m7a/ for
> - details.
> + This is automatically enabled when you run 'make cloud-config'.
>
> -config TERRAFORM_AWS_INSTANCE_TYPE_I4I
> - bool "I4i"
> - depends on TARGET_ARCH_X86_64
> - help
> - This is a storage-optimized type powered by 3rd generation
> - Intel Xeon Scalable processors (Ice Lake) and use AWS Nitro
> - NVMe SSDs.
> -
> - See https://aws.amazon.com/ec2/instance-types/i4i/ for
> - details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_IS4GEN
> - bool "Is4gen"
> - depends on TARGET_ARCH_ARM64
> - help
> - This is a Storage-optimized type powered by AWS Graviton2
> - processors.
> +if TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Include cloud-generated or static instance families
> +# Try static first (pre-generated by admins for faster loading)
> +# Fall back to generated files (requires AWS CLI)
> +source "terraform/aws/kconfigs/Kconfig.compute.static"
> +endif
>
> - See https://aws.amazon.com/ec2/instance-types/i4g/ for
> - details.
> -
> -config TERRAFORM_AWS_INSTANCE_TYPE_IM4GN
> - bool "Im4gn"
> - depends on TARGET_ARCH_ARM64
> +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Static instance types when not using dynamic config
> +choice
> + prompt "AWS instance types"
> help
> - This is a storage-optimized type powered by AWS Graviton2
> - processors.
> + Instance types comprise varying combinations of hardware
> + platform, CPU count, memory size, storage, and networking
> + capacity. Select the type that provides an appropriate mix
> + of resources for your preferred workflows.
>
> - See https://aws.amazon.com/ec2/instance-types/i4g/ for
> - details.
> + Some instance types are region- and capacity-limited.
>
> -config TERRAFORM_AWS_INSTANCE_TYPE_C7A
> - depends on TARGET_ARCH_X86_64
> - bool "c7a"
> - help
> - This is a compute-optimized type powered by 4th generation
> - AMD EPYC processors.
> + See https://aws.amazon.com/ec2/instance-types/ for
> + details.
>
> - See https://aws.amazon.com/ec2/instance-types/c7a/ for
> - details.
>
> endchoice
> +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
>
> +if !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
> +# Use static instance type definitions when not using dynamic config
> source "terraform/aws/kconfigs/instance-types/Kconfig.m5"
> source "terraform/aws/kconfigs/instance-types/Kconfig.m7a"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.i4i"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.is4gen"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.im4gn"
> -source "terraform/aws/kconfigs/instance-types/Kconfig.c7a"
> +endif # !TERRAFORM_AWS_USE_DYNAMIC_CONFIG
>
> choice
> prompt "Linux distribution"
> default TERRAFORM_AWS_DISTRO_DEBIAN
> help
> - Select a popular Linux distribution to install on your
> - instances, or use the "Custom AMI image" selection to
> - choose an image that is off the beaten path.
> + Select a popular Linux distribution to install on your
> + instances, or use the "Custom AMI image" selection to
> + choose an image that is off the beaten path.
>
> config TERRAFORM_AWS_DISTRO_AMAZON
> bool "Amazon Linux"
--
Chuck Lever
next prev parent reply other threads:[~2025-09-07 17:24 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-07 4:23 [PATCH 0/2] aws: add dynamic kconfig support Luis Chamberlain
2025-09-07 4:23 ` [PATCH 1/2] aws: add dynamic cloud configuration support using AWS CLI Luis Chamberlain
2025-09-07 17:24 ` Chuck Lever [this message]
2025-09-07 22:10 ` Luis Chamberlain
2025-09-07 22:12 ` Luis Chamberlain
2025-09-08 14:12 ` Chuck Lever
2025-09-08 14:21 ` Chuck Lever
2025-09-08 15:23 ` Chuck Lever
2025-09-08 20:22 ` Luis Chamberlain
2025-09-07 4:23 ` [PATCH 2/2] aws: enable GPU AMI support for GPU instances Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d6ae74bb-3769-4de8-9e29-2c079f3335ed@kernel.org \
--to=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=kdevops@lists.linux.dev \
--cc=mcgrof@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox