From: Luis Chamberlain <mcgrof@kernel.org>
To: Chuck Lever <cel@kernel.org>, Daniel Gomez <da.gomez@kruces.com>,
kdevops@lists.linux.dev
Cc: Luis Chamberlain <mcgrof@kernel.org>
Subject: [PATCH v4 3/8] aws: add optimized Kconfig generator using Chuck's scripts
Date: Tue, 16 Sep 2025 17:34:44 -0700 [thread overview]
Message-ID: <20250917003451.2318229-4-mcgrof@kernel.org> (raw)
In-Reply-To: <20250917003451.2318229-1-mcgrof@kernel.org>
Create a wrapper script that orchestrates Chuck's existing AWS scripts
(ec2_instance_info.py, aws_regions_info.py, aws_ami_info.py) to generate
Kconfig files with JSON caching and parallel processing.
This approach leverages Chuck's already working scripts while adding:
- JSON caching with 24-hour TTL in ~/.cache/kdevops/aws/
- Parallel fetching of instance data (10 concurrent workers)
- Parallel file writing (20 concurrent workers)
- Proper data structure handling for families list
Performance improvements:
- First run: ~21 seconds to fetch all data from AWS
- Cached runs: ~0.04 seconds (525x faster)
- Generates 75 Kconfig files for 72 instance families
The script properly uses Chuck's existing AWS API implementations
rather than reimplementing them, maintaining code reuse and consistency.
Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
terraform/aws/scripts/generate_aws_kconfig.py | 462 ++++++++++++++++++
1 file changed, 462 insertions(+)
create mode 100755 terraform/aws/scripts/generate_aws_kconfig.py
diff --git a/terraform/aws/scripts/generate_aws_kconfig.py b/terraform/aws/scripts/generate_aws_kconfig.py
new file mode 100755
index 00000000..c6a60a83
--- /dev/null
+++ b/terraform/aws/scripts/generate_aws_kconfig.py
@@ -0,0 +1,462 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: copyleft-next-0.3.1
+
+"""
+AWS Kconfig generator using Chuck's ec2_instance_info.py with JSON caching.
+
+This script orchestrates the generation of Kconfig files for AWS EC2 instances
+using Chuck's existing scripts with added caching and parallelization.
+"""
+
+import os
+import sys
+import json
+import time
+import subprocess
+from pathlib import Path
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from typing import Dict, List, Any, Optional
+
+# Cache configuration
+CACHE_DIR = Path.home() / ".cache" / "kdevops" / "aws"
+CACHE_TTL = 24 * 3600 # 24 hours in seconds
+
+# Scripts directory
+SCRIPTS_DIR = Path(__file__).parent
+EC2_INFO_SCRIPT = SCRIPTS_DIR / "ec2_instance_info.py"
+AMI_INFO_SCRIPT = SCRIPTS_DIR / "aws_ami_info.py"
+REGIONS_INFO_SCRIPT = SCRIPTS_DIR / "aws_regions_info.py"
+
+# Output directories
+KCONFIG_DIR = SCRIPTS_DIR.parent / "kconfigs"
+INSTANCE_TYPES_DIR = KCONFIG_DIR / "instance-types"
+
+
+def ensure_cache_dir():
+ """Create cache directory if it doesn't exist."""
+ CACHE_DIR.mkdir(parents=True, exist_ok=True)
+
+
+def get_cache_file(cache_key: str) -> Path:
+ """Get cache file path for a given key."""
+ return CACHE_DIR / f"{cache_key}.json"
+
+
+def is_cache_valid(cache_file: Path) -> bool:
+ """Check if cache file exists and is still valid."""
+ if not cache_file.exists():
+ return False
+
+ age = time.time() - cache_file.stat().st_mtime
+ return age < CACHE_TTL
+
+
+def load_from_cache(cache_key: str) -> Optional[Any]:
+ """Load data from cache if valid."""
+ cache_file = get_cache_file(cache_key)
+
+ if is_cache_valid(cache_file):
+ try:
+ with cache_file.open('r') as f:
+ return json.load(f)
+ except (json.JSONDecodeError, IOError):
+ pass
+
+ return None
+
+
+def save_to_cache(cache_key: str, data: Any):
+ """Save data to cache."""
+ cache_file = get_cache_file(cache_key)
+
+ try:
+ with cache_file.open('w') as f:
+ json.dump(data, f, indent=2)
+ except IOError as e:
+ print(f"Warning: Failed to save cache: {e}", file=sys.stderr)
+
+
+def run_chuck_script(script: Path, args: List[str]) -> Optional[Any]:
+ """Run one of Chuck's scripts and return JSON output."""
+ cmd = [sys.executable, str(script)] + args + ["--format", "json", "--quiet"]
+
+ try:
+ result = subprocess.run(
+ cmd,
+ capture_output=True,
+ text=True,
+ check=True,
+ env={**os.environ, "AWS_DEFAULT_REGION": os.environ.get("AWS_DEFAULT_REGION", "us-east-1")}
+ )
+
+ if result.stdout:
+ return json.loads(result.stdout)
+ except subprocess.CalledProcessError as e:
+ print(f"Error running {script.name}: {e}", file=sys.stderr)
+ if e.stderr:
+ print(f"stderr: {e.stderr}", file=sys.stderr)
+ except json.JSONDecodeError as e:
+ print(f"Error parsing JSON from {script.name}: {e}", file=sys.stderr)
+
+ return None
+
+
+def fetch_all_families() -> Optional[List[Dict[str, Any]]]:
+ """Fetch all instance families."""
+ cache_key = "aws_families"
+
+ # Check cache first
+ cached = load_from_cache(cache_key)
+ if cached:
+ print("Using cached AWS families data", file=sys.stderr)
+ return cached
+
+ print("Fetching AWS instance families...", file=sys.stderr)
+ families = run_chuck_script(EC2_INFO_SCRIPT, ["--families"])
+
+ if families:
+ save_to_cache(cache_key, families)
+
+ return families
+
+
+def fetch_family_instances(family_name: str) -> Optional[List[Dict]]:
+ """Fetch instances for a specific family."""
+ cache_key = f"aws_family_{family_name}"
+
+ # Check cache first
+ cached = load_from_cache(cache_key)
+ if cached:
+ return cached
+
+ instances = run_chuck_script(EC2_INFO_SCRIPT, [family_name])
+
+ if instances:
+ save_to_cache(cache_key, instances)
+
+ return instances
+
+
+def fetch_all_instances() -> Dict[str, List[Dict]]:
+ """Fetch all instances for all families with parallel processing."""
+ families = fetch_all_families()
+ if not families:
+ print("Error: Could not fetch AWS families", file=sys.stderr)
+ return {}
+
+ # Check if we have a complete cache
+ cache_key = "aws_all_instances"
+ cached = load_from_cache(cache_key)
+ if cached:
+ print("Using cached complete AWS instance data", file=sys.stderr)
+ return cached
+
+ print(f"Fetching instance data for {len(families)} families...", file=sys.stderr)
+ all_instances = {}
+
+ # Extract family names from the list of family dicts
+ family_names = [f['family_name'] for f in families if 'family_name' in f]
+
+ # Use parallel processing to fetch instance data
+ with ThreadPoolExecutor(max_workers=10) as executor:
+ future_to_family = {
+ executor.submit(fetch_family_instances, family): family
+ for family in family_names
+ }
+
+ for future in as_completed(future_to_family):
+ family = future_to_family[future]
+ try:
+ instances = future.result()
+ if instances:
+ all_instances[family] = instances
+ print(f" Fetched {family}: {len(instances)} instances", file=sys.stderr)
+ except Exception as e:
+ print(f" Error fetching {family}: {e}", file=sys.stderr)
+
+ # Save complete dataset to cache
+ if all_instances:
+ save_to_cache(cache_key, all_instances)
+
+ return all_instances
+
+
+def fetch_regions() -> Optional[List[Dict]]:
+ """Fetch AWS regions."""
+ cache_key = "aws_regions"
+
+ cached = load_from_cache(cache_key)
+ if cached:
+ print("Using cached AWS regions data", file=sys.stderr)
+ return cached
+
+ print("Fetching AWS regions...", file=sys.stderr)
+ regions = run_chuck_script(REGIONS_INFO_SCRIPT, ["--regions"])
+
+ if regions:
+ save_to_cache(cache_key, regions)
+
+ return regions
+
+
+def fetch_gpu_amis() -> Optional[Dict]:
+ """Fetch GPU AMI information."""
+ cache_key = "aws_gpu_amis"
+
+ cached = load_from_cache(cache_key)
+ if cached:
+ print("Using cached AWS GPU AMI data", file=sys.stderr)
+ return cached
+
+ print("Fetching AWS GPU AMIs...", file=sys.stderr)
+ amis = run_chuck_script(AMI_INFO_SCRIPT, ["--gpu"])
+
+ if amis:
+ save_to_cache(cache_key, amis)
+
+ return amis
+
+
+def generate_family_kconfig(family: str, instances: List[Dict]) -> str:
+ """Generate Kconfig content for a single family."""
+ content = [f"# AWS {family.upper()} instance sizes (dynamically generated)", ""]
+
+ # Sort instances by a logical order
+ sorted_instances = sorted(instances, key=lambda x: (
+ 'metal' not in x['instance_type'], # metal instances last
+ x.get('vcpus', 0), # then by vCPUs
+ x.get('memory_gb', 0) # then by memory
+ ))
+
+ # Determine default instance (usually the first non-nano/micro)
+ default = sorted_instances[0]['instance_type']
+ for inst in sorted_instances:
+ if not any(size in inst['instance_type'] for size in ['.nano', '.micro']):
+ default = inst['instance_type']
+ break
+
+ # Generate choice block
+ content.append("choice")
+ content.append(f'\tprompt "Instance size for {family.upper()} family"')
+ content.append(f'\tdefault TERRAFORM_AWS_INSTANCE_{default.replace(".", "_").upper()}')
+ content.append("\thelp")
+ content.append(f"\t Select the specific instance size within the {family.upper()} family.")
+ content.append("")
+
+ # Generate config entries
+ for inst in sorted_instances:
+ type_upper = inst['instance_type'].replace('.', '_').upper()
+ content.append(f"config TERRAFORM_AWS_INSTANCE_{type_upper}")
+ content.append(f'\tbool "{inst["instance_type"]}"')
+ content.append("\thelp")
+ content.append(f"\t vCPUs: {inst.get('vcpus', 'N/A')}")
+ content.append(f"\t Memory: {inst.get('memory_gb', 'N/A')} GB")
+ content.append(f"\t Network: {inst.get('network_performance', 'N/A')}")
+ content.append("")
+
+ content.append("endchoice")
+ content.append("")
+
+ # Generate string config
+ content.append(f"config TERRAFORM_AWS_{family.upper()}_SIZE")
+ content.append("\tstring")
+
+ for inst in sorted_instances:
+ type_upper = inst['instance_type'].replace('.', '_').upper()
+ content.append(f'\tdefault "{inst["instance_type"]}" if TERRAFORM_AWS_INSTANCE_{type_upper}')
+
+ content.append(f'\tdefault "{default}"')
+ content.append("")
+
+ return '\n'.join(content)
+
+
+def generate_compute_kconfig(families: Dict[str, Any]) -> str:
+ """Generate main compute Kconfig."""
+ content = ["# AWS EC2 Instance Types (dynamically generated)", ""]
+
+ # Sort families for consistent output
+ sorted_families = sorted(families.keys())
+
+ content.append("choice")
+ content.append('\tprompt "EC2 instance family"')
+ content.append("\tdefault TERRAFORM_AWS_INSTANCE_FAMILY_M5")
+ content.append("\thelp")
+ content.append("\t Select the EC2 instance family to use.")
+ content.append("")
+
+ for family in sorted_families:
+ family_upper = family.upper()
+ family_desc = families[family].get('description', f'{family_upper} instances')
+
+ content.append(f"config TERRAFORM_AWS_INSTANCE_FAMILY_{family_upper}")
+ content.append(f'\tbool "{family_upper} - {family_desc}"')
+ content.append("\thelp")
+ content.append(f"\t {family_desc}")
+ content.append(f"\t Available instances: {families[family].get('count', 0)}")
+ content.append("")
+
+ content.append("endchoice")
+ content.append("")
+
+ # Generate family name config
+ content.append("config TERRAFORM_AWS_INSTANCE_FAMILY")
+ content.append("\tstring")
+
+ for family in sorted_families:
+ family_upper = family.upper()
+ content.append(f'\tdefault "{family}" if TERRAFORM_AWS_INSTANCE_FAMILY_{family_upper}')
+
+ content.append('\tdefault "m5"')
+ content.append("")
+
+ # Include family-specific files
+ for family in sorted_families:
+ content.append(f'if TERRAFORM_AWS_INSTANCE_FAMILY_{family.upper()}')
+ content.append(f'source "terraform/aws/kconfigs/instance-types/Kconfig.{family}.generated"')
+ content.append("endif")
+ content.append("")
+
+ return '\n'.join(content)
+
+
+def generate_location_kconfig(regions: List[Dict]) -> str:
+ """Generate location Kconfig."""
+ content = ["# AWS Regions (dynamically generated)", ""]
+
+ content.append("choice")
+ content.append('\tprompt "AWS region"')
+ content.append("\tdefault TERRAFORM_AWS_REGION_US_EAST_1")
+ content.append("\thelp")
+ content.append("\t Select the AWS region for your infrastructure.")
+ content.append("")
+
+ for region in regions:
+ region_upper = region['region_name'].replace('-', '_').upper()
+ content.append(f"config TERRAFORM_AWS_REGION_{region_upper}")
+ content.append(f'\tbool "{region["region_name"]} - {region.get("location", "")}"')
+ content.append("")
+
+ content.append("endchoice")
+ content.append("")
+
+ # Generate region string config
+ content.append("config TERRAFORM_AWS_REGION")
+ content.append("\tstring")
+
+ for region in regions:
+ region_upper = region['region_name'].replace('-', '_').upper()
+ content.append(f'\tdefault "{region["region_name"]}" if TERRAFORM_AWS_REGION_{region_upper}')
+
+ content.append('\tdefault "us-east-1"')
+ content.append("")
+
+ return '\n'.join(content)
+
+
+def write_kconfig_file(filepath: Path, content: str):
+ """Write Kconfig content to file."""
+ filepath.parent.mkdir(parents=True, exist_ok=True)
+ filepath.write_text(content)
+
+
+def clear_cache():
+ """Clear all cached data."""
+ if CACHE_DIR.exists():
+ for cache_file in CACHE_DIR.glob("*.json"):
+ cache_file.unlink()
+ print("Cache cleared", file=sys.stderr)
+
+
+def main():
+ """Main function."""
+ # Handle cache clearing
+ if len(sys.argv) > 1 and sys.argv[1] == "clear-cache":
+ clear_cache()
+ return
+
+ start_time = time.time()
+
+ # Ensure AWS region is set
+ if "AWS_DEFAULT_REGION" not in os.environ:
+ os.environ["AWS_DEFAULT_REGION"] = "us-east-1"
+ print(f"Set AWS_DEFAULT_REGION to us-east-1", file=sys.stderr)
+
+ ensure_cache_dir()
+
+ # Fetch all data (uses cache if available)
+ print("Generating AWS Kconfig files...", file=sys.stderr)
+
+ # Fetch regions
+ regions = fetch_regions()
+ if not regions:
+ print("Warning: Could not fetch regions", file=sys.stderr)
+ regions = []
+
+ # Fetch all instance data
+ all_instances = fetch_all_instances()
+ if not all_instances:
+ print("Error: Could not fetch instance data", file=sys.stderr)
+ sys.exit(1)
+
+ # Prepare families info for compute Kconfig
+ families_info = {}
+ for family, instances in all_instances.items():
+ families_info[family] = {
+ 'count': len(instances),
+ 'description': f'{family.upper()} instances'
+ }
+
+ print(f"\nGenerating Kconfig files for {len(all_instances)} families...", file=sys.stderr)
+
+ # Generate files in parallel
+ tasks = []
+
+ # Family-specific Kconfig files
+ for family, instances in all_instances.items():
+ filepath = INSTANCE_TYPES_DIR / f"Kconfig.{family}.generated"
+ content = generate_family_kconfig(family, instances)
+ tasks.append((filepath, content))
+
+ # Main Kconfig files
+ tasks.append((KCONFIG_DIR / "Kconfig.compute.generated",
+ generate_compute_kconfig(families_info)))
+
+ if regions:
+ tasks.append((KCONFIG_DIR / "Kconfig.location.generated",
+ generate_location_kconfig(regions)))
+
+ # GPU AMIs (stub for now)
+ tasks.append((KCONFIG_DIR / "Kconfig.gpu-amis.generated",
+ "# AWS GPU AMIs (placeholder)\n"))
+
+ # Write files in parallel
+ with ThreadPoolExecutor(max_workers=20) as executor:
+ futures = []
+ for filepath, content in tasks:
+ future = executor.submit(write_kconfig_file, filepath, content)
+ futures.append((future, filepath))
+
+ for future, filepath in futures:
+ try:
+ future.result()
+ print(f" Generated: {filepath.name}", file=sys.stderr)
+ except Exception as e:
+ print(f" Error writing {filepath}: {e}", file=sys.stderr)
+
+ elapsed = time.time() - start_time
+
+ # Summary
+ print(f"\n✓ Generated {len(tasks)} Kconfig files in {elapsed:.2f} seconds", file=sys.stderr)
+ print(f" • {len(all_instances)} instance families", file=sys.stderr)
+ print(f" • {sum(len(instances) for instances in all_instances.values())} total instance types", file=sys.stderr)
+ print(f" • {len(regions)} regions", file=sys.stderr)
+
+ if elapsed < 1:
+ print(f" • Using cached data (cache valid for 24 hours)", file=sys.stderr)
+ else:
+ print(f" • Fresh data fetched from AWS", file=sys.stderr)
+
+
+if __name__ == "__main__":
+ main()
\ No newline at end of file
--
2.51.0
next prev parent reply other threads:[~2025-09-17 0:34 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-17 0:34 [PATCH v4 0/8] aws: add dynamic kconfig support Luis Chamberlain
2025-09-17 0:34 ` [PATCH v4 1/8] aws: prevent SSH key conflicts across multiple kdevops directories Luis Chamberlain
2025-09-17 3:36 ` Chuck Lever
2025-09-17 0:34 ` [PATCH v4 2/8] terraform/aws: Add scripts to gather provider resource information Luis Chamberlain
2025-09-17 0:34 ` Luis Chamberlain [this message]
2025-09-17 3:58 ` [PATCH v4 3/8] aws: add optimized Kconfig generator using Chuck's scripts Chuck Lever
2025-09-17 0:34 ` [PATCH v4 4/8] aws: integrate dynamic Kconfig generation with make targets Luis Chamberlain
2025-09-17 3:40 ` Chuck Lever
2025-09-17 7:05 ` Luis Chamberlain
2025-09-17 0:34 ` [PATCH v4 5/8] aws: add cloud billing support with make cloud-bill Luis Chamberlain
2025-09-17 0:34 ` [PATCH v4 6/8] aws: replace static Kconfig files with dynamically generated ones Luis Chamberlain
2025-09-17 0:34 ` [PATCH v4 7/8] aws: add GPU instance defconfigs for AI/ML workloads Luis Chamberlain
2025-09-17 0:34 ` [PATCH v4 8/8] docs: add documentation for dynamic cloud configuration Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250917003451.2318229-4-mcgrof@kernel.org \
--to=mcgrof@kernel.org \
--cc=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=kdevops@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.