public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH 0/8] neoclouds: add new datacrunch / verda support
@ 2025-12-06 16:56 Luis Chamberlain
  2025-12-06 16:56 ` [PATCH 1/8] terraform: Use directory checksum in SSH key filenames Luis Chamberlain
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Luis Chamberlain @ 2025-12-06 16:56 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

I've been exploiting using neocloud solutions for GPU usage as they are
dirt cheap compared to larger cloud solutions. Its enough and simple
enough to get started with GPU workflows. This adds yet another
neocloud, datacrunch which has been spun off now as verda.

I've taken the old terraform provider, and modified it to work with the
latest cloud solution and forked it under kdevops. I registered the
provider under opentofu as terraform doesn't let you publish new
providers.

I've been using this for about 1 month now without isues.

Luis Chamberlain (8):
  terraform: Use directory checksum in SSH key filenames
  devconfig: Add tmux.conf copying to target systems
  terraform: Enable fact gathering for localhost
  terraform: Add DataCrunch GPU cloud provider integration
  kconfig: Add support for merging defconfig fragments
  terraform: Add tier-based GPU selection for Lambda Labs
  terraform: Document tier-based GPU selection for Lambda Labs
  docs: Organize cloud providers with Neoclouds section

 .gitignore                                    |   1 +
 Makefile                                      |   9 +
 README.md                                     |   2 +-
 defconfigs/datacrunch-4x-b200                 |  11 +
 defconfigs/datacrunch-4x-b300                 |  11 +
 defconfigs/datacrunch-4x-h100-pytorch         |  11 +
 defconfigs/datacrunch-a100                    |  11 +
 defconfigs/datacrunch-a100-40-or-less         |  13 +
 defconfigs/datacrunch-a100-80-or-less         |  13 +
 defconfigs/datacrunch-b200-or-less            |  13 +
 defconfigs/datacrunch-b300                    |  11 +
 defconfigs/datacrunch-b300-or-less            |  13 +
 defconfigs/datacrunch-h100-pytorch            |  12 +
 defconfigs/datacrunch-h100-pytorch-or-less    |  13 +
 defconfigs/datacrunch-v100                    |  12 +
 defconfigs/lambdalabs-8x-b200-or-less         |  14 +
 defconfigs/lambdalabs-8x-h100-or-less         |  14 +
 defconfigs/lambdalabs-a100-or-less            |  14 +
 defconfigs/lambdalabs-gh200-or-less           |  14 +
 defconfigs/lambdalabs-h100-or-less            |  14 +
 docs/datacrunch.md                            | 583 +++++++++++++++++
 docs/kdevops-terraform.md                     | 213 +++++++
 playbooks/datacrunch_volume_cache.yml         |  83 +++
 playbooks/roles/devconfig/defaults/main.yml   |   4 +
 .../roles/devconfig/tasks/datacrunch_ml.yml   |  96 +++
 playbooks/roles/devconfig/tasks/main.yml      |  17 +
 playbooks/roles/gen_tfvars/defaults/main.yml  |   6 +
 .../templates/datacrunch/terraform.tfvars.j2  |  19 +
 playbooks/roles/terraform/tasks/main.yml      | 593 +++++++++++++++++-
 playbooks/terraform.yml                       |   2 +-
 requirements.txt                              |  18 +
 scripts/append-makefile-vars.sh               |  12 +
 scripts/datacrunch_api.py                     | 340 ++++++++++
 scripts/datacrunch_check_capacity.py          | 307 +++++++++
 scripts/datacrunch_credentials.py             | 372 +++++++++++
 scripts/datacrunch_select_tier.py             | 373 +++++++++++
 scripts/datacrunch_ssh_key_name.py            |  43 ++
 scripts/datacrunch_ssh_keys.py                | 346 ++++++++++
 scripts/generate_cloud_configs.py             |  36 +-
 scripts/generate_datacrunch_kconfig.py        | 336 ++++++++++
 scripts/kconfig/kconfig.Makefile              |  20 +-
 scripts/lambdalabs_check_capacity.py          | 124 ++++
 scripts/lambdalabs_select_tier.py             | 313 +++++++++
 scripts/terraform.Makefile                    |   7 +
 terraform/Kconfig.providers                   |  10 +
 terraform/Kconfig.ssh                         |  21 +-
 terraform/datacrunch/Kconfig                  |  45 ++
 terraform/datacrunch/LOCAL_PROVIDER.md        |  92 +++
 terraform/datacrunch/README.md                | 454 ++++++++++++++
 terraform/datacrunch/STATUS.md                | 106 ++++
 .../datacrunch/ansible_provision_cmd.tpl      |   1 +
 terraform/datacrunch/extract_api_key.py       |  63 ++
 terraform/datacrunch/kconfigs/Kconfig.compute |   7 +
 .../kconfigs/Kconfig.compute.generated        | 209 ++++++
 .../datacrunch/kconfigs/Kconfig.identity      |  82 +++
 terraform/datacrunch/kconfigs/Kconfig.images  |   7 +
 .../kconfigs/Kconfig.images.generated         |  90 +++
 .../datacrunch/kconfigs/Kconfig.location      |   7 +
 .../kconfigs/Kconfig.location.generated       |  49 ++
 terraform/datacrunch/main.tf                  |  36 ++
 terraform/datacrunch/output.tf                |  35 ++
 terraform/datacrunch/provider.tf              |  25 +
 terraform/datacrunch/scripts/apply_wrapper.sh |  79 +++
 .../datacrunch/scripts/destroy_wrapper.sh     |  95 +++
 terraform/datacrunch/scripts/volume_cache.py  | 165 +++++
 terraform/datacrunch/shared.tf                |   1 +
 terraform/datacrunch/vars.tf                  |  52 ++
 terraform/lambdalabs/README.md                | 103 +++
 terraform/lambdalabs/kconfigs/Kconfig.compute |  68 ++
 terraform/oci/scripts/gen_kconfig_image       |   6 +
 terraform/oci/scripts/gen_kconfig_location    |   6 +
 terraform/oci/scripts/gen_kconfig_shape       |   6 +
 72 files changed, 6393 insertions(+), 16 deletions(-)
 create mode 100644 defconfigs/datacrunch-4x-b200
 create mode 100644 defconfigs/datacrunch-4x-b300
 create mode 100644 defconfigs/datacrunch-4x-h100-pytorch
 create mode 100644 defconfigs/datacrunch-a100
 create mode 100644 defconfigs/datacrunch-a100-40-or-less
 create mode 100644 defconfigs/datacrunch-a100-80-or-less
 create mode 100644 defconfigs/datacrunch-b200-or-less
 create mode 100644 defconfigs/datacrunch-b300
 create mode 100644 defconfigs/datacrunch-b300-or-less
 create mode 100644 defconfigs/datacrunch-h100-pytorch
 create mode 100644 defconfigs/datacrunch-h100-pytorch-or-less
 create mode 100644 defconfigs/datacrunch-v100
 create mode 100644 defconfigs/lambdalabs-8x-b200-or-less
 create mode 100644 defconfigs/lambdalabs-8x-h100-or-less
 create mode 100644 defconfigs/lambdalabs-a100-or-less
 create mode 100644 defconfigs/lambdalabs-gh200-or-less
 create mode 100644 defconfigs/lambdalabs-h100-or-less
 create mode 100644 docs/datacrunch.md
 create mode 100644 playbooks/datacrunch_volume_cache.yml
 create mode 100644 playbooks/roles/devconfig/tasks/datacrunch_ml.yml
 create mode 100644 playbooks/roles/gen_tfvars/templates/datacrunch/terraform.tfvars.j2
 create mode 100644 requirements.txt
 create mode 100755 scripts/datacrunch_api.py
 create mode 100755 scripts/datacrunch_check_capacity.py
 create mode 100755 scripts/datacrunch_credentials.py
 create mode 100755 scripts/datacrunch_select_tier.py
 create mode 100755 scripts/datacrunch_ssh_key_name.py
 create mode 100755 scripts/datacrunch_ssh_keys.py
 create mode 100755 scripts/generate_datacrunch_kconfig.py
 create mode 100755 scripts/lambdalabs_check_capacity.py
 create mode 100755 scripts/lambdalabs_select_tier.py
 create mode 100644 terraform/datacrunch/Kconfig
 create mode 100644 terraform/datacrunch/LOCAL_PROVIDER.md
 create mode 100644 terraform/datacrunch/README.md
 create mode 100644 terraform/datacrunch/STATUS.md
 create mode 120000 terraform/datacrunch/ansible_provision_cmd.tpl
 create mode 100755 terraform/datacrunch/extract_api_key.py
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.compute
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.compute.generated
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.identity
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.images
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.images.generated
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.location
 create mode 100644 terraform/datacrunch/kconfigs/Kconfig.location.generated
 create mode 100644 terraform/datacrunch/main.tf
 create mode 100644 terraform/datacrunch/output.tf
 create mode 100644 terraform/datacrunch/provider.tf
 create mode 100755 terraform/datacrunch/scripts/apply_wrapper.sh
 create mode 100755 terraform/datacrunch/scripts/destroy_wrapper.sh
 create mode 100755 terraform/datacrunch/scripts/volume_cache.py
 create mode 120000 terraform/datacrunch/shared.tf
 create mode 100644 terraform/datacrunch/vars.tf

-- 
2.51.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-12-16 19:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-06 16:56 [PATCH 0/8] neoclouds: add new datacrunch / verda support Luis Chamberlain
2025-12-06 16:56 ` [PATCH 1/8] terraform: Use directory checksum in SSH key filenames Luis Chamberlain
2025-12-06 22:28   ` Chuck Lever
2025-12-12 19:14     ` Chuck Lever
2025-12-15 15:41       ` Chuck Lever
2025-12-06 16:56 ` [PATCH 2/8] devconfig: Add tmux.conf copying to target systems Luis Chamberlain
2025-12-06 16:56 ` [PATCH 3/8] terraform: Enable fact gathering for localhost Luis Chamberlain
2025-12-07 16:23   ` Chuck Lever
2025-12-06 16:56 ` [PATCH 4/8] terraform: Add DataCrunch GPU cloud provider integration Luis Chamberlain
2025-12-16 16:12   ` Chuck Lever
2025-12-06 16:56 ` [PATCH 5/8] kconfig: Add support for merging defconfig fragments Luis Chamberlain
2025-12-07 16:25   ` Chuck Lever
2025-12-07 20:37   ` Daniel Gomez
2025-12-06 16:56 ` [PATCH 6/8] terraform: Add tier-based GPU selection for Lambda Labs Luis Chamberlain
2025-12-16 18:05   ` Chuck Lever
2025-12-06 16:56 ` [PATCH 7/8] terraform: Document " Luis Chamberlain
2025-12-16 19:30   ` Chuck Lever
2025-12-06 16:56 ` [PATCH 8/8] docs: Organize cloud providers with Neoclouds section Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox