From: Daniel Gomez <da.gomez@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
Chuck Lever <cel@kernel.org>, Daniel Gomez <da.gomez@kruces.com>,
hui81.qi@samsung.com, kundan.kumar@samsung.com,
kdevops@lists.linux.dev
Subject: Re: [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks
Date: Mon, 1 Sep 2025 22:11:36 +0200 [thread overview]
Message-ID: <7f74c11c-743c-4eb5-b544-07c604f41344@kernel.org> (raw)
In-Reply-To: <20250827093202.3539990-3-mcgrof@kernel.org>
On 27/08/2025 11.32, Luis Chamberlain wrote:
> Extend the AI workflow to support testing Milvus across multiple
> filesystem configurations simultaneously. This enables comprehensive
> performance comparisons between different filesystems and their
> configuration options.
>
> Key features:
> - Dynamic node generation based on enabled filesystem configurations
> - Support for XFS, EXT4, and BTRFS with various mount options
> - Per-filesystem result collection and analysis
> - A/B testing across all filesystem configurations
> - Automated comparison graphs between filesystems
>
> Filesystem configurations:
> - XFS: default, nocrc, bigtime with various block sizes (512, 1k, 2k, 4k)
> - EXT4: default, nojournal, bigalloc configurations
> - BTRFS: default, zlib, lzo, zstd compression options
>
> Defconfigs:
> - ai-milvus-multifs: Test 7 filesystem configs with A/B testing
> - ai-milvus-multifs-distro: Test with distribution kernels
> - ai-milvus-multifs-extended: Extended configs (14 filesystems total)
>
> Node generation:
> The system dynamically generates nodes based on enabled filesystem
> configurations. With A/B testing enabled, this creates baseline and
> dev nodes for each filesystem (e.g., debian13-ai-xfs-4k and
> debian13-ai-xfs-4k-dev).
>
> Usage:
> make defconfig-ai-milvus-multifs
> make bringup # Creates nodes for each filesystem
> make ai # Setup infrastructure on all nodes
> make ai-tests # Run benchmarks on all filesystems
> make ai-results # Collect and compare results
>
> This enables systematic evaluation of how different filesystems and
> their configurations affect vector database performance.
>
> Generated-by: Claude AI
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
...
> diff --git a/playbooks/roles/gen_hosts/tasks/main.yml b/playbooks/roles/gen_hosts/tasks/main.yml
> index 4b35d9f6..d36790b0 100644
> --- a/playbooks/roles/gen_hosts/tasks/main.yml
> +++ b/playbooks/roles/gen_hosts/tasks/main.yml
> @@ -381,6 +381,25 @@
> - workflows_reboot_limit
> - ansible_hosts_template.stat.exists
>
> +- name: Load AI nodes configuration for multi-filesystem setup
> + include_vars:
> + file: "{{ topdir_path }}/{{ kdevops_nodes }}"
> + name: guestfs_nodes
> + when:
> + - kdevops_workflows_dedicated_workflow
> + - kdevops_workflow_enable_ai
> + - ai_enable_multifs_testing|default(false)|bool
> + - ansible_hosts_template.stat.exists
> +
> +- name: Extract AI node names for multi-filesystem setup
> + set_fact:
> + all_generic_nodes: "{{ guestfs_nodes.guestfs_nodes | map(attribute='name') | list }}"
> + when:
> + - kdevops_workflows_dedicated_workflow
> + - kdevops_workflow_enable_ai
> + - ai_enable_multifs_testing|default(false)|bool
> + - guestfs_nodes is defined
> +
> - name: Generate the Ansible hosts file for a dedicated AI setup
> tags: ['hosts']
> ansible.builtin.template:
> diff --git a/playbooks/roles/gen_hosts/templates/fstests.j2 b/playbooks/roles/gen_hosts/templates/fstests.j2
> index ac086c6e..32d90abf 100644
> --- a/playbooks/roles/gen_hosts/templates/fstests.j2
> +++ b/playbooks/roles/gen_hosts/templates/fstests.j2
> @@ -70,6 +70,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> [krb5:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {% endif %}
> +{% if kdevops_enable_iscsi or kdevops_nfsd_enable or kdevops_smbd_enable or kdevops_krb5_enable %}
> [service]
> {% if kdevops_enable_iscsi %}
> {{ kdevops_hosts_prefix }}-iscsi
> @@ -85,3 +86,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {% endif %}
> [service:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_hosts/templates/gitr.j2 b/playbooks/roles/gen_hosts/templates/gitr.j2
> index 7f9094d4..3f30a5fb 100644
> --- a/playbooks/roles/gen_hosts/templates/gitr.j2
> +++ b/playbooks/roles/gen_hosts/templates/gitr.j2
> @@ -38,6 +38,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> [nfsd:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {% endif %}
> +{% if kdevops_enable_iscsi or kdevops_nfsd_enable %}
> [service]
> {% if kdevops_enable_iscsi %}
> {{ kdevops_hosts_prefix }}-iscsi
> @@ -47,3 +48,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {% endif %}
> [service:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_hosts/templates/hosts.j2 b/playbooks/roles/gen_hosts/templates/hosts.j2
> index cdcd1883..e9441605 100644
> --- a/playbooks/roles/gen_hosts/templates/hosts.j2
> +++ b/playbooks/roles/gen_hosts/templates/hosts.j2
> @@ -119,39 +119,30 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> [ai:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
>
> -{% set fs_configs = [] %}
> +{# Individual section groups for multi-filesystem testing #}
> +{% set section_names = [] %}
> {% for node in all_generic_nodes %}
> -{% set node_parts = node.split('-') %}
> -{% if node_parts|length >= 3 %}
> -{% set fs_type = node_parts[2] %}
> -{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
> -{% set fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
> -{% if fs_group not in fs_configs %}
> -{% set _ = fs_configs.append(fs_group) %}
> +{% if not node.endswith('-dev') %}
> +{% set section = node.replace(kdevops_host_prefix + '-ai-', '') %}
> +{% if section != kdevops_host_prefix + '-ai' %}
> +{% if section_names.append(section) %}{% endif %}
> {% endif %}
> {% endif %}
> {% endfor %}
>
> -{% for fs_group in fs_configs %}
> -[ai_{{ fs_group }}]
> -{% for node in all_generic_nodes %}
> -{% set node_parts = node.split('-') %}
> -{% if node_parts|length >= 3 %}
> -{% set fs_type = node_parts[2] %}
> -{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
> -{% set node_fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
> -{% if node_fs_group == fs_group %}
> -{{ node }}
> -{% endif %}
> +{% for section in section_names %}
> +[ai_{{ section | replace('-', '_') }}]
> +{{ kdevops_host_prefix }}-ai-{{ section }}
> +{% if kdevops_baseline_and_dev %}
> +{{ kdevops_host_prefix }}-ai-{{ section }}-dev
> {% endif %}
> -{% endfor %}
>
> -[ai_{{ fs_group }}:vars]
> +[ai_{{ section | replace('-', '_') }}:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
>
> {% endfor %}
> {% else %}
> -{# Single-node AI hosts #}
> +{# Single filesystem hosts (original behavior) #}
> [all]
> localhost ansible_connection=local
> {{ kdevops_host_prefix }}-ai
> diff --git a/playbooks/roles/gen_hosts/templates/nfstest.j2 b/playbooks/roles/gen_hosts/templates/nfstest.j2
> index e427ac34..709d871d 100644
> --- a/playbooks/roles/gen_hosts/templates/nfstest.j2
> +++ b/playbooks/roles/gen_hosts/templates/nfstest.j2
> @@ -38,6 +38,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> [nfsd:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {% endif %}
> +{% if kdevops_enable_iscsi or kdevops_nfsd_enable %}
> [service]
> {% if kdevops_enable_iscsi %}
> {{ kdevops_hosts_prefix }}-iscsi
> @@ -47,3 +48,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {% endif %}
> [service:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_hosts/templates/pynfs.j2 b/playbooks/roles/gen_hosts/templates/pynfs.j2
> index 85c87dae..55add4d1 100644
> --- a/playbooks/roles/gen_hosts/templates/pynfs.j2
> +++ b/playbooks/roles/gen_hosts/templates/pynfs.j2
> @@ -23,6 +23,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {{ kdevops_hosts_prefix }}-nfsd
> [nfsd:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> +{% if true %}
> [service]
> {% if kdevops_enable_iscsi %}
> {{ kdevops_hosts_prefix }}-iscsi
> @@ -30,3 +31,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> {{ kdevops_hosts_prefix }}-nfsd
> [service:vars]
> ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_nodes/tasks/main.yml b/playbooks/roles/gen_nodes/tasks/main.yml
> index d54977be..b294d294 100644
> --- a/playbooks/roles/gen_nodes/tasks/main.yml
> +++ b/playbooks/roles/gen_nodes/tasks/main.yml
> @@ -658,6 +658,7 @@
> - kdevops_workflow_enable_ai
> - ansible_nodes_template.stat.exists
> - not kdevops_baseline_and_dev
> + - not ai_enable_multifs_testing|default(false)|bool
>
> - name: Generate the AI kdevops nodes file with dev hosts using {{ kdevops_nodes_template }} as jinja2 source template
> tags: ['hosts']
> @@ -675,6 +676,95 @@
> - kdevops_workflow_enable_ai
> - ansible_nodes_template.stat.exists
> - kdevops_baseline_and_dev
> + - not ai_enable_multifs_testing|default(false)|bool
> +
> +- name: Infer enabled AI multi-filesystem configurations
> + vars:
> + kdevops_config_data: "{{ lookup('file', topdir_path + '/.config') }}"
> + # Find all enabled AI multifs configurations
> + xfs_configs: >-
> + {{
> + kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_XFS_(.*)=y$', multiline=True)
> + | map('lower')
> + | map('regex_replace', '_', '-')
> + | map('regex_replace', '^', 'xfs-')
> + | list
> + if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_XFS=y$', multiline=True)
> + else []
> + }}
> + ext4_configs: >-
> + {{
> + kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_EXT4_(.*)=y$', multiline=True)
> + | map('lower')
> + | map('regex_replace', '_', '-')
> + | map('regex_replace', '^', 'ext4-')
> + | list
> + if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_EXT4=y$', multiline=True)
> + else []
> + }}
> + btrfs_configs: >-
> + {{
> + kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_BTRFS_(.*)=y$', multiline=True)
> + | map('lower')
> + | map('regex_replace', '_', '-')
> + | map('regex_replace', '^', 'btrfs-')
> + | list
> + if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_BTRFS=y$', multiline=True)
> + else []
> + }}
> + set_fact:
> + ai_multifs_enabled_configs: "{{ (xfs_configs + ext4_configs + btrfs_configs) | unique }}"
> + when:
> + - kdevops_workflows_dedicated_workflow
> + - kdevops_workflow_enable_ai
> + - ai_enable_multifs_testing|default(false)|bool
> + - ansible_nodes_template.stat.exists
> +
> +- name: Create AI nodes for each filesystem configuration (no dev)
> + vars:
> + filesystem_nodes: "{{ [kdevops_host_prefix + '-ai-'] | product(ai_multifs_enabled_configs | default([])) | map('join') | list }}"
> + set_fact:
> + ai_enabled_section_types: "{{ filesystem_nodes }}"
> + when:
> + - kdevops_workflows_dedicated_workflow
> + - kdevops_workflow_enable_ai
> + - ai_enable_multifs_testing|default(false)|bool
> + - ansible_nodes_template.stat.exists
> + - not kdevops_baseline_and_dev
> + - ai_multifs_enabled_configs is defined
> + - ai_multifs_enabled_configs | length > 0
> +
> +- name: Create AI nodes for each filesystem configuration with dev hosts
> + vars:
> + filesystem_nodes: "{{ [kdevops_host_prefix + '-ai-'] | product(ai_multifs_enabled_configs | default([])) | map('join') | list }}"
> + set_fact:
> + ai_enabled_section_types: "{{ filesystem_nodes | product(['', '-dev']) | map('join') | list }}"
> + when:
> + - kdevops_workflows_dedicated_workflow
> + - kdevops_workflow_enable_ai
> + - ai_enable_multifs_testing|default(false)|bool
> + - ansible_nodes_template.stat.exists
> + - kdevops_baseline_and_dev
> + - ai_multifs_enabled_configs is defined
> + - ai_multifs_enabled_configs | length > 0
> +
> +- name: Generate the AI multi-filesystem kdevops nodes file using {{ kdevops_nodes_template }} as jinja2 source template
> + tags: [ 'hosts' ]
> + vars:
> + node_template: "{{ kdevops_nodes_template | basename }}"
> + nodes: "{{ ai_enabled_section_types | regex_replace('\\[') | regex_replace('\\]') | replace(\"'\", '') | split(', ') }}"
> + all_generic_nodes: "{{ ai_enabled_section_types }}"
> + template:
> + src: "{{ node_template }}"
> + dest: "{{ topdir_path }}/{{ kdevops_nodes }}"
> + force: yes
> + when:
> + - kdevops_workflows_dedicated_workflow
> + - kdevops_workflow_enable_ai
> + - ai_enable_multifs_testing|default(false)|bool
> + - ansible_nodes_template.stat.exists
> + - ai_enabled_section_types is defined
> + - ai_enabled_section_types | length > 0
>
> - name: Get the control host's timezone
> ansible.builtin.command: "timedatectl show -p Timezone --value"
> diff --git a/playbooks/roles/guestfs/tasks/bringup/main.yml b/playbooks/roles/guestfs/tasks/bringup/main.yml
> index c131de25..bd9f5260 100644
> --- a/playbooks/roles/guestfs/tasks/bringup/main.yml
> +++ b/playbooks/roles/guestfs/tasks/bringup/main.yml
> @@ -1,11 +1,16 @@
> ---
> - name: List defined libvirt guests
> run_once: true
> + delegate_to: localhost
> community.libvirt.virt:
> command: list_vms
> uri: "{{ libvirt_uri }}"
> register: defined_vms
>
> +- name: Debug defined VMs
> + debug:
> + msg: "Hostname: {{ inventory_hostname }}, Defined VMs: {{ hostvars['localhost']['defined_vms']['list_vms'] | default([]) }}, Check: {{ inventory_hostname not in (hostvars['localhost']['defined_vms']['list_vms'] | default([])) }}"
> +
> - name: Provision each target node
> when:
> - "inventory_hostname not in defined_vms.list_vms"
> @@ -25,10 +30,13 @@
> path: "{{ ssh_key_dir }}"
> state: directory
> mode: "u=rwx"
> + delegate_to: localhost
>
> - name: Generate fresh keys for each target node
> ansible.builtin.command:
> cmd: 'ssh-keygen -q -t ed25519 -f {{ ssh_key }} -N ""'
> + creates: "{{ ssh_key }}"
> + delegate_to: localhost
>
> - name: Set the pathname of the root disk image for each target node
> ansible.builtin.set_fact:
> @@ -38,15 +46,18 @@
> ansible.builtin.file:
> path: "{{ storagedir }}/{{ inventory_hostname }}"
> state: directory
> + delegate_to: localhost
>
> - name: Duplicate the root disk image for each target node
> ansible.builtin.command:
> cmd: "cp --reflink=auto {{ base_image }} {{ root_image }}"
> + delegate_to: localhost
>
> - name: Get the timezone of the control host
> ansible.builtin.command:
> cmd: "timedatectl show -p Timezone --value"
> register: host_timezone
> + delegate_to: localhost
>
> - name: Build the root image for each target node (as root)
> become: true
> @@ -103,6 +114,7 @@
> name: "{{ inventory_hostname }}"
> xml: "{{ lookup('file', xml_file) }}"
> uri: "{{ libvirt_uri }}"
> + delegate_to: localhost
>
> - name: Find PCIe passthrough devices
> ansible.builtin.find:
> @@ -110,6 +122,7 @@
> file_type: file
> patterns: "pcie_passthrough_*.xml"
> register: passthrough_devices
> + delegate_to: localhost
>
> - name: Attach PCIe passthrough devices to each target node
> environment:
> @@ -124,6 +137,7 @@
> loop: "{{ passthrough_devices.files }}"
> loop_control:
> label: "Doing PCI-E passthrough for device {{ item }}"
> + delegate_to: localhost
> when:
> - passthrough_devices.matched > 0
>
> @@ -142,3 +156,4 @@
> name: "{{ inventory_hostname }}"
> uri: "{{ libvirt_uri }}"
> state: running
> + delegate_to: localhost
> diff --git a/scripts/guestfs.Makefile b/scripts/guestfs.Makefile
> index bd03f58c..f6c350a4 100644
> --- a/scripts/guestfs.Makefile
> +++ b/scripts/guestfs.Makefile
> @@ -79,7 +79,7 @@ bringup_guestfs: $(GUESTFS_BRINGUP_DEPS)
> --extra-vars=@./extra_vars.yaml \
> --tags network,pool,base_image
> $(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
> - --limit 'baseline:dev:service' \
> + --limit 'baseline:dev:service:ai' \
I'm not sure if I understand the need of this new ai group. Can you clarify?
Why aren't baseline or dev groups sufficient for this AI workload?
What's the role of this ai group?
Note: I kept the hunks above to make it easier to reference the part I believe
is most relevant to my questions (hopefully).
next prev parent reply other threads:[~2025-09-01 20:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 9:31 [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain
2025-08-27 9:32 ` [PATCH 1/2] ai: add Milvus vector database benchmarking support Luis Chamberlain
2025-08-27 9:32 ` [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks Luis Chamberlain
2025-08-27 14:47 ` Chuck Lever
2025-08-27 19:24 ` Luis Chamberlain
2025-09-01 20:11 ` Daniel Gomez [this message]
2025-09-01 20:27 ` Luis Chamberlain
2025-08-29 2:05 ` [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7f74c11c-743c-4eb5-b544-07c604f41344@kernel.org \
--to=da.gomez@kernel.org \
--cc=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=hui81.qi@samsung.com \
--cc=kdevops@lists.linux.dev \
--cc=kundan.kumar@samsung.com \
--cc=mcgrof@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox