Re: [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks

public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed

From: Daniel Gomez <da.gomez@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
	Chuck Lever <cel@kernel.org>, Daniel Gomez <da.gomez@kruces.com>,
	hui81.qi@samsung.com, kundan.kumar@samsung.com,
	kdevops@lists.linux.dev
Subject: Re: [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks
Date: Mon, 1 Sep 2025 22:11:36 +0200	[thread overview]
Message-ID: <7f74c11c-743c-4eb5-b544-07c604f41344@kernel.org> (raw)
In-Reply-To: <20250827093202.3539990-3-mcgrof@kernel.org>

On 27/08/2025 11.32, Luis Chamberlain wrote:
> Extend the AI workflow to support testing Milvus across multiple
> filesystem configurations simultaneously. This enables comprehensive
> performance comparisons between different filesystems and their
> configuration options.
> 
> Key features:
> - Dynamic node generation based on enabled filesystem configurations
> - Support for XFS, EXT4, and BTRFS with various mount options
> - Per-filesystem result collection and analysis
> - A/B testing across all filesystem configurations
> - Automated comparison graphs between filesystems
> 
> Filesystem configurations:
> - XFS: default, nocrc, bigtime with various block sizes (512, 1k, 2k, 4k)
> - EXT4: default, nojournal, bigalloc configurations
> - BTRFS: default, zlib, lzo, zstd compression options
> 
> Defconfigs:
> - ai-milvus-multifs: Test 7 filesystem configs with A/B testing
> - ai-milvus-multifs-distro: Test with distribution kernels
> - ai-milvus-multifs-extended: Extended configs (14 filesystems total)
> 
> Node generation:
> The system dynamically generates nodes based on enabled filesystem
> configurations. With A/B testing enabled, this creates baseline and
> dev nodes for each filesystem (e.g., debian13-ai-xfs-4k and
> debian13-ai-xfs-4k-dev).
> 
> Usage:
>   make defconfig-ai-milvus-multifs
>   make bringup    # Creates nodes for each filesystem
>   make ai         # Setup infrastructure on all nodes
>   make ai-tests   # Run benchmarks on all filesystems
>   make ai-results # Collect and compare results
> 
> This enables systematic evaluation of how different filesystems and
> their configurations affect vector database performance.
> 
> Generated-by: Claude AI
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---

...

> diff --git a/playbooks/roles/gen_hosts/tasks/main.yml b/playbooks/roles/gen_hosts/tasks/main.yml
> index 4b35d9f6..d36790b0 100644
> --- a/playbooks/roles/gen_hosts/tasks/main.yml
> +++ b/playbooks/roles/gen_hosts/tasks/main.yml
> @@ -381,6 +381,25 @@
>      - workflows_reboot_limit
>      - ansible_hosts_template.stat.exists
>  
> +- name: Load AI nodes configuration for multi-filesystem setup
> +  include_vars:
> +    file: "{{ topdir_path }}/{{ kdevops_nodes }}"
> +    name: guestfs_nodes
> +  when:
> +    - kdevops_workflows_dedicated_workflow
> +    - kdevops_workflow_enable_ai
> +    - ai_enable_multifs_testing|default(false)|bool
> +    - ansible_hosts_template.stat.exists
> +
> +- name: Extract AI node names for multi-filesystem setup
> +  set_fact:
> +    all_generic_nodes: "{{ guestfs_nodes.guestfs_nodes | map(attribute='name') | list }}"
> +  when:
> +    - kdevops_workflows_dedicated_workflow
> +    - kdevops_workflow_enable_ai
> +    - ai_enable_multifs_testing|default(false)|bool
> +    - guestfs_nodes is defined
> +
>  - name: Generate the Ansible hosts file for a dedicated AI setup
>    tags: ['hosts']
>    ansible.builtin.template:
> diff --git a/playbooks/roles/gen_hosts/templates/fstests.j2 b/playbooks/roles/gen_hosts/templates/fstests.j2
> index ac086c6e..32d90abf 100644
> --- a/playbooks/roles/gen_hosts/templates/fstests.j2
> +++ b/playbooks/roles/gen_hosts/templates/fstests.j2
> @@ -70,6 +70,7 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  [krb5:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {% endif %}
> +{% if kdevops_enable_iscsi or kdevops_nfsd_enable or kdevops_smbd_enable or kdevops_krb5_enable %}
>  [service]
>  {% if kdevops_enable_iscsi %}
>  {{ kdevops_hosts_prefix }}-iscsi
> @@ -85,3 +86,4 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {% endif %}
>  [service:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_hosts/templates/gitr.j2 b/playbooks/roles/gen_hosts/templates/gitr.j2
> index 7f9094d4..3f30a5fb 100644
> --- a/playbooks/roles/gen_hosts/templates/gitr.j2
> +++ b/playbooks/roles/gen_hosts/templates/gitr.j2
> @@ -38,6 +38,7 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  [nfsd:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {% endif %}
> +{% if kdevops_enable_iscsi or kdevops_nfsd_enable %}
>  [service]
>  {% if kdevops_enable_iscsi %}
>  {{ kdevops_hosts_prefix }}-iscsi
> @@ -47,3 +48,4 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {% endif %}
>  [service:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_hosts/templates/hosts.j2 b/playbooks/roles/gen_hosts/templates/hosts.j2
> index cdcd1883..e9441605 100644
> --- a/playbooks/roles/gen_hosts/templates/hosts.j2
> +++ b/playbooks/roles/gen_hosts/templates/hosts.j2
> @@ -119,39 +119,30 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
>  [ai:vars]
>  ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
>  
> -{% set fs_configs = [] %}
> +{# Individual section groups for multi-filesystem testing #}
> +{% set section_names = [] %}
>  {% for node in all_generic_nodes %}
> -{% set node_parts = node.split('-') %}
> -{% if node_parts|length >= 3 %}
> -{% set fs_type = node_parts[2] %}
> -{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
> -{% set fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
> -{% if fs_group not in fs_configs %}
> -{% set _ = fs_configs.append(fs_group) %}
> +{% if not node.endswith('-dev') %}
> +{% set section = node.replace(kdevops_host_prefix + '-ai-', '') %}
> +{% if section != kdevops_host_prefix + '-ai' %}
> +{% if section_names.append(section) %}{% endif %}
>  {% endif %}
>  {% endif %}
>  {% endfor %}
>  
> -{% for fs_group in fs_configs %}
> -[ai_{{ fs_group }}]
> -{% for node in all_generic_nodes %}
> -{% set node_parts = node.split('-') %}
> -{% if node_parts|length >= 3 %}
> -{% set fs_type = node_parts[2] %}
> -{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
> -{% set node_fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
> -{% if node_fs_group == fs_group %}
> -{{ node }}
> -{% endif %}
> +{% for section in section_names %}
> +[ai_{{ section | replace('-', '_') }}]
> +{{ kdevops_host_prefix }}-ai-{{ section }}
> +{% if kdevops_baseline_and_dev %}
> +{{ kdevops_host_prefix }}-ai-{{ section }}-dev
>  {% endif %}
> -{% endfor %}
>  
> -[ai_{{ fs_group }}:vars]
> +[ai_{{ section | replace('-', '_') }}:vars]
>  ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
>  
>  {% endfor %}
>  {% else %}
> -{# Single-node AI hosts #}
> +{# Single filesystem hosts (original behavior) #}
>  [all]
>  localhost ansible_connection=local
>  {{ kdevops_host_prefix }}-ai
> diff --git a/playbooks/roles/gen_hosts/templates/nfstest.j2 b/playbooks/roles/gen_hosts/templates/nfstest.j2
> index e427ac34..709d871d 100644
> --- a/playbooks/roles/gen_hosts/templates/nfstest.j2
> +++ b/playbooks/roles/gen_hosts/templates/nfstest.j2
> @@ -38,6 +38,7 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  [nfsd:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {% endif %}
> +{% if kdevops_enable_iscsi or kdevops_nfsd_enable %}
>  [service]
>  {% if kdevops_enable_iscsi %}
>  {{ kdevops_hosts_prefix }}-iscsi
> @@ -47,3 +48,4 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {% endif %}
>  [service:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_hosts/templates/pynfs.j2 b/playbooks/roles/gen_hosts/templates/pynfs.j2
> index 85c87dae..55add4d1 100644
> --- a/playbooks/roles/gen_hosts/templates/pynfs.j2
> +++ b/playbooks/roles/gen_hosts/templates/pynfs.j2
> @@ -23,6 +23,7 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {{ kdevops_hosts_prefix }}-nfsd
>  [nfsd:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
> +{% if true %}
>  [service]
>  {% if kdevops_enable_iscsi %}
>  {{ kdevops_hosts_prefix }}-iscsi
> @@ -30,3 +31,4 @@ ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
>  {{ kdevops_hosts_prefix }}-nfsd
>  [service:vars]
>  ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
> +{% endif %}
> diff --git a/playbooks/roles/gen_nodes/tasks/main.yml b/playbooks/roles/gen_nodes/tasks/main.yml
> index d54977be..b294d294 100644
> --- a/playbooks/roles/gen_nodes/tasks/main.yml
> +++ b/playbooks/roles/gen_nodes/tasks/main.yml
> @@ -658,6 +658,7 @@
>      - kdevops_workflow_enable_ai
>      - ansible_nodes_template.stat.exists
>      - not kdevops_baseline_and_dev
> +    - not ai_enable_multifs_testing|default(false)|bool
>  
>  - name: Generate the AI kdevops nodes file with dev hosts using {{ kdevops_nodes_template }} as jinja2 source template
>    tags: ['hosts']
> @@ -675,6 +676,95 @@
>      - kdevops_workflow_enable_ai
>      - ansible_nodes_template.stat.exists
>      - kdevops_baseline_and_dev
> +    - not ai_enable_multifs_testing|default(false)|bool
> +
> +- name: Infer enabled AI multi-filesystem configurations
> +  vars:
> +    kdevops_config_data: "{{ lookup('file', topdir_path + '/.config') }}"
> +    # Find all enabled AI multifs configurations
> +    xfs_configs: >-
> +      {{
> +        kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_XFS_(.*)=y$', multiline=True)
> +        | map('lower')
> +        | map('regex_replace', '_', '-')
> +        | map('regex_replace', '^', 'xfs-')
> +        | list
> +        if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_XFS=y$', multiline=True)
> +        else []
> +      }}
> +    ext4_configs: >-
> +      {{
> +        kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_EXT4_(.*)=y$', multiline=True)
> +        | map('lower')
> +        | map('regex_replace', '_', '-')
> +        | map('regex_replace', '^', 'ext4-')
> +        | list
> +        if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_EXT4=y$', multiline=True)
> +        else []
> +      }}
> +    btrfs_configs: >-
> +      {{
> +        kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_BTRFS_(.*)=y$', multiline=True)
> +        | map('lower')
> +        | map('regex_replace', '_', '-')
> +        | map('regex_replace', '^', 'btrfs-')
> +        | list
> +        if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_BTRFS=y$', multiline=True)
> +        else []
> +      }}
> +  set_fact:
> +    ai_multifs_enabled_configs: "{{ (xfs_configs + ext4_configs + btrfs_configs) | unique }}"
> +  when:
> +    - kdevops_workflows_dedicated_workflow
> +    - kdevops_workflow_enable_ai
> +    - ai_enable_multifs_testing|default(false)|bool
> +    - ansible_nodes_template.stat.exists
> +
> +- name: Create AI nodes for each filesystem configuration (no dev)
> +  vars:
> +    filesystem_nodes: "{{ [kdevops_host_prefix + '-ai-'] | product(ai_multifs_enabled_configs | default([])) | map('join') | list }}"
> +  set_fact:
> +    ai_enabled_section_types: "{{ filesystem_nodes }}"
> +  when:
> +    - kdevops_workflows_dedicated_workflow
> +    - kdevops_workflow_enable_ai
> +    - ai_enable_multifs_testing|default(false)|bool
> +    - ansible_nodes_template.stat.exists
> +    - not kdevops_baseline_and_dev
> +    - ai_multifs_enabled_configs is defined
> +    - ai_multifs_enabled_configs | length > 0
> +
> +- name: Create AI nodes for each filesystem configuration with dev hosts
> +  vars:
> +    filesystem_nodes: "{{ [kdevops_host_prefix + '-ai-'] | product(ai_multifs_enabled_configs | default([])) | map('join') | list }}"
> +  set_fact:
> +    ai_enabled_section_types: "{{ filesystem_nodes | product(['', '-dev']) | map('join') | list }}"
> +  when:
> +    - kdevops_workflows_dedicated_workflow
> +    - kdevops_workflow_enable_ai
> +    - ai_enable_multifs_testing|default(false)|bool
> +    - ansible_nodes_template.stat.exists
> +    - kdevops_baseline_and_dev
> +    - ai_multifs_enabled_configs is defined
> +    - ai_multifs_enabled_configs | length > 0
> +
> +- name: Generate the AI multi-filesystem kdevops nodes file using {{ kdevops_nodes_template }} as jinja2 source template
> +  tags: [ 'hosts' ]
> +  vars:
> +    node_template: "{{ kdevops_nodes_template | basename }}"
> +    nodes: "{{ ai_enabled_section_types | regex_replace('\\[') | regex_replace('\\]') | replace(\"'\", '') | split(', ') }}"
> +    all_generic_nodes: "{{ ai_enabled_section_types }}"
> +  template:
> +    src: "{{ node_template }}"
> +    dest: "{{ topdir_path }}/{{ kdevops_nodes }}"
> +    force: yes
> +  when:
> +    - kdevops_workflows_dedicated_workflow
> +    - kdevops_workflow_enable_ai
> +    - ai_enable_multifs_testing|default(false)|bool
> +    - ansible_nodes_template.stat.exists
> +    - ai_enabled_section_types is defined
> +    - ai_enabled_section_types | length > 0
>  
>  - name: Get the control host's timezone
>    ansible.builtin.command: "timedatectl show -p Timezone --value"
> diff --git a/playbooks/roles/guestfs/tasks/bringup/main.yml b/playbooks/roles/guestfs/tasks/bringup/main.yml
> index c131de25..bd9f5260 100644
> --- a/playbooks/roles/guestfs/tasks/bringup/main.yml
> +++ b/playbooks/roles/guestfs/tasks/bringup/main.yml
> @@ -1,11 +1,16 @@
>  ---
>  - name: List defined libvirt guests
>    run_once: true
> +  delegate_to: localhost
>    community.libvirt.virt:
>      command: list_vms
>      uri: "{{ libvirt_uri }}"
>    register: defined_vms
>  
> +- name: Debug defined VMs
> +  debug:
> +    msg: "Hostname: {{ inventory_hostname }}, Defined VMs: {{ hostvars['localhost']['defined_vms']['list_vms'] | default([]) }}, Check: {{ inventory_hostname not in (hostvars['localhost']['defined_vms']['list_vms'] | default([])) }}"
> +
>  - name: Provision each target node
>    when:
>      - "inventory_hostname not in defined_vms.list_vms"
> @@ -25,10 +30,13 @@
>              path: "{{ ssh_key_dir }}"
>              state: directory
>              mode: "u=rwx"
> +          delegate_to: localhost
>  
>          - name: Generate fresh keys for each target node
>            ansible.builtin.command:
>              cmd: 'ssh-keygen -q -t ed25519 -f {{ ssh_key }} -N ""'
> +            creates: "{{ ssh_key }}"
> +          delegate_to: localhost
>  
>      - name: Set the pathname of the root disk image for each target node
>        ansible.builtin.set_fact:
> @@ -38,15 +46,18 @@
>        ansible.builtin.file:
>          path: "{{ storagedir }}/{{ inventory_hostname }}"
>          state: directory
> +      delegate_to: localhost
>  
>      - name: Duplicate the root disk image for each target node
>        ansible.builtin.command:
>          cmd: "cp --reflink=auto {{ base_image }} {{ root_image }}"
> +      delegate_to: localhost
>  
>      - name: Get the timezone of the control host
>        ansible.builtin.command:
>          cmd: "timedatectl show -p Timezone --value"
>        register: host_timezone
> +      delegate_to: localhost
>  
>      - name: Build the root image for each target node (as root)
>        become: true
> @@ -103,6 +114,7 @@
>          name: "{{ inventory_hostname }}"
>          xml: "{{ lookup('file', xml_file) }}"
>          uri: "{{ libvirt_uri }}"
> +      delegate_to: localhost
>  
>      - name: Find PCIe passthrough devices
>        ansible.builtin.find:
> @@ -110,6 +122,7 @@
>          file_type: file
>          patterns: "pcie_passthrough_*.xml"
>        register: passthrough_devices
> +      delegate_to: localhost
>  
>      - name: Attach PCIe passthrough devices to each target node
>        environment:
> @@ -124,6 +137,7 @@
>        loop: "{{ passthrough_devices.files }}"
>        loop_control:
>          label: "Doing PCI-E passthrough for device {{ item }}"
> +      delegate_to: localhost
>        when:
>          - passthrough_devices.matched > 0
>  
> @@ -142,3 +156,4 @@
>      name: "{{ inventory_hostname }}"
>      uri: "{{ libvirt_uri }}"
>      state: running
> +  delegate_to: localhost
> diff --git a/scripts/guestfs.Makefile b/scripts/guestfs.Makefile
> index bd03f58c..f6c350a4 100644
> --- a/scripts/guestfs.Makefile
> +++ b/scripts/guestfs.Makefile
> @@ -79,7 +79,7 @@ bringup_guestfs: $(GUESTFS_BRINGUP_DEPS)
>  		--extra-vars=@./extra_vars.yaml \
>  		--tags network,pool,base_image
>  	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
> -		--limit 'baseline:dev:service' \
> +		--limit 'baseline:dev:service:ai' \

I'm not sure if I understand the need of this new ai group. Can you clarify?

Why aren't baseline or dev groups sufficient for this AI workload?
What's the role of this ai group?

Note: I kept the hunks above to make it easier to reference the part I believe
is most relevant to my questions (hopefully).

next prev parent reply	other threads:[~2025-09-01 20:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27  9:31 [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain
2025-08-27  9:32 ` [PATCH 1/2] ai: add Milvus vector database benchmarking support Luis Chamberlain
2025-08-27  9:32 ` [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks Luis Chamberlain
2025-08-27 14:47   ` Chuck Lever
2025-08-27 19:24     ` Luis Chamberlain
2025-09-01 20:11   ` Daniel Gomez [this message]
2025-09-01 20:27     ` Luis Chamberlain
2025-08-29  2:05 ` [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f74c11c-743c-4eb5-b544-07c604f41344@kernel.org \
    --to=da.gomez@kernel.org \
    --cc=cel@kernel.org \
    --cc=da.gomez@kruces.com \
    --cc=hui81.qi@samsung.com \
    --cc=kdevops@lists.linux.dev \
    --cc=kundan.kumar@samsung.com \
    --cc=mcgrof@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox