From: Luis Chamberlain <mcgrof@kernel.org>
To: Chuck Lever <cel@kernel.org>, Daniel Gomez <da.gomez@kruces.com>,
hui81.qi@samsung.com, kundan.kumar@samsung.com,
kdevops@lists.linux.dev
Cc: Luis Chamberlain <mcgrof@kernel.org>
Subject: [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks
Date: Wed, 27 Aug 2025 02:32:01 -0700 [thread overview]
Message-ID: <20250827093202.3539990-3-mcgrof@kernel.org> (raw)
In-Reply-To: <20250827093202.3539990-1-mcgrof@kernel.org>
Extend the AI workflow to support testing Milvus across multiple
filesystem configurations simultaneously. This enables comprehensive
performance comparisons between different filesystems and their
configuration options.
Key features:
- Dynamic node generation based on enabled filesystem configurations
- Support for XFS, EXT4, and BTRFS with various mount options
- Per-filesystem result collection and analysis
- A/B testing across all filesystem configurations
- Automated comparison graphs between filesystems
Filesystem configurations:
- XFS: default, nocrc, bigtime with various block sizes (512, 1k, 2k, 4k)
- EXT4: default, nojournal, bigalloc configurations
- BTRFS: default, zlib, lzo, zstd compression options
Defconfigs:
- ai-milvus-multifs: Test 7 filesystem configs with A/B testing
- ai-milvus-multifs-distro: Test with distribution kernels
- ai-milvus-multifs-extended: Extended configs (14 filesystems total)
Node generation:
The system dynamically generates nodes based on enabled filesystem
configurations. With A/B testing enabled, this creates baseline and
dev nodes for each filesystem (e.g., debian13-ai-xfs-4k and
debian13-ai-xfs-4k-dev).
Usage:
make defconfig-ai-milvus-multifs
make bringup # Creates nodes for each filesystem
make ai # Setup infrastructure on all nodes
make ai-tests # Run benchmarks on all filesystems
make ai-results # Collect and compare results
This enables systematic evaluation of how different filesystems and
their configurations affect vector database performance.
Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
.github/workflows/docker-tests.yml | 6 +
Makefile | 2 +-
defconfigs/ai-milvus-multifs | 67 +
defconfigs/ai-milvus-multifs-distro | 109 ++
defconfigs/ai-milvus-multifs-extended | 108 ++
docs/ai/vector-databases/README.md | 1 -
playbooks/ai_install.yml | 6 +
playbooks/ai_multifs.yml | 24 +
.../host_vars/debian13-ai-xfs-4k-4ks.yml | 10 -
.../files/analyze_results.py | 1132 +++++++++++---
.../files/generate_better_graphs.py | 16 +-
.../files/generate_graphs.py | 888 ++++-------
.../files/generate_html_report.py | 263 +++-
.../roles/ai_collect_results/tasks/main.yml | 42 +-
.../templates/analysis_config.json.j2 | 2 +-
.../roles/ai_milvus_storage/tasks/main.yml | 161 ++
.../tasks/generate_comparison.yml | 279 ++++
playbooks/roles/ai_multifs_run/tasks/main.yml | 23 +
.../tasks/run_single_filesystem.yml | 104 ++
.../templates/milvus_config.json.j2 | 42 +
.../roles/ai_multifs_setup/defaults/main.yml | 49 +
.../roles/ai_multifs_setup/tasks/main.yml | 70 +
.../files/milvus_benchmark.py | 164 +-
playbooks/roles/gen_hosts/tasks/main.yml | 19 +
.../roles/gen_hosts/templates/fstests.j2 | 2 +
playbooks/roles/gen_hosts/templates/gitr.j2 | 2 +
playbooks/roles/gen_hosts/templates/hosts.j2 | 35 +-
.../roles/gen_hosts/templates/nfstest.j2 | 2 +
playbooks/roles/gen_hosts/templates/pynfs.j2 | 2 +
playbooks/roles/gen_nodes/tasks/main.yml | 90 ++
.../roles/guestfs/tasks/bringup/main.yml | 15 +
scripts/guestfs.Makefile | 2 +-
workflows/ai/Kconfig | 13 +
workflows/ai/Kconfig.fs | 118 ++
workflows/ai/Kconfig.multifs | 184 +++
workflows/ai/scripts/analysis_config.json | 2 +-
workflows/ai/scripts/analyze_results.py | 1132 +++++++++++---
workflows/ai/scripts/generate_graphs.py | 1372 ++++-------------
workflows/ai/scripts/generate_html_report.py | 94 +-
39 files changed, 4356 insertions(+), 2296 deletions(-)
create mode 100644 defconfigs/ai-milvus-multifs
create mode 100644 defconfigs/ai-milvus-multifs-distro
create mode 100644 defconfigs/ai-milvus-multifs-extended
create mode 100644 playbooks/ai_multifs.yml
delete mode 100644 playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml
create mode 100644 playbooks/roles/ai_milvus_storage/tasks/main.yml
create mode 100644 playbooks/roles/ai_multifs_run/tasks/generate_comparison.yml
create mode 100644 playbooks/roles/ai_multifs_run/tasks/main.yml
create mode 100644 playbooks/roles/ai_multifs_run/tasks/run_single_filesystem.yml
create mode 100644 playbooks/roles/ai_multifs_run/templates/milvus_config.json.j2
create mode 100644 playbooks/roles/ai_multifs_setup/defaults/main.yml
create mode 100644 playbooks/roles/ai_multifs_setup/tasks/main.yml
create mode 100644 workflows/ai/Kconfig.fs
create mode 100644 workflows/ai/Kconfig.multifs
diff --git a/.github/workflows/docker-tests.yml b/.github/workflows/docker-tests.yml
index c0e0d03d..adea1182 100644
--- a/.github/workflows/docker-tests.yml
+++ b/.github/workflows/docker-tests.yml
@@ -53,3 +53,9 @@ jobs:
echo "Running simple make targets on ${{ matrix.distro_container }} environment"
make mrproper
+ - name: Test fio-tests defconfig
+ run: |
+ echo "Testing fio-tests CI configuration"
+ make defconfig-fio-tests-ci
+ make
+ echo "Configuration test passed for fio-tests"
diff --git a/Makefile b/Makefile
index 8755577e..83c67340 100644
--- a/Makefile
+++ b/Makefile
@@ -226,7 +226,7 @@ include scripts/bringup.Makefile
endif
DEFAULT_DEPS += $(ANSIBLE_INVENTORY_FILE)
-$(ANSIBLE_INVENTORY_FILE): .config $(ANSIBLE_CFG_FILE) $(KDEVOPS_HOSTS_TEMPLATE)
+$(ANSIBLE_INVENTORY_FILE): .config $(ANSIBLE_CFG_FILE) $(KDEVOPS_HOSTS_TEMPLATE) $(KDEVOPS_NODES)
$(Q)ANSIBLE_LOCALHOST_WARNING=False ANSIBLE_INVENTORY_UNPARSED_WARNING=False \
ansible-playbook $(ANSIBLE_VERBOSE) \
$(KDEVOPS_PLAYBOOKS_DIR)/gen_hosts.yml \
diff --git a/defconfigs/ai-milvus-multifs b/defconfigs/ai-milvus-multifs
new file mode 100644
index 00000000..7e5ad971
--- /dev/null
+++ b/defconfigs/ai-milvus-multifs
@@ -0,0 +1,67 @@
+CONFIG_GUESTFS=y
+CONFIG_LIBVIRT=y
+
+# Disable mirror features for CI/testing
+# CONFIG_ENABLE_LOCAL_LINUX_MIRROR is not set
+# CONFIG_USE_LOCAL_LINUX_MIRROR is not set
+# CONFIG_INSTALL_ONLY_GIT_DAEMON is not set
+# CONFIG_MIRROR_INSTALL is not set
+
+CONFIG_WORKFLOWS=y
+CONFIG_WORKFLOW_LINUX_CUSTOM=y
+
+CONFIG_BOOTLINUX=y
+CONFIG_BOOTLINUX_9P=y
+
+# Enable A/B testing with different kernel references
+CONFIG_KDEVOPS_BASELINE_AND_DEV=y
+CONFIG_BOOTLINUX_AB_DIFFERENT_REF=y
+
+# AI workflow configuration
+CONFIG_WORKFLOWS_TESTS=y
+CONFIG_WORKFLOWS_LINUX_TESTS=y
+CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
+CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
+
+# Vector database configuration
+CONFIG_AI_TESTS_VECTOR_DATABASE=y
+CONFIG_AI_VECTOR_DB_MILVUS=y
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER=y
+
+# Enable multi-filesystem testing
+CONFIG_AI_MULTIFS_ENABLE=y
+CONFIG_AI_ENABLE_MULTIFS_TESTING=y
+
+# Enable dedicated Milvus storage with node-based filesystem
+CONFIG_AI_MILVUS_STORAGE_ENABLE=y
+CONFIG_AI_MILVUS_USE_NODE_FS=y
+
+# Test XFS with different block sizes
+CONFIG_AI_MULTIFS_TEST_XFS=y
+CONFIG_AI_MULTIFS_XFS_4K_4KS=y
+CONFIG_AI_MULTIFS_XFS_16K_4KS=y
+CONFIG_AI_MULTIFS_XFS_32K_4KS=y
+CONFIG_AI_MULTIFS_XFS_64K_4KS=y
+
+# Test EXT4 configurations
+CONFIG_AI_MULTIFS_TEST_EXT4=y
+CONFIG_AI_MULTIFS_EXT4_4K=y
+CONFIG_AI_MULTIFS_EXT4_16K_BIGALLOC=y
+
+# Test BTRFS
+CONFIG_AI_MULTIFS_TEST_BTRFS=y
+CONFIG_AI_MULTIFS_BTRFS_DEFAULT=y
+
+# Performance settings
+CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
+CONFIG_AI_BENCHMARK_ITERATIONS=5
+
+# Dataset configuration for benchmarking
+CONFIG_AI_VECTOR_DB_MILVUS_DATASET_SIZE=100000
+CONFIG_AI_VECTOR_DB_MILVUS_BATCH_SIZE=10000
+CONFIG_AI_VECTOR_DB_MILVUS_NUM_QUERIES=10000
+
+# Container configuration
+CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5=y
+CONFIG_AI_VECTOR_DB_MILVUS_MEMORY_LIMIT="8g"
+CONFIG_AI_VECTOR_DB_MILVUS_CPU_LIMIT="4.0"
\ No newline at end of file
diff --git a/defconfigs/ai-milvus-multifs-distro b/defconfigs/ai-milvus-multifs-distro
new file mode 100644
index 00000000..fb71f2b5
--- /dev/null
+++ b/defconfigs/ai-milvus-multifs-distro
@@ -0,0 +1,109 @@
+# AI Multi-Filesystem Performance Testing Configuration (Distro Kernel)
+# This configuration enables testing AI workloads across multiple filesystem
+# configurations including XFS (4k and 16k block sizes), ext4 (4k and 16k bigalloc),
+# and btrfs (default profile) using the distribution kernel without A/B testing.
+
+# Base virtualization setup
+CONFIG_LIBVIRT=y
+CONFIG_LIBVIRT_MACHINE_TYPE_Q35=y
+CONFIG_LIBVIRT_STORAGE_POOL_PATH="/opt/kdevops/libvirt"
+CONFIG_LIBVIRT_ENABLE_LARGEIO=y
+CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_NVME=y
+CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_SIZE="50GiB"
+
+# Network configuration
+CONFIG_LIBVIRT_ENABLE_BRIDGED_NETWORKING=y
+CONFIG_LIBVIRT_NET_NAME="kdevops"
+
+# Host configuration
+CONFIG_KDEVOPS_HOSTS_TEMPLATE="hosts.j2"
+CONFIG_VAGRANT_NVME_DISK_SIZE="50GiB"
+
+# Base system requirements
+CONFIG_WORKFLOWS=y
+CONFIG_WORKFLOWS_TESTS=y
+CONFIG_WORKFLOWS_LINUX_TESTS=y
+CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
+CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
+
+# AI Workflow Configuration
+CONFIG_KDEVOPS_WORKFLOW_ENABLE_AI=y
+CONFIG_AI_TESTS_VECTOR_DATABASE=y
+CONFIG_AI_MILVUS_DOCKER=y
+CONFIG_AI_VECTOR_DB_TYPE_MILVUS=y
+
+# Milvus Configuration
+CONFIG_AI_MILVUS_HOST="localhost"
+CONFIG_AI_MILVUS_PORT=19530
+CONFIG_AI_MILVUS_DATABASE_NAME="ai_benchmark"
+
+# Test Parameters (optimized for multi-fs testing)
+CONFIG_AI_BENCHMARK_ITERATIONS=3
+CONFIG_AI_DATASET_1M=y
+CONFIG_AI_VECTOR_DIM_128=y
+CONFIG_AI_BENCHMARK_RUNTIME="180"
+CONFIG_AI_BENCHMARK_WARMUP_TIME="30"
+
+# Query patterns
+CONFIG_AI_BENCHMARK_QUERY_TOPK_1=y
+CONFIG_AI_BENCHMARK_QUERY_TOPK_10=y
+
+# Batch sizes
+CONFIG_AI_BENCHMARK_BATCH_1=y
+CONFIG_AI_BENCHMARK_BATCH_10=y
+
+# Index configuration
+CONFIG_AI_INDEX_HNSW=y
+CONFIG_AI_INDEX_TYPE="HNSW"
+CONFIG_AI_INDEX_HNSW_M=16
+CONFIG_AI_INDEX_HNSW_EF_CONSTRUCTION=200
+CONFIG_AI_INDEX_HNSW_EF=64
+
+# Results and visualization
+CONFIG_AI_BENCHMARK_RESULTS_DIR="/data/ai-benchmark"
+CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
+CONFIG_AI_BENCHMARK_GRAPH_FORMAT="png"
+CONFIG_AI_BENCHMARK_GRAPH_DPI=300
+CONFIG_AI_BENCHMARK_GRAPH_THEME="default"
+
+# Multi-filesystem testing configuration
+CONFIG_AI_ENABLE_MULTIFS_TESTING=y
+CONFIG_AI_MULTIFS_RESULTS_DIR="/data/ai-multifs-benchmark"
+
+# Enable dedicated Milvus storage with node-based filesystem
+CONFIG_AI_MILVUS_STORAGE_ENABLE=y
+CONFIG_AI_MILVUS_USE_NODE_FS=y
+CONFIG_AI_MILVUS_DEVICE="/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops3"
+CONFIG_AI_MILVUS_MOUNT_POINT="/data/milvus"
+
+# XFS configurations
+CONFIG_AI_MULTIFS_TEST_XFS=y
+CONFIG_AI_MULTIFS_XFS_4K_4KS=y
+CONFIG_AI_MULTIFS_XFS_16K_4KS=y
+CONFIG_AI_MULTIFS_XFS_32K_4KS=y
+CONFIG_AI_MULTIFS_XFS_64K_4KS=y
+
+# ext4 configurations
+CONFIG_AI_MULTIFS_TEST_EXT4=y
+CONFIG_AI_MULTIFS_EXT4_4K=y
+CONFIG_AI_MULTIFS_EXT4_16K_BIGALLOC=y
+
+# btrfs configurations
+CONFIG_AI_MULTIFS_TEST_BTRFS=y
+CONFIG_AI_MULTIFS_BTRFS_DEFAULT=y
+
+# Standard filesystem configuration (for comparison)
+CONFIG_AI_FILESYSTEM_XFS=y
+CONFIG_AI_FILESYSTEM="xfs"
+CONFIG_AI_FSTYPE="xfs"
+CONFIG_AI_XFS_MKFS_OPTS="-f -s size=4096"
+CONFIG_AI_XFS_MOUNT_OPTS="rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+
+# Use distribution kernel (no kernel building)
+# CONFIG_BOOTLINUX is not set
+
+# Memory configuration
+CONFIG_LIBVIRT_MEM_MB=16384
+
+# Disable A/B testing to use single baseline configuration
+# CONFIG_KDEVOPS_BASELINE_AND_DEV is not set
diff --git a/defconfigs/ai-milvus-multifs-extended b/defconfigs/ai-milvus-multifs-extended
new file mode 100644
index 00000000..7886c8c4
--- /dev/null
+++ b/defconfigs/ai-milvus-multifs-extended
@@ -0,0 +1,108 @@
+# AI Extended Multi-Filesystem Performance Testing Configuration (Distro Kernel)
+# This configuration enables testing AI workloads across multiple filesystem
+# configurations including XFS (4k, 16k, 32k, 64k block sizes), ext4 (4k and 16k bigalloc),
+# and btrfs (default profile) using the distribution kernel without A/B testing.
+
+# Base virtualization setup
+CONFIG_LIBVIRT=y
+CONFIG_LIBVIRT_MACHINE_TYPE_Q35=y
+CONFIG_LIBVIRT_STORAGE_POOL_PATH="/opt/kdevops/libvirt"
+CONFIG_LIBVIRT_ENABLE_LARGEIO=y
+CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_NVME=y
+CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_SIZE="50GiB"
+
+# Network configuration
+CONFIG_LIBVIRT_ENABLE_BRIDGED_NETWORKING=y
+CONFIG_LIBVIRT_NET_NAME="kdevops"
+
+# Host configuration
+CONFIG_KDEVOPS_HOSTS_TEMPLATE="hosts.j2"
+CONFIG_VAGRANT_NVME_DISK_SIZE="50GiB"
+
+# Base system requirements
+CONFIG_WORKFLOWS=y
+CONFIG_WORKFLOWS_TESTS=y
+CONFIG_WORKFLOWS_LINUX_TESTS=y
+CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
+CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
+
+# AI Workflow Configuration
+CONFIG_KDEVOPS_WORKFLOW_ENABLE_AI=y
+CONFIG_AI_TESTS_VECTOR_DATABASE=y
+CONFIG_AI_VECTOR_DB_MILVUS=y
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER=y
+
+# Test Parameters (optimized for multi-fs testing)
+CONFIG_AI_BENCHMARK_ITERATIONS=3
+CONFIG_AI_DATASET_1M=y
+CONFIG_AI_VECTOR_DIM_128=y
+CONFIG_AI_BENCHMARK_RUNTIME="180"
+CONFIG_AI_BENCHMARK_WARMUP_TIME="30"
+
+# Query patterns
+CONFIG_AI_BENCHMARK_QUERY_TOPK_1=y
+CONFIG_AI_BENCHMARK_QUERY_TOPK_10=y
+
+# Batch sizes
+CONFIG_AI_BENCHMARK_BATCH_1=y
+CONFIG_AI_BENCHMARK_BATCH_10=y
+
+# Index configuration
+CONFIG_AI_INDEX_HNSW=y
+CONFIG_AI_INDEX_TYPE="HNSW"
+CONFIG_AI_INDEX_HNSW_M=16
+CONFIG_AI_INDEX_HNSW_EF_CONSTRUCTION=200
+CONFIG_AI_INDEX_HNSW_EF=64
+
+# Results and visualization
+CONFIG_AI_BENCHMARK_RESULTS_DIR="/data/ai-benchmark"
+CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
+CONFIG_AI_BENCHMARK_GRAPH_FORMAT="png"
+CONFIG_AI_BENCHMARK_GRAPH_DPI=300
+CONFIG_AI_BENCHMARK_GRAPH_THEME="default"
+
+# Multi-filesystem testing configuration
+CONFIG_AI_MULTIFS_ENABLE=y
+CONFIG_AI_ENABLE_MULTIFS_TESTING=y
+CONFIG_AI_MULTIFS_RESULTS_DIR="/data/ai-multifs-benchmark"
+
+# Enable dedicated Milvus storage with node-based filesystem
+CONFIG_AI_MILVUS_STORAGE_ENABLE=y
+CONFIG_AI_MILVUS_USE_NODE_FS=y
+CONFIG_AI_MILVUS_DEVICE="/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops3"
+CONFIG_AI_MILVUS_MOUNT_POINT="/data/milvus"
+
+# Extended XFS configurations (4k, 16k, 32k, 64k block sizes)
+CONFIG_AI_MULTIFS_TEST_XFS=y
+CONFIG_AI_MULTIFS_XFS_4K_4KS=y
+CONFIG_AI_MULTIFS_XFS_16K_4KS=y
+CONFIG_AI_MULTIFS_XFS_32K_4KS=y
+CONFIG_AI_MULTIFS_XFS_64K_4KS=y
+
+# ext4 configurations
+CONFIG_AI_MULTIFS_TEST_EXT4=y
+CONFIG_AI_MULTIFS_EXT4_4K=y
+CONFIG_AI_MULTIFS_EXT4_16K_BIGALLOC=y
+
+# btrfs configurations
+CONFIG_AI_MULTIFS_TEST_BTRFS=y
+CONFIG_AI_MULTIFS_BTRFS_DEFAULT=y
+
+# Standard filesystem configuration (for comparison)
+CONFIG_AI_FILESYSTEM_XFS=y
+CONFIG_AI_FILESYSTEM="xfs"
+CONFIG_AI_FSTYPE="xfs"
+CONFIG_AI_XFS_MKFS_OPTS="-f -s size=4096"
+CONFIG_AI_XFS_MOUNT_OPTS="rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+
+# Use distribution kernel (no kernel building)
+# CONFIG_BOOTLINUX is not set
+
+# Memory configuration
+CONFIG_LIBVIRT_MEM_MB=16384
+
+# Baseline/dev testing setup
+CONFIG_KDEVOPS_BASELINE_AND_DEV=y
+# Build Linux
+CONFIG_WORKFLOW_LINUX_CUSTOM=y
+CONFIG_BOOTLINUX_AB_DIFFERENT_REF=y
diff --git a/docs/ai/vector-databases/README.md b/docs/ai/vector-databases/README.md
index 2a3955d7..0fdd204b 100644
--- a/docs/ai/vector-databases/README.md
+++ b/docs/ai/vector-databases/README.md
@@ -52,7 +52,6 @@ Vector databases heavily depend on storage performance. The workflow tests acros
- **XFS**: Default for many production deployments
- **ext4**: Traditional Linux filesystem
- **btrfs**: Copy-on-write with compression support
-- **ZFS**: Advanced features for data integrity
## Configuration Dimensions
diff --git a/playbooks/ai_install.yml b/playbooks/ai_install.yml
index 70b734e4..38e6671c 100644
--- a/playbooks/ai_install.yml
+++ b/playbooks/ai_install.yml
@@ -4,5 +4,11 @@
become: true
become_user: root
roles:
+ - role: ai_docker_storage
+ when: ai_docker_storage_enable | default(true)
+ tags: ['ai', 'docker', 'storage']
+ - role: ai_milvus_storage
+ when: ai_milvus_storage_enable | default(false)
+ tags: ['ai', 'milvus', 'storage']
- role: milvus
tags: ['ai', 'vector_db', 'milvus', 'install']
diff --git a/playbooks/ai_multifs.yml b/playbooks/ai_multifs.yml
new file mode 100644
index 00000000..637f11f4
--- /dev/null
+++ b/playbooks/ai_multifs.yml
@@ -0,0 +1,24 @@
+---
+- hosts: baseline
+ become: yes
+ gather_facts: yes
+ vars:
+ ai_benchmark_results_dir: "{{ ai_multifs_results_dir | default('/data/ai-multifs-benchmark') }}"
+ roles:
+ - role: ai_multifs_setup
+ - role: ai_multifs_run
+ tasks:
+ - name: Final multi-filesystem testing summary
+ debug:
+ msg: |
+ Multi-filesystem AI benchmark testing completed!
+
+ Results directory: {{ ai_multifs_results_dir }}
+ Comparison report: {{ ai_multifs_results_dir }}/comparison/multi_filesystem_comparison.html
+
+ Individual filesystem results:
+ {% for config in ai_multifs_configurations %}
+ {% if config.enabled %}
+ - {{ config.name }}: {{ ai_multifs_results_dir }}/{{ config.name }}/
+ {% endif %}
+ {% endfor %}
diff --git a/playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml b/playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml
deleted file mode 100644
index ffe9eb28..00000000
--- a/playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml
+++ /dev/null
@@ -1,10 +0,0 @@
----
-# XFS 4k block, 4k sector configuration
-ai_docker_fstype: "xfs"
-ai_docker_xfs_blocksize: 4096
-ai_docker_xfs_sectorsize: 4096
-ai_docker_xfs_mkfs_opts: ""
-filesystem_type: "xfs"
-filesystem_block_size: "4k-4ks"
-ai_filesystem: "xfs"
-ai_data_device_path: "/var/lib/docker"
\ No newline at end of file
diff --git a/playbooks/roles/ai_collect_results/files/analyze_results.py b/playbooks/roles/ai_collect_results/files/analyze_results.py
index 3d11fb11..2dc4a1d6 100755
--- a/playbooks/roles/ai_collect_results/files/analyze_results.py
+++ b/playbooks/roles/ai_collect_results/files/analyze_results.py
@@ -226,6 +226,68 @@ class ResultsAnalyzer:
return fs_info
+ def _extract_filesystem_config(
+ self, result: Dict[str, Any]
+ ) -> tuple[str, str, str]:
+ """Extract filesystem type and block size from result data.
+ Returns (fs_type, block_size, config_key)"""
+ filename = result.get("_file", "")
+
+ # Primary: Extract filesystem type from filename (more reliable than JSON)
+ fs_type = "unknown"
+ block_size = "default"
+
+ if "xfs" in filename:
+ fs_type = "xfs"
+ # Check larger sizes first to avoid substring matches
+ if "64k" in filename and "64k-" in filename:
+ block_size = "64k"
+ elif "32k" in filename and "32k-" in filename:
+ block_size = "32k"
+ elif "16k" in filename and "16k-" in filename:
+ block_size = "16k"
+ elif "4k" in filename and "4k-" in filename:
+ block_size = "4k"
+ elif "ext4" in filename:
+ fs_type = "ext4"
+ if "16k" in filename:
+ block_size = "16k"
+ elif "4k" in filename:
+ block_size = "4k"
+ elif "btrfs" in filename:
+ fs_type = "btrfs"
+ block_size = "default"
+ else:
+ # Fallback to JSON data if filename parsing fails
+ fs_type = result.get("filesystem", "unknown")
+ self.logger.warning(
+ f"Could not determine filesystem from filename {filename}, using JSON data: {fs_type}"
+ )
+
+ config_key = f"{fs_type}-{block_size}" if block_size != "default" else fs_type
+ return fs_type, block_size, config_key
+
+ def _extract_node_info(self, result: Dict[str, Any]) -> tuple[str, bool]:
+ """Extract node hostname and determine if it's a dev node.
+ Returns (hostname, is_dev_node)"""
+ # Get hostname from system_info (preferred) or fall back to filename
+ system_info = result.get("system_info", {})
+ hostname = system_info.get("hostname", "")
+
+ # If no hostname in system_info, try extracting from filename
+ if not hostname:
+ filename = result.get("_file", "")
+ # Remove results_ prefix and .json suffix
+ hostname = filename.replace("results_", "").replace(".json", "")
+ # Remove iteration number if present (_1, _2, etc.)
+ if "_" in hostname and hostname.split("_")[-1].isdigit():
+ hostname = "_".join(hostname.split("_")[:-1])
+
+ # Determine if this is a dev node
+ is_dev = hostname.endswith("-dev")
+
+ return hostname, is_dev
+
def load_results(self) -> bool:
"""Load all result files from the results directory"""
try:
@@ -391,6 +453,8 @@ class ResultsAnalyzer:
html.append(
" .highlight { background-color: #fff3cd; padding: 10px; border-radius: 3px; }"
)
+ html.append(" .baseline-row { background-color: #e8f5e9; }")
+ html.append(" .dev-row { background-color: #e3f2fd; }")
html.append(" </style>")
html.append("</head>")
html.append("<body>")
@@ -486,26 +550,69 @@ class ResultsAnalyzer:
else:
html.append(" <p>No storage device information available.</p>")
- # Filesystem section
- html.append(" <h3>🗂️ Filesystem Configuration</h3>")
- fs_info = self.system_info.get("filesystem_info", {})
- html.append(" <table class='config-table'>")
- html.append(
- " <tr><td>Filesystem Type</td><td>"
- + str(fs_info.get("filesystem_type", "Unknown"))
- + "</td></tr>"
- )
- html.append(
- " <tr><td>Mount Point</td><td>"
- + str(fs_info.get("mount_point", "Unknown"))
- + "</td></tr>"
- )
- html.append(
- " <tr><td>Mount Options</td><td>"
- + str(fs_info.get("mount_options", "Unknown"))
- + "</td></tr>"
- )
- html.append(" </table>")
+ # Node Configuration section - Extract from actual benchmark results
+ html.append(" <h3>🗂️ Node Configuration</h3>")
+
+ # Collect node and filesystem information from benchmark results
+ node_configs = {}
+ for result in self.results_data:
+ # Extract node information
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(
+ result
+ )
+
+ system_info = result.get("system_info", {})
+ data_path = system_info.get("data_path", "/data/milvus")
+ mount_point = system_info.get("mount_point", "/data")
+ kernel_version = system_info.get("kernel_version", "unknown")
+
+ if hostname not in node_configs:
+ node_configs[hostname] = {
+ "hostname": hostname,
+ "node_type": "Development" if is_dev else "Baseline",
+ "filesystem": fs_type,
+ "block_size": block_size,
+ "data_path": data_path,
+ "mount_point": mount_point,
+ "kernel": kernel_version,
+ "test_count": 0,
+ }
+ node_configs[hostname]["test_count"] += 1
+
+ if node_configs:
+ html.append(" <table class='config-table'>")
+ html.append(
+ " <tr><th>Node</th><th>Type</th><th>Filesystem</th><th>Block Size</th><th>Data Path</th><th>Mount Point</th><th>Kernel</th><th>Tests</th></tr>"
+ )
+ # Sort nodes with baseline first, then dev
+ sorted_nodes = sorted(
+ node_configs.items(),
+ key=lambda x: (x[1]["node_type"] != "Baseline", x[0]),
+ )
+ for hostname, config_info in sorted_nodes:
+ row_class = (
+ "dev-row"
+ if config_info["node_type"] == "Development"
+ else "baseline-row"
+ )
+ html.append(f" <tr class='{row_class}'>")
+ html.append(f" <td><strong>{hostname}</strong></td>")
+ html.append(f" <td>{config_info['node_type']}</td>")
+ html.append(f" <td>{config_info['filesystem']}</td>")
+ html.append(f" <td>{config_info['block_size']}</td>")
+ html.append(f" <td>{config_info['data_path']}</td>")
+ html.append(
+ f" <td>{config_info['mount_point']}</td>"
+ )
+ html.append(f" <td>{config_info['kernel']}</td>")
+ html.append(f" <td>{config_info['test_count']}</td>")
+ html.append(f" </tr>")
+ html.append(" </table>")
+ else:
+ html.append(
+ " <p>No node configuration data found in results.</p>"
+ )
html.append(" </div>")
# Test Configuration Section
@@ -551,92 +658,192 @@ class ResultsAnalyzer:
html.append(" </table>")
html.append(" </div>")
- # Performance Results Section
+ # Performance Results Section - Per Node
html.append(" <div class='section'>")
- html.append(" <h2>📊 Performance Results Summary</h2>")
+ html.append(" <h2>📊 Performance Results by Node</h2>")
if self.results_data:
- # Insert performance
- insert_times = [
- r.get("insert_performance", {}).get("total_time_seconds", 0)
- for r in self.results_data
- ]
- insert_rates = [
- r.get("insert_performance", {}).get("vectors_per_second", 0)
- for r in self.results_data
- ]
-
- if insert_times and any(t > 0 for t in insert_times):
- html.append(" <h3>📈 Vector Insert Performance</h3>")
- html.append(" <table class='metric-table'>")
- html.append(
- f" <tr><td>Average Insert Time</td><td>{np.mean(insert_times):.2f} seconds</td></tr>"
- )
- html.append(
- f" <tr><td>Average Insert Rate</td><td>{np.mean(insert_rates):.2f} vectors/sec</td></tr>"
+ # Group results by node
+ node_performance = {}
+
+ for result in self.results_data:
+ # Use node hostname as the grouping key
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(
+ result
)
- html.append(
- f" <tr><td>Insert Rate Range</td><td>{np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec</td></tr>"
- )
- html.append(" </table>")
- # Index performance
- index_times = [
- r.get("index_performance", {}).get("creation_time_seconds", 0)
- for r in self.results_data
- ]
- if index_times and any(t > 0 for t in index_times):
- html.append(" <h3>🔗 Index Creation Performance</h3>")
- html.append(" <table class='metric-table'>")
- html.append(
- f" <tr><td>Average Index Creation Time</td><td>{np.mean(index_times):.2f} seconds</td></tr>"
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "hostname": hostname,
+ "node_type": "Development" if is_dev else "Baseline",
+ "insert_rates": [],
+ "insert_times": [],
+ "index_times": [],
+ "query_performance": {},
+ "filesystem": fs_type,
+ "block_size": block_size,
+ }
+
+ # Add insert performance
+ insert_perf = result.get("insert_performance", {})
+ if insert_perf:
+ rate = insert_perf.get("vectors_per_second", 0)
+ time = insert_perf.get("total_time_seconds", 0)
+ if rate > 0:
+ node_performance[hostname]["insert_rates"].append(rate)
+ if time > 0:
+ node_performance[hostname]["insert_times"].append(time)
+
+ # Add index performance
+ index_perf = result.get("index_performance", {})
+ if index_perf:
+ time = index_perf.get("creation_time_seconds", 0)
+ if time > 0:
+ node_performance[hostname]["index_times"].append(time)
+
+ # Collect query performance (use first result for each node)
+ query_perf = result.get("query_performance", {})
+ if (
+ query_perf
+ and not node_performance[hostname]["query_performance"]
+ ):
+ node_performance[hostname]["query_performance"] = query_perf
+
+ # Display results for each node, sorted with baseline first
+ sorted_nodes = sorted(
+ node_performance.items(),
+ key=lambda x: (x[1]["node_type"] != "Baseline", x[0]),
+ )
+ for hostname, perf_data in sorted_nodes:
+ node_type_badge = (
+ "🔵" if perf_data["node_type"] == "Development" else "🟢"
)
html.append(
- f" <tr><td>Index Time Range</td><td>{np.min(index_times):.2f} - {np.max(index_times):.2f} seconds</td></tr>"
+ f" <h3>{node_type_badge} {hostname} ({perf_data['node_type']})</h3>"
)
- html.append(" </table>")
-
- # Query performance
- html.append(" <h3>🔍 Query Performance</h3>")
- first_query_perf = self.results_data[0].get("query_performance", {})
- if first_query_perf:
- html.append(" <table>")
html.append(
- " <tr><th>Query Type</th><th>Batch Size</th><th>QPS</th><th>Avg Latency (ms)</th></tr>"
+ f" <p>Filesystem: {perf_data['filesystem']}, Block Size: {perf_data['block_size']}</p>"
)
- for topk, topk_data in first_query_perf.items():
- for batch, batch_data in topk_data.items():
- qps = batch_data.get("queries_per_second", 0)
- avg_time = batch_data.get("average_time_seconds", 0) * 1000
-
- # Color coding for performance
- qps_class = ""
- if qps > 1000:
- qps_class = "performance-good"
- elif qps > 100:
- qps_class = "performance-warning"
- else:
- qps_class = "performance-poor"
-
- html.append(f" <tr>")
- html.append(
- f" <td>{topk.replace('topk_', 'Top-')}</td>"
- )
- html.append(
- f" <td>{batch.replace('batch_', 'Batch ')}</td>"
- )
- html.append(
- f" <td class='{qps_class}'>{qps:.2f}</td>"
- )
- html.append(f" <td>{avg_time:.2f}</td>")
- html.append(f" </tr>")
+ # Insert performance
+ insert_rates = perf_data["insert_rates"]
+ if insert_rates:
+ html.append(" <h4>📈 Vector Insert Performance</h4>")
+ html.append(" <table class='metric-table'>")
+ html.append(
+ f" <tr><td>Average Insert Rate</td><td>{np.mean(insert_rates):.2f} vectors/sec</td></tr>"
+ )
+ html.append(
+ f" <tr><td>Insert Rate Range</td><td>{np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec</td></tr>"
+ )
+ html.append(
+ f" <tr><td>Test Iterations</td><td>{len(insert_rates)}</td></tr>"
+ )
+ html.append(" </table>")
+
+ # Index performance
+ index_times = perf_data["index_times"]
+ if index_times:
+ html.append(" <h4>🔗 Index Creation Performance</h4>")
+ html.append(" <table class='metric-table'>")
+ html.append(
+ f" <tr><td>Average Index Creation Time</td><td>{np.mean(index_times):.3f} seconds</td></tr>"
+ )
+ html.append(
+ f" <tr><td>Index Time Range</td><td>{np.min(index_times):.3f} - {np.max(index_times):.3f} seconds</td></tr>"
+ )
+ html.append(" </table>")
+
+ # Query performance
+ query_perf = perf_data["query_performance"]
+ if query_perf:
+ html.append(" <h4>🔍 Query Performance</h4>")
+ html.append(" <table>")
+ html.append(
+ " <tr><th>Query Type</th><th>Batch Size</th><th>QPS</th><th>Avg Latency (ms)</th></tr>"
+ )
- html.append(" </table>")
+ for topk, topk_data in query_perf.items():
+ for batch, batch_data in topk_data.items():
+ qps = batch_data.get("queries_per_second", 0)
+ avg_time = (
+ batch_data.get("average_time_seconds", 0) * 1000
+ )
+
+ # Color coding for performance
+ qps_class = ""
+ if qps > 1000:
+ qps_class = "performance-good"
+ elif qps > 100:
+ qps_class = "performance-warning"
+ else:
+ qps_class = "performance-poor"
+
+ html.append(f" <tr>")
+ html.append(
+ f" <td>{topk.replace('topk_', 'Top-')}</td>"
+ )
+ html.append(
+ f" <td>{batch.replace('batch_', 'Batch ')}</td>"
+ )
+ html.append(
+ f" <td class='{qps_class}'>{qps:.2f}</td>"
+ )
+ html.append(f" <td>{avg_time:.2f}</td>")
+ html.append(f" </tr>")
+ html.append(" </table>")
+
+ html.append(" <br>") # Add spacing between configurations
- html.append(" </div>")
+ html.append(" </div>")
# Footer
+ # Performance Graphs Section
+ html.append(" <div class='section'>")
+ html.append(" <h2>📈 Performance Visualizations</h2>")
+ html.append(
+ " <p>The following graphs provide visual analysis of the benchmark results across all tested filesystem configurations:</p>"
+ )
+ html.append(" <ul>")
+ html.append(
+ " <li><strong>Insert Performance:</strong> Shows vector insertion rates and times for each filesystem configuration</li>"
+ )
+ html.append(
+ " <li><strong>Query Performance:</strong> Displays query performance heatmaps for different Top-K and batch sizes</li>"
+ )
+ html.append(
+ " <li><strong>Index Performance:</strong> Compares index creation times across filesystems</li>"
+ )
+ html.append(
+ " <li><strong>Performance Matrix:</strong> Comprehensive comparison matrix of all metrics</li>"
+ )
+ html.append(
+ " <li><strong>Filesystem Comparison:</strong> Side-by-side comparison of filesystem performance</li>"
+ )
+ html.append(" </ul>")
+ html.append(
+ " <p><em>Note: Graphs are generated as separate PNG files in the same directory as this report.</em></p>"
+ )
+ html.append(" <div style='margin-top: 20px;'>")
+ html.append(
+ " <img src='insert_performance.png' alt='Insert Performance' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='query_performance.png' alt='Query Performance' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='index_performance.png' alt='Index Performance' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='performance_matrix.png' alt='Performance Matrix' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='filesystem_comparison.png' alt='Filesystem Comparison' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(" </div>")
+ html.append(" </div>")
+
html.append(" <div class='section'>")
html.append(" <h2>📝 Notes</h2>")
html.append(" <ul>")
@@ -661,10 +868,11 @@ class ResultsAnalyzer:
return "\n".join(html)
except Exception as e:
- self.logger.error(f"Error generating HTML report: {e}")
- return (
- f"<html><body><h1>Error generating HTML report: {e}</h1></body></html>"
- )
+ import traceback
+
+ tb = traceback.format_exc()
+ self.logger.error(f"Error generating HTML report: {e}\n{tb}")
+ return f"<html><body><h1>Error generating HTML report: {e}</h1><pre>{tb}</pre></body></html>"
def generate_graphs(self) -> bool:
"""Generate performance visualization graphs"""
@@ -691,6 +899,9 @@ class ResultsAnalyzer:
# Graph 4: Performance Comparison Matrix
self._plot_performance_matrix()
+ # Graph 5: Multi-filesystem Comparison (if applicable)
+ self._plot_filesystem_comparison()
+
self.logger.info("Graphs generated successfully")
return True
@@ -699,34 +910,188 @@ class ResultsAnalyzer:
return False
def _plot_insert_performance(self):
- """Plot insert performance metrics"""
- fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+ """Plot insert performance metrics with node differentiation"""
+ # Group data by node
+ node_performance = {}
- # Extract insert data
- iterations = []
- insert_rates = []
- insert_times = []
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "insert_rates": [],
+ "insert_times": [],
+ "iterations": [],
+ "is_dev": is_dev,
+ }
- for i, result in enumerate(self.results_data):
insert_perf = result.get("insert_performance", {})
if insert_perf:
- iterations.append(i + 1)
- insert_rates.append(insert_perf.get("vectors_per_second", 0))
- insert_times.append(insert_perf.get("total_time_seconds", 0))
-
- # Plot insert rate
- ax1.plot(iterations, insert_rates, "b-o", linewidth=2, markersize=6)
- ax1.set_xlabel("Iteration")
- ax1.set_ylabel("Vectors/Second")
- ax1.set_title("Vector Insert Rate Performance")
- ax1.grid(True, alpha=0.3)
-
- # Plot insert time
- ax2.plot(iterations, insert_times, "r-o", linewidth=2, markersize=6)
- ax2.set_xlabel("Iteration")
- ax2.set_ylabel("Total Time (seconds)")
- ax2.set_title("Vector Insert Time Performance")
- ax2.grid(True, alpha=0.3)
+ node_performance[hostname]["insert_rates"].append(
+ insert_perf.get("vectors_per_second", 0)
+ )
+ node_performance[hostname]["insert_times"].append(
+ insert_perf.get("total_time_seconds", 0)
+ )
+ node_performance[hostname]["iterations"].append(
+ len(node_performance[hostname]["insert_rates"])
+ )
+
+ # Check if we have multiple nodes
+ if len(node_performance) > 1:
+ # Multi-node mode: separate lines for each node
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 7))
+
+ # Sort nodes with baseline first, then dev
+ sorted_nodes = sorted(
+ node_performance.items(), key=lambda x: (x[1]["is_dev"], x[0])
+ )
+
+ # Create color palettes for baseline and dev nodes
+ baseline_colors = [
+ "#2E7D32",
+ "#43A047",
+ "#66BB6A",
+ "#81C784",
+ "#A5D6A7",
+ "#C8E6C9",
+ ] # Greens
+ dev_colors = [
+ "#0D47A1",
+ "#1565C0",
+ "#1976D2",
+ "#1E88E5",
+ "#2196F3",
+ "#42A5F5",
+ "#64B5F6",
+ ] # Blues
+
+ # Additional colors if needed
+ extra_colors = [
+ "#E65100",
+ "#F57C00",
+ "#FF9800",
+ "#FFB300",
+ "#FFC107",
+ "#FFCA28",
+ ] # Oranges
+
+ # Line styles to cycle through
+ line_styles = ["-", "--", "-.", ":"]
+ markers = ["o", "s", "^", "v", "D", "p", "*", "h"]
+
+ baseline_idx = 0
+ dev_idx = 0
+
+ # Use different colors and styles for each node
+ for idx, (hostname, perf_data) in enumerate(sorted_nodes):
+ if not perf_data["insert_rates"]:
+ continue
+
+ # Choose color and style based on node type and index
+ if perf_data["is_dev"]:
+ # Development nodes - blues
+ color = dev_colors[dev_idx % len(dev_colors)]
+ linestyle = line_styles[
+ (dev_idx // len(dev_colors)) % len(line_styles)
+ ]
+ marker = markers[4 + (dev_idx % 4)] # Use markers 4-7 for dev
+ label = f"{hostname} (Dev)"
+ dev_idx += 1
+ else:
+ # Baseline nodes - greens
+ color = baseline_colors[baseline_idx % len(baseline_colors)]
+ linestyle = line_styles[
+ (baseline_idx // len(baseline_colors)) % len(line_styles)
+ ]
+ marker = markers[
+ baseline_idx % 4
+ ] # Use first 4 markers for baseline
+ label = f"{hostname} (Baseline)"
+ baseline_idx += 1
+
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate with alpha for better visibility
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ color=color,
+ linestyle=linestyle,
+ marker=marker,
+ linewidth=1.5,
+ markersize=5,
+ label=label,
+ alpha=0.8,
+ )
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ color=color,
+ linestyle=linestyle,
+ marker=marker,
+ linewidth=1.5,
+ markersize=5,
+ label=label,
+ alpha=0.8,
+ )
+
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title("Milvus Insert Rate by Node")
+ ax1.grid(True, alpha=0.3)
+ # Position legend outside plot area for better visibility with many nodes
+ ax1.legend(bbox_to_anchor=(1.05, 1), loc="upper left", fontsize=7, ncol=1)
+
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title("Milvus Insert Time by Node")
+ ax2.grid(True, alpha=0.3)
+ # Position legend outside plot area for better visibility with many nodes
+ ax2.legend(bbox_to_anchor=(1.05, 1), loc="upper left", fontsize=7, ncol=1)
+
+ plt.suptitle(
+ "Insert Performance Analysis: Baseline vs Development",
+ fontsize=14,
+ y=1.02,
+ )
+ else:
+ # Single node mode: original behavior
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+ # Extract insert data from single node
+ hostname = list(node_performance.keys())[0] if node_performance else None
+ if hostname:
+ perf_data = node_performance[hostname]
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ "b-o",
+ linewidth=2,
+ markersize=6,
+ )
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title(f"Vector Insert Rate Performance - {hostname}")
+ ax1.grid(True, alpha=0.3)
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ "r-o",
+ linewidth=2,
+ markersize=6,
+ )
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title(f"Vector Insert Time Performance - {hostname}")
+ ax2.grid(True, alpha=0.3)
plt.tight_layout()
output_file = os.path.join(
@@ -739,52 +1104,110 @@ class ResultsAnalyzer:
plt.close()
def _plot_query_performance(self):
- """Plot query performance metrics"""
+ """Plot query performance metrics comparing baseline vs dev nodes"""
if not self.results_data:
return
- # Collect query performance data
- query_data = []
+ # Group data by filesystem configuration
+ fs_groups = {}
for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(result)
+
+ if config_key not in fs_groups:
+ fs_groups[config_key] = {"baseline": [], "dev": []}
+
query_perf = result.get("query_performance", {})
- for topk, topk_data in query_perf.items():
- for batch, batch_data in topk_data.items():
- query_data.append(
- {
- "topk": topk.replace("topk_", ""),
- "batch": batch.replace("batch_", ""),
- "qps": batch_data.get("queries_per_second", 0),
- "avg_time": batch_data.get("average_time_seconds", 0)
- * 1000, # Convert to ms
- }
- )
+ if query_perf:
+ node_type = "dev" if is_dev else "baseline"
+ for topk, topk_data in query_perf.items():
+ for batch, batch_data in topk_data.items():
+ fs_groups[config_key][node_type].append(
+ {
+ "hostname": hostname,
+ "topk": topk.replace("topk_", ""),
+ "batch": batch.replace("batch_", ""),
+ "qps": batch_data.get("queries_per_second", 0),
+ "avg_time": batch_data.get("average_time_seconds", 0)
+ * 1000,
+ }
+ )
- if not query_data:
+ if not fs_groups:
return
- df = pd.DataFrame(query_data)
+ # Create subplots for each filesystem config
+ n_configs = len(fs_groups)
+ fig_height = max(8, 4 * n_configs)
+ fig, axes = plt.subplots(n_configs, 2, figsize=(16, fig_height))
- # Create subplots
- fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+ if n_configs == 1:
+ axes = axes.reshape(1, -1)
- # QPS heatmap
- qps_pivot = df.pivot_table(
- values="qps", index="topk", columns="batch", aggfunc="mean"
- )
- sns.heatmap(qps_pivot, annot=True, fmt=".1f", ax=ax1, cmap="YlOrRd")
- ax1.set_title("Queries Per Second (QPS)")
- ax1.set_xlabel("Batch Size")
- ax1.set_ylabel("Top-K")
-
- # Latency heatmap
- latency_pivot = df.pivot_table(
- values="avg_time", index="topk", columns="batch", aggfunc="mean"
- )
- sns.heatmap(latency_pivot, annot=True, fmt=".1f", ax=ax2, cmap="YlOrRd")
- ax2.set_title("Average Query Latency (ms)")
- ax2.set_xlabel("Batch Size")
- ax2.set_ylabel("Top-K")
+ for idx, (config_key, data) in enumerate(sorted(fs_groups.items())):
+ # Create DataFrames for baseline and dev
+ baseline_df = (
+ pd.DataFrame(data["baseline"]) if data["baseline"] else pd.DataFrame()
+ )
+ dev_df = pd.DataFrame(data["dev"]) if data["dev"] else pd.DataFrame()
+
+ # Baseline QPS heatmap
+ ax_base = axes[idx][0]
+ if not baseline_df.empty:
+ baseline_pivot = baseline_df.pivot_table(
+ values="qps", index="topk", columns="batch", aggfunc="mean"
+ )
+ sns.heatmap(
+ baseline_pivot,
+ annot=True,
+ fmt=".1f",
+ ax=ax_base,
+ cmap="Greens",
+ cbar_kws={"label": "QPS"},
+ )
+ ax_base.set_title(f"{config_key.upper()} - Baseline QPS")
+ ax_base.set_xlabel("Batch Size")
+ ax_base.set_ylabel("Top-K")
+ else:
+ ax_base.text(
+ 0.5,
+ 0.5,
+ f"No baseline data for {config_key}",
+ ha="center",
+ va="center",
+ transform=ax_base.transAxes,
+ )
+ ax_base.set_title(f"{config_key.upper()} - Baseline QPS")
+ # Dev QPS heatmap
+ ax_dev = axes[idx][1]
+ if not dev_df.empty:
+ dev_pivot = dev_df.pivot_table(
+ values="qps", index="topk", columns="batch", aggfunc="mean"
+ )
+ sns.heatmap(
+ dev_pivot,
+ annot=True,
+ fmt=".1f",
+ ax=ax_dev,
+ cmap="Blues",
+ cbar_kws={"label": "QPS"},
+ )
+ ax_dev.set_title(f"{config_key.upper()} - Development QPS")
+ ax_dev.set_xlabel("Batch Size")
+ ax_dev.set_ylabel("Top-K")
+ else:
+ ax_dev.text(
+ 0.5,
+ 0.5,
+ f"No dev data for {config_key}",
+ ha="center",
+ va="center",
+ transform=ax_dev.transAxes,
+ )
+ ax_dev.set_title(f"{config_key.upper()} - Development QPS")
+
+ plt.suptitle("Query Performance: Baseline vs Development", fontsize=16, y=1.02)
plt.tight_layout()
output_file = os.path.join(
self.output_dir,
@@ -796,32 +1219,101 @@ class ResultsAnalyzer:
plt.close()
def _plot_index_performance(self):
- """Plot index creation performance"""
- iterations = []
- index_times = []
+ """Plot index creation performance comparing baseline vs dev"""
+ # Group by filesystem configuration
+ fs_groups = {}
+
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(result)
+
+ if config_key not in fs_groups:
+ fs_groups[config_key] = {"baseline": [], "dev": []}
- for i, result in enumerate(self.results_data):
index_perf = result.get("index_performance", {})
if index_perf:
- iterations.append(i + 1)
- index_times.append(index_perf.get("creation_time_seconds", 0))
+ time = index_perf.get("creation_time_seconds", 0)
+ if time > 0:
+ node_type = "dev" if is_dev else "baseline"
+ fs_groups[config_key][node_type].append(time)
- if not index_times:
+ if not fs_groups:
return
- plt.figure(figsize=(10, 6))
- plt.bar(iterations, index_times, alpha=0.7, color="green")
- plt.xlabel("Iteration")
- plt.ylabel("Index Creation Time (seconds)")
- plt.title("Index Creation Performance")
- plt.grid(True, alpha=0.3)
-
- # Add average line
- avg_time = np.mean(index_times)
- plt.axhline(
- y=avg_time, color="red", linestyle="--", label=f"Average: {avg_time:.2f}s"
+ # Create comparison bar chart
+ fig, ax = plt.subplots(figsize=(14, 8))
+
+ configs = sorted(fs_groups.keys())
+ x = np.arange(len(configs))
+ width = 0.35
+
+ # Calculate averages for each config
+ baseline_avgs = []
+ dev_avgs = []
+ baseline_stds = []
+ dev_stds = []
+
+ for config in configs:
+ baseline_times = fs_groups[config]["baseline"]
+ dev_times = fs_groups[config]["dev"]
+
+ baseline_avgs.append(np.mean(baseline_times) if baseline_times else 0)
+ dev_avgs.append(np.mean(dev_times) if dev_times else 0)
+ baseline_stds.append(np.std(baseline_times) if baseline_times else 0)
+ dev_stds.append(np.std(dev_times) if dev_times else 0)
+
+ # Create bars
+ bars1 = ax.bar(
+ x - width / 2,
+ baseline_avgs,
+ width,
+ yerr=baseline_stds,
+ label="Baseline",
+ color="#4CAF50",
+ capsize=5,
+ )
+ bars2 = ax.bar(
+ x + width / 2,
+ dev_avgs,
+ width,
+ yerr=dev_stds,
+ label="Development",
+ color="#2196F3",
+ capsize=5,
)
- plt.legend()
+
+ # Add value labels on bars
+ for bar, val in zip(bars1, baseline_avgs):
+ if val > 0:
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.3f}s",
+ ha="center",
+ va="bottom",
+ fontsize=9,
+ )
+
+ for bar, val in zip(bars2, dev_avgs):
+ if val > 0:
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.3f}s",
+ ha="center",
+ va="bottom",
+ fontsize=9,
+ )
+
+ ax.set_xlabel("Filesystem Configuration", fontsize=12)
+ ax.set_ylabel("Index Creation Time (seconds)", fontsize=12)
+ ax.set_title("Index Creation Performance: Baseline vs Development", fontsize=14)
+ ax.set_xticks(x)
+ ax.set_xticklabels([c.upper() for c in configs], rotation=45, ha="right")
+ ax.legend(loc="upper right")
+ ax.grid(True, alpha=0.3, axis="y")
output_file = os.path.join(
self.output_dir,
@@ -833,61 +1325,148 @@ class ResultsAnalyzer:
plt.close()
def _plot_performance_matrix(self):
- """Plot comprehensive performance comparison matrix"""
+ """Plot performance comparison matrix for each filesystem config"""
if len(self.results_data) < 2:
return
- # Extract key metrics for comparison
- metrics = []
- for i, result in enumerate(self.results_data):
+ # Group by filesystem configuration
+ fs_metrics = {}
+
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(result)
+
+ if config_key not in fs_metrics:
+ fs_metrics[config_key] = {"baseline": [], "dev": []}
+
+ # Collect metrics
insert_perf = result.get("insert_performance", {})
index_perf = result.get("index_performance", {})
+ query_perf = result.get("query_performance", {})
metric = {
- "iteration": i + 1,
+ "hostname": hostname,
"insert_rate": insert_perf.get("vectors_per_second", 0),
"index_time": index_perf.get("creation_time_seconds", 0),
}
- # Add query metrics
- query_perf = result.get("query_performance", {})
+ # Get representative query performance (topk_10, batch_1)
if "topk_10" in query_perf and "batch_1" in query_perf["topk_10"]:
metric["query_qps"] = query_perf["topk_10"]["batch_1"].get(
"queries_per_second", 0
)
+ else:
+ metric["query_qps"] = 0
- metrics.append(metric)
+ node_type = "dev" if is_dev else "baseline"
+ fs_metrics[config_key][node_type].append(metric)
- df = pd.DataFrame(metrics)
+ if not fs_metrics:
+ return
- # Normalize metrics for comparison
- numeric_cols = ["insert_rate", "index_time", "query_qps"]
- for col in numeric_cols:
- if col in df.columns:
- df[f"{col}_norm"] = (df[col] - df[col].min()) / (
- df[col].max() - df[col].min() + 1e-6
- )
+ # Create subplots for each filesystem
+ n_configs = len(fs_metrics)
+ n_cols = min(3, n_configs)
+ n_rows = (n_configs + n_cols - 1) // n_cols
+
+ fig, axes = plt.subplots(n_rows, n_cols, figsize=(n_cols * 6, n_rows * 5))
+ if n_rows == 1 and n_cols == 1:
+ axes = [[axes]]
+ elif n_rows == 1:
+ axes = [axes]
+ elif n_cols == 1:
+ axes = [[ax] for ax in axes]
+
+ for idx, (config_key, data) in enumerate(sorted(fs_metrics.items())):
+ row = idx // n_cols
+ col = idx % n_cols
+ ax = axes[row][col]
+
+ # Calculate averages
+ baseline_metrics = data["baseline"]
+ dev_metrics = data["dev"]
+
+ if baseline_metrics and dev_metrics:
+ categories = ["Insert Rate\n(vec/s)", "Index Time\n(s)", "Query QPS"]
+
+ baseline_avg = [
+ np.mean([m["insert_rate"] for m in baseline_metrics]),
+ np.mean([m["index_time"] for m in baseline_metrics]),
+ np.mean([m["query_qps"] for m in baseline_metrics]),
+ ]
- # Create radar chart
- fig, ax = plt.subplots(figsize=(10, 8), subplot_kw=dict(projection="polar"))
+ dev_avg = [
+ np.mean([m["insert_rate"] for m in dev_metrics]),
+ np.mean([m["index_time"] for m in dev_metrics]),
+ np.mean([m["query_qps"] for m in dev_metrics]),
+ ]
- angles = np.linspace(0, 2 * np.pi, len(numeric_cols), endpoint=False).tolist()
- angles += angles[:1] # Complete the circle
+ x = np.arange(len(categories))
+ width = 0.35
- for i, row in df.iterrows():
- values = [row.get(f"{col}_norm", 0) for col in numeric_cols]
- values += values[:1] # Complete the circle
+ bars1 = ax.bar(
+ x - width / 2,
+ baseline_avg,
+ width,
+ label="Baseline",
+ color="#4CAF50",
+ )
+ bars2 = ax.bar(
+ x + width / 2, dev_avg, width, label="Development", color="#2196F3"
+ )
- ax.plot(
- angles, values, "o-", linewidth=2, label=f'Iteration {row["iteration"]}'
- )
- ax.fill(angles, values, alpha=0.25)
+ # Add value labels
+ for bar, val in zip(bars1, baseline_avg):
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.0f}" if val > 100 else f"{val:.2f}",
+ ha="center",
+ va="bottom",
+ fontsize=8,
+ )
- ax.set_xticks(angles[:-1])
- ax.set_xticklabels(["Insert Rate", "Index Time (inv)", "Query QPS"])
- ax.set_ylim(0, 1)
- ax.set_title("Performance Comparison Matrix (Normalized)", y=1.08)
- ax.legend(loc="upper right", bbox_to_anchor=(1.3, 1.0))
+ for bar, val in zip(bars2, dev_avg):
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.0f}" if val > 100 else f"{val:.2f}",
+ ha="center",
+ va="bottom",
+ fontsize=8,
+ )
+
+ ax.set_xlabel("Metrics")
+ ax.set_ylabel("Value")
+ ax.set_title(f"{config_key.upper()}")
+ ax.set_xticks(x)
+ ax.set_xticklabels(categories)
+ ax.legend(loc="upper right", fontsize=8)
+ ax.grid(True, alpha=0.3, axis="y")
+ else:
+ ax.text(
+ 0.5,
+ 0.5,
+ f"Insufficient data\nfor {config_key}",
+ ha="center",
+ va="center",
+ transform=ax.transAxes,
+ )
+ ax.set_title(f"{config_key.upper()}")
+
+ # Hide unused subplots
+ for idx in range(n_configs, n_rows * n_cols):
+ row = idx // n_cols
+ col = idx % n_cols
+ axes[row][col].set_visible(False)
+
+ plt.suptitle(
+ "Performance Comparison Matrix: Baseline vs Development",
+ fontsize=14,
+ y=1.02,
+ )
output_file = os.path.join(
self.output_dir,
@@ -898,6 +1477,149 @@ class ResultsAnalyzer:
)
plt.close()
+ def _plot_filesystem_comparison(self):
+ """Plot node performance comparison chart"""
+ if len(self.results_data) < 2:
+ return
+
+ # Group results by node
+ node_performance = {}
+
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "insert_rates": [],
+ "index_times": [],
+ "query_qps": [],
+ "is_dev": is_dev,
+ }
+
+ # Collect metrics
+ insert_perf = result.get("insert_performance", {})
+ if insert_perf:
+ node_performance[hostname]["insert_rates"].append(
+ insert_perf.get("vectors_per_second", 0)
+ )
+
+ index_perf = result.get("index_performance", {})
+ if index_perf:
+ node_performance[hostname]["index_times"].append(
+ index_perf.get("creation_time_seconds", 0)
+ )
+
+ # Get top-10 batch-1 query performance as representative
+ query_perf = result.get("query_performance", {})
+ if "topk_10" in query_perf and "batch_1" in query_perf["topk_10"]:
+ qps = query_perf["topk_10"]["batch_1"].get("queries_per_second", 0)
+ node_performance[hostname]["query_qps"].append(qps)
+
+ # Only create comparison if we have multiple nodes
+ if len(node_performance) > 1:
+ # Calculate averages
+ node_metrics = {}
+ for hostname, perf_data in node_performance.items():
+ node_metrics[hostname] = {
+ "avg_insert_rate": (
+ np.mean(perf_data["insert_rates"])
+ if perf_data["insert_rates"]
+ else 0
+ ),
+ "avg_index_time": (
+ np.mean(perf_data["index_times"])
+ if perf_data["index_times"]
+ else 0
+ ),
+ "avg_query_qps": (
+ np.mean(perf_data["query_qps"]) if perf_data["query_qps"] else 0
+ ),
+ "is_dev": perf_data["is_dev"],
+ }
+
+ # Create comparison bar chart with more space
+ fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(24, 8))
+
+ # Sort nodes with baseline first
+ sorted_nodes = sorted(
+ node_metrics.items(), key=lambda x: (x[1]["is_dev"], x[0])
+ )
+ node_names = [hostname for hostname, _ in sorted_nodes]
+
+ # Use different colors for baseline vs dev
+ colors = [
+ "#4CAF50" if not node_metrics[hostname]["is_dev"] else "#2196F3"
+ for hostname in node_names
+ ]
+
+ # Add labels for clarity
+ labels = [
+ f"{hostname}\n({'Dev' if node_metrics[hostname]['is_dev'] else 'Baseline'})"
+ for hostname in node_names
+ ]
+
+ # Insert rate comparison
+ insert_rates = [
+ node_metrics[hostname]["avg_insert_rate"] for hostname in node_names
+ ]
+ bars1 = ax1.bar(labels, insert_rates, color=colors)
+ ax1.set_title("Average Milvus Insert Rate by Node")
+ ax1.set_ylabel("Vectors/Second")
+ # Rotate labels for better readability
+ ax1.set_xticklabels(labels, rotation=45, ha="right", fontsize=8)
+
+ # Index time comparison (lower is better)
+ index_times = [
+ node_metrics[hostname]["avg_index_time"] for hostname in node_names
+ ]
+ bars2 = ax2.bar(labels, index_times, color=colors)
+ ax2.set_title("Average Milvus Index Time by Node")
+ ax2.set_ylabel("Seconds (Lower is Better)")
+ ax2.set_xticklabels(labels, rotation=45, ha="right", fontsize=8)
+
+ # Query QPS comparison
+ query_qps = [
+ node_metrics[hostname]["avg_query_qps"] for hostname in node_names
+ ]
+ bars3 = ax3.bar(labels, query_qps, color=colors)
+ ax3.set_title("Average Milvus Query QPS by Node")
+ ax3.set_ylabel("Queries/Second")
+ ax3.set_xticklabels(labels, rotation=45, ha="right", fontsize=8)
+
+ # Add value labels on bars
+ for bars, values in [
+ (bars1, insert_rates),
+ (bars2, index_times),
+ (bars3, query_qps),
+ ]:
+ for bar, value in zip(bars, values):
+ height = bar.get_height()
+ ax = bar.axes
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height + height * 0.01,
+ f"{value:.1f}",
+ ha="center",
+ va="bottom",
+ fontsize=10,
+ )
+
+ plt.suptitle(
+ "Milvus Performance Comparison: Baseline vs Development Nodes",
+ fontsize=16,
+ y=1.02,
+ )
+ plt.tight_layout()
+
+ output_file = os.path.join(
+ self.output_dir,
+ f"filesystem_comparison.{self.config.get('graph_format', 'png')}",
+ )
+ plt.savefig(
+ output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+ )
+ plt.close()
+
def analyze(self) -> bool:
"""Run complete analysis"""
self.logger.info("Starting results analysis...")
diff --git a/playbooks/roles/ai_collect_results/files/generate_better_graphs.py b/playbooks/roles/ai_collect_results/files/generate_better_graphs.py
index 645bac9e..b3681ff9 100755
--- a/playbooks/roles/ai_collect_results/files/generate_better_graphs.py
+++ b/playbooks/roles/ai_collect_results/files/generate_better_graphs.py
@@ -29,17 +29,18 @@ def extract_filesystem_from_filename(filename):
if "_" in node_name:
parts = node_name.split("_")
node_name = "_".join(parts[:-1]) # Remove last part (iteration)
-
+
# Extract filesystem type from node name
if "-xfs-" in node_name:
return "xfs"
elif "-ext4-" in node_name:
- return "ext4"
+ return "ext4"
elif "-btrfs-" in node_name:
return "btrfs"
-
+
return "unknown"
+
def extract_node_config_from_filename(filename):
"""Extract detailed node configuration from filename"""
# Expected format: results_debian13-ai-xfs-4k-4ks_1.json
@@ -50,14 +51,15 @@ def extract_node_config_from_filename(filename):
if "_" in node_name:
parts = node_name.split("_")
node_name = "_".join(parts[:-1]) # Remove last part (iteration)
-
+
# Remove -dev suffix if present
node_name = node_name.replace("-dev", "")
-
+
return node_name.replace("debian13-ai-", "")
-
+
return "unknown"
+
def detect_filesystem():
"""Detect the filesystem type of /data on test nodes"""
# This is now a fallback - we primarily use filename-based detection
@@ -104,7 +106,7 @@ def load_results(results_dir):
# Extract node type from filename
filename = os.path.basename(json_file)
data["filename"] = filename
-
+
# Extract filesystem type and config from filename
data["filesystem"] = extract_filesystem_from_filename(filename)
data["node_config"] = extract_node_config_from_filename(filename)
diff --git a/playbooks/roles/ai_collect_results/files/generate_graphs.py b/playbooks/roles/ai_collect_results/files/generate_graphs.py
index 53a835e2..fafc62bf 100755
--- a/playbooks/roles/ai_collect_results/files/generate_graphs.py
+++ b/playbooks/roles/ai_collect_results/files/generate_graphs.py
@@ -9,7 +9,6 @@ import sys
import glob
import numpy as np
import matplotlib
-
matplotlib.use("Agg") # Use non-interactive backend
import matplotlib.pyplot as plt
from datetime import datetime
@@ -17,68 +16,78 @@ from pathlib import Path
from collections import defaultdict
+def _extract_filesystem_config(result):
+ """Extract filesystem type and block size from result data.
+ Returns (fs_type, block_size, config_key)"""
+ filename = result.get("_file", "")
+
+ # Primary: Extract filesystem type from filename (more reliable than JSON)
+ fs_type = "unknown"
+ block_size = "default"
+
+ if "xfs" in filename:
+ fs_type = "xfs"
+ # Check larger sizes first to avoid substring matches
+ if "64k" in filename and "64k-" in filename:
+ block_size = "64k"
+ elif "32k" in filename and "32k-" in filename:
+ block_size = "32k"
+ elif "16k" in filename and "16k-" in filename:
+ block_size = "16k"
+ elif "4k" in filename and "4k-" in filename:
+ block_size = "4k"
+ elif "ext4" in filename:
+ fs_type = "ext4"
+ if "4k" in filename and "4k-" in filename:
+ block_size = "4k"
+ elif "16k" in filename and "16k-" in filename:
+ block_size = "16k"
+ elif "btrfs" in filename:
+ fs_type = "btrfs"
+
+ # Fallback: Check JSON data if filename parsing failed
+ if fs_type == "unknown":
+ fs_type = result.get("filesystem", "unknown")
+
+ # Create descriptive config key
+ config_key = f"{fs_type}-{block_size}" if block_size != "default" else fs_type
+ return fs_type, block_size, config_key
+
+
+def _extract_node_info(result):
+ """Extract node hostname and determine if it's a dev node.
+ Returns (hostname, is_dev_node)"""
+ # Get hostname from system_info (preferred) or fall back to filename
+ system_info = result.get("system_info", {})
+ hostname = system_info.get("hostname", "")
+
+ # If no hostname in system_info, try extracting from filename
+ if not hostname:
+ filename = result.get("_file", "")
+ # Remove results_ prefix and .json suffix
+ hostname = filename.replace("results_", "").replace(".json", "")
+ # Remove iteration number if present (_1, _2, etc.)
+ if "_" in hostname and hostname.split("_")[-1].isdigit():
+ hostname = "_".join(hostname.split("_")[:-1])
+
+ # Determine if this is a dev node
+ is_dev = hostname.endswith("-dev")
+
+ return hostname, is_dev
+
+
def load_results(results_dir):
"""Load all JSON result files from the directory"""
results = []
- json_files = glob.glob(os.path.join(results_dir, "*.json"))
+ # Only load results_*.json files, not consolidated or other JSON files
+ json_files = glob.glob(os.path.join(results_dir, "results_*.json"))
for json_file in json_files:
try:
with open(json_file, "r") as f:
data = json.load(f)
- # Extract filesystem info - prefer from JSON data over filename
- filename = os.path.basename(json_file)
-
- # First, try to get filesystem from the JSON data itself
- fs_type = data.get("filesystem", None)
-
- # If not in JSON, try to parse from filename (backwards compatibility)
- if not fs_type:
- parts = filename.replace("results_", "").replace(".json", "").split("-")
-
- # Parse host info
- if "debian13-ai-" in filename:
- host_parts = (
- filename.replace("results_debian13-ai-", "")
- .replace("_1.json", "")
- .replace("_2.json", "")
- .replace("_3.json", "")
- .split("-")
- )
- if "xfs" in host_parts[0]:
- fs_type = "xfs"
- # Extract block size (e.g., "4k", "16k", etc.)
- block_size = host_parts[1] if len(host_parts) > 1 else "unknown"
- elif "ext4" in host_parts[0]:
- fs_type = "ext4"
- block_size = host_parts[1] if len(host_parts) > 1 else "4k"
- elif "btrfs" in host_parts[0]:
- fs_type = "btrfs"
- block_size = "default"
- else:
- fs_type = "unknown"
- block_size = "unknown"
- else:
- fs_type = "unknown"
- block_size = "unknown"
- else:
- # If filesystem came from JSON, set appropriate block size
- if fs_type == "btrfs":
- block_size = "default"
- elif fs_type in ["ext4", "xfs"]:
- block_size = data.get("block_size", "4k")
- else:
- block_size = data.get("block_size", "default")
-
- is_dev = "dev" in filename
-
- # Use filesystem from JSON if available, otherwise use parsed value
- if "filesystem" not in data:
- data["filesystem"] = fs_type
- data["block_size"] = block_size
- data["is_dev"] = is_dev
- data["filename"] = filename
-
+ # Add filename for filesystem detection
+ data["_file"] = os.path.basename(json_file)
results.append(data)
except Exception as e:
print(f"Error loading {json_file}: {e}")
@@ -86,554 +95,243 @@ def load_results(results_dir):
return results
-def create_filesystem_comparison_chart(results, output_dir):
- """Create a bar chart comparing performance across filesystems"""
- # Group by filesystem and baseline/dev
- fs_data = defaultdict(lambda: {"baseline": [], "dev": []})
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- category = "dev" if result.get("is_dev", False) else "baseline"
-
- # Extract actual performance data from results
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
- fs_data[fs][category].append(insert_qps)
-
- # Prepare data for plotting
- filesystems = list(fs_data.keys())
- baseline_means = [
- np.mean(fs_data[fs]["baseline"]) if fs_data[fs]["baseline"] else 0
- for fs in filesystems
- ]
- dev_means = [
- np.mean(fs_data[fs]["dev"]) if fs_data[fs]["dev"] else 0 for fs in filesystems
- ]
-
- x = np.arange(len(filesystems))
- width = 0.35
-
- fig, ax = plt.subplots(figsize=(10, 6))
- baseline_bars = ax.bar(
- x - width / 2, baseline_means, width, label="Baseline", color="#1f77b4"
- )
- dev_bars = ax.bar(
- x + width / 2, dev_means, width, label="Development", color="#ff7f0e"
- )
-
- ax.set_xlabel("Filesystem")
- ax.set_ylabel("Insert QPS")
- ax.set_title("Vector Database Performance by Filesystem")
- ax.set_xticks(x)
- ax.set_xticklabels(filesystems)
- ax.legend()
- ax.grid(True, alpha=0.3)
-
- # Add value labels on bars
- for bars in [baseline_bars, dev_bars]:
- for bar in bars:
- height = bar.get_height()
- if height > 0:
- ax.annotate(
- f"{height:.0f}",
- xy=(bar.get_x() + bar.get_width() / 2, height),
- xytext=(0, 3),
- textcoords="offset points",
- ha="center",
- va="bottom",
- )
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "filesystem_comparison.png"), dpi=150)
- plt.close()
-
-
-def create_block_size_analysis(results, output_dir):
- """Create analysis for different block sizes (XFS specific)"""
- # Filter XFS results
- xfs_results = [r for r in results if r.get("filesystem") == "xfs"]
-
- if not xfs_results:
+def create_simple_performance_trends(results, output_dir):
+ """Create multi-node performance trends chart"""
+ if not results:
return
- # Group by block size
- block_size_data = defaultdict(lambda: {"baseline": [], "dev": []})
-
- for result in xfs_results:
- block_size = result.get("block_size", "unknown")
- category = "dev" if result.get("is_dev", False) else "baseline"
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
- block_size_data[block_size][category].append(insert_qps)
-
- # Sort block sizes
- block_sizes = sorted(
- block_size_data.keys(),
- key=lambda x: (
- int(x.replace("k", "").replace("s", ""))
- if x not in ["unknown", "default"]
- else 0
- ),
- )
-
- # Create grouped bar chart
- baseline_means = [
- (
- np.mean(block_size_data[bs]["baseline"])
- if block_size_data[bs]["baseline"]
- else 0
- )
- for bs in block_sizes
- ]
- dev_means = [
- np.mean(block_size_data[bs]["dev"]) if block_size_data[bs]["dev"] else 0
- for bs in block_sizes
- ]
-
- x = np.arange(len(block_sizes))
- width = 0.35
-
- fig, ax = plt.subplots(figsize=(12, 6))
- baseline_bars = ax.bar(
- x - width / 2, baseline_means, width, label="Baseline", color="#2ca02c"
- )
- dev_bars = ax.bar(
- x + width / 2, dev_means, width, label="Development", color="#d62728"
- )
-
- ax.set_xlabel("Block Size")
- ax.set_ylabel("Insert QPS")
- ax.set_title("XFS Performance by Block Size")
- ax.set_xticks(x)
- ax.set_xticklabels(block_sizes)
- ax.legend()
- ax.grid(True, alpha=0.3)
-
- # Add value labels
- for bars in [baseline_bars, dev_bars]:
- for bar in bars:
- height = bar.get_height()
- if height > 0:
- ax.annotate(
- f"{height:.0f}",
- xy=(bar.get_x() + bar.get_width() / 2, height),
- xytext=(0, 3),
- textcoords="offset points",
- ha="center",
- va="bottom",
- )
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "xfs_block_size_analysis.png"), dpi=150)
- plt.close()
-
-
-def create_heatmap_analysis(results, output_dir):
- """Create a heatmap showing performance across all configurations"""
- # Group data by configuration and version
- config_data = defaultdict(
- lambda: {
- "baseline": {"insert": 0, "query": 0},
- "dev": {"insert": 0, "query": 0},
- }
- )
+ # Group results by node
+ node_performance = defaultdict(lambda: {
+ "insert_rates": [],
+ "insert_times": [],
+ "iterations": [],
+ "is_dev": False,
+ })
for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "default")
- config = f"{fs}-{block_size}"
- version = "dev" if result.get("is_dev", False) else "baseline"
-
- # Get actual insert performance
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
-
- # Calculate average query QPS
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
-
- config_data[config][version]["insert"] = insert_qps
- config_data[config][version]["query"] = query_qps
-
- # Sort configurations
- configs = sorted(config_data.keys())
-
- # Prepare data for heatmap
- insert_baseline = [config_data[c]["baseline"]["insert"] for c in configs]
- insert_dev = [config_data[c]["dev"]["insert"] for c in configs]
- query_baseline = [config_data[c]["baseline"]["query"] for c in configs]
- query_dev = [config_data[c]["dev"]["query"] for c in configs]
-
- # Create figure with custom heatmap
- fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
-
- # Create data matrices
- insert_data = np.array([insert_baseline, insert_dev]).T
- query_data = np.array([query_baseline, query_dev]).T
-
- # Insert QPS heatmap
- im1 = ax1.imshow(insert_data, cmap="YlOrRd", aspect="auto")
- ax1.set_xticks([0, 1])
- ax1.set_xticklabels(["Baseline", "Development"])
- ax1.set_yticks(range(len(configs)))
- ax1.set_yticklabels(configs)
- ax1.set_title("Insert Performance Heatmap")
- ax1.set_ylabel("Configuration")
-
- # Add text annotations
- for i in range(len(configs)):
- for j in range(2):
- text = ax1.text(
- j,
- i,
- f"{int(insert_data[i, j])}",
- ha="center",
- va="center",
- color="black",
- )
+ hostname, is_dev = _extract_node_info(result)
+
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "insert_rates": [],
+ "insert_times": [],
+ "iterations": [],
+ "is_dev": is_dev,
+ }
- # Add colorbar
- cbar1 = plt.colorbar(im1, ax=ax1)
- cbar1.set_label("Insert QPS")
-
- # Query QPS heatmap
- im2 = ax2.imshow(query_data, cmap="YlGnBu", aspect="auto")
- ax2.set_xticks([0, 1])
- ax2.set_xticklabels(["Baseline", "Development"])
- ax2.set_yticks(range(len(configs)))
- ax2.set_yticklabels(configs)
- ax2.set_title("Query Performance Heatmap")
-
- # Add text annotations
- for i in range(len(configs)):
- for j in range(2):
- text = ax2.text(
- j,
- i,
- f"{int(query_data[i, j])}",
- ha="center",
- va="center",
- color="black",
+ insert_perf = result.get("insert_performance", {})
+ if insert_perf:
+ node_performance[hostname]["insert_rates"].append(
+ insert_perf.get("vectors_per_second", 0)
+ )
+ fs_performance[config_key]["insert_times"].append(
+ insert_perf.get("total_time_seconds", 0)
+ )
+ fs_performance[config_key]["iterations"].append(
+ len(fs_performance[config_key]["insert_rates"])
)
- # Add colorbar
- cbar2 = plt.colorbar(im2, ax=ax2)
- cbar2.set_label("Query QPS")
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "performance_heatmap.png"), dpi=150)
- plt.close()
-
-
-def create_performance_trends(results, output_dir):
- """Create line charts showing performance trends"""
- # Group by filesystem type
- fs_types = defaultdict(
- lambda: {
- "configs": [],
- "baseline_insert": [],
- "dev_insert": [],
- "baseline_query": [],
- "dev_query": [],
- }
- )
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "default")
- config = f"{block_size}"
-
- if config not in fs_types[fs]["configs"]:
- fs_types[fs]["configs"].append(config)
- fs_types[fs]["baseline_insert"].append(0)
- fs_types[fs]["dev_insert"].append(0)
- fs_types[fs]["baseline_query"].append(0)
- fs_types[fs]["dev_query"].append(0)
-
- idx = fs_types[fs]["configs"].index(config)
-
- # Calculate average query QPS from all test configurations
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
-
- if result.get("is_dev", False):
- if "insert_performance" in result:
- fs_types[fs]["dev_insert"][idx] = result["insert_performance"].get(
- "vectors_per_second", 0
- )
- fs_types[fs]["dev_query"][idx] = query_qps
- else:
- if "insert_performance" in result:
- fs_types[fs]["baseline_insert"][idx] = result["insert_performance"].get(
- "vectors_per_second", 0
- )
- fs_types[fs]["baseline_query"][idx] = query_qps
-
- # Create separate plots for each filesystem
- for fs, data in fs_types.items():
- if not data["configs"]:
- continue
-
- fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
-
- x = range(len(data["configs"]))
-
- # Insert performance
- ax1.plot(
- x,
- data["baseline_insert"],
- "o-",
- label="Baseline",
- linewidth=2,
- markersize=8,
- )
- ax1.plot(
- x, data["dev_insert"], "s-", label="Development", linewidth=2, markersize=8
- )
- ax1.set_xlabel("Configuration")
- ax1.set_ylabel("Insert QPS")
- ax1.set_title(f"{fs.upper()} Insert Performance")
- ax1.set_xticks(x)
- ax1.set_xticklabels(data["configs"])
- ax1.legend()
+ # Check if we have multi-filesystem data
+ if len(fs_performance) > 1:
+ # Multi-filesystem mode: separate lines for each filesystem
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+ colors = ["b", "r", "g", "m", "c", "y", "k"]
+ color_idx = 0
+
+ for config_key, perf_data in fs_performance.items():
+ if not perf_data["insert_rates"]:
+ continue
+
+ color = colors[color_idx % len(colors)]
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ f"{color}-o",
+ linewidth=2,
+ markersize=6,
+ label=config_key.upper(),
+ )
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ f"{color}-o",
+ linewidth=2,
+ markersize=6,
+ label=config_key.upper(),
+ )
+
+ color_idx += 1
+
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title("Milvus Insert Rate by Storage Filesystem")
ax1.grid(True, alpha=0.3)
-
- # Query performance
- ax2.plot(
- x, data["baseline_query"], "o-", label="Baseline", linewidth=2, markersize=8
- )
- ax2.plot(
- x, data["dev_query"], "s-", label="Development", linewidth=2, markersize=8
- )
- ax2.set_xlabel("Configuration")
- ax2.set_ylabel("Query QPS")
- ax2.set_title(f"{fs.upper()} Query Performance")
- ax2.set_xticks(x)
- ax2.set_xticklabels(data["configs"])
- ax2.legend()
+ ax1.legend()
+
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title("Milvus Insert Time by Storage Filesystem")
ax2.grid(True, alpha=0.3)
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, f"{fs}_performance_trends.png"), dpi=150)
- plt.close()
+ ax2.legend()
+ else:
+ # Single filesystem mode: original behavior
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+ # Extract insert data from single filesystem
+ config_key = list(fs_performance.keys())[0] if fs_performance else None
+ if config_key:
+ perf_data = fs_performance[config_key]
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ "b-o",
+ linewidth=2,
+ markersize=6,
+ )
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title("Vector Insert Rate Performance")
+ ax1.grid(True, alpha=0.3)
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ "r-o",
+ linewidth=2,
+ markersize=6,
+ )
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title("Vector Insert Time Performance")
+ ax2.grid(True, alpha=0.3)
+
+ plt.tight_layout()
+ plt.savefig(os.path.join(output_dir, "performance_trends.png"), dpi=150)
+ plt.close()
-def create_simple_performance_trends(results, output_dir):
- """Create a simple performance trends chart for basic Milvus testing"""
+def create_heatmap_analysis(results, output_dir):
+ """Create multi-filesystem heatmap showing query performance"""
if not results:
return
-
- # Separate baseline and dev results
- baseline_results = [r for r in results if not r.get("is_dev", False)]
- dev_results = [r for r in results if r.get("is_dev", False)]
-
- if not baseline_results and not dev_results:
- return
-
- fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
-
- # Prepare data
- baseline_insert = []
- baseline_query = []
- dev_insert = []
- dev_query = []
- labels = []
-
- # Process baseline results
- for i, result in enumerate(baseline_results):
- if "insert_performance" in result:
- baseline_insert.append(result["insert_performance"].get("vectors_per_second", 0))
- else:
- baseline_insert.append(0)
+
+ # Group data by filesystem configuration
+ fs_performance = defaultdict(lambda: {
+ "query_data": [],
+ "config_key": "",
+ })
+
+ for result in results:
+ fs_type, block_size, config_key = _extract_filesystem_config(result)
- # Calculate average query QPS
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get("queries_per_second", 0)
- count += 1
- if count > 0:
- query_qps = total_qps / count
- baseline_query.append(query_qps)
- labels.append(f"Run {i+1}")
-
- # Process dev results
- for result in dev_results:
- if "insert_performance" in result:
- dev_insert.append(result["insert_performance"].get("vectors_per_second", 0))
- else:
- dev_insert.append(0)
+ query_perf = result.get("query_performance", {})
+ for topk, topk_data in query_perf.items():
+ for batch, batch_data in topk_data.items():
+ qps = batch_data.get("queries_per_second", 0)
+ fs_performance[config_key]["query_data"].append({
+ "topk": topk,
+ "batch": batch,
+ "qps": qps,
+ })
+ fs_performance[config_key]["config_key"] = config_key
+
+ # Check if we have multi-filesystem data
+ if len(fs_performance) > 1:
+ # Multi-filesystem mode: separate heatmaps for each filesystem
+ num_fs = len(fs_performance)
+ fig, axes = plt.subplots(1, num_fs, figsize=(5*num_fs, 6))
+ if num_fs == 1:
+ axes = [axes]
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get("queries_per_second", 0)
- count += 1
- if count > 0:
- query_qps = total_qps / count
- dev_query.append(query_qps)
-
- x = range(len(baseline_results) if baseline_results else len(dev_results))
-
- # Insert performance
- if baseline_insert:
- ax1.plot(x, baseline_insert, "o-", label="Baseline", linewidth=2, markersize=8)
- if dev_insert:
- ax1.plot(x[:len(dev_insert)], dev_insert, "s-", label="Development", linewidth=2, markersize=8)
- ax1.set_xlabel("Test Run")
- ax1.set_ylabel("Insert QPS")
- ax1.set_title("Milvus Insert Performance")
- ax1.set_xticks(x)
- ax1.set_xticklabels(labels if labels else [f"Run {i+1}" for i in x])
- ax1.legend()
- ax1.grid(True, alpha=0.3)
-
- # Query performance
- if baseline_query:
- ax2.plot(x, baseline_query, "o-", label="Baseline", linewidth=2, markersize=8)
- if dev_query:
- ax2.plot(x[:len(dev_query)], dev_query, "s-", label="Development", linewidth=2, markersize=8)
- ax2.set_xlabel("Test Run")
- ax2.set_ylabel("Query QPS")
- ax2.set_title("Milvus Query Performance")
- ax2.set_xticks(x)
- ax2.set_xticklabels(labels if labels else [f"Run {i+1}" for i in x])
- ax2.legend()
- ax2.grid(True, alpha=0.3)
+ # Define common structure for consistency
+ topk_order = ["topk_1", "topk_10", "topk_100"]
+ batch_order = ["batch_1", "batch_10", "batch_100"]
+
+ for idx, (config_key, perf_data) in enumerate(fs_performance.items()):
+ # Create matrix for this filesystem
+ matrix = np.zeros((len(topk_order), len(batch_order)))
+
+ # Fill matrix with data
+ query_dict = {}
+ for item in perf_data["query_data"]:
+ query_dict[(item["topk"], item["batch"])] = item["qps"]
+
+ for i, topk in enumerate(topk_order):
+ for j, batch in enumerate(batch_order):
+ matrix[i, j] = query_dict.get((topk, batch), 0)
+
+ # Plot heatmap
+ im = axes[idx].imshow(matrix, cmap='viridis', aspect='auto')
+ axes[idx].set_title(f"{config_key.upper()} Query Performance")
+ axes[idx].set_xticks(range(len(batch_order)))
+ axes[idx].set_xticklabels([b.replace("batch_", "Batch ") for b in batch_order])
+ axes[idx].set_yticks(range(len(topk_order)))
+ axes[idx].set_yticklabels([t.replace("topk_", "Top-") for t in topk_order])
+
+ # Add text annotations
+ for i in range(len(topk_order)):
+ for j in range(len(batch_order)):
+ axes[idx].text(j, i, f'{matrix[i, j]:.0f}',
+ ha="center", va="center", color="white", fontweight="bold")
+
+ # Add colorbar
+ cbar = plt.colorbar(im, ax=axes[idx])
+ cbar.set_label('Queries Per Second (QPS)')
+ else:
+ # Single filesystem mode
+ fig, ax = plt.subplots(1, 1, figsize=(8, 6))
+
+ if fs_performance:
+ config_key = list(fs_performance.keys())[0]
+ perf_data = fs_performance[config_key]
+
+ # Create matrix
+ topk_order = ["topk_1", "topk_10", "topk_100"]
+ batch_order = ["batch_1", "batch_10", "batch_100"]
+ matrix = np.zeros((len(topk_order), len(batch_order)))
+
+ # Fill matrix with data
+ query_dict = {}
+ for item in perf_data["query_data"]:
+ query_dict[(item["topk"], item["batch"])] = item["qps"]
+
+ for i, topk in enumerate(topk_order):
+ for j, batch in enumerate(batch_order):
+ matrix[i, j] = query_dict.get((topk, batch), 0)
+
+ # Plot heatmap
+ im = ax.imshow(matrix, cmap='viridis', aspect='auto')
+ ax.set_title("Milvus Query Performance Heatmap")
+ ax.set_xticks(range(len(batch_order)))
+ ax.set_xticklabels([b.replace("batch_", "Batch ") for b in batch_order])
+ ax.set_yticks(range(len(topk_order)))
+ ax.set_yticklabels([t.replace("topk_", "Top-") for t in topk_order])
+
+ # Add text annotations
+ for i in range(len(topk_order)):
+ for j in range(len(batch_order)):
+ ax.text(j, i, f'{matrix[i, j]:.0f}',
+ ha="center", va="center", color="white", fontweight="bold")
+
+ # Add colorbar
+ cbar = plt.colorbar(im, ax=ax)
+ cbar.set_label('Queries Per Second (QPS)')
plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "performance_trends.png"), dpi=150)
+ plt.savefig(os.path.join(output_dir, "performance_heatmap.png"), dpi=150, bbox_inches="tight")
plt.close()
-def generate_summary_statistics(results, output_dir):
- """Generate summary statistics and save to JSON"""
- summary = {
- "total_tests": len(results),
- "filesystems_tested": list(
- set(r.get("filesystem", "unknown") for r in results)
- ),
- "configurations": {},
- "performance_summary": {
- "best_insert_qps": {"value": 0, "config": ""},
- "best_query_qps": {"value": 0, "config": ""},
- "average_insert_qps": 0,
- "average_query_qps": 0,
- },
- }
-
- # Calculate statistics
- all_insert_qps = []
- all_query_qps = []
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "default")
- is_dev = "dev" if result.get("is_dev", False) else "baseline"
- config_name = f"{fs}-{block_size}-{is_dev}"
-
- # Get actual performance metrics
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
-
- # Calculate average query QPS
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
-
- all_insert_qps.append(insert_qps)
- all_query_qps.append(query_qps)
-
- summary["configurations"][config_name] = {
- "insert_qps": insert_qps,
- "query_qps": query_qps,
- "host": result.get("host", "unknown"),
- }
-
- if insert_qps > summary["performance_summary"]["best_insert_qps"]["value"]:
- summary["performance_summary"]["best_insert_qps"] = {
- "value": insert_qps,
- "config": config_name,
- }
-
- if query_qps > summary["performance_summary"]["best_query_qps"]["value"]:
- summary["performance_summary"]["best_query_qps"] = {
- "value": query_qps,
- "config": config_name,
- }
-
- summary["performance_summary"]["average_insert_qps"] = (
- np.mean(all_insert_qps) if all_insert_qps else 0
- )
- summary["performance_summary"]["average_query_qps"] = (
- np.mean(all_query_qps) if all_query_qps else 0
- )
-
- # Save summary
- with open(os.path.join(output_dir, "summary.json"), "w") as f:
- json.dump(summary, f, indent=2)
-
- return summary
-
-
def main():
if len(sys.argv) < 3:
print("Usage: generate_graphs.py <results_dir> <output_dir>")
@@ -642,37 +340,23 @@ def main():
results_dir = sys.argv[1]
output_dir = sys.argv[2]
- # Create output directory
+ # Ensure output directory exists
os.makedirs(output_dir, exist_ok=True)
# Load results
results = load_results(results_dir)
-
if not results:
- print("No results found to analyze")
+ print(f"No valid results found in {results_dir}")
sys.exit(1)
print(f"Loaded {len(results)} result files")
# Generate graphs
- print("Generating performance heatmap...")
- create_heatmap_analysis(results, output_dir)
-
- print("Generating performance trends...")
create_simple_performance_trends(results, output_dir)
+ create_heatmap_analysis(results, output_dir)
- print("Generating summary statistics...")
- summary = generate_summary_statistics(results, output_dir)
-
- print(f"\nAnalysis complete! Graphs saved to {output_dir}")
- print(f"Total configurations tested: {summary['total_tests']}")
- print(
- f"Best insert QPS: {summary['performance_summary']['best_insert_qps']['value']} ({summary['performance_summary']['best_insert_qps']['config']})"
- )
- print(
- f"Best query QPS: {summary['performance_summary']['best_query_qps']['value']} ({summary['performance_summary']['best_query_qps']['config']})"
- )
+ print(f"Graphs generated in {output_dir}")
if __name__ == "__main__":
- main()
+ main()
\ No newline at end of file
diff --git a/playbooks/roles/ai_collect_results/files/generate_html_report.py b/playbooks/roles/ai_collect_results/files/generate_html_report.py
index a205577c..01ec734c 100755
--- a/playbooks/roles/ai_collect_results/files/generate_html_report.py
+++ b/playbooks/roles/ai_collect_results/files/generate_html_report.py
@@ -69,6 +69,24 @@ HTML_TEMPLATE = """
color: #7f8c8d;
font-size: 0.9em;
}}
+ .config-box {{
+ background: #f8f9fa;
+ border-left: 4px solid #3498db;
+ padding: 15px;
+ margin: 20px 0;
+ border-radius: 4px;
+ }}
+ .config-box h3 {{
+ margin-top: 0;
+ color: #2c3e50;
+ }}
+ .config-box ul {{
+ margin: 10px 0;
+ padding-left: 20px;
+ }}
+ .config-box li {{
+ margin: 5px 0;
+ }}
.section {{
background: white;
padding: 30px;
@@ -162,15 +180,16 @@ HTML_TEMPLATE = """
</head>
<body>
<div class="header">
- <h1>AI Vector Database Benchmark Results</h1>
+ <h1>Milvus Vector Database Benchmark Results</h1>
<div class="subtitle">Generated on {timestamp}</div>
</div>
<nav class="navigation">
<ul>
<li><a href="#summary">Summary</a></li>
+ {filesystem_nav_items}
<li><a href="#performance-metrics">Performance Metrics</a></li>
- <li><a href="#performance-trends">Performance Trends</a></li>
+ <li><a href="#performance-heatmap">Performance Heatmap</a></li>
<li><a href="#detailed-results">Detailed Results</a></li>
</ul>
</nav>
@@ -192,34 +211,40 @@ HTML_TEMPLATE = """
<div class="label">{best_query_config}</div>
</div>
<div class="card">
- <h3>Test Runs</h3>
- <div class="value">{total_tests}</div>
- <div class="label">Benchmark Executions</div>
+ <h3>{fourth_card_title}</h3>
+ <div class="value">{fourth_card_value}</div>
+ <div class="label">{fourth_card_label}</div>
</div>
</div>
- <div id="performance-metrics" class="section">
- <h2>Performance Metrics</h2>
- <p>Key performance indicators for Milvus vector database operations.</p>
+ {filesystem_comparison_section}
+
+ {block_size_analysis_section}
+
+ <div id="performance-heatmap" class="section">
+ <h2>Performance Heatmap</h2>
+ <p>Heatmap visualization showing performance metrics across all tested configurations.</p>
<div class="graph-container">
- <img src="graphs/performance_heatmap.png" alt="Performance Metrics">
+ <img src="graphs/performance_heatmap.png" alt="Performance Heatmap">
</div>
</div>
- <div id="performance-trends" class="section">
- <h2>Performance Trends</h2>
- <p>Performance comparison between baseline and development configurations.</p>
- <div class="graph-container">
- <img src="graphs/performance_trends.png" alt="Performance Trends">
+ <div id="performance-metrics" class="section">
+ <h2>Performance Metrics</h2>
+ {config_summary}
+ <div class="graph-grid">
+ {performance_trend_graphs}
</div>
</div>
<div id="detailed-results" class="section">
- <h2>Detailed Results Table</h2>
+ <h2>Milvus Performance by Storage Filesystem</h2>
+ <p>This table shows how Milvus vector database performs when its data is stored on different filesystem types and configurations.</p>
<table class="results-table">
<thead>
<tr>
- <th>Host</th>
+ <th>Filesystem</th>
+ <th>Configuration</th>
<th>Type</th>
<th>Insert QPS</th>
<th>Query QPS</th>
@@ -260,51 +285,77 @@ def load_results(results_dir):
data = json.load(f)
# Get filesystem from JSON data first, then fallback to filename parsing
filename = os.path.basename(json_file)
-
+
# Skip results without valid performance data
insert_perf = data.get("insert_performance", {})
query_perf = data.get("query_performance", {})
if not insert_perf or not query_perf:
continue
-
+
# Get filesystem from JSON data
fs_type = data.get("filesystem", None)
-
- # If not in JSON, try to parse from filename (backwards compatibility)
- if not fs_type and "debian13-ai" in filename:
- host_parts = (
- filename.replace("results_debian13-ai-", "")
- .replace("_1.json", "")
+
+ # Always try to parse from filename first since JSON data might be wrong
+ if "-ai-" in filename:
+ # Handle both debian13-ai- and prod-ai- prefixes
+ cleaned_filename = filename.replace("results_", "")
+
+ # Extract the part after -ai-
+ if "debian13-ai-" in cleaned_filename:
+ host_part = cleaned_filename.replace("debian13-ai-", "")
+ elif "prod-ai-" in cleaned_filename:
+ host_part = cleaned_filename.replace("prod-ai-", "")
+ else:
+ # Generic extraction
+ ai_index = cleaned_filename.find("-ai-")
+ if ai_index != -1:
+ host_part = cleaned_filename[ai_index + 4 :] # Skip "-ai-"
+ else:
+ host_part = cleaned_filename
+
+ # Remove file extensions and dev suffix
+ host_part = (
+ host_part.replace("_1.json", "")
.replace("_2.json", "")
.replace("_3.json", "")
- .split("-")
+ .replace("-dev", "")
)
- if "xfs" in host_parts[0]:
+
+ # Parse filesystem type and block size
+ if host_part.startswith("xfs-"):
fs_type = "xfs"
- block_size = host_parts[1] if len(host_parts) > 1 else "4k"
- elif "ext4" in host_parts[0]:
+ # Extract block size: xfs-4k-4ks -> 4k
+ parts = host_part.split("-")
+ if len(parts) >= 2:
+ block_size = parts[1] # 4k, 16k, 32k, 64k
+ else:
+ block_size = "4k"
+ elif host_part.startswith("ext4-"):
fs_type = "ext4"
- block_size = host_parts[1] if len(host_parts) > 1 else "4k"
- elif "btrfs" in host_parts[0]:
+ parts = host_part.split("-")
+ block_size = parts[1] if len(parts) > 1 else "4k"
+ elif host_part.startswith("btrfs"):
fs_type = "btrfs"
block_size = "default"
else:
- fs_type = "unknown"
- block_size = "unknown"
+ # Fallback to JSON data if available
+ if not fs_type:
+ fs_type = "unknown"
+ block_size = "unknown"
else:
# Set appropriate block size based on filesystem
if fs_type == "btrfs":
block_size = "default"
else:
block_size = data.get("block_size", "default")
-
+
# Default to unknown if still not found
if not fs_type:
fs_type = "unknown"
block_size = "unknown"
-
+
is_dev = "dev" in filename
-
+
# Calculate average QPS from query performance data
query_qps = 0
query_count = 0
@@ -316,7 +367,7 @@ def load_results(results_dir):
query_count += 1
if query_count > 0:
query_qps = query_qps / query_count
-
+
results.append(
{
"host": filename.replace("results_", "").replace(".json", ""),
@@ -348,12 +399,36 @@ def generate_table_rows(results, best_configs):
if config_key in best_configs:
row_class += " best-config"
+ # Generate descriptive labels showing Milvus is running on this filesystem
+ if result["filesystem"] == "xfs" and result["block_size"] != "default":
+ storage_label = f"XFS {result['block_size'].upper()}"
+ config_details = f"Block size: {result['block_size']}, Milvus data on XFS"
+ elif result["filesystem"] == "ext4":
+ storage_label = "EXT4"
+ if "bigalloc" in result.get("host", "").lower():
+ config_details = "EXT4 with bigalloc, Milvus data on ext4"
+ else:
+ config_details = (
+ f"Block size: {result['block_size']}, Milvus data on ext4"
+ )
+ elif result["filesystem"] == "btrfs":
+ storage_label = "BTRFS"
+ config_details = "Default Btrfs settings, Milvus data on Btrfs"
+ else:
+ storage_label = result["filesystem"].upper()
+ config_details = f"Milvus data on {result['filesystem']}"
+
+ # Extract clean node identifier from hostname
+ node_name = result["host"].replace("results_", "").replace(".json", "")
+
row = f"""
<tr class="{row_class}">
- <td>{result['host']}</td>
+ <td><strong>{storage_label}</strong></td>
+ <td>{config_details}</td>
<td>{result['type']}</td>
<td>{result['insert_qps']:,}</td>
<td>{result['query_qps']:,}</td>
+ <td><code>{node_name}</code></td>
<td>{result['timestamp']}</td>
</tr>
"""
@@ -362,10 +437,66 @@ def generate_table_rows(results, best_configs):
return "\n".join(rows)
+def generate_config_summary(results_dir):
+ """Generate configuration summary HTML from results"""
+ # Try to load first result file to get configuration
+ result_files = glob.glob(os.path.join(results_dir, "results_*.json"))
+ if not result_files:
+ return ""
+
+ try:
+ with open(result_files[0], "r") as f:
+ data = json.load(f)
+ config = data.get("config", {})
+
+ # Format configuration details
+ config_html = """
+ <div class="config-box">
+ <h3>Test Configuration</h3>
+ <ul>
+ <li><strong>Vector Dataset Size:</strong> {:,} vectors</li>
+ <li><strong>Vector Dimensions:</strong> {}</li>
+ <li><strong>Index Type:</strong> {} (M={}, ef_construction={}, ef={})</li>
+ <li><strong>Benchmark Runtime:</strong> {} seconds</li>
+ <li><strong>Batch Size:</strong> {:,}</li>
+ <li><strong>Test Iterations:</strong> {} runs with identical configuration</li>
+ </ul>
+ </div>
+ """.format(
+ config.get("vector_dataset_size", "N/A"),
+ config.get("vector_dimensions", "N/A"),
+ config.get("index_type", "N/A"),
+ config.get("index_hnsw_m", "N/A"),
+ config.get("index_hnsw_ef_construction", "N/A"),
+ config.get("index_hnsw_ef", "N/A"),
+ config.get("benchmark_runtime", "N/A"),
+ config.get("batch_size", "N/A"),
+ len(result_files),
+ )
+ return config_html
+ except Exception as e:
+ print(f"Warning: Could not generate config summary: {e}")
+ return ""
+
+
def find_performance_trend_graphs(graphs_dir):
- """Find performance trend graph"""
- # Not used in basic implementation since we embed the graph directly
- return ""
+ """Find performance trend graphs"""
+ graphs = []
+ # Look for filesystem-specific graphs in multi-fs mode
+ for fs in ["xfs", "ext4", "btrfs"]:
+ graph_path = f"{fs}_performance_trends.png"
+ if os.path.exists(os.path.join(graphs_dir, graph_path)):
+ graphs.append(
+ f'<div class="graph-container"><img src="graphs/{graph_path}" alt="{fs.upper()} Performance Trends"></div>'
+ )
+ # Fallback to simple performance trends for single mode
+ if not graphs and os.path.exists(
+ os.path.join(graphs_dir, "performance_trends.png")
+ ):
+ graphs.append(
+ '<div class="graph-container"><img src="graphs/performance_trends.png" alt="Performance Trends"></div>'
+ )
+ return "\n".join(graphs)
def generate_html_report(results_dir, graphs_dir, output_path):
@@ -393,6 +524,50 @@ def generate_html_report(results_dir, graphs_dir, output_path):
if summary["performance_summary"]["best_query_qps"]["config"]:
best_configs.add(summary["performance_summary"]["best_query_qps"]["config"])
+ # Check if multi-filesystem testing is enabled (more than one filesystem)
+ filesystems_tested = summary.get("filesystems_tested", [])
+ is_multifs_enabled = len(filesystems_tested) > 1
+
+ # Generate conditional sections based on multi-fs status
+ if is_multifs_enabled:
+ filesystem_nav_items = """
+ <li><a href="#filesystem-comparison">Filesystem Comparison</a></li>
+ <li><a href="#block-size-analysis">Block Size Analysis</a></li>"""
+
+ filesystem_comparison_section = """<div id="filesystem-comparison" class="section">
+ <h2>Milvus Storage Filesystem Comparison</h2>
+ <p>Comparison of Milvus vector database performance when its data is stored on different filesystem types (XFS, ext4, Btrfs) with various configurations.</p>
+ <div class="graph-container">
+ <img src="graphs/filesystem_comparison.png" alt="Filesystem Comparison">
+ </div>
+ </div>"""
+
+ block_size_analysis_section = """<div id="block-size-analysis" class="section">
+ <h2>XFS Block Size Analysis</h2>
+ <p>Performance analysis of XFS filesystem with different block sizes (4K, 16K, 32K, 64K).</p>
+ <div class="graph-container">
+ <img src="graphs/xfs_block_size_analysis.png" alt="XFS Block Size Analysis">
+ </div>
+ </div>"""
+
+ # Multi-fs mode: show filesystem info
+ fourth_card_title = "Storage Filesystems"
+ fourth_card_value = str(len(filesystems_tested))
+ fourth_card_label = ", ".join(filesystems_tested).upper() + " for Milvus Data"
+ else:
+ # Single filesystem mode - hide multi-fs sections
+ filesystem_nav_items = ""
+ filesystem_comparison_section = ""
+ block_size_analysis_section = ""
+
+ # Single mode: show test iterations
+ fourth_card_title = "Test Iterations"
+ fourth_card_value = str(summary["total_tests"])
+ fourth_card_label = "Identical Configuration Runs"
+
+ # Generate configuration summary
+ config_summary = generate_config_summary(results_dir)
+
# Generate HTML
html_content = HTML_TEMPLATE.format(
timestamp=datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
@@ -401,6 +576,14 @@ def generate_html_report(results_dir, graphs_dir, output_path):
best_insert_config=summary["performance_summary"]["best_insert_qps"]["config"],
best_query_qps=f"{summary['performance_summary']['best_query_qps']['value']:,}",
best_query_config=summary["performance_summary"]["best_query_qps"]["config"],
+ fourth_card_title=fourth_card_title,
+ fourth_card_value=fourth_card_value,
+ fourth_card_label=fourth_card_label,
+ filesystem_nav_items=filesystem_nav_items,
+ filesystem_comparison_section=filesystem_comparison_section,
+ block_size_analysis_section=block_size_analysis_section,
+ config_summary=config_summary,
+ performance_trend_graphs=find_performance_trend_graphs(graphs_dir),
table_rows=generate_table_rows(results, best_configs),
)
diff --git a/playbooks/roles/ai_collect_results/tasks/main.yml b/playbooks/roles/ai_collect_results/tasks/main.yml
index 6a15d89c..9586890a 100644
--- a/playbooks/roles/ai_collect_results/tasks/main.yml
+++ b/playbooks/roles/ai_collect_results/tasks/main.yml
@@ -134,13 +134,22 @@
ansible.builtin.command: >
python3 {{ local_scripts_dir }}/analyze_results.py
--results-dir {{ local_results_dir }}
- --output-dir {{ local_results_dir }}
+ --output-dir {{ local_results_dir }}/graphs
{% if ai_benchmark_enable_graphing | bool %}--config {{ local_scripts_dir }}/analysis_config.json{% endif %}
register: analysis_result
run_once: true
delegate_to: localhost
when: collected_results.files is defined and collected_results.files | length > 0
tags: ['results', 'analysis']
+ failed_when: analysis_result.rc != 0
+
+- name: Display analysis script output
+ ansible.builtin.debug:
+ var: analysis_result
+ run_once: true
+ delegate_to: localhost
+ when: collected_results.files is defined and collected_results.files | length > 0
+ tags: ['results', 'analysis']
- name: Create graphs directory
@@ -155,35 +164,8 @@
- collected_results.files | length > 0
tags: ['results', 'graphs']
-- name: Generate performance graphs
- ansible.builtin.command: >
- python3 {{ local_scripts_dir }}/generate_better_graphs.py
- {{ local_results_dir }}
- {{ local_results_dir }}/graphs
- register: graph_generation_result
- failed_when: false
- run_once: true
- delegate_to: localhost
- when:
- - collected_results.files is defined
- - collected_results.files | length > 0
- - ai_benchmark_enable_graphing|bool
- tags: ['results', 'graphs']
-
-- name: Fallback to basic graphs if better graphs fail
- ansible.builtin.command: >
- python3 {{ local_scripts_dir }}/generate_graphs.py
- {{ local_results_dir }}
- {{ local_results_dir }}/graphs
- run_once: true
- delegate_to: localhost
- when:
- - collected_results.files is defined
- - collected_results.files | length > 0
- - ai_benchmark_enable_graphing|bool
- - graph_generation_result is defined
- - graph_generation_result.rc != 0
- tags: ['results', 'graphs']
+# Graph generation is now handled by analyze_results.py above
+# No separate graph generation step needed
- name: Generate HTML report
ansible.builtin.command: >
diff --git a/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2 b/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2
index 5a879649..459cd602 100644
--- a/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2
+++ b/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2
@@ -2,5 +2,5 @@
"enable_graphing": {{ ai_benchmark_enable_graphing|default(true)|lower }},
"graph_format": "{{ ai_benchmark_graph_format|default('png') }}",
"graph_dpi": {{ ai_benchmark_graph_dpi|default(150) }},
- "graph_theme": "{{ ai_benchmark_graph_theme|default('seaborn') }}"
+ "graph_theme": "{{ ai_benchmark_graph_theme|default('default') }}"
}
diff --git a/playbooks/roles/ai_milvus_storage/tasks/main.yml b/playbooks/roles/ai_milvus_storage/tasks/main.yml
new file mode 100644
index 00000000..f8e4ea63
--- /dev/null
+++ b/playbooks/roles/ai_milvus_storage/tasks/main.yml
@@ -0,0 +1,161 @@
+---
+- name: Import optional extra_args file
+ include_vars: "{{ item }}"
+ ignore_errors: yes
+ with_items:
+ - "../extra_vars.yaml"
+ tags: vars
+
+- name: Milvus storage setup
+ when: ai_milvus_storage_enable|bool
+ block:
+ - name: Install filesystem utilities
+ package:
+ name:
+ - xfsprogs
+ - e2fsprogs
+ - btrfs-progs
+ state: present
+ become: yes
+ become_method: sudo
+
+ - name: Check if device exists
+ stat:
+ path: "{{ ai_milvus_device }}"
+ register: milvus_device_stat
+ failed_when: not milvus_device_stat.stat.exists
+
+ - name: Check if Milvus storage is already mounted
+ command: mountpoint -q {{ ai_milvus_mount_point }}
+ register: milvus_mount_check
+ changed_when: false
+ failed_when: false
+
+ - name: Setup Milvus storage filesystem
+ when: milvus_mount_check.rc != 0
+ block:
+ - name: Create Milvus mount point directory
+ file:
+ path: "{{ ai_milvus_mount_point }}"
+ state: directory
+ mode: '0755'
+ become: yes
+ become_method: sudo
+
+ - name: Detect filesystem type from node name
+ set_fact:
+ detected_fstype: >-
+ {%- if 'xfs' in inventory_hostname -%}
+ xfs
+ {%- elif 'ext4' in inventory_hostname -%}
+ ext4
+ {%- elif 'btrfs' in inventory_hostname -%}
+ btrfs
+ {%- else -%}
+ {{ ai_milvus_fstype | default('xfs') }}
+ {%- endif -%}
+ when: ai_milvus_use_node_fs | default(false) | bool
+
+ - name: Detect XFS parameters from node name
+ set_fact:
+ milvus_xfs_blocksize: >-
+ {%- if '64k' in inventory_hostname -%}
+ 65536
+ {%- elif '32k' in inventory_hostname -%}
+ 32768
+ {%- elif '16k' in inventory_hostname -%}
+ 16384
+ {%- else -%}
+ {{ ai_milvus_xfs_blocksize | default(4096) }}
+ {%- endif -%}
+ milvus_xfs_sectorsize: >-
+ {%- if '4ks' in inventory_hostname -%}
+ 4096
+ {%- elif '512s' in inventory_hostname -%}
+ 512
+ {%- else -%}
+ {{ ai_milvus_xfs_sectorsize | default(4096) }}
+ {%- endif -%}
+ when:
+ - ai_milvus_use_node_fs | default(false) | bool
+ - detected_fstype | default(ai_milvus_fstype) == 'xfs'
+
+ - name: Detect ext4 parameters from node name
+ set_fact:
+ milvus_ext4_opts: >-
+ {%- if '16k' in inventory_hostname and 'bigalloc' in inventory_hostname -%}
+ -F -b 4096 -C 16384 -O bigalloc
+ {%- elif '4k' in inventory_hostname -%}
+ -F -b 4096
+ {%- else -%}
+ {{ ai_milvus_ext4_mkfs_opts | default('-F') }}
+ {%- endif -%}
+ when:
+ - ai_milvus_use_node_fs | default(false) | bool
+ - detected_fstype | default(ai_milvus_fstype) == 'ext4'
+
+ - name: Set final filesystem type
+ set_fact:
+ milvus_fstype: "{{ detected_fstype | default(ai_milvus_fstype | default('xfs')) }}"
+
+ - name: Format device with XFS
+ command: >
+ mkfs.xfs -f
+ -b size={{ milvus_xfs_blocksize | default(ai_milvus_xfs_blocksize | default(4096)) }}
+ -s size={{ milvus_xfs_sectorsize | default(ai_milvus_xfs_sectorsize | default(4096)) }}
+ {{ ai_milvus_xfs_mkfs_opts | default('') }}
+ {{ ai_milvus_device }}
+ when: milvus_fstype == "xfs"
+ become: yes
+ become_method: sudo
+
+ - name: Format device with Btrfs
+ command: mkfs.btrfs {{ ai_milvus_btrfs_mkfs_opts | default('-f') }} {{ ai_milvus_device }}
+ when: milvus_fstype == "btrfs"
+ become: yes
+ become_method: sudo
+
+ - name: Format device with ext4
+ command: mkfs.ext4 {{ milvus_ext4_opts | default(ai_milvus_ext4_mkfs_opts | default('-F')) }} {{ ai_milvus_device }}
+ when: milvus_fstype == "ext4"
+ become: yes
+ become_method: sudo
+
+ - name: Mount Milvus storage filesystem
+ mount:
+ path: "{{ ai_milvus_mount_point }}"
+ src: "{{ ai_milvus_device }}"
+ fstype: "{{ milvus_fstype }}"
+ opts: defaults,noatime
+ state: mounted
+ become: yes
+ become_method: sudo
+
+ - name: Add Milvus storage mount to fstab
+ mount:
+ path: "{{ ai_milvus_mount_point }}"
+ src: "{{ ai_milvus_device }}"
+ fstype: "{{ milvus_fstype }}"
+ opts: defaults,noatime
+ state: present
+ become: yes
+ become_method: sudo
+
+ - name: Ensure Milvus directories exist with proper permissions
+ file:
+ path: "{{ item }}"
+ state: directory
+ mode: '0755'
+ owner: root
+ group: root
+ become: yes
+ become_method: sudo
+ loop:
+ - "{{ ai_milvus_mount_point }}"
+ - "{{ ai_milvus_mount_point }}/data"
+ - "{{ ai_milvus_mount_point }}/etcd"
+ - "{{ ai_milvus_mount_point }}/minio"
+
+ - name: Display Milvus storage setup complete
+ debug:
+ msg: "Milvus storage has been prepared at: {{ ai_milvus_mount_point }} with filesystem: {{ milvus_fstype | default(ai_milvus_fstype | default('xfs')) }}"
diff --git a/playbooks/roles/ai_multifs_run/tasks/generate_comparison.yml b/playbooks/roles/ai_multifs_run/tasks/generate_comparison.yml
new file mode 100644
index 00000000..b4453b81
--- /dev/null
+++ b/playbooks/roles/ai_multifs_run/tasks/generate_comparison.yml
@@ -0,0 +1,279 @@
+---
+- name: Create multi-filesystem comparison script
+ copy:
+ content: |
+ #!/usr/bin/env python3
+ """
+ Multi-Filesystem AI Benchmark Comparison Report Generator
+
+ This script analyzes AI benchmark results across different filesystem
+ configurations and generates a comprehensive comparison report.
+ """
+
+ import json
+ import glob
+ import os
+ import sys
+ from datetime import datetime
+ from typing import Dict, List, Any
+
+ def load_filesystem_results(results_dir: str) -> Dict[str, Any]:
+ """Load results from all filesystem configurations"""
+ fs_results = {}
+
+ # Find all filesystem configuration directories
+ fs_dirs = [d for d in os.listdir(results_dir)
+ if os.path.isdir(os.path.join(results_dir, d)) and d != 'comparison']
+
+ for fs_name in fs_dirs:
+ fs_path = os.path.join(results_dir, fs_name)
+
+ # Load configuration
+ config_file = os.path.join(fs_path, 'filesystem_config.txt')
+ config_info = {}
+ if os.path.exists(config_file):
+ with open(config_file, 'r') as f:
+ config_info['config_text'] = f.read()
+
+ # Load benchmark results
+ result_files = glob.glob(os.path.join(fs_path, 'results_*.json'))
+ benchmark_results = []
+
+ for result_file in result_files:
+ try:
+ with open(result_file, 'r') as f:
+ data = json.load(f)
+ benchmark_results.append(data)
+ except Exception as e:
+ print(f"Error loading {result_file}: {e}")
+
+ fs_results[fs_name] = {
+ 'config': config_info,
+ 'results': benchmark_results,
+ 'path': fs_path
+ }
+
+ return fs_results
+
+ def generate_comparison_report(fs_results: Dict[str, Any], output_dir: str):
+ """Generate HTML comparison report"""
+ html = []
+
+ # HTML header
+ html.append("<!DOCTYPE html>")
+ html.append("<html lang='en'>")
+ html.append("<head>")
+ html.append(" <meta charset='UTF-8'>")
+ html.append(" <title>AI Multi-Filesystem Benchmark Comparison</title>")
+ html.append(" <style>")
+ html.append(" body { font-family: Arial, sans-serif; margin: 20px; }")
+ html.append(" .header { background-color: #f0f8ff; padding: 20px; border-radius: 5px; margin-bottom: 20px; }")
+ html.append(" .fs-section { margin-bottom: 30px; border: 1px solid #ddd; padding: 15px; border-radius: 5px; }")
+ html.append(" .comparison-table { width: 100%; border-collapse: collapse; margin: 20px 0; }")
+ html.append(" .comparison-table th, .comparison-table td { border: 1px solid #ddd; padding: 8px; text-align: left; }")
+ html.append(" .comparison-table th { background-color: #f2f2f2; }")
+ html.append(" .metric-best { background-color: #d4edda; font-weight: bold; }")
+ html.append(" .metric-worst { background-color: #f8d7da; }")
+ html.append(" .chart-container { margin: 20px 0; padding: 15px; background-color: #f9f9f9; border-radius: 5px; }")
+ html.append(" </style>")
+ html.append("</head>")
+ html.append("<body>")
+
+ # Report header
+ html.append(" <div class='header'>")
+ html.append(" <h1>🗂️ AI Multi-Filesystem Benchmark Comparison</h1>")
+ html.append(f" <p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>")
+ html.append(f" <p><strong>Filesystem Configurations Tested:</strong> {len(fs_results)}</p>")
+ html.append(" </div>")
+
+ # Performance comparison table
+ html.append(" <h2>📊 Performance Comparison Summary</h2>")
+ html.append(" <table class='comparison-table'>")
+ html.append(" <tr>")
+ html.append(" <th>Filesystem</th>")
+ html.append(" <th>Avg Insert Rate (vectors/sec)</th>")
+ html.append(" <th>Avg Index Time (sec)</th>")
+ html.append(" <th>Avg Query QPS (Top-10, Batch-1)</th>")
+ html.append(" <th>Avg Query Latency (ms)</th>")
+ html.append(" </tr>")
+
+ # Calculate metrics for comparison
+ fs_metrics = {}
+ for fs_name, fs_data in fs_results.items():
+ if not fs_data['results']:
+ continue
+
+ # Calculate averages across all iterations
+ insert_rates = []
+ index_times = []
+ query_qps = []
+ query_latencies = []
+
+ for result in fs_data['results']:
+ if 'insert_performance' in result:
+ insert_rates.append(result['insert_performance'].get('vectors_per_second', 0))
+
+ if 'index_performance' in result:
+ index_times.append(result['index_performance'].get('creation_time_seconds', 0))
+
+ if 'query_performance' in result:
+ qp = result['query_performance']
+ if 'topk_10' in qp and 'batch_1' in qp['topk_10']:
+ batch_data = qp['topk_10']['batch_1']
+ query_qps.append(batch_data.get('queries_per_second', 0))
+ query_latencies.append(batch_data.get('average_time_seconds', 0) * 1000)
+
+ fs_metrics[fs_name] = {
+ 'insert_rate': sum(insert_rates) / len(insert_rates) if insert_rates else 0,
+ 'index_time': sum(index_times) / len(index_times) if index_times else 0,
+ 'query_qps': sum(query_qps) / len(query_qps) if query_qps else 0,
+ 'query_latency': sum(query_latencies) / len(query_latencies) if query_latencies else 0
+ }
+
+ # Find best/worst for highlighting
+ if fs_metrics:
+ best_insert = max(fs_metrics.keys(), key=lambda x: fs_metrics[x]['insert_rate'])
+ best_index = min(fs_metrics.keys(), key=lambda x: fs_metrics[x]['index_time'])
+ best_qps = max(fs_metrics.keys(), key=lambda x: fs_metrics[x]['query_qps'])
+ best_latency = min(fs_metrics.keys(), key=lambda x: fs_metrics[x]['query_latency'])
+
+ worst_insert = min(fs_metrics.keys(), key=lambda x: fs_metrics[x]['insert_rate'])
+ worst_index = max(fs_metrics.keys(), key=lambda x: fs_metrics[x]['index_time'])
+ worst_qps = min(fs_metrics.keys(), key=lambda x: fs_metrics[x]['query_qps'])
+ worst_latency = max(fs_metrics.keys(), key=lambda x: fs_metrics[x]['query_latency'])
+
+ # Generate comparison rows
+ for fs_name, metrics in fs_metrics.items():
+ html.append(" <tr>")
+ html.append(f" <td><strong>{fs_name}</strong></td>")
+
+ # Insert rate
+ cell_class = ""
+ if fs_name == best_insert:
+ cell_class = "metric-best"
+ elif fs_name == worst_insert:
+ cell_class = "metric-worst"
+ html.append(f" <td class='{cell_class}'>{metrics['insert_rate']:.2f}</td>")
+
+ # Index time
+ cell_class = ""
+ if fs_name == best_index:
+ cell_class = "metric-best"
+ elif fs_name == worst_index:
+ cell_class = "metric-worst"
+ html.append(f" <td class='{cell_class}'>{metrics['index_time']:.2f}</td>")
+
+ # Query QPS
+ cell_class = ""
+ if fs_name == best_qps:
+ cell_class = "metric-best"
+ elif fs_name == worst_qps:
+ cell_class = "metric-worst"
+ html.append(f" <td class='{cell_class}'>{metrics['query_qps']:.2f}</td>")
+
+ # Query latency
+ cell_class = ""
+ if fs_name == best_latency:
+ cell_class = "metric-best"
+ elif fs_name == worst_latency:
+ cell_class = "metric-worst"
+ html.append(f" <td class='{cell_class}'>{metrics['query_latency']:.2f}</td>")
+
+ html.append(" </tr>")
+
+ html.append(" </table>")
+
+ # Individual filesystem details
+ html.append(" <h2>📁 Individual Filesystem Details</h2>")
+ for fs_name, fs_data in fs_results.items():
+ html.append(f" <div class='fs-section'>")
+ html.append(f" <h3>{fs_name}</h3>")
+
+ if 'config_text' in fs_data['config']:
+ html.append(" <h4>Configuration:</h4>")
+ html.append(" <pre>" + fs_data['config']['config_text'][:500] + "</pre>")
+
+ html.append(f" <p><strong>Benchmark Iterations:</strong> {len(fs_data['results'])}</p>")
+
+ if fs_name in fs_metrics:
+ metrics = fs_metrics[fs_name]
+ html.append(" <table class='comparison-table'>")
+ html.append(" <tr><th>Metric</th><th>Value</th></tr>")
+ html.append(f" <tr><td>Average Insert Rate</td><td>{metrics['insert_rate']:.2f} vectors/sec</td></tr>")
+ html.append(f" <tr><td>Average Index Time</td><td>{metrics['index_time']:.2f} seconds</td></tr>")
+ html.append(f" <tr><td>Average Query QPS</td><td>{metrics['query_qps']:.2f}</td></tr>")
+ html.append(f" <tr><td>Average Query Latency</td><td>{metrics['query_latency']:.2f} ms</td></tr>")
+ html.append(" </table>")
+
+ html.append(" </div>")
+
+ # Footer
+ html.append(" <div style='margin-top: 40px; padding: 20px; background-color: #f8f9fa; border-radius: 5px;'>")
+ html.append(" <h3>📝 Analysis Notes</h3>")
+ html.append(" <ul>")
+ html.append(" <li>Green highlighting indicates the best performing filesystem for each metric</li>")
+ html.append(" <li>Red highlighting indicates the worst performing filesystem for each metric</li>")
+ html.append(" <li>Results are averaged across all benchmark iterations for each filesystem</li>")
+ html.append(" <li>Performance can vary based on hardware, kernel version, and workload characteristics</li>")
+ html.append(" </ul>")
+ html.append(" </div>")
+
+ html.append("</body>")
+ html.append("</html>")
+
+ # Write HTML report
+ report_file = os.path.join(output_dir, "multi_filesystem_comparison.html")
+ with open(report_file, 'w') as f:
+ f.write("\n".join(html))
+
+ print(f"Multi-filesystem comparison report generated: {report_file}")
+
+ # Generate JSON summary
+ summary_data = {
+ 'generation_time': datetime.now().isoformat(),
+ 'filesystem_count': len(fs_results),
+ 'metrics_summary': fs_metrics,
+ 'raw_results': {fs: data['results'] for fs, data in fs_results.items()}
+ }
+
+ summary_file = os.path.join(output_dir, "multi_filesystem_summary.json")
+ with open(summary_file, 'w') as f:
+ json.dump(summary_data, f, indent=2)
+
+ print(f"Multi-filesystem summary data: {summary_file}")
+
+ def main():
+ results_dir = "{{ ai_multifs_results_dir }}"
+ comparison_dir = os.path.join(results_dir, "comparison")
+ os.makedirs(comparison_dir, exist_ok=True)
+
+ print("Loading filesystem results...")
+ fs_results = load_filesystem_results(results_dir)
+
+ if not fs_results:
+ print("No filesystem results found!")
+ return 1
+
+ print(f"Found results for {len(fs_results)} filesystem configurations")
+ print("Generating comparison report...")
+
+ generate_comparison_report(fs_results, comparison_dir)
+
+ print("Multi-filesystem comparison completed!")
+ return 0
+
+ if __name__ == "__main__":
+ sys.exit(main())
+ dest: "{{ ai_multifs_results_dir }}/generate_comparison.py"
+ mode: '0755'
+
+- name: Run multi-filesystem comparison analysis
+ command: python3 {{ ai_multifs_results_dir }}/generate_comparison.py
+ register: comparison_result
+
+- name: Display comparison completion message
+ debug:
+ msg: |
+ Multi-filesystem comparison completed!
+ Comparison report: {{ ai_multifs_results_dir }}/comparison/multi_filesystem_comparison.html
+ Summary data: {{ ai_multifs_results_dir }}/comparison/multi_filesystem_summary.json
diff --git a/playbooks/roles/ai_multifs_run/tasks/main.yml b/playbooks/roles/ai_multifs_run/tasks/main.yml
new file mode 100644
index 00000000..38dbba12
--- /dev/null
+++ b/playbooks/roles/ai_multifs_run/tasks/main.yml
@@ -0,0 +1,23 @@
+---
+- name: Import optional extra_args file
+ include_vars: "{{ item }}"
+ ignore_errors: yes
+ with_items:
+ - "../extra_vars.yaml"
+ tags: vars
+
+- name: Filter enabled filesystem configurations
+ set_fact:
+ enabled_fs_configs: "{{ ai_multifs_configurations | selectattr('enabled', 'equalto', true) | list }}"
+
+- name: Run AI benchmarks on each filesystem configuration
+ include_tasks: run_single_filesystem.yml
+ loop: "{{ enabled_fs_configs }}"
+ loop_control:
+ loop_var: fs_config
+ index_var: fs_index
+ when: enabled_fs_configs | length > 0
+
+- name: Generate multi-filesystem comparison report
+ include_tasks: generate_comparison.yml
+ when: enabled_fs_configs | length > 1
diff --git a/playbooks/roles/ai_multifs_run/tasks/run_single_filesystem.yml b/playbooks/roles/ai_multifs_run/tasks/run_single_filesystem.yml
new file mode 100644
index 00000000..fd194550
--- /dev/null
+++ b/playbooks/roles/ai_multifs_run/tasks/run_single_filesystem.yml
@@ -0,0 +1,104 @@
+---
+- name: Display current filesystem configuration
+ debug:
+ msg: "Testing filesystem configuration {{ fs_index + 1 }}/{{ enabled_fs_configs | length }}: {{ fs_config.name }}"
+
+- name: Unmount filesystem if mounted
+ mount:
+ path: "{{ ai_multifs_mount_point }}"
+ state: unmounted
+ ignore_errors: yes
+
+- name: Create filesystem with specific configuration
+ shell: "{{ fs_config.mkfs_cmd }} {{ ai_multifs_device }}"
+ register: mkfs_result
+
+- name: Display mkfs output
+ debug:
+ msg: "mkfs output: {{ mkfs_result.stdout }}"
+ when: mkfs_result.stdout != ""
+
+- name: Mount filesystem with specific options
+ mount:
+ path: "{{ ai_multifs_mount_point }}"
+ src: "{{ ai_multifs_device }}"
+ fstype: "{{ fs_config.filesystem }}"
+ opts: "{{ fs_config.mount_opts }}"
+ state: mounted
+
+- name: Create filesystem-specific results directory
+ file:
+ path: "{{ ai_multifs_results_dir }}/{{ fs_config.name }}"
+ state: directory
+ mode: '0755'
+
+- name: Update AI benchmark configuration for current filesystem
+ set_fact:
+ current_fs_benchmark_dir: "{{ ai_multifs_mount_point }}/ai-benchmark-data"
+ current_fs_results_dir: "{{ ai_multifs_results_dir }}/{{ fs_config.name }}"
+
+- name: Create AI benchmark data directory on current filesystem
+ file:
+ path: "{{ current_fs_benchmark_dir }}"
+ state: directory
+ mode: '0755'
+
+- name: Generate AI benchmark configuration for current filesystem
+ template:
+ src: milvus_config.json.j2
+ dest: "{{ current_fs_results_dir }}/milvus_config.json"
+ mode: '0644'
+
+- name: Run AI benchmark on current filesystem
+ shell: |
+ cd {{ current_fs_benchmark_dir }}
+ python3 {{ playbook_dir }}/roles/ai_run_benchmarks/files/milvus_benchmark.py \
+ --config {{ current_fs_results_dir }}/milvus_config.json \
+ --output {{ current_fs_results_dir }}/results_{{ fs_config.name }}_$(date +%Y%m%d_%H%M%S).json
+ register: benchmark_result
+ async: 7200 # 2 hour timeout
+ poll: 30
+
+- name: Display benchmark completion
+ debug:
+ msg: "Benchmark completed for {{ fs_config.name }}: {{ benchmark_result.stdout_lines[-5:] | default(['No output']) }}"
+
+- name: Record filesystem configuration metadata
+ copy:
+ content: |
+ # Filesystem Configuration: {{ fs_config.name }}
+ Filesystem Type: {{ fs_config.filesystem }}
+ mkfs Command: {{ fs_config.mkfs_cmd }}
+ Mount Options: {{ fs_config.mount_opts }}
+ Device: {{ ai_multifs_device }}
+ Mount Point: {{ ai_multifs_mount_point }}
+ Data Directory: {{ current_fs_benchmark_dir }}
+ Results Directory: {{ current_fs_results_dir }}
+ Test Start Time: {{ ansible_date_time.iso8601 }}
+
+ mkfs Output:
+ {{ mkfs_result.stdout }}
+ {{ mkfs_result.stderr }}
+ dest: "{{ current_fs_results_dir }}/filesystem_config.txt"
+ mode: '0644'
+
+- name: Capture filesystem statistics after benchmark
+ shell: |
+ echo "=== Filesystem Usage ===" > {{ current_fs_results_dir }}/filesystem_stats.txt
+ df -h {{ ai_multifs_mount_point }} >> {{ current_fs_results_dir }}/filesystem_stats.txt
+ echo "" >> {{ current_fs_results_dir }}/filesystem_stats.txt
+ echo "=== Filesystem Info ===" >> {{ current_fs_results_dir }}/filesystem_stats.txt
+ {% if fs_config.filesystem == 'xfs' %}
+ xfs_info {{ ai_multifs_mount_point }} >> {{ current_fs_results_dir }}/filesystem_stats.txt 2>&1 || true
+ {% elif fs_config.filesystem == 'ext4' %}
+ tune2fs -l {{ ai_multifs_device }} >> {{ current_fs_results_dir }}/filesystem_stats.txt 2>&1 || true
+ {% elif fs_config.filesystem == 'btrfs' %}
+ btrfs filesystem show {{ ai_multifs_mount_point }} >> {{ current_fs_results_dir }}/filesystem_stats.txt 2>&1 || true
+ btrfs filesystem usage {{ ai_multifs_mount_point }} >> {{ current_fs_results_dir }}/filesystem_stats.txt 2>&1 || true
+ {% endif %}
+ ignore_errors: yes
+
+- name: Unmount filesystem after benchmark
+ mount:
+ path: "{{ ai_multifs_mount_point }}"
+ state: unmounted
diff --git a/playbooks/roles/ai_multifs_run/templates/milvus_config.json.j2 b/playbooks/roles/ai_multifs_run/templates/milvus_config.json.j2
new file mode 100644
index 00000000..6216bf46
--- /dev/null
+++ b/playbooks/roles/ai_multifs_run/templates/milvus_config.json.j2
@@ -0,0 +1,42 @@
+{
+ "milvus": {
+ "host": "{{ ai_milvus_host }}",
+ "port": {{ ai_milvus_port }},
+ "database_name": "{{ ai_milvus_database_name }}_{{ fs_config.name }}"
+ },
+ "benchmark": {
+ "vector_dataset_size": {{ ai_vector_dataset_size }},
+ "vector_dimensions": {{ ai_vector_dimensions }},
+ "index_type": "{{ ai_index_type }}",
+ "iterations": {{ ai_benchmark_iterations }},
+ "runtime_seconds": {{ ai_benchmark_runtime }},
+ "warmup_seconds": {{ ai_benchmark_warmup_time }},
+ "query_patterns": {
+ "topk_1": {{ ai_benchmark_query_topk_1 | lower }},
+ "topk_10": {{ ai_benchmark_query_topk_10 | lower }},
+ "topk_100": {{ ai_benchmark_query_topk_100 | lower }}
+ },
+ "batch_sizes": {
+ "batch_1": {{ ai_benchmark_batch_1 | lower }},
+ "batch_10": {{ ai_benchmark_batch_10 | lower }},
+ "batch_100": {{ ai_benchmark_batch_100 | lower }}
+ }
+ },
+ "index_params": {
+{% if ai_index_type == "HNSW" %}
+ "M": {{ ai_index_hnsw_m }},
+ "efConstruction": {{ ai_index_hnsw_ef_construction }},
+ "ef": {{ ai_index_hnsw_ef }}
+{% elif ai_index_type == "IVF_FLAT" %}
+ "nlist": {{ ai_index_ivf_nlist }},
+ "nprobe": {{ ai_index_ivf_nprobe }}
+{% endif %}
+ },
+ "filesystem": {
+ "name": "{{ fs_config.name }}",
+ "type": "{{ fs_config.filesystem }}",
+ "mkfs_cmd": "{{ fs_config.mkfs_cmd }}",
+ "mount_opts": "{{ fs_config.mount_opts }}",
+ "data_directory": "{{ current_fs_benchmark_dir }}"
+ }
+}
diff --git a/playbooks/roles/ai_multifs_setup/defaults/main.yml b/playbooks/roles/ai_multifs_setup/defaults/main.yml
new file mode 100644
index 00000000..c35d179f
--- /dev/null
+++ b/playbooks/roles/ai_multifs_setup/defaults/main.yml
@@ -0,0 +1,49 @@
+---
+# Default values for AI multi-filesystem testing
+ai_multifs_results_dir: "/data/ai-multifs-benchmark"
+ai_multifs_device: "/dev/vdb"
+ai_multifs_mount_point: "/mnt/ai-multifs-test"
+
+# Filesystem configurations to test
+ai_multifs_configurations:
+ - name: "xfs_4k_4ks"
+ filesystem: "xfs"
+ mkfs_cmd: "mkfs.xfs -f -s size=4096 -b size=4096"
+ mount_opts: "rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+ enabled: "{{ ai_multifs_test_xfs and ai_multifs_xfs_4k_4ks }}"
+
+ - name: "xfs_16k_4ks"
+ filesystem: "xfs"
+ mkfs_cmd: "mkfs.xfs -f -s size=4096 -b size=16384"
+ mount_opts: "rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+ enabled: "{{ ai_multifs_test_xfs and ai_multifs_xfs_16k_4ks }}"
+
+ - name: "xfs_32k_4ks"
+ filesystem: "xfs"
+ mkfs_cmd: "mkfs.xfs -f -s size=4096 -b size=32768"
+ mount_opts: "rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+ enabled: "{{ ai_multifs_test_xfs and ai_multifs_xfs_32k_4ks }}"
+
+ - name: "xfs_64k_4ks"
+ filesystem: "xfs"
+ mkfs_cmd: "mkfs.xfs -f -s size=4096 -b size=65536"
+ mount_opts: "rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+ enabled: "{{ ai_multifs_test_xfs and ai_multifs_xfs_64k_4ks }}"
+
+ - name: "ext4_4k"
+ filesystem: "ext4"
+ mkfs_cmd: "mkfs.ext4 -F -b 4096"
+ mount_opts: "rw,relatime,data=ordered"
+ enabled: "{{ ai_multifs_test_ext4 and ai_multifs_ext4_4k }}"
+
+ - name: "ext4_16k_bigalloc"
+ filesystem: "ext4"
+ mkfs_cmd: "mkfs.ext4 -F -b 4096 -C 16384"
+ mount_opts: "rw,relatime,data=ordered"
+ enabled: "{{ ai_multifs_test_ext4 and ai_multifs_ext4_16k_bigalloc }}"
+
+ - name: "btrfs_default"
+ filesystem: "btrfs"
+ mkfs_cmd: "mkfs.btrfs -f"
+ mount_opts: "rw,relatime,space_cache=v2,discard=async"
+ enabled: "{{ ai_multifs_test_btrfs and ai_multifs_btrfs_default }}"
diff --git a/playbooks/roles/ai_multifs_setup/tasks/main.yml b/playbooks/roles/ai_multifs_setup/tasks/main.yml
new file mode 100644
index 00000000..28f3ec40
--- /dev/null
+++ b/playbooks/roles/ai_multifs_setup/tasks/main.yml
@@ -0,0 +1,70 @@
+---
+- name: Import optional extra_args file
+ include_vars: "{{ item }}"
+ ignore_errors: yes
+ with_items:
+ - "../extra_vars.yaml"
+ tags: vars
+
+- name: Create multi-filesystem results directory
+ file:
+ path: "{{ ai_multifs_results_dir }}"
+ state: directory
+ mode: '0755'
+
+- name: Create mount point directory
+ file:
+ path: "{{ ai_multifs_mount_point }}"
+ state: directory
+ mode: '0755'
+
+- name: Unmount any existing filesystem on mount point
+ mount:
+ path: "{{ ai_multifs_mount_point }}"
+ state: unmounted
+ ignore_errors: yes
+
+- name: Install required filesystem utilities
+ package:
+ name:
+ - xfsprogs
+ - e2fsprogs
+ - btrfs-progs
+ state: present
+
+- name: Filter enabled filesystem configurations
+ set_fact:
+ enabled_fs_configs: "{{ ai_multifs_configurations | selectattr('enabled', 'equalto', true) | list }}"
+
+- name: Display enabled filesystem configurations
+ debug:
+ msg: "Will test {{ enabled_fs_configs | length }} filesystem configurations: {{ enabled_fs_configs | map(attribute='name') | list }}"
+
+- name: Validate that device exists
+ stat:
+ path: "{{ ai_multifs_device }}"
+ register: device_stat
+ failed_when: not device_stat.stat.exists
+
+- name: Display device information
+ debug:
+ msg: "Using device {{ ai_multifs_device }} for multi-filesystem testing"
+
+- name: Create filesystem configuration summary
+ copy:
+ content: |
+ # AI Multi-Filesystem Testing Configuration
+ Generated: {{ ansible_date_time.iso8601 }}
+ Device: {{ ai_multifs_device }}
+ Mount Point: {{ ai_multifs_mount_point }}
+ Results Directory: {{ ai_multifs_results_dir }}
+
+ Enabled Filesystem Configurations:
+ {% for config in enabled_fs_configs %}
+ - {{ config.name }}:
+ Filesystem: {{ config.filesystem }}
+ mkfs command: {{ config.mkfs_cmd }}
+ Mount options: {{ config.mount_opts }}
+ {% endfor %}
+ dest: "{{ ai_multifs_results_dir }}/test_configuration.txt"
+ mode: '0644'
diff --git a/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py b/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py
index 4ce14fb7..2aaa54ba 100644
--- a/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py
+++ b/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py
@@ -54,67 +54,83 @@ class MilvusBenchmark:
)
self.logger = logging.getLogger(__name__)
- def get_filesystem_info(self, path: str = "/data") -> Dict[str, str]:
+ def get_filesystem_info(self, path: str = "/data/milvus") -> Dict[str, str]:
"""Detect filesystem type for the given path"""
- try:
- # Use df -T to get filesystem type
- result = subprocess.run(
- ["df", "-T", path], capture_output=True, text=True, check=True
- )
-
- lines = result.stdout.strip().split("\n")
- if len(lines) >= 2:
- # Second line contains the filesystem info
- # Format: Filesystem Type 1K-blocks Used Available Use% Mounted on
- parts = lines[1].split()
- if len(parts) >= 2:
- filesystem_type = parts[1]
- mount_point = parts[-1] if len(parts) >= 7 else path
+ # Try primary path first, fallback to /data for backwards compatibility
+ paths_to_try = [path]
+ if path != "/data" and not os.path.exists(path):
+ paths_to_try.append("/data")
+
+ for check_path in paths_to_try:
+ try:
+ # Use df -T to get filesystem type
+ result = subprocess.run(
+ ["df", "-T", check_path], capture_output=True, text=True, check=True
+ )
+
+ lines = result.stdout.strip().split("\n")
+ if len(lines) >= 2:
+ # Second line contains the filesystem info
+ # Format: Filesystem Type 1K-blocks Used Available Use% Mounted on
+ parts = lines[1].split()
+ if len(parts) >= 2:
+ filesystem_type = parts[1]
+ mount_point = parts[-1] if len(parts) >= 7 else check_path
+
+ return {
+ "filesystem": filesystem_type,
+ "mount_point": mount_point,
+ "data_path": check_path,
+ }
+ except subprocess.CalledProcessError as e:
+ self.logger.warning(
+ f"Failed to detect filesystem for {check_path}: {e}"
+ )
+ continue
+ except Exception as e:
+ self.logger.warning(f"Error detecting filesystem for {check_path}: {e}")
+ continue
+ # Fallback: try to detect from /proc/mounts
+ for check_path in paths_to_try:
+ try:
+ with open("/proc/mounts", "r") as f:
+ mounts = f.readlines()
+
+ # Find the mount that contains our path
+ best_match = ""
+ best_fs = "unknown"
+
+ for line in mounts:
+ parts = line.strip().split()
+ if len(parts) >= 3:
+ mount_point = parts[1]
+ fs_type = parts[2]
+
+ # Check if this mount point is a prefix of our path
+ if check_path.startswith(mount_point) and len(
+ mount_point
+ ) > len(best_match):
+ best_match = mount_point
+ best_fs = fs_type
+
+ if best_fs != "unknown":
return {
- "filesystem": filesystem_type,
- "mount_point": mount_point,
- "data_path": path,
+ "filesystem": best_fs,
+ "mount_point": best_match,
+ "data_path": check_path,
}
- except subprocess.CalledProcessError as e:
- self.logger.warning(f"Failed to detect filesystem for {path}: {e}")
- except Exception as e:
- self.logger.warning(f"Error detecting filesystem for {path}: {e}")
- # Fallback: try to detect from /proc/mounts
- try:
- with open("/proc/mounts", "r") as f:
- mounts = f.readlines()
-
- # Find the mount that contains our path
- best_match = ""
- best_fs = "unknown"
-
- for line in mounts:
- parts = line.strip().split()
- if len(parts) >= 3:
- mount_point = parts[1]
- fs_type = parts[2]
-
- # Check if this mount point is a prefix of our path
- if path.startswith(mount_point) and len(mount_point) > len(
- best_match
- ):
- best_match = mount_point
- best_fs = fs_type
-
- if best_fs != "unknown":
- return {
- "filesystem": best_fs,
- "mount_point": best_match,
- "data_path": path,
- }
-
- except Exception as e:
- self.logger.warning(f"Error reading /proc/mounts: {e}")
+ except Exception as e:
+ self.logger.warning(f"Error reading /proc/mounts for {check_path}: {e}")
+ continue
# Final fallback
- return {"filesystem": "unknown", "mount_point": "/", "data_path": path}
+ return {
+ "filesystem": "unknown",
+ "mount_point": "/",
+ "data_path": paths_to_try[0],
+ }
def connect_to_milvus(self) -> bool:
"""Connect to Milvus server"""
@@ -440,13 +456,47 @@ class MilvusBenchmark:
"""Run complete benchmark suite"""
self.logger.info("Starting Milvus benchmark suite...")
- # Detect filesystem information
- fs_info = self.get_filesystem_info("/data")
+ # Detect filesystem information - Milvus data path first
+ milvus_data_path = "/data/milvus"
+ if os.path.exists(milvus_data_path):
+ # Multi-fs mode: Milvus data is on dedicated filesystem
+ fs_info = self.get_filesystem_info(milvus_data_path)
+ self.logger.info(
+ f"Multi-filesystem mode: Using {milvus_data_path} for filesystem detection"
+ )
+ else:
+ # Single-fs mode: fallback to /data
+ fs_info = self.get_filesystem_info("/data")
+ self.logger.info(
+ f"Single-filesystem mode: Using /data for filesystem detection"
+ )
+
self.results["system_info"] = fs_info
+
+ # Add kernel version and hostname to system info
+ try:
+ import socket
+
+ # Get hostname
+ self.results["system_info"]["hostname"] = socket.gethostname()
+
+ # Get kernel version using uname -r
+ kernel_result = subprocess.run(['uname', '-r'], capture_output=True, text=True, check=True)
+ self.results["system_info"]["kernel_version"] = kernel_result.stdout.strip()
+
+ self.logger.info(
+ f"System info: hostname={self.results['system_info']['hostname']}, "
+ f"kernel={self.results['system_info']['kernel_version']}"
+ )
+ except Exception as e:
+ self.logger.warning(f"Could not collect kernel info: {e}")
+ self.results["system_info"]["kernel_version"] = "unknown"
+ self.results["system_info"]["hostname"] = "unknown"
+
# Also add filesystem at top level for compatibility with existing graphs
self.results["filesystem"] = fs_info["filesystem"]
self.logger.info(
- f"Detected filesystem: {fs_info['filesystem']} at {fs_info['mount_point']}"
+ f"Detected filesystem: {fs_info['filesystem']} at {fs_info['mount_point']} (data path: {fs_info['data_path']})"
)
if not self.connect_to_milvus():
diff --git a/playbooks/roles/gen_hosts/tasks/main.yml b/playbooks/roles/gen_hosts/tasks/main.yml
index 4b35d9f6..d36790b0 100644
--- a/playbooks/roles/gen_hosts/tasks/main.yml
+++ b/playbooks/roles/gen_hosts/tasks/main.yml
@@ -381,6 +381,25 @@
- workflows_reboot_limit
- ansible_hosts_template.stat.exists
+- name: Load AI nodes configuration for multi-filesystem setup
+ include_vars:
+ file: "{{ topdir_path }}/{{ kdevops_nodes }}"
+ name: guestfs_nodes
+ when:
+ - kdevops_workflows_dedicated_workflow
+ - kdevops_workflow_enable_ai
+ - ai_enable_multifs_testing|default(false)|bool
+ - ansible_hosts_template.stat.exists
+
+- name: Extract AI node names for multi-filesystem setup
+ set_fact:
+ all_generic_nodes: "{{ guestfs_nodes.guestfs_nodes | map(attribute='name') | list }}"
+ when:
+ - kdevops_workflows_dedicated_workflow
+ - kdevops_workflow_enable_ai
+ - ai_enable_multifs_testing|default(false)|bool
+ - guestfs_nodes is defined
+
- name: Generate the Ansible hosts file for a dedicated AI setup
tags: ['hosts']
ansible.builtin.template:
diff --git a/playbooks/roles/gen_hosts/templates/fstests.j2 b/playbooks/roles/gen_hosts/templates/fstests.j2
index ac086c6e..32d90abf 100644
--- a/playbooks/roles/gen_hosts/templates/fstests.j2
+++ b/playbooks/roles/gen_hosts/templates/fstests.j2
@@ -70,6 +70,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
[krb5:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endif %}
+{% if kdevops_enable_iscsi or kdevops_nfsd_enable or kdevops_smbd_enable or kdevops_krb5_enable %}
[service]
{% if kdevops_enable_iscsi %}
{{ kdevops_hosts_prefix }}-iscsi
@@ -85,3 +86,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endif %}
[service:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% endif %}
diff --git a/playbooks/roles/gen_hosts/templates/gitr.j2 b/playbooks/roles/gen_hosts/templates/gitr.j2
index 7f9094d4..3f30a5fb 100644
--- a/playbooks/roles/gen_hosts/templates/gitr.j2
+++ b/playbooks/roles/gen_hosts/templates/gitr.j2
@@ -38,6 +38,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
[nfsd:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endif %}
+{% if kdevops_enable_iscsi or kdevops_nfsd_enable %}
[service]
{% if kdevops_enable_iscsi %}
{{ kdevops_hosts_prefix }}-iscsi
@@ -47,3 +48,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endif %}
[service:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% endif %}
diff --git a/playbooks/roles/gen_hosts/templates/hosts.j2 b/playbooks/roles/gen_hosts/templates/hosts.j2
index cdcd1883..e9441605 100644
--- a/playbooks/roles/gen_hosts/templates/hosts.j2
+++ b/playbooks/roles/gen_hosts/templates/hosts.j2
@@ -119,39 +119,30 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
[ai:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
-{% set fs_configs = [] %}
+{# Individual section groups for multi-filesystem testing #}
+{% set section_names = [] %}
{% for node in all_generic_nodes %}
-{% set node_parts = node.split('-') %}
-{% if node_parts|length >= 3 %}
-{% set fs_type = node_parts[2] %}
-{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
-{% set fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
-{% if fs_group not in fs_configs %}
-{% set _ = fs_configs.append(fs_group) %}
+{% if not node.endswith('-dev') %}
+{% set section = node.replace(kdevops_host_prefix + '-ai-', '') %}
+{% if section != kdevops_host_prefix + '-ai' %}
+{% if section_names.append(section) %}{% endif %}
{% endif %}
{% endif %}
{% endfor %}
-{% for fs_group in fs_configs %}
-[ai_{{ fs_group }}]
-{% for node in all_generic_nodes %}
-{% set node_parts = node.split('-') %}
-{% if node_parts|length >= 3 %}
-{% set fs_type = node_parts[2] %}
-{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
-{% set node_fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
-{% if node_fs_group == fs_group %}
-{{ node }}
-{% endif %}
+{% for section in section_names %}
+[ai_{{ section | replace('-', '_') }}]
+{{ kdevops_host_prefix }}-ai-{{ section }}
+{% if kdevops_baseline_and_dev %}
+{{ kdevops_host_prefix }}-ai-{{ section }}-dev
{% endif %}
-{% endfor %}
-[ai_{{ fs_group }}:vars]
+[ai_{{ section | replace('-', '_') }}:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endfor %}
{% else %}
-{# Single-node AI hosts #}
+{# Single filesystem hosts (original behavior) #}
[all]
localhost ansible_connection=local
{{ kdevops_host_prefix }}-ai
diff --git a/playbooks/roles/gen_hosts/templates/nfstest.j2 b/playbooks/roles/gen_hosts/templates/nfstest.j2
index e427ac34..709d871d 100644
--- a/playbooks/roles/gen_hosts/templates/nfstest.j2
+++ b/playbooks/roles/gen_hosts/templates/nfstest.j2
@@ -38,6 +38,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
[nfsd:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endif %}
+{% if kdevops_enable_iscsi or kdevops_nfsd_enable %}
[service]
{% if kdevops_enable_iscsi %}
{{ kdevops_hosts_prefix }}-iscsi
@@ -47,3 +48,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{% endif %}
[service:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% endif %}
diff --git a/playbooks/roles/gen_hosts/templates/pynfs.j2 b/playbooks/roles/gen_hosts/templates/pynfs.j2
index 85c87dae..55add4d1 100644
--- a/playbooks/roles/gen_hosts/templates/pynfs.j2
+++ b/playbooks/roles/gen_hosts/templates/pynfs.j2
@@ -23,6 +23,7 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{{ kdevops_hosts_prefix }}-nfsd
[nfsd:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% if true %}
[service]
{% if kdevops_enable_iscsi %}
{{ kdevops_hosts_prefix }}-iscsi
@@ -30,3 +31,4 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
{{ kdevops_hosts_prefix }}-nfsd
[service:vars]
ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% endif %}
diff --git a/playbooks/roles/gen_nodes/tasks/main.yml b/playbooks/roles/gen_nodes/tasks/main.yml
index d54977be..b294d294 100644
--- a/playbooks/roles/gen_nodes/tasks/main.yml
+++ b/playbooks/roles/gen_nodes/tasks/main.yml
@@ -658,6 +658,7 @@
- kdevops_workflow_enable_ai
- ansible_nodes_template.stat.exists
- not kdevops_baseline_and_dev
+ - not ai_enable_multifs_testing|default(false)|bool
- name: Generate the AI kdevops nodes file with dev hosts using {{ kdevops_nodes_template }} as jinja2 source template
tags: ['hosts']
@@ -675,6 +676,95 @@
- kdevops_workflow_enable_ai
- ansible_nodes_template.stat.exists
- kdevops_baseline_and_dev
+ - not ai_enable_multifs_testing|default(false)|bool
+
+- name: Infer enabled AI multi-filesystem configurations
+ vars:
+ kdevops_config_data: "{{ lookup('file', topdir_path + '/.config') }}"
+ # Find all enabled AI multifs configurations
+ xfs_configs: >-
+ {{
+ kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_XFS_(.*)=y$', multiline=True)
+ | map('lower')
+ | map('regex_replace', '_', '-')
+ | map('regex_replace', '^', 'xfs-')
+ | list
+ if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_XFS=y$', multiline=True)
+ else []
+ }}
+ ext4_configs: >-
+ {{
+ kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_EXT4_(.*)=y$', multiline=True)
+ | map('lower')
+ | map('regex_replace', '_', '-')
+ | map('regex_replace', '^', 'ext4-')
+ | list
+ if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_EXT4=y$', multiline=True)
+ else []
+ }}
+ btrfs_configs: >-
+ {{
+ kdevops_config_data | regex_findall('^CONFIG_AI_MULTIFS_BTRFS_(.*)=y$', multiline=True)
+ | map('lower')
+ | map('regex_replace', '_', '-')
+ | map('regex_replace', '^', 'btrfs-')
+ | list
+ if kdevops_config_data | regex_search('^CONFIG_AI_MULTIFS_TEST_BTRFS=y$', multiline=True)
+ else []
+ }}
+ set_fact:
+ ai_multifs_enabled_configs: "{{ (xfs_configs + ext4_configs + btrfs_configs) | unique }}"
+ when:
+ - kdevops_workflows_dedicated_workflow
+ - kdevops_workflow_enable_ai
+ - ai_enable_multifs_testing|default(false)|bool
+ - ansible_nodes_template.stat.exists
+
+- name: Create AI nodes for each filesystem configuration (no dev)
+ vars:
+ filesystem_nodes: "{{ [kdevops_host_prefix + '-ai-'] | product(ai_multifs_enabled_configs | default([])) | map('join') | list }}"
+ set_fact:
+ ai_enabled_section_types: "{{ filesystem_nodes }}"
+ when:
+ - kdevops_workflows_dedicated_workflow
+ - kdevops_workflow_enable_ai
+ - ai_enable_multifs_testing|default(false)|bool
+ - ansible_nodes_template.stat.exists
+ - not kdevops_baseline_and_dev
+ - ai_multifs_enabled_configs is defined
+ - ai_multifs_enabled_configs | length > 0
+
+- name: Create AI nodes for each filesystem configuration with dev hosts
+ vars:
+ filesystem_nodes: "{{ [kdevops_host_prefix + '-ai-'] | product(ai_multifs_enabled_configs | default([])) | map('join') | list }}"
+ set_fact:
+ ai_enabled_section_types: "{{ filesystem_nodes | product(['', '-dev']) | map('join') | list }}"
+ when:
+ - kdevops_workflows_dedicated_workflow
+ - kdevops_workflow_enable_ai
+ - ai_enable_multifs_testing|default(false)|bool
+ - ansible_nodes_template.stat.exists
+ - kdevops_baseline_and_dev
+ - ai_multifs_enabled_configs is defined
+ - ai_multifs_enabled_configs | length > 0
+
+- name: Generate the AI multi-filesystem kdevops nodes file using {{ kdevops_nodes_template }} as jinja2 source template
+ tags: [ 'hosts' ]
+ vars:
+ node_template: "{{ kdevops_nodes_template | basename }}"
+ nodes: "{{ ai_enabled_section_types | regex_replace('\\[') | regex_replace('\\]') | replace(\"'\", '') | split(', ') }}"
+ all_generic_nodes: "{{ ai_enabled_section_types }}"
+ template:
+ src: "{{ node_template }}"
+ dest: "{{ topdir_path }}/{{ kdevops_nodes }}"
+ force: yes
+ when:
+ - kdevops_workflows_dedicated_workflow
+ - kdevops_workflow_enable_ai
+ - ai_enable_multifs_testing|default(false)|bool
+ - ansible_nodes_template.stat.exists
+ - ai_enabled_section_types is defined
+ - ai_enabled_section_types | length > 0
- name: Get the control host's timezone
ansible.builtin.command: "timedatectl show -p Timezone --value"
diff --git a/playbooks/roles/guestfs/tasks/bringup/main.yml b/playbooks/roles/guestfs/tasks/bringup/main.yml
index c131de25..bd9f5260 100644
--- a/playbooks/roles/guestfs/tasks/bringup/main.yml
+++ b/playbooks/roles/guestfs/tasks/bringup/main.yml
@@ -1,11 +1,16 @@
---
- name: List defined libvirt guests
run_once: true
+ delegate_to: localhost
community.libvirt.virt:
command: list_vms
uri: "{{ libvirt_uri }}"
register: defined_vms
+- name: Debug defined VMs
+ debug:
+ msg: "Hostname: {{ inventory_hostname }}, Defined VMs: {{ hostvars['localhost']['defined_vms']['list_vms'] | default([]) }}, Check: {{ inventory_hostname not in (hostvars['localhost']['defined_vms']['list_vms'] | default([])) }}"
+
- name: Provision each target node
when:
- "inventory_hostname not in defined_vms.list_vms"
@@ -25,10 +30,13 @@
path: "{{ ssh_key_dir }}"
state: directory
mode: "u=rwx"
+ delegate_to: localhost
- name: Generate fresh keys for each target node
ansible.builtin.command:
cmd: 'ssh-keygen -q -t ed25519 -f {{ ssh_key }} -N ""'
+ creates: "{{ ssh_key }}"
+ delegate_to: localhost
- name: Set the pathname of the root disk image for each target node
ansible.builtin.set_fact:
@@ -38,15 +46,18 @@
ansible.builtin.file:
path: "{{ storagedir }}/{{ inventory_hostname }}"
state: directory
+ delegate_to: localhost
- name: Duplicate the root disk image for each target node
ansible.builtin.command:
cmd: "cp --reflink=auto {{ base_image }} {{ root_image }}"
+ delegate_to: localhost
- name: Get the timezone of the control host
ansible.builtin.command:
cmd: "timedatectl show -p Timezone --value"
register: host_timezone
+ delegate_to: localhost
- name: Build the root image for each target node (as root)
become: true
@@ -103,6 +114,7 @@
name: "{{ inventory_hostname }}"
xml: "{{ lookup('file', xml_file) }}"
uri: "{{ libvirt_uri }}"
+ delegate_to: localhost
- name: Find PCIe passthrough devices
ansible.builtin.find:
@@ -110,6 +122,7 @@
file_type: file
patterns: "pcie_passthrough_*.xml"
register: passthrough_devices
+ delegate_to: localhost
- name: Attach PCIe passthrough devices to each target node
environment:
@@ -124,6 +137,7 @@
loop: "{{ passthrough_devices.files }}"
loop_control:
label: "Doing PCI-E passthrough for device {{ item }}"
+ delegate_to: localhost
when:
- passthrough_devices.matched > 0
@@ -142,3 +156,4 @@
name: "{{ inventory_hostname }}"
uri: "{{ libvirt_uri }}"
state: running
+ delegate_to: localhost
diff --git a/scripts/guestfs.Makefile b/scripts/guestfs.Makefile
index bd03f58c..f6c350a4 100644
--- a/scripts/guestfs.Makefile
+++ b/scripts/guestfs.Makefile
@@ -79,7 +79,7 @@ bringup_guestfs: $(GUESTFS_BRINGUP_DEPS)
--extra-vars=@./extra_vars.yaml \
--tags network,pool,base_image
$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
- --limit 'baseline:dev:service' \
+ --limit 'baseline:dev:service:ai' \
playbooks/guestfs.yml \
--extra-vars=@./extra_vars.yaml \
--tags bringup
diff --git a/workflows/ai/Kconfig b/workflows/ai/Kconfig
index 2ffc6b65..d04570d8 100644
--- a/workflows/ai/Kconfig
+++ b/workflows/ai/Kconfig
@@ -161,4 +161,17 @@ config AI_BENCHMARK_ITERATIONS
# Docker storage configuration
source "workflows/ai/Kconfig.docker-storage"
+# Multi-filesystem configuration
+config AI_MULTIFS_ENABLE
+ bool "Enable multi-filesystem benchmarking"
+ output yaml
+ default n
+ help
+ Run AI benchmarks across multiple filesystem configurations
+ to compare performance characteristics.
+
+if AI_MULTIFS_ENABLE
+source "workflows/ai/Kconfig.multifs"
+endif
+
endif # KDEVOPS_WORKFLOW_ENABLE_AI
diff --git a/workflows/ai/Kconfig.fs b/workflows/ai/Kconfig.fs
new file mode 100644
index 00000000..a95d02c6
--- /dev/null
+++ b/workflows/ai/Kconfig.fs
@@ -0,0 +1,118 @@
+menu "Target filesystem to use"
+
+choice
+ prompt "Target filesystem"
+ default AI_FILESYSTEM_XFS
+
+config AI_FILESYSTEM_XFS
+ bool "xfs"
+ select HAVE_SUPPORTS_PURE_IOMAP if BOOTLINUX_TREE_LINUS || BOOTLINUX_TREE_STABLE
+ help
+ This will target testing AI workloads on top of XFS.
+ XFS provides excellent performance for large datasets
+ and is commonly used in high-performance computing.
+
+config AI_FILESYSTEM_BTRFS
+ bool "btrfs"
+ help
+ This will target testing AI workloads on top of btrfs.
+ Btrfs provides features like snapshots and compression
+ which can be useful for AI dataset management.
+
+config AI_FILESYSTEM_EXT4
+ bool "ext4"
+ help
+ This will target testing AI workloads on top of ext4.
+ Ext4 is widely supported and provides reliable performance
+ for AI workloads.
+
+endchoice
+
+config AI_FILESYSTEM
+ string
+ output yaml
+ default "xfs" if AI_FILESYSTEM_XFS
+ default "btrfs" if AI_FILESYSTEM_BTRFS
+ default "ext4" if AI_FILESYSTEM_EXT4
+
+config AI_FSTYPE
+ string
+ output yaml
+ default "xfs" if AI_FILESYSTEM_XFS
+ default "btrfs" if AI_FILESYSTEM_BTRFS
+ default "ext4" if AI_FILESYSTEM_EXT4
+
+if AI_FILESYSTEM_XFS
+
+menu "XFS configuration"
+
+config AI_XFS_MKFS_OPTS
+ string "mkfs.xfs options"
+ output yaml
+ default "-f -s size=4096"
+ help
+ Additional options to pass to mkfs.xfs when creating
+ the filesystem for AI workloads.
+
+config AI_XFS_MOUNT_OPTS
+ string "XFS mount options"
+ output yaml
+ default "rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+ help
+ Mount options for XFS filesystem. These options are
+ optimized for AI workloads with large sequential I/O.
+
+endmenu
+
+endif # AI_FILESYSTEM_XFS
+
+if AI_FILESYSTEM_BTRFS
+
+menu "Btrfs configuration"
+
+config AI_BTRFS_MKFS_OPTS
+ string "mkfs.btrfs options"
+ output yaml
+ default "-f"
+ help
+ Additional options to pass to mkfs.btrfs when creating
+ the filesystem for AI workloads.
+
+config AI_BTRFS_MOUNT_OPTS
+ string "Btrfs mount options"
+ output yaml
+ default "rw,relatime,compress=lz4,space_cache=v2"
+ help
+ Mount options for Btrfs filesystem. LZ4 compression
+ can help with AI datasets while maintaining performance.
+
+endmenu
+
+endif # AI_FILESYSTEM_BTRFS
+
+if AI_FILESYSTEM_EXT4
+
+menu "Ext4 configuration"
+
+config AI_EXT4_MKFS_OPTS
+ string "mkfs.ext4 options"
+ output yaml
+ default "-F"
+ help
+ Additional options to pass to mkfs.ext4 when creating
+ the filesystem for AI workloads.
+
+config AI_EXT4_MOUNT_OPTS
+ string "Ext4 mount options"
+ output yaml
+ default "rw,relatime,data=ordered"
+ help
+ Mount options for Ext4 filesystem optimized for
+ AI workload patterns.
+
+endmenu
+
+endif # AI_FILESYSTEM_EXT4
+
+
+endmenu
diff --git a/workflows/ai/Kconfig.multifs b/workflows/ai/Kconfig.multifs
new file mode 100644
index 00000000..2b72dd6c
--- /dev/null
+++ b/workflows/ai/Kconfig.multifs
@@ -0,0 +1,184 @@
+menu "Multi-filesystem testing configuration"
+
+config AI_ENABLE_MULTIFS_TESTING
+ bool "Enable multi-filesystem testing"
+ default n
+ output yaml
+ help
+ Enable testing the same AI workload across multiple filesystem
+ configurations. This allows comparing performance characteristics
+ between different filesystems and their configurations.
+
+ When enabled, the AI benchmark will run sequentially across all
+ selected filesystem configurations, allowing for detailed
+ performance analysis across different storage backends.
+
+if AI_ENABLE_MULTIFS_TESTING
+
+config AI_MULTIFS_TEST_XFS
+ bool "Test XFS configurations"
+ default y
+ output yaml
+ help
+ Enable testing AI workloads on XFS filesystem with different
+ block size configurations.
+
+if AI_MULTIFS_TEST_XFS
+
+menu "XFS configuration profiles"
+
+config AI_MULTIFS_XFS_4K_4KS
+ bool "XFS 4k block size - 4k sector size"
+ default y
+ output yaml
+ help
+ Test AI workloads on XFS with 4k filesystem block size
+ and 4k sector size. This is the most common configuration
+ and provides good performance for most workloads.
+
+config AI_MULTIFS_XFS_16K_4KS
+ bool "XFS 16k block size - 4k sector size"
+ default y
+ output yaml
+ help
+ Test AI workloads on XFS with 16k filesystem block size
+ and 4k sector size. Larger block sizes can improve performance
+ for sequential I/O patterns common in AI workloads.
+
+config AI_MULTIFS_XFS_32K_4KS
+ bool "XFS 32k block size - 4k sector size"
+ default y
+ output yaml
+ help
+ Test AI workloads on XFS with 32k filesystem block size
+ and 4k sector size. Even larger block sizes can provide
+ benefits for large sequential I/O operations typical in
+ AI vector database workloads.
+
+config AI_MULTIFS_XFS_64K_4KS
+ bool "XFS 64k block size - 4k sector size"
+ default y
+ output yaml
+ help
+ Test AI workloads on XFS with 64k filesystem block size
+ and 4k sector size. Maximum supported block size for XFS,
+ optimized for very large file operations and high-throughput
+ AI workloads with substantial data transfers.
+
+endmenu
+
+endif # AI_MULTIFS_TEST_XFS
+
+config AI_MULTIFS_TEST_EXT4
+ bool "Test ext4 configurations"
+ default y
+ output yaml
+ help
+ Enable testing AI workloads on ext4 filesystem with different
+ configurations including bigalloc options.
+
+if AI_MULTIFS_TEST_EXT4
+
+menu "ext4 configuration profiles"
+
+config AI_MULTIFS_EXT4_4K
+ bool "ext4 4k block size"
+ default y
+ output yaml
+ help
+ Test AI workloads on ext4 with standard 4k block size.
+ This is the default ext4 configuration.
+
+config AI_MULTIFS_EXT4_16K_BIGALLOC
+ bool "ext4 16k bigalloc"
+ default y
+ output yaml
+ help
+ Test AI workloads on ext4 with 16k bigalloc enabled.
+ Bigalloc reduces metadata overhead and can improve
+ performance for large file workloads.
+
+endmenu
+
+endif # AI_MULTIFS_TEST_EXT4
+
+config AI_MULTIFS_TEST_BTRFS
+ bool "Test btrfs configurations"
+ default y
+ output yaml
+ help
+ Enable testing AI workloads on btrfs filesystem with
+ common default configuration profile.
+
+if AI_MULTIFS_TEST_BTRFS
+
+menu "btrfs configuration profiles"
+
+config AI_MULTIFS_BTRFS_DEFAULT
+ bool "btrfs default profile"
+ default y
+ output yaml
+ help
+ Test AI workloads on btrfs with default configuration.
+ This includes modern defaults with free-space-tree and
+ no-holes features enabled.
+
+endmenu
+
+endif # AI_MULTIFS_TEST_BTRFS
+
+config AI_MULTIFS_RESULTS_DIR
+ string "Multi-filesystem results directory"
+ output yaml
+ default "/data/ai-multifs-benchmark"
+ help
+ Directory where multi-filesystem test results and logs will be stored.
+ Each filesystem configuration will have its own subdirectory.
+
+config AI_MILVUS_STORAGE_ENABLE
+ bool "Enable dedicated Milvus storage with filesystem matching node profile"
+ default y
+ output yaml
+ help
+ Configure a dedicated storage device for Milvus data including
+ vector data (MinIO), metadata (etcd), and local cache. The filesystem
+ type will automatically match the node's configuration profile.
+
+config AI_MILVUS_DEVICE
+ string "Device to use for Milvus storage"
+ output yaml
+ default "/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops3" if LIBVIRT && LIBVIRT_EXTRA_STORAGE_DRIVE_NVME
+ default "/dev/disk/by-id/virtio-kdevops3" if LIBVIRT && LIBVIRT_EXTRA_STORAGE_DRIVE_VIRTIO
+ default "/dev/disk/by-id/ata-QEMU_HARDDISK_kdevops3" if LIBVIRT && LIBVIRT_EXTRA_STORAGE_DRIVE_IDE
+ default "/dev/nvme3n1" if TERRAFORM_AWS_INSTANCE_M5AD_2XLARGE
+ default "/dev/nvme3n1" if TERRAFORM_AWS_INSTANCE_M5AD_4XLARGE
+ default "/dev/nvme3n1" if TERRAFORM_GCE
+ default "/dev/sde" if TERRAFORM_AZURE
+ default TERRAFORM_OCI_SPARSE_VOLUME_DEVICE_FILE_NAME if TERRAFORM_OCI
+ help
+ The device to use for Milvus storage. This device will be
+ formatted with the filesystem type matching the node's profile
+ and mounted at /data/milvus.
+
+config AI_MILVUS_MOUNT_POINT
+ string "Mount point for Milvus storage"
+ output yaml
+ default "/data/milvus"
+ help
+ The path where the Milvus storage filesystem will be mounted.
+ All Milvus data directories (data/, etcd/, minio/) will be
+ created under this mount point.
+
+config AI_MILVUS_USE_NODE_FS
+ bool "Automatically detect filesystem type from node name"
+ default y
+ output yaml
+ help
+ When enabled, the filesystem type for Milvus storage will be
+ automatically determined based on the node's configuration name.
+ For example, nodes named *-xfs-* will use XFS, *-ext4-* will
+ use ext4, and *-btrfs-* will use Btrfs.
+
+endif # AI_ENABLE_MULTIFS_TESTING
+
+endmenu
diff --git a/workflows/ai/scripts/analysis_config.json b/workflows/ai/scripts/analysis_config.json
index 2f90f4d5..5f0a9328 100644
--- a/workflows/ai/scripts/analysis_config.json
+++ b/workflows/ai/scripts/analysis_config.json
@@ -2,5 +2,5 @@
"enable_graphing": true,
"graph_format": "png",
"graph_dpi": 150,
- "graph_theme": "seaborn"
+ "graph_theme": "default"
}
diff --git a/workflows/ai/scripts/analyze_results.py b/workflows/ai/scripts/analyze_results.py
index 3d11fb11..2dc4a1d6 100755
--- a/workflows/ai/scripts/analyze_results.py
+++ b/workflows/ai/scripts/analyze_results.py
@@ -226,6 +226,68 @@ class ResultsAnalyzer:
return fs_info
+ def _extract_filesystem_config(
+ self, result: Dict[str, Any]
+ ) -> tuple[str, str, str]:
+ """Extract filesystem type and block size from result data.
+ Returns (fs_type, block_size, config_key)"""
+ filename = result.get("_file", "")
+
+ # Primary: Extract filesystem type from filename (more reliable than JSON)
+ fs_type = "unknown"
+ block_size = "default"
+
+ if "xfs" in filename:
+ fs_type = "xfs"
+ # Check larger sizes first to avoid substring matches
+ if "64k" in filename and "64k-" in filename:
+ block_size = "64k"
+ elif "32k" in filename and "32k-" in filename:
+ block_size = "32k"
+ elif "16k" in filename and "16k-" in filename:
+ block_size = "16k"
+ elif "4k" in filename and "4k-" in filename:
+ block_size = "4k"
+ elif "ext4" in filename:
+ fs_type = "ext4"
+ if "16k" in filename:
+ block_size = "16k"
+ elif "4k" in filename:
+ block_size = "4k"
+ elif "btrfs" in filename:
+ fs_type = "btrfs"
+ block_size = "default"
+ else:
+ # Fallback to JSON data if filename parsing fails
+ fs_type = result.get("filesystem", "unknown")
+ self.logger.warning(
+ f"Could not determine filesystem from filename {filename}, using JSON data: {fs_type}"
+ )
+
+ config_key = f"{fs_type}-{block_size}" if block_size != "default" else fs_type
+ return fs_type, block_size, config_key
+
+ def _extract_node_info(self, result: Dict[str, Any]) -> tuple[str, bool]:
+ """Extract node hostname and determine if it's a dev node.
+ Returns (hostname, is_dev_node)"""
+ # Get hostname from system_info (preferred) or fall back to filename
+ system_info = result.get("system_info", {})
+ hostname = system_info.get("hostname", "")
+
+ # If no hostname in system_info, try extracting from filename
+ if not hostname:
+ filename = result.get("_file", "")
+ # Remove results_ prefix and .json suffix
+ hostname = filename.replace("results_", "").replace(".json", "")
+ # Remove iteration number if present (_1, _2, etc.)
+ if "_" in hostname and hostname.split("_")[-1].isdigit():
+ hostname = "_".join(hostname.split("_")[:-1])
+
+ # Determine if this is a dev node
+ is_dev = hostname.endswith("-dev")
+
+ return hostname, is_dev
+
def load_results(self) -> bool:
"""Load all result files from the results directory"""
try:
@@ -391,6 +453,8 @@ class ResultsAnalyzer:
html.append(
" .highlight { background-color: #fff3cd; padding: 10px; border-radius: 3px; }"
)
+ html.append(" .baseline-row { background-color: #e8f5e9; }")
+ html.append(" .dev-row { background-color: #e3f2fd; }")
html.append(" </style>")
html.append("</head>")
html.append("<body>")
@@ -486,26 +550,69 @@ class ResultsAnalyzer:
else:
html.append(" <p>No storage device information available.</p>")
- # Filesystem section
- html.append(" <h3>🗂️ Filesystem Configuration</h3>")
- fs_info = self.system_info.get("filesystem_info", {})
- html.append(" <table class='config-table'>")
- html.append(
- " <tr><td>Filesystem Type</td><td>"
- + str(fs_info.get("filesystem_type", "Unknown"))
- + "</td></tr>"
- )
- html.append(
- " <tr><td>Mount Point</td><td>"
- + str(fs_info.get("mount_point", "Unknown"))
- + "</td></tr>"
- )
- html.append(
- " <tr><td>Mount Options</td><td>"
- + str(fs_info.get("mount_options", "Unknown"))
- + "</td></tr>"
- )
- html.append(" </table>")
+ # Node Configuration section - Extract from actual benchmark results
+ html.append(" <h3>🗂️ Node Configuration</h3>")
+
+ # Collect node and filesystem information from benchmark results
+ node_configs = {}
+ for result in self.results_data:
+ # Extract node information
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(
+ result
+ )
+
+ system_info = result.get("system_info", {})
+ data_path = system_info.get("data_path", "/data/milvus")
+ mount_point = system_info.get("mount_point", "/data")
+ kernel_version = system_info.get("kernel_version", "unknown")
+
+ if hostname not in node_configs:
+ node_configs[hostname] = {
+ "hostname": hostname,
+ "node_type": "Development" if is_dev else "Baseline",
+ "filesystem": fs_type,
+ "block_size": block_size,
+ "data_path": data_path,
+ "mount_point": mount_point,
+ "kernel": kernel_version,
+ "test_count": 0,
+ }
+ node_configs[hostname]["test_count"] += 1
+
+ if node_configs:
+ html.append(" <table class='config-table'>")
+ html.append(
+ " <tr><th>Node</th><th>Type</th><th>Filesystem</th><th>Block Size</th><th>Data Path</th><th>Mount Point</th><th>Kernel</th><th>Tests</th></tr>"
+ )
+ # Sort nodes with baseline first, then dev
+ sorted_nodes = sorted(
+ node_configs.items(),
+ key=lambda x: (x[1]["node_type"] != "Baseline", x[0]),
+ )
+ for hostname, config_info in sorted_nodes:
+ row_class = (
+ "dev-row"
+ if config_info["node_type"] == "Development"
+ else "baseline-row"
+ )
+ html.append(f" <tr class='{row_class}'>")
+ html.append(f" <td><strong>{hostname}</strong></td>")
+ html.append(f" <td>{config_info['node_type']}</td>")
+ html.append(f" <td>{config_info['filesystem']}</td>")
+ html.append(f" <td>{config_info['block_size']}</td>")
+ html.append(f" <td>{config_info['data_path']}</td>")
+ html.append(
+ f" <td>{config_info['mount_point']}</td>"
+ )
+ html.append(f" <td>{config_info['kernel']}</td>")
+ html.append(f" <td>{config_info['test_count']}</td>")
+ html.append(f" </tr>")
+ html.append(" </table>")
+ else:
+ html.append(
+ " <p>No node configuration data found in results.</p>"
+ )
html.append(" </div>")
# Test Configuration Section
@@ -551,92 +658,192 @@ class ResultsAnalyzer:
html.append(" </table>")
html.append(" </div>")
- # Performance Results Section
+ # Performance Results Section - Per Node
html.append(" <div class='section'>")
- html.append(" <h2>📊 Performance Results Summary</h2>")
+ html.append(" <h2>📊 Performance Results by Node</h2>")
if self.results_data:
- # Insert performance
- insert_times = [
- r.get("insert_performance", {}).get("total_time_seconds", 0)
- for r in self.results_data
- ]
- insert_rates = [
- r.get("insert_performance", {}).get("vectors_per_second", 0)
- for r in self.results_data
- ]
-
- if insert_times and any(t > 0 for t in insert_times):
- html.append(" <h3>📈 Vector Insert Performance</h3>")
- html.append(" <table class='metric-table'>")
- html.append(
- f" <tr><td>Average Insert Time</td><td>{np.mean(insert_times):.2f} seconds</td></tr>"
- )
- html.append(
- f" <tr><td>Average Insert Rate</td><td>{np.mean(insert_rates):.2f} vectors/sec</td></tr>"
+ # Group results by node
+ node_performance = {}
+
+ for result in self.results_data:
+ # Use node hostname as the grouping key
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(
+ result
)
- html.append(
- f" <tr><td>Insert Rate Range</td><td>{np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec</td></tr>"
- )
- html.append(" </table>")
- # Index performance
- index_times = [
- r.get("index_performance", {}).get("creation_time_seconds", 0)
- for r in self.results_data
- ]
- if index_times and any(t > 0 for t in index_times):
- html.append(" <h3>🔗 Index Creation Performance</h3>")
- html.append(" <table class='metric-table'>")
- html.append(
- f" <tr><td>Average Index Creation Time</td><td>{np.mean(index_times):.2f} seconds</td></tr>"
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "hostname": hostname,
+ "node_type": "Development" if is_dev else "Baseline",
+ "insert_rates": [],
+ "insert_times": [],
+ "index_times": [],
+ "query_performance": {},
+ "filesystem": fs_type,
+ "block_size": block_size,
+ }
+
+ # Add insert performance
+ insert_perf = result.get("insert_performance", {})
+ if insert_perf:
+ rate = insert_perf.get("vectors_per_second", 0)
+ time = insert_perf.get("total_time_seconds", 0)
+ if rate > 0:
+ node_performance[hostname]["insert_rates"].append(rate)
+ if time > 0:
+ node_performance[hostname]["insert_times"].append(time)
+
+ # Add index performance
+ index_perf = result.get("index_performance", {})
+ if index_perf:
+ time = index_perf.get("creation_time_seconds", 0)
+ if time > 0:
+ node_performance[hostname]["index_times"].append(time)
+
+ # Collect query performance (use first result for each node)
+ query_perf = result.get("query_performance", {})
+ if (
+ query_perf
+ and not node_performance[hostname]["query_performance"]
+ ):
+ node_performance[hostname]["query_performance"] = query_perf
+
+ # Display results for each node, sorted with baseline first
+ sorted_nodes = sorted(
+ node_performance.items(),
+ key=lambda x: (x[1]["node_type"] != "Baseline", x[0]),
+ )
+ for hostname, perf_data in sorted_nodes:
+ node_type_badge = (
+ "🔵" if perf_data["node_type"] == "Development" else "🟢"
)
html.append(
- f" <tr><td>Index Time Range</td><td>{np.min(index_times):.2f} - {np.max(index_times):.2f} seconds</td></tr>"
+ f" <h3>{node_type_badge} {hostname} ({perf_data['node_type']})</h3>"
)
- html.append(" </table>")
-
- # Query performance
- html.append(" <h3>🔍 Query Performance</h3>")
- first_query_perf = self.results_data[0].get("query_performance", {})
- if first_query_perf:
- html.append(" <table>")
html.append(
- " <tr><th>Query Type</th><th>Batch Size</th><th>QPS</th><th>Avg Latency (ms)</th></tr>"
+ f" <p>Filesystem: {perf_data['filesystem']}, Block Size: {perf_data['block_size']}</p>"
)
- for topk, topk_data in first_query_perf.items():
- for batch, batch_data in topk_data.items():
- qps = batch_data.get("queries_per_second", 0)
- avg_time = batch_data.get("average_time_seconds", 0) * 1000
-
- # Color coding for performance
- qps_class = ""
- if qps > 1000:
- qps_class = "performance-good"
- elif qps > 100:
- qps_class = "performance-warning"
- else:
- qps_class = "performance-poor"
-
- html.append(f" <tr>")
- html.append(
- f" <td>{topk.replace('topk_', 'Top-')}</td>"
- )
- html.append(
- f" <td>{batch.replace('batch_', 'Batch ')}</td>"
- )
- html.append(
- f" <td class='{qps_class}'>{qps:.2f}</td>"
- )
- html.append(f" <td>{avg_time:.2f}</td>")
- html.append(f" </tr>")
+ # Insert performance
+ insert_rates = perf_data["insert_rates"]
+ if insert_rates:
+ html.append(" <h4>📈 Vector Insert Performance</h4>")
+ html.append(" <table class='metric-table'>")
+ html.append(
+ f" <tr><td>Average Insert Rate</td><td>{np.mean(insert_rates):.2f} vectors/sec</td></tr>"
+ )
+ html.append(
+ f" <tr><td>Insert Rate Range</td><td>{np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec</td></tr>"
+ )
+ html.append(
+ f" <tr><td>Test Iterations</td><td>{len(insert_rates)}</td></tr>"
+ )
+ html.append(" </table>")
+
+ # Index performance
+ index_times = perf_data["index_times"]
+ if index_times:
+ html.append(" <h4>🔗 Index Creation Performance</h4>")
+ html.append(" <table class='metric-table'>")
+ html.append(
+ f" <tr><td>Average Index Creation Time</td><td>{np.mean(index_times):.3f} seconds</td></tr>"
+ )
+ html.append(
+ f" <tr><td>Index Time Range</td><td>{np.min(index_times):.3f} - {np.max(index_times):.3f} seconds</td></tr>"
+ )
+ html.append(" </table>")
+
+ # Query performance
+ query_perf = perf_data["query_performance"]
+ if query_perf:
+ html.append(" <h4>🔍 Query Performance</h4>")
+ html.append(" <table>")
+ html.append(
+ " <tr><th>Query Type</th><th>Batch Size</th><th>QPS</th><th>Avg Latency (ms)</th></tr>"
+ )
- html.append(" </table>")
+ for topk, topk_data in query_perf.items():
+ for batch, batch_data in topk_data.items():
+ qps = batch_data.get("queries_per_second", 0)
+ avg_time = (
+ batch_data.get("average_time_seconds", 0) * 1000
+ )
+
+ # Color coding for performance
+ qps_class = ""
+ if qps > 1000:
+ qps_class = "performance-good"
+ elif qps > 100:
+ qps_class = "performance-warning"
+ else:
+ qps_class = "performance-poor"
+
+ html.append(f" <tr>")
+ html.append(
+ f" <td>{topk.replace('topk_', 'Top-')}</td>"
+ )
+ html.append(
+ f" <td>{batch.replace('batch_', 'Batch ')}</td>"
+ )
+ html.append(
+ f" <td class='{qps_class}'>{qps:.2f}</td>"
+ )
+ html.append(f" <td>{avg_time:.2f}</td>")
+ html.append(f" </tr>")
+ html.append(" </table>")
+
+ html.append(" <br>") # Add spacing between configurations
- html.append(" </div>")
+ html.append(" </div>")
# Footer
+ # Performance Graphs Section
+ html.append(" <div class='section'>")
+ html.append(" <h2>📈 Performance Visualizations</h2>")
+ html.append(
+ " <p>The following graphs provide visual analysis of the benchmark results across all tested filesystem configurations:</p>"
+ )
+ html.append(" <ul>")
+ html.append(
+ " <li><strong>Insert Performance:</strong> Shows vector insertion rates and times for each filesystem configuration</li>"
+ )
+ html.append(
+ " <li><strong>Query Performance:</strong> Displays query performance heatmaps for different Top-K and batch sizes</li>"
+ )
+ html.append(
+ " <li><strong>Index Performance:</strong> Compares index creation times across filesystems</li>"
+ )
+ html.append(
+ " <li><strong>Performance Matrix:</strong> Comprehensive comparison matrix of all metrics</li>"
+ )
+ html.append(
+ " <li><strong>Filesystem Comparison:</strong> Side-by-side comparison of filesystem performance</li>"
+ )
+ html.append(" </ul>")
+ html.append(
+ " <p><em>Note: Graphs are generated as separate PNG files in the same directory as this report.</em></p>"
+ )
+ html.append(" <div style='margin-top: 20px;'>")
+ html.append(
+ " <img src='insert_performance.png' alt='Insert Performance' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='query_performance.png' alt='Query Performance' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='index_performance.png' alt='Index Performance' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='performance_matrix.png' alt='Performance Matrix' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(
+ " <img src='filesystem_comparison.png' alt='Filesystem Comparison' style='max-width: 100%; height: auto; margin-bottom: 20px;'>"
+ )
+ html.append(" </div>")
+ html.append(" </div>")
+
html.append(" <div class='section'>")
html.append(" <h2>📝 Notes</h2>")
html.append(" <ul>")
@@ -661,10 +868,11 @@ class ResultsAnalyzer:
return "\n".join(html)
except Exception as e:
- self.logger.error(f"Error generating HTML report: {e}")
- return (
- f"<html><body><h1>Error generating HTML report: {e}</h1></body></html>"
- )
+ import traceback
+
+ tb = traceback.format_exc()
+ self.logger.error(f"Error generating HTML report: {e}\n{tb}")
+ return f"<html><body><h1>Error generating HTML report: {e}</h1><pre>{tb}</pre></body></html>"
def generate_graphs(self) -> bool:
"""Generate performance visualization graphs"""
@@ -691,6 +899,9 @@ class ResultsAnalyzer:
# Graph 4: Performance Comparison Matrix
self._plot_performance_matrix()
+ # Graph 5: Multi-filesystem Comparison (if applicable)
+ self._plot_filesystem_comparison()
+
self.logger.info("Graphs generated successfully")
return True
@@ -699,34 +910,188 @@ class ResultsAnalyzer:
return False
def _plot_insert_performance(self):
- """Plot insert performance metrics"""
- fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+ """Plot insert performance metrics with node differentiation"""
+ # Group data by node
+ node_performance = {}
- # Extract insert data
- iterations = []
- insert_rates = []
- insert_times = []
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "insert_rates": [],
+ "insert_times": [],
+ "iterations": [],
+ "is_dev": is_dev,
+ }
- for i, result in enumerate(self.results_data):
insert_perf = result.get("insert_performance", {})
if insert_perf:
- iterations.append(i + 1)
- insert_rates.append(insert_perf.get("vectors_per_second", 0))
- insert_times.append(insert_perf.get("total_time_seconds", 0))
-
- # Plot insert rate
- ax1.plot(iterations, insert_rates, "b-o", linewidth=2, markersize=6)
- ax1.set_xlabel("Iteration")
- ax1.set_ylabel("Vectors/Second")
- ax1.set_title("Vector Insert Rate Performance")
- ax1.grid(True, alpha=0.3)
-
- # Plot insert time
- ax2.plot(iterations, insert_times, "r-o", linewidth=2, markersize=6)
- ax2.set_xlabel("Iteration")
- ax2.set_ylabel("Total Time (seconds)")
- ax2.set_title("Vector Insert Time Performance")
- ax2.grid(True, alpha=0.3)
+ node_performance[hostname]["insert_rates"].append(
+ insert_perf.get("vectors_per_second", 0)
+ )
+ node_performance[hostname]["insert_times"].append(
+ insert_perf.get("total_time_seconds", 0)
+ )
+ node_performance[hostname]["iterations"].append(
+ len(node_performance[hostname]["insert_rates"])
+ )
+
+ # Check if we have multiple nodes
+ if len(node_performance) > 1:
+ # Multi-node mode: separate lines for each node
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18, 7))
+
+ # Sort nodes with baseline first, then dev
+ sorted_nodes = sorted(
+ node_performance.items(), key=lambda x: (x[1]["is_dev"], x[0])
+ )
+
+ # Create color palettes for baseline and dev nodes
+ baseline_colors = [
+ "#2E7D32",
+ "#43A047",
+ "#66BB6A",
+ "#81C784",
+ "#A5D6A7",
+ "#C8E6C9",
+ ] # Greens
+ dev_colors = [
+ "#0D47A1",
+ "#1565C0",
+ "#1976D2",
+ "#1E88E5",
+ "#2196F3",
+ "#42A5F5",
+ "#64B5F6",
+ ] # Blues
+
+ # Additional colors if needed
+ extra_colors = [
+ "#E65100",
+ "#F57C00",
+ "#FF9800",
+ "#FFB300",
+ "#FFC107",
+ "#FFCA28",
+ ] # Oranges
+
+ # Line styles to cycle through
+ line_styles = ["-", "--", "-.", ":"]
+ markers = ["o", "s", "^", "v", "D", "p", "*", "h"]
+
+ baseline_idx = 0
+ dev_idx = 0
+
+ # Use different colors and styles for each node
+ for idx, (hostname, perf_data) in enumerate(sorted_nodes):
+ if not perf_data["insert_rates"]:
+ continue
+
+ # Choose color and style based on node type and index
+ if perf_data["is_dev"]:
+ # Development nodes - blues
+ color = dev_colors[dev_idx % len(dev_colors)]
+ linestyle = line_styles[
+ (dev_idx // len(dev_colors)) % len(line_styles)
+ ]
+ marker = markers[4 + (dev_idx % 4)] # Use markers 4-7 for dev
+ label = f"{hostname} (Dev)"
+ dev_idx += 1
+ else:
+ # Baseline nodes - greens
+ color = baseline_colors[baseline_idx % len(baseline_colors)]
+ linestyle = line_styles[
+ (baseline_idx // len(baseline_colors)) % len(line_styles)
+ ]
+ marker = markers[
+ baseline_idx % 4
+ ] # Use first 4 markers for baseline
+ label = f"{hostname} (Baseline)"
+ baseline_idx += 1
+
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate with alpha for better visibility
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ color=color,
+ linestyle=linestyle,
+ marker=marker,
+ linewidth=1.5,
+ markersize=5,
+ label=label,
+ alpha=0.8,
+ )
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ color=color,
+ linestyle=linestyle,
+ marker=marker,
+ linewidth=1.5,
+ markersize=5,
+ label=label,
+ alpha=0.8,
+ )
+
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title("Milvus Insert Rate by Node")
+ ax1.grid(True, alpha=0.3)
+ # Position legend outside plot area for better visibility with many nodes
+ ax1.legend(bbox_to_anchor=(1.05, 1), loc="upper left", fontsize=7, ncol=1)
+
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title("Milvus Insert Time by Node")
+ ax2.grid(True, alpha=0.3)
+ # Position legend outside plot area for better visibility with many nodes
+ ax2.legend(bbox_to_anchor=(1.05, 1), loc="upper left", fontsize=7, ncol=1)
+
+ plt.suptitle(
+ "Insert Performance Analysis: Baseline vs Development",
+ fontsize=14,
+ y=1.02,
+ )
+ else:
+ # Single node mode: original behavior
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+ # Extract insert data from single node
+ hostname = list(node_performance.keys())[0] if node_performance else None
+ if hostname:
+ perf_data = node_performance[hostname]
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ "b-o",
+ linewidth=2,
+ markersize=6,
+ )
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title(f"Vector Insert Rate Performance - {hostname}")
+ ax1.grid(True, alpha=0.3)
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ "r-o",
+ linewidth=2,
+ markersize=6,
+ )
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title(f"Vector Insert Time Performance - {hostname}")
+ ax2.grid(True, alpha=0.3)
plt.tight_layout()
output_file = os.path.join(
@@ -739,52 +1104,110 @@ class ResultsAnalyzer:
plt.close()
def _plot_query_performance(self):
- """Plot query performance metrics"""
+ """Plot query performance metrics comparing baseline vs dev nodes"""
if not self.results_data:
return
- # Collect query performance data
- query_data = []
+ # Group data by filesystem configuration
+ fs_groups = {}
for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(result)
+
+ if config_key not in fs_groups:
+ fs_groups[config_key] = {"baseline": [], "dev": []}
+
query_perf = result.get("query_performance", {})
- for topk, topk_data in query_perf.items():
- for batch, batch_data in topk_data.items():
- query_data.append(
- {
- "topk": topk.replace("topk_", ""),
- "batch": batch.replace("batch_", ""),
- "qps": batch_data.get("queries_per_second", 0),
- "avg_time": batch_data.get("average_time_seconds", 0)
- * 1000, # Convert to ms
- }
- )
+ if query_perf:
+ node_type = "dev" if is_dev else "baseline"
+ for topk, topk_data in query_perf.items():
+ for batch, batch_data in topk_data.items():
+ fs_groups[config_key][node_type].append(
+ {
+ "hostname": hostname,
+ "topk": topk.replace("topk_", ""),
+ "batch": batch.replace("batch_", ""),
+ "qps": batch_data.get("queries_per_second", 0),
+ "avg_time": batch_data.get("average_time_seconds", 0)
+ * 1000,
+ }
+ )
- if not query_data:
+ if not fs_groups:
return
- df = pd.DataFrame(query_data)
+ # Create subplots for each filesystem config
+ n_configs = len(fs_groups)
+ fig_height = max(8, 4 * n_configs)
+ fig, axes = plt.subplots(n_configs, 2, figsize=(16, fig_height))
- # Create subplots
- fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+ if n_configs == 1:
+ axes = axes.reshape(1, -1)
- # QPS heatmap
- qps_pivot = df.pivot_table(
- values="qps", index="topk", columns="batch", aggfunc="mean"
- )
- sns.heatmap(qps_pivot, annot=True, fmt=".1f", ax=ax1, cmap="YlOrRd")
- ax1.set_title("Queries Per Second (QPS)")
- ax1.set_xlabel("Batch Size")
- ax1.set_ylabel("Top-K")
-
- # Latency heatmap
- latency_pivot = df.pivot_table(
- values="avg_time", index="topk", columns="batch", aggfunc="mean"
- )
- sns.heatmap(latency_pivot, annot=True, fmt=".1f", ax=ax2, cmap="YlOrRd")
- ax2.set_title("Average Query Latency (ms)")
- ax2.set_xlabel("Batch Size")
- ax2.set_ylabel("Top-K")
+ for idx, (config_key, data) in enumerate(sorted(fs_groups.items())):
+ # Create DataFrames for baseline and dev
+ baseline_df = (
+ pd.DataFrame(data["baseline"]) if data["baseline"] else pd.DataFrame()
+ )
+ dev_df = pd.DataFrame(data["dev"]) if data["dev"] else pd.DataFrame()
+
+ # Baseline QPS heatmap
+ ax_base = axes[idx][0]
+ if not baseline_df.empty:
+ baseline_pivot = baseline_df.pivot_table(
+ values="qps", index="topk", columns="batch", aggfunc="mean"
+ )
+ sns.heatmap(
+ baseline_pivot,
+ annot=True,
+ fmt=".1f",
+ ax=ax_base,
+ cmap="Greens",
+ cbar_kws={"label": "QPS"},
+ )
+ ax_base.set_title(f"{config_key.upper()} - Baseline QPS")
+ ax_base.set_xlabel("Batch Size")
+ ax_base.set_ylabel("Top-K")
+ else:
+ ax_base.text(
+ 0.5,
+ 0.5,
+ f"No baseline data for {config_key}",
+ ha="center",
+ va="center",
+ transform=ax_base.transAxes,
+ )
+ ax_base.set_title(f"{config_key.upper()} - Baseline QPS")
+ # Dev QPS heatmap
+ ax_dev = axes[idx][1]
+ if not dev_df.empty:
+ dev_pivot = dev_df.pivot_table(
+ values="qps", index="topk", columns="batch", aggfunc="mean"
+ )
+ sns.heatmap(
+ dev_pivot,
+ annot=True,
+ fmt=".1f",
+ ax=ax_dev,
+ cmap="Blues",
+ cbar_kws={"label": "QPS"},
+ )
+ ax_dev.set_title(f"{config_key.upper()} - Development QPS")
+ ax_dev.set_xlabel("Batch Size")
+ ax_dev.set_ylabel("Top-K")
+ else:
+ ax_dev.text(
+ 0.5,
+ 0.5,
+ f"No dev data for {config_key}",
+ ha="center",
+ va="center",
+ transform=ax_dev.transAxes,
+ )
+ ax_dev.set_title(f"{config_key.upper()} - Development QPS")
+
+ plt.suptitle("Query Performance: Baseline vs Development", fontsize=16, y=1.02)
plt.tight_layout()
output_file = os.path.join(
self.output_dir,
@@ -796,32 +1219,101 @@ class ResultsAnalyzer:
plt.close()
def _plot_index_performance(self):
- """Plot index creation performance"""
- iterations = []
- index_times = []
+ """Plot index creation performance comparing baseline vs dev"""
+ # Group by filesystem configuration
+ fs_groups = {}
+
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(result)
+
+ if config_key not in fs_groups:
+ fs_groups[config_key] = {"baseline": [], "dev": []}
- for i, result in enumerate(self.results_data):
index_perf = result.get("index_performance", {})
if index_perf:
- iterations.append(i + 1)
- index_times.append(index_perf.get("creation_time_seconds", 0))
+ time = index_perf.get("creation_time_seconds", 0)
+ if time > 0:
+ node_type = "dev" if is_dev else "baseline"
+ fs_groups[config_key][node_type].append(time)
- if not index_times:
+ if not fs_groups:
return
- plt.figure(figsize=(10, 6))
- plt.bar(iterations, index_times, alpha=0.7, color="green")
- plt.xlabel("Iteration")
- plt.ylabel("Index Creation Time (seconds)")
- plt.title("Index Creation Performance")
- plt.grid(True, alpha=0.3)
-
- # Add average line
- avg_time = np.mean(index_times)
- plt.axhline(
- y=avg_time, color="red", linestyle="--", label=f"Average: {avg_time:.2f}s"
+ # Create comparison bar chart
+ fig, ax = plt.subplots(figsize=(14, 8))
+
+ configs = sorted(fs_groups.keys())
+ x = np.arange(len(configs))
+ width = 0.35
+
+ # Calculate averages for each config
+ baseline_avgs = []
+ dev_avgs = []
+ baseline_stds = []
+ dev_stds = []
+
+ for config in configs:
+ baseline_times = fs_groups[config]["baseline"]
+ dev_times = fs_groups[config]["dev"]
+
+ baseline_avgs.append(np.mean(baseline_times) if baseline_times else 0)
+ dev_avgs.append(np.mean(dev_times) if dev_times else 0)
+ baseline_stds.append(np.std(baseline_times) if baseline_times else 0)
+ dev_stds.append(np.std(dev_times) if dev_times else 0)
+
+ # Create bars
+ bars1 = ax.bar(
+ x - width / 2,
+ baseline_avgs,
+ width,
+ yerr=baseline_stds,
+ label="Baseline",
+ color="#4CAF50",
+ capsize=5,
+ )
+ bars2 = ax.bar(
+ x + width / 2,
+ dev_avgs,
+ width,
+ yerr=dev_stds,
+ label="Development",
+ color="#2196F3",
+ capsize=5,
)
- plt.legend()
+
+ # Add value labels on bars
+ for bar, val in zip(bars1, baseline_avgs):
+ if val > 0:
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.3f}s",
+ ha="center",
+ va="bottom",
+ fontsize=9,
+ )
+
+ for bar, val in zip(bars2, dev_avgs):
+ if val > 0:
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.3f}s",
+ ha="center",
+ va="bottom",
+ fontsize=9,
+ )
+
+ ax.set_xlabel("Filesystem Configuration", fontsize=12)
+ ax.set_ylabel("Index Creation Time (seconds)", fontsize=12)
+ ax.set_title("Index Creation Performance: Baseline vs Development", fontsize=14)
+ ax.set_xticks(x)
+ ax.set_xticklabels([c.upper() for c in configs], rotation=45, ha="right")
+ ax.legend(loc="upper right")
+ ax.grid(True, alpha=0.3, axis="y")
output_file = os.path.join(
self.output_dir,
@@ -833,61 +1325,148 @@ class ResultsAnalyzer:
plt.close()
def _plot_performance_matrix(self):
- """Plot comprehensive performance comparison matrix"""
+ """Plot performance comparison matrix for each filesystem config"""
if len(self.results_data) < 2:
return
- # Extract key metrics for comparison
- metrics = []
- for i, result in enumerate(self.results_data):
+ # Group by filesystem configuration
+ fs_metrics = {}
+
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+ fs_type, block_size, config_key = self._extract_filesystem_config(result)
+
+ if config_key not in fs_metrics:
+ fs_metrics[config_key] = {"baseline": [], "dev": []}
+
+ # Collect metrics
insert_perf = result.get("insert_performance", {})
index_perf = result.get("index_performance", {})
+ query_perf = result.get("query_performance", {})
metric = {
- "iteration": i + 1,
+ "hostname": hostname,
"insert_rate": insert_perf.get("vectors_per_second", 0),
"index_time": index_perf.get("creation_time_seconds", 0),
}
- # Add query metrics
- query_perf = result.get("query_performance", {})
+ # Get representative query performance (topk_10, batch_1)
if "topk_10" in query_perf and "batch_1" in query_perf["topk_10"]:
metric["query_qps"] = query_perf["topk_10"]["batch_1"].get(
"queries_per_second", 0
)
+ else:
+ metric["query_qps"] = 0
- metrics.append(metric)
+ node_type = "dev" if is_dev else "baseline"
+ fs_metrics[config_key][node_type].append(metric)
- df = pd.DataFrame(metrics)
+ if not fs_metrics:
+ return
- # Normalize metrics for comparison
- numeric_cols = ["insert_rate", "index_time", "query_qps"]
- for col in numeric_cols:
- if col in df.columns:
- df[f"{col}_norm"] = (df[col] - df[col].min()) / (
- df[col].max() - df[col].min() + 1e-6
- )
+ # Create subplots for each filesystem
+ n_configs = len(fs_metrics)
+ n_cols = min(3, n_configs)
+ n_rows = (n_configs + n_cols - 1) // n_cols
+
+ fig, axes = plt.subplots(n_rows, n_cols, figsize=(n_cols * 6, n_rows * 5))
+ if n_rows == 1 and n_cols == 1:
+ axes = [[axes]]
+ elif n_rows == 1:
+ axes = [axes]
+ elif n_cols == 1:
+ axes = [[ax] for ax in axes]
+
+ for idx, (config_key, data) in enumerate(sorted(fs_metrics.items())):
+ row = idx // n_cols
+ col = idx % n_cols
+ ax = axes[row][col]
+
+ # Calculate averages
+ baseline_metrics = data["baseline"]
+ dev_metrics = data["dev"]
+
+ if baseline_metrics and dev_metrics:
+ categories = ["Insert Rate\n(vec/s)", "Index Time\n(s)", "Query QPS"]
+
+ baseline_avg = [
+ np.mean([m["insert_rate"] for m in baseline_metrics]),
+ np.mean([m["index_time"] for m in baseline_metrics]),
+ np.mean([m["query_qps"] for m in baseline_metrics]),
+ ]
- # Create radar chart
- fig, ax = plt.subplots(figsize=(10, 8), subplot_kw=dict(projection="polar"))
+ dev_avg = [
+ np.mean([m["insert_rate"] for m in dev_metrics]),
+ np.mean([m["index_time"] for m in dev_metrics]),
+ np.mean([m["query_qps"] for m in dev_metrics]),
+ ]
- angles = np.linspace(0, 2 * np.pi, len(numeric_cols), endpoint=False).tolist()
- angles += angles[:1] # Complete the circle
+ x = np.arange(len(categories))
+ width = 0.35
- for i, row in df.iterrows():
- values = [row.get(f"{col}_norm", 0) for col in numeric_cols]
- values += values[:1] # Complete the circle
+ bars1 = ax.bar(
+ x - width / 2,
+ baseline_avg,
+ width,
+ label="Baseline",
+ color="#4CAF50",
+ )
+ bars2 = ax.bar(
+ x + width / 2, dev_avg, width, label="Development", color="#2196F3"
+ )
- ax.plot(
- angles, values, "o-", linewidth=2, label=f'Iteration {row["iteration"]}'
- )
- ax.fill(angles, values, alpha=0.25)
+ # Add value labels
+ for bar, val in zip(bars1, baseline_avg):
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.0f}" if val > 100 else f"{val:.2f}",
+ ha="center",
+ va="bottom",
+ fontsize=8,
+ )
- ax.set_xticks(angles[:-1])
- ax.set_xticklabels(["Insert Rate", "Index Time (inv)", "Query QPS"])
- ax.set_ylim(0, 1)
- ax.set_title("Performance Comparison Matrix (Normalized)", y=1.08)
- ax.legend(loc="upper right", bbox_to_anchor=(1.3, 1.0))
+ for bar, val in zip(bars2, dev_avg):
+ height = bar.get_height()
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height,
+ f"{val:.0f}" if val > 100 else f"{val:.2f}",
+ ha="center",
+ va="bottom",
+ fontsize=8,
+ )
+
+ ax.set_xlabel("Metrics")
+ ax.set_ylabel("Value")
+ ax.set_title(f"{config_key.upper()}")
+ ax.set_xticks(x)
+ ax.set_xticklabels(categories)
+ ax.legend(loc="upper right", fontsize=8)
+ ax.grid(True, alpha=0.3, axis="y")
+ else:
+ ax.text(
+ 0.5,
+ 0.5,
+ f"Insufficient data\nfor {config_key}",
+ ha="center",
+ va="center",
+ transform=ax.transAxes,
+ )
+ ax.set_title(f"{config_key.upper()}")
+
+ # Hide unused subplots
+ for idx in range(n_configs, n_rows * n_cols):
+ row = idx // n_cols
+ col = idx % n_cols
+ axes[row][col].set_visible(False)
+
+ plt.suptitle(
+ "Performance Comparison Matrix: Baseline vs Development",
+ fontsize=14,
+ y=1.02,
+ )
output_file = os.path.join(
self.output_dir,
@@ -898,6 +1477,149 @@ class ResultsAnalyzer:
)
plt.close()
+ def _plot_filesystem_comparison(self):
+ """Plot node performance comparison chart"""
+ if len(self.results_data) < 2:
+ return
+
+ # Group results by node
+ node_performance = {}
+
+ for result in self.results_data:
+ hostname, is_dev = self._extract_node_info(result)
+
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "insert_rates": [],
+ "index_times": [],
+ "query_qps": [],
+ "is_dev": is_dev,
+ }
+
+ # Collect metrics
+ insert_perf = result.get("insert_performance", {})
+ if insert_perf:
+ node_performance[hostname]["insert_rates"].append(
+ insert_perf.get("vectors_per_second", 0)
+ )
+
+ index_perf = result.get("index_performance", {})
+ if index_perf:
+ node_performance[hostname]["index_times"].append(
+ index_perf.get("creation_time_seconds", 0)
+ )
+
+ # Get top-10 batch-1 query performance as representative
+ query_perf = result.get("query_performance", {})
+ if "topk_10" in query_perf and "batch_1" in query_perf["topk_10"]:
+ qps = query_perf["topk_10"]["batch_1"].get("queries_per_second", 0)
+ node_performance[hostname]["query_qps"].append(qps)
+
+ # Only create comparison if we have multiple nodes
+ if len(node_performance) > 1:
+ # Calculate averages
+ node_metrics = {}
+ for hostname, perf_data in node_performance.items():
+ node_metrics[hostname] = {
+ "avg_insert_rate": (
+ np.mean(perf_data["insert_rates"])
+ if perf_data["insert_rates"]
+ else 0
+ ),
+ "avg_index_time": (
+ np.mean(perf_data["index_times"])
+ if perf_data["index_times"]
+ else 0
+ ),
+ "avg_query_qps": (
+ np.mean(perf_data["query_qps"]) if perf_data["query_qps"] else 0
+ ),
+ "is_dev": perf_data["is_dev"],
+ }
+
+ # Create comparison bar chart with more space
+ fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(24, 8))
+
+ # Sort nodes with baseline first
+ sorted_nodes = sorted(
+ node_metrics.items(), key=lambda x: (x[1]["is_dev"], x[0])
+ )
+ node_names = [hostname for hostname, _ in sorted_nodes]
+
+ # Use different colors for baseline vs dev
+ colors = [
+ "#4CAF50" if not node_metrics[hostname]["is_dev"] else "#2196F3"
+ for hostname in node_names
+ ]
+
+ # Add labels for clarity
+ labels = [
+ f"{hostname}\n({'Dev' if node_metrics[hostname]['is_dev'] else 'Baseline'})"
+ for hostname in node_names
+ ]
+
+ # Insert rate comparison
+ insert_rates = [
+ node_metrics[hostname]["avg_insert_rate"] for hostname in node_names
+ ]
+ bars1 = ax1.bar(labels, insert_rates, color=colors)
+ ax1.set_title("Average Milvus Insert Rate by Node")
+ ax1.set_ylabel("Vectors/Second")
+ # Rotate labels for better readability
+ ax1.set_xticklabels(labels, rotation=45, ha="right", fontsize=8)
+
+ # Index time comparison (lower is better)
+ index_times = [
+ node_metrics[hostname]["avg_index_time"] for hostname in node_names
+ ]
+ bars2 = ax2.bar(labels, index_times, color=colors)
+ ax2.set_title("Average Milvus Index Time by Node")
+ ax2.set_ylabel("Seconds (Lower is Better)")
+ ax2.set_xticklabels(labels, rotation=45, ha="right", fontsize=8)
+
+ # Query QPS comparison
+ query_qps = [
+ node_metrics[hostname]["avg_query_qps"] for hostname in node_names
+ ]
+ bars3 = ax3.bar(labels, query_qps, color=colors)
+ ax3.set_title("Average Milvus Query QPS by Node")
+ ax3.set_ylabel("Queries/Second")
+ ax3.set_xticklabels(labels, rotation=45, ha="right", fontsize=8)
+
+ # Add value labels on bars
+ for bars, values in [
+ (bars1, insert_rates),
+ (bars2, index_times),
+ (bars3, query_qps),
+ ]:
+ for bar, value in zip(bars, values):
+ height = bar.get_height()
+ ax = bar.axes
+ ax.text(
+ bar.get_x() + bar.get_width() / 2.0,
+ height + height * 0.01,
+ f"{value:.1f}",
+ ha="center",
+ va="bottom",
+ fontsize=10,
+ )
+
+ plt.suptitle(
+ "Milvus Performance Comparison: Baseline vs Development Nodes",
+ fontsize=16,
+ y=1.02,
+ )
+ plt.tight_layout()
+
+ output_file = os.path.join(
+ self.output_dir,
+ f"filesystem_comparison.{self.config.get('graph_format', 'png')}",
+ )
+ plt.savefig(
+ output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+ )
+ plt.close()
+
def analyze(self) -> bool:
"""Run complete analysis"""
self.logger.info("Starting results analysis...")
diff --git a/workflows/ai/scripts/generate_graphs.py b/workflows/ai/scripts/generate_graphs.py
index 2e183e86..fafc62bf 100755
--- a/workflows/ai/scripts/generate_graphs.py
+++ b/workflows/ai/scripts/generate_graphs.py
@@ -9,7 +9,6 @@ import sys
import glob
import numpy as np
import matplotlib
-
matplotlib.use("Agg") # Use non-interactive backend
import matplotlib.pyplot as plt
from datetime import datetime
@@ -17,6 +16,66 @@ from pathlib import Path
from collections import defaultdict
+def _extract_filesystem_config(result):
+ """Extract filesystem type and block size from result data.
+ Returns (fs_type, block_size, config_key)"""
+ filename = result.get("_file", "")
+
+ # Primary: Extract filesystem type from filename (more reliable than JSON)
+ fs_type = "unknown"
+ block_size = "default"
+
+ if "xfs" in filename:
+ fs_type = "xfs"
+ # Check larger sizes first to avoid substring matches
+ if "64k" in filename and "64k-" in filename:
+ block_size = "64k"
+ elif "32k" in filename and "32k-" in filename:
+ block_size = "32k"
+ elif "16k" in filename and "16k-" in filename:
+ block_size = "16k"
+ elif "4k" in filename and "4k-" in filename:
+ block_size = "4k"
+ elif "ext4" in filename:
+ fs_type = "ext4"
+ if "4k" in filename and "4k-" in filename:
+ block_size = "4k"
+ elif "16k" in filename and "16k-" in filename:
+ block_size = "16k"
+ elif "btrfs" in filename:
+ fs_type = "btrfs"
+
+ # Fallback: Check JSON data if filename parsing failed
+ if fs_type == "unknown":
+ fs_type = result.get("filesystem", "unknown")
+
+ # Create descriptive config key
+ config_key = f"{fs_type}-{block_size}" if block_size != "default" else fs_type
+ return fs_type, block_size, config_key
+
+
+def _extract_node_info(result):
+ """Extract node hostname and determine if it's a dev node.
+ Returns (hostname, is_dev_node)"""
+ # Get hostname from system_info (preferred) or fall back to filename
+ system_info = result.get("system_info", {})
+ hostname = system_info.get("hostname", "")
+
+ # If no hostname in system_info, try extracting from filename
+ if not hostname:
+ filename = result.get("_file", "")
+ # Remove results_ prefix and .json suffix
+ hostname = filename.replace("results_", "").replace(".json", "")
+ # Remove iteration number if present (_1, _2, etc.)
+ if "_" in hostname and hostname.split("_")[-1].isdigit():
+ hostname = "_".join(hostname.split("_")[:-1])
+
+ # Determine if this is a dev node
+ is_dev = hostname.endswith("-dev")
+
+ return hostname, is_dev
+
+
def load_results(results_dir):
"""Load all JSON result files from the directory"""
results = []
@@ -27,63 +86,8 @@ def load_results(results_dir):
try:
with open(json_file, "r") as f:
data = json.load(f)
- # Extract filesystem info - prefer from JSON data over filename
- filename = os.path.basename(json_file)
-
- # First, try to get filesystem from the JSON data itself
- fs_type = data.get("filesystem", None)
-
- # If not in JSON, try to parse from filename (backwards compatibility)
- if not fs_type:
- parts = (
- filename.replace("results_", "").replace(".json", "").split("-")
- )
-
- # Parse host info
- if "debian13-ai-" in filename:
- host_parts = (
- filename.replace("results_debian13-ai-", "")
- .replace("_1.json", "")
- .replace("_2.json", "")
- .replace("_3.json", "")
- .split("-")
- )
- if "xfs" in host_parts[0]:
- fs_type = "xfs"
- # Extract block size (e.g., "4k", "16k", etc.)
- block_size = (
- host_parts[1] if len(host_parts) > 1 else "unknown"
- )
- elif "ext4" in host_parts[0]:
- fs_type = "ext4"
- block_size = host_parts[1] if len(host_parts) > 1 else "4k"
- elif "btrfs" in host_parts[0]:
- fs_type = "btrfs"
- block_size = "default"
- else:
- fs_type = "unknown"
- block_size = "unknown"
- else:
- fs_type = "unknown"
- block_size = "unknown"
- else:
- # If filesystem came from JSON, set appropriate block size
- if fs_type == "btrfs":
- block_size = "default"
- elif fs_type in ["ext4", "xfs"]:
- block_size = data.get("block_size", "4k")
- else:
- block_size = data.get("block_size", "default")
-
- is_dev = "dev" in filename
-
- # Use filesystem from JSON if available, otherwise use parsed value
- if "filesystem" not in data:
- data["filesystem"] = fs_type
- data["block_size"] = block_size
- data["is_dev"] = is_dev
- data["filename"] = filename
-
+ # Add filename for filesystem detection
+ data["_file"] = os.path.basename(json_file)
results.append(data)
except Exception as e:
print(f"Error loading {json_file}: {e}")
@@ -91,1023 +95,240 @@ def load_results(results_dir):
return results
-def create_filesystem_comparison_chart(results, output_dir):
- """Create a bar chart comparing performance across filesystems"""
- # Group by filesystem and baseline/dev
- fs_data = defaultdict(lambda: {"baseline": [], "dev": []})
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- category = "dev" if result.get("is_dev", False) else "baseline"
-
- # Extract actual performance data from results
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
- fs_data[fs][category].append(insert_qps)
-
- # Prepare data for plotting
- filesystems = list(fs_data.keys())
- baseline_means = [
- np.mean(fs_data[fs]["baseline"]) if fs_data[fs]["baseline"] else 0
- for fs in filesystems
- ]
- dev_means = [
- np.mean(fs_data[fs]["dev"]) if fs_data[fs]["dev"] else 0 for fs in filesystems
- ]
-
- x = np.arange(len(filesystems))
- width = 0.35
-
- fig, ax = plt.subplots(figsize=(10, 6))
- baseline_bars = ax.bar(
- x - width / 2, baseline_means, width, label="Baseline", color="#1f77b4"
- )
- dev_bars = ax.bar(
- x + width / 2, dev_means, width, label="Development", color="#ff7f0e"
- )
-
- ax.set_xlabel("Filesystem")
- ax.set_ylabel("Insert QPS")
- ax.set_title("Vector Database Performance by Filesystem")
- ax.set_xticks(x)
- ax.set_xticklabels(filesystems)
- ax.legend()
- ax.grid(True, alpha=0.3)
-
- # Add value labels on bars
- for bars in [baseline_bars, dev_bars]:
- for bar in bars:
- height = bar.get_height()
- if height > 0:
- ax.annotate(
- f"{height:.0f}",
- xy=(bar.get_x() + bar.get_width() / 2, height),
- xytext=(0, 3),
- textcoords="offset points",
- ha="center",
- va="bottom",
- )
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "filesystem_comparison.png"), dpi=150)
- plt.close()
-
-
-def create_block_size_analysis(results, output_dir):
- """Create analysis for different block sizes (XFS specific)"""
- # Filter XFS results
- xfs_results = [r for r in results if r.get("filesystem") == "xfs"]
-
- if not xfs_results:
+def create_simple_performance_trends(results, output_dir):
+ """Create multi-node performance trends chart"""
+ if not results:
return
- # Group by block size
- block_size_data = defaultdict(lambda: {"baseline": [], "dev": []})
-
- for result in xfs_results:
- block_size = result.get("block_size", "unknown")
- category = "dev" if result.get("is_dev", False) else "baseline"
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
- block_size_data[block_size][category].append(insert_qps)
-
- # Sort block sizes
- block_sizes = sorted(
- block_size_data.keys(),
- key=lambda x: (
- int(x.replace("k", "").replace("s", ""))
- if x not in ["unknown", "default"]
- else 0
- ),
- )
-
- # Create grouped bar chart
- baseline_means = [
- (
- np.mean(block_size_data[bs]["baseline"])
- if block_size_data[bs]["baseline"]
- else 0
- )
- for bs in block_sizes
- ]
- dev_means = [
- np.mean(block_size_data[bs]["dev"]) if block_size_data[bs]["dev"] else 0
- for bs in block_sizes
- ]
-
- x = np.arange(len(block_sizes))
- width = 0.35
-
- fig, ax = plt.subplots(figsize=(12, 6))
- baseline_bars = ax.bar(
- x - width / 2, baseline_means, width, label="Baseline", color="#2ca02c"
- )
- dev_bars = ax.bar(
- x + width / 2, dev_means, width, label="Development", color="#d62728"
- )
-
- ax.set_xlabel("Block Size")
- ax.set_ylabel("Insert QPS")
- ax.set_title("XFS Performance by Block Size")
- ax.set_xticks(x)
- ax.set_xticklabels(block_sizes)
- ax.legend()
- ax.grid(True, alpha=0.3)
-
- # Add value labels
- for bars in [baseline_bars, dev_bars]:
- for bar in bars:
- height = bar.get_height()
- if height > 0:
- ax.annotate(
- f"{height:.0f}",
- xy=(bar.get_x() + bar.get_width() / 2, height),
- xytext=(0, 3),
- textcoords="offset points",
- ha="center",
- va="bottom",
- )
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "xfs_block_size_analysis.png"), dpi=150)
- plt.close()
-
-
-def create_heatmap_analysis(results, output_dir):
- """Create a heatmap showing AVERAGE performance across all test iterations"""
- # Group data by configuration and version, collecting ALL values for averaging
- config_data = defaultdict(
- lambda: {
- "baseline": {"insert": [], "query": [], "count": 0},
- "dev": {"insert": [], "query": [], "count": 0},
- }
- )
+ # Group results by node
+ node_performance = defaultdict(lambda: {
+ "insert_rates": [],
+ "insert_times": [],
+ "iterations": [],
+ "is_dev": False,
+ })
for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "default")
- config = f"{fs}-{block_size}"
- version = "dev" if result.get("is_dev", False) else "baseline"
-
- # Get actual insert performance
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
-
- # Calculate average query QPS
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
-
- # Collect all values for averaging
- config_data[config][version]["insert"].append(insert_qps)
- config_data[config][version]["query"].append(query_qps)
- config_data[config][version]["count"] += 1
-
- # Sort configurations
- configs = sorted(config_data.keys())
-
- # Calculate averages for heatmap
- insert_baseline = []
- insert_dev = []
- query_baseline = []
- query_dev = []
- iteration_counts = {"baseline": 0, "dev": 0}
-
- for c in configs:
- # Calculate average insert QPS
- baseline_insert_vals = config_data[c]["baseline"]["insert"]
- insert_baseline.append(
- np.mean(baseline_insert_vals) if baseline_insert_vals else 0
- )
-
- dev_insert_vals = config_data[c]["dev"]["insert"]
- insert_dev.append(np.mean(dev_insert_vals) if dev_insert_vals else 0)
-
- # Calculate average query QPS
- baseline_query_vals = config_data[c]["baseline"]["query"]
- query_baseline.append(
- np.mean(baseline_query_vals) if baseline_query_vals else 0
- )
-
- dev_query_vals = config_data[c]["dev"]["query"]
- query_dev.append(np.mean(dev_query_vals) if dev_query_vals else 0)
-
- # Track iteration counts
- iteration_counts["baseline"] = max(
- iteration_counts["baseline"], len(baseline_insert_vals)
- )
- iteration_counts["dev"] = max(iteration_counts["dev"], len(dev_insert_vals))
-
- # Create figure with custom heatmap
- fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
-
- # Create data matrices
- insert_data = np.array([insert_baseline, insert_dev]).T
- query_data = np.array([query_baseline, query_dev]).T
-
- # Insert QPS heatmap
- im1 = ax1.imshow(insert_data, cmap="YlOrRd", aspect="auto")
- ax1.set_xticks([0, 1])
- ax1.set_xticklabels(["Baseline", "Development"])
- ax1.set_yticks(range(len(configs)))
- ax1.set_yticklabels(configs)
- ax1.set_title(
- f"Insert Performance - AVERAGE across {iteration_counts['baseline']} iterations\n(1M vectors, 128 dims, HNSW index)"
- )
- ax1.set_ylabel("Configuration")
-
- # Add text annotations with dynamic color based on background
- # Get the colormap to determine actual colors
- cmap1 = plt.cm.YlOrRd
- norm1 = plt.Normalize(vmin=insert_data.min(), vmax=insert_data.max())
-
- for i in range(len(configs)):
- for j in range(2):
- # Get the actual color from the colormap
- val = insert_data[i, j]
- rgba = cmap1(norm1(val))
- # Calculate luminance using standard formula
- # Perceived luminance: 0.299*R + 0.587*G + 0.114*B
- luminance = 0.299 * rgba[0] + 0.587 * rgba[1] + 0.114 * rgba[2]
- # Use white text on dark backgrounds (low luminance)
- text_color = "white" if luminance < 0.5 else "black"
+ hostname, is_dev = _extract_node_info(result)
+
+ if hostname not in node_performance:
+ node_performance[hostname] = {
+ "insert_rates": [],
+ "insert_times": [],
+ "iterations": [],
+ "is_dev": is_dev,
+ }
- # Show average value with indicator
- text = ax1.text(
- j,
- i,
- f"{int(insert_data[i, j])}\n(avg)",
- ha="center",
- va="center",
- color=text_color,
- fontweight="bold",
- fontsize=9,
+ insert_perf = result.get("insert_performance", {})
+ if insert_perf:
+ node_performance[hostname]["insert_rates"].append(
+ insert_perf.get("vectors_per_second", 0)
)
-
- # Add colorbar
- cbar1 = plt.colorbar(im1, ax=ax1)
- cbar1.set_label("Insert QPS")
-
- # Query QPS heatmap
- im2 = ax2.imshow(query_data, cmap="YlGnBu", aspect="auto")
- ax2.set_xticks([0, 1])
- ax2.set_xticklabels(["Baseline", "Development"])
- ax2.set_yticks(range(len(configs)))
- ax2.set_yticklabels(configs)
- ax2.set_title(
- f"Query Performance - AVERAGE across {iteration_counts['dev']} iterations\n(1M vectors, 128 dims, HNSW index)"
- )
-
- # Add text annotations with dynamic color based on background
- # Get the colormap to determine actual colors
- cmap2 = plt.cm.YlGnBu
- norm2 = plt.Normalize(vmin=query_data.min(), vmax=query_data.max())
-
- for i in range(len(configs)):
- for j in range(2):
- # Get the actual color from the colormap
- val = query_data[i, j]
- rgba = cmap2(norm2(val))
- # Calculate luminance using standard formula
- # Perceived luminance: 0.299*R + 0.587*G + 0.114*B
- luminance = 0.299 * rgba[0] + 0.587 * rgba[1] + 0.114 * rgba[2]
- # Use white text on dark backgrounds (low luminance)
- text_color = "white" if luminance < 0.5 else "black"
-
- # Show average value with indicator
- text = ax2.text(
- j,
- i,
- f"{int(query_data[i, j])}\n(avg)",
- ha="center",
- va="center",
- color=text_color,
- fontweight="bold",
- fontsize=9,
+ fs_performance[config_key]["insert_times"].append(
+ insert_perf.get("total_time_seconds", 0)
+ )
+ fs_performance[config_key]["iterations"].append(
+ len(fs_performance[config_key]["insert_rates"])
)
- # Add colorbar
- cbar2 = plt.colorbar(im2, ax=ax2)
- cbar2.set_label("Query QPS")
-
- # Add overall figure title
- fig.suptitle(
- "Performance Heatmap - Showing AVERAGES across Multiple Test Iterations",
- fontsize=14,
- fontweight="bold",
- y=1.02,
- )
-
- plt.tight_layout()
- plt.savefig(
- os.path.join(output_dir, "performance_heatmap.png"),
- dpi=150,
- bbox_inches="tight",
- )
- plt.close()
-
-
-def create_performance_trends(results, output_dir):
- """Create line charts showing performance trends"""
- # Group by filesystem type
- fs_types = defaultdict(
- lambda: {
- "configs": [],
- "baseline_insert": [],
- "dev_insert": [],
- "baseline_query": [],
- "dev_query": [],
- }
- )
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "default")
- config = f"{block_size}"
-
- if config not in fs_types[fs]["configs"]:
- fs_types[fs]["configs"].append(config)
- fs_types[fs]["baseline_insert"].append(0)
- fs_types[fs]["dev_insert"].append(0)
- fs_types[fs]["baseline_query"].append(0)
- fs_types[fs]["dev_query"].append(0)
-
- idx = fs_types[fs]["configs"].index(config)
-
- # Calculate average query QPS from all test configurations
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
-
- if result.get("is_dev", False):
- if "insert_performance" in result:
- fs_types[fs]["dev_insert"][idx] = result["insert_performance"].get(
- "vectors_per_second", 0
- )
- fs_types[fs]["dev_query"][idx] = query_qps
- else:
- if "insert_performance" in result:
- fs_types[fs]["baseline_insert"][idx] = result["insert_performance"].get(
- "vectors_per_second", 0
- )
- fs_types[fs]["baseline_query"][idx] = query_qps
-
- # Create separate plots for each filesystem
- for fs, data in fs_types.items():
- if not data["configs"]:
- continue
-
- fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
-
- x = range(len(data["configs"]))
-
- # Insert performance
- ax1.plot(
- x,
- data["baseline_insert"],
- "o-",
- label="Baseline",
- linewidth=2,
- markersize=8,
- )
- ax1.plot(
- x, data["dev_insert"], "s-", label="Development", linewidth=2, markersize=8
- )
- ax1.set_xlabel("Configuration")
- ax1.set_ylabel("Insert QPS")
- ax1.set_title(f"{fs.upper()} Insert Performance")
- ax1.set_xticks(x)
- ax1.set_xticklabels(data["configs"])
- ax1.legend()
+ # Check if we have multi-filesystem data
+ if len(fs_performance) > 1:
+ # Multi-filesystem mode: separate lines for each filesystem
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+ colors = ["b", "r", "g", "m", "c", "y", "k"]
+ color_idx = 0
+
+ for config_key, perf_data in fs_performance.items():
+ if not perf_data["insert_rates"]:
+ continue
+
+ color = colors[color_idx % len(colors)]
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ f"{color}-o",
+ linewidth=2,
+ markersize=6,
+ label=config_key.upper(),
+ )
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ f"{color}-o",
+ linewidth=2,
+ markersize=6,
+ label=config_key.upper(),
+ )
+
+ color_idx += 1
+
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title("Milvus Insert Rate by Storage Filesystem")
ax1.grid(True, alpha=0.3)
-
- # Query performance
- ax2.plot(
- x, data["baseline_query"], "o-", label="Baseline", linewidth=2, markersize=8
- )
- ax2.plot(
- x, data["dev_query"], "s-", label="Development", linewidth=2, markersize=8
- )
- ax2.set_xlabel("Configuration")
- ax2.set_ylabel("Query QPS")
- ax2.set_title(f"{fs.upper()} Query Performance")
- ax2.set_xticks(x)
- ax2.set_xticklabels(data["configs"])
- ax2.legend()
+ ax1.legend()
+
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title("Milvus Insert Time by Storage Filesystem")
ax2.grid(True, alpha=0.3)
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, f"{fs}_performance_trends.png"), dpi=150)
- plt.close()
-
-
-def create_simple_performance_trends(results, output_dir):
- """Create a simple performance trends chart for basic Milvus testing"""
- if not results:
- return
-
- # Extract configuration from first result for display
- config_text = ""
- if results:
- first_result = results[0]
- if "config" in first_result:
- cfg = first_result["config"]
- config_text = (
- f"Test Config:\n"
- f"• {cfg.get('vector_dataset_size', 'N/A'):,} vectors/iteration\n"
- f"• {cfg.get('vector_dimensions', 'N/A')} dimensions\n"
- f"• {cfg.get('index_type', 'N/A')} index"
+ ax2.legend()
+ else:
+ # Single filesystem mode: original behavior
+ fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+ # Extract insert data from single filesystem
+ config_key = list(fs_performance.keys())[0] if fs_performance else None
+ if config_key:
+ perf_data = fs_performance[config_key]
+ iterations = list(range(1, len(perf_data["insert_rates"]) + 1))
+
+ # Plot insert rate
+ ax1.plot(
+ iterations,
+ perf_data["insert_rates"],
+ "b-o",
+ linewidth=2,
+ markersize=6,
)
-
- # Separate baseline and dev results
- baseline_results = [r for r in results if not r.get("is_dev", False)]
- dev_results = [r for r in results if r.get("is_dev", False)]
-
- if not baseline_results and not dev_results:
- return
-
- fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
-
- # Prepare data
- baseline_insert = []
- baseline_query = []
- dev_insert = []
- dev_query = []
- labels = []
-
- # Process baseline results
- for i, result in enumerate(baseline_results):
- if "insert_performance" in result:
- baseline_insert.append(
- result["insert_performance"].get("vectors_per_second", 0)
+ ax1.set_xlabel("Iteration")
+ ax1.set_ylabel("Vectors/Second")
+ ax1.set_title("Vector Insert Rate Performance")
+ ax1.grid(True, alpha=0.3)
+
+ # Plot insert time
+ ax2.plot(
+ iterations,
+ perf_data["insert_times"],
+ "r-o",
+ linewidth=2,
+ markersize=6,
)
- else:
- baseline_insert.append(0)
-
- # Calculate average query QPS
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
- baseline_query.append(query_qps)
- labels.append(f"Iteration {i+1}")
-
- # Process dev results
- for result in dev_results:
- if "insert_performance" in result:
- dev_insert.append(result["insert_performance"].get("vectors_per_second", 0))
- else:
- dev_insert.append(0)
-
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
- dev_query.append(query_qps)
-
- x = range(len(baseline_results) if baseline_results else len(dev_results))
-
- # Insert performance - with visible markers for all points
- if baseline_insert:
- # Line plot with smaller markers
- ax1.plot(
- x,
- baseline_insert,
- "-",
- label="Baseline",
- linewidth=1.5,
- color="blue",
- alpha=0.6,
- )
- # Add distinct markers for each point
- ax1.scatter(
- x,
- baseline_insert,
- s=30,
- color="blue",
- alpha=0.8,
- edgecolors="darkblue",
- linewidth=0.5,
- zorder=5,
- )
- if dev_insert:
- # Line plot with smaller markers
- ax1.plot(
- x[: len(dev_insert)],
- dev_insert,
- "-",
- label="Development",
- linewidth=1.5,
- color="red",
- alpha=0.6,
- )
- # Add distinct markers for each point
- ax1.scatter(
- x[: len(dev_insert)],
- dev_insert,
- s=30,
- color="red",
- alpha=0.8,
- edgecolors="darkred",
- linewidth=0.5,
- marker="s",
- zorder=5,
- )
- ax1.set_xlabel("Test Iteration (same configuration, repeated for reliability)")
- ax1.set_ylabel("Insert QPS (vectors/second)")
- ax1.set_title("Milvus Insert Performance")
-
- # Handle x-axis labels to prevent overlap
- num_points = len(x)
- if num_points > 20:
- # Show every 5th label for many iterations
- step = 5
- tick_positions = list(range(0, num_points, step))
- tick_labels = [
- labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
- ]
- ax1.set_xticks(tick_positions)
- ax1.set_xticklabels(tick_labels, rotation=45, ha="right")
- elif num_points > 10:
- # Show every 2nd label for moderate iterations
- step = 2
- tick_positions = list(range(0, num_points, step))
- tick_labels = [
- labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
- ]
- ax1.set_xticks(tick_positions)
- ax1.set_xticklabels(tick_labels, rotation=45, ha="right")
- else:
- # Show all labels for few iterations
- ax1.set_xticks(x)
- ax1.set_xticklabels(labels if labels else [f"Iteration {i+1}" for i in x])
-
- ax1.legend()
- ax1.grid(True, alpha=0.3)
-
- # Add configuration text box - compact
- if config_text:
- ax1.text(
- 0.02,
- 0.98,
- config_text,
- transform=ax1.transAxes,
- fontsize=6,
- verticalalignment="top",
- bbox=dict(boxstyle="round,pad=0.3", facecolor="wheat", alpha=0.85),
- )
-
- # Query performance - with visible markers for all points
- if baseline_query:
- # Line plot
- ax2.plot(
- x,
- baseline_query,
- "-",
- label="Baseline",
- linewidth=1.5,
- color="blue",
- alpha=0.6,
- )
- # Add distinct markers for each point
- ax2.scatter(
- x,
- baseline_query,
- s=30,
- color="blue",
- alpha=0.8,
- edgecolors="darkblue",
- linewidth=0.5,
- zorder=5,
- )
- if dev_query:
- # Line plot
- ax2.plot(
- x[: len(dev_query)],
- dev_query,
- "-",
- label="Development",
- linewidth=1.5,
- color="red",
- alpha=0.6,
- )
- # Add distinct markers for each point
- ax2.scatter(
- x[: len(dev_query)],
- dev_query,
- s=30,
- color="red",
- alpha=0.8,
- edgecolors="darkred",
- linewidth=0.5,
- marker="s",
- zorder=5,
- )
- ax2.set_xlabel("Test Iteration (same configuration, repeated for reliability)")
- ax2.set_ylabel("Query QPS (queries/second)")
- ax2.set_title("Milvus Query Performance")
-
- # Handle x-axis labels to prevent overlap
- num_points = len(x)
- if num_points > 20:
- # Show every 5th label for many iterations
- step = 5
- tick_positions = list(range(0, num_points, step))
- tick_labels = [
- labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
- ]
- ax2.set_xticks(tick_positions)
- ax2.set_xticklabels(tick_labels, rotation=45, ha="right")
- elif num_points > 10:
- # Show every 2nd label for moderate iterations
- step = 2
- tick_positions = list(range(0, num_points, step))
- tick_labels = [
- labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
- ]
- ax2.set_xticks(tick_positions)
- ax2.set_xticklabels(tick_labels, rotation=45, ha="right")
- else:
- # Show all labels for few iterations
- ax2.set_xticks(x)
- ax2.set_xticklabels(labels if labels else [f"Iteration {i+1}" for i in x])
-
- ax2.legend()
- ax2.grid(True, alpha=0.3)
-
- # Add configuration text box - compact
- if config_text:
- ax2.text(
- 0.02,
- 0.98,
- config_text,
- transform=ax2.transAxes,
- fontsize=6,
- verticalalignment="top",
- bbox=dict(boxstyle="round,pad=0.3", facecolor="wheat", alpha=0.85),
- )
-
+ ax2.set_xlabel("Iteration")
+ ax2.set_ylabel("Total Time (seconds)")
+ ax2.set_title("Vector Insert Time Performance")
+ ax2.grid(True, alpha=0.3)
+
plt.tight_layout()
plt.savefig(os.path.join(output_dir, "performance_trends.png"), dpi=150)
plt.close()
-def generate_summary_statistics(results, output_dir):
- """Generate summary statistics and save to JSON"""
- # Get unique filesystems, excluding "unknown"
- filesystems = set()
- for r in results:
- fs = r.get("filesystem", "unknown")
- if fs != "unknown":
- filesystems.add(fs)
-
- summary = {
- "total_tests": len(results),
- "filesystems_tested": sorted(list(filesystems)),
- "configurations": {},
- "performance_summary": {
- "best_insert_qps": {"value": 0, "config": ""},
- "best_query_qps": {"value": 0, "config": ""},
- "average_insert_qps": 0,
- "average_query_qps": 0,
- },
- }
-
- # Calculate statistics
- all_insert_qps = []
- all_query_qps = []
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "default")
- is_dev = "dev" if result.get("is_dev", False) else "baseline"
- config_name = f"{fs}-{block_size}-{is_dev}"
-
- # Get actual performance metrics
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
-
- # Calculate average query QPS
- query_qps = 0
- if "query_performance" in result:
- qp = result["query_performance"]
- total_qps = 0
- count = 0
- for topk_key in ["topk_1", "topk_10", "topk_100"]:
- if topk_key in qp:
- for batch_key in ["batch_1", "batch_10", "batch_100"]:
- if batch_key in qp[topk_key]:
- total_qps += qp[topk_key][batch_key].get(
- "queries_per_second", 0
- )
- count += 1
- if count > 0:
- query_qps = total_qps / count
-
- all_insert_qps.append(insert_qps)
- all_query_qps.append(query_qps)
-
- summary["configurations"][config_name] = {
- "insert_qps": insert_qps,
- "query_qps": query_qps,
- "host": result.get("host", "unknown"),
- }
-
- if insert_qps > summary["performance_summary"]["best_insert_qps"]["value"]:
- summary["performance_summary"]["best_insert_qps"] = {
- "value": insert_qps,
- "config": config_name,
- }
-
- if query_qps > summary["performance_summary"]["best_query_qps"]["value"]:
- summary["performance_summary"]["best_query_qps"] = {
- "value": query_qps,
- "config": config_name,
- }
-
- summary["performance_summary"]["average_insert_qps"] = (
- np.mean(all_insert_qps) if all_insert_qps else 0
- )
- summary["performance_summary"]["average_query_qps"] = (
- np.mean(all_query_qps) if all_query_qps else 0
- )
-
- # Save summary
- with open(os.path.join(output_dir, "summary.json"), "w") as f:
- json.dump(summary, f, indent=2)
-
- return summary
-
-
-def create_comprehensive_fs_comparison(results, output_dir):
- """Create comprehensive filesystem performance comparison including all configurations"""
- import matplotlib.pyplot as plt
- import numpy as np
- from collections import defaultdict
-
- # Collect data for all filesystem configurations
- config_data = defaultdict(lambda: {"baseline": [], "dev": []})
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "")
-
- # Create configuration label
- if block_size and block_size != "default":
- config_label = f"{fs}-{block_size}"
- else:
- config_label = fs
-
- category = "dev" if result.get("is_dev", False) else "baseline"
-
- # Extract performance metrics
- if "insert_performance" in result:
- insert_qps = result["insert_performance"].get("vectors_per_second", 0)
- else:
- insert_qps = 0
-
- config_data[config_label][category].append(insert_qps)
-
- # Sort configurations for consistent display
- configs = sorted(config_data.keys())
-
- # Calculate means and standard deviations
- baseline_means = []
- baseline_stds = []
- dev_means = []
- dev_stds = []
-
- for config in configs:
- baseline_vals = config_data[config]["baseline"]
- dev_vals = config_data[config]["dev"]
-
- baseline_means.append(np.mean(baseline_vals) if baseline_vals else 0)
- baseline_stds.append(np.std(baseline_vals) if baseline_vals else 0)
- dev_means.append(np.mean(dev_vals) if dev_vals else 0)
- dev_stds.append(np.std(dev_vals) if dev_vals else 0)
-
- # Create the plot
- fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10))
-
- x = np.arange(len(configs))
- width = 0.35
-
- # Top plot: Absolute performance
- baseline_bars = ax1.bar(
- x - width / 2,
- baseline_means,
- width,
- yerr=baseline_stds,
- label="Baseline",
- color="#1f77b4",
- capsize=5,
- )
- dev_bars = ax1.bar(
- x + width / 2,
- dev_means,
- width,
- yerr=dev_stds,
- label="Development",
- color="#ff7f0e",
- capsize=5,
- )
-
- ax1.set_ylabel("Insert QPS")
- ax1.set_title("Vector Database Performance Across Filesystem Configurations")
- ax1.set_xticks(x)
- ax1.set_xticklabels(configs, rotation=45, ha="right")
- ax1.legend()
- ax1.grid(True, alpha=0.3)
-
- # Add value labels on bars
- for bars in [baseline_bars, dev_bars]:
- for bar in bars:
- height = bar.get_height()
- if height > 0:
- ax1.annotate(
- f"{height:.0f}",
- xy=(bar.get_x() + bar.get_width() / 2, height),
- xytext=(0, 3),
- textcoords="offset points",
- ha="center",
- va="bottom",
- fontsize=8,
- )
-
- # Bottom plot: Percentage improvement (dev vs baseline)
- improvements = []
- for i in range(len(configs)):
- if baseline_means[i] > 0:
- improvement = ((dev_means[i] - baseline_means[i]) / baseline_means[i]) * 100
- else:
- improvement = 0
- improvements.append(improvement)
-
- colors = ["green" if x > 0 else "red" for x in improvements]
- improvement_bars = ax2.bar(x, improvements, color=colors, alpha=0.7)
-
- ax2.set_ylabel("Performance Change (%)")
- ax2.set_title("Development vs Baseline Performance Change")
- ax2.set_xticks(x)
- ax2.set_xticklabels(configs, rotation=45, ha="right")
- ax2.axhline(y=0, color="black", linestyle="-", linewidth=0.5)
- ax2.grid(True, alpha=0.3)
-
- # Add percentage labels
- for bar, val in zip(improvement_bars, improvements):
- ax2.annotate(
- f"{val:.1f}%",
- xy=(bar.get_x() + bar.get_width() / 2, val),
- xytext=(0, 3 if val > 0 else -15),
- textcoords="offset points",
- ha="center",
- va="bottom" if val > 0 else "top",
- fontsize=8,
- )
-
- plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "comprehensive_fs_comparison.png"), dpi=150)
- plt.close()
-
-
-def create_fs_latency_comparison(results, output_dir):
- """Create latency comparison across filesystems"""
- import matplotlib.pyplot as plt
- import numpy as np
- from collections import defaultdict
-
- # Collect latency data
- config_latency = defaultdict(lambda: {"baseline": [], "dev": []})
-
- for result in results:
- fs = result.get("filesystem", "unknown")
- block_size = result.get("block_size", "")
-
- if block_size and block_size != "default":
- config_label = f"{fs}-{block_size}"
- else:
- config_label = fs
-
- category = "dev" if result.get("is_dev", False) else "baseline"
-
- # Extract latency metrics
- if "query_performance" in result:
- latency_p99 = result["query_performance"].get("latency_p99_ms", 0)
- else:
- latency_p99 = 0
-
- if latency_p99 > 0:
- config_latency[config_label][category].append(latency_p99)
-
- if not config_latency:
+def create_heatmap_analysis(results, output_dir):
+ """Create multi-filesystem heatmap showing query performance"""
+ if not results:
return
- # Sort configurations
- configs = sorted(config_latency.keys())
-
- # Calculate statistics
- baseline_p99 = []
- dev_p99 = []
-
- for config in configs:
- baseline_vals = config_latency[config]["baseline"]
- dev_vals = config_latency[config]["dev"]
-
- baseline_p99.append(np.mean(baseline_vals) if baseline_vals else 0)
- dev_p99.append(np.mean(dev_vals) if dev_vals else 0)
-
- # Create plot
- fig, ax = plt.subplots(figsize=(12, 6))
-
- x = np.arange(len(configs))
- width = 0.35
-
- baseline_bars = ax.bar(
- x - width / 2, baseline_p99, width, label="Baseline P99", color="#9467bd"
- )
- dev_bars = ax.bar(
- x + width / 2, dev_p99, width, label="Development P99", color="#e377c2"
- )
-
- ax.set_xlabel("Filesystem Configuration")
- ax.set_ylabel("Latency P99 (ms)")
- ax.set_title("Query Latency (P99) Comparison Across Filesystems")
- ax.set_xticks(x)
- ax.set_xticklabels(configs, rotation=45, ha="right")
- ax.legend()
- ax.grid(True, alpha=0.3)
-
- # Add value labels
- for bars in [baseline_bars, dev_bars]:
- for bar in bars:
- height = bar.get_height()
- if height > 0:
- ax.annotate(
- f"{height:.1f}",
- xy=(bar.get_x() + bar.get_width() / 2, height),
- xytext=(0, 3),
- textcoords="offset points",
- ha="center",
- va="bottom",
- fontsize=8,
- )
+ # Group data by filesystem configuration
+ fs_performance = defaultdict(lambda: {
+ "query_data": [],
+ "config_key": "",
+ })
+ for result in results:
+ fs_type, block_size, config_key = _extract_filesystem_config(result)
+
+ query_perf = result.get("query_performance", {})
+ for topk, topk_data in query_perf.items():
+ for batch, batch_data in topk_data.items():
+ qps = batch_data.get("queries_per_second", 0)
+ fs_performance[config_key]["query_data"].append({
+ "topk": topk,
+ "batch": batch,
+ "qps": qps,
+ })
+ fs_performance[config_key]["config_key"] = config_key
+
+ # Check if we have multi-filesystem data
+ if len(fs_performance) > 1:
+ # Multi-filesystem mode: separate heatmaps for each filesystem
+ num_fs = len(fs_performance)
+ fig, axes = plt.subplots(1, num_fs, figsize=(5*num_fs, 6))
+ if num_fs == 1:
+ axes = [axes]
+
+ # Define common structure for consistency
+ topk_order = ["topk_1", "topk_10", "topk_100"]
+ batch_order = ["batch_1", "batch_10", "batch_100"]
+
+ for idx, (config_key, perf_data) in enumerate(fs_performance.items()):
+ # Create matrix for this filesystem
+ matrix = np.zeros((len(topk_order), len(batch_order)))
+
+ # Fill matrix with data
+ query_dict = {}
+ for item in perf_data["query_data"]:
+ query_dict[(item["topk"], item["batch"])] = item["qps"]
+
+ for i, topk in enumerate(topk_order):
+ for j, batch in enumerate(batch_order):
+ matrix[i, j] = query_dict.get((topk, batch), 0)
+
+ # Plot heatmap
+ im = axes[idx].imshow(matrix, cmap='viridis', aspect='auto')
+ axes[idx].set_title(f"{config_key.upper()} Query Performance")
+ axes[idx].set_xticks(range(len(batch_order)))
+ axes[idx].set_xticklabels([b.replace("batch_", "Batch ") for b in batch_order])
+ axes[idx].set_yticks(range(len(topk_order)))
+ axes[idx].set_yticklabels([t.replace("topk_", "Top-") for t in topk_order])
+
+ # Add text annotations
+ for i in range(len(topk_order)):
+ for j in range(len(batch_order)):
+ axes[idx].text(j, i, f'{matrix[i, j]:.0f}',
+ ha="center", va="center", color="white", fontweight="bold")
+
+ # Add colorbar
+ cbar = plt.colorbar(im, ax=axes[idx])
+ cbar.set_label('Queries Per Second (QPS)')
+ else:
+ # Single filesystem mode
+ fig, ax = plt.subplots(1, 1, figsize=(8, 6))
+
+ if fs_performance:
+ config_key = list(fs_performance.keys())[0]
+ perf_data = fs_performance[config_key]
+
+ # Create matrix
+ topk_order = ["topk_1", "topk_10", "topk_100"]
+ batch_order = ["batch_1", "batch_10", "batch_100"]
+ matrix = np.zeros((len(topk_order), len(batch_order)))
+
+ # Fill matrix with data
+ query_dict = {}
+ for item in perf_data["query_data"]:
+ query_dict[(item["topk"], item["batch"])] = item["qps"]
+
+ for i, topk in enumerate(topk_order):
+ for j, batch in enumerate(batch_order):
+ matrix[i, j] = query_dict.get((topk, batch), 0)
+
+ # Plot heatmap
+ im = ax.imshow(matrix, cmap='viridis', aspect='auto')
+ ax.set_title("Milvus Query Performance Heatmap")
+ ax.set_xticks(range(len(batch_order)))
+ ax.set_xticklabels([b.replace("batch_", "Batch ") for b in batch_order])
+ ax.set_yticks(range(len(topk_order)))
+ ax.set_yticklabels([t.replace("topk_", "Top-") for t in topk_order])
+
+ # Add text annotations
+ for i in range(len(topk_order)):
+ for j in range(len(batch_order)):
+ ax.text(j, i, f'{matrix[i, j]:.0f}',
+ ha="center", va="center", color="white", fontweight="bold")
+
+ # Add colorbar
+ cbar = plt.colorbar(im, ax=ax)
+ cbar.set_label('Queries Per Second (QPS)')
+
plt.tight_layout()
- plt.savefig(os.path.join(output_dir, "filesystem_latency_comparison.png"), dpi=150)
+ plt.savefig(os.path.join(output_dir, "performance_heatmap.png"), dpi=150, bbox_inches="tight")
plt.close()
@@ -1119,56 +340,23 @@ def main():
results_dir = sys.argv[1]
output_dir = sys.argv[2]
- # Create output directory
+ # Ensure output directory exists
os.makedirs(output_dir, exist_ok=True)
# Load results
results = load_results(results_dir)
-
if not results:
- print("No results found to analyze")
+ print(f"No valid results found in {results_dir}")
sys.exit(1)
print(f"Loaded {len(results)} result files")
# Generate graphs
- print("Generating performance heatmap...")
- create_heatmap_analysis(results, output_dir)
-
- print("Generating performance trends...")
create_simple_performance_trends(results, output_dir)
+ create_heatmap_analysis(results, output_dir)
- print("Generating summary statistics...")
- summary = generate_summary_statistics(results, output_dir)
-
- # Check if we have multiple filesystems to compare
- filesystems = set(r.get("filesystem", "unknown") for r in results)
- if len(filesystems) > 1:
- print("Generating filesystem comparison chart...")
- create_filesystem_comparison_chart(results, output_dir)
-
- print("Generating comprehensive filesystem comparison...")
- create_comprehensive_fs_comparison(results, output_dir)
-
- print("Generating filesystem latency comparison...")
- create_fs_latency_comparison(results, output_dir)
-
- # Check if we have XFS results with different block sizes
- xfs_results = [r for r in results if r.get("filesystem") == "xfs"]
- block_sizes = set(r.get("block_size", "unknown") for r in xfs_results)
- if len(block_sizes) > 1:
- print("Generating XFS block size analysis...")
- create_block_size_analysis(results, output_dir)
-
- print(f"\nAnalysis complete! Graphs saved to {output_dir}")
- print(f"Total configurations tested: {summary['total_tests']}")
- print(
- f"Best insert QPS: {summary['performance_summary']['best_insert_qps']['value']} ({summary['performance_summary']['best_insert_qps']['config']})"
- )
- print(
- f"Best query QPS: {summary['performance_summary']['best_query_qps']['value']} ({summary['performance_summary']['best_query_qps']['config']})"
- )
+ print(f"Graphs generated in {output_dir}")
if __name__ == "__main__":
- main()
+ main()
\ No newline at end of file
diff --git a/workflows/ai/scripts/generate_html_report.py b/workflows/ai/scripts/generate_html_report.py
index 3aa8342f..01ec734c 100755
--- a/workflows/ai/scripts/generate_html_report.py
+++ b/workflows/ai/scripts/generate_html_report.py
@@ -180,7 +180,7 @@ HTML_TEMPLATE = """
</head>
<body>
<div class="header">
- <h1>AI Vector Database Benchmark Results</h1>
+ <h1>Milvus Vector Database Benchmark Results</h1>
<div class="subtitle">Generated on {timestamp}</div>
</div>
@@ -238,11 +238,13 @@ HTML_TEMPLATE = """
</div>
<div id="detailed-results" class="section">
- <h2>Detailed Results Table</h2>
+ <h2>Milvus Performance by Storage Filesystem</h2>
+ <p>This table shows how Milvus vector database performs when its data is stored on different filesystem types and configurations.</p>
<table class="results-table">
<thead>
<tr>
- <th>Host</th>
+ <th>Filesystem</th>
+ <th>Configuration</th>
<th>Type</th>
<th>Insert QPS</th>
<th>Query QPS</th>
@@ -293,27 +295,53 @@ def load_results(results_dir):
# Get filesystem from JSON data
fs_type = data.get("filesystem", None)
- # If not in JSON, try to parse from filename (backwards compatibility)
- if not fs_type and "debian13-ai" in filename:
- host_parts = (
- filename.replace("results_debian13-ai-", "")
- .replace("_1.json", "")
+ # Always try to parse from filename first since JSON data might be wrong
+ if "-ai-" in filename:
+ # Handle both debian13-ai- and prod-ai- prefixes
+ cleaned_filename = filename.replace("results_", "")
+
+ # Extract the part after -ai-
+ if "debian13-ai-" in cleaned_filename:
+ host_part = cleaned_filename.replace("debian13-ai-", "")
+ elif "prod-ai-" in cleaned_filename:
+ host_part = cleaned_filename.replace("prod-ai-", "")
+ else:
+ # Generic extraction
+ ai_index = cleaned_filename.find("-ai-")
+ if ai_index != -1:
+ host_part = cleaned_filename[ai_index + 4 :] # Skip "-ai-"
+ else:
+ host_part = cleaned_filename
+
+ # Remove file extensions and dev suffix
+ host_part = (
+ host_part.replace("_1.json", "")
.replace("_2.json", "")
.replace("_3.json", "")
- .split("-")
+ .replace("-dev", "")
)
- if "xfs" in host_parts[0]:
+
+ # Parse filesystem type and block size
+ if host_part.startswith("xfs-"):
fs_type = "xfs"
- block_size = host_parts[1] if len(host_parts) > 1 else "4k"
- elif "ext4" in host_parts[0]:
+ # Extract block size: xfs-4k-4ks -> 4k
+ parts = host_part.split("-")
+ if len(parts) >= 2:
+ block_size = parts[1] # 4k, 16k, 32k, 64k
+ else:
+ block_size = "4k"
+ elif host_part.startswith("ext4-"):
fs_type = "ext4"
- block_size = host_parts[1] if len(host_parts) > 1 else "4k"
- elif "btrfs" in host_parts[0]:
+ parts = host_part.split("-")
+ block_size = parts[1] if len(parts) > 1 else "4k"
+ elif host_part.startswith("btrfs"):
fs_type = "btrfs"
block_size = "default"
else:
- fs_type = "unknown"
- block_size = "unknown"
+ # Fallback to JSON data if available
+ if not fs_type:
+ fs_type = "unknown"
+ block_size = "unknown"
else:
# Set appropriate block size based on filesystem
if fs_type == "btrfs":
@@ -371,12 +399,36 @@ def generate_table_rows(results, best_configs):
if config_key in best_configs:
row_class += " best-config"
+ # Generate descriptive labels showing Milvus is running on this filesystem
+ if result["filesystem"] == "xfs" and result["block_size"] != "default":
+ storage_label = f"XFS {result['block_size'].upper()}"
+ config_details = f"Block size: {result['block_size']}, Milvus data on XFS"
+ elif result["filesystem"] == "ext4":
+ storage_label = "EXT4"
+ if "bigalloc" in result.get("host", "").lower():
+ config_details = "EXT4 with bigalloc, Milvus data on ext4"
+ else:
+ config_details = (
+ f"Block size: {result['block_size']}, Milvus data on ext4"
+ )
+ elif result["filesystem"] == "btrfs":
+ storage_label = "BTRFS"
+ config_details = "Default Btrfs settings, Milvus data on Btrfs"
+ else:
+ storage_label = result["filesystem"].upper()
+ config_details = f"Milvus data on {result['filesystem']}"
+
+ # Extract clean node identifier from hostname
+ node_name = result["host"].replace("results_", "").replace(".json", "")
+
row = f"""
<tr class="{row_class}">
- <td>{result['host']}</td>
+ <td><strong>{storage_label}</strong></td>
+ <td>{config_details}</td>
<td>{result['type']}</td>
<td>{result['insert_qps']:,}</td>
<td>{result['query_qps']:,}</td>
+ <td><code>{node_name}</code></td>
<td>{result['timestamp']}</td>
</tr>
"""
@@ -483,8 +535,8 @@ def generate_html_report(results_dir, graphs_dir, output_path):
<li><a href="#block-size-analysis">Block Size Analysis</a></li>"""
filesystem_comparison_section = """<div id="filesystem-comparison" class="section">
- <h2>Filesystem Performance Comparison</h2>
- <p>Comparison of vector database performance across different filesystems, showing both baseline and development kernel results.</p>
+ <h2>Milvus Storage Filesystem Comparison</h2>
+ <p>Comparison of Milvus vector database performance when its data is stored on different filesystem types (XFS, ext4, Btrfs) with various configurations.</p>
<div class="graph-container">
<img src="graphs/filesystem_comparison.png" alt="Filesystem Comparison">
</div>
@@ -499,9 +551,9 @@ def generate_html_report(results_dir, graphs_dir, output_path):
</div>"""
# Multi-fs mode: show filesystem info
- fourth_card_title = "Filesystems Tested"
+ fourth_card_title = "Storage Filesystems"
fourth_card_value = str(len(filesystems_tested))
- fourth_card_label = ", ".join(filesystems_tested).upper()
+ fourth_card_label = ", ".join(filesystems_tested).upper() + " for Milvus Data"
else:
# Single filesystem mode - hide multi-fs sections
filesystem_nav_items = ""
--
2.50.1
next prev parent reply other threads:[~2025-08-27 9:32 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 9:31 [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain
2025-08-27 9:32 ` [PATCH 1/2] ai: add Milvus vector database benchmarking support Luis Chamberlain
2025-08-27 9:32 ` Luis Chamberlain [this message]
2025-08-27 14:47 ` [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks Chuck Lever
2025-08-27 19:24 ` Luis Chamberlain
2025-09-01 20:11 ` Daniel Gomez
2025-09-01 20:27 ` Luis Chamberlain
2025-08-29 2:05 ` [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250827093202.3539990-3-mcgrof@kernel.org \
--to=mcgrof@kernel.org \
--cc=cel@kernel.org \
--cc=da.gomez@kruces.com \
--cc=hui81.qi@samsung.com \
--cc=kdevops@lists.linux.dev \
--cc=kundan.kumar@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox