[PATCH 1/2] ai: add Milvus vector database benchmarking support

public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed

From: Luis Chamberlain <mcgrof@kernel.org>
To: Chuck Lever <cel@kernel.org>, Daniel Gomez <da.gomez@kruces.com>,
	hui81.qi@samsung.com, kundan.kumar@samsung.com,
	kdevops@lists.linux.dev
Cc: Luis Chamberlain <mcgrof@kernel.org>
Subject: [PATCH 1/2] ai: add Milvus vector database benchmarking support
Date: Wed, 27 Aug 2025 02:32:00 -0700	[thread overview]
Message-ID: <20250827093202.3539990-2-mcgrof@kernel.org> (raw)
In-Reply-To: <20250827093202.3539990-1-mcgrof@kernel.org>

Add initial AI/ML workflow infrastructure starting with Milvus
vector database benchmarking. This provides a foundation for testing
AI systems with the same rigor as existing kernel testing workflows
(fstests, blktests).

Key features:
- Docker-based Milvus deployment with etcd and MinIO
  - Supports using a dedicated drive for docker /var/lib/docker/
    including custom filesystem configurations
- Python virtual environment management for benchmark dependencies
- Comprehensive benchmarking of vector operations (insert, search, delete)
- A/B testing support for baseline vs development comparisons
- Performance visualization focusing on key metrics (QPS, latency)
- Result collection and analysis infrastructure

Performance Metrics:
The benchmarks focus on two critical vector database metrics:
- QPS (Queries Per Second): Throughput measurement for search operations
- Latency: Response time percentiles (p50, p95, p99) for operations

Recall rate measurement is challenging without ground truth data - the
correct answers must be known beforehand to measure search accuracy.
Since we generate random vectors for testing, establishing meaningful
ground truth would require careful similarity calculations that would
essentially duplicate the work being tested.

Defconfigs:
- ai-milvus-docker: Standard Docker-based Milvus deployment
- ai-milvus-docker-ci: CI-optimized with minimal dataset (1000 vectors)

Workflow integration follows kdevops patterns:
  make defconfig-ai-milvus-docker
  make bringup
  make ai         # Setup infrastructure
  make ai-tests   # Run benchmarks
  make ai-results # View results

The implementation handles proper cleanup, lock file management, and
comprehensive error handling to ensure reliable benchmark execution.

Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 .gitignore                                    |    3 +-
 README.md                                     |   18 +
 defconfigs/ai-milvus-docker                   |  113 ++
 defconfigs/ai-milvus-docker-ci                |   51 +
 docs/ai/README.md                             |  108 ++
 docs/ai/vector-databases/README.md            |   76 ++
 docs/ai/vector-databases/milvus.md            |  264 ++++
 kconfigs/workflows/Kconfig                    |   27 +
 playbooks/ai.yml                              |   11 +
 playbooks/ai_benchmark.yml                    |    8 +
 playbooks/ai_destroy.yml                      |   24 +
 playbooks/ai_install.yml                      |    8 +
 playbooks/ai_results.yml                      |    6 +
 playbooks/ai_setup.yml                        |    6 +
 playbooks/ai_tests.yml                        |   31 +
 playbooks/ai_uninstall.yml                    |    6 +
 .../debian13-ai-btrfs-default-dev.yml         |    8 +
 .../host_vars/debian13-ai-btrfs-default.yml   |    8 +
 .../debian13-ai-ext4-16k-bigalloc-dev.yml     |    8 +
 .../debian13-ai-ext4-16k-bigalloc.yml         |    8 +
 .../host_vars/debian13-ai-ext4-4k-dev.yml     |    8 +
 playbooks/host_vars/debian13-ai-ext4-4k.yml   |    8 +
 .../host_vars/debian13-ai-xfs-16k-4ks-dev.yml |   10 +
 .../host_vars/debian13-ai-xfs-16k-4ks.yml     |   10 +
 .../host_vars/debian13-ai-xfs-32k-4ks-dev.yml |   10 +
 .../host_vars/debian13-ai-xfs-32k-4ks.yml     |   10 +
 .../host_vars/debian13-ai-xfs-4k-4ks-dev.yml  |   10 +
 .../host_vars/debian13-ai-xfs-4k-4ks.yml      |   10 +
 .../host_vars/debian13-ai-xfs-64k-4ks-dev.yml |   10 +
 .../host_vars/debian13-ai-xfs-64k-4ks.yml     |   10 +
 .../files/analyze_results.py                  |  979 ++++++++++++++
 .../files/generate_better_graphs.py           |  548 ++++++++
 .../files/generate_graphs.py                  |  678 ++++++++++
 .../files/generate_html_report.py             |  427 ++++++
 .../roles/ai_collect_results/tasks/main.yml   |  220 +++
 .../templates/analysis_config.json.j2         |    6 +
 playbooks/roles/ai_destroy/tasks/main.yml     |   63 +
 .../roles/ai_docker_storage/tasks/main.yml    |  123 ++
 playbooks/roles/ai_install/tasks/main.yml     |   90 ++
 playbooks/roles/ai_results/tasks/main.yml     |   22 +
 .../files/milvus_benchmark.py                 |  506 +++++++
 .../roles/ai_run_benchmarks/tasks/main.yml    |  181 +++
 .../templates/benchmark_config.json.j2        |   24 +
 playbooks/roles/ai_setup/tasks/main.yml       |  115 ++
 playbooks/roles/ai_uninstall/tasks/main.yml   |   62 +
 playbooks/roles/gen_hosts/tasks/main.yml      |   14 +
 playbooks/roles/gen_hosts/templates/hosts.j2  |  108 ++
 playbooks/roles/gen_nodes/tasks/main.yml      |   34 +
 playbooks/roles/milvus/README.md              |  181 +++
 playbooks/roles/milvus/defaults/main.yml      |   74 ++
 .../roles/milvus/files/milvus_benchmark.py    |  348 +++++
 playbooks/roles/milvus/files/milvus_utils.py  |  134 ++
 playbooks/roles/milvus/meta/main.yml          |   30 +
 playbooks/roles/milvus/tasks/benchmark.yml    |   61 +
 .../roles/milvus/tasks/benchmark_setup.yml    |   58 +
 .../roles/milvus/tasks/install_docker.yml     |   97 ++
 playbooks/roles/milvus/tasks/main.yml         |   52 +
 playbooks/roles/milvus/tasks/setup.yml        |  107 ++
 .../milvus/templates/benchmark_config.json.j2 |   25 +
 .../templates/docker-compose.override.yml.j2  |   24 +
 .../milvus/templates/docker-compose.yml.j2    |   64 +
 .../roles/milvus/templates/milvus.yaml.j2     |   30 +
 .../milvus/templates/test_connection.py.j2    |   25 +
 workflows/Makefile                            |    4 +
 workflows/ai/Kconfig                          |  164 +++
 workflows/ai/Kconfig.docker                   |  172 +++
 workflows/ai/Kconfig.docker-storage           |  201 +++
 workflows/ai/Kconfig.native                   |  184 +++
 workflows/ai/Makefile                         |  160 +++
 workflows/ai/scripts/analysis_config.json     |    6 +
 workflows/ai/scripts/analyze_results.py       |  979 ++++++++++++++
 workflows/ai/scripts/generate_graphs.py       | 1174 +++++++++++++++++
 workflows/ai/scripts/generate_html_report.py  |  558 ++++++++
 73 files changed, 9999 insertions(+), 1 deletion(-)
 create mode 100644 defconfigs/ai-milvus-docker
 create mode 100644 defconfigs/ai-milvus-docker-ci
 create mode 100644 docs/ai/README.md
 create mode 100644 docs/ai/vector-databases/README.md
 create mode 100644 docs/ai/vector-databases/milvus.md
 create mode 100644 playbooks/ai.yml
 create mode 100644 playbooks/ai_benchmark.yml
 create mode 100644 playbooks/ai_destroy.yml
 create mode 100644 playbooks/ai_install.yml
 create mode 100644 playbooks/ai_results.yml
 create mode 100644 playbooks/ai_setup.yml
 create mode 100644 playbooks/ai_tests.yml
 create mode 100644 playbooks/ai_uninstall.yml
 create mode 100644 playbooks/host_vars/debian13-ai-btrfs-default-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-btrfs-default.yml
 create mode 100644 playbooks/host_vars/debian13-ai-ext4-16k-bigalloc-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-ext4-16k-bigalloc.yml
 create mode 100644 playbooks/host_vars/debian13-ai-ext4-4k-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-ext4-4k.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-16k-4ks-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-16k-4ks.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-32k-4ks-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-32k-4ks.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-4k-4ks-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-64k-4ks-dev.yml
 create mode 100644 playbooks/host_vars/debian13-ai-xfs-64k-4ks.yml
 create mode 100755 playbooks/roles/ai_collect_results/files/analyze_results.py
 create mode 100755 playbooks/roles/ai_collect_results/files/generate_better_graphs.py
 create mode 100755 playbooks/roles/ai_collect_results/files/generate_graphs.py
 create mode 100755 playbooks/roles/ai_collect_results/files/generate_html_report.py
 create mode 100644 playbooks/roles/ai_collect_results/tasks/main.yml
 create mode 100644 playbooks/roles/ai_collect_results/templates/analysis_config.json.j2
 create mode 100644 playbooks/roles/ai_destroy/tasks/main.yml
 create mode 100644 playbooks/roles/ai_docker_storage/tasks/main.yml
 create mode 100644 playbooks/roles/ai_install/tasks/main.yml
 create mode 100644 playbooks/roles/ai_results/tasks/main.yml
 create mode 100644 playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py
 create mode 100644 playbooks/roles/ai_run_benchmarks/tasks/main.yml
 create mode 100644 playbooks/roles/ai_run_benchmarks/templates/benchmark_config.json.j2
 create mode 100644 playbooks/roles/ai_setup/tasks/main.yml
 create mode 100644 playbooks/roles/ai_uninstall/tasks/main.yml
 create mode 100644 playbooks/roles/milvus/README.md
 create mode 100644 playbooks/roles/milvus/defaults/main.yml
 create mode 100644 playbooks/roles/milvus/files/milvus_benchmark.py
 create mode 100644 playbooks/roles/milvus/files/milvus_utils.py
 create mode 100644 playbooks/roles/milvus/meta/main.yml
 create mode 100644 playbooks/roles/milvus/tasks/benchmark.yml
 create mode 100644 playbooks/roles/milvus/tasks/benchmark_setup.yml
 create mode 100644 playbooks/roles/milvus/tasks/install_docker.yml
 create mode 100644 playbooks/roles/milvus/tasks/main.yml
 create mode 100644 playbooks/roles/milvus/tasks/setup.yml
 create mode 100644 playbooks/roles/milvus/templates/benchmark_config.json.j2
 create mode 100644 playbooks/roles/milvus/templates/docker-compose.override.yml.j2
 create mode 100644 playbooks/roles/milvus/templates/docker-compose.yml.j2
 create mode 100644 playbooks/roles/milvus/templates/milvus.yaml.j2
 create mode 100644 playbooks/roles/milvus/templates/test_connection.py.j2
 create mode 100644 workflows/ai/Kconfig
 create mode 100644 workflows/ai/Kconfig.docker
 create mode 100644 workflows/ai/Kconfig.docker-storage
 create mode 100644 workflows/ai/Kconfig.native
 create mode 100644 workflows/ai/Makefile
 create mode 100644 workflows/ai/scripts/analysis_config.json
 create mode 100755 workflows/ai/scripts/analyze_results.py
 create mode 100755 workflows/ai/scripts/generate_graphs.py
 create mode 100755 workflows/ai/scripts/generate_html_report.py

diff --git a/.gitignore b/.gitignore
index e5a13676..75e4712d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -32,7 +32,6 @@ scripts/workflows/fstests/lib/__pycache__/
 scripts/workflows/blktests/lib/__pycache__/
 scripts/workflows/lib/__pycache__/
 
-
 include/
 
 # You can override role specific stuff on these
@@ -48,7 +47,9 @@ playbooks/secret.yml
 playbooks/python/workflows/fstests/__pycache__/
 playbooks/python/workflows/fstests/lib/__pycache__/
 playbooks/python/workflows/fstests/gen_results_summary.pyc
+playbooks/roles/ai_run_benchmarks/files/__pycache__/
 
+workflows/ai/results/
 workflows/pynfs/results/
 
 workflows/fstests/new_expunge_files.txt
diff --git a/README.md b/README.md
index 0c30762a..cb5fbc1f 100644
--- a/README.md
+++ b/README.md
@@ -14,6 +14,7 @@ Table of Contents
       * [reboot-limit](#reboot-limit)
       * [sysbench](#sysbench)
       * [fio-tests](#fio-tests)
+      * [AI workflow](#ai-workflow)
    * [kdevops chats](#kdevops-chats)
    * [kdevops on discord](#kdevops-on-discord)
       * [kdevops IRC](#kdevops-irc)
@@ -273,6 +274,22 @@ A/B testing capabilities, and advanced graphing and visualization support. For
 detailed configuration and usage information, refer to the
 [kdevops fio-tests documentation](docs/fio-tests.md).
 
+### AI workflow
+
+kdevops now supports AI/ML system benchmarking, starting with vector databases
+like Milvus. Similar to fstests, you can quickly set up and benchmark AI
+infrastructure with just a few commands:
+
+```bash
+make defconfig-ai-milvus-docker
+make bringup
+make ai
+```
+
+The AI workflow supports A/B testing, filesystem performance impact analysis,
+and comprehensive benchmarking of vector similarity search workloads. For
+details, see the [kdevops AI workflow documentation](docs/ai/README.md).
+
 ## kdevops chats
 
 We use discord and IRC. Right now we have more folks on discord than on IRC.
@@ -324,6 +341,7 @@ want to just use the kernel that comes with your Linux distribution.
   * [kdevops NFS docs](docs/nfs.md)
   * [kdevops selftests docs](docs/selftests.md)
   * [kdevops reboot-limit docs](docs/reboot-limit.md)
+  * [kdevops AI workflow docs](docs/ai/README.md)
 
 # kdevops general documentation
 
diff --git a/defconfigs/ai-milvus-docker b/defconfigs/ai-milvus-docker
new file mode 100644
index 00000000..ef5aa029
--- /dev/null
+++ b/defconfigs/ai-milvus-docker
@@ -0,0 +1,113 @@
+# AI benchmarking configuration for Milvus vector database testing
+CONFIG_KDEVOPS_FIRST_RUN=n
+CONFIG_LIBVIRT=y
+CONFIG_LIBVIRT_URI="qemu:///system"
+CONFIG_LIBVIRT_HOST_PASSTHROUGH=y
+CONFIG_LIBVIRT_MACHINE_TYPE_DEFAULT=y
+CONFIG_LIBVIRT_CPU_MODEL_PASSTHROUGH=y
+CONFIG_LIBVIRT_VCPUS=4
+CONFIG_LIBVIRT_RAM=8192
+CONFIG_LIBVIRT_OS_VARIANT="generic"
+CONFIG_LIBVIRT_STORAGE_POOL_PATH_CUSTOM=n
+CONFIG_LIBVIRT_STORAGE_POOL_CREATE=y
+CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_NVME=y
+CONFIG_LIBVIRT_EXTRA_STORAGE_DRIVE_SIZE="100"
+
+# Network configuration
+CONFIG_KDEVOPS_NETWORK_TYPE_NATUAL_BRIDGE=y
+
+# Workflow configuration
+CONFIG_WORKFLOWS=y
+CONFIG_WORKFLOWS_TESTS=y
+CONFIG_WORKFLOWS_LINUX_TESTS=y
+CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
+CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
+
+# AI workflow configuration
+CONFIG_AI_TESTS_VECTOR_DATABASE=y
+CONFIG_AI_VECTOR_DB_MILVUS=y
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER=y
+
+# Milvus Docker configuration
+CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5=y
+CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_STRING="milvusdb/milvus:v2.5.10"
+CONFIG_AI_VECTOR_DB_MILVUS_CONTAINER_NAME="milvus-ai-benchmark"
+CONFIG_AI_VECTOR_DB_MILVUS_ETCD_CONTAINER_IMAGE_STRING="quay.io/coreos/etcd:v3.5.18"
+CONFIG_AI_VECTOR_DB_MILVUS_ETCD_CONTAINER_NAME="milvus-etcd"
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_CONTAINER_IMAGE_STRING="minio/minio:RELEASE.2023-03-20T20-16-18Z"
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_CONTAINER_NAME="milvus-minio"
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_ACCESS_KEY="minioadmin"
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_SECRET_KEY="minioadmin"
+
+# Docker storage configuration
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_DATA_PATH="/data/milvus-data"
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_ETCD_DATA_PATH="/data/milvus-etcd"
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_MINIO_DATA_PATH="/data/milvus-minio"
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER_NETWORK_NAME="milvus-network"
+
+# Docker ports
+CONFIG_AI_VECTOR_DB_MILVUS_PORT=19530
+CONFIG_AI_VECTOR_DB_MILVUS_WEB_UI_PORT=9091
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_API_PORT=9000
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_CONSOLE_PORT=9001
+CONFIG_AI_VECTOR_DB_MILVUS_ETCD_CLIENT_PORT=2379
+CONFIG_AI_VECTOR_DB_MILVUS_ETCD_PEER_PORT=2380
+
+# Docker resource limits
+CONFIG_AI_VECTOR_DB_MILVUS_MEMORY_LIMIT="8g"
+CONFIG_AI_VECTOR_DB_MILVUS_CPU_LIMIT="4.0"
+CONFIG_AI_VECTOR_DB_MILVUS_ETCD_MEMORY_LIMIT="1g"
+CONFIG_AI_VECTOR_DB_MILVUS_MINIO_MEMORY_LIMIT="2g"
+
+# Milvus connection configuration
+CONFIG_AI_VECTOR_DB_MILVUS_COLLECTION_NAME="benchmark_collection"
+CONFIG_AI_VECTOR_DB_MILVUS_DIMENSION=768
+CONFIG_AI_VECTOR_DB_MILVUS_DATASET_SIZE=1000000
+CONFIG_AI_VECTOR_DB_MILVUS_BATCH_SIZE=10000
+CONFIG_AI_VECTOR_DB_MILVUS_NUM_QUERIES=10000
+
+# Benchmark configuration
+CONFIG_AI_BENCHMARK_ITERATIONS=3
+# Vector dataset configuration
+CONFIG_AI_VECTOR_DB_MILVUS_DIMENSION=128
+
+# Test runtime configuration
+CONFIG_AI_BENCHMARK_RUNTIME="180"
+CONFIG_AI_BENCHMARK_WARMUP_TIME="30"
+
+# Query patterns for CI testing
+CONFIG_AI_BENCHMARK_QUERY_TOPK_1=y
+CONFIG_AI_BENCHMARK_QUERY_TOPK_10=y
+CONFIG_AI_BENCHMARK_QUERY_TOPK_100=n
+
+# Batch size configuration for CI
+CONFIG_AI_BENCHMARK_BATCH_1=y
+CONFIG_AI_BENCHMARK_BATCH_10=y
+CONFIG_AI_BENCHMARK_BATCH_100=n
+
+# Index configuration
+CONFIG_AI_INDEX_HNSW=y
+CONFIG_AI_INDEX_TYPE="HNSW"
+CONFIG_AI_INDEX_HNSW_M=16
+CONFIG_AI_INDEX_HNSW_EF_CONSTRUCTION=200
+CONFIG_AI_INDEX_HNSW_EF=64
+
+# Results and graphing
+CONFIG_AI_BENCHMARK_RESULTS_DIR="/data/ai-benchmark"
+CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
+CONFIG_AI_BENCHMARK_GRAPH_FORMAT="png"
+CONFIG_AI_BENCHMARK_GRAPH_DPI=300
+CONFIG_AI_BENCHMARK_GRAPH_THEME="default"
+
+# Filesystem configuration
+CONFIG_AI_FILESYSTEM_XFS=y
+CONFIG_AI_FILESYSTEM="xfs"
+CONFIG_AI_FSTYPE="xfs"
+CONFIG_AI_XFS_MKFS_OPTS="-f -s size=4096"
+CONFIG_AI_XFS_MOUNT_OPTS="rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota"
+
+# Baseline/dev testing setup
+CONFIG_KDEVOPS_BASELINE_AND_DEV=y
+# Build Linux
+CONFIG_WORKFLOW_LINUX_CUSTOM=y
+CONFIG_BOOTLINUX_AB_DIFFERENT_REF=y
diff --git a/defconfigs/ai-milvus-docker-ci b/defconfigs/ai-milvus-docker-ci
new file mode 100644
index 00000000..144a6490
--- /dev/null
+++ b/defconfigs/ai-milvus-docker-ci
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: copyleft-next-0.3.1
+#
+# AI vector database benchmarking for CI testing
+# Uses minimal dataset size and short runtime for quick verification
+
+CONFIG_KDEVOPS_FIRST_RUN=y
+CONFIG_GUESTFS=y
+CONFIG_GUESTFS_DEBIAN=y
+CONFIG_GUESTFS_DEBIAN_TRIXIE=y
+
+# Enable AI workflow
+CONFIG_WORKFLOWS_TESTS=y
+CONFIG_WORKFLOWS_LINUX_TESTS=y
+CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y
+CONFIG_KDEVOPS_WORKFLOW_DEDICATE_AI=y
+CONFIG_AI_TESTS_VECTOR_DATABASE=y
+
+# Docker deployment
+CONFIG_AI_VECTOR_DB_MILVUS=y
+CONFIG_AI_VECTOR_DB_MILVUS_DOCKER=y
+
+# CI-optimized: Use custom small dataset
+CONFIG_AI_DATASET_CUSTOM=y
+
+# Small vector dimensions for faster processing
+CONFIG_AI_VECTOR_DIM_128=y
+
+# Minimal query configurations
+CONFIG_AI_BENCHMARK_QUERY_TOPK_1=y
+CONFIG_AI_BENCHMARK_BATCH_1=y
+
+# Fast HNSW indexing
+CONFIG_AI_INDEX_HNSW=y
+
+# Short runtime for CI
+# These will be overridden by environment variables in CI:
+# AI_VECTOR_DATASET_SIZE=1000
+# AI_BENCHMARK_RUNTIME=30
+
+# Reduced resource limits for CI
+CONFIG_AI_VECTOR_DB_MILVUS_MEMORY_LIMIT="2g"
+CONFIG_AI_VECTOR_DB_MILVUS_CPU_LIMIT="2.0"
+
+# Enable graphing for result verification
+CONFIG_AI_BENCHMARK_ENABLE_GRAPHING=y
+
+# XFS filesystem (fastest for AI workloads)
+CONFIG_AI_FILESYSTEM_XFS=y
+
+# A/B testing enabled for baseline/dev comparison
+CONFIG_KDEVOPS_BASELINE_AND_DEV=y
diff --git a/docs/ai/README.md b/docs/ai/README.md
new file mode 100644
index 00000000..94f9f6c0
--- /dev/null
+++ b/docs/ai/README.md
@@ -0,0 +1,108 @@
+# AI Workflow Documentation
+
+The kdevops AI workflow provides infrastructure for benchmarking and testing AI/ML systems, with initial support for vector databases.
+
+## Quick Start
+
+Just like other kdevops workflows (fstests, blktests), the AI workflow follows the same pattern:
+
+```bash
+make defconfig-ai-milvus-docker # Configure for AI vector database testing
+make bringup # Bring up the test environment
+make ai # Run the AI benchmarks
+make ai-baseline # Establish baseline results
+make ai-results # View results
+```
+
+## Supported Components
+
+### Vector Databases
+- [Milvus](vector-databases/milvus.md) - High-performance vector database for AI applications
+
+### Future Components (Planned)
+- Language Models (LLMs)
+- Embedding Services
+- Training Infrastructure
+- Inference Servers
+
+## Configuration Options
+
+The AI workflow can be configured through `make menuconfig`:
+
+1. **Vector Database Selection**
+   - Milvus (Docker or Native deployment)
+   - Future: Weaviate, Qdrant, Pinecone
+
+2. **Dataset Configuration**
+   - Dataset size (number of vectors)
+   - Vector dimensions
+   - Batch sizes
+
+3. **Benchmark Parameters**
+   - Query patterns
+   - Concurrency levels
+   - Runtime duration
+
+4. **Filesystem Testing**
+   - Test on different filesystems (XFS, ext4, btrfs)
+   - Compare performance across storage configurations
+
+## Pre-built Configurations
+
+Quick configurations for common use cases:
+
+- `defconfig-ai-milvus-docker` - Docker-based Milvus deployment
+- `defconfig-ai-milvus-docker-ci` - CI-optimized with minimal dataset
+- `defconfig-ai-milvus-native` - Native Milvus installation from source
+- `defconfig-ai-milvus-multifs` - Multi-filesystem performance comparison
+
+## A/B Testing Support
+
+Like other kdevops workflows, AI supports baseline/dev comparisons:
+
+```bash
+# Configure with A/B testing
+make menuconfig  # Enable CONFIG_KDEVOPS_BASELINE_AND_DEV
+make ai-baseline # Run on baseline
+make ai-dev # Run on dev
+make ai-results # Compare results
+```
+
+## Results and Analysis
+
+The AI workflow generates comprehensive performance metrics:
+
+- Throughput (operations/second)
+- Latency percentiles (p50, p95, p99)
+- Resource utilization
+- Performance graphs and trends
+
+Results are stored in the configured results directory (default: `/data/ai-results/`).
+
+## Integration with CI/CD
+
+The workflow includes CI-optimized configurations that use:
+- Minimal datasets for quick validation
+- `/dev/null` storage for I/O testing without disk requirements
+- Environment variable overrides for runtime configuration
+
+Example CI usage:
+```bash
+AI_VECTOR_DATASET_SIZE=1000 AI_BENCHMARK_RUNTIME=30 make defconfig-ai-milvus-docker-ci
+make bringup
+make ai
+```
+
+## Workflow Architecture
+
+The AI workflow follows kdevops patterns:
+
+1. **Configuration** - Kconfig-based configuration system
+2. **Provisioning** - Ansible-based infrastructure setup
+3. **Execution** - Standardized test execution
+4. **Collection** - Automated result collection and analysis
+5. **Reporting** - Performance visualization and comparison
+
+For detailed usage of specific components, see:
+- [Vector Databases Overview](vector-databases/README.md)
+- [Milvus Usage Guide](vector-databases/milvus.md)
diff --git a/docs/ai/vector-databases/README.md b/docs/ai/vector-databases/README.md
new file mode 100644
index 00000000..2a3955d7
--- /dev/null
+++ b/docs/ai/vector-databases/README.md
@@ -0,0 +1,76 @@
+# Vector Database Testing
+
+Vector databases are specialized systems designed to store and search high-dimensional vectors, essential for modern AI applications like semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation).
+
+## Overview
+
+The kdevops AI workflow supports comprehensive benchmarking of vector databases to evaluate:
+
+- **Ingestion Performance**: How fast vectors can be indexed
+- **Query Performance**: Search latency and throughput
+- **Scalability**: Performance under different dataset sizes
+- **Storage Efficiency**: Filesystem and storage backend impact
+- **Resource Utilization**: CPU, memory, and I/O patterns
+
+## Supported Vector Databases
+
+### Currently Implemented
+- **[Milvus](milvus.md)** - Industry-leading vector database with comprehensive feature set
+
+### Planned Support
+- **Weaviate** - GraphQL-based vector search engine
+- **Qdrant** - High-performance vector similarity search
+- **Pinecone** - Cloud-native vector database
+- **ChromaDB** - Embedded vector database
+
+## Common Benchmark Patterns
+
+All vector database benchmarks follow similar patterns:
+
+1. **Data Ingestion**
+   - Generate or load vector datasets
+   - Create collections/indexes
+   - Insert vectors in batches
+   - Measure indexing performance
+
+2. **Query Workloads**
+   - Single vector searches
+   - Batch query processing
+   - Filtered searches
+   - Range queries
+
+3. **Performance Metrics**
+   - Queries per second (QPS)
+   - Latency percentiles
+   - Recall accuracy
+   - Resource consumption
+
+## Filesystem Impact
+
+Vector databases heavily depend on storage performance. The workflow tests across:
+
+- **XFS**: Default for many production deployments
+- **ext4**: Traditional Linux filesystem
+- **btrfs**: Copy-on-write with compression support
+- **ZFS**: Advanced features for data integrity
+
+## Configuration Dimensions
+
+Vector database testing explores multiple dimensions:
+
+- **Vector Dimensions**: 128, 256, 512, 768, 1536
+- **Dataset Sizes**: 100K to 100M+ vectors
+- **Index Types**: HNSW, IVF, Flat, Annoy
+- **Distance Metrics**: L2, Cosine, IP
+- **Batch Sizes**: Impact on ingestion/query performance
+
+## Quick Start Example
+
+```bash
+make defconfig-ai-milvus-docker # Configure for Milvus testing
+make bringup # Start the environment
+make ai # Run benchmarks
+make ai-results # Check results
+```
+
+See individual database guides for detailed configuration and usage instructions.
diff --git a/docs/ai/vector-databases/milvus.md b/docs/ai/vector-databases/milvus.md
new file mode 100644
index 00000000..11172774
--- /dev/null
+++ b/docs/ai/vector-databases/milvus.md
@@ -0,0 +1,264 @@
+# Milvus Vector Database Testing
+
+Milvus is a high-performance, cloud-native vector database designed for billion-scale vector similarity search. This guide explains how to benchmark Milvus using the kdevops AI workflow.
+
+## Quick Start
+
+### Basic Workflow
+
+Just like fstests or blktests, the Milvus workflow follows the standard kdevops pattern:
+
+```bash
+make defconfig-ai-milvus-docker # 1. Configure for Milvus testing
+make bringup # 2. Provision the test environment
+make ai # 3. Run the Milvus benchmarks
+make ai-baseline # 4. Establish baseline performance
+make ai-results # 5. View results
+```
+
+That's it! The workflow handles all the complexity of setting up Milvus, generating test data, and running comprehensive benchmarks.
+
+## Deployment Options
+
+### Docker Deployment (Recommended)
+
+The easiest way to test Milvus:
+
+```bash
+make defconfig-ai-milvus-docker
+make bringup
+make ai
+```
+
+This deploys Milvus using Docker Compose with:
+- Milvus standalone server
+- etcd for metadata storage
+- MinIO for object storage
+- Automatic service orchestration
+
+### Native Deployment
+
+For testing Milvus performance without containerization overhead:
+
+```bash
+make defconfig-ai-milvus-native
+make bringup
+make ai
+```
+
+Builds Milvus from source and runs directly on the VM.
+
+### CI/Quick Test Mode
+
+For rapid validation in CI pipelines:
+
+```bash
+# Uses minimal dataset (1000 vectors) and short runtime (30s)
+make defconfig-ai-milvus-docker-ci
+make bringup
+make ai
+```
+
+Or with environment overrides:
+```bash
+AI_VECTOR_DATASET_SIZE=5000 AI_BENCHMARK_RUNTIME=60 make ai
+```
+
+## What Actually Happens
+
+When you run `make ai`, the workflow:
+
+1. **Deploys Milvus** - Starts all required services
+2. **Generates Test Data** - Creates random vectors of configured dimensions
+3. **Creates Collection** - Sets up Milvus collection with appropriate schema
+4. **Ingests Data** - Inserts vectors in batches, measuring throughput
+5. **Builds Index** - Creates HNSW/IVF index on vectors
+6. **Runs Queries** - Executes search workload with various patterns
+7. **Collects Metrics** - Gathers performance data and system metrics
+8. **Generates Reports** - Creates graphs and summary statistics
+
+## Configuration Options
+
+### Via menuconfig
+
+```bash
+make menuconfig
+# Navigate to: Workflows → AI → Vector Databases → Milvus
+```
+
+Key configuration options:
+
+- **Deployment Type**: Docker vs Native
+- **Dataset Size**: 100K to 100M+ vectors (default: 1M)
+- **Vector Dimensions**: 128, 256, 512, 768, 1536 (default: 768)
+- **Batch Size**: Vectors per insert batch (default: 10K)
+- **Index Type**: HNSW, IVF_FLAT, IVF_SQ8
+- **Query Count**: Number of search queries to run
+
+### Via Environment Variables
+
+Override configurations at runtime:
+
+```bash
+# Quick test with small dataset
+AI_VECTOR_DATASET_SIZE=10000 make ai
+
+# Extended benchmark
+AI_BENCHMARK_RUNTIME=3600 make ai
+
+# Custom vector dimensions
+AI_VECTOR_DIMENSIONS=1536 make ai
+```
+
+## Filesystem Testing
+
+Test Milvus performance on different filesystems:
+
+```bash
+# Test on multiple filesystems
+make defconfig-ai-milvus-multifs
+make bringup
+make ai
+
+# Creates separate VMs for each filesystem:
+# - XFS with various configurations
+# - ext4 with bigalloc
+# - btrfs with compression
+```
+
+## A/B Testing
+
+Compare baseline vs development configurations:
+
+```bash
+# Enable A/B testing in menuconfig
+make menuconfig  # Enable CONFIG_KDEVOPS_BASELINE_AND_DEV
+make ai-baseline # Run baseline
+# Make changes (kernel, filesystem, Milvus config)
+make ai-dev   # Run on dev
+make ai-results # Compare results
+```
+
+## Understanding Results
+
+Results are stored in `/data/ai-results/` (configurable) with:
+
+### Performance Metrics
+- **Ingestion Rate**: Vectors indexed per second
+- **Query Latency**: p50, p95, p99 latencies
+- **Query Throughput**: Queries per second (QPS)
+- **Index Build Time**: Time to build vector index
+- **Resource Usage**: CPU, memory, disk I/O
+
+### Output Files
+```
+/data/ai-results/
+├── milvus_benchmark_results.json    # Raw benchmark data
+├── performance_summary.txt          # Human-readable summary
+├── graphs/
+│   ├── ingestion_throughput.png
+│   ├── query_latency_percentiles.png
+│   └── qps_over_time.png
+└── system_metrics/                  # iostat, vmstat data
+```
+
+## Common Tasks
+
+### View Current Milvus Status
+```bash
+ansible all -m shell -a "docker ps | grep milvus"
+```
+
+### Check Milvus Logs
+```bash
+ansible all -m shell -a "docker logs milvus-standalone"
+```
+
+### Reset and Re-run
+```bash
+make ai-destroy  # Clean up Milvus
+make ai         # Fresh run
+```
+
+### Run Specific Phases
+```bash
+make ai-vector-db-milvus-install    # Just install Milvus
+make ai-vector-db-milvus-benchmark  # Just run benchmarks
+make ai-vector-db-milvus-destroy    # Clean up
+```
+
+## Advanced Configuration
+
+### Custom Index Parameters
+
+Edit Milvus collection configuration in menuconfig:
+- HNSW: M (connections), efConstruction
+- IVF: nlist (clusters), nprobe
+- Metric Type: L2, IP, Cosine
+
+### Resource Limits
+
+For Docker deployment:
+```
+CONFIG_AI_VECTOR_DB_MILVUS_MEMORY_LIMIT="8g"
+CONFIG_AI_VECTOR_DB_MILVUS_CPU_LIMIT="4.0"
+```
+
+### Multi-Node Testing
+
+Future support for distributed Milvus cluster testing across multiple nodes.
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Out of Memory**: Reduce dataset size or increase VM memory
+2. **Slow Ingestion**: Check disk I/O, consider faster storage
+3. **Docker Issues**: Ensure Docker service is running on VMs
+
+### Debug Commands
+
+```bash
+# Check Milvus health
+ansible all -m uri -a "url=http://localhost:9091/health"
+
+# View resource usage
+ansible all -m shell -a "docker stats --no-stream"
+
+# Check disk space
+ansible all -m shell -a "df -h /data"
+```
+
+## Performance Tuning Tips
+
+1. **Storage**: Use NVMe/SSD for best performance
+2. **Memory**: Ensure sufficient RAM for dataset + indexes
+3. **CPU**: More cores help with parallel ingestion
+4. **Filesystem**: XFS often performs best for Milvus workloads
+5. **Batch Size**: Larger batches improve ingestion throughput
+
+## Integration with CI/CD
+
+Example GitHub Actions workflow:
+
+```yaml
+- name: Run Milvus CI benchmark
+  run: |
+    AI_VECTOR_DATASET_SIZE=1000 \
+    AI_BENCHMARK_RUNTIME=30 \
+    make defconfig-ai-milvus-docker-ci
+    make bringup
+    make ai
+    make ai-results
+```
+
+## Summary
+
+The Milvus workflow in kdevops makes it simple to:
+- Quickly deploy and benchmark Milvus
+- Compare performance across configurations
+- Test filesystem and kernel impacts
+- Generate reproducible results
+- Scale from quick CI tests to comprehensive benchmarks
+
+Just like running `make fstests`, you can now run `make ai` to benchmark vector databases!
diff --git a/kconfigs/workflows/Kconfig b/kconfigs/workflows/Kconfig
index 6b2a3769..70898a1a 100644
--- a/kconfigs/workflows/Kconfig
+++ b/kconfigs/workflows/Kconfig
@@ -214,6 +214,13 @@ config KDEVOPS_WORKFLOW_DEDICATE_FIO_TESTS
 	  This will dedicate your configuration to running only the
 	  fio-tests workflow for comprehensive storage performance testing.
 
+config KDEVOPS_WORKFLOW_DEDICATE_AI
+	bool "ai"
+	select KDEVOPS_WORKFLOW_ENABLE_AI
+	help
+	  This will dedicate your configuration to running only the
+	  AI workflow for vector database performance testing.
+
 endchoice
 
 config KDEVOPS_WORKFLOW_NAME
@@ -229,6 +236,7 @@ config KDEVOPS_WORKFLOW_NAME
 	default "sysbench" if KDEVOPS_WORKFLOW_DEDICATE_SYSBENCH
 	default "mmtests" if KDEVOPS_WORKFLOW_DEDICATE_MMTESTS
 	default "fio-tests" if KDEVOPS_WORKFLOW_DEDICATE_FIO_TESTS
+	default "ai" if KDEVOPS_WORKFLOW_DEDICATE_AI
 
 endif
 
@@ -338,6 +346,14 @@ config KDEVOPS_WORKFLOW_NOT_DEDICATED_ENABLE_FIO_TESTS
 	  Select this option if you want to provision fio-tests on a
 	  single target node for by-hand testing.
 
+config KDEVOPS_WORKFLOW_NOT_DEDICATED_ENABLE_AI
+	bool "ai"
+	select KDEVOPS_WORKFLOW_ENABLE_AI
+	depends on LIBVIRT || TERRAFORM_PRIVATE_NET
+	help
+	  Select this option if you want to provision AI benchmarks on a
+	  single target node for by-hand testing.
+
 endif # !WORKFLOWS_DEDICATED_WORKFLOW
 
 config KDEVOPS_WORKFLOW_ENABLE_FSTESTS
@@ -462,6 +478,17 @@ source "workflows/fio-tests/Kconfig"
 endmenu
 endif # KDEVOPS_WORKFLOW_ENABLE_FIO_TESTS
 
+config KDEVOPS_WORKFLOW_ENABLE_AI
+	bool
+	output yaml
+	default y if KDEVOPS_WORKFLOW_NOT_DEDICATED_ENABLE_AI || KDEVOPS_WORKFLOW_DEDICATE_AI
+
+if KDEVOPS_WORKFLOW_ENABLE_AI
+menu "Configure and run AI benchmarks"
+source "workflows/ai/Kconfig"
+endmenu
+endif # KDEVOPS_WORKFLOW_ENABLE_AI
+
 config KDEVOPS_WORKFLOW_ENABLE_SSD_STEADY_STATE
        bool "Attain SSD steady state prior to tests"
        output yaml
diff --git a/playbooks/ai.yml b/playbooks/ai.yml
new file mode 100644
index 00000000..b1613309
--- /dev/null
+++ b/playbooks/ai.yml
@@ -0,0 +1,11 @@
+---
+# Main AI workflow orchestration playbook
+# This demonstrates the scalable structure for AI workflows
+
+- name: AI Workflow - Vector Database Setup
+  ansible.builtin.import_playbook: ai_install.yml
+  when: ai_workflow_vector_db | default(true) | bool
+  tags: ['ai', 'setup']
+
+# Benchmarks are run separately via make ai-tests targets
+# They should not run during the setup phase (make ai)
diff --git a/playbooks/ai_benchmark.yml b/playbooks/ai_benchmark.yml
new file mode 100644
index 00000000..85fc117c
--- /dev/null
+++ b/playbooks/ai_benchmark.yml
@@ -0,0 +1,8 @@
+---
+- name: Run Milvus Vector Database Benchmarks
+  hosts: ai
+  vars:
+    ai_vector_db_milvus_benchmark_enable: true
+  roles:
+    - role: milvus
+      tags: ['ai', 'vector_db', 'milvus', 'benchmark']
diff --git a/playbooks/ai_destroy.yml b/playbooks/ai_destroy.yml
new file mode 100644
index 00000000..eef07b2a
--- /dev/null
+++ b/playbooks/ai_destroy.yml
@@ -0,0 +1,24 @@
+---
+- name: Destroy Milvus Vector Database
+  hosts: ai
+  become: true
+  tasks:
+    - name: Stop Milvus containers
+      community.docker.docker_compose:
+        project_src: "{{ ai_vector_db_milvus_config_dir }}"
+        state: absent
+      when: ai_vector_db_milvus_docker | bool
+      # TODO: Review - was ignore_errors: true
+      failed_when: false  # Always succeed - review this condition
+
+    - name: Remove Milvus data directories
+      ansible.builtin.file:
+        path: "{{ item }}"
+        state: absent
+      loop:
+        - "{{ ai_vector_db_milvus_data_dir }}"
+        - "{{ ai_vector_db_milvus_config_dir }}"
+        - "{{ ai_vector_db_milvus_log_dir }}"
+      when: ai_vector_db_force_destroy | default(false) | bool
+
+  tags: ['ai', 'vector_db', 'milvus', 'destroy']
diff --git a/playbooks/ai_install.yml b/playbooks/ai_install.yml
new file mode 100644
index 00000000..70b734e4
--- /dev/null
+++ b/playbooks/ai_install.yml
@@ -0,0 +1,8 @@
+---
+- name: Install Milvus Vector Database
+  hosts: ai
+  become: true
+  become_user: root
+  roles:
+    - role: milvus
+      tags: ['ai', 'vector_db', 'milvus', 'install']
diff --git a/playbooks/ai_results.yml b/playbooks/ai_results.yml
new file mode 100644
index 00000000..881295eb
--- /dev/null
+++ b/playbooks/ai_results.yml
@@ -0,0 +1,6 @@
+---
+- name: Collect and analyze AI benchmark results
+  hosts: ai
+  roles:
+    - ai_collect_results
+  tags: ['ai', 'ai_results']
diff --git a/playbooks/ai_setup.yml b/playbooks/ai_setup.yml
new file mode 100644
index 00000000..f0007ee2
--- /dev/null
+++ b/playbooks/ai_setup.yml
@@ -0,0 +1,6 @@
+---
+- name: Setup AI benchmark environment
+  hosts: ai
+  roles:
+    - ai_setup
+  tags: ['ai', 'ai_setup']
diff --git a/playbooks/ai_tests.yml b/playbooks/ai_tests.yml
new file mode 100644
index 00000000..1a5638fc
--- /dev/null
+++ b/playbooks/ai_tests.yml
@@ -0,0 +1,31 @@
+---
+# AI Tests/Benchmarks playbook
+# This ensures AI infrastructure is setup before running benchmarks
+
+- name: AI Tests - Ensure Milvus is installed
+  hosts: ai
+  become: true
+  become_user: root
+  roles:
+    - role: milvus
+      when: ai_vector_db_milvus | default(false) | bool
+      tags: ['ai', 'milvus', 'setup']
+
+- name: AI Tests - Vector Database Benchmarks
+  hosts: ai
+  become: true
+  vars:
+    # Skip infrastructure setup when running tests
+    ai_skip_setup: true
+  roles:
+    - role: ai_run_benchmarks
+      when: ai_vector_db_milvus | default(false) | bool
+      tags: ['ai', 'benchmark']
+
+- name: AI Tests - Results Collection
+  hosts: ai
+  become: true
+  roles:
+    - role: ai_collect_results
+      when: ai_collect_results | default(true) | bool
+      tags: ['ai', 'results']
diff --git a/playbooks/ai_uninstall.yml b/playbooks/ai_uninstall.yml
new file mode 100644
index 00000000..fb537664
--- /dev/null
+++ b/playbooks/ai_uninstall.yml
@@ -0,0 +1,6 @@
+---
+- name: Uninstall AI benchmark components
+  hosts: ai
+  roles:
+    - ai_uninstall
+  tags: ['ai', 'ai_uninstall']
diff --git a/playbooks/host_vars/debian13-ai-btrfs-default-dev.yml b/playbooks/host_vars/debian13-ai-btrfs-default-dev.yml
new file mode 100644
index 00000000..85b95a52
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-btrfs-default-dev.yml
@@ -0,0 +1,8 @@
+---
+# btrfs default configuration (dev)
+ai_docker_fstype: "btrfs"
+ai_docker_btrfs_mkfs_opts: "-f"
+filesystem_type: "btrfs"
+filesystem_block_size: "default"
+ai_filesystem: "btrfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-btrfs-default.yml b/playbooks/host_vars/debian13-ai-btrfs-default.yml
new file mode 100644
index 00000000..f4f18b9e
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-btrfs-default.yml
@@ -0,0 +1,8 @@
+---
+# btrfs default configuration
+ai_docker_fstype: "btrfs"
+ai_docker_btrfs_mkfs_opts: "-f"
+filesystem_type: "btrfs"
+filesystem_block_size: "default"
+ai_filesystem: "btrfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-ext4-16k-bigalloc-dev.yml b/playbooks/host_vars/debian13-ai-ext4-16k-bigalloc-dev.yml
new file mode 100644
index 00000000..e4b1a9da
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-ext4-16k-bigalloc-dev.yml
@@ -0,0 +1,8 @@
+---
+# ext4 16k bigalloc configuration (dev)
+ai_docker_fstype: "ext4"
+ai_docker_ext4_mkfs_opts: "-b 4096 -C 16384 -O bigalloc"
+filesystem_type: "ext4"
+filesystem_block_size: "16k-bigalloc"
+ai_filesystem: "ext4"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-ext4-16k-bigalloc.yml b/playbooks/host_vars/debian13-ai-ext4-16k-bigalloc.yml
new file mode 100644
index 00000000..a5624440
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-ext4-16k-bigalloc.yml
@@ -0,0 +1,8 @@
+---
+# ext4 16k bigalloc configuration
+ai_docker_fstype: "ext4"
+ai_docker_ext4_mkfs_opts: "-b 4096 -C 16384 -O bigalloc"
+filesystem_type: "ext4"
+filesystem_block_size: "16k-bigalloc"
+ai_filesystem: "ext4"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-ext4-4k-dev.yml b/playbooks/host_vars/debian13-ai-ext4-4k-dev.yml
new file mode 100644
index 00000000..6ca5fec5
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-ext4-4k-dev.yml
@@ -0,0 +1,8 @@
+---
+# ext4 4k block configuration (dev)
+ai_docker_fstype: "ext4"
+ai_docker_ext4_mkfs_opts: "-b 4096"
+filesystem_type: "ext4"
+filesystem_block_size: "4k"
+ai_filesystem: "ext4"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-ext4-4k.yml b/playbooks/host_vars/debian13-ai-ext4-4k.yml
new file mode 100644
index 00000000..f2840faa
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-ext4-4k.yml
@@ -0,0 +1,8 @@
+---
+# ext4 4k block configuration
+ai_docker_fstype: "ext4"
+ai_docker_ext4_mkfs_opts: "-b 4096"
+filesystem_type: "ext4"
+filesystem_block_size: "4k"
+ai_filesystem: "ext4"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-16k-4ks-dev.yml b/playbooks/host_vars/debian13-ai-xfs-16k-4ks-dev.yml
new file mode 100644
index 00000000..429e6461
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-16k-4ks-dev.yml
@@ -0,0 +1,10 @@
+---
+# XFS 16k block, 4k sector configuration (dev)
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 16384
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "16k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-16k-4ks.yml b/playbooks/host_vars/debian13-ai-xfs-16k-4ks.yml
new file mode 100644
index 00000000..15200810
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-16k-4ks.yml
@@ -0,0 +1,10 @@
+---
+# XFS 16k block, 4k sector configuration  
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 16384
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "16k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-32k-4ks-dev.yml b/playbooks/host_vars/debian13-ai-xfs-32k-4ks-dev.yml
new file mode 100644
index 00000000..6f30a053
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-32k-4ks-dev.yml
@@ -0,0 +1,10 @@
+---
+# XFS 32k block, 4k sector configuration (dev)
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 32768
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "32k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-32k-4ks.yml b/playbooks/host_vars/debian13-ai-xfs-32k-4ks.yml
new file mode 100644
index 00000000..4c78e9a4
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-32k-4ks.yml
@@ -0,0 +1,10 @@
+---
+# XFS 32k block, 4k sector configuration
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 32768
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "32k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-4k-4ks-dev.yml b/playbooks/host_vars/debian13-ai-xfs-4k-4ks-dev.yml
new file mode 100644
index 00000000..f8b8c55b
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-4k-4ks-dev.yml
@@ -0,0 +1,10 @@
+---
+# XFS 4k block, 4k sector configuration (dev)
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 4096
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "4k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml b/playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml
new file mode 100644
index 00000000..ffe9eb28
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-4k-4ks.yml
@@ -0,0 +1,10 @@
+---
+# XFS 4k block, 4k sector configuration
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 4096
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "4k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
\ No newline at end of file
diff --git a/playbooks/host_vars/debian13-ai-xfs-64k-4ks-dev.yml b/playbooks/host_vars/debian13-ai-xfs-64k-4ks-dev.yml
new file mode 100644
index 00000000..1590f154
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-64k-4ks-dev.yml
@@ -0,0 +1,10 @@
+---
+# XFS 64k block, 4k sector configuration (dev)
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 65536
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "64k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/host_vars/debian13-ai-xfs-64k-4ks.yml b/playbooks/host_vars/debian13-ai-xfs-64k-4ks.yml
new file mode 100644
index 00000000..482835c4
--- /dev/null
+++ b/playbooks/host_vars/debian13-ai-xfs-64k-4ks.yml
@@ -0,0 +1,10 @@
+---
+# XFS 64k block, 4k sector configuration
+ai_docker_fstype: "xfs"
+ai_docker_xfs_blocksize: 65536
+ai_docker_xfs_sectorsize: 4096
+ai_docker_xfs_mkfs_opts: ""
+filesystem_type: "xfs"
+filesystem_block_size: "64k-4ks"
+ai_filesystem: "xfs"
+ai_data_device_path: "/var/lib/docker"
diff --git a/playbooks/roles/ai_collect_results/files/analyze_results.py b/playbooks/roles/ai_collect_results/files/analyze_results.py
new file mode 100755
index 00000000..3d11fb11
--- /dev/null
+++ b/playbooks/roles/ai_collect_results/files/analyze_results.py
@@ -0,0 +1,979 @@
+#!/usr/bin/env python3
+"""
+AI Benchmark Results Analysis and Visualization
+
+This script analyzes benchmark results and generates comprehensive graphs
+showing performance characteristics of the AI workload testing.
+"""
+
+import json
+import glob
+import os
+import sys
+import argparse
+import subprocess
+import platform
+from typing import List, Dict, Any
+import logging
+from datetime import datetime
+
+# Optional imports with graceful fallback
+GRAPHING_AVAILABLE = True
+try:
+    import pandas as pd
+    import matplotlib.pyplot as plt
+    import seaborn as sns
+    import numpy as np
+except ImportError as e:
+    GRAPHING_AVAILABLE = False
+    print(f"Warning: Graphing libraries not available: {e}")
+    print("Install with: pip install pandas matplotlib seaborn numpy")
+
+
+class ResultsAnalyzer:
+    def __init__(self, results_dir: str, output_dir: str, config: Dict[str, Any]):
+        self.results_dir = results_dir
+        self.output_dir = output_dir
+        self.config = config
+        self.results_data = []
+
+        # Setup logging
+        logging.basicConfig(
+            level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
+        )
+        self.logger = logging.getLogger(__name__)
+
+        # Create output directory
+        os.makedirs(output_dir, exist_ok=True)
+
+        # Collect system information for DUT details
+        self.system_info = self._collect_system_info()
+
+    def _collect_system_info(self) -> Dict[str, Any]:
+        """Collect system information for DUT details in HTML report"""
+        info = {}
+
+        try:
+            # Basic system information
+            info["hostname"] = platform.node()
+            info["platform"] = platform.platform()
+            info["architecture"] = platform.architecture()[0]
+            info["processor"] = platform.processor()
+
+            # Memory information
+            try:
+                with open("/proc/meminfo", "r") as f:
+                    meminfo = f.read()
+                    for line in meminfo.split("\n"):
+                        if "MemTotal:" in line:
+                            info["total_memory"] = line.split()[1] + " kB"
+                            break
+            except:
+                info["total_memory"] = "Unknown"
+
+            # CPU information
+            try:
+                with open("/proc/cpuinfo", "r") as f:
+                    cpuinfo = f.read()
+                    cpu_count = cpuinfo.count("processor")
+                    info["cpu_count"] = cpu_count
+
+                    # Extract CPU model
+                    for line in cpuinfo.split("\n"):
+                        if "model name" in line:
+                            info["cpu_model"] = line.split(":", 1)[1].strip()
+                            break
+            except:
+                info["cpu_count"] = "Unknown"
+                info["cpu_model"] = "Unknown"
+
+            # Storage information
+            info["storage_devices"] = self._get_storage_info()
+
+            # Virtualization detection
+            info["is_vm"] = self._detect_virtualization()
+
+            # Filesystem information for AI data directory
+            info["filesystem_info"] = self._get_filesystem_info()
+
+        except Exception as e:
+            self.logger.warning(f"Error collecting system information: {e}")
+
+        return info
+
+    def _get_storage_info(self) -> List[Dict[str, str]]:
+        """Get storage device information including NVMe details"""
+        devices = []
+
+        try:
+            # Get block devices
+            result = subprocess.run(
+                ["lsblk", "-J", "-o", "NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE"],
+                capture_output=True,
+                text=True,
+            )
+            if result.returncode == 0:
+                lsblk_data = json.loads(result.stdout)
+                for device in lsblk_data.get("blockdevices", []):
+                    if device.get("type") == "disk":
+                        dev_info = {
+                            "name": device.get("name", ""),
+                            "size": device.get("size", ""),
+                            "type": "disk",
+                        }
+
+                        # Check if it's NVMe and get additional details
+                        if device.get("name", "").startswith("nvme"):
+                            nvme_info = self._get_nvme_info(device.get("name", ""))
+                            dev_info.update(nvme_info)
+
+                        devices.append(dev_info)
+        except Exception as e:
+            self.logger.warning(f"Error getting storage info: {e}")
+
+        return devices
+
+    def _get_nvme_info(self, device_name: str) -> Dict[str, str]:
+        """Get detailed NVMe device information"""
+        nvme_info = {}
+
+        try:
+            # Get NVMe identify info
+            result = subprocess.run(
+                ["nvme", "id-ctrl", f"/dev/{device_name}"],
+                capture_output=True,
+                text=True,
+            )
+            if result.returncode == 0:
+                output = result.stdout
+                for line in output.split("\n"):
+                    if "mn :" in line:
+                        nvme_info["model"] = line.split(":", 1)[1].strip()
+                    elif "fr :" in line:
+                        nvme_info["firmware"] = line.split(":", 1)[1].strip()
+                    elif "sn :" in line:
+                        nvme_info["serial"] = line.split(":", 1)[1].strip()
+        except Exception as e:
+            self.logger.debug(f"Could not get NVMe info for {device_name}: {e}")
+
+        return nvme_info
+
+    def _detect_virtualization(self) -> str:
+        """Detect if running in a virtual environment"""
+        try:
+            # Check systemd-detect-virt
+            result = subprocess.run(
+                ["systemd-detect-virt"], capture_output=True, text=True
+            )
+            if result.returncode == 0:
+                virt_type = result.stdout.strip()
+                return virt_type if virt_type != "none" else "Physical"
+        except:
+            pass
+
+        try:
+            # Check dmesg for virtualization hints
+            result = subprocess.run(["dmesg"], capture_output=True, text=True)
+            if result.returncode == 0:
+                dmesg_output = result.stdout.lower()
+                if "kvm" in dmesg_output:
+                    return "KVM"
+                elif "vmware" in dmesg_output:
+                    return "VMware"
+                elif "virtualbox" in dmesg_output:
+                    return "VirtualBox"
+                elif "xen" in dmesg_output:
+                    return "Xen"
+        except:
+            pass
+
+        return "Unknown"
+
+    def _get_filesystem_info(self) -> Dict[str, str]:
+        """Get filesystem information for the AI benchmark directory"""
+        fs_info = {}
+
+        try:
+            # Get filesystem info for the results directory
+            result = subprocess.run(
+                ["df", "-T", self.results_dir], capture_output=True, text=True
+            )
+            if result.returncode == 0:
+                lines = result.stdout.strip().split("\n")
+                if len(lines) > 1:
+                    fields = lines[1].split()
+                    if len(fields) >= 2:
+                        fs_info["filesystem_type"] = fields[1]
+                        fs_info["mount_point"] = (
+                            fields[6] if len(fields) > 6 else "Unknown"
+                        )
+
+            # Get mount options
+            try:
+                with open("/proc/mounts", "r") as f:
+                    for line in f:
+                        parts = line.split()
+                        if (
+                            len(parts) >= 4
+                            and fs_info.get("mount_point", "") in parts[1]
+                        ):
+                            fs_info["mount_options"] = parts[3]
+                            break
+            except:
+                pass
+        except Exception as e:
+            self.logger.warning(f"Error getting filesystem info: {e}")
+
+        return fs_info
+
+    def load_results(self) -> bool:
+        """Load all result files from the results directory"""
+        try:
+            pattern = os.path.join(self.results_dir, "results_*.json")
+            result_files = glob.glob(pattern)
+
+            if not result_files:
+                self.logger.warning(f"No result files found in {self.results_dir}")
+                return False
+
+            self.logger.info(f"Found {len(result_files)} result files")
+
+            for file_path in result_files:
+                try:
+                    with open(file_path, "r") as f:
+                        data = json.load(f)
+                        data["_file"] = os.path.basename(file_path)
+                        self.results_data.append(data)
+                except Exception as e:
+                    self.logger.error(f"Error loading {file_path}: {e}")
+
+            self.logger.info(
+                f"Successfully loaded {len(self.results_data)} result sets"
+            )
+            return len(self.results_data) > 0
+
+        except Exception as e:
+            self.logger.error(f"Error loading results: {e}")
+            return False
+
+    def generate_summary_report(self) -> str:
+        """Generate a text summary report"""
+        try:
+            report = []
+            report.append("=" * 80)
+            report.append("AI BENCHMARK RESULTS SUMMARY")
+            report.append("=" * 80)
+            report.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+            report.append(f"Total result sets: {len(self.results_data)}")
+            report.append("")
+
+            if not self.results_data:
+                report.append("No results to analyze.")
+                return "\n".join(report)
+
+            # Configuration summary
+            first_result = self.results_data[0]
+            config = first_result.get("config", {})
+
+            report.append("CONFIGURATION:")
+            report.append(
+                f"  Vector dataset size: {config.get('vector_dataset_size', 'N/A'):,}"
+            )
+            report.append(
+                f"  Vector dimensions: {config.get('vector_dimensions', 'N/A')}"
+            )
+            report.append(f"  Index type: {config.get('index_type', 'N/A')}")
+            report.append(f"  Benchmark iterations: {len(self.results_data)}")
+            report.append("")
+
+            # Insert performance summary
+            insert_times = []
+            insert_rates = []
+            for result in self.results_data:
+                insert_perf = result.get("insert_performance", {})
+                if insert_perf:
+                    insert_times.append(insert_perf.get("total_time_seconds", 0))
+                    insert_rates.append(insert_perf.get("vectors_per_second", 0))
+
+            if insert_times:
+                report.append("INSERT PERFORMANCE:")
+                report.append(
+                    f"  Average insert time: {np.mean(insert_times):.2f} seconds"
+                )
+                report.append(
+                    f"  Average insert rate: {np.mean(insert_rates):.2f} vectors/sec"
+                )
+                report.append(
+                    f"  Insert rate range: {np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec"
+                )
+                report.append("")
+
+            # Index performance summary
+            index_times = []
+            for result in self.results_data:
+                index_perf = result.get("index_performance", {})
+                if index_perf:
+                    index_times.append(index_perf.get("creation_time_seconds", 0))
+
+            if index_times:
+                report.append("INDEX PERFORMANCE:")
+                report.append(
+                    f"  Average index creation time: {np.mean(index_times):.2f} seconds"
+                )
+                report.append(
+                    f"  Index time range: {np.min(index_times):.2f} - {np.max(index_times):.2f} seconds"
+                )
+                report.append("")
+
+            # Query performance summary
+            report.append("QUERY PERFORMANCE:")
+            for result in self.results_data:
+                query_perf = result.get("query_performance", {})
+                if query_perf:
+                    for topk, topk_data in query_perf.items():
+                        report.append(f"  {topk.upper()}:")
+                        for batch, batch_data in topk_data.items():
+                            qps = batch_data.get("queries_per_second", 0)
+                            avg_time = batch_data.get("average_time_seconds", 0)
+                            report.append(
+                                f"    {batch}: {qps:.2f} QPS, {avg_time*1000:.2f}ms avg"
+                            )
+                    break  # Only show first result for summary
+
+            return "\n".join(report)
+
+        except Exception as e:
+            self.logger.error(f"Error generating summary report: {e}")
+            return f"Error generating summary: {e}"
+
+    def generate_html_report(self) -> str:
+        """Generate comprehensive HTML report with DUT details and test configuration"""
+        try:
+            html = []
+
+            # HTML header
+            html.append("<!DOCTYPE html>")
+            html.append("<html lang='en'>")
+            html.append("<head>")
+            html.append("    <meta charset='UTF-8'>")
+            html.append(
+                "    <meta name='viewport' content='width=device-width, initial-scale=1.0'>"
+            )
+            html.append("    <title>AI Benchmark Results Report</title>")
+            html.append("    <style>")
+            html.append(
+                "        body { font-family: Arial, sans-serif; margin: 20px; line-height: 1.6; }"
+            )
+            html.append(
+                "        .header { background-color: #f4f4f4; padding: 20px; border-radius: 5px; margin-bottom: 20px; }"
+            )
+            html.append("        .section { margin-bottom: 30px; }")
+            html.append(
+                "        .section h2 { color: #333; border-bottom: 2px solid #007acc; padding-bottom: 5px; }"
+            )
+            html.append("        .section h3 { color: #555; }")
+            html.append(
+                "        table { border-collapse: collapse; width: 100%; margin-bottom: 20px; }"
+            )
+            html.append(
+                "        th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }"
+            )
+            html.append("        th { background-color: #f2f2f2; font-weight: bold; }")
+            html.append(
+                "        .metric-table td:first-child { font-weight: bold; width: 30%; }"
+            )
+            html.append(
+                "        .config-table td:first-child { font-weight: bold; width: 40%; }"
+            )
+            html.append("        .performance-good { color: #27ae60; }")
+            html.append("        .performance-warning { color: #f39c12; }")
+            html.append("        .performance-poor { color: #e74c3c; }")
+            html.append(
+                "        .highlight { background-color: #fff3cd; padding: 10px; border-radius: 3px; }"
+            )
+            html.append("    </style>")
+            html.append("</head>")
+            html.append("<body>")
+
+            # Report header
+            html.append("    <div class='header'>")
+            html.append("        <h1>AI Benchmark Results Report</h1>")
+            html.append(
+                f"        <p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>"
+            )
+            html.append(
+                f"        <p><strong>Test Results:</strong> {len(self.results_data)} benchmark iterations</p>"
+            )
+
+            # Test type identification
+            html.append("        <div class='highlight'>")
+            html.append("            <h3>🤖 AI Workflow Test Type</h3>")
+            html.append(
+                "            <p><strong>Vector Database Performance Testing</strong> using <strong>Milvus Vector Database</strong></p>"
+            )
+            html.append(
+                "            <p>This test evaluates AI workload performance including vector insertion, indexing, and similarity search operations.</p>"
+            )
+            html.append("        </div>")
+            html.append("    </div>")
+
+            # Device Under Test (DUT) Section
+            html.append("    <div class='section'>")
+            html.append("        <h2>📋 Device Under Test (DUT) Details</h2>")
+            html.append("        <table class='config-table'>")
+            html.append(
+                "            <tr><td>Hostname</td><td>"
+                + str(self.system_info.get("hostname", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>System Type</td><td>"
+                + str(self.system_info.get("is_vm", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Platform</td><td>"
+                + str(self.system_info.get("platform", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Architecture</td><td>"
+                + str(self.system_info.get("architecture", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>CPU Model</td><td>"
+                + str(self.system_info.get("cpu_model", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>CPU Count</td><td>"
+                + str(self.system_info.get("cpu_count", "Unknown"))
+                + " cores</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Total Memory</td><td>"
+                + str(self.system_info.get("total_memory", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append("        </table>")
+
+            # Storage devices section
+            html.append("        <h3>💾 Storage Configuration</h3>")
+            storage_devices = self.system_info.get("storage_devices", [])
+            if storage_devices:
+                html.append("        <table>")
+                html.append(
+                    "            <tr><th>Device</th><th>Size</th><th>Type</th><th>Model</th><th>Firmware</th></tr>"
+                )
+                for device in storage_devices:
+                    model = device.get("model", "N/A")
+                    firmware = device.get("firmware", "N/A")
+                    html.append(f"            <tr>")
+                    html.append(
+                        f"                <td>{device.get('name', 'Unknown')}</td>"
+                    )
+                    html.append(
+                        f"                <td>{device.get('size', 'Unknown')}</td>"
+                    )
+                    html.append(
+                        f"                <td>{device.get('type', 'Unknown')}</td>"
+                    )
+                    html.append(f"                <td>{model}</td>")
+                    html.append(f"                <td>{firmware}</td>")
+                    html.append(f"            </tr>")
+                html.append("        </table>")
+            else:
+                html.append("        <p>No storage device information available.</p>")
+
+            # Filesystem section
+            html.append("        <h3>🗂️ Filesystem Configuration</h3>")
+            fs_info = self.system_info.get("filesystem_info", {})
+            html.append("        <table class='config-table'>")
+            html.append(
+                "            <tr><td>Filesystem Type</td><td>"
+                + str(fs_info.get("filesystem_type", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Mount Point</td><td>"
+                + str(fs_info.get("mount_point", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Mount Options</td><td>"
+                + str(fs_info.get("mount_options", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append("        </table>")
+            html.append("    </div>")
+
+            # Test Configuration Section
+            if self.results_data:
+                first_result = self.results_data[0]
+                config = first_result.get("config", {})
+
+                html.append("    <div class='section'>")
+                html.append("        <h2>⚙️ AI Test Configuration</h2>")
+                html.append("        <table class='config-table'>")
+                html.append(
+                    f"            <tr><td>Vector Dataset Size</td><td>{config.get('vector_dataset_size', 'N/A'):,} vectors</td></tr>"
+                )
+                html.append(
+                    f"            <tr><td>Vector Dimensions</td><td>{config.get('vector_dimensions', 'N/A')}</td></tr>"
+                )
+                html.append(
+                    f"            <tr><td>Index Type</td><td>{config.get('index_type', 'N/A')}</td></tr>"
+                )
+                html.append(
+                    f"            <tr><td>Benchmark Iterations</td><td>{len(self.results_data)}</td></tr>"
+                )
+
+                # Add index-specific parameters
+                if config.get("index_type") == "HNSW":
+                    html.append(
+                        f"            <tr><td>HNSW M Parameter</td><td>{config.get('hnsw_m', 'N/A')}</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>HNSW ef Construction</td><td>{config.get('hnsw_ef_construction', 'N/A')}</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>HNSW ef Search</td><td>{config.get('hnsw_ef', 'N/A')}</td></tr>"
+                    )
+                elif config.get("index_type") == "IVF_FLAT":
+                    html.append(
+                        f"            <tr><td>IVF nlist</td><td>{config.get('ivf_nlist', 'N/A')}</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>IVF nprobe</td><td>{config.get('ivf_nprobe', 'N/A')}</td></tr>"
+                    )
+
+                html.append("        </table>")
+                html.append("    </div>")
+
+            # Performance Results Section
+            html.append("    <div class='section'>")
+            html.append("        <h2>📊 Performance Results Summary</h2>")
+
+            if self.results_data:
+                # Insert performance
+                insert_times = [
+                    r.get("insert_performance", {}).get("total_time_seconds", 0)
+                    for r in self.results_data
+                ]
+                insert_rates = [
+                    r.get("insert_performance", {}).get("vectors_per_second", 0)
+                    for r in self.results_data
+                ]
+
+                if insert_times and any(t > 0 for t in insert_times):
+                    html.append("        <h3>📈 Vector Insert Performance</h3>")
+                    html.append("        <table class='metric-table'>")
+                    html.append(
+                        f"            <tr><td>Average Insert Time</td><td>{np.mean(insert_times):.2f} seconds</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>Average Insert Rate</td><td>{np.mean(insert_rates):.2f} vectors/sec</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>Insert Rate Range</td><td>{np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec</td></tr>"
+                    )
+                    html.append("        </table>")
+
+                # Index performance
+                index_times = [
+                    r.get("index_performance", {}).get("creation_time_seconds", 0)
+                    for r in self.results_data
+                ]
+                if index_times and any(t > 0 for t in index_times):
+                    html.append("        <h3>🔗 Index Creation Performance</h3>")
+                    html.append("        <table class='metric-table'>")
+                    html.append(
+                        f"            <tr><td>Average Index Creation Time</td><td>{np.mean(index_times):.2f} seconds</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>Index Time Range</td><td>{np.min(index_times):.2f} - {np.max(index_times):.2f} seconds</td></tr>"
+                    )
+                    html.append("        </table>")
+
+                # Query performance
+                html.append("        <h3>🔍 Query Performance</h3>")
+                first_query_perf = self.results_data[0].get("query_performance", {})
+                if first_query_perf:
+                    html.append("        <table>")
+                    html.append(
+                        "            <tr><th>Query Type</th><th>Batch Size</th><th>QPS</th><th>Avg Latency (ms)</th></tr>"
+                    )
+
+                    for topk, topk_data in first_query_perf.items():
+                        for batch, batch_data in topk_data.items():
+                            qps = batch_data.get("queries_per_second", 0)
+                            avg_time = batch_data.get("average_time_seconds", 0) * 1000
+
+                            # Color coding for performance
+                            qps_class = ""
+                            if qps > 1000:
+                                qps_class = "performance-good"
+                            elif qps > 100:
+                                qps_class = "performance-warning"
+                            else:
+                                qps_class = "performance-poor"
+
+                            html.append(f"            <tr>")
+                            html.append(
+                                f"                <td>{topk.replace('topk_', 'Top-')}</td>"
+                            )
+                            html.append(
+                                f"                <td>{batch.replace('batch_', 'Batch ')}</td>"
+                            )
+                            html.append(
+                                f"                <td class='{qps_class}'>{qps:.2f}</td>"
+                            )
+                            html.append(f"                <td>{avg_time:.2f}</td>")
+                            html.append(f"            </tr>")
+
+                    html.append("        </table>")
+
+                html.append("    </div>")
+
+            # Footer
+            html.append("    <div class='section'>")
+            html.append("        <h2>📝 Notes</h2>")
+            html.append("        <ul>")
+            html.append(
+                "            <li>This report was generated automatically by the AI benchmark analysis tool</li>"
+            )
+            html.append(
+                "            <li>Performance metrics are averaged across all benchmark iterations</li>"
+            )
+            html.append(
+                "            <li>QPS (Queries Per Second) values are color-coded: <span class='performance-good'>Green (>1000)</span>, <span class='performance-warning'>Orange (100-1000)</span>, <span class='performance-poor'>Red (<100)</span></li>"
+            )
+            html.append(
+                "            <li>Storage device information may require root privileges to display NVMe details</li>"
+            )
+            html.append("        </ul>")
+            html.append("    </div>")
+
+            html.append("</body>")
+            html.append("</html>")
+
+            return "\n".join(html)
+
+        except Exception as e:
+            self.logger.error(f"Error generating HTML report: {e}")
+            return (
+                f"<html><body><h1>Error generating HTML report: {e}</h1></body></html>"
+            )
+
+    def generate_graphs(self) -> bool:
+        """Generate performance visualization graphs"""
+        if not GRAPHING_AVAILABLE:
+            self.logger.warning(
+                "Graphing libraries not available, skipping graph generation"
+            )
+            return False
+
+        try:
+            # Set matplotlib style
+            if self.config.get("graph_theme", "default") != "default":
+                plt.style.use(self.config["graph_theme"])
+
+            # Graph 1: Insert Performance
+            self._plot_insert_performance()
+
+            # Graph 2: Query Performance by Top-K
+            self._plot_query_performance()
+
+            # Graph 3: Index Creation Time
+            self._plot_index_performance()
+
+            # Graph 4: Performance Comparison Matrix
+            self._plot_performance_matrix()
+
+            self.logger.info("Graphs generated successfully")
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Error generating graphs: {e}")
+            return False
+
+    def _plot_insert_performance(self):
+        """Plot insert performance metrics"""
+        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+        # Extract insert data
+        iterations = []
+        insert_rates = []
+        insert_times = []
+
+        for i, result in enumerate(self.results_data):
+            insert_perf = result.get("insert_performance", {})
+            if insert_perf:
+                iterations.append(i + 1)
+                insert_rates.append(insert_perf.get("vectors_per_second", 0))
+                insert_times.append(insert_perf.get("total_time_seconds", 0))
+
+        # Plot insert rate
+        ax1.plot(iterations, insert_rates, "b-o", linewidth=2, markersize=6)
+        ax1.set_xlabel("Iteration")
+        ax1.set_ylabel("Vectors/Second")
+        ax1.set_title("Vector Insert Rate Performance")
+        ax1.grid(True, alpha=0.3)
+
+        # Plot insert time
+        ax2.plot(iterations, insert_times, "r-o", linewidth=2, markersize=6)
+        ax2.set_xlabel("Iteration")
+        ax2.set_ylabel("Total Time (seconds)")
+        ax2.set_title("Vector Insert Time Performance")
+        ax2.grid(True, alpha=0.3)
+
+        plt.tight_layout()
+        output_file = os.path.join(
+            self.output_dir,
+            f"insert_performance.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def _plot_query_performance(self):
+        """Plot query performance metrics"""
+        if not self.results_data:
+            return
+
+        # Collect query performance data
+        query_data = []
+        for result in self.results_data:
+            query_perf = result.get("query_performance", {})
+            for topk, topk_data in query_perf.items():
+                for batch, batch_data in topk_data.items():
+                    query_data.append(
+                        {
+                            "topk": topk.replace("topk_", ""),
+                            "batch": batch.replace("batch_", ""),
+                            "qps": batch_data.get("queries_per_second", 0),
+                            "avg_time": batch_data.get("average_time_seconds", 0)
+                            * 1000,  # Convert to ms
+                        }
+                    )
+
+        if not query_data:
+            return
+
+        df = pd.DataFrame(query_data)
+
+        # Create subplots
+        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+        # QPS heatmap
+        qps_pivot = df.pivot_table(
+            values="qps", index="topk", columns="batch", aggfunc="mean"
+        )
+        sns.heatmap(qps_pivot, annot=True, fmt=".1f", ax=ax1, cmap="YlOrRd")
+        ax1.set_title("Queries Per Second (QPS)")
+        ax1.set_xlabel("Batch Size")
+        ax1.set_ylabel("Top-K")
+
+        # Latency heatmap
+        latency_pivot = df.pivot_table(
+            values="avg_time", index="topk", columns="batch", aggfunc="mean"
+        )
+        sns.heatmap(latency_pivot, annot=True, fmt=".1f", ax=ax2, cmap="YlOrRd")
+        ax2.set_title("Average Query Latency (ms)")
+        ax2.set_xlabel("Batch Size")
+        ax2.set_ylabel("Top-K")
+
+        plt.tight_layout()
+        output_file = os.path.join(
+            self.output_dir,
+            f"query_performance.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def _plot_index_performance(self):
+        """Plot index creation performance"""
+        iterations = []
+        index_times = []
+
+        for i, result in enumerate(self.results_data):
+            index_perf = result.get("index_performance", {})
+            if index_perf:
+                iterations.append(i + 1)
+                index_times.append(index_perf.get("creation_time_seconds", 0))
+
+        if not index_times:
+            return
+
+        plt.figure(figsize=(10, 6))
+        plt.bar(iterations, index_times, alpha=0.7, color="green")
+        plt.xlabel("Iteration")
+        plt.ylabel("Index Creation Time (seconds)")
+        plt.title("Index Creation Performance")
+        plt.grid(True, alpha=0.3)
+
+        # Add average line
+        avg_time = np.mean(index_times)
+        plt.axhline(
+            y=avg_time, color="red", linestyle="--", label=f"Average: {avg_time:.2f}s"
+        )
+        plt.legend()
+
+        output_file = os.path.join(
+            self.output_dir,
+            f"index_performance.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def _plot_performance_matrix(self):
+        """Plot comprehensive performance comparison matrix"""
+        if len(self.results_data) < 2:
+            return
+
+        # Extract key metrics for comparison
+        metrics = []
+        for i, result in enumerate(self.results_data):
+            insert_perf = result.get("insert_performance", {})
+            index_perf = result.get("index_performance", {})
+
+            metric = {
+                "iteration": i + 1,
+                "insert_rate": insert_perf.get("vectors_per_second", 0),
+                "index_time": index_perf.get("creation_time_seconds", 0),
+            }
+
+            # Add query metrics
+            query_perf = result.get("query_performance", {})
+            if "topk_10" in query_perf and "batch_1" in query_perf["topk_10"]:
+                metric["query_qps"] = query_perf["topk_10"]["batch_1"].get(
+                    "queries_per_second", 0
+                )
+
+            metrics.append(metric)
+
+        df = pd.DataFrame(metrics)
+
+        # Normalize metrics for comparison
+        numeric_cols = ["insert_rate", "index_time", "query_qps"]
+        for col in numeric_cols:
+            if col in df.columns:
+                df[f"{col}_norm"] = (df[col] - df[col].min()) / (
+                    df[col].max() - df[col].min() + 1e-6
+                )
+
+        # Create radar chart
+        fig, ax = plt.subplots(figsize=(10, 8), subplot_kw=dict(projection="polar"))
+
+        angles = np.linspace(0, 2 * np.pi, len(numeric_cols), endpoint=False).tolist()
+        angles += angles[:1]  # Complete the circle
+
+        for i, row in df.iterrows():
+            values = [row.get(f"{col}_norm", 0) for col in numeric_cols]
+            values += values[:1]  # Complete the circle
+
+            ax.plot(
+                angles, values, "o-", linewidth=2, label=f'Iteration {row["iteration"]}'
+            )
+            ax.fill(angles, values, alpha=0.25)
+
+        ax.set_xticks(angles[:-1])
+        ax.set_xticklabels(["Insert Rate", "Index Time (inv)", "Query QPS"])
+        ax.set_ylim(0, 1)
+        ax.set_title("Performance Comparison Matrix (Normalized)", y=1.08)
+        ax.legend(loc="upper right", bbox_to_anchor=(1.3, 1.0))
+
+        output_file = os.path.join(
+            self.output_dir,
+            f"performance_matrix.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def analyze(self) -> bool:
+        """Run complete analysis"""
+        self.logger.info("Starting results analysis...")
+
+        if not self.load_results():
+            return False
+
+        # Generate summary report
+        summary = self.generate_summary_report()
+        summary_file = os.path.join(self.output_dir, "benchmark_summary.txt")
+        with open(summary_file, "w") as f:
+            f.write(summary)
+        self.logger.info(f"Summary report saved to {summary_file}")
+
+        # Generate HTML report
+        html_report = self.generate_html_report()
+        html_file = os.path.join(self.output_dir, "benchmark_report.html")
+        with open(html_file, "w") as f:
+            f.write(html_report)
+        self.logger.info(f"HTML report saved to {html_file}")
+
+        # Generate graphs if enabled
+        if self.config.get("enable_graphing", True):
+            self.generate_graphs()
+
+        # Create consolidated JSON report
+        consolidated_file = os.path.join(self.output_dir, "consolidated_results.json")
+        with open(consolidated_file, "w") as f:
+            json.dump(
+                {
+                    "summary": summary.split("\n"),
+                    "raw_results": self.results_data,
+                    "analysis_timestamp": datetime.now().isoformat(),
+                    "system_info": self.system_info,
+                },
+                f,
+                indent=2,
+            )
+
+        self.logger.info("Analysis completed successfully")
+        return True
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Analyze AI benchmark results")
+    parser.add_argument(
+        "--results-dir", required=True, help="Directory containing result files"
+    )
+    parser.add_argument(
+        "--output-dir", required=True, help="Directory for analysis output"
+    )
+    parser.add_argument("--config", help="Analysis configuration file (JSON)")
+
+    args = parser.parse_args()
+
+    # Load configuration
+    config = {
+        "enable_graphing": True,
+        "graph_format": "png",
+        "graph_dpi": 300,
+        "graph_theme": "default",
+    }
+
+    if args.config:
+        try:
+            with open(args.config, "r") as f:
+                config.update(json.load(f))
+        except Exception as e:
+            print(f"Error loading config file: {e}")
+
+    # Run analysis
+    analyzer = ResultsAnalyzer(args.results_dir, args.output_dir, config)
+    success = analyzer.analyze()
+
+    return 0 if success else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/playbooks/roles/ai_collect_results/files/generate_better_graphs.py b/playbooks/roles/ai_collect_results/files/generate_better_graphs.py
new file mode 100755
index 00000000..645bac9e
--- /dev/null
+++ b/playbooks/roles/ai_collect_results/files/generate_better_graphs.py
@@ -0,0 +1,548 @@
+#!/usr/bin/env python3
+"""
+Generate meaningful graphs for AI benchmark results
+Focus on QPS and Latency metrics that matter
+"""
+
+import json
+import os
+import sys
+import glob
+import numpy as np
+import matplotlib
+
+matplotlib.use("Agg")  # Use non-interactive backend
+import matplotlib.pyplot as plt
+from datetime import datetime
+from pathlib import Path
+from collections import defaultdict
+import subprocess
+
+
+def extract_filesystem_from_filename(filename):
+    """Extract filesystem type from result filename"""
+    # Expected format: results_debian13-ai-xfs-4k-4ks_1.json or results_debian13-ai-ext4-4k_1.json
+    if "debian13-ai-" in filename:
+        # Remove the "results_" prefix and ".json" suffix
+        node_name = filename.replace("results_", "").replace(".json", "")
+        # Remove the iteration number at the end
+        if "_" in node_name:
+            parts = node_name.split("_")
+            node_name = "_".join(parts[:-1])  # Remove last part (iteration)
+        
+        # Extract filesystem type from node name
+        if "-xfs-" in node_name:
+            return "xfs"
+        elif "-ext4-" in node_name:
+            return "ext4"  
+        elif "-btrfs-" in node_name:
+            return "btrfs"
+    
+    return "unknown"
+
+def extract_node_config_from_filename(filename):
+    """Extract detailed node configuration from filename"""
+    # Expected format: results_debian13-ai-xfs-4k-4ks_1.json
+    if "debian13-ai-" in filename:
+        # Remove the "results_" prefix and ".json" suffix
+        node_name = filename.replace("results_", "").replace(".json", "")
+        # Remove the iteration number at the end
+        if "_" in node_name:
+            parts = node_name.split("_")
+            node_name = "_".join(parts[:-1])  # Remove last part (iteration)
+        
+        # Remove -dev suffix if present
+        node_name = node_name.replace("-dev", "")
+        
+        return node_name.replace("debian13-ai-", "")
+    
+    return "unknown"
+
+def detect_filesystem():
+    """Detect the filesystem type of /data on test nodes"""
+    # This is now a fallback - we primarily use filename-based detection
+    try:
+        # Try to get filesystem info from a test node
+        result = subprocess.run(
+            ["ssh", "debian13-ai", "df -T /data | tail -1"],
+            capture_output=True,
+            text=True,
+            timeout=5,
+        )
+        if result.returncode == 0:
+            parts = result.stdout.strip().split()
+            if len(parts) >= 2:
+                return parts[1]  # filesystem type is second column
+    except:
+        pass
+
+    # Fallback to local filesystem check
+    try:
+        result = subprocess.run(["df", "-T", "."], capture_output=True, text=True)
+        if result.returncode == 0:
+            lines = result.stdout.strip().split("\n")
+            if len(lines) > 1:
+                parts = lines[1].split()
+                if len(parts) >= 2:
+                    return parts[1]
+    except:
+        pass
+
+    return "unknown"
+
+
+def load_results(results_dir):
+    """Load all JSON result files from the directory"""
+    results = []
+    json_files = glob.glob(os.path.join(results_dir, "results_*.json"))
+
+    for json_file in json_files:
+        try:
+            with open(json_file, "r") as f:
+                data = json.load(f)
+
+                # Extract node type from filename
+                filename = os.path.basename(json_file)
+                data["filename"] = filename
+                
+                # Extract filesystem type and config from filename
+                data["filesystem"] = extract_filesystem_from_filename(filename)
+                data["node_config"] = extract_node_config_from_filename(filename)
+
+                # Determine if it's baseline or dev
+                if "-dev_" in filename or "-dev." in filename:
+                    data["node_type"] = "dev"
+                    data["is_dev"] = True
+                else:
+                    data["node_type"] = "baseline"
+                    data["is_dev"] = False
+
+                # Extract iteration number
+                if "_" in filename:
+                    parts = filename.split("_")
+                    iteration = parts[-1].replace(".json", "")
+                    data["iteration"] = int(iteration) if iteration.isdigit() else 1
+                else:
+                    data["iteration"] = 1
+
+                results.append(data)
+        except Exception as e:
+            print(f"Error loading {json_file}: {e}")
+
+    return results
+
+
+def create_qps_comparison_chart(results, output_dir):
+    """Create a clear QPS comparison chart between baseline and dev"""
+
+    # Organize data by node type and test configuration
+    baseline_data = defaultdict(list)
+    dev_data = defaultdict(list)
+
+    for result in results:
+        if "query_performance" not in result:
+            continue
+
+        qp = result["query_performance"]
+        node_type = result.get("node_type", "unknown")
+
+        # Extract QPS for different configurations
+        for topk in ["topk_1", "topk_10", "topk_100"]:
+            if topk not in qp:
+                continue
+            for batch in ["batch_1", "batch_10", "batch_100"]:
+                if batch not in qp[topk]:
+                    continue
+
+                config_name = f"{topk}_{batch}"
+                qps = qp[topk][batch].get("queries_per_second", 0)
+
+                if node_type == "dev":
+                    dev_data[config_name].append(qps)
+                else:
+                    baseline_data[config_name].append(qps)
+
+    # Calculate averages
+    configs = sorted(set(baseline_data.keys()) | set(dev_data.keys()))
+    baseline_avg = [
+        np.mean(baseline_data[c]) if baseline_data[c] else 0 for c in configs
+    ]
+    dev_avg = [np.mean(dev_data[c]) if dev_data[c] else 0 for c in configs]
+
+    # Create the plot
+    fig, ax = plt.subplots(figsize=(14, 8))
+
+    x = np.arange(len(configs))
+    width = 0.35
+
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_avg, width, label="Baseline", color="#2E86AB"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_avg, width, label="Development", color="#A23B72"
+    )
+
+    # Customize the plot
+    ax.set_xlabel("Query Configuration", fontsize=12)
+    ax.set_ylabel("Queries Per Second (QPS)", fontsize=12)
+    fs_type = results[0].get("filesystem", "unknown") if results else "unknown"
+    ax.set_title(
+        f"Milvus Query Performance Comparison\nFilesystem: {fs_type.upper()}",
+        fontsize=14,
+        fontweight="bold",
+    )
+    ax.set_xticks(x)
+    ax.set_xticklabels([c.replace("_", "\n") for c in configs], rotation=45, ha="right")
+    ax.legend(fontsize=11)
+    ax.grid(True, alpha=0.3, axis="y")
+
+    # Add value labels on bars
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.0f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                    fontsize=9,
+                )
+
+    plt.tight_layout()
+    plt.savefig(
+        os.path.join(output_dir, "qps_comparison.png"), dpi=150, bbox_inches="tight"
+    )
+    plt.close()
+
+    print(f"Generated QPS comparison chart")
+
+
+def create_latency_comparison_chart(results, output_dir):
+    """Create latency comparison chart (lower is better)"""
+
+    # Organize data by node type and test configuration
+    baseline_latency = defaultdict(list)
+    dev_latency = defaultdict(list)
+
+    for result in results:
+        if "query_performance" not in result:
+            continue
+
+        qp = result["query_performance"]
+        node_type = result.get("node_type", "unknown")
+
+        # Extract latency for different configurations
+        for topk in ["topk_1", "topk_10", "topk_100"]:
+            if topk not in qp:
+                continue
+            for batch in ["batch_1", "batch_10", "batch_100"]:
+                if batch not in qp[topk]:
+                    continue
+
+                config_name = f"{topk}_{batch}"
+                # Convert to milliseconds for readability
+                latency_ms = qp[topk][batch].get("average_time_seconds", 0) * 1000
+
+                if node_type == "dev":
+                    dev_latency[config_name].append(latency_ms)
+                else:
+                    baseline_latency[config_name].append(latency_ms)
+
+    # Calculate averages
+    configs = sorted(set(baseline_latency.keys()) | set(dev_latency.keys()))
+    baseline_avg = [
+        np.mean(baseline_latency[c]) if baseline_latency[c] else 0 for c in configs
+    ]
+    dev_avg = [np.mean(dev_latency[c]) if dev_latency[c] else 0 for c in configs]
+
+    # Create the plot
+    fig, ax = plt.subplots(figsize=(14, 8))
+
+    x = np.arange(len(configs))
+    width = 0.35
+
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_avg, width, label="Baseline", color="#2E86AB"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_avg, width, label="Development", color="#A23B72"
+    )
+
+    # Customize the plot
+    ax.set_xlabel("Query Configuration", fontsize=12)
+    ax.set_ylabel("Average Latency (milliseconds)", fontsize=12)
+    fs_type = results[0].get("filesystem", "unknown") if results else "unknown"
+    ax.set_title(
+        f"Milvus Query Latency Comparison (Lower is Better)\nFilesystem: {fs_type.upper()}",
+        fontsize=14,
+        fontweight="bold",
+    )
+    ax.set_xticks(x)
+    ax.set_xticklabels([c.replace("_", "\n") for c in configs], rotation=45, ha="right")
+    ax.legend(fontsize=11)
+    ax.grid(True, alpha=0.3, axis="y")
+
+    # Add value labels on bars
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.1f}ms",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                    fontsize=9,
+                )
+
+    plt.tight_layout()
+    plt.savefig(
+        os.path.join(output_dir, "latency_comparison.png"), dpi=150, bbox_inches="tight"
+    )
+    plt.close()
+
+    print(f"Generated latency comparison chart")
+
+
+def create_insert_performance_chart(results, output_dir):
+    """Create insert performance comparison"""
+
+    baseline_insert = []
+    dev_insert = []
+
+    for result in results:
+        if "insert_performance" not in result:
+            continue
+
+        vectors_per_sec = result["insert_performance"].get("vectors_per_second", 0)
+        node_type = result.get("node_type", "unknown")
+
+        if node_type == "dev":
+            dev_insert.append(vectors_per_sec)
+        else:
+            baseline_insert.append(vectors_per_sec)
+
+    if not baseline_insert and not dev_insert:
+        return
+
+    # Create box plot for insert performance
+    fig, ax = plt.subplots(figsize=(10, 6))
+
+    data_to_plot = []
+    labels = []
+
+    if baseline_insert:
+        data_to_plot.append(baseline_insert)
+        labels.append("Baseline")
+    if dev_insert:
+        data_to_plot.append(dev_insert)
+        labels.append("Development")
+
+    bp = ax.boxplot(data_to_plot, labels=labels, patch_artist=True)
+
+    # Color the boxes
+    colors = ["#2E86AB", "#A23B72"]
+    for patch, color in zip(bp["boxes"], colors[: len(bp["boxes"])]):
+        patch.set_facecolor(color)
+        patch.set_alpha(0.7)
+
+    # Add individual points
+    for i, data in enumerate(data_to_plot, 1):
+        x = np.random.normal(i, 0.04, size=len(data))
+        ax.scatter(x, data, alpha=0.4, s=30, color="black")
+
+    ax.set_ylabel("Vectors per Second", fontsize=12)
+    fs_type = results[0].get("filesystem", "unknown") if results else "unknown"
+    ax.set_title(
+        f"Insert Performance Distribution\nFilesystem: {fs_type.upper()}",
+        fontsize=14,
+        fontweight="bold",
+    )
+    ax.grid(True, alpha=0.3, axis="y")
+
+    # Add mean values
+    for i, data in enumerate(data_to_plot, 1):
+        mean_val = np.mean(data)
+        ax.text(
+            i,
+            mean_val,
+            f"μ={mean_val:.0f}",
+            ha="center",
+            va="bottom",
+            fontweight="bold",
+        )
+
+    plt.tight_layout()
+    plt.savefig(
+        os.path.join(output_dir, "insert_performance.png"), dpi=150, bbox_inches="tight"
+    )
+    plt.close()
+
+    print(f"Generated insert performance chart")
+
+
+def create_performance_summary_table(results, output_dir):
+    """Create a performance summary table as an image"""
+
+    # Calculate summary statistics
+    summary_data = {"Metric": [], "Baseline": [], "Development": [], "Improvement": []}
+
+    # Insert performance
+    baseline_insert = []
+    dev_insert = []
+
+    for result in results:
+        if "insert_performance" in result:
+            vectors_per_sec = result["insert_performance"].get("vectors_per_second", 0)
+            if result.get("node_type") == "dev":
+                dev_insert.append(vectors_per_sec)
+            else:
+                baseline_insert.append(vectors_per_sec)
+
+    if baseline_insert and dev_insert:
+        baseline_avg = np.mean(baseline_insert)
+        dev_avg = np.mean(dev_insert)
+        improvement = ((dev_avg - baseline_avg) / baseline_avg) * 100
+
+        summary_data["Metric"].append("Insert Rate (vec/s)")
+        summary_data["Baseline"].append(f"{baseline_avg:.0f}")
+        summary_data["Development"].append(f"{dev_avg:.0f}")
+        summary_data["Improvement"].append(f"{improvement:+.1f}%")
+
+    # Query performance (best case)
+    baseline_best_qps = 0
+    dev_best_qps = 0
+
+    for result in results:
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            for topk in qp.values():
+                for batch in topk.values():
+                    qps = batch.get("queries_per_second", 0)
+                    if result.get("node_type") == "dev":
+                        dev_best_qps = max(dev_best_qps, qps)
+                    else:
+                        baseline_best_qps = max(baseline_best_qps, qps)
+
+    if baseline_best_qps and dev_best_qps:
+        improvement = ((dev_best_qps - baseline_best_qps) / baseline_best_qps) * 100
+        summary_data["Metric"].append("Best Query QPS")
+        summary_data["Baseline"].append(f"{baseline_best_qps:.0f}")
+        summary_data["Development"].append(f"{dev_best_qps:.0f}")
+        summary_data["Improvement"].append(f"{improvement:+.1f}%")
+
+    # Create table plot
+    # Check if we have data to create a table
+    if not summary_data["Metric"]:
+        print("No comparison data available for performance summary table")
+        return
+
+    fig, ax = plt.subplots(figsize=(10, 3))
+    ax.axis("tight")
+    ax.axis("off")
+
+    table_data = []
+    for i in range(len(summary_data["Metric"])):
+        table_data.append(
+            [
+                summary_data["Metric"][i],
+                summary_data["Baseline"][i],
+                summary_data["Development"][i],
+                summary_data["Improvement"][i],
+            ]
+        )
+
+    table = ax.table(
+        cellText=table_data,
+        colLabels=["Metric", "Baseline", "Development", "Change"],
+        cellLoc="center",
+        loc="center",
+        colWidths=[0.3, 0.2, 0.2, 0.2],
+    )
+
+    table.auto_set_font_size(False)
+    table.set_fontsize(11)
+    table.scale(1.2, 1.5)
+
+    # Style the header
+    for i in range(4):
+        table[(0, i)].set_facecolor("#2E86AB")
+        table[(0, i)].set_text_props(weight="bold", color="white")
+
+    # Color improvement cells
+    for i in range(1, len(table_data) + 1):
+        if "+" in table_data[i - 1][3]:
+            table[(i, 3)].set_facecolor("#90EE90")
+        elif "-" in table_data[i - 1][3]:
+            table[(i, 3)].set_facecolor("#FFB6C1")
+
+    fs_type = results[0].get("filesystem", "unknown") if results else "unknown"
+    plt.title(
+        f"Performance Summary - Filesystem: {fs_type.upper()}",
+        fontsize=14,
+        fontweight="bold",
+        pad=20,
+    )
+
+    plt.savefig(
+        os.path.join(output_dir, "performance_summary.png"),
+        dpi=150,
+        bbox_inches="tight",
+    )
+    plt.close()
+
+    print(f"Generated performance summary table")
+
+
+def main():
+    if len(sys.argv) < 3:
+        print("Usage: generate_better_graphs.py <results_dir> <output_dir>")
+        sys.exit(1)
+
+    results_dir = sys.argv[1]
+    output_dir = sys.argv[2]
+
+    # Create output directory if it doesn't exist
+    os.makedirs(output_dir, exist_ok=True)
+
+    # Load results
+    results = load_results(results_dir)
+    print(f"Loaded {len(results)} result files")
+
+    if not results:
+        print("No results found!")
+        sys.exit(1)
+
+    # Generate graphs
+    print("Generating QPS comparison chart...")
+    create_qps_comparison_chart(results, output_dir)
+
+    print("Generating latency comparison chart...")
+    create_latency_comparison_chart(results, output_dir)
+
+    print("Generating insert performance chart...")
+    create_insert_performance_chart(results, output_dir)
+
+    print("Generating performance summary table...")
+    create_performance_summary_table(results, output_dir)
+
+    print(f"\nAnalysis complete! Graphs saved to {output_dir}")
+
+    # Print summary
+    fs_type = results[0].get("filesystem", "unknown")
+    print(f"Filesystem detected: {fs_type}")
+    print(f"Total tests analyzed: {len(results)}")
+
+    baseline_count = sum(1 for r in results if r.get("node_type") == "baseline")
+    dev_count = sum(1 for r in results if r.get("node_type") == "dev")
+    print(f"Baseline tests: {baseline_count}")
+    print(f"Development tests: {dev_count}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/playbooks/roles/ai_collect_results/files/generate_graphs.py b/playbooks/roles/ai_collect_results/files/generate_graphs.py
new file mode 100755
index 00000000..53a835e2
--- /dev/null
+++ b/playbooks/roles/ai_collect_results/files/generate_graphs.py
@@ -0,0 +1,678 @@
+#!/usr/bin/env python3
+"""
+Generate graphs and analysis for AI benchmark results
+"""
+
+import json
+import os
+import sys
+import glob
+import numpy as np
+import matplotlib
+
+matplotlib.use("Agg")  # Use non-interactive backend
+import matplotlib.pyplot as plt
+from datetime import datetime
+from pathlib import Path
+from collections import defaultdict
+
+
+def load_results(results_dir):
+    """Load all JSON result files from the directory"""
+    results = []
+    json_files = glob.glob(os.path.join(results_dir, "*.json"))
+
+    for json_file in json_files:
+        try:
+            with open(json_file, "r") as f:
+                data = json.load(f)
+                # Extract filesystem info - prefer from JSON data over filename
+                filename = os.path.basename(json_file)
+                
+                # First, try to get filesystem from the JSON data itself
+                fs_type = data.get("filesystem", None)
+                
+                # If not in JSON, try to parse from filename (backwards compatibility)
+                if not fs_type:
+                    parts = filename.replace("results_", "").replace(".json", "").split("-")
+                    
+                    # Parse host info
+                    if "debian13-ai-" in filename:
+                        host_parts = (
+                            filename.replace("results_debian13-ai-", "")
+                            .replace("_1.json", "")
+                            .replace("_2.json", "")
+                            .replace("_3.json", "")
+                            .split("-")
+                        )
+                        if "xfs" in host_parts[0]:
+                            fs_type = "xfs"
+                            # Extract block size (e.g., "4k", "16k", etc.)
+                            block_size = host_parts[1] if len(host_parts) > 1 else "unknown"
+                        elif "ext4" in host_parts[0]:
+                            fs_type = "ext4"
+                            block_size = host_parts[1] if len(host_parts) > 1 else "4k"
+                        elif "btrfs" in host_parts[0]:
+                            fs_type = "btrfs"
+                            block_size = "default"
+                        else:
+                            fs_type = "unknown"
+                            block_size = "unknown"
+                    else:
+                        fs_type = "unknown"
+                        block_size = "unknown"
+                else:
+                    # If filesystem came from JSON, set appropriate block size
+                    if fs_type == "btrfs":
+                        block_size = "default"
+                    elif fs_type in ["ext4", "xfs"]:
+                        block_size = data.get("block_size", "4k")
+                    else:
+                        block_size = data.get("block_size", "default")
+                
+                is_dev = "dev" in filename
+                
+                # Use filesystem from JSON if available, otherwise use parsed value
+                if "filesystem" not in data:
+                    data["filesystem"] = fs_type
+                data["block_size"] = block_size
+                data["is_dev"] = is_dev
+                data["filename"] = filename
+
+                results.append(data)
+        except Exception as e:
+            print(f"Error loading {json_file}: {e}")
+
+    return results
+
+
+def create_filesystem_comparison_chart(results, output_dir):
+    """Create a bar chart comparing performance across filesystems"""
+    # Group by filesystem and baseline/dev
+    fs_data = defaultdict(lambda: {"baseline": [], "dev": []})
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        category = "dev" if result.get("is_dev", False) else "baseline"
+
+        # Extract actual performance data from results
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+        fs_data[fs][category].append(insert_qps)
+
+    # Prepare data for plotting
+    filesystems = list(fs_data.keys())
+    baseline_means = [
+        np.mean(fs_data[fs]["baseline"]) if fs_data[fs]["baseline"] else 0
+        for fs in filesystems
+    ]
+    dev_means = [
+        np.mean(fs_data[fs]["dev"]) if fs_data[fs]["dev"] else 0 for fs in filesystems
+    ]
+
+    x = np.arange(len(filesystems))
+    width = 0.35
+
+    fig, ax = plt.subplots(figsize=(10, 6))
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_means, width, label="Baseline", color="#1f77b4"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_means, width, label="Development", color="#ff7f0e"
+    )
+
+    ax.set_xlabel("Filesystem")
+    ax.set_ylabel("Insert QPS")
+    ax.set_title("Vector Database Performance by Filesystem")
+    ax.set_xticks(x)
+    ax.set_xticklabels(filesystems)
+    ax.legend()
+    ax.grid(True, alpha=0.3)
+
+    # Add value labels on bars
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.0f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "filesystem_comparison.png"), dpi=150)
+    plt.close()
+
+
+def create_block_size_analysis(results, output_dir):
+    """Create analysis for different block sizes (XFS specific)"""
+    # Filter XFS results
+    xfs_results = [r for r in results if r.get("filesystem") == "xfs"]
+
+    if not xfs_results:
+        return
+
+    # Group by block size
+    block_size_data = defaultdict(lambda: {"baseline": [], "dev": []})
+
+    for result in xfs_results:
+        block_size = result.get("block_size", "unknown")
+        category = "dev" if result.get("is_dev", False) else "baseline"
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+        block_size_data[block_size][category].append(insert_qps)
+
+    # Sort block sizes
+    block_sizes = sorted(
+        block_size_data.keys(),
+        key=lambda x: (
+            int(x.replace("k", "").replace("s", ""))
+            if x not in ["unknown", "default"]
+            else 0
+        ),
+    )
+
+    # Create grouped bar chart
+    baseline_means = [
+        (
+            np.mean(block_size_data[bs]["baseline"])
+            if block_size_data[bs]["baseline"]
+            else 0
+        )
+        for bs in block_sizes
+    ]
+    dev_means = [
+        np.mean(block_size_data[bs]["dev"]) if block_size_data[bs]["dev"] else 0
+        for bs in block_sizes
+    ]
+
+    x = np.arange(len(block_sizes))
+    width = 0.35
+
+    fig, ax = plt.subplots(figsize=(12, 6))
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_means, width, label="Baseline", color="#2ca02c"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_means, width, label="Development", color="#d62728"
+    )
+
+    ax.set_xlabel("Block Size")
+    ax.set_ylabel("Insert QPS")
+    ax.set_title("XFS Performance by Block Size")
+    ax.set_xticks(x)
+    ax.set_xticklabels(block_sizes)
+    ax.legend()
+    ax.grid(True, alpha=0.3)
+
+    # Add value labels
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.0f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "xfs_block_size_analysis.png"), dpi=150)
+    plt.close()
+
+
+def create_heatmap_analysis(results, output_dir):
+    """Create a heatmap showing performance across all configurations"""
+    # Group data by configuration and version
+    config_data = defaultdict(
+        lambda: {
+            "baseline": {"insert": 0, "query": 0},
+            "dev": {"insert": 0, "query": 0},
+        }
+    )
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "default")
+        config = f"{fs}-{block_size}"
+        version = "dev" if result.get("is_dev", False) else "baseline"
+
+        # Get actual insert performance
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+
+        # Calculate average query QPS
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+
+        config_data[config][version]["insert"] = insert_qps
+        config_data[config][version]["query"] = query_qps
+
+    # Sort configurations
+    configs = sorted(config_data.keys())
+
+    # Prepare data for heatmap
+    insert_baseline = [config_data[c]["baseline"]["insert"] for c in configs]
+    insert_dev = [config_data[c]["dev"]["insert"] for c in configs]
+    query_baseline = [config_data[c]["baseline"]["query"] for c in configs]
+    query_dev = [config_data[c]["dev"]["query"] for c in configs]
+
+    # Create figure with custom heatmap
+    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
+
+    # Create data matrices
+    insert_data = np.array([insert_baseline, insert_dev]).T
+    query_data = np.array([query_baseline, query_dev]).T
+
+    # Insert QPS heatmap
+    im1 = ax1.imshow(insert_data, cmap="YlOrRd", aspect="auto")
+    ax1.set_xticks([0, 1])
+    ax1.set_xticklabels(["Baseline", "Development"])
+    ax1.set_yticks(range(len(configs)))
+    ax1.set_yticklabels(configs)
+    ax1.set_title("Insert Performance Heatmap")
+    ax1.set_ylabel("Configuration")
+
+    # Add text annotations
+    for i in range(len(configs)):
+        for j in range(2):
+            text = ax1.text(
+                j,
+                i,
+                f"{int(insert_data[i, j])}",
+                ha="center",
+                va="center",
+                color="black",
+            )
+
+    # Add colorbar
+    cbar1 = plt.colorbar(im1, ax=ax1)
+    cbar1.set_label("Insert QPS")
+
+    # Query QPS heatmap
+    im2 = ax2.imshow(query_data, cmap="YlGnBu", aspect="auto")
+    ax2.set_xticks([0, 1])
+    ax2.set_xticklabels(["Baseline", "Development"])
+    ax2.set_yticks(range(len(configs)))
+    ax2.set_yticklabels(configs)
+    ax2.set_title("Query Performance Heatmap")
+
+    # Add text annotations
+    for i in range(len(configs)):
+        for j in range(2):
+            text = ax2.text(
+                j,
+                i,
+                f"{int(query_data[i, j])}",
+                ha="center",
+                va="center",
+                color="black",
+            )
+
+    # Add colorbar
+    cbar2 = plt.colorbar(im2, ax=ax2)
+    cbar2.set_label("Query QPS")
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "performance_heatmap.png"), dpi=150)
+    plt.close()
+
+
+def create_performance_trends(results, output_dir):
+    """Create line charts showing performance trends"""
+    # Group by filesystem type
+    fs_types = defaultdict(
+        lambda: {
+            "configs": [],
+            "baseline_insert": [],
+            "dev_insert": [],
+            "baseline_query": [],
+            "dev_query": [],
+        }
+    )
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "default")
+        config = f"{block_size}"
+
+        if config not in fs_types[fs]["configs"]:
+            fs_types[fs]["configs"].append(config)
+            fs_types[fs]["baseline_insert"].append(0)
+            fs_types[fs]["dev_insert"].append(0)
+            fs_types[fs]["baseline_query"].append(0)
+            fs_types[fs]["dev_query"].append(0)
+
+        idx = fs_types[fs]["configs"].index(config)
+
+        # Calculate average query QPS from all test configurations
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+
+        if result.get("is_dev", False):
+            if "insert_performance" in result:
+                fs_types[fs]["dev_insert"][idx] = result["insert_performance"].get(
+                    "vectors_per_second", 0
+                )
+            fs_types[fs]["dev_query"][idx] = query_qps
+        else:
+            if "insert_performance" in result:
+                fs_types[fs]["baseline_insert"][idx] = result["insert_performance"].get(
+                    "vectors_per_second", 0
+                )
+            fs_types[fs]["baseline_query"][idx] = query_qps
+
+    # Create separate plots for each filesystem
+    for fs, data in fs_types.items():
+        if not data["configs"]:
+            continue
+
+        fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
+
+        x = range(len(data["configs"]))
+
+        # Insert performance
+        ax1.plot(
+            x,
+            data["baseline_insert"],
+            "o-",
+            label="Baseline",
+            linewidth=2,
+            markersize=8,
+        )
+        ax1.plot(
+            x, data["dev_insert"], "s-", label="Development", linewidth=2, markersize=8
+        )
+        ax1.set_xlabel("Configuration")
+        ax1.set_ylabel("Insert QPS")
+        ax1.set_title(f"{fs.upper()} Insert Performance")
+        ax1.set_xticks(x)
+        ax1.set_xticklabels(data["configs"])
+        ax1.legend()
+        ax1.grid(True, alpha=0.3)
+
+        # Query performance
+        ax2.plot(
+            x, data["baseline_query"], "o-", label="Baseline", linewidth=2, markersize=8
+        )
+        ax2.plot(
+            x, data["dev_query"], "s-", label="Development", linewidth=2, markersize=8
+        )
+        ax2.set_xlabel("Configuration")
+        ax2.set_ylabel("Query QPS")
+        ax2.set_title(f"{fs.upper()} Query Performance")
+        ax2.set_xticks(x)
+        ax2.set_xticklabels(data["configs"])
+        ax2.legend()
+        ax2.grid(True, alpha=0.3)
+
+        plt.tight_layout()
+        plt.savefig(os.path.join(output_dir, f"{fs}_performance_trends.png"), dpi=150)
+        plt.close()
+
+
+def create_simple_performance_trends(results, output_dir):
+    """Create a simple performance trends chart for basic Milvus testing"""
+    if not results:
+        return
+    
+    # Separate baseline and dev results
+    baseline_results = [r for r in results if not r.get("is_dev", False)]
+    dev_results = [r for r in results if r.get("is_dev", False)]
+    
+    if not baseline_results and not dev_results:
+        return
+    
+    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
+    
+    # Prepare data
+    baseline_insert = []
+    baseline_query = []
+    dev_insert = []
+    dev_query = []
+    labels = []
+    
+    # Process baseline results
+    for i, result in enumerate(baseline_results):
+        if "insert_performance" in result:
+            baseline_insert.append(result["insert_performance"].get("vectors_per_second", 0))
+        else:
+            baseline_insert.append(0)
+        
+        # Calculate average query QPS
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get("queries_per_second", 0)
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+        baseline_query.append(query_qps)
+        labels.append(f"Run {i+1}")
+    
+    # Process dev results
+    for result in dev_results:
+        if "insert_performance" in result:
+            dev_insert.append(result["insert_performance"].get("vectors_per_second", 0))
+        else:
+            dev_insert.append(0)
+        
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get("queries_per_second", 0)
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+        dev_query.append(query_qps)
+    
+    x = range(len(baseline_results) if baseline_results else len(dev_results))
+    
+    # Insert performance
+    if baseline_insert:
+        ax1.plot(x, baseline_insert, "o-", label="Baseline", linewidth=2, markersize=8)
+    if dev_insert:
+        ax1.plot(x[:len(dev_insert)], dev_insert, "s-", label="Development", linewidth=2, markersize=8)
+    ax1.set_xlabel("Test Run")
+    ax1.set_ylabel("Insert QPS")
+    ax1.set_title("Milvus Insert Performance")
+    ax1.set_xticks(x)
+    ax1.set_xticklabels(labels if labels else [f"Run {i+1}" for i in x])
+    ax1.legend()
+    ax1.grid(True, alpha=0.3)
+    
+    # Query performance
+    if baseline_query:
+        ax2.plot(x, baseline_query, "o-", label="Baseline", linewidth=2, markersize=8)
+    if dev_query:
+        ax2.plot(x[:len(dev_query)], dev_query, "s-", label="Development", linewidth=2, markersize=8)
+    ax2.set_xlabel("Test Run")
+    ax2.set_ylabel("Query QPS")
+    ax2.set_title("Milvus Query Performance")
+    ax2.set_xticks(x)
+    ax2.set_xticklabels(labels if labels else [f"Run {i+1}" for i in x])
+    ax2.legend()
+    ax2.grid(True, alpha=0.3)
+    
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "performance_trends.png"), dpi=150)
+    plt.close()
+
+
+def generate_summary_statistics(results, output_dir):
+    """Generate summary statistics and save to JSON"""
+    summary = {
+        "total_tests": len(results),
+        "filesystems_tested": list(
+            set(r.get("filesystem", "unknown") for r in results)
+        ),
+        "configurations": {},
+        "performance_summary": {
+            "best_insert_qps": {"value": 0, "config": ""},
+            "best_query_qps": {"value": 0, "config": ""},
+            "average_insert_qps": 0,
+            "average_query_qps": 0,
+        },
+    }
+
+    # Calculate statistics
+    all_insert_qps = []
+    all_query_qps = []
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "default")
+        is_dev = "dev" if result.get("is_dev", False) else "baseline"
+        config_name = f"{fs}-{block_size}-{is_dev}"
+
+        # Get actual performance metrics
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+
+        # Calculate average query QPS
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+
+        all_insert_qps.append(insert_qps)
+        all_query_qps.append(query_qps)
+
+        summary["configurations"][config_name] = {
+            "insert_qps": insert_qps,
+            "query_qps": query_qps,
+            "host": result.get("host", "unknown"),
+        }
+
+        if insert_qps > summary["performance_summary"]["best_insert_qps"]["value"]:
+            summary["performance_summary"]["best_insert_qps"] = {
+                "value": insert_qps,
+                "config": config_name,
+            }
+
+        if query_qps > summary["performance_summary"]["best_query_qps"]["value"]:
+            summary["performance_summary"]["best_query_qps"] = {
+                "value": query_qps,
+                "config": config_name,
+            }
+
+    summary["performance_summary"]["average_insert_qps"] = (
+        np.mean(all_insert_qps) if all_insert_qps else 0
+    )
+    summary["performance_summary"]["average_query_qps"] = (
+        np.mean(all_query_qps) if all_query_qps else 0
+    )
+
+    # Save summary
+    with open(os.path.join(output_dir, "summary.json"), "w") as f:
+        json.dump(summary, f, indent=2)
+
+    return summary
+
+
+def main():
+    if len(sys.argv) < 3:
+        print("Usage: generate_graphs.py <results_dir> <output_dir>")
+        sys.exit(1)
+
+    results_dir = sys.argv[1]
+    output_dir = sys.argv[2]
+
+    # Create output directory
+    os.makedirs(output_dir, exist_ok=True)
+
+    # Load results
+    results = load_results(results_dir)
+
+    if not results:
+        print("No results found to analyze")
+        sys.exit(1)
+
+    print(f"Loaded {len(results)} result files")
+
+    # Generate graphs
+    print("Generating performance heatmap...")
+    create_heatmap_analysis(results, output_dir)
+
+    print("Generating performance trends...")
+    create_simple_performance_trends(results, output_dir)
+
+    print("Generating summary statistics...")
+    summary = generate_summary_statistics(results, output_dir)
+
+    print(f"\nAnalysis complete! Graphs saved to {output_dir}")
+    print(f"Total configurations tested: {summary['total_tests']}")
+    print(
+        f"Best insert QPS: {summary['performance_summary']['best_insert_qps']['value']} ({summary['performance_summary']['best_insert_qps']['config']})"
+    )
+    print(
+        f"Best query QPS: {summary['performance_summary']['best_query_qps']['value']} ({summary['performance_summary']['best_query_qps']['config']})"
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/playbooks/roles/ai_collect_results/files/generate_html_report.py b/playbooks/roles/ai_collect_results/files/generate_html_report.py
new file mode 100755
index 00000000..a205577c
--- /dev/null
+++ b/playbooks/roles/ai_collect_results/files/generate_html_report.py
@@ -0,0 +1,427 @@
+#!/usr/bin/env python3
+"""
+Generate HTML report for AI benchmark results
+"""
+
+import json
+import os
+import sys
+import glob
+from datetime import datetime
+from pathlib import Path
+
+HTML_TEMPLATE = """
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>AI Benchmark Results - {timestamp}</title>
+    <style>
+        body {{
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
+            line-height: 1.6;
+            color: #333;
+            max-width: 1400px;
+            margin: 0 auto;
+            padding: 20px;
+            background-color: #f5f5f5;
+        }}
+        .header {{
+            background-color: #2c3e50;
+            color: white;
+            padding: 30px;
+            border-radius: 8px;
+            margin-bottom: 30px;
+            text-align: center;
+        }}
+        h1 {{
+            margin: 0;
+            font-size: 2.5em;
+        }}
+        .subtitle {{
+            margin-top: 10px;
+            opacity: 0.9;
+        }}
+        .summary-cards {{
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
+            gap: 20px;
+            margin-bottom: 40px;
+        }}
+        .card {{
+            background: white;
+            padding: 20px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            text-align: center;
+        }}
+        .card h3 {{
+            margin: 0 0 10px 0;
+            color: #2c3e50;
+        }}
+        .card .value {{
+            font-size: 2em;
+            font-weight: bold;
+            color: #3498db;
+        }}
+        .card .label {{
+            color: #7f8c8d;
+            font-size: 0.9em;
+        }}
+        .section {{
+            background: white;
+            padding: 30px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            margin-bottom: 30px;
+        }}
+        .section h2 {{
+            color: #2c3e50;
+            border-bottom: 2px solid #3498db;
+            padding-bottom: 10px;
+            margin-bottom: 20px;
+        }}
+        .graph-container {{
+            text-align: center;
+            margin: 20px 0;
+        }}
+        .graph-container img {{
+            max-width: 100%;
+            height: auto;
+            border-radius: 4px;
+            box-shadow: 0 2px 8px rgba(0,0,0,0.1);
+        }}
+        .results-table {{
+            width: 100%;
+            border-collapse: collapse;
+            margin-top: 20px;
+        }}
+        .results-table th, .results-table td {{
+            padding: 12px;
+            text-align: left;
+            border-bottom: 1px solid #ddd;
+        }}
+        .results-table th {{
+            background-color: #f8f9fa;
+            font-weight: 600;
+            color: #2c3e50;
+        }}
+        .results-table tr:hover {{
+            background-color: #f8f9fa;
+        }}
+        .baseline {{
+            background-color: #e8f4fd;
+        }}
+        .dev {{
+            background-color: #fff3cd;
+        }}
+        .footer {{
+            text-align: center;
+            padding: 20px;
+            color: #7f8c8d;
+            font-size: 0.9em;
+        }}
+        .graph-grid {{
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(500px, 1fr));
+            gap: 20px;
+            margin: 20px 0;
+        }}
+        .best-config {{
+            background-color: #d4edda;
+            font-weight: bold;
+        }}
+        .navigation {{
+            position: sticky;
+            top: 20px;
+            background: white;
+            padding: 20px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            margin-bottom: 30px;
+        }}
+        .navigation ul {{
+            list-style: none;
+            padding: 0;
+            margin: 0;
+        }}
+        .navigation li {{
+            display: inline-block;
+            margin-right: 20px;
+        }}
+        .navigation a {{
+            color: #3498db;
+            text-decoration: none;
+            font-weight: 500;
+        }}
+        .navigation a:hover {{
+            text-decoration: underline;
+        }}
+    </style>
+</head>
+<body>
+    <div class="header">
+        <h1>AI Vector Database Benchmark Results</h1>
+        <div class="subtitle">Generated on {timestamp}</div>
+    </div>
+    
+    <nav class="navigation">
+        <ul>
+            <li><a href="#summary">Summary</a></li>
+            <li><a href="#performance-metrics">Performance Metrics</a></li>
+            <li><a href="#performance-trends">Performance Trends</a></li>
+            <li><a href="#detailed-results">Detailed Results</a></li>
+        </ul>
+    </nav>
+    
+    <div id="summary" class="summary-cards">
+        <div class="card">
+            <h3>Total Tests</h3>
+            <div class="value">{total_tests}</div>
+            <div class="label">Configurations</div>
+        </div>
+        <div class="card">
+            <h3>Best Insert QPS</h3>
+            <div class="value">{best_insert_qps}</div>
+            <div class="label">{best_insert_config}</div>
+        </div>
+        <div class="card">
+            <h3>Best Query QPS</h3>
+            <div class="value">{best_query_qps}</div>
+            <div class="label">{best_query_config}</div>
+        </div>
+        <div class="card">
+            <h3>Test Runs</h3>
+            <div class="value">{total_tests}</div>
+            <div class="label">Benchmark Executions</div>
+        </div>
+    </div>
+    
+    <div id="performance-metrics" class="section">
+        <h2>Performance Metrics</h2>
+        <p>Key performance indicators for Milvus vector database operations.</p>
+        <div class="graph-container">
+            <img src="graphs/performance_heatmap.png" alt="Performance Metrics">
+        </div>
+    </div>
+    
+    <div id="performance-trends" class="section">
+        <h2>Performance Trends</h2>
+        <p>Performance comparison between baseline and development configurations.</p>
+        <div class="graph-container">
+            <img src="graphs/performance_trends.png" alt="Performance Trends">
+        </div>
+    </div>
+    
+    <div id="detailed-results" class="section">
+        <h2>Detailed Results Table</h2>
+        <table class="results-table">
+            <thead>
+                <tr>
+                    <th>Host</th>
+                    <th>Type</th>
+                    <th>Insert QPS</th>
+                    <th>Query QPS</th>
+                    <th>Timestamp</th>
+                </tr>
+            </thead>
+            <tbody>
+                {table_rows}
+            </tbody>
+        </table>
+    </div>
+    
+    <div class="footer">
+        <p>Generated by kdevops AI Benchmark Suite | <a href="https://github.com/linux-kdevops/kdevops">GitHub</a></p>
+    </div>
+</body>
+</html>
+"""
+
+
+def load_summary(graphs_dir):
+    """Load the summary.json file"""
+    summary_path = os.path.join(graphs_dir, "summary.json")
+    if os.path.exists(summary_path):
+        with open(summary_path, "r") as f:
+            return json.load(f)
+    return None
+
+
+def load_results(results_dir):
+    """Load all result files for detailed table"""
+    results = []
+    json_files = glob.glob(os.path.join(results_dir, "*.json"))
+
+    for json_file in json_files:
+        try:
+            with open(json_file, "r") as f:
+                data = json.load(f)
+                # Get filesystem from JSON data first, then fallback to filename parsing
+                filename = os.path.basename(json_file)
+                
+                # Skip results without valid performance data
+                insert_perf = data.get("insert_performance", {})
+                query_perf = data.get("query_performance", {})
+                if not insert_perf or not query_perf:
+                    continue
+                
+                # Get filesystem from JSON data
+                fs_type = data.get("filesystem", None)
+                
+                # If not in JSON, try to parse from filename (backwards compatibility)
+                if not fs_type and "debian13-ai" in filename:
+                    host_parts = (
+                        filename.replace("results_debian13-ai-", "")
+                        .replace("_1.json", "")
+                        .replace("_2.json", "")
+                        .replace("_3.json", "")
+                        .split("-")
+                    )
+                    if "xfs" in host_parts[0]:
+                        fs_type = "xfs"
+                        block_size = host_parts[1] if len(host_parts) > 1 else "4k"
+                    elif "ext4" in host_parts[0]:
+                        fs_type = "ext4"
+                        block_size = host_parts[1] if len(host_parts) > 1 else "4k"
+                    elif "btrfs" in host_parts[0]:
+                        fs_type = "btrfs"
+                        block_size = "default"
+                    else:
+                        fs_type = "unknown"
+                        block_size = "unknown"
+                else:
+                    # Set appropriate block size based on filesystem
+                    if fs_type == "btrfs":
+                        block_size = "default"
+                    else:
+                        block_size = data.get("block_size", "default")
+                
+                # Default to unknown if still not found
+                if not fs_type:
+                    fs_type = "unknown"
+                    block_size = "unknown"
+                
+                is_dev = "dev" in filename
+                
+                # Calculate average QPS from query performance data
+                query_qps = 0
+                query_count = 0
+                for topk_data in query_perf.values():
+                    for batch_data in topk_data.values():
+                        qps = batch_data.get("queries_per_second", 0)
+                        if qps > 0:
+                            query_qps += qps
+                            query_count += 1
+                if query_count > 0:
+                    query_qps = query_qps / query_count
+                
+                results.append(
+                    {
+                        "host": filename.replace("results_", "").replace(".json", ""),
+                        "filesystem": fs_type,
+                        "block_size": block_size,
+                        "type": "Development" if is_dev else "Baseline",
+                        "insert_qps": insert_perf.get("vectors_per_second", 0),
+                        "query_qps": query_qps,
+                        "timestamp": data.get("timestamp", "N/A"),
+                        "is_dev": is_dev,
+                    }
+                )
+        except Exception as e:
+            print(f"Error loading {json_file}: {e}")
+
+    # Sort by filesystem, block size, then type
+    results.sort(key=lambda x: (x["filesystem"], x["block_size"], x["type"]))
+    return results
+
+
+def generate_table_rows(results, best_configs):
+    """Generate HTML table rows"""
+    rows = []
+    for result in results:
+        config_key = f"{result['filesystem']}-{result['block_size']}-{'dev' if result['is_dev'] else 'baseline'}"
+        row_class = "dev" if result["is_dev"] else "baseline"
+
+        # Check if this is a best configuration
+        if config_key in best_configs:
+            row_class += " best-config"
+
+        row = f"""
+        <tr class="{row_class}">
+            <td>{result['host']}</td>
+            <td>{result['type']}</td>
+            <td>{result['insert_qps']:,}</td>
+            <td>{result['query_qps']:,}</td>
+            <td>{result['timestamp']}</td>
+        </tr>
+        """
+        rows.append(row)
+
+    return "\n".join(rows)
+
+
+def find_performance_trend_graphs(graphs_dir):
+    """Find performance trend graph"""
+    # Not used in basic implementation since we embed the graph directly
+    return ""
+
+
+def generate_html_report(results_dir, graphs_dir, output_path):
+    """Generate the HTML report"""
+    # Load summary
+    summary = load_summary(graphs_dir)
+    if not summary:
+        print("Warning: No summary.json found")
+        summary = {
+            "total_tests": 0,
+            "filesystems_tested": [],
+            "performance_summary": {
+                "best_insert_qps": {"value": 0, "config": "N/A"},
+                "best_query_qps": {"value": 0, "config": "N/A"},
+            },
+        }
+
+    # Load detailed results
+    results = load_results(results_dir)
+
+    # Find best configurations
+    best_configs = set()
+    if summary["performance_summary"]["best_insert_qps"]["config"]:
+        best_configs.add(summary["performance_summary"]["best_insert_qps"]["config"])
+    if summary["performance_summary"]["best_query_qps"]["config"]:
+        best_configs.add(summary["performance_summary"]["best_query_qps"]["config"])
+
+    # Generate HTML
+    html_content = HTML_TEMPLATE.format(
+        timestamp=datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
+        total_tests=summary["total_tests"],
+        best_insert_qps=f"{summary['performance_summary']['best_insert_qps']['value']:,}",
+        best_insert_config=summary["performance_summary"]["best_insert_qps"]["config"],
+        best_query_qps=f"{summary['performance_summary']['best_query_qps']['value']:,}",
+        best_query_config=summary["performance_summary"]["best_query_qps"]["config"],
+        table_rows=generate_table_rows(results, best_configs),
+    )
+
+    # Write HTML file
+    with open(output_path, "w") as f:
+        f.write(html_content)
+
+    print(f"HTML report generated: {output_path}")
+
+
+def main():
+    if len(sys.argv) < 4:
+        print("Usage: generate_html_report.py <results_dir> <graphs_dir> <output_html>")
+        sys.exit(1)
+
+    results_dir = sys.argv[1]
+    graphs_dir = sys.argv[2]
+    output_html = sys.argv[3]
+
+    generate_html_report(results_dir, graphs_dir, output_html)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/playbooks/roles/ai_collect_results/tasks/main.yml b/playbooks/roles/ai_collect_results/tasks/main.yml
new file mode 100644
index 00000000..6a15d89c
--- /dev/null
+++ b/playbooks/roles/ai_collect_results/tasks/main.yml
@@ -0,0 +1,220 @@
+---
+- name: Import optional extra_args file
+  ansible.builtin.include_vars: "{{ item }}"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Set local directories
+  ansible.builtin.set_fact:
+    local_results_dir: "{{ topdir_path }}/workflows/ai/results"
+    local_scripts_dir: "{{ topdir_path }}/workflows/ai/scripts"
+  run_once: true
+  delegate_to: localhost
+
+- name: Create local directories if they don't exist
+  ansible.builtin.file:
+    path: "{{ item }}"
+    state: directory
+    mode: '0755'
+  loop:
+    - "{{ local_results_dir }}"
+    - "{{ local_scripts_dir }}"
+  run_once: true
+  delegate_to: localhost
+
+- name: Create analysis directory
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}/analysis"
+    state: directory
+    mode: '0755'
+  become: true
+
+- name: Copy analysis scripts to scripts directory
+  ansible.builtin.copy:
+    src: "{{ item }}"
+    dest: "{{ local_scripts_dir }}/{{ item }}"
+    mode: '0755'
+    force: yes
+  loop:
+    - analyze_results.py
+    - generate_graphs.py
+    - generate_html_report.py
+  run_once: true
+  delegate_to: localhost
+  become: true
+
+- name: Generate analysis configuration
+  ansible.builtin.template:
+    src: analysis_config.json.j2
+    dest: "{{ local_scripts_dir }}/analysis_config.json"
+    mode: '0644'
+  run_once: true
+  delegate_to: localhost
+  when: ai_benchmark_enable_graphing | bool
+
+- name: Check if benchmark results exist
+  ansible.builtin.stat:
+    path: "{{ ai_benchmark_results_dir }}"
+  register: results_dir_check
+
+- name: Find benchmark result files on remote host
+  ansible.builtin.find:
+    paths: "{{ ai_benchmark_results_dir }}"
+    patterns: "results_*.json"
+  register: remote_results
+  when: results_dir_check.stat.exists
+
+- name: Clean up entire local results directory before collection
+  ansible.builtin.file:
+    path: "{{ local_results_dir }}"
+    state: absent
+  run_once: true
+  delegate_to: localhost
+  become: true
+  when:
+    - results_dir_check.stat.exists
+    - remote_results.files is defined
+
+- name: Recreate local results directory with correct permissions
+  ansible.builtin.file:
+    path: "{{ local_results_dir }}"
+    state: directory
+    mode: '0755'
+  run_once: true
+  delegate_to: localhost
+  become: false
+  when:
+    - results_dir_check.stat.exists
+    - remote_results.files is defined
+
+- name: Collect result files from all hosts
+  ansible.builtin.fetch:
+    src: "{{ item.path }}"
+    dest: "{{ local_results_dir }}/{{ item.path | basename }}"
+    flat: true
+    mode: '0644'
+  loop: "{{ remote_results.files | default([]) }}"
+  when:
+    - results_dir_check.stat.exists
+    - remote_results.files is defined
+
+- name: Check if any results were collected
+  ansible.builtin.find:
+    paths: "{{ local_results_dir }}"
+    patterns: "*results_*.json"
+  register: collected_results
+  run_once: true
+  delegate_to: localhost
+
+- name: Display message if no results found
+  ansible.builtin.debug:
+    msg: |
+      No benchmark results found to analyze.
+      Please run 'make ai-tests' first to generate benchmark results.
+  when: collected_results.files is not defined or collected_results.files | length == 0
+  run_once: true
+  delegate_to: localhost
+
+- name: Ensure results directory has correct permissions
+  ansible.builtin.file:
+    path: "{{ local_results_dir }}"
+    owner: "{{ lookup('env', 'USER') }}"
+    group: "{{ lookup('env', 'USER') }}"
+    mode: '0755'
+    recurse: true
+  run_once: true
+  delegate_to: localhost
+  become: true
+  tags: ['results', 'analysis']
+
+- name: Run results analysis
+  ansible.builtin.command: >
+    python3 {{ local_scripts_dir }}/analyze_results.py
+    --results-dir {{ local_results_dir }}
+    --output-dir {{ local_results_dir }}
+    {% if ai_benchmark_enable_graphing | bool %}--config {{ local_scripts_dir }}/analysis_config.json{% endif %}
+  register: analysis_result
+  run_once: true
+  delegate_to: localhost
+  when: collected_results.files is defined and collected_results.files | length > 0
+  tags: ['results', 'analysis']
+
+
+- name: Create graphs directory
+  ansible.builtin.file:
+    path: "{{ local_results_dir }}/graphs"
+    state: directory
+    mode: '0755'
+  run_once: true
+  delegate_to: localhost
+  when:
+    - collected_results.files is defined
+    - collected_results.files | length > 0
+  tags: ['results', 'graphs']
+
+- name: Generate performance graphs
+  ansible.builtin.command: >
+    python3 {{ local_scripts_dir }}/generate_better_graphs.py
+    {{ local_results_dir }}
+    {{ local_results_dir }}/graphs
+  register: graph_generation_result
+  failed_when: false
+  run_once: true
+  delegate_to: localhost
+  when:
+    - collected_results.files is defined
+    - collected_results.files | length > 0
+    - ai_benchmark_enable_graphing|bool
+  tags: ['results', 'graphs']
+
+- name: Fallback to basic graphs if better graphs fail
+  ansible.builtin.command: >
+    python3 {{ local_scripts_dir }}/generate_graphs.py
+    {{ local_results_dir }}
+    {{ local_results_dir }}/graphs
+  run_once: true
+  delegate_to: localhost
+  when:
+    - collected_results.files is defined
+    - collected_results.files | length > 0
+    - ai_benchmark_enable_graphing|bool
+    - graph_generation_result is defined
+    - graph_generation_result.rc != 0
+  tags: ['results', 'graphs']
+
+- name: Generate HTML report
+  ansible.builtin.command: >
+    python3 {{ local_scripts_dir }}/generate_html_report.py
+    {{ local_results_dir }}
+    {{ local_results_dir }}/graphs
+    {{ local_results_dir }}/benchmark_report.html
+  register: html_generation_result
+  run_once: true
+  delegate_to: localhost
+  when:
+    - collected_results.files is defined
+    - collected_results.files | length > 0
+
+- name: Display analysis completion message
+  ansible.builtin.debug:
+    msg: |
+      Benchmark analysis completed!
+      Results available in: {{ local_results_dir }}/
+      Summary report: {{ local_results_dir }}/benchmark_summary.txt
+      HTML report: {{ local_results_dir }}/benchmark_report.html
+      {% if ai_benchmark_enable_graphing | bool %}
+      Graphs generated in: {{ local_results_dir }}/graphs/
+      {% endif %}
+
+      To view the HTML report:
+      - Open {{ local_results_dir }}/benchmark_report.html in a web browser
+  run_once: true
+  delegate_to: localhost
+  when:
+    - collected_results.files is defined
+    - collected_results.files | length > 0
+    - analysis_result is defined
+    - analysis_result.rc == 0
diff --git a/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2 b/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2
new file mode 100644
index 00000000..5a879649
--- /dev/null
+++ b/playbooks/roles/ai_collect_results/templates/analysis_config.json.j2
@@ -0,0 +1,6 @@
+{
+  "enable_graphing": {{ ai_benchmark_enable_graphing|default(true)|lower }},
+  "graph_format": "{{ ai_benchmark_graph_format|default('png') }}",
+  "graph_dpi": {{ ai_benchmark_graph_dpi|default(150) }},
+  "graph_theme": "{{ ai_benchmark_graph_theme|default('seaborn') }}"
+}
diff --git a/playbooks/roles/ai_destroy/tasks/main.yml b/playbooks/roles/ai_destroy/tasks/main.yml
new file mode 100644
index 00000000..29406b37
--- /dev/null
+++ b/playbooks/roles/ai_destroy/tasks/main.yml
@@ -0,0 +1,63 @@
+---
+- name: Import optional extra_args file
+  ansible.builtin.include_vars: "{{ item }}"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Stop and remove all AI containers
+  community.docker.docker_container:
+    name: "{{ item }}"
+    state: absent
+  loop:
+    - "{{ ai_milvus_container_name }}"
+    - "{{ ai_minio_container_name }}"
+    - "{{ ai_etcd_container_name }}"
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Remove Docker network
+  community.docker.docker_network:
+    name: "{{ ai_docker_network_name }}"
+    state: absent
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Remove Docker storage directories
+  ansible.builtin.file:
+    path: "{{ item }}"
+    state: absent
+  loop:
+    - "{{ ai_docker_data_path }}"
+    - "{{ ai_docker_etcd_data_path }}"
+    - "{{ ai_docker_minio_data_path }}"
+  when: ai_milvus_docker | bool
+  become: true
+
+- name: Remove benchmark results directory
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}"
+    state: absent
+  become: true
+
+- name: Remove Docker images (optional)
+  community.docker.docker_image:
+    name: "{{ item }}"
+    state: absent
+  loop:
+    - "{{ ai_milvus_container_image_string }}"
+    - "{{ ai_etcd_container_image_string }}"
+    - "{{ ai_minio_container_image_string }}"
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Display destroy completion message
+  ansible.builtin.debug:
+    msg: |
+      AI benchmark environment completely destroyed.
+      All data, containers, and results have been removed.
diff --git a/playbooks/roles/ai_docker_storage/tasks/main.yml b/playbooks/roles/ai_docker_storage/tasks/main.yml
new file mode 100644
index 00000000..612df3cb
--- /dev/null
+++ b/playbooks/roles/ai_docker_storage/tasks/main.yml
@@ -0,0 +1,123 @@
+---
+- name: Import optional extra_args file
+  include_vars: "{{ item }}"
+  ignore_errors: yes
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Docker storage setup
+  when: ai_docker_storage_enable|bool
+  block:
+    - name: Install filesystem utilities
+      package:
+        name:
+          - xfsprogs
+          - e2fsprogs
+          - btrfs-progs
+          - rsync
+        state: present
+      become: yes
+      become_method: sudo
+
+    - name: Check if device exists
+      stat:
+        path: "{{ ai_docker_device }}"
+      register: docker_device_stat
+      failed_when: not docker_device_stat.stat.exists
+
+    - name: Check if Docker storage is already mounted
+      command: mountpoint -q {{ ai_docker_mount_point }}
+      register: docker_mount_check
+      changed_when: false
+      failed_when: false
+
+    - name: Setup Docker storage filesystem
+      when: docker_mount_check.rc != 0
+      block:
+        - name: Create Docker mount point directory
+          file:
+            path: "{{ ai_docker_mount_point }}"
+            state: directory
+            mode: '0755'
+          become: yes
+          become_method: sudo
+
+        - name: Format device with XFS
+          command: >
+            mkfs.xfs -f
+            -b size={{ ai_docker_xfs_blocksize | default(4096) }}
+            -s size={{ ai_docker_xfs_sectorsize | default(4096) }}
+            {{ ai_docker_xfs_mkfs_opts | default('') }}
+            {{ ai_docker_device }}
+          when: ai_docker_fstype == "xfs"
+          become: yes
+          become_method: sudo
+
+        - name: Format device with Btrfs
+          command: mkfs.btrfs {{ ai_docker_btrfs_mkfs_opts }} {{ ai_docker_device }}
+          when: ai_docker_fstype == "btrfs"
+          become: yes
+          become_method: sudo
+
+        - name: Format device with ext4
+          command: mkfs.ext4 {{ ai_docker_ext4_mkfs_opts }} {{ ai_docker_device }}
+          when: ai_docker_fstype == "ext4"
+          become: yes
+          become_method: sudo
+
+        - name: Mount Docker storage filesystem
+          mount:
+            path: "{{ ai_docker_mount_point }}"
+            src: "{{ ai_docker_device }}"
+            fstype: "{{ ai_docker_fstype }}"
+            opts: defaults,noatime
+            state: mounted
+          become: yes
+          become_method: sudo
+
+        - name: Add Docker storage mount to fstab
+          mount:
+            path: "{{ ai_docker_mount_point }}"
+            src: "{{ ai_docker_device }}"
+            fstype: "{{ ai_docker_fstype }}"
+            opts: defaults,noatime
+            state: present
+          become: yes
+          become_method: sudo
+
+    - name: Check if Docker service exists
+      systemd:
+        name: docker
+      register: docker_service_status
+      failed_when: false
+      changed_when: false
+
+    - name: Stop Docker service if running
+      systemd:
+        name: docker
+        state: stopped
+      become: yes
+      become_method: sudo
+      when: docker_service_status.status is defined and docker_service_status.status.ActiveState == 'active'
+      ignore_errors: yes
+
+    # Note: When ai_docker_storage_enable is true, we mount directly to /var/lib/docker
+    # No need to move data or create symlinks as the storage is already in the right place
+
+    - name: Ensure Docker directory has proper permissions
+      file:
+        path: "{{ ai_docker_mount_point }}"
+        state: directory
+        mode: '0711'
+        owner: root
+        group: root
+      become: yes
+      become_method: sudo
+      when: ai_docker_mount_point == '/var/lib/docker'
+
+    # Docker will be installed and started later by the ai role
+    # We only prepare the storage here
+    - name: Display Docker storage setup complete
+      debug:
+        msg: "Docker storage has been prepared at: {{ ai_docker_mount_point }}"
diff --git a/playbooks/roles/ai_install/tasks/main.yml b/playbooks/roles/ai_install/tasks/main.yml
new file mode 100644
index 00000000..820e0f64
--- /dev/null
+++ b/playbooks/roles/ai_install/tasks/main.yml
@@ -0,0 +1,90 @@
+---
+- name: Include role create_data_partition
+  include_role:
+    name: create_data_partition
+  tags: ['setup', 'data_partition']
+
+- name: Include role common
+  include_role:
+    name: common
+  when:
+    - infer_uid_and_group|bool
+
+- name: Ensure data_dir has correct ownership
+  tags: ['setup']
+  become: true
+  # become_method: sudo  # sudo is the default, not needed
+  ansible.builtin.file:
+    path: "{{ data_path }}"
+    owner: "{{ data_user }}"
+    group: "{{ data_group }}"
+    recurse: true
+    state: directory
+    mode: '0755'
+
+- name: Import optional extra_args file
+  ansible.builtin.include_vars: "{{ item }}"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Install Docker if using Docker deployment
+  ansible.builtin.apt:
+    name:
+      - docker.io
+      - docker-compose
+    state: present
+    update_cache: true
+  when: ai_milvus_docker | bool
+  become: true
+
+- name: Add user to docker group
+  ansible.builtin.user:
+    name: "{{ data_user | default(ansible_user_id) }}"
+    groups: docker
+    append: true
+  when: ai_milvus_docker | bool
+  become: true
+
+- name: Install Python dependencies for AI benchmarks
+  ansible.builtin.pip:
+    name:
+      - pymilvus>=2.3.0
+      - numpy
+      - scikit-learn
+      - pandas
+      - tqdm
+    state: present
+  when: ai_benchmark_enable_graphing | bool
+
+- name: Install additional Python dependencies for graphing
+  ansible.builtin.pip:
+    name:
+      - matplotlib
+      - seaborn
+      - plotly
+    state: present
+  when: ai_benchmark_enable_graphing | bool
+
+- name: Install filesystem utilities for XFS
+  ansible.builtin.apt:
+    name: xfsprogs
+    state: present
+  when: ai_filesystem == "xfs"
+  become: true
+
+- name: Install filesystem utilities for Btrfs
+  ansible.builtin.apt:
+    name: btrfs-progs
+    state: present
+  when: ai_filesystem == "btrfs"
+  become: true
+
+- name: Create benchmark results directory
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}"
+    state: directory
+    mode: '0755'
+  become: true
diff --git a/playbooks/roles/ai_results/tasks/main.yml b/playbooks/roles/ai_results/tasks/main.yml
new file mode 100644
index 00000000..094a9025
--- /dev/null
+++ b/playbooks/roles/ai_results/tasks/main.yml
@@ -0,0 +1,22 @@
+---
+# AI Results collection role
+# This role collects and aggregates benchmark results from various AI components
+
+- name: Create central results directory
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}"
+    state: directory
+    mode: '0755'
+
+- name: Find all benchmark result files
+  ansible.builtin.find:
+    paths: "{{ ai_benchmark_results_dir }}"
+    patterns: "*.json"
+    recurse: true
+  register: result_files
+
+- name: Display found result files
+  ansible.builtin.debug:
+    msg: "Found {{ result_files.files | length }} result files"
+
+# Future: Add result aggregation, analysis, and reporting tasks here
diff --git a/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py b/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py
new file mode 100644
index 00000000..4ce14fb7
--- /dev/null
+++ b/playbooks/roles/ai_run_benchmarks/files/milvus_benchmark.py
@@ -0,0 +1,506 @@
+#!/usr/bin/env python3
+"""
+Milvus Vector Database Benchmark Script
+
+This script performs comprehensive benchmarking of Milvus vector database
+including vector insertion, index creation, and query performance testing.
+"""
+
+import json
+import numpy as np
+import time
+import argparse
+import sys
+import subprocess
+import os
+from datetime import datetime
+from typing import List, Dict, Any, Tuple
+import logging
+
+try:
+    from pymilvus import (
+        connections,
+        Collection,
+        CollectionSchema,
+        FieldSchema,
+        DataType,
+        utility,
+    )
+    from pymilvus.client.types import LoadState
+except ImportError as e:
+    print(f"Error importing pymilvus: {e}")
+    print(f"Python executable: {sys.executable}")
+    print(f"Python path: {sys.path}")
+    print("Please ensure pymilvus is installed in the virtual environment")
+    sys.exit(1)
+
+
+class MilvusBenchmark:
+    def __init__(self, config: Dict[str, Any]):
+        self.config = config
+        self.collection = None
+        self.results = {
+            "config": config,
+            "timestamp": datetime.now().isoformat(),
+            "insert_performance": {},
+            "index_performance": {},
+            "query_performance": {},
+            "system_info": {},
+        }
+
+        # Setup logging
+        logging.basicConfig(
+            level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
+        )
+        self.logger = logging.getLogger(__name__)
+
+    def get_filesystem_info(self, path: str = "/data") -> Dict[str, str]:
+        """Detect filesystem type for the given path"""
+        try:
+            # Use df -T to get filesystem type
+            result = subprocess.run(
+                ["df", "-T", path], capture_output=True, text=True, check=True
+            )
+
+            lines = result.stdout.strip().split("\n")
+            if len(lines) >= 2:
+                # Second line contains the filesystem info
+                # Format: Filesystem Type 1K-blocks Used Available Use% Mounted on
+                parts = lines[1].split()
+                if len(parts) >= 2:
+                    filesystem_type = parts[1]
+                    mount_point = parts[-1] if len(parts) >= 7 else path
+
+                    return {
+                        "filesystem": filesystem_type,
+                        "mount_point": mount_point,
+                        "data_path": path,
+                    }
+        except subprocess.CalledProcessError as e:
+            self.logger.warning(f"Failed to detect filesystem for {path}: {e}")
+        except Exception as e:
+            self.logger.warning(f"Error detecting filesystem for {path}: {e}")
+
+        # Fallback: try to detect from /proc/mounts
+        try:
+            with open("/proc/mounts", "r") as f:
+                mounts = f.readlines()
+
+            # Find the mount that contains our path
+            best_match = ""
+            best_fs = "unknown"
+
+            for line in mounts:
+                parts = line.strip().split()
+                if len(parts) >= 3:
+                    mount_point = parts[1]
+                    fs_type = parts[2]
+
+                    # Check if this mount point is a prefix of our path
+                    if path.startswith(mount_point) and len(mount_point) > len(
+                        best_match
+                    ):
+                        best_match = mount_point
+                        best_fs = fs_type
+
+            if best_fs != "unknown":
+                return {
+                    "filesystem": best_fs,
+                    "mount_point": best_match,
+                    "data_path": path,
+                }
+
+        except Exception as e:
+            self.logger.warning(f"Error reading /proc/mounts: {e}")
+
+        # Final fallback
+        return {"filesystem": "unknown", "mount_point": "/", "data_path": path}
+
+    def connect_to_milvus(self) -> bool:
+        """Connect to Milvus server"""
+        try:
+            connections.connect(
+                alias="default", host=self.config["host"], port=self.config["port"]
+            )
+            self.logger.info(
+                f"Connected to Milvus at {self.config['host']}:{self.config['port']}"
+            )
+            return True
+        except Exception as e:
+            self.logger.error(f"Failed to connect to Milvus: {e}")
+            return False
+
+    def create_collection(self) -> bool:
+        """Create benchmark collection"""
+        try:
+            collection_name = self.config["database_name"]
+
+            # Drop collection if exists
+            if utility.has_collection(collection_name):
+                utility.drop_collection(collection_name)
+                self.logger.info(f"Dropped existing collection: {collection_name}")
+
+            # Define schema
+            fields = [
+                FieldSchema(
+                    name="id", dtype=DataType.INT64, is_primary=True, auto_id=False
+                ),
+                FieldSchema(
+                    name="vector",
+                    dtype=DataType.FLOAT_VECTOR,
+                    dim=self.config["vector_dimensions"],
+                ),
+            ]
+            schema = CollectionSchema(
+                fields,
+                f"Benchmark collection with {self.config['vector_dimensions']}D vectors",
+            )
+
+            # Create collection
+            self.collection = Collection(collection_name, schema)
+            self.logger.info(f"Created collection: {collection_name}")
+            return True
+        except Exception as e:
+            self.logger.error(f"Failed to create collection: {e}")
+            return False
+
+    def generate_vectors(self, count: int) -> Tuple[List[int], List[List[float]]]:
+        """Generate random vectors for benchmarking"""
+        ids = list(range(count))
+        vectors = (
+            np.random.random((count, self.config["vector_dimensions"]))
+            .astype(np.float32)
+            .tolist()
+        )
+        return ids, vectors
+
+    def benchmark_insert(self) -> bool:
+        """Benchmark vector insertion performance"""
+        try:
+            self.logger.info("Starting insert benchmark...")
+
+            batch_size = 1000
+            total_vectors = self.config["vector_dataset_size"]
+
+            insert_times = []
+
+            for i in range(0, total_vectors, batch_size):
+                current_batch_size = min(batch_size, total_vectors - i)
+
+                # Generate batch data
+                ids, vectors = self.generate_vectors(current_batch_size)
+                ids = [id + i for id in ids]  # Ensure unique IDs
+
+                # Insert batch
+                start_time = time.time()
+                self.collection.insert([ids, vectors])
+                insert_time = time.time() - start_time
+                insert_times.append(insert_time)
+
+                if (i // batch_size) % 100 == 0:
+                    self.logger.info(
+                        f"Inserted {i + current_batch_size}/{total_vectors} vectors"
+                    )
+
+            # Flush to ensure data is persisted
+            self.logger.info("Flushing collection...")
+            flush_start = time.time()
+            self.collection.flush()
+            flush_time = time.time() - flush_start
+
+            # Calculate statistics
+            total_insert_time = sum(insert_times)
+            avg_insert_time = total_insert_time / len(insert_times)
+            vectors_per_second = total_vectors / total_insert_time
+
+            self.results["insert_performance"] = {
+                "total_vectors": total_vectors,
+                "total_time_seconds": total_insert_time,
+                "flush_time_seconds": flush_time,
+                "average_batch_time_seconds": avg_insert_time,
+                "vectors_per_second": vectors_per_second,
+                "batch_size": batch_size,
+            }
+
+            self.logger.info(
+                f"Insert benchmark completed: {vectors_per_second:.2f} vectors/sec"
+            )
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Insert benchmark failed: {e}")
+            return False
+
+    def benchmark_index_creation(self) -> bool:
+        """Benchmark index creation performance"""
+        try:
+            self.logger.info("Starting index creation benchmark...")
+
+            index_params = {
+                "metric_type": "L2",
+                "index_type": self.config["index_type"],
+                "params": {},
+            }
+
+            if self.config["index_type"] == "HNSW":
+                index_params["params"] = {
+                    "M": self.config.get("index_hnsw_m", 16),
+                    "efConstruction": self.config.get(
+                        "index_hnsw_ef_construction", 200
+                    ),
+                }
+            elif self.config["index_type"] == "IVF_FLAT":
+                index_params["params"] = {
+                    "nlist": self.config.get("index_ivf_nlist", 1024)
+                }
+
+            start_time = time.time()
+            self.collection.create_index("vector", index_params)
+            index_time = time.time() - start_time
+
+            self.results["index_performance"] = {
+                "index_type": self.config["index_type"],
+                "index_params": index_params,
+                "creation_time_seconds": index_time,
+            }
+
+            self.logger.info(f"Index creation completed in {index_time:.2f} seconds")
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Index creation failed: {e}")
+            return False
+
+    def benchmark_queries(self) -> bool:
+        """Benchmark query performance"""
+        try:
+            self.logger.info("Starting query benchmark...")
+
+            # Load collection with timeout and retry logic
+            self.logger.info("Loading collection into memory...")
+            max_retries = 3
+            retry_count = 0
+            load_success = False
+
+            while retry_count < max_retries and not load_success:
+                try:
+                    # First, ensure the collection is released if previously loaded
+                    if utility.load_state(self.collection.name) != LoadState.NotLoad:
+                        self.logger.info("Releasing existing collection load...")
+                        self.collection.release()
+                        time.sleep(5)  # Wait for release to complete
+
+                    # Now load the collection with explicit timeout
+                    # For large collections, we may need to adjust replica number
+                    self.logger.info(
+                        f"Loading collection (attempt {retry_count + 1}/{max_retries})..."
+                    )
+                    # Check collection size first
+                    collection_stats = self.collection.num_entities
+                    self.logger.info(f"Collection has {collection_stats} entities")
+
+                    # For very large collections, load with specific parameters
+                    if collection_stats > 500000:
+                        self.logger.info(
+                            "Large collection detected, using optimized loading parameters"
+                        )
+                        self.collection.load(
+                            replica_number=1, timeout=1200
+                        )  # 20 minute timeout for large collections
+                        max_wait_time = (
+                            1800  # 30 minutes max wait for large collections
+                        )
+                    else:
+                        self.collection.load(timeout=300)  # 5 minute timeout
+                        max_wait_time = 600  # 10 minutes max wait
+
+                    # Wait for the collection to be fully loaded
+                    start_wait = time.time()
+
+                    while time.time() - start_wait < max_wait_time:
+                        load_state = utility.load_state(self.collection.name)
+                        if load_state == LoadState.Loaded:
+                            self.logger.info(
+                                "Collection successfully loaded into memory"
+                            )
+                            load_success = True
+                            break
+                        elif load_state == LoadState.Loading:
+                            try:
+                                progress = utility.loading_progress(
+                                    self.collection.name
+                                )
+                                self.logger.info(f"Loading progress: {progress}%")
+                            except Exception as e:
+                                self.logger.warning(
+                                    f"Could not get loading progress: {e}"
+                                )
+                            time.sleep(10)  # Check every 10 seconds
+                        else:
+                            self.logger.warning(f"Unexpected load state: {load_state}")
+                            break
+
+                    if not load_success:
+                        self.logger.warning(
+                            f"Collection loading timed out after {max_wait_time} seconds"
+                        )
+                        retry_count += 1
+                        if retry_count < max_retries:
+                            self.logger.info("Retrying collection load...")
+                            time.sleep(30)  # Wait before retry
+
+                except Exception as e:
+                    self.logger.error(f"Error loading collection: {e}")
+                    retry_count += 1
+                    if retry_count < max_retries:
+                        self.logger.info("Retrying after error...")
+                        time.sleep(30)
+                    else:
+                        raise
+
+            if not load_success:
+                self.logger.error("Failed to load collection after all retries")
+                return False
+
+            # Generate query vectors
+            query_count = 1000
+            _, query_vectors = self.generate_vectors(query_count)
+
+            query_results = {}
+
+            # Test different top-k values
+            topk_values = []
+            if self.config.get("benchmark_query_topk_1", False):
+                topk_values.append(1)
+            if self.config.get("benchmark_query_topk_10", False):
+                topk_values.append(10)
+            if self.config.get("benchmark_query_topk_100", False):
+                topk_values.append(100)
+
+            # Test different batch sizes
+            batch_sizes = []
+            if self.config.get("benchmark_batch_1", False):
+                batch_sizes.append(1)
+            if self.config.get("benchmark_batch_10", False):
+                batch_sizes.append(10)
+            if self.config.get("benchmark_batch_100", False):
+                batch_sizes.append(100)
+
+            for topk in topk_values:
+                query_results[f"topk_{topk}"] = {}
+
+                search_params = {"metric_type": "L2", "params": {}}
+                if self.config["index_type"] == "HNSW":
+                    # For HNSW, ef must be at least as large as topk
+                    default_ef = self.config.get("index_hnsw_ef", 64)
+                    search_params["params"]["ef"] = max(default_ef, topk)
+                elif self.config["index_type"] == "IVF_FLAT":
+                    search_params["params"]["nprobe"] = self.config.get(
+                        "index_ivf_nprobe", 16
+                    )
+
+                for batch_size in batch_sizes:
+                    self.logger.info(f"Testing topk={topk}, batch_size={batch_size}")
+
+                    times = []
+                    for i in range(
+                        0, min(query_count, 100), batch_size
+                    ):  # Limit to 100 queries for speed
+                        batch_vectors = query_vectors[i : i + batch_size]
+
+                        start_time = time.time()
+                        results = self.collection.search(
+                            batch_vectors,
+                            "vector",
+                            search_params,
+                            limit=topk,
+                            output_fields=["id"],
+                        )
+                        query_time = time.time() - start_time
+                        times.append(query_time)
+
+                    avg_time = sum(times) / len(times)
+                    qps = batch_size / avg_time
+
+                    query_results[f"topk_{topk}"][f"batch_{batch_size}"] = {
+                        "average_time_seconds": avg_time,
+                        "queries_per_second": qps,
+                        "total_queries": len(times) * batch_size,
+                    }
+
+            self.results["query_performance"] = query_results
+            self.logger.info("Query benchmark completed")
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Query benchmark failed: {e}")
+            return False
+
+    def run_benchmark(self) -> bool:
+        """Run complete benchmark suite"""
+        self.logger.info("Starting Milvus benchmark suite...")
+
+        # Detect filesystem information
+        fs_info = self.get_filesystem_info("/data")
+        self.results["system_info"] = fs_info
+        # Also add filesystem at top level for compatibility with existing graphs
+        self.results["filesystem"] = fs_info["filesystem"]
+        self.logger.info(
+            f"Detected filesystem: {fs_info['filesystem']} at {fs_info['mount_point']}"
+        )
+
+        if not self.connect_to_milvus():
+            return False
+
+        if not self.create_collection():
+            return False
+
+        if not self.benchmark_insert():
+            return False
+
+        if not self.benchmark_index_creation():
+            return False
+
+        if not self.benchmark_queries():
+            return False
+
+        self.logger.info("Benchmark suite completed successfully")
+        return True
+
+    def save_results(self, output_file: str):
+        """Save benchmark results to file"""
+        try:
+            with open(output_file, "w") as f:
+                json.dump(self.results, f, indent=2)
+            self.logger.info(f"Results saved to {output_file}")
+        except Exception as e:
+            self.logger.error(f"Failed to save results: {e}")
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Milvus Vector Database Benchmark")
+    parser.add_argument("--config", required=True, help="JSON configuration file")
+    parser.add_argument("--output", required=True, help="Output results file")
+
+    args = parser.parse_args()
+
+    # Load configuration
+    try:
+        with open(args.config, "r") as f:
+            config = json.load(f)
+    except Exception as e:
+        print(f"Error loading config file: {e}")
+        return 1
+
+    # Run benchmark
+    benchmark = MilvusBenchmark(config)
+    success = benchmark.run_benchmark()
+
+    # Save results
+    benchmark.save_results(args.output)
+
+    return 0 if success else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/playbooks/roles/ai_run_benchmarks/tasks/main.yml b/playbooks/roles/ai_run_benchmarks/tasks/main.yml
new file mode 100644
index 00000000..81fd5a87
--- /dev/null
+++ b/playbooks/roles/ai_run_benchmarks/tasks/main.yml
@@ -0,0 +1,181 @@
+---
+- name: Import optional extra_args file
+  ansible.builtin.include_vars: "{{ item }}"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Clean up any stale lock files from previous runs (force mode)
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}/.benchmark.lock"
+    state: absent
+  failed_when: false
+  when: ai_benchmark_force_unlock | default(false) | bool
+  tags: cleanup
+
+- name: Check for ongoing benchmark processes
+  ansible.builtin.shell: |
+    pgrep -f "python.*milvus_benchmark\.py" | xargs -r ps -p 2>/dev/null | grep -v "sh -c" | grep -v grep | wc -l || echo "0"
+  register: benchmark_check
+  changed_when: false
+  failed_when: false
+
+- name: Fail if benchmark is already running
+  ansible.builtin.fail:
+    msg: |
+      ERROR: A benchmark is already running on this system!
+      Number of benchmark processes: {{ benchmark_check.stdout }}
+      Please wait for the current benchmark to complete or terminate it before starting a new one.
+  when: benchmark_check.stdout | int > 0
+
+- name: Ensure benchmark results directory exists
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}"
+    state: directory
+    mode: '0755'
+
+- name: Check for benchmark lock file
+  ansible.builtin.stat:
+    path: "{{ ai_benchmark_results_dir }}/.benchmark.lock"
+  register: lock_file
+
+- name: Check if lock file is stale (older than 5 minutes)
+  ansible.builtin.set_fact:
+    lock_is_stale: "{{ (ansible_date_time.epoch | int - lock_file.stat.mtime | default(0) | int) > 300 }}"
+  when: lock_file.stat.exists
+
+- name: Remove stale lock file
+  ansible.builtin.file:
+    path: "{{ ai_benchmark_results_dir }}/.benchmark.lock"
+    state: absent
+  when:
+    - lock_file.stat.exists
+    - lock_is_stale|default(false)|bool
+
+- name: Fail if recent benchmark lock exists
+  ansible.builtin.fail:
+    msg: |
+      ERROR: Benchmark lock file exists at {{ ai_benchmark_results_dir }}/.benchmark.lock
+      This indicates a benchmark may be in progress or was terminated abnormally.
+      Lock file age: {{ (ansible_date_time.epoch | int - lock_file.stat.mtime | default(0) | int) }} seconds
+      If you're sure no benchmark is running, remove the lock file manually.
+  when:
+    - lock_file.stat.exists
+    - not lock_is_stale|default(false)|bool
+
+- name: Run benchmark with lock management
+  block:
+    - name: Create benchmark lock file
+      ansible.builtin.file:
+        path: "{{ ai_benchmark_results_dir }}/.benchmark.lock"
+        state: touch
+        mode: '0644'
+      register: lock_created
+
+    - name: Create benchmark working directory
+      ansible.builtin.file:
+        path: "{{ ai_benchmark_results_dir }}/workdir"
+        state: directory
+        mode: '0755'
+
+    - name: Copy benchmark script
+      ansible.builtin.copy:
+        src: milvus_benchmark.py
+        dest: "{{ ai_benchmark_results_dir }}/workdir/milvus_benchmark.py"
+        mode: '0755'
+
+    - name: Ensure Python venv package is installed
+      ansible.builtin.package:
+        name:
+          - python3-venv
+          - python3-pip
+          - python3-dev
+        state: present
+      become: true
+
+    - name: Clean up any globally installed packages (if accidentally installed)
+      ansible.builtin.shell: |
+        pip3 uninstall -y pymilvus numpy 2>/dev/null || true
+      become: true
+      changed_when: false
+      failed_when: false
+
+    - name: Check if virtual environment exists
+      ansible.builtin.stat:
+        path: "{{ ai_benchmark_results_dir }}/venv/bin/python"
+      register: venv_exists
+
+    - name: Verify virtual environment has required packages
+      block:
+        - name: Check if pymilvus is installed in virtual environment
+          ansible.builtin.command: "{{ ai_benchmark_results_dir }}/venv/bin/python -c 'import pymilvus; print(pymilvus.__version__)'"
+          register: pymilvus_check
+          changed_when: false
+          failed_when: false
+
+        - name: Display current pymilvus version
+          ansible.builtin.debug:
+            msg: "Current pymilvus version: {{ pymilvus_check.stdout }}"
+          when: pymilvus_check.rc == 0
+
+        - name: Virtual environment is not properly configured
+          ansible.builtin.debug:
+            msg: "Virtual environment at {{ ai_benchmark_results_dir }}/venv is missing or incomplete. Please run 'make ai' first to set up the environment."
+          when: not venv_exists.stat.exists or pymilvus_check.rc != 0
+
+        - name: Fail if virtual environment is not ready
+          ansible.builtin.fail:
+            msg: "Virtual environment is not properly configured. Please run 'make ai' to set up the environment first."
+          when: not venv_exists.stat.exists or pymilvus_check.rc != 0
+
+    - name: List installed packages in virtual environment for verification
+      ansible.builtin.command: "{{ ai_benchmark_results_dir }}/venv/bin/pip list"
+      register: pip_list
+      changed_when: false
+
+    - name: Display installed packages
+      ansible.builtin.debug:
+        msg: "Installed packages in venv: {{ pip_list.stdout }}"
+
+    - name: Generate benchmark configuration
+      ansible.builtin.template:
+        src: benchmark_config.json.j2
+        dest: "{{ ai_benchmark_results_dir }}/workdir/benchmark_config.json"
+        mode: '0644'
+
+    - name: Wait for Milvus to be ready
+      ansible.builtin.wait_for:
+        host: "localhost"
+        port: "{{ ai_vector_db_milvus_port }}"
+        delay: 10
+        timeout: 300
+
+    - name: Run Milvus benchmark for iteration {{ item }}
+      ansible.builtin.command: >
+        {{ ai_benchmark_results_dir }}/venv/bin/python
+        {{ ai_benchmark_results_dir }}/workdir/milvus_benchmark.py
+        --config {{ ai_benchmark_results_dir }}/workdir/benchmark_config.json
+        --output {{ ai_benchmark_results_dir }}/results_{{ ansible_hostname }}_{{ item }}.json
+      register: benchmark_result
+      with_sequence: start=1 end={{ ai_benchmark_iterations }}
+      tags: run_benchmark
+
+    - name: Display benchmark results
+      ansible.builtin.debug:
+        var: benchmark_result
+      when: benchmark_result is defined
+
+  always:
+    - name: Remove benchmark lock file
+      ansible.builtin.file:
+        path: "{{ ai_benchmark_results_dir }}/.benchmark.lock"
+        state: absent
+      failed_when: false
+      when: lock_created is defined and lock_created.changed
+
+    - name: Ensure lock file is removed (fallback)
+      ansible.builtin.shell: rm -f {{ ai_benchmark_results_dir }}/.benchmark.lock
+      failed_when: false
+      when: lock_created is defined
diff --git a/playbooks/roles/ai_run_benchmarks/templates/benchmark_config.json.j2 b/playbooks/roles/ai_run_benchmarks/templates/benchmark_config.json.j2
new file mode 100644
index 00000000..9983fc16
--- /dev/null
+++ b/playbooks/roles/ai_run_benchmarks/templates/benchmark_config.json.j2
@@ -0,0 +1,24 @@
+{
+  "host": "localhost",
+  "port": {{ ai_vector_db_milvus_port }},
+  "database_name": "default",
+  "collection_name": "{{ ai_vector_db_milvus_collection_name }}",
+  "vector_dataset_size": {{ ai_vector_db_milvus_dataset_size }},
+  "vector_dimensions": {{ ai_vector_db_milvus_dimension }},
+  "benchmark_runtime": {{ ai_benchmark_runtime|default(60) }},
+  "benchmark_warmup_time": {{ ai_benchmark_warmup_time|default(10) }},
+  "benchmark_query_topk_1": {{ ai_benchmark_query_topk_1|default(true)|lower }},
+  "benchmark_query_topk_10": {{ ai_benchmark_query_topk_10|default(true)|lower }},
+  "benchmark_query_topk_100": {{ ai_benchmark_query_topk_100|default(true)|lower }},
+  "benchmark_batch_1": {{ ai_benchmark_batch_1|default(true)|lower }},
+  "benchmark_batch_10": {{ ai_benchmark_batch_10|default(true)|lower }},
+  "benchmark_batch_100": {{ ai_benchmark_batch_100|default(true)|lower }},
+  "batch_size": {{ ai_vector_db_milvus_batch_size }},
+  "num_queries": {{ ai_vector_db_milvus_num_queries }},
+  "index_type": "{{ ai_index_type|default('HNSW') }}",
+  "index_hnsw_m": {{ ai_index_hnsw_m|default(16) }},
+  "index_hnsw_ef_construction": {{ ai_index_hnsw_ef_construction|default(200) }},
+  "index_hnsw_ef": {{ ai_index_hnsw_ef|default(64) }}{% if ai_index_type|default('HNSW') == "IVF_FLAT" %},
+  "index_ivf_nlist": {{ ai_index_ivf_nlist|default(1024) }},
+  "index_ivf_nprobe": {{ ai_index_ivf_nprobe|default(16) }}{% endif %}
+}
diff --git a/playbooks/roles/ai_setup/tasks/main.yml b/playbooks/roles/ai_setup/tasks/main.yml
new file mode 100644
index 00000000..b894c964
--- /dev/null
+++ b/playbooks/roles/ai_setup/tasks/main.yml
@@ -0,0 +1,115 @@
+---
+- name: Import optional extra_args file
+  ansible.builtin.include_vars: "{{ item }}"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Create Docker storage directories
+  ansible.builtin.file:
+    path: "{{ item }}"
+    state: directory
+    mode: '0755'
+  loop:
+    - "{{ ai_docker_data_path }}"
+    - "{{ ai_docker_etcd_data_path }}"
+    - "{{ ai_docker_minio_data_path }}"
+  when: ai_milvus_docker | bool
+  become: true
+
+- name: Create Docker network for Milvus
+  community.docker.docker_network:
+    name: "{{ ai_docker_network_name }}"
+    state: present
+  when: ai_milvus_docker | bool
+
+- name: Start etcd container
+  community.docker.docker_container:
+    name: "{{ ai_etcd_container_name }}"
+    image: "{{ ai_etcd_container_image_string }}"
+    state: started
+    restart_policy: unless-stopped
+    networks:
+      - name: "{{ ai_docker_network_name }}"
+    ports:
+      - "{{ ai_etcd_client_port }}:2379"
+      - "{{ ai_etcd_peer_port }}:2380"
+    env:
+      ETCD_AUTO_COMPACTION_MODE: revision
+      ETCD_AUTO_COMPACTION_RETENTION: "1000"
+      ETCD_QUOTA_BACKEND_BYTES: "4294967296"
+      ETCD_SNAPSHOT_COUNT: "50000"
+    command: >
+      etcd -advertise-client-urls=http://127.0.0.1:2379
+      -listen-client-urls http://0.0.0.0:2379
+      --data-dir /etcd
+    volumes:
+      - "{{ ai_docker_etcd_data_path }}:/etcd"
+    memory: "{{ ai_etcd_memory_limit }}"
+  when: ai_milvus_docker | bool
+
+- name: Start MinIO container
+  community.docker.docker_container:
+    name: "{{ ai_minio_container_name }}"
+    image: "{{ ai_minio_container_image_string }}"
+    state: started
+    restart_policy: unless-stopped
+    networks:
+      - name: "{{ ai_docker_network_name }}"
+    ports:
+      - "{{ ai_minio_api_port }}:9000"
+      - "{{ ai_minio_console_port }}:9001"
+    env:
+      MINIO_ACCESS_KEY: "{{ ai_minio_access_key }}"
+      MINIO_SECRET_KEY: "{{ ai_minio_secret_key }}"
+    ansible.builtin.command: server /minio_data --console-address ":9001"
+    volumes:
+      - "{{ ai_docker_minio_data_path }}:/minio_data"
+    memory: "{{ ai_minio_memory_limit }}"
+  when: ai_milvus_docker | bool
+
+- name: Wait for etcd to be ready
+  ansible.builtin.wait_for:
+    host: localhost
+    port: "{{ ai_etcd_client_port }}"
+    timeout: 60
+  when: ai_milvus_docker | bool
+
+- name: Wait for MinIO to be ready
+  ansible.builtin.wait_for:
+    host: localhost
+    port: "{{ ai_minio_api_port }}"
+    timeout: 60
+  when: ai_milvus_docker | bool
+
+- name: Start Milvus container
+  community.docker.docker_container:
+    name: "{{ ai_milvus_container_name }}"
+    image: "{{ ai_milvus_container_image_string }}"
+    state: started
+    restart_policy: unless-stopped
+    networks:
+      - name: "{{ ai_docker_network_name }}"
+    ports:
+      - "{{ ai_milvus_port }}:19530"
+      - "{{ ai_milvus_web_ui_port }}:9091"
+    env:
+      ETCD_ENDPOINTS: "{{ ai_etcd_container_name }}:2379"
+      MINIO_ADDRESS: "{{ ai_minio_container_name }}:9000"
+      MINIO_ACCESS_KEY: "{{ ai_minio_access_key }}"
+      MINIO_SECRET_KEY: "{{ ai_minio_secret_key }}"
+    volumes:
+      - "{{ ai_docker_data_path }}:/var/lib/milvus"
+    memory: "{{ ai_milvus_memory_limit }}"
+    cpus: "{{ ai_milvus_cpu_limit }}"
+    ansible.builtin.command: milvus run standalone
+  when: ai_milvus_docker | bool
+
+- name: Wait for Milvus to be ready
+  ansible.builtin.wait_for:
+    host: localhost
+    port: "{{ ai_milvus_port }}"
+    timeout: 120
+  when: ai_milvus_docker | bool
diff --git a/playbooks/roles/ai_uninstall/tasks/main.yml b/playbooks/roles/ai_uninstall/tasks/main.yml
new file mode 100644
index 00000000..4d35465b
--- /dev/null
+++ b/playbooks/roles/ai_uninstall/tasks/main.yml
@@ -0,0 +1,62 @@
+---
+- name: Import optional extra_args file
+  ansible.builtin.include_vars: "{{ item }}"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+  with_items:
+    - "../extra_vars.yaml"
+  tags: vars
+
+- name: Stop and remove Milvus container
+  community.docker.docker_container:
+    name: "{{ ai_milvus_container_name }}"
+    state: absent
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Stop and remove MinIO container
+  community.docker.docker_container:
+    name: "{{ ai_minio_container_name }}"
+    state: absent
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Stop and remove etcd container
+  community.docker.docker_container:
+    name: "{{ ai_etcd_container_name }}"
+    state: absent
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Remove Docker network
+  community.docker.docker_network:
+    name: "{{ ai_docker_network_name }}"
+    state: absent
+  when: ai_milvus_docker | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Clean up Python packages (optional)
+  ansible.builtin.pip:
+    name:
+      - pymilvus
+      - matplotlib
+      - seaborn
+      - plotly
+    state: absent
+  when: ai_benchmark_enable_graphing | bool
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Display uninstall completion message
+  ansible.builtin.debug:
+    msg: |
+      AI benchmark components uninstalled successfully.
+      Data directories preserved:
+      - {{ ai_docker_data_path }}
+      - {{ ai_benchmark_results_dir }}
+
+      To completely remove all data, run the ai-destroy target.
diff --git a/playbooks/roles/gen_hosts/tasks/main.yml b/playbooks/roles/gen_hosts/tasks/main.yml
index ec11d039..4b35d9f6 100644
--- a/playbooks/roles/gen_hosts/tasks/main.yml
+++ b/playbooks/roles/gen_hosts/tasks/main.yml
@@ -381,6 +381,20 @@
     - workflows_reboot_limit
     - ansible_hosts_template.stat.exists
 
+- name: Generate the Ansible hosts file for a dedicated AI setup
+  tags: ['hosts']
+  ansible.builtin.template:
+    src: "{{ kdevops_hosts_template }}"
+    dest: "{{ ansible_cfg_inventory }}"
+    force: true
+    trim_blocks: True
+    lstrip_blocks: True
+    mode: '0644'
+  when:
+    - kdevops_workflows_dedicated_workflow
+    - kdevops_workflow_enable_ai
+    - ansible_hosts_template.stat.exists
+
 - name: Verify if final host file exists
   ansible.builtin.stat:
     path: "{{ ansible_cfg_inventory }}"
diff --git a/playbooks/roles/gen_hosts/templates/hosts.j2 b/playbooks/roles/gen_hosts/templates/hosts.j2
index 6d83191d..cdcd1883 100644
--- a/playbooks/roles/gen_hosts/templates/hosts.j2
+++ b/playbooks/roles/gen_hosts/templates/hosts.j2
@@ -77,6 +77,114 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
 
 [service:vars]
 ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% elif kdevops_workflow_enable_ai %}
+{% if ai_enable_multifs_testing|default(false)|bool %}
+{# Multi-filesystem section-based hosts #}
+[all]
+localhost ansible_connection=local
+{% for node in all_generic_nodes %}
+{{ node }}
+{% endfor %}
+
+[all:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+[baseline]
+{% for node in all_generic_nodes %}
+{% if not node.endswith('-dev') %}
+{{ node }}
+{% endif %}
+{% endfor %}
+
+[baseline:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+{% if kdevops_baseline_and_dev %}
+[dev]
+{% for node in all_generic_nodes %}
+{% if node.endswith('-dev') %}
+{{ node }}
+{% endif %}
+{% endfor %}
+
+[dev:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+{% endif %}
+[ai]
+{% for node in all_generic_nodes %}
+{{ node }}
+{% endfor %}
+
+[ai:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+{% set fs_configs = [] %}
+{% for node in all_generic_nodes %}
+{% set node_parts = node.split('-') %}
+{% if node_parts|length >= 3 %}
+{% set fs_type = node_parts[2] %}
+{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
+{% set fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
+{% if fs_group not in fs_configs %}
+{% set _ = fs_configs.append(fs_group) %}
+{% endif %}
+{% endif %}
+{% endfor %}
+
+{% for fs_group in fs_configs %}
+[ai_{{ fs_group }}]
+{% for node in all_generic_nodes %}
+{% set node_parts = node.split('-') %}
+{% if node_parts|length >= 3 %}
+{% set fs_type = node_parts[2] %}
+{% set fs_config = node_parts[3:] | select('ne', 'dev') | join('_') %}
+{% set node_fs_group = fs_type + '_' + fs_config if fs_config else fs_type %}
+{% if node_fs_group == fs_group %}
+{{ node }}
+{% endif %}
+{% endif %}
+{% endfor %}
+
+[ai_{{ fs_group }}:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+{% endfor %}
+{% else %}
+{# Single-node AI hosts #}
+[all]
+localhost ansible_connection=local
+{{ kdevops_host_prefix }}-ai
+{% if kdevops_baseline_and_dev %}
+{{ kdevops_host_prefix }}-ai-dev
+{% endif %}
+
+[all:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+[baseline]
+{{ kdevops_host_prefix }}-ai
+
+[baseline:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+{% if kdevops_baseline_and_dev %}
+[dev]
+{{ kdevops_host_prefix }}-ai-dev
+
+[dev:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+
+{% endif %}
+[ai]
+{{ kdevops_host_prefix }}-ai
+{% if kdevops_baseline_and_dev %}
+{{ kdevops_host_prefix }}-ai-dev
+{% endif %}
+
+[ai:vars]
+ansible_python_interpreter = "{{ kdevops_python_interpreter }}"
+{% endif %}
 {% else %}
 [all]
 localhost ansible_connection=local
diff --git a/playbooks/roles/gen_nodes/tasks/main.yml b/playbooks/roles/gen_nodes/tasks/main.yml
index a8598481..d54977be 100644
--- a/playbooks/roles/gen_nodes/tasks/main.yml
+++ b/playbooks/roles/gen_nodes/tasks/main.yml
@@ -642,6 +642,40 @@
     - ansible_nodes_template.stat.exists
 
 
+- name: Generate the AI kdevops nodes file using {{ kdevops_nodes_template }} as jinja2 source template
+  tags: ['hosts']
+  vars:
+    node_template: "{{ kdevops_nodes_template | basename }}"
+    nodes: "{{ [kdevops_host_prefix + '-ai'] }}"
+    all_generic_nodes: "{{ [kdevops_host_prefix + '-ai'] }}"
+  ansible.builtin.template:
+    src: "{{ node_template }}"
+    dest: "{{ topdir_path }}/{{ kdevops_nodes }}"
+    force: true
+    mode: '0644'
+  when:
+    - kdevops_workflows_dedicated_workflow
+    - kdevops_workflow_enable_ai
+    - ansible_nodes_template.stat.exists
+    - not kdevops_baseline_and_dev
+
+- name: Generate the AI kdevops nodes file with dev hosts using {{ kdevops_nodes_template }} as jinja2 source template
+  tags: ['hosts']
+  vars:
+    node_template: "{{ kdevops_nodes_template | basename }}"
+    nodes: "{{ [kdevops_host_prefix + '-ai', kdevops_host_prefix + '-ai-dev'] }}"
+    all_generic_nodes: "{{ [kdevops_host_prefix + '-ai', kdevops_host_prefix + '-ai-dev'] }}"
+  ansible.builtin.template:
+    src: "{{ node_template }}"
+    dest: "{{ topdir_path }}/{{ kdevops_nodes }}"
+    force: true
+    mode: '0644'
+  when:
+    - kdevops_workflows_dedicated_workflow
+    - kdevops_workflow_enable_ai
+    - ansible_nodes_template.stat.exists
+    - kdevops_baseline_and_dev
+
 - name: Get the control host's timezone
   ansible.builtin.command: "timedatectl show -p Timezone --value"
   register: kdevops_host_timezone
diff --git a/playbooks/roles/milvus/README.md b/playbooks/roles/milvus/README.md
new file mode 100644
index 00000000..e6571167
--- /dev/null
+++ b/playbooks/roles/milvus/README.md
@@ -0,0 +1,181 @@
+# Milvus Vector Database Role
+
+This Ansible role manages the Milvus vector database for AI benchmarking in kdevops.
+
+## Overview
+
+Milvus is an open-source vector database designed for embedding similarity search
+and AI applications. This role provides:
+
+- Docker-based deployment with etcd and MinIO
+- Comprehensive performance benchmarking
+- Scalable testing from small to large datasets
+- Multiple index type support (HNSW, IVF_FLAT, etc.)
+
+## Role Variables
+
+### Required Variables
+
+- `ai_vector_db_milvus_enable`: Enable/disable Milvus deployment
+- `ai_vector_db_milvus_dimension`: Vector dimension size (default: 768)
+- `ai_vector_db_milvus_dataset_size`: Number of vectors to test (default: 1000000)
+
+### Docker Configuration
+
+- `ai_vector_db_milvus_container_name`: Milvus container name
+- `ai_vector_db_milvus_port`: Milvus service port (default: 19530)
+- `ai_vector_db_milvus_memory_limit`: Container memory limit
+- `ai_vector_db_milvus_cpu_limit`: Container CPU limit
+
+### Benchmark Configuration
+
+- `ai_vector_db_milvus_batch_size`: Insertion batch size
+- `ai_vector_db_milvus_num_queries`: Number of search queries
+- `ai_benchmark_iterations`: Number of benchmark iterations
+- `ai_benchmark_results_dir`: Directory for storing results
+
+## Dependencies
+
+For Docker deployment:
+- Docker Engine
+- docker-compose Python package
+
+For benchmarking:
+- Python 3.8+
+- pymilvus
+- numpy
+
+## Directory Structure
+
+```
+milvus/
+├── defaults/
+│   └── main.yml           # Default variables
+├── tasks/
+│   ├── main.yml          # Task router based on action
+│   ├── install_docker.yml # Docker installation tasks
+│   ├── setup.yml         # Environment setup
+│   ├── benchmark.yml     # Benchmark execution
+│   └── destroy.yml       # Cleanup tasks
+├── templates/
+│   ├── docker-compose.yml.j2      # Docker compose configuration
+│   ├── benchmark_config.json.j2   # Benchmark parameters
+│   └── test_connection.py.j2      # Connection test script
+├── files/
+│   ├── milvus_benchmark.py        # Main benchmark script
+│   └── milvus_utils.py           # Utility functions
+└── meta/
+    └── main.yml                   # Role metadata
+```
+
+## Usage Examples
+
+### Basic Installation
+
+```yaml
+- name: Install Milvus
+  hosts: ai
+  roles:
+    - role: milvus
+      vars:
+        action: install
+```
+
+### Run Benchmarks
+
+```yaml
+- name: Benchmark Milvus
+  hosts: ai
+  roles:
+    - role: milvus
+      vars:
+        action: benchmark
+        ai_vector_db_milvus_dataset_size: 1000000
+        ai_vector_db_milvus_dimension: 768
+```
+
+### Cleanup
+
+```yaml
+- name: Destroy Milvus
+  hosts: ai
+  roles:
+    - role: milvus
+      vars:
+        action: destroy
+```
+
+## Benchmark Metrics
+
+The benchmark collects the following metrics:
+
+1. **Insertion Performance**
+   - Total insertion time
+   - Average throughput (vectors/second)
+   - Batch-level statistics
+
+2. **Search Performance**
+   - Query latency (ms)
+   - Queries per second (QPS)
+   - Top-K accuracy
+
+3. **Index Performance**
+   - Index build time
+   - Index memory usage
+   - Search performance by index type
+
+## Results
+
+Benchmark results are stored in JSON format:
+
+```json
+{
+  "timestamp": "2024-01-20T10:30:00",
+  "configuration": {
+    "dataset_size": 1000000,
+    "dimension": 768,
+    "index_type": "HNSW"
+  },
+  "insertion": {
+    "total_time": 120.5,
+    "throughput": 8298.75
+  },
+  "search": {
+    "avg_latency": 2.3,
+    "qps": 434.78
+  }
+}
+```
+
+## Troubleshooting
+
+### Container Issues
+
+Check container status:
+```bash
+docker ps -a | grep milvus
+docker logs milvus-ai-benchmark
+```
+
+### Connection Issues
+
+Test connectivity:
+```bash
+python3 /tmp/test_milvus_connection.py
+```
+
+### Performance Issues
+
+For large datasets:
+- Increase memory limits in Kconfig
+- Use SSD storage for better performance
+- Adjust batch sizes based on available memory
+
+## Contributing
+
+When modifying this role:
+
+1. Follow Ansible best practices
+2. Update documentation for new features
+3. Test with both small and large datasets
+4. Ensure idempotency of all tasks
diff --git a/playbooks/roles/milvus/defaults/main.yml b/playbooks/roles/milvus/defaults/main.yml
new file mode 100644
index 00000000..a002196c
--- /dev/null
+++ b/playbooks/roles/milvus/defaults/main.yml
@@ -0,0 +1,74 @@
+---
+# Milvus vector database defaults
+ai_vector_db_milvus_version: "2.3.0"
+ai_vector_db_milvus_docker: true
+ai_vector_db_milvus_compose_version: "v2.3.0"
+
+# Deployment options
+ai_vector_db_milvus_data_dir: "/data/milvus"
+ai_vector_db_milvus_config_dir: "/etc/milvus"
+ai_vector_db_milvus_log_dir: "/var/log/milvus"
+
+# Network configuration
+ai_vector_db_milvus_port: 19530
+ai_vector_db_milvus_grpc_port: 19530
+ai_vector_db_milvus_metrics_port: 9091
+ai_vector_db_milvus_web_ui_port: 9091
+ai_vector_db_milvus_etcd_client_port: 2379
+ai_vector_db_milvus_minio_api_port: 9000
+ai_vector_db_milvus_minio_console_port: 9001
+
+# Resource limits
+ai_vector_db_milvus_memory_limit: "8Gi"
+ai_vector_db_milvus_cpu_limit: "4"
+
+# Storage backend
+ai_vector_db_milvus_storage_type: "local"  # local, s3, minio
+ai_vector_db_milvus_storage_path: "{{ ai_vector_db_milvus_data_dir }}/storage"
+
+# Index configuration
+ai_vector_db_milvus_index_type: "IVF_FLAT"
+ai_vector_db_milvus_metric_type: "L2"
+ai_vector_db_milvus_nlist: 1024
+
+# Collection defaults
+ai_vector_db_milvus_default_collection: "benchmark_collection"
+ai_vector_db_milvus_default_dim: 768
+ai_vector_db_milvus_default_shards: 2
+
+# Benchmark configuration
+ai_vector_db_milvus_benchmark_enable: true
+ai_vector_db_milvus_benchmark_datasets:
+  - sift1m
+  - gist1m
+ai_vector_db_milvus_benchmark_batch_size: 10000
+ai_vector_db_milvus_benchmark_num_queries: 10000
+
+# Results and filesystem configuration
+ai_benchmark_results_dir: "/data/benchmark-results"
+ai_filesystem: "{{ kdevops_filesystem | default('xfs') }}"
+ai_data_device_path: "/data"
+ai_mkfs_opts: ""
+ai_mount_opts: "defaults"
+
+# Docker container configuration
+ai_vector_db_milvus_container_name: "milvus-standalone"
+ai_vector_db_milvus_etcd_container_name: "milvus-etcd"
+ai_vector_db_milvus_minio_container_name: "milvus-minio"
+
+# Docker image configuration
+ai_vector_db_milvus_container_image_string: "milvusdb/milvus:{{ ai_vector_db_milvus_version }}"
+ai_vector_db_milvus_etcd_container_image_string: "quay.io/coreos/etcd:v3.5.5"
+ai_vector_db_milvus_minio_container_image_string: "minio/minio:RELEASE.2023-03-20T20-16-18Z"
+
+# Docker volume paths
+ai_vector_db_milvus_docker_data_path: "{{ ai_vector_db_milvus_data_dir }}/volumes/milvus"
+ai_vector_db_milvus_docker_etcd_data_path: "{{ ai_vector_db_milvus_data_dir }}/volumes/etcd"
+ai_vector_db_milvus_docker_minio_data_path: "{{ ai_vector_db_milvus_data_dir }}/volumes/minio"
+
+# MinIO configuration
+ai_vector_db_milvus_minio_access_key: "minioadmin"
+ai_vector_db_milvus_minio_secret_key: "minioadmin"
+
+# Docker network
+ai_vector_db_milvus_docker_network_name: "milvus"
diff --git a/playbooks/roles/milvus/files/milvus_benchmark.py b/playbooks/roles/milvus/files/milvus_benchmark.py
new file mode 100644
index 00000000..bd7d5ead
--- /dev/null
+++ b/playbooks/roles/milvus/files/milvus_benchmark.py
@@ -0,0 +1,348 @@
+#!/usr/bin/env python3
+"""
+Milvus Vector Database Benchmark Script
+
+This script performs comprehensive benchmarking of Milvus vector database
+including vector insertion, index creation, and query performance testing.
+"""
+
+import json
+import numpy as np
+import time
+import argparse
+import sys
+from datetime import datetime
+from typing import List, Dict, Any, Tuple
+import logging
+
+try:
+    from pymilvus import (
+        connections,
+        Collection,
+        CollectionSchema,
+        FieldSchema,
+        DataType,
+        utility,
+    )
+except ImportError:
+    print("Error: pymilvus not installed. Please install with: pip install pymilvus")
+    sys.exit(1)
+
+
+class MilvusBenchmark:
+    def __init__(self, config: Dict[str, Any]):
+        self.config = config
+        self.collection = None
+        self.results = {
+            "config": config,
+            "timestamp": datetime.now().isoformat(),
+            "insert_performance": {},
+            "index_performance": {},
+            "query_performance": {},
+            "system_info": {},
+        }
+
+        # Setup logging
+        logging.basicConfig(
+            level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
+        )
+        self.logger = logging.getLogger(__name__)
+
+    def connect_to_milvus(self) -> bool:
+        """Connect to Milvus server"""
+        try:
+            connections.connect(
+                alias="default",
+                host=self.config["milvus"]["host"],
+                port=self.config["milvus"]["port"],
+            )
+            self.logger.info(
+                f"Connected to Milvus at {self.config['milvus']['host']}:{self.config['milvus']['port']}"
+            )
+            return True
+        except Exception as e:
+            self.logger.error(f"Failed to connect to Milvus: {e}")
+            return False
+
+    def create_collection(self) -> bool:
+        """Create benchmark collection"""
+        try:
+            collection_name = self.config["milvus"]["collection_name"]
+
+            # Drop collection if exists
+            if utility.has_collection(collection_name):
+                utility.drop_collection(collection_name)
+                self.logger.info(f"Dropped existing collection: {collection_name}")
+
+            # Define schema
+            fields = [
+                FieldSchema(
+                    name="id", dtype=DataType.INT64, is_primary=True, auto_id=False
+                ),
+                FieldSchema(
+                    name="vector",
+                    dtype=DataType.FLOAT_VECTOR,
+                    dim=self.config["milvus"]["dimension"],
+                ),
+            ]
+            schema = CollectionSchema(
+                fields,
+                f"Benchmark collection with {self.config['milvus']['dimension']}D vectors",
+            )
+
+            # Create collection
+            self.collection = Collection(collection_name, schema)
+            self.logger.info(f"Created collection: {collection_name}")
+            return True
+        except Exception as e:
+            self.logger.error(f"Failed to create collection: {e}")
+            return False
+
+    def generate_vectors(self, count: int) -> Tuple[List[int], List[List[float]]]:
+        """Generate random vectors for benchmarking"""
+        ids = list(range(count))
+        vectors = (
+            np.random.random((count, self.config["milvus"]["dimension"]))
+            .astype(np.float32)
+            .tolist()
+        )
+        return ids, vectors
+
+    def benchmark_insert(self) -> bool:
+        """Benchmark vector insertion performance"""
+        try:
+            self.logger.info("Starting insert benchmark...")
+
+            batch_size = self.config["benchmark"]["batch_size"]
+            total_vectors = self.config["benchmark"][
+                "num_queries"
+            ]  # Use num_queries as dataset size
+
+            insert_times = []
+
+            for i in range(0, total_vectors, batch_size):
+                current_batch_size = min(batch_size, total_vectors - i)
+
+                # Generate batch data
+                ids, vectors = self.generate_vectors(current_batch_size)
+                ids = [id + i for id in ids]  # Ensure unique IDs
+
+                # Insert batch
+                start_time = time.time()
+                self.collection.insert([ids, vectors])
+                insert_time = time.time() - start_time
+                insert_times.append(insert_time)
+
+                if (i // batch_size) % 100 == 0:
+                    self.logger.info(
+                        f"Inserted {i + current_batch_size}/{total_vectors} vectors"
+                    )
+
+            # Flush to ensure data is persisted
+            self.logger.info("Flushing collection...")
+            flush_start = time.time()
+            self.collection.flush()
+            flush_time = time.time() - flush_start
+
+            # Calculate statistics
+            total_insert_time = sum(insert_times)
+            avg_insert_time = total_insert_time / len(insert_times)
+            vectors_per_second = total_vectors / total_insert_time
+
+            self.results["insert_performance"] = {
+                "total_vectors": total_vectors,
+                "total_time_seconds": total_insert_time,
+                "flush_time_seconds": flush_time,
+                "average_batch_time_seconds": avg_insert_time,
+                "vectors_per_second": vectors_per_second,
+                "batch_size": batch_size,
+            }
+
+            self.logger.info(
+                f"Insert benchmark completed: {vectors_per_second:.2f} vectors/sec"
+            )
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Insert benchmark failed: {e}")
+            return False
+
+    def benchmark_index_creation(self) -> bool:
+        """Benchmark index creation performance"""
+        try:
+            self.logger.info("Starting index creation benchmark...")
+
+            index_params = {
+                "metric_type": "L2",
+                "index_type": self.config["milvus"]["index_type"],
+                "params": {},
+            }
+
+            if self.config["milvus"]["index_type"] == "HNSW":
+                index_params["params"] = {
+                    "M": self.config.get("index_hnsw_m", 16),
+                    "efConstruction": self.config.get(
+                        "index_hnsw_ef_construction", 200
+                    ),
+                }
+            elif self.config["milvus"]["index_type"] == "IVF_FLAT":
+                index_params["params"] = {
+                    "nlist": self.config.get("index_ivf_nlist", 1024)
+                }
+
+            start_time = time.time()
+            self.collection.create_index("vector", index_params)
+            index_time = time.time() - start_time
+
+            self.results["index_performance"] = {
+                "index_type": self.config["milvus"]["index_type"],
+                "index_params": index_params,
+                "creation_time_seconds": index_time,
+            }
+
+            self.logger.info(f"Index creation completed in {index_time:.2f} seconds")
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Index creation failed: {e}")
+            return False
+
+    def benchmark_queries(self) -> bool:
+        """Benchmark query performance"""
+        try:
+            self.logger.info("Starting query benchmark...")
+
+            # Load collection
+            self.collection.load()
+
+            # Generate query vectors
+            query_count = 1000
+            _, query_vectors = self.generate_vectors(query_count)
+
+            query_results = {}
+
+            # Test different top-k values
+            topk_values = []
+            if self.config.get("benchmark_query_topk_1", False):
+                topk_values.append(1)
+            if self.config.get("benchmark_query_topk_10", False):
+                topk_values.append(10)
+            if self.config.get("benchmark_query_topk_100", False):
+                topk_values.append(100)
+
+            # Test different batch sizes
+            batch_sizes = []
+            if self.config.get("benchmark_batch_1", False):
+                batch_sizes.append(1)
+            if self.config.get("benchmark_batch_10", False):
+                batch_sizes.append(10)
+            if self.config.get("benchmark_batch_100", False):
+                batch_sizes.append(100)
+
+            for topk in topk_values:
+                query_results[f"topk_{topk}"] = {}
+
+                search_params = {"metric_type": "L2", "params": {}}
+                if self.config["milvus"]["index_type"] == "HNSW":
+                    search_params["params"]["ef"] = self.config.get("index_hnsw_ef", 64)
+                elif self.config["milvus"]["index_type"] == "IVF_FLAT":
+                    search_params["params"]["nprobe"] = self.config.get(
+                        "index_ivf_nprobe", 16
+                    )
+
+                for batch_size in batch_sizes:
+                    self.logger.info(f"Testing topk={topk}, batch_size={batch_size}")
+
+                    times = []
+                    for i in range(
+                        0, min(query_count, 100), batch_size
+                    ):  # Limit to 100 queries for speed
+                        batch_vectors = query_vectors[i : i + batch_size]
+
+                        start_time = time.time()
+                        results = self.collection.search(
+                            batch_vectors,
+                            "vector",
+                            search_params,
+                            limit=topk,
+                            output_fields=["id"],
+                        )
+                        query_time = time.time() - start_time
+                        times.append(query_time)
+
+                    avg_time = sum(times) / len(times)
+                    qps = batch_size / avg_time
+
+                    query_results[f"topk_{topk}"][f"batch_{batch_size}"] = {
+                        "average_time_seconds": avg_time,
+                        "queries_per_second": qps,
+                        "total_queries": len(times) * batch_size,
+                    }
+
+            self.results["query_performance"] = query_results
+            self.logger.info("Query benchmark completed")
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Query benchmark failed: {e}")
+            return False
+
+    def run_benchmark(self) -> bool:
+        """Run complete benchmark suite"""
+        self.logger.info("Starting Milvus benchmark suite...")
+
+        if not self.connect_to_milvus():
+            return False
+
+        if not self.create_collection():
+            return False
+
+        if not self.benchmark_insert():
+            return False
+
+        if not self.benchmark_index_creation():
+            return False
+
+        if not self.benchmark_queries():
+            return False
+
+        self.logger.info("Benchmark suite completed successfully")
+        return True
+
+    def save_results(self, output_file: str):
+        """Save benchmark results to file"""
+        try:
+            with open(output_file, "w") as f:
+                json.dump(self.results, f, indent=2)
+            self.logger.info(f"Results saved to {output_file}")
+        except Exception as e:
+            self.logger.error(f"Failed to save results: {e}")
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Milvus Vector Database Benchmark")
+    parser.add_argument("--config", required=True, help="JSON configuration file")
+    parser.add_argument("--output", required=True, help="Output results file")
+
+    args = parser.parse_args()
+
+    # Load configuration
+    try:
+        with open(args.config, "r") as f:
+            config = json.load(f)
+    except Exception as e:
+        print(f"Error loading config file: {e}")
+        return 1
+
+    # Run benchmark
+    benchmark = MilvusBenchmark(config)
+    success = benchmark.run_benchmark()
+
+    # Save results
+    benchmark.save_results(args.output)
+
+    return 0 if success else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/playbooks/roles/milvus/files/milvus_utils.py b/playbooks/roles/milvus/files/milvus_utils.py
new file mode 100644
index 00000000..15b8af4f
--- /dev/null
+++ b/playbooks/roles/milvus/files/milvus_utils.py
@@ -0,0 +1,134 @@
+#!/usr/bin/env python3
+"""
+Utility functions for Milvus benchmarking
+"""
+
+import numpy as np
+import time
+from typing import List, Dict, Any
+from pymilvus import Collection, utility
+
+
+def generate_random_vectors(dim: int, count: int) -> np.ndarray:
+    """Generate random vectors for testing"""
+    return np.random.random((count, dim)).astype("float32")
+
+
+def create_collection(name: str, dim: int, metric_type: str = "L2") -> Collection:
+    """Create a Milvus collection with specified parameters"""
+    from pymilvus import CollectionSchema, FieldSchema, DataType
+
+    fields = [
+        FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
+        FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=dim),
+    ]
+
+    schema = CollectionSchema(
+        fields=fields, description=f"Benchmark collection dim={dim}"
+    )
+    collection = Collection(name=name, schema=schema)
+
+    return collection
+
+
+def create_index(
+    collection: Collection, index_type: str = "IVF_FLAT", nlist: int = 1024
+):
+    """Create an index on the collection"""
+    index_params = {
+        "metric_type": "L2",
+        "index_type": index_type,
+        "params": {"nlist": nlist},
+    }
+
+    collection.create_index(field_name="embedding", index_params=index_params)
+    collection.load()
+
+
+def benchmark_insert(
+    collection: Collection, vectors: np.ndarray, batch_size: int = 10000
+) -> Dict[str, Any]:
+    """Benchmark vector insertion"""
+    total_vectors = len(vectors)
+    results = {
+        "total_vectors": total_vectors,
+        "batch_size": batch_size,
+        "batches": [],
+        "total_time": 0,
+    }
+
+    start_time = time.time()
+
+    for i in range(0, total_vectors, batch_size):
+        batch_start = time.time()
+        batch_vectors = vectors[i : i + batch_size].tolist()
+
+        collection.insert([batch_vectors])
+
+        batch_time = time.time() - batch_start
+        results["batches"].append(
+            {
+                "batch_idx": i // batch_size,
+                "vectors": len(batch_vectors),
+                "time": batch_time,
+                "throughput": len(batch_vectors) / batch_time,
+            }
+        )
+
+    collection.flush()
+
+    results["total_time"] = time.time() - start_time
+    results["avg_throughput"] = total_vectors / results["total_time"]
+
+    return results
+
+
+def benchmark_search(
+    collection: Collection, query_vectors: np.ndarray, top_k: int = 10, nprobe: int = 10
+) -> Dict[str, Any]:
+    """Benchmark vector search"""
+    search_params = {"metric_type": "L2", "params": {"nprobe": nprobe}}
+
+    results = {
+        "num_queries": len(query_vectors),
+        "top_k": top_k,
+        "nprobe": nprobe,
+        "queries": [],
+        "total_time": 0,
+    }
+
+    start_time = time.time()
+
+    for i, query in enumerate(query_vectors):
+        query_start = time.time()
+
+        search_results = collection.search(
+            data=[query.tolist()],
+            anns_field="embedding",
+            param=search_params,
+            limit=top_k,
+        )
+
+        query_time = time.time() - query_start
+        results["queries"].append(
+            {"query_idx": i, "time": query_time, "num_results": len(search_results[0])}
+        )
+
+    results["total_time"] = time.time() - start_time
+    results["avg_latency"] = results["total_time"] / len(query_vectors)
+    results["qps"] = len(query_vectors) / results["total_time"]
+
+    return results
+
+
+def get_collection_stats(collection: Collection) -> Dict[str, Any]:
+    """Get collection statistics"""
+    collection.flush()
+    stats = collection.num_entities
+
+    return {
+        "name": collection.name,
+        "num_entities": stats,
+        "loaded": utility.load_state(collection.name).name,
+        "index": collection.indexes,
+    }
diff --git a/playbooks/roles/milvus/meta/main.yml b/playbooks/roles/milvus/meta/main.yml
new file mode 100644
index 00000000..6af514b7
--- /dev/null
+++ b/playbooks/roles/milvus/meta/main.yml
@@ -0,0 +1,30 @@
+---
+galaxy_info:
+  author: kdevops AI team
+  description: Milvus vector database installation and setup for AI workflows
+  company: kdevops
+  license: copyleft-next-0.3.1
+  min_ansible_version: 2.9
+  platforms:
+    - name: Debian
+      versions:
+        - bookworm
+        - bullseye
+    - name: Ubuntu
+      versions:
+        - jammy
+        - focal
+    - name: Fedora
+      versions:
+        - all
+    - name: EL
+      versions:
+        - 8
+        - 9
+  galaxy_tags:
+    - ai
+    - vector_database
+    - milvus
+    - machine_learning
+
+dependencies: []
diff --git a/playbooks/roles/milvus/tasks/benchmark.yml b/playbooks/roles/milvus/tasks/benchmark.yml
new file mode 100644
index 00000000..222a00e9
--- /dev/null
+++ b/playbooks/roles/milvus/tasks/benchmark.yml
@@ -0,0 +1,61 @@
+---
+# Check if Milvus is actually running before attempting benchmarks
+- name: Check if Milvus is accessible
+  ansible.builtin.wait_for:
+    port: "{{ ai_vector_db_milvus_port }}"
+    host: localhost
+    timeout: 5
+    state: started
+  register: milvus_running
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Set Milvus availability flag
+  ansible.builtin.set_fact:
+    milvus_is_available: "{{ milvus_running.failed is not defined or not milvus_running.failed }}"
+
+- name: Debug Milvus check result
+  ansible.builtin.debug:
+    msg: |
+      Milvus check result: {{ milvus_running }}
+      Is succeeded: {{ milvus_running is succeeded }}
+      Is failed: {{ milvus_running is failed }}
+      Milvus is available: {{ milvus_is_available }}
+
+- name: Skip benchmarks if Milvus is not running
+  ansible.builtin.debug:
+    msg: |
+      Milvus is not running on port {{ ai_vector_db_milvus_port }}.
+      In native mode, Milvus server is not available.
+      Skipping benchmarks. Use Docker mode for full functionality.
+  when: not milvus_is_available
+
+- name: Run benchmark tasks only if Milvus is available
+  block:
+    - name: Create benchmark results directory
+      ansible.builtin.file:
+        path: "{{ ai_benchmark_results_dir }}/milvus"
+        state: directory
+        mode: '0755'
+
+    - name: Generate benchmark configuration
+      ansible.builtin.template:
+        src: benchmark_config.json.j2
+        dest: "{{ ai_vector_db_milvus_data_dir }}/scripts/benchmark_config.json"
+        mode: '0644'
+
+    - name: Run Milvus benchmarks
+      ansible.builtin.command: >
+        python3 {{ ai_vector_db_milvus_data_dir }}/scripts/milvus_benchmark.py
+        --config {{ ai_vector_db_milvus_data_dir }}/scripts/benchmark_config.json
+        --output {{ ai_benchmark_results_dir }}/milvus/results_{{ ansible_date_time.epoch }}.json
+      register: benchmark_result
+      when: ai_vector_db_milvus_benchmark_enable | bool
+
+    - name: Display benchmark summary
+      ansible.builtin.debug:
+        msg: "{{ benchmark_result.stdout_lines[-20:] }}"
+      when:
+        - ai_vector_db_milvus_benchmark_enable|bool
+        - benchmark_result is defined
+  when: milvus_is_available
diff --git a/playbooks/roles/milvus/tasks/benchmark_setup.yml b/playbooks/roles/milvus/tasks/benchmark_setup.yml
new file mode 100644
index 00000000..68ce18e4
--- /dev/null
+++ b/playbooks/roles/milvus/tasks/benchmark_setup.yml
@@ -0,0 +1,58 @@
+---
+# Setup benchmark scripts and directories only
+# This is used when running benchmarks on already-setup infrastructure
+
+- name: Ensure Python dependencies are installed
+  ansible.builtin.package:
+    name:
+      - python3-numpy
+      - python3-pandas
+      - python3-tqdm
+      - python3-pip
+    state: present
+  become: true
+
+- name: Check if pymilvus is installed
+  ansible.builtin.command: python3 -c "import pymilvus; print(pymilvus.__version__)"
+    changed_when: false
+  register: pymilvus_check
+  changed_when: false
+  failed_when: false
+
+- name: Install Python Milvus client with pip
+  ansible.builtin.pip:
+    name:
+      - pymilvus>={{ ai_vector_db_milvus_version }}
+    state: present
+    extra_args: --break-system-packages
+  become: true
+  when: pymilvus_check.rc != 0 or pymilvus_check.stdout is version(ai_vector_db_milvus_version, '<')
+
+- name: Create benchmark scripts directory
+  ansible.builtin.file:
+    path: "{{ ai_vector_db_milvus_data_dir }}/scripts"
+    state: directory
+    mode: '0755'
+  register: scripts_dir_result
+
+- name: Check if benchmark scripts exist
+  ansible.builtin.stat:
+    path: "{{ ai_vector_db_milvus_data_dir }}/scripts/{{ item }}"
+  loop:
+    - milvus_benchmark.py
+    - milvus_utils.py
+  register: benchmark_scripts_check
+
+- name: Copy benchmark scripts
+  ansible.builtin.copy:
+    src: "{{ item.item }}"
+    dest: "{{ ai_vector_db_milvus_data_dir }}/scripts/"
+    mode: '0755'
+  loop: "{{ benchmark_scripts_check.results }}"
+  when: not item.stat.exists or scripts_dir_result is changed
+
+- name: Create initial connection test script
+  ansible.builtin.template:
+    src: test_connection.py.j2
+    dest: "{{ ai_vector_db_milvus_data_dir }}/scripts/test_connection.py"
+    mode: '0755'
diff --git a/playbooks/roles/milvus/tasks/install_docker.yml b/playbooks/roles/milvus/tasks/install_docker.yml
new file mode 100644
index 00000000..e1e1d911
--- /dev/null
+++ b/playbooks/roles/milvus/tasks/install_docker.yml
@@ -0,0 +1,97 @@
+---
+- name: Check if Docker packages are installed (Debian)
+  ansible.builtin.command: dpkg -l docker.io docker-compose
+  register: docker_packages_check
+  changed_when: false
+  failed_when: false
+  when: ansible_os_family == "Debian"
+
+- name: Install Docker and Python dependencies
+  ansible.builtin.package:
+    name:
+      - docker.io
+      - docker-compose
+      - python3-pip
+      - python3-setuptools
+      - python3-packaging
+    state: present
+  become: true
+  when:
+    - ansible_os_family == "Debian"
+    - docker_packages_check.rc != 0
+
+- name: Check if Docker packages are installed (RedHat)
+  # TODO: Consider using package_facts module instead of rpm command
+  ansible.builtin.command: rpm -q docker docker-compose
+  register: docker_packages_check_rh
+  changed_when: false
+  failed_when: false
+  when: ansible_os_family == "RedHat"
+
+- name: Install Docker and Python dependencies (RedHat)
+  ansible.builtin.package:
+    name:
+      - docker
+      - docker-compose
+      - python3-pip
+      - python3-setuptools
+    state: present
+  become: true
+  when:
+    - ansible_os_family == "RedHat"
+    - docker_packages_check_rh.rc != 0
+
+- name: Check if user is in docker group
+  ansible.builtin.shell: groups {{ data_user | default(ansible_user_id) }} | grep -q docker
+  register: user_docker_group_check
+  changed_when: false
+  failed_when: false
+
+- name: Add user to docker group
+  ansible.builtin.user:
+    name: "{{ data_user | default(ansible_user_id) }}"
+    groups: docker
+    append: true
+  become: true
+  when: user_docker_group_check.rc != 0
+
+- name: Ensure Docker service is started
+  ansible.builtin.systemd:
+    name: docker
+    state: started
+    enabled: true
+  become: true
+
+- name: Create Milvus directories
+  ansible.builtin.file:
+    path: "{{ item }}"
+    state: directory
+    mode: '0755'
+    owner: "{{ data_user | default(ansible_user_id) }}"
+  become: true
+  loop:
+    - "{{ ai_vector_db_milvus_data_dir }}"
+    - "{{ ai_vector_db_milvus_config_dir }}"
+    - "{{ ai_vector_db_milvus_log_dir }}"
+    - "{{ ai_vector_db_milvus_docker_data_path }}"
+    - "{{ ai_vector_db_milvus_docker_etcd_data_path }}"
+    - "{{ ai_vector_db_milvus_docker_minio_data_path }}"
+
+- name: Check if docker-compose.yml exists
+  ansible.builtin.stat:
+    path: "{{ ai_vector_db_milvus_config_dir }}/docker-compose.yml"
+  register: docker_compose_exists
+
+- name: Remove old docker-compose override file if exists
+  ansible.builtin.file:
+    path: "{{ ai_vector_db_milvus_config_dir }}/docker-compose.override.yml"
+    state: absent
+  become: true
+  when: not docker_compose_exists.stat.exists
+
+- name: Create Milvus docker-compose file
+  ansible.builtin.template:
+    src: docker-compose.yml.j2
+    dest: "{{ ai_vector_db_milvus_config_dir }}/docker-compose.yml"
+    mode: '0644'
+  become: true
diff --git a/playbooks/roles/milvus/tasks/main.yml b/playbooks/roles/milvus/tasks/main.yml
new file mode 100644
index 00000000..4088cb47
--- /dev/null
+++ b/playbooks/roles/milvus/tasks/main.yml
@@ -0,0 +1,52 @@
+---
+- name: Include role create_data_partition
+  include_role:
+    name: create_data_partition
+  tags: ['setup', 'data_partition']
+
+- name: Include role common
+  include_role:
+    name: common
+  when:
+    - infer_uid_and_group|bool
+
+- name: Ensure data_dir has correct ownership
+  tags: ['setup']
+  become: true
+  # become_method: sudo  # sudo is the default, not needed
+  ansible.builtin.file:
+    path: "{{ data_path }}"
+    owner: "{{ data_user }}"
+    group: "{{ data_group }}"
+    recurse: false
+    state: directory
+    mode: '0755'
+
+- name: Ensure Milvus-specific subdirectories have correct ownership
+  tags: ['setup']
+  become: true
+  # become_method: sudo  # sudo is the default, not needed
+  ansible.builtin.file:
+    path: "{{ item }}"
+    owner: "{{ data_user }}"
+    group: "{{ data_group }}"
+    recurse: true
+    state: directory
+    mode: '0755'
+  loop:
+    - "{{ data_path }}/milvus"
+    - "{{ ai_vector_db_milvus_docker_data_path | default(data_path + '/milvus/data') }}"
+    - "{{ ai_vector_db_milvus_docker_etcd_data_path | default(data_path + '/milvus/etcd') }}"
+    - "{{ ai_vector_db_milvus_docker_minio_data_path | default(data_path + '/milvus/minio') }}"
+    - "{{ data_path }}/ai-benchmark"
+  # TODO: Review - was ignore_errors: true
+  failed_when: false  # Always succeed - review this condition
+
+- name: Include Docker installation tasks
+  ansible.builtin.include_tasks: install_docker.yml
+
+- name: Include setup tasks
+  ansible.builtin.include_tasks: setup.yml
+
+# Benchmarks are included via separate playbook call with proper tags
+# They are not run during the initial setup phase
diff --git a/playbooks/roles/milvus/tasks/setup.yml b/playbooks/roles/milvus/tasks/setup.yml
new file mode 100644
index 00000000..e9b8b6d5
--- /dev/null
+++ b/playbooks/roles/milvus/tasks/setup.yml
@@ -0,0 +1,107 @@
+---
+- name: Install Python virtual environment support
+  ansible.builtin.package:
+    name:
+      - python3-venv
+      - python3-pip
+    state: present
+  become: true
+
+- name: Check if virtual environment exists
+  ansible.builtin.stat:
+    path: "{{ data_path }}/ai-benchmark/venv"
+  register: venv_stat
+
+- name: Create Python virtual environment for AI benchmarks
+  ansible.builtin.command: python3 -m venv {{ data_path }}/ai-benchmark/venv
+  when: not venv_stat.stat.exists
+
+- name: Upgrade pip in virtual environment
+  ansible.builtin.command: "{{ data_path }}/ai-benchmark/venv/bin/python -m pip install --upgrade pip"
+  register: pip_upgrade
+  changed_when: "'Successfully installed' in pip_upgrade.stdout"
+
+- name: Install required Python packages in virtual environment
+  ansible.builtin.pip:
+    name:
+      - "pymilvus>={{ ai_vector_db_milvus_version }}"
+      - numpy
+      - pandas
+      - tqdm
+    virtualenv: "{{ data_path }}/ai-benchmark/venv"
+    state: present
+
+- name: Verify pymilvus is installed in virtual environment
+  ansible.builtin.command: "{{ data_path }}/ai-benchmark/venv/bin/python -c 'import pymilvus; print(pymilvus.__version__)'"
+  register: pymilvus_version
+  changed_when: false
+  failed_when: false
+
+- name: Display pymilvus version
+  ansible.builtin.debug:
+    msg: "pymilvus version: {{ pymilvus_version.stdout }}"
+  when: pymilvus_version.rc == 0
+
+- name: Check Docker Compose services status
+  ansible.builtin.shell: |
+    cd {{ ai_vector_db_milvus_config_dir }}
+    docker-compose ps --format json
+  when: ai_vector_db_milvus_docker | bool
+  become: true
+  register: docker_status_check
+  changed_when: false
+  failed_when: false
+
+- name: Start Milvus with Docker Compose
+  ansible.builtin.shell: |
+    cd {{ ai_vector_db_milvus_config_dir }}
+    docker-compose up -d
+  when: docker_status_check.rc != 0 or "running" not in docker_status_check.stdout
+  become: true
+  register: docker_compose_result
+  changed_when: "'Started' in docker_compose_result.stderr or 'Created' in docker_compose_result.stderr"
+
+- name: Wait for Milvus to be ready
+  ansible.builtin.wait_for:
+    port: "{{ ai_vector_db_milvus_port }}"
+    host: localhost
+    delay: 60
+    timeout: 300
+
+- name: Create benchmark scripts directory
+  ansible.builtin.file:
+    path: "{{ ai_vector_db_milvus_data_dir }}/scripts"
+    state: directory
+    mode: '0755'
+  register: scripts_dir_result
+
+- name: Check if benchmark scripts exist
+  ansible.builtin.stat:
+    path: "{{ ai_vector_db_milvus_data_dir }}/scripts/{{ item }}"
+  loop:
+    - milvus_benchmark.py
+    - milvus_utils.py
+  register: benchmark_scripts_check
+
+- name: Copy benchmark scripts
+  ansible.builtin.copy:
+    src: "{{ item.item }}"
+    dest: "{{ ai_vector_db_milvus_data_dir }}/scripts/"
+    mode: '0755'
+  loop: "{{ benchmark_scripts_check.results }}"
+  when: not item.stat.exists or scripts_dir_result is changed
+
+- name: Create initial connection test script
+  ansible.builtin.template:
+    src: test_connection.py.j2
+    dest: "{{ ai_vector_db_milvus_data_dir }}/scripts/test_connection.py"
+    mode: '0755'
+
+- name: Test Milvus connection
+  ansible.builtin.command: "{{ data_path }}/ai-benchmark/venv/bin/python {{ ai_vector_db_milvus_data_dir }}/scripts/test_connection.py"
+  register: connection_test
+  changed_when: false
+
+- name: Display connection test result
+  ansible.builtin.debug:
+    msg: "{{ connection_test.stdout }}"
diff --git a/playbooks/roles/milvus/templates/benchmark_config.json.j2 b/playbooks/roles/milvus/templates/benchmark_config.json.j2
new file mode 100644
index 00000000..f3ed04a0
--- /dev/null
+++ b/playbooks/roles/milvus/templates/benchmark_config.json.j2
@@ -0,0 +1,25 @@
+{
+    "milvus": {
+        "host": "localhost",
+        "port": {{ ai_vector_db_milvus_port }},
+        "collection_name": "{{ ai_vector_db_milvus_default_collection }}",
+        "dimension": {{ ai_vector_db_milvus_default_dim }},
+        "index_type": "{{ ai_vector_db_milvus_index_type }}",
+        "metric_type": "{{ ai_vector_db_milvus_metric_type }}",
+        "nlist": {{ ai_vector_db_milvus_nlist }},
+        "num_shards": {{ ai_vector_db_milvus_default_shards }}
+    },
+    "benchmark": {
+        "datasets": {{ ai_vector_db_milvus_benchmark_datasets | to_json }},
+        "batch_size": {{ ai_vector_db_milvus_benchmark_batch_size }},
+        "num_queries": {{ ai_vector_db_milvus_benchmark_num_queries }},
+        "top_k": [1, 10, 100],
+        "nprobe": [1, 10, 50, 100]
+    },
+    "filesystem": {
+        "type": "{{ ai_filesystem }}",
+        "mount_point": "{{ ai_data_device_path }}",
+        "mkfs_opts": "{{ ai_mkfs_opts | default('') }}",
+        "mount_opts": "{{ ai_mount_opts | default('defaults') }}"
+    }
+}
diff --git a/playbooks/roles/milvus/templates/docker-compose.override.yml.j2 b/playbooks/roles/milvus/templates/docker-compose.override.yml.j2
new file mode 100644
index 00000000..b4f96a44
--- /dev/null
+++ b/playbooks/roles/milvus/templates/docker-compose.override.yml.j2
@@ -0,0 +1,24 @@
+services:
+  milvus-standalone:
+    environment:
+      MILVUS_DATA_DIR: /var/lib/milvus
+      MILVUS_LOG_DIR: /var/log/milvus
+    volumes:
+      - {{ ai_vector_db_milvus_data_dir }}/volumes/milvus:/var/lib/milvus
+      - {{ ai_vector_db_milvus_log_dir }}:/var/log/milvus
+    ports:
+      - "{{ ai_vector_db_milvus_port }}:19530"
+      - "{{ ai_vector_db_milvus_metrics_port }}:9091"
+    deploy:
+      resources:
+        limits:
+          memory: {{ ai_vector_db_milvus_memory_limit }}
+          cpus: '{{ ai_vector_db_milvus_cpu_limit }}'
+
+  etcd:
+    volumes:
+      - {{ ai_vector_db_milvus_data_dir }}/volumes/etcd:/etcd
+
+  minio:
+    volumes:
+      - {{ ai_vector_db_milvus_data_dir }}/volumes/minio:/minio_data
diff --git a/playbooks/roles/milvus/templates/docker-compose.yml.j2 b/playbooks/roles/milvus/templates/docker-compose.yml.j2
new file mode 100644
index 00000000..6a611c51
--- /dev/null
+++ b/playbooks/roles/milvus/templates/docker-compose.yml.j2
@@ -0,0 +1,64 @@
+services:
+  etcd:
+    container_name: {{ ai_vector_db_milvus_etcd_container_name }}
+    image: {{ ai_vector_db_milvus_etcd_container_image_string }}
+    environment:
+      - ETCD_AUTO_COMPACTION_MODE=revision
+      - ETCD_AUTO_COMPACTION_RETENTION=1000
+      - ETCD_QUOTA_BACKEND_BYTES=4294967296
+      - ETCD_SNAPSHOT_COUNT=50000
+    volumes:
+      - {{ ai_vector_db_milvus_docker_etcd_data_path }}:/etcd
+    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
+    # Health check disabled - etcd container doesn't have curl or etcdctl in PATH
+    # healthcheck:
+    #   test: ["CMD", "curl", "-f", "http://localhost:2379/health"]
+    #   interval: 30s
+    #   timeout: 20s
+    #   retries: 3
+    restart: unless-stopped
+
+  minio:
+    container_name: {{ ai_vector_db_milvus_minio_container_name }}
+    image: {{ ai_vector_db_milvus_minio_container_image_string }}
+    environment:
+      MINIO_ACCESS_KEY: {{ ai_vector_db_milvus_minio_access_key }}
+      MINIO_SECRET_KEY: {{ ai_vector_db_milvus_minio_secret_key }}
+    volumes:
+      - {{ ai_vector_db_milvus_docker_minio_data_path }}:/minio_data
+    command: minio server /minio_data --console-address ":{{ ai_vector_db_milvus_minio_console_port }}"
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:{{ ai_vector_db_milvus_minio_api_port }}/minio/health/live"]
+      interval: 30s
+      timeout: 20s
+      retries: 3
+    restart: unless-stopped
+    ports:
+      - "{{ ai_vector_db_milvus_minio_api_port }}:{{ ai_vector_db_milvus_minio_api_port }}"
+      - "{{ ai_vector_db_milvus_minio_console_port }}:{{ ai_vector_db_milvus_minio_console_port }}"
+
+  milvus:
+    container_name: {{ ai_vector_db_milvus_container_name }}
+    image: {{ ai_vector_db_milvus_container_image_string }}
+    command: ["milvus", "run", "standalone"]
+    environment:
+      ETCD_ENDPOINTS: etcd:{{ ai_vector_db_milvus_etcd_client_port }}
+      MINIO_ADDRESS: minio:{{ ai_vector_db_milvus_minio_api_port }}
+    volumes:
+      - {{ ai_vector_db_milvus_docker_data_path }}:/var/lib/milvus
+    depends_on:
+      - etcd
+      - minio
+    ports:
+      - "{{ ai_vector_db_milvus_port }}:19530"
+      - "{{ ai_vector_db_milvus_web_ui_port }}:9091"
+    restart: unless-stopped
+    deploy:
+      resources:
+        limits:
+          memory: {{ ai_vector_db_milvus_memory_limit }}
+          cpus: '{{ ai_vector_db_milvus_cpu_limit }}'
+
+networks:
+  default:
+    name: {{ ai_vector_db_milvus_docker_network_name }}
diff --git a/playbooks/roles/milvus/templates/milvus.yaml.j2 b/playbooks/roles/milvus/templates/milvus.yaml.j2
new file mode 100644
index 00000000..f843ec4b
--- /dev/null
+++ b/playbooks/roles/milvus/templates/milvus.yaml.j2
@@ -0,0 +1,30 @@
+# Milvus configuration file
+etcd:
+  endpoints:
+    - {{ ai_vector_db_milvus_etcd_native_client_url }}
+  rootPath: milvus
+
+minio:
+  address: localhost
+  port: 9000
+  accessKeyID: {{ ai_vector_db_milvus_minio_native_access_key }}
+  secretAccessKey: {{ ai_vector_db_milvus_minio_native_secret_key }}
+  bucketName: milvus-bucket
+  useSSL: false
+
+proxy:
+  port: {{ ai_vector_db_milvus_port }}
+
+log:
+  level: info
+  path: {{ ai_vector_db_milvus_native_log_path }}
+
+dataNode:
+  dataPath: {{ ai_vector_db_milvus_native_data_path }}
+
+indexNode:
+  enableDisk: true
+
+common:
+  security:
+    authorizationEnabled: false
diff --git a/playbooks/roles/milvus/templates/test_connection.py.j2 b/playbooks/roles/milvus/templates/test_connection.py.j2
new file mode 100644
index 00000000..d85423ba
--- /dev/null
+++ b/playbooks/roles/milvus/templates/test_connection.py.j2
@@ -0,0 +1,25 @@
+#!{{ data_path }}/ai-benchmark/venv/bin/python
+"""Test Milvus connection"""
+
+from pymilvus import connections, utility
+
+try:
+    # Connect to Milvus
+    connections.connect(
+        alias="default",
+        host="localhost",
+        port="{{ ai_vector_db_milvus_port }}"
+    )
+
+    # Check if connected
+    if utility.list_collections():
+        print("✓ Successfully connected to Milvus")
+        print(f"  Server version: {utility.get_server_version()}")
+        print(f"  Collections: {utility.list_collections()}")
+    else:
+        print("✓ Successfully connected to Milvus (no collections yet)")
+        print(f"  Server version: {utility.get_server_version()}")
+
+except Exception as e:
+    print(f"✗ Failed to connect to Milvus: {e}")
+    exit(1)
diff --git a/workflows/Makefile b/workflows/Makefile
index b5f54ff5..fe35707b 100644
--- a/workflows/Makefile
+++ b/workflows/Makefile
@@ -66,6 +66,10 @@ ifeq (y,$(CONFIG_KDEVOPS_WORKFLOW_ENABLE_FIO_TESTS))
 include workflows/fio-tests/Makefile
 endif # CONFIG_KDEVOPS_WORKFLOW_ENABLE_FIO_TESTS == y
 
+ifeq (y,$(CONFIG_KDEVOPS_WORKFLOW_ENABLE_AI))
+include workflows/ai/Makefile
+endif # CONFIG_KDEVOPS_WORKFLOW_ENABLE_AI == y
+
 ANSIBLE_EXTRA_ARGS += $(WORKFLOW_ARGS)
 ANSIBLE_EXTRA_ARGS_SEPARATED += $(WORKFLOW_ARGS_SEPARATED)
 ANSIBLE_EXTRA_ARGS_DIRECT += $(WORKFLOW_ARGS_DIRECT)
diff --git a/workflows/ai/Kconfig b/workflows/ai/Kconfig
new file mode 100644
index 00000000..2ffc6b65
--- /dev/null
+++ b/workflows/ai/Kconfig
@@ -0,0 +1,164 @@
+if KDEVOPS_WORKFLOW_ENABLE_AI
+
+choice
+	prompt "What type of AI testing do you want to run?"
+	default AI_TESTS_VECTOR_DATABASE
+
+config AI_TESTS_VECTOR_DATABASE
+	bool "Vector database performance tests"
+	select KDEVOPS_BASELINE_AND_DEV
+	output yaml
+	help
+	  Run vector database performance analysis tests.
+	  This includes testing various vector dimensions, batch sizes,
+	  and query patterns to generate performance profiles for AI workloads.
+
+	  A/B testing is enabled to compare performance across different
+	  configurations using baseline and development nodes.
+
+endchoice
+
+# Vector Database Configuration
+if AI_TESTS_VECTOR_DATABASE
+
+choice
+	prompt "Select vector database system"
+	default AI_VECTOR_DB_MILVUS
+
+config AI_VECTOR_DB_MILVUS
+	bool "Milvus - Open-source vector database"
+	output yaml
+	help
+	  Milvus is a cloud-native vector database built for scalable
+	  similarity search and AI applications. It provides high
+	  performance vector indexing and querying capabilities.
+
+endchoice
+
+# Milvus-specific configuration
+if AI_VECTOR_DB_MILVUS
+
+# CLI override support for CI testing
+config AI_VECTOR_DB_MILVUS_QUICK_TEST_SET_BY_CLI
+	bool
+	output yaml
+	default $(shell, scripts/check-cli-set-var.sh AI_VECTOR_DB_MILVUS_QUICK_TEST)
+
+config AI_VECTOR_DB_MILVUS_QUICK_TEST
+	bool "Enable quick test mode for CI/demo"
+	default y if AI_VECTOR_DB_MILVUS_QUICK_TEST_SET_BY_CLI
+	output yaml
+	help
+	  Quick test mode reduces dataset sizes and runtime for rapid validation.
+	  This is useful for CI pipelines and demonstrations.
+
+# Milvus runs in Docker containers only
+config AI_VECTOR_DB_MILVUS_DOCKER
+	bool
+	output yaml
+	default y
+	help
+	  Milvus runs inside Docker containers with embedded etcd and MinIO storage.
+	  Native installation is not supported due to complex build requirements.
+
+config AI_VECTOR_DB_MILVUS_VERSION
+	string "Milvus version"
+	output yaml
+	default "2.3.0"
+	help
+	  The version of Milvus to install and use.
+
+config AI_VECTOR_DB_MILVUS_PORT
+	int "Milvus server port"
+	output yaml
+	default 19530
+	help
+	  The port number where Milvus server is listening.
+	  Default is 19530 for standard Milvus deployment.
+
+config AI_VECTOR_DB_MILVUS_COLLECTION_NAME
+	string "Default collection name"
+	output yaml
+	default "benchmark_collection"
+	help
+	  The default collection name to use for benchmarking tests.
+
+config AI_VECTOR_DB_MILVUS_DIMENSION
+	int "Vector dimension"
+	output yaml
+	default 768
+	range 1 4096
+	help
+	  The dimension of vectors to use in benchmarks.
+	  Common dimensions: 128, 384, 768, 1536
+
+config AI_VECTOR_DB_MILVUS_DATASET_SIZE
+	int "Dataset size (number of vectors)"
+	output yaml
+	default 100000 if AI_VECTOR_DB_MILVUS_QUICK_TEST
+	default 1000000 if !AI_VECTOR_DB_MILVUS_QUICK_TEST
+	help
+	  The number of vectors to insert for benchmarking.
+	  Quick test mode uses smaller dataset for faster execution.
+
+config AI_VECTOR_DB_MILVUS_BATCH_SIZE
+	int "Batch size for insertions"
+	output yaml
+	default 10000
+	help
+	  The batch size to use when inserting vectors.
+
+config AI_VECTOR_DB_MILVUS_NUM_QUERIES
+	int "Number of search queries"
+	output yaml
+	default 1000 if AI_VECTOR_DB_MILVUS_QUICK_TEST
+	default 10000
+	help
+	  The number of search queries to execute during benchmarking.
+
+if AI_VECTOR_DB_MILVUS_DOCKER
+source "workflows/ai/Kconfig.docker"
+endif # AI_VECTOR_DB_MILVUS_DOCKER
+
+if AI_VECTOR_DB_MILVUS_NATIVE
+source "workflows/ai/Kconfig.native"
+endif # AI_VECTOR_DB_MILVUS_NATIVE
+
+endif # AI_VECTOR_DB_MILVUS
+
+endif # AI_TESTS_VECTOR_DATABASE
+
+# Common AI Benchmark Configuration
+config AI_BENCHMARK_RESULTS_DIR
+	string "Benchmark results directory"
+	output yaml
+	default "/data/ai-benchmark"
+	help
+	  Directory where benchmark results will be stored.
+
+config AI_BENCHMARK_ENABLE_GRAPHING
+	bool "Enable performance graphing"
+	output yaml
+	default y
+	help
+	  Generate performance graphs and visualizations from benchmark results.
+
+config AI_BENCHMARK_ITERATIONS
+	int "Number of benchmark iterations"
+	output yaml
+	default 3 if AI_VECTOR_DB_MILVUS_QUICK_TEST
+	default 40 if !AI_VECTOR_DB_MILVUS_QUICK_TEST
+	range 1 100
+	help
+	  The number of iterations to run for each benchmark configuration.
+	  Multiple iterations help ensure consistent results. The default
+	  of 40 is used, that will use up about 100 GiB of storage space
+	  if you use 1,000,000 vectors. This will work for existing defaults
+	  on kdevops taret nodes, as our min drive use is 100 GiB per extra
+	  drive. This should take about 1 full day of testing. If you want
+	  more than 40, be sure to account for increasing your storage drive.
+
+# Docker storage configuration
+source "workflows/ai/Kconfig.docker-storage"
+
+endif # KDEVOPS_WORKFLOW_ENABLE_AI
diff --git a/workflows/ai/Kconfig.docker b/workflows/ai/Kconfig.docker
new file mode 100644
index 00000000..012fc0b9
--- /dev/null
+++ b/workflows/ai/Kconfig.docker
@@ -0,0 +1,172 @@
+choice
+	prompt "Which Milvus container image to use?"
+	default AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5
+
+config AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5
+	bool "milvusdb/milvus:v2.5.10"
+	output yaml
+	help
+	  Use the latest stable Milvus 2.5.x release with enhanced
+	  performance and stability features.
+
+config AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_4
+	bool "milvusdb/milvus:v2.4.17"
+	output yaml
+	help
+	  Use Milvus 2.4.x for compatibility with existing workloads
+	  or when specific 2.4 features are required.
+
+endchoice
+
+config AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_STRING
+	string
+	output yaml
+	default "milvusdb/milvus:v2.5.10" if AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5
+	default "milvusdb/milvus:v2.4.17" if AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_4
+
+config AI_VECTOR_DB_MILVUS_CONTAINER_NAME
+	string "The local Milvus container name"
+	default "milvus-ai-benchmark"
+	output yaml
+	help
+	  Set the name for the Milvus Docker container.
+
+config AI_VECTOR_DB_MILVUS_ETCD_CONTAINER_IMAGE_STRING
+	string "etcd container image"
+	output yaml
+	default "quay.io/coreos/etcd:v3.5.18" if AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_5
+	default "quay.io/coreos/etcd:v3.5.5" if AI_VECTOR_DB_MILVUS_CONTAINER_IMAGE_2_4
+	help
+	  The etcd container image to use for Milvus metadata storage.
+
+config AI_VECTOR_DB_MILVUS_ETCD_CONTAINER_NAME
+	string "The local etcd container name"
+	default "milvus-etcd"
+	output yaml
+	help
+	  Set the name for the etcd Docker container.
+
+config AI_VECTOR_DB_MILVUS_MINIO_CONTAINER_IMAGE_STRING
+	string "MinIO container image"
+	output yaml
+	default "minio/minio:RELEASE.2023-03-20T20-16-18Z"
+	help
+	  The MinIO container image to use for Milvus object storage.
+
+config AI_VECTOR_DB_MILVUS_MINIO_CONTAINER_NAME
+	string "The local MinIO container name"
+	default "milvus-minio"
+	output yaml
+	help
+	  Set the name for the MinIO Docker container.
+
+config AI_VECTOR_DB_MILVUS_MINIO_ACCESS_KEY
+	string "MinIO access key"
+	output yaml
+	default "minioadmin"
+	help
+	  Access key for MinIO object storage.
+
+config AI_VECTOR_DB_MILVUS_MINIO_SECRET_KEY
+	string "MinIO secret key"
+	output yaml
+	default "minioadmin"
+	help
+	  Secret key for MinIO object storage.
+
+config AI_VECTOR_DB_MILVUS_DOCKER_DATA_PATH
+	string "Host path for persistent data storage"
+	output yaml
+	default "/data/milvus/data"
+	help
+	  Directory on the host where Milvus data will be persisted.
+	  This includes vector data, metadata, and logs.
+
+config AI_VECTOR_DB_MILVUS_DOCKER_ETCD_DATA_PATH
+	string "Host path for etcd data storage"
+	output yaml
+	default "/data/milvus/etcd"
+	help
+	  Directory on the host where etcd data will be persisted.
+
+config AI_VECTOR_DB_MILVUS_DOCKER_MINIO_DATA_PATH
+	string "Host path for MinIO data storage"
+	output yaml
+	default "/data/milvus/minio"
+	help
+	  Directory on the host where MinIO data will be persisted.
+
+config AI_VECTOR_DB_MILVUS_DOCKER_NETWORK_NAME
+	string "Docker network name"
+	output yaml
+	default "milvus-network"
+	help
+	  Name of the Docker network to create for Milvus containers.
+
+config AI_VECTOR_DB_MILVUS_WEB_UI_PORT
+	int "Milvus web UI port"
+	output yaml
+	default "9091"
+	help
+	  Port for accessing the Milvus web UI interface.
+
+config AI_VECTOR_DB_MILVUS_MINIO_API_PORT
+	int "MinIO API port"
+	output yaml
+	default "9000"
+	help
+	  Port for MinIO API access.
+
+config AI_VECTOR_DB_MILVUS_MINIO_CONSOLE_PORT
+	int "MinIO console port"
+	output yaml
+	default "9001"
+	help
+	  Port for MinIO web console access.
+
+config AI_VECTOR_DB_MILVUS_ETCD_CLIENT_PORT
+	int "etcd client port"
+	output yaml
+	default "2379"
+	help
+	  Port for etcd client connections.
+
+config AI_VECTOR_DB_MILVUS_ETCD_PEER_PORT
+	int "etcd peer port"
+	output yaml
+	default "2380"
+	help
+	  Port for etcd peer connections.
+
+menu "Docker resource limits"
+
+config AI_VECTOR_DB_MILVUS_MEMORY_LIMIT
+	string "Milvus container memory limit"
+	output yaml
+	default "8g"
+	help
+	  Memory limit for the Milvus container. Adjust based on
+	  your system resources and dataset size.
+
+config AI_VECTOR_DB_MILVUS_CPU_LIMIT
+	string "Milvus container CPU limit"
+	output yaml
+	default "4.0"
+	help
+	  CPU limit for the Milvus container (number of CPUs).
+
+config AI_VECTOR_DB_MILVUS_ETCD_MEMORY_LIMIT
+	string "etcd container memory limit"
+	output yaml
+	default "1g"
+	help
+	  Memory limit for the etcd container.
+
+config AI_VECTOR_DB_MILVUS_MINIO_MEMORY_LIMIT
+	string "MinIO container memory limit"
+	output yaml
+	default "2g"
+	help
+	  Memory limit for the MinIO container.
+
+endmenu
diff --git a/workflows/ai/Kconfig.docker-storage b/workflows/ai/Kconfig.docker-storage
new file mode 100644
index 00000000..33efce4f
--- /dev/null
+++ b/workflows/ai/Kconfig.docker-storage
@@ -0,0 +1,201 @@
+menu "Docker Storage Configuration for AI Workloads"
+
+config AI_DOCKER_STORAGE_ENABLE
+	bool "Enable dedicated Docker storage for AI workloads"
+	default y
+	output yaml
+	help
+	  Configure a dedicated storage device for Docker containers
+	  and images used in AI workloads. This prevents Docker from
+	  filling up the root filesystem and provides better performance
+	  isolation for container operations.
+
+	  When enabled, Docker data will be stored on a dedicated device
+	  and filesystem optimized for container workloads.
+
+if AI_DOCKER_STORAGE_ENABLE
+
+config AI_DOCKER_DEVICE
+	string "Device to use for Docker storage"
+	output yaml
+	default "/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops1" if LIBVIRT && LIBVIRT_EXTRA_STORAGE_DRIVE_NVME
+	default "/dev/disk/by-id/virtio-kdevops1" if LIBVIRT && LIBVIRT_EXTRA_STORAGE_DRIVE_VIRTIO
+	default "/dev/disk/by-id/ata-QEMU_HARDDISK_kdevops1" if LIBVIRT && LIBVIRT_EXTRA_STORAGE_DRIVE_IDE
+	default "/dev/nvme2n1" if TERRAFORM_AWS_INSTANCE_M5AD_2XLARGE
+	default "/dev/nvme2n1" if TERRAFORM_AWS_INSTANCE_M5AD_4XLARGE
+	default "/dev/nvme1n1" if TERRAFORM_GCE
+	default "/dev/sdd" if TERRAFORM_AZURE
+	default TERRAFORM_OCI_SPARSE_VOLUME_DEVICE_FILE_NAME if TERRAFORM_OCI
+	help
+	  The device to use for Docker storage. This device will be
+	  formatted and mounted to store Docker containers, images,
+	  and volumes for AI workloads.
+
+config AI_DOCKER_MOUNT_POINT
+	string "Mount point for Docker storage"
+	output yaml
+	default "/var/lib/docker/"
+	help
+	  The path where the Docker storage filesystem will be mounted.
+	  Docker will be configured to use this path via a symlink from
+	  /var/lib/docker.
+
+choice
+	prompt "Docker storage filesystem"
+	default AI_DOCKER_FSTYPE_XFS
+
+config AI_DOCKER_FSTYPE_XFS
+	bool "XFS"
+	help
+	  Use XFS filesystem for Docker storage. XFS provides excellent
+	  performance for large files and is recommended for production
+	  Docker deployments. Supports various block sizes for testing
+	  large block size (LBS) configurations.
+
+config AI_DOCKER_FSTYPE_BTRFS
+	bool "Btrfs"
+	help
+	  Use Btrfs filesystem for Docker storage. Btrfs provides
+	  advanced features like snapshots and compression, which can
+	  be beneficial for Docker layer management.
+
+config AI_DOCKER_FSTYPE_EXT4
+	bool "ext4"
+	help
+	  Use ext4 filesystem for Docker storage. Ext4 is a mature
+	  and reliable filesystem with good all-around performance.
+
+endchoice
+
+config AI_DOCKER_FSTYPE
+	string
+	output yaml
+	default "xfs" if AI_DOCKER_FSTYPE_XFS
+	default "btrfs" if AI_DOCKER_FSTYPE_BTRFS
+	default "ext4" if AI_DOCKER_FSTYPE_EXT4
+
+if AI_DOCKER_FSTYPE_XFS
+
+choice
+	prompt "XFS block size configuration"
+	default AI_DOCKER_XFS_BLOCKSIZE_4K
+
+config AI_DOCKER_XFS_BLOCKSIZE_4K
+	bool "4K block size (default)"
+	help
+	  Use 4K (4096 bytes) block size. This is the default and most
+	  compatible configuration.
+
+config AI_DOCKER_XFS_BLOCKSIZE_8K
+	bool "8K block size"
+	help
+	  Use 8K (8192 bytes) block size for improved performance with
+	  larger I/O operations.
+
+config AI_DOCKER_XFS_BLOCKSIZE_16K
+	bool "16K block size (LBS)"
+	help
+	  Use 16K (16384 bytes) block size. This is a large block size
+	  configuration that may require kernel LBS support.
+
+config AI_DOCKER_XFS_BLOCKSIZE_32K
+	bool "32K block size (LBS)"
+	help
+	  Use 32K (32768 bytes) block size. This is a large block size
+	  configuration that requires kernel LBS support.
+
+config AI_DOCKER_XFS_BLOCKSIZE_64K
+	bool "64K block size (LBS)"
+	help
+	  Use 64K (65536 bytes) block size. This is the maximum XFS block
+	  size and requires kernel LBS support.
+
+endchoice
+
+config AI_DOCKER_XFS_BLOCKSIZE
+	int
+	output yaml
+	default 4096 if AI_DOCKER_XFS_BLOCKSIZE_4K
+	default 8192 if AI_DOCKER_XFS_BLOCKSIZE_8K
+	default 16384 if AI_DOCKER_XFS_BLOCKSIZE_16K
+	default 32768 if AI_DOCKER_XFS_BLOCKSIZE_32K
+	default 65536 if AI_DOCKER_XFS_BLOCKSIZE_64K
+
+choice
+	prompt "XFS sector size"
+	default AI_DOCKER_XFS_SECTORSIZE_4K
+
+config AI_DOCKER_XFS_SECTORSIZE_4K
+	bool "4K sector size (default)"
+	help
+	  Use 4K (4096 bytes) sector size. This is the standard
+	  configuration for most modern drives.
+
+config AI_DOCKER_XFS_SECTORSIZE_512
+	bool "512 byte sector size"
+	depends on AI_DOCKER_XFS_BLOCKSIZE_4K
+	help
+	  Use legacy 512 byte sector size. Only available with 4K block size.
+
+config AI_DOCKER_XFS_SECTORSIZE_8K
+	bool "8K sector size"
+	depends on AI_DOCKER_XFS_BLOCKSIZE_8K || AI_DOCKER_XFS_BLOCKSIZE_16K || AI_DOCKER_XFS_BLOCKSIZE_32K || AI_DOCKER_XFS_BLOCKSIZE_64K
+	help
+	  Use 8K (8192 bytes) sector size. Requires block size >= 8K.
+
+config AI_DOCKER_XFS_SECTORSIZE_16K
+	bool "16K sector size (LBS)"
+	depends on AI_DOCKER_XFS_BLOCKSIZE_16K || AI_DOCKER_XFS_BLOCKSIZE_32K || AI_DOCKER_XFS_BLOCKSIZE_64K
+	help
+	  Use 16K (16384 bytes) sector size. Requires block size >= 16K
+	  and kernel LBS support.
+
+config AI_DOCKER_XFS_SECTORSIZE_32K
+	bool "32K sector size (LBS)"
+	depends on AI_DOCKER_XFS_BLOCKSIZE_32K || AI_DOCKER_XFS_BLOCKSIZE_64K
+	help
+	  Use 32K (32768 bytes) sector size. Requires block size >= 32K
+	  and kernel LBS support.
+
+endchoice
+
+config AI_DOCKER_XFS_SECTORSIZE
+	int
+	output yaml
+	default 512 if AI_DOCKER_XFS_SECTORSIZE_512
+	default 4096 if AI_DOCKER_XFS_SECTORSIZE_4K
+	default 8192 if AI_DOCKER_XFS_SECTORSIZE_8K
+	default 16384 if AI_DOCKER_XFS_SECTORSIZE_16K
+	default 32768 if AI_DOCKER_XFS_SECTORSIZE_32K
+
+config AI_DOCKER_XFS_MKFS_OPTS
+	string "Additional XFS mkfs options for Docker storage"
+	output yaml
+	default ""
+	help
+	  Additional options to pass to mkfs.xfs when creating the Docker
+	  storage filesystem. Block and sector sizes are configured above.
+
+endif # AI_DOCKER_FSTYPE_XFS
+
+config AI_DOCKER_BTRFS_MKFS_OPTS
+	string "Btrfs mkfs options for Docker storage"
+	output yaml
+	default "-f"
+	depends on AI_DOCKER_FSTYPE_BTRFS
+	help
+	  Options to pass to mkfs.btrfs when creating the Docker storage
+	  filesystem.
+
+config AI_DOCKER_EXT4_MKFS_OPTS
+	string "ext4 mkfs options for Docker storage"
+	output yaml
+	default "-F"
+	depends on AI_DOCKER_FSTYPE_EXT4
+	help
+	  Options to pass to mkfs.ext4 when creating the Docker storage
+	  filesystem.
+
+endif # AI_DOCKER_STORAGE_ENABLE
+
+endmenu
diff --git a/workflows/ai/Kconfig.native b/workflows/ai/Kconfig.native
new file mode 100644
index 00000000..ef9768c3
--- /dev/null
+++ b/workflows/ai/Kconfig.native
@@ -0,0 +1,184 @@
+choice
+	prompt "Native Milvus installation method"
+	default AI_VECTOR_DB_MILVUS_NATIVE_BINARY
+
+config AI_VECTOR_DB_MILVUS_NATIVE_BINARY
+	bool "Install from pre-built binaries"
+	output yaml
+	help
+	  Install Milvus from official pre-built binaries. This is
+	  the recommended approach for production deployments and
+	  provides optimal performance.
+
+config AI_VECTOR_DB_MILVUS_NATIVE_SOURCE
+	bool "Build from source"
+	output yaml
+	help
+	  Build Milvus from source code. This allows for custom
+	  optimizations but requires longer build times and more
+	  dependencies.
+
+endchoice
+
+config AI_VECTOR_DB_MILVUS_NATIVE_VERSION
+	string "Milvus version to install"
+	output yaml
+	default "v2.5.10" if AI_VECTOR_DB_MILVUS_NATIVE_BINARY
+	default "master" if AI_VECTOR_DB_MILVUS_NATIVE_SOURCE
+	help
+	  The Milvus version to install. For binary installation,
+	  use release tags like v2.5.10. For source builds, you
+	  can use branch names or commit hashes.
+
+config AI_VECTOR_DB_MILVUS_NATIVE_INSTALL_PATH
+	string "Installation directory"
+	output yaml
+	default "/opt/milvus"
+	help
+	  Directory where Milvus will be installed.
+
+config AI_VECTOR_DB_MILVUS_NATIVE_DATA_PATH
+	string "Data storage directory"
+	output yaml
+	default "/data/milvus"
+	help
+	  Directory where Milvus will store vector data and metadata.
+
+config AI_VECTOR_DB_MILVUS_NATIVE_LOG_PATH
+	string "Log directory"
+	output yaml
+	default "/var/log/milvus"
+	help
+	  Directory where Milvus will write log files.
+
+menu "Native dependencies configuration"
+
+config AI_VECTOR_DB_MILVUS_ETCD_NATIVE_INSTALL
+	bool "Install etcd natively"
+	output yaml
+	default y
+	help
+	  Install etcd as a native service for Milvus metadata storage.
+
+if AI_VECTOR_DB_MILVUS_ETCD_NATIVE_INSTALL
+
+config AI_VECTOR_DB_MILVUS_ETCD_NATIVE_VERSION
+	string "etcd version"
+	output yaml
+	default "v3.5.18"
+	help
+	  Version of etcd to install.
+
+config AI_VECTOR_DB_MILVUS_ETCD_NATIVE_DATA_DIR
+	string "etcd data directory"
+	output yaml
+	default "/data/etcd"
+	help
+	  Directory where etcd will store its data.
+
+config AI_VECTOR_DB_MILVUS_ETCD_NATIVE_CLIENT_URL
+	string "etcd client URL"
+	output yaml
+	default "http://localhost:2379"
+	help
+	  URL for etcd client connections.
+
+endif # AI_VECTOR_DB_MILVUS_ETCD_NATIVE_INSTALL
+
+config AI_VECTOR_DB_MILVUS_MINIO_NATIVE_INSTALL
+	bool "Install MinIO natively"
+	output yaml
+	default y
+	help
+	  Install MinIO as a native service for Milvus object storage.
+
+if AI_VECTOR_DB_MILVUS_MINIO_NATIVE_INSTALL
+
+config AI_VECTOR_DB_MILVUS_MINIO_NATIVE_VERSION
+	string "MinIO version"
+	output yaml
+	default "RELEASE.2023-03-20T20-16-18Z"
+	help
+	  Version of MinIO to install.
+
+config AI_VECTOR_DB_MILVUS_MINIO_NATIVE_DATA_DIR
+	string "MinIO data directory"
+	output yaml
+	default "/data/minio"
+	help
+	  Directory where MinIO will store object data.
+
+config AI_VECTOR_DB_MILVUS_MINIO_NATIVE_ACCESS_KEY
+	string "MinIO access key"
+	output yaml
+	default "minioadmin"
+	help
+	  Access key for MinIO authentication.
+
+config AI_VECTOR_DB_MILVUS_MINIO_NATIVE_SECRET_KEY
+	string "MinIO secret key"
+	output yaml
+	default "minioadmin"
+	help
+	  Secret key for MinIO authentication.
+
+endif # AI_VECTOR_DB_MILVUS_MINIO_NATIVE_INSTALL
+
+endmenu
+
+menu "Native service configuration"
+
+config AI_VECTOR_DB_MILVUS_NATIVE_USER
+	string "Milvus service user"
+	output yaml
+	default "milvus"
+	help
+	  System user to run the Milvus service.
+
+config AI_VECTOR_DB_MILVUS_NATIVE_GROUP
+	string "Milvus service group"
+	output yaml
+	default "milvus"
+	help
+	  System group for the Milvus service.
+
+config AI_VECTOR_DB_MILVUS_NATIVE_ENABLE_SYSTEMD
+	bool "Create systemd service files"
+	output yaml
+	default y
+	help
+	  Create systemd service files for automatic startup and
+	  service management.
+
+endmenu
+
+if AI_VECTOR_DB_MILVUS_NATIVE_SOURCE
+
+menu "Source build configuration"
+
+config AI_VECTOR_DB_MILVUS_BUILD_DEPENDENCIES
+	bool "Install build dependencies"
+	output yaml
+	default y
+	help
+	  Automatically install required build dependencies including
+	  Go compiler, CMake, and other development tools.
+
+config AI_VECTOR_DB_MILVUS_BUILD_JOBS
+	int "Number of parallel build jobs"
+	output yaml
+	default 0
+	help
+	  Number of parallel jobs for building Milvus. Set to 0
+	  to use all available CPU cores.
+
+config AI_VECTOR_DB_MILVUS_BUILD_TYPE
+	string "Build type"
+	output yaml
+	default "Release"
+	help
+	  CMake build type. Options: Release, Debug, RelWithDebInfo.
+
+endmenu
+
+endif # AI_VECTOR_DB_MILVUS_NATIVE_SOURCE
diff --git a/workflows/ai/Makefile b/workflows/ai/Makefile
new file mode 100644
index 00000000..1c297edd
--- /dev/null
+++ b/workflows/ai/Makefile
@@ -0,0 +1,160 @@
+PHONY += ai ai-baseline ai-dev ai-results ai-results-baseline ai-results-dev
+PHONY += ai-setup ai-uninstall ai-destroy ai-help-menu
+PHONY += ai-tests ai-tests-baseline ai-tests-dev
+PHONY += ai-tests-results
+
+ifeq (y,$(CONFIG_WORKFLOWS_DEDICATED_WORKFLOW))
+export KDEVOPS_HOSTS_TEMPLATE := hosts.j2
+endif
+
+export AI_DATA_TARGET := $(subst ",,$(CONFIG_AI_BENCHMARK_RESULTS_DIR))
+export AI_ARGS :=
+
+AI_ARGS += ai_benchmark_results_dir='$(AI_DATA_TARGET)'
+
+# Vector Database Configuration
+ifeq (y,$(CONFIG_AI_TESTS_VECTOR_DATABASE))
+AI_ARGS += ai_tests_vector_database=True
+else
+AI_ARGS += ai_tests_vector_database=False
+endif
+
+# Milvus-specific Configuration
+ifeq (y,$(CONFIG_AI_VECTOR_DB_MILVUS))
+AI_ARGS += ai_vector_db_milvus_enable=True
+AI_ARGS += ai_vector_db_milvus_docker=True
+else
+AI_ARGS += ai_vector_db_milvus_enable=False
+endif
+
+AI_MANUAL_ARGS :=
+
+export AI_ARGS_SEPARATED := $(subst $(space),$(comma),$(AI_ARGS))
+
+# Main AI workflow targets
+ai: $(KDEVOPS_NODES) $(ANSIBLE_INVENTORY_FILE)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai.yml \
+		-f 10 \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		$(LIMIT_HOSTS)
+
+ai-baseline:
+	$(Q)$(MAKE) ai HOSTS="baseline"
+
+ai-dev:
+	$(Q)$(MAKE) ai HOSTS="dev"
+
+# AI Testing/Benchmark targets
+ai-tests: $(KDEVOPS_NODES) $(ANSIBLE_INVENTORY_FILE)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai_tests.yml \
+		-f 10 \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		$(LIMIT_HOSTS)
+	$(Q)$(MAKE) ai-results
+
+ai-tests-baseline: $(KDEVOPS_NODES) $(ANSIBLE_INVENTORY_FILE)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-l baseline \
+		-i hosts \
+		playbooks/ai_tests.yml \
+		-f 10 \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)"
+	$(Q)$(MAKE) ai-results-baseline
+
+ai-tests-dev: $(KDEVOPS_NODES) $(ANSIBLE_INVENTORY_FILE)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-l dev \
+		-i hosts \
+		playbooks/ai_tests.yml \
+		-f 10 \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)"
+	$(Q)$(MAKE) ai-results-dev
+
+# Target to only run results analysis and graph generation
+ai-tests-results:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai_tests.yml \
+		-f 10 \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		--tags="results" \
+		$(LIMIT_HOSTS)
+
+# Results collection targets
+ai-results:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai_results.yml \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		$(LIMIT_HOSTS)
+
+ai-results-baseline:
+	$(Q)$(MAKE) ai-results HOSTS="baseline"
+
+ai-results-dev:
+	$(Q)$(MAKE) ai-results HOSTS="dev"
+
+ai-setup:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai_setup.yml \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		$(LIMIT_HOSTS)
+
+ai-uninstall:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai_uninstall.yml \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		$(LIMIT_HOSTS)
+
+ai-destroy:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i hosts \
+		playbooks/ai_destroy.yml \
+		--extra-vars=@$(KDEVOPS_EXTRA_VARS) \
+		--extra-vars="$(AI_ARGS) $(AI_MANUAL_ARGS)" \
+		$(LIMIT_HOSTS)
+
+ai-help-menu:
+	@echo "AI workflow targets:"
+	@echo ""
+	@echo "Setup targets:"
+	@echo "  ai                     - Setup AI infrastructure (installs and starts services)"
+	@echo "  ai-baseline            - Setup AI infrastructure on baseline nodes only"
+	@echo "  ai-dev                 - Setup AI infrastructure on dev nodes only"
+	@echo ""
+	@echo "Testing/Benchmark targets:"
+	@echo "  ai-tests               - Run AI benchmarks on all nodes"
+	@echo "  ai-tests-baseline      - Run AI benchmarks on baseline nodes only"
+	@echo "  ai-tests-dev           - Run AI benchmarks on dev nodes only"
+	@echo "  ai-tests-results       - Only run results analysis and graph generation"
+	@echo ""
+	@echo "Results collection:"
+	@echo "  ai-results             - Collect and analyze AI benchmark results"
+	@echo "  ai-results-baseline    - Collect results from baseline nodes only"
+	@echo "  ai-results-dev         - Collect results from dev nodes only"
+	@echo ""
+	@echo "Other targets:"
+	@echo "  ai-setup               - Legacy target (use 'make ai' instead)"
+	@echo "  ai-uninstall           - Uninstall AI benchmark components"
+	@echo "  ai-destroy             - Destroy AI benchmark environment"
+	@echo ""
+
+HELP_TARGETS += ai-help-menu
+
+EXTRA_VAR_INPUTS += AI_ARGS_SEPARATED
+
+.PHONY: $(PHONY)
diff --git a/workflows/ai/scripts/analysis_config.json b/workflows/ai/scripts/analysis_config.json
new file mode 100644
index 00000000..2f90f4d5
--- /dev/null
+++ b/workflows/ai/scripts/analysis_config.json
@@ -0,0 +1,6 @@
+{
+  "enable_graphing": true,
+  "graph_format": "png",
+  "graph_dpi": 150,
+  "graph_theme": "seaborn"
+}
diff --git a/workflows/ai/scripts/analyze_results.py b/workflows/ai/scripts/analyze_results.py
new file mode 100755
index 00000000..3d11fb11
--- /dev/null
+++ b/workflows/ai/scripts/analyze_results.py
@@ -0,0 +1,979 @@
+#!/usr/bin/env python3
+"""
+AI Benchmark Results Analysis and Visualization
+
+This script analyzes benchmark results and generates comprehensive graphs
+showing performance characteristics of the AI workload testing.
+"""
+
+import json
+import glob
+import os
+import sys
+import argparse
+import subprocess
+import platform
+from typing import List, Dict, Any
+import logging
+from datetime import datetime
+
+# Optional imports with graceful fallback
+GRAPHING_AVAILABLE = True
+try:
+    import pandas as pd
+    import matplotlib.pyplot as plt
+    import seaborn as sns
+    import numpy as np
+except ImportError as e:
+    GRAPHING_AVAILABLE = False
+    print(f"Warning: Graphing libraries not available: {e}")
+    print("Install with: pip install pandas matplotlib seaborn numpy")
+
+
+class ResultsAnalyzer:
+    def __init__(self, results_dir: str, output_dir: str, config: Dict[str, Any]):
+        self.results_dir = results_dir
+        self.output_dir = output_dir
+        self.config = config
+        self.results_data = []
+
+        # Setup logging
+        logging.basicConfig(
+            level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
+        )
+        self.logger = logging.getLogger(__name__)
+
+        # Create output directory
+        os.makedirs(output_dir, exist_ok=True)
+
+        # Collect system information for DUT details
+        self.system_info = self._collect_system_info()
+
+    def _collect_system_info(self) -> Dict[str, Any]:
+        """Collect system information for DUT details in HTML report"""
+        info = {}
+
+        try:
+            # Basic system information
+            info["hostname"] = platform.node()
+            info["platform"] = platform.platform()
+            info["architecture"] = platform.architecture()[0]
+            info["processor"] = platform.processor()
+
+            # Memory information
+            try:
+                with open("/proc/meminfo", "r") as f:
+                    meminfo = f.read()
+                    for line in meminfo.split("\n"):
+                        if "MemTotal:" in line:
+                            info["total_memory"] = line.split()[1] + " kB"
+                            break
+            except:
+                info["total_memory"] = "Unknown"
+
+            # CPU information
+            try:
+                with open("/proc/cpuinfo", "r") as f:
+                    cpuinfo = f.read()
+                    cpu_count = cpuinfo.count("processor")
+                    info["cpu_count"] = cpu_count
+
+                    # Extract CPU model
+                    for line in cpuinfo.split("\n"):
+                        if "model name" in line:
+                            info["cpu_model"] = line.split(":", 1)[1].strip()
+                            break
+            except:
+                info["cpu_count"] = "Unknown"
+                info["cpu_model"] = "Unknown"
+
+            # Storage information
+            info["storage_devices"] = self._get_storage_info()
+
+            # Virtualization detection
+            info["is_vm"] = self._detect_virtualization()
+
+            # Filesystem information for AI data directory
+            info["filesystem_info"] = self._get_filesystem_info()
+
+        except Exception as e:
+            self.logger.warning(f"Error collecting system information: {e}")
+
+        return info
+
+    def _get_storage_info(self) -> List[Dict[str, str]]:
+        """Get storage device information including NVMe details"""
+        devices = []
+
+        try:
+            # Get block devices
+            result = subprocess.run(
+                ["lsblk", "-J", "-o", "NAME,SIZE,TYPE,MOUNTPOINT,FSTYPE"],
+                capture_output=True,
+                text=True,
+            )
+            if result.returncode == 0:
+                lsblk_data = json.loads(result.stdout)
+                for device in lsblk_data.get("blockdevices", []):
+                    if device.get("type") == "disk":
+                        dev_info = {
+                            "name": device.get("name", ""),
+                            "size": device.get("size", ""),
+                            "type": "disk",
+                        }
+
+                        # Check if it's NVMe and get additional details
+                        if device.get("name", "").startswith("nvme"):
+                            nvme_info = self._get_nvme_info(device.get("name", ""))
+                            dev_info.update(nvme_info)
+
+                        devices.append(dev_info)
+        except Exception as e:
+            self.logger.warning(f"Error getting storage info: {e}")
+
+        return devices
+
+    def _get_nvme_info(self, device_name: str) -> Dict[str, str]:
+        """Get detailed NVMe device information"""
+        nvme_info = {}
+
+        try:
+            # Get NVMe identify info
+            result = subprocess.run(
+                ["nvme", "id-ctrl", f"/dev/{device_name}"],
+                capture_output=True,
+                text=True,
+            )
+            if result.returncode == 0:
+                output = result.stdout
+                for line in output.split("\n"):
+                    if "mn :" in line:
+                        nvme_info["model"] = line.split(":", 1)[1].strip()
+                    elif "fr :" in line:
+                        nvme_info["firmware"] = line.split(":", 1)[1].strip()
+                    elif "sn :" in line:
+                        nvme_info["serial"] = line.split(":", 1)[1].strip()
+        except Exception as e:
+            self.logger.debug(f"Could not get NVMe info for {device_name}: {e}")
+
+        return nvme_info
+
+    def _detect_virtualization(self) -> str:
+        """Detect if running in a virtual environment"""
+        try:
+            # Check systemd-detect-virt
+            result = subprocess.run(
+                ["systemd-detect-virt"], capture_output=True, text=True
+            )
+            if result.returncode == 0:
+                virt_type = result.stdout.strip()
+                return virt_type if virt_type != "none" else "Physical"
+        except:
+            pass
+
+        try:
+            # Check dmesg for virtualization hints
+            result = subprocess.run(["dmesg"], capture_output=True, text=True)
+            if result.returncode == 0:
+                dmesg_output = result.stdout.lower()
+                if "kvm" in dmesg_output:
+                    return "KVM"
+                elif "vmware" in dmesg_output:
+                    return "VMware"
+                elif "virtualbox" in dmesg_output:
+                    return "VirtualBox"
+                elif "xen" in dmesg_output:
+                    return "Xen"
+        except:
+            pass
+
+        return "Unknown"
+
+    def _get_filesystem_info(self) -> Dict[str, str]:
+        """Get filesystem information for the AI benchmark directory"""
+        fs_info = {}
+
+        try:
+            # Get filesystem info for the results directory
+            result = subprocess.run(
+                ["df", "-T", self.results_dir], capture_output=True, text=True
+            )
+            if result.returncode == 0:
+                lines = result.stdout.strip().split("\n")
+                if len(lines) > 1:
+                    fields = lines[1].split()
+                    if len(fields) >= 2:
+                        fs_info["filesystem_type"] = fields[1]
+                        fs_info["mount_point"] = (
+                            fields[6] if len(fields) > 6 else "Unknown"
+                        )
+
+            # Get mount options
+            try:
+                with open("/proc/mounts", "r") as f:
+                    for line in f:
+                        parts = line.split()
+                        if (
+                            len(parts) >= 4
+                            and fs_info.get("mount_point", "") in parts[1]
+                        ):
+                            fs_info["mount_options"] = parts[3]
+                            break
+            except:
+                pass
+        except Exception as e:
+            self.logger.warning(f"Error getting filesystem info: {e}")
+
+        return fs_info
+
+    def load_results(self) -> bool:
+        """Load all result files from the results directory"""
+        try:
+            pattern = os.path.join(self.results_dir, "results_*.json")
+            result_files = glob.glob(pattern)
+
+            if not result_files:
+                self.logger.warning(f"No result files found in {self.results_dir}")
+                return False
+
+            self.logger.info(f"Found {len(result_files)} result files")
+
+            for file_path in result_files:
+                try:
+                    with open(file_path, "r") as f:
+                        data = json.load(f)
+                        data["_file"] = os.path.basename(file_path)
+                        self.results_data.append(data)
+                except Exception as e:
+                    self.logger.error(f"Error loading {file_path}: {e}")
+
+            self.logger.info(
+                f"Successfully loaded {len(self.results_data)} result sets"
+            )
+            return len(self.results_data) > 0
+
+        except Exception as e:
+            self.logger.error(f"Error loading results: {e}")
+            return False
+
+    def generate_summary_report(self) -> str:
+        """Generate a text summary report"""
+        try:
+            report = []
+            report.append("=" * 80)
+            report.append("AI BENCHMARK RESULTS SUMMARY")
+            report.append("=" * 80)
+            report.append(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+            report.append(f"Total result sets: {len(self.results_data)}")
+            report.append("")
+
+            if not self.results_data:
+                report.append("No results to analyze.")
+                return "\n".join(report)
+
+            # Configuration summary
+            first_result = self.results_data[0]
+            config = first_result.get("config", {})
+
+            report.append("CONFIGURATION:")
+            report.append(
+                f"  Vector dataset size: {config.get('vector_dataset_size', 'N/A'):,}"
+            )
+            report.append(
+                f"  Vector dimensions: {config.get('vector_dimensions', 'N/A')}"
+            )
+            report.append(f"  Index type: {config.get('index_type', 'N/A')}")
+            report.append(f"  Benchmark iterations: {len(self.results_data)}")
+            report.append("")
+
+            # Insert performance summary
+            insert_times = []
+            insert_rates = []
+            for result in self.results_data:
+                insert_perf = result.get("insert_performance", {})
+                if insert_perf:
+                    insert_times.append(insert_perf.get("total_time_seconds", 0))
+                    insert_rates.append(insert_perf.get("vectors_per_second", 0))
+
+            if insert_times:
+                report.append("INSERT PERFORMANCE:")
+                report.append(
+                    f"  Average insert time: {np.mean(insert_times):.2f} seconds"
+                )
+                report.append(
+                    f"  Average insert rate: {np.mean(insert_rates):.2f} vectors/sec"
+                )
+                report.append(
+                    f"  Insert rate range: {np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec"
+                )
+                report.append("")
+
+            # Index performance summary
+            index_times = []
+            for result in self.results_data:
+                index_perf = result.get("index_performance", {})
+                if index_perf:
+                    index_times.append(index_perf.get("creation_time_seconds", 0))
+
+            if index_times:
+                report.append("INDEX PERFORMANCE:")
+                report.append(
+                    f"  Average index creation time: {np.mean(index_times):.2f} seconds"
+                )
+                report.append(
+                    f"  Index time range: {np.min(index_times):.2f} - {np.max(index_times):.2f} seconds"
+                )
+                report.append("")
+
+            # Query performance summary
+            report.append("QUERY PERFORMANCE:")
+            for result in self.results_data:
+                query_perf = result.get("query_performance", {})
+                if query_perf:
+                    for topk, topk_data in query_perf.items():
+                        report.append(f"  {topk.upper()}:")
+                        for batch, batch_data in topk_data.items():
+                            qps = batch_data.get("queries_per_second", 0)
+                            avg_time = batch_data.get("average_time_seconds", 0)
+                            report.append(
+                                f"    {batch}: {qps:.2f} QPS, {avg_time*1000:.2f}ms avg"
+                            )
+                    break  # Only show first result for summary
+
+            return "\n".join(report)
+
+        except Exception as e:
+            self.logger.error(f"Error generating summary report: {e}")
+            return f"Error generating summary: {e}"
+
+    def generate_html_report(self) -> str:
+        """Generate comprehensive HTML report with DUT details and test configuration"""
+        try:
+            html = []
+
+            # HTML header
+            html.append("<!DOCTYPE html>")
+            html.append("<html lang='en'>")
+            html.append("<head>")
+            html.append("    <meta charset='UTF-8'>")
+            html.append(
+                "    <meta name='viewport' content='width=device-width, initial-scale=1.0'>"
+            )
+            html.append("    <title>AI Benchmark Results Report</title>")
+            html.append("    <style>")
+            html.append(
+                "        body { font-family: Arial, sans-serif; margin: 20px; line-height: 1.6; }"
+            )
+            html.append(
+                "        .header { background-color: #f4f4f4; padding: 20px; border-radius: 5px; margin-bottom: 20px; }"
+            )
+            html.append("        .section { margin-bottom: 30px; }")
+            html.append(
+                "        .section h2 { color: #333; border-bottom: 2px solid #007acc; padding-bottom: 5px; }"
+            )
+            html.append("        .section h3 { color: #555; }")
+            html.append(
+                "        table { border-collapse: collapse; width: 100%; margin-bottom: 20px; }"
+            )
+            html.append(
+                "        th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }"
+            )
+            html.append("        th { background-color: #f2f2f2; font-weight: bold; }")
+            html.append(
+                "        .metric-table td:first-child { font-weight: bold; width: 30%; }"
+            )
+            html.append(
+                "        .config-table td:first-child { font-weight: bold; width: 40%; }"
+            )
+            html.append("        .performance-good { color: #27ae60; }")
+            html.append("        .performance-warning { color: #f39c12; }")
+            html.append("        .performance-poor { color: #e74c3c; }")
+            html.append(
+                "        .highlight { background-color: #fff3cd; padding: 10px; border-radius: 3px; }"
+            )
+            html.append("    </style>")
+            html.append("</head>")
+            html.append("<body>")
+
+            # Report header
+            html.append("    <div class='header'>")
+            html.append("        <h1>AI Benchmark Results Report</h1>")
+            html.append(
+                f"        <p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>"
+            )
+            html.append(
+                f"        <p><strong>Test Results:</strong> {len(self.results_data)} benchmark iterations</p>"
+            )
+
+            # Test type identification
+            html.append("        <div class='highlight'>")
+            html.append("            <h3>🤖 AI Workflow Test Type</h3>")
+            html.append(
+                "            <p><strong>Vector Database Performance Testing</strong> using <strong>Milvus Vector Database</strong></p>"
+            )
+            html.append(
+                "            <p>This test evaluates AI workload performance including vector insertion, indexing, and similarity search operations.</p>"
+            )
+            html.append("        </div>")
+            html.append("    </div>")
+
+            # Device Under Test (DUT) Section
+            html.append("    <div class='section'>")
+            html.append("        <h2>📋 Device Under Test (DUT) Details</h2>")
+            html.append("        <table class='config-table'>")
+            html.append(
+                "            <tr><td>Hostname</td><td>"
+                + str(self.system_info.get("hostname", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>System Type</td><td>"
+                + str(self.system_info.get("is_vm", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Platform</td><td>"
+                + str(self.system_info.get("platform", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Architecture</td><td>"
+                + str(self.system_info.get("architecture", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>CPU Model</td><td>"
+                + str(self.system_info.get("cpu_model", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>CPU Count</td><td>"
+                + str(self.system_info.get("cpu_count", "Unknown"))
+                + " cores</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Total Memory</td><td>"
+                + str(self.system_info.get("total_memory", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append("        </table>")
+
+            # Storage devices section
+            html.append("        <h3>💾 Storage Configuration</h3>")
+            storage_devices = self.system_info.get("storage_devices", [])
+            if storage_devices:
+                html.append("        <table>")
+                html.append(
+                    "            <tr><th>Device</th><th>Size</th><th>Type</th><th>Model</th><th>Firmware</th></tr>"
+                )
+                for device in storage_devices:
+                    model = device.get("model", "N/A")
+                    firmware = device.get("firmware", "N/A")
+                    html.append(f"            <tr>")
+                    html.append(
+                        f"                <td>{device.get('name', 'Unknown')}</td>"
+                    )
+                    html.append(
+                        f"                <td>{device.get('size', 'Unknown')}</td>"
+                    )
+                    html.append(
+                        f"                <td>{device.get('type', 'Unknown')}</td>"
+                    )
+                    html.append(f"                <td>{model}</td>")
+                    html.append(f"                <td>{firmware}</td>")
+                    html.append(f"            </tr>")
+                html.append("        </table>")
+            else:
+                html.append("        <p>No storage device information available.</p>")
+
+            # Filesystem section
+            html.append("        <h3>🗂️ Filesystem Configuration</h3>")
+            fs_info = self.system_info.get("filesystem_info", {})
+            html.append("        <table class='config-table'>")
+            html.append(
+                "            <tr><td>Filesystem Type</td><td>"
+                + str(fs_info.get("filesystem_type", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Mount Point</td><td>"
+                + str(fs_info.get("mount_point", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append(
+                "            <tr><td>Mount Options</td><td>"
+                + str(fs_info.get("mount_options", "Unknown"))
+                + "</td></tr>"
+            )
+            html.append("        </table>")
+            html.append("    </div>")
+
+            # Test Configuration Section
+            if self.results_data:
+                first_result = self.results_data[0]
+                config = first_result.get("config", {})
+
+                html.append("    <div class='section'>")
+                html.append("        <h2>⚙️ AI Test Configuration</h2>")
+                html.append("        <table class='config-table'>")
+                html.append(
+                    f"            <tr><td>Vector Dataset Size</td><td>{config.get('vector_dataset_size', 'N/A'):,} vectors</td></tr>"
+                )
+                html.append(
+                    f"            <tr><td>Vector Dimensions</td><td>{config.get('vector_dimensions', 'N/A')}</td></tr>"
+                )
+                html.append(
+                    f"            <tr><td>Index Type</td><td>{config.get('index_type', 'N/A')}</td></tr>"
+                )
+                html.append(
+                    f"            <tr><td>Benchmark Iterations</td><td>{len(self.results_data)}</td></tr>"
+                )
+
+                # Add index-specific parameters
+                if config.get("index_type") == "HNSW":
+                    html.append(
+                        f"            <tr><td>HNSW M Parameter</td><td>{config.get('hnsw_m', 'N/A')}</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>HNSW ef Construction</td><td>{config.get('hnsw_ef_construction', 'N/A')}</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>HNSW ef Search</td><td>{config.get('hnsw_ef', 'N/A')}</td></tr>"
+                    )
+                elif config.get("index_type") == "IVF_FLAT":
+                    html.append(
+                        f"            <tr><td>IVF nlist</td><td>{config.get('ivf_nlist', 'N/A')}</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>IVF nprobe</td><td>{config.get('ivf_nprobe', 'N/A')}</td></tr>"
+                    )
+
+                html.append("        </table>")
+                html.append("    </div>")
+
+            # Performance Results Section
+            html.append("    <div class='section'>")
+            html.append("        <h2>📊 Performance Results Summary</h2>")
+
+            if self.results_data:
+                # Insert performance
+                insert_times = [
+                    r.get("insert_performance", {}).get("total_time_seconds", 0)
+                    for r in self.results_data
+                ]
+                insert_rates = [
+                    r.get("insert_performance", {}).get("vectors_per_second", 0)
+                    for r in self.results_data
+                ]
+
+                if insert_times and any(t > 0 for t in insert_times):
+                    html.append("        <h3>📈 Vector Insert Performance</h3>")
+                    html.append("        <table class='metric-table'>")
+                    html.append(
+                        f"            <tr><td>Average Insert Time</td><td>{np.mean(insert_times):.2f} seconds</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>Average Insert Rate</td><td>{np.mean(insert_rates):.2f} vectors/sec</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>Insert Rate Range</td><td>{np.min(insert_rates):.2f} - {np.max(insert_rates):.2f} vectors/sec</td></tr>"
+                    )
+                    html.append("        </table>")
+
+                # Index performance
+                index_times = [
+                    r.get("index_performance", {}).get("creation_time_seconds", 0)
+                    for r in self.results_data
+                ]
+                if index_times and any(t > 0 for t in index_times):
+                    html.append("        <h3>🔗 Index Creation Performance</h3>")
+                    html.append("        <table class='metric-table'>")
+                    html.append(
+                        f"            <tr><td>Average Index Creation Time</td><td>{np.mean(index_times):.2f} seconds</td></tr>"
+                    )
+                    html.append(
+                        f"            <tr><td>Index Time Range</td><td>{np.min(index_times):.2f} - {np.max(index_times):.2f} seconds</td></tr>"
+                    )
+                    html.append("        </table>")
+
+                # Query performance
+                html.append("        <h3>🔍 Query Performance</h3>")
+                first_query_perf = self.results_data[0].get("query_performance", {})
+                if first_query_perf:
+                    html.append("        <table>")
+                    html.append(
+                        "            <tr><th>Query Type</th><th>Batch Size</th><th>QPS</th><th>Avg Latency (ms)</th></tr>"
+                    )
+
+                    for topk, topk_data in first_query_perf.items():
+                        for batch, batch_data in topk_data.items():
+                            qps = batch_data.get("queries_per_second", 0)
+                            avg_time = batch_data.get("average_time_seconds", 0) * 1000
+
+                            # Color coding for performance
+                            qps_class = ""
+                            if qps > 1000:
+                                qps_class = "performance-good"
+                            elif qps > 100:
+                                qps_class = "performance-warning"
+                            else:
+                                qps_class = "performance-poor"
+
+                            html.append(f"            <tr>")
+                            html.append(
+                                f"                <td>{topk.replace('topk_', 'Top-')}</td>"
+                            )
+                            html.append(
+                                f"                <td>{batch.replace('batch_', 'Batch ')}</td>"
+                            )
+                            html.append(
+                                f"                <td class='{qps_class}'>{qps:.2f}</td>"
+                            )
+                            html.append(f"                <td>{avg_time:.2f}</td>")
+                            html.append(f"            </tr>")
+
+                    html.append("        </table>")
+
+                html.append("    </div>")
+
+            # Footer
+            html.append("    <div class='section'>")
+            html.append("        <h2>📝 Notes</h2>")
+            html.append("        <ul>")
+            html.append(
+                "            <li>This report was generated automatically by the AI benchmark analysis tool</li>"
+            )
+            html.append(
+                "            <li>Performance metrics are averaged across all benchmark iterations</li>"
+            )
+            html.append(
+                "            <li>QPS (Queries Per Second) values are color-coded: <span class='performance-good'>Green (>1000)</span>, <span class='performance-warning'>Orange (100-1000)</span>, <span class='performance-poor'>Red (<100)</span></li>"
+            )
+            html.append(
+                "            <li>Storage device information may require root privileges to display NVMe details</li>"
+            )
+            html.append("        </ul>")
+            html.append("    </div>")
+
+            html.append("</body>")
+            html.append("</html>")
+
+            return "\n".join(html)
+
+        except Exception as e:
+            self.logger.error(f"Error generating HTML report: {e}")
+            return (
+                f"<html><body><h1>Error generating HTML report: {e}</h1></body></html>"
+            )
+
+    def generate_graphs(self) -> bool:
+        """Generate performance visualization graphs"""
+        if not GRAPHING_AVAILABLE:
+            self.logger.warning(
+                "Graphing libraries not available, skipping graph generation"
+            )
+            return False
+
+        try:
+            # Set matplotlib style
+            if self.config.get("graph_theme", "default") != "default":
+                plt.style.use(self.config["graph_theme"])
+
+            # Graph 1: Insert Performance
+            self._plot_insert_performance()
+
+            # Graph 2: Query Performance by Top-K
+            self._plot_query_performance()
+
+            # Graph 3: Index Creation Time
+            self._plot_index_performance()
+
+            # Graph 4: Performance Comparison Matrix
+            self._plot_performance_matrix()
+
+            self.logger.info("Graphs generated successfully")
+            return True
+
+        except Exception as e:
+            self.logger.error(f"Error generating graphs: {e}")
+            return False
+
+    def _plot_insert_performance(self):
+        """Plot insert performance metrics"""
+        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+        # Extract insert data
+        iterations = []
+        insert_rates = []
+        insert_times = []
+
+        for i, result in enumerate(self.results_data):
+            insert_perf = result.get("insert_performance", {})
+            if insert_perf:
+                iterations.append(i + 1)
+                insert_rates.append(insert_perf.get("vectors_per_second", 0))
+                insert_times.append(insert_perf.get("total_time_seconds", 0))
+
+        # Plot insert rate
+        ax1.plot(iterations, insert_rates, "b-o", linewidth=2, markersize=6)
+        ax1.set_xlabel("Iteration")
+        ax1.set_ylabel("Vectors/Second")
+        ax1.set_title("Vector Insert Rate Performance")
+        ax1.grid(True, alpha=0.3)
+
+        # Plot insert time
+        ax2.plot(iterations, insert_times, "r-o", linewidth=2, markersize=6)
+        ax2.set_xlabel("Iteration")
+        ax2.set_ylabel("Total Time (seconds)")
+        ax2.set_title("Vector Insert Time Performance")
+        ax2.grid(True, alpha=0.3)
+
+        plt.tight_layout()
+        output_file = os.path.join(
+            self.output_dir,
+            f"insert_performance.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def _plot_query_performance(self):
+        """Plot query performance metrics"""
+        if not self.results_data:
+            return
+
+        # Collect query performance data
+        query_data = []
+        for result in self.results_data:
+            query_perf = result.get("query_performance", {})
+            for topk, topk_data in query_perf.items():
+                for batch, batch_data in topk_data.items():
+                    query_data.append(
+                        {
+                            "topk": topk.replace("topk_", ""),
+                            "batch": batch.replace("batch_", ""),
+                            "qps": batch_data.get("queries_per_second", 0),
+                            "avg_time": batch_data.get("average_time_seconds", 0)
+                            * 1000,  # Convert to ms
+                        }
+                    )
+
+        if not query_data:
+            return
+
+        df = pd.DataFrame(query_data)
+
+        # Create subplots
+        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
+
+        # QPS heatmap
+        qps_pivot = df.pivot_table(
+            values="qps", index="topk", columns="batch", aggfunc="mean"
+        )
+        sns.heatmap(qps_pivot, annot=True, fmt=".1f", ax=ax1, cmap="YlOrRd")
+        ax1.set_title("Queries Per Second (QPS)")
+        ax1.set_xlabel("Batch Size")
+        ax1.set_ylabel("Top-K")
+
+        # Latency heatmap
+        latency_pivot = df.pivot_table(
+            values="avg_time", index="topk", columns="batch", aggfunc="mean"
+        )
+        sns.heatmap(latency_pivot, annot=True, fmt=".1f", ax=ax2, cmap="YlOrRd")
+        ax2.set_title("Average Query Latency (ms)")
+        ax2.set_xlabel("Batch Size")
+        ax2.set_ylabel("Top-K")
+
+        plt.tight_layout()
+        output_file = os.path.join(
+            self.output_dir,
+            f"query_performance.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def _plot_index_performance(self):
+        """Plot index creation performance"""
+        iterations = []
+        index_times = []
+
+        for i, result in enumerate(self.results_data):
+            index_perf = result.get("index_performance", {})
+            if index_perf:
+                iterations.append(i + 1)
+                index_times.append(index_perf.get("creation_time_seconds", 0))
+
+        if not index_times:
+            return
+
+        plt.figure(figsize=(10, 6))
+        plt.bar(iterations, index_times, alpha=0.7, color="green")
+        plt.xlabel("Iteration")
+        plt.ylabel("Index Creation Time (seconds)")
+        plt.title("Index Creation Performance")
+        plt.grid(True, alpha=0.3)
+
+        # Add average line
+        avg_time = np.mean(index_times)
+        plt.axhline(
+            y=avg_time, color="red", linestyle="--", label=f"Average: {avg_time:.2f}s"
+        )
+        plt.legend()
+
+        output_file = os.path.join(
+            self.output_dir,
+            f"index_performance.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def _plot_performance_matrix(self):
+        """Plot comprehensive performance comparison matrix"""
+        if len(self.results_data) < 2:
+            return
+
+        # Extract key metrics for comparison
+        metrics = []
+        for i, result in enumerate(self.results_data):
+            insert_perf = result.get("insert_performance", {})
+            index_perf = result.get("index_performance", {})
+
+            metric = {
+                "iteration": i + 1,
+                "insert_rate": insert_perf.get("vectors_per_second", 0),
+                "index_time": index_perf.get("creation_time_seconds", 0),
+            }
+
+            # Add query metrics
+            query_perf = result.get("query_performance", {})
+            if "topk_10" in query_perf and "batch_1" in query_perf["topk_10"]:
+                metric["query_qps"] = query_perf["topk_10"]["batch_1"].get(
+                    "queries_per_second", 0
+                )
+
+            metrics.append(metric)
+
+        df = pd.DataFrame(metrics)
+
+        # Normalize metrics for comparison
+        numeric_cols = ["insert_rate", "index_time", "query_qps"]
+        for col in numeric_cols:
+            if col in df.columns:
+                df[f"{col}_norm"] = (df[col] - df[col].min()) / (
+                    df[col].max() - df[col].min() + 1e-6
+                )
+
+        # Create radar chart
+        fig, ax = plt.subplots(figsize=(10, 8), subplot_kw=dict(projection="polar"))
+
+        angles = np.linspace(0, 2 * np.pi, len(numeric_cols), endpoint=False).tolist()
+        angles += angles[:1]  # Complete the circle
+
+        for i, row in df.iterrows():
+            values = [row.get(f"{col}_norm", 0) for col in numeric_cols]
+            values += values[:1]  # Complete the circle
+
+            ax.plot(
+                angles, values, "o-", linewidth=2, label=f'Iteration {row["iteration"]}'
+            )
+            ax.fill(angles, values, alpha=0.25)
+
+        ax.set_xticks(angles[:-1])
+        ax.set_xticklabels(["Insert Rate", "Index Time (inv)", "Query QPS"])
+        ax.set_ylim(0, 1)
+        ax.set_title("Performance Comparison Matrix (Normalized)", y=1.08)
+        ax.legend(loc="upper right", bbox_to_anchor=(1.3, 1.0))
+
+        output_file = os.path.join(
+            self.output_dir,
+            f"performance_matrix.{self.config.get('graph_format', 'png')}",
+        )
+        plt.savefig(
+            output_file, dpi=self.config.get("graph_dpi", 300), bbox_inches="tight"
+        )
+        plt.close()
+
+    def analyze(self) -> bool:
+        """Run complete analysis"""
+        self.logger.info("Starting results analysis...")
+
+        if not self.load_results():
+            return False
+
+        # Generate summary report
+        summary = self.generate_summary_report()
+        summary_file = os.path.join(self.output_dir, "benchmark_summary.txt")
+        with open(summary_file, "w") as f:
+            f.write(summary)
+        self.logger.info(f"Summary report saved to {summary_file}")
+
+        # Generate HTML report
+        html_report = self.generate_html_report()
+        html_file = os.path.join(self.output_dir, "benchmark_report.html")
+        with open(html_file, "w") as f:
+            f.write(html_report)
+        self.logger.info(f"HTML report saved to {html_file}")
+
+        # Generate graphs if enabled
+        if self.config.get("enable_graphing", True):
+            self.generate_graphs()
+
+        # Create consolidated JSON report
+        consolidated_file = os.path.join(self.output_dir, "consolidated_results.json")
+        with open(consolidated_file, "w") as f:
+            json.dump(
+                {
+                    "summary": summary.split("\n"),
+                    "raw_results": self.results_data,
+                    "analysis_timestamp": datetime.now().isoformat(),
+                    "system_info": self.system_info,
+                },
+                f,
+                indent=2,
+            )
+
+        self.logger.info("Analysis completed successfully")
+        return True
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Analyze AI benchmark results")
+    parser.add_argument(
+        "--results-dir", required=True, help="Directory containing result files"
+    )
+    parser.add_argument(
+        "--output-dir", required=True, help="Directory for analysis output"
+    )
+    parser.add_argument("--config", help="Analysis configuration file (JSON)")
+
+    args = parser.parse_args()
+
+    # Load configuration
+    config = {
+        "enable_graphing": True,
+        "graph_format": "png",
+        "graph_dpi": 300,
+        "graph_theme": "default",
+    }
+
+    if args.config:
+        try:
+            with open(args.config, "r") as f:
+                config.update(json.load(f))
+        except Exception as e:
+            print(f"Error loading config file: {e}")
+
+    # Run analysis
+    analyzer = ResultsAnalyzer(args.results_dir, args.output_dir, config)
+    success = analyzer.analyze()
+
+    return 0 if success else 1
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/workflows/ai/scripts/generate_graphs.py b/workflows/ai/scripts/generate_graphs.py
new file mode 100755
index 00000000..2e183e86
--- /dev/null
+++ b/workflows/ai/scripts/generate_graphs.py
@@ -0,0 +1,1174 @@
+#!/usr/bin/env python3
+"""
+Generate graphs and analysis for AI benchmark results
+"""
+
+import json
+import os
+import sys
+import glob
+import numpy as np
+import matplotlib
+
+matplotlib.use("Agg")  # Use non-interactive backend
+import matplotlib.pyplot as plt
+from datetime import datetime
+from pathlib import Path
+from collections import defaultdict
+
+
+def load_results(results_dir):
+    """Load all JSON result files from the directory"""
+    results = []
+    # Only load results_*.json files, not consolidated or other JSON files
+    json_files = glob.glob(os.path.join(results_dir, "results_*.json"))
+
+    for json_file in json_files:
+        try:
+            with open(json_file, "r") as f:
+                data = json.load(f)
+                # Extract filesystem info - prefer from JSON data over filename
+                filename = os.path.basename(json_file)
+
+                # First, try to get filesystem from the JSON data itself
+                fs_type = data.get("filesystem", None)
+
+                # If not in JSON, try to parse from filename (backwards compatibility)
+                if not fs_type:
+                    parts = (
+                        filename.replace("results_", "").replace(".json", "").split("-")
+                    )
+
+                    # Parse host info
+                    if "debian13-ai-" in filename:
+                        host_parts = (
+                            filename.replace("results_debian13-ai-", "")
+                            .replace("_1.json", "")
+                            .replace("_2.json", "")
+                            .replace("_3.json", "")
+                            .split("-")
+                        )
+                        if "xfs" in host_parts[0]:
+                            fs_type = "xfs"
+                            # Extract block size (e.g., "4k", "16k", etc.)
+                            block_size = (
+                                host_parts[1] if len(host_parts) > 1 else "unknown"
+                            )
+                        elif "ext4" in host_parts[0]:
+                            fs_type = "ext4"
+                            block_size = host_parts[1] if len(host_parts) > 1 else "4k"
+                        elif "btrfs" in host_parts[0]:
+                            fs_type = "btrfs"
+                            block_size = "default"
+                        else:
+                            fs_type = "unknown"
+                            block_size = "unknown"
+                    else:
+                        fs_type = "unknown"
+                        block_size = "unknown"
+                else:
+                    # If filesystem came from JSON, set appropriate block size
+                    if fs_type == "btrfs":
+                        block_size = "default"
+                    elif fs_type in ["ext4", "xfs"]:
+                        block_size = data.get("block_size", "4k")
+                    else:
+                        block_size = data.get("block_size", "default")
+
+                is_dev = "dev" in filename
+
+                # Use filesystem from JSON if available, otherwise use parsed value
+                if "filesystem" not in data:
+                    data["filesystem"] = fs_type
+                data["block_size"] = block_size
+                data["is_dev"] = is_dev
+                data["filename"] = filename
+
+                results.append(data)
+        except Exception as e:
+            print(f"Error loading {json_file}: {e}")
+
+    return results
+
+
+def create_filesystem_comparison_chart(results, output_dir):
+    """Create a bar chart comparing performance across filesystems"""
+    # Group by filesystem and baseline/dev
+    fs_data = defaultdict(lambda: {"baseline": [], "dev": []})
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        category = "dev" if result.get("is_dev", False) else "baseline"
+
+        # Extract actual performance data from results
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+        fs_data[fs][category].append(insert_qps)
+
+    # Prepare data for plotting
+    filesystems = list(fs_data.keys())
+    baseline_means = [
+        np.mean(fs_data[fs]["baseline"]) if fs_data[fs]["baseline"] else 0
+        for fs in filesystems
+    ]
+    dev_means = [
+        np.mean(fs_data[fs]["dev"]) if fs_data[fs]["dev"] else 0 for fs in filesystems
+    ]
+
+    x = np.arange(len(filesystems))
+    width = 0.35
+
+    fig, ax = plt.subplots(figsize=(10, 6))
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_means, width, label="Baseline", color="#1f77b4"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_means, width, label="Development", color="#ff7f0e"
+    )
+
+    ax.set_xlabel("Filesystem")
+    ax.set_ylabel("Insert QPS")
+    ax.set_title("Vector Database Performance by Filesystem")
+    ax.set_xticks(x)
+    ax.set_xticklabels(filesystems)
+    ax.legend()
+    ax.grid(True, alpha=0.3)
+
+    # Add value labels on bars
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.0f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "filesystem_comparison.png"), dpi=150)
+    plt.close()
+
+
+def create_block_size_analysis(results, output_dir):
+    """Create analysis for different block sizes (XFS specific)"""
+    # Filter XFS results
+    xfs_results = [r for r in results if r.get("filesystem") == "xfs"]
+
+    if not xfs_results:
+        return
+
+    # Group by block size
+    block_size_data = defaultdict(lambda: {"baseline": [], "dev": []})
+
+    for result in xfs_results:
+        block_size = result.get("block_size", "unknown")
+        category = "dev" if result.get("is_dev", False) else "baseline"
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+        block_size_data[block_size][category].append(insert_qps)
+
+    # Sort block sizes
+    block_sizes = sorted(
+        block_size_data.keys(),
+        key=lambda x: (
+            int(x.replace("k", "").replace("s", ""))
+            if x not in ["unknown", "default"]
+            else 0
+        ),
+    )
+
+    # Create grouped bar chart
+    baseline_means = [
+        (
+            np.mean(block_size_data[bs]["baseline"])
+            if block_size_data[bs]["baseline"]
+            else 0
+        )
+        for bs in block_sizes
+    ]
+    dev_means = [
+        np.mean(block_size_data[bs]["dev"]) if block_size_data[bs]["dev"] else 0
+        for bs in block_sizes
+    ]
+
+    x = np.arange(len(block_sizes))
+    width = 0.35
+
+    fig, ax = plt.subplots(figsize=(12, 6))
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_means, width, label="Baseline", color="#2ca02c"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_means, width, label="Development", color="#d62728"
+    )
+
+    ax.set_xlabel("Block Size")
+    ax.set_ylabel("Insert QPS")
+    ax.set_title("XFS Performance by Block Size")
+    ax.set_xticks(x)
+    ax.set_xticklabels(block_sizes)
+    ax.legend()
+    ax.grid(True, alpha=0.3)
+
+    # Add value labels
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.0f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "xfs_block_size_analysis.png"), dpi=150)
+    plt.close()
+
+
+def create_heatmap_analysis(results, output_dir):
+    """Create a heatmap showing AVERAGE performance across all test iterations"""
+    # Group data by configuration and version, collecting ALL values for averaging
+    config_data = defaultdict(
+        lambda: {
+            "baseline": {"insert": [], "query": [], "count": 0},
+            "dev": {"insert": [], "query": [], "count": 0},
+        }
+    )
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "default")
+        config = f"{fs}-{block_size}"
+        version = "dev" if result.get("is_dev", False) else "baseline"
+
+        # Get actual insert performance
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+
+        # Calculate average query QPS
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+
+        # Collect all values for averaging
+        config_data[config][version]["insert"].append(insert_qps)
+        config_data[config][version]["query"].append(query_qps)
+        config_data[config][version]["count"] += 1
+
+    # Sort configurations
+    configs = sorted(config_data.keys())
+
+    # Calculate averages for heatmap
+    insert_baseline = []
+    insert_dev = []
+    query_baseline = []
+    query_dev = []
+    iteration_counts = {"baseline": 0, "dev": 0}
+
+    for c in configs:
+        # Calculate average insert QPS
+        baseline_insert_vals = config_data[c]["baseline"]["insert"]
+        insert_baseline.append(
+            np.mean(baseline_insert_vals) if baseline_insert_vals else 0
+        )
+
+        dev_insert_vals = config_data[c]["dev"]["insert"]
+        insert_dev.append(np.mean(dev_insert_vals) if dev_insert_vals else 0)
+
+        # Calculate average query QPS
+        baseline_query_vals = config_data[c]["baseline"]["query"]
+        query_baseline.append(
+            np.mean(baseline_query_vals) if baseline_query_vals else 0
+        )
+
+        dev_query_vals = config_data[c]["dev"]["query"]
+        query_dev.append(np.mean(dev_query_vals) if dev_query_vals else 0)
+
+        # Track iteration counts
+        iteration_counts["baseline"] = max(
+            iteration_counts["baseline"], len(baseline_insert_vals)
+        )
+        iteration_counts["dev"] = max(iteration_counts["dev"], len(dev_insert_vals))
+
+    # Create figure with custom heatmap
+    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
+
+    # Create data matrices
+    insert_data = np.array([insert_baseline, insert_dev]).T
+    query_data = np.array([query_baseline, query_dev]).T
+
+    # Insert QPS heatmap
+    im1 = ax1.imshow(insert_data, cmap="YlOrRd", aspect="auto")
+    ax1.set_xticks([0, 1])
+    ax1.set_xticklabels(["Baseline", "Development"])
+    ax1.set_yticks(range(len(configs)))
+    ax1.set_yticklabels(configs)
+    ax1.set_title(
+        f"Insert Performance - AVERAGE across {iteration_counts['baseline']} iterations\n(1M vectors, 128 dims, HNSW index)"
+    )
+    ax1.set_ylabel("Configuration")
+
+    # Add text annotations with dynamic color based on background
+    # Get the colormap to determine actual colors
+    cmap1 = plt.cm.YlOrRd
+    norm1 = plt.Normalize(vmin=insert_data.min(), vmax=insert_data.max())
+
+    for i in range(len(configs)):
+        for j in range(2):
+            # Get the actual color from the colormap
+            val = insert_data[i, j]
+            rgba = cmap1(norm1(val))
+            # Calculate luminance using standard formula
+            # Perceived luminance: 0.299*R + 0.587*G + 0.114*B
+            luminance = 0.299 * rgba[0] + 0.587 * rgba[1] + 0.114 * rgba[2]
+            # Use white text on dark backgrounds (low luminance)
+            text_color = "white" if luminance < 0.5 else "black"
+
+            # Show average value with indicator
+            text = ax1.text(
+                j,
+                i,
+                f"{int(insert_data[i, j])}\n(avg)",
+                ha="center",
+                va="center",
+                color=text_color,
+                fontweight="bold",
+                fontsize=9,
+            )
+
+    # Add colorbar
+    cbar1 = plt.colorbar(im1, ax=ax1)
+    cbar1.set_label("Insert QPS")
+
+    # Query QPS heatmap
+    im2 = ax2.imshow(query_data, cmap="YlGnBu", aspect="auto")
+    ax2.set_xticks([0, 1])
+    ax2.set_xticklabels(["Baseline", "Development"])
+    ax2.set_yticks(range(len(configs)))
+    ax2.set_yticklabels(configs)
+    ax2.set_title(
+        f"Query Performance - AVERAGE across {iteration_counts['dev']} iterations\n(1M vectors, 128 dims, HNSW index)"
+    )
+
+    # Add text annotations with dynamic color based on background
+    # Get the colormap to determine actual colors
+    cmap2 = plt.cm.YlGnBu
+    norm2 = plt.Normalize(vmin=query_data.min(), vmax=query_data.max())
+
+    for i in range(len(configs)):
+        for j in range(2):
+            # Get the actual color from the colormap
+            val = query_data[i, j]
+            rgba = cmap2(norm2(val))
+            # Calculate luminance using standard formula
+            # Perceived luminance: 0.299*R + 0.587*G + 0.114*B
+            luminance = 0.299 * rgba[0] + 0.587 * rgba[1] + 0.114 * rgba[2]
+            # Use white text on dark backgrounds (low luminance)
+            text_color = "white" if luminance < 0.5 else "black"
+
+            # Show average value with indicator
+            text = ax2.text(
+                j,
+                i,
+                f"{int(query_data[i, j])}\n(avg)",
+                ha="center",
+                va="center",
+                color=text_color,
+                fontweight="bold",
+                fontsize=9,
+            )
+
+    # Add colorbar
+    cbar2 = plt.colorbar(im2, ax=ax2)
+    cbar2.set_label("Query QPS")
+
+    # Add overall figure title
+    fig.suptitle(
+        "Performance Heatmap - Showing AVERAGES across Multiple Test Iterations",
+        fontsize=14,
+        fontweight="bold",
+        y=1.02,
+    )
+
+    plt.tight_layout()
+    plt.savefig(
+        os.path.join(output_dir, "performance_heatmap.png"),
+        dpi=150,
+        bbox_inches="tight",
+    )
+    plt.close()
+
+
+def create_performance_trends(results, output_dir):
+    """Create line charts showing performance trends"""
+    # Group by filesystem type
+    fs_types = defaultdict(
+        lambda: {
+            "configs": [],
+            "baseline_insert": [],
+            "dev_insert": [],
+            "baseline_query": [],
+            "dev_query": [],
+        }
+    )
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "default")
+        config = f"{block_size}"
+
+        if config not in fs_types[fs]["configs"]:
+            fs_types[fs]["configs"].append(config)
+            fs_types[fs]["baseline_insert"].append(0)
+            fs_types[fs]["dev_insert"].append(0)
+            fs_types[fs]["baseline_query"].append(0)
+            fs_types[fs]["dev_query"].append(0)
+
+        idx = fs_types[fs]["configs"].index(config)
+
+        # Calculate average query QPS from all test configurations
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+
+        if result.get("is_dev", False):
+            if "insert_performance" in result:
+                fs_types[fs]["dev_insert"][idx] = result["insert_performance"].get(
+                    "vectors_per_second", 0
+                )
+            fs_types[fs]["dev_query"][idx] = query_qps
+        else:
+            if "insert_performance" in result:
+                fs_types[fs]["baseline_insert"][idx] = result["insert_performance"].get(
+                    "vectors_per_second", 0
+                )
+            fs_types[fs]["baseline_query"][idx] = query_qps
+
+    # Create separate plots for each filesystem
+    for fs, data in fs_types.items():
+        if not data["configs"]:
+            continue
+
+        fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
+
+        x = range(len(data["configs"]))
+
+        # Insert performance
+        ax1.plot(
+            x,
+            data["baseline_insert"],
+            "o-",
+            label="Baseline",
+            linewidth=2,
+            markersize=8,
+        )
+        ax1.plot(
+            x, data["dev_insert"], "s-", label="Development", linewidth=2, markersize=8
+        )
+        ax1.set_xlabel("Configuration")
+        ax1.set_ylabel("Insert QPS")
+        ax1.set_title(f"{fs.upper()} Insert Performance")
+        ax1.set_xticks(x)
+        ax1.set_xticklabels(data["configs"])
+        ax1.legend()
+        ax1.grid(True, alpha=0.3)
+
+        # Query performance
+        ax2.plot(
+            x, data["baseline_query"], "o-", label="Baseline", linewidth=2, markersize=8
+        )
+        ax2.plot(
+            x, data["dev_query"], "s-", label="Development", linewidth=2, markersize=8
+        )
+        ax2.set_xlabel("Configuration")
+        ax2.set_ylabel("Query QPS")
+        ax2.set_title(f"{fs.upper()} Query Performance")
+        ax2.set_xticks(x)
+        ax2.set_xticklabels(data["configs"])
+        ax2.legend()
+        ax2.grid(True, alpha=0.3)
+
+        plt.tight_layout()
+        plt.savefig(os.path.join(output_dir, f"{fs}_performance_trends.png"), dpi=150)
+        plt.close()
+
+
+def create_simple_performance_trends(results, output_dir):
+    """Create a simple performance trends chart for basic Milvus testing"""
+    if not results:
+        return
+
+    # Extract configuration from first result for display
+    config_text = ""
+    if results:
+        first_result = results[0]
+        if "config" in first_result:
+            cfg = first_result["config"]
+            config_text = (
+                f"Test Config:\n"
+                f"• {cfg.get('vector_dataset_size', 'N/A'):,} vectors/iteration\n"
+                f"• {cfg.get('vector_dimensions', 'N/A')} dimensions\n"
+                f"• {cfg.get('index_type', 'N/A')} index"
+            )
+
+    # Separate baseline and dev results
+    baseline_results = [r for r in results if not r.get("is_dev", False)]
+    dev_results = [r for r in results if r.get("is_dev", False)]
+
+    if not baseline_results and not dev_results:
+        return
+
+    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 10))
+
+    # Prepare data
+    baseline_insert = []
+    baseline_query = []
+    dev_insert = []
+    dev_query = []
+    labels = []
+
+    # Process baseline results
+    for i, result in enumerate(baseline_results):
+        if "insert_performance" in result:
+            baseline_insert.append(
+                result["insert_performance"].get("vectors_per_second", 0)
+            )
+        else:
+            baseline_insert.append(0)
+
+        # Calculate average query QPS
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+        baseline_query.append(query_qps)
+        labels.append(f"Iteration {i+1}")
+
+    # Process dev results
+    for result in dev_results:
+        if "insert_performance" in result:
+            dev_insert.append(result["insert_performance"].get("vectors_per_second", 0))
+        else:
+            dev_insert.append(0)
+
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+        dev_query.append(query_qps)
+
+    x = range(len(baseline_results) if baseline_results else len(dev_results))
+
+    # Insert performance - with visible markers for all points
+    if baseline_insert:
+        # Line plot with smaller markers
+        ax1.plot(
+            x,
+            baseline_insert,
+            "-",
+            label="Baseline",
+            linewidth=1.5,
+            color="blue",
+            alpha=0.6,
+        )
+        # Add distinct markers for each point
+        ax1.scatter(
+            x,
+            baseline_insert,
+            s=30,
+            color="blue",
+            alpha=0.8,
+            edgecolors="darkblue",
+            linewidth=0.5,
+            zorder=5,
+        )
+    if dev_insert:
+        # Line plot with smaller markers
+        ax1.plot(
+            x[: len(dev_insert)],
+            dev_insert,
+            "-",
+            label="Development",
+            linewidth=1.5,
+            color="red",
+            alpha=0.6,
+        )
+        # Add distinct markers for each point
+        ax1.scatter(
+            x[: len(dev_insert)],
+            dev_insert,
+            s=30,
+            color="red",
+            alpha=0.8,
+            edgecolors="darkred",
+            linewidth=0.5,
+            marker="s",
+            zorder=5,
+        )
+    ax1.set_xlabel("Test Iteration (same configuration, repeated for reliability)")
+    ax1.set_ylabel("Insert QPS (vectors/second)")
+    ax1.set_title("Milvus Insert Performance")
+
+    # Handle x-axis labels to prevent overlap
+    num_points = len(x)
+    if num_points > 20:
+        # Show every 5th label for many iterations
+        step = 5
+        tick_positions = list(range(0, num_points, step))
+        tick_labels = [
+            labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
+        ]
+        ax1.set_xticks(tick_positions)
+        ax1.set_xticklabels(tick_labels, rotation=45, ha="right")
+    elif num_points > 10:
+        # Show every 2nd label for moderate iterations
+        step = 2
+        tick_positions = list(range(0, num_points, step))
+        tick_labels = [
+            labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
+        ]
+        ax1.set_xticks(tick_positions)
+        ax1.set_xticklabels(tick_labels, rotation=45, ha="right")
+    else:
+        # Show all labels for few iterations
+        ax1.set_xticks(x)
+        ax1.set_xticklabels(labels if labels else [f"Iteration {i+1}" for i in x])
+
+    ax1.legend()
+    ax1.grid(True, alpha=0.3)
+
+    # Add configuration text box - compact
+    if config_text:
+        ax1.text(
+            0.02,
+            0.98,
+            config_text,
+            transform=ax1.transAxes,
+            fontsize=6,
+            verticalalignment="top",
+            bbox=dict(boxstyle="round,pad=0.3", facecolor="wheat", alpha=0.85),
+        )
+
+    # Query performance - with visible markers for all points
+    if baseline_query:
+        # Line plot
+        ax2.plot(
+            x,
+            baseline_query,
+            "-",
+            label="Baseline",
+            linewidth=1.5,
+            color="blue",
+            alpha=0.6,
+        )
+        # Add distinct markers for each point
+        ax2.scatter(
+            x,
+            baseline_query,
+            s=30,
+            color="blue",
+            alpha=0.8,
+            edgecolors="darkblue",
+            linewidth=0.5,
+            zorder=5,
+        )
+    if dev_query:
+        # Line plot
+        ax2.plot(
+            x[: len(dev_query)],
+            dev_query,
+            "-",
+            label="Development",
+            linewidth=1.5,
+            color="red",
+            alpha=0.6,
+        )
+        # Add distinct markers for each point
+        ax2.scatter(
+            x[: len(dev_query)],
+            dev_query,
+            s=30,
+            color="red",
+            alpha=0.8,
+            edgecolors="darkred",
+            linewidth=0.5,
+            marker="s",
+            zorder=5,
+        )
+    ax2.set_xlabel("Test Iteration (same configuration, repeated for reliability)")
+    ax2.set_ylabel("Query QPS (queries/second)")
+    ax2.set_title("Milvus Query Performance")
+
+    # Handle x-axis labels to prevent overlap
+    num_points = len(x)
+    if num_points > 20:
+        # Show every 5th label for many iterations
+        step = 5
+        tick_positions = list(range(0, num_points, step))
+        tick_labels = [
+            labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
+        ]
+        ax2.set_xticks(tick_positions)
+        ax2.set_xticklabels(tick_labels, rotation=45, ha="right")
+    elif num_points > 10:
+        # Show every 2nd label for moderate iterations
+        step = 2
+        tick_positions = list(range(0, num_points, step))
+        tick_labels = [
+            labels[i] if labels else f"Iteration {i+1}" for i in tick_positions
+        ]
+        ax2.set_xticks(tick_positions)
+        ax2.set_xticklabels(tick_labels, rotation=45, ha="right")
+    else:
+        # Show all labels for few iterations
+        ax2.set_xticks(x)
+        ax2.set_xticklabels(labels if labels else [f"Iteration {i+1}" for i in x])
+
+    ax2.legend()
+    ax2.grid(True, alpha=0.3)
+
+    # Add configuration text box - compact
+    if config_text:
+        ax2.text(
+            0.02,
+            0.98,
+            config_text,
+            transform=ax2.transAxes,
+            fontsize=6,
+            verticalalignment="top",
+            bbox=dict(boxstyle="round,pad=0.3", facecolor="wheat", alpha=0.85),
+        )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "performance_trends.png"), dpi=150)
+    plt.close()
+
+
+def generate_summary_statistics(results, output_dir):
+    """Generate summary statistics and save to JSON"""
+    # Get unique filesystems, excluding "unknown"
+    filesystems = set()
+    for r in results:
+        fs = r.get("filesystem", "unknown")
+        if fs != "unknown":
+            filesystems.add(fs)
+
+    summary = {
+        "total_tests": len(results),
+        "filesystems_tested": sorted(list(filesystems)),
+        "configurations": {},
+        "performance_summary": {
+            "best_insert_qps": {"value": 0, "config": ""},
+            "best_query_qps": {"value": 0, "config": ""},
+            "average_insert_qps": 0,
+            "average_query_qps": 0,
+        },
+    }
+
+    # Calculate statistics
+    all_insert_qps = []
+    all_query_qps = []
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "default")
+        is_dev = "dev" if result.get("is_dev", False) else "baseline"
+        config_name = f"{fs}-{block_size}-{is_dev}"
+
+        # Get actual performance metrics
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+
+        # Calculate average query QPS
+        query_qps = 0
+        if "query_performance" in result:
+            qp = result["query_performance"]
+            total_qps = 0
+            count = 0
+            for topk_key in ["topk_1", "topk_10", "topk_100"]:
+                if topk_key in qp:
+                    for batch_key in ["batch_1", "batch_10", "batch_100"]:
+                        if batch_key in qp[topk_key]:
+                            total_qps += qp[topk_key][batch_key].get(
+                                "queries_per_second", 0
+                            )
+                            count += 1
+            if count > 0:
+                query_qps = total_qps / count
+
+        all_insert_qps.append(insert_qps)
+        all_query_qps.append(query_qps)
+
+        summary["configurations"][config_name] = {
+            "insert_qps": insert_qps,
+            "query_qps": query_qps,
+            "host": result.get("host", "unknown"),
+        }
+
+        if insert_qps > summary["performance_summary"]["best_insert_qps"]["value"]:
+            summary["performance_summary"]["best_insert_qps"] = {
+                "value": insert_qps,
+                "config": config_name,
+            }
+
+        if query_qps > summary["performance_summary"]["best_query_qps"]["value"]:
+            summary["performance_summary"]["best_query_qps"] = {
+                "value": query_qps,
+                "config": config_name,
+            }
+
+    summary["performance_summary"]["average_insert_qps"] = (
+        np.mean(all_insert_qps) if all_insert_qps else 0
+    )
+    summary["performance_summary"]["average_query_qps"] = (
+        np.mean(all_query_qps) if all_query_qps else 0
+    )
+
+    # Save summary
+    with open(os.path.join(output_dir, "summary.json"), "w") as f:
+        json.dump(summary, f, indent=2)
+
+    return summary
+
+
+def create_comprehensive_fs_comparison(results, output_dir):
+    """Create comprehensive filesystem performance comparison including all configurations"""
+    import matplotlib.pyplot as plt
+    import numpy as np
+    from collections import defaultdict
+
+    # Collect data for all filesystem configurations
+    config_data = defaultdict(lambda: {"baseline": [], "dev": []})
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "")
+
+        # Create configuration label
+        if block_size and block_size != "default":
+            config_label = f"{fs}-{block_size}"
+        else:
+            config_label = fs
+
+        category = "dev" if result.get("is_dev", False) else "baseline"
+
+        # Extract performance metrics
+        if "insert_performance" in result:
+            insert_qps = result["insert_performance"].get("vectors_per_second", 0)
+        else:
+            insert_qps = 0
+
+        config_data[config_label][category].append(insert_qps)
+
+    # Sort configurations for consistent display
+    configs = sorted(config_data.keys())
+
+    # Calculate means and standard deviations
+    baseline_means = []
+    baseline_stds = []
+    dev_means = []
+    dev_stds = []
+
+    for config in configs:
+        baseline_vals = config_data[config]["baseline"]
+        dev_vals = config_data[config]["dev"]
+
+        baseline_means.append(np.mean(baseline_vals) if baseline_vals else 0)
+        baseline_stds.append(np.std(baseline_vals) if baseline_vals else 0)
+        dev_means.append(np.mean(dev_vals) if dev_vals else 0)
+        dev_stds.append(np.std(dev_vals) if dev_vals else 0)
+
+    # Create the plot
+    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10))
+
+    x = np.arange(len(configs))
+    width = 0.35
+
+    # Top plot: Absolute performance
+    baseline_bars = ax1.bar(
+        x - width / 2,
+        baseline_means,
+        width,
+        yerr=baseline_stds,
+        label="Baseline",
+        color="#1f77b4",
+        capsize=5,
+    )
+    dev_bars = ax1.bar(
+        x + width / 2,
+        dev_means,
+        width,
+        yerr=dev_stds,
+        label="Development",
+        color="#ff7f0e",
+        capsize=5,
+    )
+
+    ax1.set_ylabel("Insert QPS")
+    ax1.set_title("Vector Database Performance Across Filesystem Configurations")
+    ax1.set_xticks(x)
+    ax1.set_xticklabels(configs, rotation=45, ha="right")
+    ax1.legend()
+    ax1.grid(True, alpha=0.3)
+
+    # Add value labels on bars
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax1.annotate(
+                    f"{height:.0f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                    fontsize=8,
+                )
+
+    # Bottom plot: Percentage improvement (dev vs baseline)
+    improvements = []
+    for i in range(len(configs)):
+        if baseline_means[i] > 0:
+            improvement = ((dev_means[i] - baseline_means[i]) / baseline_means[i]) * 100
+        else:
+            improvement = 0
+        improvements.append(improvement)
+
+    colors = ["green" if x > 0 else "red" for x in improvements]
+    improvement_bars = ax2.bar(x, improvements, color=colors, alpha=0.7)
+
+    ax2.set_ylabel("Performance Change (%)")
+    ax2.set_title("Development vs Baseline Performance Change")
+    ax2.set_xticks(x)
+    ax2.set_xticklabels(configs, rotation=45, ha="right")
+    ax2.axhline(y=0, color="black", linestyle="-", linewidth=0.5)
+    ax2.grid(True, alpha=0.3)
+
+    # Add percentage labels
+    for bar, val in zip(improvement_bars, improvements):
+        ax2.annotate(
+            f"{val:.1f}%",
+            xy=(bar.get_x() + bar.get_width() / 2, val),
+            xytext=(0, 3 if val > 0 else -15),
+            textcoords="offset points",
+            ha="center",
+            va="bottom" if val > 0 else "top",
+            fontsize=8,
+        )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "comprehensive_fs_comparison.png"), dpi=150)
+    plt.close()
+
+
+def create_fs_latency_comparison(results, output_dir):
+    """Create latency comparison across filesystems"""
+    import matplotlib.pyplot as plt
+    import numpy as np
+    from collections import defaultdict
+
+    # Collect latency data
+    config_latency = defaultdict(lambda: {"baseline": [], "dev": []})
+
+    for result in results:
+        fs = result.get("filesystem", "unknown")
+        block_size = result.get("block_size", "")
+
+        if block_size and block_size != "default":
+            config_label = f"{fs}-{block_size}"
+        else:
+            config_label = fs
+
+        category = "dev" if result.get("is_dev", False) else "baseline"
+
+        # Extract latency metrics
+        if "query_performance" in result:
+            latency_p99 = result["query_performance"].get("latency_p99_ms", 0)
+        else:
+            latency_p99 = 0
+
+        if latency_p99 > 0:
+            config_latency[config_label][category].append(latency_p99)
+
+    if not config_latency:
+        return
+
+    # Sort configurations
+    configs = sorted(config_latency.keys())
+
+    # Calculate statistics
+    baseline_p99 = []
+    dev_p99 = []
+
+    for config in configs:
+        baseline_vals = config_latency[config]["baseline"]
+        dev_vals = config_latency[config]["dev"]
+
+        baseline_p99.append(np.mean(baseline_vals) if baseline_vals else 0)
+        dev_p99.append(np.mean(dev_vals) if dev_vals else 0)
+
+    # Create plot
+    fig, ax = plt.subplots(figsize=(12, 6))
+
+    x = np.arange(len(configs))
+    width = 0.35
+
+    baseline_bars = ax.bar(
+        x - width / 2, baseline_p99, width, label="Baseline P99", color="#9467bd"
+    )
+    dev_bars = ax.bar(
+        x + width / 2, dev_p99, width, label="Development P99", color="#e377c2"
+    )
+
+    ax.set_xlabel("Filesystem Configuration")
+    ax.set_ylabel("Latency P99 (ms)")
+    ax.set_title("Query Latency (P99) Comparison Across Filesystems")
+    ax.set_xticks(x)
+    ax.set_xticklabels(configs, rotation=45, ha="right")
+    ax.legend()
+    ax.grid(True, alpha=0.3)
+
+    # Add value labels
+    for bars in [baseline_bars, dev_bars]:
+        for bar in bars:
+            height = bar.get_height()
+            if height > 0:
+                ax.annotate(
+                    f"{height:.1f}",
+                    xy=(bar.get_x() + bar.get_width() / 2, height),
+                    xytext=(0, 3),
+                    textcoords="offset points",
+                    ha="center",
+                    va="bottom",
+                    fontsize=8,
+                )
+
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, "filesystem_latency_comparison.png"), dpi=150)
+    plt.close()
+
+
+def main():
+    if len(sys.argv) < 3:
+        print("Usage: generate_graphs.py <results_dir> <output_dir>")
+        sys.exit(1)
+
+    results_dir = sys.argv[1]
+    output_dir = sys.argv[2]
+
+    # Create output directory
+    os.makedirs(output_dir, exist_ok=True)
+
+    # Load results
+    results = load_results(results_dir)
+
+    if not results:
+        print("No results found to analyze")
+        sys.exit(1)
+
+    print(f"Loaded {len(results)} result files")
+
+    # Generate graphs
+    print("Generating performance heatmap...")
+    create_heatmap_analysis(results, output_dir)
+
+    print("Generating performance trends...")
+    create_simple_performance_trends(results, output_dir)
+
+    print("Generating summary statistics...")
+    summary = generate_summary_statistics(results, output_dir)
+
+    # Check if we have multiple filesystems to compare
+    filesystems = set(r.get("filesystem", "unknown") for r in results)
+    if len(filesystems) > 1:
+        print("Generating filesystem comparison chart...")
+        create_filesystem_comparison_chart(results, output_dir)
+
+        print("Generating comprehensive filesystem comparison...")
+        create_comprehensive_fs_comparison(results, output_dir)
+
+        print("Generating filesystem latency comparison...")
+        create_fs_latency_comparison(results, output_dir)
+
+        # Check if we have XFS results with different block sizes
+        xfs_results = [r for r in results if r.get("filesystem") == "xfs"]
+        block_sizes = set(r.get("block_size", "unknown") for r in xfs_results)
+        if len(block_sizes) > 1:
+            print("Generating XFS block size analysis...")
+            create_block_size_analysis(results, output_dir)
+
+    print(f"\nAnalysis complete! Graphs saved to {output_dir}")
+    print(f"Total configurations tested: {summary['total_tests']}")
+    print(
+        f"Best insert QPS: {summary['performance_summary']['best_insert_qps']['value']} ({summary['performance_summary']['best_insert_qps']['config']})"
+    )
+    print(
+        f"Best query QPS: {summary['performance_summary']['best_query_qps']['value']} ({summary['performance_summary']['best_query_qps']['config']})"
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/workflows/ai/scripts/generate_html_report.py b/workflows/ai/scripts/generate_html_report.py
new file mode 100755
index 00000000..3aa8342f
--- /dev/null
+++ b/workflows/ai/scripts/generate_html_report.py
@@ -0,0 +1,558 @@
+#!/usr/bin/env python3
+"""
+Generate HTML report for AI benchmark results
+"""
+
+import json
+import os
+import sys
+import glob
+from datetime import datetime
+from pathlib import Path
+
+HTML_TEMPLATE = """
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>AI Benchmark Results - {timestamp}</title>
+    <style>
+        body {{
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
+            line-height: 1.6;
+            color: #333;
+            max-width: 1400px;
+            margin: 0 auto;
+            padding: 20px;
+            background-color: #f5f5f5;
+        }}
+        .header {{
+            background-color: #2c3e50;
+            color: white;
+            padding: 30px;
+            border-radius: 8px;
+            margin-bottom: 30px;
+            text-align: center;
+        }}
+        h1 {{
+            margin: 0;
+            font-size: 2.5em;
+        }}
+        .subtitle {{
+            margin-top: 10px;
+            opacity: 0.9;
+        }}
+        .summary-cards {{
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
+            gap: 20px;
+            margin-bottom: 40px;
+        }}
+        .card {{
+            background: white;
+            padding: 20px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            text-align: center;
+        }}
+        .card h3 {{
+            margin: 0 0 10px 0;
+            color: #2c3e50;
+        }}
+        .card .value {{
+            font-size: 2em;
+            font-weight: bold;
+            color: #3498db;
+        }}
+        .card .label {{
+            color: #7f8c8d;
+            font-size: 0.9em;
+        }}
+        .config-box {{
+            background: #f8f9fa;
+            border-left: 4px solid #3498db;
+            padding: 15px;
+            margin: 20px 0;
+            border-radius: 4px;
+        }}
+        .config-box h3 {{
+            margin-top: 0;
+            color: #2c3e50;
+        }}
+        .config-box ul {{
+            margin: 10px 0;
+            padding-left: 20px;
+        }}
+        .config-box li {{
+            margin: 5px 0;
+        }}
+        .section {{
+            background: white;
+            padding: 30px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            margin-bottom: 30px;
+        }}
+        .section h2 {{
+            color: #2c3e50;
+            border-bottom: 2px solid #3498db;
+            padding-bottom: 10px;
+            margin-bottom: 20px;
+        }}
+        .graph-container {{
+            text-align: center;
+            margin: 20px 0;
+        }}
+        .graph-container img {{
+            max-width: 100%;
+            height: auto;
+            border-radius: 4px;
+            box-shadow: 0 2px 8px rgba(0,0,0,0.1);
+        }}
+        .results-table {{
+            width: 100%;
+            border-collapse: collapse;
+            margin-top: 20px;
+        }}
+        .results-table th, .results-table td {{
+            padding: 12px;
+            text-align: left;
+            border-bottom: 1px solid #ddd;
+        }}
+        .results-table th {{
+            background-color: #f8f9fa;
+            font-weight: 600;
+            color: #2c3e50;
+        }}
+        .results-table tr:hover {{
+            background-color: #f8f9fa;
+        }}
+        .baseline {{
+            background-color: #e8f4fd;
+        }}
+        .dev {{
+            background-color: #fff3cd;
+        }}
+        .footer {{
+            text-align: center;
+            padding: 20px;
+            color: #7f8c8d;
+            font-size: 0.9em;
+        }}
+        .graph-grid {{
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(500px, 1fr));
+            gap: 20px;
+            margin: 20px 0;
+        }}
+        .best-config {{
+            background-color: #d4edda;
+            font-weight: bold;
+        }}
+        .navigation {{
+            position: sticky;
+            top: 20px;
+            background: white;
+            padding: 20px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            margin-bottom: 30px;
+        }}
+        .navigation ul {{
+            list-style: none;
+            padding: 0;
+            margin: 0;
+        }}
+        .navigation li {{
+            display: inline-block;
+            margin-right: 20px;
+        }}
+        .navigation a {{
+            color: #3498db;
+            text-decoration: none;
+            font-weight: 500;
+        }}
+        .navigation a:hover {{
+            text-decoration: underline;
+        }}
+    </style>
+</head>
+<body>
+    <div class="header">
+        <h1>AI Vector Database Benchmark Results</h1>
+        <div class="subtitle">Generated on {timestamp}</div>
+    </div>
+    
+    <nav class="navigation">
+        <ul>
+            <li><a href="#summary">Summary</a></li>
+            {filesystem_nav_items}
+            <li><a href="#performance-metrics">Performance Metrics</a></li>
+            <li><a href="#performance-heatmap">Performance Heatmap</a></li>
+            <li><a href="#detailed-results">Detailed Results</a></li>
+        </ul>
+    </nav>
+    
+    <div id="summary" class="summary-cards">
+        <div class="card">
+            <h3>Total Tests</h3>
+            <div class="value">{total_tests}</div>
+            <div class="label">Configurations</div>
+        </div>
+        <div class="card">
+            <h3>Best Insert QPS</h3>
+            <div class="value">{best_insert_qps}</div>
+            <div class="label">{best_insert_config}</div>
+        </div>
+        <div class="card">
+            <h3>Best Query QPS</h3>
+            <div class="value">{best_query_qps}</div>
+            <div class="label">{best_query_config}</div>
+        </div>
+        <div class="card">
+            <h3>{fourth_card_title}</h3>
+            <div class="value">{fourth_card_value}</div>
+            <div class="label">{fourth_card_label}</div>
+        </div>
+    </div>
+    
+    {filesystem_comparison_section}
+    
+    {block_size_analysis_section}
+    
+    <div id="performance-heatmap" class="section">
+        <h2>Performance Heatmap</h2>
+        <p>Heatmap visualization showing performance metrics across all tested configurations.</p>
+        <div class="graph-container">
+            <img src="graphs/performance_heatmap.png" alt="Performance Heatmap">
+        </div>
+    </div>
+    
+    <div id="performance-metrics" class="section">
+        <h2>Performance Metrics</h2>
+        {config_summary}
+        <div class="graph-grid">
+            {performance_trend_graphs}
+        </div>
+    </div>
+    
+    <div id="detailed-results" class="section">
+        <h2>Detailed Results Table</h2>
+        <table class="results-table">
+            <thead>
+                <tr>
+                    <th>Host</th>
+                    <th>Type</th>
+                    <th>Insert QPS</th>
+                    <th>Query QPS</th>
+                    <th>Timestamp</th>
+                </tr>
+            </thead>
+            <tbody>
+                {table_rows}
+            </tbody>
+        </table>
+    </div>
+    
+    <div class="footer">
+        <p>Generated by kdevops AI Benchmark Suite | <a href="https://github.com/linux-kdevops/kdevops">GitHub</a></p>
+    </div>
+</body>
+</html>
+"""
+
+
+def load_summary(graphs_dir):
+    """Load the summary.json file"""
+    summary_path = os.path.join(graphs_dir, "summary.json")
+    if os.path.exists(summary_path):
+        with open(summary_path, "r") as f:
+            return json.load(f)
+    return None
+
+
+def load_results(results_dir):
+    """Load all result files for detailed table"""
+    results = []
+    json_files = glob.glob(os.path.join(results_dir, "*.json"))
+
+    for json_file in json_files:
+        try:
+            with open(json_file, "r") as f:
+                data = json.load(f)
+                # Get filesystem from JSON data first, then fallback to filename parsing
+                filename = os.path.basename(json_file)
+
+                # Skip results without valid performance data
+                insert_perf = data.get("insert_performance", {})
+                query_perf = data.get("query_performance", {})
+                if not insert_perf or not query_perf:
+                    continue
+
+                # Get filesystem from JSON data
+                fs_type = data.get("filesystem", None)
+
+                # If not in JSON, try to parse from filename (backwards compatibility)
+                if not fs_type and "debian13-ai" in filename:
+                    host_parts = (
+                        filename.replace("results_debian13-ai-", "")
+                        .replace("_1.json", "")
+                        .replace("_2.json", "")
+                        .replace("_3.json", "")
+                        .split("-")
+                    )
+                    if "xfs" in host_parts[0]:
+                        fs_type = "xfs"
+                        block_size = host_parts[1] if len(host_parts) > 1 else "4k"
+                    elif "ext4" in host_parts[0]:
+                        fs_type = "ext4"
+                        block_size = host_parts[1] if len(host_parts) > 1 else "4k"
+                    elif "btrfs" in host_parts[0]:
+                        fs_type = "btrfs"
+                        block_size = "default"
+                    else:
+                        fs_type = "unknown"
+                        block_size = "unknown"
+                else:
+                    # Set appropriate block size based on filesystem
+                    if fs_type == "btrfs":
+                        block_size = "default"
+                    else:
+                        block_size = data.get("block_size", "default")
+
+                # Default to unknown if still not found
+                if not fs_type:
+                    fs_type = "unknown"
+                    block_size = "unknown"
+
+                is_dev = "dev" in filename
+
+                # Calculate average QPS from query performance data
+                query_qps = 0
+                query_count = 0
+                for topk_data in query_perf.values():
+                    for batch_data in topk_data.values():
+                        qps = batch_data.get("queries_per_second", 0)
+                        if qps > 0:
+                            query_qps += qps
+                            query_count += 1
+                if query_count > 0:
+                    query_qps = query_qps / query_count
+
+                results.append(
+                    {
+                        "host": filename.replace("results_", "").replace(".json", ""),
+                        "filesystem": fs_type,
+                        "block_size": block_size,
+                        "type": "Development" if is_dev else "Baseline",
+                        "insert_qps": insert_perf.get("vectors_per_second", 0),
+                        "query_qps": query_qps,
+                        "timestamp": data.get("timestamp", "N/A"),
+                        "is_dev": is_dev,
+                    }
+                )
+        except Exception as e:
+            print(f"Error loading {json_file}: {e}")
+
+    # Sort by filesystem, block size, then type
+    results.sort(key=lambda x: (x["filesystem"], x["block_size"], x["type"]))
+    return results
+
+
+def generate_table_rows(results, best_configs):
+    """Generate HTML table rows"""
+    rows = []
+    for result in results:
+        config_key = f"{result['filesystem']}-{result['block_size']}-{'dev' if result['is_dev'] else 'baseline'}"
+        row_class = "dev" if result["is_dev"] else "baseline"
+
+        # Check if this is a best configuration
+        if config_key in best_configs:
+            row_class += " best-config"
+
+        row = f"""
+        <tr class="{row_class}">
+            <td>{result['host']}</td>
+            <td>{result['type']}</td>
+            <td>{result['insert_qps']:,}</td>
+            <td>{result['query_qps']:,}</td>
+            <td>{result['timestamp']}</td>
+        </tr>
+        """
+        rows.append(row)
+
+    return "\n".join(rows)
+
+
+def generate_config_summary(results_dir):
+    """Generate configuration summary HTML from results"""
+    # Try to load first result file to get configuration
+    result_files = glob.glob(os.path.join(results_dir, "results_*.json"))
+    if not result_files:
+        return ""
+
+    try:
+        with open(result_files[0], "r") as f:
+            data = json.load(f)
+            config = data.get("config", {})
+
+            # Format configuration details
+            config_html = """
+        <div class="config-box">
+            <h3>Test Configuration</h3>
+            <ul>
+                <li><strong>Vector Dataset Size:</strong> {:,} vectors</li>
+                <li><strong>Vector Dimensions:</strong> {}</li>
+                <li><strong>Index Type:</strong> {} (M={}, ef_construction={}, ef={})</li>
+                <li><strong>Benchmark Runtime:</strong> {} seconds</li>
+                <li><strong>Batch Size:</strong> {:,}</li>
+                <li><strong>Test Iterations:</strong> {} runs with identical configuration</li>
+            </ul>
+        </div>
+            """.format(
+                config.get("vector_dataset_size", "N/A"),
+                config.get("vector_dimensions", "N/A"),
+                config.get("index_type", "N/A"),
+                config.get("index_hnsw_m", "N/A"),
+                config.get("index_hnsw_ef_construction", "N/A"),
+                config.get("index_hnsw_ef", "N/A"),
+                config.get("benchmark_runtime", "N/A"),
+                config.get("batch_size", "N/A"),
+                len(result_files),
+            )
+            return config_html
+    except Exception as e:
+        print(f"Warning: Could not generate config summary: {e}")
+        return ""
+
+
+def find_performance_trend_graphs(graphs_dir):
+    """Find performance trend graphs"""
+    graphs = []
+    # Look for filesystem-specific graphs in multi-fs mode
+    for fs in ["xfs", "ext4", "btrfs"]:
+        graph_path = f"{fs}_performance_trends.png"
+        if os.path.exists(os.path.join(graphs_dir, graph_path)):
+            graphs.append(
+                f'<div class="graph-container"><img src="graphs/{graph_path}" alt="{fs.upper()} Performance Trends"></div>'
+            )
+    # Fallback to simple performance trends for single mode
+    if not graphs and os.path.exists(
+        os.path.join(graphs_dir, "performance_trends.png")
+    ):
+        graphs.append(
+            '<div class="graph-container"><img src="graphs/performance_trends.png" alt="Performance Trends"></div>'
+        )
+    return "\n".join(graphs)
+
+
+def generate_html_report(results_dir, graphs_dir, output_path):
+    """Generate the HTML report"""
+    # Load summary
+    summary = load_summary(graphs_dir)
+    if not summary:
+        print("Warning: No summary.json found")
+        summary = {
+            "total_tests": 0,
+            "filesystems_tested": [],
+            "performance_summary": {
+                "best_insert_qps": {"value": 0, "config": "N/A"},
+                "best_query_qps": {"value": 0, "config": "N/A"},
+            },
+        }
+
+    # Load detailed results
+    results = load_results(results_dir)
+
+    # Find best configurations
+    best_configs = set()
+    if summary["performance_summary"]["best_insert_qps"]["config"]:
+        best_configs.add(summary["performance_summary"]["best_insert_qps"]["config"])
+    if summary["performance_summary"]["best_query_qps"]["config"]:
+        best_configs.add(summary["performance_summary"]["best_query_qps"]["config"])
+
+    # Check if multi-filesystem testing is enabled (more than one filesystem)
+    filesystems_tested = summary.get("filesystems_tested", [])
+    is_multifs_enabled = len(filesystems_tested) > 1
+
+    # Generate conditional sections based on multi-fs status
+    if is_multifs_enabled:
+        filesystem_nav_items = """
+            <li><a href="#filesystem-comparison">Filesystem Comparison</a></li>
+            <li><a href="#block-size-analysis">Block Size Analysis</a></li>"""
+
+        filesystem_comparison_section = """<div id="filesystem-comparison" class="section">
+        <h2>Filesystem Performance Comparison</h2>
+        <p>Comparison of vector database performance across different filesystems, showing both baseline and development kernel results.</p>
+        <div class="graph-container">
+            <img src="graphs/filesystem_comparison.png" alt="Filesystem Comparison">
+        </div>
+    </div>"""
+
+        block_size_analysis_section = """<div id="block-size-analysis" class="section">
+        <h2>XFS Block Size Analysis</h2>
+        <p>Performance analysis of XFS filesystem with different block sizes (4K, 16K, 32K, 64K).</p>
+        <div class="graph-container">
+            <img src="graphs/xfs_block_size_analysis.png" alt="XFS Block Size Analysis">
+        </div>
+    </div>"""
+
+        # Multi-fs mode: show filesystem info
+        fourth_card_title = "Filesystems Tested"
+        fourth_card_value = str(len(filesystems_tested))
+        fourth_card_label = ", ".join(filesystems_tested).upper()
+    else:
+        # Single filesystem mode - hide multi-fs sections
+        filesystem_nav_items = ""
+        filesystem_comparison_section = ""
+        block_size_analysis_section = ""
+
+        # Single mode: show test iterations
+        fourth_card_title = "Test Iterations"
+        fourth_card_value = str(summary["total_tests"])
+        fourth_card_label = "Identical Configuration Runs"
+
+    # Generate configuration summary
+    config_summary = generate_config_summary(results_dir)
+
+    # Generate HTML
+    html_content = HTML_TEMPLATE.format(
+        timestamp=datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
+        total_tests=summary["total_tests"],
+        best_insert_qps=f"{summary['performance_summary']['best_insert_qps']['value']:,}",
+        best_insert_config=summary["performance_summary"]["best_insert_qps"]["config"],
+        best_query_qps=f"{summary['performance_summary']['best_query_qps']['value']:,}",
+        best_query_config=summary["performance_summary"]["best_query_qps"]["config"],
+        fourth_card_title=fourth_card_title,
+        fourth_card_value=fourth_card_value,
+        fourth_card_label=fourth_card_label,
+        filesystem_nav_items=filesystem_nav_items,
+        filesystem_comparison_section=filesystem_comparison_section,
+        block_size_analysis_section=block_size_analysis_section,
+        config_summary=config_summary,
+        performance_trend_graphs=find_performance_trend_graphs(graphs_dir),
+        table_rows=generate_table_rows(results, best_configs),
+    )
+
+    # Write HTML file
+    with open(output_path, "w") as f:
+        f.write(html_content)
+
+    print(f"HTML report generated: {output_path}")
+
+
+def main():
+    if len(sys.argv) < 4:
+        print("Usage: generate_html_report.py <results_dir> <graphs_dir> <output_html>")
+        sys.exit(1)
+
+    results_dir = sys.argv[1]
+    graphs_dir = sys.argv[2]
+    output_html = sys.argv[3]
+
+    generate_html_report(results_dir, graphs_dir, output_html)
+
+
+if __name__ == "__main__":
+    main()
-- 
2.50.1

next prev parent reply	other threads:[~2025-08-27  9:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27  9:31 [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain
2025-08-27  9:32 ` Luis Chamberlain [this message]
2025-08-27  9:32 ` [PATCH 2/2] ai: add multi-filesystem testing support for Milvus benchmarks Luis Chamberlain
2025-08-27 14:47   ` Chuck Lever
2025-08-27 19:24     ` Luis Chamberlain
2025-09-01 20:11   ` Daniel Gomez
2025-09-01 20:27     ` Luis Chamberlain
2025-08-29  2:05 ` [PATCH 0/2] kdevops: add milvus with minio support Luis Chamberlain

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:e5a1367 dfblob:75e4712 dfblob:0c30762 dfblob:cb5fbc1
dfblob:ef5aa02 dfblob:144a649 dfblob:94f9f6c dfblob:2a3955d
dfblob:1117277 dfblob:6b2a376 dfblob:70898a1 dfblob:b161330
dfblob:85fc117 dfblob:eef07b2 dfblob:70b734e dfblob:881295e
dfblob:f0007ee dfblob:1a5638f dfblob:fb53766 dfblob:85b95a5
dfblob:f4f18b9 dfblob:e4b1a9d dfblob:a562444 dfblob:6ca5fec
dfblob:f2840fa dfblob:429e646 dfblob:1520081 dfblob:6f30a05
dfblob:4c78e9a dfblob:f8b8c55 dfblob:ffe9eb2 dfblob:1590f15
dfblob:482835c dfblob:3d11fb1 dfblob:645bac9 dfblob:53a835e
dfblob:a205577 dfblob:6a15d89 dfblob:5a87964 dfblob:29406b3
dfblob:612df3c dfblob:820e0f6 dfblob:094a902 dfblob:4ce14fb
dfblob:81fd5a8 dfblob:9983fc1 dfblob:b894c96 dfblob:4d35465
dfblob:ec11d03 dfblob:4b35d9f dfblob:6d83191 dfblob:cdcd188
dfblob:a859848 dfblob:d54977b dfblob:e657116 dfblob:a002196
dfblob:bd7d5ea dfblob:15b8af4 dfblob:6af514b dfblob:222a00e
dfblob:68ce18e dfblob:e1e1d91 dfblob:4088cb4 dfblob:e9b8b6d
dfblob:f3ed04a dfblob:b4f96a4 dfblob:6a611c5 dfblob:f843ec4
dfblob:d85423b dfblob:b5f54ff dfblob:fe35707 dfblob:2ffc6b6
dfblob:012fc0b dfblob:33efce4 dfblob:ef9768c dfblob:1c297ed
dfblob:2f90f4d dfblob:3d11fb1 dfblob:2e183e8 dfblob:3aa8342 )
 OR (
bs:"[PATCH 1/2] ai: add Milvus vector database benchmarking support" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250827093202.3539990-2-mcgrof@kernel.org \
    --to=mcgrof@kernel.org \
    --cc=cel@kernel.org \
    --cc=da.gomez@kruces.com \
    --cc=hui81.qi@samsung.com \
    --cc=kdevops@lists.linux.dev \
    --cc=kundan.kumar@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox