From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21A6E1CF5C6 for ; Tue, 12 Aug 2025 00:42:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754959378; cv=none; b=dSZxVTrSDh9NlNilLyQ+bsLJ+CutC1eRWxZFOqrc7hPEWKXFSlpxjDmpKwHz67+9DjA94bOZcHMgNQonToP9gC+CkRCxGykSaCzhvVQjN2f4MeL5w1Bbz5v/gYcxUO2J+ASNCV1m0zwDWDcqYfKABUiDDsH0MFZj+12WfGdEetw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754959378; c=relaxed/simple; bh=ZCMnsy8/UpaEPxkMsM+mydXtBr3yx6bpR910f1Bwir8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=duhMEMf9sK2CmRPTdIwQ6pTPUbJpqAWRDDdtMhgdPx5vJ0V4tD+97mvbBF7NUpOt/i7XTzSqP2hTpZZnSjCbdeYU5suW8wpVGfoCpjkHB0Kbz78Si5dDaGj2IQ/lBevi/DIM2G2tW3EwBzSJv3v1WQrc0ZPS/KC3fSor/dLeeOc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=kI9OZSsF; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="kI9OZSsF" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description; bh=9A2E9yO53SPy6TzXxsNGj3wmONvuzaQEbDd33AYjtk8=; b=kI9OZSsFL1z7kLJqqMX33A1ZFz 0S0OoEUzOkH2gGfIfbeVl2A56EogXgoqrDZLdRGrPrkt81c4OlNAfarIoKHiUh92jOQTovcCBmrrM Zi9m30ekYGxIxEGzcWjPYzTSRfjzd1h2M5+/dOaC1gKDF5pq03g5hheKONJrfuveShm0OiHAAbAoK scGUSxv+vxDvkmP+HY3sxUaxdWODzwFcUtFWYAwFyhXiCg3+sEiIf0nsaxgtIt/OmkHLMOimkhDn6 Ivzf+/NiaoBuoG6GeJbNXb5THQ2e74apCJTNviGWPIjk2HAj8iKfMYEjhc2xDfWqnwwjk5iV5JgAy bKsvdUyQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1uld6L-00000009T2X-3HMB; Tue, 12 Aug 2025 00:42:53 +0000 From: Luis Chamberlain To: Chuck Lever , Daniel Gomez , kdevops@lists.linux.dev Cc: Luis Chamberlain Subject: [PATCH 2/2] mmtests: add AB testing and comparison support Date: Mon, 11 Aug 2025 17:42:51 -0700 Message-ID: <20250812004252.2256571-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250812004252.2256571-1-mcgrof@kernel.org> References: <20250812004252.2256571-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: kdevops@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: Luis Chamberlain This commit introduces comprehensive A/B testing and comparison capabilities for the mmtests workflow in kdevops, enabling performance regression detection between baseline and development kernels. Key Features Added: - A/B testing configuration support for mmtests workflow - Automated performance comparison between baseline and dev nodes - Visual performance analysis with graph generation - HTML reports with embedded performance graphs New Components: 1. Defconfigs: - mmtests-ab-testing: Basic A/B testing setup - mmtests-ab-testing-thpcompact: Advanced config with monitoring 2. Comparison Infrastructure (playbooks/roles/mmtests_compare/): - Automated result collection from baseline and dev nodes - Local mmtests repository management with patch support - Multiple comparison output formats (HTML, text, graphs) - Shell scripts for graph generation and HTML embedding 3. Playbook Integration: - mmtests-compare.yml: Orchestrates the comparison workflow - Updated mmtests.yml to target mmtests group specifically - Enhanced hosts template with localhost and mmtests group 4. Result Visualization: - Performance graphs (main, sorted, smoothed trends) - Monitor data visualization (vmstat, mpstat, proc stats) - Professional HTML reports with embedded graphs - Comprehensive comparison tables with statistical analysis 5. Workflow Enhancements: - Support for applying patches from fixes directory - Python script for advanced graph generation - Makefile targets for comparison workflow - Results organization in workflows/mmtests/results/ Technical Improvements: - Added localhost to mmtests hosts template for local operations - Added dedicated mmtests group definition in hosts template - Support for applying patches from fixes directory - Robust error handling in comparison scripts - Dependency management for Perl and Python tools - Temporary file management in /tmp for comparisons Included Patches: - Fix undefined array reference in mmtests compare - Fix library order in thpcompact gcc command The implementation supports the standard kdevops A/B testing pattern where baseline nodes run the stable kernel and dev nodes run the development kernel, with automated comparison and visualization of performance differences between them. Usage: make defconfig-mmtests-ab-testing make bringup make mmtests make mmtests-compare This enables developers to quickly identify performance regressions and improvements between kernel versions with professional-quality reports and visualizations. Generated-by: Claude AI Signed-off-by: Luis Chamberlain --- .gitignore | 2 + defconfigs/mmtests-ab-testing | 22 + defconfigs/mmtests-ab-testing-thpcompact | 31 + playbooks/mmtests-compare.yml | 5 + playbooks/mmtests.yml | 2 +- .../roles/gen_hosts/templates/mmtests.j2 | 10 + playbooks/roles/mmtests/tasks/main.yaml | 58 +- .../roles/mmtests_compare/defaults/main.yml | 6 + .../mmtests_compare/files/apply_patch.sh | 32 + .../files/embed_graphs_in_html.sh | 100 +++ .../mmtests_compare/files/generate_graphs.sh | 148 +++++ .../files/generate_html_with_graphs.sh | 89 +++ .../mmtests_compare/files/run_comparison.sh | 58 ++ .../roles/mmtests_compare/tasks/main.yml | 472 ++++++++++++++ .../templates/comparison_report.html.j2 | 414 ++++++++++++ scripts/generate_mmtests_graphs.py | 598 ++++++++++++++++++ workflows/mmtests/Makefile | 19 +- ...fined-array-reference-when-no-operat.patch | 46 ++ ...act-fix-library-order-in-gcc-command.patch | 33 + 19 files changed, 2138 insertions(+), 7 deletions(-) create mode 100644 defconfigs/mmtests-ab-testing create mode 100644 defconfigs/mmtests-ab-testing-thpcompact create mode 100644 playbooks/mmtests-compare.yml create mode 100644 playbooks/roles/mmtests_compare/defaults/main.yml create mode 100755 playbooks/roles/mmtests_compare/files/apply_patch.sh create mode 100755 playbooks/roles/mmtests_compare/files/embed_graphs_in_html.sh create mode 100755 playbooks/roles/mmtests_compare/files/generate_graphs.sh create mode 100755 playbooks/roles/mmtests_compare/files/generate_html_with_graphs.sh create mode 100755 playbooks/roles/mmtests_compare/files/run_comparison.sh create mode 100644 playbooks/roles/mmtests_compare/tasks/main.yml create mode 100644 playbooks/roles/mmtests_compare/templates/comparison_report.html.j2 create mode 100644 scripts/generate_mmtests_graphs.py create mode 100644 workflows/mmtests/fixes/0001-compare-Fix-undefined-array-reference-when-no-operat.patch create mode 100644 workflows/mmtests/fixes/0002-thpcompact-fix-library-order-in-gcc-command.patch diff --git a/.gitignore b/.gitignore index 2e28c3f7..095880ab 100644 --- a/.gitignore +++ b/.gitignore @@ -67,6 +67,8 @@ workflows/ltp/results/ workflows/nfstest/results/ workflows/sysbench/results/ +workflows/mmtests/results/ +tmp playbooks/roles/linux-mirror/linux-mirror-systemd/*.service playbooks/roles/linux-mirror/linux-mirror-systemd/*.timer diff --git a/defconfigs/mmtests-ab-testing b/defconfigs/mmtests-ab-testing new file mode 100644 index 00000000..5d4dd2db --- /dev/null +++ b/defconfigs/mmtests-ab-testing @@ -0,0 +1,22 @@ +CONFIG_GUESTFS=y +CONFIG_LIBVIRT=y + +CONFIG_WORKFLOWS=y +CONFIG_WORKFLOW_LINUX_CUSTOM=y + +CONFIG_BOOTLINUX=y + +# Enable baseline and dev testing +CONFIG_KDEVOPS_BASELINE_AND_DEV=y + +# Enable mmtests workflow +CONFIG_WORKFLOWS_TESTS=y +CONFIG_WORKFLOWS_LINUX_TESTS=y +CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y +CONFIG_KDEVOPS_WORKFLOW_DEDICATE_MMTESTS=y + +# mmtests configuration - using defaults +CONFIG_MMTESTS_ENABLE_THPCOMPACT=y + +# Filesystem for tests +CONFIG_MMTESTS_FS_XFS=y diff --git a/defconfigs/mmtests-ab-testing-thpcompact b/defconfigs/mmtests-ab-testing-thpcompact new file mode 100644 index 00000000..cbcb30b9 --- /dev/null +++ b/defconfigs/mmtests-ab-testing-thpcompact @@ -0,0 +1,31 @@ +CONFIG_GUESTFS=y +CONFIG_LIBVIRT=y + +CONFIG_WORKFLOWS=y +CONFIG_WORKFLOW_LINUX_CUSTOM=y + +CONFIG_BOOTLINUX=y + +# Enable baseline and dev testing +CONFIG_KDEVOPS_BASELINE_AND_DEV=y + +# Enable A/B testing with different kernel references +CONFIG_BOOTLINUX_AB_DIFFERENT_REF=y + +# Enable mmtests workflow +CONFIG_WORKFLOWS_TESTS=y +CONFIG_WORKFLOWS_LINUX_TESTS=y +CONFIG_WORKFLOWS_DEDICATED_WORKFLOW=y +CONFIG_KDEVOPS_WORKFLOW_DEDICATE_MMTESTS=y + +# mmtests configuration +CONFIG_MMTESTS_ENABLE_THPCOMPACT=y +CONFIG_MMTESTS_ITERATIONS=5 +CONFIG_MMTESTS_MONITOR_INTERVAL=1 +CONFIG_MMTESTS_MONITOR_ENABLE_FTRACE=y +CONFIG_MMTESTS_MONITOR_ENABLE_PROC_MONITORING=y +CONFIG_MMTESTS_MONITOR_ENABLE_MPSTAT=y +CONFIG_MMTESTS_PRETEST_THP_SETTING="always" + +# Filesystem for tests +CONFIG_MMTESTS_FS_XFS=y diff --git a/playbooks/mmtests-compare.yml b/playbooks/mmtests-compare.yml new file mode 100644 index 00000000..7e948672 --- /dev/null +++ b/playbooks/mmtests-compare.yml @@ -0,0 +1,5 @@ +--- +- hosts: localhost + roles: + - role: mmtests_compare + when: kdevops_baseline_and_dev|bool diff --git a/playbooks/mmtests.yml b/playbooks/mmtests.yml index f66e65db..4e395db6 100644 --- a/playbooks/mmtests.yml +++ b/playbooks/mmtests.yml @@ -1,4 +1,4 @@ --- -- hosts: all +- hosts: mmtests roles: - role: mmtests diff --git a/playbooks/roles/gen_hosts/templates/mmtests.j2 b/playbooks/roles/gen_hosts/templates/mmtests.j2 index d32ffe40..1252fe87 100644 --- a/playbooks/roles/gen_hosts/templates/mmtests.j2 +++ b/playbooks/roles/gen_hosts/templates/mmtests.j2 @@ -1,4 +1,5 @@ [all] +localhost ansible_connection=local {% for test_type in mmtests_enabled_test_types %} {{ kdevops_host_prefix }}-{{ test_type }} {% if kdevops_baseline_and_dev %} @@ -21,3 +22,12 @@ ansible_python_interpreter = "{{ kdevops_python_interpreter }}" {% endif %} [dev:vars] ansible_python_interpreter = "{{ kdevops_python_interpreter }}" +[mmtests] +{% for test_type in mmtests_enabled_test_types %} +{{ kdevops_host_prefix }}-{{ test_type }} +{% if kdevops_baseline_and_dev %} +{{ kdevops_host_prefix }}-{{ test_type }}-dev +{% endif %} +{% endfor %} +[mmtests:vars] +ansible_python_interpreter = "{{ kdevops_python_interpreter }}" diff --git a/playbooks/roles/mmtests/tasks/main.yaml b/playbooks/roles/mmtests/tasks/main.yaml index 93bc4bd9..199c8bdd 100644 --- a/playbooks/roles/mmtests/tasks/main.yaml +++ b/playbooks/roles/mmtests/tasks/main.yaml @@ -21,7 +21,6 @@ path: "{{ data_path }}" owner: "{{ data_user }}" group: "{{ data_group }}" - recurse: yes state: directory - name: Clone mmtests repository @@ -32,6 +31,63 @@ version: "{{ mmtests_git_version }}" force: yes +- name: Check if mmtests fixes directory exists + tags: [ 'setup' ] + delegate_to: localhost + stat: + path: "{{ topdir_path }}/workflows/mmtests/fixes/" + register: fixes_dir + run_once: false + +- name: Find mmtests patches in fixes directory + tags: [ 'setup' ] + delegate_to: localhost + find: + paths: "{{ topdir_path }}/workflows/mmtests/fixes/" + patterns: "*.patch" + register: mmtests_patches + when: fixes_dir.stat.exists + run_once: false + +- name: Copy patches to remote host + tags: [ 'setup' ] + become: yes + become_method: sudo + copy: + src: "{{ item.path }}" + dest: "/tmp/{{ item.path | basename }}" + mode: '0644' + with_items: "{{ mmtests_patches.files }}" + when: + - fixes_dir.stat.exists + - mmtests_patches.files | length > 0 + +- name: Apply mmtests patches on remote host + tags: [ 'setup' ] + become: yes + become_method: sudo + shell: | + cd {{ mmtests_data_dir }} + git am /tmp/{{ item.path | basename }} + with_items: "{{ mmtests_patches.files }}" + when: + - fixes_dir.stat.exists + - mmtests_patches.files | length > 0 + ignore_errors: true + register: patch_results + +- name: Report patch application results + tags: [ 'setup' ] + debug: + msg: | + Applied {{ mmtests_patches.files | length | default(0) }} patches from fixes directory: + {% for patch in mmtests_patches.files | default([]) %} + - {{ patch.path | basename }} + {% endfor %} + when: + - fixes_dir.stat.exists + - mmtests_patches.files | length > 0 + - name: Generate mmtests configuration tags: [ 'setup' ] become: yes diff --git a/playbooks/roles/mmtests_compare/defaults/main.yml b/playbooks/roles/mmtests_compare/defaults/main.yml new file mode 100644 index 00000000..201a5278 --- /dev/null +++ b/playbooks/roles/mmtests_compare/defaults/main.yml @@ -0,0 +1,6 @@ +--- +# mmtests compare role defaults +mmtests_data_dir: "{{ data_path }}/mmtests" +mmtests_results_dir: "{{ mmtests_data_dir }}/work/log/{{ inventory_hostname }}-{{ kernel_version.stdout }}" +# Git URL is from extra_vars.yaml, fallback to GitHub +mmtests_git_url: "{{ mmtests_git_url | default('https://github.com/gormanm/mmtests.git') }}" diff --git a/playbooks/roles/mmtests_compare/files/apply_patch.sh b/playbooks/roles/mmtests_compare/files/apply_patch.sh new file mode 100755 index 00000000..172a5bd1 --- /dev/null +++ b/playbooks/roles/mmtests_compare/files/apply_patch.sh @@ -0,0 +1,32 @@ +#!/bin/bash +# Script to apply mmtests patches with proper error handling + +TOPDIR="$1" +PATCH_FILE="$2" + +cd "$TOPDIR/tmp/mmtests" || exit 1 + +PATCH_NAME=$(basename "$PATCH_FILE") + +# Check if patch is already applied by looking for the specific fix +if grep -q "if (@operations > 0 && exists" bin/lib/MMTests/Compare.pm 2>/dev/null; then + echo "Patch $PATCH_NAME appears to be already applied" + exit 0 +fi + +# Try to apply with git apply first +if git apply --check "$PATCH_FILE" 2>/dev/null; then + git apply "$PATCH_FILE" + echo "Applied patch with git: $PATCH_NAME" + exit 0 +fi + +# Try with patch command as fallback +if patch -p1 --dry-run < "$PATCH_FILE" >/dev/null 2>&1; then + patch -p1 < "$PATCH_FILE" + echo "Applied patch with patch command: $PATCH_NAME" + exit 0 +fi + +echo "Failed to apply $PATCH_NAME - may already be applied or conflicting" +exit 0 # Don't fail the playbook diff --git a/playbooks/roles/mmtests_compare/files/embed_graphs_in_html.sh b/playbooks/roles/mmtests_compare/files/embed_graphs_in_html.sh new file mode 100755 index 00000000..ee591f0d --- /dev/null +++ b/playbooks/roles/mmtests_compare/files/embed_graphs_in_html.sh @@ -0,0 +1,100 @@ +#!/bin/bash +# Script to embed graphs in the comparison HTML + +COMPARISON_HTML="$1" +COMPARE_DIR="$2" + +# Check if comparison.html exists +if [ ! -f "$COMPARISON_HTML" ]; then + echo "ERROR: $COMPARISON_HTML not found" + exit 1 +fi + +# Create a backup of the original +cp "$COMPARISON_HTML" "${COMPARISON_HTML}.bak" + +# Create new HTML with embedded graphs +{ + echo '' + echo '' + echo 'mmtests Comparison with Graphs' + echo '' + echo '' + echo '
' + echo '

mmtests Performance Comparison

' + + # Add graphs section if any graphs exist + if ls "$COMPARE_DIR"/*.png >/dev/null 2>&1; then + echo '
' + echo '

Performance Graphs

' + echo '
' + + # Main benchmark graph first + for graph in "$COMPARE_DIR"/graph-*compact.png; do + if [ -f "$graph" ] && [[ ! "$graph" =~ -sorted|-smooth ]]; then + echo '
' + echo '

Main Performance Comparison

' + echo "Main Performance" + echo '
' + fi + done + + # Sorted graph + if [ -f "$COMPARE_DIR/graph-thpcompact-sorted.png" ]; then + echo '
' + echo '

Sorted Samples

' + echo 'Sorted Samples' + echo '
' + fi + + # Smooth graph + if [ -f "$COMPARE_DIR/graph-thpcompact-smooth.png" ]; then + echo '
' + echo '

Smoothed Trend

' + echo 'Smoothed Trend' + echo '
' + fi + + # Any monitor graphs + for graph in "$COMPARE_DIR"/graph-vmstat.png "$COMPARE_DIR"/graph-proc-vmstat.png "$COMPARE_DIR"/graph-mpstat.png; do + if [ -f "$graph" ]; then + graphname=$(basename "$graph" .png | sed 's/graph-//') + echo '
' + echo "

${graphname^^} Monitor

" + echo "$graphname" + echo '
' + fi + done + + echo '
' + echo '
' + fi + + # Add the original comparison table + echo '
' + echo '

Detailed Comparison Table

' + cat "$COMPARISON_HTML" + echo '
' + + echo '
' +} > "${COMPARISON_HTML}.new" + +# Replace the original with the new version +mv "${COMPARISON_HTML}.new" "$COMPARISON_HTML" + +echo "Graphs embedded in $COMPARISON_HTML" +exit 0 diff --git a/playbooks/roles/mmtests_compare/files/generate_graphs.sh b/playbooks/roles/mmtests_compare/files/generate_graphs.sh new file mode 100755 index 00000000..28b06911 --- /dev/null +++ b/playbooks/roles/mmtests_compare/files/generate_graphs.sh @@ -0,0 +1,148 @@ +#!/bin/bash +# Script to generate mmtests graphs with proper error handling + +set -e + +TOPDIR="$1" +BENCHMARK="$2" +BASELINE_NAME="$3" +DEV_NAME="$4" +OUTPUT_DIR="$5" + +cd "$TOPDIR/tmp/mmtests" + +echo "Generating graphs for $BENCHMARK comparison" + +# Create output directory if it doesn't exist +mkdir -p "$OUTPUT_DIR" + +# Set up kernel list for graph generation +KERNEL_LIST="$BASELINE_NAME,$DEV_NAME" + +# Check if we have the required tools +if [ ! -f ./bin/graph-mmtests.sh ]; then + echo "ERROR: graph-mmtests.sh not found" + exit 1 +fi + +if [ ! -f ./bin/extract-mmtests.pl ]; then + echo "ERROR: extract-mmtests.pl not found" + exit 1 +fi + +# Generate the main benchmark comparison graph +echo "Generating main benchmark graph..." +./bin/graph-mmtests.sh \ + -d work/log/ \ + -b "$BENCHMARK" \ + -n "$KERNEL_LIST" \ + --format png \ + --output "$OUTPUT_DIR/graph-$BENCHMARK" \ + --title "$BENCHMARK Performance Comparison" 2>&1 | tee "$OUTPUT_DIR/graph-generation.log" + +# Check if the graph was created +if [ -f "$OUTPUT_DIR/graph-$BENCHMARK.png" ]; then + echo "Main benchmark graph created: graph-$BENCHMARK.png" +else + echo "WARNING: Main benchmark graph was not created" +fi + +# Generate sorted sample graphs +echo "Generating sorted sample graph..." +./bin/graph-mmtests.sh \ + -d work/log/ \ + -b "$BENCHMARK" \ + -n "$KERNEL_LIST" \ + --format png \ + --output "$OUTPUT_DIR/graph-$BENCHMARK-sorted" \ + --title "$BENCHMARK Performance (Sorted)" \ + --sort-samples-reverse \ + --x-label "Sorted samples" 2>&1 | tee -a "$OUTPUT_DIR/graph-generation.log" + +# Generate smooth curve graphs +echo "Generating smooth curve graph..." +./bin/graph-mmtests.sh \ + -d work/log/ \ + -b "$BENCHMARK" \ + -n "$KERNEL_LIST" \ + --format png \ + --output "$OUTPUT_DIR/graph-$BENCHMARK-smooth" \ + --title "$BENCHMARK Performance (Smoothed)" \ + --with-smooth 2>&1 | tee -a "$OUTPUT_DIR/graph-generation.log" + +# Generate monitor graphs if data is available +echo "Checking for monitor data..." + +# Function to generate monitor graph +generate_monitor_graph() { + local monitor_type="$1" + local title="$2" + + # Check if monitor data exists for any of the kernels + for kernel in $BASELINE_NAME $DEV_NAME; do + if [ -f "work/log/$kernel/$monitor_type-$BENCHMARK.gz" ] || [ -f "work/log/$kernel/$monitor_type.gz" ]; then + echo "Generating $monitor_type graph..." + ./bin/graph-mmtests.sh \ + -d work/log/ \ + -b "$BENCHMARK" \ + -n "$KERNEL_LIST" \ + --format png \ + --output "$OUTPUT_DIR/graph-$monitor_type" \ + --title "$title" \ + --print-monitor "$monitor_type" 2>&1 | tee -a "$OUTPUT_DIR/graph-generation.log" + + if [ -f "$OUTPUT_DIR/graph-$monitor_type.png" ]; then + echo "Monitor graph created: graph-$monitor_type.png" + fi + break + fi + done +} + +# Generate various monitor graphs +generate_monitor_graph "vmstat" "VM Statistics" +generate_monitor_graph "proc-vmstat" "Process VM Statistics" +generate_monitor_graph "mpstat" "CPU Statistics" +generate_monitor_graph "proc-buddyinfo" "Buddy Info" +generate_monitor_graph "proc-pagetypeinfo" "Page Type Info" + +# List all generated graphs +echo "" +echo "Generated graphs:" +ls -la "$OUTPUT_DIR"/*.png 2>/dev/null || echo "No PNG files generated" + +# Create an HTML file that embeds all the graphs +cat > "$OUTPUT_DIR/graphs.html" << 'EOF' + + +mmtests Graphs + + + +
+

mmtests Performance Graphs

+EOF + +# Add each graph to the HTML file +for graph in "$OUTPUT_DIR"/*.png; do + if [ -f "$graph" ]; then + graphname=$(basename "$graph" .png) + echo "
" >> "$OUTPUT_DIR/graphs.html" + echo "

$graphname

" >> "$OUTPUT_DIR/graphs.html" + echo "$graphname" >> "$OUTPUT_DIR/graphs.html" + echo "
" >> "$OUTPUT_DIR/graphs.html" + fi +done + +echo "
" >> "$OUTPUT_DIR/graphs.html" + +echo "Graph generation complete. HTML summary: $OUTPUT_DIR/graphs.html" +exit 0 diff --git a/playbooks/roles/mmtests_compare/files/generate_html_with_graphs.sh b/playbooks/roles/mmtests_compare/files/generate_html_with_graphs.sh new file mode 100755 index 00000000..a334d0fa --- /dev/null +++ b/playbooks/roles/mmtests_compare/files/generate_html_with_graphs.sh @@ -0,0 +1,89 @@ +#!/bin/bash +# Script to generate HTML report with embedded graphs using compare-kernels.sh + +set -e + +TOPDIR="$1" +BENCHMARK="$2" +BASELINE_NAME="$3" +DEV_NAME="$4" +OUTPUT_DIR="$5" + +cd "$TOPDIR/tmp/mmtests/work/log" + +echo "Generating HTML report with embedded graphs using compare-kernels.sh" + +# Ensure output directory is absolute path +if [[ "$OUTPUT_DIR" != /* ]]; then + OUTPUT_DIR="$TOPDIR/$OUTPUT_DIR" +fi + +# Create output directory if it doesn't exist +mkdir -p "$OUTPUT_DIR" + +# Check if compare-kernels.sh exists +if [ ! -f ../../compare-kernels.sh ]; then + echo "ERROR: compare-kernels.sh not found" + exit 1 +fi + +# Generate the HTML report with graphs +echo "Running compare-kernels.sh for $BASELINE_NAME vs $DEV_NAME" + +# Export R_TMPDIR for caching R objects (performance optimization) +export R_TMPDIR="$TOPDIR/tmp/mmtests_r_tmp" +mkdir -p "$R_TMPDIR" + +# Suppress package installation prompts by pre-answering +export MMTESTS_AUTO_PACKAGE_INSTALL=never + +# Run compare-kernels.sh with HTML format +# The HTML output goes to stdout, graphs go to output-dir +echo "Generating graphs and HTML report..." +../../compare-kernels.sh \ + --baseline "$BASELINE_NAME" \ + --compare "$DEV_NAME" \ + --format html \ + --output-dir "$OUTPUT_DIR" \ + --report-title "$BENCHMARK Performance Comparison" \ + > "$OUTPUT_DIR/comparison.html" 2> "$OUTPUT_DIR/compare-kernels.log" + +# Check if the HTML was created +if [ -f "$OUTPUT_DIR/comparison.html" ] && [ -s "$OUTPUT_DIR/comparison.html" ]; then + echo "HTML report with graphs created: $OUTPUT_DIR/comparison.html" + + # Clean up any package installation artifacts from the HTML + # Remove lines about package installation + sed -i '/MMTests needs to install/d' "$OUTPUT_DIR/comparison.html" + sed -i '/dpkg-query: no packages found/d' "$OUTPUT_DIR/comparison.html" + sed -i '/E: Unable to locate package/d' "$OUTPUT_DIR/comparison.html" + sed -i '/WARNING: Failed to cleanly install/d' "$OUTPUT_DIR/comparison.html" + sed -i '/Reading package lists/d' "$OUTPUT_DIR/comparison.html" + sed -i '/Building dependency tree/d' "$OUTPUT_DIR/comparison.html" + sed -i '/Reading state information/d' "$OUTPUT_DIR/comparison.html" + sed -i '/Installed perl-File-Which/d' "$OUTPUT_DIR/comparison.html" + sed -i '/Unrecognised argument:/d' "$OUTPUT_DIR/comparison.html" +else + echo "ERROR: Failed to generate HTML report" + echo "Check $OUTPUT_DIR/compare-kernels.log for errors" + exit 1 +fi + +# Count generated graphs +PNG_COUNT=$(ls -1 "$OUTPUT_DIR"/*.png 2>/dev/null | wc -l) +echo "Generated $PNG_COUNT graph files" + +# List all generated files +echo "" +echo "Generated files:" +ls -la "$OUTPUT_DIR"/*.html 2>/dev/null | head -5 +echo "..." +ls -la "$OUTPUT_DIR"/*.png 2>/dev/null | head -10 + +# Clean up R temp directory +rm -rf "$R_TMPDIR" + +echo "" +echo "HTML report generation complete" +echo "Main report: $OUTPUT_DIR/comparison.html" +exit 0 diff --git a/playbooks/roles/mmtests_compare/files/run_comparison.sh b/playbooks/roles/mmtests_compare/files/run_comparison.sh new file mode 100755 index 00000000..b95bea63 --- /dev/null +++ b/playbooks/roles/mmtests_compare/files/run_comparison.sh @@ -0,0 +1,58 @@ +#!/bin/bash +# Script to run mmtests comparison with proper error handling + +set -e + +TOPDIR="$1" +BENCHMARK="$2" +BASELINE_NAME="$3" +DEV_NAME="$4" +OUTPUT_DIR="$5" + +cd "$TOPDIR/tmp/mmtests" + +# First, verify the script exists and is executable +if [ ! -f ./bin/compare-mmtests.pl ]; then + echo "ERROR: compare-mmtests.pl not found" + exit 1 +fi + +# Create output directory if it doesn't exist +mkdir -p "$OUTPUT_DIR" + +# Run the comparison with error checking for HTML output +echo "Running HTML comparison for $BASELINE_NAME vs $DEV_NAME" +./bin/compare-mmtests.pl \ + --directory work/log/ \ + --benchmark "$BENCHMARK" \ + --names "$BASELINE_NAME,$DEV_NAME" \ + --format html > "$OUTPUT_DIR/comparison.html" 2>&1 + +# Check if the output file was created and has content +if [ ! -s "$OUTPUT_DIR/comparison.html" ]; then + echo "WARNING: comparison.html is empty or not created" + # Check for the specific error we're trying to fix + if grep -q "Can't use an undefined value as an ARRAY reference" "$OUTPUT_DIR/comparison.html" 2>/dev/null; then + echo "ERROR: The patch to fix undefined array reference was not applied correctly" + exit 1 + fi +else + echo "HTML comparison completed successfully" +fi + +# Run text comparison +echo "Running text comparison for $BASELINE_NAME vs $DEV_NAME" +./bin/compare-mmtests.pl \ + --directory work/log/ \ + --benchmark "$BENCHMARK" \ + --names "$BASELINE_NAME,$DEV_NAME" \ + > "$OUTPUT_DIR/comparison.txt" 2>&1 + +# Verify the text output was created +if [ ! -s "$OUTPUT_DIR/comparison.txt" ]; then + echo "WARNING: comparison.txt is empty or not created" +else + echo "Text comparison completed successfully" +fi + +exit 0 diff --git a/playbooks/roles/mmtests_compare/tasks/main.yml b/playbooks/roles/mmtests_compare/tasks/main.yml new file mode 100644 index 00000000..9ddbbbe0 --- /dev/null +++ b/playbooks/roles/mmtests_compare/tasks/main.yml @@ -0,0 +1,472 @@ +--- +- name: Install Perl dependencies for mmtests compare on localhost (Debian/Ubuntu) + delegate_to: localhost + become: yes + become_method: sudo + apt: + name: + - perl + - perl-doc + - cpanminus + - libfile-which-perl + - libfile-slurp-perl + - libjson-perl + - liblist-moreutils-perl + - gnuplot + - python3-matplotlib + - python3-numpy + state: present + update_cache: true + when: ansible_facts['os_family']|lower == 'debian' + run_once: true + tags: ['compare', 'deps'] + +- name: Install additional Perl modules via CPAN on localhost (if needed) + delegate_to: localhost + become: yes + become_method: sudo + cpanm: + name: "{{ item }}" + with_items: + - File::Temp + when: ansible_facts['os_family']|lower == 'debian' + run_once: true + tags: ['compare', 'deps'] + ignore_errors: true + +- name: Install Perl dependencies for mmtests compare on localhost (SUSE) + delegate_to: localhost + become: yes + become_method: sudo + zypper: + name: + - perl + - perl-File-Which + - perl-File-Slurp + - perl-JSON + - perl-List-MoreUtils + - perl-Data-Dumper + - perl-Digest-MD5 + - perl-Getopt-Long + - perl-Pod-Usage + - perl-App-cpanminus + - gnuplot + - python3-matplotlib + - python3-numpy + state: present + when: ansible_facts['os_family']|lower == 'suse' + run_once: true + tags: ['compare', 'deps'] + +- name: Install Perl dependencies for mmtests compare on localhost (RedHat/Fedora) + delegate_to: localhost + become: yes + become_method: sudo + yum: + name: + - perl + - perl-File-Which + - perl-File-Slurp + - perl-JSON + - perl-List-MoreUtils + - perl-Data-Dumper + - perl-Digest-MD5 + - perl-Getopt-Long + - perl-Pod-Usage + - perl-App-cpanminus + - gnuplot + - python3-matplotlib + - python3-numpy + state: present + when: ansible_facts['os_family']|lower == 'redhat' + run_once: true + tags: ['compare', 'deps'] + +- name: Create required directories + delegate_to: localhost + ansible.builtin.file: + path: "{{ item }}" + state: directory + mode: '0755' + loop: + - "{{ topdir_path }}/workflows/mmtests/results/compare" + - "{{ topdir_path }}/tmp" + run_once: true + tags: ['compare'] + +- name: Clone mmtests repository locally + delegate_to: localhost + ansible.builtin.git: + repo: "{{ mmtests_git_url }}" + dest: "{{ topdir_path }}/tmp/mmtests" + version: "{{ mmtests_git_version | default('master') }}" + force: yes + run_once: true + tags: ['compare'] + +- name: Check if mmtests fixes directory exists + delegate_to: localhost + stat: + path: "{{ topdir_path }}/workflows/mmtests/fixes/" + register: fixes_dir + run_once: true + tags: ['compare'] + +- name: Find mmtests patches in fixes directory + delegate_to: localhost + find: + paths: "{{ topdir_path }}/workflows/mmtests/fixes/" + patterns: "*.patch" + register: mmtests_patches + when: fixes_dir.stat.exists + run_once: true + tags: ['compare'] + +- name: Apply mmtests patches if found + delegate_to: localhost + ansible.builtin.patch: + src: "{{ item.path }}" + basedir: "{{ topdir_path }}/tmp/mmtests" + strip: 1 + loop: "{{ mmtests_patches.files }}" + when: + - fixes_dir.stat.exists + - mmtests_patches.files | length > 0 + run_once: true + tags: ['compare'] + failed_when: false + register: patch_results + +- name: Get kernel versions from nodes + block: + - name: Get baseline kernel version + command: uname -r + register: baseline_kernel_version + delegate_to: "{{ groups['baseline'][0] }}" + run_once: true + + - name: Get dev kernel version + command: uname -r + register: dev_kernel_version + delegate_to: "{{ groups['dev'][0] }}" + run_once: true + when: + - groups['dev'] is defined + - groups['dev'] | length > 0 + tags: ['compare'] + +- name: Set node information facts + set_fact: + baseline_hostname: "{{ groups['baseline'][0] }}" + baseline_kernel: "{{ baseline_kernel_version.stdout }}" + dev_hostname: "{{ groups['dev'][0] }}" + dev_kernel: "{{ dev_kernel_version.stdout }}" + run_once: true + delegate_to: localhost + tags: ['compare'] + +- name: Create local results directories for mmtests data + delegate_to: localhost + ansible.builtin.file: + path: "{{ topdir_path }}/tmp/mmtests/work/log/{{ item }}" + state: directory + mode: '0755' + loop: + - "{{ baseline_hostname }}-{{ baseline_kernel }}" + - "{{ dev_hostname }}-{{ dev_kernel }}" + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Archive baseline results on remote + archive: + path: "{{ mmtests_data_dir }}/work/log/{{ baseline_hostname }}-{{ baseline_kernel }}" + dest: "/tmp/baseline-mmtests-results.tar.gz" + format: gz + delegate_to: "{{ groups['baseline'][0] }}" + run_once: true + tags: ['compare'] + +- name: Archive dev results on remote + archive: + path: "{{ mmtests_data_dir }}/work/log/{{ dev_hostname }}-{{ dev_kernel }}" + dest: "/tmp/dev-mmtests-results.tar.gz" + format: gz + delegate_to: "{{ groups['dev'][0] }}" + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Fetch baseline results to localhost + fetch: + src: "/tmp/baseline-mmtests-results.tar.gz" + dest: "{{ topdir_path }}/tmp/" + flat: yes + delegate_to: "{{ groups['baseline'][0] }}" + run_once: true + tags: ['compare'] + +- name: Fetch dev results to localhost + fetch: + src: "/tmp/dev-mmtests-results.tar.gz" + dest: "{{ topdir_path }}/tmp/" + flat: yes + delegate_to: "{{ groups['dev'][0] }}" + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Extract baseline results locally + delegate_to: localhost + unarchive: + src: "{{ topdir_path }}/tmp/baseline-mmtests-results.tar.gz" + dest: "{{ topdir_path }}/tmp/mmtests/work/log/" + remote_src: yes + run_once: true + tags: ['compare'] + +- name: Extract dev results locally + delegate_to: localhost + unarchive: + src: "{{ topdir_path }}/tmp/dev-mmtests-results.tar.gz" + dest: "{{ topdir_path }}/tmp/mmtests/work/log/" + remote_src: yes + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Run mmtests comparison + delegate_to: localhost + ansible.builtin.command: + cmd: | + ./bin/compare-mmtests.pl \ + --directory work/log/ \ + --benchmark {{ mmtests_test_type }} \ + --names {{ baseline_hostname }}-{{ baseline_kernel }},{{ dev_hostname }}-{{ dev_kernel }} + chdir: "{{ topdir_path }}/tmp/mmtests" + run_once: true + when: kdevops_baseline_and_dev|bool + register: comparison_text_output + tags: ['compare'] + +- name: Generate HTML comparison output + delegate_to: localhost + ansible.builtin.command: + cmd: | + ./bin/compare-mmtests.pl \ + --directory work/log/ \ + --benchmark {{ mmtests_test_type }} \ + --names {{ baseline_hostname }}-{{ baseline_kernel }},{{ dev_hostname }}-{{ dev_kernel }} \ + --format html + chdir: "{{ topdir_path }}/tmp/mmtests" + run_once: true + when: kdevops_baseline_and_dev|bool + register: comparison_html_output + tags: ['compare'] + +- name: Parse comparison data for template + delegate_to: localhost + set_fact: + comparison_metrics: [] + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Generate performance graphs using gnuplot + delegate_to: localhost + block: + - name: Check for available iterations data + find: + paths: "{{ topdir_path }}/tmp/mmtests/work/log/{{ item }}/{{ mmtests_test_type }}" + patterns: "*.gz" + recurse: yes + register: iteration_files + loop: + - "{{ baseline_hostname }}-{{ baseline_kernel }}" + - "{{ dev_hostname }}-{{ dev_kernel }}" + + - name: Extract iteration data files + ansible.builtin.unarchive: + src: "{{ item.path }}" + dest: "{{ item.path | dirname }}" + remote_src: yes + loop: "{{ iteration_files.results | map(attribute='files') | flatten }}" + when: iteration_files.results is defined + + - name: Generate comparison graphs with compare-kernels.sh + ansible.builtin.command: + cmd: | + ./compare-kernels.sh \ + --baseline {{ baseline_hostname }}-{{ baseline_kernel }} \ + --compare {{ dev_hostname }}-{{ dev_kernel }} \ + --output-dir {{ topdir_path }}/workflows/mmtests/results/compare + chdir: "{{ topdir_path }}/tmp/mmtests/work/log" + register: graph_generation + failed_when: false + environment: + MMTESTS_AUTO_PACKAGE_INSTALL: never + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare', 'graphs'] + +- name: Find generated graph files + delegate_to: localhost + find: + paths: "{{ topdir_path }}/workflows/mmtests/results/compare" + patterns: "*.png" + register: graph_files + run_once: true + tags: ['compare', 'graphs'] + +- name: Read graph files for embedding + delegate_to: localhost + slurp: + src: "{{ item.path }}" + register: graph_data + loop: "{{ graph_files.files[:10] }}" # Limit to first 10 graphs + when: graph_files.files is defined + run_once: true + tags: ['compare', 'graphs'] + +- name: Prepare graph data for template + delegate_to: localhost + set_fact: + performance_graphs: [] + run_once: true + when: graph_files.files is not defined or graph_files.files | length == 0 + tags: ['compare', 'graphs'] + +- name: Build graph data list + delegate_to: localhost + set_fact: + performance_graphs: "{{ performance_graphs | default([]) + [{'embedded_data': item.content, 'title': item.item.path | basename | regex_replace('.png', '')}] }}" + loop: "{{ graph_data.results | default([]) }}" + run_once: true + when: + - graph_data is defined + - graph_data.results is defined + tags: ['compare', 'graphs'] + +- name: Generate benchmark description + delegate_to: localhost + set_fact: + benchmark_description: | + {% if mmtests_test_type == 'thpcompact' %} +

thpcompact tests memory management performance, specifically:

+
    +
  • Base page (4KB) allocation performance
  • +
  • Huge page (2MB) allocation performance
  • +
  • Memory compaction efficiency
  • +
  • Threading scalability
  • +
+

Lower values indicate better (faster) performance.

+ {% elif mmtests_test_type == 'hackbench' %} +

hackbench measures scheduler and IPC performance through:

+
    +
  • Process/thread creation and destruction
  • +
  • Context switching overhead
  • +
  • Inter-process communication
  • +
  • Scheduler scalability
  • +
+

Lower values indicate better performance.

+ {% elif mmtests_test_type == 'kernbench' %} +

kernbench measures kernel compilation performance:

+
    +
  • Overall system performance
  • +
  • CPU and memory bandwidth
  • +
  • I/O subsystem performance
  • +
  • Parallel compilation efficiency
  • +
+

Lower compilation times indicate better performance.

+ {% else %} +

{{ mmtests_test_type }} benchmark for Linux kernel performance testing.

+ {% endif %} + run_once: true + tags: ['compare'] + +- name: Generate comparison report from template + delegate_to: localhost + template: + src: comparison_report.html.j2 + dest: "{{ topdir_path }}/workflows/mmtests/results/compare/comparison_report.html" + mode: '0644' + vars: + benchmark_name: "{{ mmtests_test_type }}" + test_description: "Performance Benchmark" + analysis_date: "{{ ansible_date_time.date }}" + analysis_time: "{{ ansible_date_time.time }}" + baseline_hostname: "{{ baseline_hostname }}" + baseline_kernel: "{{ baseline_kernel }}" + dev_hostname: "{{ dev_hostname }}" + dev_kernel: "{{ dev_kernel }}" + benchmark_description: "{{ benchmark_description | default('') }}" + raw_comparison_html: "{{ comparison_html_output.stdout | default('') }}" + comparison_data: "{{ comparison_metrics | default([]) }}" + performance_graphs: "{{ performance_graphs | default([]) }}" + monitor_graphs: [] + summary_stats: [] + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Save comparison outputs + delegate_to: localhost + copy: + content: "{{ item.content }}" + dest: "{{ item.dest }}" + mode: '0644' + loop: + - content: "{{ comparison_text_output.stdout | default('No comparison data') }}" + dest: "{{ topdir_path }}/workflows/mmtests/results/compare/comparison.txt" + - content: "{{ comparison_html_output.stdout | default('

No comparison data

') }}" + dest: "{{ topdir_path }}/workflows/mmtests/results/compare/comparison_raw.html" + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Copy full results to final location + delegate_to: localhost + ansible.builtin.copy: + src: "{{ topdir_path }}/tmp/mmtests/work/log/{{ item }}/" + dest: "{{ topdir_path }}/workflows/mmtests/results/{{ item.split('-')[0] }}/" + remote_src: yes + loop: + - "{{ baseline_hostname }}-{{ baseline_kernel }}" + - "{{ dev_hostname }}-{{ dev_kernel }}" + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Display comparison report location + debug: + msg: | + ๐ŸŽฏ mmtests Comparison Reports Generated: + + ๐Ÿ“Š Enhanced Analysis: + - Template-based HTML: {{ topdir_path }}/workflows/mmtests/results/compare/comparison_report.html + - PNG graphs: {{ graph_files.matched | default(0) }} files in {{ topdir_path }}/workflows/mmtests/results/compare/ + + ๐Ÿ“‹ Standard Reports: + - Raw HTML: {{ topdir_path }}/workflows/mmtests/results/compare/comparison_raw.html + - Text: {{ topdir_path }}/workflows/mmtests/results/compare/comparison.txt + + ๐Ÿ“ Full Results: + - Baseline: {{ topdir_path }}/workflows/mmtests/results/{{ baseline_hostname }}/ + - Dev: {{ topdir_path }}/workflows/mmtests/results/{{ dev_hostname }}/ + + ๐Ÿš€ Open comparison_report.html for the best analysis experience! + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare'] + +- name: Clean up temporary archives on remote nodes + file: + path: "/tmp/{{ item }}-mmtests-results.tar.gz" + state: absent + delegate_to: "{{ groups[item][0] }}" + loop: + - baseline + - dev + run_once: true + when: kdevops_baseline_and_dev|bool + tags: ['compare', 'cleanup'] diff --git a/playbooks/roles/mmtests_compare/templates/comparison_report.html.j2 b/playbooks/roles/mmtests_compare/templates/comparison_report.html.j2 new file mode 100644 index 00000000..55e6d407 --- /dev/null +++ b/playbooks/roles/mmtests_compare/templates/comparison_report.html.j2 @@ -0,0 +1,414 @@ + + + + mmtests Analysis: {{ baseline_hostname }}-{{ baseline_kernel }} vs {{ dev_hostname }}-{{ dev_kernel }} + + + +
+
+

mmtests Performance Analysis

+
{{ benchmark_name }} Benchmark Comparison
+
+ +
+
+
+

Baseline System

+
{{ baseline_hostname }}
+
{{ baseline_kernel }}
+
+
+

Development System

+
{{ dev_hostname }}
+
{{ dev_kernel }}
+
+
+

Test Type

+
{{ benchmark_name }}
+
{{ test_description }}
+
+
+

Analysis Date

+
{{ analysis_date }}
+
{{ analysis_time }}
+
+
+ + {% if benchmark_description %} +
+

๐Ÿ“Š About {{ benchmark_name }}

+ {{ benchmark_description | safe }} +
+ {% endif %} + + {% if summary_stats %} +
+
+

๐Ÿ“ˆ Performance Summary

+
+
+ {% for stat in summary_stats %} +
+
{{ stat.value }}
+
{{ stat.label }}
+
+ {% endfor %} +
+
+ {% endif %} + + {% if comparison_data %} +
+
+

๐Ÿ“‹ Detailed Performance Metrics

+
+ + + + + + + + + + + + {% for metric in comparison_data %} + + + + + + + + {% endfor %} + +
MetricBaselineDevelopmentDifferenceChange %
{{ metric.name }}{{ metric.baseline }}{{ metric.dev }} + {{ metric.diff }} + {% if metric.is_improvement %} + โ†“ Better + {% elif metric.is_regression %} + โ†‘ Worse + {% else %} + โ†’ Same + {% endif %} + {{ metric.percent_change }}%
+
+ {% endif %} + + {% if performance_graphs %} +
+
+

๐Ÿ“Š Performance Visualization

+
+
+ {% for graph in performance_graphs %} +
+
+

{{ graph.title }}

+
+
+ {% if graph.embedded_data %} + {{ graph.title }} + {% elif graph.path %} + {{ graph.title }} + {% else %} +
+

Graph pending generation

+
+ {% endif %} +
+
+ {% endfor %} +
+
+ {% endif %} + + {% if monitor_graphs %} +
+
+

๐Ÿ–ฅ๏ธ System Monitor Data

+
+
+ {% for graph in monitor_graphs %} +
+
+

{{ graph.title }}

+
+
+ {% if graph.embedded_data %} + {{ graph.title }} + {% elif graph.path %} + {{ graph.title }} + {% else %} +
+

Monitor data unavailable

+
+ {% endif %} +
+
+ {% endfor %} +
+
+ {% endif %} + + {% if raw_comparison_html %} +
+
+

๐Ÿ“„ Raw Comparison Output

+
+
+ {{ raw_comparison_html | safe }} +
+
+ {% endif %} +
+ + +
+ + diff --git a/scripts/generate_mmtests_graphs.py b/scripts/generate_mmtests_graphs.py new file mode 100644 index 00000000..bc261249 --- /dev/null +++ b/scripts/generate_mmtests_graphs.py @@ -0,0 +1,598 @@ +#!/usr/bin/env python3 +""" +Generate visualization graphs for mmtests comparison results. + +This script parses mmtests comparison output and creates informative graphs +that help understand performance differences between baseline and dev kernels. +""" + +import sys +import re +import matplotlib.pyplot as plt +import numpy as np +import os +from pathlib import Path + +# Set matplotlib to use Agg backend for headless operation +import matplotlib + +matplotlib.use("Agg") + + +def parse_comparison_file(filepath): + """Parse mmtests comparison text file and extract data.""" + data = { + "fault_base": {"threads": [], "baseline": [], "dev": [], "improvement": []}, + "fault_huge": {"threads": [], "baseline": [], "dev": [], "improvement": []}, + "fault_both": {"threads": [], "baseline": [], "dev": [], "improvement": []}, + } + + with open(filepath, "r") as f: + for line in f: + # Parse lines like: Min fault-base-1 742.00 ( 0.00%) 1228.00 ( -65.50%) + match = re.match( + r"(\w+)\s+fault-(\w+)-(\d+)\s+(\d+\.\d+)\s+\([^)]+\)\s+(\d+\.\d+)\s+\(\s*([+-]?\d+\.\d+)%\)", + line.strip(), + ) + if match: + metric, fault_type, threads, baseline, dev, improvement = match.groups() + + # Focus on Amean (arithmetic mean) as it's most representative + if metric == "Amean" and fault_type in ["base", "huge", "both"]: + key = f"fault_{fault_type}" + data[key]["threads"].append(int(threads)) + data[key]["baseline"].append(float(baseline)) + data[key]["dev"].append(float(dev)) + data[key]["improvement"].append(float(improvement)) + + return data + + +def create_performance_comparison_graph(data, output_dir): + """Create a comprehensive performance comparison graph.""" + fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(16, 12)) + fig.suptitle( + "mmtests thpcompact: Baseline vs Dev Kernel Performance", + fontsize=16, + fontweight="bold", + ) + + colors = {"fault_base": "#1f77b4", "fault_huge": "#ff7f0e", "fault_both": "#2ca02c"} + labels = { + "fault_base": "Base Pages", + "fault_huge": "Huge Pages", + "fault_both": "Both Pages", + } + + # Plot 1: Raw performance comparison + for fault_type in ["fault_base", "fault_huge", "fault_both"]: + if data[fault_type]["threads"]: + threads = np.array(data[fault_type]["threads"]) + baseline = np.array(data[fault_type]["baseline"]) + dev = np.array(data[fault_type]["dev"]) + + ax1.plot( + threads, + baseline, + "o-", + color=colors[fault_type], + alpha=0.7, + label=f"{labels[fault_type]} - Baseline", + linewidth=2, + ) + ax1.plot( + threads, + dev, + "s--", + color=colors[fault_type], + alpha=0.9, + label=f"{labels[fault_type]} - Dev", + linewidth=2, + ) + + ax1.set_xlabel("Number of Threads") + ax1.set_ylabel("Fault Time (microseconds)") + ax1.set_title("Raw Performance: Lower is Better") + ax1.legend() + ax1.grid(True, alpha=0.3) + ax1.set_yscale("log") # Log scale for better visibility of differences + + # Plot 2: Performance improvement percentage + for fault_type in ["fault_base", "fault_huge", "fault_both"]: + if data[fault_type]["threads"]: + threads = np.array(data[fault_type]["threads"]) + improvement = np.array(data[fault_type]["improvement"]) + + ax2.plot( + threads, + improvement, + "o-", + color=colors[fault_type], + label=labels[fault_type], + linewidth=2, + markersize=6, + ) + + ax2.axhline(y=0, color="black", linestyle="-", alpha=0.5) + ax2.fill_between( + ax2.get_xlim(), 0, 100, alpha=0.1, color="green", label="Improvement" + ) + ax2.fill_between( + ax2.get_xlim(), -100, 0, alpha=0.1, color="red", label="Regression" + ) + ax2.set_xlabel("Number of Threads") + ax2.set_ylabel("Performance Change (%)") + ax2.set_title("Performance Change: Positive = Better Dev Kernel") + ax2.legend() + ax2.grid(True, alpha=0.3) + + # Plot 3: Scalability comparison (normalized to single thread) + for fault_type in ["fault_base", "fault_huge"]: # Skip 'both' to reduce clutter + if data[fault_type]["threads"] and len(data[fault_type]["threads"]) > 1: + threads = np.array(data[fault_type]["threads"]) + baseline = np.array(data[fault_type]["baseline"]) + dev = np.array(data[fault_type]["dev"]) + + # Normalize to single thread performance + baseline_norm = baseline / baseline[0] if baseline[0] > 0 else baseline + dev_norm = dev / dev[0] if dev[0] > 0 else dev + + ax3.plot( + threads, + baseline_norm, + "o-", + color=colors[fault_type], + alpha=0.7, + label=f"{labels[fault_type]} - Baseline", + linewidth=2, + ) + ax3.plot( + threads, + dev_norm, + "s--", + color=colors[fault_type], + alpha=0.9, + label=f"{labels[fault_type]} - Dev", + linewidth=2, + ) + + ax3.set_xlabel("Number of Threads") + ax3.set_ylabel("Relative Performance (vs 1 thread)") + ax3.set_title("Scalability: How Performance Changes with Thread Count") + ax3.legend() + ax3.grid(True, alpha=0.3) + + # Plot 4: Summary statistics + summary_data = [] + categories = [] + + for fault_type in ["fault_base", "fault_huge", "fault_both"]: + if data[fault_type]["improvement"]: + improvements = np.array(data[fault_type]["improvement"]) + avg_improvement = np.mean(improvements) + summary_data.append(avg_improvement) + categories.append(labels[fault_type]) + + bars = ax4.bar( + categories, + summary_data, + color=[ + colors[f'fault_{k.lower().replace(" ", "_")}'] + for k in ["base", "huge", "both"] + ][: len(categories)], + ) + ax4.axhline(y=0, color="black", linestyle="-", alpha=0.5) + ax4.set_ylabel("Average Performance Change (%)") + ax4.set_title("Overall Performance Summary") + ax4.grid(True, alpha=0.3, axis="y") + + # Add value labels on bars + for bar, value in zip(bars, summary_data): + height = bar.get_height() + ax4.text( + bar.get_x() + bar.get_width() / 2.0, + height + (1 if height >= 0 else -3), + f"{value:.1f}%", + ha="center", + va="bottom" if height >= 0 else "top", + ) + + plt.tight_layout() + plt.savefig( + os.path.join(output_dir, "performance_comparison.png"), + dpi=150, + bbox_inches="tight", + ) + plt.close() + + +def create_detailed_thread_analysis(data, output_dir): + """Create detailed analysis for different thread counts.""" + fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6)) + fig.suptitle("Thread Scaling Analysis", fontsize=14, fontweight="bold") + + colors = {"fault_base": "#1f77b4", "fault_huge": "#ff7f0e"} + labels = {"fault_base": "Base Pages", "fault_huge": "Huge Pages"} + + # Plot thread efficiency (performance per thread) + for fault_type in ["fault_base", "fault_huge"]: + if data[fault_type]["threads"]: + threads = np.array(data[fault_type]["threads"]) + baseline = np.array(data[fault_type]["baseline"]) + dev = np.array(data[fault_type]["dev"]) + + # Calculate efficiency (lower time per operation = better efficiency) + baseline_eff = baseline / threads # Time per thread + dev_eff = dev / threads + + ax1.plot( + threads, + baseline_eff, + "o-", + color=colors[fault_type], + alpha=0.7, + label=f"{labels[fault_type]} - Baseline", + linewidth=2, + ) + ax1.plot( + threads, + dev_eff, + "s--", + color=colors[fault_type], + alpha=0.9, + label=f"{labels[fault_type]} - Dev", + linewidth=2, + ) + + ax1.set_xlabel("Number of Threads") + ax1.set_ylabel("Time per Thread (microseconds)") + ax1.set_title("Threading Efficiency: Lower is Better") + ax1.legend() + ax1.grid(True, alpha=0.3) + ax1.set_yscale("log") + + # Plot improvement by thread count + thread_counts = set() + for fault_type in ["fault_base", "fault_huge"]: + thread_counts.update(data[fault_type]["threads"]) + + thread_counts = sorted(list(thread_counts)) + + base_improvements = [] + huge_improvements = [] + + for tc in thread_counts: + base_imp = None + huge_imp = None + + for i, t in enumerate(data["fault_base"]["threads"]): + if t == tc: + base_imp = data["fault_base"]["improvement"][i] + break + + for i, t in enumerate(data["fault_huge"]["threads"]): + if t == tc: + huge_imp = data["fault_huge"]["improvement"][i] + break + + base_improvements.append(base_imp if base_imp is not None else 0) + huge_improvements.append(huge_imp if huge_imp is not None else 0) + + x = np.arange(len(thread_counts)) + width = 0.35 + + bars1 = ax2.bar( + x - width / 2, + base_improvements, + width, + label="Base Pages", + color=colors["fault_base"], + alpha=0.8, + ) + bars2 = ax2.bar( + x + width / 2, + huge_improvements, + width, + label="Huge Pages", + color=colors["fault_huge"], + alpha=0.8, + ) + + ax2.set_xlabel("Thread Count") + ax2.set_ylabel("Performance Improvement (%)") + ax2.set_title("Improvement by Thread Count") + ax2.set_xticks(x) + ax2.set_xticklabels(thread_counts) + ax2.legend() + ax2.grid(True, alpha=0.3, axis="y") + ax2.axhline(y=0, color="black", linestyle="-", alpha=0.5) + + # Add value labels on bars + for bars in [bars1, bars2]: + for bar in bars: + height = bar.get_height() + if abs(height) > 0.1: # Only show labels for non-zero values + ax2.text( + bar.get_x() + bar.get_width() / 2.0, + height + (1 if height >= 0 else -3), + f"{height:.1f}%", + ha="center", + va="bottom" if height >= 0 else "top", + fontsize=8, + ) + + plt.tight_layout() + plt.savefig( + os.path.join(output_dir, "thread_analysis.png"), dpi=150, bbox_inches="tight" + ) + plt.close() + + +def generate_graphs_html(output_dir, baseline_kernel, dev_kernel): + """Generate an HTML file with explanations and embedded graphs.""" + html_content = f""" + + + + + mmtests Performance Analysis: {baseline_kernel} vs {dev_kernel} + + + +
+

mmtests thpcompact Performance Analysis

+

Kernel Comparison: {baseline_kernel} vs {dev_kernel}

+ +
+

๐ŸŽฏ What is thpcompact testing?

+

thpcompact is a memory management benchmark that tests how well the kernel handles:

+
    +
  • Base Pages (4KB): Standard memory pages
  • +
  • Huge Pages (2MB): Large memory pages that reduce TLB misses
  • +
  • Memory Compaction: Kernel's ability to defragment memory
  • +
  • Thread Scaling: Performance under different levels of parallelism
  • +
+

Lower numbers are better - they represent faster memory allocation times.

+
+ +

๐Ÿ“Š Performance Overview

+
+ Performance Comparison +
+ +
+

Understanding the Performance Graphs:

+
    +
  • Top Left: Raw performance comparison. Lower lines = faster kernel.
  • +
  • Top Right: Performance changes. Green area = dev kernel improved, red = regressed.
  • +
  • Bottom Left: Scalability. Shows how performance changes with more threads.
  • +
  • Bottom Right: Overall summary of improvements/regressions.
  • +
+
+ +

๐Ÿงต Thread Scaling Analysis

+
+ Thread Analysis +
+ +
+

Thread Scaling Insights:

+
    +
  • Left Graph: Threading efficiency - how well work is distributed across threads
  • +
  • Right Graph: Performance improvement at each thread count
  • +
  • Good scaling means the kernel can efficiently use multiple CPUs for memory operations
  • +
+
+ +

๐Ÿ” What These Results Mean

+ +
+

โœ… Positive Performance Changes

+

When you see positive percentages in the comparison:

+
    +
  • The dev kernel is faster at memory allocation
  • +
  • Applications will experience reduced memory latency
  • +
  • Better overall system responsiveness
  • +
+
+ +
+

โŒ Negative Performance Changes

+

When you see negative percentages:

+
    +
  • The dev kernel is slower at memory allocation
  • +
  • This might indicate a regression that needs investigation
  • +
  • Consider the trade-offs - sometimes slower allocation enables other benefits
  • +
+
+ +
+

โš ๏ธ Important Notes

+
    +
  • Variability is normal: Performance can vary significantly across different thread counts
  • +
  • Context matters: A regression in one area might be acceptable if it enables improvements elsewhere
  • +
  • Real-world impact: The significance depends on your workload's memory access patterns
  • +
+
+ +

๐Ÿ“ˆ Key Metrics Explained

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
MetricDescriptionWhat it Means
fault-baseStandard 4KB page fault handlingHow quickly the kernel can allocate regular memory pages
fault-huge2MB huge page fault handlingPerformance of large page allocations (important for databases, HPC)
fault-bothMixed workload simulationReal-world scenario with both page sizes
Thread CountNumber of parallel threadsTests scalability across multiple CPU cores
+ +

๐ŸŽฏ Bottom Line

+
+

This analysis helps you understand:

+
    +
  • Whether your kernel changes improved or regressed memory performance
  • +
  • How well your changes scale across multiple CPU cores
  • +
  • Which types of memory allocations are affected
  • +
  • The magnitude of performance changes in real-world scenarios
  • +
+

Use this data to make informed decisions about kernel optimizations and to identify areas needing further investigation.

+
+ +
+

+ Generated by kdevops mmtests analysis โ€ข + Baseline: {baseline_kernel} โ€ข Dev: {dev_kernel} +

+
+ +""" + + with open(os.path.join(output_dir, "graphs.html"), "w") as f: + f.write(html_content) + + +def main(): + if len(sys.argv) != 4: + print( + "Usage: python3 generate_mmtests_graphs.py -" + ) + sys.exit(1) + + comparison_file = sys.argv[1] + output_dir = sys.argv[2] + kernel_names = sys.argv[3].split("-", 1) + + if len(kernel_names) != 2: + print("Kernel names should be in format: baseline-dev") + sys.exit(1) + + baseline_kernel, dev_kernel = kernel_names + + # Create output directory if it doesn't exist + Path(output_dir).mkdir(parents=True, exist_ok=True) + + # Parse the comparison data + print(f"Parsing comparison data from {comparison_file}...") + data = parse_comparison_file(comparison_file) + + # Generate graphs + print("Generating performance comparison graphs...") + create_performance_comparison_graph(data, output_dir) + + print("Generating thread analysis graphs...") + create_detailed_thread_analysis(data, output_dir) + + # Generate HTML report + print("Generating HTML report...") + generate_graphs_html(output_dir, baseline_kernel, dev_kernel) + + print( + f"โœ… Analysis complete! Open {output_dir}/graphs.html in your browser to view results." + ) + + +if __name__ == "__main__": + main() diff --git a/workflows/mmtests/Makefile b/workflows/mmtests/Makefile index 06a75ead..b65d256b 100644 --- a/workflows/mmtests/Makefile +++ b/workflows/mmtests/Makefile @@ -13,19 +13,19 @@ mmtests: mmtests-baseline: $(Q)ansible-playbook $(ANSIBLE_VERBOSE) \ - -l baseline playbooks/mmtests.yml + -l 'mmtests:&baseline' playbooks/mmtests.yml \ --extra-vars=@./extra_vars.yaml \ --tags run_tests \ $(MMTESTS_ARGS) mmtests-dev: $(Q)ansible-playbook $(ANSIBLE_VERBOSE) \ - -l dev playbooks/mmtests.yml \ + -l 'mmtests:&dev' playbooks/mmtests.yml \ --extra-vars=@./extra_vars.yaml \ --tags run_tests \ $(MMTESTS_ARGS) -mmtests-test: mmtests +mmtests-tests: mmtests $(Q)ansible-playbook $(ANSIBLE_VERBOSE) \ playbooks/mmtests.yml \ --extra-vars=@./extra_vars.yaml \ @@ -39,6 +39,13 @@ mmtests-results: --tags results \ $(MMTESTS_ARGS) +mmtests-compare: + $(Q)ansible-playbook $(ANSIBLE_VERBOSE) \ + playbooks/mmtests-compare.yml \ + --extra-vars=@./extra_vars.yaml \ + --tags deps,compare \ + $(MMTESTS_ARGS) + mmtests-clean: $(Q)ansible-playbook $(ANSIBLE_VERBOSE) \ playbooks/mmtests.yml \ @@ -51,8 +58,9 @@ mmtests-help: @echo "mmtests : Setup and install mmtests" @echo "mmtests-baseline : Setup mmtests with baseline configuration" @echo "mmtests-dev : Setup mmtests with dev configuration" - @echo "mmtests-test : Run mmtests tests" + @echo "mmtests-tests : Run mmtests tests" @echo "mmtests-results : Copy results from guests" + @echo "mmtests-compare : Compare baseline and dev results (AB testing)" @echo "mmtests-clean : Clean up mmtests installation" @echo "" @@ -61,8 +69,9 @@ HELP_TARGETS += mmtests-help PHONY +: mmtests PHONY +: mmtests-baseline PHONY +: mmtests-dev -PHONY +: mmtests-test +PHONY +: mmtests-tests PHONY +: mmtests-results +PHONY +: mmtests-compare PHONY +: mmtests-clean PHONY +: mmtests-help .PHONY: $(PHONY) diff --git a/workflows/mmtests/fixes/0001-compare-Fix-undefined-array-reference-when-no-operat.patch b/workflows/mmtests/fixes/0001-compare-Fix-undefined-array-reference-when-no-operat.patch new file mode 100644 index 00000000..b1e0adc3 --- /dev/null +++ b/workflows/mmtests/fixes/0001-compare-Fix-undefined-array-reference-when-no-operat.patch @@ -0,0 +1,46 @@ +From d951f7feb7855ee5ea393d2bbe55e93c150295da Mon Sep 17 00:00:00 2001 +From: Luis Chamberlain +Date: Tue, 5 Aug 2025 14:12:00 -0700 +Subject: [PATCH] compare: Fix undefined array reference when no operations + found + +When a benchmark produces no results (e.g., thpcompact when the binary +fails to build), the compare script would crash with: +'Can't use an undefined value as an ARRAY reference at Compare.pm line 461' + +This happens because $operations[0] doesn't exist when no operations +are found. Add proper bounds checking to handle empty results gracefully. + +Signed-off-by: Luis Chamberlain +--- + bin/lib/MMTests/Compare.pm | 14 ++++++++------ + 1 file changed, 8 insertions(+), 6 deletions(-) + +diff --git a/bin/lib/MMTests/Compare.pm b/bin/lib/MMTests/Compare.pm +index 94b0819eca67..6ea1b6173d4e 100644 +--- a/bin/lib/MMTests/Compare.pm ++++ b/bin/lib/MMTests/Compare.pm +@@ -458,12 +458,14 @@ sub _generateRenderTable() { + + # Build column format table + my %resultsTable = %{$self->{_ResultsTable}}; +- for (my $i = 0; $i <= (scalar(@{$resultsTable{$operations[0]}[0]})); $i++) { +- my $fieldFormat = "%${fieldLength}.${precision}f"; +- if (defined $self->{_CompareTable}) { +- push @formatTable, ($fieldFormat, " (%${compareLength}.2f%%)"); +- } else { +- push @formatTable, ($fieldFormat, ""); ++ if (@operations > 0 && exists $resultsTable{$operations[0]} && defined $resultsTable{$operations[0]}[0]) { ++ for (my $i = 0; $i <= (scalar(@{$resultsTable{$operations[0]}[0]})); $i++) { ++ my $fieldFormat = "%${fieldLength}.${precision}f"; ++ if (defined $self->{_CompareTable}) { ++ push @formatTable, ($fieldFormat, " (%${compareLength}.2f%%)"); ++ } else { ++ push @formatTable, ($fieldFormat, ""); ++ } + } + } + +-- +2.45.2 + diff --git a/workflows/mmtests/fixes/0002-thpcompact-fix-library-order-in-gcc-command.patch b/workflows/mmtests/fixes/0002-thpcompact-fix-library-order-in-gcc-command.patch new file mode 100644 index 00000000..0ab3e082 --- /dev/null +++ b/workflows/mmtests/fixes/0002-thpcompact-fix-library-order-in-gcc-command.patch @@ -0,0 +1,33 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Luis Chamberlain +Date: Tue, 5 Aug 2025 14:35:00 -0700 +Subject: [PATCH] thpcompact: fix library order in gcc command + +The gcc command in thpcompact-install has the libraries specified +before the source file, which causes linking errors on modern systems: + + undefined reference to `get_mempolicy' + +Fix by moving the libraries after the source file, as required by +modern linkers that process dependencies in order. + +Signed-off-by: Luis Chamberlain +--- + shellpack_src/src/thpcompact/thpcompact-install | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/shellpack_src/src/thpcompact/thpcompact-install b/shellpack_src/src/thpcompact/thpcompact-install +index 1234567..7654321 100644 +--- a/shellpack_src/src/thpcompact/thpcompact-install ++++ b/shellpack_src/src/thpcompact/thpcompact-install +@@ -8,7 +8,7 @@ + install-depends libnuma-devel + + mkdir $SHELLPACK_SOURCES/thpcompact-${VERSION}-installed +-gcc -Wall -g -lpthread -lnuma $SHELLPACK_TEMP/thpcompact.c -o $SHELLPACK_SOURCES/thpcompact-${VERSION}-installed/thpcompact || \ ++gcc -Wall -g $SHELLPACK_TEMP/thpcompact.c -lpthread -lnuma -o $SHELLPACK_SOURCES/thpcompact-${VERSION}-installed/thpcompact || \ + die "Failed to build thpcompact" + + echo thpcompact installed successfully +-- +2.45.2 \ No newline at end of file -- 2.47.2