[PATCH 0/5] add memory fragmentation automation testing

public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed

* [PATCH 0/5] add memory fragmentation automation testing
@ 2025-09-04  9:13 Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 1/5] monitoring: add memory fragmentation eBPF monitoring support Luis Chamberlain
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-04  9:13 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

This extends monitoring support on kdevops to leverage tracepoint
analysis for automatic memory fragmentation analysis.

Luis Chamberlain (5):
  monitoring: add memory fragmentation eBPF monitoring support
  mmtests: add monitoring framework integration
  sysbench: add monitoring framework integration
  ai milvus: add monitoring support
  minio: add monitoring support

 kconfigs/monitors/Kconfig                     |   53 +
 playbooks/ai_benchmark.yml                    |   14 +
 playbooks/minio.yml                           |   15 +
 .../tasks/install-deps/debian/main.yml        |    1 +
 .../tasks/install-deps/redhat/main.yml        |    1 +
 .../fstests/tasks/install-deps/suse/main.yml  |    1 +
 .../roles/milvus/tasks/install_docker.yml     |    2 +
 playbooks/roles/minio_install/tasks/main.yml  |   24 +-
 .../tasks/install-deps/debian/main.yml        |    1 +
 .../tasks/install-deps/redhat/main.yml        |    1 +
 .../mmtests/tasks/install-deps/suse/main.yml  |    1 +
 playbooks/roles/mmtests/tasks/main.yaml       |   12 +
 .../monitoring/files/fragmentation_tracker.py |  533 ++++++++
 .../files/fragmentation_visualizer.py         | 1161 +++++++++++++++++
 .../monitoring/tasks/monitor_collect.yml      |  145 +-
 .../roles/monitoring/tasks/monitor_run.yml    |  123 ++
 .../tasks/install-deps/debian/main.yml        |    2 +
 .../tasks/install-deps/redhat/main.yml        |    1 +
 .../sysbench/tasks/install-deps/suse/main.yml |    1 +
 playbooks/roles/sysbench/tasks/main.yaml      |   12 +
 workflows/ai/Makefile                         |    5 +
 workflows/minio/Makefile                      |    9 +-
 workflows/mmtests/Makefile                    |    8 +
 workflows/sysbench/Makefile                   |    8 +-
 24 files changed, 2130 insertions(+), 4 deletions(-)
 create mode 100644 playbooks/roles/monitoring/files/fragmentation_tracker.py
 create mode 100644 playbooks/roles/monitoring/files/fragmentation_visualizer.py

-- 
2.45.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] monitoring: add memory fragmentation eBPF monitoring support
  2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
@ 2025-09-04  9:13 ` Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 2/5] mmtests: add monitoring framework integration Luis Chamberlain
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-04  9:13 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

Add support for memory fragmentation monitoring using eBPF-based tracking,
we leverage the plot-fragmentation effort [0]. This provides real-time
tracking of memory allocation events and fragmentation indices with matplotlib
visualization.

Features:
- eBPF tracepoint-based fragmentation tracking
- Real-time fragmentation index monitoring
- Automatic plot generation with fragmentation_visualizer.py
- Configurable monitoring duration and output directory
- Integration with existing monitoring framework

The scripts are included directly in kdevops rather than cloning
an external repository for simplicity.

Generated-by: Claude AI
Link: https://github.com/mcgrof/plot-fragmentation # [0]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 kconfigs/monitors/Kconfig                     |   53 +
 .../tasks/install-deps/debian/main.yml        |    1 +
 .../tasks/install-deps/redhat/main.yml        |    1 +
 .../fstests/tasks/install-deps/suse/main.yml  |    1 +
 .../monitoring/files/fragmentation_tracker.py |  533 ++++++++
 .../files/fragmentation_visualizer.py         | 1161 +++++++++++++++++
 .../monitoring/tasks/monitor_collect.yml      |  126 ++
 .../roles/monitoring/tasks/monitor_run.yml    |  123 ++
 8 files changed, 1999 insertions(+)
 create mode 100644 playbooks/roles/monitoring/files/fragmentation_tracker.py
 create mode 100644 playbooks/roles/monitoring/files/fragmentation_visualizer.py

diff --git a/kconfigs/monitors/Kconfig b/kconfigs/monitors/Kconfig
index 6dc1ddbdd2e9..bd4dc81fa11d 100644
--- a/kconfigs/monitors/Kconfig
+++ b/kconfigs/monitors/Kconfig
@@ -61,6 +61,59 @@ config MONITOR_FOLIO_MIGRATION_INTERVAL
 	  performance. Higher values reduce overhead but may miss
 	  short-lived migration events.
 
+config MONITOR_MEMORY_FRAGMENTATION
+	bool "Monitor memory fragmentation with eBPF"
+	output yaml
+	default n
+	help
+	  Enable monitoring of memory fragmentation using eBPF-based tracking.
+	  This provides advanced memory fragmentation visualization using
+	  eBPF tracepoints and matplotlib.
+
+	  This tool tracks memory allocation events and fragmentation indices
+	  in real-time, providing insights that traditional methods like
+	  /proc/pagetypeinfo cannot fully capture.
+
+	  Features:
+	  - eBPF-based tracepoint tracking
+	  - Real-time fragmentation index monitoring
+	  - Page mobility tracking
+	  - Matplotlib visualization of fragmentation data
+
+	  Requirements:
+	  - Python 3 with python3-bpfcc
+	  - Kernel with required tracepoint support
+	  - Root privileges for eBPF attachment
+
+	  The tool is particularly useful for investigating whether Large Block
+	  Size support in the kernel creates worse fragmentation.
+
+config MONITOR_FRAGMENTATION_DURATION
+	int "Fragmentation monitoring duration (seconds)"
+	output yaml
+	default 0
+	depends on MONITOR_MEMORY_FRAGMENTATION
+	help
+	  Duration to run fragmentation monitoring in seconds.
+	  Set to 0 for continuous monitoring until workflow completion.
+
+	  The monitoring will automatically stop when the workflow
+	  finishes or when this duration expires, whichever comes first.
+
+config MONITOR_FRAGMENTATION_OUTPUT_DIR
+	string "Fragmentation monitoring output directory"
+	output yaml
+	default "/root/monitoring/fragmentation"
+	depends on MONITOR_MEMORY_FRAGMENTATION
+	help
+	  Directory where fragmentation monitoring data and plots will be stored.
+	  This directory will be created if it doesn't exist.
+
+	  The collected data includes:
+	  - Raw eBPF trace data
+	  - Generated matplotlib plots
+	  - JSON formatted fragmentation metrics
+
 endif # MONITOR_DEVELOPMENTAL_STATS
 
 # Future monitoring options can be added here
diff --git a/playbooks/roles/fstests/tasks/install-deps/debian/main.yml b/playbooks/roles/fstests/tasks/install-deps/debian/main.yml
index cbcb3788d2bd..cc4a5a6b10af 100644
--- a/playbooks/roles/fstests/tasks/install-deps/debian/main.yml
+++ b/playbooks/roles/fstests/tasks/install-deps/debian/main.yml
@@ -73,6 +73,7 @@
       - xfsdump
       - cifs-utils
       - duperemove
+      - python3-bpfcc
     state: present
     update_cache: true
   tags: ["fstests", "deps"]
diff --git a/playbooks/roles/fstests/tasks/install-deps/redhat/main.yml b/playbooks/roles/fstests/tasks/install-deps/redhat/main.yml
index c1bd7f82f0aa..3c681c1a06fb 100644
--- a/playbooks/roles/fstests/tasks/install-deps/redhat/main.yml
+++ b/playbooks/roles/fstests/tasks/install-deps/redhat/main.yml
@@ -72,6 +72,7 @@
       - gettext
       - ncurses
       - ncurses-devel
+      - python3-bcc
 
 - name: Install xfsprogs-xfs_scrub
   become: true
diff --git a/playbooks/roles/fstests/tasks/install-deps/suse/main.yml b/playbooks/roles/fstests/tasks/install-deps/suse/main.yml
index 3247567a8cfa..54de4a7ad3d5 100644
--- a/playbooks/roles/fstests/tasks/install-deps/suse/main.yml
+++ b/playbooks/roles/fstests/tasks/install-deps/suse/main.yml
@@ -124,6 +124,7 @@
       - libcap-progs
       - fio
       - parted
+      - python3-bcc
     state: present
   when:
     - repos_present|bool
diff --git a/playbooks/roles/monitoring/files/fragmentation_tracker.py b/playbooks/roles/monitoring/files/fragmentation_tracker.py
new file mode 100644
index 000000000000..7b66f3232960
--- /dev/null
+++ b/playbooks/roles/monitoring/files/fragmentation_tracker.py
@@ -0,0 +1,533 @@
+#!/usr/bin/env python3
+"""
+Enhanced eBPF-based memory fragmentation tracker.
+Primary focus on mm_page_alloc_extfrag events with optional compaction tracking.
+"""
+
+from bcc import BPF
+import time
+import signal
+import sys
+import os
+import json
+import argparse
+from collections import defaultdict
+from datetime import datetime
+
+# eBPF program to trace fragmentation events
+bpf_program = """
+#include <uapi/linux/ptrace.h>
+#include <linux/mm.h>
+#include <linux/mmzone.h>
+
+// Event types
+#define EVENT_COMPACTION_SUCCESS 1
+#define EVENT_COMPACTION_FAILURE 2
+#define EVENT_EXTFRAG 3
+
+struct fragmentation_event {
+    u64 timestamp;
+    u32 pid;
+    u32 tid;
+    u8 event_type;  // 1=compact_success, 2=compact_fail, 3=extfrag
+
+    // Common fields
+    u32 order;
+    int fragmentation_index;
+    int zone_idx;
+    int node_id;
+
+    // ExtFrag specific fields
+    int fallback_order;         // Order of the fallback allocation
+    int migrate_from;           // Original migrate type
+    int migrate_to;             // Fallback migrate type
+    int fallback_blocks;        // Number of pageblocks involved
+    int is_steal;               // Whether this is a steal vs claim
+
+    // Process info
+    char comm[16];
+};
+
+BPF_PERF_OUTPUT(events);
+
+// Statistics tracking
+BPF_HASH(extfrag_stats, u32, u64);  // Key: order, Value: count
+BPF_HASH(compact_stats, u32, u64);  // Key: order|success<<16, Value: count
+
+// Helper to get current fragmentation state (simplified)
+static inline int get_fragmentation_estimate(int order) {
+    // This is a simplified estimate
+    // In real implementation, we'd need to walk buddy lists
+    // For now, return a placeholder that indicates we need fragmentation data
+    if (order <= 3) return 100;  // Low order usually OK
+    if (order <= 6) return 400;  // Medium order moderate frag
+    return 700;  // High order typically fragmented
+}
+
+// Trace external fragmentation events (page steal/claim from different migratetype)
+TRACEPOINT_PROBE(kmem, mm_page_alloc_extfrag) {
+    struct fragmentation_event event = {};
+
+    event.timestamp = bpf_ktime_get_ns();
+    event.pid = bpf_get_current_pid_tgid() >> 32;
+    event.tid = bpf_get_current_pid_tgid() & 0xFFFFFFFF;
+    event.event_type = EVENT_EXTFRAG;
+
+    // Extract tracepoint arguments
+    // Note: Field names may vary by kernel version
+    // Typical fields: alloc_order, fallback_order,
+    //                alloc_migratetype, fallback_migratetype, change_ownership
+
+    event.order = args->alloc_order;
+    event.fallback_order = args->fallback_order;
+    event.migrate_from = args->fallback_migratetype;
+    event.migrate_to = args->alloc_migratetype;
+
+    // change_ownership indicates if the whole pageblock was claimed
+    // 0 = steal (partial), 1 = claim (whole block)
+    event.is_steal = args->change_ownership ? 0 : 1;
+
+    // Node ID - set to -1 as page struct access is kernel-specific
+    // Could be enhanced with kernel version detection
+    event.node_id = -1;
+    event.zone_idx = -1;
+
+    // Estimate fragmentation at this point
+    event.fragmentation_index = get_fragmentation_estimate(event.order);
+
+    // Get process name
+    bpf_get_current_comm(&event.comm, sizeof(event.comm));
+
+    events.perf_submit(args, &event, sizeof(event));
+
+    // Update statistics
+    u64 *count = extfrag_stats.lookup(&event.order);
+    if (count) {
+        (*count)++;
+    } else {
+        u64 initial = 1;
+        extfrag_stats.update(&event.order, &initial);
+    }
+
+    return 0;
+}
+
+// Optional: Trace compaction success (if tracepoint exists)
+#ifdef TRACE_COMPACTION
+TRACEPOINT_PROBE(page_alloc, mm_compaction_success) {
+    struct fragmentation_event event = {};
+
+    event.timestamp = bpf_ktime_get_ns();
+    event.pid = bpf_get_current_pid_tgid() >> 32;
+    event.tid = bpf_get_current_pid_tgid() & 0xFFFFFFFF;
+    event.event_type = EVENT_COMPACTION_SUCCESS;
+
+    event.order = args->order;
+    event.fragmentation_index = args->ret;
+    event.zone_idx = args->idx;
+    event.node_id = args->nid;
+
+    bpf_get_current_comm(&event.comm, sizeof(event.comm));
+
+    events.perf_submit(args, &event, sizeof(event));
+
+    u32 key = (event.order) | (1 << 16);  // Set success bit
+    u64 *count = compact_stats.lookup(&key);
+    if (count) {
+        (*count)++;
+    } else {
+        u64 initial = 1;
+        compact_stats.update(&key, &initial);
+    }
+
+    return 0;
+}
+
+TRACEPOINT_PROBE(page_alloc, mm_compaction_failure) {
+    struct fragmentation_event event = {};
+
+    event.timestamp = bpf_ktime_get_ns();
+    event.pid = bpf_get_current_pid_tgid() >> 32;
+    event.tid = bpf_get_current_pid_tgid() & 0xFFFFFFFF;
+    event.event_type = EVENT_COMPACTION_FAILURE;
+
+    event.order = args->order;
+    event.fragmentation_index = -1;
+    event.zone_idx = -1;
+    event.node_id = -1;
+
+    bpf_get_current_comm(&event.comm, sizeof(event.comm));
+
+    events.perf_submit(args, &event, sizeof(event));
+
+    u32 key = event.order;  // No success bit
+    u64 *count = compact_stats.lookup(&key);
+    if (count) {
+        (*count)++;
+    } else {
+        u64 initial = 1;
+        compact_stats.update(&key, &initial);
+    }
+
+    return 0;
+}
+#endif
+"""
+
+# Migrate type names for better readability
+MIGRATE_TYPES = {
+    0: "UNMOVABLE",
+    1: "MOVABLE",
+    2: "RECLAIMABLE",
+    3: "PCPTYPES",
+    4: "HIGHATOMIC",
+    5: "CMA",
+    6: "ISOLATE",
+}
+
+
+class FragmentationTracker:
+    def __init__(self, verbose=True, output_file=None):
+        self.start_time = time.time()
+        self.events_data = []
+        self.extfrag_stats = defaultdict(int)
+        self.compact_stats = defaultdict(lambda: {"success": 0, "failure": 0})
+        self.zone_names = ["DMA", "DMA32", "Normal", "Movable", "Device"]
+        self.verbose = verbose
+        self.output_file = output_file
+        self.event_count = 0
+        self.interrupted = False
+
+    def process_event(self, cpu, data, size):
+        """Process a fragmentation event from eBPF."""
+        event = self.b["events"].event(data)
+
+        # Calculate relative time from start
+        rel_time = (event.timestamp - self.start_ns) / 1e9
+
+        # Decode process name
+        try:
+            comm = event.comm.decode("utf-8", "replace")
+        except:
+            comm = "unknown"
+
+        # Determine event type and format output
+        if event.event_type == 3:  # EXTFRAG event
+            event_name = "EXTFRAG"
+            color = "\033[93m"  # Yellow
+
+            # Get migrate type names
+            from_type = MIGRATE_TYPES.get(
+                event.migrate_from, f"TYPE_{event.migrate_from}"
+            )
+            to_type = MIGRATE_TYPES.get(event.migrate_to, f"TYPE_{event.migrate_to}")
+
+            # Store event data
+            event_dict = {
+                "timestamp": rel_time,
+                "absolute_time": datetime.now().isoformat(),
+                "event_type": "extfrag",
+                "pid": event.pid,
+                "tid": event.tid,
+                "comm": comm,
+                "order": event.order,
+                "fallback_order": event.fallback_order,
+                "migrate_from": from_type,
+                "migrate_to": to_type,
+                "is_steal": bool(event.is_steal),
+                "node": event.node_id,
+                "fragmentation_index": event.fragmentation_index,
+            }
+
+            self.extfrag_stats[event.order] += 1
+
+            if self.verbose:
+                action = "steal" if event.is_steal else "claim"
+                print(
+                    f"{color}[{rel_time:8.3f}s] {event_name:10s}\033[0m "
+                    f"Order={event.order:2d} FallbackOrder={event.fallback_order:2d} "
+                    f"{from_type:10s}->{to_type:10s} ({action}) "
+                    f"Process={comm:12s} PID={event.pid:6d}"
+                )
+
+        elif event.event_type == 1:  # COMPACTION_SUCCESS
+            event_name = "COMPACT_OK"
+            color = "\033[92m"  # Green
+
+            zone_name = (
+                self.zone_names[event.zone_idx]
+                if 0 <= event.zone_idx < len(self.zone_names)
+                else "Unknown"
+            )
+
+            event_dict = {
+                "timestamp": rel_time,
+                "absolute_time": datetime.now().isoformat(),
+                "event_type": "compaction_success",
+                "pid": event.pid,
+                "comm": comm,
+                "order": event.order,
+                "fragmentation_index": event.fragmentation_index,
+                "zone": zone_name,
+                "node": event.node_id,
+            }
+
+            self.compact_stats[event.order]["success"] += 1
+
+            if self.verbose:
+                print(
+                    f"{color}[{rel_time:8.3f}s] {event_name:10s}\033[0m "
+                    f"Order={event.order:2d} FragIdx={event.fragmentation_index:5d} "
+                    f"Zone={zone_name:8s} Node={event.node_id:2d} "
+                    f"Process={comm:12s} PID={event.pid:6d}"
+                )
+
+        else:  # COMPACTION_FAILURE
+            event_name = "COMPACT_FAIL"
+            color = "\033[91m"  # Red
+
+            event_dict = {
+                "timestamp": rel_time,
+                "absolute_time": datetime.now().isoformat(),
+                "event_type": "compaction_failure",
+                "pid": event.pid,
+                "comm": comm,
+                "order": event.order,
+                "fragmentation_index": -1,
+            }
+
+            self.compact_stats[event.order]["failure"] += 1
+
+            if self.verbose:
+                print(
+                    f"{color}[{rel_time:8.3f}s] {event_name:10s}\033[0m "
+                    f"Order={event.order:2d} "
+                    f"Process={comm:12s} PID={event.pid:6d}"
+                )
+
+        self.events_data.append(event_dict)
+        self.event_count += 1
+
+    def print_summary(self):
+        """Print summary statistics."""
+        print("\n" + "=" * 80)
+        print("FRAGMENTATION TRACKING SUMMARY")
+        print("=" * 80)
+
+        total_events = len(self.events_data)
+        print(f"\nTotal events captured: {total_events}")
+
+        if total_events > 0:
+            # Count by type
+            extfrag_count = sum(
+                1 for e in self.events_data if e["event_type"] == "extfrag"
+            )
+            compact_success = sum(
+                1 for e in self.events_data if e["event_type"] == "compaction_success"
+            )
+            compact_fail = sum(
+                1 for e in self.events_data if e["event_type"] == "compaction_failure"
+            )
+
+            print(f"\nEvent breakdown:")
+            print(f"  External Fragmentation: {extfrag_count}")
+            print(f"  Compaction Success: {compact_success}")
+            print(f"  Compaction Failure: {compact_fail}")
+
+            # ExtFrag analysis
+            if extfrag_count > 0:
+                print("\nExternal Fragmentation Events by Order:")
+                print("-" * 40)
+                print(f"{'Order':<8} {'Count':<10} {'Percentage':<10}")
+                print("-" * 40)
+
+                for order in sorted(self.extfrag_stats.keys()):
+                    count = self.extfrag_stats[order]
+                    pct = (count / extfrag_count) * 100
+                    print(f"{order:<8} {count:<10} {pct:<10.1f}%")
+
+                # Analyze migrate type patterns
+                extfrag_events = [
+                    e for e in self.events_data if e["event_type"] == "extfrag"
+                ]
+                migrate_patterns = defaultdict(int)
+                steal_vs_claim = {"steal": 0, "claim": 0}
+
+                for e in extfrag_events:
+                    pattern = f"{e['migrate_from']}->{e['migrate_to']}"
+                    migrate_patterns[pattern] += 1
+                    if e["is_steal"]:
+                        steal_vs_claim["steal"] += 1
+                    else:
+                        steal_vs_claim["claim"] += 1
+
+                print("\nMigrate Type Patterns:")
+                print("-" * 40)
+                for pattern, count in sorted(
+                    migrate_patterns.items(), key=lambda x: x[1], reverse=True
+                )[:5]:
+                    print(
+                        f"  {pattern:<30} {count:5d} ({count/extfrag_count*100:5.1f}%)"
+                    )
+
+                print(f"\nSteal vs Claim:")
+                print(
+                    f"  Steal (partial): {steal_vs_claim['steal']} ({steal_vs_claim['steal']/extfrag_count*100:.1f}%)"
+                )
+                print(
+                    f"  Claim (whole):   {steal_vs_claim['claim']} ({steal_vs_claim['claim']/extfrag_count*100:.1f}%)"
+                )
+
+            # Compaction analysis
+            if self.compact_stats:
+                print("\nCompaction Events by Order:")
+                print("-" * 40)
+                print(
+                    f"{'Order':<8} {'Success':<10} {'Failure':<10} {'Total':<10} {'Success%':<10}"
+                )
+                print("-" * 40)
+
+                for order in sorted(self.compact_stats.keys()):
+                    stats = self.compact_stats[order]
+                    total = stats["success"] + stats["failure"]
+                    success_pct = (stats["success"] / total * 100) if total > 0 else 0
+                    print(
+                        f"{order:<8} {stats['success']:<10} {stats['failure']:<10} "
+                        f"{total:<10} {success_pct:<10.1f}"
+                    )
+
+    def save_data(self, filename=None):
+        """Save captured data to JSON file for visualization."""
+        if filename is None and self.output_file:
+            filename = self.output_file
+
+        if filename is None:
+            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+            filename = f"fragmentation_data_{timestamp}.json"
+
+        # Prepare statistics
+        stats = {}
+
+        # ExtFrag stats
+        stats["extfrag"] = dict(self.extfrag_stats)
+
+        # Compaction stats
+        stats["compaction"] = {}
+        for order, counts in self.compact_stats.items():
+            stats["compaction"][str(order)] = counts
+
+        output = {
+            "metadata": {
+                "start_time": self.start_time,
+                "end_time": time.time(),
+                "duration": time.time() - self.start_time,
+                "total_events": len(self.events_data),
+                "kernel_version": os.uname().release,
+            },
+            "events": self.events_data,
+            "statistics": stats,
+        }
+
+        with open(filename, "w") as f:
+            json.dump(output, f, indent=2)
+        print(f"\nData saved to {filename}")
+        return filename
+
+    def run(self):
+        """Main execution loop."""
+        print("Compiling eBPF program...")
+
+        # Check if compaction tracepoints are available
+        has_compaction = os.path.exists(
+            "/sys/kernel/debug/tracing/events/page_alloc/mm_compaction_success"
+        )
+
+        # Modify BPF program based on available tracepoints
+        program = bpf_program
+        if has_compaction:
+            program = program.replace("#ifdef TRACE_COMPACTION", "#if 1")
+            print("  Compaction tracepoints: AVAILABLE")
+        else:
+            program = program.replace("#ifdef TRACE_COMPACTION", "#if 0")
+            print("  Compaction tracepoints: NOT AVAILABLE (will track extfrag only)")
+
+        self.b = BPF(text=program)
+        self.start_ns = time.perf_counter_ns()
+
+        # Setup event handler
+        self.b["events"].open_perf_buffer(self.process_event)
+
+        # Determine output filename upfront
+        if self.output_file:
+            save_file = self.output_file
+        else:
+            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+            save_file = f"fragmentation_data_{timestamp}.json"
+
+        print("\nStarting fragmentation event tracking...")
+        print(f"Primary focus: mm_page_alloc_extfrag events")
+        print(f"Data will be saved to: {save_file}")
+        print("Press Ctrl+C to stop and see summary\n")
+        print("-" * 80)
+        print(f"{'Time':>10s} {'Event':>12s} {'Details'}")
+        print("-" * 80)
+
+        try:
+            while not self.interrupted:
+                self.b.perf_buffer_poll()
+        except KeyboardInterrupt:
+            self.interrupted = True
+        finally:
+            # Always save data on exit
+            self.print_summary()
+            self.save_data()
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Track memory fragmentation events using eBPF"
+    )
+    parser.add_argument("-o", "--output", help="Output JSON file")
+    parser.add_argument(
+        "-t", "--time", type=int, help="Run for specified seconds then exit"
+    )
+    parser.add_argument(
+        "-q",
+        "--quiet",
+        action="store_true",
+        help="Suppress event output (summary only)",
+    )
+
+    args = parser.parse_args()
+
+    # Check for root privileges
+    if os.geteuid() != 0:
+        print("This script must be run as root (uses eBPF)")
+        sys.exit(1)
+
+    # Create tracker instance
+    tracker = FragmentationTracker(verbose=not args.quiet, output_file=args.output)
+
+    # Set up signal handler
+    def signal_handler_with_tracker(sig, frame):
+        tracker.interrupted = True
+
+    signal.signal(signal.SIGINT, signal_handler_with_tracker)
+
+    if args.time:
+        # Run for specified time
+        import threading
+
+        def timeout_handler():
+            time.sleep(args.time)
+            tracker.interrupted = True
+
+        timer = threading.Thread(target=timeout_handler)
+        timer.daemon = True
+        timer.start()
+
+    tracker.run()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/playbooks/roles/monitoring/files/fragmentation_visualizer.py b/playbooks/roles/monitoring/files/fragmentation_visualizer.py
new file mode 100644
index 000000000000..f3891de61e35
--- /dev/null
+++ b/playbooks/roles/monitoring/files/fragmentation_visualizer.py
@@ -0,0 +1,1161 @@
+#!/usr/bin/env python3
+"""
+Enhanced fragmentation A/B comparison with overlaid visualizations.
+Combines datasets on the same graphs using different visual markers.
+
+Usage:
+  python c6.py fragmentation_data_A.json --compare fragmentation_data_B.json -o comparison.png
+"""
+import json
+import sys
+import numpy as np
+import matplotlib.pyplot as plt
+import matplotlib.gridspec as gridspec
+import matplotlib.patches as mpatches
+from matplotlib.patches import Rectangle
+from datetime import datetime
+import argparse
+from collections import defaultdict
+
+
+def load_data(filename):
+    with open(filename, "r") as f:
+        return json.load(f)
+
+
+def get_dot_size(order: int) -> float:
+    base_size = 1
+    return base_size + (order * 2)
+
+
+def build_counts(events, bin_size):
+    if not events:
+        return np.array([]), np.array([])
+    times = np.array([e["timestamp"] for e in events], dtype=float)
+    tmin, tmax = times.min(), times.max()
+    if tmax == tmin:
+        tmax = tmin + bin_size
+    bins = np.arange(tmin, tmax + bin_size, bin_size)
+    counts, edges = np.histogram(times, bins=bins)
+    centers = (edges[:-1] + edges[1:]) / 2.0
+    return centers, counts
+
+
+def get_migrate_type_color(mtype):
+    """Get consistent colors for migrate types"""
+    colors = {
+        "UNMOVABLE": "#e74c3c",  # Red
+        "MOVABLE": "#2ecc71",  # Green
+        "RECLAIMABLE": "#f39c12",  # Orange
+        "PCPTYPES": "#9b59b6",  # Purple
+        "HIGHATOMIC": "#e91e63",  # Pink
+        "ISOLATE": "#607d8b",  # Blue-grey
+        "CMA": "#00bcd4",  # Cyan
+    }
+    return colors.get(mtype, "#95a5a6")
+
+
+def get_migration_severity(from_type, to_type):
+    """Determine migration severity score"""
+    if to_type == "UNMOVABLE":
+        return -2  # Very bad
+    elif from_type == "UNMOVABLE":
+        return 1  # Good
+    elif to_type == "MOVABLE":
+        return 1  # Good
+    elif from_type == "MOVABLE" and to_type == "RECLAIMABLE":
+        return -1  # Somewhat bad
+    elif to_type == "RECLAIMABLE":
+        return -0.5  # Slightly bad
+    return 0
+
+
+def get_severity_color(severity_score):
+    """Get color based on severity score"""
+    if severity_score <= -2:
+        return "#8b0000"  # Dark red
+    elif severity_score <= -1:
+        return "#ff6b6b"  # Light red
+    elif severity_score >= 1:
+        return "#51cf66"  # Green
+    else:
+        return "#ffd43b"  # Yellow
+
+
+def create_overlaid_compaction_graph(ax, data_a, data_b, labels):
+    """Create overlaid compaction events graph"""
+
+    # Process dataset A
+    events_a = data_a.get("events", [])
+    compact_a = [
+        e
+        for e in events_a
+        if e["event_type"] in ["compaction_success", "compaction_failure"]
+    ]
+    success_a = [e for e in compact_a if e["event_type"] == "compaction_success"]
+    failure_a = [e for e in compact_a if e["event_type"] == "compaction_failure"]
+
+    # Process dataset B
+    events_b = data_b.get("events", [])
+    compact_b = [
+        e
+        for e in events_b
+        if e["event_type"] in ["compaction_success", "compaction_failure"]
+    ]
+    success_b = [e for e in compact_b if e["event_type"] == "compaction_success"]
+    failure_b = [e for e in compact_b if e["event_type"] == "compaction_failure"]
+
+    # Plot A with circles
+    for e in success_a:
+        ax.scatter(
+            e["timestamp"],
+            e.get("fragmentation_index", 0),
+            s=get_dot_size(e["order"]),
+            c="#2ecc71",
+            alpha=0.3,
+            edgecolors="none",
+            marker="o",
+            label=None,
+        )
+
+    for i, e in enumerate(failure_a):
+        y_pos = -50 - (e["order"] * 10)
+        ax.scatter(
+            e["timestamp"],
+            y_pos,
+            s=get_dot_size(e["order"]),
+            c="#e74c3c",
+            alpha=0.3,
+            edgecolors="none",
+            marker="o",
+            label=None,
+        )
+
+    # Plot B with triangles
+    for e in success_b:
+        ax.scatter(
+            e["timestamp"],
+            e.get("fragmentation_index", 0),
+            s=get_dot_size(e["order"]) * 1.2,
+            c="#27ae60",
+            alpha=0.4,
+            edgecolors="black",
+            linewidths=0.5,
+            marker="^",
+            label=None,
+        )
+
+    for i, e in enumerate(failure_b):
+        y_pos = -55 - (e["order"] * 10)  # Slightly offset from A
+        ax.scatter(
+            e["timestamp"],
+            y_pos,
+            s=get_dot_size(e["order"]) * 1.2,
+            c="#c0392b",
+            alpha=0.4,
+            edgecolors="black",
+            linewidths=0.5,
+            marker="^",
+            label=None,
+        )
+
+    # Set y-axis limits - cap at 1000, ignore data above
+    all_y_values = []
+    if success_a:
+        all_y_values.extend(
+            [
+                e.get("fragmentation_index", 0)
+                for e in success_a
+                if e.get("fragmentation_index", 0) <= 1000
+            ]
+        )
+    if success_b:
+        all_y_values.extend(
+            [
+                e.get("fragmentation_index", 0)
+                for e in success_b
+                if e.get("fragmentation_index", 0) <= 1000
+            ]
+        )
+
+    max_y = max(all_y_values) if all_y_values else 1000
+    min_y = -200  # Fixed minimum for failure lanes
+    ax.set_ylim(min_y, min(max_y + 100, 1000))
+
+    # Create legend - position above the data
+    from matplotlib.lines import Line2D
+
+    legend_elements = [
+        Line2D(
+            [0],
+            [0],
+            marker="o",
+            color="w",
+            markerfacecolor="#2ecc71",
+            markersize=8,
+            alpha=0.6,
+            label=f"{labels[0]} Success",
+        ),
+        Line2D(
+            [0],
+            [0],
+            marker="o",
+            color="w",
+            markerfacecolor="#e74c3c",
+            markersize=8,
+            alpha=0.6,
+            label=f"{labels[0]} Failure",
+        ),
+        Line2D(
+            [0],
+            [0],
+            marker="^",
+            color="w",
+            markerfacecolor="#27ae60",
+            markersize=8,
+            alpha=0.6,
+            label=f"{labels[1]} Success",
+        ),
+        Line2D(
+            [0],
+            [0],
+            marker="^",
+            color="w",
+            markerfacecolor="#c0392b",
+            markersize=8,
+            alpha=0.6,
+            label=f"{labels[1]} Failure",
+        ),
+    ]
+    # Position legend at y=0 on the left side where there's no data
+    ax.legend(
+        handles=legend_elements,
+        loc="center left",
+        bbox_to_anchor=(0.02, 0.5),
+        ncol=1,
+        fontsize=8,
+        frameon=True,
+        fancybox=True,
+    )
+
+    # Styling
+    ax.axhline(y=0, color="#34495e", linestyle="-", linewidth=1.5, alpha=0.8)
+    ax.grid(True, alpha=0.08, linestyle=":", linewidth=0.5)
+    ax.set_xlabel("Time (seconds)", fontsize=11)
+    ax.set_ylabel("Fragmentation Index", fontsize=11)
+    ax.set_title(
+        "Compaction Events Comparison (○ = A, △ = B)",
+        fontsize=13,
+        fontweight="bold",
+        pad=20,
+    )
+
+
+def create_overlaid_extfrag_timeline(ax, data_a, data_b, labels, bin_size=0.5):
+    """Create overlaid ExtFrag timeline"""
+
+    events_a = [e for e in data_a.get("events", []) if e["event_type"] == "extfrag"]
+    events_b = [e for e in data_b.get("events", []) if e["event_type"] == "extfrag"]
+
+    # Dataset A - solid lines
+    steal_a = [e for e in events_a if e.get("is_steal")]
+    claim_a = [e for e in events_a if not e.get("is_steal")]
+
+    steal_times_a, steal_counts_a = build_counts(steal_a, bin_size)
+    claim_times_a, claim_counts_a = build_counts(claim_a, bin_size)
+
+    if steal_times_a.size > 0:
+        ax.plot(
+            steal_times_a,
+            steal_counts_a,
+            linewidth=2,
+            color="#3498db",
+            alpha=0.8,
+            label=f"{labels[0]} Steal",
+            linestyle="-",
+        )
+        ax.fill_between(steal_times_a, 0, steal_counts_a, alpha=0.15, color="#3498db")
+
+    if claim_times_a.size > 0:
+        ax.plot(
+            claim_times_a,
+            claim_counts_a,
+            linewidth=2,
+            color="#e67e22",
+            alpha=0.8,
+            label=f"{labels[0]} Claim",
+            linestyle="-",
+        )
+        ax.fill_between(claim_times_a, 0, claim_counts_a, alpha=0.15, color="#e67e22")
+
+    # Dataset B - dashed lines
+    steal_b = [e for e in events_b if e.get("is_steal")]
+    claim_b = [e for e in events_b if not e.get("is_steal")]
+
+    steal_times_b, steal_counts_b = build_counts(steal_b, bin_size)
+    claim_times_b, claim_counts_b = build_counts(claim_b, bin_size)
+
+    if steal_times_b.size > 0:
+        ax.plot(
+            steal_times_b,
+            steal_counts_b,
+            linewidth=2,
+            color="#2980b9",
+            alpha=0.8,
+            label=f"{labels[1]} Steal",
+            linestyle="--",
+        )
+
+    if claim_times_b.size > 0:
+        ax.plot(
+            claim_times_b,
+            claim_counts_b,
+            linewidth=2,
+            color="#d35400",
+            alpha=0.8,
+            label=f"{labels[1]} Claim",
+            linestyle="--",
+        )
+
+    ax.legend(loc="upper right", frameon=True, fontsize=9, ncol=2)
+    ax.set_xlabel("Time (seconds)", fontsize=11)
+    ax.set_ylabel(f"Events per {bin_size}s", fontsize=11)
+    ax.set_title(
+        "ExtFrag Events Timeline (Solid = A, Dashed = B)",
+        fontsize=12,
+        fontweight="semibold",
+    )
+    ax.grid(True, alpha=0.06, linestyle=":", linewidth=0.5)
+
+
+def create_combined_migration_heatmap(ax, data_a, data_b, labels):
+    """Create combined migration pattern heatmap"""
+
+    events_a = [e for e in data_a.get("events", []) if e["event_type"] == "extfrag"]
+    events_b = [e for e in data_b.get("events", []) if e["event_type"] == "extfrag"]
+
+    if not events_a and not events_b:
+        ax.text(
+            0.5,
+            0.5,
+            "No external fragmentation events",
+            ha="center",
+            va="center",
+            fontsize=12,
+        )
+        ax.axis("off")
+        return
+
+    # Combine all events to get unified time range and patterns
+    all_events = events_a + events_b
+    times = [e["timestamp"] for e in all_events]
+    min_time, max_time = min(times), max(times)
+
+    # Create time bins
+    n_bins = min(25, max(15, int((max_time - min_time) / 10)))
+    time_bins = np.linspace(min_time, max_time, n_bins + 1)
+    time_centers = (time_bins[:-1] + time_bins[1:]) / 2
+
+    # Get all unique patterns from both datasets
+    all_patterns = set()
+    for e in all_events:
+        all_patterns.add(f"{e['migrate_from']}→{e['migrate_to']}")
+
+    # Calculate pattern severities and sort
+    pattern_severities = {}
+    for pattern in all_patterns:
+        from_type, to_type = pattern.split("→")
+        pattern_severities[pattern] = get_migration_severity(from_type, to_type)
+
+    sorted_patterns = sorted(all_patterns, key=lambda p: (pattern_severities[p], p))
+
+    # Create separate heatmaps for A and B
+    heatmap_a = np.zeros((len(sorted_patterns), len(time_centers)))
+    heatmap_b = np.zeros((len(sorted_patterns), len(time_centers)))
+
+    # Fill heatmap A
+    for e in events_a:
+        pattern = f"{e['migrate_from']}→{e['migrate_to']}"
+        pattern_idx = sorted_patterns.index(pattern)
+        bin_idx = np.digitize(e["timestamp"], time_bins) - 1
+        if 0 <= bin_idx < len(time_centers):
+            heatmap_a[pattern_idx, bin_idx] += 1
+
+    # Fill heatmap B
+    for e in events_b:
+        pattern = f"{e['migrate_from']}→{e['migrate_to']}"
+        pattern_idx = sorted_patterns.index(pattern)
+        bin_idx = np.digitize(e["timestamp"], time_bins) - 1
+        if 0 <= bin_idx < len(time_centers):
+            heatmap_b[pattern_idx, bin_idx] += 1
+
+    # Combine heatmaps: A in upper half of cell, B in lower half
+    from matplotlib.colors import LinearSegmentedColormap
+
+    # Plot base grid
+    for i in range(len(sorted_patterns)):
+        for j in range(len(time_centers)):
+            # Draw cell background based on severity
+            severity = pattern_severities[sorted_patterns[i]]
+            base_color = get_severity_color(severity)
+            rect = Rectangle(
+                (j - 0.5, i - 0.5),
+                1,
+                1,
+                facecolor=base_color,
+                alpha=0.1,
+                edgecolor="gray",
+                linewidth=0.5,
+            )
+            ax.add_patch(rect)
+
+            # Add counts for A (upper half)
+            if heatmap_a[i, j] > 0:
+                rect_a = Rectangle(
+                    (j - 0.4, i), 0.8, 0.4, facecolor="#3498db", alpha=0.6
+                )
+                ax.add_patch(rect_a)
+                ax.text(
+                    j,
+                    i + 0.2,
+                    str(int(heatmap_a[i, j])),
+                    ha="center",
+                    va="center",
+                    fontsize=6,
+                    color="white",
+                    fontweight="bold",
+                )
+
+            # Add counts for B (lower half)
+            if heatmap_b[i, j] > 0:
+                rect_b = Rectangle(
+                    (j - 0.4, i - 0.4), 0.8, 0.4, facecolor="#e67e22", alpha=0.6
+                )
+                ax.add_patch(rect_b)
+                ax.text(
+                    j,
+                    i - 0.2,
+                    str(int(heatmap_b[i, j])),
+                    ha="center",
+                    va="center",
+                    fontsize=6,
+                    color="white",
+                    fontweight="bold",
+                )
+
+    # Set axes - extend left margin for severity indicators
+    ax.set_xlim(-2.5, len(time_centers) - 0.5)
+    ax.set_ylim(-0.5, len(sorted_patterns) - 0.5)
+
+    # Set x-axis (time)
+    ax.set_xticks(np.arange(len(time_centers)))
+    ax.set_xticklabels(
+        [f"{t:.0f}s" for t in time_centers], rotation=45, ha="right", fontsize=8
+    )
+
+    # Set y-axis with severity indicators
+    ax.set_yticks(np.arange(len(sorted_patterns)))
+    y_labels = []
+
+    for i, pattern in enumerate(sorted_patterns):
+        severity = pattern_severities[pattern]
+
+        # Add colored severity indicator on the left side (in data coordinates)
+        severity_color = get_severity_color(severity)
+        rect = Rectangle(
+            (-1.8, i - 0.4),
+            1.0,
+            0.8,
+            facecolor=severity_color,
+            alpha=0.8,
+            edgecolor="black",
+            linewidth=0.5,
+            clip_on=False,
+        )
+        ax.add_patch(rect)
+
+        # Add severity symbol
+        if severity <= -2:
+            symbol = "!!"
+        elif severity <= -1:
+            symbol = "!"
+        elif severity >= 1:
+            symbol = "+"
+        else:
+            symbol = "="
+
+        ax.text(
+            -1.3,
+            i,
+            symbol,
+            ha="center",
+            va="center",
+            fontsize=8,
+            fontweight="bold",
+            color="white" if abs(severity) > 0 else "black",
+        )
+
+        y_labels.append(pattern)
+
+    ax.set_yticklabels(y_labels, fontsize=8)
+
+    # Add legend
+    legend_elements = [
+        mpatches.Patch(color="#3498db", alpha=0.6, label=f"{labels[0]} (upper)"),
+        mpatches.Patch(color="#e67e22", alpha=0.6, label=f"{labels[1]} (lower)"),
+        mpatches.Patch(color="#8b0000", alpha=0.8, label="Bad migration"),
+        mpatches.Patch(color="#51cf66", alpha=0.8, label="Good migration"),
+    ]
+    ax.legend(
+        handles=legend_elements,
+        loc="upper right",
+        bbox_to_anchor=(1.15, 1.0),
+        fontsize=8,
+        frameon=True,
+    )
+
+    # Styling
+    ax.set_xlabel("Time", fontsize=11)
+    ax.set_ylabel("Migration Pattern", fontsize=11)
+    ax.set_title(
+        "Migration Patterns Comparison (Blue = A, Orange = B)",
+        fontsize=12,
+        fontweight="semibold",
+    )
+    ax.grid(False)
+
+
+def create_comparison_statistics_table(ax, data_a, data_b, labels):
+    """Create comparison statistics table"""
+    ax.axis("off")
+
+    # Calculate metrics
+    def calculate_metrics(data):
+        events = data.get("events", [])
+        compact = [
+            e
+            for e in events
+            if e["event_type"] in ["compaction_success", "compaction_failure"]
+        ]
+        extfrag = [e for e in events if e["event_type"] == "extfrag"]
+
+        compact_success = sum(
+            1 for e in compact if e["event_type"] == "compaction_success"
+        )
+        success_rate = (compact_success / len(compact) * 100) if compact else 0
+
+        bad = sum(
+            1
+            for e in extfrag
+            if get_migration_severity(e["migrate_from"], e["migrate_to"]) < 0
+        )
+        good = sum(
+            1
+            for e in extfrag
+            if get_migration_severity(e["migrate_from"], e["migrate_to"]) > 0
+        )
+
+        steal = sum(1 for e in extfrag if e.get("is_steal"))
+        claim = len(extfrag) - steal if extfrag else 0
+
+        return {
+            "total": len(events),
+            "compact_success_rate": success_rate,
+            "extfrag": len(extfrag),
+            "bad_migrations": bad,
+            "good_migrations": good,
+            "steal": steal,
+            "claim": claim,
+        }
+
+    metrics_a = calculate_metrics(data_a)
+    metrics_b = calculate_metrics(data_b)
+
+    # Create table data
+    headers = ["Metric", labels[0], labels[1], "Better"]
+    rows = [
+        [
+            "Total Events",
+            metrics_a["total"],
+            metrics_b["total"],
+            "=" if metrics_a["total"] == metrics_b["total"] else "",
+        ],
+        [
+            "Compaction Success Rate",
+            f"{metrics_a['compact_success_rate']:.1f}%",
+            f"{metrics_b['compact_success_rate']:.1f}%",
+            (
+                labels[0]
+                if metrics_a["compact_success_rate"] > metrics_b["compact_success_rate"]
+                else (
+                    labels[1]
+                    if metrics_b["compact_success_rate"]
+                    > metrics_a["compact_success_rate"]
+                    else "="
+                )
+            ),
+        ],
+        [
+            "ExtFrag Events",
+            metrics_a["extfrag"],
+            metrics_b["extfrag"],
+            (
+                labels[0]
+                if metrics_a["extfrag"] < metrics_b["extfrag"]
+                else labels[1] if metrics_b["extfrag"] < metrics_a["extfrag"] else "="
+            ),
+        ],
+        [
+            "Bad Migrations",
+            metrics_a["bad_migrations"],
+            metrics_b["bad_migrations"],
+            (
+                labels[0]
+                if metrics_a["bad_migrations"] < metrics_b["bad_migrations"]
+                else (
+                    labels[1]
+                    if metrics_b["bad_migrations"] < metrics_a["bad_migrations"]
+                    else "="
+                )
+            ),
+        ],
+        [
+            "Good Migrations",
+            metrics_a["good_migrations"],
+            metrics_b["good_migrations"],
+            (
+                labels[0]
+                if metrics_a["good_migrations"] > metrics_b["good_migrations"]
+                else (
+                    labels[1]
+                    if metrics_b["good_migrations"] > metrics_a["good_migrations"]
+                    else "="
+                )
+            ),
+        ],
+        ["Steal Events", metrics_a["steal"], metrics_b["steal"], ""],
+        ["Claim Events", metrics_a["claim"], metrics_b["claim"], ""],
+    ]
+
+    # Create table - position closer to title
+    table = ax.table(
+        cellText=rows,
+        colLabels=headers,
+        cellLoc="center",
+        loc="center",
+        colWidths=[0.35, 0.25, 0.25, 0.15],
+        bbox=[0.1, 0.15, 0.8, 0.65],
+    )  # Center table with margins
+
+    table.auto_set_font_size(False)
+    table.set_fontsize(10)
+    table.scale(1, 2.2)  # Make cells taller for better readability
+
+    # Add padding to cells for better spacing
+    for key, cell in table.get_celld().items():
+        cell.set_height(0.08)  # Increase cell height
+        cell.PAD = 0.05  # Add internal padding
+
+    # Color cells based on which is better
+    for i in range(1, len(rows) + 1):
+        row = rows[i - 1]
+        if row[3] == labels[0]:
+            table[(i, 1)].set_facecolor("#d4edda")
+            table[(i, 2)].set_facecolor("#f8d7da")
+        elif row[3] == labels[1]:
+            table[(i, 1)].set_facecolor("#f8d7da")
+            table[(i, 2)].set_facecolor("#d4edda")
+
+    # Position title with more space from previous graph
+    ax.set_title(
+        "\n\nStatistical Comparison (Green = Better, Red = Worse)",
+        fontsize=13,
+        fontweight="bold",
+        pad=5,
+        y=1.0,
+    )
+
+
+def create_single_dashboard(data, output_file=None, bin_size=0.5):
+    """Create single dataset analysis dashboard with severity indicators"""
+
+    # Create figure
+    fig = plt.figure(figsize=(20, 16), constrained_layout=False)
+    fig.patch.set_facecolor("#f8f9fa")
+
+    # Create grid layout - 3 rows for single analysis
+    gs = gridspec.GridSpec(3, 1, height_ratios=[2.5, 2, 3], hspace=0.3, figure=fig)
+
+    # Create subplots
+    ax_compact = fig.add_subplot(gs[0])
+    ax_extfrag = fig.add_subplot(gs[1])
+    ax_migration = fig.add_subplot(gs[2])
+
+    # Process events
+    events = data.get("events", [])
+    compact_events = [
+        e
+        for e in events
+        if e["event_type"] in ["compaction_success", "compaction_failure"]
+    ]
+    extfrag_events = [e for e in events if e["event_type"] == "extfrag"]
+
+    success_events = [
+        e for e in compact_events if e["event_type"] == "compaction_success"
+    ]
+    failure_events = [
+        e for e in compact_events if e["event_type"] == "compaction_failure"
+    ]
+
+    # === COMPACTION GRAPH ===
+    if compact_events:
+        for e in success_events:
+            if e.get("fragmentation_index", 0) <= 1000:  # Cap at 1000
+                ax_compact.scatter(
+                    e["timestamp"],
+                    e.get("fragmentation_index", 0),
+                    s=get_dot_size(e["order"]),
+                    c="#2ecc71",
+                    alpha=0.3,
+                    edgecolors="none",
+                )
+
+        for i, e in enumerate(failure_events):
+            y_pos = -50 - (e["order"] * 10)
+            ax_compact.scatter(
+                e["timestamp"],
+                y_pos,
+                s=get_dot_size(e["order"]),
+                c="#e74c3c",
+                alpha=0.3,
+                edgecolors="none",
+            )
+
+        ax_compact.axhline(
+            y=0, color="#34495e", linestyle="-", linewidth=1.5, alpha=0.8
+        )
+        ax_compact.grid(True, alpha=0.08, linestyle=":", linewidth=0.5)
+        ax_compact.set_ylim(-200, 1000)
+
+        # Add statistics
+        success_rate = (
+            len(success_events) / len(compact_events) * 100 if compact_events else 0
+        )
+        stats_text = f"Success: {len(success_events)}/{len(compact_events)} ({success_rate:.1f}%)"
+        ax_compact.text(
+            0.02,
+            0.98,
+            stats_text,
+            transform=ax_compact.transAxes,
+            fontsize=10,
+            verticalalignment="top",
+            bbox=dict(boxstyle="round,pad=0.5", facecolor="white", alpha=0.9),
+        )
+
+    ax_compact.set_xlabel("Time (seconds)", fontsize=11)
+    ax_compact.set_ylabel("Fragmentation Index", fontsize=11)
+    ax_compact.set_title("Compaction Events Over Time", fontsize=13, fontweight="bold")
+
+    # === EXTFRAG TIMELINE ===
+    if extfrag_events:
+        steal_events = [e for e in extfrag_events if e.get("is_steal")]
+        claim_events = [e for e in extfrag_events if not e.get("is_steal")]
+
+        steal_times, steal_counts = build_counts(steal_events, bin_size)
+        claim_times, claim_counts = build_counts(claim_events, bin_size)
+
+        if steal_times.size > 0:
+            ax_extfrag.fill_between(
+                steal_times, 0, steal_counts, alpha=0.3, color="#3498db"
+            )
+            ax_extfrag.plot(
+                steal_times,
+                steal_counts,
+                linewidth=2,
+                color="#2980b9",
+                alpha=0.8,
+                label=f"Steal ({len(steal_events)})",
+            )
+
+        if claim_times.size > 0:
+            ax_extfrag.fill_between(
+                claim_times, 0, claim_counts, alpha=0.3, color="#e67e22"
+            )
+            ax_extfrag.plot(
+                claim_times,
+                claim_counts,
+                linewidth=2,
+                color="#d35400",
+                alpha=0.8,
+                label=f"Claim ({len(claim_events)})",
+            )
+
+        ax_extfrag.legend(loc="upper right", frameon=True, fontsize=9)
+
+        # Add bad/good migration counts
+        bad_migrations = sum(
+            1
+            for e in extfrag_events
+            if get_migration_severity(e["migrate_from"], e["migrate_to"]) < 0
+        )
+        good_migrations = sum(
+            1
+            for e in extfrag_events
+            if get_migration_severity(e["migrate_from"], e["migrate_to"]) > 0
+        )
+
+        migration_text = f"Bad: {bad_migrations} | Good: {good_migrations}"
+        ax_extfrag.text(
+            0.02,
+            0.98,
+            migration_text,
+            transform=ax_extfrag.transAxes,
+            fontsize=10,
+            verticalalignment="top",
+            bbox=dict(boxstyle="round,pad=0.5", facecolor="white", alpha=0.9),
+        )
+
+    ax_extfrag.set_xlabel("Time (seconds)", fontsize=11)
+    ax_extfrag.set_ylabel(f"Events per {bin_size}s", fontsize=11)
+    ax_extfrag.set_title(
+        "External Fragmentation Events Timeline", fontsize=12, fontweight="semibold"
+    )
+    ax_extfrag.grid(True, alpha=0.06, linestyle=":", linewidth=0.5)
+
+    # === MIGRATION HEATMAP WITH SEVERITY ===
+    create_single_migration_heatmap(ax_migration, extfrag_events)
+
+    # Super title
+    fig.suptitle(
+        "Memory Fragmentation Analysis", fontsize=18, fontweight="bold", y=0.98
+    )
+
+    # Footer
+    timestamp_text = f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
+    fig.text(
+        0.98,
+        0.01,
+        timestamp_text,
+        ha="right",
+        fontsize=9,
+        style="italic",
+        color="#7f8c8d",
+    )
+
+    # Adjust layout
+    plt.subplots_adjust(left=0.08, right=0.95, top=0.94, bottom=0.03)
+
+    # Save
+    if output_file is None:
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        output_file = f"fragmentation_analysis_{timestamp}.png"
+
+    plt.savefig(output_file, dpi=120, bbox_inches="tight", facecolor="#f8f9fa")
+    plt.close(fig)
+    return output_file
+
+
+def create_single_migration_heatmap(ax, extfrag_events):
+    """Create migration heatmap for single dataset with severity indicators"""
+    if not extfrag_events:
+        ax.text(
+            0.5,
+            0.5,
+            "No external fragmentation events",
+            ha="center",
+            va="center",
+            fontsize=12,
+        )
+        ax.axis("off")
+        return
+
+    # Get time range and create bins
+    times = [e["timestamp"] for e in extfrag_events]
+    min_time, max_time = min(times), max(times)
+
+    n_bins = min(25, max(15, int((max_time - min_time) / 10)))
+    time_bins = np.linspace(min_time, max_time, n_bins + 1)
+    time_centers = (time_bins[:-1] + time_bins[1:]) / 2
+
+    # Get patterns and calculate severities
+    patterns = {}
+    pattern_events = defaultdict(list)
+
+    for e in extfrag_events:
+        pattern = f"{e['migrate_from']}→{e['migrate_to']}"
+        pattern_events[pattern].append(e)
+        if pattern not in patterns:
+            patterns[pattern] = {
+                "from": e["migrate_from"],
+                "to": e["migrate_to"],
+                "total": 0,
+                "steal": 0,
+                "claim": 0,
+                "severity": get_migration_severity(e["migrate_from"], e["migrate_to"]),
+            }
+        patterns[pattern]["total"] += 1
+        if e.get("is_steal"):
+            patterns[pattern]["steal"] += 1
+        else:
+            patterns[pattern]["claim"] += 1
+
+    # Sort by severity then count
+    sorted_patterns = sorted(
+        patterns.keys(), key=lambda p: (patterns[p]["severity"], -patterns[p]["total"])
+    )
+
+    # Create heatmap data
+    heatmap_data = np.zeros((len(sorted_patterns), len(time_centers)))
+
+    for i, pattern in enumerate(sorted_patterns):
+        for e in pattern_events[pattern]:
+            bin_idx = np.digitize(e["timestamp"], time_bins) - 1
+            if 0 <= bin_idx < len(time_centers):
+                heatmap_data[i, bin_idx] += 1
+
+    # Plot heatmap
+    from matplotlib.colors import LinearSegmentedColormap
+
+    colors = ["#ffffff", "#ffeb3b", "#ff9800", "#f44336"]  # White to red
+    cmap = LinearSegmentedColormap.from_list("intensity", colors, N=256)
+
+    im = ax.imshow(heatmap_data, aspect="auto", cmap=cmap, vmin=0, alpha=0.8)
+
+    # Overlay counts
+    for i in range(len(sorted_patterns)):
+        for j in range(len(time_centers)):
+            if heatmap_data[i, j] > 0:
+                count = int(heatmap_data[i, j])
+                color = (
+                    "white" if heatmap_data[i, j] > heatmap_data.max() / 2 else "black"
+                )
+                ax.text(
+                    j,
+                    i,
+                    str(count),
+                    ha="center",
+                    va="center",
+                    fontsize=6,
+                    fontweight="bold",
+                    color=color,
+                )
+
+    # Set axes
+    ax.set_xlim(-2.5, len(time_centers) - 0.5)
+    ax.set_ylim(-0.5, len(sorted_patterns) - 0.5)
+
+    # Set x-axis
+    ax.set_xticks(np.arange(len(time_centers)))
+    ax.set_xticklabels(
+        [f"{t:.0f}s" for t in time_centers], rotation=45, ha="right", fontsize=8
+    )
+
+    # Set y-axis with severity indicators
+    ax.set_yticks(np.arange(len(sorted_patterns)))
+    y_labels = []
+
+    for i, pattern in enumerate(sorted_patterns):
+        severity = patterns[pattern]["severity"]
+        severity_color = get_severity_color(severity)
+
+        # Add severity indicator
+        rect = Rectangle(
+            (-2.3, i - 0.4),
+            1.5,
+            0.8,
+            facecolor=severity_color,
+            alpha=0.8,
+            edgecolor="black",
+            linewidth=0.5,
+            clip_on=False,
+        )
+        ax.add_patch(rect)
+
+        # Add symbol
+        if severity <= -2:
+            symbol = "!!"
+        elif severity <= -1:
+            symbol = "!"
+        elif severity >= 1:
+            symbol = "+"
+        else:
+            symbol = "="
+
+        ax.text(
+            -1.55,
+            i,
+            symbol,
+            ha="center",
+            va="center",
+            fontsize=8,
+            fontweight="bold",
+            color="white" if abs(severity) > 0 else "black",
+        )
+
+        # Format label
+        total = patterns[pattern]["total"]
+        steal = patterns[pattern]["steal"]
+        claim = patterns[pattern]["claim"]
+        label = f"{pattern} ({total}: {steal}s/{claim}c)"
+        y_labels.append(label)
+
+    ax.set_yticklabels(y_labels, fontsize=8)
+
+    # Add colorbar
+    cbar = plt.colorbar(im, ax=ax, orientation="vertical", pad=0.02, aspect=30)
+    cbar.set_label("Event Intensity", fontsize=9)
+    cbar.ax.tick_params(labelsize=8)
+
+    # Add severity legend
+    bad_patch = mpatches.Patch(color="#8b0000", label="Bad (→UNMOVABLE)", alpha=0.8)
+    good_patch = mpatches.Patch(color="#51cf66", label="Good (→MOVABLE)", alpha=0.8)
+    neutral_patch = mpatches.Patch(color="#ffd43b", label="Neutral", alpha=0.8)
+
+    ax.legend(
+        handles=[bad_patch, neutral_patch, good_patch],
+        loc="upper right",
+        bbox_to_anchor=(1.15, 1.0),
+        title="Migration Impact",
+        fontsize=8,
+        title_fontsize=9,
+    )
+
+    # Styling
+    ax.set_xlabel("Time", fontsize=11)
+    ax.set_ylabel("Migration Pattern", fontsize=11)
+    ax.set_title(
+        "Migration Patterns Timeline with Severity Indicators",
+        fontsize=12,
+        fontweight="semibold",
+    )
+    ax.grid(False)
+
+    # Add grid lines
+    for i in range(len(sorted_patterns) + 1):
+        ax.axhline(i - 0.5, color="gray", linewidth=0.5, alpha=0.3)
+    for j in range(len(time_centers) + 1):
+        ax.axvline(j - 0.5, color="gray", linewidth=0.5, alpha=0.3)
+
+    # Summary
+    total_events = len(extfrag_events)
+    bad_events = sum(
+        patterns[p]["total"] for p in patterns if patterns[p]["severity"] < 0
+    )
+    good_events = sum(
+        patterns[p]["total"] for p in patterns if patterns[p]["severity"] > 0
+    )
+
+    summary = f"Total: {total_events} | Bad: {bad_events} | Good: {good_events}"
+    ax.text(
+        0.5,
+        -0.12,
+        summary,
+        transform=ax.transAxes,
+        ha="center",
+        fontsize=9,
+        style="italic",
+        color="#7f8c8d",
+    )
+
+
+def create_comparison_dashboard(data_a, data_b, labels, output_file=None):
+    """Create comprehensive comparison dashboard"""
+
+    # Create figure
+    fig = plt.figure(figsize=(20, 18), constrained_layout=False)
+    fig.patch.set_facecolor("#f8f9fa")
+
+    # Create grid layout - 4 rows, single column with more space for stats
+    gs = gridspec.GridSpec(
+        4, 1, height_ratios=[2.5, 2, 2.5, 1.5], hspace=0.45, figure=fig
+    )
+
+    # Create subplots
+    ax_compact = fig.add_subplot(gs[0])
+    ax_extfrag = fig.add_subplot(gs[1])
+    ax_migration = fig.add_subplot(gs[2])
+    ax_stats = fig.add_subplot(gs[3])
+
+    # Create visualizations
+    create_overlaid_compaction_graph(ax_compact, data_a, data_b, labels)
+    create_overlaid_extfrag_timeline(ax_extfrag, data_a, data_b, labels)
+    create_combined_migration_heatmap(ax_migration, data_a, data_b, labels)
+    create_comparison_statistics_table(ax_stats, data_a, data_b, labels)
+
+    # Super title
+    fig.suptitle(
+        "Memory Fragmentation A/B Comparison Analysis",
+        fontsize=18,
+        fontweight="bold",
+        y=0.98,
+    )
+
+    # Footer
+    timestamp_text = f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
+    fig.text(
+        0.98,
+        0.01,
+        timestamp_text,
+        ha="right",
+        fontsize=9,
+        style="italic",
+        color="#7f8c8d",
+    )
+
+    # Adjust layout with more bottom margin for stats table
+    plt.subplots_adjust(left=0.08, right=0.95, top=0.94, bottom=0.05)
+
+    # Save
+    if output_file is None:
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        output_file = f"fragmentation_comparison_{timestamp}.png"
+
+    plt.savefig(output_file, dpi=120, bbox_inches="tight", facecolor="#f8f9fa")
+    plt.close(fig)
+    return output_file
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Fragmentation analysis with optional comparison"
+    )
+    parser.add_argument("input_file", help="Primary JSON file")
+    parser.add_argument(
+        "--compare", help="Secondary JSON file for A/B comparison (optional)"
+    )
+    parser.add_argument("-o", "--output", help="Output filename")
+    parser.add_argument(
+        "--labels",
+        nargs=2,
+        default=["Light Load", "Heavy Load"],
+        help="Labels for the two datasets in comparison mode",
+    )
+    parser.add_argument(
+        "--bin", type=float, default=0.5, help="Bin size for event counts"
+    )
+    args = parser.parse_args()
+
+    try:
+        data_a = load_data(args.input_file)
+    except Exception as e:
+        print(f"Error loading primary data: {e}")
+        sys.exit(1)
+
+    if args.compare:
+        # Comparison mode
+        try:
+            data_b = load_data(args.compare)
+        except Exception as e:
+            print(f"Error loading comparison data: {e}")
+            sys.exit(1)
+
+        out = create_comparison_dashboard(data_a, data_b, args.labels, args.output)
+        print(f"Comparison saved: {out}")
+    else:
+        # Single file mode
+        out = create_single_dashboard(data_a, args.output, args.bin)
+        print(f"Analysis saved: {out}")
+
+
+if __name__ == "__main__":
+    main()
diff --git a/playbooks/roles/monitoring/tasks/monitor_collect.yml b/playbooks/roles/monitoring/tasks/monitor_collect.yml
index 5432fc879bd0..f57a4e9d8106 100644
--- a/playbooks/roles/monitoring/tasks/monitor_collect.yml
+++ b/playbooks/roles/monitoring/tasks/monitor_collect.yml
@@ -206,3 +206,129 @@
     - monitor_developmental_stats|default(false)|bool
     - monitor_folio_migration|default(false)|bool
     - folio_migration_data_file.stat.exists|default(false)
+
+# Plot-fragmentation collection tasks
+- name: Check if fragmentation monitoring was started
+  become: true
+  become_method: sudo
+  ansible.builtin.stat:
+    path: "{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}/fragmentation_tracker.pid"
+  register: fragmentation_pid_file
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Stop fragmentation monitoring
+  become: true
+  become_method: sudo
+  ansible.builtin.shell: |
+    output_dir="{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}"
+    if [ -f "${output_dir}/fragmentation_tracker.pid" ]; then
+      pid=$(cat "${output_dir}/fragmentation_tracker.pid")
+      if ps -p $pid > /dev/null 2>&1; then
+        kill -SIGINT $pid  # Use SIGINT to allow graceful shutdown
+        sleep 2  # Give it time to save data
+        if ps -p $pid > /dev/null 2>&1; then
+          kill -SIGTERM $pid  # Force kill if still running
+        fi
+        echo "Stopped fragmentation monitoring process $pid"
+      else
+        echo "Fragmentation monitoring process $pid was not running"
+      fi
+      rm -f "${output_dir}/fragmentation_tracker.pid"
+    fi
+
+    # Save the end time
+    date +"%Y-%m-%d %H:%M:%S" > "${output_dir}/end_time.txt"
+  register: stop_fragmentation_monitor
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+    - fragmentation_pid_file.stat.exists|default(false)
+
+- name: Display stop fragmentation monitoring status
+  ansible.builtin.debug:
+    msg: "{{ stop_fragmentation_monitor.stdout }}"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+    - stop_fragmentation_monitor is defined
+    - stop_fragmentation_monitor.changed|default(false)
+
+- name: Generate fragmentation visualization
+  become: true
+  become_method: sudo
+  ansible.builtin.shell: |
+    cd /opt/fragmentation
+    output_dir="{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}"
+
+    # Run the visualizer if data exists
+    if [ -f "${output_dir}/fragmentation_data.json" ] || [ -f "${output_dir}/fragmentation_tracker.log" ]; then
+      python3 fragmentation_visualizer.py \
+        --input "${output_dir}" \
+        --output "${output_dir}/fragmentation_plot.png" 2>&1 | tee "${output_dir}/visualizer.log"
+      echo "Generated fragmentation visualization"
+    else
+      echo "No fragmentation data found to visualize"
+    fi
+  register: generate_fragmentation_plot
+  ignore_errors: true
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: List fragmentation monitoring output files
+  become: true
+  become_method: sudo
+  ansible.builtin.find:
+    paths: "{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}"
+    patterns: "*"
+    file_type: file
+  register: fragmentation_output_files
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Create local fragmentation results directory
+  ansible.builtin.file:
+    path: "{{ monitoring_results_path }}/fragmentation"
+    state: directory
+  delegate_to: localhost
+  run_once: true
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+    - fragmentation_output_files.files is defined
+    - fragmentation_output_files.files | length > 0
+
+- name: Copy fragmentation monitoring data to localhost
+  become: true
+  become_method: sudo
+  ansible.builtin.fetch:
+    src: "{{ item.path }}"
+    dest: "{{ monitoring_results_path }}/fragmentation/{{ ansible_hostname }}_{{ item.path | basename }}"
+    flat: true
+    validate_checksum: false
+  loop: "{{ fragmentation_output_files.files | default([]) }}"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+    - fragmentation_output_files.files is defined
+
+- name: Display fragmentation monitoring collection summary
+  ansible.builtin.debug:
+    msg: |
+      Fragmentation monitoring collection complete.
+      {% if fragmentation_output_files.files is defined and fragmentation_output_files.files | length > 0 %}
+      Collected {{ fragmentation_output_files.files | length }} files
+      Data saved to: {{ monitoring_results_path }}/fragmentation/
+      Files collected:
+      {% for file in fragmentation_output_files.files %}
+        - {{ ansible_hostname }}_{{ file.path | basename }}
+      {% endfor %}
+      {% else %}
+      No fragmentation data was collected.
+      {% endif %}
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
diff --git a/playbooks/roles/monitoring/tasks/monitor_run.yml b/playbooks/roles/monitoring/tasks/monitor_run.yml
index f56d06e4facf..c563b38bc0b5 100644
--- a/playbooks/roles/monitoring/tasks/monitor_run.yml
+++ b/playbooks/roles/monitoring/tasks/monitor_run.yml
@@ -81,3 +81,126 @@
     - monitor_folio_migration|default(false)|bool
     - folio_migration_stats_file.stat.exists|default(false)
     - monitor_status is defined
+
+# Plot-fragmentation monitoring tasks
+- name: Install python3-bpfcc for fragmentation monitoring
+  become: true
+  become_method: sudo
+  ansible.builtin.package:
+    name: python3-bpfcc
+    state: present
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Install matplotlib for fragmentation visualization
+  become: true
+  become_method: sudo
+  ansible.builtin.pip:
+    name: matplotlib
+    state: present
+    executable: pip3
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Create fragmentation scripts directory
+  become: true
+  become_method: sudo
+  ansible.builtin.file:
+    path: /opt/fragmentation
+    state: directory
+    mode: "0755"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Copy fragmentation monitoring scripts to target
+  become: true
+  become_method: sudo
+  ansible.builtin.copy:
+    src: "{{ item }}"
+    dest: "/opt/fragmentation/{{ item | basename }}"
+    mode: "0755"
+  loop:
+    - "{{ playbook_dir }}/roles/monitoring/files/fragmentation_tracker.py"
+    - "{{ playbook_dir }}/roles/monitoring/files/fragmentation_visualizer.py"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Create fragmentation monitoring output directory
+  become: true
+  become_method: sudo
+  ansible.builtin.file:
+    path: "{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}"
+    state: directory
+    mode: "0755"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Start fragmentation monitoring in background
+  become: true
+  become_method: sudo
+  ansible.builtin.shell: |
+    cd /opt/fragmentation
+    duration="{{ monitor_fragmentation_duration|default(0) }}"
+    output_dir="{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}"
+
+    # Start the fragmentation tracker
+    if [ "$duration" -eq "0" ]; then
+      # Run continuously until killed
+      nohup python3 fragmentation_tracker.py > "${output_dir}/fragmentation_tracker.log" 2>&1 &
+    else
+      # Run for specified duration
+      nohup timeout ${duration} python3 fragmentation_tracker.py > "${output_dir}/fragmentation_tracker.log" 2>&1 &
+    fi
+    echo $! > "${output_dir}/fragmentation_tracker.pid"
+
+    # Also save the start time for reference
+    date +"%Y-%m-%d %H:%M:%S" > "${output_dir}/start_time.txt"
+  async: 86400 # Run for up to 24 hours
+  poll: 0
+  register: fragmentation_monitor
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Save fragmentation monitor async job ID
+  ansible.builtin.set_fact:
+    fragmentation_monitor_job: "{{ fragmentation_monitor.ansible_job_id }}"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+    - fragmentation_monitor is defined
+
+- name: Verify fragmentation monitoring started successfully
+  become: true
+  become_method: sudo
+  ansible.builtin.shell: |
+    output_dir="{{ monitor_fragmentation_output_dir|default('/root/monitoring/fragmentation') }}"
+    if [ -f "${output_dir}/fragmentation_tracker.pid" ]; then
+      pid=$(cat "${output_dir}/fragmentation_tracker.pid")
+      if ps -p $pid > /dev/null 2>&1; then
+        echo "Fragmentation monitoring process $pid is running"
+      else
+        echo "ERROR: Fragmentation monitoring process $pid is not running" >&2
+        exit 1
+      fi
+    else
+      echo "ERROR: PID file not found" >&2
+      exit 1
+    fi
+  register: fragmentation_monitor_status
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+
+- name: Display fragmentation monitoring status
+  ansible.builtin.debug:
+    msg: "{{ fragmentation_monitor_status.stdout }}"
+  when:
+    - monitor_developmental_stats|default(false)|bool
+    - monitor_memory_fragmentation|default(false)|bool
+    - fragmentation_monitor_status is defined
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] mmtests: add monitoring framework integration
  2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 1/5] monitoring: add memory fragmentation eBPF monitoring support Luis Chamberlain
@ 2025-09-04  9:13 ` Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 3/5] sysbench: " Luis Chamberlain
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-04  9:13 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

Add support for the monitoring framework to mmtests workflow, matching
the implementation in fstests. This allows mmtests to leverage all
monitoring capabilities including the new eBPF fragmentation monitoring.

Changes:
- Import monitor_run tasks before test execution
- Import monitor_collect tasks after test completion
- Add monitor-results make target for interim data collection
- Make monitoring results path workflow-aware (fstests vs mmtests)

With these changes, users can simply enable monitoring options in
menuconfig and mmtests will automatically start/stop monitoring
during test execution.

Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 .../roles/mmtests/tasks/install-deps/debian/main.yml |  1 +
 .../roles/mmtests/tasks/install-deps/redhat/main.yml |  1 +
 .../roles/mmtests/tasks/install-deps/suse/main.yml   |  1 +
 playbooks/roles/mmtests/tasks/main.yaml              | 12 ++++++++++++
 playbooks/roles/monitoring/tasks/monitor_collect.yml | 11 ++++++++++-
 workflows/mmtests/Makefile                           |  8 ++++++++
 6 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/playbooks/roles/mmtests/tasks/install-deps/debian/main.yml b/playbooks/roles/mmtests/tasks/install-deps/debian/main.yml
index 62de1e2a0c3e..7195ea527256 100644
--- a/playbooks/roles/mmtests/tasks/install-deps/debian/main.yml
+++ b/playbooks/roles/mmtests/tasks/install-deps/debian/main.yml
@@ -60,6 +60,7 @@
     name:
       - trace-cmd
       - perf-tools-unstable
+      - python3-bpfcc
     state: present
     update_cache: true
   tags: ["deps"]
diff --git a/playbooks/roles/mmtests/tasks/install-deps/redhat/main.yml b/playbooks/roles/mmtests/tasks/install-deps/redhat/main.yml
index debfb577c906..c62ff09fdde5 100644
--- a/playbooks/roles/mmtests/tasks/install-deps/redhat/main.yml
+++ b/playbooks/roles/mmtests/tasks/install-deps/redhat/main.yml
@@ -52,6 +52,7 @@
       - kernel-tools
       - trace-cmd
       - perf
+      - python3-bcc
     state: present
   tags: ["deps"]
   ignore_errors: true
diff --git a/playbooks/roles/mmtests/tasks/install-deps/suse/main.yml b/playbooks/roles/mmtests/tasks/install-deps/suse/main.yml
index 5a6c2b54c332..e357232d11a2 100644
--- a/playbooks/roles/mmtests/tasks/install-deps/suse/main.yml
+++ b/playbooks/roles/mmtests/tasks/install-deps/suse/main.yml
@@ -52,6 +52,7 @@
       - kernel-default-devel
       - trace-cmd
       - perf
+      - python3-bcc
     state: present
   tags: ["deps"]
   ignore_errors: true
diff --git a/playbooks/roles/mmtests/tasks/main.yaml b/playbooks/roles/mmtests/tasks/main.yaml
index 6e9593662a6b..fd05843f8b2c 100644
--- a/playbooks/roles/mmtests/tasks/main.yaml
+++ b/playbooks/roles/mmtests/tasks/main.yaml
@@ -217,6 +217,12 @@
   ansible.builtin.debug:
     msg: "Kernel version on {{ inventory_hostname }} : {{ kernel_version.stdout }}"
 
+# Start monitoring services before running tests
+- ansible.builtin.import_tasks: ../../monitoring/tasks/monitor_run.yml
+  when:
+    - enable_monitoring|default(false)|bool
+  tags: ["run_tests", "monitoring", "monitor_run"]
+
 - name: Run mmtests in background
   tags: ["run_tests"]
   become: true
@@ -239,6 +245,12 @@
   retries: 1440 # 12 hours
   delay: 60 # check every 60 seconds
 
+# Collect monitoring data after tests complete
+- ansible.builtin.import_tasks: ../../monitoring/tasks/monitor_collect.yml
+  when:
+    - enable_monitoring|default(false)|bool
+  tags: ["run_tests", "monitoring", "monitor_collect"]
+
 - name: Create local results directory
   delegate_to: localhost
   ansible.builtin.file:
diff --git a/playbooks/roles/monitoring/tasks/monitor_collect.yml b/playbooks/roles/monitoring/tasks/monitor_collect.yml
index f57a4e9d8106..967526b40428 100644
--- a/playbooks/roles/monitoring/tasks/monitor_collect.yml
+++ b/playbooks/roles/monitoring/tasks/monitor_collect.yml
@@ -110,7 +110,16 @@
 
 - name: Set monitoring results path
   ansible.builtin.set_fact:
-    monitoring_results_path: "{{ monitoring_results_base_path | default(topdir_path + '/workflows/fstests/results/monitoring') }}"
+    monitoring_results_path: >-
+      {%- if monitoring_results_base_path is defined -%}
+        {{ monitoring_results_base_path }}
+      {%- elif kdevops_run_fstests|default(false)|bool -%}
+        {{ topdir_path }}/workflows/fstests/results/monitoring
+      {%- elif kdevops_workflow_enable_mmtests|default(false)|bool -%}
+        {{ topdir_path }}/workflows/mmtests/results/monitoring
+      {%- else -%}
+        {{ topdir_path }}/results/monitoring
+      {%- endif -%}
 
 - name: Create local monitoring results directory
   ansible.builtin.file:
diff --git a/workflows/mmtests/Makefile b/workflows/mmtests/Makefile
index 69db9505284d..c6248b95c05c 100644
--- a/workflows/mmtests/Makefile
+++ b/workflows/mmtests/Makefile
@@ -43,6 +43,12 @@ mmtests-compare:
 		--tags deps,compare \
 		$(MMTESTS_ARGS)
 
+monitor-results: $(KDEVOPS_EXTRA_VARS)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		playbooks/monitor-results.yml \
+		--extra-vars=@./extra_vars.yaml \
+		$(MMTESTS_ARGS)
+
 mmtests-clean:
 	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
 		playbooks/mmtests.yml \
@@ -58,6 +64,7 @@ mmtests-help:
 	@echo "mmtests-tests                : Run mmtests tests"
 	@echo "mmtests-results              : Copy results from guests"
 	@echo "mmtests-compare              : Compare baseline and dev results (AB testing)"
+	@echo "monitor-results              : Collect interim monitoring data without stopping monitoring"
 	@echo "mmtests-clean                : Clean up mmtests installation"
 	@echo ""
 
@@ -69,6 +76,7 @@ PHONY +: mmtests-dev
 PHONY +: mmtests-tests
 PHONY +: mmtests-results
 PHONY +: mmtests-compare
+PHONY +: monitor-results
 PHONY +: mmtests-clean
 PHONY +: mmtests-help
 .PHONY: $(PHONY)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] sysbench: add monitoring framework integration
  2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 1/5] monitoring: add memory fragmentation eBPF monitoring support Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 2/5] mmtests: add monitoring framework integration Luis Chamberlain
@ 2025-09-04  9:13 ` Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 4/5] ai milvus: add monitoring support Luis Chamberlain
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-04  9:13 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

Add support for the monitoring framework to sysbench workflow, matching
the implementation in other workflows. This allows sysbench to leverage
all monitoring capabilities including the new eBPF fragmentation monitoring.

Changes:
- Import monitor_run tasks before test execution
- Import monitor_collect tasks after test completion
- Add monitor-results make target for interim data collection
- Add sysbench to monitoring results path detection
- Install python3-bpfcc/python3-bcc dependency across all distributions

With these changes, users can simply enable monitoring options in
menuconfig and sysbench will automatically start/stop monitoring
during test execution.

Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 playbooks/roles/monitoring/tasks/monitor_collect.yml |  2 ++
 .../sysbench/tasks/install-deps/debian/main.yml      |  2 ++
 .../sysbench/tasks/install-deps/redhat/main.yml      |  1 +
 .../roles/sysbench/tasks/install-deps/suse/main.yml  |  1 +
 playbooks/roles/sysbench/tasks/main.yaml             | 12 ++++++++++++
 workflows/sysbench/Makefile                          |  8 +++++++-
 6 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/playbooks/roles/monitoring/tasks/monitor_collect.yml b/playbooks/roles/monitoring/tasks/monitor_collect.yml
index 967526b40428..9662f827301f 100644
--- a/playbooks/roles/monitoring/tasks/monitor_collect.yml
+++ b/playbooks/roles/monitoring/tasks/monitor_collect.yml
@@ -117,6 +117,8 @@
         {{ topdir_path }}/workflows/fstests/results/monitoring
       {%- elif kdevops_workflow_enable_mmtests|default(false)|bool -%}
         {{ topdir_path }}/workflows/mmtests/results/monitoring
+      {%- elif kdevops_workflow_enable_sysbench|default(false)|bool -%}
+        {{ topdir_path }}/workflows/sysbench/results/monitoring
       {%- else -%}
         {{ topdir_path }}/results/monitoring
       {%- endif -%}
diff --git a/playbooks/roles/sysbench/tasks/install-deps/debian/main.yml b/playbooks/roles/sysbench/tasks/install-deps/debian/main.yml
index bd091389edf8..0a7943640797 100644
--- a/playbooks/roles/sysbench/tasks/install-deps/debian/main.yml
+++ b/playbooks/roles/sysbench/tasks/install-deps/debian/main.yml
@@ -25,6 +25,7 @@
       - docker.io
       - locales
       - rsync
+      - python3-bpfcc
     state: present
     update_cache: true
   tags: ["deps"]
@@ -72,6 +73,7 @@
     name:
       - locales
       - rsync
+      - python3-bpfcc
     state: present
     update_cache: true
   when: "sysbench_type_postgresql_native|bool"
diff --git a/playbooks/roles/sysbench/tasks/install-deps/redhat/main.yml b/playbooks/roles/sysbench/tasks/install-deps/redhat/main.yml
index 9b780db2dd13..04e67434fb5b 100644
--- a/playbooks/roles/sysbench/tasks/install-deps/redhat/main.yml
+++ b/playbooks/roles/sysbench/tasks/install-deps/redhat/main.yml
@@ -22,3 +22,4 @@
   vars:
     packages:
       - docker
+      - python3-bcc
diff --git a/playbooks/roles/sysbench/tasks/install-deps/suse/main.yml b/playbooks/roles/sysbench/tasks/install-deps/suse/main.yml
index f9e78f3881ad..0dfb2f651670 100644
--- a/playbooks/roles/sysbench/tasks/install-deps/suse/main.yml
+++ b/playbooks/roles/sysbench/tasks/install-deps/suse/main.yml
@@ -62,6 +62,7 @@
   ansible.builtin.package:
     name:
       - docker
+      - python3-bcc
     state: present
   when:
     - repos_present|bool
diff --git a/playbooks/roles/sysbench/tasks/main.yaml b/playbooks/roles/sysbench/tasks/main.yaml
index 77c57d6a9dee..bd550428d8c7 100644
--- a/playbooks/roles/sysbench/tasks/main.yaml
+++ b/playbooks/roles/sysbench/tasks/main.yaml
@@ -30,6 +30,12 @@
     name: create_data_partition
   tags: ["mkfs"]
 
+# Start monitoring services before running tests
+- ansible.builtin.import_tasks: ../../monitoring/tasks/monitor_run.yml
+  when:
+    - enable_monitoring|default(false)|bool
+  tags: ["run_sysbench", "monitoring", "monitor_run"]
+
 - name: MySQL Docker
   ansible.builtin.import_tasks: mysql-docker/main.yaml
   when: sysbench_type_mysql_docker | bool
@@ -37,3 +43,9 @@
 - name: PostgreSQL Native
   ansible.builtin.import_tasks: postgresql-native/main.yaml
   when: sysbench_type_postgresql_native | bool
+
+# Collect monitoring data after tests complete
+- ansible.builtin.import_tasks: ../../monitoring/tasks/monitor_collect.yml
+  when:
+    - enable_monitoring|default(false)|bool
+  tags: ["run_sysbench", "monitoring", "monitor_collect"]
diff --git a/workflows/sysbench/Makefile b/workflows/sysbench/Makefile
index 66e594d34841..45347d79ed1f 100644
--- a/workflows/sysbench/Makefile
+++ b/workflows/sysbench/Makefile
@@ -1,4 +1,4 @@
-PHONY += sysbench sysbench-test sysbench-telemetry sysbench-help-menu
+PHONY += sysbench sysbench-test sysbench-telemetry sysbench-help-menu monitor-results
 
 
 TAGS_SYSBENCH_RUN := db_start
@@ -47,6 +47,11 @@ sysbench-results:
 		playbooks/sysbench.yml \
 		--tags $(subst $(space),$(comma),$(TAGS_SYSBENCH_RESULTS))
 
+monitor-results: $(KDEVOPS_EXTRA_VARS)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		playbooks/monitor-results.yml \
+		--extra-vars=@./extra_vars.yaml
+
 sysbench-clean:
 	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
 		playbooks/sysbench.yml \
@@ -65,6 +70,7 @@ sysbench-help-menu:
 	@echo "sysbench-test                     - Run sysbench tests and collect results (with telemetry)"
 	@echo "sysbench-telemetry                - Gather sysbench telemetry data on each node"
 	@echo "sysbench-results                  - Collect all sysbench results onto local host"
+	@echo "monitor-results                   - Collect interim monitoring data without stopping monitoring"
 	@echo "sysbench-clean                    - Remove any previous results on node and host"
 	@echo ""
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] ai milvus: add monitoring support
  2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
                   ` (2 preceding siblings ...)
  2025-09-04  9:13 ` [PATCH 3/5] sysbench: " Luis Chamberlain
@ 2025-09-04  9:13 ` Luis Chamberlain
  2025-09-04  9:13 ` [PATCH 5/5] minio: " Luis Chamberlain
  2025-09-19  3:49 ` [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
  5 siblings, 0 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-04  9:13 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

Add monitoring framework integration to the AI Milvus workflow to
track performance during vector database benchmarks.

Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 playbooks/ai_benchmark.yml                         | 14 ++++++++++++++
 playbooks/roles/milvus/tasks/install_docker.yml    |  2 ++
 .../roles/monitoring/tasks/monitor_collect.yml     |  2 ++
 workflows/ai/Makefile                              |  5 +++++
 4 files changed, 23 insertions(+)

diff --git a/playbooks/ai_benchmark.yml b/playbooks/ai_benchmark.yml
index 85fc117c8ba7..7e681328e2bc 100644
--- a/playbooks/ai_benchmark.yml
+++ b/playbooks/ai_benchmark.yml
@@ -3,6 +3,20 @@
   hosts: ai
   vars:
     ai_vector_db_milvus_benchmark_enable: true
+  tasks:
+    # Start monitoring services before running benchmarks
+    - ansible.builtin.import_tasks: roles/monitoring/tasks/monitor_run.yml
+      when:
+        - enable_monitoring|default(false)|bool
+      tags: ["monitoring", "monitor_run"]
+
   roles:
     - role: milvus
       tags: ['ai', 'vector_db', 'milvus', 'benchmark']
+
+  post_tasks:
+    # Collect monitoring data after benchmarks complete
+    - ansible.builtin.import_tasks: roles/monitoring/tasks/monitor_collect.yml
+      when:
+        - enable_monitoring|default(false)|bool
+      tags: ["monitoring", "monitor_collect"]
diff --git a/playbooks/roles/milvus/tasks/install_docker.yml b/playbooks/roles/milvus/tasks/install_docker.yml
index e1e1d911aee0..8c3a80430cd7 100644
--- a/playbooks/roles/milvus/tasks/install_docker.yml
+++ b/playbooks/roles/milvus/tasks/install_docker.yml
@@ -14,6 +14,7 @@
       - python3-pip
       - python3-setuptools
       - python3-packaging
+      - python3-bpfcc
     state: present
   become: true
   when:
@@ -35,6 +36,7 @@
       - docker-compose
       - python3-pip
       - python3-setuptools
+      - python3-bcc
     state: present
   become: true
   when:
diff --git a/playbooks/roles/monitoring/tasks/monitor_collect.yml b/playbooks/roles/monitoring/tasks/monitor_collect.yml
index 9662f827301f..83193ed459b6 100644
--- a/playbooks/roles/monitoring/tasks/monitor_collect.yml
+++ b/playbooks/roles/monitoring/tasks/monitor_collect.yml
@@ -119,6 +119,8 @@
         {{ topdir_path }}/workflows/mmtests/results/monitoring
       {%- elif kdevops_workflow_enable_sysbench|default(false)|bool -%}
         {{ topdir_path }}/workflows/sysbench/results/monitoring
+      {%- elif kdevops_workflow_enable_ai|default(false)|bool -%}
+        {{ topdir_path }}/workflows/ai/results/monitoring
       {%- else -%}
         {{ topdir_path }}/results/monitoring
       {%- endif -%}
diff --git a/workflows/ai/Makefile b/workflows/ai/Makefile
index 7e9b8af236b7..2ffa64c727fa 100644
--- a/workflows/ai/Makefile
+++ b/workflows/ai/Makefile
@@ -101,6 +101,11 @@ ai-results-baseline:
 ai-results-dev:
 	$(Q)$(MAKE) ai-results HOSTS="dev"
 
+monitor-results: $(KDEVOPS_EXTRA_VARS)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		playbooks/monitor-results.yml \
+		--extra-vars=@./extra_vars.yaml
+
 ai-setup:
 	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
 		-i hosts \
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] minio: add monitoring support
  2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
                   ` (3 preceding siblings ...)
  2025-09-04  9:13 ` [PATCH 4/5] ai milvus: add monitoring support Luis Chamberlain
@ 2025-09-04  9:13 ` Luis Chamberlain
  2025-09-19  3:49 ` [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
  5 siblings, 0 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-04  9:13 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain

Add monitoring framework integration to the MinIO Warp workflow to
track performance during S3 benchmarking tests.

Generated-by: Claude AI
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 playbooks/minio.yml                           | 15 ++++++++++++
 playbooks/roles/minio_install/tasks/main.yml  | 24 ++++++++++++++++++-
 .../monitoring/tasks/monitor_collect.yml      |  4 ++++
 workflows/minio/Makefile                      |  9 ++++++-
 4 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/playbooks/minio.yml b/playbooks/minio.yml
index bf80bbf4eaa0..a3b2be26ed91 100644
--- a/playbooks/minio.yml
+++ b/playbooks/minio.yml
@@ -25,9 +25,24 @@
   become: true
   become_user: root
   tags: ['minio_warp']
+
+  pre_tasks:
+    # Start monitoring services before running benchmarks
+    - ansible.builtin.import_tasks: roles/monitoring/tasks/monitor_run.yml
+      when:
+        - enable_monitoring|default(false)|bool
+      tags: ["monitoring", "monitor_run"]
+
   roles:
     - role: minio_warp_run
 
+  post_tasks:
+    # Collect monitoring data after benchmarks complete
+    - ansible.builtin.import_tasks: roles/monitoring/tasks/monitor_collect.yml
+      when:
+        - enable_monitoring|default(false)|bool
+      tags: ["monitoring", "monitor_collect"]
+
 - name: Uninstall MinIO
   hosts: minio
   become: true
diff --git a/playbooks/roles/minio_install/tasks/main.yml b/playbooks/roles/minio_install/tasks/main.yml
index 9ea3d758adcb..e8ec3067b8b3 100644
--- a/playbooks/roles/minio_install/tasks/main.yml
+++ b/playbooks/roles/minio_install/tasks/main.yml
@@ -6,13 +6,35 @@
     - "../extra_vars.yaml"
   tags: vars
 
-- name: Install Docker
+- name: Install Docker and monitoring dependencies
   package:
     name:
       - docker.io
       - python3-docker
+      - python3-bpfcc
     state: present
   become: yes
+  when: ansible_os_family == "Debian"
+
+- name: Install Docker and monitoring dependencies (RedHat)
+  package:
+    name:
+      - docker
+      - python3-docker
+      - python3-bcc
+    state: present
+  become: yes
+  when: ansible_os_family == "RedHat"
+
+- name: Install Docker and monitoring dependencies (SUSE)
+  package:
+    name:
+      - docker
+      - python3-docker
+      - python3-bcc
+    state: present
+  become: yes
+  when: ansible_os_family == "SUSE"
 
 - name: Ensure Docker service is running
   systemd:
diff --git a/playbooks/roles/monitoring/tasks/monitor_collect.yml b/playbooks/roles/monitoring/tasks/monitor_collect.yml
index 83193ed459b6..e59e4ce7a386 100644
--- a/playbooks/roles/monitoring/tasks/monitor_collect.yml
+++ b/playbooks/roles/monitoring/tasks/monitor_collect.yml
@@ -121,6 +121,10 @@
         {{ topdir_path }}/workflows/sysbench/results/monitoring
       {%- elif kdevops_workflow_enable_ai|default(false)|bool -%}
         {{ topdir_path }}/workflows/ai/results/monitoring
+      {%- elif kdevops_workflow_enable_minio|default(false)|bool -%}
+        {{ topdir_path }}/workflows/minio/results/monitoring
+      {%- elif kdevops_workflow_enable_build_linux|default(false)|bool -%}
+        {{ topdir_path }}/workflows/build-linux/results/monitoring
       {%- else -%}
         {{ topdir_path }}/results/monitoring
       {%- endif -%}
diff --git a/workflows/minio/Makefile b/workflows/minio/Makefile
index c543ed3b26ea..ef2ecb973429 100644
--- a/workflows/minio/Makefile
+++ b/workflows/minio/Makefile
@@ -4,6 +4,7 @@ MINIO_DATA_TARGET_UNINSTALL		:= minio-uninstall
 MINIO_DATA_TARGET_DESTROY		:= minio-destroy
 MINIO_DATA_TARGET_RUN			:= minio-warp
 MINIO_DATA_TARGET_RESULTS		:= minio-results
+MINIO_DATA_TARGET_MONITOR		:= monitor-results
 
 MINIO_PLAYBOOK		:= playbooks/minio.yml
 
@@ -49,6 +50,11 @@ $(MINIO_DATA_TARGET_RESULTS):
 		echo "No results directory found. Run 'make minio-warp' first."; \
 	fi
 
+$(MINIO_DATA_TARGET_MONITOR): $(KDEVOPS_EXTRA_VARS)
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		playbooks/monitor-results.yml \
+		--extra-vars=@./extra_vars.yaml
+
 minio-help:
 	@echo "MinIO Warp S3 benchmarking targets:"
 	@echo ""
@@ -58,6 +64,7 @@ minio-help:
 	@echo "minio-destroy           - Remove MinIO containers and clean up data"
 	@echo "minio-warp              - Run MinIO Warp benchmarks"
 	@echo "minio-results           - Collect and analyze benchmark results"
+	@echo "monitor-results         - Collect monitoring data"
 	@echo ""
 	@echo "Example usage:"
 	@echo "  make defconfig-minio-warp    # Configure for Warp benchmarking"
@@ -73,4 +80,4 @@ minio-help:
 
 .PHONY: $(MINIO_DATA_TARGET) $(MINIO_DATA_TARGET_INSTALL) $(MINIO_DATA_TARGET_UNINSTALL)
 .PHONY: $(MINIO_DATA_TARGET_DESTROY) $(MINIO_DATA_TARGET_RUN) $(MINIO_DATA_TARGET_RESULTS)
-.PHONY: minio-help
+.PHONY: $(MINIO_DATA_TARGET_MONITOR) minio-help
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/5] add memory fragmentation automation testing
  2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
                   ` (4 preceding siblings ...)
  2025-09-04  9:13 ` [PATCH 5/5] minio: " Luis Chamberlain
@ 2025-09-19  3:49 ` Luis Chamberlain
  5 siblings, 0 replies; 7+ messages in thread
From: Luis Chamberlain @ 2025-09-19  3:49 UTC (permalink / raw)
  To: Chuck Lever, Daniel Gomez, kdevops

On Thu, Sep 04, 2025 at 02:13:16AM -0700, Luis Chamberlain wrote:
> This extends monitoring support on kdevops to leverage tracepoint
> analysis for automatic memory fragmentation analysis.
> 
> Luis Chamberlain (5):
>   monitoring: add memory fragmentation eBPF monitoring support
>   mmtests: add monitoring framework integration
>   sysbench: add monitoring framework integration
>   ai milvus: add monitoring support
>   minio: add monitoring support

Rebased, merged and pushed.

  Luis

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-09-19  3:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-04  9:13 [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain
2025-09-04  9:13 ` [PATCH 1/5] monitoring: add memory fragmentation eBPF monitoring support Luis Chamberlain
2025-09-04  9:13 ` [PATCH 2/5] mmtests: add monitoring framework integration Luis Chamberlain
2025-09-04  9:13 ` [PATCH 3/5] sysbench: " Luis Chamberlain
2025-09-04  9:13 ` [PATCH 4/5] ai milvus: add monitoring support Luis Chamberlain
2025-09-04  9:13 ` [PATCH 5/5] minio: " Luis Chamberlain
2025-09-19  3:49 ` [PATCH 0/5] add memory fragmentation automation testing Luis Chamberlain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox