Linux virtualization list
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] virtio-balloon: extended stats and push mode
@ 2026-05-13 16:50 Gregory Price
  2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
  2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
  0 siblings, 2 replies; 10+ messages in thread
From: Gregory Price @ 2026-05-13 16:50 UTC (permalink / raw)
  To: virtualization
  Cc: linux-kernel, kernel-team, mst, david, jasowang, xuanzhuo,
	eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

This series extends the virtio-balloon stats virtqueue with new tags
for memory composition and pressure data, and adds an optional push
mode for guest-initiated stat reporting.

Patch 1 adds 11 new stat tags to the existing stats VQ.

  These use the same tag/value format and require no feature negotiation.
  Old hosts ignore unknown tags.

  The new tags provide data needed for better balloon sizing:

  a) dirty/writeback pages (more accurate reclaimable estimates)
  b) workingset refault counters (balloon overshoot before PSI spikes)
  c) PSI pressure stats

  Also exports psi_system for module builds.

Patch 2 adds VIRTIO_BALLOON_F_STATS_PUSH.

  Stats push is a new feature that lets the host configure the guest
  to push stats on a timer. The host sets stats_push_interval_ms in
  the balloon config space; when non-zero, the guest pushes stats at
  that cadence without waiting for the host to return the buffer. 

  This serves two purposes: 
    a) latency-sensitive consumers get stats at a guaranteed cadence
    b) absence of expected stats provides guest liveness detection.

  STATS_PUSH requires STATS_VQ and suppresses the pull callback when
  active to avoid racing on buffer submission.

Gregory Price (2):
  virtio-balloon: extend stats with memory composition and pressure data
  virtio-balloon: add stats push mode

 drivers/virtio/virtio_balloon.c     | 104 ++++++++++++++++++++++++++++
 include/uapi/linux/virtio_balloon.h |  33 ++++++++-
 kernel/sched/psi.c                  |   1 +
 3 files changed, 136 insertions(+), 2 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
  2026-05-13 16:50 [RFC PATCH 0/2] virtio-balloon: extended stats and push mode Gregory Price
@ 2026-05-13 16:50 ` Gregory Price
  2026-06-16 12:30   ` David Hildenbrand (Arm)
  2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
  1 sibling, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-05-13 16:50 UTC (permalink / raw)
  To: virtualization
  Cc: linux-kernel, kernel-team, mst, david, jasowang, xuanzhuo,
	eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

When doing proactive ballooning to reduce the size of a guest, some
additional vmstat information is useful in deciding how much pressure
to exert on the VM and when the VM starts experiencing undesirable
behavior (memory and io pressure).

Add 11 new statistics tags to the existing balloon stats virtqueue
for improved balloon sizing decisions. Old hosts ignore unknown tags,
so no feature negotiation is required.

New memory composition stats (bytes):
  S_DIRTY:         dirty pages awaiting writeback
  S_WRITEBACK:     pages under active writeback
  S_ANON:          anonymous pages (for balloon ceiling calculation)
  S_INACTIVE_FILE: inactive file LRU (safely reclaimable subset of cache)
  S_SLAB_RECLAIM:  reclaimable slab memory

New workingset stats (counts):
  S_WS_REFAULT_A:  anon workingset refaults
  S_WS_REFAULT_F:  file workingset refaults

New PSI stats (microseconds, cumulative):
  S_PSI_MEM_SOME:  memory pressure (some stalled)
  S_PSI_MEM_FULL:  memory pressure (all stalled)
  S_PSI_IO_SOME:   IO pressure (some stalled)
  S_PSI_IO_FULL:   IO pressure (all stalled)

Export psi_system for module builds (CONFIG_VIRTIO_BALLOON=m with
CONFIG_PSI=y).

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Gregory Price <gourry@gourry.net>
---
 drivers/virtio/virtio_balloon.c     | 33 +++++++++++++++++++++++++++++
 include/uapi/linux/virtio_balloon.h | 26 +++++++++++++++++++++--
 kernel/sched/psi.c                  |  1 +
 3 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f6c2dff33f8a..8fa33aec4ce7 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -18,6 +18,8 @@
 #include <linux/wait.h>
 #include <linux/mm.h>
 #include <linux/page_reporting.h>
+#include <linux/vmstat.h>
+#include <linux/psi.h>
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -414,6 +416,37 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb)
 	update_stat(vb, idx++, VIRTIO_BALLOON_S_CACHES,
 				pages_to_bytes(caches));
 
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRTY,
+		    pages_to_bytes(global_node_page_state(NR_FILE_DIRTY)));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_WRITEBACK,
+		    pages_to_bytes(global_node_page_state(NR_WRITEBACK)));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_ANON,
+		    pages_to_bytes(global_node_page_state(NR_ANON_MAPPED)));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_INACTIVE_FILE,
+		    pages_to_bytes(global_node_page_state(NR_INACTIVE_FILE)));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_SLAB_RECLAIM,
+		    pages_to_bytes(
+			global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B)));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_A,
+		    global_node_page_state(WORKINGSET_REFAULT_ANON));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_F,
+		    global_node_page_state(WORKINGSET_REFAULT_FILE));
+
+#ifdef CONFIG_PSI
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_SOME,
+		    div_u64(psi_system.total[PSI_AVGS][PSI_MEM_SOME],
+			    NSEC_PER_USEC));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_FULL,
+		    div_u64(psi_system.total[PSI_AVGS][PSI_MEM_FULL],
+			    NSEC_PER_USEC));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_SOME,
+		    div_u64(psi_system.total[PSI_AVGS][PSI_IO_SOME],
+			    NSEC_PER_USEC));
+	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_FULL,
+		    div_u64(psi_system.total[PSI_AVGS][PSI_IO_FULL],
+			    NSEC_PER_USEC));
+#endif
+
 	return idx;
 }
 
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index ee35a372805d..37ec8a8466c4 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -77,7 +77,18 @@ struct virtio_balloon_config {
 #define VIRTIO_BALLOON_S_DIRECT_SCAN   13 /* Amount of memory scanned directly */
 #define VIRTIO_BALLOON_S_ASYNC_RECLAIM 14 /* Amount of memory reclaimed asynchronously */
 #define VIRTIO_BALLOON_S_DIRECT_RECLAIM 15 /* Amount of memory reclaimed directly */
-#define VIRTIO_BALLOON_S_NR       16
+#define VIRTIO_BALLOON_S_DIRTY	       16 /* Dirty pages (bytes) */
+#define VIRTIO_BALLOON_S_WRITEBACK     17 /* Pages under writeback (bytes) */
+#define VIRTIO_BALLOON_S_ANON	       18 /* Anonymous pages (bytes) */
+#define VIRTIO_BALLOON_S_INACTIVE_FILE 19 /* Inactive file LRU pages (bytes) */
+#define VIRTIO_BALLOON_S_SLAB_RECLAIM  20 /* Reclaimable slab (bytes) */
+#define VIRTIO_BALLOON_S_WS_REFAULT_A  21 /* Workingset refaults anon (count) */
+#define VIRTIO_BALLOON_S_WS_REFAULT_F  22 /* Workingset refaults file (count) */
+#define VIRTIO_BALLOON_S_PSI_MEM_SOME  23 /* PSI memory some total (us) */
+#define VIRTIO_BALLOON_S_PSI_MEM_FULL  24 /* PSI memory full total (us) */
+#define VIRTIO_BALLOON_S_PSI_IO_SOME   25 /* PSI IO some total (us) */
+#define VIRTIO_BALLOON_S_PSI_IO_FULL   26 /* PSI IO full total (us) */
+#define VIRTIO_BALLOON_S_NR	       27
 
 #define VIRTIO_BALLOON_S_NAMES_WITH_PREFIX(VIRTIO_BALLOON_S_NAMES_prefix) { \
 	VIRTIO_BALLOON_S_NAMES_prefix "swap-in", \
@@ -95,7 +106,18 @@ struct virtio_balloon_config {
 	VIRTIO_BALLOON_S_NAMES_prefix "async-scans", \
 	VIRTIO_BALLOON_S_NAMES_prefix "direct-scans", \
 	VIRTIO_BALLOON_S_NAMES_prefix "async-reclaims", \
-	VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims" \
+	VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims", \
+	VIRTIO_BALLOON_S_NAMES_prefix "dirty", \
+	VIRTIO_BALLOON_S_NAMES_prefix "writeback", \
+	VIRTIO_BALLOON_S_NAMES_prefix "anon-pages", \
+	VIRTIO_BALLOON_S_NAMES_prefix "inactive-file", \
+	VIRTIO_BALLOON_S_NAMES_prefix "slab-reclaimable", \
+	VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-anon", \
+	VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-file", \
+	VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-some-us", \
+	VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-full-us", \
+	VIRTIO_BALLOON_S_NAMES_prefix "psi-io-some-us", \
+	VIRTIO_BALLOON_S_NAMES_prefix "psi-io-full-us" \
 }
 
 #define VIRTIO_BALLOON_S_NAMES VIRTIO_BALLOON_S_NAMES_WITH_PREFIX("")
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45..8ab3aa1c4ef5 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -175,6 +175,7 @@ static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
 struct psi_group psi_system = {
 	.pcpu = &system_group_pcpu,
 };
+EXPORT_SYMBOL_GPL(psi_system);
 
 static DEFINE_PER_CPU(seqcount_t, psi_seq) = SEQCNT_ZERO(psi_seq);
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 2/2] virtio-balloon: add stats push mode
  2026-05-13 16:50 [RFC PATCH 0/2] virtio-balloon: extended stats and push mode Gregory Price
  2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
@ 2026-05-13 16:50 ` Gregory Price
  2026-06-16 12:33   ` David Hildenbrand (Arm)
  1 sibling, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-05-13 16:50 UTC (permalink / raw)
  To: virtualization
  Cc: linux-kernel, kernel-team, mst, david, jasowang, xuanzhuo,
	eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

When doing aggressive overcommit of VMs on a single host, a pull
model of stat retrieval is problematic if a guest becomes some form
of unresponsive.  In particular, it's difficult to discern the
difference between a hung guest and a slow guest - and why the
guest is experiencing that.

Add VIRTIO_BALLOON_F_STATS_PUSH feature that allows the host to
configure the guest to push stats on a timer instead of the default
pull model.

The host sets stats_push_interval_ms in the balloon config space:
  0     = disabled (pull-only, default)
  N > 0 = guest pushes stats every N milliseconds

The push mode reuses the existing stats VQ, same buffer format,
same tags. The host can change the interval at runtime by updating
the config field.

Push mode provides two advantages over pull:
  1. Guest liveness detection: in pull mode, the host cannot
     distinguish a slow guest from a hung guest without implementing
     its own timeout tracking. In push mode, the absence of expected
     stats buffers is an implicit liveness signal; if the guest
     fails to push within the expected interval, the host can
     conclude it is unresponsive.
  2. Latency-sensitive consumers (e.g., memory pressure response
     loops) receive fresh stats at a guaranteed cadence without
     the host needing to poll.

STATS_PUSH requires STATS_VQ; the driver clears STATS_PUSH during
feature validation if STATS_VQ is absent. When push mode is active,
the pull callback is suppressed to avoid racing on buffer submission.

The pull model remains available and is the default.

Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Gregory Price <gourry@gourry.net>
---
 drivers/virtio/virtio_balloon.c     | 71 +++++++++++++++++++++++++++++
 include/uapi/linux/virtio_balloon.h |  7 +++
 2 files changed, 78 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8fa33aec4ce7..47bde1d2b388 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -112,6 +112,10 @@ struct virtio_balloon {
 	/* Memory statistics */
 	struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
 
+	/* Stats push mode */
+	struct delayed_work stats_push_work;
+	uint32_t stats_push_interval_ms;
+
 	/* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */
 	struct shrinker *shrinker;
 
@@ -463,6 +467,13 @@ static void stats_request(struct virtqueue *vq)
 {
 	struct virtio_balloon *vb = vq->vdev->priv;
 
+	/*
+	 * In push mode, the push timer owns the VQ. Ignore pull
+	 * requests to avoid racing on buffer submission.
+	 */
+	if (vb->stats_push_interval_ms)
+		return;
+
 	spin_lock(&vb->stop_update_lock);
 	if (!vb->stop_update) {
 		start_wakeup_event(vb, VIRTIO_BALLOON_WAKEUP_SIGNAL_STATS);
@@ -558,6 +569,20 @@ static void virtballoon_changed(struct virtio_device *vdev)
 		virtio_balloon_queue_free_page_work(vb);
 	}
 	spin_unlock_irqrestore(&vb->stop_update_lock, flags);
+
+	if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_PUSH)) {
+		uint32_t interval;
+
+		virtio_cread_le(vdev, struct virtio_balloon_config,
+				stats_push_interval_ms, &interval);
+		if (interval != vb->stats_push_interval_ms) {
+			vb->stats_push_interval_ms = interval;
+			cancel_delayed_work(&vb->stats_push_work);
+			if (interval)
+				schedule_delayed_work(&vb->stats_push_work,
+					msecs_to_jiffies(interval));
+		}
+	}
 }
 
 static void update_balloon_size(struct virtio_balloon *vb)
@@ -581,6 +606,32 @@ static void update_balloon_stats_func(struct work_struct *work)
 	finish_wakeup_event(vb);
 }
 
+static void stats_push_func(struct work_struct *work)
+{
+	struct virtio_balloon *vb = container_of(work, struct virtio_balloon,
+						 stats_push_work.work);
+	struct virtqueue *vq;
+	struct scatterlist sg;
+	unsigned int num_stats, len;
+
+	if (!vb->stats_push_interval_ms)
+		return;
+
+	vq = vb->stats_vq;
+
+	/* Reclaim previous buffer */
+	while (virtqueue_get_buf(vq, &len))
+		;
+
+	num_stats = update_balloon_stats(vb);
+	sg_init_one(&sg, vb->stats, sizeof(vb->stats[0]) * num_stats);
+	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
+	virtqueue_kick(vq);
+
+	schedule_delayed_work(&vb->stats_push_work,
+			      msecs_to_jiffies(vb->stats_push_interval_ms));
+}
+
 static void update_balloon_size_func(struct work_struct *work)
 {
 	struct virtio_balloon *vb;
@@ -967,6 +1018,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
 	}
 
 	INIT_WORK(&vb->update_balloon_stats_work, update_balloon_stats_func);
+	INIT_DELAYED_WORK(&vb->stats_push_work, stats_push_func);
 	INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
 	spin_lock_init(&vb->stop_update_lock);
 	mutex_init(&vb->balloon_lock);
@@ -1094,6 +1146,19 @@ static int virtballoon_probe(struct virtio_device *vdev)
 
 	if (towards_target(vb))
 		virtballoon_changed(vdev);
+
+	if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_PUSH)) {
+		uint32_t interval;
+
+		virtio_cread_le(vdev, struct virtio_balloon_config,
+				stats_push_interval_ms, &interval);
+		if (interval) {
+			vb->stats_push_interval_ms = interval;
+			schedule_delayed_work(&vb->stats_push_work,
+					      msecs_to_jiffies(interval));
+		}
+	}
+
 	return 0;
 
 out_unregister_oom:
@@ -1145,6 +1210,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
 	spin_unlock_irq(&vb->stop_update_lock);
 	cancel_work_sync(&vb->update_balloon_size_work);
 	cancel_work_sync(&vb->update_balloon_stats_work);
+	cancel_delayed_work_sync(&vb->stats_push_work);
 
 	if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
 		cancel_work_sync(&vb->report_free_page_work);
@@ -1199,6 +1265,10 @@ static int virtballoon_validate(struct virtio_device *vdev)
 	else if (!virtio_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON))
 		__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_REPORTING);
 
+	if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_PUSH) &&
+	    !virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ))
+		__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_STATS_PUSH);
+
 	__virtio_clear_bit(vdev, VIRTIO_F_ACCESS_PLATFORM);
 	return 0;
 }
@@ -1210,6 +1280,7 @@ static unsigned int features[] = {
 	VIRTIO_BALLOON_F_FREE_PAGE_HINT,
 	VIRTIO_BALLOON_F_PAGE_POISON,
 	VIRTIO_BALLOON_F_REPORTING,
+	VIRTIO_BALLOON_F_STATS_PUSH,
 };
 
 static struct virtio_driver virtio_balloon_driver = {
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index 37ec8a8466c4..90e9b5247e5e 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -37,6 +37,7 @@
 #define VIRTIO_BALLOON_F_FREE_PAGE_HINT	3 /* VQ to report free pages */
 #define VIRTIO_BALLOON_F_PAGE_POISON	4 /* Guest is using page poisoning */
 #define VIRTIO_BALLOON_F_REPORTING	5 /* Page reporting virtqueue */
+#define VIRTIO_BALLOON_F_STATS_PUSH	6 /* Guest pushes stats on a timer */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -59,6 +60,12 @@ struct virtio_balloon_config {
 	};
 	/* Stores PAGE_POISON if page poisoning is in use */
 	__le32 poison_val;
+	/*
+	 * Stats push interval in milliseconds. 0 = disabled (pull only).
+	 * Valid with VIRTIO_BALLOON_F_STATS_PUSH. Host-writable, can change
+	 * at runtime via config updates.
+	 */
+	__le32 stats_push_interval_ms;
 };
 
 #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
  2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
@ 2026-06-16 12:30   ` David Hildenbrand (Arm)
  2026-06-16 13:49     ` Gregory Price
  0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 12:30 UTC (permalink / raw)
  To: Gregory Price, virtualization
  Cc: linux-kernel, kernel-team, mst, jasowang, xuanzhuo, eperezma,
	hannes, surenb, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
	kprateek.nayak

On 5/13/26 18:50, Gregory Price wrote:
> When doing proactive ballooning to reduce the size of a guest, some
> additional vmstat information is useful in deciding how much pressure
> to exert on the VM and when the VM starts experiencing undesirable
> behavior (memory and io pressure).
> 
> Add 11 new statistics tags to the existing balloon stats virtqueue
> for improved balloon sizing decisions. Old hosts ignore unknown tags,
> so no feature negotiation is required.
> 
> New memory composition stats (bytes):
>   S_DIRTY:         dirty pages awaiting writeback
>   S_WRITEBACK:     pages under active writeback
>   S_ANON:          anonymous pages (for balloon ceiling calculation)
>   S_INACTIVE_FILE: inactive file LRU (safely reclaimable subset of cache)
>   S_SLAB_RECLAIM:  reclaimable slab memory
> 
> New workingset stats (counts):
>   S_WS_REFAULT_A:  anon workingset refaults
>   S_WS_REFAULT_F:  file workingset refaults
> 
> New PSI stats (microseconds, cumulative):
>   S_PSI_MEM_SOME:  memory pressure (some stalled)
>   S_PSI_MEM_FULL:  memory pressure (all stalled)
>   S_PSI_IO_SOME:   IO pressure (some stalled)
>   S_PSI_IO_FULL:   IO pressure (all stalled)
> 
> Export psi_system for module builds (CONFIG_VIRTIO_BALLOON=m with
> CONFIG_PSI=y).
> 
> Assisted-by: Claude:claude-opus-4-6
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>  drivers/virtio/virtio_balloon.c     | 33 +++++++++++++++++++++++++++++
>  include/uapi/linux/virtio_balloon.h | 26 +++++++++++++++++++++--
>  kernel/sched/psi.c                  |  1 +
>  3 files changed, 58 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index f6c2dff33f8a..8fa33aec4ce7 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -18,6 +18,8 @@
>  #include <linux/wait.h>
>  #include <linux/mm.h>
>  #include <linux/page_reporting.h>
> +#include <linux/vmstat.h>
> +#include <linux/psi.h>
>  
>  /*
>   * Balloon device works in 4K page units.  So each page is pointed to by
> @@ -414,6 +416,37 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb)
>  	update_stat(vb, idx++, VIRTIO_BALLOON_S_CACHES,
>  				pages_to_bytes(caches));
>  
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRTY,
> +		    pages_to_bytes(global_node_page_state(NR_FILE_DIRTY)));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_WRITEBACK,
> +		    pages_to_bytes(global_node_page_state(NR_WRITEBACK)));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_ANON,
> +		    pages_to_bytes(global_node_page_state(NR_ANON_MAPPED)));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_INACTIVE_FILE,
> +		    pages_to_bytes(global_node_page_state(NR_INACTIVE_FILE)));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_SLAB_RECLAIM,
> +		    pages_to_bytes(
> +			global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B)));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_A,
> +		    global_node_page_state(WORKINGSET_REFAULT_ANON));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_F,
> +		    global_node_page_state(WORKINGSET_REFAULT_FILE));
> +
> +#ifdef CONFIG_PSI
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_SOME,
> +		    div_u64(psi_system.total[PSI_AVGS][PSI_MEM_SOME],
> +			    NSEC_PER_USEC));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_FULL,
> +		    div_u64(psi_system.total[PSI_AVGS][PSI_MEM_FULL],
> +			    NSEC_PER_USEC));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_SOME,
> +		    div_u64(psi_system.total[PSI_AVGS][PSI_IO_SOME],
> +			    NSEC_PER_USEC));
> +	update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_FULL,
> +		    div_u64(psi_system.total[PSI_AVGS][PSI_IO_FULL],
> +			    NSEC_PER_USEC));
> +#endif
> +
>  	return idx;
>  }
>  
> diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> index ee35a372805d..37ec8a8466c4 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h
> @@ -77,7 +77,18 @@ struct virtio_balloon_config {
>  #define VIRTIO_BALLOON_S_DIRECT_SCAN   13 /* Amount of memory scanned directly */
>  #define VIRTIO_BALLOON_S_ASYNC_RECLAIM 14 /* Amount of memory reclaimed asynchronously */
>  #define VIRTIO_BALLOON_S_DIRECT_RECLAIM 15 /* Amount of memory reclaimed directly */
> -#define VIRTIO_BALLOON_S_NR       16
> +#define VIRTIO_BALLOON_S_DIRTY	       16 /* Dirty pages (bytes) */
> +#define VIRTIO_BALLOON_S_WRITEBACK     17 /* Pages under writeback (bytes) */
> +#define VIRTIO_BALLOON_S_ANON	       18 /* Anonymous pages (bytes) */
> +#define VIRTIO_BALLOON_S_INACTIVE_FILE 19 /* Inactive file LRU pages (bytes) */
> +#define VIRTIO_BALLOON_S_SLAB_RECLAIM  20 /* Reclaimable slab (bytes) */
> +#define VIRTIO_BALLOON_S_WS_REFAULT_A  21 /* Workingset refaults anon (count) */
> +#define VIRTIO_BALLOON_S_WS_REFAULT_F  22 /* Workingset refaults file (count) */
> +#define VIRTIO_BALLOON_S_PSI_MEM_SOME  23 /* PSI memory some total (us) */
> +#define VIRTIO_BALLOON_S_PSI_MEM_FULL  24 /* PSI memory full total (us) */
> +#define VIRTIO_BALLOON_S_PSI_IO_SOME   25 /* PSI IO some total (us) */
> +#define VIRTIO_BALLOON_S_PSI_IO_FULL   26 /* PSI IO full total (us) */
> +#define VIRTIO_BALLOON_S_NR	       27
>  
>  #define VIRTIO_BALLOON_S_NAMES_WITH_PREFIX(VIRTIO_BALLOON_S_NAMES_prefix) { \
>  	VIRTIO_BALLOON_S_NAMES_prefix "swap-in", \
> @@ -95,7 +106,18 @@ struct virtio_balloon_config {
>  	VIRTIO_BALLOON_S_NAMES_prefix "async-scans", \
>  	VIRTIO_BALLOON_S_NAMES_prefix "direct-scans", \
>  	VIRTIO_BALLOON_S_NAMES_prefix "async-reclaims", \
> -	VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims" \
> +	VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "dirty", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "writeback", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "anon-pages", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "inactive-file", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "slab-reclaimable", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-anon", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-file", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-some-us", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-full-us", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "psi-io-some-us", \
> +	VIRTIO_BALLOON_S_NAMES_prefix "psi-io-full-us" \
>  }
>  
>  #define VIRTIO_BALLOON_S_NAMES VIRTIO_BALLOON_S_NAMES_WITH_PREFIX("")
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index d9c9d9480a45..8ab3aa1c4ef5 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -175,6 +175,7 @@ static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
>  struct psi_group psi_system = {
>  	.pcpu = &system_group_pcpu,
>  };
> +EXPORT_SYMBOL_GPL(psi_system);

Nothing too crazy here, however, the question is which of these values we
actually want to guarantee that we will provide them with unchanged semantics in
the future ... I guess anything we already expose to user space is alright
(because it effectively already must remain mostly unchanged I assume).

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
  2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
@ 2026-06-16 12:33   ` David Hildenbrand (Arm)
  2026-06-16 13:57     ` Gregory Price
  0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 12:33 UTC (permalink / raw)
  To: Gregory Price, virtualization
  Cc: linux-kernel, kernel-team, mst, jasowang, xuanzhuo, eperezma,
	hannes, surenb, peterz, mingo, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
	kprateek.nayak

On 5/13/26 18:50, Gregory Price wrote:
> When doing aggressive overcommit of VMs on a single host, a pull
> model of stat retrieval is problematic if a guest becomes some form
> of unresponsive.  In particular, it's difficult to discern the
> difference between a hung guest and a slow guest - and why the
> guest is experiencing that.
> 
> Add VIRTIO_BALLOON_F_STATS_PUSH feature that allows the host to
> configure the guest to push stats on a timer instead of the default
> pull model.
> 
> The host sets stats_push_interval_ms in the balloon config space:
>   0     = disabled (pull-only, default)
>   N > 0 = guest pushes stats every N milliseconds
> 
> The push mode reuses the existing stats VQ, same buffer format,
> same tags. The host can change the interval at runtime by updating
> the config field.
> 
> Push mode provides two advantages over pull:
>   1. Guest liveness detection: in pull mode, the host cannot
>      distinguish a slow guest from a hung guest without implementing
>      its own timeout tracking. In push mode, the absence of expected
>      stats buffers is an implicit liveness signal; if the guest
>      fails to push within the expected interval, the host can
>      conclude it is unresponsive.
>   2. Latency-sensitive consumers (e.g., memory pressure response
>      loops) receive fresh stats at a guaranteed cadence without
>      the host needing to poll.
> 
> STATS_PUSH requires STATS_VQ; the driver clears STATS_PUSH during
> feature validation if STATS_VQ is absent. When push mode is active,
> the pull callback is suppressed to avoid racing on buffer submission.
> 
> The pull model remains available and is the default.

I don't quite see the big benefit here, really: either it's a timer in the
hypervisor or a timer in the VM. A slow VM will, in either model, delay the
update of stats.

If you need some "liveness detection", is virtio-balloon stats updates really
the right mechanism?

I don't quite understand the "Latency-sensitive consumers" problem. If the VM is
slow, it is slow and will mess with latency-sensitive consumers in either way?

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
  2026-06-16 12:30   ` David Hildenbrand (Arm)
@ 2026-06-16 13:49     ` Gregory Price
  2026-06-16 14:19       ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-06-16 13:49 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
	xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

On Tue, Jun 16, 2026 at 02:30:34PM +0200, David Hildenbrand (Arm) wrote:
> 
> Nothing too crazy here, however, the question is which of these values we
> actually want to guarantee that we will provide them with unchanged semantics in
> the future ... I guess anything we already expose to user space is alright
> (because it effectively already must remain mostly unchanged I assume).
>

I suppose a risk of them going away entirely?  I suppose that's
fixable but unlikely.

~Gregory

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
  2026-06-16 12:33   ` David Hildenbrand (Arm)
@ 2026-06-16 13:57     ` Gregory Price
  2026-06-16 14:32       ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-06-16 13:57 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
	xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

On Tue, Jun 16, 2026 at 02:33:43PM +0200, David Hildenbrand (Arm) wrote:
> On 5/13/26 18:50, Gregory Price wrote:
> > 
> > The pull model remains available and is the default.
> 
> I don't quite see the big benefit here, really: either it's a timer in the
> hypervisor or a timer in the VM. A slow VM will, in either model, delay the
> update of stats.
> 
> If you need some "liveness detection", is virtio-balloon stats updates really
> the right mechanism?
> 
> I don't quite understand the "Latency-sensitive consumers" problem. If the VM is
> slow, it is slow and will mess with latency-sensitive consumers in either way?
>

Latency sensitive here should probably be defined as "Does not like
blocking operations".  This was prototyped in the context of
cloud-hypervisor [1] and an orchestrator trying poll 1000 VMs on a
single machine for stats. 

The poller couldn't determine the difference between "guest is slow" and
"guest is hung" and so had to block on the operation (I didn't see how
to solve this async).

Similarly, having a single thread just round-robin poll the VMs is
bluntly inefficient and provides poor guarantees about the liveliness
of the stats (a couple slow guests can cause other guests' stats to
become stale for 10s of seconds).

Definitely an RFC here because I'm not sure if I was missing something
that might help me solve the problem.

~Gregory

[1] https://github.com/cloud-hypervisor/cloud-hypervisor

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
  2026-06-16 13:49     ` Gregory Price
@ 2026-06-16 14:19       ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 14:19 UTC (permalink / raw)
  To: Gregory Price
  Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
	xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

On 6/16/26 15:49, Gregory Price wrote:
> On Tue, Jun 16, 2026 at 02:30:34PM +0200, David Hildenbrand (Arm) wrote:
>>
>> Nothing too crazy here, however, the question is which of these values we
>> actually want to guarantee that we will provide them with unchanged semantics in
>> the future ... I guess anything we already expose to user space is alright
>> (because it effectively already must remain mostly unchanged I assume).
>>
> 
> I suppose a risk of them going away entirely?  I suppose that's
> fixable but unlikely.

Yeah, the first bunch is all exported through /proc/vmstat AFAIKS.

The other through /proc/pressure/

So I assume this is just fine.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
  2026-06-16 13:57     ` Gregory Price
@ 2026-06-16 14:32       ` David Hildenbrand (Arm)
  2026-06-16 14:44         ` Gregory Price
  0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 14:32 UTC (permalink / raw)
  To: Gregory Price
  Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
	xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

On 6/16/26 15:57, Gregory Price wrote:
> On Tue, Jun 16, 2026 at 02:33:43PM +0200, David Hildenbrand (Arm) wrote:
>> On 5/13/26 18:50, Gregory Price wrote:
>>>
>>> The pull model remains available and is the default.
>>
>> I don't quite see the big benefit here, really: either it's a timer in the
>> hypervisor or a timer in the VM. A slow VM will, in either model, delay the
>> update of stats.
>>
>> If you need some "liveness detection", is virtio-balloon stats updates really
>> the right mechanism?
>>
>> I don't quite understand the "Latency-sensitive consumers" problem. If the VM is
>> slow, it is slow and will mess with latency-sensitive consumers in either way?
>>
> 
> Latency sensitive here should probably be defined as "Does not like
> blocking operations".  This was prototyped in the context of
> cloud-hypervisor [1] and an orchestrator trying poll 1000 VMs on a
> single machine for stats. 
> 
> The poller couldn't determine the difference between "guest is slow" and
> "guest is hung" and so had to block on the operation (I didn't see how
> to solve this async).
> 
> Similarly, having a single thread just round-robin poll the VMs is
> bluntly inefficient and provides poor guarantees about the liveliness
> of the stats (a couple slow guests can cause other guests' stats to
> become stale for 10s of seconds).
> 
> Definitely an RFC here because I'm not sure if I was missing something
> that might help me solve the problem.

Well, in QEMU we just run a timer internally that does the polling.

Then, upper layers in the stack can ask QEMU for the latest stats.

There, you just get the stats along with a "last-update" timestamp.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
  2026-06-16 14:32       ` David Hildenbrand (Arm)
@ 2026-06-16 14:44         ` Gregory Price
  0 siblings, 0 replies; 10+ messages in thread
From: Gregory Price @ 2026-06-16 14:44 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
	xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak

On Tue, Jun 16, 2026 at 04:32:46PM +0200, David Hildenbrand (Arm) wrote:
> On 6/16/26 15:57, Gregory Price wrote:
> > 
> > Definitely an RFC here because I'm not sure if I was missing something
> > that might help me solve the problem.
> 
> Well, in QEMU we just run a timer internally that does the polling.
> 
> Then, upper layers in the stack can ask QEMU for the latest stats.
> 
> There, you just get the stats along with a "last-update" timestamp.
>

That makes sense, although don't you just push the blocking operation
into yet another thread on the host?

Assuming it's not cancel-able, there's a blocked thread there you have
to reap.  Vs the guest being unresponsive and not sending updates, you
just reap the whole guest or take some other corrective action.

I suppose to the orchestrator reaping QEMU and reaping the guest looks
the same.  The difference is just where the thread lives, hmmmm

I'll make a note to inspect QEMU's solution to see if that's handled
or if it would be subject to the same issue.

Thanks!
~Gregory

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-06-16 14:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 16:50 [RFC PATCH 0/2] virtio-balloon: extended stats and push mode Gregory Price
2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
2026-06-16 12:30   ` David Hildenbrand (Arm)
2026-06-16 13:49     ` Gregory Price
2026-06-16 14:19       ` David Hildenbrand (Arm)
2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
2026-06-16 12:33   ` David Hildenbrand (Arm)
2026-06-16 13:57     ` Gregory Price
2026-06-16 14:32       ` David Hildenbrand (Arm)
2026-06-16 14:44         ` Gregory Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox