* [RFC PATCH 0/2] virtio-balloon: extended stats and push mode
@ 2026-05-13 16:50 Gregory Price
2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
0 siblings, 2 replies; 10+ messages in thread
From: Gregory Price @ 2026-05-13 16:50 UTC (permalink / raw)
To: virtualization
Cc: linux-kernel, kernel-team, mst, david, jasowang, xuanzhuo,
eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
This series extends the virtio-balloon stats virtqueue with new tags
for memory composition and pressure data, and adds an optional push
mode for guest-initiated stat reporting.
Patch 1 adds 11 new stat tags to the existing stats VQ.
These use the same tag/value format and require no feature negotiation.
Old hosts ignore unknown tags.
The new tags provide data needed for better balloon sizing:
a) dirty/writeback pages (more accurate reclaimable estimates)
b) workingset refault counters (balloon overshoot before PSI spikes)
c) PSI pressure stats
Also exports psi_system for module builds.
Patch 2 adds VIRTIO_BALLOON_F_STATS_PUSH.
Stats push is a new feature that lets the host configure the guest
to push stats on a timer. The host sets stats_push_interval_ms in
the balloon config space; when non-zero, the guest pushes stats at
that cadence without waiting for the host to return the buffer.
This serves two purposes:
a) latency-sensitive consumers get stats at a guaranteed cadence
b) absence of expected stats provides guest liveness detection.
STATS_PUSH requires STATS_VQ and suppresses the pull callback when
active to avoid racing on buffer submission.
Gregory Price (2):
virtio-balloon: extend stats with memory composition and pressure data
virtio-balloon: add stats push mode
drivers/virtio/virtio_balloon.c | 104 ++++++++++++++++++++++++++++
include/uapi/linux/virtio_balloon.h | 33 ++++++++-
kernel/sched/psi.c | 1 +
3 files changed, 136 insertions(+), 2 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
2026-05-13 16:50 [RFC PATCH 0/2] virtio-balloon: extended stats and push mode Gregory Price
@ 2026-05-13 16:50 ` Gregory Price
2026-06-16 12:30 ` David Hildenbrand (Arm)
2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
1 sibling, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-05-13 16:50 UTC (permalink / raw)
To: virtualization
Cc: linux-kernel, kernel-team, mst, david, jasowang, xuanzhuo,
eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
When doing proactive ballooning to reduce the size of a guest, some
additional vmstat information is useful in deciding how much pressure
to exert on the VM and when the VM starts experiencing undesirable
behavior (memory and io pressure).
Add 11 new statistics tags to the existing balloon stats virtqueue
for improved balloon sizing decisions. Old hosts ignore unknown tags,
so no feature negotiation is required.
New memory composition stats (bytes):
S_DIRTY: dirty pages awaiting writeback
S_WRITEBACK: pages under active writeback
S_ANON: anonymous pages (for balloon ceiling calculation)
S_INACTIVE_FILE: inactive file LRU (safely reclaimable subset of cache)
S_SLAB_RECLAIM: reclaimable slab memory
New workingset stats (counts):
S_WS_REFAULT_A: anon workingset refaults
S_WS_REFAULT_F: file workingset refaults
New PSI stats (microseconds, cumulative):
S_PSI_MEM_SOME: memory pressure (some stalled)
S_PSI_MEM_FULL: memory pressure (all stalled)
S_PSI_IO_SOME: IO pressure (some stalled)
S_PSI_IO_FULL: IO pressure (all stalled)
Export psi_system for module builds (CONFIG_VIRTIO_BALLOON=m with
CONFIG_PSI=y).
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++
include/uapi/linux/virtio_balloon.h | 26 +++++++++++++++++++++--
kernel/sched/psi.c | 1 +
3 files changed, 58 insertions(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f6c2dff33f8a..8fa33aec4ce7 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -18,6 +18,8 @@
#include <linux/wait.h>
#include <linux/mm.h>
#include <linux/page_reporting.h>
+#include <linux/vmstat.h>
+#include <linux/psi.h>
/*
* Balloon device works in 4K page units. So each page is pointed to by
@@ -414,6 +416,37 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb)
update_stat(vb, idx++, VIRTIO_BALLOON_S_CACHES,
pages_to_bytes(caches));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRTY,
+ pages_to_bytes(global_node_page_state(NR_FILE_DIRTY)));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_WRITEBACK,
+ pages_to_bytes(global_node_page_state(NR_WRITEBACK)));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_ANON,
+ pages_to_bytes(global_node_page_state(NR_ANON_MAPPED)));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_INACTIVE_FILE,
+ pages_to_bytes(global_node_page_state(NR_INACTIVE_FILE)));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_SLAB_RECLAIM,
+ pages_to_bytes(
+ global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B)));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_A,
+ global_node_page_state(WORKINGSET_REFAULT_ANON));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_F,
+ global_node_page_state(WORKINGSET_REFAULT_FILE));
+
+#ifdef CONFIG_PSI
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_SOME,
+ div_u64(psi_system.total[PSI_AVGS][PSI_MEM_SOME],
+ NSEC_PER_USEC));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_FULL,
+ div_u64(psi_system.total[PSI_AVGS][PSI_MEM_FULL],
+ NSEC_PER_USEC));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_SOME,
+ div_u64(psi_system.total[PSI_AVGS][PSI_IO_SOME],
+ NSEC_PER_USEC));
+ update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_FULL,
+ div_u64(psi_system.total[PSI_AVGS][PSI_IO_FULL],
+ NSEC_PER_USEC));
+#endif
+
return idx;
}
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index ee35a372805d..37ec8a8466c4 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -77,7 +77,18 @@ struct virtio_balloon_config {
#define VIRTIO_BALLOON_S_DIRECT_SCAN 13 /* Amount of memory scanned directly */
#define VIRTIO_BALLOON_S_ASYNC_RECLAIM 14 /* Amount of memory reclaimed asynchronously */
#define VIRTIO_BALLOON_S_DIRECT_RECLAIM 15 /* Amount of memory reclaimed directly */
-#define VIRTIO_BALLOON_S_NR 16
+#define VIRTIO_BALLOON_S_DIRTY 16 /* Dirty pages (bytes) */
+#define VIRTIO_BALLOON_S_WRITEBACK 17 /* Pages under writeback (bytes) */
+#define VIRTIO_BALLOON_S_ANON 18 /* Anonymous pages (bytes) */
+#define VIRTIO_BALLOON_S_INACTIVE_FILE 19 /* Inactive file LRU pages (bytes) */
+#define VIRTIO_BALLOON_S_SLAB_RECLAIM 20 /* Reclaimable slab (bytes) */
+#define VIRTIO_BALLOON_S_WS_REFAULT_A 21 /* Workingset refaults anon (count) */
+#define VIRTIO_BALLOON_S_WS_REFAULT_F 22 /* Workingset refaults file (count) */
+#define VIRTIO_BALLOON_S_PSI_MEM_SOME 23 /* PSI memory some total (us) */
+#define VIRTIO_BALLOON_S_PSI_MEM_FULL 24 /* PSI memory full total (us) */
+#define VIRTIO_BALLOON_S_PSI_IO_SOME 25 /* PSI IO some total (us) */
+#define VIRTIO_BALLOON_S_PSI_IO_FULL 26 /* PSI IO full total (us) */
+#define VIRTIO_BALLOON_S_NR 27
#define VIRTIO_BALLOON_S_NAMES_WITH_PREFIX(VIRTIO_BALLOON_S_NAMES_prefix) { \
VIRTIO_BALLOON_S_NAMES_prefix "swap-in", \
@@ -95,7 +106,18 @@ struct virtio_balloon_config {
VIRTIO_BALLOON_S_NAMES_prefix "async-scans", \
VIRTIO_BALLOON_S_NAMES_prefix "direct-scans", \
VIRTIO_BALLOON_S_NAMES_prefix "async-reclaims", \
- VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims" \
+ VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims", \
+ VIRTIO_BALLOON_S_NAMES_prefix "dirty", \
+ VIRTIO_BALLOON_S_NAMES_prefix "writeback", \
+ VIRTIO_BALLOON_S_NAMES_prefix "anon-pages", \
+ VIRTIO_BALLOON_S_NAMES_prefix "inactive-file", \
+ VIRTIO_BALLOON_S_NAMES_prefix "slab-reclaimable", \
+ VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-anon", \
+ VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-file", \
+ VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-some-us", \
+ VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-full-us", \
+ VIRTIO_BALLOON_S_NAMES_prefix "psi-io-some-us", \
+ VIRTIO_BALLOON_S_NAMES_prefix "psi-io-full-us" \
}
#define VIRTIO_BALLOON_S_NAMES VIRTIO_BALLOON_S_NAMES_WITH_PREFIX("")
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index d9c9d9480a45..8ab3aa1c4ef5 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -175,6 +175,7 @@ static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
struct psi_group psi_system = {
.pcpu = &system_group_pcpu,
};
+EXPORT_SYMBOL_GPL(psi_system);
static DEFINE_PER_CPU(seqcount_t, psi_seq) = SEQCNT_ZERO(psi_seq);
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC PATCH 2/2] virtio-balloon: add stats push mode
2026-05-13 16:50 [RFC PATCH 0/2] virtio-balloon: extended stats and push mode Gregory Price
2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
@ 2026-05-13 16:50 ` Gregory Price
2026-06-16 12:33 ` David Hildenbrand (Arm)
1 sibling, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-05-13 16:50 UTC (permalink / raw)
To: virtualization
Cc: linux-kernel, kernel-team, mst, david, jasowang, xuanzhuo,
eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
When doing aggressive overcommit of VMs on a single host, a pull
model of stat retrieval is problematic if a guest becomes some form
of unresponsive. In particular, it's difficult to discern the
difference between a hung guest and a slow guest - and why the
guest is experiencing that.
Add VIRTIO_BALLOON_F_STATS_PUSH feature that allows the host to
configure the guest to push stats on a timer instead of the default
pull model.
The host sets stats_push_interval_ms in the balloon config space:
0 = disabled (pull-only, default)
N > 0 = guest pushes stats every N milliseconds
The push mode reuses the existing stats VQ, same buffer format,
same tags. The host can change the interval at runtime by updating
the config field.
Push mode provides two advantages over pull:
1. Guest liveness detection: in pull mode, the host cannot
distinguish a slow guest from a hung guest without implementing
its own timeout tracking. In push mode, the absence of expected
stats buffers is an implicit liveness signal; if the guest
fails to push within the expected interval, the host can
conclude it is unresponsive.
2. Latency-sensitive consumers (e.g., memory pressure response
loops) receive fresh stats at a guaranteed cadence without
the host needing to poll.
STATS_PUSH requires STATS_VQ; the driver clears STATS_PUSH during
feature validation if STATS_VQ is absent. When push mode is active,
the pull callback is suppressed to avoid racing on buffer submission.
The pull model remains available and is the default.
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Gregory Price <gourry@gourry.net>
---
drivers/virtio/virtio_balloon.c | 71 +++++++++++++++++++++++++++++
include/uapi/linux/virtio_balloon.h | 7 +++
2 files changed, 78 insertions(+)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8fa33aec4ce7..47bde1d2b388 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -112,6 +112,10 @@ struct virtio_balloon {
/* Memory statistics */
struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
+ /* Stats push mode */
+ struct delayed_work stats_push_work;
+ uint32_t stats_push_interval_ms;
+
/* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */
struct shrinker *shrinker;
@@ -463,6 +467,13 @@ static void stats_request(struct virtqueue *vq)
{
struct virtio_balloon *vb = vq->vdev->priv;
+ /*
+ * In push mode, the push timer owns the VQ. Ignore pull
+ * requests to avoid racing on buffer submission.
+ */
+ if (vb->stats_push_interval_ms)
+ return;
+
spin_lock(&vb->stop_update_lock);
if (!vb->stop_update) {
start_wakeup_event(vb, VIRTIO_BALLOON_WAKEUP_SIGNAL_STATS);
@@ -558,6 +569,20 @@ static void virtballoon_changed(struct virtio_device *vdev)
virtio_balloon_queue_free_page_work(vb);
}
spin_unlock_irqrestore(&vb->stop_update_lock, flags);
+
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_PUSH)) {
+ uint32_t interval;
+
+ virtio_cread_le(vdev, struct virtio_balloon_config,
+ stats_push_interval_ms, &interval);
+ if (interval != vb->stats_push_interval_ms) {
+ vb->stats_push_interval_ms = interval;
+ cancel_delayed_work(&vb->stats_push_work);
+ if (interval)
+ schedule_delayed_work(&vb->stats_push_work,
+ msecs_to_jiffies(interval));
+ }
+ }
}
static void update_balloon_size(struct virtio_balloon *vb)
@@ -581,6 +606,32 @@ static void update_balloon_stats_func(struct work_struct *work)
finish_wakeup_event(vb);
}
+static void stats_push_func(struct work_struct *work)
+{
+ struct virtio_balloon *vb = container_of(work, struct virtio_balloon,
+ stats_push_work.work);
+ struct virtqueue *vq;
+ struct scatterlist sg;
+ unsigned int num_stats, len;
+
+ if (!vb->stats_push_interval_ms)
+ return;
+
+ vq = vb->stats_vq;
+
+ /* Reclaim previous buffer */
+ while (virtqueue_get_buf(vq, &len))
+ ;
+
+ num_stats = update_balloon_stats(vb);
+ sg_init_one(&sg, vb->stats, sizeof(vb->stats[0]) * num_stats);
+ virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
+ virtqueue_kick(vq);
+
+ schedule_delayed_work(&vb->stats_push_work,
+ msecs_to_jiffies(vb->stats_push_interval_ms));
+}
+
static void update_balloon_size_func(struct work_struct *work)
{
struct virtio_balloon *vb;
@@ -967,6 +1018,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
}
INIT_WORK(&vb->update_balloon_stats_work, update_balloon_stats_func);
+ INIT_DELAYED_WORK(&vb->stats_push_work, stats_push_func);
INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
spin_lock_init(&vb->stop_update_lock);
mutex_init(&vb->balloon_lock);
@@ -1094,6 +1146,19 @@ static int virtballoon_probe(struct virtio_device *vdev)
if (towards_target(vb))
virtballoon_changed(vdev);
+
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_PUSH)) {
+ uint32_t interval;
+
+ virtio_cread_le(vdev, struct virtio_balloon_config,
+ stats_push_interval_ms, &interval);
+ if (interval) {
+ vb->stats_push_interval_ms = interval;
+ schedule_delayed_work(&vb->stats_push_work,
+ msecs_to_jiffies(interval));
+ }
+ }
+
return 0;
out_unregister_oom:
@@ -1145,6 +1210,7 @@ static void virtballoon_remove(struct virtio_device *vdev)
spin_unlock_irq(&vb->stop_update_lock);
cancel_work_sync(&vb->update_balloon_size_work);
cancel_work_sync(&vb->update_balloon_stats_work);
+ cancel_delayed_work_sync(&vb->stats_push_work);
if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
cancel_work_sync(&vb->report_free_page_work);
@@ -1199,6 +1265,10 @@ static int virtballoon_validate(struct virtio_device *vdev)
else if (!virtio_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON))
__virtio_clear_bit(vdev, VIRTIO_BALLOON_F_REPORTING);
+ if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_PUSH) &&
+ !virtio_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ))
+ __virtio_clear_bit(vdev, VIRTIO_BALLOON_F_STATS_PUSH);
+
__virtio_clear_bit(vdev, VIRTIO_F_ACCESS_PLATFORM);
return 0;
}
@@ -1210,6 +1280,7 @@ static unsigned int features[] = {
VIRTIO_BALLOON_F_FREE_PAGE_HINT,
VIRTIO_BALLOON_F_PAGE_POISON,
VIRTIO_BALLOON_F_REPORTING,
+ VIRTIO_BALLOON_F_STATS_PUSH,
};
static struct virtio_driver virtio_balloon_driver = {
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index 37ec8a8466c4..90e9b5247e5e 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -37,6 +37,7 @@
#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
#define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */
+#define VIRTIO_BALLOON_F_STATS_PUSH 6 /* Guest pushes stats on a timer */
/* Size of a PFN in the balloon interface. */
#define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -59,6 +60,12 @@ struct virtio_balloon_config {
};
/* Stores PAGE_POISON if page poisoning is in use */
__le32 poison_val;
+ /*
+ * Stats push interval in milliseconds. 0 = disabled (pull only).
+ * Valid with VIRTIO_BALLOON_F_STATS_PUSH. Host-writable, can change
+ * at runtime via config updates.
+ */
+ __le32 stats_push_interval_ms;
};
#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */
--
2.54.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
@ 2026-06-16 12:30 ` David Hildenbrand (Arm)
2026-06-16 13:49 ` Gregory Price
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 12:30 UTC (permalink / raw)
To: Gregory Price, virtualization
Cc: linux-kernel, kernel-team, mst, jasowang, xuanzhuo, eperezma,
hannes, surenb, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak
On 5/13/26 18:50, Gregory Price wrote:
> When doing proactive ballooning to reduce the size of a guest, some
> additional vmstat information is useful in deciding how much pressure
> to exert on the VM and when the VM starts experiencing undesirable
> behavior (memory and io pressure).
>
> Add 11 new statistics tags to the existing balloon stats virtqueue
> for improved balloon sizing decisions. Old hosts ignore unknown tags,
> so no feature negotiation is required.
>
> New memory composition stats (bytes):
> S_DIRTY: dirty pages awaiting writeback
> S_WRITEBACK: pages under active writeback
> S_ANON: anonymous pages (for balloon ceiling calculation)
> S_INACTIVE_FILE: inactive file LRU (safely reclaimable subset of cache)
> S_SLAB_RECLAIM: reclaimable slab memory
>
> New workingset stats (counts):
> S_WS_REFAULT_A: anon workingset refaults
> S_WS_REFAULT_F: file workingset refaults
>
> New PSI stats (microseconds, cumulative):
> S_PSI_MEM_SOME: memory pressure (some stalled)
> S_PSI_MEM_FULL: memory pressure (all stalled)
> S_PSI_IO_SOME: IO pressure (some stalled)
> S_PSI_IO_FULL: IO pressure (all stalled)
>
> Export psi_system for module builds (CONFIG_VIRTIO_BALLOON=m with
> CONFIG_PSI=y).
>
> Assisted-by: Claude:claude-opus-4-6
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
> drivers/virtio/virtio_balloon.c | 33 +++++++++++++++++++++++++++++
> include/uapi/linux/virtio_balloon.h | 26 +++++++++++++++++++++--
> kernel/sched/psi.c | 1 +
> 3 files changed, 58 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index f6c2dff33f8a..8fa33aec4ce7 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -18,6 +18,8 @@
> #include <linux/wait.h>
> #include <linux/mm.h>
> #include <linux/page_reporting.h>
> +#include <linux/vmstat.h>
> +#include <linux/psi.h>
>
> /*
> * Balloon device works in 4K page units. So each page is pointed to by
> @@ -414,6 +416,37 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb)
> update_stat(vb, idx++, VIRTIO_BALLOON_S_CACHES,
> pages_to_bytes(caches));
>
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRTY,
> + pages_to_bytes(global_node_page_state(NR_FILE_DIRTY)));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_WRITEBACK,
> + pages_to_bytes(global_node_page_state(NR_WRITEBACK)));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_ANON,
> + pages_to_bytes(global_node_page_state(NR_ANON_MAPPED)));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_INACTIVE_FILE,
> + pages_to_bytes(global_node_page_state(NR_INACTIVE_FILE)));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_SLAB_RECLAIM,
> + pages_to_bytes(
> + global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B)));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_A,
> + global_node_page_state(WORKINGSET_REFAULT_ANON));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_WS_REFAULT_F,
> + global_node_page_state(WORKINGSET_REFAULT_FILE));
> +
> +#ifdef CONFIG_PSI
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_SOME,
> + div_u64(psi_system.total[PSI_AVGS][PSI_MEM_SOME],
> + NSEC_PER_USEC));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_MEM_FULL,
> + div_u64(psi_system.total[PSI_AVGS][PSI_MEM_FULL],
> + NSEC_PER_USEC));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_SOME,
> + div_u64(psi_system.total[PSI_AVGS][PSI_IO_SOME],
> + NSEC_PER_USEC));
> + update_stat(vb, idx++, VIRTIO_BALLOON_S_PSI_IO_FULL,
> + div_u64(psi_system.total[PSI_AVGS][PSI_IO_FULL],
> + NSEC_PER_USEC));
> +#endif
> +
> return idx;
> }
>
> diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> index ee35a372805d..37ec8a8466c4 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h
> @@ -77,7 +77,18 @@ struct virtio_balloon_config {
> #define VIRTIO_BALLOON_S_DIRECT_SCAN 13 /* Amount of memory scanned directly */
> #define VIRTIO_BALLOON_S_ASYNC_RECLAIM 14 /* Amount of memory reclaimed asynchronously */
> #define VIRTIO_BALLOON_S_DIRECT_RECLAIM 15 /* Amount of memory reclaimed directly */
> -#define VIRTIO_BALLOON_S_NR 16
> +#define VIRTIO_BALLOON_S_DIRTY 16 /* Dirty pages (bytes) */
> +#define VIRTIO_BALLOON_S_WRITEBACK 17 /* Pages under writeback (bytes) */
> +#define VIRTIO_BALLOON_S_ANON 18 /* Anonymous pages (bytes) */
> +#define VIRTIO_BALLOON_S_INACTIVE_FILE 19 /* Inactive file LRU pages (bytes) */
> +#define VIRTIO_BALLOON_S_SLAB_RECLAIM 20 /* Reclaimable slab (bytes) */
> +#define VIRTIO_BALLOON_S_WS_REFAULT_A 21 /* Workingset refaults anon (count) */
> +#define VIRTIO_BALLOON_S_WS_REFAULT_F 22 /* Workingset refaults file (count) */
> +#define VIRTIO_BALLOON_S_PSI_MEM_SOME 23 /* PSI memory some total (us) */
> +#define VIRTIO_BALLOON_S_PSI_MEM_FULL 24 /* PSI memory full total (us) */
> +#define VIRTIO_BALLOON_S_PSI_IO_SOME 25 /* PSI IO some total (us) */
> +#define VIRTIO_BALLOON_S_PSI_IO_FULL 26 /* PSI IO full total (us) */
> +#define VIRTIO_BALLOON_S_NR 27
>
> #define VIRTIO_BALLOON_S_NAMES_WITH_PREFIX(VIRTIO_BALLOON_S_NAMES_prefix) { \
> VIRTIO_BALLOON_S_NAMES_prefix "swap-in", \
> @@ -95,7 +106,18 @@ struct virtio_balloon_config {
> VIRTIO_BALLOON_S_NAMES_prefix "async-scans", \
> VIRTIO_BALLOON_S_NAMES_prefix "direct-scans", \
> VIRTIO_BALLOON_S_NAMES_prefix "async-reclaims", \
> - VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims" \
> + VIRTIO_BALLOON_S_NAMES_prefix "direct-reclaims", \
> + VIRTIO_BALLOON_S_NAMES_prefix "dirty", \
> + VIRTIO_BALLOON_S_NAMES_prefix "writeback", \
> + VIRTIO_BALLOON_S_NAMES_prefix "anon-pages", \
> + VIRTIO_BALLOON_S_NAMES_prefix "inactive-file", \
> + VIRTIO_BALLOON_S_NAMES_prefix "slab-reclaimable", \
> + VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-anon", \
> + VIRTIO_BALLOON_S_NAMES_prefix "ws-refault-file", \
> + VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-some-us", \
> + VIRTIO_BALLOON_S_NAMES_prefix "psi-mem-full-us", \
> + VIRTIO_BALLOON_S_NAMES_prefix "psi-io-some-us", \
> + VIRTIO_BALLOON_S_NAMES_prefix "psi-io-full-us" \
> }
>
> #define VIRTIO_BALLOON_S_NAMES VIRTIO_BALLOON_S_NAMES_WITH_PREFIX("")
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index d9c9d9480a45..8ab3aa1c4ef5 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -175,6 +175,7 @@ static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
> struct psi_group psi_system = {
> .pcpu = &system_group_pcpu,
> };
> +EXPORT_SYMBOL_GPL(psi_system);
Nothing too crazy here, however, the question is which of these values we
actually want to guarantee that we will provide them with unchanged semantics in
the future ... I guess anything we already expose to user space is alright
(because it effectively already must remain mostly unchanged I assume).
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
@ 2026-06-16 12:33 ` David Hildenbrand (Arm)
2026-06-16 13:57 ` Gregory Price
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 12:33 UTC (permalink / raw)
To: Gregory Price, virtualization
Cc: linux-kernel, kernel-team, mst, jasowang, xuanzhuo, eperezma,
hannes, surenb, peterz, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
kprateek.nayak
On 5/13/26 18:50, Gregory Price wrote:
> When doing aggressive overcommit of VMs on a single host, a pull
> model of stat retrieval is problematic if a guest becomes some form
> of unresponsive. In particular, it's difficult to discern the
> difference between a hung guest and a slow guest - and why the
> guest is experiencing that.
>
> Add VIRTIO_BALLOON_F_STATS_PUSH feature that allows the host to
> configure the guest to push stats on a timer instead of the default
> pull model.
>
> The host sets stats_push_interval_ms in the balloon config space:
> 0 = disabled (pull-only, default)
> N > 0 = guest pushes stats every N milliseconds
>
> The push mode reuses the existing stats VQ, same buffer format,
> same tags. The host can change the interval at runtime by updating
> the config field.
>
> Push mode provides two advantages over pull:
> 1. Guest liveness detection: in pull mode, the host cannot
> distinguish a slow guest from a hung guest without implementing
> its own timeout tracking. In push mode, the absence of expected
> stats buffers is an implicit liveness signal; if the guest
> fails to push within the expected interval, the host can
> conclude it is unresponsive.
> 2. Latency-sensitive consumers (e.g., memory pressure response
> loops) receive fresh stats at a guaranteed cadence without
> the host needing to poll.
>
> STATS_PUSH requires STATS_VQ; the driver clears STATS_PUSH during
> feature validation if STATS_VQ is absent. When push mode is active,
> the pull callback is suppressed to avoid racing on buffer submission.
>
> The pull model remains available and is the default.
I don't quite see the big benefit here, really: either it's a timer in the
hypervisor or a timer in the VM. A slow VM will, in either model, delay the
update of stats.
If you need some "liveness detection", is virtio-balloon stats updates really
the right mechanism?
I don't quite understand the "Latency-sensitive consumers" problem. If the VM is
slow, it is slow and will mess with latency-sensitive consumers in either way?
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
2026-06-16 12:30 ` David Hildenbrand (Arm)
@ 2026-06-16 13:49 ` Gregory Price
2026-06-16 14:19 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-06-16 13:49 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
On Tue, Jun 16, 2026 at 02:30:34PM +0200, David Hildenbrand (Arm) wrote:
>
> Nothing too crazy here, however, the question is which of these values we
> actually want to guarantee that we will provide them with unchanged semantics in
> the future ... I guess anything we already expose to user space is alright
> (because it effectively already must remain mostly unchanged I assume).
>
I suppose a risk of them going away entirely? I suppose that's
fixable but unlikely.
~Gregory
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
2026-06-16 12:33 ` David Hildenbrand (Arm)
@ 2026-06-16 13:57 ` Gregory Price
2026-06-16 14:32 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 10+ messages in thread
From: Gregory Price @ 2026-06-16 13:57 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
On Tue, Jun 16, 2026 at 02:33:43PM +0200, David Hildenbrand (Arm) wrote:
> On 5/13/26 18:50, Gregory Price wrote:
> >
> > The pull model remains available and is the default.
>
> I don't quite see the big benefit here, really: either it's a timer in the
> hypervisor or a timer in the VM. A slow VM will, in either model, delay the
> update of stats.
>
> If you need some "liveness detection", is virtio-balloon stats updates really
> the right mechanism?
>
> I don't quite understand the "Latency-sensitive consumers" problem. If the VM is
> slow, it is slow and will mess with latency-sensitive consumers in either way?
>
Latency sensitive here should probably be defined as "Does not like
blocking operations". This was prototyped in the context of
cloud-hypervisor [1] and an orchestrator trying poll 1000 VMs on a
single machine for stats.
The poller couldn't determine the difference between "guest is slow" and
"guest is hung" and so had to block on the operation (I didn't see how
to solve this async).
Similarly, having a single thread just round-robin poll the VMs is
bluntly inefficient and provides poor guarantees about the liveliness
of the stats (a couple slow guests can cause other guests' stats to
become stale for 10s of seconds).
Definitely an RFC here because I'm not sure if I was missing something
that might help me solve the problem.
~Gregory
[1] https://github.com/cloud-hypervisor/cloud-hypervisor
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data
2026-06-16 13:49 ` Gregory Price
@ 2026-06-16 14:19 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 14:19 UTC (permalink / raw)
To: Gregory Price
Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
On 6/16/26 15:49, Gregory Price wrote:
> On Tue, Jun 16, 2026 at 02:30:34PM +0200, David Hildenbrand (Arm) wrote:
>>
>> Nothing too crazy here, however, the question is which of these values we
>> actually want to guarantee that we will provide them with unchanged semantics in
>> the future ... I guess anything we already expose to user space is alright
>> (because it effectively already must remain mostly unchanged I assume).
>>
>
> I suppose a risk of them going away entirely? I suppose that's
> fixable but unlikely.
Yeah, the first bunch is all exported through /proc/vmstat AFAIKS.
The other through /proc/pressure/
So I assume this is just fine.
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
2026-06-16 13:57 ` Gregory Price
@ 2026-06-16 14:32 ` David Hildenbrand (Arm)
2026-06-16 14:44 ` Gregory Price
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-16 14:32 UTC (permalink / raw)
To: Gregory Price
Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
On 6/16/26 15:57, Gregory Price wrote:
> On Tue, Jun 16, 2026 at 02:33:43PM +0200, David Hildenbrand (Arm) wrote:
>> On 5/13/26 18:50, Gregory Price wrote:
>>>
>>> The pull model remains available and is the default.
>>
>> I don't quite see the big benefit here, really: either it's a timer in the
>> hypervisor or a timer in the VM. A slow VM will, in either model, delay the
>> update of stats.
>>
>> If you need some "liveness detection", is virtio-balloon stats updates really
>> the right mechanism?
>>
>> I don't quite understand the "Latency-sensitive consumers" problem. If the VM is
>> slow, it is slow and will mess with latency-sensitive consumers in either way?
>>
>
> Latency sensitive here should probably be defined as "Does not like
> blocking operations". This was prototyped in the context of
> cloud-hypervisor [1] and an orchestrator trying poll 1000 VMs on a
> single machine for stats.
>
> The poller couldn't determine the difference between "guest is slow" and
> "guest is hung" and so had to block on the operation (I didn't see how
> to solve this async).
>
> Similarly, having a single thread just round-robin poll the VMs is
> bluntly inefficient and provides poor guarantees about the liveliness
> of the stats (a couple slow guests can cause other guests' stats to
> become stale for 10s of seconds).
>
> Definitely an RFC here because I'm not sure if I was missing something
> that might help me solve the problem.
Well, in QEMU we just run a timer internally that does the polling.
Then, upper layers in the stack can ask QEMU for the latest stats.
There, you just get the stats along with a "last-update" timestamp.
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 2/2] virtio-balloon: add stats push mode
2026-06-16 14:32 ` David Hildenbrand (Arm)
@ 2026-06-16 14:44 ` Gregory Price
0 siblings, 0 replies; 10+ messages in thread
From: Gregory Price @ 2026-06-16 14:44 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: virtualization, linux-kernel, kernel-team, mst, jasowang,
xuanzhuo, eperezma, hannes, surenb, peterz, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak
On Tue, Jun 16, 2026 at 04:32:46PM +0200, David Hildenbrand (Arm) wrote:
> On 6/16/26 15:57, Gregory Price wrote:
> >
> > Definitely an RFC here because I'm not sure if I was missing something
> > that might help me solve the problem.
>
> Well, in QEMU we just run a timer internally that does the polling.
>
> Then, upper layers in the stack can ask QEMU for the latest stats.
>
> There, you just get the stats along with a "last-update" timestamp.
>
That makes sense, although don't you just push the blocking operation
into yet another thread on the host?
Assuming it's not cancel-able, there's a blocked thread there you have
to reap. Vs the guest being unresponsive and not sending updates, you
just reap the whole guest or take some other corrective action.
I suppose to the orchestrator reaping QEMU and reaping the guest looks
the same. The difference is just where the thread lives, hmmmm
I'll make a note to inspect QEMU's solution to see if that's handled
or if it would be subject to the same issue.
Thanks!
~Gregory
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-06-16 14:44 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-13 16:50 [RFC PATCH 0/2] virtio-balloon: extended stats and push mode Gregory Price
2026-05-13 16:50 ` [RFC PATCH 1/2] virtio-balloon: extend stats with memory composition and pressure data Gregory Price
2026-06-16 12:30 ` David Hildenbrand (Arm)
2026-06-16 13:49 ` Gregory Price
2026-06-16 14:19 ` David Hildenbrand (Arm)
2026-05-13 16:50 ` [RFC PATCH 2/2] virtio-balloon: add stats push mode Gregory Price
2026-06-16 12:33 ` David Hildenbrand (Arm)
2026-06-16 13:57 ` Gregory Price
2026-06-16 14:32 ` David Hildenbrand (Arm)
2026-06-16 14:44 ` Gregory Price
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox