From: Peter Xu <peterx@redhat.com>
To: marcandre.lureau@redhat.com
Cc: qemu-devel@nongnu.org, "Ben Chaney" <bchaney@akamai.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Alex Williamson" <alex@shazbot.org>,
"Fabiano Rosas" <farosas@suse.de>,
"David Hildenbrand" <david@kernel.org>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
kvm@vger.kernel.org, "Mark Kanda" <mark.kanda@oracle.com>
Subject: Re: [PATCH v3 12/15] system/memory: implement RamDiscardManager multi-source aggregation
Date: Tue, 3 Mar 2026 16:16:00 -0500 [thread overview]
Message-ID: <aadPkO26sp1NQRo-@x1.local> (raw)
In-Reply-To: <20260226140001.3622334-13-marcandre.lureau@redhat.com>
On Thu, Feb 26, 2026 at 02:59:57PM +0100, marcandre.lureau@redhat.com wrote:
> From: Marc-André Lureau <marcandre.lureau@redhat.com>
>
> Refactor RamDiscardManager to aggregate multiple RamDiscardSource
> instances. This enables scenarios where multiple components (e.g.,
> virtio-mem and RamBlockAttributes) can coordinate memory discard
> state for the same memory region.
>
> The aggregation uses:
> - Populated: ALL sources populated
> - Discarded: ANY source discarded
>
> When a source is added with existing listeners, they are notified
> about regions that become discarded. When a source is removed,
> listeners are notified about regions that become populated.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Looks nice to me in general, feel free to take:
Reviewed-by: Peter Xu <peterx@redhat.com>
Only some nitpicks and pure comments below.
> ---
> include/system/ram-discard-manager.h | 143 +++++++--
> hw/virtio/virtio-mem.c | 8 +-
> system/memory.c | 15 +-
> system/ram-block-attributes.c | 6 +-
> system/ram-discard-manager.c | 427 ++++++++++++++++++++++++---
> 5 files changed, 515 insertions(+), 84 deletions(-)
>
> diff --git a/include/system/ram-discard-manager.h b/include/system/ram-discard-manager.h
> index b5dbcb4a82d..9d650ee4d7b 100644
> --- a/include/system/ram-discard-manager.h
> +++ b/include/system/ram-discard-manager.h
> @@ -170,30 +170,96 @@ struct RamDiscardSourceClass {
> * becoming discarded in a different granularity than it was populated and the
> * other way around.
> */
> +
> +typedef struct RamDiscardSourceEntry RamDiscardSourceEntry;
> +
> +struct RamDiscardSourceEntry {
> + RamDiscardSource *rds;
> + QLIST_ENTRY(RamDiscardSourceEntry) next;
> +};
> +
> struct RamDiscardManager {
> Object parent;
>
> - RamDiscardSource *rds;
> - MemoryRegion *mr;
> + struct MemoryRegion *mr;
s/struct//
> + QLIST_HEAD(, RamDiscardSourceEntry) source_list;
> + uint64_t min_granularity;
> QLIST_HEAD(, RamDiscardListener) rdl_list;
> };
>
> -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr,
> - RamDiscardSource *rds);
> +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr);
> +
> +/**
> + * ram_discard_manager_add_source:
> + *
> + * Register a #RamDiscardSource with the #RamDiscardManager. The manager
> + * aggregates state from all registered sources using AND semantics: a region
> + * is considered populated only if ALL sources report it as populated.
> + *
> + * If listeners are already registered, they will be notified about any
> + * regions that become discarded due to adding this source. Specifically,
> + * for each region that the new source reports as discarded, if all other
> + * sources reported it as populated, listeners receive a discard notification.
> + *
> + * If any listener rejects the notification (returns an error), previously
> + * notified listeners are rolled back with populate notifications and the
> + * source is not added.
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource to add
> + *
> + * Returns: 0 on success, -EBUSY if @source is already registered, or a
> + * negative error code if a listener rejected the state change.
> + */
> +int ram_discard_manager_add_source(RamDiscardManager *rdm,
> + RamDiscardSource *source);
> +
> +/**
> + * ram_discard_manager_del_source:
> + *
> + * Unregister a #RamDiscardSource from the #RamDiscardManager.
> + *
> + * If listeners are already registered, they will be notified about any
> + * regions that become populated due to removing this source. Specifically,
> + * for each region that the removed source reported as discarded, if all
> + * remaining sources report it as populated, listeners receive a populate
> + * notification.
> + *
> + * If any listener rejects the notification (returns an error), previously
> + * notified listeners are rolled back with discard notifications and the
> + * source is not removed.
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource to remove
> + *
> + * Returns: 0 on success, -ENOENT if @source is not registered, or a
> + * negative error code if a listener rejected the state change.
> + */
> +int ram_discard_manager_del_source(RamDiscardManager *rdm,
> + RamDiscardSource *source);
> +
>
> uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *rdm,
> const MemoryRegion *mr);
>
> +/**
> + * ram_discard_manager_is_populated:
> + *
> + * Check if the given memory region section is populated.
> + * If the manager has no sources, it is considered populated.
> + *
> + * @rdm: the #RamDiscardManager
> + * @section: the #MemoryRegionSection to check
> + *
> + * Returns: true if the section is populated, false otherwise.
> + */
> bool ram_discard_manager_is_populated(const RamDiscardManager *rdm,
> const MemoryRegionSection *section);
>
> /**
> * ram_discard_manager_replay_populated:
> *
> - * Iterate the given #MemoryRegionSection at minimum granularity, calling
> - * #RamDiscardSourceClass.is_populated for each chunk, and invoke @replay_fn
> - * for each contiguous populated range. In case any call fails, no further
> - * calls are made.
> + * Call @replay_fn on regions that are populated in all sources.
> *
> * @rdm: the #RamDiscardManager
> * @section: the #MemoryRegionSection
> @@ -210,10 +276,7 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
> /**
> * ram_discard_manager_replay_discarded:
> *
> - * Iterate the given #MemoryRegionSection at minimum granularity, calling
> - * #RamDiscardSourceClass.is_populated for each chunk, and invoke @replay_fn
> - * for each contiguous discarded range. In case any call fails, no further
> - * calls are made.
> + * Call @replay_fn on regions that are discarded in any sources.
> *
> * @rdm: the #RamDiscardManager
> * @section: the #MemoryRegionSection
> @@ -234,31 +297,61 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
> RamDiscardListener *rdl);
>
> -/*
> - * Note: later refactoring should take the source into account and the manager
> - * should be able to aggregate multiple sources.
> +/**
> + * ram_discard_manager_notify_populate:
> + *
> + * Notify listeners that a region is about to be populated by a source.
> + * For multi-source aggregation, only notifies when all sources agree
> + * the region is populated (intersection).
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource that is populating
> + * @offset: offset within the memory region
> + * @size: size of the region being populated
> + *
> + * Returns 0 on success, or a negative error if any listener rejects.
> */
> int ram_discard_manager_notify_populate(RamDiscardManager *rdm,
> + RamDiscardSource *source,
> uint64_t offset, uint64_t size);
>
> -/*
> - * Note: later refactoring should take the source into account and the manager
> - * should be able to aggregate multiple sources.
> +/**
> + * ram_discard_manager_notify_discard:
> + *
> + * Notify listeners that a region has been discarded by a source.
> + * For multi-source aggregation, always notifies immediately
> + * (union semantics - any source discarding makes region discarded).
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource that is discarding
> + * @offset: offset within the memory region
> + * @size: size of the region being discarded
> */
> void ram_discard_manager_notify_discard(RamDiscardManager *rdm,
> + RamDiscardSource *source,
> uint64_t offset, uint64_t size);
>
> -/*
> - * Note: later refactoring should take the source into account and the manager
> - * should be able to aggregate multiple sources.
> +/**
> + * ram_discard_manager_notify_discard_all:
> + *
> + * Notify listeners that all regions have been discarded by a source.
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource that is discarding
> */
> -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm);
> +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm,
> + RamDiscardSource *source);
>
> -/*
> +/**
> + * ram_discard_manager_replay_populated_to_listeners:
> + *
> * Replay populated sections to all registered listeners.
> + * For multi-source aggregation, only replays regions where all sources
> + * are populated (intersection).
> *
> - * Note: later refactoring should take the source into account and the manager
> - * should be able to aggregate multiple sources.
> + * @rdm: the #RamDiscardManager
> + *
> + * Returns 0 on success, or a negative error if any notification failed.
> */
> int ram_discard_manager_replay_populated_to_listeners(RamDiscardManager *rdm);
>
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index 2b67b2882d2..35e03ed7599 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -264,7 +264,8 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset,
> {
> RamDiscardManager *rdm = memory_region_get_ram_discard_manager(&vmem->memdev->mr);
>
> - ram_discard_manager_notify_discard(rdm, offset, size);
> + ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(vmem),
> + offset, size);
> }
>
> static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
> @@ -272,7 +273,8 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
> {
> RamDiscardManager *rdm = memory_region_get_ram_discard_manager(&vmem->memdev->mr);
>
> - return ram_discard_manager_notify_populate(rdm, offset, size);
> + return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(vmem),
> + offset, size);
> }
>
> static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem)
> @@ -283,7 +285,7 @@ static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem)
> return;
> }
>
> - ram_discard_manager_notify_discard_all(rdm);
> + ram_discard_manager_notify_discard_all(rdm, RAM_DISCARD_SOURCE(vmem));
> }
>
> static bool virtio_mem_is_range_plugged(const VirtIOMEM *vmem,
> diff --git a/system/memory.c b/system/memory.c
> index 8b46cb87838..8a4cb7b59ac 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -2109,21 +2109,22 @@ int memory_region_add_ram_discard_source(MemoryRegion *mr,
> RamDiscardSource *source)
> {
> g_assert(memory_region_is_ram(mr));
> - if (mr->rdm) {
> - return -EBUSY;
> +
> + if (!mr->rdm) {
> + mr->rdm = ram_discard_manager_new(mr);
> }
>
> - mr->rdm = ram_discard_manager_new(mr, RAM_DISCARD_SOURCE(source));
> - return 0;
> + return ram_discard_manager_add_source(mr->rdm, source);
> }
>
> void memory_region_del_ram_discard_source(MemoryRegion *mr,
> RamDiscardSource *source)
> {
> - g_assert(mr->rdm->rds == source);
> + g_assert(mr->rdm);
> +
> + ram_discard_manager_del_source(mr->rdm, source);
>
> - object_unref(mr->rdm);
> - mr->rdm = NULL;
> + /* if there is no source and no listener left, we could free rdm */
Maybe squash patch 14 directly into this one would be better?
> }
>
> /* Called with rcu_read_lock held. */
> diff --git a/system/ram-block-attributes.c b/system/ram-block-attributes.c
> index 718c7075cec..59ec7a28eb0 100644
> --- a/system/ram-block-attributes.c
> +++ b/system/ram-block-attributes.c
> @@ -90,7 +90,8 @@ ram_block_attributes_notify_discard(RamBlockAttributes *attr,
> {
> RamDiscardManager *rdm = memory_region_get_ram_discard_manager(attr->ram_block->mr);
>
> - ram_discard_manager_notify_discard(rdm, offset, size);
> + ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(attr),
> + offset, size);
> }
>
> static int
> @@ -99,7 +100,8 @@ ram_block_attributes_notify_populate(RamBlockAttributes *attr,
> {
> RamDiscardManager *rdm = memory_region_get_ram_discard_manager(attr->ram_block->mr);
>
> - return ram_discard_manager_notify_populate(rdm, offset, size);
> + return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(attr),
> + offset, size);
> }
>
> int ram_block_attributes_state_change(RamBlockAttributes *attr,
> diff --git a/system/ram-discard-manager.c b/system/ram-discard-manager.c
> index 25beb052a1e..5592bfd3486 100644
> --- a/system/ram-discard-manager.c
> +++ b/system/ram-discard-manager.c
> @@ -7,6 +7,7 @@
>
> #include "qemu/osdep.h"
> #include "qemu/error-report.h"
> +#include "qemu/queue.h"
> #include "system/memory.h"
>
> static uint64_t ram_discard_source_get_min_granularity(const RamDiscardSource *rds,
> @@ -28,20 +29,21 @@ static bool ram_discard_source_is_populated(const RamDiscardSource *rds,
> }
>
> /*
> - * Iterate the section at source granularity, aggregating consecutive chunks
> - * with matching populated state, and call replay_fn for each run.
> + * Iterate a single source's populated or discarded regions and call
> + * replay_fn for each contiguous run.
> */
> -static int replay_by_populated_state(const RamDiscardManager *rdm,
> - const MemoryRegionSection *section,
> - bool replay_populated,
> - ReplayRamDiscardState replay_fn,
> - void *opaque)
> +static int replay_source_by_state(const RamDiscardSource *source,
> + const MemoryRegion *mr,
> + const MemoryRegionSection *section,
> + bool replay_populated,
> + ReplayRamDiscardState replay_fn,
> + void *opaque)
> {
> uint64_t granularity, offset, size, end, pos, run_start;
> bool in_run = false;
> int ret = 0;
>
> - granularity = ram_discard_source_get_min_granularity(rdm->rds, rdm->mr);
> + granularity = ram_discard_source_get_min_granularity(source, mr);
> offset = section->offset_within_region;
> size = int128_get64(section->size);
> end = offset + size;
> @@ -55,7 +57,7 @@ static int replay_by_populated_state(const RamDiscardManager *rdm,
> .offset_within_region = pos,
> .size = int128_make64(granularity),
> };
> - bool populated = ram_discard_source_is_populated(rdm->rds, &chunk);
> + bool populated = ram_discard_source_is_populated(source, &chunk);
>
> if (populated == replay_populated) {
> if (!in_run) {
> @@ -88,28 +90,338 @@ static int replay_by_populated_state(const RamDiscardManager *rdm,
> return ret;
> }
>
> -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr,
> - RamDiscardSource *rds)
> +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr)
> {
> RamDiscardManager *rdm;
>
> rdm = RAM_DISCARD_MANAGER(object_new(TYPE_RAM_DISCARD_MANAGER));
> - rdm->rds = rds;
> rdm->mr = mr;
> - QLIST_INIT(&rdm->rdl_list);
> return rdm;
> }
>
> +static void ram_discard_manager_update_granularity(RamDiscardManager *rdm)
> +{
> + RamDiscardSourceEntry *entry;
> + uint64_t granularity = 0;
> +
> + QLIST_FOREACH(entry, &rdm->source_list, next) {
> + uint64_t src_granularity;
> +
> + src_granularity =
> + ram_discard_source_get_min_granularity(entry->rds, rdm->mr);
> + g_assert(src_granularity != 0);
> + if (granularity == 0) {
> + granularity = src_granularity;
> + } else {
> + granularity = MIN(granularity, src_granularity);
> + }
> + }
> + rdm->min_granularity = granularity;
> +}
> +
> +static RamDiscardSourceEntry *
> +ram_discard_manager_find_source(RamDiscardManager *rdm, RamDiscardSource *rds)
> +{
> + RamDiscardSourceEntry *entry;
> +
> + QLIST_FOREACH(entry, &rdm->source_list, next) {
> + if (entry->rds == rds) {
> + return entry;
> + }
> + }
> + return NULL;
> +}
> +
> +static int rdl_populate_cb(const MemoryRegionSection *section, void *opaque)
> +{
> + RamDiscardListener *rdl = opaque;
> + MemoryRegionSection tmp = *rdl->section;
> +
> + g_assert(section->mr == rdl->section->mr);
> +
> + if (!memory_region_section_intersect_range(&tmp,
> + section->offset_within_region,
> + int128_get64(section->size))) {
> + return 0;
> + }
> +
> + return rdl->notify_populate(rdl, &tmp);
> +}
> +
> +static int rdl_discard_cb(const MemoryRegionSection *section, void *opaque)
> +{
> + RamDiscardListener *rdl = opaque;
> + MemoryRegionSection tmp = *rdl->section;
> +
> + g_assert(section->mr == rdl->section->mr);
> +
> + if (!memory_region_section_intersect_range(&tmp,
> + section->offset_within_region,
> + int128_get64(section->size))) {
> + return 0;
> + }
> +
> + rdl->notify_discard(rdl, &tmp);
> + return 0;
> +}
> +
> +static bool rdm_is_all_populated_skip(const RamDiscardManager *rdm,
> + const MemoryRegionSection *section,
> + const RamDiscardSource *skip_source)
> +{
> + RamDiscardSourceEntry *entry;
> +
> + QLIST_FOREACH(entry, &rdm->source_list, next) {
> + if (skip_source && entry->rds == skip_source) {
> + continue;
> + }
> + if (!ram_discard_source_is_populated(entry->rds, section)) {
> + return false;
> + }
> + }
> + return true;
> +}
> +
> +typedef struct SourceNotifyCtx {
> + RamDiscardManager *rdm;
> + RamDiscardListener *rdl;
> + RamDiscardSource *source; /* added or removed */
> +} SourceNotifyCtx;
> +
> +/*
> + * Unified helper to replay regions based on populated state.
> + * If replay_populated is true: replay regions where ALL sources are populated.
> + * If replay_populated is false: replay regions where ANY source is discarded.
> + */
> +static int replay_by_populated_state(const RamDiscardManager *rdm,
> + const MemoryRegionSection *section,
> + const RamDiscardSource *skip_source,
Indeed we still need skip_source to handle the cases where one source
updated the populated status / bitmap to notify the change..
IIUC if we can cache the global status into a per-manager merged bitmap,
then we could drop skip_source. That'll also help in the case you
mentioned in the previous patch because then a replay can walk that merged
bitmap, but as you also mentioned we don't yet know if the perf is an issue
yet. I also don't know if a merged bitmap in the manager would help much
if there're only two sources: I believe it'll help more if we have a lot
more sources, but unlikely.. So just a quick thought.
Thanks,
> + bool replay_populated,
> + ReplayRamDiscardState replay_fn,
> + void *user_opaque)
> +{
> + uint64_t granularity = rdm->min_granularity;
> + uint64_t offset, end_offset;
> + uint64_t run_start = 0;
> + bool in_run = false;
> + int ret = 0;
> +
> + if (QLIST_EMPTY(&rdm->source_list)) {
> + if (replay_populated) {
> + return replay_fn(section, user_opaque);
> + }
> + return 0;
> + }
> +
> + g_assert(granularity != 0);
> +
> + offset = section->offset_within_region;
> + end_offset = offset + int128_get64(section->size);
> +
> + while (offset < end_offset) {
> + MemoryRegionSection subsection = {
> + .mr = section->mr,
> + .offset_within_region = offset,
> + .size = int128_make64(MIN(granularity, end_offset - offset)),
> + };
> + bool all_populated;
> + bool included;
> +
> + all_populated = rdm_is_all_populated_skip(rdm, &subsection,
> + skip_source);
> + included = replay_populated ? all_populated : !all_populated;
> +
> + if (included) {
> + if (!in_run) {
> + run_start = offset;
> + in_run = true;
> + }
> + } else {
> + if (in_run) {
> + MemoryRegionSection run_section = {
> + .mr = section->mr,
> + .offset_within_region = run_start,
> + .size = int128_make64(offset - run_start),
> + };
> + ret = replay_fn(&run_section, user_opaque);
> + if (ret) {
> + return ret;
> + }
> + in_run = false;
> + }
> + }
> + if (granularity > end_offset - offset) {
> + break;
> + }
> + offset += granularity;
> + }
> +
> + if (in_run) {
> + MemoryRegionSection run_section = {
> + .mr = section->mr,
> + .offset_within_region = run_start,
> + .size = int128_make64(end_offset - run_start),
> + };
> + ret = replay_fn(&run_section, user_opaque);
> + }
> +
> + return ret;
> +}
> +
> +static int add_source_check_discard_cb(const MemoryRegionSection *section,
> + void *opaque)
> +{
> + SourceNotifyCtx *ctx = opaque;
> +
> + return replay_by_populated_state(ctx->rdm, section, ctx->source, true,
> + rdl_discard_cb, ctx->rdl);
> +}
> +
> +static int del_source_check_populate_cb(const MemoryRegionSection *section,
> + void *opaque)
> +{
> + SourceNotifyCtx *ctx = opaque;
> +
> + return replay_by_populated_state(ctx->rdm, section, ctx->source, true,
> + rdl_populate_cb, ctx->rdl);
> +}
> +
> +int ram_discard_manager_add_source(RamDiscardManager *rdm,
> + RamDiscardSource *source)
> +{
> + RamDiscardSourceEntry *entry;
> + RamDiscardListener *rdl, *rdl2;
> + int ret = 0;
> +
> + if (ram_discard_manager_find_source(rdm, source)) {
> + return -EBUSY;
> + }
> +
> + /*
> + * If there are existing listeners, notify them about regions that
> + * become discarded due to adding this source. Only notify for regions
> + * that were previously populated (all other sources agreed).
> + */
> + QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> + SourceNotifyCtx ctx = {
> + .rdm = rdm,
> + .rdl = rdl,
> + /* no need to set source */
> + };
> + ret = replay_source_by_state(source, rdm->mr, rdl->section,
> + false,
> + add_source_check_discard_cb, &ctx);
> + if (ret) {
> + break;
> + }
> + }
> + if (ret) {
> + QLIST_FOREACH(rdl2, &rdm->rdl_list, next) {
> + SourceNotifyCtx ctx = {
> + .rdm = rdm,
> + .rdl = rdl2,
> + };
> + replay_source_by_state(source, rdm->mr, rdl2->section,
> + false,
> + del_source_check_populate_cb,
> + &ctx);
> + if (rdl == rdl2) {
> + break;
> + }
> + }
> +
> + return ret;
> + }
> +
> + entry = g_new0(RamDiscardSourceEntry, 1);
> + entry->rds = source;
> + QLIST_INSERT_HEAD(&rdm->source_list, entry, next);
> +
> + ram_discard_manager_update_granularity(rdm);
> +
> + return ret;
> +}
> +
> +int ram_discard_manager_del_source(RamDiscardManager *rdm,
> + RamDiscardSource *source)
> +{
> + RamDiscardSourceEntry *entry;
> + RamDiscardListener *rdl, *rdl2;
> + int ret = 0;
> +
> + entry = ram_discard_manager_find_source(rdm, source);
> + if (!entry) {
> + return -ENOENT;
> + }
> +
> + /*
> + * If there are existing listeners, check if any regions become
> + * populated due to removing this source.
> + */
> + QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> + SourceNotifyCtx ctx = {
> + .rdm = rdm,
> + .rdl = rdl,
> + .source = source,
> + };
> + /*
> + * From the previously discarded regions, check if any
> + * regions become populated.
> + */
> + ret = replay_source_by_state(source, rdm->mr, rdl->section,
> + false,
> + del_source_check_populate_cb,
> + &ctx);
> + if (ret) {
> + break;
> + }
> + }
> + if (ret) {
> + QLIST_FOREACH(rdl2, &rdm->rdl_list, next) {
> + SourceNotifyCtx ctx = {
> + .rdm = rdm,
> + .rdl = rdl2,
> + .source = source,
> + };
> + replay_source_by_state(source, rdm->mr, rdl2->section,
> + false,
> + add_source_check_discard_cb,
> + &ctx);
> + if (rdl == rdl2) {
> + break;
> + }
> + }
> +
> + return ret;
> + }
> +
> + QLIST_REMOVE(entry, next);
> + g_free(entry);
> + ram_discard_manager_update_granularity(rdm);
> + return ret;
> +}
> +
> uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *rdm,
> const MemoryRegion *mr)
> {
> - return ram_discard_source_get_min_granularity(rdm->rds, mr);
> + g_assert(mr == rdm->mr);
> + return rdm->min_granularity;
> }
>
> +/*
> + * Aggregated query: returns true only if ALL sources report populated (AND).
> + */
> bool ram_discard_manager_is_populated(const RamDiscardManager *rdm,
> const MemoryRegionSection *section)
> {
> - return ram_discard_source_is_populated(rdm->rds, section);
> + RamDiscardSourceEntry *entry;
> +
> + QLIST_FOREACH(entry, &rdm->source_list, next) {
> + if (!ram_discard_source_is_populated(entry->rds, section)) {
> + return false;
> + }
> + }
> + return true;
> }
>
> int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
> @@ -117,7 +429,8 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
> ReplayRamDiscardState replay_fn,
> void *opaque)
> {
> - return replay_by_populated_state(rdm, section, true, replay_fn, opaque);
> + return replay_by_populated_state(rdm, section, NULL, true,
> + replay_fn, opaque);
> }
>
> int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm,
> @@ -125,14 +438,17 @@ int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm,
> ReplayRamDiscardState replay_fn,
> void *opaque)
> {
> - return replay_by_populated_state(rdm, section, false, replay_fn, opaque);
> + return replay_by_populated_state(rdm, section, NULL, false,
> + replay_fn, opaque);
> }
>
> static void ram_discard_manager_initfn(Object *obj)
> {
> RamDiscardManager *rdm = RAM_DISCARD_MANAGER(obj);
>
> + QLIST_INIT(&rdm->source_list);
> QLIST_INIT(&rdm->rdl_list);
> + rdm->min_granularity = 0;
> }
>
> static void ram_discard_manager_finalize(Object *obj)
> @@ -140,74 +456,91 @@ static void ram_discard_manager_finalize(Object *obj)
> RamDiscardManager *rdm = RAM_DISCARD_MANAGER(obj);
>
> g_assert(QLIST_EMPTY(&rdm->rdl_list));
> + g_assert(QLIST_EMPTY(&rdm->source_list));
> }
>
> int ram_discard_manager_notify_populate(RamDiscardManager *rdm,
> + RamDiscardSource *source,
> uint64_t offset, uint64_t size)
> {
> RamDiscardListener *rdl, *rdl2;
> + MemoryRegionSection section = {
> + .mr = rdm->mr,
> + .offset_within_region = offset,
> + .size = int128_make64(size),
> + };
> int ret = 0;
>
> - QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> - MemoryRegionSection tmp = *rdl->section;
> + g_assert(ram_discard_manager_find_source(rdm, source));
>
> - if (!memory_region_section_intersect_range(&tmp, offset, size)) {
> - continue;
> - }
> - ret = rdl->notify_populate(rdl, &tmp);
> + /*
> + * Only notify about regions that are populated in ALL sources.
> + * replay_by_populated_state checks all sources including the one that
> + * just populated.
> + */
> + QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> + ret = replay_by_populated_state(rdm, §ion, NULL, true,
> + rdl_populate_cb, rdl);
> if (ret) {
> break;
> }
> }
>
> if (ret) {
> - /* Notify all already-notified listeners about discard. */
> + /*
> + * Rollback: notify discard for listeners we already notified,
> + * including the failing listener which may have been partially
> + * notified. Listeners must handle discard notifications for
> + * regions they didn't receive populate notifications for.
> + */
> QLIST_FOREACH(rdl2, &rdm->rdl_list, next) {
> - MemoryRegionSection tmp = *rdl2->section;
> -
> + replay_by_populated_state(rdm, §ion, NULL, true,
> + rdl_discard_cb, rdl2);
> if (rdl2 == rdl) {
> break;
> }
> - if (!memory_region_section_intersect_range(&tmp, offset, size)) {
> - continue;
> - }
> - rdl2->notify_discard(rdl2, &tmp);
> }
> }
> return ret;
> }
>
> void ram_discard_manager_notify_discard(RamDiscardManager *rdm,
> + RamDiscardSource *source,
> uint64_t offset, uint64_t size)
> {
> RamDiscardListener *rdl;
> -
> + MemoryRegionSection section = {
> + .mr = rdm->mr,
> + .offset_within_region = offset,
> + .size = int128_make64(size),
> + };
> +
> + g_assert(ram_discard_manager_find_source(rdm, source));
> +
> + /*
> + * Only notify about ranges that were aggregately populated before this
> + * source's discard. Since the source has already updated its state,
> + * we use replay_by_populated_state with this source skipped - it will
> + * replay only the ranges where all OTHER sources are populated.
> + */
> QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> - MemoryRegionSection tmp = *rdl->section;
> -
> - if (!memory_region_section_intersect_range(&tmp, offset, size)) {
> - continue;
> - }
> - rdl->notify_discard(rdl, &tmp);
> + replay_by_populated_state(rdm, §ion, source, true,
> + rdl_discard_cb, rdl);
> }
> }
>
> -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm)
> +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm,
> + RamDiscardSource *source)
> {
> RamDiscardListener *rdl;
>
> + g_assert(ram_discard_manager_find_source(rdm, source));
> +
> QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> rdl->notify_discard(rdl, rdl->section);
> }
> }
>
> -static int rdm_populate_cb(const MemoryRegionSection *section, void *opaque)
> -{
> - RamDiscardListener *rdl = opaque;
> -
> - return rdl->notify_populate(rdl, section);
> -}
> -
> void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> RamDiscardListener *rdl,
> MemoryRegionSection *section)
> @@ -220,7 +553,7 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
> QLIST_INSERT_HEAD(&rdm->rdl_list, rdl, next);
>
> ret = ram_discard_manager_replay_populated(rdm, rdl->section,
> - rdm_populate_cb, rdl);
> + rdl_populate_cb, rdl);
> if (ret) {
> error_report("%s: Replaying populated ranges failed: %s", __func__,
> strerror(-ret));
> @@ -246,7 +579,7 @@ int ram_discard_manager_replay_populated_to_listeners(RamDiscardManager *rdm)
>
> QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> ret = ram_discard_manager_replay_populated(rdm, rdl->section,
> - rdm_populate_cb, rdl);
> + rdl_populate_cb, rdl);
> if (ret) {
> break;
> }
> --
> 2.53.0
>
--
Peter Xu
next prev parent reply other threads:[~2026-03-03 21:16 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 13:59 [PATCH v3 00/15] Make RamDiscardManager work with multiple sources marcandre.lureau
2026-02-26 13:59 ` [PATCH v3 01/15] system/rba: use DIV_ROUND_UP marcandre.lureau
2026-03-03 18:57 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 02/15] memory: drop RamDiscardListener::double_discard_supported marcandre.lureau
2026-02-26 15:58 ` David Hildenbrand
2026-03-03 18:57 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 03/15] virtio-mem: use warn_report_err_once() marcandre.lureau
2026-02-26 15:57 ` David Hildenbrand
2026-03-03 18:58 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 04/15] system/memory: minor doc fix marcandre.lureau
2026-02-26 15:58 ` David Hildenbrand
2026-03-03 18:59 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 05/15] kvm: replace RamDicardManager by the RamBlockAttribute marcandre.lureau
2026-03-03 18:59 ` Peter Xu
2026-03-04 9:09 ` Chenyi Qiang
2026-02-26 13:59 ` [PATCH v3 06/15] system/memory: split RamDiscardManager into source and manager marcandre.lureau
2026-03-03 19:45 ` Peter Xu
2026-03-04 9:51 ` Chenyi Qiang
2026-02-26 13:59 ` [PATCH v3 07/15] system/memory: move RamDiscardManager to separate compilation unit marcandre.lureau
2026-03-03 19:47 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 08/15] system/memory: constify section arguments marcandre.lureau
2026-03-03 19:48 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 09/15] system/ram-discard-manager: implement replay via is_populated iteration marcandre.lureau
2026-02-26 16:01 ` David Hildenbrand
2026-03-03 10:47 ` Marc-André Lureau
2026-03-03 20:27 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 10/15] virtio-mem: remove replay_populated/replay_discarded implementation marcandre.lureau
2026-02-26 13:59 ` [PATCH v3 11/15] system/ram-discard-manager: drop replay from source interface marcandre.lureau
2026-02-26 13:59 ` [PATCH v3 12/15] system/memory: implement RamDiscardManager multi-source aggregation marcandre.lureau
2026-03-03 21:16 ` Peter Xu [this message]
2026-02-26 13:59 ` [PATCH v3 13/15] system/physmem: destroy ram block attributes before RCU-deferred reclaim marcandre.lureau
2026-03-03 21:31 ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 14/15] system/memory: add RamDiscardManager reference counting and cleanup marcandre.lureau
2026-03-03 21:33 ` Peter Xu
2026-02-26 14:00 ` [PATCH v3 15/15] tests: add unit tests for RamDiscardManager multi-source aggregation marcandre.lureau
2026-03-03 21:28 ` [PATCH v3 00/15] Make RamDiscardManager work with multiple sources Peter Xu
2026-03-03 22:33 ` Marc-André Lureau
2026-03-09 18:25 ` Peter Xu
2026-03-10 2:35 ` Chenyi Qiang
2026-03-10 16:09 ` Marc-André Lureau
2026-03-11 7:17 ` Chenyi Qiang
2026-03-11 14:12 ` Marc-André Lureau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aadPkO26sp1NQRo-@x1.local \
--to=peterx@redhat.com \
--cc=alex@shazbot.org \
--cc=bchaney@akamai.com \
--cc=clg@redhat.com \
--cc=david@kernel.org \
--cc=farosas@suse.de \
--cc=kvm@vger.kernel.org \
--cc=marcandre.lureau@redhat.com \
--cc=mark.kanda@oracle.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox