Re: [PATCH v3 12/15] system/memory: implement RamDiscardManager multi-source aggregation

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: marcandre.lureau@redhat.com
Cc: qemu-devel@nongnu.org, "Ben Chaney" <bchaney@akamai.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Cédric Le Goater" <clg@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Alex Williamson" <alex@shazbot.org>,
	"Fabiano Rosas" <farosas@suse.de>,
	"David Hildenbrand" <david@kernel.org>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	kvm@vger.kernel.org, "Mark Kanda" <mark.kanda@oracle.com>
Subject: Re: [PATCH v3 12/15] system/memory: implement RamDiscardManager multi-source aggregation
Date: Tue, 3 Mar 2026 16:16:00 -0500	[thread overview]
Message-ID: <aadPkO26sp1NQRo-@x1.local> (raw)
In-Reply-To: <20260226140001.3622334-13-marcandre.lureau@redhat.com>

On Thu, Feb 26, 2026 at 02:59:57PM +0100, marcandre.lureau@redhat.com wrote:
> From: Marc-André Lureau <marcandre.lureau@redhat.com>
> 
> Refactor RamDiscardManager to aggregate multiple RamDiscardSource
> instances. This enables scenarios where multiple components (e.g.,
> virtio-mem and RamBlockAttributes) can coordinate memory discard
> state for the same memory region.
> 
> The aggregation uses:
> - Populated: ALL sources populated
> - Discarded: ANY source discarded
> 
> When a source is added with existing listeners, they are notified
> about regions that become discarded. When a source is removed,
> listeners are notified about regions that become populated.
> 
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

Looks nice to me in general, feel free to take:

Reviewed-by: Peter Xu <peterx@redhat.com>

Only some nitpicks and pure comments below.

> ---
>  include/system/ram-discard-manager.h | 143 +++++++--
>  hw/virtio/virtio-mem.c               |   8 +-
>  system/memory.c                      |  15 +-
>  system/ram-block-attributes.c        |   6 +-
>  system/ram-discard-manager.c         | 427 ++++++++++++++++++++++++---
>  5 files changed, 515 insertions(+), 84 deletions(-)
> 
> diff --git a/include/system/ram-discard-manager.h b/include/system/ram-discard-manager.h
> index b5dbcb4a82d..9d650ee4d7b 100644
> --- a/include/system/ram-discard-manager.h
> +++ b/include/system/ram-discard-manager.h
> @@ -170,30 +170,96 @@ struct RamDiscardSourceClass {
>   * becoming discarded in a different granularity than it was populated and the
>   * other way around.
>   */
> +
> +typedef struct RamDiscardSourceEntry RamDiscardSourceEntry;
> +
> +struct RamDiscardSourceEntry {
> +    RamDiscardSource *rds;
> +    QLIST_ENTRY(RamDiscardSourceEntry) next;
> +};
> +
>  struct RamDiscardManager {
>      Object parent;
>  
> -    RamDiscardSource *rds;
> -    MemoryRegion *mr;
> +    struct MemoryRegion *mr;

s/struct//

> +    QLIST_HEAD(, RamDiscardSourceEntry) source_list;
> +    uint64_t min_granularity;
>      QLIST_HEAD(, RamDiscardListener) rdl_list;
>  };
>  
> -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr,
> -                                           RamDiscardSource *rds);
> +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr);
> +
> +/**
> + * ram_discard_manager_add_source:
> + *
> + * Register a #RamDiscardSource with the #RamDiscardManager. The manager
> + * aggregates state from all registered sources using AND semantics: a region
> + * is considered populated only if ALL sources report it as populated.
> + *
> + * If listeners are already registered, they will be notified about any
> + * regions that become discarded due to adding this source. Specifically,
> + * for each region that the new source reports as discarded, if all other
> + * sources reported it as populated, listeners receive a discard notification.
> + *
> + * If any listener rejects the notification (returns an error), previously
> + * notified listeners are rolled back with populate notifications and the
> + * source is not added.
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource to add
> + *
> + * Returns: 0 on success, -EBUSY if @source is already registered, or a
> + *          negative error code if a listener rejected the state change.
> + */
> +int ram_discard_manager_add_source(RamDiscardManager *rdm,
> +                                   RamDiscardSource *source);
> +
> +/**
> + * ram_discard_manager_del_source:
> + *
> + * Unregister a #RamDiscardSource from the #RamDiscardManager.
> + *
> + * If listeners are already registered, they will be notified about any
> + * regions that become populated due to removing this source. Specifically,
> + * for each region that the removed source reported as discarded, if all
> + * remaining sources report it as populated, listeners receive a populate
> + * notification.
> + *
> + * If any listener rejects the notification (returns an error), previously
> + * notified listeners are rolled back with discard notifications and the
> + * source is not removed.
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource to remove
> + *
> + * Returns: 0 on success, -ENOENT if @source is not registered, or a
> + *          negative error code if a listener rejected the state change.
> + */
> +int ram_discard_manager_del_source(RamDiscardManager *rdm,
> +                                   RamDiscardSource *source);
> +
>  
>  uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *rdm,
>                                                   const MemoryRegion *mr);
>  
> +/**
> + * ram_discard_manager_is_populated:
> + *
> + * Check if the given memory region section is populated.
> + * If the manager has no sources, it is considered populated.
> + *
> + * @rdm: the #RamDiscardManager
> + * @section: the #MemoryRegionSection to check
> + *
> + * Returns: true if the section is populated, false otherwise.
> + */
>  bool ram_discard_manager_is_populated(const RamDiscardManager *rdm,
>                                        const MemoryRegionSection *section);
>  
>  /**
>   * ram_discard_manager_replay_populated:
>   *
> - * Iterate the given #MemoryRegionSection at minimum granularity, calling
> - * #RamDiscardSourceClass.is_populated for each chunk, and invoke @replay_fn
> - * for each contiguous populated range. In case any call fails, no further
> - * calls are made.
> + * Call @replay_fn on regions that are populated in all sources.
>   *
>   * @rdm: the #RamDiscardManager
>   * @section: the #MemoryRegionSection
> @@ -210,10 +276,7 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
>  /**
>   * ram_discard_manager_replay_discarded:
>   *
> - * Iterate the given #MemoryRegionSection at minimum granularity, calling
> - * #RamDiscardSourceClass.is_populated for each chunk, and invoke @replay_fn
> - * for each contiguous discarded range. In case any call fails, no further
> - * calls are made.
> + * Call @replay_fn on regions that are discarded in any sources.
>   *
>   * @rdm: the #RamDiscardManager
>   * @section: the #MemoryRegionSection
> @@ -234,31 +297,61 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
>  void ram_discard_manager_unregister_listener(RamDiscardManager *rdm,
>                                               RamDiscardListener *rdl);
>  
> -/*
> - * Note: later refactoring should take the source into account and the manager
> - *       should be able to aggregate multiple sources.
> +/**
> + * ram_discard_manager_notify_populate:
> + *
> + * Notify listeners that a region is about to be populated by a source.
> + * For multi-source aggregation, only notifies when all sources agree
> + * the region is populated (intersection).
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource that is populating
> + * @offset: offset within the memory region
> + * @size: size of the region being populated
> + *
> + * Returns 0 on success, or a negative error if any listener rejects.
>   */
>  int ram_discard_manager_notify_populate(RamDiscardManager *rdm,
> +                                        RamDiscardSource *source,
>                                          uint64_t offset, uint64_t size);
>  
> -/*
> - * Note: later refactoring should take the source into account and the manager
> - *       should be able to aggregate multiple sources.
> +/**
> + * ram_discard_manager_notify_discard:
> + *
> + * Notify listeners that a region has been discarded by a source.
> + * For multi-source aggregation, always notifies immediately
> + * (union semantics - any source discarding makes region discarded).
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource that is discarding
> + * @offset: offset within the memory region
> + * @size: size of the region being discarded
>   */
>  void ram_discard_manager_notify_discard(RamDiscardManager *rdm,
> +                                        RamDiscardSource *source,
>                                          uint64_t offset, uint64_t size);
>  
> -/*
> - * Note: later refactoring should take the source into account and the manager
> - *       should be able to aggregate multiple sources.
> +/**
> + * ram_discard_manager_notify_discard_all:
> + *
> + * Notify listeners that all regions have been discarded by a source.
> + *
> + * @rdm: the #RamDiscardManager
> + * @source: the #RamDiscardSource that is discarding
>   */
> -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm);
> +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm,
> +                                            RamDiscardSource *source);
>  
> -/*
> +/**
> + * ram_discard_manager_replay_populated_to_listeners:
> + *
>   * Replay populated sections to all registered listeners.
> + * For multi-source aggregation, only replays regions where all sources
> + * are populated (intersection).
>   *
> - * Note: later refactoring should take the source into account and the manager
> - *       should be able to aggregate multiple sources.
> + * @rdm: the #RamDiscardManager
> + *
> + * Returns 0 on success, or a negative error if any notification failed.
>   */
>  int ram_discard_manager_replay_populated_to_listeners(RamDiscardManager *rdm);
>  
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index 2b67b2882d2..35e03ed7599 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -264,7 +264,8 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset,
>  {
>      RamDiscardManager *rdm = memory_region_get_ram_discard_manager(&vmem->memdev->mr);
>  
> -    ram_discard_manager_notify_discard(rdm, offset, size);
> +    ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(vmem),
> +                                       offset, size);
>  }
>  
>  static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
> @@ -272,7 +273,8 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset,
>  {
>      RamDiscardManager *rdm = memory_region_get_ram_discard_manager(&vmem->memdev->mr);
>  
> -    return ram_discard_manager_notify_populate(rdm, offset, size);
> +    return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(vmem),
> +                                               offset, size);
>  }
>  
>  static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem)
> @@ -283,7 +285,7 @@ static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem)
>          return;
>      }
>  
> -    ram_discard_manager_notify_discard_all(rdm);
> +    ram_discard_manager_notify_discard_all(rdm, RAM_DISCARD_SOURCE(vmem));
>  }
>  
>  static bool virtio_mem_is_range_plugged(const VirtIOMEM *vmem,
> diff --git a/system/memory.c b/system/memory.c
> index 8b46cb87838..8a4cb7b59ac 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -2109,21 +2109,22 @@ int memory_region_add_ram_discard_source(MemoryRegion *mr,
>                                           RamDiscardSource *source)
>  {
>      g_assert(memory_region_is_ram(mr));
> -    if (mr->rdm) {
> -        return -EBUSY;
> +
> +    if (!mr->rdm) {
> +        mr->rdm = ram_discard_manager_new(mr);
>      }
>  
> -    mr->rdm = ram_discard_manager_new(mr, RAM_DISCARD_SOURCE(source));
> -    return 0;
> +    return ram_discard_manager_add_source(mr->rdm, source);
>  }
>  
>  void memory_region_del_ram_discard_source(MemoryRegion *mr,
>                                            RamDiscardSource *source)
>  {
> -    g_assert(mr->rdm->rds == source);
> +    g_assert(mr->rdm);
> +
> +    ram_discard_manager_del_source(mr->rdm, source);
>  
> -    object_unref(mr->rdm);
> -    mr->rdm = NULL;
> +    /* if there is no source and no listener left, we could free rdm */

Maybe squash patch 14 directly into this one would be better?

>  }
>  
>  /* Called with rcu_read_lock held.  */
> diff --git a/system/ram-block-attributes.c b/system/ram-block-attributes.c
> index 718c7075cec..59ec7a28eb0 100644
> --- a/system/ram-block-attributes.c
> +++ b/system/ram-block-attributes.c
> @@ -90,7 +90,8 @@ ram_block_attributes_notify_discard(RamBlockAttributes *attr,
>  {
>      RamDiscardManager *rdm = memory_region_get_ram_discard_manager(attr->ram_block->mr);
>  
> -    ram_discard_manager_notify_discard(rdm, offset, size);
> +    ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(attr),
> +                                       offset, size);
>  }
>  
>  static int
> @@ -99,7 +100,8 @@ ram_block_attributes_notify_populate(RamBlockAttributes *attr,
>  {
>      RamDiscardManager *rdm = memory_region_get_ram_discard_manager(attr->ram_block->mr);
>  
> -    return ram_discard_manager_notify_populate(rdm, offset, size);
> +    return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(attr),
> +                                               offset, size);
>  }
>  
>  int ram_block_attributes_state_change(RamBlockAttributes *attr,
> diff --git a/system/ram-discard-manager.c b/system/ram-discard-manager.c
> index 25beb052a1e..5592bfd3486 100644
> --- a/system/ram-discard-manager.c
> +++ b/system/ram-discard-manager.c
> @@ -7,6 +7,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/error-report.h"
> +#include "qemu/queue.h"
>  #include "system/memory.h"
>  
>  static uint64_t ram_discard_source_get_min_granularity(const RamDiscardSource *rds,
> @@ -28,20 +29,21 @@ static bool ram_discard_source_is_populated(const RamDiscardSource *rds,
>  }
>  
>  /*
> - * Iterate the section at source granularity, aggregating consecutive chunks
> - * with matching populated state, and call replay_fn for each run.
> + * Iterate a single source's populated or discarded regions and call
> + * replay_fn for each contiguous run.
>   */
> -static int replay_by_populated_state(const RamDiscardManager *rdm,
> -                                     const MemoryRegionSection *section,
> -                                     bool replay_populated,
> -                                     ReplayRamDiscardState replay_fn,
> -                                     void *opaque)
> +static int replay_source_by_state(const RamDiscardSource *source,
> +                                  const MemoryRegion *mr,
> +                                  const MemoryRegionSection *section,
> +                                  bool replay_populated,
> +                                  ReplayRamDiscardState replay_fn,
> +                                  void *opaque)
>  {
>      uint64_t granularity, offset, size, end, pos, run_start;
>      bool in_run = false;
>      int ret = 0;
>  
> -    granularity = ram_discard_source_get_min_granularity(rdm->rds, rdm->mr);
> +    granularity = ram_discard_source_get_min_granularity(source, mr);
>      offset = section->offset_within_region;
>      size = int128_get64(section->size);
>      end = offset + size;
> @@ -55,7 +57,7 @@ static int replay_by_populated_state(const RamDiscardManager *rdm,
>              .offset_within_region = pos,
>              .size = int128_make64(granularity),
>          };
> -        bool populated = ram_discard_source_is_populated(rdm->rds, &chunk);
> +        bool populated = ram_discard_source_is_populated(source, &chunk);
>  
>          if (populated == replay_populated) {
>              if (!in_run) {
> @@ -88,28 +90,338 @@ static int replay_by_populated_state(const RamDiscardManager *rdm,
>      return ret;
>  }
>  
> -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr,
> -                                           RamDiscardSource *rds)
> +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr)
>  {
>      RamDiscardManager *rdm;
>  
>      rdm = RAM_DISCARD_MANAGER(object_new(TYPE_RAM_DISCARD_MANAGER));
> -    rdm->rds = rds;
>      rdm->mr = mr;
> -    QLIST_INIT(&rdm->rdl_list);
>      return rdm;
>  }
>  
> +static void ram_discard_manager_update_granularity(RamDiscardManager *rdm)
> +{
> +    RamDiscardSourceEntry *entry;
> +    uint64_t granularity = 0;
> +
> +    QLIST_FOREACH(entry, &rdm->source_list, next) {
> +        uint64_t src_granularity;
> +
> +        src_granularity =
> +            ram_discard_source_get_min_granularity(entry->rds, rdm->mr);
> +        g_assert(src_granularity != 0);
> +        if (granularity == 0) {
> +            granularity = src_granularity;
> +        } else {
> +            granularity = MIN(granularity, src_granularity);
> +        }
> +    }
> +    rdm->min_granularity = granularity;
> +}
> +
> +static RamDiscardSourceEntry *
> +ram_discard_manager_find_source(RamDiscardManager *rdm, RamDiscardSource *rds)
> +{
> +    RamDiscardSourceEntry *entry;
> +
> +    QLIST_FOREACH(entry, &rdm->source_list, next) {
> +        if (entry->rds == rds) {
> +            return entry;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +static int rdl_populate_cb(const MemoryRegionSection *section, void *opaque)
> +{
> +    RamDiscardListener *rdl = opaque;
> +    MemoryRegionSection tmp = *rdl->section;
> +
> +    g_assert(section->mr == rdl->section->mr);
> +
> +    if (!memory_region_section_intersect_range(&tmp,
> +                                               section->offset_within_region,
> +                                               int128_get64(section->size))) {
> +        return 0;
> +    }
> +
> +    return rdl->notify_populate(rdl, &tmp);
> +}
> +
> +static int rdl_discard_cb(const MemoryRegionSection *section, void *opaque)
> +{
> +    RamDiscardListener *rdl = opaque;
> +    MemoryRegionSection tmp = *rdl->section;
> +
> +    g_assert(section->mr == rdl->section->mr);
> +
> +    if (!memory_region_section_intersect_range(&tmp,
> +                                               section->offset_within_region,
> +                                               int128_get64(section->size))) {
> +        return 0;
> +    }
> +
> +    rdl->notify_discard(rdl, &tmp);
> +    return 0;
> +}
> +
> +static bool rdm_is_all_populated_skip(const RamDiscardManager *rdm,
> +                                      const MemoryRegionSection *section,
> +                                      const RamDiscardSource *skip_source)
> +{
> +    RamDiscardSourceEntry *entry;
> +
> +    QLIST_FOREACH(entry, &rdm->source_list, next) {
> +        if (skip_source && entry->rds == skip_source) {
> +            continue;
> +        }
> +        if (!ram_discard_source_is_populated(entry->rds, section)) {
> +            return false;
> +        }
> +    }
> +    return true;
> +}
> +
> +typedef struct SourceNotifyCtx {
> +    RamDiscardManager *rdm;
> +    RamDiscardListener *rdl;
> +    RamDiscardSource *source; /* added or removed */
> +} SourceNotifyCtx;
> +
> +/*
> + * Unified helper to replay regions based on populated state.
> + * If replay_populated is true: replay regions where ALL sources are populated.
> + * If replay_populated is false: replay regions where ANY source is discarded.
> + */
> +static int replay_by_populated_state(const RamDiscardManager *rdm,
> +                                     const MemoryRegionSection *section,
> +                                     const RamDiscardSource *skip_source,

Indeed we still need skip_source to handle the cases where one source
updated the populated status / bitmap to notify the change..

IIUC if we can cache the global status into a per-manager merged bitmap,
then we could drop skip_source.  That'll also help in the case you
mentioned in the previous patch because then a replay can walk that merged
bitmap, but as you also mentioned we don't yet know if the perf is an issue
yet.  I also don't know if a merged bitmap in the manager would help much
if there're only two sources: I believe it'll help more if we have a lot
more sources, but unlikely..  So just a quick thought.

Thanks,

> +                                     bool replay_populated,
> +                                     ReplayRamDiscardState replay_fn,
> +                                     void *user_opaque)
> +{
> +    uint64_t granularity = rdm->min_granularity;
> +    uint64_t offset, end_offset;
> +    uint64_t run_start = 0;
> +    bool in_run = false;
> +    int ret = 0;
> +
> +    if (QLIST_EMPTY(&rdm->source_list)) {
> +        if (replay_populated) {
> +            return replay_fn(section, user_opaque);
> +        }
> +        return 0;
> +    }
> +
> +    g_assert(granularity != 0);
> +
> +    offset = section->offset_within_region;
> +    end_offset = offset + int128_get64(section->size);
> +
> +    while (offset < end_offset) {
> +        MemoryRegionSection subsection = {
> +            .mr = section->mr,
> +            .offset_within_region = offset,
> +            .size = int128_make64(MIN(granularity, end_offset - offset)),
> +        };
> +        bool all_populated;
> +        bool included;
> +
> +        all_populated = rdm_is_all_populated_skip(rdm, &subsection,
> +                                                     skip_source);
> +        included = replay_populated ? all_populated : !all_populated;
> +
> +        if (included) {
> +            if (!in_run) {
> +                run_start = offset;
> +                in_run = true;
> +            }
> +        } else {
> +            if (in_run) {
> +                MemoryRegionSection run_section = {
> +                    .mr = section->mr,
> +                    .offset_within_region = run_start,
> +                    .size = int128_make64(offset - run_start),
> +                };
> +                ret = replay_fn(&run_section, user_opaque);
> +                if (ret) {
> +                    return ret;
> +                }
> +                in_run = false;
> +            }
> +        }
> +        if (granularity > end_offset - offset) {
> +            break;
> +        }
> +        offset += granularity;
> +    }
> +
> +    if (in_run) {
> +        MemoryRegionSection run_section = {
> +            .mr = section->mr,
> +            .offset_within_region = run_start,
> +            .size = int128_make64(end_offset - run_start),
> +        };
> +        ret = replay_fn(&run_section, user_opaque);
> +    }
> +
> +    return ret;
> +}
> +
> +static int add_source_check_discard_cb(const MemoryRegionSection *section,
> +                                       void *opaque)
> +{
> +    SourceNotifyCtx *ctx = opaque;
> +
> +    return replay_by_populated_state(ctx->rdm, section, ctx->source, true,
> +                                     rdl_discard_cb, ctx->rdl);
> +}
> +
> +static int del_source_check_populate_cb(const MemoryRegionSection *section,
> +                                        void *opaque)
> +{
> +    SourceNotifyCtx *ctx = opaque;
> +
> +    return replay_by_populated_state(ctx->rdm, section, ctx->source, true,
> +                                     rdl_populate_cb, ctx->rdl);
> +}
> +
> +int ram_discard_manager_add_source(RamDiscardManager *rdm,
> +                                   RamDiscardSource *source)
> +{
> +    RamDiscardSourceEntry *entry;
> +    RamDiscardListener *rdl, *rdl2;
> +    int ret = 0;
> +
> +    if (ram_discard_manager_find_source(rdm, source)) {
> +        return -EBUSY;
> +    }
> +
> +    /*
> +     * If there are existing listeners, notify them about regions that
> +     * become discarded due to adding this source. Only notify for regions
> +     * that were previously populated (all other sources agreed).
> +     */
> +    QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> +        SourceNotifyCtx ctx = {
> +            .rdm = rdm,
> +            .rdl = rdl,
> +            /* no need to set source */
> +        };
> +        ret = replay_source_by_state(source, rdm->mr, rdl->section,
> +                                     false,
> +                                     add_source_check_discard_cb, &ctx);
> +        if (ret) {
> +            break;
> +        }
> +    }
> +    if (ret) {
> +        QLIST_FOREACH(rdl2, &rdm->rdl_list, next) {
> +            SourceNotifyCtx ctx = {
> +                .rdm = rdm,
> +                .rdl = rdl2,
> +            };
> +            replay_source_by_state(source, rdm->mr, rdl2->section,
> +                                   false,
> +                                   del_source_check_populate_cb,
> +                                   &ctx);
> +            if (rdl == rdl2) {
> +                break;
> +            }
> +        }
> +
> +        return ret;
> +    }
> +
> +    entry = g_new0(RamDiscardSourceEntry, 1);
> +    entry->rds = source;
> +    QLIST_INSERT_HEAD(&rdm->source_list, entry, next);
> +
> +    ram_discard_manager_update_granularity(rdm);
> +
> +    return ret;
> +}
> +
> +int ram_discard_manager_del_source(RamDiscardManager *rdm,
> +                                   RamDiscardSource *source)
> +{
> +    RamDiscardSourceEntry *entry;
> +    RamDiscardListener *rdl, *rdl2;
> +    int ret = 0;
> +
> +    entry = ram_discard_manager_find_source(rdm, source);
> +    if (!entry) {
> +        return -ENOENT;
> +    }
> +
> +    /*
> +     * If there are existing listeners, check if any regions become
> +     * populated due to removing this source.
> +     */
> +    QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> +        SourceNotifyCtx ctx = {
> +            .rdm = rdm,
> +            .rdl = rdl,
> +            .source = source,
> +        };
> +        /*
> +         * From the previously discarded regions, check if any
> +         * regions become populated.
> +         */
> +        ret = replay_source_by_state(source, rdm->mr, rdl->section,
> +                                     false,
> +                                     del_source_check_populate_cb,
> +                                     &ctx);
> +        if (ret) {
> +            break;
> +        }
> +    }
> +    if (ret) {
> +        QLIST_FOREACH(rdl2, &rdm->rdl_list, next) {
> +            SourceNotifyCtx ctx = {
> +                .rdm = rdm,
> +                .rdl = rdl2,
> +                .source = source,
> +            };
> +            replay_source_by_state(source, rdm->mr, rdl2->section,
> +                                   false,
> +                                   add_source_check_discard_cb,
> +                                   &ctx);
> +            if (rdl == rdl2) {
> +                break;
> +            }
> +        }
> +
> +        return ret;
> +    }
> +
> +    QLIST_REMOVE(entry, next);
> +    g_free(entry);
> +    ram_discard_manager_update_granularity(rdm);
> +    return ret;
> +}
> +
>  uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *rdm,
>                                                   const MemoryRegion *mr)
>  {
> -    return ram_discard_source_get_min_granularity(rdm->rds, mr);
> +    g_assert(mr == rdm->mr);
> +    return rdm->min_granularity;
>  }
>  
> +/*
> + * Aggregated query: returns true only if ALL sources report populated (AND).
> + */
>  bool ram_discard_manager_is_populated(const RamDiscardManager *rdm,
>                                        const MemoryRegionSection *section)
>  {
> -    return ram_discard_source_is_populated(rdm->rds, section);
> +    RamDiscardSourceEntry *entry;
> +
> +    QLIST_FOREACH(entry, &rdm->source_list, next) {
> +        if (!ram_discard_source_is_populated(entry->rds, section)) {
> +            return false;
> +        }
> +    }
> +    return true;
>  }
>  
>  int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
> @@ -117,7 +429,8 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm,
>                                           ReplayRamDiscardState replay_fn,
>                                           void *opaque)
>  {
> -    return replay_by_populated_state(rdm, section, true, replay_fn, opaque);
> +    return replay_by_populated_state(rdm, section, NULL, true,
> +                                     replay_fn, opaque);
>  }
>  
>  int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm,
> @@ -125,14 +438,17 @@ int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm,
>                                           ReplayRamDiscardState replay_fn,
>                                           void *opaque)
>  {
> -    return replay_by_populated_state(rdm, section, false, replay_fn, opaque);
> +    return replay_by_populated_state(rdm, section, NULL, false,
> +                                     replay_fn, opaque);
>  }
>  
>  static void ram_discard_manager_initfn(Object *obj)
>  {
>      RamDiscardManager *rdm = RAM_DISCARD_MANAGER(obj);
>  
> +    QLIST_INIT(&rdm->source_list);
>      QLIST_INIT(&rdm->rdl_list);
> +    rdm->min_granularity = 0;
>  }
>  
>  static void ram_discard_manager_finalize(Object *obj)
> @@ -140,74 +456,91 @@ static void ram_discard_manager_finalize(Object *obj)
>      RamDiscardManager *rdm = RAM_DISCARD_MANAGER(obj);
>  
>      g_assert(QLIST_EMPTY(&rdm->rdl_list));
> +    g_assert(QLIST_EMPTY(&rdm->source_list));
>  }
>  
>  int ram_discard_manager_notify_populate(RamDiscardManager *rdm,
> +                                        RamDiscardSource *source,
>                                          uint64_t offset, uint64_t size)
>  {
>      RamDiscardListener *rdl, *rdl2;
> +    MemoryRegionSection section = {
> +        .mr = rdm->mr,
> +        .offset_within_region = offset,
> +        .size = int128_make64(size),
> +    };
>      int ret = 0;
>  
> -    QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> -        MemoryRegionSection tmp = *rdl->section;
> +    g_assert(ram_discard_manager_find_source(rdm, source));
>  
> -        if (!memory_region_section_intersect_range(&tmp, offset, size)) {
> -            continue;
> -        }
> -        ret = rdl->notify_populate(rdl, &tmp);
> +    /*
> +     * Only notify about regions that are populated in ALL sources.
> +     * replay_by_populated_state checks all sources including the one that
> +     * just populated.
> +     */
> +    QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> +        ret = replay_by_populated_state(rdm, &section, NULL, true,
> +                                        rdl_populate_cb, rdl);
>          if (ret) {
>              break;
>          }
>      }
>  
>      if (ret) {
> -        /* Notify all already-notified listeners about discard. */
> +        /*
> +         * Rollback: notify discard for listeners we already notified,
> +         * including the failing listener which may have been partially
> +         * notified. Listeners must handle discard notifications for
> +         * regions they didn't receive populate notifications for.
> +         */
>          QLIST_FOREACH(rdl2, &rdm->rdl_list, next) {
> -            MemoryRegionSection tmp = *rdl2->section;
> -
> +            replay_by_populated_state(rdm, &section, NULL, true,
> +                                      rdl_discard_cb, rdl2);
>              if (rdl2 == rdl) {
>                  break;
>              }
> -            if (!memory_region_section_intersect_range(&tmp, offset, size)) {
> -                continue;
> -            }
> -            rdl2->notify_discard(rdl2, &tmp);
>          }
>      }
>      return ret;
>  }
>  
>  void ram_discard_manager_notify_discard(RamDiscardManager *rdm,
> +                                        RamDiscardSource *source,
>                                          uint64_t offset, uint64_t size)
>  {
>      RamDiscardListener *rdl;
> -
> +    MemoryRegionSection section = {
> +        .mr = rdm->mr,
> +        .offset_within_region = offset,
> +        .size = int128_make64(size),
> +    };
> +
> +    g_assert(ram_discard_manager_find_source(rdm, source));
> +
> +    /*
> +     * Only notify about ranges that were aggregately populated before this
> +     * source's discard. Since the source has already updated its state,
> +     * we use replay_by_populated_state with this source skipped - it will
> +     * replay only the ranges where all OTHER sources are populated.
> +     */
>      QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
> -        MemoryRegionSection tmp = *rdl->section;
> -
> -        if (!memory_region_section_intersect_range(&tmp, offset, size)) {
> -            continue;
> -        }
> -        rdl->notify_discard(rdl, &tmp);
> +        replay_by_populated_state(rdm, &section, source, true,
> +                                  rdl_discard_cb, rdl);
>      }
>  }
>  
> -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm)
> +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm,
> +                                            RamDiscardSource *source)
>  {
>      RamDiscardListener *rdl;
>  
> +    g_assert(ram_discard_manager_find_source(rdm, source));
> +
>      QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
>          rdl->notify_discard(rdl, rdl->section);
>      }
>  }
>  
> -static int rdm_populate_cb(const MemoryRegionSection *section, void *opaque)
> -{
> -    RamDiscardListener *rdl = opaque;
> -
> -    return rdl->notify_populate(rdl, section);
> -}
> -
>  void ram_discard_manager_register_listener(RamDiscardManager *rdm,
>                                             RamDiscardListener *rdl,
>                                             MemoryRegionSection *section)
> @@ -220,7 +553,7 @@ void ram_discard_manager_register_listener(RamDiscardManager *rdm,
>      QLIST_INSERT_HEAD(&rdm->rdl_list, rdl, next);
>  
>      ret = ram_discard_manager_replay_populated(rdm, rdl->section,
> -                                               rdm_populate_cb, rdl);
> +                                               rdl_populate_cb, rdl);
>      if (ret) {
>          error_report("%s: Replaying populated ranges failed: %s", __func__,
>                       strerror(-ret));
> @@ -246,7 +579,7 @@ int ram_discard_manager_replay_populated_to_listeners(RamDiscardManager *rdm)
>  
>      QLIST_FOREACH(rdl, &rdm->rdl_list, next) {
>          ret = ram_discard_manager_replay_populated(rdm, rdl->section,
> -                                                   rdm_populate_cb, rdl);
> +                                                   rdl_populate_cb, rdl);
>          if (ret) {
>              break;
>          }
> -- 
> 2.53.0
> 

-- 
Peter Xu

next prev parent reply	other threads:[~2026-03-03 21:16 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-26 13:59 [PATCH v3 00/15] Make RamDiscardManager work with multiple sources marcandre.lureau
2026-02-26 13:59 ` [PATCH v3 01/15] system/rba: use DIV_ROUND_UP marcandre.lureau
2026-03-03 18:57   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 02/15] memory: drop RamDiscardListener::double_discard_supported marcandre.lureau
2026-02-26 15:58   ` David Hildenbrand
2026-03-03 18:57   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 03/15] virtio-mem: use warn_report_err_once() marcandre.lureau
2026-02-26 15:57   ` David Hildenbrand
2026-03-03 18:58   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 04/15] system/memory: minor doc fix marcandre.lureau
2026-02-26 15:58   ` David Hildenbrand
2026-03-03 18:59   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 05/15] kvm: replace RamDicardManager by the RamBlockAttribute marcandre.lureau
2026-03-03 18:59   ` Peter Xu
2026-03-04  9:09   ` Chenyi Qiang
2026-02-26 13:59 ` [PATCH v3 06/15] system/memory: split RamDiscardManager into source and manager marcandre.lureau
2026-03-03 19:45   ` Peter Xu
2026-03-04  9:51   ` Chenyi Qiang
2026-02-26 13:59 ` [PATCH v3 07/15] system/memory: move RamDiscardManager to separate compilation unit marcandre.lureau
2026-03-03 19:47   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 08/15] system/memory: constify section arguments marcandre.lureau
2026-03-03 19:48   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 09/15] system/ram-discard-manager: implement replay via is_populated iteration marcandre.lureau
2026-02-26 16:01   ` David Hildenbrand
2026-03-03 10:47     ` Marc-André Lureau
2026-03-03 20:27       ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 10/15] virtio-mem: remove replay_populated/replay_discarded implementation marcandre.lureau
2026-02-26 13:59 ` [PATCH v3 11/15] system/ram-discard-manager: drop replay from source interface marcandre.lureau
2026-02-26 13:59 ` [PATCH v3 12/15] system/memory: implement RamDiscardManager multi-source aggregation marcandre.lureau
2026-03-03 21:16   ` Peter Xu [this message]
2026-02-26 13:59 ` [PATCH v3 13/15] system/physmem: destroy ram block attributes before RCU-deferred reclaim marcandre.lureau
2026-03-03 21:31   ` Peter Xu
2026-02-26 13:59 ` [PATCH v3 14/15] system/memory: add RamDiscardManager reference counting and cleanup marcandre.lureau
2026-03-03 21:33   ` Peter Xu
2026-02-26 14:00 ` [PATCH v3 15/15] tests: add unit tests for RamDiscardManager multi-source aggregation marcandre.lureau
2026-03-03 21:28 ` [PATCH v3 00/15] Make RamDiscardManager work with multiple sources Peter Xu
2026-03-03 22:33   ` Marc-André Lureau
2026-03-09 18:25 ` Peter Xu
2026-03-10  2:35 ` Chenyi Qiang
2026-03-10 16:09   ` Marc-André Lureau
2026-03-11  7:17     ` Chenyi Qiang
2026-03-11 14:12       ` Marc-André Lureau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aadPkO26sp1NQRo-@x1.local \
    --to=peterx@redhat.com \
    --cc=alex@shazbot.org \
    --cc=bchaney@akamai.com \
    --cc=clg@redhat.com \
    --cc=david@kernel.org \
    --cc=farosas@suse.de \
    --cc=kvm@vger.kernel.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=mark.kanda@oracle.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox