qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Chenyi Qiang <chenyi.qiang@intel.com>
To: "Alexey Kardashevskiy" <aik@amd.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Michael Roth" <michael.roth@amd.com>
Cc: <qemu-devel@nongnu.org>, <kvm@vger.kernel.org>,
	Williams Dan J <dan.j.williams@intel.com>,
	Peng Chao P <chao.p.peng@intel.com>,
	Gao Chao <chao.gao@intel.com>, Xu Yilun <yilun.xu@intel.com>,
	Li Xiaoyao <xiaoyao.li@intel.com>
Subject: Re: [PATCH v2 4/6] memory-attribute-manager: Introduce a callback to notify the shared/private state change
Date: Wed, 19 Feb 2025 09:50:59 +0800	[thread overview]
Message-ID: <7131b4a3-a836-4efd-bcfc-982a0112ef05@intel.com> (raw)
In-Reply-To: <9a8fe1a7-528d-466a-a72d-89ceb88f47fb@amd.com>



On 2/18/2025 5:19 PM, Alexey Kardashevskiy wrote:
> 
> 
> On 17/2/25 19:18, Chenyi Qiang wrote:
>> Introduce a new state_change() callback in MemoryAttributeManagerClass to
>> efficiently notify all registered RamDiscardListeners, including VFIO
>> listeners about the memory conversion events in guest_memfd. The
>> existing VFIO listener can dynamically DMA map/unmap the shared pages
>> based on conversion types:
>> - For conversions from shared to private, the VFIO system ensures the
>>    discarding of shared mapping from the IOMMU.
>> - For conversions from private to shared, it triggers the population of
>>    the shared mapping into the IOMMU.
>>
>> Additionally, there could be some special conversion requests:
>> - When a conversion request is made for a page already in the desired
>>    state, the helper simply returns success.
>> - For requests involving a range partially in the desired state, only
>>    the necessary segments are converted, ensuring the entire range
>>    complies with the request efficiently.
>> - In scenarios where a conversion request is declined by other systems,
>>    such as a failure from VFIO during notify_populate(), the helper will
>>    roll back the request, maintaining consistency.
>>
>> Opportunistically introduce a helper to trigger the state_change()
>> callback of the class.
>>
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> ---
>> Changes in v2:
>>      - Do the alignment changes due to the rename to
>> MemoryAttributeManager
>>      - Move the state_change() helper definition in this patch.
>> ---
>>   include/system/memory-attribute-manager.h |  20 +++
>>   system/memory-attribute-manager.c         | 148 ++++++++++++++++++++++
>>   2 files changed, 168 insertions(+)
>>
>> diff --git a/include/system/memory-attribute-manager.h b/include/
>> system/memory-attribute-manager.h
>> index 72adc0028e..c3dab4e47b 100644
>> --- a/include/system/memory-attribute-manager.h
>> +++ b/include/system/memory-attribute-manager.h
>> @@ -34,8 +34,28 @@ struct MemoryAttributeManager {
>>     struct MemoryAttributeManagerClass {
>>       ObjectClass parent_class;
>> +
>> +    int (*state_change)(MemoryAttributeManager *mgr, uint64_t offset,
>> uint64_t size,
>> +                        bool shared_to_private);
>>   };
>>   +static inline int
>> memory_attribute_manager_state_change(MemoryAttributeManager *mgr,
>> uint64_t offset,
>> +                                                        uint64_t
>> size, bool shared_to_private)
>> +{
>> +    MemoryAttributeManagerClass *klass;
>> +
>> +    if (mgr == NULL) {
>> +        return 0;
>> +    }
>> +
>> +    klass = MEMORY_ATTRIBUTE_MANAGER_GET_CLASS(mgr);
>> +    if (klass->state_change) {
>> +        return klass->state_change(mgr, offset, size,
>> shared_to_private);
>> +    }
>> +
>> +    return 0;
> 
> 
> nit: MemoryAttributeManagerClass without this only callback defined
> should produce some error imho. Or assert.

Nice catch. Will return error if !klass->state_change.

> 
>> +}
>> +
>>   int memory_attribute_manager_realize(MemoryAttributeManager *mgr,
>> MemoryRegion *mr);
>>   void memory_attribute_manager_unrealize(MemoryAttributeManager *mgr);

[..]

>> +
>> +static bool
>> memory_attribute_is_range_discarded(MemoryAttributeManager *mgr,
>> +                                                uint64_t offset,
>> uint64_t size)
>> +{
>> +    int block_size = memory_attribute_manager_get_block_size(mgr);
>> +    const unsigned long first_bit = offset / block_size;
>> +    const unsigned long last_bit = first_bit + (size / block_size) - 1;
>> +    unsigned long found_bit;
>> +
>> +    /* We fake a shorter bitmap to avoid searching too far. */
> 
> Weird comment imho, why is it a "fake"? You check if all pages within
> [offset, offset+size) are discarded. You do not want to search beyond
> the end of this range anyway, right?

Yes. And I think the "fake" is aimed to describe the inconsistency with
the definition of find_next_bit(). find_next_bit() defines the second
argument as "The bitmap size in bits" but "last_bit + 1" is not the size
of shared_bitmap.

> 
>> +    found_bit = find_next_bit(mgr->shared_bitmap, last_bit + 1,
>> first_bit);
>> +    return found_bit > last_bit;
>> +}
>> +
>> +static int memory_attribute_state_change(MemoryAttributeManager *mgr,
>> uint64_t offset,
>> +                                         uint64_t size, bool
>> shared_to_private)
> 
> Elsewhere it is just "to_private".

I'm OK to change it to "to_private" to keep alignment.

> 
>> +{
>> +    int block_size = memory_attribute_manager_get_block_size(mgr);
>> +    int ret = 0;
>> +
>> +    if (!memory_attribute_is_valid_range(mgr, offset, size)) {
>> +        error_report("%s, invalid range: offset 0x%lx, size 0x%lx",
>> +                     __func__, offset, size);
>> +        return -1;
>> +    }
>> +
>> +    if ((shared_to_private &&
>> memory_attribute_is_range_discarded(mgr, offset, size)) ||
>> +        (!shared_to_private &&
>> memory_attribute_is_range_populated(mgr, offset, size))) {
>> +        return 0;
>> +    }
>> +
>> +    if (shared_to_private) {
>> +        memory_attribute_notify_discard(mgr, offset, size);
>> +    } else {
>> +        ret = memory_attribute_notify_populate(mgr, offset, size);
>> +    }
>> +
>> +    if (!ret) {
>> +        unsigned long first_bit = offset / block_size;
>> +        unsigned long nbits = size / block_size;
>> +
>> +        g_assert((first_bit + nbits) <= mgr->bitmap_size);
>> +
>> +        if (shared_to_private) {
>> +            bitmap_clear(mgr->shared_bitmap, first_bit, nbits);
>> +        } else {
>> +            bitmap_set(mgr->shared_bitmap, first_bit, nbits);
>> +        }
>> +
>> +        return 0;
> 
> Do not need this return. Thanks,

Removed. Thanks!

> 
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>>   int memory_attribute_manager_realize(MemoryAttributeManager *mgr,
>> MemoryRegion *mr)
>>   {
>>       uint64_t bitmap_size;
>> @@ -281,8 +426,11 @@ static void
>> memory_attribute_manager_finalize(Object *obj)
>>     static void memory_attribute_manager_class_init(ObjectClass *oc,
>> void *data)
>>   {
>> +    MemoryAttributeManagerClass *mamc =
>> MEMORY_ATTRIBUTE_MANAGER_CLASS(oc);
>>       RamDiscardManagerClass *rdmc = RAM_DISCARD_MANAGER_CLASS(oc);
>>   +    mamc->state_change = memory_attribute_state_change;
>> +
>>       rdmc->get_min_granularity =
>> memory_attribute_rdm_get_min_granularity;
>>       rdmc->register_listener = memory_attribute_rdm_register_listener;
>>       rdmc->unregister_listener =
>> memory_attribute_rdm_unregister_listener;
> 



  reply	other threads:[~2025-02-19  1:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-17  8:18 [PATCH v2 0/6] Enable shared device assignment Chenyi Qiang
2025-02-17  8:18 ` [PATCH v2 1/6] memory: Export a helper to get intersection of a MemoryRegionSection with a given range Chenyi Qiang
2025-02-17  8:18 ` [PATCH v2 2/6] memory: Change memory_region_set_ram_discard_manager() to return the result Chenyi Qiang
2025-02-18  9:19   ` Alexey Kardashevskiy
2025-02-18  9:41     ` Chenyi Qiang
2025-02-18 10:46     ` David Hildenbrand
2025-02-17  8:18 ` [PATCH v2 3/6] memory-attribute-manager: Introduce MemoryAttributeManager to manage RAMBLock with guest_memfd Chenyi Qiang
2025-02-18  9:19   ` Alexey Kardashevskiy
2025-02-19  1:20     ` Chenyi Qiang
2025-02-19  3:49       ` Alexey Kardashevskiy
2025-02-19  6:33         ` Chenyi Qiang
2025-02-20  3:02           ` Alexey Kardashevskiy
2025-02-17  8:18 ` [PATCH v2 4/6] memory-attribute-manager: Introduce a callback to notify the shared/private state change Chenyi Qiang
2025-02-18  9:19   ` Alexey Kardashevskiy
2025-02-19  1:50     ` Chenyi Qiang [this message]
2025-02-17  8:18 ` [PATCH v2 5/6] memory: Attach MemoryAttributeManager to guest_memfd-backed RAMBlocks Chenyi Qiang
2025-02-18  9:19   ` Alexey Kardashevskiy
2025-02-17  8:18 ` [PATCH v2 6/6] RAMBlock: Make guest_memfd require coordinate discard Chenyi Qiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7131b4a3-a836-4efd-bcfc-982a0112ef05@intel.com \
    --to=chenyi.qiang@intel.com \
    --cc=aik@amd.com \
    --cc=chao.gao@intel.com \
    --cc=chao.p.peng@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).