kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@amd.com>
To: "Chenyi Qiang" <chenyi.qiang@intel.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"Gupta Pankaj" <pankaj.gupta@amd.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Michael Roth" <michael.roth@amd.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	Williams Dan J <dan.j.williams@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>,
	Baolu Lu <baolu.lu@linux.intel.com>,
	Gao Chao <chao.gao@intel.com>, Xu Yilun <yilun.xu@intel.com>,
	Li Xiaoyao <xiaoyao.li@intel.com>
Subject: Re: [PATCH v5 04/10] ram-block-attribute: Introduce RamBlockAttribute to manage RAMBlock with guest_memfd
Date: Tue, 27 May 2025 16:06:31 +1000	[thread overview]
Message-ID: <047a0649-4a8c-4f08-a5c8-4168f975d5a3@amd.com> (raw)
In-Reply-To: <cc727bbc-fe37-4be5-9949-3f62d8734215@intel.com>



On 27/5/25 13:14, Chenyi Qiang wrote:
> 
> 
> On 5/27/2025 9:20 AM, Alexey Kardashevskiy wrote:
>>
>>
>> On 27/5/25 11:15, Chenyi Qiang wrote:
>>>
>>>
>>> On 5/26/2025 7:16 PM, Alexey Kardashevskiy wrote:
>>>>
>>>>
>>>> On 26/5/25 19:28, Chenyi Qiang wrote:
>>>>>
>>>>>
>>>>> On 5/26/2025 5:01 PM, David Hildenbrand wrote:
>>>>>> On 20.05.25 12:28, Chenyi Qiang wrote:
>>>>>>> Commit 852f0048f3 ("RAMBlock: make guest_memfd require uncoordinated
>>>>>>> discard") highlighted that subsystems like VFIO may disable RAM block
>>>>>>> discard. However, guest_memfd relies on discard operations for page
>>>>>>> conversion between private and shared memory, potentially leading to
>>>>>>> stale IOMMU mapping issue when assigning hardware devices to
>>>>>>> confidential VMs via shared memory. To address this and allow shared
>>>>>>> device assignement, it is crucial to ensure VFIO system refresh its
>>>>>>> IOMMU mappings.
>>>>>>>
>>>>>>> RamDiscardManager is an existing interface (used by virtio-mem) to
>>>>>>> adjust VFIO mappings in relation to VM page assignment. Effectively
>>>>>>> page
>>>>>>> conversion is similar to hot-removing a page in one mode and
>>>>>>> adding it
>>>>>>> back in the other. Therefore, similar actions are required for page
>>>>>>> conversion events. Introduce the RamDiscardManager to guest_memfd to
>>>>>>> facilitate this process.
>>>>>>>
>>>>>>> Since guest_memfd is not an object, it cannot directly implement the
>>>>>>> RamDiscardManager interface. Implementing it in HostMemoryBackend is
>>>>>>> not appropriate because guest_memfd is per RAMBlock, and some
>>>>>>> RAMBlocks
>>>>>>> have a memory backend while others do not. Notably, virtual BIOS
>>>>>>> RAMBlocks using memory_region_init_ram_guest_memfd() do not have a
>>>>>>> backend.
>>>>>>>
>>>>>>> To manage RAMBlocks with guest_memfd, define a new object named
>>>>>>> RamBlockAttribute to implement the RamDiscardManager interface. This
>>>>>>> object can store the guest_memfd information such as bitmap for
>>>>>>> shared
>>>>>>> memory, and handles page conversion notification. In the context of
>>>>>>> RamDiscardManager, shared state is analogous to populated and private
>>>>>>> state is treated as discard. The memory state is tracked at the host
>>>>>>> page size granularity, as minimum memory conversion size can be one
>>>>>>> page
>>>>>>> per request. Additionally, VFIO expects the DMA mapping for a
>>>>>>> specific
>>>>>>> iova to be mapped and unmapped with the same granularity.
>>>>>>> Confidential
>>>>>>> VMs may perform partial conversions, such as conversions on small
>>>>>>> regions within larger regions. To prevent such invalid cases and
>>>>>>> until
>>>>>>> cut_mapping operation support is available, all operations are
>>>>>>> performed
>>>>>>> with 4K granularity.
>>>>>>>
>>>>>>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>>>>>>> ---
>>>>>>> Changes in v5:
>>>>>>>         - Revert to use RamDiscardManager interface instead of
>>>>>>> introducing
>>>>>>>           new hierarchy of class to manage private/shared state,
>>>>>>> and keep
>>>>>>>           using the new name of RamBlockAttribute compared with the
>>>>>>>           MemoryAttributeManager in v3.
>>>>>>>         - Use *simple* version of object_define and object_declare
>>>>>>> since the
>>>>>>>           state_change() function is changed as an exported function
>>>>>>> instead
>>>>>>>           of a virtual function in later patch.
>>>>>>>         - Move the introduction of RamBlockAttribute field to this
>>>>>>> patch and
>>>>>>>           rename it to ram_shared. (Alexey)
>>>>>>>         - call the exit() when register/unregister failed. (Zhao)
>>>>>>>         - Add the ram-block-attribute.c to Memory API related part in
>>>>>>>           MAINTAINERS.
>>>>>>>
>>>>>>> Changes in v4:
>>>>>>>         - Change the name from memory-attribute-manager to
>>>>>>>           ram-block-attribute.
>>>>>>>         - Implement the newly-introduced PrivateSharedManager
>>>>>>> instead of
>>>>>>>           RamDiscardManager and change related commit message.
>>>>>>>         - Define the new object in ramblock.h instead of adding a new
>>>>>>> file.
>>>>>>>
>>>>>>> Changes in v3:
>>>>>>>         - Some rename (bitmap_size->shared_bitmap_size,
>>>>>>>           first_one/zero_bit->first_bit, etc.)
>>>>>>>         - Change shared_bitmap_size from uint32_t to unsigned
>>>>>>>         - Return mgr->mr->ram_block->page_size in get_block_size()
>>>>>>>         - Move set_ram_discard_manager() up to avoid a g_free() in
>>>>>>> failure
>>>>>>>           case.
>>>>>>>         - Add const for the memory_attribute_manager_get_block_size()
>>>>>>>         - Unify the ReplayRamPopulate and ReplayRamDiscard and related
>>>>>>>           callback.
>>>>>>>
>>>>>>> Changes in v2:
>>>>>>>         - Rename the object name to MemoryAttributeManager
>>>>>>>         - Rename the bitmap to shared_bitmap to make it more clear.
>>>>>>>         - Remove block_size field and get it from a helper. In
>>>>>>> future, we
>>>>>>>           can get the page_size from RAMBlock if necessary.
>>>>>>>         - Remove the unncessary "struct" before GuestMemfdReplayData
>>>>>>>         - Remove the unncessary g_free() for the bitmap
>>>>>>>         - Add some error report when the callback failure for
>>>>>>>           populated/discarded section.
>>>>>>>         - Move the realize()/unrealize() definition to this patch.
>>>>>>> ---
>>>>>>>      MAINTAINERS                  |   1 +
>>>>>>>      include/system/ramblock.h    |  20 +++
>>>>>>>      system/meson.build           |   1 +
>>>>>>>      system/ram-block-attribute.c | 311 ++++++++++++++++++++++++++++++
>>>>>>> +++++
>>>>>>>      4 files changed, 333 insertions(+)
>>>>>>>      create mode 100644 system/ram-block-attribute.c
>>>>>>>
>>>>>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>>>>>> index 6dacd6d004..3b4947dc74 100644
>>>>>>> --- a/MAINTAINERS
>>>>>>> +++ b/MAINTAINERS
>>>>>>> @@ -3149,6 +3149,7 @@ F: system/memory.c
>>>>>>>      F: system/memory_mapping.c
>>>>>>>      F: system/physmem.c
>>>>>>>      F: system/memory-internal.h
>>>>>>> +F: system/ram-block-attribute.c
>>>>>>>      F: scripts/coccinelle/memory-region-housekeeping.cocci
>>>>>>>        Memory devices
>>>>>>> diff --git a/include/system/ramblock.h b/include/system/ramblock.h
>>>>>>> index d8a116ba99..09255e8495 100644
>>>>>>> --- a/include/system/ramblock.h
>>>>>>> +++ b/include/system/ramblock.h
>>>>>>> @@ -22,6 +22,10 @@
>>>>>>>      #include "exec/cpu-common.h"
>>>>>>>      #include "qemu/rcu.h"
>>>>>>>      #include "exec/ramlist.h"
>>>>>>> +#include "system/hostmem.h"
>>>>>>> +
>>>>>>> +#define TYPE_RAM_BLOCK_ATTRIBUTE "ram-block-attribute"
>>>>>>> +OBJECT_DECLARE_SIMPLE_TYPE(RamBlockAttribute, RAM_BLOCK_ATTRIBUTE)
>>>>>>>        struct RAMBlock {
>>>>>>>          struct rcu_head rcu;
>>>>>>> @@ -42,6 +46,8 @@ struct RAMBlock {
>>>>>>>          int fd;
>>>>>>>          uint64_t fd_offset;
>>>>>>>          int guest_memfd;
>>>>>>> +    /* 1-setting of the bitmap in ram_shared represents ram is
>>>>>>> shared */
>>>>>>
>>>>>> That comment looks misplaced, and the variable misnamed.
>>>>>>
>>>>>> The commet should go into RamBlockAttribute and the variable should
>>>>>> likely be named "attributes".
>>>>>>
>>>>>> Also, "ram_shared" is not used at all in this patch, it should be
>>>>>> moved
>>>>>> into the corresponding patch.
>>>>>
>>>>> I thought we only manage the private and shared attribute, so name
>>>>> it as
>>>>> ram_shared. And in the future if managing other attributes, then rename
>>>>> it to attributes. It seems I overcomplicated things.
>>>>
>>>>
>>>> We manage populated vs discarded. Right now populated==shared but the
>>>> very next thing I will try doing is flipping this to populated==private.
>>>> Thanks,
>>>
>>> Can you elaborate your case why need to do the flip? populated and
>>> discarded are two states represented in the bitmap, is it workable to
>>> just call the related handler based on the bitmap?
>>
>>
>> Due to lack of inplace memory conversion in upstream linux, this is the
>> way to allow DMA for TDISP devices. So I'll need to make
>> populated==private opposite to the current populated==shared (+change
>> the kernel too, of course). Not sure I'm going to push real hard though,
>> depending on the inplace private/shared memory conversion work. Thanks,
> 
> Do you mean to operate only on private mapping? This is workable if you
> don't want to manipulate shared mapping. But if you want both,

But I do not want both at the moment as I only have a big knob to make all DMA trafic either private or shared but not both (well, I can have split the guest RAM in 2 halves by some bar address but that's it).

> for
> example, to_private conversion needs to discard shared mapping and
> populate private mapping in IOMMU, it may be possible to pass in a
> parameter to indicate the current operation, allowing the listener
> callback to decide how to proceed. Or other mechanisms to extend it.

True. Thanks,

> 
>>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>> +    RamBlockAttribute *ram_shared;
>>>>>>>          size_t page_size;
>>>>>>>          /* dirty bitmap used during migration */
>>>>>>>          unsigned long *bmap;
>>>>>>> @@ -91,4 +97,18 @@ struct RAMBlock {
>>>>>>>          ram_addr_t postcopy_length;
>>>>>>>      };
>>>>>>>      +struct RamBlockAttribute {
>>>>>>
>>>>>> Should this actually be "RamBlockAttributes" ?
>>>>>
>>>>> Yes. To match with variable name "attributes", it can be renamed as
>>>>> RamBlockAttributes.
>>>>>
>>>>>>
>>>>>>> +    Object parent;
>>>>>>> +
>>>>>>> +    MemoryRegion *mr;
>>>>>>
>>>>>>
>>>>>> Should we link to the parent RAMBlock instead, and lookup the MR from
>>>>>> there?
>>>>>
>>>>> Good suggestion! It can also help to reduce the long arrow operation in
>>>>> ram_block_attribute_get_block_size().
>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

-- 
Alexey


  reply	other threads:[~2025-05-27  6:06 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-20 10:28 [PATCH v5 00/10] Enable shared device assignment Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 01/10] memory: Export a helper to get intersection of a MemoryRegionSection with a given range Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 02/10] memory: Change memory_region_set_ram_discard_manager() to return the result Chenyi Qiang
2025-05-26  8:40   ` David Hildenbrand
2025-05-27  6:56   ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 03/10] memory: Unify the definiton of ReplayRamPopulate() and ReplayRamDiscard() Chenyi Qiang
2025-05-26  8:42   ` David Hildenbrand
2025-05-26  9:35   ` Philippe Mathieu-Daudé
2025-05-26 10:21     ` Chenyi Qiang
2025-05-27  6:56   ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 04/10] ram-block-attribute: Introduce RamBlockAttribute to manage RAMBlock with guest_memfd Chenyi Qiang
2025-05-26  9:01   ` David Hildenbrand
2025-05-26  9:28     ` Chenyi Qiang
2025-05-26 11:16       ` Alexey Kardashevskiy
2025-05-27  1:15         ` Chenyi Qiang
2025-05-27  1:20           ` Alexey Kardashevskiy
2025-05-27  3:14             ` Chenyi Qiang
2025-05-27  6:06               ` Alexey Kardashevskiy [this message]
2025-05-20 10:28 ` [PATCH v5 05/10] ram-block-attribute: Introduce a helper to notify shared/private state changes Chenyi Qiang
2025-05-26  9:02   ` David Hildenbrand
2025-05-27  7:35   ` Alexey Kardashevskiy
2025-05-27  9:06     ` Chenyi Qiang
2025-05-27  9:19       ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 06/10] memory: Attach RamBlockAttribute to guest_memfd-backed RAMBlocks Chenyi Qiang
2025-05-26  9:06   ` David Hildenbrand
2025-05-26  9:46     ` Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 07/10] RAMBlock: Make guest_memfd require coordinate discard Chenyi Qiang
2025-05-26  9:08   ` David Hildenbrand
2025-05-27  5:47     ` Chenyi Qiang
2025-05-27  7:42       ` Alexey Kardashevskiy
2025-05-27  8:12         ` Chenyi Qiang
2025-05-27 11:20       ` David Hildenbrand
2025-05-28  1:57         ` Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 08/10] memory: Change NotifyRamDiscard() definition to return the result Chenyi Qiang
2025-05-26  9:31   ` Philippe Mathieu-Daudé
2025-05-26 10:36   ` Cédric Le Goater
2025-05-26 12:44     ` Cédric Le Goater
2025-05-27  5:29       ` Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 09/10] KVM: Introduce RamDiscardListener for attribute changes during memory conversions Chenyi Qiang
2025-05-26  9:22   ` David Hildenbrand
2025-05-27  8:01   ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 10/10] ram-block-attribute: Add more error handling during state changes Chenyi Qiang
2025-05-26  9:17   ` David Hildenbrand
2025-05-26 10:19     ` Chenyi Qiang
2025-05-26 12:10       ` David Hildenbrand
2025-05-26 12:39         ` Chenyi Qiang
2025-05-27  9:11   ` Alexey Kardashevskiy
2025-05-27 10:18     ` Chenyi Qiang
2025-05-27 11:21       ` David Hildenbrand
2025-05-26 11:37 ` [PATCH v5 00/10] Enable shared device assignment Cédric Le Goater
2025-05-26 12:16   ` Chenyi Qiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=047a0649-4a8c-4f08-a5c8-4168f975d5a3@amd.com \
    --to=aik@amd.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=chao.gao@intel.com \
    --cc=chenyi.qiang@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pankaj.gupta@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).