From: Chenyi Qiang <chenyi.qiang@intel.com>
To: "David Hildenbrand" <david@redhat.com>,
"Alexey Kardashevskiy" <aik@amd.com>,
"Peter Xu" <peterx@redhat.com>,
"Gupta Pankaj" <pankaj.gupta@amd.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Michael Roth" <michael.roth@amd.com>
Cc: <qemu-devel@nongnu.org>, <kvm@vger.kernel.org>,
Williams Dan J <dan.j.williams@intel.com>,
Zhao Liu <zhao1.liu@intel.com>,
Baolu Lu <baolu.lu@linux.intel.com>,
Gao Chao <chao.gao@intel.com>, Xu Yilun <yilun.xu@intel.com>,
Li Xiaoyao <xiaoyao.li@intel.com>
Subject: Re: [PATCH v5 10/10] ram-block-attribute: Add more error handling during state changes
Date: Mon, 26 May 2025 18:19:38 +0800 [thread overview]
Message-ID: <1674a16c-e4fb-4702-a21d-9c4923528b2f@intel.com> (raw)
In-Reply-To: <6b5957fa-8036-40b6-b79d-db5babb5f7b9@redhat.com>
On 5/26/2025 5:17 PM, David Hildenbrand wrote:
> On 20.05.25 12:28, Chenyi Qiang wrote:
>> The current error handling is simple with the following assumption:
>> - QEMU will quit instead of resuming the guest if kvm_convert_memory()
>> fails, thus no need to do rollback.
>> - The convert range is required to be in the desired state. It is not
>> allowed to handle the mixture case.
>> - The conversion from shared to private is a non-failure operation.
>>
>> This is sufficient for now as complext error handling is not required.
>> For future extension, add some potential error handling.
>> - For private to shared conversion, do the rollback operation if
>> ram_block_attribute_notify_to_populated() fails.
>> - For shared to private conversion, still assert it as a non-failure
>> operation for now. It could be an easy fail path with in-place
>> conversion, which will likely have to retry the conversion until it
>> works in the future.
>> - For mixture case, process individual blocks for ease of rollback.
>>
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> ---
>> system/ram-block-attribute.c | 116 +++++++++++++++++++++++++++--------
>> 1 file changed, 90 insertions(+), 26 deletions(-)
>>
>> diff --git a/system/ram-block-attribute.c b/system/ram-block-attribute.c
>> index 387501b569..0af3396aa4 100644
>> --- a/system/ram-block-attribute.c
>> +++ b/system/ram-block-attribute.c
>> @@ -289,7 +289,12 @@ static int
>> ram_block_attribute_notify_to_discard(RamBlockAttribute *attr,
>> }
>> ret = rdl->notify_discard(rdl, &tmp);
>> if (ret) {
>> - break;
>> + /*
>> + * The current to_private listeners (VFIO dma_unmap and
>> + * KVM set_attribute_private) are non-failing operations.
>> + * TODO: add rollback operations if it is allowed to fail.
>> + */
>> + g_assert(ret);
>> }
>> }
>>
>
> If it's not allowed to fail for now, then patch #8 does not make sense
> and should be dropped :)
It was intended for future extension as in-place conversion to_private
allows it to fail. So I add the patch #8.
But as you mentioned, since the conversion path is changing, and maybe
it is easier to handle from KVM code directly. Let me drop patch #8 and
wait for the in-place conversion to mature.
>
> The implementations (vfio) should likely exit() instead on unexpected
> errors when discarding.
After drop patch #8, maybe keep vfio discard handling as it was. Adding
an additional exit() is also OK to me since it's non-fail case.
>
>
>
> Why not squash all the below into the corresponding patch? Looks mostly
> like handling partial conversions correctly (as discussed previously)?
I extract these two handling 1) mixture conversion; 2) operation
rollback into this individual patch because they are not the practical
cases and are untested.
For 1), I still don't see any real case which will convert a range with
mixture attributes.
For 2), current failure of memory conversion (as seen in kvm_cpu_exec()
->kvm_convert_memory()) will cause the QEMU to quit instead of resuming
guest. Doing the rollback seems useless at present.
>
>> @@ -300,7 +305,7 @@ static int
>> ram_block_attribute_notify_to_populated(RamBlockAttribute *attr,
>> uint64_t offset, uint64_t size)
>> {
>> - RamDiscardListener *rdl;
>> + RamDiscardListener *rdl, *rdl2;
>> int ret = 0;
>> QLIST_FOREACH(rdl, &attr->rdl_list, next) {
>> @@ -315,6 +320,20 @@
>> ram_block_attribute_notify_to_populated(RamBlockAttribute *attr,
>> }
>> }
>> + if (ret) {
>> + /* Notify all already-notified listeners. */
>> + QLIST_FOREACH(rdl2, &attr->rdl_list, next) {
>> + MemoryRegionSection tmp = *rdl2->section;
>> +
>> + if (rdl == rdl2) {
>> + break;
>> + }
>> + if (!memory_region_section_intersect_range(&tmp, offset,
>> size)) {
>> + continue;
>> + }
>> + rdl2->notify_discard(rdl2, &tmp);
>> + }
>> + }
>> return ret;
>> }
>> @@ -353,6 +372,9 @@ int
>> ram_block_attribute_state_change(RamBlockAttribute *attr, uint64_t
>> offset,
>> const int block_size = ram_block_attribute_get_block_size(attr);
>> const unsigned long first_bit = offset / block_size;
>> const unsigned long nbits = size / block_size;
>> + const uint64_t end = offset + size;
>> + unsigned long bit;
>> + uint64_t cur;
>> int ret = 0;
>> if (!ram_block_attribute_is_valid_range(attr, offset, size)) {
>> @@ -361,32 +383,74 @@ int
>> ram_block_attribute_state_change(RamBlockAttribute *attr, uint64_t
>> offset,
>> return -1;
>> }
>> - /* Already discard/populated */
>> - if ((ram_block_attribute_is_range_discard(attr, offset, size) &&
>> - to_private) ||
>> - (ram_block_attribute_is_range_populated(attr, offset, size) &&
>> - !to_private)) {
>> - return 0;
>> - }
>> -
>> - /* Unexpected mixture */
>> - if ((!ram_block_attribute_is_range_populated(attr, offset, size) &&
>> - to_private) ||
>> - (!ram_block_attribute_is_range_discard(attr, offset, size) &&
>> - !to_private)) {
>> - error_report("%s, the range is not all in the desired state: "
>> - "(offset 0x%lx, size 0x%lx), %s",
>> - __func__, offset, size,
>> - to_private ? "private" : "shared");
>> - return -1;
>> - }
>> -
>> if (to_private) {
>> - bitmap_clear(attr->bitmap, first_bit, nbits);
>> - ret = ram_block_attribute_notify_to_discard(attr, offset, size);
>> + if (ram_block_attribute_is_range_discard(attr, offset, size)) {
>> + /* Already private */
>> + } else if (!ram_block_attribute_is_range_populated(attr, offset,
>> + size)) {
>> + /* Unexpected mixture: process individual blocks */
>> + for (cur = offset; cur < end; cur += block_size) {
>> + bit = cur / block_size;
>> + if (!test_bit(bit, attr->bitmap)) {
>> + continue;
>> + }
>> + clear_bit(bit, attr->bitmap);
>> + ram_block_attribute_notify_to_discard(attr, cur,
>> block_size);
>> + }
>> + } else {
>> + /* Completely shared */
>> + bitmap_clear(attr->bitmap, first_bit, nbits);
>> + ram_block_attribute_notify_to_discard(attr, offset, size);
>> + }
>> } else {
>> - bitmap_set(attr->bitmap, first_bit, nbits);
>> - ret = ram_block_attribute_notify_to_populated(attr, offset,
>> size);
>> + if (ram_block_attribute_is_range_populated(attr, offset,
>> size)) {
>> + /* Already shared */
>> + } else if (!ram_block_attribute_is_range_discard(attr,
>> offset, size)) {
>> + /* Unexpected mixture: process individual blocks */
>> + unsigned long *modified_bitmap = bitmap_new(nbits);
>> +
>> + for (cur = offset; cur < end; cur += block_size) {
>> + bit = cur / block_size;
>> + if (test_bit(bit, attr->bitmap)) {
>> + continue;
>> + }
>> + set_bit(bit, attr->bitmap);
>> + ret = ram_block_attribute_notify_to_populated(attr, cur,
>> + block_size);
>> + if (!ret) {
>> + set_bit(bit - first_bit, modified_bitmap);
>> + continue;
>> + }
>> + clear_bit(bit, attr->bitmap);
>> + break;
>> + }
>> +
>> + if (ret) {
>> + /*
>> + * Very unexpected: something went wrong. Revert to
>> the old
>> + * state, marking only the blocks as private that we
>> converted
>> + * to shared.
>> + */
>> + for (cur = offset; cur < end; cur += block_size) {
>> + bit = cur / block_size;
>> + if (!test_bit(bit - first_bit, modified_bitmap)) {
>> + continue;
>> + }
>> + assert(test_bit(bit, attr->bitmap));
>> + clear_bit(bit, attr->bitmap);
>> + ram_block_attribute_notify_to_discard(attr, cur,
>> + block_size);
>> + }
>> + }
>> + g_free(modified_bitmap);
>> + } else {
>> + /* Complete private */
>> + bitmap_set(attr->bitmap, first_bit, nbits);
>> + ret = ram_block_attribute_notify_to_populated(attr,
>> offset, size);
>> + if (ret) {
>> + bitmap_clear(attr->bitmap, first_bit, nbits);
>> + }
>> + }
>> }
>> return ret;
>
>
next prev parent reply other threads:[~2025-05-26 10:20 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-20 10:28 [PATCH v5 00/10] Enable shared device assignment Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 01/10] memory: Export a helper to get intersection of a MemoryRegionSection with a given range Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 02/10] memory: Change memory_region_set_ram_discard_manager() to return the result Chenyi Qiang
2025-05-26 8:40 ` David Hildenbrand
2025-05-27 6:56 ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 03/10] memory: Unify the definiton of ReplayRamPopulate() and ReplayRamDiscard() Chenyi Qiang
2025-05-26 8:42 ` David Hildenbrand
2025-05-26 9:35 ` Philippe Mathieu-Daudé
2025-05-26 10:21 ` Chenyi Qiang
2025-05-27 6:56 ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 04/10] ram-block-attribute: Introduce RamBlockAttribute to manage RAMBlock with guest_memfd Chenyi Qiang
2025-05-26 9:01 ` David Hildenbrand
2025-05-26 9:28 ` Chenyi Qiang
2025-05-26 11:16 ` Alexey Kardashevskiy
2025-05-27 1:15 ` Chenyi Qiang
2025-05-27 1:20 ` Alexey Kardashevskiy
2025-05-27 3:14 ` Chenyi Qiang
2025-05-27 6:06 ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 05/10] ram-block-attribute: Introduce a helper to notify shared/private state changes Chenyi Qiang
2025-05-26 9:02 ` David Hildenbrand
2025-05-27 7:35 ` Alexey Kardashevskiy
2025-05-27 9:06 ` Chenyi Qiang
2025-05-27 9:19 ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 06/10] memory: Attach RamBlockAttribute to guest_memfd-backed RAMBlocks Chenyi Qiang
2025-05-26 9:06 ` David Hildenbrand
2025-05-26 9:46 ` Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 07/10] RAMBlock: Make guest_memfd require coordinate discard Chenyi Qiang
2025-05-26 9:08 ` David Hildenbrand
2025-05-27 5:47 ` Chenyi Qiang
2025-05-27 7:42 ` Alexey Kardashevskiy
2025-05-27 8:12 ` Chenyi Qiang
2025-05-27 11:20 ` David Hildenbrand
2025-05-28 1:57 ` Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 08/10] memory: Change NotifyRamDiscard() definition to return the result Chenyi Qiang
2025-05-26 9:31 ` Philippe Mathieu-Daudé
2025-05-26 10:36 ` Cédric Le Goater
2025-05-26 12:44 ` Cédric Le Goater
2025-05-27 5:29 ` Chenyi Qiang
2025-05-20 10:28 ` [PATCH v5 09/10] KVM: Introduce RamDiscardListener for attribute changes during memory conversions Chenyi Qiang
2025-05-26 9:22 ` David Hildenbrand
2025-05-27 8:01 ` Alexey Kardashevskiy
2025-05-20 10:28 ` [PATCH v5 10/10] ram-block-attribute: Add more error handling during state changes Chenyi Qiang
2025-05-26 9:17 ` David Hildenbrand
2025-05-26 10:19 ` Chenyi Qiang [this message]
2025-05-26 12:10 ` David Hildenbrand
2025-05-26 12:39 ` Chenyi Qiang
2025-05-27 9:11 ` Alexey Kardashevskiy
2025-05-27 10:18 ` Chenyi Qiang
2025-05-27 11:21 ` David Hildenbrand
2025-05-26 11:37 ` [PATCH v5 00/10] Enable shared device assignment Cédric Le Goater
2025-05-26 12:16 ` Chenyi Qiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1674a16c-e4fb-4702-a21d-9c4923528b2f@intel.com \
--to=chenyi.qiang@intel.com \
--cc=aik@amd.com \
--cc=baolu.lu@linux.intel.com \
--cc=chao.gao@intel.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=pankaj.gupta@amd.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=xiaoyao.li@intel.com \
--cc=yilun.xu@intel.com \
--cc=zhao1.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).