qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>
Cc: Steven Sistare <steven.sistare@oracle.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"clg@redhat.com" <clg@redhat.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute "query-balloon" after CPR transfer
Date: Fri, 10 Oct 2025 09:39:03 +0100	[thread overview]
Message-ID: <aOjGEPHsSCc0ULma@redhat.com> (raw)
In-Reply-To: <IA3PR11MB9136738C2CE19566F4543DFE92EFA@IA3PR11MB9136.namprd11.prod.outlook.com>

On Fri, Oct 10, 2025 at 03:54:42AM +0000, Duan, Zhenzhong wrote:
> 
> 
> >-----Original Message-----
> >From: Daniel P. Berrangé <berrange@redhat.com>
> >Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
> >"query-balloon" after CPR transfer
> >
> >On Thu, Oct 09, 2025 at 10:32:34AM -0400, Steven Sistare wrote:
> >> On 10/9/2025 10:19 AM, Daniel P. Berrangé wrote:
> >> > On Thu, Oct 09, 2025 at 10:11:17AM -0400, Steven Sistare wrote:
> >> > > On 10/8/2025 6:22 AM, Duan, Zhenzhong wrote:
> >> > > >
> >> > > >
> >> > > > > -----Original Message-----
> >> > > > > From: Steven Sistare <steven.sistare@oracle.com>
> >> > > > > Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
> >> > > > > "query-balloon" after CPR transfer
> >> > > > >
> >> > > > > On 9/30/2025 2:00 AM, Duan, Zhenzhong wrote:
> >> > > > > > > -----Original Message-----
> >> > > > > > > From: Steven Sistare <steven.sistare@oracle.com>
> >> > > > > > > Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when
> >execute
> >> > > > > > > "query-balloon" after CPR transfer
> >> > > > > > >
> >> > > > > > > On 9/28/2025 4:54 AM, Zhenzhong Duan wrote:
> >> > > > > > > > After CPR transfer, source QEMU closes kvm fd and sets
> >kvm_state to
> >> > > > > > > NULL,
> >> > > > > > > > "query-balloon" will check kvm_state->sync_mmu and trigger
> >NULL
> >> > > > > pointer
> >> > > > > > > > reference.
> >> > > > > > > >
> >> > > > > > > > We don't need to NULL kvm_state as all states in kvm_state
> >aren't
> >> > > > > released
> >> > > > > > > > actually. Just closing kvm fd is enough so we could still query
> >states
> >> > > > > > > > through "query_*" qmp command.
> >> > > > > > >
> >> > > > > > > IMO this does not make sense.  Much of the state in kvm_state
> >was
> >> > > > > derived
> >> > > > > > >from ioctl's on the descriptors, and closing them invalidates it.
> >Asking
> >> > > > > > > historical questions about what used to be makes no sense.
> >> > > > > >
> >> > > > > > You also have your valid point.
> >> > > > > >
> >> > > > > > >
> >> > > > > > > Clearing kvm_state and setting kvm_allowed=false would be a
> >safer fix.
> >> > > > >
> >> > > > > Setting kvm_allowed=false causes kvm_enabled() to return false
> >which should
> >> > > > > prevent kvm_state from being dereferenced anywhere:
> >> > > > >
> >> > > > >       #define kvm_enabled()           (kvm_allowed)
> >> > > > >
> >> > > > >     Eg for the balloon:
> >> > > > >
> >> > > > >       static bool have_balloon(Error **errp)
> >> > > > >       {
> >> > > > >           if (kvm_enabled() && !kvm_has_sync_mmu()) {
> >> > > >
> >> > > > OK, will do, clearing kvm_allowed is a good idea.
> >> > > >
> >> > > > Currently there are two qmp commands "query-balloon" and
> >"query-cpu-definitions"
> >> > > > causing SIGSEGV after CPR-transfer, clearing kvm_allowed helps for
> >both.
> >> > > > But clearing both kvm_allowed and kvm_state cause SIGSEGV on
> >"query-cpu-definitions".
> >> > > >
> >> > > > I'll send a patch to clearing only kvm_allowed if you are fine with it.
> >> > >
> >> > > I still don't love the idea.  kvm_state is no longer valid.
> >> > >
> >> > > It sounds like "query-cpu-definitions" is missing a check for
> >kvm_enabled().
> >> >
> >> > This patch  / bug feels like it is side-stepping a general conceptual
> >> > question.  After cpr-transfer is complete what QMP commands are still
> >> > valid to call in general ? This thread mentions two which have caused
> >> > bugs, but beyond that it feels like a large subset of QMP comamnds
> >> > are conceptually invalid to use.
> >>
> >> Agreed. I don't see why these commands should be issued to the dead
> >instance.
> >> How about migration status, quit, and nothing else?
> >> Half serious.
> >
> >I was hoping you'd suggest such a short list :-)
> >
> >Essentially a case of calling qmp_for_each_command(), and in the callback
> >run qmp_disable_command() for everything not in the allow-list.
> 
> Thanks for your suggestion, I agree this is a perfect scheme, but is also heavy on this corner case😊
> Just curious if we need to do same for live migration. E.g., send qmp command on source qemu after live migration.

At the end of a normal migration, there is no functional reason to prevent
use of any QMP command.  Indeed we need to have the ability to carry on
using the orignal source QEMU, in case the final steps on migration on
the target fail and we need to restart the source.

The cpr-transfer is unusual in the restrictions it has after completion.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2025-10-10  8:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-28  8:54 [PATCH v2 0/6] VFIO: cpr-transfer fixes Zhenzhong Duan
2025-09-28  8:54 ` [PATCH v2 1/6] vfio/container: Remap only populated parts in a section Zhenzhong Duan
2025-09-28  8:54 ` [PATCH v2 2/6] vfio/cpr-legacy: drop an erroneous assert Zhenzhong Duan
2025-09-28  8:54 ` [PATCH v2 3/6] vfio/iommufd: Set cpr.ioas_id on source side for CPR transfer Zhenzhong Duan
2025-09-28  8:54 ` [PATCH v2 4/6] vfio/iommufd: Restore vbasedev's reference to hwpt after " Zhenzhong Duan
2025-09-28  8:54 ` [PATCH v2 5/6] accel/kvm: Fix an erroneous check on coalesced_mmio_ring Zhenzhong Duan
2025-09-29 13:34   ` Steven Sistare
2025-09-28  8:54 ` [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute "query-balloon" after CPR transfer Zhenzhong Duan
2025-09-29 13:54   ` Steven Sistare
2025-09-30  6:00     ` Duan, Zhenzhong
2025-09-30 12:52       ` Steven Sistare
2025-10-08 10:22         ` Duan, Zhenzhong
2025-10-09 14:11           ` Steven Sistare
2025-10-09 14:19             ` Daniel P. Berrangé
2025-10-09 14:32               ` Steven Sistare
2025-10-09 14:36                 ` Daniel P. Berrangé
2025-10-10  3:54                   ` Duan, Zhenzhong
2025-10-10  8:39                     ` Daniel P. Berrangé [this message]
2025-10-10  3:45                 ` Duan, Zhenzhong
2025-10-10  3:39             ` Duan, Zhenzhong
2025-10-07 15:39 ` [PATCH v2 0/6] VFIO: cpr-transfer fixes Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOjGEPHsSCc0ULma@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=clg@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=steven.sistare@oracle.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).