qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: "Michael Tokarev" <mjt@tls.msk.ru>,
	"Fiona Ebner" <f.ebner@proxmox.com>,
	"Leonardo Bras" <leobras@redhat.com>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Yanan Wang" <wangyanan55@huawei.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v1 1/1] hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0
Date: Thu, 18 May 2023 09:27:01 -0400	[thread overview]
Message-ID: <ZGYnpQmc+5Sut3x8@x1n> (raw)
In-Reply-To: <87wn15dgbs.fsf@secure.mitica>

On Thu, May 18, 2023 at 01:33:43PM +0200, Juan Quintela wrote:
> Michael Tokarev <mjt@tls.msk.ru> wrote:
> > 11.05.2023 11:40, Juan Quintela wrote:
> >> Fiona Ebner <f.ebner@proxmox.com> wrote:
> > ...
> >>> Closes: https://gitlab.com/qemu-project/qemu/-/issues/1576
> >>>
> >>> AFAICT, this breaks (forward) migration from 8.0 to 8.0 + this patch
> >>> when using machine type <= 7.2. That is because after this patch, when
> >>> using machine type <= 7.2, the wmask for the register is not set and
> >>> when 8.0 sends a nonzero value for the register, the error condition in
> >>> get_pci_config_device() will trigger again.
> >> I think that works correctly.
> >> See https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg02733.html
> >> What we have (before this patch) (using abbrevs as in the doc
> >> before)
> >> Current state:
> >> (1) qemu-8.0 -M pc-8.0 -> qemu-8.0 -M pc-8.0 works
> >>      not affected by the patch
> >> (2) qemu-7.2 -M pc-7.2 -> qemu-8.0 -M pc-8.0 works
> >>      works well because 7.2 don't change that field
> >> (3) qemu-8.0 -M pc-7.2 -> qemu-7.2 -M pc-7.2 fails
> >> With the patch we fixed 3, so once it is in stable, 1 and 2 continue
> >> as
> >> usual and for (3) we will have:
> >> (3) qemu-8.0.1 -M pc-7.2 -> qemu-7.2 -M pc-7.2 works
> >> If what you mean is that:
> >> (3) qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 works
> >> Will fail, that is true, but I can think a "sane" way to fix this.
> 
> Hi
> 
> > That's a great summary indeed.
> 
> Thanks.
> 
> >>> Is it necessary to also handle that? Maybe by special casing the error
> >>> condition in get_pci_config_device() to be prepared to accept such a
> >>> stream from 8.0?
> >> Well, we can do that, but it is to the pci maintainers to decide if
> >> that
> >> is "sane".
> >
> > So, can we go from here somewhere? I'd love this fix to be in 8.0.1,
> > either with or without the (un)sane part of the (3) variant above which
> > might fail.  Or else we'll have the same situation in 8.0.1 as we now
> > have in 8.0.0 (the deadline is May-27).
> >
> > We did broke x.y.0 => x.y.1 migration before already like this, such as
> > with 7.2.0=>7.2.1. I'm not saying it's a nice thing to do, just stating
> > a fact. Yes, it is better to avoid such breakage, but.. meh..
> 
> See patch for documentation:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03288.html
> 
> Basically, the best we can do is:
> - get the patch posted.  Fixes everything except:
>   (3) qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 works
> 
> And for that, we can document somewhere that we need to launch
> qemu-8.0.1 as:
> 
> $ qemu-8.0.1 -M pc-7.2 -device blah,x-pci-err-unc-mask=on

One thing we can also do to avoid it in the future is simply having someone
do this check around each softfreeze (and we'll also need maintainers be
careful on merging anything that's risky though after softfreeze) rather
than after release (what I did for this time, which is late), try to cover
as much devices as possible. I don't know whether there's a way to always
cover all devices.

I'll volunteer myself for that as long as I'll remember.  Juan, please also
have a check or remind me if I didn't. :)

I am not sure whether I mentioned it somewhere before, but maybe it'll work
if we can also have some way we check migrating each of the vmsd from
old-qemu to new-qemu (and also new->old) covering all devices.  It doesn't
need to be a real migration, just generate the per-device stream and try
loading on the other binary.

It might be an overkill to be part of CI to check each commit, but if
there's some way to check it then at least we can run it also after
softfreeze.  I also don't know whether it'll be easy to achieve it at all,
but I'll think more about it too and update if I found something useful.

> 
> And mark someone that this machine is tainted an can only be migrated to
> qemu's >= qemu-8.0.1.  And that we should reboot it as the user
> convenience. (reboot here means poweroff qemu and poweron it back
> without x-pci-err-unc-mask=on).
> 
> Later, Juan.
> 
> 
> 
> 

-- 
Peter Xu



  reply	other threads:[~2023-05-18 13:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03  0:27 [PATCH v1 1/1] hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0 Leonardo Bras
2023-05-03  9:32 ` Jonathan Cameron via
2023-05-03 15:54   ` Leonardo Bras Soares Passos
2023-05-03 15:10 ` Peter Xu
2023-05-03 17:04 ` Juan Quintela
2023-05-09 14:01 ` Peter Xu
2023-05-09 15:23   ` Michael S. Tsirkin
2023-05-09 15:32 ` Juan Quintela
2023-05-10 16:29 ` Michael Tokarev
2023-05-10 16:33   ` Michael S. Tsirkin
2023-05-10 16:42   ` Juan Quintela
2023-05-11  8:27 ` Fiona Ebner
2023-05-11  8:40   ` Juan Quintela
2023-05-18  7:34     ` Michael Tokarev
2023-05-18 11:33       ` Juan Quintela
2023-05-18 13:27         ` Peter Xu [this message]
2023-05-18 15:10           ` Michael S. Tsirkin
2023-05-18 15:27             ` Juan Quintela
2023-05-18 15:20           ` Juan Quintela
2023-05-11 10:48   ` Michael S. Tsirkin
2023-05-11 11:43     ` Juan Quintela
2023-05-11 12:20       ` Michael S. Tsirkin
2023-05-22 15:25       ` Jiri Denemark
2023-05-26  7:55         ` Juan Quintela
2023-05-28  6:39           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZGYnpQmc+5Sut3x8@x1n \
    --to=peterx@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=f.ebner@proxmox.com \
    --cc=leobras@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mjt@tls.msk.ru \
    --cc=mst@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wangyanan55@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).