From: Halil Pasic <pasic@linux.ibm.com>
To: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, pmorel@linux.ibm.com,
alex.williamson@redhat.com, cohuck@redhat.com,
kwankhede@nvidia.com, borntraeger@de.ibm.com
Subject: Re: [PATCH] s390/vfio-ap: fix unregister GISC when KVM is already gone results in OOPS
Date: Sat, 26 Sep 2020 02:56:01 +0200 [thread overview]
Message-ID: <20200926025601.2ad52b77.pasic@linux.ibm.com> (raw)
In-Reply-To: <3795bc75-9d5e-2098-fd18-f1cbaef9c290@linux.ibm.com>
On Fri, 25 Sep 2020 18:29:16 -0400
Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>
>
> On 9/21/20 11:45 AM, Halil Pasic wrote:
> > On Fri, 18 Sep 2020 13:02:34 -0400
> > Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> >
> >> Attempting to unregister Guest Interruption Subclass (GISC) when the
> >> link between the matrix mdev and KVM has been removed results in the
> >> following:
> >>
> >> "Kernel panic -not syncing: Fatal exception: panic_on_oops"
> >>
> >> This patch fixes this bug by verifying the matrix mdev and KVM are still
> >> linked prior to unregistering the GISC.
> >
> > I read from your commit message that this happens when the link between
> > the KVM and the matrix mdev was established and then got severed.
> >
> > I assume the interrupts were previously enabled, and were not been
> > disabled or cleaned up because q->saved_isc != VFIO_AP_ISC_INVALID.
> >
> > That means the guest enabled interrupts and then for whatever
> > reason got destroyed, and this happens on mdev cleanup.
> >
> > Does it happen all the time or is it some sort of a race?
>
> This is a race condition that happens when a guest is terminated and the
> mdev is
> removed in rapid succession. I came across it with one of my hades test
> cases
> on cleanup of the resources after the test case completes. There is a
> bug in the problem appears
> the vfio_ap_mdev_releasefunction because it tries to reset the APQNs
> after the bits are
> cleared from the matrix_mdev.matrix, so the resets never happen.
>
That sounds very strange. I couldn't find the place where we clear the
bits in matrix_mdev.matrix except for unassign. Currently the unassign
is supposed to be enabled only after we have no guest and we have
cleaned up the queues (which should restore VFIO_AP_ISC_INVALID). Does
your test do any unassign operations? (I'm not sure the we always do
like we are supposed to.)
Now if we did not clear the bits from matrix_mdev.matrix then this
could be an use after free scenario (where we interpret already
re-purposed memory as matrix_mdev.matrix).
> Fixing that, however, does not resolve the issue, so I'm in the process
> of doing a bunch of
> tracing to see the flow of the resets etc. during the lifecycle of the
> mdev during this
> hades test. I should have a better answer next week.
>
My take away is that we don't understand what exactly is going wrong, and
so this patch is at best a mitigation (not a real fix). Does that sound
about correct?
Regards,
Halil
[..]
next prev parent reply other threads:[~2020-09-26 0:56 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 17:02 [PATCH] s390/vfio-ap: fix unregister GISC when KVM is already gone results in OOPS Tony Krowiak
2020-09-21 5:48 ` Christian Borntraeger
2020-09-21 11:56 ` Halil Pasic
2020-09-21 8:24 ` Pierre Morel
2020-09-21 9:23 ` Cornelia Huck
2020-09-21 15:45 ` Halil Pasic
2020-09-25 22:29 ` Tony Krowiak
2020-09-26 0:56 ` Halil Pasic [this message]
2020-10-21 15:46 ` Tony Krowiak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200926025601.2ad52b77.pasic@linux.ibm.com \
--to=pasic@linux.ibm.com \
--cc=akrowiak@linux.ibm.com \
--cc=alex.williamson@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=cohuck@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kwankhede@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=pmorel@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox