Re: [PATCH v3 2/2] s390/vfio-ap: control access to PQAP(AQIC) interception handler

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: Halil Pasic <pasic@linux.ibm.com>
Cc: Tony Krowiak <akrowiak@linux.ibm.com>,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	borntraeger@de.ibm.com, cohuck@redhat.com,
	pasic@linux.vnet.ibm.com, jjherne@linux.ibm.com,
	alex.williamson@redhat.com, kwankhede@nvidia.com
Subject: Re: [PATCH v3 2/2] s390/vfio-ap: control access to PQAP(AQIC) interception handler
Date: Thu, 20 May 2021 09:26:26 -0300	[thread overview]
Message-ID: <20210520122626.GW1002214@nvidia.com> (raw)
In-Reply-To: <20210520104857.65d75858.pasic@linux.ibm.com>

On Thu, May 20, 2021 at 10:48:57AM +0200, Halil Pasic wrote:
> On Wed, 19 May 2021 21:08:15 -0400
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> 
> > >
> > > This is nonesense too:
> > >
> > > 	if (vcpu->kvm->arch.crypto.pqap_hook) {
> > > 		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
> > > 			return -EOPNOTSUPP;
> > > 		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> > >
> > > It should have a lock around it of some kind, not a
> > > try_module_get. module_get is not la lock.  
> > 
> > As I said earlier, I don't know why the author did this. 
> 
> Please have a look at these links from the archive to get some
> perspective:
> https://lkml.org/lkml/2020/12/4/994
> https://lkml.org/lkml/2020/12/3/987
> https://www.lkml.org/lkml/2019/3/1/260
> 
> We can ask the original author, but I don't think we have to. BTW the
> patch that introduced it has your r-b.
> 
> > My best guess
> > is that he wanted to ensure that the module was still loaded; otherwise,
> > the data structures contained therein - for example, the pqap_hook
> > and matrix_mdev that contains it - would be gonzo.
> 
> More precisely prevent the module from unloading while we execute code
> from it. As I've pointed out in a previous email the module may be gone
> by the time we call try_module_get().

No, this is a common misconception.

The module_get prevents the module from even being attempted to be
unloaded. Code should acquire this if it has done something that would
cause a module remove function hang indefinitely, such as a design
that waits for a user FD to close.

This provides a good user experience but should generally not be
required for correctness.

All code passing function pointers across subsystems should always
fully fence those function pointers during removal. This means it
interacts with some kind of locking that guarentees nothing is
currently calling, or ever will call again, those function pointers.

This is not just to protect the function pointer code itself, but the
lock should also protect the data access that function pointer almost
always invokes. This is the bug here, ap is accessing the matrix_dev
data from a function pointer without any locking or serialization
against kfree(matrix_dev). Fencing to guarentee the hook isn't and
won't run also serves as a strong enough serialization to allow the
kfree().

The basic logic is that a module removal cannot complete until all
its function pointers have been removed from everywhere and all the
locking that protect those removals are satisified.

Jason

next prev parent reply	other threads:[~2021-05-20 12:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 15:39 [PATCH v3 0/2] s390/vfio-ap: fix memory leak in mdev remove callback Tony Krowiak
2021-05-19 15:39 ` [PATCH v3 1/2] " Tony Krowiak
2021-05-19 15:39 ` [PATCH v3 2/2] s390/vfio-ap: control access to PQAP(AQIC) interception handler Tony Krowiak
2021-05-19 16:16   ` Jason Gunthorpe
2021-05-19 23:04     ` Tony Krowiak
2021-05-19 23:22       ` Jason Gunthorpe
2021-05-20  1:08         ` Tony Krowiak
2021-05-20  8:48           ` Halil Pasic
2021-05-20 12:26             ` Jason Gunthorpe [this message]
2021-05-20  8:38         ` Halil Pasic
2021-05-20 12:51           ` Jason Gunthorpe
2021-05-21 18:24         ` Tony Krowiak
2021-05-21 18:30           ` Jason Gunthorpe
2021-05-19 17:21   ` Halil Pasic
2021-05-19 23:14     ` Tony Krowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210520122626.GW1002214@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=jjherne@linux.ibm.com \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=pasic@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox