public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Leon Romanovsky <leonro@nvidia.com>
Cc: "Kuppuswamy Sathyanarayanan"
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"Jim Harris" <jim.harris@samsung.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Pierre Crégut" <pierre.cregut@orange.com>
Subject: Re: [PATCH v2 1/2] PCI/IOV: Revert "PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes"
Date: Mon, 12 Feb 2024 14:27:14 -0600	[thread overview]
Message-ID: <20240212202714.GA1142983@bhelgaas> (raw)
In-Reply-To: <20240211084844.GA805332@unreal>

On Sun, Feb 11, 2024 at 10:48:44AM +0200, Leon Romanovsky wrote:
> On Fri, Feb 09, 2024 at 07:20:28PM -0800, Kuppuswamy Sathyanarayanan wrote:
> > On 2/9/24 3:52 PM, Jim Harris wrote:
> > > If an SR-IOV enabled device is held by vfio, and the device is removed,
> > > vfio will hold device lock and notify userspace of the removal. If
> > > userspace reads the sriov_numvfs sysfs entry, that thread will be blocked
> > > since sriov_numvfs_show() also tries to acquire the device lock. If that
> > > same thread is responsible for releasing the device to vfio, it results in
> > > a deadlock.
> > >
> > > The proper way to detect a change to the num_VFs value is to listen for a
> > > sysfs event, not to add a device_lock() on the attribute _show() in the
> > > kernel.

The lock was not about detecting a change; Pierre did this:

  ip monitor dev ${DEVICE} | grep --line-buffered "^${id}:" | while read line; do \
    cat ${path}/device/sriov_numvfs; \

which I assume works by listening for sysfs events.  The problem was
that after the event occurred, the sriov_numvfs read got a stale value
(see https://bugzilla.kernel.org/show_bug.cgi?id=202991).

So I would drop this sentence because I don't think it accurately
reflects the reason for 35ff867b7657.

> > Since you are reverting a commit that synchronizes SysFS read
> > /write, please add some comments about why it is not an
> > issue anymore.
> 
> It was never an issue, the idea that sysfs read and write should be
> serialized by kernel is not correct by definition. 

I think it *was* an issue.  The behavior Pierre observed at was
clearly wrong, and we added 35ff867b7657 ("PCI/IOV: Serialize sysfs
sriov_numvfs reads vs writes") to resolve it.

We should try to avoid reintroducing the problem, so I think we should
probably squash these two patches and describe it as a deadlock fix
instead of dismissing 35ff867b7657 as being based on false premises.

It would be awesome if you had time to verify that these patches also
resolve the problem you saw, Pierre.

I think we should also add:

  Fixes: 35ff867b7657 ("PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes")

as a trigger for backporting this to kernels that include
35ff867b7657.

Bjorn

> > > This reverts commit 35ff867b76576e32f34c698ccd11343f7d616204.
> > > Revert had a small conflict, the sprintf() is now changed to sysfs_emit().
> > >
> > > Link: https://lore.kernel.org/linux-pci/ZXJI5+f8bUelVXqu@ubuntu/
> > > Suggested-by: Leon Romanovsky <leonro@nvidia.com>
> > > Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> > > Signed-off-by: Jim Harris <jim.harris@samsung.com>
> > > ---
> > >  drivers/pci/iov.c |    8 +-------
> > >  1 file changed, 1 insertion(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> > > index aaa33e8dc4c9..0ca20cd518d5 100644
> > > --- a/drivers/pci/iov.c
> > > +++ b/drivers/pci/iov.c
> > > @@ -395,14 +395,8 @@ static ssize_t sriov_numvfs_show(struct device *dev,
> > >  				 char *buf)
> > >  {
> > >  	struct pci_dev *pdev = to_pci_dev(dev);
> > > -	u16 num_vfs;
> > > -
> > > -	/* Serialize vs sriov_numvfs_store() so readers see valid num_VFs */
> > > -	device_lock(&pdev->dev);
> > > -	num_vfs = pdev->sriov->num_VFs;
> > > -	device_unlock(&pdev->dev);
> > >  
> > > -	return sysfs_emit(buf, "%u\n", num_vfs);
> > > +	return sysfs_emit(buf, "%u\n", pdev->sriov->num_VFs);
> > >  }
> > >  
> > >  /*
> > >
> > -- 
> > Sathyanarayanan Kuppuswamy
> > Linux Kernel Developer
> > 

  parent reply	other threads:[~2024-02-12 20:27 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20240209235208uscas1p26c658c64cc85711cd3aa6312224164fc@uscas1p2.samsung.com>
2024-02-09 23:52 ` [PATCH v2 0/2] PCI/IOV: sriov_numvfs bug fixes Jim Harris
2024-02-09 23:52   ` [PATCH v2 1/2] PCI/IOV: Revert "PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes" Jim Harris
2024-02-10  3:20     ` Kuppuswamy Sathyanarayanan
2024-02-11  8:48       ` Leon Romanovsky
2024-02-11 19:15         ` Kuppuswamy Sathyanarayanan
2024-02-12  9:31           ` Leon Romanovsky
2024-02-12 20:27         ` Bjorn Helgaas [this message]
2024-02-12 22:59           ` Jim Harris
2024-02-13  7:37             ` Leon Romanovsky
2024-02-13  9:40             ` pierre.cregut
2024-02-13 14:59             ` Jason Gunthorpe
2024-02-13  7:34           ` Leon Romanovsky
2024-02-13 15:59             ` Bjorn Helgaas
2024-02-13 17:46               ` Leon Romanovsky
2024-02-13 18:00                 ` Kuppuswamy Sathyanarayanan
2024-02-13 19:45                 ` Bjorn Helgaas
2024-02-14  7:16                   ` Leon Romanovsky
2024-02-14 17:04                     ` Jim Harris
2024-02-14 17:50                       ` Bjorn Helgaas
2024-02-14 22:55                         ` Jim Harris
2024-02-09 23:52   ` [PATCH v2 2/2] pci/iov: fix kobject_uevent() ordering in sriov_enable() Jim Harris
2024-02-10  3:22     ` Kuppuswamy Sathyanarayanan
2024-02-12 15:17       ` Keith Busch
2024-02-12 15:29         ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240212202714.GA1142983@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=jgg@nvidia.com \
    --cc=jim.harris@samsung.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=pierre.cregut@orange.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox