Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Shay Drori <shayd@nvidia.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: <bhelgaas@google.com>, <linux-pci@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, Keith Busch <kbusch@kernel.org>,
	"Leon Romanovsky" <leonro@nvidia.com>
Subject: Re: [PATCH] PCI: Fix NULL dereference in SR-IOV VF creation error path
Date: Sun, 2 Mar 2025 10:22:26 +0200	[thread overview]
Message-ID: <f3a98fe9-ced9-4d1c-b77b-2c1d65ff9e23@nvidia.com> (raw)
In-Reply-To: <20250227224547.GA22604@bhelgaas>



On 28/02/2025 0:45, Bjorn Helgaas wrote:
> External email: Use caution opening links or attachments
> 
> 
> On Sun, Feb 16, 2025 at 10:32:54AM +0200, Shay Drory wrote:
>> Add proper cleanup when virtfn setup fails to prevent NULL pointer
>> dereference during device removal. The kernel oops[1] occurred due to
>> Incorrect error handling flow when pci_setup_device() fails.
>>
>> Fix it by properly cleaning up virtfn resources when pci_setup_device()
>> fails, instead of invoking pci_stop_and_remove_bus_device().
>> This prevents accessing partially initialized virtfn devices during
>> removal.
> 
>> Fixes: e3f30d563a38 ("PCI: Make pci_destroy_dev() concurrent safe")
> 
> It's not obvious to me how e3f30d563a38 is related.  Can you elucidate
> the connection?
> 

The Null-ptr Oops is from device_del() inside pci_destroy_dev().

Before e3f30d563a38, pci_destroy_dev() check for device's kobj parent.
pci_setup_device() doesn't set device's kobj parent, which means that
in our case, pci_destroy_dev(), which called from
pci_stop_and_remove_bus_device(), is no-op.

after e3f30d563a38, the above is no longer true and pci_destroy_dev() is
being executed and call to device_del()...

>> CC: Keith Busch <kbusch@kernel.org>
>> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
>> Signed-off-by: Shay Drory <shayd@nvidia.com>
>> ---
>>   drivers/pci/iov.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index 9e4770cdd4d5..3dfcbf10e127 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -314,8 +314,11 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id)
>>                pci_read_vf_config_common(virtfn);
>>
>>        rc = pci_setup_device(virtfn);
>> -     if (rc)
>> +     if (rc) {
>> +             pci_bus_put(virtfn->bus);
>> +             kfree(virtfn);
>>                goto failed1;
>> +     }
> 
> Thanks for the fix.  The mix of error recovery styles (cleanup here at
> the point of falure vs. goto different cleanup steps at the end) makes
> this kind of hard to understand.
> 
> I see that this cleanup is similar to what's done in
> pci_scan_device(), which does help.  Did you consider making a helper
> here with structure similar to pci_scan_device(), e.g., a
> pci_iov_scan_device()?  I wonder if that could make the error handling
> here simpler?

seems like a good idea, will do it in v2.

just to be clear, the outcome will be something like:

bus = virtfn_add_bus(dev->bus, pci_iov_virtfn_bus(dev, id));
if (!bus)
         goto failed;

rc = pci_iov_scan_device()
if (rc)
         goto failed1;

virtfn->dev.parent = dev->dev.parent;
virtfn->multifunction = 0;
<...>

> 
>>        virtfn->dev.parent = dev->dev.parent;
>>        virtfn->multifunction = 0;
>> @@ -336,14 +339,15 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id)
>>        pci_device_add(virtfn, virtfn->bus);
>>        rc = pci_iov_sysfs_link(dev, virtfn, id);
>>        if (rc)
>> -             goto failed1;
>> +             goto failed2;
>>
>>        pci_bus_add_device(virtfn);
>>
>>        return 0;
>>
>> -failed1:
>> +failed2:
>>        pci_stop_and_remove_bus_device(virtfn);
>> +failed1:
>>        pci_dev_put(dev);
>>   failed0:
>>        virtfn_remove_bus(dev->bus, bus);
>> --
>> 2.38.1
>>


      reply	other threads:[~2025-03-02  8:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-16  8:32 [PATCH] PCI: Fix NULL dereference in SR-IOV VF creation error path Shay Drory
2025-02-27 22:45 ` Bjorn Helgaas
2025-03-02  8:22   ` Shay Drori [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3a98fe9-ced9-4d1c-b77b-2c1d65ff9e23@nvidia.com \
    --to=shayd@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=kbusch@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox