public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Philipp Stanner <pstanner@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] PCI: Fix devres regression in pci_intx()
Date: Fri, 6 Sep 2024 09:37:11 +0900	[thread overview]
Message-ID: <2072aac8-cdab-40e3-806c-399d38e683f9@kernel.org> (raw)
In-Reply-To: <6a17c02077543f98b72662a7189407d0452e6d47.camel@redhat.com>

On 9/5/24 16:13, Philipp Stanner wrote:
> On Thu, 2024-09-05 at 09:33 +0900, Damien Le Moal wrote:
>> On 2024/09/05 6:10, Alex Williamson wrote:
>>> On Wed, 4 Sep 2024 23:24:53 +0300
>>> Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
>>>
>>>> Wed, Sep 04, 2024 at 12:07:21PM -0600, Alex Williamson kirjoitti:
>>>>> On Wed, 04 Sep 2024 15:37:25 +0200
>>>>> Philipp Stanner <pstanner@redhat.com> wrote:  
>>>>>> On Wed, 2024-09-04 at 17:25 +0900, Damien Le Moal wrote:  
>>>>
>>>> ...
>>>>
>>>>>> If vfio-pci can get rid of pci_intx() alltogether, that might
>>>>>> be a good
>>>>>> thing. As far as I understood Andy Shevchenko, pci_intx() is
>>>>>> outdated.
>>>>>> There's only a hand full of users anyways.  
>>>>>
>>>>> What's the alternative?  
>>>>
>>>> From API perspective the pci_alloc_irq_vectors() & Co should be
>>>> used.
>>>
>>> We can't replace a device level INTx control with a vector
>>> allocation
>>> function.
>>>  
>>>>> vfio-pci has a potentially unique requirement
>>>>> here, we don't know how to handle the device interrupt, we only
>>>>> forward
>>>>> it to the userspace driver.  As a level triggered interrupt,
>>>>> INTx will
>>>>> continue to assert until that userspace driver handles the
>>>>> device.
>>>>> That's obviously unacceptable from a host perspective, so INTx
>>>>> is
>>>>> masked at the device via pci_intx() where available, or at the
>>>>> interrupt controller otherwise.  The API with the userspace
>>>>> driver
>>>>> requires that driver to unmask the interrupt, again resulting
>>>>> in a call
>>>>> to pci_intx() or unmasking the interrupt controller, in order
>>>>> to receive
>>>>> further interrupts from the device.  Thanks,  
>>>>
>>>> I briefly read the discussion and if I understand it correctly
>>>> the problem here
>>>> is in the flow: when the above mentioned API is being called.
>>>> Hence it's design
>>>> (or architectural) level of issue and changing call from foo() to
>>>> bar() won't
>>>> magically make problem go away. But I might be mistaken.
>>>
>>> Certainly from a vector allocation standpoint we can change to
>>> whatever
>>> is preferred, but the direct INTx manipulation functions are a
>>> different thing entirely and afaik there's nothing else that can
>>> replace them at a low level, nor can we just get rid of our calls
>>> to
>>> pci_intx().  Thanks,
>>
>> But can these calls be moved out of the spinlock context ? If not,
>> then we need
>> to clarify that pci_intx() can be called from any context, which will
>> require
>> changing to a GFP_ATOMIC for the resource allocation, even if the use
>> case
>> cannot trigger the allocation. This is needed to ensure the
>> correctness of the
>> pci_intx() function use.
> 
> We could do that I guess. As I keep saying, it's not intended to have
> pci_intx() allocate _permanently_. This is a temporary situation.
> pci_intx() should have neither devres nor allocation.
> 
>> Frankly, I am surprised that the might sleep splat you
>> got was not already reported before (fuzzying, static analyzers might
>> eventually
>> catch that though).
> 
> It's a super rare situation:
>  * pci_intx() has very few callers
>  * It only allocates if pcim_enable_device() instead of
>    pci_enable_device() ran.
>  * It only allocates when it's called for the FIRST TIME
>  * All of the above is only a problem while you hold a lock
> 
>>
>> The other solution would be a version of pci_intx() that has a gfp
>> flags
>> argument to allow callers to use the right gfp flags for the call
>> context.
> 
> I don't think that's a good idea. As I said, I want to clean up all
> that in the mid term.
> 
> As a matter of fact, there is already __pcim_intx() in pci/devres.c as
> a pure unmanaged pci_intx() as a means to split and then cleanup the
> APIs.

Yeah. That naming is in fact confusing. __pcim_intx() should really be named
pci_intx()...

> One path towards getting the hybrid behavior out of pci_intx() could be
> to rename __pcim_intx() to pci_intx_unmanaged() and port everyone who
> uses pci_enable_device() + pci_intx() to that version. That would be
> better than to have a third version with a gfp_t argument.

Sounds good. But ideally, all users that rely on the managed variant should be
converted to use pcim_intx() and pci_intx() changed to not call in devres. But
that may be too much code churn (I have not checked).


-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2024-09-06  0:37 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-25 12:07 [PATCH] PCI: Fix devres regression in pci_intx() Philipp Stanner
2024-07-25 14:26 ` Christoph Hellwig
2024-07-25 15:21   ` Philipp Stanner
2024-07-25 15:47     ` Christoph Hellwig
2024-07-25 21:00 ` Bjorn Helgaas
2024-07-26  0:19 ` Damien Le Moal
2024-07-26 18:43   ` pstanner
2024-07-26 18:59     ` Bjorn Helgaas
2024-07-29 11:29     ` Damien Le Moal
2024-07-29 15:45       ` Philipp Stanner
2024-09-03 15:44 ` Alex Williamson
2024-09-04  7:06   ` Philipp Stanner
2024-09-04  8:25     ` Damien Le Moal
2024-09-04 13:37       ` Philipp Stanner
2024-09-04 18:07         ` Alex Williamson
2024-09-04 20:24           ` Andy Shevchenko
2024-09-04 21:10             ` Alex Williamson
2024-09-05  0:33               ` Damien Le Moal
2024-09-05  1:56                 ` Alex Williamson
2024-09-05  7:13                 ` Philipp Stanner
2024-09-06  0:37                   ` Damien Le Moal [this message]
2024-09-06  6:45                     ` Philipp Stanner
2024-09-04 12:57     ` Alex Williamson
2024-09-04 13:29       ` Philipp Stanner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2072aac8-cdab-40e3-806c-399d38e683f9@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=andy.shevchenko@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=pstanner@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox