From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
To: Marc Zyngier <maz@kernel.org>
Cc: tglx@linutronix.de, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: MSIs not freed in GICv3 ITS driver
Date: Sun, 21 Jul 2024 14:20:32 +0530 [thread overview]
Message-ID: <20240721085032.GL1908@thinkpad> (raw)
In-Reply-To: <877cdupdvu.wl-maz@kernel.org>
On Tue, Jul 09, 2024 at 08:24:37PM +0100, Marc Zyngier wrote:
> On Tue, 09 Jul 2024 18:37:08 +0100,
> Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> >
> > On Mon, Jul 08, 2024 at 06:31:57PM +0100, Marc Zyngier wrote:
> > > Mani,
> > >
> > > On Mon, 08 Jul 2024 16:39:33 +0100,
> > > Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> wrote:
> > > >
> > > > Hi Marc, Thomas,
> > > >
> > > > I'm seeing a weird behavior with GICv3 ITS driver while allocating MSIs from
> > > > PCIe devices. When the PCIe driver (I'm using virtio_pci_common.c) tries to
> > > > allocate non power of 2 MSIs (like 3), then the GICv3 MSI driver always rounds
> > > > the MSI count to power of 2 to find the order. In this case, the order becomes 2
> > > > in its_alloc_device_irq().
> > >
> > > That's because we can only allocate EventIDs as a number of ID
> > > bits. So you can't have *1* MSI, nor 3. You can have 2, 4, 8, or
> > > 2^24. This is a power-of-two architecture.
> > >
> >
> > Ah okay.
> >
> > > > So 4 entries are allocated by bitmap_find_free_region().
> > >
> > > Assuming you're calling about its_alloc_device_irq(), it looks like a
> > > bug. Or rather, some laziness on my part. The thing is, this bitmap is
> > > only dealing with sub-allocation in the pool that has been given to
> > > the endpoint. So the power-of-two crap doesn't really matter unless
> > > you are dealing with Multi-MSI, which has actual alignment
> > > requirements.
> > >
> >
> > Okay.
> >
> > > >
> > > > But since the PCIe driver has only requested 3 MSIs, its_irq_domain_alloc()
> > > > will only allocate 3 MSIs, leaving one bitmap entry unused.
> > > >
> > > > And when the driver frees the MSIs using pci_free_irq_vectors(), only 3
> > > > allocated MSIs were freed and their bitmap entries were also released. But the
> > > > entry for the additional bitmap was never released. Due to this,
> > > > its_free_device() was also never called, resulting in the ITS device not getting
> > > > freed.
> > > >
> > > > So when the PCIe driver tries to request the MSIs again (PCIe device being
> > > > removed and inserted back), because the ITS device was not freed previously,
> > > > MSIs were again requested for the same ITS device. And due to the stale bitmap
> > > > entry, the ITS driver refuses to allocate 4 MSIs as only 3 bitmap entries were
> > > > available. This forces the PCIe driver to reduce the MSI count, which is sub
> > > > optimal.
> > > >
> > > > This behavior might be applicable to other irqchip drivers handling MSI as well.
> > > > I want to know if this behavior is already known with MSI and irqchip drivers?
> > > >
> > > > For fixing this issue, the PCIe drivers could always request MSIs of power of 2,
> > > > and use a dummy MSI handler for the extra number of MSIs allocated. This could
> > > > also be done in the generic MSI driver itself to avoid changes in the PCIe
> > > > drivers. But I wouldn't say it is the best possible fix.
> > >
> > > No, that's terrible. This is just papering over a design mistake, and
> > > I refuse to go down that road.
> > >
> >
> > Agree. But what about other MSI drivers? And because of the MSI design, they
> > also round the requested MSI count to power of 2, leading to unused vectors and
> > those also wouldn't get freed.
>
> This has absolutely nothing to do with the "design" of MSIs. It has
> everything to do with not special-casing Multi-MSI.
>
> > I think this power of 2 limitation should be
> > imposed at the API level or in the MSI driver instead of silently keeping unused
> > vectors in irqchip drivers.
>
> You really have the wrong end of the stick. The MSi API has *zero*
> control over the payload allocation. How could it? The whole point of
> having an MSI driver is to insulate the core code from such stuff.
>
Right, but because of the way most of the MSI drivers (not all?) use bitmap to
allocate MSIs, this issue is also present in all of them. So I think the fix is
not just applicable for the gic-its-v3 driver alone.
> >
> > > >
> > > > Is there any other way to address this issue? Or am I missing something
> > > > completely?
> > >
> > > Well, since each endpoint handled by an ITS has its allocation tracked
> > > by a bitmap, it makes more sense to precisely track the allocation.
> > >
> > > Here's a quick hack that managed to survive a VM boot. It may even
> > > work. The only problem with it is that it probably breaks a Multi-MSi
> > > device sitting behind a non-transparent bridge that would get its MSIs
> > > allocated after another device. In this case, we wouldn't honor the
> > > required alignment and things would break.
> > >
> > > So take this as a proof of concept. If that works, I'll think of how
> > > to deal with this crap in a more suitable way...
> > >
> >
> > This works fine. Now the ITS driver allocates requested number of MSIs, thanks!
>
> Well, as I said, this breaks tons of other things so I'm not going to
> merge this any time soon. Certainly not before Thomas gets his MSI
> rework upstream. And then I need to work out how to deal with
> Multi-MSI in the correct way.
>
> So don't hold your breath.
>
Sure thing. Thanks for getting it around quickly. I'll wait for a proper fix.
- Mani
--
மணிவண்ணன் சதாசிவம்
next prev parent reply other threads:[~2024-07-21 8:51 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-08 15:39 MSIs not freed in GICv3 ITS driver Manivannan Sadhasivam
2024-07-08 17:31 ` Marc Zyngier
2024-07-09 17:37 ` Manivannan Sadhasivam
2024-07-09 19:24 ` Marc Zyngier
2024-07-21 8:50 ` Manivannan Sadhasivam [this message]
2026-01-16 15:03 ` Manivannan Sadhasivam
2026-02-19 16:54 ` Marc Zyngier
2026-02-25 9:34 ` Qiang Yu
2026-02-26 13:39 ` Marc Zyngier
2026-03-03 5:22 ` Qiang Yu
2026-03-03 9:26 ` Manivannan Sadhasivam
2026-03-30 8:17 ` Marc Zyngier
2026-04-01 7:59 ` Manivannan Sadhasivam
2026-04-01 8:15 ` Marc Zyngier
2026-04-01 12:01 ` Manivannan Sadhasivam
2026-04-01 16:08 ` Marc Zyngier
2026-04-01 17:37 ` Manivannan Sadhasivam
2026-04-07 10:45 ` Thomas Gleixner
2026-04-07 12:18 ` Manivannan Sadhasivam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240721085032.GL1908@thinkpad \
--to=manivannan.sadhasivam@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.