From: Bjorn Helgaas <bhelgaas@google.com>
To: Alexander Gordeev <agordeev@redhat.com>
Cc: linux-kernel@vger.kernel.org,
Michael Ellerman <michael@ellerman.id.au>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Tejun Heo <tj@kernel.org>,
Ben Hutchings <bhutchings@solarflare.com>,
David Laight <David.Laight@ACULAB.COM>,
Mark Lord <kernel@start.ca>, "H. Peter Anvin" <hpa@zytor.com>,
linux-pci@vger.kernel.org
Subject: Re: [PATCH v3 04/11] PCI/MSI/pSeries: Make quota traversing and requesting race-safe
Date: Tue, 10 Dec 2013 15:30:20 -0700 [thread overview]
Message-ID: <20131210223020.GF4699@google.com> (raw)
In-Reply-To: <97749db3e9562eb05fb3ee22a0be0ed7a93a0615.1385399393.git.agordeev@redhat.com>
On Tue, Nov 26, 2013 at 10:09:53AM +0100, Alexander Gordeev wrote:
> The current MSI quota handling is not race-safe and might lead
> to incoherent number of MSIs allocated between the firmware and
> Linux MSI data structures. I.e. if the following chain is called
> from concurrently loading drivers: rtas_setup_msi_irqs() ->
> msi_quota_for_device() -> traverse_pci_devices() a driver might
> get a stalled value of MSI limit for its device or possibly even
> crash.
Can you outline the race and the scenario that leads to incorrect results
or a crash? I looked through rtas_setup_msi_irqs() (briefly) and I didn't
see the way that concurrent calls for different devices could interfere
with each other.
I was looking for some place that modifies state, where concurrent calls
might trample on each other, but it looks like msi_quota_for_device() is
pretty safe: it traverses a tree, but everything it computes is on the
stack and it doesn't seem to save results anywhere. Maybe I'm barking up
the wrong tree?
Bjorn
> This update introduces "rtas_quota_mutex" and serializes all
> accesses to msi_quota_for_device() function. As result, no driver
> could eat into other device's MSI limit.
>
> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
> ---
> arch/powerpc/platforms/pseries/msi.c | 24 ++++++++++++++++++++++--
> 1 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
> index 009ec73..0e1d288 100644
> --- a/arch/powerpc/platforms/pseries/msi.c
> +++ b/arch/powerpc/platforms/pseries/msi.c
> @@ -26,6 +26,8 @@ static int query_token, change_token;
> #define RTAS_CHANGE_MSIX_FN 4
> #define RTAS_CHANGE_32MSI_FN 5
>
> +static DEFINE_MUTEX(rtas_quota_mutex);
> +
> /* RTAS Helpers */
>
> static int rtas_change_msi(struct pci_dn *pdn, u32 func, u32 num_irqs)
> @@ -345,7 +347,9 @@ static int rtas_msi_check_device(struct pci_dev *pdev, int nvec, int type)
> if (rc)
> return rc;
>
> + mutex_lock(&rtas_quota_mutex);
> quota = msi_quota_for_device(pdev, nvec);
> + mutex_unlock(&rtas_quota_mutex);
>
> if (quota && quota < nvec)
> return quota;
> @@ -399,6 +403,7 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
> struct msi_msg msg;
> int nvec = nvec_in;
> int use_32bit_msi_hack = 0;
> + int quota;
>
> pdn = pci_get_pdn(pdev);
> if (!pdn)
> @@ -407,13 +412,21 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
> if (type == PCI_CAP_ID_MSIX && check_msix_entries(pdev))
> return -EINVAL;
>
> + mutex_lock(&rtas_quota_mutex);
> +
> + quota = msi_quota_for_device(pdev, nvec);
> + if (quota && quota < nvec) {
> + mutex_unlock(&rtas_quota_mutex);
> + return quota;
> + }
> +
> /*
> * Firmware currently refuse any non power of two allocation
> * so we round up if the quota will allow it.
> */
> if (type == PCI_CAP_ID_MSIX) {
> int m = roundup_pow_of_two(nvec);
> - int quota = msi_quota_for_device(pdev, m);
> + quota = msi_quota_for_device(pdev, m);
>
> if (quota >= m)
> nvec = m;
> @@ -433,8 +446,11 @@ again:
> * We only want to run the 32 bit MSI hack below if
> * the max bus speed is Gen2 speed
> */
> - if (pdev->bus->max_bus_speed != PCIE_SPEED_5_0GT)
> + if (pdev->bus->max_bus_speed !=
> + PCIE_SPEED_5_0GT) {
> + mutex_unlock(&rtas_quota_mutex);
> return rc;
> + }
>
> use_32bit_msi_hack = 1;
> }
> @@ -459,6 +475,7 @@ again:
> nvec = nvec_in;
> goto again;
> }
> + mutex_unlock(&rtas_quota_mutex);
> pr_debug("rtas_msi: rtas_change_msi() failed\n");
> return rc;
> }
> @@ -467,6 +484,7 @@ again:
> list_for_each_entry(entry, &pdev->msi_list, list) {
> hwirq = rtas_query_irq_number(pdn, i++);
> if (hwirq < 0) {
> + mutex_unlock(&rtas_quota_mutex);
> pr_debug("rtas_msi: error (%d) getting hwirq\n", hwirq);
> return hwirq;
> }
> @@ -474,6 +492,7 @@ again:
> virq = irq_create_mapping(NULL, hwirq);
>
> if (virq == NO_IRQ) {
> + mutex_unlock(&rtas_quota_mutex);
> pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
> return -ENOSPC;
> }
> @@ -486,6 +505,7 @@ again:
> entry->msg = msg;
> }
>
> + mutex_unlock(&rtas_quota_mutex);
> return 0;
> }
>
> --
> 1.7.7.6
>
next prev parent reply other threads:[~2013-12-10 22:30 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-26 9:09 [PATCH v3 00/11] Introduce pcim_enable_msi*() family helpers Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 01/11] PCI/MSI/s390: Fix single MSI only check Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 02/11] PCI/MSI/s390: Remove superfluous check of MSI type Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 03/11] PCI/MSI/pSeries: Fix wrong error code reporting Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 04/11] PCI/MSI/pSeries: Make quota traversing and requesting race-safe Alexander Gordeev
2013-12-10 22:30 ` Bjorn Helgaas [this message]
2013-12-13 10:29 ` Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 05/11] PCI/MSI: Fix return value when populate_msi_sysfs() failed Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 06/11] PCI/MSI: Return -ENOSYS for unimplemented interfaces, not -1 Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 07/11] PCI/MSI: Make pci_enable_msi/msix() 'nvec' argument type as int Alexander Gordeev
2013-11-26 20:58 ` Tejun Heo
2013-11-26 9:09 ` [PATCH v3 08/11] PCI/MSI: Factor out pci_get_msi_cap() interface Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 09/11] PCI/MSI: Get rid of pci_enable_msi_block_auto() interface Alexander Gordeev
2013-11-26 9:09 ` [PATCH v3 10/11] PCI/MSI: Convert pci_msix_table_size() to a public interface Alexander Gordeev
2013-12-10 23:08 ` Bjorn Helgaas
2013-12-12 16:06 ` Alexander Gordeev
2013-12-12 21:16 ` Bjorn Helgaas
2013-11-26 9:10 ` [PATCH v3 11/11] PCI/MSI: Introduce pcim_enable_msi*() family helpers Alexander Gordeev
2013-12-10 23:16 ` Bjorn Helgaas
2013-12-12 16:11 ` Alexander Gordeev
2013-12-12 16:11 ` Tejun Heo
2013-12-12 21:16 ` Bjorn Helgaas
2013-11-26 21:00 ` [PATCH v3 00/11] " Tejun Heo
2013-12-06 8:39 ` Alexander Gordeev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131210223020.GF4699@google.com \
--to=bhelgaas@google.com \
--cc=David.Laight@ACULAB.COM \
--cc=agordeev@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=bhutchings@solarflare.com \
--cc=hpa@zytor.com \
--cc=kernel@start.ca \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=michael@ellerman.id.au \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).