public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Roland Dreier <roland@topspin.com>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: "Nguyen, Tom L" <tom.l.nguyen@intel.com>,
	long <tlnguyen@snoqualmie.dp.intel.com>,
	ak@muc.de, akpm@osdl.org, greg@kroah.com, jgarzik@pobox.com,
	linux-kernel@vger.kernel.org, eli@mellanox.co.il
Subject: Re: [PATCH]2.6.7 MSI-X Update
Date: Thu, 24 Jun 2004 00:27:52 -0700	[thread overview]
Message-ID: <52zn6tbekn.fsf@topspin.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0406240216170.3273@montezuma.fsmlabs.com> (Zwane Mwaikambo's message of "Thu, 24 Jun 2004 02:37:29 -0400 (EDT)")

Hi, I think you may not have read Long's patch/API carefully, since
you seem to be misunderstanding my objection.  In any case...

    Roland> I could imagine hardware where the driver does not know
    Roland> exactly how many vectors it will use until it starts up.
    Roland> As a hypothetical example, imagine some storage networking
    Roland> host adapter that supports an interrupt vector per storage
    Roland> target.  The driver does not know how many vectors it will
    Roland> actually use until it has logged into the storage fabric;
    Roland> in fact, the driver may want to keep some vectors "in
    Roland> reserve" in case a new target is added to the fabric
    Roland> later.

    Roland> I think it would be better to preserve maximum flexibility
    Roland> for devices and drivers, and not mandate that every
    Roland> allocated MSI-X vector is always used.

    Zwane> The MSI subsystem should at most reserve and the driver
    Zwane> make a request.  There may be a limit per PCI device as
    Zwane> specified by the MSI subsystem for some reason or
    Zwane> other. Isn't this what we're all saying?

No, Long is actually saying that a driver must actually call
request_irq() on all the vectors that it is allocated.  I am saying
that this requirement is too stringent, since there may be devices and
drivers that cannot predict exactly how many MSI-X vectors they will
use during driver initialization.

    Roland> It seems in the code right now you are able to tell if any
    Roland> MSI-X vectors are hooked, since you wait for the last
    Roland> vector to be unhooked to disable MSI-X.  I would just have
    Roland> it be a WARN_ON() (or maybe BUG_ON()) if a driver calls
    Roland> pci_disable_msix() without calling free_irq for all its
    Roland> MSI-X vectors.

    Roland> Right now there is an issue if a driver is unloaded
    Roland> without freeing all its IRQs -- the device will be left in
    Roland> MSI-X mode and can not be recovered without rebooting.

    Zwane> This sounds like a case of bad driver bug generally the
    Zwane> kernel would oops when the ISR text gets unloaded. What
    Zwane> kind of behaviour do you expect here?

Yes, I agree, it is a bad driver bug if the driver is unloaded without
doing free_irq() on all the vectors it has done request_irq() on.
However, with Long's API, there is a problem if for example a device
driver does pci_enable_msix() and is allocated 2 vectors, then
correctly does request_irq()/free_irq() on one vector and doesn't
touch the second vector, and then is unloaded.  The device will be
left with MSI-X enabled and leak its vectors.

In the proposed API, since there is no pci_disable_msix() call, the
only way the driver can free its MSI-X vector is to actually do
request_irq()/free_irq() on it.

    Roland> Similarly, with the API as it stands in your patch, a
    Roland> driver must be very careful not to take any action that
    Roland> may fail in between calling pci_enable_msix() and actually
    Roland> calling request_irq(), or otherwise the only way to avoid
    Roland> leaking MSI-X resources is to take the very risky step of
    Roland> calling request_irq() on an error path.  This doesn't fit
    Roland> very well with the structure of lots of device drivers,
    Roland> for example Intel's very own e1000 driver, which wait
    Roland> until the device is actually opened to call request_irq().

    Zwane> Could you elaborate further here? Won't a matched
    Zwane> pci_disable_msix() free the necessary resources on failure?

Yes, a matched pci_disable_msix() would be exactly what is needed.
However, look at Long's patch -- there is no such function in the API
he is proposing.

    Roland> For your second point, I would have pci_disable_msix()
    Roland> always free all MSI-X vectors that have been
    Roland> allocated... the only parameter that I expect it would
    Roland> take is a struct pci_dev *.

    Zwane> If the driver is doing this, then we won't have to bother
    Zwane> about pci_disable_msix() doing the vector free surely?

I think the PCI core needs to know which vectors are in use and which
are free (and ready to assign to PCI devices that request them).

I believe the correct API/semantics for a device driver are:

	pci_enable_msix(dev, &entries, num_entries);
          /* On success, driver now has full use of the num_entries
             interrupt vectors returned through entries.  MSI-X enable
             bit is set in PCI header. */
          /* ... */
          /* driver freely does request_irq()/free_irq() on some or all
             vectors in entries while running. */
          /* ... */
        pci_disable_msix(dev);
          /* All handlers attached to MSI-X vectors must be removed with
             free_irq() before pci_disable_msi() call. */
          /* MSI-X enable bit is now cleared from PCI header, and all
             interrupt vectors are returned to the core for possible
             reallocation. */

The major change from Long's proposal is the addition of the
pci_disable_msix() function.

 - Roland

  reply	other threads:[~2004-06-24  7:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-23 22:03 [PATCH]2.6.7 MSI-X Update Nguyen, Tom L
2004-06-24  1:42 ` Roland Dreier
2004-06-24  6:37   ` Zwane Mwaikambo
2004-06-24  7:27     ` Roland Dreier [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-07-19 15:36 Nguyen, Tom L
2004-07-15 15:46 Nguyen, Tom L
2004-07-13 22:02 long
2004-07-15  2:14 ` [PATCH]2.6.7 " Roland Dreier
2004-07-18  1:25 ` Roland Dreier
2004-06-26  1:21 long
2004-06-26  0:30 ` [PATCH]2.6.7 " Roland Dreier
2004-06-26  1:38 ` Roland Dreier
2004-06-26  8:27   ` Christoph Hellwig
2004-06-26 17:30     ` Roland Dreier
2004-06-24 16:29 Nguyen, Tom L
2004-06-23 16:58 Nguyen, Tom L
2004-06-23 16:49 Nguyen, Tom L
2004-06-22 23:06 Nguyen, Tom L
2004-06-22 21:48 long
2004-06-22 23:57 ` Roland Dreier
2004-06-23  0:26   ` Jeff Garzik
2004-06-23  1:18     ` Roland Dreier
2004-06-23  3:45 ` Roland Dreier
2004-06-23 16:50 ` Roland Dreier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52zn6tbekn.fsf@topspin.com \
    --to=roland@topspin.com \
    --cc=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=eli@mellanox.co.il \
    --cc=greg@kroah.com \
    --cc=jgarzik@pobox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tlnguyen@snoqualmie.dp.intel.com \
    --cc=tom.l.nguyen@intel.com \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox