All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roland Dreier <roland@topspin.com>
To: Zwane Mwaikambo <zwane@linuxpower.ca>
Cc: "Nguyen, Tom L" <tom.l.nguyen@intel.com>,
	long <tlnguyen@snoqualmie.dp.intel.com>,
	ak@muc.de, akpm@osdl.org, greg@kroah.com, jgarzik@pobox.com,
	linux-kernel@vger.kernel.org, eli@mellanox.co.il
Subject: Re: [PATCH]2.6.7 MSI-X Update
Date: Thu, 24 Jun 2004 00:27:52 -0700	[thread overview]
Message-ID: <52zn6tbekn.fsf@topspin.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0406240216170.3273@montezuma.fsmlabs.com> (Zwane Mwaikambo's message of "Thu, 24 Jun 2004 02:37:29 -0400 (EDT)")

Hi, I think you may not have read Long's patch/API carefully, since
you seem to be misunderstanding my objection.  In any case...

    Roland> I could imagine hardware where the driver does not know
    Roland> exactly how many vectors it will use until it starts up.
    Roland> As a hypothetical example, imagine some storage networking
    Roland> host adapter that supports an interrupt vector per storage
    Roland> target.  The driver does not know how many vectors it will
    Roland> actually use until it has logged into the storage fabric;
    Roland> in fact, the driver may want to keep some vectors "in
    Roland> reserve" in case a new target is added to the fabric
    Roland> later.

    Roland> I think it would be better to preserve maximum flexibility
    Roland> for devices and drivers, and not mandate that every
    Roland> allocated MSI-X vector is always used.

    Zwane> The MSI subsystem should at most reserve and the driver
    Zwane> make a request.  There may be a limit per PCI device as
    Zwane> specified by the MSI subsystem for some reason or
    Zwane> other. Isn't this what we're all saying?

No, Long is actually saying that a driver must actually call
request_irq() on all the vectors that it is allocated.  I am saying
that this requirement is too stringent, since there may be devices and
drivers that cannot predict exactly how many MSI-X vectors they will
use during driver initialization.

    Roland> It seems in the code right now you are able to tell if any
    Roland> MSI-X vectors are hooked, since you wait for the last
    Roland> vector to be unhooked to disable MSI-X.  I would just have
    Roland> it be a WARN_ON() (or maybe BUG_ON()) if a driver calls
    Roland> pci_disable_msix() without calling free_irq for all its
    Roland> MSI-X vectors.

    Roland> Right now there is an issue if a driver is unloaded
    Roland> without freeing all its IRQs -- the device will be left in
    Roland> MSI-X mode and can not be recovered without rebooting.

    Zwane> This sounds like a case of bad driver bug generally the
    Zwane> kernel would oops when the ISR text gets unloaded. What
    Zwane> kind of behaviour do you expect here?

Yes, I agree, it is a bad driver bug if the driver is unloaded without
doing free_irq() on all the vectors it has done request_irq() on.
However, with Long's API, there is a problem if for example a device
driver does pci_enable_msix() and is allocated 2 vectors, then
correctly does request_irq()/free_irq() on one vector and doesn't
touch the second vector, and then is unloaded.  The device will be
left with MSI-X enabled and leak its vectors.

In the proposed API, since there is no pci_disable_msix() call, the
only way the driver can free its MSI-X vector is to actually do
request_irq()/free_irq() on it.

    Roland> Similarly, with the API as it stands in your patch, a
    Roland> driver must be very careful not to take any action that
    Roland> may fail in between calling pci_enable_msix() and actually
    Roland> calling request_irq(), or otherwise the only way to avoid
    Roland> leaking MSI-X resources is to take the very risky step of
    Roland> calling request_irq() on an error path.  This doesn't fit
    Roland> very well with the structure of lots of device drivers,
    Roland> for example Intel's very own e1000 driver, which wait
    Roland> until the device is actually opened to call request_irq().

    Zwane> Could you elaborate further here? Won't a matched
    Zwane> pci_disable_msix() free the necessary resources on failure?

Yes, a matched pci_disable_msix() would be exactly what is needed.
However, look at Long's patch -- there is no such function in the API
he is proposing.

    Roland> For your second point, I would have pci_disable_msix()
    Roland> always free all MSI-X vectors that have been
    Roland> allocated... the only parameter that I expect it would
    Roland> take is a struct pci_dev *.

    Zwane> If the driver is doing this, then we won't have to bother
    Zwane> about pci_disable_msix() doing the vector free surely?

I think the PCI core needs to know which vectors are in use and which
are free (and ready to assign to PCI devices that request them).

I believe the correct API/semantics for a device driver are:

	pci_enable_msix(dev, &entries, num_entries);
          /* On success, driver now has full use of the num_entries
             interrupt vectors returned through entries.  MSI-X enable
             bit is set in PCI header. */
          /* ... */
          /* driver freely does request_irq()/free_irq() on some or all
             vectors in entries while running. */
          /* ... */
        pci_disable_msix(dev);
          /* All handlers attached to MSI-X vectors must be removed with
             free_irq() before pci_disable_msi() call. */
          /* MSI-X enable bit is now cleared from PCI header, and all
             interrupt vectors are returned to the core for possible
             reallocation. */

The major change from Long's proposal is the addition of the
pci_disable_msix() function.

 - Roland

  reply	other threads:[~2004-06-24  7:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-23 22:03 [PATCH]2.6.7 MSI-X Update Nguyen, Tom L
2004-06-24  1:42 ` Roland Dreier
2004-06-24  6:37   ` Zwane Mwaikambo
2004-06-24  7:27     ` Roland Dreier [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-07-19 15:36 Nguyen, Tom L
2004-07-15 15:46 Nguyen, Tom L
2004-07-13 22:02 long
2004-07-15  2:14 ` [PATCH]2.6.7 " Roland Dreier
2004-07-18  1:25 ` Roland Dreier
2004-06-26  1:21 long
2004-06-26  0:30 ` [PATCH]2.6.7 " Roland Dreier
2004-06-26  1:38 ` Roland Dreier
2004-06-26  8:27   ` Christoph Hellwig
2004-06-26 17:30     ` Roland Dreier
2004-06-24 16:29 Nguyen, Tom L
2004-06-23 16:58 Nguyen, Tom L
2004-06-23 16:49 Nguyen, Tom L
2004-06-22 23:06 Nguyen, Tom L
2004-06-22 21:48 long
2004-06-22 23:57 ` Roland Dreier
2004-06-23  0:26   ` Jeff Garzik
2004-06-23  1:18     ` Roland Dreier
2004-06-23  3:45 ` Roland Dreier
2004-06-23 16:50 ` Roland Dreier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52zn6tbekn.fsf@topspin.com \
    --to=roland@topspin.com \
    --cc=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=eli@mellanox.co.il \
    --cc=greg@kroah.com \
    --cc=jgarzik@pobox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tlnguyen@snoqualmie.dp.intel.com \
    --cc=tom.l.nguyen@intel.com \
    --cc=zwane@linuxpower.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.