Re: [PATCH] natsemi: netpoll fixes

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sergei Shtylyov <sshtylyov@ru.mvista.com>
To: Mark Huth <mhuth@mvista.com>
Cc: jgarzik@pobox.com, netdev@vger.kernel.org
Subject: Re: [PATCH] natsemi: netpoll fixes
Date: Sat, 10 Mar 2007 23:25:05 +0300	[thread overview]
Message-ID: <45F31421.4010908@ru.mvista.com> (raw)
In-Reply-To: <45ECE9AC.3090804@mvista.com>

Hello.

Mark Huth wrote:

>>>  #ifdef CONFIG_NET_POLL_CONTROLLER
>>>  static void natsemi_poll_controller(struct net_device *dev)
>>>  {
>>> +     struct netdev_private *np = netdev_priv(dev);
>>> +
>>>       disable_irq(dev->irq);
>>> -     intr_handler(dev->irq, dev);
>>> +
>>> +     /*
>>> +      * A real interrupt might have already reached us at this point
>>> +      * but NAPI might still haven't called us back.  As the
>>> interrupt
>>> +      * status register is cleared by reading, we should prevent an
>>> +      * interrupt loss in this case...
>>> +      */
>>> +     if (!np->intr_status)
>>> +             intr_handler(dev->irq, dev);
>>> +
>>>       enable_irq(dev->irq);

    Oops, I was going to recast the patch but my attention switched elsewhere 
for couple of days, and it "slipped" into mainline. I'm now preparing a better 
patch to also protect...

>> Is it possible for this to run at the same time as the NAPI poll?  If so
>> then it is possible for the netpoll poll to run between np->intr_status
>> being cleared and netif_rx_complete() being called.  If the hardware
>> asserts an interrupt at the wrong moment then this could cause the

> Well, there is a whole task of analyzing the netpoll conditions under 
> smp.  There appears to me to be a race with netpoll and NAPI on another 
> processor, given that netpoll can be called with virtually any system 
> condition on a debug breakpoint or crash dump initiation.  I'm spending 
> some time looking into it, but don't have a smoking gun immediately.  
> Regardless, if such a condition does exist, it is shared across many or 
> all of the potential netpolled devices.  Since that is exactly the 
> condition  the suggested patch purports to solve, it is pointless if the 
> whole NAPI/netpoll race exists.  Such a race would lead to various and 
> imaginative failures in the system.  So don't fix that problem in a 
> particular driver.  If it exists, fix it generally in the netpoll/NAPI 
> infrastructure.

    Any takers? :-)

>> In any case, this is a problem independently of netpoll if the chip
>> shares an interrupt with anything so the interrupt handler should be
>> fixed to cope with this situation instead.

> Yes, that would appear so.  If an interrupt line is shared with this 
> device, then the interrupt handler can be called again, even though the 
> device's interrupts are disabled on the interface.  So, in the actual 
> interrupt handler, check the dev->state __LINK_STATE_SCHED flag - if 
> it's set, leave immediately, it can't be our interrupt. If it's clear, 
> read the irq enable hardware register.  If enabled, do the rest of the 
> interrupt handler.

    It seems that there's no need to read it, as it gets set to 0 
"synchronously" with setting the 'hands_off' flag (except in NAPI callback)...

> Since the isr is disabled only by the interrupt 
> handler, and enabled only by the poll routine, the race on the interrupt 
> cause register is prevented.  And, as a byproduct, the netpoll race is 
> also prevented.  You could just always read the isr enable hardware 
> register, but that means you always do an operation to the chip, which 
> can be painfully slow.

    Yeah, it seems currently unjustified.  However IntrEnable would have been 
an ultimate criterion on taking or ignoring an interrupt otherwise...

> I guess the tradeoff depends on the probability 
> of getting the isr called when NAPI is active for the device.

    Didn't get it... :-/

> If this results in netpoll not getting a packet right away, that's okay, 
> since the netpoll users should try again.

    Well, in certain stuations (like KGDBoE), netpoll callback being called 
*while* NAPI callback is being executed would mean a deadlock, I think (as 
NAPI callback will never complete)...
    BTW, it seems I've found another interrupt lossage path in the driver:

netdev_poll() -> netdev_rx() -> reset_rx()

If the netdev_rx() detects an oversized packet, it will call reset_rx() which 
will spin in a loop "collecting" interrupt status until it sees RxResetDone 
there.  The issue is 'intr_status' field will get overwritten and interrupt 
status lost after netdev_rx() returns to netdev_poll().  How do you think, is 
this corner case worth fixing (by moving netdev_rx() call to the top of a 
do/while loop)?

> Mark Huth

WBR, Sergei

next prev parent reply	other threads:[~2007-03-10 20:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-05 20:10 [PATCH] natsemi: netpoll fixes Sergei Shtylyov
2007-03-05 22:41 ` Mark Brown
2007-03-05 22:43 ` Mark Brown
2007-03-06  4:10   ` Mark Huth
2007-03-10 20:25     ` Sergei Shtylyov [this message]
2007-03-11 12:16       ` Mark Brown
2007-03-12 13:05         ` Sergei Shtylyov
2007-03-12 19:11           ` Mark Brown
2007-03-13 13:53             ` Sergei Shtylyov
2007-03-13 19:31               ` Mark Brown
2007-03-12 19:12           ` Sergei Shtylyov
2007-03-12 19:05         ` Mark Huth
2007-03-13  0:14           ` Mark Brown
2007-03-13 13:45             ` Sergei Shtylyov
2007-03-06 11:10 ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45F31421.4010908@ru.mvista.com \
    --to=sshtylyov@ru.mvista.com \
    --cc=jgarzik@pobox.com \
    --cc=mhuth@mvista.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).