netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergei Shtylyov <sshtylyov@ru.mvista.com>
To: Mark Brown <broonie@sirena.org.uk>
Cc: Mark Huth <mhuth@mvista.com>, jgarzik@pobox.com, netdev@vger.kernel.org
Subject: Re: [PATCH] natsemi: netpoll fixes
Date: Mon, 12 Mar 2007 16:05:48 +0300	[thread overview]
Message-ID: <45F5502C.1000905@ru.mvista.com> (raw)
In-Reply-To: <20070311121615.GF8133@sirena.org.uk>

Hello.

Mark Brown wrote:

>>   Oops, I was going to recast the patch but my attention switched 
>>   elsewhere for couple of days, and it "slipped" into mainline. I'm now 
>>preparing a better patch to also protect...

> Ah, I was also looking at it.  I enclose my current patch which appears
> to work although I have not finished testing it yet.

>>>interrupt handler, check the dev->state __LINK_STATE_SCHED flag - if 
>>>it's set, leave immediately, it can't be our interrupt. If it's clear, 
>>>read the irq enable hardware register.  If enabled, do the rest of the 
>>>interrupt handler.

>>   It seems that there's no need to read it, as it gets set to 0 
>>"synchronously" with setting the 'hands_off' flag (except in NAPI 
>>callback)...

> hands_off is stronger than that - it's used for sync with some of the
> other code paths like suspend/resume and means "don't touch the chip".
> I've added a new driver local flag instead.

    I'm not sure it was worth it -- we already had __LINK_STATE_RX_SCHED...

>>   Yeah, it seems currently unjustified.  However IntrEnable would have 
>>   been an ultimate criterion on taking or ignoring an interrupt otherwise...

>>>I guess the tradeoff depends on the probability 
>>>of getting the isr called when NAPI is active for the device.

>>   Didn't get it... :-/

    I mean I didn't understand why there's tradeoff and how it depends on the 
probability...

> Some systems can provoke this fairly easily - Sokeris have some boards
> where at least three natsemis share a single interrupt line, for example
> (the model I'm looking at has three, they look to have another
> configuration where 5 would end up on the line).

    PCI means IRQ sharing, generally. So, it should have been fixed anyway. :-)

>>   BTW, it seems I've found another interrupt lossage path in the driver:

>>netdev_poll() -> netdev_rx() -> reset_rx()

>>If the netdev_rx() detects an oversized packet, it will call reset_rx() 
>>which will spin in a loop "collecting" interrupt status until it sees 
>>RxResetDone there.  The issue is 'intr_status' field will get overwritten 
>>and interrupt status lost after netdev_rx() returns to netdev_poll().  How 
>>do you think, is this corner case worth fixing (by moving netdev_rx() call
>>to the top of a do/while loop)?

> Moving netdev_rx() would fix that one but there's some others too -
> there's one in the timer routine if the chip crashes.  In the case you

    Erm, sorry, I'm not seeing it -- could you "point with finger" please? :-)

> describe above the consequences shouldn't be too bad since it tends to
> only occur at high volume so further traffic will tend to occur and
> cause things to recover - all the testing of that patch was done with
> the bug present and no ill effects.

    Oversized packets occur only at high volume? Is it some errata?

> Subject: natsemi: Fix NAPI for interrupt sharing
> To: Jeff Garzik <jeff@garzik.org>
> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>, Simon Blake <simon@citylink.co.nz>, John Philips <johnphilips42@yahoo.com>, netdev@vger.kernel.org

> The interrupt status register for the natsemi chips is clear on read and
> was read unconditionally from both the interrupt and from the NAPI poll
> routine, meaning that if the interrupt service routine was called (for 
> example, due to a shared interrupt) while a NAPI poll was scheduled
> interrupts could be missed.  This patch fixes that by ensuring that the
> interrupt status register is only read when there is no poll scheduled.

> It also reverts a workaround for this problem from the netpoll hook.

> Thanks to Sergei Shtylyov <sshtylyov@ru.mvista.com> for spotting the
> issue and Simon Blake <simon@citylink.co.nz> for testing resources.

    Thanks for the patch!
    (If I only knew somebody else was working on that issue, it could have 
saved my cycles, sigh... but well, I should have said  that I was going to 
recast the patch. :-)

> Signed-Off-By: Mark Brown <broonie@sirena.org.uk>

> Index: linux-2.6/drivers/net/natsemi.c
> ===================================================================
> --- linux-2.6.orig/drivers/net/natsemi.c	2007-03-11 02:32:43.000000000 +0000
> +++ linux-2.6/drivers/net/natsemi.c	2007-03-11 12:09:14.000000000 +0000
> @@ -571,6 +571,8 @@
>  	int oom;
>  	/* Interrupt status */
>  	u32 intr_status;
> +	int poll_active;
> +	spinlock_t intr_lock;
>  	/* Do not touch the nic registers */
>  	int hands_off;
>  	/* Don't pay attention to the reported link state. */
> @@ -812,9 +814,11 @@
>  	pci_set_drvdata(pdev, dev);
>  	np->iosize = iosize;
>  	spin_lock_init(&np->lock);
> +	spin_lock_init(&np->intr_lock);
>  	np->msg_enable = (debug >= 0) ? (1<<debug)-1 : NATSEMI_DEF_MSG;
>  	np->hands_off = 0;
>  	np->intr_status = 0;
> +	np->poll_active = 0;
>  	np->eeprom_size = natsemi_pci_info[chip_idx].eeprom_size;
>  	if (natsemi_pci_info[chip_idx].flags & NATSEMI_FLAG_IGNORE_PHY)
>  		np->ignore_phy = 1;
> @@ -1406,6 +1410,8 @@
>  	writel(rfcr, ioaddr + RxFilterAddr);
>  }
>  
> +/* MUST be called so that both NAPI poll and ISR are excluded due to
> + * use of intr_status. */
>  static void reset_rx(struct net_device *dev)
>  {
>  	int i;
> @@ -2118,30 +2124,45 @@
>  	struct net_device *dev = dev_instance;
>  	struct netdev_private *np = netdev_priv(dev);
>  	void __iomem * ioaddr = ns_ioaddr(dev);
> +	unsigned long flags;
> +	irqreturn_t status = IRQ_NONE;
>  
>  	if (np->hands_off)
>  		return IRQ_NONE;
>  
> -	/* Reading automatically acknowledges. */
> -	np->intr_status = readl(ioaddr + IntrStatus);
> -
> -	if (netif_msg_intr(np))
> -		printk(KERN_DEBUG
> -		       "%s: Interrupt, status %#08x, mask %#08x.\n",
> -		       dev->name, np->intr_status,
> -		       readl(ioaddr + IntrMask));
> +	spin_lock_irqsave(&np->intr_lock, flags);

    Yeah, I've suspected that we need to grab np->lock here... but does that 
separate spinlock actually protect us from anything?

> -	if (!np->intr_status)
> -		return IRQ_NONE;
> +	/* Reading IntrStatus automatically acknowledges so don't do
> +	 * that while a poll is scheduled.  */
> +	if (!np->poll_active) {
> +		np->intr_status = readl(ioaddr + IntrStatus);
>  
> -	prefetch(&np->rx_skbuff[np->cur_rx % RX_RING_SIZE]);
> +		if (netif_msg_intr(np))
> +			printk(KERN_DEBUG
> +			       "%s: Interrupt, status %#08x, mask %#08x.\n",
> +			       dev->name, np->intr_status,
> +			       readl(ioaddr + IntrMask));
> +
> +		if (np->intr_status) {
> +			prefetch(&np->rx_skbuff[np->cur_rx % RX_RING_SIZE]);
> +
> +			/* Disable interrupts and register for poll */
> +			if (netif_rx_schedule_prep(dev)) {
> +				natsemi_irq_disable(dev);
> +				__netif_rx_schedule(dev);
> +				np->poll_active = 1;
> +			} else
> +				printk(KERN_WARNING
> +			       	       "%s: Ignoring interrupt, status %#08x, mask %#08x.\n",
> +				       dev->name, np->intr_status,
> +				       readl(ioaddr + IntrMask));
>  
> -	if (netif_rx_schedule_prep(dev)) {
> -		/* Disable interrupts and register for poll */
> -		natsemi_irq_disable(dev);
> -		__netif_rx_schedule(dev);
> +			status = IRQ_HANDLED;
> +		}
>  	}
> -	return IRQ_HANDLED;
> +
> +	spin_unlock_irqrestore(&np->intr_lock, flags);
> +	return status;
>  }
>  
>  /* This is the NAPI poll routine.  As well as the standard RX handling
> @@ -2154,8 +2175,15 @@
>  
>  	int work_to_do = min(*budget, dev->quota);
>  	int work_done = 0;
> +	unsigned long flags;
>  
>  	do {
> +		if (netif_msg_intr(np))
> +			printk(KERN_DEBUG
> +			       "%s: Poll, status %#08x, mask %#08x.\n",
> +			       dev->name, np->intr_status,
> +			       readl(ioaddr + IntrMask));
> +
>  		if (np->intr_status &
>  		    (IntrTxDone | IntrTxIntr | IntrTxIdle | IntrTxErr)) {
>  			spin_lock(&np->lock);
> @@ -2182,14 +2210,19 @@
>  		np->intr_status = readl(ioaddr + IntrStatus);
>  	} while (np->intr_status);
>  
> +	/* We need to ensure that the ISR doesn't run between telling
> +	 * NAPI we're done and enabling the interrupt. */

    Why? :-O

> +	spin_lock_irqsave(&np->intr_lock, flags);
> +
>  	netif_rx_complete(dev);
> +	np->poll_active = 0;
>  
>  	/* Reenable interrupts providing nothing is trying to shut
>  	 * the chip down. */
> -	spin_lock(&np->lock);
> -	if (!np->hands_off && netif_running(dev))
> +	if (!np->hands_off)
>  		natsemi_irq_enable(dev);
> -	spin_unlock(&np->lock);
> +
> +	spin_unlock_irqrestore(&np->intr_lock, flags);
>  
>  	return 0;
>  }

WBR, Sergei

  reply	other threads:[~2007-03-12 13:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-05 20:10 [PATCH] natsemi: netpoll fixes Sergei Shtylyov
2007-03-05 22:41 ` Mark Brown
2007-03-05 22:43 ` Mark Brown
2007-03-06  4:10   ` Mark Huth
2007-03-10 20:25     ` Sergei Shtylyov
2007-03-11 12:16       ` Mark Brown
2007-03-12 13:05         ` Sergei Shtylyov [this message]
2007-03-12 19:11           ` Mark Brown
2007-03-13 13:53             ` Sergei Shtylyov
2007-03-13 19:31               ` Mark Brown
2007-03-12 19:12           ` Sergei Shtylyov
2007-03-12 19:05         ` Mark Huth
2007-03-13  0:14           ` Mark Brown
2007-03-13 13:45             ` Sergei Shtylyov
2007-03-06 11:10 ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45F5502C.1000905@ru.mvista.com \
    --to=sshtylyov@ru.mvista.com \
    --cc=broonie@sirena.org.uk \
    --cc=jgarzik@pobox.com \
    --cc=mhuth@mvista.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).