All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Mackall <mpm@selenic.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>, netdev@oss.sgi.com
Subject: Re: serious netpoll bug w/NAPI
Date: Tue, 15 Feb 2005 21:07:22 -0800	[thread overview]
Message-ID: <20050216050722.GC3358@waste.org> (raw)
In-Reply-To: <16914.31886.665975.522710@segfault.boston.redhat.com>

On Tue, Feb 15, 2005 at 05:49:50PM -0500, Jeff Moyer wrote:
> ==> Regarding Re: serious netpoll bug w/NAPI; Matt Mackall <mpm@selenic.com> adds:
> 
> Sorry, Matt, I'm just now getting to this.
> 
> mpm> On Wed, Feb 09, 2005 at 04:46:58PM -0800, David S. Miller wrote:
> >> On Wed, 9 Feb 2005 10:32:19 -0800 Matt Mackall <mpm@selenic.com> wrote:
> >> 
> >> > On closer inspection, there's a couple other related failure cases >
> >> with the new ->poll logic in netpoll. I'm afraid it looks like >
> >> CONFIG_NETPOLL will need to guard ->poll() with a per-device spinlock >
> >> on netpoll-enabled devices.
> >> > 
> >> > This will mean putting a pointer to struct netpoll in struct >
> >> net_device (which I should have done in the first place) and will take >
> >> a few patches to sort out.
> >> 
> >> Will this ->poll() guarding lock be acquired only in the netpoll code or
> >> system-wide?  If the latter, this introduced an incredible performance
> >> regression for devices using the LLTX locking scheme (ie. the most
> >> important high-performance gigabit drivers in the tree use this).
> 
> mpm> The lock will only be taken in net_rx_action iff netpoll is enabled
> mpm> for the given device. Lock contention should be basically nil.
> 
> mpm> Here's my current patch (on top of -mm), which I'm struggling to find
> mpm> an appropriate test box for (my dual box with NAPI got pressed into
> mpm> service as a web server a couple weeks ago). I've attached the other
> mpm> two patches it relies on as well.
> 
> mpm> --------------
> 
> mpm> Introduce a per-client poll lock and flag. The lock assures we never
> mpm> have more than one caller in dev->poll(). The flag provides recursion
> mpm> avoidance on UP where the lock disappears.
> 
> ,----
> |  /*
> | - * Check whether delayed processing was scheduled for our current CPU,
> | - * and then manually invoke NAPI polling to pump data off the card.
> | + * Check whether delayed processing was scheduled for our NIC. If so,
> | + * we attempt to grab the poll lock and use ->poll() to pump the card.
> | + * If this fails, either we've recursed in ->poll() or it's already
> | + * running on another CPU.
> | + *
> | + * Note: we don't mask interrupts with this lock because we're using
> | + * trylock here and interrupts are already disabled in the softirq
> | + * case. Further, we test the poll_flag to avoid recursion on UP
> | + * systems where the lock doesn't exist.
> |   *
> |   * In cases where there is bi-directional communications, reading only
> |   * one message at a time can lead to packets being dropped by the
> | @@ -74,13 +80,9 @@
> |  static void poll_napi(struct netpoll *np)
> |  {
> |  	int budget = 16;
> | -	unsigned long flags;
> | -	struct softnet_data *queue;
> |  
> | -	spin_lock_irqsave(&netpoll_poll_lock, flags);
> | -	queue = &__get_cpu_var(softnet_data);
> |  	if (test_bit(__LINK_STATE_RX_SCHED, &np->dev->state) &&
> | -	    !list_empty(&queue->poll_list)) {
> | +	    !np->poll_flag && spin_trylock(&np->poll_lock)) {
> |  		np->rx_flags |= NETPOLL_RX_DROP;
> |  		atomic_inc(&trapped);
> |  
> | @@ -88,8 +90,8 @@
> |  
> |  		atomic_dec(&trapped);
> |  		np->rx_flags &= ~NETPOLL_RX_DROP;
> | +		spin_unlock(&np->poll_lock);
> |  	}
> | -	spin_unlock_irqrestore(&netpoll_poll_lock, flags);
> |  }
> 
> Okay, I've only taken a quick glance at this, but I don't quite understand
> why it's safe to take out the check for the per-cpu queue.  Realize that an
> interrupt may have been serviced on another processor, and a NAPI poll may
> have been scheduled there.

Because dev->np->poll_lock now serializes all access to ->poll (when
netpoll is enabled on said device).

> Also, could you use the -p flag to diff when you generate your next patch?
> It makes it *much* easier to review.

Hmm, somehow my QUILT_DIFF_OPTS got lost, thanks.

I've just now recovered my SMP+NAPI box, so I can take a stab at
actually testing this shortly.

-- 
Mathematics is the supreme nostalgia of our time.

  reply	other threads:[~2005-02-16  5:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-09  4:16 serious netpoll bug w/NAPI David S. Miller
2005-02-09 18:32 ` Matt Mackall
2005-02-10  0:46   ` David S. Miller
2005-02-10  1:11     ` Matt Mackall
2005-02-10  9:16       ` Martin Josefsson
2005-02-10 17:14         ` Matt Mackall
2005-02-15 22:49       ` Jeff Moyer
2005-02-16  5:07         ` Matt Mackall [this message]
2005-02-16 19:26           ` Jeff Moyer
2005-02-16 22:07           ` Jeff Moyer
2005-02-16 23:02           ` David S. Miller
2005-02-16 23:44             ` Matt Mackall
2005-02-16 23:54               ` David S. Miller
2005-02-17  0:15                 ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050216050722.GC3358@waste.org \
    --to=mpm@selenic.com \
    --cc=davem@davemloft.net \
    --cc=jmoyer@redhat.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.