netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sabrina Dubroca <sd@queasysnail.net>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, peterz@infradead.org
Subject: Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
Date: Thu, 5 Feb 2015 01:20:59 +0100	[thread overview]
Message-ID: <20150205002059.GA28282@kria> (raw)
In-Reply-To: <20150105143126.GA22122@kria>

Thomas, ping?

thread is over there if you need it:
https://marc.info/?l=linux-netdev&m=141833435000554&w=2

2015-01-05, 15:31:26 +0100, Sabrina Dubroca wrote:
> 2014-12-12, 23:01:28 +0100, Thomas Gleixner wrote:
> > On Thu, 11 Dec 2014, Sabrina Dubroca wrote:
> > > 2014-12-09, 21:44:33 -0500, David Miller wrote:
> > > > 
> > > > Adding a new spinlock to every interrupt service routine is
> > > > simply a non-starter.
> > > > 
> > > > You will certainly have to find a way to fix this in a way
> > > > that doesn't involve adding any new overhead to the normal
> > > > operational paths of these drivers.
> > > 
> > > Okay. Here is another idea.
> > > 
> > > Since the issue is with the wait_event() part of synchronize_irq(),
> > > and it only takes care of threaded handlers, maybe we could try not
> > > waiting for threaded handlers.
> > > 
> > > Introduce disable_irq_nosleep() that returns true if it successfully
> > > synchronized against all handlers (there was no threaded handler
> > > running), false if it left some threads running.  And in
> > > ->ndo_poll_controller, only call the interrupt handler if
> > > synchronization was successful.
> > > 
> > > Both users of the poll controllers retry their action (alloc/xmit an
> > > skb) several times, with calls to the device's poll controller between
> > > attempts.  And hopefully, if the first attempt fails, we will still
> > > manage to get through?
> > 
> > Hopefully is not a good starting point. Is the poll controller
> > definitely retrying? Otherwise you might end up with the following:
> > 
> > Interrupt line is shared between your network device and a
> > device which requested a threaded interrupt handler.
> > 
> >   CPU0	       		   	    CPU1
> >   interrupt()
> >     your_device_handler()
> >       return NONE;
> >     shared_device_handler()
> >       return WAKE_THREAD;
> >       --> atomic_inc(threads_active);
> > 				    poll()
> > 				      disable_irq_nosleep()
> > 					sync_hardirq()
> > 					return atomic_read(threads_active);
> > 
> > So if you do not have a reliable retry then you might just go into a
> > stale state. And this can happen if the interrupt type is edge because
> > we do not disable the interrupt when we wakeup the thread for obvious
> > reasons.
> 
> We do have loops retrying to run the netpoll controller, and trying to
> do the work even if the controller doesn't help.  And by hopefully I
> mean: even if we fail, we tried our best and netpoll isn't 100%
> reliable.
> 
> 
> static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve)
> {
> 	...
> repeat:
> 
> 	skb = alloc_skb(len, GFP_ATOMIC);
> 	if (!skb)
> 		skb = skb_dequeue(&skb_pool);
> 
> 	if (!skb) {
> 		if (++count < 10) {
> 			netpoll_poll_dev(np->dev);
> 			goto repeat;
> 		}
> 		return NULL;
> 	}
> 
> 	...
> }
> 
> void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
> 			     struct net_device *dev)
> {
> 	...
> 
> 		/* try until next clock tick */
> 		for (tries = jiffies_to_usecs(1)/USEC_PER_POLL;
> 		     tries > 0; --tries) {
> 			if (HARD_TX_TRYLOCK(dev, txq)) {
> 				if (!netif_xmit_stopped(txq))
> 					status = netpoll_start_xmit(skb, dev, txq);
> 
> 				HARD_TX_UNLOCK(dev, txq);
> 
> 				if (status == NETDEV_TX_OK)
> 					break;
> 
> 			}
> 
> 			/* tickle device maybe there is some cleanup */
> 			netpoll_poll_dev(np->dev);
> 
> 			udelay(USEC_PER_POLL);
> 		}
> 
> 	...
> }
> 
> 
> 
> > Aside of that I think that something like this is a reasonable
> > approach to the problem.
> > 
> > The only other nitpicks I have are:
> > 
> >     - The name of the function sucks, though my tired braain can't
> >       come up with something reasonable right now
> 
> I couldn't think of anything better.  Maybe 'disable_irq_trysync' or
> 'disable_irq_hardsync'?
> 
> Or maybe you prefer something that works like spin_trylock, and
> reenables the irq before returning if we can't sync?  Maybe the risk
> of abuse would be a bit lower this way?
> 
> I made synchronize_irq_nosleep static, but maybe it should be
> EXPORT_SYMBOL'ed as well.  I didn't need that for e1000, but that
> would be more consistent.
> 
> 
> >     - The lack of extensive documentation how this interface is
> >       supposed to be used and the pitfals of abusage, both in the
> >       function documentation and the changelog.
> > 
> >       Merlily copying the existing documentation of the other
> >       interface is not sufficient.
> 
> 
> Yes, my email wasn't really a changelog, just a description and RFC.
> 
> 
> Modified documentation:
> 
> -----
> disable_irq_nosleep - disable an irq and wait for completion of hard IRQ handlers
> @irq: Interrupt to disable
> 
> Disable the selected interrupt line.  Enables and Disables are
> nested.
> This function does not sleep, and is safe to call in atomic context.
> 
> This function waits for any pending hard IRQ handlers for this
> interrupt to complete before returning. If you use this
> function while holding a resource the IRQ handler may need you
> will deadlock.
> 
> This function does not wait for threaded IRQ handlers.
> Returns true if synchronized, false if there are threaded
> handlers pending.
> 
> If false is returned, the caller must assume that synchronization
> didn't occur, and that it is NOT safe to proceed.
> The caller MUST reenable the interrupt by calling enable_irq in all
> cases.
> 
> This function may be called - with care - from IRQ context.
> -----
> 
> 
> Thanks.
> 
> --
> Sabrina
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-02-05  0:21 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-09 14:37 [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 01/11] netpoll: introduce netpoll_irq_lock to protect netpoll controller against interrupts Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 02/11] e1000: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 03/11] 8139cp/too: remove disable_irq from netpoll controller Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 04/11] atl1c: remove disable_irq from netpoll controller, use netpoll_irq_lock Sabrina Dubroca
     [not found]   ` <CAMXMK6t5NfPQFBxK1Qny45LCS6rwX4Ys1n4C7fsTPHXu=x_vuQ@mail.gmail.com>
2014-12-09 17:23     ` Sabrina Dubroca
2014-12-09 21:17       ` Chris Snook
2014-12-09 14:37 ` [RFC PATCH net-next 05/11] bnx2: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 06/11] s2io: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 07/11] pasemi: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 08/11] ll_temac: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 09/11] xilinx/axienet: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 10/11] gianfar: " Sabrina Dubroca
2014-12-09 14:37 ` [RFC PATCH net-next 11/11] net: fec: " Sabrina Dubroca
2014-12-10  2:44 ` [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller David Miller
2014-12-11 21:45   ` Sabrina Dubroca
2014-12-12  2:14     ` David Miller
2014-12-12 22:01     ` Thomas Gleixner
2015-01-05 14:31       ` Sabrina Dubroca
2015-02-05  0:20         ` Sabrina Dubroca [this message]
2015-02-05 13:06           ` Peter Zijlstra
2015-02-05 15:33             ` Sabrina Dubroca

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150205002059.GA28282@kria \
    --to=sd@queasysnail.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).