netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Eric Dumazet <dada1@cosmosbay.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Paul Mackerras <paulus@samba.org>,
	paulmck@linux.vnet.ibm.com, Evgeniy Polyakov <zbr@ioremap.net>,
	David Miller <davem@davemloft.net>,
	kaber@trash.net, jeff.chua.linux@gmail.com, laijs@cn.fujitsu.com,
	jengelh@medozas.de, r000n@r000n.net,
	linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org,
	netdev@vger.kernel.org, benh@kernel.crashing.org,
	mathieu.desnoyers@polymtl.ca
Subject: Re: [PATCH] netfilter: use per-cpu recursive lock (v11)
Date: Wed, 22 Apr 2009 13:18:17 +0200	[thread overview]
Message-ID: <20090422111817.GA21190@elte.hu> (raw)
In-Reply-To: <49EEDAF0.2010507@cosmosbay.com>


* Eric Dumazet <dada1@cosmosbay.com> wrote:

> netfilter in 2.6.2[0-9] used :
> 
> CPU 1
> 
> sofirq handles one packet from a NIC
> 
> ipt_do_table() /* to handle INPUT table for example, or FORWARD */
> read_lock_bh(&a_central_and_damn_rwlock)
> ... parse rules
>     -> calling some netfilter sub-function
>        re-entering network IP stack to send some packet (say a RST packet)
>        ...
>        ipt_do_table() /* to handle OUPUT table rules for example */
>        read_lock_bh() ; /* WE RECURSE here, but once in a while (if ever) */
> 
> This is one of the case, but other can happens with virtual 
> networks, tunnels, ... and so on. (Stephen had some cases with KVM 
> if I remember well)
> 
> If this could be done without recursion, I am pretty sure 
> netfilter and network guys would have done it. I found Linus 
> reaction quite shocking IMHO, considering hard work done by all 
> people on this.

Again ... i dont know this code well, but you yourself describe it 
as "a_central_and_damn_rwlock" above.

"Central and damn" locks of any type, anywhere tend to cause such 
trouble.

Often they are mini-BKL's in the making. Here's some historic 
patterns:

 1- First there's a convenient lock around a popular and useful 
    piece of data and code.

 2- As popularity (and reach of code) grows, and some non-trivial 
    interaction ensues, a little bit of self-recursion is added to 
    the mix (this is often easier to do than to fix the root cause 
    of the problem) - making critical sections even easier to grow 
    in size.

 3- Then attempts are made to make it scale all (it's a popular 
    piece of code), it's extended along a per-cpu axis, but 
    complexity of locking explode by taking a ton of locks all at 
    once. The solution: yell at the lock validator for not allowing
    this (or for exploding due to the sheer mathematically 
    large complexity of the locking rules). Frequent requests for
    an exclusion from those pesky validations are made, and various
    hacks are done to work it around. It's all the fault of the lock
    validator, of course.

 4- At this point it's rarely a clean, tidy data lock anymore - it 
    tends to grow into a code lock nobody really knows how it works, 
    except that it better be taken more often than not, badness may
    ensue otherwise. Nobody really knows when to take it, only that 
    it should be taken widely enough, that it should be recursive 
    enough to call even _more_ code from under it. Efforts to reduce 
    critical section length are rebuffed with: "this adds unlocking
    and relocking overhead". And it should then all scale as well.

 5- In the end it's a lock everyone curses but nobody is able to fix 
    anymore. "If it was easy to fix we'd have fixed it long ago 
    already" kind of fatalist thinking becomes widespread.

We saw many examples of this in the past: beyond the BKL (which is a 
bit special as its more of an UP legacy - but which hurt us the 
most), we had the tasklist_lock which was problematic for a long 
time, then there's also the reiser3 locking which was extremely 
monolithic. [ i'm only naming safe examples here, where nobody will 
flame me =B-) ]

Whether this particular case applies is up to you. I see certain 
matches up to phase 3 or so - but i also see certain dissimilarities 
as well. Nor do i claim that it's easy to fix or improve.

	Ingo

  parent reply	other threads:[~2009-04-22 11:19 UTC|newest]

Thread overview: 215+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.64.0904101656190.2093@boston.corp.fedex.com>
     [not found] ` <20090410095246.4fdccb56@s6510>
2009-04-11  1:25   ` iptables very slow after commit784544739a25c30637397ace5489eeb6e15d7d49 David Miller
2009-04-11  1:39     ` iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 Linus Torvalds
2009-04-11  4:15       ` Paul E. McKenney
2009-04-11  5:14         ` Jan Engelhardt
2009-04-11  5:42           ` Paul E. McKenney
2009-04-11  6:00           ` David Miller
2009-04-11 18:12             ` Kyle Moffett
2009-04-11 18:32               ` Arkadiusz Miskiewicz
2009-04-12  0:54               ` david
2009-04-12  5:05                 ` Kyle Moffett
2009-04-12 12:30                 ` Harald Welte
2009-04-12 16:38             ` Jan Engelhardt
2009-04-11 15:07           ` Stephen Hemminger
2009-04-11 16:05             ` Jeff Chua
2009-04-11 17:51           ` Linus Torvalds
2009-04-11  7:08         ` Ingo Molnar
2009-04-11 15:05           ` Stephen Hemminger
2009-04-11 17:48           ` Paul E. McKenney
2009-04-12 10:54             ` Ingo Molnar
2009-04-12 11:34             ` Paul Mackerras
2009-04-12 17:31               ` Paul E. McKenney
2009-04-13  1:13                 ` David Miller
2009-04-13  4:04                   ` Paul E. McKenney
2009-04-13 16:53                     ` [PATCH] netfilter: use per-cpu spinlock rather than RCU Stephen Hemminger
2009-04-13 17:40                       ` Eric Dumazet
2009-04-13 18:11                         ` Stephen Hemminger
2009-04-13 19:06                       ` Martin Josefsson
2009-04-13 19:17                         ` Linus Torvalds
2009-04-13 22:24                       ` Andrew Morton
2009-04-13 23:20                         ` Stephen Hemminger
2009-04-13 23:26                           ` Andrew Morton
2009-04-13 23:37                             ` Linus Torvalds
2009-04-13 23:52                               ` Ingo Molnar
2009-04-14 12:27                       ` Patrick McHardy
2009-04-14 14:23                         ` Eric Dumazet
2009-04-14 14:45                           ` Stephen Hemminger
2009-04-14 15:49                             ` Eric Dumazet
2009-04-14 16:51                               ` Jeff Chua
2009-04-14 18:17                                 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v2) Stephen Hemminger
2009-04-14 19:28                                   ` Eric Dumazet
2009-04-14 21:11                                     ` Stephen Hemminger
2009-04-14 21:13                                     ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Stephen Hemminger
2009-04-14 21:40                                       ` Eric Dumazet
2009-04-15 10:59                                         ` Patrick McHardy
2009-04-15 16:31                                           ` Stephen Hemminger
2009-04-15 20:55                                           ` Stephen Hemminger
2009-04-15 21:07                                             ` Eric Dumazet
2009-04-15 21:55                                               ` Jan Engelhardt
2009-04-16 12:12                                                 ` Patrick McHardy
2009-04-16 12:24                                                   ` Jan Engelhardt
2009-04-16 12:31                                                     ` Patrick McHardy
2009-04-15 21:57                                               ` [PATCH] netfilter: use per-cpu rwlock rather than RCU (v4) Stephen Hemminger
2009-04-15 23:48                                               ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) David Miller
2009-04-16  0:01                                                 ` Stephen Hemminger
2009-04-16  0:05                                                   ` David Miller
2009-04-16 12:28                                                     ` Patrick McHardy
2009-04-16  0:10                                                   ` Linus Torvalds
2009-04-16  0:45                                                     ` [PATCH] netfilter: use per-cpu spinlock and RCU (v5) Stephen Hemminger
2009-04-16  5:01                                                       ` Eric Dumazet
2009-04-16 13:53                                                         ` Patrick McHardy
2009-04-16 14:47                                                           ` Paul E. McKenney
2009-04-16 16:10                                                             ` [PATCH] netfilter: use per-cpu recursive spinlock (v6) Eric Dumazet
2009-04-16 16:20                                                               ` Eric Dumazet
2009-04-16 16:37                                                               ` Linus Torvalds
2009-04-16 16:59                                                                 ` Patrick McHardy
2009-04-16 17:58                                                               ` Paul E. McKenney
2009-04-16 18:41                                                                 ` Eric Dumazet
2009-04-16 20:49                                                                   ` [PATCH[] netfilter: use per-cpu reader-writer lock (v0.7) Stephen Hemminger
2009-04-16 21:02                                                                     ` Linus Torvalds
2009-04-16 23:04                                                                       ` Ingo Molnar
2009-04-17  0:13                                                                   ` [PATCH] netfilter: use per-cpu recursive spinlock (v6) Paul E. McKenney
2009-04-16 13:11                                                     ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Patrick McHardy
2009-04-16 22:33                                                       ` David Miller
2009-04-16 23:49                                                         ` Paul E. McKenney
2009-04-16 23:52                                                           ` [PATCH] netfilter: per-cpu spin-lock with recursion (v0.8) Stephen Hemminger
2009-04-17  0:15                                                             ` Jeff Chua
2009-04-17  5:55                                                             ` Peter Zijlstra
2009-04-17  6:03                                                             ` Eric Dumazet
2009-04-17  6:14                                                               ` Eric Dumazet
2009-04-17 17:08                                                                 ` Peter Zijlstra
2009-04-17 11:17                                                               ` Patrick McHardy
2009-04-17  1:28                                                           ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Paul E. McKenney
2009-04-17  2:19                                                             ` Mathieu Desnoyers
2009-04-17  5:05                                                               ` Paul E. McKenney
2009-04-17  5:44                                                                 ` Mathieu Desnoyers
2009-04-17 14:51                                                                   ` Paul E. McKenney
2009-04-17  4:50                                                             ` Stephen Hemminger
2009-04-17  5:08                                                               ` Paul E. McKenney
2009-04-17  5:16                                                               ` Eric Dumazet
2009-04-17  5:40                                                                 ` Paul E. McKenney
2009-04-17  8:07                                                                   ` David Miller
2009-04-17 15:00                                                                     ` Paul E. McKenney
2009-04-17 17:22                                                                     ` Peter Zijlstra
2009-04-17 17:32                                                                       ` Linus Torvalds
2009-04-17  6:12                                                             ` Peter Zijlstra
2009-04-17 16:33                                                               ` Paul E. McKenney
2009-04-17 16:51                                                                 ` Peter Zijlstra
2009-04-17 21:29                                                                   ` Paul E. McKenney
2009-04-18  9:40                                                             ` Evgeniy Polyakov
2009-04-18 14:14                                                               ` Paul E. McKenney
2009-04-20 17:34                                                                 ` [PATCH] netfilter: use per-cpu recursive lock (v10) Stephen Hemminger
2009-04-20 18:21                                                                   ` Paul E. McKenney
2009-04-20 18:25                                                                   ` Eric Dumazet
2009-04-20 20:32                                                                     ` Stephen Hemminger
2009-04-20 20:42                                                                     ` Stephen Hemminger
2009-04-20 21:05                                                                       ` Paul E. McKenney
2009-04-20 21:23                                                                     ` Paul Mackerras
2009-04-20 21:58                                                                       ` Paul E. McKenney
2009-04-20 22:41                                                                         ` Paul Mackerras
2009-04-20 23:01                                                                           ` [PATCH] netfilter: use per-cpu recursive lock (v11) Stephen Hemminger
2009-04-21  3:41                                                                             ` Lai Jiangshan
2009-04-21  3:56                                                                               ` Eric Dumazet
2009-04-21  4:15                                                                                 ` Stephen Hemminger
2009-04-21  5:22                                                                                 ` Lai Jiangshan
2009-04-21  5:45                                                                                   ` Stephen Hemminger
2009-04-21  6:52                                                                                     ` Lai Jiangshan
2009-04-21  8:16                                                                                       ` Evgeniy Polyakov
2009-04-21  8:42                                                                                         ` Lai Jiangshan
2009-04-21  8:49                                                                                           ` David Miller
2009-04-21  8:55                                                                                         ` Eric Dumazet
2009-04-21  9:22                                                                                           ` Evgeniy Polyakov
2009-04-21  9:34                                                                                           ` Lai Jiangshan
2009-04-21  5:34                                                                                 ` Lai Jiangshan
2009-04-21  4:59                                                                             ` Eric Dumazet
2009-04-21 16:37                                                                               ` Paul E. McKenney
2009-04-21  5:46                                                                             ` Lai Jiangshan
2009-04-21 16:13                                                                             ` Linus Torvalds
2009-04-21 16:43                                                                               ` Stephen Hemminger
2009-04-21 16:50                                                                                 ` Linus Torvalds
2009-04-21 18:02                                                                               ` Ingo Molnar
2009-04-21 18:15                                                                               ` Stephen Hemminger
2009-04-21 19:10                                                                                 ` Ingo Molnar
2009-04-21 19:46                                                                                   ` Eric Dumazet
2009-04-22  7:35                                                                                     ` Ingo Molnar
2009-04-22  8:53                                                                                       ` Eric Dumazet
2009-04-22 10:13                                                                                         ` Jarek Poplawski
2009-04-22 11:26                                                                                           ` Ingo Molnar
2009-04-22 11:39                                                                                             ` Jarek Poplawski
2009-04-22 11:18                                                                                         ` Ingo Molnar [this message]
2009-04-22 15:19                                                                                         ` Linus Torvalds
2009-04-22 16:57                                                                                           ` Eric Dumazet
2009-04-22 17:18                                                                                             ` Linus Torvalds
2009-04-22 20:46                                                                                               ` Jarek Poplawski
2009-04-22 17:48                                                                                         ` Ingo Molnar
2009-04-21 21:04                                                                                   ` Stephen Hemminger
2009-04-22  8:00                                                                                     ` Ingo Molnar
2009-04-21 19:39                                                                                 ` Ingo Molnar
2009-04-21 21:39                                                                                   ` [PATCH] netfilter: use per-cpu recursive lock (v13) Stephen Hemminger
2009-04-22  4:17                                                                                     ` Paul E. McKenney
2009-04-22 14:57                                                                                     ` Eric Dumazet
2009-04-22 15:32                                                                                     ` Linus Torvalds
2009-04-24  4:09                                                                                       ` [PATCH] netfilter: use per-CPU recursive lock {XIV} Stephen Hemminger
2009-04-24  4:58                                                                                         ` Eric Dumazet
2009-04-24 15:33                                                                                           ` Patrick McHardy
2009-04-24 16:18                                                                                           ` Stephen Hemminger
2009-04-24 20:43                                                                                             ` Jarek Poplawski
2009-04-25 20:30                                                                                               ` [PATCH] netfilter: iptables no lockdep is needed Stephen Hemminger
2009-04-26  8:18                                                                                                 ` Jarek Poplawski
2009-04-26 18:24                                                                                                 ` [PATCH] netfilter: use per-CPU recursive lock {XV} Eric Dumazet
2009-04-26 18:56                                                                                                   ` Mathieu Desnoyers
2009-04-26 21:57                                                                                                     ` Stephen Hemminger
2009-04-26 22:32                                                                                                       ` Mathieu Desnoyers
2009-04-27 17:44                                                                                                       ` Peter Zijlstra
2009-04-27 18:30                                                                                                         ` [PATCH] netfilter: use per-CPU r**ursive " Stephen Hemminger
2009-04-27 18:54                                                                                                           ` Ingo Molnar
2009-04-27 19:06                                                                                                             ` Stephen Hemminger
2009-04-27 19:46                                                                                                               ` Linus Torvalds
2009-04-27 19:48                                                                                                                 ` Linus Torvalds
2009-04-27 20:36                                                                                                                 ` Evgeniy Polyakov
2009-04-27 20:58                                                                                                                   ` Linus Torvalds
2009-04-27 21:40                                                                                                                     ` Stephen Hemminger
2009-04-27 22:24                                                                                                                       ` Linus Torvalds
2009-04-27 23:01                                                                                                                         ` Linus Torvalds
2009-04-27 23:03                                                                                                                           ` Linus Torvalds
2009-04-28  6:58                                                                                                                             ` Eric Dumazet
2009-04-28 11:53                                                                                                                               ` David Miller
2009-04-28 12:40                                                                                                                                 ` Ingo Molnar
2009-04-28 13:43                                                                                                                                   ` David Miller
2009-04-28 13:52                                                                                                                                     ` Mathieu Desnoyers
2009-04-28 14:37                                                                                                                                       ` David Miller
2009-04-28 14:49                                                                                                                                         ` Mathieu Desnoyers
2009-04-28 15:00                                                                                                                                           ` David Miller
2009-04-28 16:24                                                                                                                                             ` [PATCH] netfilter: revised locking for x_tables Stephen Hemminger
2009-04-28 16:50                                                                                                                                               ` Linus Torvalds
2009-04-28 16:55                                                                                                                                                 ` Linus Torvalds
2009-04-29  5:37                                                                                                                                                   ` David Miller
     [not found]                                                                                                                                                     ` <20090428.223708.168741998.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2009-04-30  3:26                                                                                                                                                       ` Jeff Chua
     [not found]                                                                                                                                                         ` <b6a2187b0904292026k7d6107a7vcdc761d4149f40aa-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-04-30  3:31                                                                                                                                                           ` David Miller
2009-05-01  8:38                                                                                                                                                     ` [PATCH] netfilter: use likely() in xt_info_rdlock_bh() Eric Dumazet
2009-05-01 16:10                                                                                                                                                       ` David Miller
2009-04-28 15:42                                                                                                                                     ` [PATCH] netfilter: use per-CPU r**ursive lock {XV} Paul E. McKenney
2009-04-28 17:35                                                                                                                                       ` Christoph Lameter
2009-04-28 15:09                                                                                                                               ` Linus Torvalds
2009-04-27 23:32                                                                                                                           ` Linus Torvalds
2009-04-28  7:41                                                                                                                             ` Peter Zijlstra
2009-04-28 14:22                                                                                                                               ` Paul E. McKenney
2009-04-28  7:42                                                                                                                 ` Jan Engelhardt
2009-04-26 19:31                                                                                                   ` [PATCH] netfilter: use per-CPU recursive " Mathieu Desnoyers
2009-04-26 20:55                                                                                                     ` Eric Dumazet
2009-04-26 21:39                                                                                                       ` Mathieu Desnoyers
2009-04-21 18:34                                                                               ` [PATCH] netfilter: use per-cpu recursive lock (v11) Paul E. McKenney
2009-04-21 20:14                                                                                 ` Linus Torvalds
2009-04-20 23:44                                                                           ` [PATCH] netfilter: use per-cpu recursive lock (v10) Paul E. McKenney
2009-04-16  0:02                                                 ` [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) Linus Torvalds
2009-04-16  6:26                                                 ` Eric Dumazet
2009-04-16 14:33                                                   ` Paul E. McKenney
2009-04-15  3:23                                       ` David Miller
2009-04-14 17:19                               ` [PATCH] netfilter: use per-cpu spinlock rather than RCU Stephen Hemminger
2009-04-11 15:50         ` iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 Stephen Hemminger
2009-04-11 17:43           ` Paul E. McKenney
2009-04-11 18:57         ` Linus Torvalds
2009-04-12  0:34           ` Paul E. McKenney
2009-04-12  7:23             ` Evgeniy Polyakov
2009-04-12 16:06             ` Stephen Hemminger
2009-04-12 17:30               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090422111817.GA21190@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=benh@kernel.crashing.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=jeff.chua.linux@gmail.com \
    --cc=jengelh@medozas.de \
    --cc=kaber@trash.net \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=r000n@r000n.net \
    --cc=shemminger@vyatta.com \
    --cc=torvalds@linux-foundation.org \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).