From: Eric Dumazet <eric.dumazet@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Florian Westphal <fw@strlen.de>,
netdev@vger.kernel.org, Thomas Graf <tgraf@suug.ch>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Cong Wang <amwang@redhat.com>,
Herbert Xu <herbert@gondor.hengli.com.au>
Subject: Re: [net-next PATCH V3-evictor] net: frag evictor, avoid killing warm frag queues
Date: Tue, 04 Dec 2012 06:47:27 -0800 [thread overview]
Message-ID: <1354632447.1388.150.camel@edumazet-glaptop> (raw)
In-Reply-To: <20121204133007.20215.52566.stgit@dragon>
On Tue, 2012-12-04 at 14:30 +0100, Jesper Dangaard Brouer wrote:
> The fragmentation evictor system have a very unfortunate eviction
> system for killing fragment, when the system is put under pressure.
>
> If packets are coming in too fast, the evictor code kills "warm"
> fragments too quickly. Resulting in close to zero throughput, as
> fragments are killed before they have a chance to complete
>
> This is related to the bad interaction with the LRU (Least Recently
> Used) list. Under load the LRU list sort-of changes meaning/behavior.
> When the LRU head is very new/warm, then the head is most likely the
> one with most fragments and the tail (latest used or added element)
> with least.
>
> Solved by, introducing a creation "jiffie" timestamp (creation_ts).
> If the element is tried evicted in same jiffie, then perform tail drop
> on the LRU list instead.
>
> Signed-off-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
This would only 'work' if a reassembled packet can be done/completed
under one jiffie.
For 64KB packets, this means 100Mb link wont be able to deliver a
reassembled packet under IP frags load if HZ=1000
LRU goal is to be able to select the oldest inet_frag_queue, because in
typical networks, packet losses are really happening and this is why
some packets wont complete their reassembly. They naturally will be
found on LRU head, and they probably are very fat (for example a single
packet was lost for the inet_frag_queue)
Choosing the most recent inet_frag_queue is exactly the opposite
strategy. We pay the huge cost of maintaining a central LRU, and we
exactly misuse it.
As long as an inet_frag_queue receives new fragments and is moved to the
LRU tail, its a candidate for being kept, not a candidate for being
evicted.
Only when an inet_frag_queue is the oldest one, it becomes a candidate
for eviction.
I think you are trying to solve a configuration/tuning problem by
changing a valid strategy.
Whats wrong with admitting high_thresh/low_thresh default values should
be updated, now some people apparently want to use IP fragments in
production ?
Lets say we allow to use 1 % of memory for frags, instead of the current
256 KB limit, which was chosen decades ago.
Only in very severe DOS attacks, LRU head 'creation_ts' would possibly
be <= 1ms. And under severe DOS attacks, I am afraid there is nothing we
can do.
(We could eventually avoid LRU hassle and chose instead a random drop
strategy)
high_thresh/low_thresh should be changed from 'int' to 'long' as well,
so that a 64bit host could use more than 2GB for frag storage.
next prev parent reply other threads:[~2012-12-04 14:47 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-29 16:10 [net-next PATCH V2 0/9] net: fragmentation performance scalability on NUMA/SMP systems Jesper Dangaard Brouer
2012-11-29 16:11 ` [net-next PATCH V2 1/9] net: frag evictor, avoid killing warm frag queues Jesper Dangaard Brouer
2012-11-29 17:44 ` David Miller
2012-11-29 22:17 ` Jesper Dangaard Brouer
2012-11-29 23:01 ` Eric Dumazet
2012-11-30 10:04 ` Jesper Dangaard Brouer
2012-11-30 14:52 ` Eric Dumazet
2012-11-30 15:45 ` Jesper Dangaard Brouer
2012-11-30 16:37 ` Eric Dumazet
2012-11-30 21:37 ` Jesper Dangaard Brouer
2012-11-30 22:25 ` Eric Dumazet
2012-11-30 23:23 ` Jesper Dangaard Brouer
2012-11-30 23:47 ` Stephen Hemminger
2012-12-01 0:03 ` Eric Dumazet
2012-12-01 0:13 ` Stephen Hemminger
2012-11-30 23:58 ` Eric Dumazet
2012-12-04 13:30 ` [net-next PATCH V3-evictor] " Jesper Dangaard Brouer
2012-12-04 14:32 ` [net-next PATCH V3-evictor] net: frag evictor,avoid " David Laight
2012-12-04 14:47 ` Eric Dumazet [this message]
2012-12-04 17:51 ` [net-next PATCH V3-evictor] net: frag evictor, avoid " Jesper Dangaard Brouer
2012-12-05 9:24 ` Jesper Dangaard Brouer
2012-12-06 12:26 ` Jesper Dangaard Brouer
2012-12-06 12:32 ` Florian Westphal
2012-12-06 13:29 ` David Laight
2012-12-06 21:38 ` David Miller
2012-12-06 13:55 ` Jesper Dangaard Brouer
2012-12-06 14:47 ` Eric Dumazet
2012-12-06 15:23 ` Jesper Dangaard Brouer
2012-11-29 23:32 ` [net-next PATCH V2 1/9] " Eric Dumazet
2012-11-30 12:01 ` Jesper Dangaard Brouer
2012-11-30 14:57 ` Eric Dumazet
2012-11-29 16:11 ` [net-next PATCH V2 2/9] net: frag cache line adjust inet_frag_queue.net Jesper Dangaard Brouer
2012-11-29 16:12 ` [net-next PATCH V2 3/9] net: frag, move LRU list maintenance outside of rwlock Jesper Dangaard Brouer
2012-11-29 17:43 ` Eric Dumazet
2012-11-29 17:48 ` David Miller
2012-11-29 17:54 ` Eric Dumazet
2012-11-29 18:05 ` David Miller
2012-11-29 18:24 ` Eric Dumazet
2012-11-29 18:31 ` David Miller
2012-11-29 18:33 ` Eric Dumazet
2012-11-29 18:36 ` David Miller
2012-11-29 22:33 ` Jesper Dangaard Brouer
2012-11-29 16:12 ` [net-next PATCH V2 4/9] net: frag helper functions for mem limit tracking Jesper Dangaard Brouer
2012-11-29 16:13 ` [net-next PATCH V2 5/9] net: frag, per CPU resource, mem limit and LRU list accounting Jesper Dangaard Brouer
2012-11-29 17:06 ` Eric Dumazet
2012-11-29 17:31 ` David Miller
2012-12-03 14:02 ` Jesper Dangaard Brouer
2012-12-03 17:25 ` David Miller
2012-11-29 16:14 ` [net-next PATCH V2 6/9] net: frag, implement dynamic percpu alloc of frag_cpu_limit Jesper Dangaard Brouer
2012-11-29 16:15 ` [net-next PATCH V2 7/9] net: frag, move nqueues counter under LRU lock protection Jesper Dangaard Brouer
2012-11-29 16:15 ` [net-next PATCH V2 8/9] net: frag queue locking per hash bucket Jesper Dangaard Brouer
2012-11-29 17:08 ` Eric Dumazet
2012-11-30 12:55 ` Jesper Dangaard Brouer
2012-11-29 16:16 ` [net-next PATCH V2 9/9] net: increase frag queue hash size and cache-line Jesper Dangaard Brouer
2012-11-29 16:39 ` [net-next PATCH V2 9/9] net: increase frag queue hash size andcache-line David Laight
2012-11-29 16:55 ` [net-next PATCH V2 9/9] net: increase frag queue hash size and cache-line Eric Dumazet
2012-11-29 20:53 ` Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1354632447.1388.150.camel@edumazet-glaptop \
--to=eric.dumazet@gmail.com \
--cc=amwang@redhat.com \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=herbert@gondor.hengli.com.au \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox