From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Florian Westphal <fw@strlen.de>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>,
netdev@vger.kernel.org, Thomas Graf <tgraf@suug.ch>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Cong Wang <amwang@redhat.com>,
Herbert Xu <herbert@gondor.hengli.com.au>
Subject: [net-next PATCH V3-evictor] net: frag evictor, avoid killing warm frag queues
Date: Tue, 04 Dec 2012 14:30:46 +0100 [thread overview]
Message-ID: <20121204133007.20215.52566.stgit@dragon> (raw)
In-Reply-To: <1354319937.20109.285.camel@edumazet-glaptop>
The fragmentation evictor system have a very unfortunate eviction
system for killing fragment, when the system is put under pressure.
If packets are coming in too fast, the evictor code kills "warm"
fragments too quickly. Resulting in close to zero throughput, as
fragments are killed before they have a chance to complete
This is related to the bad interaction with the LRU (Least Recently
Used) list. Under load the LRU list sort-of changes meaning/behavior.
When the LRU head is very new/warm, then the head is most likely the
one with most fragments and the tail (latest used or added element)
with least.
Solved by, introducing a creation "jiffie" timestamp (creation_ts).
If the element is tried evicted in same jiffie, then perform tail drop
on the LRU list instead.
Signed-off-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
---
V2:
- Drop the INET_FRAG_FIRST_IN idea for detecting dropped "head" packets
V3:
- Move the tail drop, from inet_frag_alloc() to inet_frag_evictor()
This will be close to the same semantics, but at a higher cost.
include/net/inet_frag.h | 1 +
net/ipv4/inet_fragment.c | 12 ++++++++++++
2 files changed, 13 insertions(+), 0 deletions(-)
diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index 32786a0..7b897b2 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -24,6 +24,7 @@ struct inet_frag_queue {
ktime_t stamp;
int len; /* total length of orig datagram */
int meat;
+ u32 creation_ts;/* jiffies when queue was created*/
__u8 last_in; /* first/last segment arrived? */
#define INET_FRAG_COMPLETE 4
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 4750d2b..d8bf59b 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -178,6 +178,16 @@ int inet_frag_evictor(struct netns_frags *nf, struct inet_frags *f, bool force)
q = list_first_entry(&nf->lru_list,
struct inet_frag_queue, lru_list);
+
+ /* When head of LRU is very new/warm, then the head is
+ * most likely the one with most fragments and the
+ * tail with least, thus drop tail
+ */
+ if (!force && q->creation_ts == (u32) jiffies) {
+ q = list_entry(&nf->lru_list.prev,
+ struct inet_frag_queue, lru_list);
+ }
+
atomic_inc(&q->refcnt);
read_unlock(&f->lock);
@@ -243,11 +253,13 @@ static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf,
struct inet_frags *f, void *arg)
{
struct inet_frag_queue *q;
+ // Note: We could also perform the tail drop here
q = kzalloc(f->qsize, GFP_ATOMIC);
if (q == NULL)
return NULL;
+ q->creation_ts = (u32) jiffies;
q->net = nf;
f->constructor(q, arg);
atomic_add(f->qsize, &nf->mem);
next prev parent reply other threads:[~2012-12-04 13:31 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-29 16:10 [net-next PATCH V2 0/9] net: fragmentation performance scalability on NUMA/SMP systems Jesper Dangaard Brouer
2012-11-29 16:11 ` [net-next PATCH V2 1/9] net: frag evictor, avoid killing warm frag queues Jesper Dangaard Brouer
2012-11-29 17:44 ` David Miller
2012-11-29 22:17 ` Jesper Dangaard Brouer
2012-11-29 23:01 ` Eric Dumazet
2012-11-30 10:04 ` Jesper Dangaard Brouer
2012-11-30 14:52 ` Eric Dumazet
2012-11-30 15:45 ` Jesper Dangaard Brouer
2012-11-30 16:37 ` Eric Dumazet
2012-11-30 21:37 ` Jesper Dangaard Brouer
2012-11-30 22:25 ` Eric Dumazet
2012-11-30 23:23 ` Jesper Dangaard Brouer
2012-11-30 23:47 ` Stephen Hemminger
2012-12-01 0:03 ` Eric Dumazet
2012-12-01 0:13 ` Stephen Hemminger
2012-11-30 23:58 ` Eric Dumazet
2012-12-04 13:30 ` Jesper Dangaard Brouer [this message]
2012-12-04 14:32 ` [net-next PATCH V3-evictor] net: frag evictor,avoid " David Laight
2012-12-04 14:47 ` [net-next PATCH V3-evictor] net: frag evictor, avoid " Eric Dumazet
2012-12-04 17:51 ` Jesper Dangaard Brouer
2012-12-05 9:24 ` Jesper Dangaard Brouer
2012-12-06 12:26 ` Jesper Dangaard Brouer
2012-12-06 12:32 ` Florian Westphal
2012-12-06 13:29 ` David Laight
2012-12-06 21:38 ` David Miller
2012-12-06 13:55 ` Jesper Dangaard Brouer
2012-12-06 14:47 ` Eric Dumazet
2012-12-06 15:23 ` Jesper Dangaard Brouer
2012-11-29 23:32 ` [net-next PATCH V2 1/9] " Eric Dumazet
2012-11-30 12:01 ` Jesper Dangaard Brouer
2012-11-30 14:57 ` Eric Dumazet
2012-11-29 16:11 ` [net-next PATCH V2 2/9] net: frag cache line adjust inet_frag_queue.net Jesper Dangaard Brouer
2012-11-29 16:12 ` [net-next PATCH V2 3/9] net: frag, move LRU list maintenance outside of rwlock Jesper Dangaard Brouer
2012-11-29 17:43 ` Eric Dumazet
2012-11-29 17:48 ` David Miller
2012-11-29 17:54 ` Eric Dumazet
2012-11-29 18:05 ` David Miller
2012-11-29 18:24 ` Eric Dumazet
2012-11-29 18:31 ` David Miller
2012-11-29 18:33 ` Eric Dumazet
2012-11-29 18:36 ` David Miller
2012-11-29 22:33 ` Jesper Dangaard Brouer
2012-11-29 16:12 ` [net-next PATCH V2 4/9] net: frag helper functions for mem limit tracking Jesper Dangaard Brouer
2012-11-29 16:13 ` [net-next PATCH V2 5/9] net: frag, per CPU resource, mem limit and LRU list accounting Jesper Dangaard Brouer
2012-11-29 17:06 ` Eric Dumazet
2012-11-29 17:31 ` David Miller
2012-12-03 14:02 ` Jesper Dangaard Brouer
2012-12-03 17:25 ` David Miller
2012-11-29 16:14 ` [net-next PATCH V2 6/9] net: frag, implement dynamic percpu alloc of frag_cpu_limit Jesper Dangaard Brouer
2012-11-29 16:15 ` [net-next PATCH V2 7/9] net: frag, move nqueues counter under LRU lock protection Jesper Dangaard Brouer
2012-11-29 16:15 ` [net-next PATCH V2 8/9] net: frag queue locking per hash bucket Jesper Dangaard Brouer
2012-11-29 17:08 ` Eric Dumazet
2012-11-30 12:55 ` Jesper Dangaard Brouer
2012-11-29 16:16 ` [net-next PATCH V2 9/9] net: increase frag queue hash size and cache-line Jesper Dangaard Brouer
2012-11-29 16:39 ` [net-next PATCH V2 9/9] net: increase frag queue hash size andcache-line David Laight
2012-11-29 16:55 ` [net-next PATCH V2 9/9] net: increase frag queue hash size and cache-line Eric Dumazet
2012-11-29 20:53 ` Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121204133007.20215.52566.stgit@dragon \
--to=brouer@redhat.com \
--cc=amwang@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=herbert@gondor.hengli.com.au \
--cc=netdev@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).