From: Eric Dumazet <dada1@cosmosbay.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: linux-kernel@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Dipankar Sarma <dipankar@in.ibm.com>,
"Paul E. McKenney" <paulmck@us.ibm.com>,
Manfred Spraul <manfred@colorfullife.com>,
netdev@vger.kernel.org
Subject: [PATCH, RFC] RCU : OOM avoidance and lower latency
Date: Fri, 06 Jan 2006 11:17:26 +0100 [thread overview]
Message-ID: <43BE43B6.3010105@cosmosbay.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0601051727070.3169@g5.osdl.org>
[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]
In order to avoid some OOM triggered by a flood of call_rcu() calls, we
increased in linux 2.6.14 maxbatch from 10 to 10000, and conditionally call
set_need_resched() in call_rcu().
This solution doesnt solve all the problems and has drawbacks.
1) Using a big maxbatch has a bad impact on latency.
2) A flood of call_rcu_bh() still can OOM
I have some servers that once in a while crashes when the ip route cache is
flushed. After raising /proc/sys/net/ipv4/route/secret_interval (so that *no*
flush is done), I got better uptime for these servers. But in some cases I
think the network stack can floods call_rcu_bh(), and a fatal OOM occurs.
I suggest in this patch :
1) To lower maxbatch to a more reasonable value (as far as the latency is
concerned)
2) To be able to guard a RCU cpu queue against a maximal count (10.000 for
example). If this limit is reached, free the oldest entry of this queue.
I assume that if a CPU queued 10.000 items in its RCU queue, then the oldest
entry cannot still be in use by another CPU. This might sounds as a violation
of RCU rules, (I'm not an RCU expert) but seems quite reasonable.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: RCU_OOM.patch --]
[-- Type: text/plain, Size: 2078 bytes --]
--- linux-2.6.15/kernel/rcupdate.c 2006-01-03 04:21:10.000000000 +0100
+++ linux-2.6.15-edum/kernel/rcupdate.c 2006-01-06 11:10:45.000000000 +0100
@@ -71,14 +71,14 @@
/* Fake initialization required by compiler */
static DEFINE_PER_CPU(struct tasklet_struct, rcu_tasklet) = {NULL};
-static int maxbatch = 10000;
+static int maxbatch = 100;
#ifndef __HAVE_ARCH_CMPXCHG
/*
* We use an array of spinlocks for the rcurefs -- similar to ones in sparc
* 32 bit atomic_t implementations, and a hash function similar to that
* for our refcounting needs.
- * Can't help multiprocessors which donot have cmpxchg :(
+ * Can't help multiprocessors which dont have cmpxchg :(
*/
spinlock_t __rcuref_hash[RCUREF_HASH_SIZE] = {
@@ -110,9 +110,17 @@
*rdp->nxttail = head;
rdp->nxttail = &head->next;
- if (unlikely(++rdp->count > 10000))
- set_need_resched();
-
+/*
+ * OOM avoidance : If we queued too many items in this queue,
+ * free the oldest entry
+ */
+ if (unlikely(++rdp->count > 10000)) {
+ rdp->count--;
+ head = rdp->donelist;
+ rdp->donelist = head->next;
+ local_irq_restore(flags);
+ return head->func(head);
+ }
local_irq_restore(flags);
}
@@ -148,12 +156,17 @@
rdp = &__get_cpu_var(rcu_bh_data);
*rdp->nxttail = head;
rdp->nxttail = &head->next;
- rdp->count++;
/*
- * Should we directly call rcu_do_batch() here ?
- * if (unlikely(rdp->count > 10000))
- * rcu_do_batch(rdp);
+ * OOM avoidance : If we queued too many items in this queue,
+ * free the oldest entry
*/
+ if (unlikely(++rdp->count > 10000)) {
+ rdp->count--;
+ head = rdp->donelist;
+ rdp->donelist = head->next;
+ local_irq_restore(flags);
+ return head->func(head);
+ }
local_irq_restore(flags);
}
@@ -209,7 +222,7 @@
static void rcu_do_batch(struct rcu_data *rdp)
{
struct rcu_head *next, *list;
- int count = 0;
+ int count = maxbatch;
list = rdp->donelist;
while (list) {
@@ -217,7 +230,7 @@
list->func(list);
list = next;
rdp->count--;
- if (++count >= maxbatch)
+ if (--count <= 0)
break;
}
if (!rdp->donelist)
next prev parent reply other threads:[~2006-01-06 10:18 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060105235845.967478000@sorel.sous-sol.org>
2006-01-05 21:47 ` [PATCH 0/6] -stable review Chris Wright
2006-01-06 0:45 ` [PATCH 1/6] drivers/net/sungem.c: gem_remove_one mustnt be __devexit Chris Wright
2006-01-06 0:45 ` [PATCH 2/6] ieee80211_crypt_tkip depends on NET_RADIO Chris Wright
2006-01-06 0:45 ` [PATCH 3/6] Insanity avoidance in /proc (CVE-2005-4605) Chris Wright
2006-01-06 0:45 ` [PATCH 4/6] sysctl: dont overflow the user-supplied buffer with 0 Chris Wright
2006-01-06 1:30 ` Linus Torvalds
2006-01-06 3:40 ` Chris Wright
2006-01-06 10:17 ` Eric Dumazet [this message]
2006-01-06 12:52 ` [PATCH, RFC] RCU : OOM avoidance and lower latency (Version 2), HOTPLUG_CPU fix Eric Dumazet
2006-01-06 12:58 ` [PATCH, RFC] RCU : OOM avoidance and lower latency Andi Kleen
2006-01-06 13:09 ` Eric Dumazet
2006-01-06 19:26 ` Lee Revell
2006-01-06 22:18 ` Andi Kleen
2006-01-06 13:37 ` Alan Cox
2006-01-06 14:00 ` Eric Dumazet
2006-01-06 14:45 ` Alan Cox
2006-01-06 16:47 ` Paul E. McKenney
2006-01-06 17:19 ` Eric Dumazet
2006-01-06 20:26 ` Paul E. McKenney
2006-01-06 20:33 ` David S. Miller
2006-01-06 20:57 ` Andi Kleen
2006-01-07 0:17 ` David S. Miller
2006-01-07 1:09 ` Andi Kleen
2006-01-07 7:10 ` David S. Miller
2006-01-07 7:34 ` Eric Dumazet
2006-01-07 7:44 ` David S. Miller
2006-01-07 7:53 ` Eric Dumazet
2006-01-07 8:36 ` David S. Miller
2006-01-07 20:30 ` Paul E. McKenney
2006-01-07 8:30 ` Eric Dumazet
2006-01-06 19:24 ` Lee Revell
2006-01-06 0:46 ` [PATCH 5/6] UFS: inode->i_sem is not released in error path Chris Wright
2006-01-06 0:46 ` [PATCH 6/6] [ATYFB]: Fix onboard video on SPARC Blade 100 for 2.6.{13,14,15} Chris Wright
2006-01-06 0:53 ` [PATCH 0/6] -stable review Chris Wright
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43BE43B6.3010105@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=dipankar@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=netdev@vger.kernel.org \
--cc=paulmck@us.ibm.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.