From: Eric Dumazet <dada1@cosmosbay.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx()
Date: Tue, 01 Apr 2008 11:19:06 +0200 [thread overview]
Message-ID: <47F1FE0A.9090501@cosmosbay.com> (raw)
In-Reply-To: <20080331101237.GA12324@elte.hu>
[-- Attachment #1: Type: text/plain, Size: 2186 bytes --]
Ingo Molnar a écrit :
> * Eric Dumazet <dada1@cosmosbay.com> wrote:
>
>
>> Problem is to check available space :
>>
>> It depends on stack growing UP or DOWN, and depends on caller running
>> on process stack, or softirq stack, or even hardirq stack.
>>
>
> ok - i wish such threads were on lkml so that everyone not just the
> netdev kabal can read it. It's quite ugly, but if we want to check stack
> free space i'd suggest for you to put a stack_can_recurse() call into
> arch/x86/kernel/process.c and offer a default __weak implementation in
> kernel/fork.c that always returns 0.
>
> the rule on x86 should be something like this: on 4K stacks and 64-bit
> [which have irqstacks] free stack space can go as low as 25%. On 8K
> stacks [which doesnt have irqstacks but nests irqs] it should not go
> below 50% before falling back to the explicitly queued packet branch.
>
> this way other pieces of kernel code code can choose between on-stack
> fast recursion and explicit iterators. Although i'm not sure i like the
> whole concept to begin with ...
>
>
Hi Ingo
I took the time to prepare a patch to implement
arch_stack_can_recurse() as you suggested.
Thank you
[PATCH] x86 : arch_stack_can_recurse() introduction
Some paths in kernel would like to chose between on-stack fast recursion
and explicit iterators.
One identified spot is in net loopback driver, where we can avoid
netif_rx() and its slowdown if
sufficient stack space is available.
We introduce a generic arch_stack_can_recurse() which default to a weak
function returning 0.
On x86 arch, we implement following logic :
32 bits and 4K stacks (separate irq stacks) : can use up to 25% of stack
64 bits, 8K stacks (separate irq stacks) : can use up to 25% of stack
32 bits and 8K stacks (no irq stacks) : can use up to 50% of stack
Example of use in drivers/net/loopback.c, function loopback_xmit()
if (arch_stack_can_recurse())
netif_receive_skb(skb); /* immediate delivery to stack */
else
netif_rx(skb); /* defer to softirq handling */
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
[-- Attachment #2: can_recurse.patch --]
[-- Type: text/plain, Size: 2282 bytes --]
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0e613e7..6edc1d3 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -43,3 +43,25 @@ void arch_task_cache_init(void)
__alignof__(union thread_xstate),
SLAB_PANIC, NULL);
}
+
+
+/*
+ * Used to check if we can recurse without risking stack overflow
+ * Rules are :
+ * 32 bits and 4K stacks (separate irq stacks) : can use up to 25% of stack
+ * 64 bits, 8K stacks (separate irq stacks) : can use up to 25% of stack
+ * 32 bits and 8K stacks (no irq stacks) : can use up to 50% of stack
+ */
+#if defined(CONFIG_4KSTACKS) || defined(CONFIG_X86_64)
+# define STACK_RECURSE_LIMIT (THREAD_SIZE/4)
+#else
+# define STACK_RECURSE_LIMIT (THREAD_SIZE/2)
+#endif
+
+int arch_stack_can_recurse()
+{
+ unsigned long offset_stack = current_stack_pointer & (THREAD_SIZE - 1);
+ unsigned long avail_stack = offset_stack - sizeof(struct thread_info);
+
+ return avail_stack >= STACK_RECURSE_LIMIT;
+}
diff --git a/include/asm-x86/thread_info_64.h b/include/asm-x86/thread_info_64.h
index f23fefc..9a913c4 100644
--- a/include/asm-x86/thread_info_64.h
+++ b/include/asm-x86/thread_info_64.h
@@ -60,6 +60,9 @@ struct thread_info {
#define init_thread_info (init_thread_union.thread_info)
#define init_stack (init_thread_union.stack)
+/* how to get the current stack pointer from C */
+register unsigned long current_stack_pointer asm("rsp") __used;
+
static inline struct thread_info *current_thread_info(void)
{
struct thread_info *ti;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ca720f0..445b8da 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1600,6 +1600,8 @@ union thread_union {
unsigned long stack[THREAD_SIZE/sizeof(long)];
};
+extern int arch_stack_can_recurse(void);
+
#ifndef __HAVE_ARCH_KSTACK_END
static inline int kstack_end(void *addr)
{
diff --git a/kernel/fork.c b/kernel/fork.c
index a19df75..cd5d1e1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -136,6 +136,11 @@ void __attribute__((weak)) arch_task_cache_init(void)
{
}
+int __attribute__((weak)) arch_stack_can_recurse(void)
+{
+ return 0;
+}
+
void __init fork_init(unsigned long mempages)
{
#ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
next prev parent reply other threads:[~2008-04-01 9:19 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-21 18:51 [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx() Eric Dumazet
2008-02-21 20:14 ` Daniel Lezcano
2008-02-21 23:19 ` Eric Dumazet
2008-02-22 10:19 ` Daniel Lezcano
2008-02-27 2:21 ` David Miller
2008-02-27 7:20 ` Jarek Poplawski
2008-02-27 7:23 ` David Miller
2008-02-27 7:34 ` Jarek Poplawski
2008-03-01 10:26 ` Eric Dumazet
2008-03-04 4:55 ` David Miller
2008-03-04 5:15 ` Stephen Hemminger
2008-03-04 6:27 ` Eric Dumazet
2008-03-23 10:29 ` David Miller
2008-03-23 18:48 ` Eric Dumazet
2008-03-23 19:15 ` Andi Kleen
2008-03-29 1:36 ` David Miller
2008-03-29 8:18 ` Eric Dumazet
2008-03-29 23:54 ` David Miller
2008-03-31 6:38 ` Eric Dumazet
2008-03-31 9:48 ` Ingo Molnar
2008-03-31 10:01 ` Eric Dumazet
2008-03-31 10:12 ` Ingo Molnar
2008-04-01 9:19 ` Eric Dumazet [this message]
2008-04-03 14:06 ` Pavel Machek
2008-04-03 16:19 ` Eric Dumazet
2008-03-31 10:08 ` David Miller
2008-03-31 10:44 ` Ingo Molnar
2008-03-31 11:02 ` David Miller
2008-03-31 11:36 ` poor network loopback performance and scalability (was: Re: [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx()) Ingo Molnar
2008-04-21 3:24 ` Herbert Xu
2008-04-21 3:38 ` poor network loopback performance and scalability David Miller
2008-04-21 8:11 ` Ingo Molnar
2008-04-21 8:16 ` David Miller
2008-04-21 10:19 ` Herbert Xu
2008-04-21 10:22 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47F1FE0A.9090501@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=a.p.zijlstra@chello.nl \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).