netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RFC,PATCH] loopback: calls netif_receive_skb() instead of	netif_rx()
Date: Tue, 01 Apr 2008 11:19:06 +0200	[thread overview]
Message-ID: <47F1FE0A.9090501@cosmosbay.com> (raw)
In-Reply-To: <20080331101237.GA12324@elte.hu>

[-- Attachment #1: Type: text/plain, Size: 2186 bytes --]

Ingo Molnar a écrit :
> * Eric Dumazet <dada1@cosmosbay.com> wrote:
>
>   
>> Problem is to check available space :
>>
>> It depends on stack growing UP or DOWN, and depends on caller running 
>> on process stack, or softirq stack, or even hardirq stack.
>>     
>
> ok - i wish such threads were on lkml so that everyone not just the 
> netdev kabal can read it. It's quite ugly, but if we want to check stack 
> free space i'd suggest for you to put a stack_can_recurse() call into 
> arch/x86/kernel/process.c and offer a default __weak implementation in 
> kernel/fork.c that always returns 0.
>
> the rule on x86 should be something like this: on 4K stacks and 64-bit 
> [which have irqstacks] free stack space can go as low as 25%. On 8K 
> stacks [which doesnt have irqstacks but nests irqs] it should not go 
> below 50% before falling back to the explicitly queued packet branch.
>
> this way other pieces of kernel code code can choose between on-stack 
> fast recursion and explicit iterators. Although i'm not sure i like the 
> whole concept to begin with ...
>
>   

Hi Ingo

I took the time to prepare a patch to implement  
arch_stack_can_recurse() as you suggested.

Thank you

[PATCH] x86 : arch_stack_can_recurse() introduction

Some paths in kernel would like to chose between on-stack fast recursion 
and explicit iterators.

One identified spot is in net loopback driver, where we can avoid 
netif_rx() and its slowdown if
sufficient stack space is available.

We introduce a generic arch_stack_can_recurse() which default to a weak 
function returning 0.

 On x86 arch, we implement following logic :

   32 bits and 4K stacks (separate irq stacks) : can use up to 25% of stack
   64 bits, 8K stacks (separate irq stacks)    : can use up to 25% of stack
   32 bits and 8K stacks (no irq stacks)       : can use up to 50% of stack

Example of use in drivers/net/loopback.c, function  loopback_xmit()

if (arch_stack_can_recurse())
    netif_receive_skb(skb); /* immediate delivery to stack */
else
    netif_rx(skb); /* defer to softirq handling */

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>



[-- Attachment #2: can_recurse.patch --]
[-- Type: text/plain, Size: 2282 bytes --]

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0e613e7..6edc1d3 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -43,3 +43,25 @@ void arch_task_cache_init(void)
 				  __alignof__(union thread_xstate),
 				  SLAB_PANIC, NULL);
 }
+
+
+/*
+ * Used to check if we can recurse without risking stack overflow
+ * Rules are :
+ *   32 bits and 4K stacks (separate irq stacks) : can use up to 25% of stack
+ *   64 bits, 8K stacks (separate irq stacks)    : can use up to 25% of stack
+ *   32 bits and 8K stacks (no irq stacks)       : can use up to 50% of stack
+ */
+#if defined(CONFIG_4KSTACKS) || defined(CONFIG_X86_64)
+# define STACK_RECURSE_LIMIT (THREAD_SIZE/4)
+#else
+# define STACK_RECURSE_LIMIT (THREAD_SIZE/2)
+#endif
+
+int arch_stack_can_recurse()
+{
+	unsigned long offset_stack = current_stack_pointer & (THREAD_SIZE - 1);
+	unsigned long avail_stack = offset_stack - sizeof(struct thread_info);
+
+	return avail_stack >= STACK_RECURSE_LIMIT;
+}
diff --git a/include/asm-x86/thread_info_64.h b/include/asm-x86/thread_info_64.h
index f23fefc..9a913c4 100644
--- a/include/asm-x86/thread_info_64.h
+++ b/include/asm-x86/thread_info_64.h
@@ -60,6 +60,9 @@ struct thread_info {
 #define init_thread_info	(init_thread_union.thread_info)
 #define init_stack		(init_thread_union.stack)
 
+/* how to get the current stack pointer from C */
+register unsigned long current_stack_pointer asm("rsp") __used;
+
 static inline struct thread_info *current_thread_info(void)
 {
 	struct thread_info *ti;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ca720f0..445b8da 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1600,6 +1600,8 @@ union thread_union {
 	unsigned long stack[THREAD_SIZE/sizeof(long)];
 };
 
+extern int arch_stack_can_recurse(void);
+
 #ifndef __HAVE_ARCH_KSTACK_END
 static inline int kstack_end(void *addr)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index a19df75..cd5d1e1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -136,6 +136,11 @@ void __attribute__((weak)) arch_task_cache_init(void)
 {
 }
 
+int __attribute__((weak)) arch_stack_can_recurse(void)
+{
+	return 0;
+}
+
 void __init fork_init(unsigned long mempages)
 {
 #ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR

  reply	other threads:[~2008-04-01  9:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-21 18:51 [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx() Eric Dumazet
2008-02-21 20:14 ` Daniel Lezcano
2008-02-21 23:19   ` Eric Dumazet
2008-02-22 10:19     ` Daniel Lezcano
2008-02-27  2:21 ` David Miller
2008-02-27  7:20   ` Jarek Poplawski
2008-02-27  7:23     ` David Miller
2008-02-27  7:34       ` Jarek Poplawski
2008-03-01 10:26   ` Eric Dumazet
2008-03-04  4:55     ` David Miller
2008-03-04  5:15       ` Stephen Hemminger
2008-03-04  6:27       ` Eric Dumazet
2008-03-23 10:29     ` David Miller
2008-03-23 18:48       ` Eric Dumazet
2008-03-23 19:15         ` Andi Kleen
2008-03-29  1:36         ` David Miller
2008-03-29  8:18           ` Eric Dumazet
2008-03-29 23:54             ` David Miller
2008-03-31  6:38               ` Eric Dumazet
2008-03-31  9:48         ` Ingo Molnar
2008-03-31 10:01           ` Eric Dumazet
2008-03-31 10:12             ` Ingo Molnar
2008-04-01  9:19               ` Eric Dumazet [this message]
2008-04-03 14:06                 ` Pavel Machek
2008-04-03 16:19                   ` Eric Dumazet
2008-03-31 10:08           ` David Miller
2008-03-31 10:44             ` Ingo Molnar
2008-03-31 11:02               ` David Miller
2008-03-31 11:36                 ` poor network loopback performance and scalability (was: Re: [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx()) Ingo Molnar
2008-04-21  3:24                   ` Herbert Xu
2008-04-21  3:38                     ` poor network loopback performance and scalability David Miller
2008-04-21  8:11                       ` Ingo Molnar
2008-04-21  8:16                         ` David Miller
2008-04-21 10:19                           ` Herbert Xu
2008-04-21 10:22                             ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47F1FE0A.9090501@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).