public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] net, rps: bypass enqueue_to_backlog()
@ 2013-12-18 13:03 Zhi Yong Wu
  2013-12-18 13:56 ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Zhi Yong Wu @ 2013-12-18 13:03 UTC (permalink / raw)
  To: therbert; +Cc: netdev, Zhi Yong Wu

From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>

When local cpu is just target cpu which will handle network soft irq,
the packet should be directly injected to network stack, by bypassing
enqueue_to_backlog(), it can speed up the packet processing.

HI, guys

I checked the first several versions of RPS patch which seemed to have
this condition determination, but why was it removed later? Do i miss
anything? if yes, please correct me, thanks.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
---
 net/core/dev.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index c482fe8..d29d61f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3693,7 +3693,7 @@ int netif_receive_skb(struct sk_buff *skb)
 
 		cpu = get_rps_cpu(skb->dev, skb, &rflow);
 
-		if (cpu >= 0) {
+		if ((cpu >= 0) && (cpu != raw_smp_processor_id())) {
 			ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
 			rcu_read_unlock();
 			return ret;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] net, rps: bypass enqueue_to_backlog()
  2013-12-18 13:03 [RFC PATCH] net, rps: bypass enqueue_to_backlog() Zhi Yong Wu
@ 2013-12-18 13:56 ` Eric Dumazet
  2013-12-18 14:04   ` Eric Dumazet
  2013-12-18 14:20   ` Zhi Yong Wu
  0 siblings, 2 replies; 6+ messages in thread
From: Eric Dumazet @ 2013-12-18 13:56 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: therbert, netdev, Zhi Yong Wu

On Wed, 2013-12-18 at 21:03 +0800, Zhi Yong Wu wrote:
> From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> 
> When local cpu is just target cpu which will handle network soft irq,
> the packet should be directly injected to network stack, by bypassing
> enqueue_to_backlog(), it can speed up the packet processing.
> 
> HI, guys
> 
> I checked the first several versions of RPS patch which seemed to have
> this condition determination, but why was it removed later? Do i miss
> anything? if yes, please correct me, thanks.
> 
> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> ---

Hmm... Could you elaborate ?

At which point do you think this condition was tested or removed ?

I think the idea was to drain NIC RX queues as fast as possible, then :

- Send the IPI to remote cpus
- process our queue in parallel with other cpus processing their own
queue.

If we process our packets through whole stack, packets for other cpus
will have a fair amount of extra latency.

Thats a tradeoff I suppose.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] net, rps: bypass enqueue_to_backlog()
  2013-12-18 13:56 ` Eric Dumazet
@ 2013-12-18 14:04   ` Eric Dumazet
  2013-12-18 14:11     ` Zhi Yong Wu
  2013-12-18 14:20   ` Zhi Yong Wu
  1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2013-12-18 14:04 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: therbert, netdev, Zhi Yong Wu

On Wed, 2013-12-18 at 05:56 -0800, Eric Dumazet wrote:
> On Wed, 2013-12-18 at 21:03 +0800, Zhi Yong Wu wrote:
> > From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> > 
> > When local cpu is just target cpu which will handle network soft irq,
> > the packet should be directly injected to network stack, by bypassing
> > enqueue_to_backlog(), it can speed up the packet processing.
> > 
> > HI, guys
> > 
> > I checked the first several versions of RPS patch which seemed to have
> > this condition determination, but why was it removed later? Do i miss
> > anything? if yes, please correct me, thanks.
> > 
> > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
> > ---
> 
> Hmm... Could you elaborate ?
> 
> At which point do you think this condition was tested or removed ?
> 
> I think the idea was to drain NIC RX queues as fast as possible, then :
> 
> - Send the IPI to remote cpus
> - process our queue in parallel with other cpus processing their own
> queue.
> 
> If we process our packets through whole stack, packets for other cpus
> will have a fair amount of extra latency.
> 
> Thats a tradeoff I suppose.
> 

Also note that going through the backlog permits thinks like

99bbc70741903c0  ("rps: selective flow shedding during softnet
overflow")

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] net, rps: bypass enqueue_to_backlog()
  2013-12-18 14:04   ` Eric Dumazet
@ 2013-12-18 14:11     ` Zhi Yong Wu
  2013-12-18 14:26       ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Zhi Yong Wu @ 2013-12-18 14:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tom Herbert, Linux Netdev List, Zhi Yong Wu

On Wed, Dec 18, 2013 at 10:04 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2013-12-18 at 05:56 -0800, Eric Dumazet wrote:
>> On Wed, 2013-12-18 at 21:03 +0800, Zhi Yong Wu wrote:
>> > From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
>> >
>> > When local cpu is just target cpu which will handle network soft irq,
>> > the packet should be directly injected to network stack, by bypassing
>> > enqueue_to_backlog(), it can speed up the packet processing.
>> >
>> > HI, guys
>> >
>> > I checked the first several versions of RPS patch which seemed to have
>> > this condition determination, but why was it removed later? Do i miss
>> > anything? if yes, please correct me, thanks.
>> >
>> > Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
>> > ---
>>
>> Hmm... Could you elaborate ?
>>
>> At which point do you think this condition was tested or removed ?
>>
>> I think the idea was to drain NIC RX queues as fast as possible, then :
>>
>> - Send the IPI to remote cpus
>> - process our queue in parallel with other cpus processing their own
>> queue.
>>
>> If we process our packets through whole stack, packets for other cpus
>> will have a fair amount of extra latency.
>>
>> Thats a tradeoff I suppose.
>>
>
> Also note that going through the backlog permits thinks like
>
> 99bbc70741903c0  ("rps: selective flow shedding during softnet
> overflow"
it makes sense, but if cpu < 0, some packets seems to bypass the flow
shedding stuff, That is, the flow limit will be not so accurate,
right?

>
>
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] net, rps: bypass enqueue_to_backlog()
  2013-12-18 13:56 ` Eric Dumazet
  2013-12-18 14:04   ` Eric Dumazet
@ 2013-12-18 14:20   ` Zhi Yong Wu
  1 sibling, 0 replies; 6+ messages in thread
From: Zhi Yong Wu @ 2013-12-18 14:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tom Herbert, Linux Netdev List, Zhi Yong Wu

thanks for your explanation.

by the way, pls ignore this patch, thanks.

On Wed, Dec 18, 2013 at 9:56 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2013-12-18 at 21:03 +0800, Zhi Yong Wu wrote:
>> From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
>>
>> When local cpu is just target cpu which will handle network soft irq,
>> the packet should be directly injected to network stack, by bypassing
>> enqueue_to_backlog(), it can speed up the packet processing.
>>
>> HI, guys
>>
>> I checked the first several versions of RPS patch which seemed to have
>> this condition determination, but why was it removed later? Do i miss
>> anything? if yes, please correct me, thanks.
>>
>> Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
>> ---
>
> Hmm... Could you elaborate ?
>
> At which point do you think this condition was tested or removed ?
>
> I think the idea was to drain NIC RX queues as fast as possible, then :
>
> - Send the IPI to remote cpus
> - process our queue in parallel with other cpus processing their own
> queue.
>
> If we process our packets through whole stack, packets for other cpus
> will have a fair amount of extra latency.
>
> Thats a tradeoff I suppose.
>
>



-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] net, rps: bypass enqueue_to_backlog()
  2013-12-18 14:11     ` Zhi Yong Wu
@ 2013-12-18 14:26       ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2013-12-18 14:26 UTC (permalink / raw)
  To: Zhi Yong Wu; +Cc: Tom Herbert, Linux Netdev List, Zhi Yong Wu

On Wed, 2013-12-18 at 22:11 +0800, Zhi Yong Wu wrote:

> it makes sense, but if cpu < 0, some packets seems to bypass the flow
> shedding stuff, That is, the flow limit will be not so accurate,
> right?

Not sure what you mean.

cpu = -1 on very rare occasions, like cpu hot unplug.

In netif_receive_skb(), we just process the packet normally.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-12-18 14:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-18 13:03 [RFC PATCH] net, rps: bypass enqueue_to_backlog() Zhi Yong Wu
2013-12-18 13:56 ` Eric Dumazet
2013-12-18 14:04   ` Eric Dumazet
2013-12-18 14:11     ` Zhi Yong Wu
2013-12-18 14:26       ` Eric Dumazet
2013-12-18 14:20   ` Zhi Yong Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox