public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-25 16:17 jamal
  2002-08-25 22:51 ` David S. Miller
  0 siblings, 1 reply; 26+ messages in thread
From: jamal @ 2002-08-25 16:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: Mala Anand, netdev, Robert Olsson



Mala,

Could you please at least cc netdev on networking related issues?
It says so in the kernel FAQ.
I swore back around 95 to join lk only when Linux gets a IDE maintainer
who is not insane. Hasnt happened yet.

Can you repeat your tests with the hotlist turned off (i.e set to 0)?
Also if you would be doing tests on NAPI please either copy us or netdev;
it is not nice to read weeks after you post.

Also Robert and i did a few tests and we did find skb recycling (based on
a patch from Robert a few years back) was infact giving perfomance
improvements of upto 15% over regular slab.
Did you test with that patch for the e1000 he pointed you at?
I repeated the tests (around June/July) with the tulip with input rates of
a few 100K packets/sec and noticed a improvement over regular NAPI by
about 10%. Theres one bug on the tulip which we are chasing that
might be related to tulips alignment requirements;

The idea of only freeing on the same CPU a skb allocated is free with
the e1000 NAPI driver style but not in the tulip NAPI  where a txmit
interupt might happen on a different CPU. The skb recycler patch only
recylces if allocation and freeing are happening on the same CPU;
otherwise we let the slab take the hit. On the tulip this happens about
50% of the time.

cheers,
jamal


^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-09-03  3:47 Mala Anand
  0 siblings, 0 replies; 26+ messages in thread
From: Mala Anand @ 2002-09-03  3:47 UTC (permalink / raw)
  To: jamal; +Cc: Bill Hartner, davem, linux-kernel, Mala Anand, netdev,
	Robert Olsson



>On Tue, 27 Aug 2002, Mala Anand wrote:

>> SPECweb99 profile shows that __kfree_skb is in the top 5 hot routines.
We
>> will test the skb recycle patch on SPECweb99 and add skbinit patch
>> to that and see how it helps.  What I understand is that the skb recycle
>> patch does not attempt to recycle if the skbs are allocated on CPU
>> and freed on another CPU. Is that right? If so, skbinit patch will help
>> those cases.

>yes it will. Not significant is my current thinking. i.e i wouldn't write
>my mother to tell her about it.

I have not looked at Robert's recycle skb patch yet. I couldn't
find it in the link he sent me so I don't know how it works. However
I thought about it a little more and realized that even when you
recycle the skbs, they need to be initialized (cleaned up). I don't
understand how
can the recycle skb patch avoid calling constructors and destructors for
the
skb.  The skbs are given back to the driver instead of freeing to the skb
hot list
or to the slab. That does not eliminate the part of the code of kfree_skb
which
releases dst, initializes part of skb and executes destructor.
Tell me if I am wrong but wouldn't it break the code. So I do think that
recycle skb patch will not mitigate the benefits of the skb init
patch.


Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088






^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-27 13:18 Mala Anand
  2002-08-27 15:49 ` jamal
  0 siblings, 1 reply; 26+ messages in thread
From: Mala Anand @ 2002-08-27 13:18 UTC (permalink / raw)
  To: jamal; +Cc: Bill Hartner, davem, linux-kernel, netdev, Robert Olsson



Jamal wrote ..
>On Mon, 26 Aug 2002, Mala Anand wrote:

>> Troy Wilson (who works with me) posted SPECweb99 results using my
>> skbinit patch to lkml on Friday:
>>  http://www.uwsg.iu.edu/hypermail/linux/kernel/0208.2/1470.html
>> I know you don't subscribe to lkml. Have you seen these results?
>> On Numa machine it showed around 3% improvement using SPECweb99.

>The posting you pointed to says 1% - not that it matters. It becomes more
>insignificant when skb recycling comes in play mostly because the alloc
>and freeing of skbs doesnt really show up as hotlist item within
>the profile.
>I am not saying it is totaly useless -- anything that will save a few
>cycles is good;

SPECweb99 profile shows that __kfree_skb is in the top 5 hot routines. We
will test the skb recycle patch on SPECweb99 and add skbinit patch
to that and see how it helps.  What I understand is that the skb recycle
patch does not attempt to recycle if the skbs are allocated on CPU
and freed on another CPU. Is that right? If so, skbinit patch will help
those
cases. I think this patch is pretty safe and I anticipate greater gains
on NUMA systems.

BTW the 3% gain that I reported earlier on NUMA is done at another site
of IBM and it turned out to be it is not a NUMA machine. It is also
an 8-way SMP machine, however those are non-complaint SPECWeb99 runs so
I won't be able to use those results.

The alloc and free routines are not hot in netperf3 profiles. However
I am seeing some gains there also, not significant. I will post netperf3
results with skbinit patch later.



Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088






^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-27  2:53 Mala Anand
  0 siblings, 0 replies; 26+ messages in thread
From: Mala Anand @ 2002-08-27  2:53 UTC (permalink / raw)
  To: Robert Olsson
  Cc: Bill Hartner, davem, jamal, linux-kernel, Mala Anand, netdev,
	Robert Olsson


Robert Olsson wrote..
 >In slab terms you moved part of the destructor to the constructor
 >but the main problem is still there. The skb entered the "wrong" CPU
 >so to be "reused from the slab again" the work has to done regardless
 >if it's in the constructor or destructor.
That is true if it is a uni processor but in smp the initialization,
if happened in two different CPUs, affects performance due to cache
effects.

The problem of object (skb) allocation, usage and deallocation occurring
in multiple CPUs need to be addressed separately. This patch is not
attempting to address that.

 >Eventually if we accept some cache misses a skb could possibly be
re-routed
 >to the proper slab/CPU for this we would need some skb coloring.
You still can do this. I don't see skbinit patch hindering this.

Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088





^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-26 13:04 Mala Anand
  2002-08-26 19:28 ` Robert Olsson
  2002-08-27 10:17 ` jamal
  0 siblings, 2 replies; 26+ messages in thread
From: Mala Anand @ 2002-08-26 13:04 UTC (permalink / raw)
  To: jamal, davem; +Cc: netdev, Robert Olsson, Bill Hartner, linux-kernel

>>I think the skbinit patch and recycling skbs are mutually exclusive.

>I would say they are more orthogonal than mutually exclusive.
>Although ou still need to prove that relocating the code actually helps in
>real life. On paper it looks good.
Troy Wilson (who works with me) posted SPECweb99 results using my
skbinit patch to lkml on Friday:
 http://www.uwsg.iu.edu/hypermail/linux/kernel/0208.2/1470.html
I know you don't subscribe to lkml. Have you seen these results?
On Numa machine it showed around 3% improvement using SPECweb99.


Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088






^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-25 20:12 Mala Anand
  2002-08-26  1:02 ` jamal
  0 siblings, 1 reply; 26+ messages in thread
From: Mala Anand @ 2002-08-25 20:12 UTC (permalink / raw)
  To: jamal; +Cc: linux-kernel, Mala Anand, netdev, Robert Olsson


Jamal wrote ..

>Could you please at least cc netdev on networking related issues?
>It says so in the kernel FAQ.
>I swore back around 95 to join lk only when Linux gets a IDE maintainer
>who is not insane. Hasnt happened yet.

 Yes I will, it is a mistake from my part,

>Can you repeat your tests with the hotlist turned off (i.e set to 0)?

 Even if I turned the hot list, the slab allocator has a per cpu array of
objects. In this case it keeps by default 60 objects and hot list keeps
126 objects. So there may be a difference. I will try this.

 This skb init work is the result of my probing in to the slab cache work.
Read
my posting on slab cache:
http://marc.theaimsgroup.com/?l=linux-kernel&m=102773718023056&w=2
This work triggered the skb init patch. To quantify the effect of bouncing
the objects between cpus, I choose skb to measure. And it turns out that
the
limited cpu array is not the culprit in this case, it is how the objects
are
allocated in one cpu and freed in another cpu is what causing the bouncing
of objects between cpus.

>Also if you would be doing tests on NAPI please either copy us or netdev;
>it is not nice to read weeks after you post.

 Yes I will.

>Also Robert and i did a few tests and we did find skb recycling (based on
>a patch from Robert a few years back) was infact giving perfomance
>improvements of upto 15% over regular slab.
>Did you test with that patch for the e1000 he pointed you at?
>I repeated the tests (around June/July) with the tulip with input rates of
>a few 100K packets/sec and noticed a improvement over regular NAPI by
>about 10%. Theres one bug on the tulip which we are chasing that
>might be related to tulips alignment requirements;

Yes I got the patch from Robert and I am planning on testing the patch. My
understanding is that skbs are recylced in other operating systems as well
to
improve performance. And it particularly helps in architectures where pci
mapping is expensive and when skbs are recycled, remapping is eliminated. I
think the skbinit patch and recycling skbs are mutually exclusive.
Recycling
skbs will reduce the number of times we hit alloc_skb and __kfree_skb.

>The idea of only freeing on the same CPU a skb allocated is free with
>the e1000 NAPI driver style but not in the tulip NAPI  where a txmit
>interupt might happen on a different CPU. The skb recycler patch only
>recylces if allocation and freeing are happening on the same CPU;
>otherwise we let the slab take the hit. On the tulip this happens about
>50% of the time.
 So skbinit patch will help the other case.

Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088







^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-23 23:38 Mala Anand
  2002-08-23 23:55 ` David S. Miller
  0 siblings, 1 reply; 26+ messages in thread
From: Mala Anand @ 2002-08-23 23:38 UTC (permalink / raw)
  To: David S. Miller
  Cc: alan, bcrl, Bill Hartner, haveblue, linux-kernel, lse-tech,
	lse-tech-admin


From: Dave Hansen <haveblue@us.ibm.com>
   Date: Fri, 23 Aug 2002 09:39:13 -0700

   Where are interrupts disabled?   I just went through a set of kernprof
   data and traced up the call graph.  In the most common __kfree_skb
   case, I do not believe that it has interupts disabled.  I could be
   wrong, but I didn't see it.

>That's completely right.  interrupts should never be disabled when
>__kfree_skb is executed.  It used to be possible when we allowed
>it to be invoked from interrupt handlers, but that is illegal and
>we have kfree_skb_irq which just reschedules the actual __kfree_skb
>to a software interrupt.

>So I agree with you, Mala's claims seem totally bogus and not well
>founded at all.
To name a few, interrupts are disabled when skbs are put back to the
hot_list
and when the cache list is accessed in the slab allocator. Am I missing
something? Please help me to understand.


Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088







^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-23 23:14 Mala Anand
  0 siblings, 0 replies; 26+ messages in thread
From: Mala Anand @ 2002-08-23 23:14 UTC (permalink / raw)
  To: nevdull; +Cc: Bill Hartner, haveblue, linux-kernel, lse-tech


Rick wrote..
>Note that Mala said "I measured the cycles for only the
>initialization code in alloc_skb and __kfree_skb" which could mean that
>even other parts of alloc_skb() or __kfree_skb() may have gotten worse
>and you would not have known.
 Please look at my reply to Ben LeHaise which has the cycles for
 alloc_skb() and __kfree_skb(). You don't have to guess that.

> Later she admits, "As the scope of the
>code measured widens the percentage improvement comes down" and finally
>observes "We measured it in a web serving workload and found that we
>get 0.7% improvement"  which is practically in the noise.
That was initial results which had more than the posted patch. We are
still working on getting numbers.

>Dave's
>observation was that it was slightly worse (0.35%).
Are you basing this 0.35% degradation on your profile. According to
Dave's SPECweb99 results there is a 2.97% improvement in simultaneous
connections with my patch. Is that right Dave?




Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088







^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-23 14:44 Mala Anand
  2002-08-23 16:39 ` Dave Hansen
  0 siblings, 1 reply; 26+ messages in thread
From: Mala Anand @ 2002-08-23 14:44 UTC (permalink / raw)
  To: haveblue
  Cc: Benjamin LaHaise, alan, Bill Hartner, davem, linux-kernel,
	lse-tech, lse-tech-admin


Dave Hansen wrote..
>Mala Anand wrote:
>> The third scope would be measuring this patch in a workload environment.
>> We measured it in a web serving workload and found that we get 0.7%
>> improvement.

>First of all, the patch doesn't apply at all against the current
>bitkeeper tree.  I can post the exact one I used if you like.

>I tried this under our Specweb99 setup.  Here's a snippet of
>readprofile with, then without the patch:

>alloc:free ratio: 1.226
>(__kfree_skb+alloc_skb)/total = 3.14%


>alloc:free ratio: 0.348
>(__kfree_skb+alloc_skb)/total = 2.79%

>You can see the entire readprofile here:
>http://www.sr71.net/~specweb99/run-specweb-100sec-2400-2.5.31-bk+4-kmap-08-22-2002-11.20.17/

>http://www.sr71.net/~specweb99/run-specweb-100sec-2400-2.5.31-bk+4-kmap-mala-08-22-2002-11.44.25/

>No, I don't know why I have so much idle time.

Readprofile ticks are not as accurate as the cycles I measured.
Moreover readprofile can give misleading information as it profiles
on timer interrupts. The alloc_skb and __kfree_skb call memory
management routines and interrupts are disabled in many parts of that code.
So I don't trust the readprofile data.



Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088





^ permalink raw reply	[flat|nested] 26+ messages in thread
* Re: [Lse-tech] Re: (RFC): SKB Initialization
@ 2002-08-22 17:22 Mala Anand
  2002-08-22 18:32 ` Benjamin LaHaise
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Mala Anand @ 2002-08-22 17:22 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: alan, Bill Hartner, davem, linux-kernel, lse-tech, lse-tech-admin


>On Wed, Aug 21, 2002 at 01:07:09PM -0500, Mala Anand wrote:
>>
>> >On Wed, Aug 21, 2002 at 11:59:44AM -0500, Mala Anand wrote:
>> >> The patch reduces the number of cylces by 25%
>>
>> >The data you are reporting is flawed: where are the average cycle
>> >times spent in __kfree_skb with the patch?
>>
>> I measured the cycles for only the initialization code in alloc_skb
>> and __kfree_skb. Since the init code is removed from __kfree_skb,
>> no cycles are spent there.

>Then the testing technique is flawed.  You should include all of the
>operations included in an alloc_skb/kfree_skb pair in order to see
>the overall effect of the change, otherwise your change could have a
>net negative effect which would not be noticed.

Cycles for the whole routines alloc_skb and __kfree_skb are as follows:

Baseline 2.5.25
----------------
       alloc/free average cycles
       -------------------------
Runs:      1st              2nd          3rd

CPU0:    337/1163       336/1132      304/1100
CPU1:    318/1164       309/1153      311/1127


2.5.25+skbinit patch
--------------------

       alloc/free average cycles
       -------------------------
Runs:      1st          2nd            3rd

CPU0:   447/1015       580/846        402/905
CPU1:   419/1003       383/915        547/856

The above figures indicate that the cycles spent in alloc_skb and
__kfree_skb have gained 5% in the patch case.  However if you
take the absolute cycles and average them for the three runs it
comes around 145 cycles saving that is close to what I posted earlier
by measuring just the changed code. As the scope of the code measured
widens the percentage improvement comes down.

So the first two scopes, 1. measuring the cycles spent in changed code
2. measuring the cycles spent in alloc_skb and __kfree_skb, results
are consistent.

The third scope would be measuring this patch in a workload environment.
We measured it in a web serving workload and found that we get 0.7%
improvement.

I would like to stress again that this patch helps only when the
allocations
and frees occur on two different CPUs.  I measured it in a UNI system and
did not see any impact.

Regards,
    Mala


   Mala Anand
   IBM Linux Technology Center - Kernel Performance
   E-mail:manand@us.ibm.com
   http://www-124.ibm.com/developerworks/opensource/linuxperf
   http://www-124.ibm.com/developerworks/projects/linuxperf
   Phone:838-8088; Tie-line:678-8088




                                                                                                                                               
                      Benjamin LaHaise                                                                                                         
                      <bcrl@redhat.com>                To:       Mala Anand/Austin/IBM@IBMUS                                                   
                      Sent by:                         cc:       alan@lxorguk.ukuu.org.uk, Bill Hartner/Austin/IBM@IBMUS, davem@redhat.com,    
                      lse-tech-admin@lists.sour         linux-kernel@vger.kernel.org, lse-tech@lists.sourceforge.net                           
                      ceforge.net                      Subject:  [Lse-tech] Re: (RFC): SKB Initialization                                      
                                                                                                                                               
                                                                                                                                               
                      08/21/02 01:16 PM                                                                                                        
                                                                                                                                               
                                                                                                                                               



On Wed, Aug 21, 2002 at 01:07:09PM -0500, Mala Anand wrote:
>
> >On Wed, Aug 21, 2002 at 11:59:44AM -0500, Mala Anand wrote:
> >> The patch reduces the numer of cylces by 25%
>
> >The data you are reporting is flawed: where are the average cycle
> >times spent in __kfree_skb with the patch?
>
> I measured the cycles for only the initialization code in alloc_skb
> and __kfree_skb. Since the init code is removed from __kfree_skb,
> no cycles are spent there.

Then the testing technique is flawed.  You should include all of the
operations included in an alloc_skb/kfree_skb pair in order to see
the overall effect of the change, otherwise your change could have a
net negative effect which would not be noticed.

                         -ben
--
"You will be reincarnated as a toad; and you will be much happier."


-------------------------------------------------------
This sf.net email is sponsored by: OSDN - Tired of that same old
cell phone?  Get a new here for FREE!
https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
_______________________________________________
Lse-tech mailing list
Lse-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lse-tech






^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2002-09-03  3:43 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-25 16:17 [Lse-tech] Re: (RFC): SKB Initialization jamal
2002-08-25 22:51 ` David S. Miller
  -- strict thread matches above, loose matches on Subject: below --
2002-09-03  3:47 Mala Anand
2002-08-27 13:18 Mala Anand
2002-08-27 15:49 ` jamal
2002-08-27  2:53 Mala Anand
2002-08-26 13:04 Mala Anand
2002-08-26 19:28 ` Robert Olsson
2002-08-27 10:17 ` jamal
2002-08-25 20:12 Mala Anand
2002-08-26  1:02 ` jamal
2002-08-23 23:38 Mala Anand
2002-08-23 23:55 ` David S. Miller
2002-08-23 23:14 Mala Anand
2002-08-23 14:44 Mala Anand
2002-08-23 16:39 ` Dave Hansen
2002-08-23 20:12   ` Bill Hartner
2002-08-23 20:30     ` Dave Hansen
2002-08-23 23:36       ` Troy Wilson
2002-08-23 20:51     ` Rick Lindsley
2002-08-23 22:41   ` David S. Miller
2002-08-22 17:22 Mala Anand
2002-08-22 18:32 ` Benjamin LaHaise
2002-08-22 19:02 ` Dave Hansen
2002-08-22 22:05   ` William Lee Irwin III
2002-08-23 19:09 ` Bill Hartner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox