netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* latest -stable breaks Squid
@ 2006-05-03 21:19 Dave Jones
  2006-05-04  0:43 ` Dave Jones
  2006-05-04  1:10 ` Herbert Xu
  0 siblings, 2 replies; 12+ messages in thread
From: Dave Jones @ 2006-05-03 21:19 UTC (permalink / raw)
  To: netdev; +Cc: stable, Andreas M. Kirchwitz, Vassilios Kotoulas

[-- Attachment #1: Type: text/plain, Size: 290 bytes --]

So I pushed out an update for Fedora Core 5 users yesterday
that moved the kernel from 2.6.16.9 to 2.6.16.13.
I've since heard "My network performance is awful", and worse
yet, some apps seem broken as in the report below.

Anyone have any ideas ?

		Dave

-- 
http://www.codemonkey.org.uk

[-- Attachment #2: Type: message/rfc822, Size: 4270 bytes --]

From: "Andreas M. Kirchwitz" <fedora-list@list.zikzak.de>
To: fedora-list@redhat.com
Subject: Kernel 2.6.16-1.2107_FC5 breaks Squid
Date: Wed, 3 May 2006 12:53:04 +0000 (UTC)
Message-ID: <slrne5h9tg.or2.amk@nautilus.zikzak.de>

Hi folks!

Just installed the new kernel 2.6.16-1.2107_FC5 (32-bit, i686),
and everything seems to work fine -- except Squid (not matter
if I use the one that ships with FC5 or my own).

No problem with small data objects (a few kilobytes).

   telnet localhost 3128
   GET http://www.kirchwitz.de/test.html HTTP/1.0

Large data objects won't work. According to ethereal, the data
is loaded successfully. And I also find the complete objects
in /var/spool/squid, but the data is not output to the application:

   telnet localhost 3128
   GET http://www.heise.de/ HTTP/1.0

After a few kilobytes, the stream suddenly stops. Sometimes,
Squid doesn't do any output at all, although the object is in
the cache.

With the previous kernel 2.6.16-1.2096_FC5 all is well.

What has changed in the kernel that makes Squid break so hardly?
Maybe other applications are affected as well. Don't know yet.
The error with squid was simply very obvious. ;-)

	Greetings, Andreas

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-03 21:19 latest -stable breaks Squid Dave Jones
@ 2006-05-04  0:43 ` Dave Jones
  2006-05-04  1:10 ` Herbert Xu
  1 sibling, 0 replies; 12+ messages in thread
From: Dave Jones @ 2006-05-04  0:43 UTC (permalink / raw)
  To: netdev; +Cc: stable, Andreas M. Kirchwitz, Vassilios Kotoulas

On Wed, May 03, 2006 at 05:19:15PM -0400, Dave Jones wrote:
 > So I pushed out an update for Fedora Core 5 users yesterday
 > that moved the kernel from 2.6.16.9 to 2.6.16.13.
 > I've since heard "My network performance is awful", and worse
 > yet, some apps seem broken as in the report below.

Further problems found (not all network related, but there does
seem to be a pattern amongst some of them) can be seen at
http://tinyurl.com/o7uqj

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-03 21:19 latest -stable breaks Squid Dave Jones
  2006-05-04  0:43 ` Dave Jones
@ 2006-05-04  1:10 ` Herbert Xu
  2006-05-04  1:22   ` Ben Greear
  1 sibling, 1 reply; 12+ messages in thread
From: Herbert Xu @ 2006-05-04  1:10 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev, stable, fedora-list, fedoralist

Dave Jones <davej@redhat.com> wrote:
> 
> So I pushed out an update for Fedora Core 5 users yesterday
> that moved the kernel from 2.6.16.9 to 2.6.16.13.
> I've since heard "My network performance is awful", and worse
> yet, some apps seem broken as in the report below.
> 
> Anyone have any ideas ?

Try reverting the e1000 truesize patch.  Although the fix is 100%
correct, it might have a negative impact on user-space apps with
particuarly small rcvbuf settings.  Prior to the fix, due to the
incorrect accounting we are essentially enlarging rcvbuf by as much
as 10 times.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-04  1:10 ` Herbert Xu
@ 2006-05-04  1:22   ` Ben Greear
  2006-05-04  1:59     ` Ian McDonald
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Greear @ 2006-05-04  1:22 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Dave Jones, netdev, stable, fedora-list, fedoralist

Herbert Xu wrote:
> Dave Jones <davej@redhat.com> wrote:
> 
>>So I pushed out an update for Fedora Core 5 users yesterday
>>that moved the kernel from 2.6.16.9 to 2.6.16.13.
>>I've since heard "My network performance is awful", and worse
>>yet, some apps seem broken as in the report below.
>>
>>Anyone have any ideas ?
> 
> 
> Try reverting the e1000 truesize patch.  Although the fix is 100%
> correct, it might have a negative impact on user-space apps with
> particuarly small rcvbuf settings.  Prior to the fix, due to the
> incorrect accounting we are essentially enlarging rcvbuf by as much
> as 10 times.

At least one of the reports shows problems with non e1000 NICs, so it's
probably not just the e1000 change.

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=190620

Ben

> 
> Cheers,


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-04  1:22   ` Ben Greear
@ 2006-05-04  1:59     ` Ian McDonald
  2006-05-04 23:25       ` David S. Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Ian McDonald @ 2006-05-04  1:59 UTC (permalink / raw)
  To: Ben Greear
  Cc: Herbert Xu, Dave Jones, netdev, stable, fedora-list, fedoralist

On 5/4/06, Ben Greear <greearb@candelatech.com> wrote:
> Herbert Xu wrote:
> > Dave Jones <davej@redhat.com> wrote:
> >
> >>So I pushed out an update for Fedora Core 5 users yesterday
> >>that moved the kernel from 2.6.16.9 to 2.6.16.13.
> >>I've since heard "My network performance is awful", and worse
> >>yet, some apps seem broken as in the report below.
> >>
> >>Anyone have any ideas ?
> >
> >
> > Try reverting the e1000 truesize patch.  Although the fix is 100%
> > correct, it might have a negative impact on user-space apps with
> > particuarly small rcvbuf settings.  Prior to the fix, due to the
> > incorrect accounting we are essentially enlarging rcvbuf by as much
> > as 10 times.
>
> At least one of the reports shows problems with non e1000 NICs, so it's
> probably not just the e1000 change.
>
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=190620
>
> Ben
>
Wouldn't it be more likely commit 5d0b6f2bdaf7e016e750cd24164a241512d968a3

as this touches net/ipv4/tcp_output.c and is also in same general area?
--
Ian McDonald
Web: http://wand.net.nz/~iam4
Blog: http://imcdnzl.blogspot.com
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-04  1:59     ` Ian McDonald
@ 2006-05-04 23:25       ` David S. Miller
  2006-05-04 23:30         ` Herbert Xu
                           ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: David S. Miller @ 2006-05-04 23:25 UTC (permalink / raw)
  To: imcdnzl; +Cc: greearb, herbert, davej, netdev, stable, fedora-list, fedoralist

From: "Ian McDonald" <imcdnzl@gmail.com>
Date: Thu, 4 May 2006 13:59:04 +1200

> Wouldn't it be more likely commit 5d0b6f2bdaf7e016e750cd24164a241512d968a3
> 
> as this touches net/ipv4/tcp_output.c and is also in same general area?

This commit makes us account transmit memory properly.  Previously we
were underaccounting which is a serious error and in fact could result
in assertion failures due to sk->sk_forward_alloc going negative if
things were just right.

If this change is what makes an application go slower, then the
problem is likely that the socket send buffer limits are not being set
large enough.

That being said, the first thing that should be tried is reverting
the above mentioned change and see if the problem goes away.  If
so, then we need to investigate what the bandwidth delay product is
for the connection, and whether the socket send buffer is set large
enough for that size of pipe.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-04 23:25       ` David S. Miller
@ 2006-05-04 23:30         ` Herbert Xu
  2006-05-04 23:47         ` Dave Jones
  2006-05-05  8:49         ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Eric Dumazet
  2 siblings, 0 replies; 12+ messages in thread
From: Herbert Xu @ 2006-05-04 23:30 UTC (permalink / raw)
  To: David S. Miller
  Cc: imcdnzl, greearb, davej, netdev, stable, fedora-list, fedoralist

On Thu, May 04, 2006 at 04:25:46PM -0700, David S. Miller wrote:
> 
> That being said, the first thing that should be tried is reverting
> the above mentioned change and see if the problem goes away.  If
> so, then we need to investigate what the bandwidth delay product is
> for the connection, and whether the socket send buffer is set large
> enough for that size of pipe.

I've tracked the networking breakage down to a broken Xen patch.
So 2.6.16.13 is all OK.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: latest -stable breaks Squid
  2006-05-04 23:25       ` David S. Miller
  2006-05-04 23:30         ` Herbert Xu
@ 2006-05-04 23:47         ` Dave Jones
  2006-05-05  8:49         ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Eric Dumazet
  2 siblings, 0 replies; 12+ messages in thread
From: Dave Jones @ 2006-05-04 23:47 UTC (permalink / raw)
  To: David S. Miller
  Cc: imcdnzl, greearb, herbert, netdev, stable, fedora-list,
	fedoralist

On Thu, May 04, 2006 at 04:25:46PM -0700, David S. Miller wrote:

 > That being said, the first thing that should be tried is reverting
 > the above mentioned change and see if the problem goes away.  If
 > so, then we need to investigate what the bandwidth delay product is
 > for the connection, and whether the socket send buffer is set large
 > enough for that size of pipe.

It's now believed (after some detective work from Herbert) that
this round of problems wasn't caused by the -stable patch, but
by a bogus update to Xen which we carry in the Fedora kernel
that sneaked in without a changelog (which is why I didn't even suspect that thing
given it worked fine previously).

Until today I had no idea just how much that thing poked into net/

Ugh.  Apologies for the false alarm.

		Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb()
  2006-05-04 23:25       ` David S. Miller
  2006-05-04 23:30         ` Herbert Xu
  2006-05-04 23:47         ` Dave Jones
@ 2006-05-05  8:49         ` Eric Dumazet
  2006-05-05 10:06           ` Herbert Xu
                             ` (2 more replies)
  2 siblings, 3 replies; 12+ messages in thread
From: Eric Dumazet @ 2006-05-05  8:49 UTC (permalink / raw)
  To: David S. Miller, Andi Kleen; +Cc: netdev

On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c 
, function dst_destroy(struct dst_entry * dst)

It appears the smb_rmb() done at the begining of  dst_destroy() is the 
killer  (this is a lfence machine instruction, that apparently is doing 
a *lot* of things... may be IO related...) that is responsible for 80% 
of the cpu time used by the whole function.

I dont understand very much all variety of available barriers, and why 
this smb_rmb() is used in dst_destroy().
I missed the corresponding wmb that should be done somewhere in the dst 
code.

Do we have an alternative to smp_rmb() in the dst_destroy()/ kfree_skb()
context ?

Documentation/memory-barriers.txt mentions several 'advanced barrier 
functions' but I'm really lost.




ffffffff803b5f80 <dst_destroy>: /* dst_destroy total: 237528  0.5635 */
   163 3.9e-04 :ffffffff803b5f80:       push   %r12
  3483  0.0083 :ffffffff803b5f82:       push   %rbp
               :ffffffff803b5f83:       mov    %rdi,%rbp
     7 1.7e-05 :ffffffff803b5f86:       push   %rbx
   201 4.8e-04 :ffffffff803b5f87:       lfence
192133  0.4558 :ffffffff803b5f8a:       data16
               :ffffffff803b5f8b:       data16
               :ffffffff803b5f8c:       nop
     4 9.5e-06 :ffffffff803b5f8d:       data16
               :ffffffff803b5f8e:       data16
               :ffffffff803b5f8f:       nop
               :ffffffff803b5f90:       mov    0x90(%rbp),%rdi



ffffffff803ae8a0 <kfree_skb>: /* kfree_skb total: 145240  0.3446 */
  1873  0.0044 :ffffffff803ae8a0:       test   %rdi,%rdi
  2127  0.0050 :ffffffff803ae8a3:       je     ffffffff803ae8c7 
<kfree_skb+0x27>
    81 1.9e-04 :ffffffff803ae8a5:       mov    0xbc(%rdi),%eax
     1 2.4e-06 :ffffffff803ae8ab:       dec    %eax
  2303  0.0055 :ffffffff803ae8ad:       jne    ffffffff803ae8b4 
<kfree_skb+0x14>
   221 5.2e-04 :ffffffff803ae8af:       lfence
137609  0.3265 :ffffffff803ae8b2:       jmp    ffffffff803ae8c2 
<kfree_skb+0x22>
               :ffffffff803ae8b4:       lock decl 0xbc(%rdi)
    38 9.0e-05 :ffffffff803ae8bb:       sete   %al
    86 2.0e-04 :ffffffff803ae8be:       test   %al,%al
               :ffffffff803ae8c0:       je     ffffffff803ae8c7 
<kfree_skb+0x27>
   806  0.0019 :ffffffff803ae8c2:       jmpq   ffffffff803ae7d0 
<__kfree_skb>
    95 2.3e-04 :ffffffff803ae8c7:       repz retq

Thank you

Eric





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb()
  2006-05-05  8:49         ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Eric Dumazet
@ 2006-05-05 10:06           ` Herbert Xu
  2006-05-05 16:13           ` Very long list of struct dst_entry in dst_garbage_list Eric Dumazet
  2006-05-05 17:05           ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Andi Kleen
  2 siblings, 0 replies; 12+ messages in thread
From: Herbert Xu @ 2006-05-05 10:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, ak, netdev

Eric Dumazet <dada1@cosmosbay.com> wrote:
>
> I missed the corresponding wmb that should be done somewhere in the dst 
> code.

The wmb for dst's is in dst_release while skb's have an implicit barrier
through atomic_dec_and_test in kfree_skb.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Very long list of struct dst_entry in dst_garbage_list
  2006-05-05  8:49         ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Eric Dumazet
  2006-05-05 10:06           ` Herbert Xu
@ 2006-05-05 16:13           ` Eric Dumazet
  2006-05-05 17:05           ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Andi Kleen
  2 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2006-05-05 16:13 UTC (permalink / raw)
  To: netdev

I noticed that after a 'ip route flush cache' (manual or timer 
triggered) on a busy server, XXXXX entries are added to dst_garbage_list.
(XXXXX depends on the number of established sockets)

Every 1/10th second (DST_GC_MIN) , net/core/dst.c::dst_run_gc() is 
fired, and try to free some entries from the list, but many entries have 
a non null refcnt and stay in the list for the next run.

Linux version is 2.6.17-rc3.

Do you think a rework of  dst_run_gc() function is necessary, (using a 
batch mode to limit the number of entries examined at each run), or is 
it a "should not happen, something is broken" situation ?

Eric




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb()
  2006-05-05  8:49         ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Eric Dumazet
  2006-05-05 10:06           ` Herbert Xu
  2006-05-05 16:13           ` Very long list of struct dst_entry in dst_garbage_list Eric Dumazet
@ 2006-05-05 17:05           ` Andi Kleen
  2 siblings, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2006-05-05 17:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, netdev

On Friday 05 May 2006 10:49, Eric Dumazet wrote:
> On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c 
> , function dst_destroy(struct dst_entry * dst)
> 
> It appears the smb_rmb() done at the begining of  dst_destroy() is the 
> killer  (this is a lfence machine instruction, that apparently is doing 
> a *lot* of things... may be IO related...) that is responsible for 80% 
> of the cpu time used by the whole function.
> 
> I dont understand very much all variety of available barriers, and why 
> this smb_rmb() is used in dst_destroy().
> I missed the corresponding wmb that should be done somewhere in the dst 
> code.
> 
> Do we have an alternative to smp_rmb() in the dst_destroy()/ kfree_skb() 
> context ?

Eliminating it probably wouldn't help very much - it just flushes the 
loads already in flight. If it didn't do that the next smp_rmb() would.
I'm surprised there are that many though. Normally kernel code is spagetti
enough that the CPU cannot speculate too many loads ahead.

But are you 100% sure the cost is not in the lock decl ? That would make
more sense. Perhaps profile for cache misses too and double check?

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-05-05 17:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-03 21:19 latest -stable breaks Squid Dave Jones
2006-05-04  0:43 ` Dave Jones
2006-05-04  1:10 ` Herbert Xu
2006-05-04  1:22   ` Ben Greear
2006-05-04  1:59     ` Ian McDonald
2006-05-04 23:25       ` David S. Miller
2006-05-04 23:30         ` Herbert Xu
2006-05-04 23:47         ` Dave Jones
2006-05-05  8:49         ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Eric Dumazet
2006-05-05 10:06           ` Herbert Xu
2006-05-05 16:13           ` Very long list of struct dst_entry in dst_garbage_list Eric Dumazet
2006-05-05 17:05           ` [x86_64, NET] smp_rmb() in dst_destroy() seems very expensive, ditto in kfree_skb() Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).