netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug-hunting
@ 2005-02-22 15:07 Christian Schmid
  2005-02-22 18:44 ` Bug-hunting David S. Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Schmid @ 2005-02-22 15:07 UTC (permalink / raw)
  To: netdev

Hello.

I am still trying to hunt down the bug with the slowdown-on-many-sockets. Is there any way I can see 
how much tcp-memory is used right now? Why did you change the behaviour? In 2.6.10-rc2 I was able to 
see the amount by looking in slabinfo but now the buffers are gone. And where did you introduce the 
buffer-limit? It seems its now globally limited to xxx MB. I want to disable this in order to check 
if thats the reason.

Chris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-22 15:07 Bug-hunting Christian Schmid
@ 2005-02-22 18:44 ` David S. Miller
  2005-02-22 19:17   ` Bug-hunting Christian Schmid
  0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2005-02-22 18:44 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

On Tue, 22 Feb 2005 16:07:35 +0100
Christian Schmid <webmaster@rapidforum.com> wrote:

> I am still trying to hunt down the bug with the slowdown-on-many-sockets. Is there any way I can see 
> how much tcp-memory is used right now? Why did you change the behaviour? In 2.6.10-rc2 I was able to 
> see the amount by looking in slabinfo but now the buffers are gone. And where did you introduce the 
> buffer-limit? It seems its now globally limited to xxx MB. I want to disable this in order to check 
> if thats the reason.

The global TCP memory limit, controllable by sysctl()'s, has been there for
at least 3 years.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-22 18:44 ` Bug-hunting David S. Miller
@ 2005-02-22 19:17   ` Christian Schmid
  2005-02-22 19:51     ` Bug-hunting David S. Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Schmid @ 2005-02-22 19:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

No. This is a new thing and this wasnt there before. In 2.6.10-rc2 the kernel aborted programs with 
"Out of memory" when too many buffers are allocated and low memory was full. NOW it just shrinks the 
buffers dynamically. I don't want that. I have a 2/2 system and I want 1600 MB for buffers but you 
only allow around 700 MB for buffers. This is definetly NEW.

David S. Miller wrote:
> On Tue, 22 Feb 2005 16:07:35 +0100
> Christian Schmid <webmaster@rapidforum.com> wrote:
> 
> 
>>I am still trying to hunt down the bug with the slowdown-on-many-sockets. Is there any way I can see 
>>how much tcp-memory is used right now? Why did you change the behaviour? In 2.6.10-rc2 I was able to 
>>see the amount by looking in slabinfo but now the buffers are gone. And where did you introduce the 
>>buffer-limit? It seems its now globally limited to xxx MB. I want to disable this in order to check 
>>if thats the reason.
> 
> 
> The global TCP memory limit, controllable by sysctl()'s, has been there for
> at least 3 years.
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-22 19:17   ` Bug-hunting Christian Schmid
@ 2005-02-22 19:51     ` David S. Miller
  2005-02-22 23:21       ` Bug-hunting Christian Schmid
  0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2005-02-22 19:51 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

On Tue, 22 Feb 2005 20:17:55 +0100
Christian Schmid <webmaster@rapidforum.com> wrote:

> No. This is a new thing and this wasnt there before. In 2.6.10-rc2 the kernel aborted programs with 
> "Out of memory" when too many buffers are allocated and low memory was full. NOW it just shrinks the 
> buffers dynamically. I don't want that. I have a 2/2 system and I want 1600 MB for buffers but you 
> only allow around 700 MB for buffers. This is definetly NEW.

It always shrinks the TCP socket buffer usage when the total socket
memory used by TCP is over the threshold.  This code has been there
for 3 years at least.

If the behavior has changed, that's interesting and probably due to
some other change.  It could even be a MM layer change that is causing
things to behave differently now for you, and because things are
being tweaked in the memory management all the time (particularly
the handling of lowmem vs. highmem) this would not surprise me at
all.

But the basic framework for shrinking socket buffers when total TCP
memory usage crosses some threshold has been there and has been
enabled for a long time.

Anyways, we're stalled on figuring out exactly what is wrong due to
lack of information and difficulty in reproducing.  The ball is still
in your court.

Why don't you put together a very simple test case that others can
use to analyze and reproduce your bug?  I bet we'll fix it or figure
out what is wrong with your app within 24 hours once you do that. :-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-22 19:51     ` Bug-hunting David S. Miller
@ 2005-02-22 23:21       ` Christian Schmid
  2005-02-23  3:30         ` Bug-hunting David S. Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Schmid @ 2005-02-22 23:21 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

Hm can you tell me how to change the limit? It seems the limit is good for 3/1 systems but not for 
2/2 systems.

To reproduce this, you would have to create a server which is sending data to 4000 non-blocking 
sockets. It doesnt matter what you send. You should realize a slow-down the more sockets you use. 
You need Gigabit for this to appear.

David S. Miller wrote:
> On Tue, 22 Feb 2005 20:17:55 +0100
> Christian Schmid <webmaster@rapidforum.com> wrote:
> 
> 
>>No. This is a new thing and this wasnt there before. In 2.6.10-rc2 the kernel aborted programs with 
>>"Out of memory" when too many buffers are allocated and low memory was full. NOW it just shrinks the 
>>buffers dynamically. I don't want that. I have a 2/2 system and I want 1600 MB for buffers but you 
>>only allow around 700 MB for buffers. This is definetly NEW.
> 
> 
> It always shrinks the TCP socket buffer usage when the total socket
> memory used by TCP is over the threshold.  This code has been there
> for 3 years at least.
> 
> If the behavior has changed, that's interesting and probably due to
> some other change.  It could even be a MM layer change that is causing
> things to behave differently now for you, and because things are
> being tweaked in the memory management all the time (particularly
> the handling of lowmem vs. highmem) this would not surprise me at
> all.
> 
> But the basic framework for shrinking socket buffers when total TCP
> memory usage crosses some threshold has been there and has been
> enabled for a long time.
> 
> Anyways, we're stalled on figuring out exactly what is wrong due to
> lack of information and difficulty in reproducing.  The ball is still
> in your court.
> 
> Why don't you put together a very simple test case that others can
> use to analyze and reproduce your bug?  I bet we'll fix it or figure
> out what is wrong with your app within 24 hours once you do that. :-)
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-22 23:21       ` Bug-hunting Christian Schmid
@ 2005-02-23  3:30         ` David S. Miller
  2005-02-23 10:28           ` Bug-hunting Christian Schmid
  0 siblings, 1 reply; 13+ messages in thread
From: David S. Miller @ 2005-02-23  3:30 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

On Wed, 23 Feb 2005 00:21:31 +0100
Christian Schmid <webmaster@rapidforum.com> wrote:

> To reproduce this, you would have to create a server which is sending data to 4000 non-blocking 
> sockets. It doesnt matter what you send. You should realize a slow-down the more sockets you use. 
> You need Gigabit for this to appear.

Please send us the exact test programs to run, not some general
description of what "one should do" to reproduce the problem.

This is exactly what is stalling this bug getting diagnosed properly
and fixed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23  3:30         ` Bug-hunting David S. Miller
@ 2005-02-23 10:28           ` Christian Schmid
  2005-02-23 19:29             ` Bug-hunting David S. Miller
  0 siblings, 1 reply; 13+ messages in thread
From: Christian Schmid @ 2005-02-23 10:28 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

I would have to write a test-program first. This needs time. Before I spend hours into coding this, 
first just tell me how I can raise the global-limit from around 600-700 MB to 1500 MB.

David S. Miller wrote:
> On Wed, 23 Feb 2005 00:21:31 +0100
> Christian Schmid <webmaster@rapidforum.com> wrote:
> 
> 
>>To reproduce this, you would have to create a server which is sending data to 4000 non-blocking 
>>sockets. It doesnt matter what you send. You should realize a slow-down the more sockets you use. 
>>You need Gigabit for this to appear.
> 
> 
> Please send us the exact test programs to run, not some general
> description of what "one should do" to reproduce the problem.
> 
> This is exactly what is stalling this bug getting diagnosed properly
> and fixed.
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23 10:28           ` Bug-hunting Christian Schmid
@ 2005-02-23 19:29             ` David S. Miller
  2005-02-23 19:41               ` Bug-hunting Christian Schmid
  2005-02-23 19:46               ` Bug-hunting Christian Schmid
  0 siblings, 2 replies; 13+ messages in thread
From: David S. Miller @ 2005-02-23 19:29 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

On Wed, 23 Feb 2005 11:28:24 +0100
Christian Schmid <webmaster@rapidforum.com> wrote:

> I would have to write a test-program first. This needs time. Before I spend hours into coding this, 
> first just tell me how I can raise the global-limit from around 600-700 MB to 1500 MB.

The limit is in the tcp_mem sysctl, I believe we told you this
before.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23 19:29             ` Bug-hunting David S. Miller
@ 2005-02-23 19:41               ` Christian Schmid
  2005-02-23 19:53                 ` Bug-hunting David S. Miller
  2005-02-23 19:46               ` Bug-hunting Christian Schmid
  1 sibling, 1 reply; 13+ messages in thread
From: Christian Schmid @ 2005-02-23 19:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

/proc/sys/net/ipv4/tcp_mem?

Its set to 98304   131072  196608
I have changed this to 1024000 1025000 1026000

It didnt change anything. The send-buffer still keeps being shrinked down.

David S. Miller wrote:
> On Wed, 23 Feb 2005 11:28:24 +0100
> Christian Schmid <webmaster@rapidforum.com> wrote:
> 
> 
>>I would have to write a test-program first. This needs time. Before I spend hours into coding this, 
>>first just tell me how I can raise the global-limit from around 600-700 MB to 1500 MB.
> 
> 
> The limit is in the tcp_mem sysctl, I believe we told you this
> before.
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23 19:29             ` Bug-hunting David S. Miller
  2005-02-23 19:41               ` Bug-hunting Christian Schmid
@ 2005-02-23 19:46               ` Christian Schmid
  2005-02-23 20:00                 ` Bug-hunting John Heffner
  1 sibling, 1 reply; 13+ messages in thread
From: Christian Schmid @ 2005-02-23 19:46 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev

Can I see somewhere how much memory tcp needs right now so to see if that changes effect something? 
Maybe the error is somewhere else so I need to check if memory-usage raises if I change the values.

David S. Miller wrote:
> On Wed, 23 Feb 2005 11:28:24 +0100
> Christian Schmid <webmaster@rapidforum.com> wrote:
> 
> 
>>I would have to write a test-program first. This needs time. Before I spend hours into coding this, 
>>first just tell me how I can raise the global-limit from around 600-700 MB to 1500 MB.
> 
> 
> The limit is in the tcp_mem sysctl, I believe we told you this
> before.
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23 19:41               ` Bug-hunting Christian Schmid
@ 2005-02-23 19:53                 ` David S. Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David S. Miller @ 2005-02-23 19:53 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

On Wed, 23 Feb 2005 20:41:39 +0100
Christian Schmid <webmaster@rapidforum.com> wrote:

> /proc/sys/net/ipv4/tcp_mem?
> 
> Its set to 98304   131072  196608
> I have changed this to 1024000 1025000 1026000
> 
> It didnt change anything. The send-buffer still keeps being shrinked down.

That's why I felt silly telling you again where the limit
is configured.

Please write the test program already.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23 19:46               ` Bug-hunting Christian Schmid
@ 2005-02-23 20:00                 ` John Heffner
  2005-02-24 12:29                   ` Bug-hunting Christian Schmid
  0 siblings, 1 reply; 13+ messages in thread
From: John Heffner @ 2005-02-23 20:00 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

Christian,

Try looking at /proc/net/netstat while running.  If TCPMemoryPressures is
increasing, you are running out of TCP memory, and you would expect to see
your socket buffers shrinking.

  -John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug-hunting
  2005-02-23 20:00                 ` Bug-hunting John Heffner
@ 2005-02-24 12:29                   ` Christian Schmid
  0 siblings, 0 replies; 13+ messages in thread
From: Christian Schmid @ 2005-02-24 12:29 UTC (permalink / raw)
  To: John Heffner; +Cc: netdev

More track-downs:

I have watched /proc/net/netstat and it didnt change. I experimented more with vm-parameters as I 
have understood that Linux is complex enough that bugs aren't obvious enough to simply say its the 
net-code just because a non-blocking socket suddenly blocks.

I found out that changing /proc/sys/vm/min_free_kbytes gives the most drastic differences. I changed 
it to 1024000 and it suddenly speeded up drastically. I changed back to around 16000 and it suddenly 
slowed down immediately. Changed back to 1024000 and it speeded up immediately. Does someone know 
more about this? What does this parameter do? I watched /proc/meminfo but it didnt change anything 
really.MemFree is always around 10 MB:

MemTotal:      8314392 kB
MemFree:         10284 kB
Buffers:         25644 kB
Cached:        7724180 kB
SwapCached:          0 kB
Active:         991312 kB
Inactive:      6977476 kB
HighTotal:     6421952 kB
HighFree:          640 kB
LowTotal:      1892440 kB
LowFree:          9644 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:          116272 kB
Writeback:           0 kB
Mapped:         223940 kB
Slab:           322652 kB
CommitLimit:   4157196 kB
Committed_AS:   794200 kB
PageTables:       1788 kB
VmallocTotal:   114680 kB
VmallocUsed:      1200 kB
VmallocChunk:   113392 kB


John Heffner wrote:
> Christian,
> 
> Try looking at /proc/net/netstat while running.  If TCPMemoryPressures is
> increasing, you are running out of TCP memory, and you would expect to see
> your socket buffers shrinking.
> 
>   -John
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2005-02-24 12:29 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-22 15:07 Bug-hunting Christian Schmid
2005-02-22 18:44 ` Bug-hunting David S. Miller
2005-02-22 19:17   ` Bug-hunting Christian Schmid
2005-02-22 19:51     ` Bug-hunting David S. Miller
2005-02-22 23:21       ` Bug-hunting Christian Schmid
2005-02-23  3:30         ` Bug-hunting David S. Miller
2005-02-23 10:28           ` Bug-hunting Christian Schmid
2005-02-23 19:29             ` Bug-hunting David S. Miller
2005-02-23 19:41               ` Bug-hunting Christian Schmid
2005-02-23 19:53                 ` Bug-hunting David S. Miller
2005-02-23 19:46               ` Bug-hunting Christian Schmid
2005-02-23 20:00                 ` Bug-hunting John Heffner
2005-02-24 12:29                   ` Bug-hunting Christian Schmid

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).