netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug in 2.6.10
@ 2005-01-28 19:27 Christian Schmid
  2005-01-28 19:42 ` Stephen Hemminger
  0 siblings, 1 reply; 5+ messages in thread
From: Christian Schmid @ 2005-01-28 19:27 UTC (permalink / raw)
  To: netdev

Hello.

In 2.6.10 there has been a "bug" introduced. You may also call it a feature, but its a crappy 
feature for big servers. It seems the kernel is dynamically adjusting the buffer-space available for 
sockets. Even if send-buffer has been set to 1024 KB, the kernel blocks at less if there are enough 
sockets in use. If you have 10 sockets with 1024 KB each, they do not block at all, using full 1024 
KB. If you have 4000 sockets, they only use 200 KB. So it seems its blocking at 800 MB. This is 
good, if you have a 1/3 system, because else the kernel would run out of low mem. But I have a 2/2 
system and I need them for buffers. So what can I do? Where can I adjust the "pool"?

Best regards,
Christian Schmid - RapidTec

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.10
  2005-01-28 19:27 Bug in 2.6.10 Christian Schmid
@ 2005-01-28 19:42 ` Stephen Hemminger
  2005-01-28 19:46   ` David S. Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Stephen Hemminger @ 2005-01-28 19:42 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

On Fri, 28 Jan 2005 20:27:53 +0100
Christian Schmid <webmaster@rapidforum.com> wrote:

> Hello.
> 
> In 2.6.10 there has been a "bug" introduced. You may also call it a feature, but its a crappy 
> feature for big servers. It seems the kernel is dynamically adjusting the buffer-space available for 
> sockets. Even if send-buffer has been set to 1024 KB, the kernel blocks at less if there are enough 
> sockets in use. If you have 10 sockets with 1024 KB each, they do not block at all, using full 1024 
> KB. If you have 4000 sockets, they only use 200 KB. So it seems its blocking at 800 MB. This is 
> good, if you have a 1/3 system, because else the kernel would run out of low mem. But I have a 2/2 
> system and I need them for buffers. So what can I do? Where can I adjust the "pool"?

You can set the upper bound by setting tcp_wmem. There are three values
all documented in Documentation/networking/ip-sysctl.txt

tcp_wmem - vector of 3 INTEGERs: min, default, max
	min: Amount of memory reserved for send buffers for TCP socket.
	Each TCP socket has rights to use it due to fact of its birth.
	Default: 4K

	default: Amount of memory allowed for send buffers for TCP socket
	by default. This value overrides net.core.wmem_default used
	by other protocols, it is usually lower than net.core.wmem_default.
	Default: 16K

	max: Maximal amount of memory allowed for automatically selected
	send buffers for TCP socket. This value does not override
	net.core.wmem_max, "static" selection via SO_SNDBUF does not use this.
	Default: 128K

If you want performance on big servers you are going to need lots of memory, this is
just a fact of bandwidth delay product * number of connections.

-- 
Stephen Hemminger	<shemminger@osdl.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.10
  2005-01-28 19:42 ` Stephen Hemminger
@ 2005-01-28 19:46   ` David S. Miller
  2005-01-28 20:03     ` Christian Schmid
  2005-01-28 20:04     ` Ben Greear
  0 siblings, 2 replies; 5+ messages in thread
From: David S. Miller @ 2005-01-28 19:46 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: webmaster, netdev

On Fri, 28 Jan 2005 11:42:51 -0800
Stephen Hemminger <shemminger@osdl.org> wrote:

> On Fri, 28 Jan 2005 20:27:53 +0100
> Christian Schmid <webmaster@rapidforum.com> wrote:
> 
> > Hello.
> > 
> > In 2.6.10 there has been a "bug" introduced. You may also call it a feature, but its a crappy 
> > feature for big servers. It seems the kernel is dynamically adjusting the buffer-space available for 
> > sockets. Even if send-buffer has been set to 1024 KB, the kernel blocks at less if there are enough 
> > sockets in use. If you have 10 sockets with 1024 KB each, they do not block at all, using full 1024 
> > KB. If you have 4000 sockets, they only use 200 KB. So it seems its blocking at 800 MB. This is 
> > good, if you have a 1/3 system, because else the kernel would run out of low mem. But I have a 2/2 
> > system and I need them for buffers. So what can I do? Where can I adjust the "pool"?
> 
> You can set the upper bound by setting tcp_wmem. There are three values
> all documented in Documentation/networking/ip-sysctl.txt

This feature is meant also to prevent remote denial of service
attacks.  It limits the amount of system memory TCP can
consume on your system.

Before this feature went in, it's really easy to remotely make a system
consume %90 of system memory just with socket buffers.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.10
  2005-01-28 19:46   ` David S. Miller
@ 2005-01-28 20:03     ` Christian Schmid
  2005-01-28 20:04     ` Ben Greear
  1 sibling, 0 replies; 5+ messages in thread
From: Christian Schmid @ 2005-01-28 20:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: Stephen Hemminger, netdev

>>>Hello.
>>>
>>>In 2.6.10 there has been a "bug" introduced. You may also call it a feature, but its a crappy 
>>>feature for big servers. It seems the kernel is dynamically adjusting the buffer-space available for 
>>>sockets. Even if send-buffer has been set to 1024 KB, the kernel blocks at less if there are enough 
>>>sockets in use. If you have 10 sockets with 1024 KB each, they do not block at all, using full 1024 
>>>KB. If you have 4000 sockets, they only use 200 KB. So it seems its blocking at 800 MB. This is 
>>>good, if you have a 1/3 system, because else the kernel would run out of low mem. But I have a 2/2 
>>>system and I need them for buffers. So what can I do? Where can I adjust the "pool"?
>>
>>You can set the upper bound by setting tcp_wmem. There are three values
>>all documented in Documentation/networking/ip-sysctl.txt
> 
> 
> This feature is meant also to prevent remote denial of service
> attacks.  It limits the amount of system memory TCP can
> consume on your system.
> 
> Before this feature went in, it's really easy to remotely make a system
> consume %90 of system memory just with socket buffers.

Thank you for your reply.

Yes this is possible for most situations. Isn't there a way to disable it or at least adjust the 
level? As said, I have a 2/2 system and I would like to set it to 1500 MB. this is important for my 
big server.

Oh and for that DOS-thingy, my two cents: It would be better if buffer-sockets are dynamically 
adjusted. So that memory is only allocated if the queue is really needed. Most of the allocated 
memory is mostly unused because queues are mostly empty.

Please help me.

Thank you in advance,
Chris

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.10
  2005-01-28 19:46   ` David S. Miller
  2005-01-28 20:03     ` Christian Schmid
@ 2005-01-28 20:04     ` Ben Greear
  1 sibling, 0 replies; 5+ messages in thread
From: Ben Greear @ 2005-01-28 20:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: Stephen Hemminger, webmaster, netdev

David S. Miller wrote:
> On Fri, 28 Jan 2005 11:42:51 -0800
> Stephen Hemminger <shemminger@osdl.org> wrote:
> 
> 
>>On Fri, 28 Jan 2005 20:27:53 +0100
>>Christian Schmid <webmaster@rapidforum.com> wrote:
>>
>>
>>>Hello.
>>>
>>>In 2.6.10 there has been a "bug" introduced. You may also call it a feature, but its a crappy 
>>>feature for big servers. It seems the kernel is dynamically adjusting the buffer-space available for 
>>>sockets. Even if send-buffer has been set to 1024 KB, the kernel blocks at less if there are enough 
>>>sockets in use. If you have 10 sockets with 1024 KB each, they do not block at all, using full 1024 
>>>KB. If you have 4000 sockets, they only use 200 KB. So it seems its blocking at 800 MB. This is 
>>>good, if you have a 1/3 system, because else the kernel would run out of low mem. But I have a 2/2 
>>>system and I need them for buffers. So what can I do? Where can I adjust the "pool"?
>>
>>You can set the upper bound by setting tcp_wmem. There are three values
>>all documented in Documentation/networking/ip-sysctl.txt
> 
> 
> This feature is meant also to prevent remote denial of service
> attacks.  It limits the amount of system memory TCP can
> consume on your system.
> 
> Before this feature went in, it's really easy to remotely make a system
> consume %90 of system memory just with socket buffers.

Could you cause this attack without having the local machine explicitly
set it's local wmem buffers higher?

With the latest code, if you set the tcp_[rw]mem MAX to some really large thing,
as it appears Mr Schmid was doing, does the kernel just ignore the larger value
after 800MB?  I agree that by default the system should protect itself from OOM
attacks, but at the same time, if a user really wants to use up to 2GB of RAM for
buffers, I don't think we should stop them.

In addition to Mr Hemminger's suggestion, I think the more important knob
would be the /proc/sys/net/core/netdev_max_backlog which bounds the total
number of receive packets in the system, correct?  Is there a similar knob
for the total maximum number of buffers waiting in tx queues?

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-01-28 20:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-01-28 19:27 Bug in 2.6.10 Christian Schmid
2005-01-28 19:42 ` Stephen Hemminger
2005-01-28 19:46   ` David S. Miller
2005-01-28 20:03     ` Christian Schmid
2005-01-28 20:04     ` Ben Greear

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).