All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Rik van Riel <riel@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Daniel Phillips <phillips@google.com>
Subject: Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD
Date: Sat, 12 Aug 2006 12:18:07 +0200	[thread overview]
Message-ID: <1155377887.13508.27.camel@lappy> (raw)
In-Reply-To: <20060812093706.GA13554@2ka.mipt.ru>

On Sat, 2006-08-12 at 13:37 +0400, Evgeniy Polyakov wrote:
> On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra (a.p.zijlstra@chello.nl) wrote:
> > > As you described above, memory for each packet must be allocated (either
> > > from SLAB or from reserve), so network needs special allocator in OOM
> > > condition, and that allocator should be separated from SLAB's one which 
> > > got OOM, so my purpose is just to use that different allocator (with
> > > additional features) for netroking always. Since every piece of
> > > networking is limited (socket queues, socket numbers, hardware queues,
> > > hardware wire speeds an so on) there is always a maximum amount of
> > > memory it can consume and can never exceed, so if network allocator will 
> > > get that amount of memory at the begining, it will never meet OOM, 
> > > so it will _always_ work and thus can allow to make slow progress for 
> > > OOM-capable things like block devices and swap issues. 
> > > There are no special reserve and no need to switch to/from it and 
> > > no possibility to have OOM by design.
> > 
> > I'm not sure if the network stack is bounded as you say; for instance
> > imagine you taking a lot of packets for blocked user-space processes,
> > these will just accumulate in the network stack and go nowhere. In that
> > case memory usage is very much unbounded.
> 
> No it is not. There are socket queues and they are limited. Things like
> TCP behave even better.
> 
> > Even if blocked sockets would only accept a limited amount of packets,
> > it would then become a function of the amount of open sockets, which is
> > again unbounded.
> 
> Does it? I though it is possible to only have 64k of working sockets per
> device in TCP.

65535 sockets * 128 packets * 16384 bytes/packet = 
1^16 * 1^7 * 1^14 = 1^(16+7+14) = 1^37 = 128G of memory per IP

And systems with a lot of IP numbers are not unthinkable.

I wonder what kind of system you have to feel that that is not a
problem. (I'm not sure on the 128 packets per socket, and the 16k per
packet is considering jumbo frames without scather gather receive)

> If system is limited enough to provide enough memory for network tree
> allocator, it is possible to create it's own drop condition inside NTA,
> but it must be saparated from the weakest chain element in that
> conditions - SLAB OOM.

Hence the alternative allocator to use on tight memory conditions.


WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Rik van Riel <riel@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, Daniel Phillips <phillips@google.com>
Subject: Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD
Date: Sat, 12 Aug 2006 12:18:07 +0200	[thread overview]
Message-ID: <1155377887.13508.27.camel@lappy> (raw)
In-Reply-To: <20060812093706.GA13554@2ka.mipt.ru>

On Sat, 2006-08-12 at 13:37 +0400, Evgeniy Polyakov wrote:
> On Sat, Aug 12, 2006 at 11:19:49AM +0200, Peter Zijlstra (a.p.zijlstra@chello.nl) wrote:
> > > As you described above, memory for each packet must be allocated (either
> > > from SLAB or from reserve), so network needs special allocator in OOM
> > > condition, and that allocator should be separated from SLAB's one which 
> > > got OOM, so my purpose is just to use that different allocator (with
> > > additional features) for netroking always. Since every piece of
> > > networking is limited (socket queues, socket numbers, hardware queues,
> > > hardware wire speeds an so on) there is always a maximum amount of
> > > memory it can consume and can never exceed, so if network allocator will 
> > > get that amount of memory at the begining, it will never meet OOM, 
> > > so it will _always_ work and thus can allow to make slow progress for 
> > > OOM-capable things like block devices and swap issues. 
> > > There are no special reserve and no need to switch to/from it and 
> > > no possibility to have OOM by design.
> > 
> > I'm not sure if the network stack is bounded as you say; for instance
> > imagine you taking a lot of packets for blocked user-space processes,
> > these will just accumulate in the network stack and go nowhere. In that
> > case memory usage is very much unbounded.
> 
> No it is not. There are socket queues and they are limited. Things like
> TCP behave even better.
> 
> > Even if blocked sockets would only accept a limited amount of packets,
> > it would then become a function of the amount of open sockets, which is
> > again unbounded.
> 
> Does it? I though it is possible to only have 64k of working sockets per
> device in TCP.

65535 sockets * 128 packets * 16384 bytes/packet = 
1^16 * 1^7 * 1^14 = 1^(16+7+14) = 1^37 = 128G of memory per IP

And systems with a lot of IP numbers are not unthinkable.

I wonder what kind of system you have to feel that that is not a
problem. (I'm not sure on the 128 packets per socket, and the 16k per
packet is considering jumbo frames without scather gather receive)

> If system is limited enough to provide enough memory for network tree
> allocator, it is possible to create it's own drop condition inside NTA,
> but it must be saparated from the weakest chain element in that
> conditions - SLAB OOM.

Hence the alternative allocator to use on tight memory conditions.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-08-12 10:19 UTC|newest]

Thread overview: 280+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-08 19:33 [RFC][PATCH 0/9] Network receive deadlock prevention for NBD Peter Zijlstra
2006-08-08 19:33 ` Peter Zijlstra
2006-08-08 19:33 ` [RFC][PATCH 1/9] pfn_to_kaddr() for UML Peter Zijlstra
2006-08-08 19:33   ` Peter Zijlstra
2006-08-08 19:33 ` [RFC][PATCH 2/9] deadlock prevention core Peter Zijlstra
2006-08-08 19:33   ` Peter Zijlstra
2006-08-08 20:57   ` Stephen Hemminger
2006-08-08 20:57     ` Stephen Hemminger
2006-08-08 21:05     ` Peter Zijlstra
2006-08-08 21:05       ` Peter Zijlstra
2006-08-09  1:33     ` Daniel Phillips
2006-08-09  1:33       ` Daniel Phillips
2006-08-09  1:38       ` David Miller
2006-08-09  1:38         ` David Miller, Daniel Phillips
2006-08-08 21:17   ` Thomas Graf
2006-08-08 21:17     ` Thomas Graf
2006-08-09  1:34     ` Daniel Phillips
2006-08-09  1:34       ` Daniel Phillips
2006-08-09  1:39       ` David Miller
2006-08-09  1:39         ` David Miller, Daniel Phillips
2006-08-09  5:47         ` Daniel Phillips
2006-08-09  5:47           ` Daniel Phillips
2006-08-09 13:19           ` Thomas Graf
2006-08-09 13:19             ` Thomas Graf
2006-08-09 14:07             ` Peter Zijlstra
2006-08-09 14:07               ` Peter Zijlstra
2006-08-09 16:18               ` Thomas Graf
2006-08-09 16:18                 ` Thomas Graf
2006-08-09 16:19                 ` Peter Zijlstra
2006-08-09 16:19                   ` Peter Zijlstra
2006-08-10  0:01                   ` David Miller
2006-08-10  0:01                     ` David Miller, Peter Zijlstra
2006-08-09 23:58               ` David Miller
2006-08-09 23:58                 ` David Miller, Peter Zijlstra
2006-08-10  6:25                 ` Peter Zijlstra
2006-08-10  6:25                   ` Peter Zijlstra
2006-08-11  4:24                 ` Stephen Hemminger
2006-08-11  4:24                   ` Stephen Hemminger
2006-08-13 21:22                 ` Daniel Phillips
2006-08-13 21:22                   ` Daniel Phillips
2006-08-13 23:49                   ` David Miller
2006-08-13 23:49                     ` David Miller, Daniel Phillips
2006-08-14  1:15                     ` Daniel Phillips
2006-08-14  1:15                       ` Daniel Phillips
2006-08-11  2:37     ` Rik van Riel
2006-08-11  2:37       ` Rik van Riel
2006-08-13 22:05       ` Daniel Phillips
2006-08-13 22:05         ` Daniel Phillips
2006-08-13 23:55         ` David Miller
2006-08-13 23:55           ` David Miller, Daniel Phillips
2006-08-14  1:31           ` Daniel Phillips
2006-08-14  1:31             ` Daniel Phillips
2006-08-14  1:53             ` Andrew Morton
2006-08-14  1:53               ` Andrew Morton
2006-08-14  4:40               ` Peter Zijlstra
2006-08-14  4:40                 ` Peter Zijlstra
2006-08-14  4:58                 ` Andrew Morton
2006-08-14  4:58                   ` Andrew Morton
2006-08-14  5:03                   ` Peter Zijlstra
2006-08-14  5:03                     ` Peter Zijlstra
2006-08-14  5:22                     ` Andrew Morton
2006-08-14  5:22                       ` Andrew Morton
2006-08-14  6:45                       ` Peter Zijlstra
2006-08-14  6:45                         ` Peter Zijlstra
2006-08-14  7:07                         ` Andrew Morton
2006-08-14  7:07                           ` Andrew Morton
2006-08-14  8:15                           ` Peter Zijlstra
2006-08-14  8:15                             ` Peter Zijlstra
2006-08-14  8:25                             ` Evgeniy Polyakov
2006-08-14  8:25                               ` Evgeniy Polyakov
2006-08-14  8:35                               ` Peter Zijlstra
2006-08-14  8:35                                 ` Peter Zijlstra
2006-08-14  8:33                           ` David Miller
2006-08-14  8:33                             ` David Miller, Andrew Morton
2006-08-17  4:27                           ` Daniel Phillips
2006-08-17  4:27                             ` Daniel Phillips
2006-08-14  7:17                         ` Neil Brown
2006-08-14  7:17                           ` Neil Brown
2006-08-14  7:31                           ` Evgeniy Polyakov
2006-08-14  7:31                             ` Evgeniy Polyakov
2006-08-17  3:58                   ` Daniel Phillips
2006-08-17  3:58                     ` Daniel Phillips
2006-08-17  5:57                     ` Andrew Morton
2006-08-17  5:57                       ` Andrew Morton
2006-08-17 23:53                       ` Daniel Phillips
2006-08-17 23:53                         ` Daniel Phillips
2006-08-18  0:24                         ` Rik van Riel
2006-08-18  0:24                           ` Rik van Riel
2006-08-18  0:35                         ` Daniel Phillips
2006-08-18  0:35                           ` Daniel Phillips
2006-08-18  1:14                         ` Neil Brown
2006-08-18  1:14                           ` Neil Brown
2006-08-18  6:05                         ` Andrew Morton
2006-08-18  6:05                           ` Andrew Morton
2006-08-18 21:22                           ` Daniel Phillips
2006-08-18 21:22                             ` Daniel Phillips
2006-08-18 22:34                             ` Andrew Morton
2006-08-18 22:34                               ` Andrew Morton
2006-08-18 23:44                               ` Daniel Phillips
2006-08-18 23:44                                 ` Daniel Phillips
2006-08-19  2:44                                 ` Andrew Morton
2006-08-19  2:44                                   ` Andrew Morton
2006-08-19  4:14                                   ` Network receive stall avoidance (was [PATCH 2/9] deadlock prevention core) Daniel Phillips
2006-08-19  4:14                                     ` Daniel Phillips
2006-08-19  7:28                                     ` Andrew Morton
2006-08-19  7:28                                       ` Andrew Morton
2006-08-19 15:06                                   ` [RFC][PATCH 2/9] deadlock prevention core Rik van Riel
2006-08-19 15:06                                     ` Rik van Riel
2006-08-20  1:33                                     ` Andre Tomt
2006-08-20  1:33                                       ` Andre Tomt
2006-08-19 16:53                                   ` Ray Lee
2006-08-19 16:53                                     ` Ray Lee
2006-08-21 13:27                                   ` Philip R. Auld
2006-08-21 13:27                                     ` Philip R. Auld
2006-08-25 10:47                                     ` Pavel Machek
2006-08-25 10:47                                       ` Pavel Machek
2006-08-21 13:38                                 ` Jens Axboe
2006-08-21 13:38                                   ` Jens Axboe
2006-08-08 22:10   ` David Miller
2006-08-08 22:10     ` David Miller
2006-08-09  1:35     ` Daniel Phillips
2006-08-09  1:35       ` Daniel Phillips
2006-08-09  1:41       ` David Miller
2006-08-09  1:41         ` David Miller, Daniel Phillips
2006-08-09  5:44         ` Daniel Phillips
2006-08-09  5:44           ` Daniel Phillips
2006-08-09  7:00           ` Peter Zijlstra
2006-08-09  7:00             ` Peter Zijlstra
     [not found]   ` <42414.81.207.0.53.1155080443.squirrel@81.207.0.53>
2006-08-09  0:25     ` Daniel Phillips
2006-08-09  0:25       ` Daniel Phillips
2006-08-09 12:02       ` Indan Zupancic
2006-08-09 12:02         ` Indan Zupancic
2006-08-09 12:54         ` Peter Zijlstra
2006-08-09 12:54           ` Peter Zijlstra
2006-08-09 13:48           ` Indan Zupancic
2006-08-09 13:48             ` Indan Zupancic
2006-08-09 14:00             ` Peter Zijlstra
2006-08-09 14:00               ` Peter Zijlstra
2006-08-09 18:34               ` Indan Zupancic
2006-08-09 18:34                 ` Indan Zupancic
2006-08-09 19:45                 ` Peter Zijlstra
2006-08-09 19:45                   ` Peter Zijlstra
2006-08-09 20:19                   ` Peter Zijlstra
2006-08-09 20:19                     ` Peter Zijlstra
2006-08-10  1:21                   ` Indan Zupancic
2006-08-10  1:21                     ` Indan Zupancic
2006-08-09 16:05   ` -v2 " Peter Zijlstra
2006-08-09 16:05     ` Peter Zijlstra
2006-08-08 19:33 ` [RFC][PATCH 3/9] e1000 driver conversion Peter Zijlstra
2006-08-08 19:33   ` Peter Zijlstra
2006-08-08 20:50   ` Auke Kok
2006-08-08 20:50     ` Auke Kok
2006-08-08 20:59     ` Peter Zijlstra
2006-08-08 20:59       ` Peter Zijlstra
2006-08-08 22:32     ` David Miller
2006-08-08 22:32       ` David Miller, Auke Kok
2006-08-08 22:42       ` Auke Kok
2006-08-08 22:42         ` Auke Kok
2006-08-08 19:34 ` [RFC][PATCH 4/9] e100 " Peter Zijlstra
2006-08-08 19:34   ` Peter Zijlstra
2006-08-08 20:13   ` Auke Kok
2006-08-08 20:13     ` Auke Kok
2006-08-08 20:18     ` Peter Zijlstra
2006-08-08 20:18       ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 5/9] r8169 " Peter Zijlstra
2006-08-08 19:34   ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 6/9] tg3 " Peter Zijlstra
2006-08-08 19:34   ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 7/9] UML eth " Peter Zijlstra
2006-08-08 19:34   ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 8/9] 3c59x " Peter Zijlstra
2006-08-08 19:34   ` Peter Zijlstra
2006-08-08 23:07   ` Jeff Garzik
2006-08-08 23:07     ` Jeff Garzik
2006-08-09  5:51     ` Daniel Phillips
2006-08-09  5:51       ` Daniel Phillips
2006-08-09  5:55       ` David Miller
2006-08-09  5:55         ` David Miller, Daniel Phillips
2006-08-09  6:30         ` Jeff Garzik
2006-08-09  6:30           ` Jeff Garzik
2006-08-09  7:03           ` Peter Zijlstra
2006-08-09  7:03             ` Peter Zijlstra
2006-08-09  7:20             ` Jeff Garzik
2006-08-09  7:20               ` Jeff Garzik
2006-08-13 19:38         ` Daniel Phillips
2006-08-13 19:38           ` Daniel Phillips
2006-08-13 19:53           ` Jeff Garzik
2006-08-13 19:53             ` Jeff Garzik
2006-08-08 19:34 ` [RFC][PATCH 9/9] deadlock prevention for NBD Peter Zijlstra
2006-08-08 19:34   ` Peter Zijlstra
2006-08-09  5:46 ` [RFC][PATCH 0/9] Network receive " Evgeniy Polyakov
2006-08-09  5:46   ` Evgeniy Polyakov
2006-08-09  5:52   ` Daniel Phillips
2006-08-09  5:52     ` Daniel Phillips
2006-08-09  5:56     ` David Miller
2006-08-09  5:56       ` David Miller, Daniel Phillips
2006-08-09  5:53   ` David Miller
2006-08-09  5:53     ` David Miller, Evgeniy Polyakov
2006-08-09  5:55     ` Evgeniy Polyakov
2006-08-09  5:55       ` Evgeniy Polyakov
2006-08-09 12:37   ` Peter Zijlstra
2006-08-09 12:37     ` Peter Zijlstra
2006-08-09 13:07     ` Evgeniy Polyakov
2006-08-09 13:07       ` Evgeniy Polyakov
2006-08-09 13:32       ` Peter Zijlstra
2006-08-09 13:32         ` Peter Zijlstra
2006-08-09 19:29         ` Evgeniy Polyakov
2006-08-09 19:29           ` Evgeniy Polyakov
2006-08-09 23:54         ` David Miller
2006-08-09 23:54           ` David Miller, Peter Zijlstra
2006-08-10  6:06           ` Peter Zijlstra
2006-08-10  6:06             ` Peter Zijlstra
2006-08-13 20:16             ` Daniel Phillips
2006-08-13 20:16               ` Daniel Phillips
2006-08-14  5:13               ` Evgeniy Polyakov
2006-08-14  5:13                 ` Evgeniy Polyakov
2006-08-14  6:45                 ` Peter Zijlstra
2006-08-14  6:45                   ` Peter Zijlstra
2006-08-14  6:54                   ` Evgeniy Polyakov
2006-08-14  6:54                     ` Evgeniy Polyakov
2006-08-17  4:49                     ` Daniel Phillips
2006-08-17  4:49                       ` Daniel Phillips
2006-08-17  4:48                 ` Daniel Phillips
2006-08-17  4:48                   ` Daniel Phillips
2006-08-17  5:36                   ` Evgeniy Polyakov
2006-08-17  5:36                     ` Evgeniy Polyakov
2006-08-17 18:01                     ` Daniel Phillips
2006-08-17 18:01                       ` Daniel Phillips
2006-08-17 18:42                       ` Evgeniy Polyakov
2006-08-17 18:42                         ` Evgeniy Polyakov
2006-08-17 19:15                         ` Peter Zijlstra
2006-08-17 19:15                           ` Peter Zijlstra
2006-08-17 19:48                           ` Evgeniy Polyakov
2006-08-17 19:48                             ` Evgeniy Polyakov
2006-08-17 23:24                             ` Daniel Phillips
2006-08-17 23:24                               ` Daniel Phillips
2006-08-18  7:16                               ` Evgeniy Polyakov
2006-08-18  7:16                                 ` Evgeniy Polyakov
2006-08-12  3:42         ` Rik van Riel
2006-08-12  3:42           ` Rik van Riel
2006-08-12  8:47           ` Evgeniy Polyakov
2006-08-12  8:47             ` Evgeniy Polyakov
2006-08-12  9:19             ` Peter Zijlstra
2006-08-12  9:19               ` Peter Zijlstra
2006-08-12  9:37               ` Evgeniy Polyakov
2006-08-12  9:37                 ` Evgeniy Polyakov
2006-08-12 10:18                 ` Peter Zijlstra [this message]
2006-08-12 10:18                   ` Peter Zijlstra
2006-08-12 10:42                   ` Evgeniy Polyakov
2006-08-12 10:42                     ` Evgeniy Polyakov
2006-08-12 10:51                     ` Evgeniy Polyakov
2006-08-12 10:51                       ` Evgeniy Polyakov
2006-08-12 11:40                     ` Peter Zijlstra
2006-08-12 11:40                       ` Peter Zijlstra
2006-08-12 11:53                       ` Evgeniy Polyakov
2006-08-12 11:53                         ` Evgeniy Polyakov
2006-08-13  0:46                   ` David Miller
2006-08-13  0:46                     ` David Miller, Peter Zijlstra
2006-08-13  1:11                     ` Rik van Riel
2006-08-13  1:11                       ` Rik van Riel
2006-08-12 14:40                 ` Rik van Riel
2006-08-12 14:40                   ` Rik van Riel
2006-08-12 14:49                   ` Evgeniy Polyakov
2006-08-12 14:49                     ` Evgeniy Polyakov
2006-08-12 14:56                     ` Rik van Riel
2006-08-12 14:56                       ` Rik van Riel
2006-08-12 15:08                       ` Evgeniy Polyakov
2006-08-12 15:08                         ` Evgeniy Polyakov
2006-08-12 15:22                         ` Peter Zijlstra
2006-08-12 15:22                           ` Peter Zijlstra
2006-08-14  0:56                         ` Daniel Phillips
2006-08-14  0:56                           ` Daniel Phillips
2006-08-13  0:46                 ` David Miller
2006-08-13  0:46                   ` David Miller, Evgeniy Polyakov
2006-08-13  9:06                   ` Evgeniy Polyakov
2006-08-13  9:06                     ` Evgeniy Polyakov
2006-08-13  9:52                     ` Evgeniy Polyakov
2006-08-13  9:52                       ` Evgeniy Polyakov
2006-08-15 19:17 ` Pavel Machek
2006-08-15 19:17   ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1155377887.13508.27.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=phillips@google.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.