From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@osdl.org>
Cc: Daniel Phillips <phillips@google.com>,
David Miller <davem@davemloft.net>,
riel@redhat.com, tgraf@suug.ch, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Mike Christie <michaelc@cs.wisc.edu>
Subject: Re: [RFC][PATCH 2/9] deadlock prevention core
Date: Mon, 14 Aug 2006 06:40:53 +0200 [thread overview]
Message-ID: <1155530453.5696.98.camel@twins> (raw)
In-Reply-To: <20060813185309.928472f9.akpm@osdl.org>
On Sun, 2006-08-13 at 18:53 -0700, Andrew Morton wrote:
> On Sun, 13 Aug 2006 18:31:14 -0700
> Daniel Phillips <phillips@google.com> wrote:
>
> > But to solve the whole problem
>
> What problem? Has anyone come up with a testcase which others can
> reproduce?
Problem:
Networked Block devices (NBD, iSCSI, AoE) can deadlock in the following
manner:
deplete normal memory because of memory pressure; deplete reserves by
writeout over network (pageout happens under PF_MEMALLOC), little to no
memory left for receiving those now crucial ACK packets.
A few packets could still fit in memory, but are quickly gobbled up by
non-crucial sockets and are left waiting on blocked user-space
processes. All memory is depleted and progress stalled forever.
(This affects swap and shared mmap)
Our Solution:
Mark some sockets with SOCK_MEMALLOC; which is essentially a promise to
never block. When under memory pressure only deliver packets to these
sockets, memory will still be used but never lost waiting on a blocked
user space process.
Also make sure the reserve is large enough so that writeout will never
be able to completely deplete it.
(It is here I still do not see Evgeniy's Network Tree Allocator work;
where is the guarantee that you do not end up with all memory lost
waiting on blocked sockets?)
Testcase:
Mount an NBD device as sole swap device and mmap > physical RAM, then
loop through touching pages only once.
My normal test setup is a p3-550 with 192M of ram with a 100Mbit card
and remote machine with a regular 7200 RPM pata drive.
I'm sure there is an iSCSI equivalent scenario, playing with iSCSI is
next on my list of things.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@osdl.org>
Cc: Daniel Phillips <phillips@google.com>,
David Miller <davem@davemloft.net>,
riel@redhat.com, tgraf@suug.ch, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Mike Christie <michaelc@cs.wisc.edu>
Subject: Re: [RFC][PATCH 2/9] deadlock prevention core
Date: Mon, 14 Aug 2006 06:40:53 +0200 [thread overview]
Message-ID: <1155530453.5696.98.camel@twins> (raw)
In-Reply-To: <20060813185309.928472f9.akpm@osdl.org>
On Sun, 2006-08-13 at 18:53 -0700, Andrew Morton wrote:
> On Sun, 13 Aug 2006 18:31:14 -0700
> Daniel Phillips <phillips@google.com> wrote:
>
> > But to solve the whole problem
>
> What problem? Has anyone come up with a testcase which others can
> reproduce?
Problem:
Networked Block devices (NBD, iSCSI, AoE) can deadlock in the following
manner:
deplete normal memory because of memory pressure; deplete reserves by
writeout over network (pageout happens under PF_MEMALLOC), little to no
memory left for receiving those now crucial ACK packets.
A few packets could still fit in memory, but are quickly gobbled up by
non-crucial sockets and are left waiting on blocked user-space
processes. All memory is depleted and progress stalled forever.
(This affects swap and shared mmap)
Our Solution:
Mark some sockets with SOCK_MEMALLOC; which is essentially a promise to
never block. When under memory pressure only deliver packets to these
sockets, memory will still be used but never lost waiting on a blocked
user space process.
Also make sure the reserve is large enough so that writeout will never
be able to completely deplete it.
(It is here I still do not see Evgeniy's Network Tree Allocator work;
where is the guarantee that you do not end up with all memory lost
waiting on blocked sockets?)
Testcase:
Mount an NBD device as sole swap device and mmap > physical RAM, then
loop through touching pages only once.
My normal test setup is a p3-550 with 192M of ram with a 100Mbit card
and remote machine with a regular 7200 RPM pata drive.
I'm sure there is an iSCSI equivalent scenario, playing with iSCSI is
next on my list of things.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-08-14 4:42 UTC|newest]
Thread overview: 280+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-08 19:33 [RFC][PATCH 0/9] Network receive deadlock prevention for NBD Peter Zijlstra
2006-08-08 19:33 ` Peter Zijlstra
2006-08-08 19:33 ` [RFC][PATCH 1/9] pfn_to_kaddr() for UML Peter Zijlstra
2006-08-08 19:33 ` Peter Zijlstra
2006-08-08 19:33 ` [RFC][PATCH 2/9] deadlock prevention core Peter Zijlstra
2006-08-08 19:33 ` Peter Zijlstra
2006-08-08 20:57 ` Stephen Hemminger
2006-08-08 20:57 ` Stephen Hemminger
2006-08-08 21:05 ` Peter Zijlstra
2006-08-08 21:05 ` Peter Zijlstra
2006-08-09 1:33 ` Daniel Phillips
2006-08-09 1:33 ` Daniel Phillips
2006-08-09 1:38 ` David Miller
2006-08-09 1:38 ` David Miller, Daniel Phillips
2006-08-08 21:17 ` Thomas Graf
2006-08-08 21:17 ` Thomas Graf
2006-08-09 1:34 ` Daniel Phillips
2006-08-09 1:34 ` Daniel Phillips
2006-08-09 1:39 ` David Miller
2006-08-09 1:39 ` David Miller, Daniel Phillips
2006-08-09 5:47 ` Daniel Phillips
2006-08-09 5:47 ` Daniel Phillips
2006-08-09 13:19 ` Thomas Graf
2006-08-09 13:19 ` Thomas Graf
2006-08-09 14:07 ` Peter Zijlstra
2006-08-09 14:07 ` Peter Zijlstra
2006-08-09 16:18 ` Thomas Graf
2006-08-09 16:18 ` Thomas Graf
2006-08-09 16:19 ` Peter Zijlstra
2006-08-09 16:19 ` Peter Zijlstra
2006-08-10 0:01 ` David Miller
2006-08-10 0:01 ` David Miller, Peter Zijlstra
2006-08-09 23:58 ` David Miller
2006-08-09 23:58 ` David Miller, Peter Zijlstra
2006-08-10 6:25 ` Peter Zijlstra
2006-08-10 6:25 ` Peter Zijlstra
2006-08-11 4:24 ` Stephen Hemminger
2006-08-11 4:24 ` Stephen Hemminger
2006-08-13 21:22 ` Daniel Phillips
2006-08-13 21:22 ` Daniel Phillips
2006-08-13 23:49 ` David Miller
2006-08-13 23:49 ` David Miller, Daniel Phillips
2006-08-14 1:15 ` Daniel Phillips
2006-08-14 1:15 ` Daniel Phillips
2006-08-11 2:37 ` Rik van Riel
2006-08-11 2:37 ` Rik van Riel
2006-08-13 22:05 ` Daniel Phillips
2006-08-13 22:05 ` Daniel Phillips
2006-08-13 23:55 ` David Miller
2006-08-13 23:55 ` David Miller, Daniel Phillips
2006-08-14 1:31 ` Daniel Phillips
2006-08-14 1:31 ` Daniel Phillips
2006-08-14 1:53 ` Andrew Morton
2006-08-14 1:53 ` Andrew Morton
2006-08-14 4:40 ` Peter Zijlstra [this message]
2006-08-14 4:40 ` Peter Zijlstra
2006-08-14 4:58 ` Andrew Morton
2006-08-14 4:58 ` Andrew Morton
2006-08-14 5:03 ` Peter Zijlstra
2006-08-14 5:03 ` Peter Zijlstra
2006-08-14 5:22 ` Andrew Morton
2006-08-14 5:22 ` Andrew Morton
2006-08-14 6:45 ` Peter Zijlstra
2006-08-14 6:45 ` Peter Zijlstra
2006-08-14 7:07 ` Andrew Morton
2006-08-14 7:07 ` Andrew Morton
2006-08-14 8:15 ` Peter Zijlstra
2006-08-14 8:15 ` Peter Zijlstra
2006-08-14 8:25 ` Evgeniy Polyakov
2006-08-14 8:25 ` Evgeniy Polyakov
2006-08-14 8:35 ` Peter Zijlstra
2006-08-14 8:35 ` Peter Zijlstra
2006-08-14 8:33 ` David Miller
2006-08-14 8:33 ` David Miller, Andrew Morton
2006-08-17 4:27 ` Daniel Phillips
2006-08-17 4:27 ` Daniel Phillips
2006-08-14 7:17 ` Neil Brown
2006-08-14 7:17 ` Neil Brown
2006-08-14 7:31 ` Evgeniy Polyakov
2006-08-14 7:31 ` Evgeniy Polyakov
2006-08-17 3:58 ` Daniel Phillips
2006-08-17 3:58 ` Daniel Phillips
2006-08-17 5:57 ` Andrew Morton
2006-08-17 5:57 ` Andrew Morton
2006-08-17 23:53 ` Daniel Phillips
2006-08-17 23:53 ` Daniel Phillips
2006-08-18 0:24 ` Rik van Riel
2006-08-18 0:24 ` Rik van Riel
2006-08-18 0:35 ` Daniel Phillips
2006-08-18 0:35 ` Daniel Phillips
2006-08-18 1:14 ` Neil Brown
2006-08-18 1:14 ` Neil Brown
2006-08-18 6:05 ` Andrew Morton
2006-08-18 6:05 ` Andrew Morton
2006-08-18 21:22 ` Daniel Phillips
2006-08-18 21:22 ` Daniel Phillips
2006-08-18 22:34 ` Andrew Morton
2006-08-18 22:34 ` Andrew Morton
2006-08-18 23:44 ` Daniel Phillips
2006-08-18 23:44 ` Daniel Phillips
2006-08-19 2:44 ` Andrew Morton
2006-08-19 2:44 ` Andrew Morton
2006-08-19 4:14 ` Network receive stall avoidance (was [PATCH 2/9] deadlock prevention core) Daniel Phillips
2006-08-19 4:14 ` Daniel Phillips
2006-08-19 7:28 ` Andrew Morton
2006-08-19 7:28 ` Andrew Morton
2006-08-19 15:06 ` [RFC][PATCH 2/9] deadlock prevention core Rik van Riel
2006-08-19 15:06 ` Rik van Riel
2006-08-20 1:33 ` Andre Tomt
2006-08-20 1:33 ` Andre Tomt
2006-08-19 16:53 ` Ray Lee
2006-08-19 16:53 ` Ray Lee
2006-08-21 13:27 ` Philip R. Auld
2006-08-21 13:27 ` Philip R. Auld
2006-08-25 10:47 ` Pavel Machek
2006-08-25 10:47 ` Pavel Machek
2006-08-21 13:38 ` Jens Axboe
2006-08-21 13:38 ` Jens Axboe
2006-08-08 22:10 ` David Miller
2006-08-08 22:10 ` David Miller
2006-08-09 1:35 ` Daniel Phillips
2006-08-09 1:35 ` Daniel Phillips
2006-08-09 1:41 ` David Miller
2006-08-09 1:41 ` David Miller, Daniel Phillips
2006-08-09 5:44 ` Daniel Phillips
2006-08-09 5:44 ` Daniel Phillips
2006-08-09 7:00 ` Peter Zijlstra
2006-08-09 7:00 ` Peter Zijlstra
[not found] ` <42414.81.207.0.53.1155080443.squirrel@81.207.0.53>
2006-08-09 0:25 ` Daniel Phillips
2006-08-09 0:25 ` Daniel Phillips
2006-08-09 12:02 ` Indan Zupancic
2006-08-09 12:02 ` Indan Zupancic
2006-08-09 12:54 ` Peter Zijlstra
2006-08-09 12:54 ` Peter Zijlstra
2006-08-09 13:48 ` Indan Zupancic
2006-08-09 13:48 ` Indan Zupancic
2006-08-09 14:00 ` Peter Zijlstra
2006-08-09 14:00 ` Peter Zijlstra
2006-08-09 18:34 ` Indan Zupancic
2006-08-09 18:34 ` Indan Zupancic
2006-08-09 19:45 ` Peter Zijlstra
2006-08-09 19:45 ` Peter Zijlstra
2006-08-09 20:19 ` Peter Zijlstra
2006-08-09 20:19 ` Peter Zijlstra
2006-08-10 1:21 ` Indan Zupancic
2006-08-10 1:21 ` Indan Zupancic
2006-08-09 16:05 ` -v2 " Peter Zijlstra
2006-08-09 16:05 ` Peter Zijlstra
2006-08-08 19:33 ` [RFC][PATCH 3/9] e1000 driver conversion Peter Zijlstra
2006-08-08 19:33 ` Peter Zijlstra
2006-08-08 20:50 ` Auke Kok
2006-08-08 20:50 ` Auke Kok
2006-08-08 20:59 ` Peter Zijlstra
2006-08-08 20:59 ` Peter Zijlstra
2006-08-08 22:32 ` David Miller
2006-08-08 22:32 ` David Miller, Auke Kok
2006-08-08 22:42 ` Auke Kok
2006-08-08 22:42 ` Auke Kok
2006-08-08 19:34 ` [RFC][PATCH 4/9] e100 " Peter Zijlstra
2006-08-08 19:34 ` Peter Zijlstra
2006-08-08 20:13 ` Auke Kok
2006-08-08 20:13 ` Auke Kok
2006-08-08 20:18 ` Peter Zijlstra
2006-08-08 20:18 ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 5/9] r8169 " Peter Zijlstra
2006-08-08 19:34 ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 6/9] tg3 " Peter Zijlstra
2006-08-08 19:34 ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 7/9] UML eth " Peter Zijlstra
2006-08-08 19:34 ` Peter Zijlstra
2006-08-08 19:34 ` [RFC][PATCH 8/9] 3c59x " Peter Zijlstra
2006-08-08 19:34 ` Peter Zijlstra
2006-08-08 23:07 ` Jeff Garzik
2006-08-08 23:07 ` Jeff Garzik
2006-08-09 5:51 ` Daniel Phillips
2006-08-09 5:51 ` Daniel Phillips
2006-08-09 5:55 ` David Miller
2006-08-09 5:55 ` David Miller, Daniel Phillips
2006-08-09 6:30 ` Jeff Garzik
2006-08-09 6:30 ` Jeff Garzik
2006-08-09 7:03 ` Peter Zijlstra
2006-08-09 7:03 ` Peter Zijlstra
2006-08-09 7:20 ` Jeff Garzik
2006-08-09 7:20 ` Jeff Garzik
2006-08-13 19:38 ` Daniel Phillips
2006-08-13 19:38 ` Daniel Phillips
2006-08-13 19:53 ` Jeff Garzik
2006-08-13 19:53 ` Jeff Garzik
2006-08-08 19:34 ` [RFC][PATCH 9/9] deadlock prevention for NBD Peter Zijlstra
2006-08-08 19:34 ` Peter Zijlstra
2006-08-09 5:46 ` [RFC][PATCH 0/9] Network receive " Evgeniy Polyakov
2006-08-09 5:46 ` Evgeniy Polyakov
2006-08-09 5:52 ` Daniel Phillips
2006-08-09 5:52 ` Daniel Phillips
2006-08-09 5:56 ` David Miller
2006-08-09 5:56 ` David Miller, Daniel Phillips
2006-08-09 5:53 ` David Miller
2006-08-09 5:53 ` David Miller, Evgeniy Polyakov
2006-08-09 5:55 ` Evgeniy Polyakov
2006-08-09 5:55 ` Evgeniy Polyakov
2006-08-09 12:37 ` Peter Zijlstra
2006-08-09 12:37 ` Peter Zijlstra
2006-08-09 13:07 ` Evgeniy Polyakov
2006-08-09 13:07 ` Evgeniy Polyakov
2006-08-09 13:32 ` Peter Zijlstra
2006-08-09 13:32 ` Peter Zijlstra
2006-08-09 19:29 ` Evgeniy Polyakov
2006-08-09 19:29 ` Evgeniy Polyakov
2006-08-09 23:54 ` David Miller
2006-08-09 23:54 ` David Miller, Peter Zijlstra
2006-08-10 6:06 ` Peter Zijlstra
2006-08-10 6:06 ` Peter Zijlstra
2006-08-13 20:16 ` Daniel Phillips
2006-08-13 20:16 ` Daniel Phillips
2006-08-14 5:13 ` Evgeniy Polyakov
2006-08-14 5:13 ` Evgeniy Polyakov
2006-08-14 6:45 ` Peter Zijlstra
2006-08-14 6:45 ` Peter Zijlstra
2006-08-14 6:54 ` Evgeniy Polyakov
2006-08-14 6:54 ` Evgeniy Polyakov
2006-08-17 4:49 ` Daniel Phillips
2006-08-17 4:49 ` Daniel Phillips
2006-08-17 4:48 ` Daniel Phillips
2006-08-17 4:48 ` Daniel Phillips
2006-08-17 5:36 ` Evgeniy Polyakov
2006-08-17 5:36 ` Evgeniy Polyakov
2006-08-17 18:01 ` Daniel Phillips
2006-08-17 18:01 ` Daniel Phillips
2006-08-17 18:42 ` Evgeniy Polyakov
2006-08-17 18:42 ` Evgeniy Polyakov
2006-08-17 19:15 ` Peter Zijlstra
2006-08-17 19:15 ` Peter Zijlstra
2006-08-17 19:48 ` Evgeniy Polyakov
2006-08-17 19:48 ` Evgeniy Polyakov
2006-08-17 23:24 ` Daniel Phillips
2006-08-17 23:24 ` Daniel Phillips
2006-08-18 7:16 ` Evgeniy Polyakov
2006-08-18 7:16 ` Evgeniy Polyakov
2006-08-12 3:42 ` Rik van Riel
2006-08-12 3:42 ` Rik van Riel
2006-08-12 8:47 ` Evgeniy Polyakov
2006-08-12 8:47 ` Evgeniy Polyakov
2006-08-12 9:19 ` Peter Zijlstra
2006-08-12 9:19 ` Peter Zijlstra
2006-08-12 9:37 ` Evgeniy Polyakov
2006-08-12 9:37 ` Evgeniy Polyakov
2006-08-12 10:18 ` Peter Zijlstra
2006-08-12 10:18 ` Peter Zijlstra
2006-08-12 10:42 ` Evgeniy Polyakov
2006-08-12 10:42 ` Evgeniy Polyakov
2006-08-12 10:51 ` Evgeniy Polyakov
2006-08-12 10:51 ` Evgeniy Polyakov
2006-08-12 11:40 ` Peter Zijlstra
2006-08-12 11:40 ` Peter Zijlstra
2006-08-12 11:53 ` Evgeniy Polyakov
2006-08-12 11:53 ` Evgeniy Polyakov
2006-08-13 0:46 ` David Miller
2006-08-13 0:46 ` David Miller, Peter Zijlstra
2006-08-13 1:11 ` Rik van Riel
2006-08-13 1:11 ` Rik van Riel
2006-08-12 14:40 ` Rik van Riel
2006-08-12 14:40 ` Rik van Riel
2006-08-12 14:49 ` Evgeniy Polyakov
2006-08-12 14:49 ` Evgeniy Polyakov
2006-08-12 14:56 ` Rik van Riel
2006-08-12 14:56 ` Rik van Riel
2006-08-12 15:08 ` Evgeniy Polyakov
2006-08-12 15:08 ` Evgeniy Polyakov
2006-08-12 15:22 ` Peter Zijlstra
2006-08-12 15:22 ` Peter Zijlstra
2006-08-14 0:56 ` Daniel Phillips
2006-08-14 0:56 ` Daniel Phillips
2006-08-13 0:46 ` David Miller
2006-08-13 0:46 ` David Miller, Evgeniy Polyakov
2006-08-13 9:06 ` Evgeniy Polyakov
2006-08-13 9:06 ` Evgeniy Polyakov
2006-08-13 9:52 ` Evgeniy Polyakov
2006-08-13 9:52 ` Evgeniy Polyakov
2006-08-15 19:17 ` Pavel Machek
2006-08-15 19:17 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1155530453.5696.98.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@osdl.org \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=michaelc@cs.wisc.edu \
--cc=netdev@vger.kernel.org \
--cc=phillips@google.com \
--cc=riel@redhat.com \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.