* [PATCH] reserved buffers only for PF_MEMALLOC
@ 2004-08-10 17:20 Rik van Riel
2004-08-10 19:04 ` Marcelo Tosatti
2004-08-10 20:16 ` Marcelo Tosatti
0 siblings, 2 replies; 7+ messages in thread
From: Rik van Riel @ 2004-08-10 17:20 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: linux-kernel, Matt Domsch, Ernie Petrides
The buffer allocation path in 2.4 has a long standing bug,
where non-PF_MEMALLOC tasks can dig into the reserved pool
in get_unused_buffer_head(). The following patch makes the
reserved pool only accessible to PF_MEMALLOC tasks.
Other processes will loop in create_buffers() - the only
function that calls get_unused_buffer_head() - and will call
try_to_free_pages(GFP_NOIO), freeing any buffer heads that
have become freeable due to IO completion.
Note that PF_MEMALLOC tasks will NOT do anything inside
try_to_free_pages(), so it is needed that they are able to
dig into the reserved buffer heads while other tasks are
not.
Signed-off-by: Rik van Riel <riel@redhat.com>
--- linux/fs/buffer.c.deadlock 2004-08-10 11:33:08.000000000 -0400
+++ linux/fs/buffer.c 2004-08-10 11:34:54.000000000 -0400
@@ -1260,8 +1260,9 @@ struct buffer_head * get_unused_buffer_h
/*
* If we need an async buffer, use the reserved buffer heads.
+ * Non-PF_MEMALLOC tasks can just loop in create_buffers().
*/
- if (async) {
+ if (async && (current->flags & PF_MEMALLOC)) {
spin_lock(&unused_list_lock);
if (unused_list) {
bh = unused_list;
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] reserved buffers only for PF_MEMALLOC
2004-08-10 17:20 [PATCH] reserved buffers only for PF_MEMALLOC Rik van Riel
@ 2004-08-10 19:04 ` Marcelo Tosatti
2004-08-10 20:42 ` Matt Domsch
2004-08-10 20:16 ` Marcelo Tosatti
1 sibling, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2004-08-10 19:04 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-kernel, Matt Domsch, Ernie Petrides
On Tue, Aug 10, 2004 at 01:20:24PM -0400, Rik van Riel wrote:
>
> The buffer allocation path in 2.4 has a long standing bug,
> where non-PF_MEMALLOC tasks can dig into the reserved pool
> in get_unused_buffer_head(). The following patch makes the
> reserved pool only accessible to PF_MEMALLOC tasks.
>
> Other processes will loop in create_buffers() - the only
> function that calls get_unused_buffer_head() - and will call
> try_to_free_pages(GFP_NOIO), freeing any buffer heads that
> have become freeable due to IO completion.
>
> Note that PF_MEMALLOC tasks will NOT do anything inside
> try_to_free_pages(), so it is needed that they are able to
> dig into the reserved buffer heads while other tasks are
> not.
Sounds the correct thing to do, thanks.
Out of curiosity: Do you actually seen any practical problem due to
get_unused_buffer_head() calls eating into the reserved pool?
Or have any testcase which would trigger a problem (OOM) due to it?
>
> Signed-off-by: Rik van Riel <riel@redhat.com>
>
> --- linux/fs/buffer.c.deadlock 2004-08-10 11:33:08.000000000 -0400
> +++ linux/fs/buffer.c 2004-08-10 11:34:54.000000000 -0400
> @@ -1260,8 +1260,9 @@ struct buffer_head * get_unused_buffer_h
>
> /*
> * If we need an async buffer, use the reserved buffer heads.
> + * Non-PF_MEMALLOC tasks can just loop in create_buffers().
> */
> - if (async) {
> + if (async && (current->flags & PF_MEMALLOC)) {
> spin_lock(&unused_list_lock);
> if (unused_list) {
> bh = unused_list;
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] reserved buffers only for PF_MEMALLOC
2004-08-10 19:04 ` Marcelo Tosatti
@ 2004-08-10 20:42 ` Matt Domsch
2004-08-10 19:50 ` Marcelo Tosatti
0 siblings, 1 reply; 7+ messages in thread
From: Matt Domsch @ 2004-08-10 20:42 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Rik van Riel, linux-kernel, Ernie Petrides
On Tue, Aug 10, 2004 at 04:04:55PM -0300, Marcelo Tosatti wrote:
> On Tue, Aug 10, 2004 at 01:20:24PM -0400, Rik van Riel wrote:
> >
> > The buffer allocation path in 2.4 has a long standing bug,
> > where non-PF_MEMALLOC tasks can dig into the reserved pool
> > in get_unused_buffer_head(). The following patch makes the
> > reserved pool only accessible to PF_MEMALLOC tasks.
>
> Out of curiosity: Do you actually seen any practical problem due to
> get_unused_buffer_head() calls eating into the reserved pool?
>
> Or have any testcase which would trigger a problem (OOM) due to it?
My team has seen an application which mallocs as much memory as
possible, up to 95% of system RAM, using multiple processes as
necessary, and then has threads which touch all the malloc'd bytes and
threads which touch all the malloc'd pages. It keeps kswapd pretty
busy, such that you can get down to zero free and inactive clean
ZONE_NORMAL pages from which to allocate additional buffer_heads for
swapout, deadlocking the system.
We believe that by limiting the use of the reserved buffer_head pool
to PF_MEMALLOC tasks like kswapd, kswapd can make forward progress
even in extremely low memory situations.
Thanks,
Matt
--
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] reserved buffers only for PF_MEMALLOC
2004-08-10 20:42 ` Matt Domsch
@ 2004-08-10 19:50 ` Marcelo Tosatti
2004-08-10 20:59 ` Rik van Riel
0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2004-08-10 19:50 UTC (permalink / raw)
To: Matt Domsch; +Cc: Rik van Riel, linux-kernel, Ernie Petrides
On Tue, Aug 10, 2004 at 03:42:32PM -0500, Matt Domsch wrote:
> On Tue, Aug 10, 2004 at 04:04:55PM -0300, Marcelo Tosatti wrote:
> > On Tue, Aug 10, 2004 at 01:20:24PM -0400, Rik van Riel wrote:
> > >
> > > The buffer allocation path in 2.4 has a long standing bug,
> > > where non-PF_MEMALLOC tasks can dig into the reserved pool
> > > in get_unused_buffer_head(). The following patch makes the
> > > reserved pool only accessible to PF_MEMALLOC tasks.
> >
> > Out of curiosity: Do you actually seen any practical problem due to
> > get_unused_buffer_head() calls eating into the reserved pool?
> >
> > Or have any testcase which would trigger a problem (OOM) due to it?
>
> My team has seen an application which mallocs as much memory as
> possible, up to 95% of system RAM, using multiple processes as
> necessary, and then has threads which touch all the malloc'd bytes and
> threads which touch all the malloc'd pages. It keeps kswapd pretty
> busy, such that you can get down to zero free and inactive clean
> ZONE_NORMAL pages from which to allocate additional buffer_heads for
> swapout, deadlocking the system.
>
> We believe that by limiting the use of the reserved buffer_head pool
> to PF_MEMALLOC tasks like kswapd, kswapd can make forward progress
> even in extremely low memory situations.
OK, makes sense. I assume Rik's patch fixes the deadlock you are seeing?
Have you tested it?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] reserved buffers only for PF_MEMALLOC
2004-08-10 19:50 ` Marcelo Tosatti
@ 2004-08-10 20:59 ` Rik van Riel
0 siblings, 0 replies; 7+ messages in thread
From: Rik van Riel @ 2004-08-10 20:59 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Matt Domsch, linux-kernel, Ernie Petrides
On Tue, 10 Aug 2004, Marcelo Tosatti wrote:
> OK, makes sense. I assume Rik's patch fixes the deadlock you are seeing?
>
> Have you tested it?
It's being tested on RHEL3, where we also ran into another
(trivial) bug that blocks this issue. It's just one of a
few trivial bugs we ran into and I believe this one needs
fixing ;)
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] reserved buffers only for PF_MEMALLOC
2004-08-10 17:20 [PATCH] reserved buffers only for PF_MEMALLOC Rik van Riel
2004-08-10 19:04 ` Marcelo Tosatti
@ 2004-08-10 20:16 ` Marcelo Tosatti
2004-10-20 14:02 ` Matt Domsch
1 sibling, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2004-08-10 20:16 UTC (permalink / raw)
To: Rik van Riel; +Cc: linux-kernel, Matt Domsch, Ernie Petrides
On Tue, Aug 10, 2004 at 01:20:24PM -0400, Rik van Riel wrote:
>
> The buffer allocation path in 2.4 has a long standing bug,
> where non-PF_MEMALLOC tasks can dig into the reserved pool
> in get_unused_buffer_head(). The following patch makes the
> reserved pool only accessible to PF_MEMALLOC tasks.
>
> Other processes will loop in create_buffers() - the only
> function that calls get_unused_buffer_head() - and will call
> try_to_free_pages(GFP_NOIO), freeing any buffer heads that
> have become freeable due to IO completion.
>
> Note that PF_MEMALLOC tasks will NOT do anything inside
> try_to_free_pages(), so it is needed that they are able to
> dig into the reserved buffer heads while other tasks are
> not.
Applied.
Matt, it would be really nice if you could run the workload
which triggers the deadlock on 2.4.27+Rik's patch.
Thanks guys!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] reserved buffers only for PF_MEMALLOC
2004-08-10 20:16 ` Marcelo Tosatti
@ 2004-10-20 14:02 ` Matt Domsch
0 siblings, 0 replies; 7+ messages in thread
From: Matt Domsch @ 2004-10-20 14:02 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Rik van Riel, linux-kernel, Ernie Petrides
On Tue, Aug 10, 2004 at 05:16:52PM -0300, Marcelo Tosatti wrote:
> On Tue, Aug 10, 2004 at 01:20:24PM -0400, Rik van Riel wrote:
> >
> > The buffer allocation path in 2.4 has a long standing bug,
> > where non-PF_MEMALLOC tasks can dig into the reserved pool
> > in get_unused_buffer_head(). The following patch makes the
> > reserved pool only accessible to PF_MEMALLOC tasks.
> >
> > Other processes will loop in create_buffers() - the only
> > function that calls get_unused_buffer_head() - and will call
> > try_to_free_pages(GFP_NOIO), freeing any buffer heads that
> > have become freeable due to IO completion.
> >
> > Note that PF_MEMALLOC tasks will NOT do anything inside
> > try_to_free_pages(), so it is needed that they are able to
> > dig into the reserved buffer heads while other tasks are
> > not.
>
> Applied.
>
> Matt, it would be really nice if you could run the workload
> which triggers the deadlock on 2.4.27+Rik's patch.
Sorry for the long delay. kernel 2.4.28-pre3 has been running under
load for 12 days, where otherwise we would have expected a failure in
less than 1 day. I'm satisfied this is the right thing to have done.
Thanks,
Matt
--
Matt Domsch
Sr. Software Engineer, Lead Engineer
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-10-20 14:25 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-10 17:20 [PATCH] reserved buffers only for PF_MEMALLOC Rik van Riel
2004-08-10 19:04 ` Marcelo Tosatti
2004-08-10 20:42 ` Matt Domsch
2004-08-10 19:50 ` Marcelo Tosatti
2004-08-10 20:59 ` Rik van Riel
2004-08-10 20:16 ` Marcelo Tosatti
2004-10-20 14:02 ` Matt Domsch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox