From: Andrea Arcangeli <andrea@suse.de>
To: Andrew Morton <akpm@zip.com.au>
Cc: Marcelo Tosatti <marcelo@conectiva.com.br>,
Momchil Velikov <velco@fadata.bg>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: Copying to loop device hangs up everything
Date: Wed, 19 Dec 2001 14:42:13 +0100 [thread overview]
Message-ID: <20011219144213.A1395@athlon.random> (raw)
In-Reply-To: <87bsgwi6zz.fsf@fadata.bg> <Pine.LNX.4.21.0112181757460.4821-100000@freak.distro.conectiva> <3C1FC254.525B9108@zip.com.au> <3C1FCB96.83E49ECB@zip.com.au> <3C204C4F.C989AD71@zip.com.au>
In-Reply-To: <3C204C4F.C989AD71@zip.com.au>; from akpm@zip.com.au on Wed, Dec 19, 2001 at 12:14:07AM -0800
On Wed, Dec 19, 2001 at 12:14:07AM -0800, Andrew Morton wrote:
> Andrew Morton wrote:
> >
> > Andrew Morton wrote:
> > >
> > > I want to know how the loop thread ever hit sync_page_buffers.
> >
> > __block_prepare_write
> > ->get_unused_buffer_head
> > ->kmem_cache_alloc(SLAB_NOFS)
> >
> > Shouldn't we be using the address space's GFP flags for bufferhead
> > allocation, rather than cooking up a new one?
> >
>
> Um. That won't work. There are in fact many ways in which loopback
> can deadlock, and propagating gfp flags through all the fs code
> paths won't cut it.
>
> Here's one such deadlock:
>
> __wait_on_buffer
> sync_page_buffers
> try_to_free_buffers
> try_to_release_page
> shrink_cache
> shrink_caches
> try_to_free_pages
> balance_classzone
> __alloc_pages
> _alloc_pages
> find_or_create_page
> grow_dev_page
> grow_buffers
> getblk
> bread
> ext2_get_branch
> ext2_get_block
> __block_prepare_write
> block_prepare_write
> ext2_prepare_write
> lo_send
> do_bh_filebacked
> loop_thread
> kernel_thread
>
> I was able to get a multithread deadlock where the loop thread was waiting
> on an ext2 buffer which was sitting in the loop thread's input queue,
> waiting to be written by the loop thread. Ugly.
>
> The thing I don't like about the Andrea+Momchil approach is that it
> exposes the risk of flooding the machine with dirty data. A scheme
it doesn't, balance_dirty() has to work only at the highlevel.
sync_page_buffers also is no problem, we'll try again later in those
GFP_NOIO allocations.
furthmore you don't even address the writepage from loop thread on the
loop queue.
The final fix should be in rc2aa1 that I will release in a jiffy. It
takes care now of both the VM and balance_dirty().
this is the incremental fix against rc1aa1:
diff -urN loop-ref/fs/buffer.c loop/fs/buffer.c
--- loop-ref/fs/buffer.c Wed Dec 19 04:17:30 2001
+++ loop/fs/buffer.c Wed Dec 19 03:43:24 2001
@@ -2547,6 +2547,7 @@
/* Uhhuh, start writeback so that we don't end up with all dirty pages */
write_unlock(&hash_table_lock);
spin_unlock(&lru_list_lock);
+ gfp_mask = pf_gfp_mask(gfp_mask);
if (gfp_mask & __GFP_IO && !(current->flags & PF_ATOMICALLOC)) {
if ((gfp_mask & __GFP_HIGHIO) || !PageHighMem(page)) {
if (sync_page_buffers(bh)) {
diff -urN loop-ref/include/linux/mm.h loop/include/linux/mm.h
--- loop-ref/include/linux/mm.h Wed Dec 19 04:17:30 2001
+++ loop/include/linux/mm.h Wed Dec 19 04:15:52 2001
@@ -562,6 +562,15 @@
#define GFP_DMA __GFP_DMA
+static inline unsigned int pf_gfp_mask(unsigned int gfp_mask)
+{
+ /* avoid all memory balancing I/O methods if this task cannot block on I/O */
+ if (current->flags & PF_NOIO)
+ gfp_mask &= ~(__GFP_IO | __GFP_HIGHIO | __GFP_FS);
+
+ return gfp_mask;
+}
+
extern int heap_stack_gap;
/*
diff -urN loop-ref/mm/vmscan.c loop/mm/vmscan.c
--- loop-ref/mm/vmscan.c Wed Dec 19 04:17:30 2001
+++ loop/mm/vmscan.c Wed Dec 19 03:43:24 2001
@@ -611,6 +611,8 @@
int try_to_free_pages(zone_t *classzone, unsigned int gfp_mask, unsigned int order)
{
+ gfp_mask = pf_gfp_mask(gfp_mask);
+
for (;;) {
int tries = vm_scan_ratio << 2;
int failed_swapout = !(gfp_mask & __GFP_IO);
try_to_free_pages needs an explicit wrapping because it can be called
not only from the VM.
Andrea
next prev parent reply other threads:[~2001-12-19 13:43 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-12-16 3:40 Copying to loop device hangs up everything David Gomez
2001-12-16 4:00 ` Dave Jones
2001-12-16 11:41 ` David Gomez
2001-12-16 16:53 ` Momchil Velikov
2001-12-16 19:42 ` David Gomez
2001-12-16 19:50 ` Momchil Velikov
2001-12-16 21:52 ` Momchil Velikov
2001-12-18 19:46 ` Marcelo Tosatti
2001-12-18 20:54 ` Momchil Velikov
2001-12-18 19:57 ` Marcelo Tosatti
2001-12-18 21:26 ` Momchil Velikov
[not found] ` <3C1FC254.525B9108@zip.com.au>
[not found] ` <3C1FCB96.83E49ECB@zip.com.au>
[not found] ` <3C204C4F.C989AD71@zip.com.au>
2001-12-19 13:42 ` Andrea Arcangeli [this message]
2001-12-20 7:41 ` Andrew Morton
2001-12-20 11:27 ` Andrea Arcangeli
2001-12-20 11:34 ` Andrea Arcangeli
2001-12-17 3:30 ` Dave Jones
-- strict thread matches above, loose matches on Subject: below --
2001-12-20 21:05 Momchil Velikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20011219144213.A1395@athlon.random \
--to=andrea@suse.de \
--cc=akpm@zip.com.au \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo@conectiva.com.br \
--cc=velco@fadata.bg \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox