All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux Filesystems <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management <linux-mm@kvack.org>
Subject: Re: [patch 9/9] mm: fix pagecache write deadlocks
Date: Sun, 4 Feb 2007 02:30:55 -0800	[thread overview]
Message-ID: <20070204023055.2583fd65.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070204101529.GA22004@wotan.suse.de>

On Sun, 4 Feb 2007 11:15:29 +0100 Nick Piggin <npiggin@suse.de> wrote:

> On Sun, Feb 04, 2007 at 01:44:45AM -0800, Andrew Morton wrote:
> > On Sun,  4 Feb 2007 09:51:07 +0100 (CET) Nick Piggin <npiggin@suse.de> wrote:
> > 
> > > 2.  If we find the destination page is non uptodate, unlock it (this could be
> > >     made slightly more optimal), then find and pin the source page with
> > >     get_user_pages. Relock the destination page and continue with the copy.
> > >     However, instead of a usercopy (which might take a fault), copy the data
> > >     via the kernel address space.
> > 
> > argh.  We just can't go adding all this gunk into the write() path. 
> > 
> > mmap_sem, a full pte-walk, taking of pte-page locks, etc.  For every page. 
> > Even single-process write() will suffer, let along multithreaded stuff,
> > where mmap_sem contention may be the bigger problem.
> 
> The write path is broken. I prefer my kernels slow, than buggy.

That won't fly.

> > There's a build error in filemap_xip.c btw.

?

> > 
> > We need to think different.
> > 
> > What happened to the idea of doing an atomic copy into the non-uptodate
> > page and handling it somehow?
> 
> That was my second idea.

Coulda sworn it was mine ;) I thought you ended up deciding it wasn't
practical because of the games we needed to play with ->commit_write.

> I didn't get any feedback on that patchset
> except to try this method, so I assume everyone hated it.
> 
> I actually liked it, because it didn't have to do the writev
> segment-at-a-time for !uptodate pages like this one does. Considering
> this code gets called from mm-less contexts, maybe I'll have to go back
> to this approach.

OK.

> > Another option might be to effectively pin the whole mm during the copy:
> > 
> > 	down_read(&current->mm->unpaging_lock);
> > 	get_user(addr);		/* Fault the page in */
> > 	...
> > 	copy_from_user()
> > 	up_read(&current->mm->unpaging_lock);
> > 
> > then, anyone who wants to unmap pages from this mm requires
> > write_lock(unpaging_lock).  So we know the results of that get_user()
> > cannot be undone.
> 
> Fugly.

I invited you to think different - don't just fixate on one random
tossed-out-there suggestion.

> but you introduce the theoretical memory deadlock
> where a task cannot reclaim its own memory.

Nah, that'll never happen - both pages are already allocated.

It's better than taking mmap_sem and walking pagetables...

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux Filesystems <linux-fsdevel@vger.kernel.org>,
	Linux Memory Management <linux-mm@kvack.org>
Subject: Re: [patch 9/9] mm: fix pagecache write deadlocks
Date: Sun, 4 Feb 2007 02:30:55 -0800	[thread overview]
Message-ID: <20070204023055.2583fd65.akpm@linux-foundation.org> (raw)
In-Reply-To: <20070204101529.GA22004@wotan.suse.de>

On Sun, 4 Feb 2007 11:15:29 +0100 Nick Piggin <npiggin@suse.de> wrote:

> On Sun, Feb 04, 2007 at 01:44:45AM -0800, Andrew Morton wrote:
> > On Sun,  4 Feb 2007 09:51:07 +0100 (CET) Nick Piggin <npiggin@suse.de> wrote:
> > 
> > > 2.  If we find the destination page is non uptodate, unlock it (this could be
> > >     made slightly more optimal), then find and pin the source page with
> > >     get_user_pages. Relock the destination page and continue with the copy.
> > >     However, instead of a usercopy (which might take a fault), copy the data
> > >     via the kernel address space.
> > 
> > argh.  We just can't go adding all this gunk into the write() path. 
> > 
> > mmap_sem, a full pte-walk, taking of pte-page locks, etc.  For every page. 
> > Even single-process write() will suffer, let along multithreaded stuff,
> > where mmap_sem contention may be the bigger problem.
> 
> The write path is broken. I prefer my kernels slow, than buggy.

That won't fly.

> > There's a build error in filemap_xip.c btw.

?

> > 
> > We need to think different.
> > 
> > What happened to the idea of doing an atomic copy into the non-uptodate
> > page and handling it somehow?
> 
> That was my second idea.

Coulda sworn it was mine ;) I thought you ended up deciding it wasn't
practical because of the games we needed to play with ->commit_write.

> I didn't get any feedback on that patchset
> except to try this method, so I assume everyone hated it.
> 
> I actually liked it, because it didn't have to do the writev
> segment-at-a-time for !uptodate pages like this one does. Considering
> this code gets called from mm-less contexts, maybe I'll have to go back
> to this approach.

OK.

> > Another option might be to effectively pin the whole mm during the copy:
> > 
> > 	down_read(&current->mm->unpaging_lock);
> > 	get_user(addr);		/* Fault the page in */
> > 	...
> > 	copy_from_user()
> > 	up_read(&current->mm->unpaging_lock);
> > 
> > then, anyone who wants to unmap pages from this mm requires
> > write_lock(unpaging_lock).  So we know the results of that get_user()
> > cannot be undone.
> 
> Fugly.

I invited you to think different - don't just fixate on one random
tossed-out-there suggestion.

> but you introduce the theoretical memory deadlock
> where a task cannot reclaim its own memory.

Nah, that'll never happen - both pages are already allocated.

It's better than taking mmap_sem and walking pagetables...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-02-04 10:31 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-04  8:49 [patch 0/9] buffered write deadlock fix Nick Piggin
2007-02-04  8:49 ` Nick Piggin
2007-02-04  8:49 ` [patch 1/9] fs: libfs buffered write leak fix Nick Piggin
2007-02-04  8:49   ` Nick Piggin
2007-02-04  8:50 ` [patch 2/9] mm: revert "generic_file_buffered_write(): handle zero length iovec segments" Nick Piggin
2007-02-04  8:50   ` Nick Piggin, Andrew Morton
2007-02-04  8:50 ` [patch 3/9] mm: revert "generic_file_buffered_write(): deadlock on vectored write" Nick Piggin
2007-02-04  8:50   ` Nick Piggin, Andrew Morton
2007-02-04  8:50 ` [patch 4/9] mm: generic_file_buffered_write cleanup Nick Piggin
2007-02-04  8:50   ` Nick Piggin, Andrew Morton
2007-02-04  8:50 ` [patch 5/9] mm: debug write deadlocks Nick Piggin
2007-02-04  8:50   ` Nick Piggin
2007-02-04  8:50 ` [patch 6/9] mm: be sure to trim blocks Nick Piggin
2007-02-04  8:50   ` Nick Piggin
2007-02-04  8:50 ` [patch 7/9] mm: cleanup pagecache insertion operations Nick Piggin
2007-02-04  8:50   ` Nick Piggin
2007-02-04  8:50 ` [patch 8/9] mm: generic_file_buffered_write iovec cleanup Nick Piggin
2007-02-04  8:50   ` Nick Piggin
2007-02-04  8:51 ` [patch 9/9] mm: fix pagecache write deadlocks Nick Piggin
2007-02-04  8:51   ` Nick Piggin
2007-02-04  9:44   ` Andrew Morton
2007-02-04  9:44     ` Andrew Morton
2007-02-04 10:15     ` Nick Piggin
2007-02-04 10:15       ` Nick Piggin
2007-02-04 10:26       ` Christoph Hellwig
2007-02-04 10:26         ` Christoph Hellwig
2007-02-04 10:30       ` Andrew Morton [this message]
2007-02-04 10:30         ` Andrew Morton
2007-02-04 10:46         ` Nick Piggin
2007-02-04 10:46           ` Nick Piggin
2007-02-04 10:50           ` Nick Piggin
2007-02-04 10:50             ` Nick Piggin
2007-02-04 10:56           ` Andrew Morton
2007-02-04 10:56             ` Andrew Morton
2007-02-04 11:03             ` Nick Piggin
2007-02-04 11:03               ` Nick Piggin
2007-02-04 11:15               ` Andrew Morton
2007-02-04 11:15                 ` Andrew Morton
2007-02-04 15:10                 ` Nick Piggin
2007-02-04 15:10                   ` Nick Piggin
2007-02-04 18:36                   ` Andrew Morton
2007-02-04 18:36                     ` Andrew Morton
2007-02-06  2:25                     ` Nick Piggin
2007-02-06  2:25                       ` Nick Piggin
2007-02-06  4:41                       ` Nick Piggin
2007-02-06  4:41                         ` Nick Piggin
2007-02-06  5:30                         ` Andrew Morton
2007-02-06  5:30                           ` Andrew Morton
2007-02-06  5:49                           ` Nick Piggin
2007-02-06  5:49                             ` Nick Piggin
2007-02-06  5:53                             ` Nick Piggin
2007-02-06  5:53                               ` Nick Piggin
2007-02-04 10:59     ` Anton Altaparmakov
2007-02-04 10:59       ` Anton Altaparmakov
2007-02-04 11:10       ` Andrew Morton
2007-02-04 11:10         ` Andrew Morton
2007-02-04 11:22         ` Nick Piggin
2007-02-04 11:22           ` Nick Piggin
2007-02-04 17:40         ` Anton Altaparmakov
2007-02-04 17:40           ` Anton Altaparmakov
2007-02-06  2:09           ` Nick Piggin
2007-02-06  2:09             ` Nick Piggin
2007-02-06 13:13             ` Anton Altaparmakov
2007-02-06 13:13               ` Anton Altaparmakov
  -- strict thread matches above, loose matches on Subject: below --
2007-01-29 10:31 [patch 0/9] buffered write deadlock fix Nick Piggin
2007-01-29 10:33 ` [patch 9/9] mm: fix pagecache write deadlocks Nick Piggin
2007-01-29 10:33   ` Nick Piggin
2007-01-29 11:11   ` Nick Piggin
2007-01-29 11:11     ` Nick Piggin
2007-02-02 23:53   ` Andrew Morton
2007-02-02 23:53     ` Andrew Morton
2007-02-03  1:38     ` Nick Piggin
2007-02-03  1:38       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070204023055.2583fd65.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.