* [PATCH 0/2] Make task doing heavy writing killable @ 2011-11-14 11:10 Jan Kara 2011-11-14 11:10 ` [PATCH 1/2] mm: Make task in balance_dirty_pages() killable Jan Kara ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Jan Kara @ 2011-11-14 11:10 UTC (permalink / raw) To: linux-fsdevel Cc: Wu Fengguang, Al Viro, k-mio, Andrew Morton, Christoph Hellwig Hello, these two patches aim at making task waiting in balance_dirty_pages() killable. This is desirable because otherwise if filesystem stops accepting writes (e.g. if device has been removed or other serious error condidion) we have a task stuck in D state forever. I'm not sure who should merge these two patches... Al, Fengguang? Honza ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 1/2] mm: Make task in balance_dirty_pages() killable 2011-11-14 11:10 [PATCH 0/2] Make task doing heavy writing killable Jan Kara @ 2011-11-14 11:10 ` Jan Kara 2011-11-14 12:12 ` Wu Fengguang 2011-11-14 11:10 ` [PATCH 2/2] fs: Make write(2) interruptible by a signal Jan Kara 2011-11-14 11:59 ` [PATCH 0/2] Make task doing heavy writing killable Wu Fengguang 2 siblings, 1 reply; 21+ messages in thread From: Jan Kara @ 2011-11-14 11:10 UTC (permalink / raw) To: linux-fsdevel Cc: Wu Fengguang, Al Viro, k-mio, Andrew Morton, Christoph Hellwig, Jan Kara There is no reason why task in balance_dirty_pages() shouldn't be killable and it helps in recovering from some error conditions (like when filesystem goes in error state and cannot accept writeback anymore but we still want to kill processes using it to be able to unmount it). Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com> Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: Jan Kara <jack@suse.cz> --- mm/page-writeback.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 0360d1b..e83c286 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -1133,8 +1133,10 @@ pause: pages_dirtied, pause, start_time); - __set_current_state(TASK_UNINTERRUPTIBLE); + __set_current_state(TASK_KILLABLE); io_schedule_timeout(pause); + if (fatal_signal_pending(current)) + break; dirty_thresh = hard_dirty_limit(dirty_thresh); /* -- 1.7.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 1/2] mm: Make task in balance_dirty_pages() killable 2011-11-14 11:10 ` [PATCH 1/2] mm: Make task in balance_dirty_pages() killable Jan Kara @ 2011-11-14 12:12 ` Wu Fengguang 2011-11-14 12:37 ` Jan Kara 0 siblings, 1 reply; 21+ messages in thread From: Wu Fengguang @ 2011-11-14 12:12 UTC (permalink / raw) To: Jan Kara Cc: linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon, Nov 14, 2011 at 07:10:29PM +0800, Jan Kara wrote: > There is no reason why task in balance_dirty_pages() shouldn't be killable > and it helps in recovering from some error conditions (like when filesystem > goes in error state and cannot accept writeback anymore but we still want to > kill processes using it to be able to unmount it). > > Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com> > Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com> > Signed-off-by: Jan Kara <jack@suse.cz> > --- > mm/page-writeback.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 0360d1b..e83c286 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -1133,8 +1133,10 @@ pause: > pages_dirtied, > pause, > start_time); > - __set_current_state(TASK_UNINTERRUPTIBLE); > + __set_current_state(TASK_KILLABLE); Dumb question: will the task continue to show up as 'D' in ps? > io_schedule_timeout(pause); > + if (fatal_signal_pending(current)) > + break; I'd like to move that several lines below, to the end of the loop, so that the condition is not tested at all in normal executions. > dirty_thresh = hard_dirty_limit(dirty_thresh); > /* > -- > 1.7.1 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/2] mm: Make task in balance_dirty_pages() killable 2011-11-14 12:12 ` Wu Fengguang @ 2011-11-14 12:37 ` Jan Kara 0 siblings, 0 replies; 21+ messages in thread From: Jan Kara @ 2011-11-14 12:37 UTC (permalink / raw) To: Wu Fengguang Cc: Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon 14-11-11 20:12:45, Wu Fengguang wrote: > On Mon, Nov 14, 2011 at 07:10:29PM +0800, Jan Kara wrote: > > There is no reason why task in balance_dirty_pages() shouldn't be killable > > and it helps in recovering from some error conditions (like when filesystem > > goes in error state and cannot accept writeback anymore but we still want to > > kill processes using it to be able to unmount it). > > > > Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com> > > Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com> > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > mm/page-writeback.c | 4 +++- > > 1 files changed, 3 insertions(+), 1 deletions(-) > > > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > > index 0360d1b..e83c286 100644 > > --- a/mm/page-writeback.c > > +++ b/mm/page-writeback.c > > @@ -1133,8 +1133,10 @@ pause: > > pages_dirtied, > > pause, > > start_time); > > - __set_current_state(TASK_UNINTERRUPTIBLE); > > + __set_current_state(TASK_KILLABLE); > > Dumb question: will the task continue to show up as 'D' in ps? Yes, it will. The only difference is that the task will be woken when SIGKILL arrives. > > io_schedule_timeout(pause); > > + if (fatal_signal_pending(current)) > > + break; > > I'd like to move that several lines below, to the end of the loop, > so that the condition is not tested at all in normal executions. OK, will do that. > > dirty_thresh = hard_dirty_limit(dirty_thresh); > > /* Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 11:10 [PATCH 0/2] Make task doing heavy writing killable Jan Kara 2011-11-14 11:10 ` [PATCH 1/2] mm: Make task in balance_dirty_pages() killable Jan Kara @ 2011-11-14 11:10 ` Jan Kara 2011-11-14 12:12 ` Matthew Wilcox 2011-11-14 12:15 ` Wu Fengguang 2011-11-14 11:59 ` [PATCH 0/2] Make task doing heavy writing killable Wu Fengguang 2 siblings, 2 replies; 21+ messages in thread From: Jan Kara @ 2011-11-14 11:10 UTC (permalink / raw) To: linux-fsdevel Cc: Wu Fengguang, Al Viro, k-mio, Andrew Morton, Christoph Hellwig, Jan Kara Currently write(2) to a file is not interruptible by a signal. Sometimes this is desirable (e.g. when you want to quickly kill a process hogging your disk or when some process gets blocked in balance_dirty_pages() indefinitely due to a filesystem being in an error condition). Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com> Tested-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: Jan Kara <jack@suse.cz> --- mm/filemap.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index c0018f2..6b01d2f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file, iov_iter_count(i)); again: + if (signal_pending(current)) { + status = -EINTR; + break; + } /* * Bring in the user page that we will copy from _first_. -- 1.7.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 11:10 ` [PATCH 2/2] fs: Make write(2) interruptible by a signal Jan Kara @ 2011-11-14 12:12 ` Matthew Wilcox 2011-11-14 12:15 ` Wu Fengguang 1 sibling, 0 replies; 21+ messages in thread From: Matthew Wilcox @ 2011-11-14 12:12 UTC (permalink / raw) To: Jan Kara Cc: linux-fsdevel, Wu Fengguang, Al Viro, k-mio, Andrew Morton, Christoph Hellwig On Mon, Nov 14, 2011 at 12:10:30PM +0100, Jan Kara wrote: > again: > + if (signal_pending(current)) { Uhh ... surely 'fatal_signal_pending', no? > + status = -EINTR; > + break; > + } > > /* > * Bring in the user page that we will copy from _first_. > -- > 1.7.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 11:10 ` [PATCH 2/2] fs: Make write(2) interruptible by a signal Jan Kara 2011-11-14 12:12 ` Matthew Wilcox @ 2011-11-14 12:15 ` Wu Fengguang 2011-11-14 12:34 ` Jan Kara 1 sibling, 1 reply; 21+ messages in thread From: Wu Fengguang @ 2011-11-14 12:15 UTC (permalink / raw) To: Jan Kara Cc: linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig > @@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file, > iov_iter_count(i)); > > again: > + if (signal_pending(current)) { signal_pending looks more useful than fatal_signal_pending in that it covers normal signals too. However it's exactly the broader coverage that makes it an interface change -- will this possibly break casually written applications? > + status = -EINTR; > + break; > + } ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 12:15 ` Wu Fengguang @ 2011-11-14 12:34 ` Jan Kara 2011-11-14 14:16 ` Matthew Wilcox 0 siblings, 1 reply; 21+ messages in thread From: Jan Kara @ 2011-11-14 12:34 UTC (permalink / raw) To: Wu Fengguang Cc: Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon 14-11-11 20:15:56, Wu Fengguang wrote: > > @@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file, > > iov_iter_count(i)); > > > > again: > > + if (signal_pending(current)) { > > signal_pending looks more useful than fatal_signal_pending in that it > covers normal signals too. However it's exactly the broader coverage > that makes it an interface change -- will this possibly break casually > written applications? Yeah, this is upto discussion. Historically, write() (or any other system call) could have returned EINTR. In fact, write() to a socket can return EINTR even now. But you are right that we didn't return EINTR from write() to a regular file. So if you prefer to never return EINTR from a write to a regular file, I can change the check since I'm also slightly worried that some badly written app can notice. Honza > > > + status = -EINTR; > > + break; > > + } -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 12:34 ` Jan Kara @ 2011-11-14 14:16 ` Matthew Wilcox 2011-11-14 15:30 ` Jan Kara 0 siblings, 1 reply; 21+ messages in thread From: Matthew Wilcox @ 2011-11-14 14:16 UTC (permalink / raw) To: Jan Kara Cc: Wu Fengguang, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon, Nov 14, 2011 at 01:34:46PM +0100, Jan Kara wrote: > On Mon 14-11-11 20:15:56, Wu Fengguang wrote: > > > @@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file, > > > iov_iter_count(i)); > > > > > > again: > > > + if (signal_pending(current)) { > > > > signal_pending looks more useful than fatal_signal_pending in that it > > covers normal signals too. However it's exactly the broader coverage > > that makes it an interface change -- will this possibly break casually > > written applications? > Yeah, this is upto discussion. Historically, write() (or any other system > call) could have returned EINTR. In fact, write() to a socket can return > EINTR even now. But you are right that we didn't return EINTR from write() > to a regular file. So if you prefer to never return EINTR from a write to a > regular file, I can change the check since I'm also slightly worried that > some badly written app can notice. No, this is not up for discussion. You can't return short writes (or reads). This is why the 'fatal_signal_pending' API exists -- if the signal is fatal, the task is never returned to, so its bug (not checking the return from read/write) is not exposed. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 14:16 ` Matthew Wilcox @ 2011-11-14 15:30 ` Jan Kara 2011-11-14 18:44 ` Jeremy Allison 0 siblings, 1 reply; 21+ messages in thread From: Jan Kara @ 2011-11-14 15:30 UTC (permalink / raw) To: Matthew Wilcox Cc: Jan Kara, Wu Fengguang, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon 14-11-11 07:16:26, Matthew Wilcox wrote: > On Mon, Nov 14, 2011 at 01:34:46PM +0100, Jan Kara wrote: > > On Mon 14-11-11 20:15:56, Wu Fengguang wrote: > > > > @@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file, > > > > iov_iter_count(i)); > > > > > > > > again: > > > > + if (signal_pending(current)) { > > > > > > signal_pending looks more useful than fatal_signal_pending in that it > > > covers normal signals too. However it's exactly the broader coverage > > > that makes it an interface change -- will this possibly break casually > > > written applications? > > Yeah, this is upto discussion. Historically, write() (or any other system > > call) could have returned EINTR. In fact, write() to a socket can return > > EINTR even now. But you are right that we didn't return EINTR from write() > > to a regular file. So if you prefer to never return EINTR from a write to a > > regular file, I can change the check since I'm also slightly worried that > > some badly written app can notice. > > No, this is not up for discussion. You can't return short writes (or > reads). This is why the 'fatal_signal_pending' API exists -- if the > signal is fatal, the task is never returned to, so its bug (not checking > the return from read/write) is not exposed. By "can't return" you mean userspace need not be expecting it so we shouldn't break it or is there some standard which forbids it? Just curious... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/2] fs: Make write(2) interruptible by a signal 2011-11-14 15:30 ` Jan Kara @ 2011-11-14 18:44 ` Jeremy Allison 0 siblings, 0 replies; 21+ messages in thread From: Jeremy Allison @ 2011-11-14 18:44 UTC (permalink / raw) To: Jan Kara Cc: Matthew Wilcox, Wu Fengguang, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon, Nov 14, 2011 at 04:30:47PM +0100, Jan Kara wrote: > On Mon 14-11-11 07:16:26, Matthew Wilcox wrote: > > On Mon, Nov 14, 2011 at 01:34:46PM +0100, Jan Kara wrote: > > > On Mon 14-11-11 20:15:56, Wu Fengguang wrote: > > > > > @@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file, > > > > > iov_iter_count(i)); > > > > > > > > > > again: > > > > > + if (signal_pending(current)) { > > > > > > > > signal_pending looks more useful than fatal_signal_pending in that it > > > > covers normal signals too. However it's exactly the broader coverage > > > > that makes it an interface change -- will this possibly break casually > > > > written applications? > > > Yeah, this is upto discussion. Historically, write() (or any other system > > > call) could have returned EINTR. In fact, write() to a socket can return > > > EINTR even now. But you are right that we didn't return EINTR from write() > > > to a regular file. So if you prefer to never return EINTR from a write to a > > > regular file, I can change the check since I'm also slightly worried that > > > some badly written app can notice. > > > > No, this is not up for discussion. You can't return short writes (or > > reads). This is why the 'fatal_signal_pending' API exists -- if the > > signal is fatal, the task is never returned to, so its bug (not checking > > the return from read/write) is not exposed. > By "can't return" you mean userspace need not be expecting it so we > shouldn't break it or is there some standard which forbids it? Just > curious... This *WILL* break userspace code if you return short writes on a filesystem fd. Guarenteed. I originally wrote code in Samba to take care of it back before I learned the difference between "slow" and "fast" interruptable system calls (see W.R.Stevens for details). Don't return short writes or reads on a filesystem fd. Jeremy. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 11:10 [PATCH 0/2] Make task doing heavy writing killable Jan Kara 2011-11-14 11:10 ` [PATCH 1/2] mm: Make task in balance_dirty_pages() killable Jan Kara 2011-11-14 11:10 ` [PATCH 2/2] fs: Make write(2) interruptible by a signal Jan Kara @ 2011-11-14 11:59 ` Wu Fengguang 2011-11-14 12:05 ` Christoph Hellwig 2011-11-14 12:12 ` Jan Kara 2 siblings, 2 replies; 21+ messages in thread From: Wu Fengguang @ 2011-11-14 11:59 UTC (permalink / raw) To: Jan Kara Cc: linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig [-- Attachment #1: Type: text/plain, Size: 955 bytes --] On Mon, Nov 14, 2011 at 07:10:28PM +0800, Jan Kara wrote: > > Hello, > > these two patches aim at making task waiting in balance_dirty_pages() > killable. This is desirable because otherwise if filesystem stops accepting > writes (e.g. if device has been removed or other serious error condidion) we > have a task stuck in D state forever. Agreed totally. I myself has run into such conditions and get very annoyed not being able to kill the hard throttled tasks -- they just stuck there for ever if the error condition does not change. > I'm not sure who should merge these two patches... Al, Fengguang? I'd like to do it -- otherwise there will obviously be merge conflicts. Actually I also queued a patch to do this (attached). Your patches do better on TASK_KILLABLE and the use of signal_pending() in write routines, while mine goes further to add the break to various filesystems. How about combining them together? Thanks, Fengguang [-- Attachment #2: writeback-break-on-signal-pending.patch --] [-- Type: text/x-diff, Size: 4638 bytes --] Subject: writeback: quit throttling when fatal signal pending Date: Wed Sep 08 17:40:22 CST 2010 This allows quick response to Ctrl-C etc. for impatient users. It's necessary to abort the generic_perform_write() and other filesystem write loops too, to avoid large write + SIGKILL combination exceeding the dirty limit and possibly strange OOM. It mainly helps the rare bdi/global dirty exceeded cases. In the normal case of not exceeded, it will quit the loop anyway. if (fatal_signal_pending(current) && nr_reclaimable + nr_writeback <= dirty_thresh) break; Reviewed-by: Neil Brown <neilb@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> cc: Chris Mason <chris.mason@oracle.com> cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> --- fs/btrfs/file.c | 4 ++++ fs/btrfs/ioctl.c | 4 ++++ fs/btrfs/relocation.c | 4 ++++ fs/buffer.c | 4 ++++ fs/ntfs/attrib.c | 2 ++ fs/ntfs/file.c | 6 ++++++ mm/filemap.c | 3 ++- mm/page-writeback.c | 2 ++ 8 files changed, 28 insertions(+), 1 deletion(-) --- linux-next.orig/mm/page-writeback.c 2011-11-13 20:37:11.000000000 +0800 +++ linux-next/mm/page-writeback.c 2011-11-13 20:37:57.000000000 +0800 @@ -1185,6 +1185,8 @@ pause: */ if (task_ratelimit) break; + if (fatal_signal_pending(current)) + break; } if (!dirty_exceeded && bdi->dirty_exceeded) --- linux-next.orig/mm/filemap.c 2011-11-13 20:03:23.000000000 +0800 +++ linux-next/mm/filemap.c 2011-11-13 20:37:17.000000000 +0800 @@ -2463,7 +2463,8 @@ again: written += copied; balance_dirty_pages_ratelimited(mapping); - + if (fatal_signal_pending(current)) + break; } while (iov_iter_count(i)); return written ? written : status; --- linux-next.orig/fs/btrfs/file.c 2011-11-13 20:37:11.000000000 +0800 +++ linux-next/fs/btrfs/file.c 2011-11-13 20:37:17.000000000 +0800 @@ -1274,6 +1274,10 @@ static noinline ssize_t __btrfs_buffered dirty_pages); if (dirty_pages < (root->leafsize >> PAGE_CACHE_SHIFT) + 1) btrfs_btree_balance_dirty(root, 1); + if (fatal_signal_pending(current)) { + ret = -EINTR; + break; + } btrfs_throttle(root); pos += copied; --- linux-next.orig/fs/btrfs/ioctl.c 2011-11-13 20:03:23.000000000 +0800 +++ linux-next/fs/btrfs/ioctl.c 2011-11-13 20:37:17.000000000 +0800 @@ -1115,6 +1115,10 @@ int btrfs_defrag_file(struct inode *inod defrag_count += ret; balance_dirty_pages_ratelimited_nr(inode->i_mapping, ret); + if (fatal_signal_pending(current)) { + ret = -EINTR; + goto out_ra; + } if (newer_than) { if (newer_off == (u64)-1) --- linux-next.orig/fs/btrfs/relocation.c 2011-11-13 20:03:23.000000000 +0800 +++ linux-next/fs/btrfs/relocation.c 2011-11-13 20:37:17.000000000 +0800 @@ -3009,6 +3009,10 @@ static int relocate_file_extent_cluster( index++; balance_dirty_pages_ratelimited(inode->i_mapping); + if (fatal_signal_pending(current)) { + ret = -EINTR; + break; + } btrfs_throttle(BTRFS_I(inode)->root); } WARN_ON(nr != cluster->nr); --- linux-next.orig/fs/buffer.c 2011-11-13 20:03:23.000000000 +0800 +++ linux-next/fs/buffer.c 2011-11-13 20:37:17.000000000 +0800 @@ -2255,6 +2255,10 @@ static int cont_expand_zero(struct file err = 0; balance_dirty_pages_ratelimited(mapping); + if (fatal_signal_pending(current)) { + err = -EINTR; + goto out; + } } /* page covers the boundary, find the boundary offset */ --- linux-next.orig/fs/ntfs/attrib.c 2011-11-13 20:03:23.000000000 +0800 +++ linux-next/fs/ntfs/attrib.c 2011-11-13 20:37:17.000000000 +0800 @@ -2588,6 +2588,8 @@ int ntfs_attr_set(ntfs_inode *ni, const unlock_page(page); page_cache_release(page); balance_dirty_pages_ratelimited(mapping); + if (fatal_signal_pending(current)) + return -EINTR; cond_resched(); } /* If there is a last partial page, need to do it the slow way. */ --- linux-next.orig/fs/ntfs/file.c 2011-11-13 20:03:23.000000000 +0800 +++ linux-next/fs/ntfs/file.c 2011-11-13 20:37:17.000000000 +0800 @@ -278,6 +278,10 @@ do_non_resident_extend: * files. */ balance_dirty_pages_ratelimited(mapping); + if (fatal_signal_pending(current)) { + err = -EINTR; + goto init_err_out; + } cond_resched(); } while (++index < end_index); read_lock_irqsave(&ni->size_lock, flags); @@ -2054,6 +2058,8 @@ static ssize_t ntfs_file_buffered_write( if (unlikely(status)) break; balance_dirty_pages_ratelimited(mapping); + if (fatal_signal_pending(current)) + break; cond_resched(); } while (count); err_out: ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 11:59 ` [PATCH 0/2] Make task doing heavy writing killable Wu Fengguang @ 2011-11-14 12:05 ` Christoph Hellwig 2011-11-14 12:24 ` Jan Kara 2011-11-14 12:29 ` Wu Fengguang 2011-11-14 12:12 ` Jan Kara 1 sibling, 2 replies; 21+ messages in thread From: Christoph Hellwig @ 2011-11-14 12:05 UTC (permalink / raw) To: Wu Fengguang Cc: Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon, Nov 14, 2011 at 07:59:12PM +0800, Wu Fengguang wrote: > On Mon, Nov 14, 2011 at 07:10:28PM +0800, Jan Kara wrote: > > > > Hello, > > > > these two patches aim at making task waiting in balance_dirty_pages() > > killable. This is desirable because otherwise if filesystem stops accepting > > writes (e.g. if device has been removed or other serious error condidion) we > > have a task stuck in D state forever. > > Agreed totally. I myself has run into such conditions and get very > annoyed not being able to kill the hard throttled tasks -- they just > stuck there for ever if the error condition does not change. > > > I'm not sure who should merge these two patches... Al, Fengguang? > > I'd like to do it -- otherwise there will obviously be merge conflicts. > > Actually I also queued a patch to do this (attached). Your patches do > better on TASK_KILLABLE and the use of signal_pending() in write > routines, while mine goes further to add the break to various > filesystems. How about combining them together? Can you make balance_dirty_pages(_ratelimited) return an error instead of opencoding the fatal signal check everywhere? That would make the interface a bit more obvious. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 12:05 ` Christoph Hellwig @ 2011-11-14 12:24 ` Jan Kara 2011-11-14 12:29 ` Wu Fengguang 1 sibling, 0 replies; 21+ messages in thread From: Jan Kara @ 2011-11-14 12:24 UTC (permalink / raw) To: Christoph Hellwig Cc: Wu Fengguang, Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton On Mon 14-11-11 07:05:46, Christoph Hellwig wrote: > On Mon, Nov 14, 2011 at 07:59:12PM +0800, Wu Fengguang wrote: > > On Mon, Nov 14, 2011 at 07:10:28PM +0800, Jan Kara wrote: > > > > > > Hello, > > > > > > these two patches aim at making task waiting in balance_dirty_pages() > > > killable. This is desirable because otherwise if filesystem stops accepting > > > writes (e.g. if device has been removed or other serious error condidion) we > > > have a task stuck in D state forever. > > > > Agreed totally. I myself has run into such conditions and get very > > annoyed not being able to kill the hard throttled tasks -- they just > > stuck there for ever if the error condition does not change. > > > > > I'm not sure who should merge these two patches... Al, Fengguang? > > > > I'd like to do it -- otherwise there will obviously be merge conflicts. > > > > Actually I also queued a patch to do this (attached). Your patches do > > better on TASK_KILLABLE and the use of signal_pending() in write > > routines, while mine goes further to add the break to various > > filesystems. How about combining them together? > > Can you make balance_dirty_pages(_ratelimited) return an error instead > of opencoding the fatal signal check everywhere? That would make the > interface a bit more obvious. We can do this. It's just that signal_pending() check e.g. in generic_perform_write() has a sense even when balance_dirty_pages() will be returning error because it also catches other cases... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 12:05 ` Christoph Hellwig 2011-11-14 12:24 ` Jan Kara @ 2011-11-14 12:29 ` Wu Fengguang 2011-11-14 12:41 ` Christoph Hellwig 1 sibling, 1 reply; 21+ messages in thread From: Wu Fengguang @ 2011-11-14 12:29 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton > Can you make balance_dirty_pages(_ratelimited) return an error instead > of opencoding the fatal signal che Hmm that means a bigger change to the prototype of the _exported_ balance_dirty_pages_ratelimited_nr() which impacts out of tree FS... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 12:29 ` Wu Fengguang @ 2011-11-14 12:41 ` Christoph Hellwig 2011-11-14 13:01 ` Wu Fengguang 0 siblings, 1 reply; 21+ messages in thread From: Christoph Hellwig @ 2011-11-14 12:41 UTC (permalink / raw) To: Wu Fengguang Cc: Christoph Hellwig, Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton On Mon, Nov 14, 2011 at 08:29:32PM +0800, Wu Fengguang wrote: > > Can you make balance_dirty_pages(_ratelimited) return an error instead > > of opencoding the fatal signal che > > Hmm that means a bigger change to the prototype of the _exported_ > balance_dirty_pages_ratelimited_nr() which impacts out of tree FS... So what? out of tree fses have always been entirely supported on their own, and not a matter for keeping progress back. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 12:41 ` Christoph Hellwig @ 2011-11-14 13:01 ` Wu Fengguang 2011-11-14 15:28 ` Jan Kara 0 siblings, 1 reply; 21+ messages in thread From: Wu Fengguang @ 2011-11-14 13:01 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton On Mon, Nov 14, 2011 at 08:41:45PM +0800, Christoph Hellwig wrote: > On Mon, Nov 14, 2011 at 08:29:32PM +0800, Wu Fengguang wrote: > > > Can you make balance_dirty_pages(_ratelimited) return an error instead > > > of opencoding the fatal signal che > > > > Hmm that means a bigger change to the prototype of the _exported_ > > balance_dirty_pages_ratelimited_nr() which impacts out of tree FS... > > So what? out of tree fses have always been entirely supported on their > own, and not a matter for keeping progress back. OK. Jan, would you do the mm/*.c part in v2? I'll do the various FS parts. Thanks, Fengguang ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 13:01 ` Wu Fengguang @ 2011-11-14 15:28 ` Jan Kara 2011-11-14 15:32 ` Christoph Hellwig 0 siblings, 1 reply; 21+ messages in thread From: Jan Kara @ 2011-11-14 15:28 UTC (permalink / raw) To: Wu Fengguang Cc: Christoph Hellwig, Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton On Mon 14-11-11 21:01:53, Wu Fengguang wrote: > On Mon, Nov 14, 2011 at 08:41:45PM +0800, Christoph Hellwig wrote: > > On Mon, Nov 14, 2011 at 08:29:32PM +0800, Wu Fengguang wrote: > > > > Can you make balance_dirty_pages(_ratelimited) return an error instead > > > > of opencoding the fatal signal che > > > > > > Hmm that means a bigger change to the prototype of the _exported_ > > > balance_dirty_pages_ratelimited_nr() which impacts out of tree FS... > > > > So what? out of tree fses have always been entirely supported on their > > own, and not a matter for keeping progress back. > > OK. Jan, would you do the mm/*.c part in v2? I'll do the various FS > parts. Sure, I'll update my patch to balance_dirty_pages(). Should I also update my patch to generic_perform_write() or does that fall under "various FS parts" cathegory? Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 15:28 ` Jan Kara @ 2011-11-14 15:32 ` Christoph Hellwig 2011-11-14 16:19 ` Jan Kara 0 siblings, 1 reply; 21+ messages in thread From: Christoph Hellwig @ 2011-11-14 15:32 UTC (permalink / raw) To: Jan Kara Cc: Wu Fengguang, Christoph Hellwig, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton On Mon, Nov 14, 2011 at 04:28:32PM +0100, Jan Kara wrote: > On Mon 14-11-11 21:01:53, Wu Fengguang wrote: > > On Mon, Nov 14, 2011 at 08:41:45PM +0800, Christoph Hellwig wrote: > > > On Mon, Nov 14, 2011 at 08:29:32PM +0800, Wu Fengguang wrote: > > > > > Can you make balance_dirty_pages(_ratelimited) return an error instead > > > > > of opencoding the fatal signal che > > > > > > > > Hmm that means a bigger change to the prototype of the _exported_ > > > > balance_dirty_pages_ratelimited_nr() which impacts out of tree FS... > > > > > > So what? out of tree fses have always been entirely supported on their > > > own, and not a matter for keeping progress back. > > > > OK. Jan, would you do the mm/*.c part in v2? I'll do the various FS > > parts. > Sure, I'll update my patch to balance_dirty_pages(). Should I also update > my patch to generic_perform_write() or does that fall under "various FS > parts" cathegory? I'd be much happier if all bits go in together. IMHO this is a change worthwhile for 3.2 still, so we should probably try to get it in ASAP. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 15:32 ` Christoph Hellwig @ 2011-11-14 16:19 ` Jan Kara 0 siblings, 0 replies; 21+ messages in thread From: Jan Kara @ 2011-11-14 16:19 UTC (permalink / raw) To: Christoph Hellwig Cc: Jan Kara, Wu Fengguang, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton On Mon 14-11-11 10:32:26, Christoph Hellwig wrote: > On Mon, Nov 14, 2011 at 04:28:32PM +0100, Jan Kara wrote: > > On Mon 14-11-11 21:01:53, Wu Fengguang wrote: > > > On Mon, Nov 14, 2011 at 08:41:45PM +0800, Christoph Hellwig wrote: > > > > On Mon, Nov 14, 2011 at 08:29:32PM +0800, Wu Fengguang wrote: > > > > > > Can you make balance_dirty_pages(_ratelimited) return an error instead > > > > > > of opencoding the fatal signal che > > > > > > > > > > Hmm that means a bigger change to the prototype of the _exported_ > > > > > balance_dirty_pages_ratelimited_nr() which impacts out of tree FS... > > > > > > > > So what? out of tree fses have always been entirely supported on their > > > > own, and not a matter for keeping progress back. > > > > > > OK. Jan, would you do the mm/*.c part in v2? I'll do the various FS > > > parts. > > Sure, I'll update my patch to balance_dirty_pages(). Should I also update > > my patch to generic_perform_write() or does that fall under "various FS > > parts" cathegory? > > I'd be much happier if all bits go in together. IMHO this is a change > worthwhile for 3.2 still, so we should probably try to get it in ASAP. OK, I've resent my patches with requested changes. I think those two patches make sense on it's own and fix the biggest single pain point. Fengguang can add other fs bits to these two patches (although personally I'd be happier to have respective fs maintainers acks on them before pushing). Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 0/2] Make task doing heavy writing killable 2011-11-14 11:59 ` [PATCH 0/2] Make task doing heavy writing killable Wu Fengguang 2011-11-14 12:05 ` Christoph Hellwig @ 2011-11-14 12:12 ` Jan Kara 1 sibling, 0 replies; 21+ messages in thread From: Jan Kara @ 2011-11-14 12:12 UTC (permalink / raw) To: Wu Fengguang Cc: Jan Kara, linux-fsdevel@vger.kernel.org, Al Viro, k-mio@sx.jp.nec.com, Andrew Morton, Christoph Hellwig On Mon 14-11-11 19:59:12, Wu Fengguang wrote: > On Mon, Nov 14, 2011 at 07:10:28PM +0800, Jan Kara wrote: > > > > Hello, > > > > these two patches aim at making task waiting in balance_dirty_pages() > > killable. This is desirable because otherwise if filesystem stops accepting > > writes (e.g. if device has been removed or other serious error condidion) we > > have a task stuck in D state forever. > > Agreed totally. I myself has run into such conditions and get very > annoyed not being able to kill the hard throttled tasks -- they just > stuck there for ever if the error condition does not change. > > > I'm not sure who should merge these two patches... Al, Fengguang? > > I'd like to do it -- otherwise there will obviously be merge conflicts. Good. > Actually I also queued a patch to do this (attached). Your patches do > better on TASK_KILLABLE and the use of signal_pending() in write > routines, while mine goes further to add the break to various > filesystems. How about combining them together? > > Subject: writeback: quit throttling when fatal signal pending > Date: Wed Sep 08 17:40:22 CST 2010 > > This allows quick response to Ctrl-C etc. for impatient users. > > It's necessary to abort the generic_perform_write() and other filesystem > write loops too, to avoid large write + SIGKILL combination exceeding > the dirty limit and possibly strange OOM. > > It mainly helps the rare bdi/global dirty exceeded cases. > In the normal case of not exceeded, it will quit the loop anyway. > > if (fatal_signal_pending(current) && > nr_reclaimable + nr_writeback <= dirty_thresh) > break; > > Reviewed-by: Neil Brown <neilb@suse.de> > Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > cc: Chris Mason <chris.mason@oracle.com> > cc: Anton Altaparmakov <aia21@cantab.net> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> > --- > fs/btrfs/file.c | 4 ++++ > fs/btrfs/ioctl.c | 4 ++++ > fs/btrfs/relocation.c | 4 ++++ > fs/buffer.c | 4 ++++ > fs/ntfs/attrib.c | 2 ++ > fs/ntfs/file.c | 6 ++++++ > mm/filemap.c | 3 ++- > mm/page-writeback.c | 2 ++ > 8 files changed, 28 insertions(+), 1 deletion(-) > > --- linux-next.orig/mm/page-writeback.c 2011-11-13 20:37:11.000000000 +0800 > +++ linux-next/mm/page-writeback.c 2011-11-13 20:37:57.000000000 +0800 > @@ -1185,6 +1185,8 @@ pause: > */ > if (task_ratelimit) > break; > + if (fatal_signal_pending(current)) > + break; > } > > if (!dirty_exceeded && bdi->dirty_exceeded) > --- linux-next.orig/mm/filemap.c 2011-11-13 20:03:23.000000000 +0800 > +++ linux-next/mm/filemap.c 2011-11-13 20:37:17.000000000 +0800 > @@ -2463,7 +2463,8 @@ again: > written += copied; > > balance_dirty_pages_ratelimited(mapping); > - > + if (fatal_signal_pending(current)) > + break; > } while (iov_iter_count(i)); > > return written ? written : status; Above two hunks are already part of my patches (effectively). The hunks below look good but I guess they should be split to per-filesystem patches and sent to filesystem maintainers for merging. The changes are independent so there's no reason to do them in one big patch or skip fs maintainers... Honza > --- linux-next.orig/fs/btrfs/file.c 2011-11-13 20:37:11.000000000 +0800 > +++ linux-next/fs/btrfs/file.c 2011-11-13 20:37:17.000000000 +0800 > @@ -1274,6 +1274,10 @@ static noinline ssize_t __btrfs_buffered > dirty_pages); > if (dirty_pages < (root->leafsize >> PAGE_CACHE_SHIFT) + 1) > btrfs_btree_balance_dirty(root, 1); > + if (fatal_signal_pending(current)) { > + ret = -EINTR; > + break; > + } > btrfs_throttle(root); > > pos += copied; > --- linux-next.orig/fs/btrfs/ioctl.c 2011-11-13 20:03:23.000000000 +0800 > +++ linux-next/fs/btrfs/ioctl.c 2011-11-13 20:37:17.000000000 +0800 > @@ -1115,6 +1115,10 @@ int btrfs_defrag_file(struct inode *inod > > defrag_count += ret; > balance_dirty_pages_ratelimited_nr(inode->i_mapping, ret); > + if (fatal_signal_pending(current)) { > + ret = -EINTR; > + goto out_ra; > + } > > if (newer_than) { > if (newer_off == (u64)-1) > --- linux-next.orig/fs/btrfs/relocation.c 2011-11-13 20:03:23.000000000 +0800 > +++ linux-next/fs/btrfs/relocation.c 2011-11-13 20:37:17.000000000 +0800 > @@ -3009,6 +3009,10 @@ static int relocate_file_extent_cluster( > > index++; > balance_dirty_pages_ratelimited(inode->i_mapping); > + if (fatal_signal_pending(current)) { > + ret = -EINTR; > + break; > + } > btrfs_throttle(BTRFS_I(inode)->root); > } > WARN_ON(nr != cluster->nr); > --- linux-next.orig/fs/buffer.c 2011-11-13 20:03:23.000000000 +0800 > +++ linux-next/fs/buffer.c 2011-11-13 20:37:17.000000000 +0800 > @@ -2255,6 +2255,10 @@ static int cont_expand_zero(struct file > err = 0; > > balance_dirty_pages_ratelimited(mapping); > + if (fatal_signal_pending(current)) { > + err = -EINTR; > + goto out; > + } > } > > /* page covers the boundary, find the boundary offset */ > --- linux-next.orig/fs/ntfs/attrib.c 2011-11-13 20:03:23.000000000 +0800 > +++ linux-next/fs/ntfs/attrib.c 2011-11-13 20:37:17.000000000 +0800 > @@ -2588,6 +2588,8 @@ int ntfs_attr_set(ntfs_inode *ni, const > unlock_page(page); > page_cache_release(page); > balance_dirty_pages_ratelimited(mapping); > + if (fatal_signal_pending(current)) > + return -EINTR; > cond_resched(); > } > /* If there is a last partial page, need to do it the slow way. */ > --- linux-next.orig/fs/ntfs/file.c 2011-11-13 20:03:23.000000000 +0800 > +++ linux-next/fs/ntfs/file.c 2011-11-13 20:37:17.000000000 +0800 > @@ -278,6 +278,10 @@ do_non_resident_extend: > * files. > */ > balance_dirty_pages_ratelimited(mapping); > + if (fatal_signal_pending(current)) { > + err = -EINTR; > + goto init_err_out; > + } > cond_resched(); > } while (++index < end_index); > read_lock_irqsave(&ni->size_lock, flags); > @@ -2054,6 +2058,8 @@ static ssize_t ntfs_file_buffered_write( > if (unlikely(status)) > break; > balance_dirty_pages_ratelimited(mapping); > + if (fatal_signal_pending(current)) > + break; > cond_resched(); > } while (count); > err_out: -- Jan Kara <jack@suse.cz> SUSE Labs, CR ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2011-11-14 18:44 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-11-14 11:10 [PATCH 0/2] Make task doing heavy writing killable Jan Kara 2011-11-14 11:10 ` [PATCH 1/2] mm: Make task in balance_dirty_pages() killable Jan Kara 2011-11-14 12:12 ` Wu Fengguang 2011-11-14 12:37 ` Jan Kara 2011-11-14 11:10 ` [PATCH 2/2] fs: Make write(2) interruptible by a signal Jan Kara 2011-11-14 12:12 ` Matthew Wilcox 2011-11-14 12:15 ` Wu Fengguang 2011-11-14 12:34 ` Jan Kara 2011-11-14 14:16 ` Matthew Wilcox 2011-11-14 15:30 ` Jan Kara 2011-11-14 18:44 ` Jeremy Allison 2011-11-14 11:59 ` [PATCH 0/2] Make task doing heavy writing killable Wu Fengguang 2011-11-14 12:05 ` Christoph Hellwig 2011-11-14 12:24 ` Jan Kara 2011-11-14 12:29 ` Wu Fengguang 2011-11-14 12:41 ` Christoph Hellwig 2011-11-14 13:01 ` Wu Fengguang 2011-11-14 15:28 ` Jan Kara 2011-11-14 15:32 ` Christoph Hellwig 2011-11-14 16:19 ` Jan Kara 2011-11-14 12:12 ` Jan Kara
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).