[BUG] aborted ext4 leads to inifinity loop in balance_dirty

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
@ 2011-10-25 12:04 Kazuya Mio
  2011-10-25 13:40 ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Kazuya Mio @ 2011-10-25 12:04 UTC (permalink / raw)
  To: ext4; +Cc: Theodore Tso, Andreas Dilger

Write systemcall calls balance_dirty_pages() for direct reclaim.
However, if ext4 is aborted because of the journal abort, ext4_da_writepages()
cannot reduce the number of dirty pages because EXT4_MF_FS_ABORTED is set to
s_mount_flag. banalce_dirty_pages() has a busy loop, and we can pass this loop
only if the number of dirty pages is less than the threshold. So this function
loops infinity.

When write systemcall and kjournald ran at the same time and the disk
corruption happened, the problem occurred. The kernel version was 3.1-rc9.
I corrupted the disk on purpose by using dmsetup command.

process1 (write)                  process2 (kjournald)

generic_perform_write
  ext4_da_write_begin
  ext4_da_write_end

-------------- detect disk corruption --------------

                                  jbd2_journal_commit_transaction
                                     journal_submit_data_buffers
                                     jbd2_journal_abort

  balance_dirty_pages
    writeback_inodes_wb
      ...
        ext4_da_writepages           <- do nothing if EXT4_MF_FS_ABORTED is set
          ext4_journal_start
            ext4_journal_start_sb    <- detect journal abort
              ext4_abort             <- set EXT4_MF_FS_ABORTED

One possible idea to fix this problem is that ext4_da_writepages()
invalidates the dirty pages if the filesystem has been aborted.
Do you have any ideas?

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-10-25 12:04 [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages Kazuya Mio
@ 2011-10-25 13:40 ` Jan Kara
  2011-10-28  5:34   ` Kazuya Mio
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2011-10-25 13:40 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: ext4, Theodore Tso, Andreas Dilger

On Tue 25-10-11 21:04:53, Kazuya Mio wrote:
> Write systemcall calls balance_dirty_pages() for direct reclaim.
> However, if ext4 is aborted because of the journal abort, ext4_da_writepages()
> cannot reduce the number of dirty pages because EXT4_MF_FS_ABORTED is set to
> s_mount_flag. banalce_dirty_pages() has a busy loop, and we can pass this loop
> only if the number of dirty pages is less than the threshold. So this function
> loops infinity.
> 
> When write systemcall and kjournald ran at the same time and the disk
> corruption happened, the problem occurred. The kernel version was 3.1-rc9.
> I corrupted the disk on purpose by using dmsetup command.
> 
> 
> process1 (write)                  process2 (kjournald)
> 
> generic_perform_write
>   ext4_da_write_begin
>   ext4_da_write_end
> 
> -------------- detect disk corruption --------------
> 
>                                   jbd2_journal_commit_transaction
>                                      journal_submit_data_buffers
>                                      jbd2_journal_abort
> 
>   balance_dirty_pages
>     writeback_inodes_wb
>       ...
>         ext4_da_writepages           <- do nothing if EXT4_MF_FS_ABORTED is set
>           ext4_journal_start
>             ext4_journal_start_sb    <- detect journal abort
>               ext4_abort             <- set EXT4_MF_FS_ABORTED
  Thanks for report!

> One possible idea to fix this problem is that ext4_da_writepages()
> invalidates the dirty pages if the filesystem has been aborted.
  Please no. Generally this boils down to what do we do with dirty data
when there's error in writing them out. Currently we just throw them away
(e.g. in media error case) but I don't think that's a generally good thing
because e.g. admin may want to copy the data to other working storage or
so. So I think we should rather keep the data and provide a mechanism for
userspace to ask kernel to get rid of the data (so that we don't eventually
run OOM).

> Do you have any ideas?
  So the question is what would you like to achieve. If you just want to
unblock a thread then a solution would be to make a thread at
balance_dirty_pages() killable. If generally you want to get rid of dirty
memory, then I don't have a really good answer but throwing dirty data away
seems like a bad answer to me.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-10-25 13:40 ` Jan Kara
@ 2011-10-28  5:34   ` Kazuya Mio
  2011-11-01 23:13     ` Jan Kara
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Kazuya Mio @ 2011-10-28  5:34 UTC (permalink / raw)
  To: Jan Kara; +Cc: ext4, Theodore Tso, Andreas Dilger

2011/10/25 22:40, Jan Kara wrote:
>   Please no. Generally this boils down to what do we do with dirty data
> when there's error in writing them out. Currently we just throw them away
> (e.g. in media error case) but I don't think that's a generally good thing
> because e.g. admin may want to copy the data to other working storage or
> so. So I think we should rather keep the data and provide a mechanism for
> userspace to ask kernel to get rid of the data (so that we don't eventually
> run OOM).

I see. I agree with you.

>> Do you have any ideas?
>   So the question is what would you like to achieve. If you just want to
> unblock a thread then a solution would be to make a thread at
> balance_dirty_pages() killable. If generally you want to get rid of dirty
> memory, then I don't have a really good answer but throwing dirty data away
> seems like a bad answer to me.

The problem is that we cannot unmount the corrupted filesystem due to
un-killable dd process. We must bring down the system to resume the service
with no dirty pages. I think it is important for the service continuity
to be able to kill the thread handling in balance_dirty_pages().

Regards,
Kazuya Mio


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-10-28  5:34   ` Kazuya Mio
@ 2011-11-01 23:13     ` Jan Kara
  2011-11-02  5:24       ` Kazuya Mio
  2011-11-07  8:00     ` Dmitry Monakhov
  2011-11-08  0:03     ` Jan Kara
  2 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2011-11-01 23:13 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: Jan Kara, ext4, Theodore Tso, Andreas Dilger

On Fri 28-10-11 14:34:31, Kazuya Mio wrote:
> 2011/10/25 22:40, Jan Kara wrote:
> >  Please no. Generally this boils down to what do we do with dirty data
> >when there's error in writing them out. Currently we just throw them away
> >(e.g. in media error case) but I don't think that's a generally good thing
> >because e.g. admin may want to copy the data to other working storage or
> >so. So I think we should rather keep the data and provide a mechanism for
> >userspace to ask kernel to get rid of the data (so that we don't eventually
> >run OOM).
> 
> I see. I agree with you.
> 
> >>Do you have any ideas?
> >  So the question is what would you like to achieve. If you just want to
> >unblock a thread then a solution would be to make a thread at
> >balance_dirty_pages() killable. If generally you want to get rid of dirty
> >memory, then I don't have a really good answer but throwing dirty data away
> >seems like a bad answer to me.
> 
> The problem is that we cannot unmount the corrupted filesystem due to
> un-killable dd process. We must bring down the system to resume the service
> with no dirty pages. I think it is important for the service continuity
> to be able to kill the thread handling in balance_dirty_pages().
  Sure. Then allowing a process to be killed while waiting in
balance_dirty_pages() would solve your problem. That can be done relatively
easily. I can write the patch, just now the code is under rewrite from
IO-less dirty throttling patches so I'll wait for a while for it to settle
down.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-01 23:13     ` Jan Kara
@ 2011-11-02  5:24       ` Kazuya Mio
  0 siblings, 0 replies; 14+ messages in thread
From: Kazuya Mio @ 2011-11-02  5:24 UTC (permalink / raw)
  To: Jan Kara; +Cc: ext4, Theodore Tso, Andreas Dilger

2011/11/02 8:13, Jan Kara wrote:
>> The problem is that we cannot unmount the corrupted filesystem due to
>> un-killable dd process. We must bring down the system to resume the service
>> with no dirty pages. I think it is important for the service continuity
>> to be able to kill the thread handling in balance_dirty_pages().
>    Sure. Then allowing a process to be killed while waiting in
> balance_dirty_pages() would solve your problem. That can be done relatively
> easily. I can write the patch, just now the code is under rewrite from
> IO-less dirty throttling patches so I'll wait for a while for it to settle
> down.

Thanks for working this on.

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-10-28  5:34   ` Kazuya Mio
  2011-11-01 23:13     ` Jan Kara
@ 2011-11-07  8:00     ` Dmitry Monakhov
  2011-11-07 17:29       ` Jan Kara
  2011-11-08  0:03     ` Jan Kara
  2 siblings, 1 reply; 14+ messages in thread
From: Dmitry Monakhov @ 2011-11-07  8:00 UTC (permalink / raw)
  To: Kazuya Mio, Jan Kara; +Cc: ext4, Theodore Tso, Andreas Dilger

On Fri, 28 Oct 2011 14:34:31 +0900, Kazuya Mio <k-mio@sx.jp.nec.com> wrote:
> 2011/10/25 22:40, Jan Kara wrote:
> >   Please no. Generally this boils down to what do we do with dirty data
> > when there's error in writing them out. Currently we just throw them away
> > (e.g. in media error case) but I don't think that's a generally good thing
> > because e.g. admin may want to copy the data to other working storage or
> > so. So I think we should rather keep the data and provide a mechanism for
> > userspace to ask kernel to get rid of the data (so that we don't eventually
> > run OOM).
> 
> I see. I agree with you.
> 
> >> Do you have any ideas?
> >   So the question is what would you like to achieve. If you just want to
> > unblock a thread then a solution would be to make a thread at
> > balance_dirty_pages() killable. If generally you want to get rid of dirty
> > memory, then I don't have a really good answer but throwing dirty data away
> > seems like a bad answer to me.
> 
> The problem is that we cannot unmount the corrupted filesystem due to
> un-killable dd process. We must bring down the system to resume the service
> with no dirty pages. I think it is important for the service continuity
> to be able to kill the thread handling in balance_dirty_pages().
In fact you are very lucky because dd is just deadlocked, in many cases
journal abort result in BUG_ON triggering(if IO load is high enough).
This is because transaction abort check is racy. Right now i've no good
fix which has reasonable performance. My latest idea is to protect
transaction abort check via SRCU.
> 
> Regards,
> Kazuya Mio
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-07  8:00     ` Dmitry Monakhov
@ 2011-11-07 17:29       ` Jan Kara
  2011-11-07 17:45         ` Dmitry Monakhov
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2011-11-07 17:29 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: Kazuya Mio, Jan Kara, ext4, Theodore Tso, Andreas Dilger

On Mon 07-11-11 12:00:41, Dmitry Monakhov wrote:
> On Fri, 28 Oct 2011 14:34:31 +0900, Kazuya Mio <k-mio@sx.jp.nec.com> wrote:
> > 2011/10/25 22:40, Jan Kara wrote:
> > >   Please no. Generally this boils down to what do we do with dirty data
> > > when there's error in writing them out. Currently we just throw them away
> > > (e.g. in media error case) but I don't think that's a generally good thing
> > > because e.g. admin may want to copy the data to other working storage or
> > > so. So I think we should rather keep the data and provide a mechanism for
> > > userspace to ask kernel to get rid of the data (so that we don't eventually
> > > run OOM).
> > 
> > I see. I agree with you.
> > 
> > >> Do you have any ideas?
> > >   So the question is what would you like to achieve. If you just want to
> > > unblock a thread then a solution would be to make a thread at
> > > balance_dirty_pages() killable. If generally you want to get rid of dirty
> > > memory, then I don't have a really good answer but throwing dirty data away
> > > seems like a bad answer to me.
> > 
> > The problem is that we cannot unmount the corrupted filesystem due to
> > un-killable dd process. We must bring down the system to resume the service
> > with no dirty pages. I think it is important for the service continuity
> > to be able to kill the thread handling in balance_dirty_pages().
> In fact you are very lucky because dd is just deadlocked, in many cases
> journal abort result in BUG_ON triggering(if IO load is high enough).
  Can you provide the exact kernel message? I'd be interested...

> This is because transaction abort check is racy. Right now i've no good
> fix which has reasonable performance. My latest idea is to protect
> transaction abort check via SRCU.
  Yeah, the code does not seem to care about races too much but I don't see
which BUG_ON would be triggered...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-07 17:29       ` Jan Kara
@ 2011-11-07 17:45         ` Dmitry Monakhov
  2011-11-07 21:23           ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Dmitry Monakhov @ 2011-11-07 17:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: Kazuya Mio, Jan Kara, ext4, Theodore Tso, Andreas Dilger

On Mon, 7 Nov 2011 18:29:39 +0100, Jan Kara <jack@suse.cz> wrote:
> On Mon 07-11-11 12:00:41, Dmitry Monakhov wrote:
> > On Fri, 28 Oct 2011 14:34:31 +0900, Kazuya Mio <k-mio@sx.jp.nec.com> wrote:
> > > 2011/10/25 22:40, Jan Kara wrote:
> > > >   Please no. Generally this boils down to what do we do with dirty data
> > > > when there's error in writing them out. Currently we just throw them away
> > > > (e.g. in media error case) but I don't think that's a generally good thing
> > > > because e.g. admin may want to copy the data to other working storage or
> > > > so. So I think we should rather keep the data and provide a mechanism for
> > > > userspace to ask kernel to get rid of the data (so that we don't eventually
> > > > run OOM).
> > > 
> > > I see. I agree with you.
> > > 
> > > >> Do you have any ideas?
> > > >   So the question is what would you like to achieve. If you just want to
> > > > unblock a thread then a solution would be to make a thread at
> > > > balance_dirty_pages() killable. If generally you want to get rid of dirty
> > > > memory, then I don't have a really good answer but throwing dirty data away
> > > > seems like a bad answer to me.
> > > 
> > > The problem is that we cannot unmount the corrupted filesystem due to
> > > un-killable dd process. We must bring down the system to resume the service
> > > with no dirty pages. I think it is important for the service continuity
> > > to be able to kill the thread handling in balance_dirty_pages().
> > In fact you are very lucky because dd is just deadlocked, in many cases
> > journal abort result in BUG_ON triggering(if IO load is high enough).
>   Can you provide the exact kernel message? I'd be interested...
Several times i've failed in journal_stop() here:
int jbd2_journal_stop(handle_t *handle)
{
        transaction_t *transaction = handle->h_transaction;
        journal_t *journal = transaction->t_journal;
        int err, wait_for_commit = 0;
        tid_t tid;
        pid_t pid;

        J_ASSERT(journal_current_handle() == handle);

        if (is_handle_aborted(handle))
                err = -EIO;
        else {
                J_ASSERT(atomic_read(&transaction->t_updates) > 0);
##FAILED HERE ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                err = 0;
        }


> 
> > This is because transaction abort check is racy. Right now i've no good
> > fix which has reasonable performance. My latest idea is to protect
> > transaction abort check via SRCU.
>   Yeah, the code does not seem to care about races too much but I don't see
> which BUG_ON would be triggered...
> 
> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-07 17:45         ` Dmitry Monakhov
@ 2011-11-07 21:23           ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2011-11-07 21:23 UTC (permalink / raw)
  To: Dmitry Monakhov; +Cc: Jan Kara, Kazuya Mio, ext4, Theodore Tso, Andreas Dilger

On Mon 07-11-11 21:45:31, Dmitry Monakhov wrote:
> On Mon, 7 Nov 2011 18:29:39 +0100, Jan Kara <jack@suse.cz> wrote:
> > On Mon 07-11-11 12:00:41, Dmitry Monakhov wrote:
> > > On Fri, 28 Oct 2011 14:34:31 +0900, Kazuya Mio <k-mio@sx.jp.nec.com> wrote:
> > > > 2011/10/25 22:40, Jan Kara wrote:
> > > > >   Please no. Generally this boils down to what do we do with dirty data
> > > > > when there's error in writing them out. Currently we just throw them away
> > > > > (e.g. in media error case) but I don't think that's a generally good thing
> > > > > because e.g. admin may want to copy the data to other working storage or
> > > > > so. So I think we should rather keep the data and provide a mechanism for
> > > > > userspace to ask kernel to get rid of the data (so that we don't eventually
> > > > > run OOM).
> > > > 
> > > > I see. I agree with you.
> > > > 
> > > > >> Do you have any ideas?
> > > > >   So the question is what would you like to achieve. If you just want to
> > > > > unblock a thread then a solution would be to make a thread at
> > > > > balance_dirty_pages() killable. If generally you want to get rid of dirty
> > > > > memory, then I don't have a really good answer but throwing dirty data away
> > > > > seems like a bad answer to me.
> > > > 
> > > > The problem is that we cannot unmount the corrupted filesystem due to
> > > > un-killable dd process. We must bring down the system to resume the service
> > > > with no dirty pages. I think it is important for the service continuity
> > > > to be able to kill the thread handling in balance_dirty_pages().
> > > In fact you are very lucky because dd is just deadlocked, in many cases
> > > journal abort result in BUG_ON triggering(if IO load is high enough).
> >   Can you provide the exact kernel message? I'd be interested...
> Several times i've failed in journal_stop() here:
> int jbd2_journal_stop(handle_t *handle)
> {
>         transaction_t *transaction = handle->h_transaction;
>         journal_t *journal = transaction->t_journal;
>         int err, wait_for_commit = 0;
>         tid_t tid;
>         pid_t pid;
> 
>         J_ASSERT(journal_current_handle() == handle);
> 
>         if (is_handle_aborted(handle))
>                 err = -EIO;
>         else {
>                 J_ASSERT(atomic_read(&transaction->t_updates) > 0);
> ##FAILED HERE ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                 err = 0;
>         }
  Hum, interesting. The logic wrt t_updates looks correct to me. Whenever
we create a new handle in a transaction, we increase t_updates. Whenever we
remove the handle, decrease t_updates. Whether the journal / handle is
aborted or not does not play any role here. So I fail to see how the
assertion can be triggered - only if we tried to release a handle twice or
something like that...

								Honza

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-10-28  5:34   ` Kazuya Mio
  2011-11-01 23:13     ` Jan Kara
  2011-11-07  8:00     ` Dmitry Monakhov
@ 2011-11-08  0:03     ` Jan Kara
  2011-11-09  8:28       ` Kazuya Mio
  2011-11-14 10:06       ` Kazuya Mio
  2 siblings, 2 replies; 14+ messages in thread
From: Jan Kara @ 2011-11-08  0:03 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: Jan Kara, ext4, Theodore Tso, Andreas Dilger

[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]

On Fri 28-10-11 14:34:31, Kazuya Mio wrote:
> 2011/10/25 22:40, Jan Kara wrote:
> >  Please no. Generally this boils down to what do we do with dirty data
> >when there's error in writing them out. Currently we just throw them away
> >(e.g. in media error case) but I don't think that's a generally good thing
> >because e.g. admin may want to copy the data to other working storage or
> >so. So I think we should rather keep the data and provide a mechanism for
> >userspace to ask kernel to get rid of the data (so that we don't eventually
> >run OOM).
> 
> I see. I agree with you.
> 
> >>Do you have any ideas?
> >  So the question is what would you like to achieve. If you just want to
> >unblock a thread then a solution would be to make a thread at
> >balance_dirty_pages() killable. If generally you want to get rid of dirty
> >memory, then I don't have a really good answer but throwing dirty data away
> >seems like a bad answer to me.
> 
> The problem is that we cannot unmount the corrupted filesystem due to
> un-killable dd process. We must bring down the system to resume the service
> with no dirty pages. I think it is important for the service continuity
> to be able to kill the thread handling in balance_dirty_pages().
  OK, attached are two patches based on latest Linus's tree that should
make your task killable. Can you test them?

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-mm-Make-task-in-balance_dirty_pages-killable.patch --]
[-- Type: text/x-patch, Size: 1076 bytes --]

>From 62d9916059c0441b3f545158f723c7006bcdc1e8 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 7 Nov 2011 18:41:05 +0100
Subject: [PATCH 1/2] mm: Make task in balance_dirty_pages() killable

There is no reason why task in balance_dirty_pages() shouldn't be killable
and it helps in recovering from some error conditions (like when filesystem
goes in error state and cannot accept writeback anymore but we still want to
kill processes using it to be able to unmount it).

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/page-writeback.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0360d1b..e83c286 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1133,8 +1133,10 @@ pause:
 					  pages_dirtied,
 					  pause,
 					  start_time);
-		__set_current_state(TASK_UNINTERRUPTIBLE);
+		__set_current_state(TASK_KILLABLE);
 		io_schedule_timeout(pause);
+		if (fatal_signal_pending(current))
+			break;
 
 		dirty_thresh = hard_dirty_limit(dirty_thresh);
 		/*
-- 
1.7.1


[-- Attachment #3: 0002-fs-Make-write-2-interruptible-by-a-signal.patch --]
[-- Type: text/x-patch, Size: 979 bytes --]

>From 6eefa10d92cc35b66a8166cc26472d383b572b0d Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 7 Nov 2011 18:46:39 +0100
Subject: [PATCH 2/2] fs: Make write(2) interruptible by a signal

Currently write(2) to a file is not interruptible by a signal. Sometimes this
is desirable (e.g. when you want to quickly kill a process hogging your disk or
when some process gets blocked in balance_dirty_pages() indefinitely due to a
filesystem being in an error condition).

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/filemap.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c0018f2..6b01d2f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2407,6 +2407,10 @@ static ssize_t generic_perform_write(struct file *file,
 						iov_iter_count(i));
 
 again:
+		if (signal_pending(current)) {
+			status = -EINTR;
+			break;
+		}
 
 		/*
 		 * Bring in the user page that we will copy from _first_.
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-08  0:03     ` Jan Kara
@ 2011-11-09  8:28       ` Kazuya Mio
  2011-11-09 11:15         ` Jan Kara
  2011-11-14 10:06       ` Kazuya Mio
  1 sibling, 1 reply; 14+ messages in thread
From: Kazuya Mio @ 2011-11-09  8:28 UTC (permalink / raw)
  To: Jan Kara; +Cc: ext4, Theodore Tso, Andreas Dilger

2011/11/08 9:03, Jan Kara wrote:
> On Fri 28-10-11 14:34:31, Kazuya Mio wrote:
>> 2011/10/25 22:40, Jan Kara wrote:
>>>   Please no. Generally this boils down to what do we do with dirty data
>>> when there's error in writing them out. Currently we just throw them away
>>> (e.g. in media error case) but I don't think that's a generally good thing
>>> because e.g. admin may want to copy the data to other working storage or
>>> so. So I think we should rather keep the data and provide a mechanism for
>>> userspace to ask kernel to get rid of the data (so that we don't eventually
>>> run OOM).
>>
>> I see. I agree with you.
>>
>>>> Do you have any ideas?
>>>   So the question is what would you like to achieve. If you just want to
>>> unblock a thread then a solution would be to make a thread at
>>> balance_dirty_pages() killable. If generally you want to get rid of dirty
>>> memory, then I don't have a really good answer but throwing dirty data away
>>> seems like a bad answer to me.
>>
>> The problem is that we cannot unmount the corrupted filesystem due to
>> un-killable dd process. We must bring down the system to resume the service
>> with no dirty pages. I think it is important for the service continuity
>> to be able to kill the thread handling in balance_dirty_pages().
>    OK, attached are two patches based on latest Linus's tree that should
> make your task killable. Can you test them?

I'm trying to reproduce now, but it's hard. Could you wait a few days?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-09  8:28       ` Kazuya Mio
@ 2011-11-09 11:15         ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2011-11-09 11:15 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: Jan Kara, ext4, Theodore Tso, Andreas Dilger

On Wed 09-11-11 17:28:20, Kazuya Mio wrote:
> 2011/11/08 9:03, Jan Kara wrote:
> > On Fri 28-10-11 14:34:31, Kazuya Mio wrote:
> >> 2011/10/25 22:40, Jan Kara wrote:
> >>>   Please no. Generally this boils down to what do we do with dirty data
> >>> when there's error in writing them out. Currently we just throw them away
> >>> (e.g. in media error case) but I don't think that's a generally good thing
> >>> because e.g. admin may want to copy the data to other working storage or
> >>> so. So I think we should rather keep the data and provide a mechanism for
> >>> userspace to ask kernel to get rid of the data (so that we don't eventually
> >>> run OOM).
> >>
> >> I see. I agree with you.
> >>
> >>>> Do you have any ideas?
> >>>   So the question is what would you like to achieve. If you just want to
> >>> unblock a thread then a solution would be to make a thread at
> >>> balance_dirty_pages() killable. If generally you want to get rid of dirty
> >>> memory, then I don't have a really good answer but throwing dirty data away
> >>> seems like a bad answer to me.
> >>
> >> The problem is that we cannot unmount the corrupted filesystem due to
> >> un-killable dd process. We must bring down the system to resume the service
> >> with no dirty pages. I think it is important for the service continuity
> >> to be able to kill the thread handling in balance_dirty_pages().
> >    OK, attached are two patches based on latest Linus's tree that should
> > make your task killable. Can you test them?
> 
> I'm trying to reproduce now, but it's hard. Could you wait a few days?
  Sure, take as much time as you need.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-08  0:03     ` Jan Kara
  2011-11-09  8:28       ` Kazuya Mio
@ 2011-11-14 10:06       ` Kazuya Mio
  2011-11-14 11:11         ` Jan Kara
  1 sibling, 1 reply; 14+ messages in thread
From: Kazuya Mio @ 2011-11-14 10:06 UTC (permalink / raw)
  To: Jan Kara; +Cc: ext4, Theodore Tso, Andreas Dilger

2011/11/08 9:03, Jan Kara wrote:
> On Fri 28-10-11 14:34:31, Kazuya Mio wrote:
>> 2011/10/25 22:40, Jan Kara wrote:
>>>   Please no. Generally this boils down to what do we do with dirty data
>>> when there's error in writing them out. Currently we just throw them away
>>> (e.g. in media error case) but I don't think that's a generally good thing
>>> because e.g. admin may want to copy the data to other working storage or
>>> so. So I think we should rather keep the data and provide a mechanism for
>>> userspace to ask kernel to get rid of the data (so that we don't eventually
>>> run OOM).
>>
>> I see. I agree with you.
>>
>>>> Do you have any ideas?
>>>   So the question is what would you like to achieve. If you just want to
>>> unblock a thread then a solution would be to make a thread at
>>> balance_dirty_pages() killable. If generally you want to get rid of dirty
>>> memory, then I don't have a really good answer but throwing dirty data away
>>> seems like a bad answer to me.
>>
>> The problem is that we cannot unmount the corrupted filesystem due to
>> un-killable dd process. We must bring down the system to resume the service
>> with no dirty pages. I think it is important for the service continuity
>> to be able to kill the thread handling in balance_dirty_pages().
>    OK, attached are two patches based on latest Linus's tree that should
> make your task killable. Can you test them?

Sorry for the late reply.
I confirmed that these patches fix the problem.

Reported-and-tested-by: Kazuya Mio <k-mio@sx.jp.nec.com>

Regards,
Kazuya Mio

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages
  2011-11-14 10:06       ` Kazuya Mio
@ 2011-11-14 11:11         ` Jan Kara
  0 siblings, 0 replies; 14+ messages in thread
From: Jan Kara @ 2011-11-14 11:11 UTC (permalink / raw)
  To: Kazuya Mio; +Cc: Jan Kara, ext4, Theodore Tso, Andreas Dilger

On Mon 14-11-11 19:06:31, Kazuya Mio wrote:
> 2011/11/08 9:03, Jan Kara wrote:
> > On Fri 28-10-11 14:34:31, Kazuya Mio wrote:
> >> 2011/10/25 22:40, Jan Kara wrote:
> >>>   Please no. Generally this boils down to what do we do with dirty data
> >>> when there's error in writing them out. Currently we just throw them away
> >>> (e.g. in media error case) but I don't think that's a generally good thing
> >>> because e.g. admin may want to copy the data to other working storage or
> >>> so. So I think we should rather keep the data and provide a mechanism for
> >>> userspace to ask kernel to get rid of the data (so that we don't eventually
> >>> run OOM).
> >>
> >> I see. I agree with you.
> >>
> >>>> Do you have any ideas?
> >>>   So the question is what would you like to achieve. If you just want to
> >>> unblock a thread then a solution would be to make a thread at
> >>> balance_dirty_pages() killable. If generally you want to get rid of dirty
> >>> memory, then I don't have a really good answer but throwing dirty data away
> >>> seems like a bad answer to me.
> >>
> >> The problem is that we cannot unmount the corrupted filesystem due to
> >> un-killable dd process. We must bring down the system to resume the service
> >> with no dirty pages. I think it is important for the service continuity
> >> to be able to kill the thread handling in balance_dirty_pages().
> >    OK, attached are two patches based on latest Linus's tree that should
> > make your task killable. Can you test them?
> 
> Sorry for the late reply.
> I confirmed that these patches fix the problem.
> 
> Reported-and-tested-by: Kazuya Mio <k-mio@sx.jp.nec.com>
  Thanks for testing! I've sent patches for inclusion...

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-11-14 11:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-25 12:04 [BUG] aborted ext4 leads to inifinity loop in balance_dirty_pages Kazuya Mio
2011-10-25 13:40 ` Jan Kara
2011-10-28  5:34   ` Kazuya Mio
2011-11-01 23:13     ` Jan Kara
2011-11-02  5:24       ` Kazuya Mio
2011-11-07  8:00     ` Dmitry Monakhov
2011-11-07 17:29       ` Jan Kara
2011-11-07 17:45         ` Dmitry Monakhov
2011-11-07 21:23           ` Jan Kara
2011-11-08  0:03     ` Jan Kara
2011-11-09  8:28       ` Kazuya Mio
2011-11-09 11:15         ` Jan Kara
2011-11-14 10:06       ` Kazuya Mio
2011-11-14 11:11         ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).