All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: Bob Bell <b_lkml@thebellsplace.com>
Cc: Andrew Morton <akpm@osdl.org>, Linus Torvalds <torvalds@osdl.org>,
	trond@netapp.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] TASK_KILLABLE version 2
Date: Thu, 6 Dec 2007 18:49:20 -0700	[thread overview]
Message-ID: <20071207014920.GB28595@parisc-linux.org> (raw)
In-Reply-To: <20070924201648.GA6850@newbie.thebellsplace.net>

On Mon, Sep 24, 2007 at 04:16:48PM -0400, Bob Bell wrote:
> I've been testing this patch on my systems.  It's working for me when
> I read() a file.  Asynchronous write()s seem okay, too.  However,
> synchronous writes (caused by either calling fsync() or fcntl() to
> release a lock) prevent the process from being killed when the NFS
> server goes down.
> 
> When the process is sent SIGKILL, it's waiting with the following call
> tree:
>    do_fsync
>    nfs_fsync
>    nfs_wb_all
>    nfs_sync_mapping_wait
>    nfs_wait_on_requests_locks (I believe)
>    nfs_wait_on_request
>    out_of_line_wait_on_bit
>    __wait_on_bit
>    nfs_wait_bit_interruptible
>    schedule
> 
> When the process is later viewed after being deemed "stuck", it's
> waiting with the following call tree:
>    do_fsync
>    filemap_fdatawait
>    wait_on_page_writeback_range
>    wait_on_page_writeback
>    wait_on_page_bit
>    __wait_on_bit
>    sync_page
>    io_schedule
>    schedule
> 
> If I hazard a guess as to what might be wrong here, I believe that when
> the processes catches SIGKILL, nfs_wait_bit_interruptible is returning
> -ERESTARTSYS.  That error bubbles back up to nfs_fsync.  However,
> nfs_fsync returns ctx->error, not -ERESTARTSYS, and ctx->error is 0.
> do_fsync proceeds to call filemap_fdatawait.  I question whether 
> nfs_sync should return an error, and if do_fsync should skip 
> filemap_fdatawait if the fsync op returned an error.
> 
> I did try replacing the call to sync_page in __wait_on_bit with 
> sync_page_killable and replacing TASK_UNINTERRUPTIBLE with 
> TASK_KILLABLE.  That seemed to work once, but then really screwed things 
> up on subsequent attempts.

Hi Bob,

The patch I posted earlier today (a somewhat cleaned-up version of a
patch originally from Liam Howlett) converts NFS to use TASK_KILLABLE.
Could you have another shot at breaking it?  You can find it here:

http://marc.info/?l=linux-fsdevel&m=119698206729969&w=2

(It has some prerequisites that are in the git tree, so probably the
easiest thing is just to pull that git tree).

Liam notes that there's still problems with programs that call *stat*(),
but I haven't looked into that issue yet.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

      parent reply	other threads:[~2007-12-07  1:49 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-02  2:43 [PATCH] TASK_KILLABLE version 2 Matthew Wilcox
2007-09-02  2:46 ` [PATCH 1/5] Use wake_up_locked() in eventpoll Matthew Wilcox
2007-09-02  2:46 ` [PATCH 2/5] Use macros instead of TASK_ flags Matthew Wilcox
2007-09-02  2:54   ` Matthew Wilcox
2007-09-02  3:35   ` Daniel Walker
2007-09-02  4:05     ` Matthew Wilcox
2007-09-03 21:03   ` Matthew Wilcox
2007-09-02  2:46 ` [PATCH 3/5] Add TASK_WAKEKILL Matthew Wilcox
2007-09-02  2:46 ` [PATCH 4/5] Add lock_page_killable Matthew Wilcox
2007-09-02  2:46 ` [PATCH 5/5] Make wait_on_retry_sync_kiocb killable Matthew Wilcox
2007-09-24 20:16 ` [PATCH] TASK_KILLABLE version 2 Bob Bell
2007-09-26 11:57   ` Ric Wheeler
2007-09-27 21:47     ` Nick Piggin
2007-12-07  1:49   ` Matthew Wilcox [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071207014920.GB28595@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=akpm@osdl.org \
    --cc=b_lkml@thebellsplace.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@osdl.org \
    --cc=trond@netapp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.