From: NeilBrown <neilb@suse.de>
To: Milosz Tanski <milosz@adfin.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-cachefs@redhat.com" <linux-cachefs@redhat.com>,
David Howells <dhowells@redhat.com>
Subject: Re: fscache recursive hang -- similar to loopback NFS issues
Date: Mon, 21 Jul 2014 16:40:44 +1000 [thread overview]
Message-ID: <20140721164044.2845f3fd@notabene.brown> (raw)
In-Reply-To: <CANP1eJE7tU9touhSq+Utt=MLE4w5D_C4pT1TAFAiFNBh8ee_mA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2784 bytes --]
On Sat, 19 Jul 2014 16:20:01 -0400 Milosz Tanski <milosz@adfin.com> wrote:
> Neil,
>
> I saw your recent patcheset for improving the wait_on_bit interface
> (particular: SCHED: allow wait_on_bit_action functions to support a
> timeout.) I'm looking on some guidance on leveraging that work to
> solve other recursive lock hang in fscache.
>
> I've ran into similar issues you're trying to solve with loopback NFS
> but in the fscache code. This happens under heavy vma preasure when
> the kernel is aggressively trying to trim the page cache.
>
> The hang is caused by this serious of events
> 1. cachefiles_write_page - cachefiles (the fscache backend, sitting on
> ext4) tries to write page to disk
> 2. ext4 tries to allocate a page in writeback (without GPF_NOFS and
> with wait flag)
> 3. due to vma preasure the kernel tries to free-up pages
> 4. this causes release pages in ceph to be called
> 5. the selected page is cached page in process of write out (from step #1)
> 6. fscache_wait_on_page_write hangs forever
>
> Is there a solution that you have to NFS as another patch that
> implements the timeout that I can use a template? I'm not familiar
> with that piece of the code base.
It looks like the comment in __fscache_maybe_release_page
/* We will wait here if we're allowed to, but that could deadlock the
* allocator as the work threads writing to the cache may all end up
* sleeping on memory allocation, so we may need to impose a timeout
* too. */
is correct when it says "we may need to impose a timeout".
The following __fscache_wait_on_page_write() needs to timeout.
However that doesn't use wait_on_bit(), it just has a simple wait_event.
So something like this should fix it (or should at least move the problem
along a bit).
NeilBrown
diff --git a/fs/fscache/page.c b/fs/fscache/page.c
index ed70714503fa..58035024c5cf 100644
--- a/fs/fscache/page.c
+++ b/fs/fscache/page.c
@@ -43,6 +43,13 @@ void __fscache_wait_on_page_write(struct fscache_cookie *cookie, struct page *pa
}
EXPORT_SYMBOL(__fscache_wait_on_page_write);
+void __fscache_wait_on_page_write_timeout(struct fscache_cookie *cookie, struct page *page, unsigned long timeout)
+{
+ wait_queue_head_t *wq = bit_waitqueue(&cookie->flags, 0);
+
+ wait_event_timeout(*wq, !__fscache_check_page_write(cookie, page), timeout);
+}
+
/*
* decide whether a page can be released, possibly by cancelling a store to it
* - we're allowed to sleep if __GFP_WAIT is flagged
@@ -115,7 +122,7 @@ page_busy:
}
fscache_stat(&fscache_n_store_vmscan_wait);
- __fscache_wait_on_page_write(cookie, page);
+ __fscache_wait_on_page_write_timeout(cookie, page, HZ);
gfp &= ~__GFP_WAIT;
goto try_again;
}
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-07-21 6:40 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-19 20:20 fscache recursive hang -- similar to loopback NFS issues Milosz Tanski
2014-07-19 20:31 ` Milosz Tanski
2014-07-21 6:40 ` NeilBrown [this message]
2014-07-21 11:42 ` Milosz Tanski
2014-07-29 16:12 ` David Howells
2014-07-29 21:17 ` NeilBrown
2014-07-30 1:48 ` Milosz Tanski
2014-07-30 2:19 ` NeilBrown
2014-07-30 16:06 ` Milosz Tanski
2014-08-05 4:12 ` Milosz Tanski
2014-08-05 4:49 ` NeilBrown
2014-08-05 14:32 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140721164044.2845f3fd@notabene.brown \
--to=neilb@suse.de \
--cc=ceph-devel@vger.kernel.org \
--cc=dhowells@redhat.com \
--cc=linux-cachefs@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=milosz@adfin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).