From: Chris Mason <chris.mason@oracle.com>
To: linux-mm@kvack.org, Peter Zijlstra <a.p.zijlstra@chello.nl>,
Matthew Wilcox <matthew@wil.cx>,
"chuck.lever" <chuck.lever@oracle.com>
Subject: [PATCH RFC] Lost wakeups from lock_page_killable()
Date: Wed, 14 Jan 2009 15:23:52 -0500 [thread overview]
Message-ID: <1231964632.8269.47.camel@think.oraclecorp.com> (raw)
Chuck has been debugging a problem with NFS mounts where procs are
getting stuck waiting on the page lock. He did a bunch of work around
tracking the last person to lock the page and then printing out the page
state when it found stuck procs.
The end result showed that procs were waiting to lock pages that were
not actually locked. The workload involved a well known commercial
database that was being shutdown via sql shutdown abort
After trying to blame NFS for a really long time I think
lock_page_killable may be the cause.
lock_page and lock_page_killable both call __wait_on_bit_lock, and so
both end up using prepare_to_wait_exclusive(). This means that when
someone does finally unlock the page, only one process is going to get
woken up.
So, procA holding the page lock, procB and procC are waiting on the
lock.
procA: lock_page() // success
procB: lock_page_killable(), sync_page_killable(), io_schedule()
procC: lock_page_killable(), sync_page_killable(), io_schedule()
procA: unlock, wake_up_page(page, PG_locked)
procA: wake up procB
happy admin: kill procB
procB: wakes into sync_page_killable(), notices the signal and returns
-EINTR
procB: __wait_on_bit_lock sees the action() func returns < 0 and does
not take the page lock
procB: lock_page_killable() returns < 0 and exits happily.
procC: sleeping in io_schedule() forever unless someone else locks the
page.
The patch below is entirely untested but may do a better job of
explaining what I think the bug is. I'm hoping I can trigger it locally
with a few dd commands mixed with a lot of kill commands.
-chris
diff --git a/mm/filemap.c b/mm/filemap.c
index ceba0bd..e1184fa 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -623,9 +623,20 @@ EXPORT_SYMBOL(__lock_page);
int __lock_page_killable(struct page *page)
{
DEFINE_WAIT_BIT(wait, &page->flags, PG_locked);
+ int ret;
- return __wait_on_bit_lock(page_waitqueue(page), &wait,
+ ret = __wait_on_bit_lock(page_waitqueue(page), &wait,
sync_page_killable, TASK_KILLABLE);
+ /*
+ * wait_on_bit_lock uses prepare_to_wait_exclusive, so if multiple
+ * procs were waiting on this page, we were the only proc woken up.
+ *
+ * if ret != 0, we didn't actually get the lock. We need to
+ * make sure any other waiters don't sleep forever.
+ */
+ if (ret)
+ wake_up_page(page, PG_locked);
+ return ret;
}
/**
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2009-01-14 20:24 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-14 20:23 Chris Mason [this message]
2009-01-14 21:55 ` [PATCH RFC] Lost wakeups from lock_page_killable() Chris Mason
2009-01-14 22:30 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1231964632.8269.47.camel@think.oraclecorp.com \
--to=chris.mason@oracle.com \
--cc=a.p.zijlstra@chello.nl \
--cc=chuck.lever@oracle.com \
--cc=linux-mm@kvack.org \
--cc=matthew@wil.cx \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).