From: Jan Kara <jack@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>,
linux-mm@kvack.org, Martin Schwidefsky <schwidefsky@de.ibm.com>,
Mel Gorman <mgorman@suse.de>,
linux-s390@vger.kernel.org, Hugh Dickins <hughd@google.com>
Subject: Re: [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390
Date: Tue, 23 Oct 2012 12:21:53 +0200 [thread overview]
Message-ID: <20121023102153.GD3064@quack.suse.cz> (raw)
In-Reply-To: <20121022123852.a4bd5f2a.akpm@linux-foundation.org>
On Mon 22-10-12 12:38:52, Andrew Morton wrote:
> On Mon, 22 Oct 2012 17:06:46 +0200
> Jan Kara <jack@suse.cz> wrote:
>
> > On s390 any write to a page (even from kernel itself) sets architecture
> > specific page dirty bit. Thus when a page is written to via buffered write, HW
> > dirty bit gets set and when we later map and unmap the page, page_remove_rmap()
> > finds the dirty bit and calls set_page_dirty().
> >
> > Dirtying of a page which shouldn't be dirty can cause all sorts of problems to
> > filesystems. The bug we observed in practice is that buffers from the page get
> > freed, so when the page gets later marked as dirty and writeback writes it, XFS
> > crashes due to an assertion BUG_ON(!PagePrivate(page)) in page_buffers() called
> > from xfs_count_page_state().
> >
> > Similar problem can also happen when zero_user_segment() call from
> > xfs_vm_writepage() (or block_write_full_page() for that matter) set the
> > hardware dirty bit during writeback, later buffers get freed, and then page
> > unmapped.
> >
> > Fix the issue by ignoring s390 HW dirty bit for page cache pages of mappings
> > with mapping_cap_account_dirty(). This is safe because for such mappings when a
> > page gets marked as writeable in PTE it is also marked dirty in do_wp_page() or
> > do_page_fault(). When the dirty bit is cleared by clear_page_dirty_for_io(),
> > the page gets writeprotected in page_mkclean(). So pagecache page is writeable
> > if and only if it is dirty.
> >
> > Thanks to Hugh Dickins <hughd@google.com> for pointing out mapping has to have
> > mapping_cap_account_dirty() for things to work and proposing a cleaned up
> > variant of the patch.
> >
> > The patch has survived about two hours of running fsx-linux on tmpfs while
> > heavily swapping and several days of running on out build machines where the
> > original problem was triggered.
>
> That seems a fairly serious problem. To which kernel version(s) should
> we apply the fix?
Well, XFS will crash starting from 2.6.36 kernel where the assertion was
added. Previously XFS just silently added buffers (as other filesystems do
it) and wrote / redirtied the page (unnecessarily). So looking into
maintained -stable branches I think pushing the patch to -stable from 3.0
on should be enough.
> > diff --git a/mm/rmap.c b/mm/rmap.c
>
> It's a bit surprising that none of the added comments mention the s390
> pte-dirtying oddity. I don't see an obvious place to mention this, but
> I for one didn't know about this and it would be good if we could
> capture the info _somewhere_?
As Hugh says, the comment before page_test_and_clear_dirty() is somewhat
updated. But do you mean recording somewhere the catch that s390 HW dirty
bit gets set also whenever we write to a page from kernel? I guess we could
add that also to the comment before page_test_and_clear_dirty() in
page_remove_rmap() and also before definition of
page_test_and_clear_dirty(). So most people that will add / remove these
calls will be warned. OK?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-10-23 10:21 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-22 15:06 [PATCH] mm: Fix XFS oops due to dirty pages without buffers on s390 Jan Kara
2012-10-22 19:38 ` Andrew Morton
2012-10-23 4:40 ` Hugh Dickins
2012-10-23 10:21 ` Jan Kara [this message]
2012-10-23 21:56 ` Andrew Morton
2012-10-24 8:30 ` Martin Schwidefsky
2012-10-25 20:01 ` Jan Kara
2012-12-14 8:45 ` Martin Schwidefsky
2012-12-17 23:31 ` Hugh Dickins
2012-12-18 7:30 ` Martin Schwidefsky
-- strict thread matches above, loose matches on Subject: below --
2012-10-01 16:26 Jan Kara
2012-10-01 16:26 ` Jan Kara
2012-10-01 16:26 ` Jan Kara
2012-10-08 14:28 ` Mel Gorman
2012-10-08 14:28 ` Mel Gorman
2012-10-08 14:28 ` Mel Gorman
2012-10-09 4:24 ` Hugh Dickins
2012-10-09 4:24 ` Hugh Dickins
2012-10-09 4:24 ` Hugh Dickins
2012-10-09 8:18 ` Martin Schwidefsky
2012-10-09 8:18 ` Martin Schwidefsky
2012-10-09 8:18 ` Martin Schwidefsky
2012-10-09 23:21 ` Hugh Dickins
2012-10-09 23:21 ` Hugh Dickins
2012-10-09 23:21 ` Hugh Dickins
2012-10-10 21:57 ` Hugh Dickins
2012-10-10 21:57 ` Hugh Dickins
2012-10-10 21:57 ` Hugh Dickins
2012-10-19 14:38 ` Martin Schwidefsky
2012-10-19 14:38 ` Martin Schwidefsky
2012-10-19 14:38 ` Martin Schwidefsky
2012-10-09 9:32 ` Mel Gorman
2012-10-09 9:32 ` Mel Gorman
2012-10-09 9:32 ` Mel Gorman
2012-10-09 23:00 ` Hugh Dickins
2012-10-09 23:00 ` Hugh Dickins
2012-10-09 23:00 ` Hugh Dickins
2012-10-09 16:21 ` Jan Kara
2012-10-09 16:21 ` Jan Kara
2012-10-09 16:21 ` Jan Kara
2012-10-10 2:19 ` Hugh Dickins
2012-10-10 2:19 ` Hugh Dickins
2012-10-10 2:19 ` Hugh Dickins
2012-10-10 8:55 ` Jan Kara
2012-10-10 8:55 ` Jan Kara
2012-10-10 8:55 ` Jan Kara
2012-10-10 21:28 ` Hugh Dickins
2012-10-10 21:28 ` Hugh Dickins
2012-10-10 21:28 ` Hugh Dickins
2012-10-11 7:42 ` Martin Schwidefsky
2012-10-11 7:42 ` Martin Schwidefsky
2012-10-11 7:42 ` Martin Schwidefsky
2012-10-10 21:56 ` Dave Chinner
2012-10-10 21:56 ` Dave Chinner
2012-10-10 21:56 ` Dave Chinner
2012-10-11 7:44 ` Martin Schwidefsky
2012-10-11 7:44 ` Martin Schwidefsky
2012-10-11 7:44 ` Martin Schwidefsky
2012-10-17 0:43 ` Jan Kara
2012-10-17 0:43 ` Jan Kara
2012-10-17 0:43 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121023102153.GD3064@quack.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=schwidefsky@de.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.