From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
mgorman@suse.de, linux-mm@kvack.org
Subject: Re: Ext4 stack trace with savedwrite patches
Date: Wed, 1 Mar 2017 15:53:52 +0530 [thread overview]
Message-ID: <d6569967-fecd-2708-9e18-cf0964c362bd@linux.vnet.ibm.com> (raw)
In-Reply-To: <20170301094913.GB20512@quack2.suse.cz>
On Wednesday 01 March 2017 03:19 PM, Jan Kara wrote:
> Hi,
>
> On Fri 24-02-17 19:23:52, Aneesh Kumar K.V wrote:
>> I am hitting this while running stress test with the saved write patch
>> series. I guess we are missing a set page dirty some where. I will
>> continue to debug this, but if you have any suggestion let me know.
> <snip>
>
> So this warning can happen when page got dirtied but ->page_mkwrite() was
> not called. I don't know details of how autonuma works but a quick look
> suggests that autonuma can also do numa hinting faults for file pages.
> So the following seems to be possible:
>
> Autonuma decides to check for accesses to a mapped shared file page that is
> dirty. pte_present gets cleared, pte_write stays set (due to logic
> introduced in commit b191f9b106 "mm: numa: preserve PTE write permissions
> across a NUMA hinting fault"). Then page writeback happens, page_mkclean()
> is called to write-protect the page. However page_check_address() returns
> NULL for the PTE (__page_check_address() returns NULL for !pte_present
> PTEs) so we don't clear pte_write bit in page_mkclean_one().
Even though we cleared _PAGE_PRESENT a pte_present() check return true
for numa fault pte. The problem with savedwrite patch series that i
quoted in the original mail was that pte_write() was checking on
_PAGE_WRITE where as numa fault stashed the write bit as savedwrite bit.
Hence page_mkclean was skipping those ptes.
> Sometime later
> a process looks at the page through mmap, takes NUMA fault and
> do_numa_page() reestablishes a writeable mapping of the page although the
> filesystem does not expect there to be one and funny things happen
> afterwards...
>
> I'll defer to more mm-savvy people to decide how this should be fixed. My
> naive understanding is that page_mkclean_one() should clear the pte_write
> bit even for pages that are undergoing NUMA probation but I'm not sure
> about a preferred way to achieve that...
>
>
Yes found that and finally decided that instead of fixing all those code
path, we can update pte_write to handle autonuma preserved write bit.
https://lkml.kernel.org/r/1488203787-17849-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2017-03-01 10:24 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <87innzu233.fsf@skywalker.in.ibm.com>
2017-03-01 9:49 ` Ext4 stack trace with savedwrite patches Jan Kara
2017-03-01 10:23 ` Aneesh Kumar K.V [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d6569967-fecd-2708-9e18-cf0964c362bd@linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).