From: Nishanth Aravamudan <nacc@us.ibm.com>
To: William Lee Irwin III <wli@holomorphy.com>
Cc: kenchen@google.com, linux-mm@kvack.org, agl@us.ibm.com, dwg@au1.ibm.com
Subject: Re: FADV_DONTNEED on hugetlbfs files broken
Date: Sat, 17 Mar 2007 12:37:29 -0700 [thread overview]
Message-ID: <20070317193729.GA11449@us.ibm.com> (raw)
In-Reply-To: <20070317061322.GI8915@holomorphy.com>
On 16.03.2007 [23:13:22 -0700], William Lee Irwin III wrote:
> On Fri, Mar 16, 2007 at 10:13:09PM -0700, Nishanth Aravamudan wrote:
> > git commit 6649a3863232eb2e2f15ea6c622bd8ceacf96d76 "[PATCH] hugetlb:
> > preserve hugetlb pte dirty state" fixed one bug and caused another (or,
> > at least, a regression): FADV_DONTNEED no longer works on hugetlbfs
> > files. git-bisect revealed this commit to be the cause. I'm still trying
> > to figure out what the solution is (but it is also the start of the
> > weekend :) Maybe it's not a bug, but it is a change in behavior, and I
> > don't think it was clear from the commit message.
>
> Well, setting the pages always dirty like that will prevent things
> from dropping them because they think they still need to be written
> back. It is, however, legitimate and/or permissible to ignore
> fadvise() and/or madvise(); they are by definition only advisory. I
> think this is more of a "please add back FADV_DONTNEED support"
> affair.
Yes, that could be :) Sorry if my e-mail indicated I was asking
otherwise. I don't want Ken's commit to be reverted, as that would make
hugepages very nearly unusable on x86 and x86_64. But I had found a
functional change and wanted it to be documented. If hugepages can no
longer be dropped from the page cache, then we should make sure that is
clear (and expected/desired).
Now, even if I call fsync() on the file descriptor, I still don't get
the pages out of the page cache. It seems to me like fsync() would clear
the dirty state -- although perhaps with Ken's patch, writable hugetlbfs
pages will *always* be dirty? I'm still trying to figure out what ever
clears that dirty state (in hugetlbfs or anywhere else). Seems like
hugetlbfs truncates call cancel_dirty_page(), but the comment there
indicates it's only for truncates.
> Perhaps we should ask what ramfs, tmpfs, et al would do. Or, for that
> matter, if they suffer from the same issue as Ken Chen identified for
> hugetlbfs. Perhaps the issue is not hugetlb's dirty state, but
> drop_pagecache_sb() failing to check the bdi for BDI_CAP_NO_WRITEBACK.
> Or perhaps what safety guarantees drop_pagecache_sb() is supposed to
> have or lack.
A good point, and one I hadn't considered. I'm less concerned by the
drop_pagecache_sb() path (which is /proc/sys/vm/drop_caches, yes?),
although it appears that it and the FADV_DONTNEED code both end up
calling into invalidate_mapping_pages(). I'm still pretty new to this
part of the kernel code, and am trying to follow along as best I can.
In any case, if the problem were in drop_pagecache_sb(), it seems like
it wouldn't help the DONTNEED case, since that's a level above the call
to invalidate_mapping_pages().
I'll keep looking through the code and thinking, and if anyone has any
patches they'd like me to test, I'll be glad to.
Thanks,
Nish
--
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-03-17 19:37 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-17 5:13 FADV_DONTNEED on hugetlbfs files broken Nishanth Aravamudan
2007-03-17 6:13 ` William Lee Irwin III
2007-03-17 19:37 ` Nishanth Aravamudan [this message]
2007-03-18 2:13 ` William Lee Irwin III
2007-03-18 7:43 ` Ken Chen
2007-03-18 17:27 ` Nishanth Aravamudan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070317193729.GA11449@us.ibm.com \
--to=nacc@us.ibm.com \
--cc=agl@us.ibm.com \
--cc=dwg@au1.ibm.com \
--cc=kenchen@google.com \
--cc=linux-mm@kvack.org \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.