From: Chuck Lever <chuck.lever@oracle.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andrew Morton <akpm@osdl.org>,
Trond Myklebust <Trond.Myklebust@netapp.com>,
Steve Dickson <steved@redhat.com>,
linux-mm@kvack.org
Subject: Re: Checking page_count(page) in invalidate_complete_page
Date: Mon, 25 Sep 2006 20:01:07 -0400 [thread overview]
Message-ID: <45186DC3.7000902@oracle.com> (raw)
In-Reply-To: <45186481.1090306@yahoo.com.au>
Nick Piggin wrote:
> Chuck Lever wrote:
>
>> Nick Piggin wrote:
>>
>>> Andrew Morton wrote:
>>> Also, you can't guarantee anything much about its refcount even then
>>> (because it could be on a private reclaim list or pagevec somewhere).
>>>
>>>> We could retry the invalidation a few times, but that stinks.
>>>>
>>>> I think invalidate_inode_pages2() is sufficiently different from (ie:
>>>> stronger than) invalidate_inode_pages() to justify the addition of a
>>>> new
>>>> invalidate_complete_page2(), which skips the page refcount check.
>>>>
>>>
>>> Yes, I think that would be possible using the lock_page in do_no_page
>>> trick.
>>> That would also enable you to invalidate pages that have direct IO going
>>> into them, and other weird and wonderful get_user_pages happenings.
>>>
>>> I haven't thrown away those patches, and I am looking for a
>>> justification
>>> for them because they make the code look nicer ;)
>>>
>>> For 2.6.18.stable, Andrew's idea of checking the return value and retry
>>> might be the only option.
>>
>>
>> I think allowing callers of invalidate_inode_pages2() to get the
>> previous behavior is reasonable here. There are only 2 of them: v9fs
>> and the NFS client.
>
>
>
> That still reintroduces the page fault race, but if the dumb
> check'n'retry is
> no good then it may be OK for 2.6.18.stable, considering the page fault
> race
> is much less common than the reclaim one. Not sure, not my call.
The NFS client uses invalidate_inode_pages2 for files, symlinks, and
directories. The latter two won't have the do_no_page race since you
can't map those types of file objects.
For NFS files, the do_no_page race does exist (at least theoretically --
I've never seen a report of such a problem). Most people are not brave
enough to use shared mapped files on NFS... so that race may be very
rare indeed.
> Upstream, it should be fixed properly without re-introducing bugs along the
> way.
Of course... thanks for sending the history.
I'm wondering aloud here, because I'm a VM neophyte. I'm not sure how
common the reclaim race is in real environments. For instance, the test
I'm running is pretty simple, and I run it just after the client has
rebooted. Why is try_to_free_pages being called here? If I knew that I
could probably make a better guess about how common this race is.
Also, the last get_page() call is from pagevec_strip(). Why do we need
to try to strip buffers off of a page that is guaranteed not to have any
buffers?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-09-26 0:01 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4518333E.2060101@oracle.com>
2006-09-25 21:10 ` Checking page_count(page) in invalidate_complete_page Andrew Morton
2006-09-25 22:30 ` Chuck Lever
2006-09-25 22:53 ` Andrew Morton
2006-09-25 22:57 ` Steve Dickson
2006-09-25 23:14 ` Nick Piggin
2006-09-25 22:40 ` Chuck Lever
2006-09-25 23:02 ` Andrew Morton
2006-09-25 22:50 ` Steve Dickson
2006-09-25 22:51 ` Nick Piggin
2006-09-25 23:14 ` Chuck Lever
2006-09-25 23:21 ` Nick Piggin
2006-09-26 0:01 ` Chuck Lever [this message]
2006-09-26 0:13 ` Nick Piggin
2006-09-26 1:33 ` Chuck Lever
2006-09-26 1:48 ` Nick Piggin
2006-09-28 16:26 ` Chuck Lever
2006-09-28 16:36 ` Andrew Morton
2006-09-28 16:40 ` Andrew Morton
2006-09-28 16:42 ` Chuck Lever
2006-09-28 17:03 ` Andrew Morton
2006-09-28 17:09 ` Chuck Lever
2006-09-29 0:37 ` Nick Piggin
2006-09-29 20:34 ` Chuck Lever
2006-09-29 20:45 ` Peter Zijlstra
2006-09-29 21:02 ` Chuck Lever
2006-09-29 21:17 ` Peter Zijlstra
2006-09-29 21:44 ` Andrew Morton
2006-09-29 21:48 ` Chuck Lever
2006-09-29 22:29 ` Andrew Morton
2006-09-29 23:05 ` Chuck Lever
2006-10-01 4:21 ` Chuck Lever
2006-10-02 12:01 ` Steve Dickson
2006-10-02 13:25 ` Trond Myklebust
2006-10-02 16:57 ` Andrew Morton
2006-10-02 17:02 ` Steve Dickson
2006-10-02 18:20 ` Andrew Morton
2006-10-02 19:02 ` Steve Dickson
2006-10-03 2:14 ` Chuck Lever
2006-10-03 4:18 ` Trond Myklebust
2006-10-03 4:24 ` Andrew Morton
2006-10-03 18:50 ` Chuck Lever
2006-10-03 19:10 ` Trond Myklebust
2006-10-03 19:21 ` Chuck Lever
2006-10-03 21:37 ` Andrew Morton
2006-10-04 19:29 ` Chuck Lever
2006-10-04 19:43 ` Andrew Morton
2006-10-04 19:53 ` Steve Dickson
2006-09-28 16:41 ` Chuck Lever
2006-09-26 6:25 ` Nick Piggin
2006-09-26 13:12 ` Chuck Lever
2006-09-27 4:47 ` Nick Piggin
2006-09-27 8:25 ` Andrew Morton
2006-09-27 8:39 ` Nick Piggin
2006-09-27 16:03 ` Andrew Morton
2006-09-27 15:54 ` Chuck Lever
2006-09-25 22:56 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45186DC3.7000902@oracle.com \
--to=chuck.lever@oracle.com \
--cc=Trond.Myklebust@netapp.com \
--cc=akpm@osdl.org \
--cc=linux-mm@kvack.org \
--cc=nickpiggin@yahoo.com.au \
--cc=steved@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.