All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <piggin@cyberone.com.au>
To: Andrew Morton <akpm@osdl.org>
Cc: a.p.zijlstra@chello.nl, clameter@sgi.com, torvalds@osdl.org,
	ak@suse.de, rohitseth@google.com, mbligh@google.com,
	hugh@veritas.com, riel@redhat.com, andrea@suse.de,
	arjan@infradead.org, apw@shadowen.org, mel@csn.ul.ie,
	marcelo@kvack.org, anton@samba.org, paulmck@us.ibm.com,
	linux-mm@kvack.org
Subject: Re: [RFC][PATCH 1/3] tracking dirty pages in shared mappings -V4
Date: Fri, 12 May 2006 15:05:04 +1000	[thread overview]
Message-ID: <44641780.8020608@cyberone.com.au> (raw)
In-Reply-To: <20060511213045.32b41aa6.akpm@osdl.org>

Andrew Morton wrote:

>Nick Piggin <piggin@cyberone.com.au> wrote:
>
>> >So let's see.  We take a write fault, we mark the page dirty then we return
>> >to userspace which will proceed with the write and will mark the pte dirty.
>> >
>> >Later, the VM will write the page out.
>> >
>> >Later still, the pte will get cleaned by reclaim or by munmap or whatever
>> >and the page will be marked dirty and the page will again be written out. 
>> >Potentially needlessly.
>> >
>>
>> page_wrprotect also marks the page clean,
>>
>
>Oh.  I missed that when reading the comment which describes
>page_wrprotect() (I do go on).
>

I guess page_wrprotect isn't a good name, because it would suggest it
can be used in situations where it would cause data loss.

page_mappings_mkclean or page_mkclean might be better (the wrprotect
is just a side effect of the fact that clean,shared mappings are
readonly).

>
>>so this window is very small.
>> The window is that the fault path might set_page_dirty, then throttle
>> on writeout, and the page gets written out before it really gets dirtied
>> by the application (which then has to fault again).
>>
>
>: int test_clear_page_dirty(struct page *page)
>: {
>: 	struct address_space *mapping = page_mapping(page);
>: 	unsigned long flags;
>: 
>: 	if (mapping) {
>: 		write_lock_irqsave(&mapping->tree_lock, flags);
>: 		if (TestClearPageDirty(page)) {
>: 			radix_tree_tag_clear(&mapping->page_tree,
>: 						page_index(page),
>: 						PAGECACHE_TAG_DIRTY);
>: 			write_unlock_irqrestore(&mapping->tree_lock, flags);
>: 			/*
>: 			 * We can continue to use `mapping' here because the
>: 			 * page is locked, which pins the address_space
>: 			 */
>
>So if userspace modifies the page right here, and marks the pte dirty.
>
>: 			if (mapping_cap_account_dirty(mapping)) {
>: 				page_wrprotect(page);
>
>We just lost that pte dirty bit, and hence the user's data.
>
>: 				dec_page_state(nr_dirty);
>: 			}
>: 			return 1;
>: 		}
>: 		write_unlock_irqrestore(&mapping->tree_lock, flags);
>: 		return 0;
>: 	}
>: 	return TestClearPageDirty(page);
>: }
>: 
>
>Which is just the sort of subtle and nasty problem I was referring to...
>
>If that's correct then I guess we need the
>
>                if (ptep_clear_flush_dirty(vma, addr, pte) ||
>                                page_test_and_clear_dirty(page))
>                        ret += set_page_dirty(page);
>
>treatment in page_wrprotect().
>
>Now I suppose it's not really a dataloss race, because in practice the
>kernel is about to write this page to backing store anwyay.  I guess.  I
>cannot immediately think of any clear_page_dirty() callers for whom that
>won't be true.
>

If they do a clear_page_dirty, then fail to clean the page, then fail to
subsequently set_page_dirty again, it is a data-loss situation anyway.

If they do set_page_dirty (which they have to, for correctness), then the
page has PG_dirty set again; true dirty bits are moved out of the ptes,
but that's no problem.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-05-12  5:05 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-05 20:35 [RFC][PATCH] tracking dirty pages in shared mappings Peter Zijlstra
2006-05-06 13:18 ` Nick Piggin
2006-05-06 13:34   ` Peter Zijlstra
2006-05-06 13:47     ` Nick Piggin
2006-05-06 15:29       ` Peter Zijlstra
2006-05-07  0:40         ` Nick Piggin
2006-05-07  3:43           ` Nick Piggin
2006-05-08  6:43         ` Christoph Lameter
2006-05-08  7:23           ` Peter Zijlstra
2006-05-08 19:20           ` [RFC][PATCH 1/2] tracking dirty pages in shared mappings -V3 Peter Zijlstra
2006-05-09  5:41             ` Christoph Lameter
2006-05-09  6:06               ` Peter Zijlstra
2006-05-09 20:44               ` [RFC][PATCH 1/3] tracking dirty pages in shared mappings -V4 Peter Zijlstra
2006-05-09 20:52                 ` Peter Chubb
2006-05-09 20:55                   ` Martin Bligh
2006-05-09 22:56                     ` Brian Twichell
2006-05-10  0:24                     ` Linus Torvalds
2006-05-10  0:29                       ` Nick Piggin
2006-05-10  1:24                         ` Linus Torvalds
2006-05-11 15:02                 ` Andrew Morton
2006-05-11 16:39                   ` Andy Whitcroft
2006-05-11 22:52                   ` Christoph Lameter
2006-05-11 23:30                     ` Linus Torvalds
2006-05-11 23:44                       ` Andrew Morton
2006-05-12  0:10                         ` Linus Torvalds
2006-05-12  8:07                         ` Andy Whitcroft
2006-05-12 14:25                           ` Martin J. Bligh
2006-05-14 15:58                         ` Andy Whitcroft
2006-05-12  1:51                   ` Nick Piggin
2006-05-12  4:30                     ` Andrew Morton
2006-05-12  5:05                       ` Nick Piggin [this message]
2006-05-12  7:06                       ` Peter Zijlstra
2006-05-12  8:04                         ` Nick Piggin
2006-05-12  8:52                           ` Peter Zijlstra
2006-05-12  8:07                         ` Nick Piggin
2006-05-12  4:51                   ` Christoph Lameter
2006-05-09 20:44               ` [RFC][PATCH 2/3] throttle writers of shared mappings Peter Zijlstra
2006-05-09 22:54                 ` Nick Piggin
2006-05-09 22:55                   ` Nick Piggin
2006-05-10  6:25                     ` Peter Zijlstra
2006-05-09 20:44               ` [RFC][PATCH 3/3] optimize follow_pages() Peter Zijlstra
2006-05-10  6:30                 ` Peter Zijlstra
2006-05-08 19:24           ` [RFC][PATCH 2/2] throttle writers of shared mappings Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44641780.8020608@cyberone.com.au \
    --to=piggin@cyberone.com.au \
    --cc=a.p.zijlstra@chello.nl \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=anton@samba.org \
    --cc=apw@shadowen.org \
    --cc=arjan@infradead.org \
    --cc=clameter@sgi.com \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=marcelo@kvack.org \
    --cc=mbligh@google.com \
    --cc=mel@csn.ul.ie \
    --cc=paulmck@us.ibm.com \
    --cc=riel@redhat.com \
    --cc=rohitseth@google.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.