From: David Miller <davem@davemloft.net>
To: torvalds@osdl.org
Cc: miklos@szeredi.hu, rmk+lkml@arm.linux.org.uk,
arjan@infradead.org, linux-kernel@vger.kernel.org,
linux-arch@vger.kernel.org, akpm@osdl.org
Subject: Re: fuse, get_user_pages, flush_anon_page, aliasing caches and all that again
Date: Sun, 31 Dec 2006 13:12:16 -0800 (PST) [thread overview]
Message-ID: <20061231.131216.105428418.davem@davemloft.net> (raw)
In-Reply-To: <Pine.LNX.4.64.0612311249240.4473@woody.osdl.org>
From: Linus Torvalds <torvalds@osdl.org>
Date: Sun, 31 Dec 2006 12:58:45 -0800 (PST)
> So there really is two different cases here:
>
> - flush the cache as seen by A PARTICULAR virtual mapping.
>
> This is ptrace, but it's other things like "unmap page from this VM"
> too.
>
> - flush the cache for all possible virtual mappings - simply because we
> don't even know who has it mapped dirty.
>
> And the thing is, the more I think about it, the more I end up
> wondering:
>
> I'm not even sure how valid this is. Whatever path needs to do this is
> likely doing something wrong anyway. If there are multiple possible
> sources of cache conflicts, the thing is a disaster and the end result
> depends on our ordering anyway, so I'd argue that it is just about as
> correct to flush as it is to NOT flush.
>
> So I have this nagging suspicion that "flush_dcache_page()" is always a
> bug when it is about "virtual caches". It should NEVER flush any virtual
> caches, since that whole operations is by necessity something where you
> should be talking about _which_ virtual cache you should flush.
It's the aliasing between the _1_ valid user mapping and the kernel's
virtual mapping, that's the problem and that's very valid and that's
why we have flush_dcache_page() to begin with.
> So "flush_dcache_page()" is - I think - more validtly thought about as
> just DMA coherency (in a system where DMA does not participate in
> _physical_ cache coherency). Not about virtual caches at all.
And I guess that's what you're trying to say here.
I'm beginning to think that Ralf Baechle had the best idea here,
where he recently made it such that platforms could override
kmap() and friends even on non-HIGHMEM configurations.
In theory it's the perfect interface to handle this problem,
you flush exactly where the physical page is made visible to
the kernel for a cpu load/store. All the locations where that
happens are perfectly annotated already with kmap() calls.
So then there are two ways to touch user mapped pages:
1) Inside of a kmap()/kunmap() region.
2) Via copy_user_page()/clear_user_page()
The only core requirement is that the interfaces know the
virtual address the thing is mapped at, and after Ralf's
changes both #1 and #2 do have this information.
Using kmap() even takes care of the PIO "dma" cases where
the CPU reads/writes to the buffer for the data transfer.
Furthermore, an implementation of #1 and #2 can avoid
cache flushing altogether. Just like for HIGHMEM you
have a kmap() TLB mapping area that sets up a mapping at
the correct alias, and returns that pointer from kmap().
Since the alias is good, no cache flush is needed to access
the page in kernel space.
In fact I think this is what Ralf's implementation on MIPS is doing.
And, this is the scheme we use on sparc64 for {copy,clear}_user_page().
Now, going in the opposite direction (kernel page made visible to
userspace for the first time, so you have to kick out the kernel side
mapping from the cache) can be handled either at set_pte_at() time
(this is what sparc64 does) or in update_mmu_cache().
This leaves only one (arguably broken) case of, as you mention, the
user using MAP_SHARED at a set of several incompatible aliases. As
far as I can see, the only sane thing to do in that situation seems to
be to mark the thing non-virtually-cacheable in the user mapping PTEs
if the cpu architecture allows that.
next prev parent reply other threads:[~2006-12-31 21:12 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-12-21 15:26 fuse, get_user_pages, flush_anon_page, aliasing caches and all that again Russell King
2006-12-21 15:53 ` Miklos Szeredi
2006-12-21 16:57 ` Russell King
2006-12-21 17:51 ` Miklos Szeredi
2006-12-21 21:04 ` Jan Engelhardt
2006-12-21 21:30 ` Miklos Szeredi
2007-01-01 15:04 ` James Bottomley
2006-12-21 17:17 ` Russell King
2006-12-21 17:55 ` Miklos Szeredi
2006-12-21 18:11 ` Russell King
2006-12-21 18:30 ` Miklos Szeredi
2006-12-21 18:55 ` Russell King
2006-12-21 19:05 ` Miklos Szeredi
2006-12-21 23:51 ` Randolph Chung
2006-12-22 8:43 ` Russell King
2006-12-22 14:45 ` Randolph Chung
2006-12-30 16:39 ` Russell King
2006-12-30 16:50 ` Russell King
2006-12-30 18:26 ` Linus Torvalds
2006-12-30 22:46 ` Russell King
2006-12-31 5:23 ` David Miller
2006-12-31 9:10 ` Miklos Szeredi
2006-12-31 9:45 ` David Miller
2006-12-31 9:23 ` Russell King
2006-12-31 9:27 ` Arjan van de Ven
2006-12-31 9:47 ` David Miller
2006-12-31 10:00 ` Russell King
2006-12-31 10:04 ` David Miller
2006-12-31 12:24 ` Miklos Szeredi
2006-12-31 17:37 ` Russell King
2007-01-01 22:15 ` Miklos Szeredi
2007-01-01 23:45 ` Russell King
2007-01-02 19:40 ` Dan Williams
2007-01-02 22:53 ` James Bottomley
2007-01-02 23:19 ` David Miller
2007-01-02 23:34 ` James Bottomley
2007-01-03 0:20 ` David Miller
2007-01-03 14:16 ` Russell King
2007-01-03 15:00 ` James Bottomley
2007-01-03 15:09 ` Russell King
2007-01-07 16:09 ` James Bottomley
2007-01-07 16:30 ` Russell King
2006-12-31 20:40 ` David Miller
2006-12-31 20:58 ` Linus Torvalds
2006-12-31 21:12 ` David Miller [this message]
2007-01-01 16:44 ` James Bottomley
2007-01-01 23:04 ` David Miller
2007-01-01 23:23 ` James Bottomley
[not found] ` <1167669252.5302.57.camel@mulgrave.il.steeleye.com>
2007-01-01 23:01 ` David Miller
2007-01-01 23:17 ` Russell King
2006-12-31 9:55 ` Russell King
2006-12-31 9:46 ` David Miller
2007-01-01 14:35 ` James Bottomley
2007-01-01 16:21 ` Russell King
2006-12-30 18:21 ` Linus Torvalds
2006-12-21 16:29 ` Arjan van de Ven
2006-12-21 17:35 ` Russell King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061231.131216.105428418.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=akpm@osdl.org \
--cc=arjan@infradead.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=rmk+lkml@arm.linux.org.uk \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).