* flush_dcache_page does too much?
@ 2010-01-18 13:13 anfei
2010-01-18 13:33 ` Russell King - ARM Linux
0 siblings, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-18 13:13 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
I'm studying the cache alias problem especially of VIPT, I found
function flush_dcache_page() does much more operations on ARM than MIPS.
Can we not flush the userspace mappings and icache, just like MIPS?
Are the cache more consistent with these operations?
As far as I know, flush_dcache_page is usually used as this:
kmap_atomic(page, ...);
write the page;
flush_dcache_page(page);
kunmap_atomic(...);
called in the path of write()/..., but since mmap() + write() is not
ensured to work (even on ARM currently), it's the userspace to consider
msync()/munmap(), it looks okay without flush the userspace mappings
here. Other cases seem the same if the userspace takes charge of the
cache problem.
Thanks,
Anfei.
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 13:13 flush_dcache_page does too much? anfei
@ 2010-01-18 13:33 ` Russell King - ARM Linux
2010-01-18 13:54 ` anfei
0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2010-01-18 13:33 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 09:13:46PM +0800, anfei wrote:
> I'm studying the cache alias problem especially of VIPT, I found
> function flush_dcache_page() does much more operations on ARM than MIPS.
> Can we not flush the userspace mappings and icache, just like MIPS?
> Are the cache more consistent with these operations?
>
> As far as I know, flush_dcache_page is usually used as this:
> kmap_atomic(page, ...);
> write the page;
> flush_dcache_page(page);
> kunmap_atomic(...);
> called in the path of write()/..., but since mmap() + write() is not
> ensured to work (even on ARM currently), it's the userspace to consider
> msync()/munmap(), it looks okay without flush the userspace mappings
> here. Other cases seem the same if the userspace takes charge of the
> cache problem.
On VIPT on ARM, flush_dcache_page() flushes:
1. the direct kernel mapping and aliases of that (which read()/write()
will touch.)
2. the user aliases, which may not be coherent with the direct kernel
mapping.
It is unsafe to avoid dealing with any of those - and doing so will cause
shared mappings to be incoherent with filesystem IO.
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 13:33 ` Russell King - ARM Linux
@ 2010-01-18 13:54 ` anfei
2010-01-18 14:00 ` Russell King - ARM Linux
0 siblings, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-18 13:54 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 01:33:04PM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 18, 2010 at 09:13:46PM +0800, anfei wrote:
> > I'm studying the cache alias problem especially of VIPT, I found
> > function flush_dcache_page() does much more operations on ARM than MIPS.
> > Can we not flush the userspace mappings and icache, just like MIPS?
> > Are the cache more consistent with these operations?
> >
> > As far as I know, flush_dcache_page is usually used as this:
> > kmap_atomic(page, ...);
> > write the page;
> > flush_dcache_page(page);
> > kunmap_atomic(...);
> > called in the path of write()/..., but since mmap() + write() is not
> > ensured to work (even on ARM currently), it's the userspace to consider
> > msync()/munmap(), it looks okay without flush the userspace mappings
> > here. Other cases seem the same if the userspace takes charge of the
> > cache problem.
>
> On VIPT on ARM, flush_dcache_page() flushes:
>
> 1. the direct kernel mapping and aliases of that (which read()/write()
> will touch.)
>
> 2. the user aliases, which may not be coherent with the direct kernel
> mapping.
>
> It is unsafe to avoid dealing with any of those - and doing so will cause
> shared mappings to be incoherent with filesystem IO.
Do you mean this implementation can ensure the coherence between write
and shared mmapings? But it's easy to reproduce the alias problem by
this simple testcase (w/o error handler) on omap2430 with VIPT cache:
---
#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main(void)
{
int fd;
int *addr;
int tmp;
int val = 0x11111111;
fd = open("abc", O_RDWR);
addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
*(addr+0) = 0x44444444;
tmp = *(addr+0);
*(addr+1) = 0x77777777;
write(fd, &val, sizeof(int));
close(fd);
return 0;
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 13:54 ` anfei
@ 2010-01-18 14:00 ` Russell King - ARM Linux
2010-01-18 14:15 ` anfei
0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2010-01-18 14:00 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> Do you mean this implementation can ensure the coherence between write
> and shared mmapings? But it's easy to reproduce the alias problem by
> this simple testcase (w/o error handler) on omap2430 with VIPT cache:
Your program doesn't do anything to identify any problem. You don't
even say _what_ problem you see with this program.
If you have a specific case which fails, please show the problem, please
describe exactly the behaviour that you see, and what you expect to see.
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 14:00 ` Russell King - ARM Linux
@ 2010-01-18 14:15 ` anfei
2010-01-18 14:44 ` Russell King - ARM Linux
0 siblings, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-18 14:15 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > Do you mean this implementation can ensure the coherence between write
> > and shared mmapings? But it's easy to reproduce the alias problem by
> > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
>
> Your program doesn't do anything to identify any problem. You don't
> even say _what_ problem you see with this program.
>
Sorry for that.
> If you have a specific case which fails, please show the problem, please
> describe exactly the behaviour that you see, and what you expect to see.
The steps as these:
$ dd if=/dev/zero of=abc bs=4k count=1
$ ./a.out # this program
$ xxd abc | head -1 # check the result
I want to see it's always 0x11111111 0x77777777 at the beginning of file
"abc", but the result is not (the log is not at hand currently), but I
remember it like this after several runs:
0x11111111 0x77777777
0x44444444 0x77777777
0x11111111 0x77777777
0x44444444 0x77777777
0x44444444 0x77777777
If I add munmap()/msync() before write(), I can see it always as
expected (0x11111111 0x77777777).
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 14:15 ` anfei
@ 2010-01-18 14:44 ` Russell King - ARM Linux
2010-01-18 14:53 ` anfei
2010-01-18 14:57 ` anfei
0 siblings, 2 replies; 13+ messages in thread
From: Russell King - ARM Linux @ 2010-01-18 14:44 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote:
> On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > > Do you mean this implementation can ensure the coherence between write
> > > and shared mmapings? But it's easy to reproduce the alias problem by
> > > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
> >
> > Your program doesn't do anything to identify any problem. You don't
> > even say _what_ problem you see with this program.
> >
> Sorry for that.
>
> > If you have a specific case which fails, please show the problem, please
> > describe exactly the behaviour that you see, and what you expect to see.
Are you using a write allocate cache?
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 14:44 ` Russell King - ARM Linux
@ 2010-01-18 14:53 ` anfei
2010-01-18 14:57 ` anfei
1 sibling, 0 replies; 13+ messages in thread
From: anfei @ 2010-01-18 14:53 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote:
> > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > > > Do you mean this implementation can ensure the coherence between write
> > > > and shared mmapings? But it's easy to reproduce the alias problem by
> > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
> > >
> > > Your program doesn't do anything to identify any problem. You don't
> > > even say _what_ problem you see with this program.
> > >
> > Sorry for that.
> >
> > > If you have a specific case which fails, please show the problem, please
> > > describe exactly the behaviour that you see, and what you expect to see.
>
> Are you using a write allocate cache?
I'm not sure, I can get more information tomorrow.
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 14:44 ` Russell King - ARM Linux
2010-01-18 14:53 ` anfei
@ 2010-01-18 14:57 ` anfei
2010-01-18 15:01 ` Russell King - ARM Linux
1 sibling, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-18 14:57 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote:
> > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > > > Do you mean this implementation can ensure the coherence between write
> > > > and shared mmapings? But it's easy to reproduce the alias problem by
> > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
> > >
> > > Your program doesn't do anything to identify any problem. You don't
> > > even say _what_ problem you see with this program.
> > >
> > Sorry for that.
> >
> > > If you have a specific case which fails, please show the problem, please
> > > describe exactly the behaviour that you see, and what you expect to see.
>
> Are you using a write allocate cache?
I guess not, because this line is neccessary to reproduce the issue:
tmp = *(addr+0);
If it's write allocate, this line may not be neccessary, since it's just
a read (and cache the data).
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 14:57 ` anfei
@ 2010-01-18 15:01 ` Russell King - ARM Linux
2010-01-19 0:16 ` anfei
0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2010-01-18 15:01 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 10:57:31PM +0800, anfei wrote:
> On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote:
> > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote:
> > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > > > > Do you mean this implementation can ensure the coherence between write
> > > > > and shared mmapings? But it's easy to reproduce the alias problem by
> > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
> > > >
> > > > Your program doesn't do anything to identify any problem. You don't
> > > > even say _what_ problem you see with this program.
> > > >
> > > Sorry for that.
> > >
> > > > If you have a specific case which fails, please show the problem, please
> > > > describe exactly the behaviour that you see, and what you expect to see.
> >
> > Are you using a write allocate cache?
>
> I guess not, because this line is neccessary to reproduce the issue:
> tmp = *(addr+0);
> If it's write allocate, this line may not be neccessary, since it's just
> a read (and cache the data).
It makes no sense then - without write allocate, writes will go straight
through to the underlying page, bypassing the cache.
Sorry, no idea.
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-18 15:01 ` Russell King - ARM Linux
@ 2010-01-19 0:16 ` anfei
2010-01-19 13:05 ` anfei
0 siblings, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-19 0:16 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 18, 2010 at 03:01:52PM +0000, Russell King - ARM Linux wrote:
> On Mon, Jan 18, 2010 at 10:57:31PM +0800, anfei wrote:
> > On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote:
> > > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote:
> > > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> > > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > > > > > Do you mean this implementation can ensure the coherence between write
> > > > > > and shared mmapings? But it's easy to reproduce the alias problem by
> > > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
> > > > >
> > > > > Your program doesn't do anything to identify any problem. You don't
> > > > > even say _what_ problem you see with this program.
> > > > >
> > > > Sorry for that.
> > > >
> > > > > If you have a specific case which fails, please show the problem, please
> > > > > describe exactly the behaviour that you see, and what you expect to see.
> > >
> > > Are you using a write allocate cache?
> >
> > I guess not, because this line is neccessary to reproduce the issue:
> > tmp = *(addr+0);
> > If it's write allocate, this line may not be neccessary, since it's just
> > a read (and cache the data).
>
> It makes no sense then - without write allocate, writes will go straight
> through to the underlying page, bypassing the cache.
>
Because of the read, the write is cache hitted too even on read allocate:
*(addr+0) = 0x44444444; <- bypass the cache
tmp = *(addr+0); <- read allocate
*(addr+1) = 0x77777777; <- same cacheline, cache hitted
So the two write values are cached, then the sequence in sys_write
cannot guarantee the coherence:
kmap_atomic(page);
copy to page;
kunmap_atomic(page);
flush_dcache_page(page);
It should call flush_dcache_page()@the beginning too in order to
flush the shared mapping. Actually, I think it's better to split this
function into two, such as:
flush_dcache_user_page(page);
copy to page;
kunmap_atomic(page);
flush_dcache_kern_page(page);
And this patch seems to fix it, any other fs doesn't call it need to add
that too.
diff --git a/mm/filemap.c b/mm/filemap.c
index 96ac6b0..07056fb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2196,6 +2196,9 @@ again:
if (unlikely(status))
break;
+ if (mapping_writably_mapped(mapping))
+ flush_dcache_page(page);
+
pagefault_disable();
copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
pagefault_enable();
^ permalink raw reply related [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-19 0:16 ` anfei
@ 2010-01-19 13:05 ` anfei
2010-01-19 17:44 ` Russell King - ARM Linux
0 siblings, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-19 13:05 UTC (permalink / raw)
To: linux-arm-kernel
Hi Russell,
On Tue, Jan 19, 2010 at 08:16:36AM +0800, anfei wrote:
> On Mon, Jan 18, 2010 at 03:01:52PM +0000, Russell King - ARM Linux wrote:
> > On Mon, Jan 18, 2010 at 10:57:31PM +0800, anfei wrote:
> > > On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote:
> > > > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote:
> > > > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote:
> > > > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote:
> > > > > > > Do you mean this implementation can ensure the coherence between write
> > > > > > > and shared mmapings? But it's easy to reproduce the alias problem by
> > > > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache:
> > > > > >
> > > > > > Your program doesn't do anything to identify any problem. You don't
> > > > > > even say _what_ problem you see with this program.
> > > > > >
> > > > > Sorry for that.
> > > > >
> > > > > > If you have a specific case which fails, please show the problem, please
> > > > > > describe exactly the behaviour that you see, and what you expect to see.
> > > >
> > > > Are you using a write allocate cache?
> > >
> > > I guess not, because this line is neccessary to reproduce the issue:
> > > tmp = *(addr+0);
> > > If it's write allocate, this line may not be neccessary, since it's just
> > > a read (and cache the data).
> >
> > It makes no sense then - without write allocate, writes will go straight
> > through to the underlying page, bypassing the cache.
> >
> Because of the read, the write is cache hitted too even on read allocate:
> *(addr+0) = 0x44444444; <- bypass the cache
> tmp = *(addr+0); <- read allocate
> *(addr+1) = 0x77777777; <- same cacheline, cache hitted
>
> So the two write values are cached, then the sequence in sys_write
> cannot guarantee the coherence:
> kmap_atomic(page);
> copy to page;
> kunmap_atomic(page);
> flush_dcache_page(page);
>
> It should call flush_dcache_page() at the beginning too in order to
> flush the shared mapping. Actually, I think it's better to split this
> function into two, such as:
> flush_dcache_user_page(page);
> copy to page;
> kunmap_atomic(page);
> flush_dcache_kern_page(page);
>
> And this patch seems to fix it, any other fs doesn't call it need to add
> that too.
>
Do you think this is a bug and I can send the patch, or it's the problem
of the test case?
Thanks,
Anfei.
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 96ac6b0..07056fb 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2196,6 +2196,9 @@ again:
> if (unlikely(status))
> break;
>
> + if (mapping_writably_mapped(mapping))
> + flush_dcache_page(page);
> +
> pagefault_disable();
> copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes);
> pagefault_enable();
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-19 13:05 ` anfei
@ 2010-01-19 17:44 ` Russell King - ARM Linux
2010-01-19 18:33 ` Jamie Lokier
0 siblings, 1 reply; 13+ messages in thread
From: Russell King - ARM Linux @ 2010-01-19 17:44 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Jan 19, 2010 at 09:05:47PM +0800, anfei wrote:
> Do you think this is a bug and I can send the patch, or it's the problem
> of the test case?
Yes, it's a bug, but it needs to be reported elsewhere; neither this
list nor I can sanction patches to generic code.
It needs to be reported to the linux-kernel and linux-mm mailing lists.
^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much?
2010-01-19 17:44 ` Russell King - ARM Linux
@ 2010-01-19 18:33 ` Jamie Lokier
0 siblings, 0 replies; 13+ messages in thread
From: Jamie Lokier @ 2010-01-19 18:33 UTC (permalink / raw)
To: linux-arm-kernel
Russell King - ARM Linux wrote:
> On Tue, Jan 19, 2010 at 09:05:47PM +0800, anfei wrote:
> > Do you think this is a bug and I can send the patch, or it's the problem
> > of the test case?
>
> Yes, it's a bug, but it needs to be reported elsewhere; neither this
> list nor I can sanction patches to generic code.
It's not necessarily a bug if the appropriate msync() call fixes the
test. Linux doesn't guarantee coherence between mappings and
read/write without msync() on some architectures, but I haven't seen
an official statement of whether it *should* or not in this type of
case.
I have looked at this in some detail a few years ago. When I looked
at MIPS and some other architecture code, I wasn't sure it was correct
even with msync() calls everywhere. It looked distinctly undertested.
The Linux msync(2) man page says:
msync() flushes changes made to the in-core copy of a file that was
mapped into memory using mmap(2) back to disk. Without use of this
call there is no guarantee that changes are written back before mun?
map(2) is called. To be more precise, the part of the file that corre?
sponds to the memory area starting at addr and having length length is
updated.
It's not a great description. MS_INVALIDATE talks about invalidating
"other mappings", not this one, in the Linux man page.
But on other OSes (SunOS, HP-UX for example), MS_INVALIDATE
invalidates _this_ mapping so that it sees the latest data written by
write() to the file.
In other words, Linux is a complete mess in this area. It happens to
work on some architectures, it's broken a little bit on others, broken
a lot on some, and despite reading a fair bit of code I'm still not
sure what the kernel is trying to promise.
Probably different people have different ideas about what's expected,
which is why MIPS and ARM kernels differ.
-- Jamie
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-01-19 18:33 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-18 13:13 flush_dcache_page does too much? anfei
2010-01-18 13:33 ` Russell King - ARM Linux
2010-01-18 13:54 ` anfei
2010-01-18 14:00 ` Russell King - ARM Linux
2010-01-18 14:15 ` anfei
2010-01-18 14:44 ` Russell King - ARM Linux
2010-01-18 14:53 ` anfei
2010-01-18 14:57 ` anfei
2010-01-18 15:01 ` Russell King - ARM Linux
2010-01-19 0:16 ` anfei
2010-01-19 13:05 ` anfei
2010-01-19 17:44 ` Russell King - ARM Linux
2010-01-19 18:33 ` Jamie Lokier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).