* flush_dcache_page does too much?
@ 2010-01-18 13:13 anfei
2010-01-18 13:33 ` Russell King - ARM Linux
0 siblings, 1 reply; 13+ messages in thread
From: anfei @ 2010-01-18 13:13 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
I'm studying the cache alias problem especially of VIPT, I found
function flush_dcache_page() does much more operations on ARM than MIPS.
Can we not flush the userspace mappings and icache, just like MIPS?
Are the cache more consistent with these operations?
As far as I know, flush_dcache_page is usually used as this:
kmap_atomic(page, ...);
write the page;
flush_dcache_page(page);
kunmap_atomic(...);
called in the path of write()/..., but since mmap() + write() is not
ensured to work (even on ARM currently), it's the userspace to consider
msync()/munmap(), it looks okay without flush the userspace mappings
here. Other cases seem the same if the userspace takes charge of the
cache problem.
Thanks,
Anfei.
^ permalink raw reply [flat|nested] 13+ messages in thread* flush_dcache_page does too much? 2010-01-18 13:13 flush_dcache_page does too much? anfei @ 2010-01-18 13:33 ` Russell King - ARM Linux 2010-01-18 13:54 ` anfei 0 siblings, 1 reply; 13+ messages in thread From: Russell King - ARM Linux @ 2010-01-18 13:33 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 09:13:46PM +0800, anfei wrote: > I'm studying the cache alias problem especially of VIPT, I found > function flush_dcache_page() does much more operations on ARM than MIPS. > Can we not flush the userspace mappings and icache, just like MIPS? > Are the cache more consistent with these operations? > > As far as I know, flush_dcache_page is usually used as this: > kmap_atomic(page, ...); > write the page; > flush_dcache_page(page); > kunmap_atomic(...); > called in the path of write()/..., but since mmap() + write() is not > ensured to work (even on ARM currently), it's the userspace to consider > msync()/munmap(), it looks okay without flush the userspace mappings > here. Other cases seem the same if the userspace takes charge of the > cache problem. On VIPT on ARM, flush_dcache_page() flushes: 1. the direct kernel mapping and aliases of that (which read()/write() will touch.) 2. the user aliases, which may not be coherent with the direct kernel mapping. It is unsafe to avoid dealing with any of those - and doing so will cause shared mappings to be incoherent with filesystem IO. ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 13:33 ` Russell King - ARM Linux @ 2010-01-18 13:54 ` anfei 2010-01-18 14:00 ` Russell King - ARM Linux 0 siblings, 1 reply; 13+ messages in thread From: anfei @ 2010-01-18 13:54 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 01:33:04PM +0000, Russell King - ARM Linux wrote: > On Mon, Jan 18, 2010 at 09:13:46PM +0800, anfei wrote: > > I'm studying the cache alias problem especially of VIPT, I found > > function flush_dcache_page() does much more operations on ARM than MIPS. > > Can we not flush the userspace mappings and icache, just like MIPS? > > Are the cache more consistent with these operations? > > > > As far as I know, flush_dcache_page is usually used as this: > > kmap_atomic(page, ...); > > write the page; > > flush_dcache_page(page); > > kunmap_atomic(...); > > called in the path of write()/..., but since mmap() + write() is not > > ensured to work (even on ARM currently), it's the userspace to consider > > msync()/munmap(), it looks okay without flush the userspace mappings > > here. Other cases seem the same if the userspace takes charge of the > > cache problem. > > On VIPT on ARM, flush_dcache_page() flushes: > > 1. the direct kernel mapping and aliases of that (which read()/write() > will touch.) > > 2. the user aliases, which may not be coherent with the direct kernel > mapping. > > It is unsafe to avoid dealing with any of those - and doing so will cause > shared mappings to be incoherent with filesystem IO. Do you mean this implementation can ensure the coherence between write and shared mmapings? But it's easy to reproduce the alias problem by this simple testcase (w/o error handler) on omap2430 with VIPT cache: --- #include <stdio.h> #include <sys/mman.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> int main(void) { int fd; int *addr; int tmp; int val = 0x11111111; fd = open("abc", O_RDWR); addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); *(addr+0) = 0x44444444; tmp = *(addr+0); *(addr+1) = 0x77777777; write(fd, &val, sizeof(int)); close(fd); return 0; } ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 13:54 ` anfei @ 2010-01-18 14:00 ` Russell King - ARM Linux 2010-01-18 14:15 ` anfei 0 siblings, 1 reply; 13+ messages in thread From: Russell King - ARM Linux @ 2010-01-18 14:00 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > Do you mean this implementation can ensure the coherence between write > and shared mmapings? But it's easy to reproduce the alias problem by > this simple testcase (w/o error handler) on omap2430 with VIPT cache: Your program doesn't do anything to identify any problem. You don't even say _what_ problem you see with this program. If you have a specific case which fails, please show the problem, please describe exactly the behaviour that you see, and what you expect to see. ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 14:00 ` Russell King - ARM Linux @ 2010-01-18 14:15 ` anfei 2010-01-18 14:44 ` Russell King - ARM Linux 0 siblings, 1 reply; 13+ messages in thread From: anfei @ 2010-01-18 14:15 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > Do you mean this implementation can ensure the coherence between write > > and shared mmapings? But it's easy to reproduce the alias problem by > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > Your program doesn't do anything to identify any problem. You don't > even say _what_ problem you see with this program. > Sorry for that. > If you have a specific case which fails, please show the problem, please > describe exactly the behaviour that you see, and what you expect to see. The steps as these: $ dd if=/dev/zero of=abc bs=4k count=1 $ ./a.out # this program $ xxd abc | head -1 # check the result I want to see it's always 0x11111111 0x77777777 at the beginning of file "abc", but the result is not (the log is not at hand currently), but I remember it like this after several runs: 0x11111111 0x77777777 0x44444444 0x77777777 0x11111111 0x77777777 0x44444444 0x77777777 0x44444444 0x77777777 If I add munmap()/msync() before write(), I can see it always as expected (0x11111111 0x77777777). ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 14:15 ` anfei @ 2010-01-18 14:44 ` Russell King - ARM Linux 2010-01-18 14:53 ` anfei 2010-01-18 14:57 ` anfei 0 siblings, 2 replies; 13+ messages in thread From: Russell King - ARM Linux @ 2010-01-18 14:44 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote: > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > > Do you mean this implementation can ensure the coherence between write > > > and shared mmapings? But it's easy to reproduce the alias problem by > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > > > Your program doesn't do anything to identify any problem. You don't > > even say _what_ problem you see with this program. > > > Sorry for that. > > > If you have a specific case which fails, please show the problem, please > > describe exactly the behaviour that you see, and what you expect to see. Are you using a write allocate cache? ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 14:44 ` Russell King - ARM Linux @ 2010-01-18 14:53 ` anfei 2010-01-18 14:57 ` anfei 1 sibling, 0 replies; 13+ messages in thread From: anfei @ 2010-01-18 14:53 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote: > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote: > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > > > Do you mean this implementation can ensure the coherence between write > > > > and shared mmapings? But it's easy to reproduce the alias problem by > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > > > > > Your program doesn't do anything to identify any problem. You don't > > > even say _what_ problem you see with this program. > > > > > Sorry for that. > > > > > If you have a specific case which fails, please show the problem, please > > > describe exactly the behaviour that you see, and what you expect to see. > > Are you using a write allocate cache? I'm not sure, I can get more information tomorrow. ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 14:44 ` Russell King - ARM Linux 2010-01-18 14:53 ` anfei @ 2010-01-18 14:57 ` anfei 2010-01-18 15:01 ` Russell King - ARM Linux 1 sibling, 1 reply; 13+ messages in thread From: anfei @ 2010-01-18 14:57 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote: > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote: > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > > > Do you mean this implementation can ensure the coherence between write > > > > and shared mmapings? But it's easy to reproduce the alias problem by > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > > > > > Your program doesn't do anything to identify any problem. You don't > > > even say _what_ problem you see with this program. > > > > > Sorry for that. > > > > > If you have a specific case which fails, please show the problem, please > > > describe exactly the behaviour that you see, and what you expect to see. > > Are you using a write allocate cache? I guess not, because this line is neccessary to reproduce the issue: tmp = *(addr+0); If it's write allocate, this line may not be neccessary, since it's just a read (and cache the data). ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 14:57 ` anfei @ 2010-01-18 15:01 ` Russell King - ARM Linux 2010-01-19 0:16 ` anfei 0 siblings, 1 reply; 13+ messages in thread From: Russell King - ARM Linux @ 2010-01-18 15:01 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 10:57:31PM +0800, anfei wrote: > On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote: > > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote: > > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > > > > Do you mean this implementation can ensure the coherence between write > > > > > and shared mmapings? But it's easy to reproduce the alias problem by > > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > > > > > > > Your program doesn't do anything to identify any problem. You don't > > > > even say _what_ problem you see with this program. > > > > > > > Sorry for that. > > > > > > > If you have a specific case which fails, please show the problem, please > > > > describe exactly the behaviour that you see, and what you expect to see. > > > > Are you using a write allocate cache? > > I guess not, because this line is neccessary to reproduce the issue: > tmp = *(addr+0); > If it's write allocate, this line may not be neccessary, since it's just > a read (and cache the data). It makes no sense then - without write allocate, writes will go straight through to the underlying page, bypassing the cache. Sorry, no idea. ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-18 15:01 ` Russell King - ARM Linux @ 2010-01-19 0:16 ` anfei 2010-01-19 13:05 ` anfei 0 siblings, 1 reply; 13+ messages in thread From: anfei @ 2010-01-19 0:16 UTC (permalink / raw) To: linux-arm-kernel On Mon, Jan 18, 2010 at 03:01:52PM +0000, Russell King - ARM Linux wrote: > On Mon, Jan 18, 2010 at 10:57:31PM +0800, anfei wrote: > > On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote: > > > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote: > > > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > > > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > > > > > Do you mean this implementation can ensure the coherence between write > > > > > > and shared mmapings? But it's easy to reproduce the alias problem by > > > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > > > > > > > > > Your program doesn't do anything to identify any problem. You don't > > > > > even say _what_ problem you see with this program. > > > > > > > > > Sorry for that. > > > > > > > > > If you have a specific case which fails, please show the problem, please > > > > > describe exactly the behaviour that you see, and what you expect to see. > > > > > > Are you using a write allocate cache? > > > > I guess not, because this line is neccessary to reproduce the issue: > > tmp = *(addr+0); > > If it's write allocate, this line may not be neccessary, since it's just > > a read (and cache the data). > > It makes no sense then - without write allocate, writes will go straight > through to the underlying page, bypassing the cache. > Because of the read, the write is cache hitted too even on read allocate: *(addr+0) = 0x44444444; <- bypass the cache tmp = *(addr+0); <- read allocate *(addr+1) = 0x77777777; <- same cacheline, cache hitted So the two write values are cached, then the sequence in sys_write cannot guarantee the coherence: kmap_atomic(page); copy to page; kunmap_atomic(page); flush_dcache_page(page); It should call flush_dcache_page()@the beginning too in order to flush the shared mapping. Actually, I think it's better to split this function into two, such as: flush_dcache_user_page(page); copy to page; kunmap_atomic(page); flush_dcache_kern_page(page); And this patch seems to fix it, any other fs doesn't call it need to add that too. diff --git a/mm/filemap.c b/mm/filemap.c index 96ac6b0..07056fb 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2196,6 +2196,9 @@ again: if (unlikely(status)) break; + if (mapping_writably_mapped(mapping)) + flush_dcache_page(page); + pagefault_disable(); copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes); pagefault_enable(); ^ permalink raw reply related [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-19 0:16 ` anfei @ 2010-01-19 13:05 ` anfei 2010-01-19 17:44 ` Russell King - ARM Linux 0 siblings, 1 reply; 13+ messages in thread From: anfei @ 2010-01-19 13:05 UTC (permalink / raw) To: linux-arm-kernel Hi Russell, On Tue, Jan 19, 2010 at 08:16:36AM +0800, anfei wrote: > On Mon, Jan 18, 2010 at 03:01:52PM +0000, Russell King - ARM Linux wrote: > > On Mon, Jan 18, 2010 at 10:57:31PM +0800, anfei wrote: > > > On Mon, Jan 18, 2010 at 02:44:18PM +0000, Russell King - ARM Linux wrote: > > > > On Mon, Jan 18, 2010 at 10:15:30PM +0800, anfei wrote: > > > > > On Mon, Jan 18, 2010 at 02:00:05PM +0000, Russell King - ARM Linux wrote: > > > > > > On Mon, Jan 18, 2010 at 09:54:31PM +0800, anfei wrote: > > > > > > > Do you mean this implementation can ensure the coherence between write > > > > > > > and shared mmapings? But it's easy to reproduce the alias problem by > > > > > > > this simple testcase (w/o error handler) on omap2430 with VIPT cache: > > > > > > > > > > > > Your program doesn't do anything to identify any problem. You don't > > > > > > even say _what_ problem you see with this program. > > > > > > > > > > > Sorry for that. > > > > > > > > > > > If you have a specific case which fails, please show the problem, please > > > > > > describe exactly the behaviour that you see, and what you expect to see. > > > > > > > > Are you using a write allocate cache? > > > > > > I guess not, because this line is neccessary to reproduce the issue: > > > tmp = *(addr+0); > > > If it's write allocate, this line may not be neccessary, since it's just > > > a read (and cache the data). > > > > It makes no sense then - without write allocate, writes will go straight > > through to the underlying page, bypassing the cache. > > > Because of the read, the write is cache hitted too even on read allocate: > *(addr+0) = 0x44444444; <- bypass the cache > tmp = *(addr+0); <- read allocate > *(addr+1) = 0x77777777; <- same cacheline, cache hitted > > So the two write values are cached, then the sequence in sys_write > cannot guarantee the coherence: > kmap_atomic(page); > copy to page; > kunmap_atomic(page); > flush_dcache_page(page); > > It should call flush_dcache_page() at the beginning too in order to > flush the shared mapping. Actually, I think it's better to split this > function into two, such as: > flush_dcache_user_page(page); > copy to page; > kunmap_atomic(page); > flush_dcache_kern_page(page); > > And this patch seems to fix it, any other fs doesn't call it need to add > that too. > Do you think this is a bug and I can send the patch, or it's the problem of the test case? Thanks, Anfei. > > diff --git a/mm/filemap.c b/mm/filemap.c > index 96ac6b0..07056fb 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2196,6 +2196,9 @@ again: > if (unlikely(status)) > break; > > + if (mapping_writably_mapped(mapping)) > + flush_dcache_page(page); > + > pagefault_disable(); > copied = iov_iter_copy_from_user_atomic(page, i, offset, bytes); > pagefault_enable(); ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-19 13:05 ` anfei @ 2010-01-19 17:44 ` Russell King - ARM Linux 2010-01-19 18:33 ` Jamie Lokier 0 siblings, 1 reply; 13+ messages in thread From: Russell King - ARM Linux @ 2010-01-19 17:44 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jan 19, 2010 at 09:05:47PM +0800, anfei wrote: > Do you think this is a bug and I can send the patch, or it's the problem > of the test case? Yes, it's a bug, but it needs to be reported elsewhere; neither this list nor I can sanction patches to generic code. It needs to be reported to the linux-kernel and linux-mm mailing lists. ^ permalink raw reply [flat|nested] 13+ messages in thread
* flush_dcache_page does too much? 2010-01-19 17:44 ` Russell King - ARM Linux @ 2010-01-19 18:33 ` Jamie Lokier 0 siblings, 0 replies; 13+ messages in thread From: Jamie Lokier @ 2010-01-19 18:33 UTC (permalink / raw) To: linux-arm-kernel Russell King - ARM Linux wrote: > On Tue, Jan 19, 2010 at 09:05:47PM +0800, anfei wrote: > > Do you think this is a bug and I can send the patch, or it's the problem > > of the test case? > > Yes, it's a bug, but it needs to be reported elsewhere; neither this > list nor I can sanction patches to generic code. It's not necessarily a bug if the appropriate msync() call fixes the test. Linux doesn't guarantee coherence between mappings and read/write without msync() on some architectures, but I haven't seen an official statement of whether it *should* or not in this type of case. I have looked at this in some detail a few years ago. When I looked at MIPS and some other architecture code, I wasn't sure it was correct even with msync() calls everywhere. It looked distinctly undertested. The Linux msync(2) man page says: msync() flushes changes made to the in-core copy of a file that was mapped into memory using mmap(2) back to disk. Without use of this call there is no guarantee that changes are written back before mun? map(2) is called. To be more precise, the part of the file that corre? sponds to the memory area starting at addr and having length length is updated. It's not a great description. MS_INVALIDATE talks about invalidating "other mappings", not this one, in the Linux man page. But on other OSes (SunOS, HP-UX for example), MS_INVALIDATE invalidates _this_ mapping so that it sees the latest data written by write() to the file. In other words, Linux is a complete mess in this area. It happens to work on some architectures, it's broken a little bit on others, broken a lot on some, and despite reading a fair bit of code I'm still not sure what the kernel is trying to promise. Probably different people have different ideas about what's expected, which is why MIPS and ARM kernels differ. -- Jamie ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-01-19 18:33 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-18 13:13 flush_dcache_page does too much? anfei 2010-01-18 13:33 ` Russell King - ARM Linux 2010-01-18 13:54 ` anfei 2010-01-18 14:00 ` Russell King - ARM Linux 2010-01-18 14:15 ` anfei 2010-01-18 14:44 ` Russell King - ARM Linux 2010-01-18 14:53 ` anfei 2010-01-18 14:57 ` anfei 2010-01-18 15:01 ` Russell King - ARM Linux 2010-01-19 0:16 ` anfei 2010-01-19 13:05 ` anfei 2010-01-19 17:44 ` Russell King - ARM Linux 2010-01-19 18:33 ` Jamie Lokier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).