* race between kmap shootdown and cache maintenance @ 2010-02-05 18:13 Gary King 2010-02-07 15:30 ` Russell King - ARM Linux 0 siblings, 1 reply; 9+ messages in thread From: Gary King @ 2010-02-05 18:13 UTC (permalink / raw) To: linux-arm-kernel I have seen some instability with highmem enabled on Tegra 2 systems (Cortex A9 SMP), where ocassionally the kernel will panic with unservicable page faults from flush_dcache_page, at an address half-way through a kmapped page. I think this patch addresses the root cause: the cache maintenance code doesn't increase the reference count of the kmap, so another thread may kunmap the last reference and zap the corresponding PTE. - Gary Gary King gking at nvidia.com ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100205/2785f794/attachment-0001.htm> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001--ARM-highmem-fix-race-between-cache-flush-and-kmap.patch Type: application/octet-stream Size: 2027 bytes Desc: 0001--ARM-highmem-fix-race-between-cache-flush-and-kmap.patch URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100205/2785f794/attachment-0001.obj> ^ permalink raw reply [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-05 18:13 race between kmap shootdown and cache maintenance Gary King @ 2010-02-07 15:30 ` Russell King - ARM Linux 2010-02-08 21:57 ` Gary King 0 siblings, 1 reply; 9+ messages in thread From: Russell King - ARM Linux @ 2010-02-07 15:30 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 05, 2010 at 10:13:03AM -0800, Gary King wrote: > for highmem pages, flush_dcache_page must pin the kmap mapping in-place > using kmap_high_get, to ensure that the cache maintenance does not race > with another context calling kunmap_high on the same page and causing the > PTE to be zapped. You need to sign off on patches you send. > --- > arch/arm/mm/flush.c | 18 +++++++++++++++--- > 1 files changed, 15 insertions(+), 3 deletions(-) > > diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c > index 6f3a4b7..69ee285 100644 > --- a/arch/arm/mm/flush.c > +++ b/arch/arm/mm/flush.c > @@ -117,7 +117,7 @@ void flush_ptrace_access(struct vm_area_struct *vma, struct page *page, > > void __flush_dcache_page(struct address_space *mapping, struct page *page) > { > - void *addr = page_address(page); > + void *addr = NULL; You shouldn't need an initializer here - the address will always be initialized, either by kmap_high_get() or page_address(). > > /* > * Writeback any data associated with the kernel mapping of this > @@ -127,10 +127,17 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) > #ifdef CONFIG_HIGHMEM > /* > * kmap_atomic() doesn't set the page virtual address, and > - * kunmap_atomic() takes care of cache flushing already. > + * kunmap_atomic() takes care of cache flushing already; however, > + * the kmap must be pinned locally to ensure that no other context > + * unmaps it during the cache maintenance > */ > - if (addr) > + if (PageHighMem(page)) > + addr = kmap_high_get(page); > + else > #endif > + addr = page_address(page); > + > + if (addr) > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > /* > @@ -141,6 +148,11 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) > if (mapping && cache_is_vipt_aliasing()) > flush_pfn_alias(page_to_pfn(page), > page->index << PAGE_CACHE_SHIFT); > + > +#ifdef CONFIG_HIGHMEM > + if (PageHighMem(page) && addr) > + kunmap_high(page); > +#endif You don't need to hold on to the highmem kmap this long - the only thing that it'd matter for is the __cpuc_flush_dcache_area() call. You can combine this conditional with the test for __cpuc_flush_dcache_area() as well. ^ permalink raw reply [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-07 15:30 ` Russell King - ARM Linux @ 2010-02-08 21:57 ` Gary King 2010-02-09 3:35 ` Nicolas Pitre 0 siblings, 1 reply; 9+ messages in thread From: Gary King @ 2010-02-08 21:57 UTC (permalink / raw) To: linux-arm-kernel Fixed version attached. -----Original Message----- From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk] Sent: Sunday, February 07, 2010 7:31 AM To: Gary King Cc: 'linux-arm-kernel at lists.infradead.org' Subject: Re: race between kmap shootdown and cache maintenance On Fri, Feb 05, 2010 at 10:13:03AM -0800, Gary King wrote: > for highmem pages, flush_dcache_page must pin the kmap mapping in-place > using kmap_high_get, to ensure that the cache maintenance does not race > with another context calling kunmap_high on the same page and causing the > PTE to be zapped. You need to sign off on patches you send. > --- > arch/arm/mm/flush.c | 18 +++++++++++++++--- > 1 files changed, 15 insertions(+), 3 deletions(-) > > diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c > index 6f3a4b7..69ee285 100644 > --- a/arch/arm/mm/flush.c > +++ b/arch/arm/mm/flush.c > @@ -117,7 +117,7 @@ void flush_ptrace_access(struct vm_area_struct *vma, struct page *page, > > void __flush_dcache_page(struct address_space *mapping, struct page *page) > { > - void *addr = page_address(page); > + void *addr = NULL; You shouldn't need an initializer here - the address will always be initialized, either by kmap_high_get() or page_address(). > > /* > * Writeback any data associated with the kernel mapping of this > @@ -127,10 +127,17 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) > #ifdef CONFIG_HIGHMEM > /* > * kmap_atomic() doesn't set the page virtual address, and > - * kunmap_atomic() takes care of cache flushing already. > + * kunmap_atomic() takes care of cache flushing already; however, > + * the kmap must be pinned locally to ensure that no other context > + * unmaps it during the cache maintenance > */ > - if (addr) > + if (PageHighMem(page)) > + addr = kmap_high_get(page); > + else > #endif > + addr = page_address(page); > + > + if (addr) > __cpuc_flush_dcache_area(addr, PAGE_SIZE); > > /* > @@ -141,6 +148,11 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) > if (mapping && cache_is_vipt_aliasing()) > flush_pfn_alias(page_to_pfn(page), > page->index << PAGE_CACHE_SHIFT); > + > +#ifdef CONFIG_HIGHMEM > + if (PageHighMem(page) && addr) > + kunmap_high(page); > +#endif You don't need to hold on to the highmem kmap this long - the only thing that it'd matter for is the __cpuc_flush_dcache_area() call. You can combine this conditional with the test for __cpuc_flush_dcache_area() as well. ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001--ARM-highmem-fix-race-between-cache-flush-and-kmap.patch Type: application/octet-stream Size: 1827 bytes Desc: 0001--ARM-highmem-fix-race-between-cache-flush-and-kmap.patch URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100208/62c50b30/attachment.obj> ^ permalink raw reply [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-08 21:57 ` Gary King @ 2010-02-09 3:35 ` Nicolas Pitre 2010-02-09 4:00 ` Gary King 0 siblings, 1 reply; 9+ messages in thread From: Nicolas Pitre @ 2010-02-09 3:35 UTC (permalink / raw) To: linux-arm-kernel On Mon, 8 Feb 2010, Gary King wrote: > Fixed version attached. > > > -----Original Message----- > From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk] > Sent: Sunday, February 07, 2010 7:31 AM > To: Gary King > Cc: 'linux-arm-kernel at lists.infradead.org' > Subject: Re: race between kmap shootdown and cache maintenance > > On Fri, Feb 05, 2010 at 10:13:03AM -0800, Gary King wrote: > > for highmem pages, flush_dcache_page must pin the kmap mapping in-place > > using kmap_high_get, to ensure that the cache maintenance does not race > > with another context calling kunmap_high on the same page and causing the > > PTE to be zapped. Is this actually possible? Any flush_dcache_page() caller must have a reference count on the given highmem page since no one is supposed to play with a highmem page pointer without having called kmap() on it first. Therefore any other context calling kunmap_high() is never expected to drop the kmap ref count to zero. So unless proven otherwise I think this patch is useless. Nicolas ^ permalink raw reply [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-09 3:35 ` Nicolas Pitre @ 2010-02-09 4:00 ` Gary King 2010-02-09 4:40 ` Nicolas Pitre 0 siblings, 1 reply; 9+ messages in thread From: Gary King @ 2010-02-09 4:00 UTC (permalink / raw) To: linux-arm-kernel The patch is not a no-op; without this patch I was seeing panics in v7_flush_kern_dcache about 1 time in 3 boots, with it the crash has not reproduced in hundreds of boots. However, from re-reading the highmem code, I think my original description of the cause of the crash was slightly mistaken: Kmap zero-flushing is lazy (it happens on the subsequent call to kmap), and the page_address is not set to NULL until the lazy-flush happens. In this case, if page_address is called immediately following a kunmap call which resulted in the pin count dropping to 1, a valid address will be returned. On SMP or PREEMPT kernels, kmap may be called in one context during cache maintenance on one of these pages in a different context, resulting in flush_all_zero_pkmaps invalidating the PTE (and TLB) of the pages that is actively undergoing maintenance. - Gary -----Original Message----- From: Nicolas Pitre [mailto:nico at fluxnic.net] Sent: Monday, February 08, 2010 7:36 PM To: Gary King Cc: 'Russell King - ARM Linux'; 'linux-arm-kernel at lists.infradead.org' Subject: RE: race between kmap shootdown and cache maintenance On Mon, 8 Feb 2010, Gary King wrote: > Fixed version attached. > > > -----Original Message----- > From: Russell King - ARM Linux [mailto:linux at arm.linux.org.uk] > Sent: Sunday, February 07, 2010 7:31 AM > To: Gary King > Cc: 'linux-arm-kernel at lists.infradead.org' > Subject: Re: race between kmap shootdown and cache maintenance > > On Fri, Feb 05, 2010 at 10:13:03AM -0800, Gary King wrote: > > for highmem pages, flush_dcache_page must pin the kmap mapping in-place > > using kmap_high_get, to ensure that the cache maintenance does not race > > with another context calling kunmap_high on the same page and causing the > > PTE to be zapped. Is this actually possible? Any flush_dcache_page() caller must have a reference count on the given highmem page since no one is supposed to play with a highmem page pointer without having called kmap() on it first. Therefore any other context calling kunmap_high() is never expected to drop the kmap ref count to zero. So unless proven otherwise I think this patch is useless. Nicolas ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-09 4:00 ` Gary King @ 2010-02-09 4:40 ` Nicolas Pitre 2010-02-13 1:08 ` Gary King 0 siblings, 1 reply; 9+ messages in thread From: Nicolas Pitre @ 2010-02-09 4:40 UTC (permalink / raw) To: linux-arm-kernel On Mon, 8 Feb 2010, Gary King wrote: > The patch is not a no-op; without this patch I was seeing panics in > v7_flush_kern_dcache about 1 time in 3 boots, with it the crash has > not reproduced in hundreds of boots. I would still like to understand why. Without a good explanation this might simply be covering another bug. > However, from re-reading the highmem code, I think my original > description of the cause of the crash was slightly mistaken: > > Kmap zero-flushing is lazy (it happens on the subsequent call to > kmap), and the page_address is not set to NULL until the lazy-flush > happens. In this case, if page_address is called immediately following > a kunmap call which resulted in the pin count dropping to 1, a valid > address will be returned. But that's where things seem wrong. There should no be any caller of flush_dcache_page() passing a page with no "owner". Can you try this patch and see if it actually triggers, and if so what the call backtrace is? diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index 6f3a4b7..7afdb4b 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -125,6 +125,8 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) * coherent with the kernels mapping. */ #ifdef CONFIG_HIGHMEM + extern int kmap_high_pinned(struct page *); + BUG_ON(addr && PageHighMem(page) && !kmap_high_pinned(page)); /* * kmap_atomic() doesn't set the page virtual address, and * kunmap_atomic() takes care of cache flushing already. diff --git a/mm/highmem.c b/mm/highmem.c index 9c1e627..f50d83e 100644 --- a/mm/highmem.c +++ b/mm/highmem.c @@ -238,6 +238,18 @@ void *kmap_high_get(struct page *page) unlock_kmap_any(flags); return (void*) vaddr; } + +int kmap_high_pinned(struct page *page) +{ + unsigned long vaddr, flags; + int res; + + lock_kmap_any(flags); + vaddr = (unsigned long)page_address(page); + res = (vaddr && pkmap_count[PKMAP_NR(vaddr)] >= 2); + unlock_kmap_any(flags); + return res; +} #endif /** Nicolas ^ permalink raw reply related [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-09 4:40 ` Nicolas Pitre @ 2010-02-13 1:08 ` Gary King 2010-02-18 2:00 ` Arve Hjønnevåg 0 siblings, 1 reply; 9+ messages in thread From: Gary King @ 2010-02-13 1:08 UTC (permalink / raw) To: linux-arm-kernel I finally got a chance to test this; it looks like this condition may be common: many (all?) of the various page cache-like things seem to just hand pages to flush_dcache_page blindly, and expect flush_dcache_page to determine whether or not maintenance is required. With your patch, I immediately hit the BUG() when the init process is started; the backtrace is attached as unpinned_maint.log. I've also attached the original crash log (orig_crash.log). My kernel is a .29 derivative, but I can't find any patches that look like they would address this issue (the mainline code still has the cache maintenance calls in the same places). - Gary -----Original Message----- From: Nicolas Pitre [mailto:nico at fluxnic.net] Sent: Monday, February 08, 2010 8:40 PM To: Gary King Cc: 'Russell King - ARM Linux'; 'linux-arm-kernel at lists.infradead.org' Subject: RE: race between kmap shootdown and cache maintenance On Mon, 8 Feb 2010, Gary King wrote: > The patch is not a no-op; without this patch I was seeing panics in > v7_flush_kern_dcache about 1 time in 3 boots, with it the crash has > not reproduced in hundreds of boots. I would still like to understand why. Without a good explanation this might simply be covering another bug. > However, from re-reading the highmem code, I think my original > description of the cause of the crash was slightly mistaken: > > Kmap zero-flushing is lazy (it happens on the subsequent call to > kmap), and the page_address is not set to NULL until the lazy-flush > happens. In this case, if page_address is called immediately following > a kunmap call which resulted in the pin count dropping to 1, a valid > address will be returned. But that's where things seem wrong. There should no be any caller of flush_dcache_page() passing a page with no "owner". Can you try this patch and see if it actually triggers, and if so what the call backtrace is? diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index 6f3a4b7..7afdb4b 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -125,6 +125,8 @@ void __flush_dcache_page(struct address_space *mapping, struct page *page) * coherent with the kernels mapping. */ #ifdef CONFIG_HIGHMEM + extern int kmap_high_pinned(struct page *); + BUG_ON(addr && PageHighMem(page) && !kmap_high_pinned(page)); /* * kmap_atomic() doesn't set the page virtual address, and * kunmap_atomic() takes care of cache flushing already. diff --git a/mm/highmem.c b/mm/highmem.c index 9c1e627..f50d83e 100644 --- a/mm/highmem.c +++ b/mm/highmem.c @@ -238,6 +238,18 @@ void *kmap_high_get(struct page *page) unlock_kmap_any(flags); return (void*) vaddr; } + +int kmap_high_pinned(struct page *page) +{ + unsigned long vaddr, flags; + int res; + + lock_kmap_any(flags); + vaddr = (unsigned long)page_address(page); + res = (vaddr && pkmap_count[PKMAP_NR(vaddr)] >= 2); + unlock_kmap_any(flags); + return res; +} #endif /** Nicolas ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: unpinned_maint.log Type: application/octet-stream Size: 3097 bytes Desc: unpinned_maint.log URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100212/c2a5bbd1/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: orig_crash.log Type: application/octet-stream Size: 2548 bytes Desc: orig_crash.log URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100212/c2a5bbd1/attachment-0001.obj> ^ permalink raw reply related [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-13 1:08 ` Gary King @ 2010-02-18 2:00 ` Arve Hjønnevåg 2010-02-18 4:18 ` Nicolas Pitre 0 siblings, 1 reply; 9+ messages in thread From: Arve Hjønnevåg @ 2010-02-18 2:00 UTC (permalink / raw) To: linux-arm-kernel On Fri, Feb 12, 2010 at 5:08 PM, Gary King <GKing@nvidia.com> wrote: > I finally got a chance to test this; it looks like this condition may be common: many (all?) of the various page cache-like things seem to just hand pages to flush_dcache_page blindly, and expect flush_dcache_page to determine whether or not maintenance is required. > > With your patch, I immediately hit the BUG() when the init process is started; the backtrace is attached as unpinned_maint.log. I see the same crash on Nexus One with this patch applied to a 2.6.33-rc8 kernel. If I change the BUG_ON to WARN_ON, the system boot further, but the warning triggers the bug. If I also apply the fix from this thread the system boots again, and I see several calls that do not have the kmap-high pinned. __get_user_pages, generic_file_buffered_write, __block_prepare_write, block_write_end, blk_queue_bounce, __mpage_writepage: <4>[ 44.110321] WARNING: at arch/arm/mm/flush.c:129 __flush_dcache_page+0x58/0xe4() <4>[ 44.110534] Modules linked in: <4>[ 44.110748] [<c002b878>] (unwind_backtrace+0x0/0xd8) from [<c0058318>] (warn_slowpath_common+0x48/0x60) <4>[ 44.110961] [<c0058318>] (warn_slowpath_common+0x48/0x60) from [<c002d114>] (__flush_dcache_page+0x58/0xe4) <4>[ 44.111175] [<c002d114>] (__flush_dcache_page+0x58/0xe4) from [<c002d238>] (flush_dcache_page+0x98/0xe4) <4>[ 44.111358] [<c002d238>] (flush_dcache_page+0x98/0xe4) from [<c0091188>] (generic_file_buffered_write+0x144/0x270) <4>[ 44.111572] [<c0091188>] (generic_file_buffered_write+0x144/0x270) from [<c0093400>] (__generic_file_aio_write+0x48c/0x4d4) <4>[ 44.111785] [<c0093400>] (__generic_file_aio_write+0x48c/0x4d4) from [<c00934b0>] (generic_file_aio_write+0x68/0xc8) <4>[ 44.111999] [<c00934b0>] (generic_file_aio_write+0x68/0xc8) from [<c00bf9d8>] (do_sync_write+0x9c/0xe4) <4>[ 44.112213] [<c00bf9d8>] (do_sync_write+0x9c/0xe4) from [<c00c03c0>] (vfs_write+0xac/0x154) <4>[ 44.112335] [<c00c03c0>] (vfs_write+0xac/0x154) from [<c00c0514>] (sys_write+0x3c/0x68) <4>[ 44.112518] [<c00c0514>] (sys_write+0x3c/0x68) from [<c0026f40>] (ret_fast_syscall+0x0/0x2c) <4>[ 44.148040] WARNING: at arch/arm/mm/flush.c:129 __flush_dcache_page+0x58/0xe4() <4>[ 44.148193] Modules linked in: <4>[ 44.148529] [<c002b878>] (unwind_backtrace+0x0/0xd8) from [<c0058318>] (warn_slowpath_common+0x48/0x60) <4>[ 44.148742] [<c0058318>] (warn_slowpath_common+0x48/0x60) from [<c002d114>] (__flush_dcache_page+0x58/0xe4) <4>[ 44.149047] [<c002d114>] (__flush_dcache_page+0x58/0xe4) from [<c002d238>] (flush_dcache_page+0x98/0xe4) <4>[ 44.149383] [<c002d238>] (flush_dcache_page+0x98/0xe4) from [<c00e4f2c>] (__block_prepare_write+0x2c4/0x460) <4>[ 44.149688] [<c00e4f2c>] (__block_prepare_write+0x2c4/0x460) from [<c00e52f0>] (block_write_begin+0x8c/0x104) <4>[ 44.149993] [<c00e52f0>] (block_write_begin+0x8c/0x104) from [<c00e566c>] (cont_write_begin+0x304/0x344) <4>[ 44.150329] [<c00e566c>] (cont_write_begin+0x304/0x344) from [<c01361c0>] (fat_write_begin+0x48/0x54) <4>[ 44.150512] [<c01361c0>] (fat_write_begin+0x48/0x54) from [<c0091124>] (generic_file_buffered_write+0xe0/0x270) <4>[ 44.150848] [<c0091124>] (generic_file_buffered_write+0xe0/0x270) from [<c0093400>] (__generic_file_aio_write+0x48c/0x4d4) <4>[ 44.151153] [<c0093400>] (__generic_file_aio_write+0x48c/0x4d4) from [<c00934b0>] (generic_file_aio_write+0x68/0xc8) <4>[ 44.151489] [<c00934b0>] (generic_file_aio_write+0x68/0xc8) from [<c00bf9d8>] (do_sync_write+0x9c/0xe4) <4>[ 44.151794] [<c00bf9d8>] (do_sync_write+0x9c/0xe4) from [<c00c03c0>] (vfs_write+0xac/0x154) <4>[ 44.152099] [<c00c03c0>] (vfs_write+0xac/0x154) from [<c00c0514>] (sys_write+0x3c/0x68) <4>[ 44.152435] [<c00c0514>] (sys_write+0x3c/0x68) from [<c0026f40>] (ret_fast_syscall+0x0/0x2c) <4>[ 44.158935] WARNING: at arch/arm/mm/flush.c:129 __flush_dcache_page+0x58/0xe4() <4>[ 44.159240] Modules linked in: <4>[ 44.159576] [<c002b878>] (unwind_backtrace+0x0/0xd8) from [<c0058318>] (warn_slowpath_common+0x48/0x60) <4>[ 44.159881] [<c0058318>] (warn_slowpath_common+0x48/0x60) from [<c002d114>] (__flush_dcache_page+0x58/0xe4) <4>[ 44.160217] [<c002d114>] (__flush_dcache_page+0x58/0xe4) from [<c002d238>] (flush_dcache_page+0x98/0xe4) <4>[ 44.160522] [<c002d238>] (flush_dcache_page+0x98/0xe4) from [<c00e37c0>] (block_write_end+0x4c/0x68) <4>[ 44.160858] [<c00e37c0>] (block_write_end+0x4c/0x68) from [<c00e3810>] (generic_write_end+0x34/0xd0) <4>[ 44.161041] [<c00e3810>] (generic_write_end+0x34/0xd0) from [<c0136124>] (fat_write_end+0x2c/0x80) <4>[ 44.161346] [<c0136124>] (fat_write_end+0x2c/0x80) from [<c00911c0>] (generic_file_buffered_write+0x17c/0x270) <4>[ 44.161682] [<c00911c0>] (generic_file_buffered_write+0x17c/0x270) from [<c0093400>] (__generic_file_aio_write+0x48c/0x4d4) <4>[ 44.161987] [<c0093400>] (__generic_file_aio_write+0x48c/0x4d4) from [<c00934b0>] (generic_file_aio_write+0x68/0xc8) <4>[ 44.162292] [<c00934b0>] (generic_file_aio_write+0x68/0xc8) from [<c00bf9d8>] (do_sync_write+0x9c/0xe4) <4>[ 44.162628] [<c00bf9d8>] (do_sync_write+0x9c/0xe4) from [<c00c03c0>] (vfs_write+0xac/0x154) <4>[ 44.162933] [<c00c03c0>] (vfs_write+0xac/0x154) from [<c00c0514>] (sys_write+0x3c/0x68) <4>[ 44.163116] [<c00c0514>] (sys_write+0x3c/0x68) from [<c0026f40>] (ret_fast_syscall+0x0/0x2c) <4>[ 44.163574] ---[ end trace 1b75b31a2719f026 ]--- <4>[ 48.068237] WARNING: at arch/arm/mm/flush.c:129 __flush_dcache_page+0x58/0xe4() <4>[ 48.068878] Modules linked in: <4>[ 48.069580] [<c002b878>] (unwind_backtrace+0x0/0xd8) from [<c0058318>] (warn_slowpath_common+0x48/0x60) <4>[ 48.070251] [<c0058318>] (warn_slowpath_common+0x48/0x60) from [<c002d114>] (__flush_dcache_page+0x58/0xe4) <4>[ 48.070617] [<c002d114>] (__flush_dcache_page+0x58/0xe4) from [<c002d238>] (flush_dcache_page+0x98/0xe4) <4>[ 48.071289] [<c002d238>] (flush_dcache_page+0x98/0xe4) from [<c00b40dc>] (blk_queue_bounce+0x170/0x30c) <4>[ 48.071960] [<c00b40dc>] (blk_queue_bounce+0x170/0x30c) from [<c0162694>] (__make_request+0x44/0x424) <4>[ 48.072631] [<c0162694>] (__make_request+0x44/0x424) from [<c0161074>] (generic_make_request+0x300/0x360) <4>[ 48.073303] [<c0161074>] (generic_make_request+0x300/0x360) from [<c01611e0>] (submit_bio+0x10c/0x128) <4>[ 48.073944] [<c01611e0>] (submit_bio+0x10c/0x128) from [<c00e26ac>] (submit_bh+0x170/0x194) <4>[ 48.074340] [<c00e26ac>] (submit_bh+0x170/0x194) from [<c00e5e5c>] (__block_write_full_page+0x35c/0x4f8) <4>[ 48.074981] [<c00e5e5c>] (__block_write_full_page+0x35c/0x4f8) from [<c00e60e0>] (block_write_full_page_endio+0xe8/0xec) <4>[ 48.075653] [<c00e60e0>] (block_write_full_page_endio+0xe8/0xec) from [<c00ebfe4>] (__mpage_writepage+0x60c/0x65c) <4>[ 48.076324] [<c00ebfe4>] (__mpage_writepage+0x60c/0x65c) from [<c0098da0>] (write_cache_pages+0x1f4/0x2f8) <4>[ 48.076995] [<c0098da0>] (write_cache_pages+0x1f4/0x2f8) from [<c00ec210>] (mpage_writepages+0x48/0x70) <4>[ 48.087097] [<c00ec210>] (mpage_writepages+0x48/0x70) from [<c0098ef8>] (do_writepages+0x2c/0x38) <4>[ 48.087829] [<c0098ef8>] (do_writepages+0x2c/0x38) from [<c00dc7ac>] (writeback_single_inode+0x108/0x2fc) <4>[ 48.088195] [<c00dc7ac>] (writeback_single_inode+0x108/0x2fc) from [<c00dd578>] (writeback_inodes_wb+0x3d0/0x52c) <4>[ 48.088867] [<c00dd578>] (writeback_inodes_wb+0x3d0/0x52c) from [<c00dd80c>] (wb_writeback+0x138/0x1d0) <4>[ 48.089508] [<c00dd80c>] (wb_writeback+0x138/0x1d0) from [<c00ddb6c>] (wb_do_writeback+0x19c/0x1c0) <4>[ 48.090179] [<c00ddb6c>] (wb_do_writeback+0x19c/0x1c0) from [<c00ddbc8>] (bdi_writeback_task+0x38/0xb4) <4>[ 48.090820] [<c00ddbc8>] (bdi_writeback_task+0x38/0xb4) from [<c00a4fe8>] (bdi_start_fn+0x8c/0x104) <4>[ 48.091491] [<c00a4fe8>] (bdi_start_fn+0x8c/0x104) from [<c006cfe8>] (kthread+0x78/0x80) <4>[ 48.091857] [<c006cfe8>] (kthread+0x78/0x80) from [<c002797c>] (kernel_thread_exit+0x0/0x8) <4>[ 48.107055] WARNING: at arch/arm/mm/flush.c:129 __flush_dcache_page+0x58/0xe4() <4>[ 48.107360] Modules linked in: <4>[ 48.107574] [<c002b878>] (unwind_backtrace+0x0/0xd8) from [<c0058318>] (warn_slowpath_common+0x48/0x60) <4>[ 48.107788] [<c0058318>] (warn_slowpath_common+0x48/0x60) from [<c002d114>] (__flush_dcache_page+0x58/0xe4) <4>[ 48.108001] [<c002d114>] (__flush_dcache_page+0x58/0xe4) from [<c002d238>] (flush_dcache_page+0x98/0xe4) <4>[ 48.108123] [<c002d238>] (flush_dcache_page+0x98/0xe4) from [<c00ebdd4>] (__mpage_writepage+0x3fc/0x65c) <4>[ 48.108337] [<c00ebdd4>] (__mpage_writepage+0x3fc/0x65c) from [<c0098da0>] (write_cache_pages+0x1f4/0x2f8) <4>[ 48.112121] [<c0098da0>] (write_cache_pages+0x1f4/0x2f8) from [<c00ec210>] (mpage_writepages+0x48/0x70) <4>[ 48.112335] [<c00ec210>] (mpage_writepages+0x48/0x70) from [<c0098ef8>] (do_writepages+0x2c/0x38) <4>[ 48.112548] [<c0098ef8>] (do_writepages+0x2c/0x38) from [<c00dc7ac>] (writeback_single_inode+0x108/0x2fc) <4>[ 48.112792] [<c00dc7ac>] (writeback_single_inode+0x108/0x2fc) from [<c00dd578>] (writeback_inodes_wb+0x3d0/0x52c) <4>[ 48.113006] [<c00dd578>] (writeback_inodes_wb+0x3d0/0x52c) from [<c00dd80c>] (wb_writeback+0x138/0x1d0) <4>[ 48.113098] [<c00dd80c>] (wb_writeback+0x138/0x1d0) from [<c00ddb6c>] (wb_do_writeback+0x19c/0x1c0) <4>[ 48.113311] [<c00ddb6c>] (wb_do_writeback+0x19c/0x1c0) from [<c00ddbc8>] (bdi_writeback_task+0x38/0xb4) <4>[ 48.113525] [<c00ddbc8>] (bdi_writeback_task+0x38/0xb4) from [<c00a4fe8>] (bdi_start_fn+0x8c/0x104) <4>[ 48.113739] [<c00a4fe8>] (bdi_start_fn+0x8c/0x104) from [<c006cfe8>] (kthread+0x78/0x80) <4>[ 48.113952] [<c006cfe8>] (kthread+0x78/0x80) from [<c002797c>] (kernel_thread_exit+0x0/0x8) <4>[ 87.644622] WARNING: at arch/arm/mm/flush.c:129 __flush_dcache_page+0x58/0xe4() <4>[ 87.645233] Modules linked in: <4>[ 87.646087] [<c002b878>] (unwind_backtrace+0x0/0xd8) from [<c0058318>] (warn_slowpath_common+0x48/0x60) <4>[ 87.646759] [<c0058318>] (warn_slowpath_common+0x48/0x60) from [<c002d114>] (__flush_dcache_page+0x58/0xe4) <4>[ 87.647430] [<c002d114>] (__flush_dcache_page+0x58/0xe4) from [<c002d238>] (flush_dcache_page+0x98/0xe4) <4>[ 87.648101] [<c002d238>] (flush_dcache_page+0x98/0xe4) from [<c00a84d4>] (__get_user_pages+0x1ec/0x25c) <4>[ 87.648773] [<c00a84d4>] (__get_user_pages+0x1ec/0x25c) from [<c00c4938>] (get_arg_page+0x48/0x9c) <4>[ 87.649139] [<c00c4938>] (get_arg_page+0x48/0x9c) from [<c00c4a80>] (copy_strings+0xf4/0x208) <4>[ 87.649810] [<c00c4a80>] (copy_strings+0xf4/0x208) from [<c00c63f4>] (do_execve+0x118/0x260) <4>[ 87.650451] [<c00c63f4>] (do_execve+0x118/0x260) from [<c0029ef4>] (sys_execve+0x34/0x54) <4>[ 87.651092] [<c0029ef4>] (sys_execve+0x34/0x54) from [<c0026f40>] (ret_fast_syscall+0x0/0x2c) ... > From: Nicolas Pitre [mailto:nico at fluxnic.net] > But that's where things seem wrong. ?There should no be any caller of > flush_dcache_page() passing a page with no "owner". > What to you mean by "owner"? It looks like the page is locked, but the highmem mapping is not. -- Arve Hj?nnev?g ^ permalink raw reply [flat|nested] 9+ messages in thread
* race between kmap shootdown and cache maintenance 2010-02-18 2:00 ` Arve Hjønnevåg @ 2010-02-18 4:18 ` Nicolas Pitre 0 siblings, 0 replies; 9+ messages in thread From: Nicolas Pitre @ 2010-02-18 4:18 UTC (permalink / raw) To: linux-arm-kernel On Wed, 17 Feb 2010, Arve Hj?nnev?g wrote: > On Fri, Feb 12, 2010 at 5:08 PM, Gary King <GKing@nvidia.com> wrote: > > I finally got a chance to test this; it looks like this condition may be common: many (all?) of the various page cache-like things seem to just hand pages to flush_dcache_page blindly, and expect flush_dcache_page to determine whether or not maintenance is required. > > > > With your patch, I immediately hit the BUG() when the init process is started; the backtrace is attached as unpinned_maint.log. > > I see the same crash on Nexus One with this patch applied to a > 2.6.33-rc8 kernel. > > If I change the BUG_ON to WARN_ON, the system boot further, but the > warning triggers the bug. > > If I also apply the fix from this thread the system boots again, and I > see several calls that do not have the kmap-high pinned. OK. After further considerations, I think the original patch is legitimate. Nicolas ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-02-18 4:18 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-05 18:13 race between kmap shootdown and cache maintenance Gary King 2010-02-07 15:30 ` Russell King - ARM Linux 2010-02-08 21:57 ` Gary King 2010-02-09 3:35 ` Nicolas Pitre 2010-02-09 4:00 ` Gary King 2010-02-09 4:40 ` Nicolas Pitre 2010-02-13 1:08 ` Gary King 2010-02-18 2:00 ` Arve Hjønnevåg 2010-02-18 4:18 ` Nicolas Pitre
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).