From: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
To: Lars Persson <lars.persson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
"linux-mips@linux-mips.org" <linux-mips@linux-mips.org>,
"james.hogan@imgtec.com" <james.hogan@imgtec.com>,
"keescook@chromium.org" <keescook@chromium.org>,
"paul.burton@imgtec.com" <paul.burton@imgtec.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"manuel.lauss@gmail.com" <manuel.lauss@gmail.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"blogic@openwrt.org" <blogic@openwrt.org>,
"markos.chandras@imgtec.com" <markos.chandras@imgtec.com>
Subject: Re: [PATCH] Revert "MIPS: Remove race window in page fault handling"
Date: Fri, 5 Dec 2014 13:41:25 -0800 [thread overview]
Message-ID: <54822685.8070504@imgtec.com> (raw)
In-Reply-To: <1417771976.8400.8.camel@lnxlarper.se.axis.com>
Lars,
On 12/05/2014 01:32 AM, Lars Persson wrote:
> Hi
>
> Our setup includes both a non-DMA block device and a compressing
> file-system (UBIFS). A flush_dcache_page() is issued by UBIFS so your
> patch fixes another problem that we do not hit.
>
> The stack trace is not available now. Do we need it for any further
> analysis ? I think the mechanism of the race window is understood and it
> depends on the __flush_dcache_page() deciding that the flush should be
> postponed.
Unfortunately, the research of original case is still needed.
I looked into all cases of update_mmu_cache() besides HUGE page support
and NUMA, and I see:
1. insert_pfn()
It is used to put a special page (read - VDSO) into memory map. No
cache flush is needed here because page is done and flushed during
system boot.
2. do_wp_page(), first occurrence
It has flush_cache_page() before it sets PTE in
ptep_set_access_flags(). This flush is unconditional and affects all caches.
3. do_wp_page(), second case
It is done after preparing a clear new page or after COW. COW has
an appropriate cache flush of destination in copy_user_highpage(). The
immediate use of cleared new page as instruction (you had SIGILL,
right?)... hm-m, something wrong in application in this case.
4. do_swap_page()
Well, it may be a case of flush_icache_page() is not used (see
below) and page is taken from non-DMA swap. But I also recommend to look
into
http://patchwork.linux-mips.org/patch/7615/
there is a bug in swap entry number presentation and it may affect your
system.
5. do_anonymous_page()
The similar to case (3) - cleared new page, using of it as
instruction page may point to some app problem.
6. do_set_pte()
It also has flush_icache_page() which may have impact if not
implemented, see below.
7. handle_pte_fault()
Page is not touched and cache flush is NO-OP.
8. remove_migration_pte()
Well, it is a place for suspicion. But it should not run in
parallel with any running thread - dirtying page while other thread is
running is a way to disaster.
So, you see - if I understand it correctly, there is no place for
failure... besides application misbehaviour or potential kernel bug in
migration. Of course, I may miss something and that is a reason why
stack trace is desirable.
> I think the mechanism of the race window is understood and it
> depends on the __flush_dcache_page() deciding that the flush should be
> postponed.
As I remember, you said you use HIGHMEM patch, right? It changes a
little __flush_dcache_page() and flush of any mapped page is not
postponed anymore. So, it has an immediate effect for application pages.
- Leonid.
>
>
> - Lars
>
> On Fri, 2014-12-05 at 03:16 +0100, Leonid Yegoshin wrote:
>> (repeat mesg, first one went to wrong place)
>>
>> Lars,
>>
>> Do you have a stack trace or so then you found the second VPE between
>> set_pte_at and update_mmu_cache?
>> It would be interesting how it happens - generally, to get a consistent
>> SIGILL in applications due to misbehaviour of memory subsystem, the bug
>> in FS is not enough.
>>
>> Hold on - do you use non-DMA file system?
>> If so, I advice you to try this simple patch:
>>
>> Author: Leonid Yegoshin <yegoshin@mips.com>
>> Date: Tue Apr 2 14:20:37 2013 -0700
>>
>> MIPS: (opt) Fix of reading I-pages from non-DMA FS devices for ID
>> cache separation
>>
>> This optional fix provides a D-cache flush for instruction code
>> pages on
>> page faults. In case of non-DMA block device a driver doesn't know
>> that it
>> reads I-page and doesn't flush D-cache generally on systems without
>> cache aliasing. And that takes toll during page fault of
>> instruction pages.
>>
>> It is not a perfect fix, it should be considered as a temporary fix.
>> The permanent fix would track page origin in page cache and flushes
>> D-cache
>> during reception of page from driver only but not at each page fault.
>> It is not done yet.
>>
>> Change-Id: I43f5943d6ce0509729179615f6b81e77803a34ac
>> Author: Leonid Yegoshin <yegoshin@mips.com>
>> Signed-off-by: Leonid Yegoshin <yegoshin@mips.com>(imported from
>> commit 6ebd22eb7a3d9873582ebe990a77094f971652ee)(imported from commit
>> 0caf3b4a1eebb64572e81e4df6fdb3abf12c70
>>
>> arch/mips/include/asm/cacheflush.h:
>>
>> @@ -61,6 +61,9 @@ static inline void flush_anon_page(struct
>> vm_area_struct *vma,
>> static inline void flush_icache_page(struct vm_area_struct *vma,
>> struct page *page)
>> {
>> + if (cpu_has_dc_aliases ||
>> + ((vma->vm_flags & VM_EXEC) && !cpu_has_ic_fills_f_dc))
>> + __flush_dcache_page(page);
>> }
>>
>> extern void (*flush_icache_range)(unsigned long start, unsigned
>> long end);
>>
>>
>> It fixed crash problems with non-DMA FS in a couple of our customers.
>> Without it the non-DMA root FS crashes are catastrophic in aliasing
>> systems but it is still a problem for I-cache too but much rare.
>>
>> Unfortunately, it is also a performance hit, however is less than run a
>> page cache flush at each PTE setup. On 12/03/2014 06:03 AM, Lars Persson
>> wrote:
>>> It is the flush_dcache_page() that was called from the file-system
>>> reading the page contents into memory.
>>>
>>> - Lars
>>>
>>>
>
>
next prev parent reply other threads:[~2014-12-05 21:41 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-03 3:25 [PATCH] Revert "MIPS: Remove race window in page fault handling" Leonid Yegoshin
2014-12-03 3:25 ` Leonid Yegoshin
2014-12-03 9:31 ` Lars Persson
2014-12-03 13:24 ` Ralf Baechle
2014-12-03 13:42 ` Ralf Baechle
2014-12-03 14:03 ` Lars Persson
2014-12-03 19:28 ` Leonid Yegoshin
2014-12-05 2:16 ` Leonid Yegoshin
2014-12-05 9:32 ` Lars Persson
2014-12-05 21:41 ` Leonid Yegoshin [this message]
2014-12-08 9:18 ` Lars Persson
2014-12-03 14:20 ` Lars Persson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54822685.8070504@imgtec.com \
--to=leonid.yegoshin@imgtec.com \
--cc=akpm@linux-foundation.org \
--cc=blogic@openwrt.org \
--cc=james.hogan@imgtec.com \
--cc=keescook@chromium.org \
--cc=lars.persson@axis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@linux-mips.org \
--cc=manuel.lauss@gmail.com \
--cc=markos.chandras@imgtec.com \
--cc=paul.burton@imgtec.com \
--cc=pbonzini@redhat.com \
--cc=ralf@linux-mips.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.