* [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory [not found] <CGME20180126111453eucas1p1330d8561386c3cf2bb457bc22c0d99a8@eucas1p1.samsung.com> @ 2018-01-26 11:14 ` Marek Szyprowski [not found] ` <CGME20180126111453eucas1p178663ac7b17d0c92cc889a42b3f5bcec@eucas1p1.samsung.com> 2018-01-26 11:32 ` [RFC 1/2] arm: " Russell King - ARM Linux 0 siblings, 2 replies; 8+ messages in thread From: Marek Szyprowski @ 2018-01-26 11:14 UTC (permalink / raw) To: linux-arm-kernel glibc in calls cacheflush syscall on the whole textrels section of the relocated binaries. However, relocation usually doesn't touch all pages of that section, so not all of them are read to memory when calling this syscall. However flush_cache_user_range() function will unconditionally touch all pages from the provided range, resulting additional overhead related to reading all clean pages. Optimize this by calling flush_cache_user_range() only on the pages that are already in the memory. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- arch/arm/kernel/traps.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 5e3633c24e63..a5ec262ab30e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -564,23 +564,36 @@ static int bad_syscall(int n, struct pt_regs *regs) static inline int __do_cache_op(unsigned long start, unsigned long end) { - int ret; + struct vm_area_struct *vma = NULL; + int ret = 0; + down_read(¤t->mm->mmap_sem); do { unsigned long chunk = min(PAGE_SIZE, end - start); + if (!vma || vma->vm_end <= start) { + vma = find_vma(current->mm, start); + if (!vma) { + ret = -EFAULT; + goto done; + } + } + if (fatal_signal_pending(current)) return 0; - ret = flush_cache_user_range(start, start + chunk); - if (ret) - return ret; + if (follow_page(vma, start, 0)) { + ret = flush_cache_user_range(start, start + chunk); + if (ret) + goto done; + } cond_resched(); start += chunk; } while (start < end); - - return 0; +done: + up_read(¤t->mm->mmap_sem); + return ret; } static inline int -- 2.15.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
[parent not found: <CGME20180126111453eucas1p178663ac7b17d0c92cc889a42b3f5bcec@eucas1p1.samsung.com>]
* [RFC 2/2] arm64: compat: cacheflush syscall: process only pages that are in the memory [not found] ` <CGME20180126111453eucas1p178663ac7b17d0c92cc889a42b3f5bcec@eucas1p1.samsung.com> @ 2018-01-26 11:14 ` Marek Szyprowski 2018-01-26 17:41 ` Catalin Marinas 0 siblings, 1 reply; 8+ messages in thread From: Marek Szyprowski @ 2018-01-26 11:14 UTC (permalink / raw) To: linux-arm-kernel glibc in ARM 32bit mode calls cacheflush syscall on the whole textrels section of the relocated binaries. However, relocation usually doesn't touch all pages of that section, so not all of them are read to memory when calling this syscall. However flush_cache_user_range() function will unconditionally touch all pages from the provided range, resulting additional overhead related to reading all clean pages. Optimize this by calling flush_cache_user_range() only on the pages that are already in the memory. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- arch/arm64/kernel/sys_compat.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c index 8b8bbd3eaa52..4a047db3fdd4 100644 --- a/arch/arm64/kernel/sys_compat.c +++ b/arch/arm64/kernel/sys_compat.c @@ -25,6 +25,7 @@ #include <linux/slab.h> #include <linux/syscalls.h> #include <linux/uaccess.h> +#include <linux/mm.h> #include <asm/cacheflush.h> #include <asm/unistd.h> @@ -32,23 +33,36 @@ static long __do_compat_cache_op(unsigned long start, unsigned long end) { - long ret; + struct vm_area_struct *vma = NULL; + long ret = 0; + down_read(¤t->mm->mmap_sem); do { unsigned long chunk = min(PAGE_SIZE, end - start); + if (!vma || vma->vm_end <= start) { + vma = find_vma(current->mm, start); + if (!vma) { + ret = -EFAULT; + goto done; + } + } + if (fatal_signal_pending(current)) - return 0; + goto done; - ret = __flush_cache_user_range(start, start + chunk); - if (ret) - return ret; + if (follow_page(vma, start, 0)) { + ret = __flush_cache_user_range(start, start + chunk); + if (ret) + goto done; + } cond_resched(); start += chunk; } while (start < end); - - return 0; +done: + up_read(¤t->mm->mmap_sem); + return ret; } static inline long -- 2.15.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [RFC 2/2] arm64: compat: cacheflush syscall: process only pages that are in the memory 2018-01-26 11:14 ` [RFC 2/2] arm64: compat: " Marek Szyprowski @ 2018-01-26 17:41 ` Catalin Marinas 2018-01-26 18:02 ` Catalin Marinas 0 siblings, 1 reply; 8+ messages in thread From: Catalin Marinas @ 2018-01-26 17:41 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 26, 2018 at 12:14:41PM +0100, Marek Szyprowski wrote: > @@ -32,23 +33,36 @@ > static long > __do_compat_cache_op(unsigned long start, unsigned long end) [...] > + if (follow_page(vma, start, 0)) { > + ret = __flush_cache_user_range(start, start + chunk); > + if (ret) > + goto done; > + } This looks pretty expensive for pages already in memory. Could we do some tricks with the AT instruction in __flush_cache_user_range() so that we skip the flushing if the page isn't there? We know that when a page is mapped, the cache will get cleaned/invalidated via set_pte_at() + sync_icache_dcache(), so the only problem of a race is flushing the caches twice for a page. If a page gets unmapped after AT, we may bring it back through the cache ops. -- Catalin ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC 2/2] arm64: compat: cacheflush syscall: process only pages that are in the memory 2018-01-26 17:41 ` Catalin Marinas @ 2018-01-26 18:02 ` Catalin Marinas 0 siblings, 0 replies; 8+ messages in thread From: Catalin Marinas @ 2018-01-26 18:02 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 26, 2018 at 05:41:45PM +0000, Catalin Marinas wrote: > On Fri, Jan 26, 2018 at 12:14:41PM +0100, Marek Szyprowski wrote: > > @@ -32,23 +33,36 @@ > > static long > > __do_compat_cache_op(unsigned long start, unsigned long end) > [...] > > + if (follow_page(vma, start, 0)) { > > + ret = __flush_cache_user_range(start, start + chunk); > > + if (ret) > > + goto done; > > + } > > This looks pretty expensive for pages already in memory. Could we do > some tricks with the AT instruction in __flush_cache_user_range() so > that we skip the flushing if the page isn't there? We know that when a > page is mapped, the cache will get cleaned/invalidated via set_pte_at() > + sync_icache_dcache(), so the only problem of a race is flushing the > caches twice for a page. If a page gets unmapped after AT, we may bring > it back through the cache ops. Alternatively, we could try to use pagefault_disable()/enable() around this but we need to make sure we cover all the cases where a page may not be flushed even when it is actually present (i.e. old pte). For example, ptep_set_access_flags() skips __sync_icache_dcache(). -- Catalin ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory 2018-01-26 11:14 ` [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory Marek Szyprowski [not found] ` <CGME20180126111453eucas1p178663ac7b17d0c92cc889a42b3f5bcec@eucas1p1.samsung.com> @ 2018-01-26 11:32 ` Russell King - ARM Linux 2018-01-26 13:30 ` Marek Szyprowski 1 sibling, 1 reply; 8+ messages in thread From: Russell King - ARM Linux @ 2018-01-26 11:32 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 26, 2018 at 12:14:40PM +0100, Marek Szyprowski wrote: > glibc in calls cacheflush syscall on the whole textrels section of the > relocated binaries. However, relocation usually doesn't touch all pages > of that section, so not all of them are read to memory when calling this > syscall. However flush_cache_user_range() function will unconditionally > touch all pages from the provided range, resulting additional overhead > related to reading all clean pages. Optimize this by calling > flush_cache_user_range() only on the pages that are already in the > memory. What ensures that another CPU doesn't remove a page while we're flushing it? That will trigger a data abort, which will want to take the mmap_sem, causing a deadlock. > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > arch/arm/kernel/traps.c | 25 +++++++++++++++++++------ > 1 file changed, 19 insertions(+), 6 deletions(-) > > diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c > index 5e3633c24e63..a5ec262ab30e 100644 > --- a/arch/arm/kernel/traps.c > +++ b/arch/arm/kernel/traps.c > @@ -564,23 +564,36 @@ static int bad_syscall(int n, struct pt_regs *regs) > static inline int > __do_cache_op(unsigned long start, unsigned long end) > { > - int ret; > + struct vm_area_struct *vma = NULL; > + int ret = 0; > > + down_read(¤t->mm->mmap_sem); > do { > unsigned long chunk = min(PAGE_SIZE, end - start); > > + if (!vma || vma->vm_end <= start) { > + vma = find_vma(current->mm, start); > + if (!vma) { > + ret = -EFAULT; > + goto done; > + } > + } > + > if (fatal_signal_pending(current)) > return 0; > > - ret = flush_cache_user_range(start, start + chunk); > - if (ret) > - return ret; > + if (follow_page(vma, start, 0)) { > + ret = flush_cache_user_range(start, start + chunk); > + if (ret) > + goto done; > + } > > cond_resched(); > start += chunk; > } while (start < end); > - > - return 0; > +done: > + up_read(¤t->mm->mmap_sem); > + return ret; > } > > static inline int > -- > 2.15.0 > -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory 2018-01-26 11:32 ` [RFC 1/2] arm: " Russell King - ARM Linux @ 2018-01-26 13:30 ` Marek Szyprowski 2018-01-26 21:39 ` Russell King - ARM Linux 0 siblings, 1 reply; 8+ messages in thread From: Marek Szyprowski @ 2018-01-26 13:30 UTC (permalink / raw) To: linux-arm-kernel Hi Russell, On 2018-01-26 12:32, Russell King - ARM Linux wrote: > On Fri, Jan 26, 2018 at 12:14:40PM +0100, Marek Szyprowski wrote: >> glibc in calls cacheflush syscall on the whole textrels section of the >> relocated binaries. However, relocation usually doesn't touch all pages >> of that section, so not all of them are read to memory when calling this >> syscall. However flush_cache_user_range() function will unconditionally >> touch all pages from the provided range, resulting additional overhead >> related to reading all clean pages. Optimize this by calling >> flush_cache_user_range() only on the pages that are already in the >> memory. > What ensures that another CPU doesn't remove a page while we're > flushing it? That will trigger a data abort, which will want to > take the mmap_sem, causing a deadlock. I thought that taking mmap_sem will prevent pages from being removed. mmap_sem has been already taken in the previous implementation of that syscall, until code simplification done by commit 97c72d89ce0e ("ARM: cacheflush: don't bother rounding to nearest vma"). >> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >> --- >> arch/arm/kernel/traps.c | 25 +++++++++++++++++++------ >> 1 file changed, 19 insertions(+), 6 deletions(-) >> >> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c >> index 5e3633c24e63..a5ec262ab30e 100644 >> --- a/arch/arm/kernel/traps.c >> +++ b/arch/arm/kernel/traps.c >> @@ -564,23 +564,36 @@ static int bad_syscall(int n, struct pt_regs *regs) >> static inline int >> __do_cache_op(unsigned long start, unsigned long end) >> { >> - int ret; >> + struct vm_area_struct *vma = NULL; >> + int ret = 0; >> >> + down_read(¤t->mm->mmap_sem); >> do { >> unsigned long chunk = min(PAGE_SIZE, end - start); >> >> + if (!vma || vma->vm_end <= start) { >> + vma = find_vma(current->mm, start); >> + if (!vma) { >> + ret = -EFAULT; >> + goto done; >> + } >> + } >> + >> if (fatal_signal_pending(current)) >> return 0; >> >> - ret = flush_cache_user_range(start, start + chunk); >> - if (ret) >> - return ret; >> + if (follow_page(vma, start, 0)) { >> + ret = flush_cache_user_range(start, start + chunk); >> + if (ret) >> + goto done; >> + } >> >> cond_resched(); >> start += chunk; >> } while (start < end); >> - >> - return 0; >> +done: >> + up_read(¤t->mm->mmap_sem); >> + return ret; >> } >> >> static inline int >> -- >> 2.15.0 >> Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory 2018-01-26 13:30 ` Marek Szyprowski @ 2018-01-26 21:39 ` Russell King - ARM Linux 2018-01-31 6:03 ` Inki Dae 0 siblings, 1 reply; 8+ messages in thread From: Russell King - ARM Linux @ 2018-01-26 21:39 UTC (permalink / raw) To: linux-arm-kernel On Fri, Jan 26, 2018 at 02:30:47PM +0100, Marek Szyprowski wrote: > Hi Russell, > > On 2018-01-26 12:32, Russell King - ARM Linux wrote: > >On Fri, Jan 26, 2018 at 12:14:40PM +0100, Marek Szyprowski wrote: > >>glibc in calls cacheflush syscall on the whole textrels section of the > >>relocated binaries. However, relocation usually doesn't touch all pages > >>of that section, so not all of them are read to memory when calling this > >>syscall. However flush_cache_user_range() function will unconditionally > >>touch all pages from the provided range, resulting additional overhead > >>related to reading all clean pages. Optimize this by calling > >>flush_cache_user_range() only on the pages that are already in the > >>memory. > >What ensures that another CPU doesn't remove a page while we're > >flushing it? That will trigger a data abort, which will want to > >take the mmap_sem, causing a deadlock. > > I thought that taking mmap_sem will prevent pages from being removed. > mmap_sem has been already taken in the previous implementation of that > syscall, until code simplification done by commit 97c72d89ce0e ("ARM: > cacheflush: don't bother rounding to nearest vma"). No, you're not reading the previous code state correctly. Take a closer look at that commit. find_vma() requires that mmap_sem is held across the call as the VMA list is not stable without that semaphore held. However, more importantly, notice that it drops the semaphore _before_ calling the cache flushing function (__do_cache_op()). The point is that if __do_cache_op() faults, it will enter do_page_fault(), which will try to take the mmap_sem again, causing a deadlock. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory 2018-01-26 21:39 ` Russell King - ARM Linux @ 2018-01-31 6:03 ` Inki Dae 0 siblings, 0 replies; 8+ messages in thread From: Inki Dae @ 2018-01-31 6:03 UTC (permalink / raw) To: linux-arm-kernel Hi Russell, 2018? 01? 27? 06:39? Russell King - ARM Linux ?(?) ? ?: > On Fri, Jan 26, 2018 at 02:30:47PM +0100, Marek Szyprowski wrote: >> Hi Russell, >> >> On 2018-01-26 12:32, Russell King - ARM Linux wrote: >>> On Fri, Jan 26, 2018 at 12:14:40PM +0100, Marek Szyprowski wrote: >>>> glibc in calls cacheflush syscall on the whole textrels section of the >>>> relocated binaries. However, relocation usually doesn't touch all pages >>>> of that section, so not all of them are read to memory when calling this >>>> syscall. However flush_cache_user_range() function will unconditionally >>>> touch all pages from the provided range, resulting additional overhead >>>> related to reading all clean pages. Optimize this by calling >>>> flush_cache_user_range() only on the pages that are already in the >>>> memory. >>> What ensures that another CPU doesn't remove a page while we're >>> flushing it? That will trigger a data abort, which will want to >>> take the mmap_sem, causing a deadlock. >> >> I thought that taking mmap_sem will prevent pages from being removed. >> mmap_sem has been already taken in the previous implementation of that >> syscall, until code simplification done by commit 97c72d89ce0e ("ARM: >> cacheflush: don't bother rounding to nearest vma"). > > No, you're not reading the previous code state correctly. Take a closer > look at that commit. > > find_vma() requires that mmap_sem is held across the call as the VMA > list is not stable without that semaphore held. However, more > importantly, notice that it drops the semaphore _before_ calling the > cache flushing function (__do_cache_op()). > > The point is that if __do_cache_op() faults, it will enter > do_page_fault(), which will try to take the mmap_sem again, causing > a deadlock. I'm not sure but seems this patch tries to do cache-flush only in-memory pages. So I think the page fault wouldn't happen becasue flush_cache_user_range function returns always 0. Thanks, Inki Dae > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-01-31 6:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20180126111453eucas1p1330d8561386c3cf2bb457bc22c0d99a8@eucas1p1.samsung.com>
2018-01-26 11:14 ` [RFC 1/2] arm: cacheflush syscall: process only pages that are in the memory Marek Szyprowski
[not found] ` <CGME20180126111453eucas1p178663ac7b17d0c92cc889a42b3f5bcec@eucas1p1.samsung.com>
2018-01-26 11:14 ` [RFC 2/2] arm64: compat: " Marek Szyprowski
2018-01-26 17:41 ` Catalin Marinas
2018-01-26 18:02 ` Catalin Marinas
2018-01-26 11:32 ` [RFC 1/2] arm: " Russell King - ARM Linux
2018-01-26 13:30 ` Marek Szyprowski
2018-01-26 21:39 ` Russell King - ARM Linux
2018-01-31 6:03 ` Inki Dae
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).