* [PATCH v2] ARM64: Kernel managed pages are only flushed @ 2014-03-05 11:25 Bharat Bhushan 2014-03-05 16:12 ` Will Deacon 0 siblings, 1 reply; 6+ messages in thread From: Bharat Bhushan @ 2014-03-05 11:25 UTC (permalink / raw) To: linux-arm-kernel Kernel can only access pages which maps to managed memory. So flush only valid kernel pages. I observed kernel crash direct assigning a device using VFIO and found that it was caused because of accessing invalid page Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> --- v1->v2 Getting pfn usin pte_pfn() in pfn_valid. arch/arm64/mm/flush.c | 13 ++++++++++++- 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index e4193e3..319826a 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page, void __sync_icache_dcache(pte_t pte, unsigned long addr) { - struct page *page = pte_page(pte); + struct page *page; + +#ifdef CONFIG_HAVE_ARCH_PFN_VALID + /* + * We can only access pages that the kernel maps + * as memory. Bail out for unmapped ones. + */ + if (!pfn_valid(pte_pfn(pte))) + return; + +#endif + page = pte_page(pte); /* no flushing needed for anonymous pages */ if (!page_mapping(page)) -- 1.7.0.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed 2014-03-05 11:25 [PATCH v2] ARM64: Kernel managed pages are only flushed Bharat Bhushan @ 2014-03-05 16:12 ` Will Deacon 2014-03-05 16:27 ` Bharat.Bhushan at freescale.com 0 siblings, 1 reply; 6+ messages in thread From: Will Deacon @ 2014-03-05 16:12 UTC (permalink / raw) To: linux-arm-kernel On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote: > Kernel can only access pages which maps to managed memory. > So flush only valid kernel pages. > > I observed kernel crash direct assigning a device using VFIO > and found that it was caused because of accessing invalid page > > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> > --- > v1->v2 > Getting pfn usin pte_pfn() in pfn_valid. > > arch/arm64/mm/flush.c | 13 ++++++++++++- > 1 files changed, 12 insertions(+), 1 deletions(-) > > diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c > index e4193e3..319826a 100644 > --- a/arch/arm64/mm/flush.c > +++ b/arch/arm64/mm/flush.c > @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page, > > void __sync_icache_dcache(pte_t pte, unsigned long addr) > { > - struct page *page = pte_page(pte); > + struct page *page; > + > +#ifdef CONFIG_HAVE_ARCH_PFN_VALID > + /* > + * We can only access pages that the kernel maps > + * as memory. Bail out for unmapped ones. > + */ > + if (!pfn_valid(pte_pfn(pte))) > + return; > + > +#endif > + page = pte_page(pte); How do you get into this function without a valid, userspace, executable pte? I suspect you've got changes elsewhere and are calling this function in a context where it's not supposed to be called. Will ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed 2014-03-05 16:12 ` Will Deacon @ 2014-03-05 16:27 ` Bharat.Bhushan at freescale.com 2014-03-05 20:03 ` Laura Abbott 0 siblings, 1 reply; 6+ messages in thread From: Bharat.Bhushan at freescale.com @ 2014-03-05 16:27 UTC (permalink / raw) To: linux-arm-kernel > -----Original Message----- > From: Will Deacon [mailto:will.deacon at arm.com] > Sent: Wednesday, March 05, 2014 9:43 PM > To: Bhushan Bharat-R65777 > Cc: Catalin Marinas; linux-arm-kernel at lists.infradead.org; Bhushan Bharat-R65777 > Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed > > On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote: > > Kernel can only access pages which maps to managed memory. > > So flush only valid kernel pages. > > > > I observed kernel crash direct assigning a device using VFIO and found > > that it was caused because of accessing invalid page > > > > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> > > --- > > v1->v2 > > Getting pfn usin pte_pfn() in pfn_valid. > > > > arch/arm64/mm/flush.c | 13 ++++++++++++- > > 1 files changed, 12 insertions(+), 1 deletions(-) > > > > diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index > > e4193e3..319826a 100644 > > --- a/arch/arm64/mm/flush.c > > +++ b/arch/arm64/mm/flush.c > > @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma, > > struct page *page, > > > > void __sync_icache_dcache(pte_t pte, unsigned long addr) { > > - struct page *page = pte_page(pte); > > + struct page *page; > > + > > +#ifdef CONFIG_HAVE_ARCH_PFN_VALID > > + /* > > + * We can only access pages that the kernel maps > > + * as memory. Bail out for unmapped ones. > > + */ > > + if (!pfn_valid(pte_pfn(pte))) > > + return; > > + > > +#endif > > + page = pte_page(pte); > > How do you get into this function without a valid, userspace, executable pte? > > I suspect you've got changes elsewhere and are calling this function in a > context where it's not supposed to be called. Below I will describe the context in which this function is called: When we direct assign a bus device (we have a different freescale specific bus device but we can take PCI device for discussion as this logic applies equally for PCI device I think) to user space using VFIO. Then userspace needs to mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible memory). Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for mapping the requested address. While remap_pfn_range() internally calls this function. Thanks -Bharat > > Will > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed 2014-03-05 16:27 ` Bharat.Bhushan at freescale.com @ 2014-03-05 20:03 ` Laura Abbott 2014-03-06 3:38 ` Bharat.Bhushan at freescale.com 2014-03-06 16:18 ` Will Deacon 0 siblings, 2 replies; 6+ messages in thread From: Laura Abbott @ 2014-03-05 20:03 UTC (permalink / raw) To: linux-arm-kernel On 3/5/2014 8:27 AM, Bharat.Bhushan at freescale.com wrote: > > >> -----Original Message----- >> From: Will Deacon [mailto:will.deacon at arm.com] >> Sent: Wednesday, March 05, 2014 9:43 PM >> To: Bhushan Bharat-R65777 >> Cc: Catalin Marinas; linux-arm-kernel at lists.infradead.org; Bhushan Bharat-R65777 >> Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed >> >> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote: >>> Kernel can only access pages which maps to managed memory. >>> So flush only valid kernel pages. >>> >>> I observed kernel crash direct assigning a device using VFIO and found >>> that it was caused because of accessing invalid page >>> >>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> >>> --- >>> v1->v2 >>> Getting pfn usin pte_pfn() in pfn_valid. >>> >>> arch/arm64/mm/flush.c | 13 ++++++++++++- >>> 1 files changed, 12 insertions(+), 1 deletions(-) >>> >>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index >>> e4193e3..319826a 100644 >>> --- a/arch/arm64/mm/flush.c >>> +++ b/arch/arm64/mm/flush.c >>> @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma, >>> struct page *page, >>> >>> void __sync_icache_dcache(pte_t pte, unsigned long addr) { >>> - struct page *page = pte_page(pte); >>> + struct page *page; >>> + >>> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID >>> + /* >>> + * We can only access pages that the kernel maps >>> + * as memory. Bail out for unmapped ones. >>> + */ >>> + if (!pfn_valid(pte_pfn(pte))) >>> + return; >>> + >>> +#endif >>> + page = pte_page(pte); >> >> How do you get into this function without a valid, userspace, executable pte? >> >> I suspect you've got changes elsewhere and are calling this function in a >> context where it's not supposed to be called. > > Below I will describe the context in which this function is called: > > When we direct assign a bus device (we have a different freescale specific bus > device but we can take PCI device for discussion as this logic applies equally > for PCI device I think) to user space using VFIO. Then userspace needs to > mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible memory). > Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for mapping the >requested address. While remap_pfn_range() internally calls this function. > As someone who likes calling functions in context where they aren't supposed to be called, I took a look a this because I was curious. I can confirm the same problem trying to mmap arbitrary io address space with remap_pfn_range. We should only be hitting this if the pte is marked as exec per set_pte_at. With my test case, even mmaping with only PROT_READ and PROT_WRITE was setting PROT_EXEC as well which was triggering the bug. This seems to be because READ_IMPLIES_EXEC personality was set which was derived from #define elf_read_implies_exec(ex,stk) (stk != EXSTACK_DISABLE_X) and none of the binaries I'm generating seem to be setting the stack execute bit either way (all are EXECSTACK_DEFAULT). It's not obvious what the best solution is here. Thanks, Laura -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed 2014-03-05 20:03 ` Laura Abbott @ 2014-03-06 3:38 ` Bharat.Bhushan at freescale.com 2014-03-06 16:18 ` Will Deacon 1 sibling, 0 replies; 6+ messages in thread From: Bharat.Bhushan at freescale.com @ 2014-03-06 3:38 UTC (permalink / raw) To: linux-arm-kernel > -----Original Message----- > From: Laura Abbott [mailto:lauraa at codeaurora.org] > Sent: Thursday, March 06, 2014 1:34 AM > To: Bhushan Bharat-R65777; Will Deacon > Cc: Wood Scott-B07421; Catalin Marinas; Yoder Stuart-B08248; linux-arm- > kernel at lists.infradead.org > Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed > > On 3/5/2014 8:27 AM, Bharat.Bhushan at freescale.com wrote: > > > > > >> -----Original Message----- > >> From: Will Deacon [mailto:will.deacon at arm.com] > >> Sent: Wednesday, March 05, 2014 9:43 PM > >> To: Bhushan Bharat-R65777 > >> Cc: Catalin Marinas; linux-arm-kernel at lists.infradead.org; Bhushan > >> Bharat-R65777 > >> Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed > >> > >> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote: > >>> Kernel can only access pages which maps to managed memory. > >>> So flush only valid kernel pages. > >>> > >>> I observed kernel crash direct assigning a device using VFIO and > >>> found that it was caused because of accessing invalid page > >>> > >>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> > >>> --- > >>> v1->v2 > >>> Getting pfn usin pte_pfn() in pfn_valid. > >>> > >>> arch/arm64/mm/flush.c | 13 ++++++++++++- > >>> 1 files changed, 12 insertions(+), 1 deletions(-) > >>> > >>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index > >>> e4193e3..319826a 100644 > >>> --- a/arch/arm64/mm/flush.c > >>> +++ b/arch/arm64/mm/flush.c > >>> @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct > >>> *vma, struct page *page, > >>> > >>> void __sync_icache_dcache(pte_t pte, unsigned long addr) { > >>> - struct page *page = pte_page(pte); > >>> + struct page *page; > >>> + > >>> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID > >>> + /* > >>> + * We can only access pages that the kernel maps > >>> + * as memory. Bail out for unmapped ones. > >>> + */ > >>> + if (!pfn_valid(pte_pfn(pte))) > >>> + return; > >>> + > >>> +#endif > >>> + page = pte_page(pte); > >> > >> How do you get into this function without a valid, userspace, executable pte? > >> > >> I suspect you've got changes elsewhere and are calling this function > >> in a context where it's not supposed to be called. > > > > Below I will describe the context in which this function is called: > > > > When we direct assign a bus device (we have a different freescale > > specific bus > > device but we can take PCI device for discussion as this logic applies > equally > for PCI device I think) to user space using VFIO. Then userspace > needs to > mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible > memory). > > Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for > > mapping the > >requested address. While remap_pfn_range() internally calls this function. > > > > As someone who likes calling functions in context where they aren't supposed to > be called, I took a look a this because I was curious. Are we saying that remap_pfn_range() should not be called in such case (described earlier the case of direct assigning PCI device to user space using VFIO) ? But x86/powerpc calls this function only. > > I can confirm the same problem trying to mmap arbitrary io address space with > remap_pfn_range. We should only be hitting this if the pte is marked as exec per > set_pte_at. With my test case, even mmaping with only PROT_READ and PROT_WRITE > was setting PROT_EXEC as well which was triggering the bug. This seems to be > because READ_IMPLIES_EXEC personality was set which was derived from > > #define elf_read_implies_exec(ex,stk) (stk != EXSTACK_DISABLE_X) > > and none of the binaries I'm generating seem to be setting the stack execute bit > either way (all are EXECSTACK_DEFAULT). Yes I agree that even if we set PROT_READ and PROT_WRITE but it internally end up setting PROT_EXEC, so we enter in flow. But I see this as a second issue. I am not sure but theoretically it can still happen that we set PROT_EXEC for anonymous page. So either __sync_icache_dcache() should check that it does not access anonymous struct page (which this patch is doing) or __sync_icache_dcache() should not be called for anonymous page. Maybe something like this: diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index f0bebc5..9493f3e 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -167,7 +167,7 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { if (pte_valid_user(pte)) { - if (pte_exec(pte)) + if (pte_exec(pte) && pfn_valid(pte_pfn(pte))) __sync_icache_dcache(pte, addr); if (!pte_dirty(pte)) pte = pte_wrprotect(pte); Please suggest if some other solution. Thanks -Bharat > > It's not obvious what the best solution is here. > > Thanks, > Laura > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The > Linux Foundation > ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed 2014-03-05 20:03 ` Laura Abbott 2014-03-06 3:38 ` Bharat.Bhushan at freescale.com @ 2014-03-06 16:18 ` Will Deacon 1 sibling, 0 replies; 6+ messages in thread From: Will Deacon @ 2014-03-06 16:18 UTC (permalink / raw) To: linux-arm-kernel Hi Laura, On Wed, Mar 05, 2014 at 08:03:58PM +0000, Laura Abbott wrote: > On 3/5/2014 8:27 AM, Bharat.Bhushan at freescale.com wrote: > >> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote: > >>> Kernel can only access pages which maps to managed memory. > >>> So flush only valid kernel pages. > >>> > >> How do you get into this function without a valid, userspace, executable pte? > >> > >> I suspect you've got changes elsewhere and are calling this function in a > >> context where it's not supposed to be called. > > > > Below I will describe the context in which this function is called: > > > > When we direct assign a bus device (we have a different freescale specific bus > > device but we can take PCI device for discussion as this logic > applies equally > > for PCI device I think) to user space using VFIO. Then userspace needs to > > mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible > memory). > > Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for mapping the > >requested address. While remap_pfn_range() internally calls this function. > > > > As someone who likes calling functions in context where they aren't > supposed to be called, I took a look a this because I was curious. Somebody should hide your keyboard. Stephen? > I can confirm the same problem trying to mmap arbitrary io address space > with remap_pfn_range. We should only be hitting this if the pte is > marked as exec per set_pte_at. With my test case, even mmaping with only > PROT_READ and PROT_WRITE was setting PROT_EXEC as well which was > triggering the bug. This seems to be because READ_IMPLIES_EXEC > personality was set which was derived from > > #define elf_read_implies_exec(ex,stk) (stk != EXSTACK_DISABLE_X) > > and none of the binaries I'm generating seem to be setting the stack > execute bit either way (all are EXECSTACK_DEFAULT). > > It's not obvious what the best solution is here. It would be nice if something like phys_mem_access_prot was used by the callers, since this is used by the /dev/mem driver to make sure that the pgprot is sane for the underlying pfn. In the absence of that, I guess we could add the pfn_valid check (we have it already on arch/arm/) but if that means we end up with executable devices, we're still entering a world of looks-like-my-instruction-fetcher-just-acked-an-irq style pain. Will ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-03-06 16:18 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-05 11:25 [PATCH v2] ARM64: Kernel managed pages are only flushed Bharat Bhushan 2014-03-05 16:12 ` Will Deacon 2014-03-05 16:27 ` Bharat.Bhushan at freescale.com 2014-03-05 20:03 ` Laura Abbott 2014-03-06 3:38 ` Bharat.Bhushan at freescale.com 2014-03-06 16:18 ` Will Deacon
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.