* [PATCH v2] ARM64: Kernel managed pages are only flushed
@ 2014-03-05 11:25 Bharat Bhushan
2014-03-05 16:12 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Bharat Bhushan @ 2014-03-05 11:25 UTC (permalink / raw)
To: linux-arm-kernel
Kernel can only access pages which maps to managed memory.
So flush only valid kernel pages.
I observed kernel crash direct assigning a device using VFIO
and found that it was caused because of accessing invalid page
Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
---
v1->v2
Getting pfn usin pte_pfn() in pfn_valid.
arch/arm64/mm/flush.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index e4193e3..319826a 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
void __sync_icache_dcache(pte_t pte, unsigned long addr)
{
- struct page *page = pte_page(pte);
+ struct page *page;
+
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+ /*
+ * We can only access pages that the kernel maps
+ * as memory. Bail out for unmapped ones.
+ */
+ if (!pfn_valid(pte_pfn(pte)))
+ return;
+
+#endif
+ page = pte_page(pte);
/* no flushing needed for anonymous pages */
if (!page_mapping(page))
--
1.7.0.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed
2014-03-05 11:25 [PATCH v2] ARM64: Kernel managed pages are only flushed Bharat Bhushan
@ 2014-03-05 16:12 ` Will Deacon
2014-03-05 16:27 ` Bharat.Bhushan at freescale.com
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2014-03-05 16:12 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote:
> Kernel can only access pages which maps to managed memory.
> So flush only valid kernel pages.
>
> I observed kernel crash direct assigning a device using VFIO
> and found that it was caused because of accessing invalid page
>
> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> ---
> v1->v2
> Getting pfn usin pte_pfn() in pfn_valid.
>
> arch/arm64/mm/flush.c | 13 ++++++++++++-
> 1 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
> index e4193e3..319826a 100644
> --- a/arch/arm64/mm/flush.c
> +++ b/arch/arm64/mm/flush.c
> @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
>
> void __sync_icache_dcache(pte_t pte, unsigned long addr)
> {
> - struct page *page = pte_page(pte);
> + struct page *page;
> +
> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID
> + /*
> + * We can only access pages that the kernel maps
> + * as memory. Bail out for unmapped ones.
> + */
> + if (!pfn_valid(pte_pfn(pte)))
> + return;
> +
> +#endif
> + page = pte_page(pte);
How do you get into this function without a valid, userspace, executable pte?
I suspect you've got changes elsewhere and are calling this function in a
context where it's not supposed to be called.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed
2014-03-05 16:12 ` Will Deacon
@ 2014-03-05 16:27 ` Bharat.Bhushan at freescale.com
2014-03-05 20:03 ` Laura Abbott
0 siblings, 1 reply; 6+ messages in thread
From: Bharat.Bhushan at freescale.com @ 2014-03-05 16:27 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: Will Deacon [mailto:will.deacon at arm.com]
> Sent: Wednesday, March 05, 2014 9:43 PM
> To: Bhushan Bharat-R65777
> Cc: Catalin Marinas; linux-arm-kernel at lists.infradead.org; Bhushan Bharat-R65777
> Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed
>
> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote:
> > Kernel can only access pages which maps to managed memory.
> > So flush only valid kernel pages.
> >
> > I observed kernel crash direct assigning a device using VFIO and found
> > that it was caused because of accessing invalid page
> >
> > Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> > ---
> > v1->v2
> > Getting pfn usin pte_pfn() in pfn_valid.
> >
> > arch/arm64/mm/flush.c | 13 ++++++++++++-
> > 1 files changed, 12 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index
> > e4193e3..319826a 100644
> > --- a/arch/arm64/mm/flush.c
> > +++ b/arch/arm64/mm/flush.c
> > @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma,
> > struct page *page,
> >
> > void __sync_icache_dcache(pte_t pte, unsigned long addr) {
> > - struct page *page = pte_page(pte);
> > + struct page *page;
> > +
> > +#ifdef CONFIG_HAVE_ARCH_PFN_VALID
> > + /*
> > + * We can only access pages that the kernel maps
> > + * as memory. Bail out for unmapped ones.
> > + */
> > + if (!pfn_valid(pte_pfn(pte)))
> > + return;
> > +
> > +#endif
> > + page = pte_page(pte);
>
> How do you get into this function without a valid, userspace, executable pte?
>
> I suspect you've got changes elsewhere and are calling this function in a
> context where it's not supposed to be called.
Below I will describe the context in which this function is called:
When we direct assign a bus device (we have a different freescale specific bus device but we can take PCI device for discussion as this logic applies equally for PCI device I think) to user space using VFIO. Then userspace needs to mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible memory). Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for mapping the requested address. While remap_pfn_range() internally calls this function.
Thanks
-Bharat
>
> Will
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed
2014-03-05 16:27 ` Bharat.Bhushan at freescale.com
@ 2014-03-05 20:03 ` Laura Abbott
2014-03-06 3:38 ` Bharat.Bhushan at freescale.com
2014-03-06 16:18 ` Will Deacon
0 siblings, 2 replies; 6+ messages in thread
From: Laura Abbott @ 2014-03-05 20:03 UTC (permalink / raw)
To: linux-arm-kernel
On 3/5/2014 8:27 AM, Bharat.Bhushan at freescale.com wrote:
>
>
>> -----Original Message-----
>> From: Will Deacon [mailto:will.deacon at arm.com]
>> Sent: Wednesday, March 05, 2014 9:43 PM
>> To: Bhushan Bharat-R65777
>> Cc: Catalin Marinas; linux-arm-kernel at lists.infradead.org; Bhushan Bharat-R65777
>> Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed
>>
>> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote:
>>> Kernel can only access pages which maps to managed memory.
>>> So flush only valid kernel pages.
>>>
>>> I observed kernel crash direct assigning a device using VFIO and found
>>> that it was caused because of accessing invalid page
>>>
>>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
>>> ---
>>> v1->v2
>>> Getting pfn usin pte_pfn() in pfn_valid.
>>>
>>> arch/arm64/mm/flush.c | 13 ++++++++++++-
>>> 1 files changed, 12 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index
>>> e4193e3..319826a 100644
>>> --- a/arch/arm64/mm/flush.c
>>> +++ b/arch/arm64/mm/flush.c
>>> @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct *vma,
>>> struct page *page,
>>>
>>> void __sync_icache_dcache(pte_t pte, unsigned long addr) {
>>> - struct page *page = pte_page(pte);
>>> + struct page *page;
>>> +
>>> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID
>>> + /*
>>> + * We can only access pages that the kernel maps
>>> + * as memory. Bail out for unmapped ones.
>>> + */
>>> + if (!pfn_valid(pte_pfn(pte)))
>>> + return;
>>> +
>>> +#endif
>>> + page = pte_page(pte);
>>
>> How do you get into this function without a valid, userspace, executable pte?
>>
>> I suspect you've got changes elsewhere and are calling this function in a
>> context where it's not supposed to be called.
>
> Below I will describe the context in which this function is called:
>
> When we direct assign a bus device (we have a different freescale specific bus
> device but we can take PCI device for discussion as this logic
applies equally
> for PCI device I think) to user space using VFIO. Then userspace needs to
> mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible
memory).
> Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for mapping the
>requested address. While remap_pfn_range() internally calls this function.
>
As someone who likes calling functions in context where they aren't
supposed to be called, I took a look a this because I was curious.
I can confirm the same problem trying to mmap arbitrary io address space
with remap_pfn_range. We should only be hitting this if the pte is
marked as exec per set_pte_at. With my test case, even mmaping with only
PROT_READ and PROT_WRITE was setting PROT_EXEC as well which was
triggering the bug. This seems to be because READ_IMPLIES_EXEC
personality was set which was derived from
#define elf_read_implies_exec(ex,stk) (stk != EXSTACK_DISABLE_X)
and none of the binaries I'm generating seem to be setting the stack
execute bit either way (all are EXECSTACK_DEFAULT).
It's not obvious what the best solution is here.
Thanks,
Laura
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed
2014-03-05 20:03 ` Laura Abbott
@ 2014-03-06 3:38 ` Bharat.Bhushan at freescale.com
2014-03-06 16:18 ` Will Deacon
1 sibling, 0 replies; 6+ messages in thread
From: Bharat.Bhushan at freescale.com @ 2014-03-06 3:38 UTC (permalink / raw)
To: linux-arm-kernel
> -----Original Message-----
> From: Laura Abbott [mailto:lauraa at codeaurora.org]
> Sent: Thursday, March 06, 2014 1:34 AM
> To: Bhushan Bharat-R65777; Will Deacon
> Cc: Wood Scott-B07421; Catalin Marinas; Yoder Stuart-B08248; linux-arm-
> kernel at lists.infradead.org
> Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed
>
> On 3/5/2014 8:27 AM, Bharat.Bhushan at freescale.com wrote:
> >
> >
> >> -----Original Message-----
> >> From: Will Deacon [mailto:will.deacon at arm.com]
> >> Sent: Wednesday, March 05, 2014 9:43 PM
> >> To: Bhushan Bharat-R65777
> >> Cc: Catalin Marinas; linux-arm-kernel at lists.infradead.org; Bhushan
> >> Bharat-R65777
> >> Subject: Re: [PATCH v2] ARM64: Kernel managed pages are only flushed
> >>
> >> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote:
> >>> Kernel can only access pages which maps to managed memory.
> >>> So flush only valid kernel pages.
> >>>
> >>> I observed kernel crash direct assigning a device using VFIO and
> >>> found that it was caused because of accessing invalid page
> >>>
> >>> Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
> >>> ---
> >>> v1->v2
> >>> Getting pfn usin pte_pfn() in pfn_valid.
> >>>
> >>> arch/arm64/mm/flush.c | 13 ++++++++++++-
> >>> 1 files changed, 12 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index
> >>> e4193e3..319826a 100644
> >>> --- a/arch/arm64/mm/flush.c
> >>> +++ b/arch/arm64/mm/flush.c
> >>> @@ -72,7 +72,18 @@ void copy_to_user_page(struct vm_area_struct
> >>> *vma, struct page *page,
> >>>
> >>> void __sync_icache_dcache(pte_t pte, unsigned long addr) {
> >>> - struct page *page = pte_page(pte);
> >>> + struct page *page;
> >>> +
> >>> +#ifdef CONFIG_HAVE_ARCH_PFN_VALID
> >>> + /*
> >>> + * We can only access pages that the kernel maps
> >>> + * as memory. Bail out for unmapped ones.
> >>> + */
> >>> + if (!pfn_valid(pte_pfn(pte)))
> >>> + return;
> >>> +
> >>> +#endif
> >>> + page = pte_page(pte);
> >>
> >> How do you get into this function without a valid, userspace, executable pte?
> >>
> >> I suspect you've got changes elsewhere and are calling this function
> >> in a context where it's not supposed to be called.
> >
> > Below I will describe the context in which this function is called:
> >
> > When we direct assign a bus device (we have a different freescale
> > specific bus
> > device but we can take PCI device for discussion as this logic applies
> equally > for PCI device I think) to user space using VFIO. Then userspace
> needs to > mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible
> memory).
> > Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for
> > mapping the
> >requested address. While remap_pfn_range() internally calls this function.
> >
>
> As someone who likes calling functions in context where they aren't supposed to
> be called, I took a look a this because I was curious.
Are we saying that remap_pfn_range() should not be called in such case (described earlier the case of direct assigning PCI device to user space using VFIO) ? But x86/powerpc calls this function only.
>
> I can confirm the same problem trying to mmap arbitrary io address space with
> remap_pfn_range. We should only be hitting this if the pte is marked as exec per
> set_pte_at. With my test case, even mmaping with only PROT_READ and PROT_WRITE
> was setting PROT_EXEC as well which was triggering the bug. This seems to be
> because READ_IMPLIES_EXEC personality was set which was derived from
>
> #define elf_read_implies_exec(ex,stk) (stk != EXSTACK_DISABLE_X)
>
> and none of the binaries I'm generating seem to be setting the stack execute bit
> either way (all are EXECSTACK_DEFAULT).
Yes I agree that even if we set PROT_READ and PROT_WRITE but it internally end up setting PROT_EXEC, so we enter in flow. But I see this as a second issue. I am not sure but theoretically it can still happen that we set PROT_EXEC for anonymous page.
So either __sync_icache_dcache() should check that it does not access anonymous struct page (which this patch is doing) or __sync_icache_dcache() should not be called for anonymous page. Maybe something like this:
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f0bebc5..9493f3e 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -167,7 +167,7 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte)
{
if (pte_valid_user(pte)) {
- if (pte_exec(pte))
+ if (pte_exec(pte) && pfn_valid(pte_pfn(pte)))
__sync_icache_dcache(pte, addr);
if (!pte_dirty(pte))
pte = pte_wrprotect(pte);
Please suggest if some other solution.
Thanks
-Bharat
>
> It's not obvious what the best solution is here.
>
> Thanks,
> Laura
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The
> Linux Foundation
>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2] ARM64: Kernel managed pages are only flushed
2014-03-05 20:03 ` Laura Abbott
2014-03-06 3:38 ` Bharat.Bhushan at freescale.com
@ 2014-03-06 16:18 ` Will Deacon
1 sibling, 0 replies; 6+ messages in thread
From: Will Deacon @ 2014-03-06 16:18 UTC (permalink / raw)
To: linux-arm-kernel
Hi Laura,
On Wed, Mar 05, 2014 at 08:03:58PM +0000, Laura Abbott wrote:
> On 3/5/2014 8:27 AM, Bharat.Bhushan at freescale.com wrote:
> >> On Wed, Mar 05, 2014 at 11:25:16AM +0000, Bharat Bhushan wrote:
> >>> Kernel can only access pages which maps to managed memory.
> >>> So flush only valid kernel pages.
> >>>
> >> How do you get into this function without a valid, userspace, executable pte?
> >>
> >> I suspect you've got changes elsewhere and are calling this function in a
> >> context where it's not supposed to be called.
> >
> > Below I will describe the context in which this function is called:
> >
> > When we direct assign a bus device (we have a different freescale specific bus
> > device but we can take PCI device for discussion as this logic
> applies equally
> > for PCI device I think) to user space using VFIO. Then userspace needs to
> > mmap(PCI_BARx_offset: this PCI bar offset in not a kernel visible
> memory).
> > Then VFIO-kernel mmap() ioctl code calls remap_pfn_range() for mapping the
> >requested address. While remap_pfn_range() internally calls this function.
> >
>
> As someone who likes calling functions in context where they aren't
> supposed to be called, I took a look a this because I was curious.
Somebody should hide your keyboard. Stephen?
> I can confirm the same problem trying to mmap arbitrary io address space
> with remap_pfn_range. We should only be hitting this if the pte is
> marked as exec per set_pte_at. With my test case, even mmaping with only
> PROT_READ and PROT_WRITE was setting PROT_EXEC as well which was
> triggering the bug. This seems to be because READ_IMPLIES_EXEC
> personality was set which was derived from
>
> #define elf_read_implies_exec(ex,stk) (stk != EXSTACK_DISABLE_X)
>
> and none of the binaries I'm generating seem to be setting the stack
> execute bit either way (all are EXECSTACK_DEFAULT).
>
> It's not obvious what the best solution is here.
It would be nice if something like phys_mem_access_prot was used by the
callers, since this is used by the /dev/mem driver to make sure that the
pgprot is sane for the underlying pfn. In the absence of that, I guess we
could add the pfn_valid check (we have it already on arch/arm/) but if that
means we end up with executable devices, we're still entering a world of
looks-like-my-instruction-fetcher-just-acked-an-irq style pain.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-03-06 16:18 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-05 11:25 [PATCH v2] ARM64: Kernel managed pages are only flushed Bharat Bhushan
2014-03-05 16:12 ` Will Deacon
2014-03-05 16:27 ` Bharat.Bhushan at freescale.com
2014-03-05 20:03 ` Laura Abbott
2014-03-06 3:38 ` Bharat.Bhushan at freescale.com
2014-03-06 16:18 ` Will Deacon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.