linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm: mm: dump: don't skip final region
@ 2014-12-05 12:20 Mark Rutland
  2014-12-05 16:23 ` Steve Capper
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Rutland @ 2014-12-05 12:20 UTC (permalink / raw)
  To: linux-arm-kernel

If the final page table entry we walk is a valid mapping, the page table
dumping code will not log the region this entry is part of, as the final
note_page call in walk_pgd will trigger an early return. Luckily this
isn't seen on contemporary systems as they typically don't have enough
RAM to extend the linear mapping right to the end of the address space.

In note_page, we log a region when we reach its end (i.e. we hit an
entry immediately afterwards which has different prot bits or is
invalid). The final entry has no subsequent entry, so we will not log
this immediately. We try to cater for this with a subsequent call to
note_page in ptdump_show, but this returns early as 0 is less than
USER_PGTABLES_CEILING, and hence we will skip a valid mapping if it
spans to the final entry we note.

Luckily, the final note_page call has level == 0, while all prior calls
have a non-zero level, so if we also check level we will both skip user
entries and handle the final note_page call correctly. Due to the way
addr is constructed in the walk_* functions, it can never be less than
LOWEST_ADDR when walking the page tables, so it is not necessary to
avoid dereferencing invalid table addresses. The existing checks for
st->current_prot and st->marker[1].start_address are sufficient to
ensure we will not print and/or dereference garbage when trying to log
information.

This patch adds the aformentioned check for level, ensuring we log all
regions in the kernel mappings, including those which span right to the
end of the address space.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm/mm/dump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

I found this via inspection, rather than being hit by the problem described
above, but it seemed worthwhile to fix this up (otherwise the final note_page
call is pointless).

diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index 5942493..f459c0d 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -220,7 +220,7 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level, u
 	static const char units[] = "KMGTPE";
 	u64 prot = val & pg_level[level].mask;
 
-	if (addr < USER_PGTABLES_CEILING)
+	if (level && addr < USER_PGTABLES_CEILING)
 		return;
 
 	if (!st->level) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] arm: mm: dump: don't skip final region
  2014-12-05 12:20 [PATCH] arm: mm: dump: don't skip final region Mark Rutland
@ 2014-12-05 16:23 ` Steve Capper
  2014-12-08 12:05   ` Mark Rutland
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Capper @ 2014-12-05 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 5 December 2014 at 12:20, Mark Rutland <mark.rutland@arm.com> wrote:
> If the final page table entry we walk is a valid mapping, the page table
> dumping code will not log the region this entry is part of, as the final
> note_page call in walk_pgd will trigger an early return. Luckily this
> isn't seen on contemporary systems as they typically don't have enough
> RAM to extend the linear mapping right to the end of the address space.
>
> In note_page, we log a region when we reach its end (i.e. we hit an
> entry immediately afterwards which has different prot bits or is
> invalid). The final entry has no subsequent entry, so we will not log
> this immediately. We try to cater for this with a subsequent call to
> note_page in ptdump_show, but this returns early as 0 is less than
> USER_PGTABLES_CEILING, and hence we will skip a valid mapping if it
> spans to the final entry we note.
>
> Luckily, the final note_page call has level == 0, while all prior calls
> have a non-zero level, so if we also check level we will both skip user
> entries and handle the final note_page call correctly. Due to the way
> addr is constructed in the walk_* functions, it can never be less than
> LOWEST_ADDR when walking the page tables, so it is not necessary to
> avoid dereferencing invalid table addresses. The existing checks for
> st->current_prot and st->marker[1].start_address are sufficient to
> ensure we will not print and/or dereference garbage when trying to log
> information.

Do you mean USER_PGTABLES_CEILING instead of LOWEST_ADDR?

>
> This patch adds the aformentioned check for level, ensuring we log all
> regions in the kernel mappings, including those which span right to the
> end of the address space.
>

One comment above, then:
Acked-by: Steve Capper <steve.capper@linaro.org>

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Steve Capper <steve.capper@linaro.org>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/mm/dump.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> I found this via inspection, rather than being hit by the problem described
> above, but it seemed worthwhile to fix this up (otherwise the final note_page
> call is pointless).
>
> diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
> index 5942493..f459c0d 100644
> --- a/arch/arm/mm/dump.c
> +++ b/arch/arm/mm/dump.c
> @@ -220,7 +220,7 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level, u
>         static const char units[] = "KMGTPE";
>         u64 prot = val & pg_level[level].mask;
>
> -       if (addr < USER_PGTABLES_CEILING)
> +       if (level && addr < USER_PGTABLES_CEILING)
>                 return;
>
>         if (!st->level) {
> --
> 1.9.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: mm: dump: don't skip final region
  2014-12-05 16:23 ` Steve Capper
@ 2014-12-08 12:05   ` Mark Rutland
  2014-12-08 17:46     ` Steve Capper
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Rutland @ 2014-12-08 12:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 05, 2014 at 04:23:47PM +0000, Steve Capper wrote:
> On 5 December 2014 at 12:20, Mark Rutland <mark.rutland@arm.com> wrote:
> > If the final page table entry we walk is a valid mapping, the page table
> > dumping code will not log the region this entry is part of, as the final
> > note_page call in walk_pgd will trigger an early return. Luckily this
> > isn't seen on contemporary systems as they typically don't have enough
> > RAM to extend the linear mapping right to the end of the address space.
> >
> > In note_page, we log a region when we reach its end (i.e. we hit an
> > entry immediately afterwards which has different prot bits or is
> > invalid). The final entry has no subsequent entry, so we will not log
> > this immediately. We try to cater for this with a subsequent call to
> > note_page in ptdump_show, but this returns early as 0 is less than
> > USER_PGTABLES_CEILING, and hence we will skip a valid mapping if it
> > spans to the final entry we note.
> >
> > Luckily, the final note_page call has level == 0, while all prior calls
> > have a non-zero level, so if we also check level we will both skip user
> > entries and handle the final note_page call correctly. Due to the way
> > addr is constructed in the walk_* functions, it can never be less than
> > LOWEST_ADDR when walking the page tables, so it is not necessary to
> > avoid dereferencing invalid table addresses. The existing checks for
> > st->current_prot and st->marker[1].start_address are sufficient to
> > ensure we will not print and/or dereference garbage when trying to log
> > information.
> 
> Do you mean USER_PGTABLES_CEILING instead of LOWEST_ADDR?

Yes; sorry about that.

Looking at this again my commit message is bogus: we can construct an
address below USER_PGTABLES_CEILING in walk_pmd in the case of LPAE, as
USER_PGTABLES_CEILING isn't pgd_t aligned in that case.

As the swapper_pg_dir only contains kernel entries, the handling of
USER_PGTABLES_CEILING is just an optimisation to skip some entries that
we expect to be empty in the kernel page tables. We can handle them in
note_page as we do for other *_none() entries and don't need to skip
them early.

So I can entirely remove the check against USER_PGTABLES_CEILING in
note_page. Would that make sense to you?

I also think we should remove pgdoff from walk_pgd. It's only non-zero
for LPAE kernels, so currently if we corrupt the tables somehow and have
entries below pgdoff we won't note them, making that case harder to
spot.

Thanks,
Mark.

> >
> > This patch adds the aformentioned check for level, ensuring we log all
> > regions in the kernel mappings, including those which span right to the
> > end of the address space.
> >
> 
> One comment above, then:
> Acked-by: Steve Capper <steve.capper@linaro.org>
> 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Russell King <linux@arm.linux.org.uk>
> > Cc: Steve Capper <steve.capper@linaro.org>
> > Cc: Will Deacon <will.deacon@arm.com>
> > ---
> >  arch/arm/mm/dump.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > I found this via inspection, rather than being hit by the problem described
> > above, but it seemed worthwhile to fix this up (otherwise the final note_page
> > call is pointless).
> >
> > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
> > index 5942493..f459c0d 100644
> > --- a/arch/arm/mm/dump.c
> > +++ b/arch/arm/mm/dump.c
> > @@ -220,7 +220,7 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level, u
> >         static const char units[] = "KMGTPE";
> >         u64 prot = val & pg_level[level].mask;
> >
> > -       if (addr < USER_PGTABLES_CEILING)
> > +       if (level && addr < USER_PGTABLES_CEILING)
> >                 return;
> >
> >         if (!st->level) {
> > --
> > 1.9.1
> >
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: mm: dump: don't skip final region
  2014-12-08 12:05   ` Mark Rutland
@ 2014-12-08 17:46     ` Steve Capper
  2014-12-08 17:55       ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Capper @ 2014-12-08 17:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 December 2014 at 12:05, Mark Rutland <mark.rutland@arm.com> wrote:
> On Fri, Dec 05, 2014 at 04:23:47PM +0000, Steve Capper wrote:
>> On 5 December 2014 at 12:20, Mark Rutland <mark.rutland@arm.com> wrote:
>> > If the final page table entry we walk is a valid mapping, the page table
>> > dumping code will not log the region this entry is part of, as the final
>> > note_page call in walk_pgd will trigger an early return. Luckily this
>> > isn't seen on contemporary systems as they typically don't have enough
>> > RAM to extend the linear mapping right to the end of the address space.
>> >
>> > In note_page, we log a region when we reach its end (i.e. we hit an
>> > entry immediately afterwards which has different prot bits or is
>> > invalid). The final entry has no subsequent entry, so we will not log
>> > this immediately. We try to cater for this with a subsequent call to
>> > note_page in ptdump_show, but this returns early as 0 is less than
>> > USER_PGTABLES_CEILING, and hence we will skip a valid mapping if it
>> > spans to the final entry we note.
>> >
>> > Luckily, the final note_page call has level == 0, while all prior calls
>> > have a non-zero level, so if we also check level we will both skip user
>> > entries and handle the final note_page call correctly. Due to the way
>> > addr is constructed in the walk_* functions, it can never be less than
>> > LOWEST_ADDR when walking the page tables, so it is not necessary to
>> > avoid dereferencing invalid table addresses. The existing checks for
>> > st->current_prot and st->marker[1].start_address are sufficient to
>> > ensure we will not print and/or dereference garbage when trying to log
>> > information.
>>
>> Do you mean USER_PGTABLES_CEILING instead of LOWEST_ADDR?
>
> Yes; sorry about that.
>
> Looking at this again my commit message is bogus: we can construct an
> address below USER_PGTABLES_CEILING in walk_pmd in the case of LPAE, as
> USER_PGTABLES_CEILING isn't pgd_t aligned in that case.
>
> As the swapper_pg_dir only contains kernel entries, the handling of
> USER_PGTABLES_CEILING is just an optimisation to skip some entries that
> we expect to be empty in the kernel page tables. We can handle them in
> note_page as we do for other *_none() entries and don't need to skip
> them early.
>
> So I can entirely remove the check against USER_PGTABLES_CEILING in
> note_page. Would that make sense to you?

After some head scratching....

Basically, the swapper_pg_dir will only contain kernel pgd entries. So
at the beginning where one would normally user-space entries in a pgd
table, there should be null entries that are skipped over. So yes, I
agree, we should be able to remove the USER_PGTABLES_CEILING check.

>
> I also think we should remove pgdoff from walk_pgd. It's only non-zero
> for LPAE kernels, so currently if we corrupt the tables somehow and have
> entries below pgdoff we won't note them, making that case harder to
> spot.

I agree. Let's see what happens :-).

>
> Thanks,
> Mark.
>

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: mm: dump: don't skip final region
  2014-12-08 17:46     ` Steve Capper
@ 2014-12-08 17:55       ` Russell King - ARM Linux
  2014-12-11 18:22         ` Mark Rutland
  0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2014-12-08 17:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 08, 2014 at 05:46:45PM +0000, Steve Capper wrote:
> After some head scratching....
> 
> Basically, the swapper_pg_dir will only contain kernel pgd entries. So
> at the beginning where one would normally user-space entries in a pgd
> table, there should be null entries that are skipped over. So yes, I
> agree, we should be able to remove the USER_PGTABLES_CEILING check.

The check is there to avoid noting anything about the lower entries.
There is the possibility for lower entries to contain something (the
low vector mapping), though you're probably right that we should
report those too.

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] arm: mm: dump: don't skip final region
  2014-12-08 17:55       ` Russell King - ARM Linux
@ 2014-12-11 18:22         ` Mark Rutland
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Rutland @ 2014-12-11 18:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 08, 2014 at 05:55:03PM +0000, Russell King - ARM Linux wrote:
> On Mon, Dec 08, 2014 at 05:46:45PM +0000, Steve Capper wrote:
> > After some head scratching....
> > 
> > Basically, the swapper_pg_dir will only contain kernel pgd entries. So
> > at the beginning where one would normally user-space entries in a pgd
> > table, there should be null entries that are skipped over. So yes, I
> > agree, we should be able to remove the USER_PGTABLES_CEILING check.
> 
> The check is there to avoid noting anything about the lower entries.
> There is the possibility for lower entries to contain something (the
> low vector mapping), though you're probably right that we should
> report those too.

Ok. I'll send out a v2 which removes both uses of USER_PGTABLES_CEILING
and repots all entries.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-12-11 18:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-05 12:20 [PATCH] arm: mm: dump: don't skip final region Mark Rutland
2014-12-05 16:23 ` Steve Capper
2014-12-08 12:05   ` Mark Rutland
2014-12-08 17:46     ` Steve Capper
2014-12-08 17:55       ` Russell King - ARM Linux
2014-12-11 18:22         ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).