[PATCH v4.19 0/2] Custom backports for powerpc SLB issues

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4.19 0/2] Custom backports for powerpc SLB issues
@ 2022-04-28 12:41 Michael Ellerman
  2022-04-28 12:41 ` [PATCH v4.19 1/2] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss Michael Ellerman
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Michael Ellerman @ 2022-04-28 12:41 UTC (permalink / raw)
  To: stable, gregkh; +Cc: linuxppc-dev, npiggin

Hi Greg,

Here are two custom backports to v4.19 for some powerpc issues we've discovered.
Both were fixed upstream as part of a large non-backportable rewrite. Other stable
kernel versions are not affected.

cheers

Michael Ellerman (1):
  powerpc/64s: Unmerge EX_LR and EX_DAR

Nicholas Piggin (1):
  powerpc/64/interrupt: Temporarily save PPR on stack to fix register
    corruption due to SLB miss

 arch/powerpc/include/asm/exception-64s.h | 37 ++++++++++++++----------
 1 file changed, 22 insertions(+), 15 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4.19 1/2] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss
  2022-04-28 12:41 [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Michael Ellerman
@ 2022-04-28 12:41 ` Michael Ellerman
  2022-04-29  8:57   ` Patch "powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss" has been added to the 4.19-stable tree gregkh
  2022-04-28 12:41 ` [PATCH v4.19 2/2] powerpc/64s: Unmerge EX_LR and EX_DAR Michael Ellerman
  2022-04-29  8:56 ` [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Greg KH
  2 siblings, 1 reply; 6+ messages in thread
From: Michael Ellerman @ 2022-04-28 12:41 UTC (permalink / raw)
  To: stable, gregkh; +Cc: linuxppc-dev, npiggin

From: Nicholas Piggin <npiggin@gmail.com>

This is a minimal stable kernel fix for the problem solved by
4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than
thread_struct").

Upstream kernels between 4.17-4.20 have this bug, so I propose this
patch for 4.19 stable.

Longer description from mpe:

In commit f384796c4 ("powerpc/mm: Add support for handling > 512TB
address in SLB miss") we added support for using multiple context ids
per process. Previously accessing past the first context id was a fatal
error for the process. With the new support it became non-fatal, and so
the previous "bad_addr_slb" handler was changed to be the
"large_addr_slb" handler.

That handler uses the EXCEPTION_PROLOG_COMMON() macro, which in-turn
calls the SAVE_PPR() macro. At the point where SAVE_PPR() is used, the
r9-13 register values from the original user fault are saved in
paca->exslb. It's not until later in EXCEPTION_PROLOG_COMMON_2() that
they are saved from paca->exslb onto the kernel stack.

The PPR is saved into current->thread.ppr, which is notably not on the
kernel stack the way pt_regs are. This means we can take an SLB miss on
current->thread.ppr. If that happens in the "large_addr_slb" case we
will clobber the saved user r9-r13 in paca->exslb with kernel values.
Later we will save those clobbered values into the pt_regs on the stack,
and when we return to userspace those kernel values will be restored.

Typically this appears as some sort of segfault in userspace, with an
address that looks like a kernel address. In dmesg it can appear as:

  [19117.440331] some_program[1869625]: unhandled signal 11 at c00000000f6bda10 nip 00007fff780d559c lr 00007fff781ae56c code 30001

The upstream fix for this issue was to move PPR into pt_regs, on the
kernel stack, avoiding the possibility of an SLB fault when saving it.

However changing the size of pt_regs is an intrusive change, and has
side effects in other parts of the kernel. A minimal fix is to
temporarily save the PPR in an unused part of pt_regs, then save the
user register values from paca->exslb into pt_regs, and then move the
saved PPR into thread.ppr.

Fixes: f384796c40dc ("powerpc/mm: Add support for handling > 512TB address in SLB miss")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220316033235.903657-1-npiggin@gmail.com
---
 arch/powerpc/include/asm/exception-64s.h | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 35fb5b11955a..f0424c6fdeca 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -243,10 +243,22 @@
  * PPR save/restore macros used in exceptions_64s.S  
  * Used for P7 or later processors
  */
-#define SAVE_PPR(area, ra, rb)						\
+#define SAVE_PPR(area, ra)						\
+BEGIN_FTR_SECTION_NESTED(940)						\
+	ld	ra,area+EX_PPR(r13);	/* Read PPR from paca */	\
+	std	ra,RESULT(r1);		/* Store PPR in RESULT for now */ \
+END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)
+
+/*
+ * This is called after we are finished accessing 'area', so we can now take
+ * SLB faults accessing the thread struct, which will use PACA_EXSLB area.
+ * This is required because the large_addr_slb handler uses EXSLB and it also
+ * uses the common exception macros including this PPR saving.
+ */
+#define MOVE_PPR_TO_THREAD(ra, rb)					\
 BEGIN_FTR_SECTION_NESTED(940)						\
 	ld	ra,PACACURRENT(r13);					\
-	ld	rb,area+EX_PPR(r13);	/* Read PPR from paca */	\
+	ld	rb,RESULT(r1);		/* Read PPR from stack */	\
 	std	rb,TASKTHREADPPR(ra);					\
 END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)

@@ -515,9 +527,11 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 3:	EXCEPTION_PROLOG_COMMON_1();					   \
 	beq	4f;			/* if from kernel mode		*/ \
 	ACCOUNT_CPU_USER_ENTRY(r13, r9, r10);				   \
-	SAVE_PPR(area, r9, r10);					   \
+	SAVE_PPR(area, r9);						   \
 4:	EXCEPTION_PROLOG_COMMON_2(area)					   \
-	EXCEPTION_PROLOG_COMMON_3(n)					   \
+	beq	5f;			/* if from kernel mode		*/ \
+	MOVE_PPR_TO_THREAD(r9, r10);					   \
+5:	EXCEPTION_PROLOG_COMMON_3(n)					   \
 	ACCOUNT_STOLEN_TIME

 /* Save original regs values from save area to stack frame. */
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v4.19 2/2] powerpc/64s: Unmerge EX_LR and EX_DAR
  2022-04-28 12:41 [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Michael Ellerman
  2022-04-28 12:41 ` [PATCH v4.19 1/2] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss Michael Ellerman
@ 2022-04-28 12:41 ` Michael Ellerman
  2022-04-29  8:57   ` Patch "powerpc/64s: Unmerge EX_LR and EX_DAR" has been added to the 4.19-stable tree gregkh
  2022-04-29  8:56 ` [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Greg KH
  2 siblings, 1 reply; 6+ messages in thread
From: Michael Ellerman @ 2022-04-28 12:41 UTC (permalink / raw)
  To: stable, gregkh; +Cc: linuxppc-dev, npiggin

The SLB miss handler is not fully re-entrant, it is able to work because
we ensure that the SLB entries for the kernel text and data segment, as
well as the kernel stack are pinned in the SLB. Accesses to kernel data
outside of those areas has to be carefully managed and can only occur in
certain parts of the code. One way we deal with that is by storing some
values in temporary slots in the paca.

In v4.13 in commit dbeea1d6b4bd ("powerpc/64s/paca: EX_LR can be merged
with EX_DAR") we merged the storage for two temporary slots for register
storage during SLB miss handling. That was safe at the time because the
two slots were never used at the same time.

Unfortunately in v4.17 in commit c2b4d8b7417a ("powerpc/mm/hash64:
Increase the VA range") we broke that condition, and introduced a case
where the two slots could be in use at the same time, leading to one
being corrupted.

Specifically in slb_miss_common() when we detect that we're handling a
fault for a large virtual address (> 512TB) we go to the "8" label,
there we store the original fault address into paca->exslb[EX_DAR],
before jumping to large_addr_slb() (using rfid).

We then use the EXCEPTION_PROLOG_COMMON and RECONCILE_IRQ_STATE macros
to do exception setup, before reloading the fault address from
paca->exslb[EX_DAR] and storing it into pt_regs->dar (Data Address
Register).

However the code generated by those macros can cause a recursive SLB
miss on a kernel address in three places.

Firstly is the saving of the PPR (Program Priority Register), which
happens on all CPUs since Power7, the PPR is saved to the thread struct
which can be anywhere in memory. There is also the call to
accumulate_stolen_time() if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y and
CONFIG_PPC_SPLPAR=y, and also the call to trace_hardirqs_off() if
CONFIG_TRACE_IRQFLAGS=y. The latter two call into generic C code and can
lead to accesses anywhere in memory.

On modern 64-bit CPUs we have 1TB segments, so for any of those accesses
to cause an SLB fault they must access memory more than 1TB away from
the kernel text, data and kernel stack. That typically only happens on
machines with more than 1TB of RAM. However it is possible on multi-node
Power9 systems, because memory on the 2nd node begins at 32TB in the
linear mapping.

If we take a recursive SLB fault then we will corrupt the original fault
address with the LR (Link Register) value, because the EX_DAR and EX_LR
slots share storage. Subsequently we will think we're trying to fault
that LR address, which is the wrong address, and will also mostly likely
lead to a segfault because the LR address will be < 512TB and so will be
rejected by slb_miss_large_addr().

This appears as a spurious segfault to userspace, and if
show_unhandled_signals is enabled you will see a fault reported in dmesg
with the LR address, not the expected fault address, eg:

  prog[123]: segfault (11) at 128a61808 nip 128a618cc lr 128a61808 code 3 in prog[128a60000+10000]
  prog[123]: code: 4bffffa4 39200040 3ce00004 7d2903a6 3c000200 78e707c6 780083e4 7d3b4b78
  prog[123]: code: 7d455378 7d7d5b78 7d9f6378 7da46b78 <f8670000> 7d3a4b78 7d465378 7d7c5b78

Notice that the fault address == the LR, and the faulting instruction is
a simple store that should never use LR.

In upstream this was fixed in v4.20 in commit
48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C"),
however that is a huge rewrite and not backportable.

The minimal fix for stable is to just unmerge the EX_LR and EX_DAR slots
again, avoiding the corruption of the DAR value. This uses an extra 8
bytes per CPU, which is negligble.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/exception-64s.h | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index f0424c6fdeca..4fdae1c182df 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -48,11 +48,12 @@
 #define EX_CCR		52
 #define EX_CFAR		56
 #define EX_PPR		64
+#define EX_LR		72
 #if defined(CONFIG_RELOCATABLE)
-#define EX_CTR		72
-#define EX_SIZE		10	/* size in u64 units */
+#define EX_CTR		80
+#define EX_SIZE		11	/* size in u64 units */
 #else
-#define EX_SIZE		9	/* size in u64 units */
+#define EX_SIZE		10	/* size in u64 units */
 #endif

 /*
@@ -60,14 +61,6 @@
  */
 #define MAX_MCE_DEPTH	4

-/*
- * EX_LR is only used in EXSLB and where it does not overlap with EX_DAR
- * EX_CCR similarly with DSISR, but being 4 byte registers there is a hole
- * in the save area so it's not necessary to overlap them. Could be used
- * for future savings though if another 4 byte register was to be saved.
- */
-#define EX_LR		EX_DAR
-
 /*
  * EX_R3 is only used by the bad_stack handler. bad_stack reloads and
  * saves DAR from SPRN_DAR, and EX_DAR is not used. So EX_R3 can overlap
-- 
2.35.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4.19 0/2] Custom backports for powerpc SLB issues
  2022-04-28 12:41 [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Michael Ellerman
  2022-04-28 12:41 ` [PATCH v4.19 1/2] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss Michael Ellerman
  2022-04-28 12:41 ` [PATCH v4.19 2/2] powerpc/64s: Unmerge EX_LR and EX_DAR Michael Ellerman
@ 2022-04-29  8:56 ` Greg KH
  2 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2022-04-29  8:56 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, npiggin, stable

On Thu, Apr 28, 2022 at 10:41:48PM +1000, Michael Ellerman wrote:
> Hi Greg,
> 
> Here are two custom backports to v4.19 for some powerpc issues we've discovered.
> Both were fixed upstream as part of a large non-backportable rewrite. Other stable
> kernel versions are not affected.
> 
> cheers
> 
> Michael Ellerman (1):
>   powerpc/64s: Unmerge EX_LR and EX_DAR
> 
> Nicholas Piggin (1):
>   powerpc/64/interrupt: Temporarily save PPR on stack to fix register
>     corruption due to SLB miss
> 
>  arch/powerpc/include/asm/exception-64s.h | 37 ++++++++++++++----------
>  1 file changed, 22 insertions(+), 15 deletions(-)
> 
> -- 
> 2.35.1
> 

Both now queued up, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Patch "powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss" has been added to the 4.19-stable tree
  2022-04-28 12:41 ` [PATCH v4.19 1/2] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss Michael Ellerman
@ 2022-04-29  8:57   ` gregkh
  0 siblings, 0 replies; 6+ messages in thread
From: gregkh @ 2022-04-29  8:57 UTC (permalink / raw)
  To: gregkh, linuxppc-dev, mpe, npiggin; +Cc: stable-commits

This is a note to let you know that I've just added the patch titled

    powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss

to the 4.19-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     powerpc-64-interrupt-temporarily-save-ppr-on-stack-to-fix-register-corruption-due-to-slb-miss.patch
and it can be found in the queue-4.19 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.

From foo@baz Fri Apr 29 10:56:14 AM CEST 2022
From: Michael Ellerman <mpe@ellerman.id.au>
Date: Thu, 28 Apr 2022 22:41:49 +1000
Subject: powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss
To: <stable@vger.kernel.org>, <gregkh@linuxfoundation.org>
Cc: <linuxppc-dev@lists.ozlabs.org>, <npiggin@gmail.com>
Message-ID: <20220428124150.375623-2-mpe@ellerman.id.au>

From: Nicholas Piggin <npiggin@gmail.com>

This is a minimal stable kernel fix for the problem solved by
4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than
thread_struct").

Upstream kernels between 4.17-4.20 have this bug, so I propose this
patch for 4.19 stable.

Longer description from mpe:

In commit f384796c4 ("powerpc/mm: Add support for handling > 512TB
address in SLB miss") we added support for using multiple context ids
per process. Previously accessing past the first context id was a fatal
error for the process. With the new support it became non-fatal, and so
the previous "bad_addr_slb" handler was changed to be the
"large_addr_slb" handler.

That handler uses the EXCEPTION_PROLOG_COMMON() macro, which in-turn
calls the SAVE_PPR() macro. At the point where SAVE_PPR() is used, the
r9-13 register values from the original user fault are saved in
paca->exslb. It's not until later in EXCEPTION_PROLOG_COMMON_2() that
they are saved from paca->exslb onto the kernel stack.

The PPR is saved into current->thread.ppr, which is notably not on the
kernel stack the way pt_regs are. This means we can take an SLB miss on
current->thread.ppr. If that happens in the "large_addr_slb" case we
will clobber the saved user r9-r13 in paca->exslb with kernel values.
Later we will save those clobbered values into the pt_regs on the stack,
and when we return to userspace those kernel values will be restored.

Typically this appears as some sort of segfault in userspace, with an
address that looks like a kernel address. In dmesg it can appear as:

  [19117.440331] some_program[1869625]: unhandled signal 11 at c00000000f6bda10 nip 00007fff780d559c lr 00007fff781ae56c code 30001

The upstream fix for this issue was to move PPR into pt_regs, on the
kernel stack, avoiding the possibility of an SLB fault when saving it.

However changing the size of pt_regs is an intrusive change, and has
side effects in other parts of the kernel. A minimal fix is to
temporarily save the PPR in an unused part of pt_regs, then save the
user register values from paca->exslb into pt_regs, and then move the
saved PPR into thread.ppr.

Fixes: f384796c40dc ("powerpc/mm: Add support for handling > 512TB address in SLB miss")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220316033235.903657-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/include/asm/exception-64s.h |   22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -243,10 +243,22 @@
  * PPR save/restore macros used in exceptions_64s.S  
  * Used for P7 or later processors
  */
-#define SAVE_PPR(area, ra, rb)						\
+#define SAVE_PPR(area, ra)						\
+BEGIN_FTR_SECTION_NESTED(940)						\
+	ld	ra,area+EX_PPR(r13);	/* Read PPR from paca */	\
+	std	ra,RESULT(r1);		/* Store PPR in RESULT for now */ \
+END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)
+
+/*
+ * This is called after we are finished accessing 'area', so we can now take
+ * SLB faults accessing the thread struct, which will use PACA_EXSLB area.
+ * This is required because the large_addr_slb handler uses EXSLB and it also
+ * uses the common exception macros including this PPR saving.
+ */
+#define MOVE_PPR_TO_THREAD(ra, rb)					\
 BEGIN_FTR_SECTION_NESTED(940)						\
 	ld	ra,PACACURRENT(r13);					\
-	ld	rb,area+EX_PPR(r13);	/* Read PPR from paca */	\
+	ld	rb,RESULT(r1);		/* Read PPR from stack */	\
 	std	rb,TASKTHREADPPR(ra);					\
 END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)

@@ -515,9 +527,11 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 3:	EXCEPTION_PROLOG_COMMON_1();					   \
 	beq	4f;			/* if from kernel mode		*/ \
 	ACCOUNT_CPU_USER_ENTRY(r13, r9, r10);				   \
-	SAVE_PPR(area, r9, r10);					   \
+	SAVE_PPR(area, r9);						   \
 4:	EXCEPTION_PROLOG_COMMON_2(area)					   \
-	EXCEPTION_PROLOG_COMMON_3(n)					   \
+	beq	5f;			/* if from kernel mode		*/ \
+	MOVE_PPR_TO_THREAD(r9, r10);					   \
+5:	EXCEPTION_PROLOG_COMMON_3(n)					   \
 	ACCOUNT_STOLEN_TIME

 /* Save original regs values from save area to stack frame. */

Patches currently in stable-queue which might be from mpe@ellerman.id.au are

queue-4.19/powerpc-64s-unmerge-ex_lr-and-ex_dar.patch
queue-4.19/powerpc-64-interrupt-temporarily-save-ppr-on-stack-to-fix-register-corruption-due-to-slb-miss.patch

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Patch "powerpc/64s: Unmerge EX_LR and EX_DAR" has been added to the 4.19-stable tree
  2022-04-28 12:41 ` [PATCH v4.19 2/2] powerpc/64s: Unmerge EX_LR and EX_DAR Michael Ellerman
@ 2022-04-29  8:57   ` gregkh
  0 siblings, 0 replies; 6+ messages in thread
From: gregkh @ 2022-04-29  8:57 UTC (permalink / raw)
  To: gregkh, linuxppc-dev, mpe, npiggin; +Cc: stable-commits

This is a note to let you know that I've just added the patch titled

    powerpc/64s: Unmerge EX_LR and EX_DAR

to the 4.19-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     powerpc-64s-unmerge-ex_lr-and-ex_dar.patch
and it can be found in the queue-4.19 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.

From foo@baz Fri Apr 29 10:56:14 AM CEST 2022
From: Michael Ellerman <mpe@ellerman.id.au>
Date: Thu, 28 Apr 2022 22:41:50 +1000
Subject: powerpc/64s: Unmerge EX_LR and EX_DAR
To: <stable@vger.kernel.org>, <gregkh@linuxfoundation.org>
Cc: <linuxppc-dev@lists.ozlabs.org>, <npiggin@gmail.com>
Message-ID: <20220428124150.375623-3-mpe@ellerman.id.au>

From: Michael Ellerman <mpe@ellerman.id.au>

The SLB miss handler is not fully re-entrant, it is able to work because
we ensure that the SLB entries for the kernel text and data segment, as
well as the kernel stack are pinned in the SLB. Accesses to kernel data
outside of those areas has to be carefully managed and can only occur in
certain parts of the code. One way we deal with that is by storing some
values in temporary slots in the paca.

In v4.13 in commit dbeea1d6b4bd ("powerpc/64s/paca: EX_LR can be merged
with EX_DAR") we merged the storage for two temporary slots for register
storage during SLB miss handling. That was safe at the time because the
two slots were never used at the same time.

Unfortunately in v4.17 in commit c2b4d8b7417a ("powerpc/mm/hash64:
Increase the VA range") we broke that condition, and introduced a case
where the two slots could be in use at the same time, leading to one
being corrupted.

Specifically in slb_miss_common() when we detect that we're handling a
fault for a large virtual address (> 512TB) we go to the "8" label,
there we store the original fault address into paca->exslb[EX_DAR],
before jumping to large_addr_slb() (using rfid).

We then use the EXCEPTION_PROLOG_COMMON and RECONCILE_IRQ_STATE macros
to do exception setup, before reloading the fault address from
paca->exslb[EX_DAR] and storing it into pt_regs->dar (Data Address
Register).

However the code generated by those macros can cause a recursive SLB
miss on a kernel address in three places.

Firstly is the saving of the PPR (Program Priority Register), which
happens on all CPUs since Power7, the PPR is saved to the thread struct
which can be anywhere in memory. There is also the call to
accumulate_stolen_time() if CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y and
CONFIG_PPC_SPLPAR=y, and also the call to trace_hardirqs_off() if
CONFIG_TRACE_IRQFLAGS=y. The latter two call into generic C code and can
lead to accesses anywhere in memory.

On modern 64-bit CPUs we have 1TB segments, so for any of those accesses
to cause an SLB fault they must access memory more than 1TB away from
the kernel text, data and kernel stack. That typically only happens on
machines with more than 1TB of RAM. However it is possible on multi-node
Power9 systems, because memory on the 2nd node begins at 32TB in the
linear mapping.

If we take a recursive SLB fault then we will corrupt the original fault
address with the LR (Link Register) value, because the EX_DAR and EX_LR
slots share storage. Subsequently we will think we're trying to fault
that LR address, which is the wrong address, and will also mostly likely
lead to a segfault because the LR address will be < 512TB and so will be
rejected by slb_miss_large_addr().

This appears as a spurious segfault to userspace, and if
show_unhandled_signals is enabled you will see a fault reported in dmesg
with the LR address, not the expected fault address, eg:

  prog[123]: segfault (11) at 128a61808 nip 128a618cc lr 128a61808 code 3 in prog[128a60000+10000]
  prog[123]: code: 4bffffa4 39200040 3ce00004 7d2903a6 3c000200 78e707c6 780083e4 7d3b4b78
  prog[123]: code: 7d455378 7d7d5b78 7d9f6378 7da46b78 <f8670000> 7d3a4b78 7d465378 7d7c5b78

Notice that the fault address == the LR, and the faulting instruction is
a simple store that should never use LR.

In upstream this was fixed in v4.20 in commit
48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C"),
however that is a huge rewrite and not backportable.

The minimal fix for stable is to just unmerge the EX_LR and EX_DAR slots
again, avoiding the corruption of the DAR value. This uses an extra 8
bytes per CPU, which is negligble.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/powerpc/include/asm/exception-64s.h |   15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -48,11 +48,12 @@
 #define EX_CCR		52
 #define EX_CFAR		56
 #define EX_PPR		64
+#define EX_LR		72
 #if defined(CONFIG_RELOCATABLE)
-#define EX_CTR		72
-#define EX_SIZE		10	/* size in u64 units */
+#define EX_CTR		80
+#define EX_SIZE		11	/* size in u64 units */
 #else
-#define EX_SIZE		9	/* size in u64 units */
+#define EX_SIZE		10	/* size in u64 units */
 #endif

 /*
@@ -61,14 +62,6 @@
 #define MAX_MCE_DEPTH	4

 /*
- * EX_LR is only used in EXSLB and where it does not overlap with EX_DAR
- * EX_CCR similarly with DSISR, but being 4 byte registers there is a hole
- * in the save area so it's not necessary to overlap them. Could be used
- * for future savings though if another 4 byte register was to be saved.
- */
-#define EX_LR		EX_DAR
-
-/*
  * EX_R3 is only used by the bad_stack handler. bad_stack reloads and
  * saves DAR from SPRN_DAR, and EX_DAR is not used. So EX_R3 can overlap
  * with EX_DAR.

Patches currently in stable-queue which might be from mpe@ellerman.id.au are

queue-4.19/powerpc-64s-unmerge-ex_lr-and-ex_dar.patch
queue-4.19/powerpc-64-interrupt-temporarily-save-ppr-on-stack-to-fix-register-corruption-due-to-slb-miss.patch

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-29  8:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-04-28 12:41 [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Michael Ellerman
2022-04-28 12:41 ` [PATCH v4.19 1/2] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss Michael Ellerman
2022-04-29  8:57   ` Patch "powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss" has been added to the 4.19-stable tree gregkh
2022-04-28 12:41 ` [PATCH v4.19 2/2] powerpc/64s: Unmerge EX_LR and EX_DAR Michael Ellerman
2022-04-29  8:57   ` Patch "powerpc/64s: Unmerge EX_LR and EX_DAR" has been added to the 4.19-stable tree gregkh
2022-04-29  8:56 ` [PATCH v4.19 0/2] Custom backports for powerpc SLB issues Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).