[REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY

public inbox for linux-riscv@lists.infradead.org
 help / color / mirror / Atom feed

* [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
@ 2024-09-21 17:42 Jason Montleon
  2024-09-22  2:42 ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Montleon @ 2024-09-21 17:42 UTC (permalink / raw)
  To: linux-hardening; +Cc: Linux regressions mailing list, kees, linux-riscv

Starting with 6.11-rc1 I can no longer boot the Lichee Pi 4A with
FORTIFY_SOURCE enabled. This works on 6.10 up to at least 6.10.11.
However, with 6.11 I get no output at all from the kernel on the
serial console with FORTIFY_SOURCE enabled and the system never comes
online on network or otherwise as far as I can tell.

I did a bisect which led to:
2003e483a81cc235e29f77da3f6b256cb4b348e7
fortify: Do not special-case 0-sized destinations

If I revert this commit I can once again boot the Lichee Pi 4A with
FORTIFY_SOURCE enabled.

Thank you,
Jason Montleon

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-21 17:42 [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled Jason Montleon
@ 2024-09-22  2:42 ` Kees Cook
  2024-09-22 20:18   ` Jason Montleon
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2024-09-22  2:42 UTC (permalink / raw)
  To: Jason Montleon, linux-hardening
  Cc: Linux regressions mailing list, linux-riscv



On September 21, 2024 10:42:11 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
>Starting with 6.11-rc1 I can no longer boot the Lichee Pi 4A with
>FORTIFY_SOURCE enabled. This works on 6.10 up to at least 6.10.11.
>However, with 6.11 I get no output at all from the kernel on the
>serial console with FORTIFY_SOURCE enabled and the system never comes
>online on network or otherwise as far as I can tell.
>
>I did a bisect which led to:
>2003e483a81cc235e29f77da3f6b256cb4b348e7
>fortify: Do not special-case 0-sized destinations
>
>If I revert this commit I can once again boot the Lichee Pi 4A with
>FORTIFY_SOURCE enabled.
>

Thanks for the report! Are you able to catch what the error log shows? There must be some 0-sized array that snuck by.

Can you share your .config and compiler version?

-Kees

-- 
Kees Cook

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-22  2:42 ` Kees Cook
@ 2024-09-22 20:18   ` Jason Montleon
  2024-09-22 22:37     ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Montleon @ 2024-09-22 20:18 UTC (permalink / raw)
  To: Kees Cook; +Cc: linux-hardening, Linux regressions mailing list, linux-riscv

On Sat, Sep 21, 2024 at 10:42 PM Kees Cook <kees@kernel.org> wrote:
>
>
>
> On September 21, 2024 10:42:11 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
> >Starting with 6.11-rc1 I can no longer boot the Lichee Pi 4A with
> >FORTIFY_SOURCE enabled. This works on 6.10 up to at least 6.10.11.
> >However, with 6.11 I get no output at all from the kernel on the
> >serial console with FORTIFY_SOURCE enabled and the system never comes
> >online on network or otherwise as far as I can tell.
> >
> >I did a bisect which led to:
> >2003e483a81cc235e29f77da3f6b256cb4b348e7
> >fortify: Do not special-case 0-sized destinations
> >
> >If I revert this commit I can once again boot the Lichee Pi 4A with
> >FORTIFY_SOURCE enabled.
> >
>
> Thanks for the report! Are you able to catch what the error log shows? There must be some 0-sized array that snuck by.
>
> Can you share your .config and compiler version?
>
Hi Kees,
Thank you for the quick reply! I am using the Fedora 40 packaged
version of gcc, gcc-14.1.1-1.fc40.riscv64.

I originally noticed this while testing a build of the Fedora RISC-V
.config on Fedora 40.
http://fedora.riscv.rocks:3000/rpms/kernel/src/branch/main-riscv64/kernel-riscv64-fedora.config

When I noticed I could not boot this on the lpi4a I tried the
defconfig(arch/riscv/configs/defconfig), which worked. After merging
the configs a bit at a time I narrowed it down to FORTIFY_SOURCE=y

To do the bisect I used the riscv defconfig, running make menuconfig
to turn on FORTIFY_SOURCE, and saving.
https://gist.github.com/jmontleon/9cdc778e9c9139296924d3f71b48067b

As far as logs, I am having a hard time gathering anything useful
because the boot fails so early. Normally with FORTIFY_SOURCE turned
on I get no output from the kernel at all.
https://gist.github.com/jmontleon/42167a7b6d71bb4db8b7ca7114893b86

With a config closer to the Fedora debug kernel config I got a bit
more, but it stopped booting here and doesn't seem much more useful.
https://gist.github.com/jmontleon/00426b3bff2c85a68370ca1fb5f968c7

If you have suggestions for getting more meaningful output I am happy to try.

The Fedora kernel config boots fine on the VisionFive 2, so I think it
is more specific to the hardware than RISC-V, maybe something T-Head
related if not specific to the Lichee Pi 4A. I was thinking because it
seems pretty hardware specific and failure is so early maybe it is due
to something in one of the THEAD errata or the patch function.

While trying some more things today I noticed if FORTIFY_SOURCE is
left unset and I also unset ERRATA_THEAD_MAE it similarly fails to
boot without output, so I think my idea is possible though I don't
have anything more concrete than that to back it up at the moment.

Thank you,
- Jason

> -Kees
>
> --
> Kees Cook
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-22 20:18   ` Jason Montleon
@ 2024-09-22 22:37     ` Kees Cook
  2024-09-24 15:58       ` Jason Montleon
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2024-09-22 22:37 UTC (permalink / raw)
  To: Jason Montleon
  Cc: linux-hardening, Linux regressions mailing list, linux-riscv

On Sun, Sep 22, 2024 at 04:18:39PM -0400, Jason Montleon wrote:
> Thank you for the quick reply! I am using the Fedora 40 packaged
> version of gcc, gcc-14.1.1-1.fc40.riscv64.

Great! Thanks for the details.

> I originally noticed this while testing a build of the Fedora RISC-V
> .config on Fedora 40.
> http://fedora.riscv.rocks:3000/rpms/kernel/src/branch/main-riscv64/kernel-riscv64-fedora.config
> 
> When I noticed I could not boot this on the lpi4a I tried the
> defconfig(arch/riscv/configs/defconfig), which worked. After merging
> the configs a bit at a time I narrowed it down to FORTIFY_SOURCE=y

I'm trying to imagine how the boot would get stopped with 2003e483a81c
("fortify: Do not special-case 0-sized destinations") applied, since
that portion of the fortify checking is only supposed to warn (i.e. not
bug or panic) and then continue without blocking anything. I don't see
CONFIG_PANIC_ON_OOPS set in this config, so that's not it.

> To do the bisect I used the riscv defconfig, running make menuconfig
> to turn on FORTIFY_SOURCE, and saving.
> https://gist.github.com/jmontleon/9cdc778e9c9139296924d3f71b48067b
> 
> As far as logs, I am having a hard time gathering anything useful
> because the boot fails so early. Normally with FORTIFY_SOURCE turned
> on I get no output from the kernel at all.
> https://gist.github.com/jmontleon/42167a7b6d71bb4db8b7ca7114893b86
> 
> With a config closer to the Fedora debug kernel config I got a bit
> more, but it stopped booting here and doesn't seem much more useful.
> https://gist.github.com/jmontleon/00426b3bff2c85a68370ca1fb5f968c7

I don't see a panic_on_warn on the boot command line either:

  Kernel command line: console=ttyS0,115200 root=PARTUUID=c44914cf-1285-4aec-b5df-0224434d3e12 rootfstype=ext4 rw clk_ignore_unused earlycon=sbi loglevel=7 debug=console

> If you have suggestions for getting more meaningful output I am happy to try.

Can you try this patch? It should avoid using the "WARN" infrastructure
(if that is the source of blocking boot), but should still provide some
detail about what tripped it up (via the "regular" pr_*() logging). And
if it boots, can you look at the log to find what it reports for the
"Wanted to write ..." line(s)?

Thanks!

-Kees


diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
index 0d99bf11d260..469a3439959d 100644
--- a/include/linux/fortify-string.h
+++ b/include/linux/fortify-string.h
@@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
 					 const size_t q_size,
 					 const size_t p_size_field,
 					 const size_t q_size_field,
-					 const u8 func)
+					 const u8 func,
+					 const char *field)
 {
 	if (__builtin_constant_p(size)) {
 		/*
@@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
 	 * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
 	 */
 	if (p_size_field != SIZE_MAX &&
-	    p_size != p_size_field && p_size_field < size)
-		return true;
+	    p_size != p_size_field && p_size_field < size) {
+		if (p_size_field == 0)
+			pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
+				size, field);
+		else
+			return true;
+	}
 
 	return false;
 }
@@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
 	const size_t __q_size_field = (q_size_field);			\
 	fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,	\
 				     __q_size, __p_size_field,		\
-				     __q_size_field, FORTIFY_FUNC_ ##op), \
+				     __q_size_field, FORTIFY_FUNC_ ##op,\
+				     #p " at " FILE_LINE), \
 		  #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
 		  __fortify_size,					\
 		  "field \"" #p "\" at " FILE_LINE,			\


-- 
Kees Cook

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-22 22:37     ` Kees Cook
@ 2024-09-24 15:58       ` Jason Montleon
  2024-09-24 17:36         ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Montleon @ 2024-09-24 15:58 UTC (permalink / raw)
  To: Kees Cook; +Cc: linux-hardening, Linux regressions mailing list, linux-riscv

On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
> Can you try this patch? It should avoid using the "WARN" infrastructure
> (if that is the source of blocking boot), but should still provide some
> detail about what tripped it up (via the "regular" pr_*() logging). And
> if it boots, can you look at the log to find what it reports for the
> "Wanted to write ..." line(s)?
>
> Thanks!
>
> -Kees
>
>
> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
> index 0d99bf11d260..469a3439959d 100644
> --- a/include/linux/fortify-string.h
> +++ b/include/linux/fortify-string.h
> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>                                          const size_t q_size,
>                                          const size_t p_size_field,
>                                          const size_t q_size_field,
> -                                        const u8 func)
> +                                        const u8 func,
> +                                        const char *field)
>  {
>         if (__builtin_constant_p(size)) {
>                 /*
> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>          * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
>          */
>         if (p_size_field != SIZE_MAX &&
> -           p_size != p_size_field && p_size_field < size)
> -               return true;
> +           p_size != p_size_field && p_size_field < size) {
> +               if (p_size_field == 0)
> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
> +                               size, field);
> +               else
> +                       return true;
> +       }
>
>         return false;
>  }
> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>         const size_t __q_size_field = (q_size_field);                   \
>         fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
>                                      __q_size, __p_size_field,          \
> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
> +                                    #p " at " FILE_LINE), \
>                   #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
>                   __fortify_size,                                       \
>                   "field \"" #p "\" at " FILE_LINE,                     \
>
>
> --
> Kees Cook
>

Hi Kees,
Sorry for the delay in responding. Your patch also caused the system
to hang booting, but I took it as inspiration to try some things.
TL;DR I see these two print several times:

Wanted to write 8 to a 0-sized destination: oldptr at
arch/riscv/errata/thead/errata.c:185
Wanted to write 28 to a 0-sized destination: oldptr at
arch/riscv/errata/thead/errata.c:185

Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
pr_*, etc. I tried trace_printk but that didn't produce any traces,
and then I found a comment which makes me think we're too early for
that too.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214

I didn't want to give up so I wrote a simple module to figure out how
to use sbi_ecall and then incorporated that idea into your patch,
although even that was not straightforward:
https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c

Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe

Thank you again,
Jason Montleon

-- 
Jason Montleon        | email: jmontleo@redhat.com
Red Hat, Inc.         | gpg key: 0x069E3022
Cell: 508-496-0663    | irc: jmontleo / jmontleon


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-24 15:58       ` Jason Montleon
@ 2024-09-24 17:36         ` Kees Cook
  2024-09-25 14:32           ` Jason Montleon
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2024-09-24 17:36 UTC (permalink / raw)
  To: Jason Montleon
  Cc: linux-hardening, Linux regressions mailing list, linux-riscv



On September 24, 2024 8:58:57 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
>On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
>> Can you try this patch? It should avoid using the "WARN" infrastructure
>> (if that is the source of blocking boot), but should still provide some
>> detail about what tripped it up (via the "regular" pr_*() logging). And
>> if it boots, can you look at the log to find what it reports for the
>> "Wanted to write ..." line(s)?
>>
>> Thanks!
>>
>> -Kees
>>
>>
>> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
>> index 0d99bf11d260..469a3439959d 100644
>> --- a/include/linux/fortify-string.h
>> +++ b/include/linux/fortify-string.h
>> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>                                          const size_t q_size,
>>                                          const size_t p_size_field,
>>                                          const size_t q_size_field,
>> -                                        const u8 func)
>> +                                        const u8 func,
>> +                                        const char *field)
>>  {
>>         if (__builtin_constant_p(size)) {
>>                 /*
>> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>          * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
>>          */
>>         if (p_size_field != SIZE_MAX &&
>> -           p_size != p_size_field && p_size_field < size)
>> -               return true;
>> +           p_size != p_size_field && p_size_field < size) {
>> +               if (p_size_field == 0)
>> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
>> +                               size, field);
>> +               else
>> +                       return true;
>> +       }
>>
>>         return false;
>>  }
>> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>         const size_t __q_size_field = (q_size_field);                   \
>>         fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
>>                                      __q_size, __p_size_field,          \
>> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
>> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
>> +                                    #p " at " FILE_LINE), \
>>                   #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
>>                   __fortify_size,                                       \
>>                   "field \"" #p "\" at " FILE_LINE,                     \
>>
>>
>> --
>> Kees Cook
>>
>
>Hi Kees,
>Sorry for the delay in responding. Your patch also caused the system
>to hang booting, but I took it as inspiration to try some things.
>TL;DR I see these two print several times:
>
>Wanted to write 8 to a 0-sized destination: oldptr at
>arch/riscv/errata/thead/errata.c:185
>Wanted to write 28 to a 0-sized destination: oldptr at
>arch/riscv/errata/thead/errata.c:185
>
>Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
>CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
>pr_*, etc. I tried trace_printk but that didn't produce any traces,
>and then I found a comment which makes me think we're too early for
>that too.
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214
>
>I didn't want to give up so I wrote a simple module to figure out how
>to use sbi_ecall and then incorporated that idea into your patch,
>although even that was not straightforward:
>https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c

Oh nice work! Probably something riscv-specific needs to be wired up to the printk routines to do what you have there until printk is fully available  But that's a separate issue, I guess.

>Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe

Does the system boot with your patch?

I would guess that this line at arch/riscv/errata/thead/errata.c:168:

void *oldptr, *altptr;

Should be:

u8 *oldptr, *altptr;

But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to have a size of SIZE_MAX here, though, so something else might be confusing the check.


-- 
Kees Cook

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-24 17:36         ` Kees Cook
@ 2024-09-25 14:32           ` Jason Montleon
  2024-10-01 14:28             ` Alexandre Ghiti
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Montleon @ 2024-09-25 14:32 UTC (permalink / raw)
  To: Kees Cook; +Cc: linux-hardening, Linux regressions mailing list, linux-riscv

On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <kees@kernel.org> wrote:
>
>
>
> On September 24, 2024 8:58:57 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
> >On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
> >> Can you try this patch? It should avoid using the "WARN" infrastructure
> >> (if that is the source of blocking boot), but should still provide some
> >> detail about what tripped it up (via the "regular" pr_*() logging). And
> >> if it boots, can you look at the log to find what it reports for the
> >> "Wanted to write ..." line(s)?
> >>
> >> Thanks!
> >>
> >> -Kees
> >>
> >>
> >> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
> >> index 0d99bf11d260..469a3439959d 100644
> >> --- a/include/linux/fortify-string.h
> >> +++ b/include/linux/fortify-string.h
> >> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>                                          const size_t q_size,
> >>                                          const size_t p_size_field,
> >>                                          const size_t q_size_field,
> >> -                                        const u8 func)
> >> +                                        const u8 func,
> >> +                                        const char *field)
> >>  {
> >>         if (__builtin_constant_p(size)) {
> >>                 /*
> >> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>          * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
> >>          */
> >>         if (p_size_field != SIZE_MAX &&
> >> -           p_size != p_size_field && p_size_field < size)
> >> -               return true;
> >> +           p_size != p_size_field && p_size_field < size) {
> >> +               if (p_size_field == 0)
> >> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
> >> +                               size, field);
> >> +               else
> >> +                       return true;
> >> +       }
> >>
> >>         return false;
> >>  }
> >> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>         const size_t __q_size_field = (q_size_field);                   \
> >>         fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
> >>                                      __q_size, __p_size_field,          \
> >> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
> >> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
> >> +                                    #p " at " FILE_LINE), \
> >>                   #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
> >>                   __fortify_size,                                       \
> >>                   "field \"" #p "\" at " FILE_LINE,                     \
> >>
> >>
> >> --
> >> Kees Cook
> >>
> >
> >Hi Kees,
> >Sorry for the delay in responding. Your patch also caused the system
> >to hang booting, but I took it as inspiration to try some things.
> >TL;DR I see these two print several times:
> >
> >Wanted to write 8 to a 0-sized destination: oldptr at
> >arch/riscv/errata/thead/errata.c:185
> >Wanted to write 28 to a 0-sized destination: oldptr at
> >arch/riscv/errata/thead/errata.c:185
> >
> >Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
> >CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
> >pr_*, etc. I tried trace_printk but that didn't produce any traces,
> >and then I found a comment which makes me think we're too early for
> >that too.
> >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214
> >
> >I didn't want to give up so I wrote a simple module to figure out how
> >to use sbi_ecall and then incorporated that idea into your patch,
> >although even that was not straightforward:
> >https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c
>
> Oh nice work! Probably something riscv-specific needs to be wired up to the printk routines to do what you have there until printk is fully available  But that's a separate issue, I guess.
>
> >Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe
>
> Does the system boot with your patch?

Yes. It boots fully. It also booted with trace_printk, it just didn't
produce a trace, which again I think is expected at that stage. It
seems as long as there are no 'normal' print attempts, pr_()*, etc. it
works.

> I would guess that this line at arch/riscv/errata/thead/errata.c:168:
>
> void *oldptr, *altptr;
>
> Should be:
>
> u8 *oldptr, *altptr;

I tried this, but it didn't appear to make a difference, I will try
some more today, though I admit to being out of my element.



> But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to have a size of SIZE_MAX here, though, so something else might be confusing the check.
>
>
> --
> Kees Cook
>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-09-25 14:32           ` Jason Montleon
@ 2024-10-01 14:28             ` Alexandre Ghiti
  2024-10-02 15:14               ` Jason Montleon
  0 siblings, 1 reply; 14+ messages in thread
From: Alexandre Ghiti @ 2024-10-01 14:28 UTC (permalink / raw)
  To: Jason Montleon, Kees Cook
  Cc: linux-hardening, Linux regressions mailing list, linux-riscv

Hi Jason,

On 25/09/2024 16:32, Jason Montleon wrote:
> On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <kees@kernel.org> wrote:
>>
>>
>> On September 24, 2024 8:58:57 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
>>> On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
>>>> Can you try this patch? It should avoid using the "WARN" infrastructure
>>>> (if that is the source of blocking boot), but should still provide some
>>>> detail about what tripped it up (via the "regular" pr_*() logging). And
>>>> if it boots, can you look at the log to find what it reports for the
>>>> "Wanted to write ..." line(s)?
>>>>
>>>> Thanks!
>>>>
>>>> -Kees
>>>>
>>>>
>>>> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
>>>> index 0d99bf11d260..469a3439959d 100644
>>>> --- a/include/linux/fortify-string.h
>>>> +++ b/include/linux/fortify-string.h
>>>> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>>>                                           const size_t q_size,
>>>>                                           const size_t p_size_field,
>>>>                                           const size_t q_size_field,
>>>> -                                        const u8 func)
>>>> +                                        const u8 func,
>>>> +                                        const char *field)
>>>>   {
>>>>          if (__builtin_constant_p(size)) {
>>>>                  /*
>>>> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>>>           * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
>>>>           */
>>>>          if (p_size_field != SIZE_MAX &&
>>>> -           p_size != p_size_field && p_size_field < size)
>>>> -               return true;
>>>> +           p_size != p_size_field && p_size_field < size) {
>>>> +               if (p_size_field == 0)
>>>> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
>>>> +                               size, field);
>>>> +               else
>>>> +                       return true;
>>>> +       }
>>>>
>>>>          return false;
>>>>   }
>>>> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>>>          const size_t __q_size_field = (q_size_field);                   \
>>>>          fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
>>>>                                       __q_size, __p_size_field,          \
>>>> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
>>>> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
>>>> +                                    #p " at " FILE_LINE), \
>>>>                    #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
>>>>                    __fortify_size,                                       \
>>>>                    "field \"" #p "\" at " FILE_LINE,                     \
>>>>
>>>>
>>>> --
>>>> Kees Cook
>>>>
>>> Hi Kees,
>>> Sorry for the delay in responding. Your patch also caused the system
>>> to hang booting, but I took it as inspiration to try some things.
>>> TL;DR I see these two print several times:
>>>
>>> Wanted to write 8 to a 0-sized destination: oldptr at
>>> arch/riscv/errata/thead/errata.c:185
>>> Wanted to write 28 to a 0-sized destination: oldptr at
>>> arch/riscv/errata/thead/errata.c:185
>>>
>>> Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
>>> CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
>>> pr_*, etc. I tried trace_printk but that didn't produce any traces,
>>> and then I found a comment which makes me think we're too early for
>>> that too.
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214
>>>
>>> I didn't want to give up so I wrote a simple module to figure out how
>>> to use sbi_ecall and then incorporated that idea into your patch,
>>> although even that was not straightforward:
>>> https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c
>> Oh nice work! Probably something riscv-specific needs to be wired up to the printk routines to do what you have there until printk is fully available  But that's a separate issue, I guess.
>>
>>> Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe
>> Does the system boot with your patch?
> Yes. It boots fully. It also booted with trace_printk, it just didn't
> produce a trace, which again I think is expected at that stage. It
> seems as long as there are no 'normal' print attempts, pr_()*, etc. it
> works.


Hmmm I don't see what in your patch would fix your issue.

I sent a patch 
(https://lore.kernel.org/linux-riscv/20240829165048.49756-1-alexghiti@rivosinc.com/) 
that was merged in v6.11-rc7 which fixes something similar, could that 
be what fixed your problem?

The problem with this early code is that the MMU is not enabled yet and 
if your kernel is relocatable, the relocations are set "virtually" and 
then without the MMU, a call to an external function results in a 
trap...Which happens way too often, I have a few ideas to fix that, I 
need to find time to tackle this though.

I don't know FORTIFY_SOURCE, but I guess this could cause a call to an 
external function and then cause the issue I describe above right? We 
already remove all instrumentations from this early code so maybe we 
should remove FORTIFY_SOURCE too?

Thanks for digging into this,

Alex


>
>> I would guess that this line at arch/riscv/errata/thead/errata.c:168:
>>
>> void *oldptr, *altptr;
>>
>> Should be:
>>
>> u8 *oldptr, *altptr;
> I tried this, but it didn't appear to make a difference, I will try
> some more today, though I admit to being out of my element.
>
>
>
>> But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to have a size of SIZE_MAX here, though, so something else might be confusing the check.
>>
>>
>> --
>> Kees Cook
>>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-10-01 14:28             ` Alexandre Ghiti
@ 2024-10-02 15:14               ` Jason Montleon
  2024-10-03 14:41                 ` Alexandre Ghiti
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Montleon @ 2024-10-02 15:14 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Kees Cook, linux-hardening, Linux regressions mailing list,
	linux-riscv

On Tue, Oct 1, 2024 at 10:53 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> Hi Jason,
>
> On 25/09/2024 16:32, Jason Montleon wrote:
> > On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <kees@kernel.org> wrote:
> >>
> >>
> >> On September 24, 2024 8:58:57 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
> >>> On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
> >>>> Can you try this patch? It should avoid using the "WARN" infrastructure
> >>>> (if that is the source of blocking boot), but should still provide some
> >>>> detail about what tripped it up (via the "regular" pr_*() logging). And
> >>>> if it boots, can you look at the log to find what it reports for the
> >>>> "Wanted to write ..." line(s)?
> >>>>
> >>>> Thanks!
> >>>>
> >>>> -Kees
> >>>>
> >>>>
> >>>> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
> >>>> index 0d99bf11d260..469a3439959d 100644
> >>>> --- a/include/linux/fortify-string.h
> >>>> +++ b/include/linux/fortify-string.h
> >>>> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>>>                                           const size_t q_size,
> >>>>                                           const size_t p_size_field,
> >>>>                                           const size_t q_size_field,
> >>>> -                                        const u8 func)
> >>>> +                                        const u8 func,
> >>>> +                                        const char *field)
> >>>>   {
> >>>>          if (__builtin_constant_p(size)) {
> >>>>                  /*
> >>>> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>>>           * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
> >>>>           */
> >>>>          if (p_size_field != SIZE_MAX &&
> >>>> -           p_size != p_size_field && p_size_field < size)
> >>>> -               return true;
> >>>> +           p_size != p_size_field && p_size_field < size) {
> >>>> +               if (p_size_field == 0)
> >>>> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
> >>>> +                               size, field);
> >>>> +               else
> >>>> +                       return true;
> >>>> +       }
> >>>>
> >>>>          return false;
> >>>>   }
> >>>> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>>>          const size_t __q_size_field = (q_size_field);                   \
> >>>>          fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
> >>>>                                       __q_size, __p_size_field,          \
> >>>> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
> >>>> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
> >>>> +                                    #p " at " FILE_LINE), \
> >>>>                    #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
> >>>>                    __fortify_size,                                       \
> >>>>                    "field \"" #p "\" at " FILE_LINE,                     \
> >>>>
> >>>>
> >>>> --
> >>>> Kees Cook
> >>>>
> >>> Hi Kees,
> >>> Sorry for the delay in responding. Your patch also caused the system
> >>> to hang booting, but I took it as inspiration to try some things.
> >>> TL;DR I see these two print several times:
> >>>
> >>> Wanted to write 8 to a 0-sized destination: oldptr at
> >>> arch/riscv/errata/thead/errata.c:185
> >>> Wanted to write 28 to a 0-sized destination: oldptr at
> >>> arch/riscv/errata/thead/errata.c:185
> >>>
> >>> Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
> >>> CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
> >>> pr_*, etc. I tried trace_printk but that didn't produce any traces,
> >>> and then I found a comment which makes me think we're too early for
> >>> that too.
> >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214
> >>>
> >>> I didn't want to give up so I wrote a simple module to figure out how
> >>> to use sbi_ecall and then incorporated that idea into your patch,
> >>> although even that was not straightforward:
> >>> https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c
> >> Oh nice work! Probably something riscv-specific needs to be wired up to the printk routines to do what you have there until printk is fully available  But that's a separate issue, I guess.
> >>
> >>> Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe
> >> Does the system boot with your patch?
> > Yes. It boots fully. It also booted with trace_printk, it just didn't
> > produce a trace, which again I think is expected at that stage. It
> > seems as long as there are no 'normal' print attempts, pr_()*, etc. it
> > works.
>
>
> Hmmm I don't see what in your patch would fix your issue.
>
> I sent a patch
> (https://lore.kernel.org/linux-riscv/20240829165048.49756-1-alexghiti@rivosinc.com/)
> that was merged in v6.11-rc7 which fixes something similar, could that
> be what fixed your problem?
>

Hi Alex, Thank you for having a look.

Although I narrowed it down to starting with 6.11-rc1, the first time
I encountered this was actually with an rc7 Fedora build.
http://riscv.rocks/koji/buildinfo?buildID=338433

I also did a build from upstream 6.11.1 last night and still see the
problem trying to boot with it this morning.

> The problem with this early code is that the MMU is not enabled yet and
> if your kernel is relocatable, the relocations are set "virtually" and
> then without the MMU, a call to an external function results in a
> trap...Which happens way too often, I have a few ideas to fix that, I
> need to find time to tackle this though.
>
> I don't know FORTIFY_SOURCE, but I guess this could cause a call to an
> external function and then cause the issue I describe above right? We
> already remove all instrumentations from this early code so maybe we
> should remove FORTIFY_SOURCE too?
>
What you are describing here does not seem unreasonable to me, but I
simply do not know enough to say one way or the other.
Is there something I can do to collect more information to determine
if this is the cause?

Thank you again for having a look,
Jason

> Thanks for digging into this,
>
> Alex
>
>
> >
> >> I would guess that this line at arch/riscv/errata/thead/errata.c:168:
> >>
> >> void *oldptr, *altptr;
> >>
> >> Should be:
> >>
> >> u8 *oldptr, *altptr;
> > I tried this, but it didn't appear to make a difference, I will try
> > some more today, though I admit to being out of my element.
> >
> >
> >
> >> But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to have a size of SIZE_MAX here, though, so something else might be confusing the check.
> >>
> >>
> >> --
> >> Kees Cook
> >>
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-10-02 15:14               ` Jason Montleon
@ 2024-10-03 14:41                 ` Alexandre Ghiti
  2024-10-03 17:12                   ` Jason Montleon
  0 siblings, 1 reply; 14+ messages in thread
From: Alexandre Ghiti @ 2024-10-03 14:41 UTC (permalink / raw)
  To: Jason Montleon
  Cc: Kees Cook, linux-hardening, Linux regressions mailing list,
	linux-riscv

Hi Jason,

On 02/10/2024 17:14, Jason Montleon wrote:
> On Tue, Oct 1, 2024 at 10:53 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>> Hi Jason,
>>
>> On 25/09/2024 16:32, Jason Montleon wrote:
>>> On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <kees@kernel.org> wrote:
>>>>
>>>> On September 24, 2024 8:58:57 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
>>>>> On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
>>>>>> Can you try this patch? It should avoid using the "WARN" infrastructure
>>>>>> (if that is the source of blocking boot), but should still provide some
>>>>>> detail about what tripped it up (via the "regular" pr_*() logging). And
>>>>>> if it boots, can you look at the log to find what it reports for the
>>>>>> "Wanted to write ..." line(s)?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> -Kees
>>>>>>
>>>>>>
>>>>>> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
>>>>>> index 0d99bf11d260..469a3439959d 100644
>>>>>> --- a/include/linux/fortify-string.h
>>>>>> +++ b/include/linux/fortify-string.h
>>>>>> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>>>>>                                            const size_t q_size,
>>>>>>                                            const size_t p_size_field,
>>>>>>                                            const size_t q_size_field,
>>>>>> -                                        const u8 func)
>>>>>> +                                        const u8 func,
>>>>>> +                                        const char *field)
>>>>>>    {
>>>>>>           if (__builtin_constant_p(size)) {
>>>>>>                   /*
>>>>>> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>>>>>            * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
>>>>>>            */
>>>>>>           if (p_size_field != SIZE_MAX &&
>>>>>> -           p_size != p_size_field && p_size_field < size)
>>>>>> -               return true;
>>>>>> +           p_size != p_size_field && p_size_field < size) {
>>>>>> +               if (p_size_field == 0)
>>>>>> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
>>>>>> +                               size, field);
>>>>>> +               else
>>>>>> +                       return true;
>>>>>> +       }
>>>>>>
>>>>>>           return false;
>>>>>>    }
>>>>>> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
>>>>>>           const size_t __q_size_field = (q_size_field);                   \
>>>>>>           fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
>>>>>>                                        __q_size, __p_size_field,          \
>>>>>> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
>>>>>> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
>>>>>> +                                    #p " at " FILE_LINE), \
>>>>>>                     #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
>>>>>>                     __fortify_size,                                       \
>>>>>>                     "field \"" #p "\" at " FILE_LINE,                     \
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Kees Cook
>>>>>>
>>>>> Hi Kees,
>>>>> Sorry for the delay in responding. Your patch also caused the system
>>>>> to hang booting, but I took it as inspiration to try some things.
>>>>> TL;DR I see these two print several times:
>>>>>
>>>>> Wanted to write 8 to a 0-sized destination: oldptr at
>>>>> arch/riscv/errata/thead/errata.c:185
>>>>> Wanted to write 28 to a 0-sized destination: oldptr at
>>>>> arch/riscv/errata/thead/errata.c:185
>>>>>
>>>>> Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
>>>>> CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
>>>>> pr_*, etc. I tried trace_printk but that didn't produce any traces,
>>>>> and then I found a comment which makes me think we're too early for
>>>>> that too.
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214
>>>>>
>>>>> I didn't want to give up so I wrote a simple module to figure out how
>>>>> to use sbi_ecall and then incorporated that idea into your patch,
>>>>> although even that was not straightforward:
>>>>> https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c
>>>> Oh nice work! Probably something riscv-specific needs to be wired up to the printk routines to do what you have there until printk is fully available  But that's a separate issue, I guess.
>>>>
>>>>> Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe
>>>> Does the system boot with your patch?
>>> Yes. It boots fully. It also booted with trace_printk, it just didn't
>>> produce a trace, which again I think is expected at that stage. It
>>> seems as long as there are no 'normal' print attempts, pr_()*, etc. it
>>> works.
>>
>> Hmmm I don't see what in your patch would fix your issue.
>>
>> I sent a patch
>> (https://lore.kernel.org/linux-riscv/20240829165048.49756-1-alexghiti@rivosinc.com/)
>> that was merged in v6.11-rc7 which fixes something similar, could that
>> be what fixed your problem?
>>
> Hi Alex, Thank you for having a look.
>
> Although I narrowed it down to starting with 6.11-rc1, the first time
> I encountered this was actually with an rc7 Fedora build.
> http://riscv.rocks/koji/buildinfo?buildID=338433
>
> I also did a build from upstream 6.11.1 last night and still see the
> problem trying to boot with it this morning.
>
>> The problem with this early code is that the MMU is not enabled yet and
>> if your kernel is relocatable, the relocations are set "virtually" and
>> then without the MMU, a call to an external function results in a
>> trap...Which happens way too often, I have a few ideas to fix that, I
>> need to find time to tackle this though.
>>
>> I don't know FORTIFY_SOURCE, but I guess this could cause a call to an
>> external function and then cause the issue I describe above right? We
>> already remove all instrumentations from this early code so maybe we
>> should remove FORTIFY_SOURCE too?


So I was able to reproduce the issue on qemu by adding a few tweaks, and 
indeed we trap in __warn_printk() on a virtual address but MMU is not 
enabled yet.

The following diff though allows me to pass this failure but I can't get 
much further in the boot since the tweaks I added won't allow it, can 
you give the following a try?

diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
index 8a27394851233..4913f3b3f198c 100644
--- a/arch/riscv/errata/Makefile
+++ b/arch/riscv/errata/Makefile
@@ -2,6 +2,10 @@ ifdef CONFIG_RELOCATABLE
  KBUILD_CFLAGS += -fno-pie
  endif

+ifdef CONFIG_RISCV_ALTERNATIVE_EARLY
+KBUILD_CFLAGS += -D__NO_FORTIFY
+endif
+
  obj-$(CONFIG_ERRATA_ANDES) += andes/
  obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
  obj-$(CONFIG_ERRATA_THEAD) += thead/


Thanks,

Alex


>>
> What you are describing here does not seem unreasonable to me, but I
> simply do not know enough to say one way or the other.
> Is there something I can do to collect more information to determine
> if this is the cause?
>
> Thank you again for having a look,
> Jason
>
>> Thanks for digging into this,
>>
>> Alex
>>
>>
>>>> I would guess that this line at arch/riscv/errata/thead/errata.c:168:
>>>>
>>>> void *oldptr, *altptr;
>>>>
>>>> Should be:
>>>>
>>>> u8 *oldptr, *altptr;
>>> I tried this, but it didn't appear to make a difference, I will try
>>> some more today, though I admit to being out of my element.
>>>
>>>
>>>
>>>> But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to have a size of SIZE_MAX here, though, so something else might be confusing the check.
>>>>
>>>>
>>>> --
>>>> Kees Cook
>>>>
>>> _______________________________________________
>>> linux-riscv mailing list
>>> linux-riscv@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-10-03 14:41                 ` Alexandre Ghiti
@ 2024-10-03 17:12                   ` Jason Montleon
  2024-10-03 21:21                     ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Montleon @ 2024-10-03 17:12 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Kees Cook, linux-hardening, Linux regressions mailing list,
	linux-riscv

On Thu, Oct 3, 2024 at 10:41 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> Hi Jason,
>
> On 02/10/2024 17:14, Jason Montleon wrote:
> > On Tue, Oct 1, 2024 at 10:53 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
> >> Hi Jason,
> >>
> >> On 25/09/2024 16:32, Jason Montleon wrote:
> >>> On Tue, Sep 24, 2024 at 1:36 PM Kees Cook <kees@kernel.org> wrote:
> >>>>
> >>>> On September 24, 2024 8:58:57 AM PDT, Jason Montleon <jmontleo@redhat.com> wrote:
> >>>>> On Sun, Sep 22, 2024 at 6:38 PM Kees Cook <kees@kernel.org> wrote:
> >>>>>> Can you try this patch? It should avoid using the "WARN" infrastructure
> >>>>>> (if that is the source of blocking boot), but should still provide some
> >>>>>> detail about what tripped it up (via the "regular" pr_*() logging). And
> >>>>>> if it boots, can you look at the log to find what it reports for the
> >>>>>> "Wanted to write ..." line(s)?
> >>>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> -Kees
> >>>>>>
> >>>>>>
> >>>>>> diff --git a/include/linux/fortify-string.h b/include/linux/fortify-string.h
> >>>>>> index 0d99bf11d260..469a3439959d 100644
> >>>>>> --- a/include/linux/fortify-string.h
> >>>>>> +++ b/include/linux/fortify-string.h
> >>>>>> @@ -549,7 +549,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>>>>>                                            const size_t q_size,
> >>>>>>                                            const size_t p_size_field,
> >>>>>>                                            const size_t q_size_field,
> >>>>>> -                                        const u8 func)
> >>>>>> +                                        const u8 func,
> >>>>>> +                                        const char *field)
> >>>>>>    {
> >>>>>>           if (__builtin_constant_p(size)) {
> >>>>>>                   /*
> >>>>>> @@ -610,8 +611,13 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>>>>>            * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
> >>>>>>            */
> >>>>>>           if (p_size_field != SIZE_MAX &&
> >>>>>> -           p_size != p_size_field && p_size_field < size)
> >>>>>> -               return true;
> >>>>>> +           p_size != p_size_field && p_size_field < size) {
> >>>>>> +               if (p_size_field == 0)
> >>>>>> +                       pr_warn("Wanted to write %zu to a 0-sized destination: %s\n",
> >>>>>> +                               size, field);
> >>>>>> +               else
> >>>>>> +                       return true;
> >>>>>> +       }
> >>>>>>
> >>>>>>           return false;
> >>>>>>    }
> >>>>>> @@ -625,7 +631,8 @@ __FORTIFY_INLINE bool fortify_memcpy_chk(__kernel_size_t size,
> >>>>>>           const size_t __q_size_field = (q_size_field);                   \
> >>>>>>           fortify_warn_once(fortify_memcpy_chk(__fortify_size, __p_size,  \
> >>>>>>                                        __q_size, __p_size_field,          \
> >>>>>> -                                    __q_size_field, FORTIFY_FUNC_ ##op), \
> >>>>>> +                                    __q_size_field, FORTIFY_FUNC_ ##op,\
> >>>>>> +                                    #p " at " FILE_LINE), \
> >>>>>>                     #op ": detected field-spanning write (size %zu) of single %s (size %zu)\n", \
> >>>>>>                     __fortify_size,                                       \
> >>>>>>                     "field \"" #p "\" at " FILE_LINE,                     \
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Kees Cook
> >>>>>>
> >>>>> Hi Kees,
> >>>>> Sorry for the delay in responding. Your patch also caused the system
> >>>>> to hang booting, but I took it as inspiration to try some things.
> >>>>> TL;DR I see these two print several times:
> >>>>>
> >>>>> Wanted to write 8 to a 0-sized destination: oldptr at
> >>>>> arch/riscv/errata/thead/errata.c:185
> >>>>> Wanted to write 28 to a 0-sized destination: oldptr at
> >>>>> arch/riscv/errata/thead/errata.c:185
> >>>>>
> >>>>> Since I suspected CONFIG_ERRATA_THEAD_MAE which relies on
> >>>>> CONFIG_RISCV_ALTERNATIVE_EARLY I suspected it is just too early for
> >>>>> pr_*, etc. I tried trace_printk but that didn't produce any traces,
> >>>>> and then I found a comment which makes me think we're too early for
> >>>>> that too.
> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/alternative.c?h=v6.11#n214
> >>>>>
> >>>>> I didn't want to give up so I wrote a simple module to figure out how
> >>>>> to use sbi_ecall and then incorporated that idea into your patch,
> >>>>> although even that was not straightforward:
> >>>>> https://gist.github.com/jmontleon/f3536ad70dc92bed992139ad5b266d5c
> >>>> Oh nice work! Probably something riscv-specific needs to be wired up to the printk routines to do what you have there until printk is fully available  But that's a separate issue, I guess.
> >>>>
> >>>>> Full output: https://gist.github.com/jmontleon/823facff38e28b717f153e38adcfd7fe
> >>>> Does the system boot with your patch?
> >>> Yes. It boots fully. It also booted with trace_printk, it just didn't
> >>> produce a trace, which again I think is expected at that stage. It
> >>> seems as long as there are no 'normal' print attempts, pr_()*, etc. it
> >>> works.
> >>
> >> Hmmm I don't see what in your patch would fix your issue.
> >>
> >> I sent a patch
> >> (https://lore.kernel.org/linux-riscv/20240829165048.49756-1-alexghiti@rivosinc.com/)
> >> that was merged in v6.11-rc7 which fixes something similar, could that
> >> be what fixed your problem?
> >>
> > Hi Alex, Thank you for having a look.
> >
> > Although I narrowed it down to starting with 6.11-rc1, the first time
> > I encountered this was actually with an rc7 Fedora build.
> > http://riscv.rocks/koji/buildinfo?buildID=338433
> >
> > I also did a build from upstream 6.11.1 last night and still see the
> > problem trying to boot with it this morning.
> >
> >> The problem with this early code is that the MMU is not enabled yet and
> >> if your kernel is relocatable, the relocations are set "virtually" and
> >> then without the MMU, a call to an external function results in a
> >> trap...Which happens way too often, I have a few ideas to fix that, I
> >> need to find time to tackle this though.
> >>
> >> I don't know FORTIFY_SOURCE, but I guess this could cause a call to an
> >> external function and then cause the issue I describe above right? We
> >> already remove all instrumentations from this early code so maybe we
> >> should remove FORTIFY_SOURCE too?
>
>
> So I was able to reproduce the issue on qemu by adding a few tweaks, and
> indeed we trap in __warn_printk() on a virtual address but MMU is not
> enabled yet.
>
> The following diff though allows me to pass this failure but I can't get
> much further in the boot since the tweaks I added won't allow it, can
> you give the following a try?
>
> diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
> index 8a27394851233..4913f3b3f198c 100644
> --- a/arch/riscv/errata/Makefile
> +++ b/arch/riscv/errata/Makefile
> @@ -2,6 +2,10 @@ ifdef CONFIG_RELOCATABLE
>   KBUILD_CFLAGS += -fno-pie
>   endif
>
> +ifdef CONFIG_RISCV_ALTERNATIVE_EARLY
> +KBUILD_CFLAGS += -D__NO_FORTIFY
> +endif
> +
>   obj-$(CONFIG_ERRATA_ANDES) += andes/
>   obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
>   obj-$(CONFIG_ERRATA_THEAD) += thead/

Yes, this worked.

Thank you,
Jason

> Thanks,
>
> Alex
>
>
> >>
> > What you are describing here does not seem unreasonable to me, but I
> > simply do not know enough to say one way or the other.
> > Is there something I can do to collect more information to determine
> > if this is the cause?
> >
> > Thank you again for having a look,
> > Jason
> >
> >> Thanks for digging into this,
> >>
> >> Alex
> >>
> >>
> >>>> I would guess that this line at arch/riscv/errata/thead/errata.c:168:
> >>>>
> >>>> void *oldptr, *altptr;
> >>>>
> >>>> Should be:
> >>>>
> >>>> u8 *oldptr, *altptr;
> >>> I tried this, but it didn't appear to make a difference, I will try
> >>> some more today, though I admit to being out of my element.
> >>>
> >>>
> >>>
> >>>> But maybe __ALT_PTR needs adjustment too? Hmm. I would have expected void * to have a size of SIZE_MAX here, though, so something else might be confusing the check.
> >>>>
> >>>>
> >>>> --
> >>>> Kees Cook
> >>>>
> >>> _______________________________________________
> >>> linux-riscv mailing list
> >>> linux-riscv@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-10-03 17:12                   ` Jason Montleon
@ 2024-10-03 21:21                     ` Kees Cook
  2024-10-04 11:37                       ` Alexandre Ghiti
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2024-10-03 21:21 UTC (permalink / raw)
  To: Jason Montleon
  Cc: Alexandre Ghiti, linux-hardening, Linux regressions mailing list,
	linux-riscv

On Thu, Oct 03, 2024 at 01:12:59PM -0400, Jason Montleon wrote:
> On Thu, Oct 3, 2024 at 10:41 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
> > So I was able to reproduce the issue on qemu by adding a few tweaks, and
> > indeed we trap in __warn_printk() on a virtual address but MMU is not
> > enabled yet.
> >
> > The following diff though allows me to pass this failure but I can't get
> > much further in the boot since the tweaks I added won't allow it, can
> > you give the following a try?
> >
> > diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
> > index 8a27394851233..4913f3b3f198c 100644
> > --- a/arch/riscv/errata/Makefile
> > +++ b/arch/riscv/errata/Makefile
> > @@ -2,6 +2,10 @@ ifdef CONFIG_RELOCATABLE
> >   KBUILD_CFLAGS += -fno-pie
> >   endif
> >
> > +ifdef CONFIG_RISCV_ALTERNATIVE_EARLY
> > +KBUILD_CFLAGS += -D__NO_FORTIFY
> > +endif
> > +
> >   obj-$(CONFIG_ERRATA_ANDES) += andes/
> >   obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
> >   obj-$(CONFIG_ERRATA_THEAD) += thead/
> 
> Yes, this worked.

Thanks for testing!

Yeah, this matches similar fortify disabling in other early boot areas.
Usually it's part of a common header, but setting it via the Makefile
also works. I'll leave it up to the riscv maintainers! :)

-Kees

-- 
Kees Cook

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-10-03 21:21                     ` Kees Cook
@ 2024-10-04 11:37                       ` Alexandre Ghiti
  2024-10-04 17:56                         ` Kees Cook
  0 siblings, 1 reply; 14+ messages in thread
From: Alexandre Ghiti @ 2024-10-04 11:37 UTC (permalink / raw)
  To: Kees Cook, Jason Montleon
  Cc: linux-hardening, Linux regressions mailing list, linux-riscv

Hi Kees, Jason,

On 03/10/2024 23:21, Kees Cook wrote:
> On Thu, Oct 03, 2024 at 01:12:59PM -0400, Jason Montleon wrote:
>> On Thu, Oct 3, 2024 at 10:41 AM Alexandre Ghiti <alex@ghiti.fr> wrote:
>>> So I was able to reproduce the issue on qemu by adding a few tweaks, and
>>> indeed we trap in __warn_printk() on a virtual address but MMU is not
>>> enabled yet.
>>>
>>> The following diff though allows me to pass this failure but I can't get
>>> much further in the boot since the tweaks I added won't allow it, can
>>> you give the following a try?
>>>
>>> diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
>>> index 8a27394851233..4913f3b3f198c 100644
>>> --- a/arch/riscv/errata/Makefile
>>> +++ b/arch/riscv/errata/Makefile
>>> @@ -2,6 +2,10 @@ ifdef CONFIG_RELOCATABLE
>>>    KBUILD_CFLAGS += -fno-pie
>>>    endif
>>>
>>> +ifdef CONFIG_RISCV_ALTERNATIVE_EARLY
>>> +KBUILD_CFLAGS += -D__NO_FORTIFY
>>> +endif
>>> +
>>>    obj-$(CONFIG_ERRATA_ANDES) += andes/
>>>    obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
>>>    obj-$(CONFIG_ERRATA_THEAD) += thead/
>> Yes, this worked.


Great; thanks!


> Thanks for testing!
>
> Yeah, this matches similar fortify disabling in other early boot areas.
> Usually it's part of a common header, but setting it via the Makefile
> also works. I'll leave it up to the riscv maintainers! :)


I found a few other spots where we did not disable fortify, the easiest 
path I found was indeed to do that in the Makefile.

I have a question though: should we do something about the following 
warnings? Is there something wrong somewhere?

Wanted to write 8 to a 0-sized destination: oldptr at 
arch/riscv/errata/thead/errata.c:185
Wanted to write 28 to a 0-sized destination: oldptr at 
arch/riscv/errata/thead/errata.c:185

Thanks,

Alex


>
> -Kees
>

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled
  2024-10-04 11:37                       ` Alexandre Ghiti
@ 2024-10-04 17:56                         ` Kees Cook
  0 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2024-10-04 17:56 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Jason Montleon, linux-hardening, Linux regressions mailing list,
	linux-riscv

On Fri, Oct 04, 2024 at 01:37:44PM +0200, Alexandre Ghiti wrote:
> I have a question though: should we do something about the following
> warnings? Is there something wrong somewhere?
> 
> Wanted to write 8 to a 0-sized destination: oldptr at arch/riscv/errata/thead/errata.c:185
> Wanted to write 28 to a 0-sized destination: oldptr at arch/riscv/errata/thead/errata.c:185

I think it's a problem for fortify, not for the errata code. I'm
still looking into it -- they are unexpected warnings given the types
involved. I haven't been able to reproduce it in a stand-alone context
yet.

-- 
Kees Cook

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-10-04 17:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-21 17:42 [REGRESSION][BISECTED] Cannot boot Lichee Pi 4A with FORTIFY_SOURCE enabled Jason Montleon
2024-09-22  2:42 ` Kees Cook
2024-09-22 20:18   ` Jason Montleon
2024-09-22 22:37     ` Kees Cook
2024-09-24 15:58       ` Jason Montleon
2024-09-24 17:36         ` Kees Cook
2024-09-25 14:32           ` Jason Montleon
2024-10-01 14:28             ` Alexandre Ghiti
2024-10-02 15:14               ` Jason Montleon
2024-10-03 14:41                 ` Alexandre Ghiti
2024-10-03 17:12                   ` Jason Montleon
2024-10-03 21:21                     ` Kees Cook
2024-10-04 11:37                       ` Alexandre Ghiti
2024-10-04 17:56                         ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox