OpenSBI Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
       [not found]               ` <20230712-fancied-aviator-270f51166407@spud>
@ 2023-07-13 22:12                 ` Conor Dooley
  2023-07-13 22:35                   ` Daniel Henrique Barboza
                                     ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Conor Dooley @ 2023-07-13 22:12 UTC (permalink / raw)
  To: opensbi

+CC OpenSBI Mailing list

I've not yet had the chance to bisect this, so adding the OpenSBI folks
to CC in case they might have an idea for what to try.

And a question for you below Daniel.

On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> > On 7/12/23 18:35, Conor Dooley wrote:
> > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > > 
> > > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > > QEMU version ID/build and didn't mean much.
> > > > 
> > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > > via command line.
> > > 
> > > Sounds good, thanks. I did just now go and check icicle to see what it
> > > would report & it does not boot. I'll go bisect...
> > 
> > BTW how are you booting the icicle board nowadays? I remember you mentioning about
> > some changes in the FDT being required to boot and whatnot.
> 
> I do direct kernel boots, as the HSS doesn't work anymore, and just lie
> a bit to QEMU about how much DDR we have.
> .PHONY: qemu-icicle
> qemu-icicle:
> 	$(qemu) -M microchip-icicle-kit \
> 		-m 3G -smp 5 \
> 		-kernel $(vmlinux_bin) \
> 		-dtb $(icicle_dtb) \
> 		-initrd $(initramfs) \
> 		-display none -serial null \
> 		-serial stdio \
> 		-D qemu.log -d unimp
> 
> The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
> it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
> upstream devicetree (and current FPGA reference design) expects there to
> be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
> it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
> things just work. I prefer doing it this way than having to modify the
> DT, it is a lot easier to explain to people this way.
> 
> I've been meaning to work the support for the icicle & mpfs in QEMU, but
> it just gets shunted down the priority list. I'd really like if a proper
> boot flow would run in QEMU, which means fixing whatever broke the HSS,
> but I've recently picked up maintainership of dt-binding stuff in Linux,
> so I've unfortunately got even less time to try and work on it. Maybe
> we'll get some new graduate in and I can make them suffer in my stead...
> 
> > If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> > we can even add it to QEMU testsuite.
> 
> I don't think it really should be that bad, at least for the direct
> kernel boot, which is what I mainly care about, since I use it fairly
> often for debugging boot stuff in Linux.
> 
> Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
> commit aa903cf31391dd505b399627158f1292a6d19896
> Author: Bin Meng <bmeng@tinylab.org>
> Date:   Fri Jun 30 23:36:04 2023 +0800
> 
>     roms/opensbi: Upgrade from v1.2 to v1.3
>     
>     Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
> 
> And I see something like:
> qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
>         -m 3G -smp 5 \
>         -kernel vmlinux.bin \
>         -dtb icicle.dtb \
>         -initrd initramfs.cpio.gz \
>         -display none -serial null \
>         -serial stdio \
>         -D qemu.log -d unimp

> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match

Why am I seeing these warnings? Does the mpfs machine type need to
disable some things? It only supports rv64imafdc per the DT, and
predates things like Zca existing, so emitting warnings does not seem
fair at all to me!

> 
> OpenSBI v1.3
>    ____                    _____ ____ _____
>   / __ \                  / ____|  _ \_   _|
>  | |  | |_ __   ___ _ __ | (___ | |_) || |
>  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
>  | |__| | |_) |  __/ | | |____) | |_) || |_
>   \____/| .__/ \___|_| |_|_____/|___/_____|
>         | |
>         |_|
> 
> init_coldboot: ipi init failed (error -1009)
> 
> Just to note, because we use our own firmware that vendors in OpenSBI
> and compiles only a significantly cut down number of files from it, we
> do not use the fw_dynamic etc flow on our hardware. As a result, we have
> not tested v1.3, nor do we have any immediate plans to change our
> platform firmware to vendor v1.3 either.
> 
> I unless there's something obvious to you, it sounds like I will need to
> go and bisect OpenSBI. That's a job for another day though, given the
> time.
> 
> Cheers,
> Conor.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/opensbi/attachments/20230713/04de663d/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-13 22:12                 ` Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type) Conor Dooley
@ 2023-07-13 22:35                   ` Daniel Henrique Barboza
  2023-07-13 23:04                   ` Conor Dooley
  2023-07-14  4:30                   ` Anup Patel
  2 siblings, 0 replies; 17+ messages in thread
From: Daniel Henrique Barboza @ 2023-07-13 22:35 UTC (permalink / raw)
  To: opensbi



On 7/13/23 19:12, Conor Dooley wrote:
> +CC OpenSBI Mailing list
> 
> I've not yet had the chance to bisect this, so adding the OpenSBI folks
> to CC in case they might have an idea for what to try.
> 
> And a question for you below Daniel.
> 
> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
>> On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
>>> On 7/12/23 18:35, Conor Dooley wrote:
>>>> On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
>>>>
>>>>> It is intentional. Those default marchid/mimpid vals were derived from the current
>>>>> QEMU version ID/build and didn't mean much.
>>>>>
>>>>> It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
>>>>> using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
>>>>> via command line.
>>>>
>>>> Sounds good, thanks. I did just now go and check icicle to see what it
>>>> would report & it does not boot. I'll go bisect...
>>>
>>> BTW how are you booting the icicle board nowadays? I remember you mentioning about
>>> some changes in the FDT being required to boot and whatnot.
>>
>> I do direct kernel boots, as the HSS doesn't work anymore, and just lie
>> a bit to QEMU about how much DDR we have.
>> .PHONY: qemu-icicle
>> qemu-icicle:
>> 	$(qemu) -M microchip-icicle-kit \
>> 		-m 3G -smp 5 \
>> 		-kernel $(vmlinux_bin) \
>> 		-dtb $(icicle_dtb) \
>> 		-initrd $(initramfs) \
>> 		-display none -serial null \
>> 		-serial stdio \
>> 		-D qemu.log -d unimp
>>
>> The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
>> it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
>> upstream devicetree (and current FPGA reference design) expects there to
>> be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
>> it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
>> things just work. I prefer doing it this way than having to modify the
>> DT, it is a lot easier to explain to people this way.
>>
>> I've been meaning to work the support for the icicle & mpfs in QEMU, but
>> it just gets shunted down the priority list. I'd really like if a proper
>> boot flow would run in QEMU, which means fixing whatever broke the HSS,
>> but I've recently picked up maintainership of dt-binding stuff in Linux,
>> so I've unfortunately got even less time to try and work on it. Maybe
>> we'll get some new graduate in and I can make them suffer in my stead...
>>
>>> If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
>>> we can even add it to QEMU testsuite.
>>
>> I don't think it really should be that bad, at least for the direct
>> kernel boot, which is what I mainly care about, since I use it fairly
>> often for debugging boot stuff in Linux.
>>
>> Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
>> commit aa903cf31391dd505b399627158f1292a6d19896
>> Author: Bin Meng <bmeng@tinylab.org>
>> Date:   Fri Jun 30 23:36:04 2023 +0800
>>
>>      roms/opensbi: Upgrade from v1.2 to v1.3
>>      
>>      Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
>>
>> And I see something like:
>> qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
>>          -m 3G -smp 5 \
>>          -kernel vmlinux.bin \
>>          -dtb icicle.dtb \
>>          -initrd initramfs.cpio.gz \
>>          -display none -serial null \
>>          -serial stdio \
>>          -D qemu.log -d unimp
> 
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
> 
> Why am I seeing these warnings? Does the mpfs machine type need to
> disable some things? It only supports rv64imafdc per the DT, and
> predates things like Zca existing, so emitting warnings does not seem
> fair at all to me!

QEMU will disable extensions that are newer than a priv spec version that is set
by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
CPU has a priv spec version 1_10_0. The CPU is also enabling C.

We will enable zca if C is enabled. C and D enabled will also enable zcd. But
then the priv check will disabled both because zca and zcd have priv spec 1_12_0.

This is a side effect for a change that I did a few months ago. Back then we
weren't disabling stuff correctly. The warnings are annoying but are benign.
And apparently the sifive_u54 CPU is being inconsistent for some time and
we noticed just now.

Now, if the icicle board is supposed to have zca and zcd then we have a problem.
We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not
sure how this will affect other boards that uses this CPU) or remove this priv spec
disable code altogether from QEMU.


Thanks,

Daniel



> 
>>
>> OpenSBI v1.3
>>     ____                    _____ ____ _____
>>    / __ \                  / ____|  _ \_   _|
>>   | |  | |_ __   ___ _ __ | (___ | |_) || |
>>   | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
>>   | |__| | |_) |  __/ | | |____) | |_) || |_
>>    \____/| .__/ \___|_| |_|_____/|___/_____|
>>          | |
>>          |_|
>>
>> init_coldboot: ipi init failed (error -1009)
>>
>> Just to note, because we use our own firmware that vendors in OpenSBI
>> and compiles only a significantly cut down number of files from it, we
>> do not use the fw_dynamic etc flow on our hardware. As a result, we have
>> not tested v1.3, nor do we have any immediate plans to change our
>> platform firmware to vendor v1.3 either.
>>
>> I unless there's something obvious to you, it sounds like I will need to
>> go and bisect OpenSBI. That's a job for another day though, given the
>> time.
>>
>> Cheers,
>> Conor.
> 
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-13 22:12                 ` Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type) Conor Dooley
  2023-07-13 22:35                   ` Daniel Henrique Barboza
@ 2023-07-13 23:04                   ` Conor Dooley
  2023-07-14  4:30                   ` Anup Patel
  2 siblings, 0 replies; 17+ messages in thread
From: Conor Dooley @ 2023-07-13 23:04 UTC (permalink / raw)
  To: opensbi

On Thu, Jul 13, 2023 at 11:12:33PM +0100, Conor Dooley wrote:
> +CC OpenSBI Mailing list
> 
> I've not yet had the chance to bisect this, so adding the OpenSBI folks
> to CC in case they might have an idea for what to try.

NVM this, I bisected it. Logs below.

> And a question for you below Daniel.
> 
> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> > > On 7/12/23 18:35, Conor Dooley wrote:
> > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > > > 
> > > > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > > > QEMU version ID/build and didn't mean much.
> > > > > 
> > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > > > via command line.
> > > > 
> > > > Sounds good, thanks. I did just now go and check icicle to see what it
> > > > would report & it does not boot. I'll go bisect...
> > > 
> > > BTW how are you booting the icicle board nowadays? I remember you mentioning about
> > > some changes in the FDT being required to boot and whatnot.
> > 
> > I do direct kernel boots, as the HSS doesn't work anymore, and just lie
> > a bit to QEMU about how much DDR we have.
> > .PHONY: qemu-icicle
> > qemu-icicle:
> > 	$(qemu) -M microchip-icicle-kit \
> > 		-m 3G -smp 5 \
> > 		-kernel $(vmlinux_bin) \
> > 		-dtb $(icicle_dtb) \
> > 		-initrd $(initramfs) \
> > 		-display none -serial null \
> > 		-serial stdio \
> > 		-D qemu.log -d unimp
> > 
> > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
> > it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
> > upstream devicetree (and current FPGA reference design) expects there to
> > be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
> > it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
> > things just work. I prefer doing it this way than having to modify the
> > DT, it is a lot easier to explain to people this way.
> > 
> > I've been meaning to work the support for the icicle & mpfs in QEMU, but
> > it just gets shunted down the priority list. I'd really like if a proper
> > boot flow would run in QEMU, which means fixing whatever broke the HSS,
> > but I've recently picked up maintainership of dt-binding stuff in Linux,
> > so I've unfortunately got even less time to try and work on it. Maybe
> > we'll get some new graduate in and I can make them suffer in my stead...
> > 
> > > If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> > > we can even add it to QEMU testsuite.
> > 
> > I don't think it really should be that bad, at least for the direct
> > kernel boot, which is what I mainly care about, since I use it fairly
> > often for debugging boot stuff in Linux.
> > 
> > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
> > commit aa903cf31391dd505b399627158f1292a6d19896
> > Author: Bin Meng <bmeng@tinylab.org>
> > Date:   Fri Jun 30 23:36:04 2023 +0800
> > 
> >     roms/opensbi: Upgrade from v1.2 to v1.3
> >     
> >     Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
> > 
> > And I see something like:
> > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
> >         -m 3G -smp 5 \
> >         -kernel vmlinux.bin \
> >         -dtb icicle.dtb \
> >         -initrd initramfs.cpio.gz \
> >         -display none -serial null \
> >         -serial stdio \
> >         -D qemu.log -d unimp
> 
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
> 
> Why am I seeing these warnings? Does the mpfs machine type need to
> disable some things? It only supports rv64imafdc per the DT, and
> predates things like Zca existing, so emitting warnings does not seem
> fair at all to me!

> > OpenSBI v1.3
> >    ____                    _____ ____ _____
> >   / __ \                  / ____|  _ \_   _|
> >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> >  | |__| | |_) |  __/ | | |____) | |_) || |_
> >   \____/| .__/ \___|_| |_|_____/|___/_____|
> >         | |
> >         |_|
> > 
> > init_coldboot: ipi init failed (error -1009)

This can be reproduced using OpenSBI built using `make PLATFORM=generic`
and the QEMU incantation linked above with a -bios argument added to the
incantation.

Thanks,
Conor.

acbd8fce9e5d92f07d344388a3b046f1722ce072 is the first bad commit
commit acbd8fce9e5d92f07d344388a3b046f1722ce072
Author: Anup Patel <apatel@ventanamicro.com>
Date:   Wed Apr 19 21:23:53 2023 +0530

    lib: utils/ipi: Use scratch space to save per-HART MSWI pointer
    
    Instead of using a global array indexed by hartid, we should use
    scratch space to save per-HART MSWI pointer.
    
    Signed-off-by: Anup Patel <apatel@ventanamicro.com>
    Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

 lib/utils/ipi/aclint_mswi.c | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)
git bisect start
# status: waiting for both good and bad commits
# bad: [2552799a1df30a3dcd2321a8b75d61d06f5fb9fc] include: Bump-up version to 1.3
git bisect bad 2552799a1df30a3dcd2321a8b75d61d06f5fb9fc
# status: waiting for good commit(s), bad commit known
# good: [6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8] include: Bump-up version to 1.2
git bisect good 6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8
# good: [6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8] include: Bump-up version to 1.2
git bisect good 6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8
# good: [908be1b85c8ff0695ea226fbbf0ff24a779cdece] gpio/starfive: add gpio driver and support gpio reset
git bisect good 908be1b85c8ff0695ea226fbbf0ff24a779cdece
# good: [6bc02dede86c47f87e65293b7099e9caf3b22c29] lib: sbi: Simplify sbi_ipi_process remove goto
git bisect good 6bc02dede86c47f87e65293b7099e9caf3b22c29
# good: [bbff53fe3b6cdd3c9bc084d489640d7ee2a3f831] lib: sbi_pmu: Use heap for per-HART PMU state
git bisect good bbff53fe3b6cdd3c9bc084d489640d7ee2a3f831
# bad: [f0516beae068ffce0d5a79f09a96904a661a25ba] lib: utils/timer: Use scratch space to save per-HART MTIMER pointer
git bisect bad f0516beae068ffce0d5a79f09a96904a661a25ba
# good: [5a8cfcdf19d98b8dc5dd5a087a2eceb7f5b185fb] lib: utils/ipi: Use heap in ACLINT MSWI driver
git bisect good 5a8cfcdf19d98b8dc5dd5a087a2eceb7f5b185fb
# good: [7e5636ac3788451991a65791c5adcc7798dcc22a] lib: utils/timer: Use heap in ACLINT MTIMER driver
git bisect good 7e5636ac3788451991a65791c5adcc7798dcc22a
# bad: [acbd8fce9e5d92f07d344388a3b046f1722ce072] lib: utils/ipi: Use scratch space to save per-HART MSWI pointer
git bisect bad acbd8fce9e5d92f07d344388a3b046f1722ce072
# good: [3c1c972cb69d670ddc309391c4db76f1f19fd77e] lib: utils/fdt: Use heap in FDT domain parsing
git bisect good 3c1c972cb69d670ddc309391c4db76f1f19fd77e
# first bad commit: [acbd8fce9e5d92f07d344388a3b046f1722ce072] lib: utils/ipi: Use scratch space to save per-HART MSWI pointer

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/opensbi/attachments/20230714/30a3e797/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-13 22:12                 ` Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type) Conor Dooley
  2023-07-13 22:35                   ` Daniel Henrique Barboza
  2023-07-13 23:04                   ` Conor Dooley
@ 2023-07-14  4:30                   ` Anup Patel
  2023-07-14 10:19                     ` Conor Dooley
  2 siblings, 1 reply; 17+ messages in thread
From: Anup Patel @ 2023-07-14  4:30 UTC (permalink / raw)
  To: opensbi

On Fri, Jul 14, 2023 at 3:43?AM Conor Dooley <conor@kernel.org> wrote:
>
> +CC OpenSBI Mailing list
>
> I've not yet had the chance to bisect this, so adding the OpenSBI folks
> to CC in case they might have an idea for what to try.
>
> And a question for you below Daniel.
>
> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> > > On 7/12/23 18:35, Conor Dooley wrote:
> > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > > >
> > > > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > > > QEMU version ID/build and didn't mean much.
> > > > >
> > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > > > via command line.
> > > >
> > > > Sounds good, thanks. I did just now go and check icicle to see what it
> > > > would report & it does not boot. I'll go bisect...
> > >
> > > BTW how are you booting the icicle board nowadays? I remember you mentioning about
> > > some changes in the FDT being required to boot and whatnot.
> >
> > I do direct kernel boots, as the HSS doesn't work anymore, and just lie
> > a bit to QEMU about how much DDR we have.
> > .PHONY: qemu-icicle
> > qemu-icicle:
> >       $(qemu) -M microchip-icicle-kit \
> >               -m 3G -smp 5 \
> >               -kernel $(vmlinux_bin) \
> >               -dtb $(icicle_dtb) \
> >               -initrd $(initramfs) \
> >               -display none -serial null \
> >               -serial stdio \
> >               -D qemu.log -d unimp
> >
> > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
> > it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
> > upstream devicetree (and current FPGA reference design) expects there to
> > be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
> > it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
> > things just work. I prefer doing it this way than having to modify the
> > DT, it is a lot easier to explain to people this way.
> >
> > I've been meaning to work the support for the icicle & mpfs in QEMU, but
> > it just gets shunted down the priority list. I'd really like if a proper
> > boot flow would run in QEMU, which means fixing whatever broke the HSS,
> > but I've recently picked up maintainership of dt-binding stuff in Linux,
> > so I've unfortunately got even less time to try and work on it. Maybe
> > we'll get some new graduate in and I can make them suffer in my stead...
> >
> > > If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> > > we can even add it to QEMU testsuite.
> >
> > I don't think it really should be that bad, at least for the direct
> > kernel boot, which is what I mainly care about, since I use it fairly
> > often for debugging boot stuff in Linux.
> >
> > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
> > commit aa903cf31391dd505b399627158f1292a6d19896
> > Author: Bin Meng <bmeng@tinylab.org>
> > Date:   Fri Jun 30 23:36:04 2023 +0800
> >
> >     roms/opensbi: Upgrade from v1.2 to v1.3
> >
> >     Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
> >
> > And I see something like:
> > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
> >         -m 3G -smp 5 \
> >         -kernel vmlinux.bin \
> >         -dtb icicle.dtb \
> >         -initrd initramfs.cpio.gz \
> >         -display none -serial null \
> >         -serial stdio \
> >         -D qemu.log -d unimp
>
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
>
> Why am I seeing these warnings? Does the mpfs machine type need to
> disable some things? It only supports rv64imafdc per the DT, and
> predates things like Zca existing, so emitting warnings does not seem
> fair at all to me!
>
> >
> > OpenSBI v1.3
> >    ____                    _____ ____ _____
> >   / __ \                  / ____|  _ \_   _|
> >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> >  | |__| | |_) |  __/ | | |____) | |_) || |_
> >   \____/| .__/ \___|_| |_|_____/|___/_____|
> >         | |
> >         |_|
> >
> > init_coldboot: ipi init failed (error -1009)
> >
> > Just to note, because we use our own firmware that vendors in OpenSBI
> > and compiles only a significantly cut down number of files from it, we
> > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > not tested v1.3, nor do we have any immediate plans to change our
> > platform firmware to vendor v1.3 either.
> >
> > I unless there's something obvious to you, it sounds like I will need to
> > go and bisect OpenSBI. That's a job for another day though, given the
> > time.
> >

The real issue is some CPU/HART DT nodes marked as disabled in the
DT passed to OpenSBI 1.3.

This issue does not exist in any of the DTs generated by QEMU but some
of the DTs in the kernel (such as microchip and SiFive board DTs) have
the E-core disabled.

I had discovered this issue in a totally different context after the OpenSBI 1.3
release happened. This issue is already fixed in the latest OpenSBI by the
following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
Fix sbi_hartid_to_scratch() usage in ACLINT drivers").

I always assumed that Microchip hss.bin is the preferred BIOS for the
QEMU microchip-icicle-kit machine but I guess that's not true.

At this point, you can either:
1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
    microchip-icicle-kit machine with OpenSBI 1.3

Regards,
Anup


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-14  4:30                   ` Anup Patel
@ 2023-07-14 10:19                     ` Conor Dooley
  2023-07-14 12:28                       ` Conor Dooley
  2023-07-14 12:35                       ` Anup Patel
  0 siblings, 2 replies; 17+ messages in thread
From: Conor Dooley @ 2023-07-14 10:19 UTC (permalink / raw)
  To: opensbi

On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:

> > > OpenSBI v1.3
> > >    ____                    _____ ____ _____
> > >   / __ \                  / ____|  _ \_   _|
> > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > >         | |
> > >         |_|
> > >
> > > init_coldboot: ipi init failed (error -1009)
> > >
> > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > and compiles only a significantly cut down number of files from it, we
> > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > not tested v1.3, nor do we have any immediate plans to change our
> > > platform firmware to vendor v1.3 either.
> > >
> > > I unless there's something obvious to you, it sounds like I will need to
> > > go and bisect OpenSBI. That's a job for another day though, given the
> > > time.
> > >
> 
> The real issue is some CPU/HART DT nodes marked as disabled in the
> DT passed to OpenSBI 1.3.
> 
> This issue does not exist in any of the DTs generated by QEMU but some
> of the DTs in the kernel (such as microchip and SiFive board DTs) have
> the E-core disabled.
> 
> I had discovered this issue in a totally different context after the OpenSBI 1.3
> release happened. This issue is already fixed in the latest OpenSBI by the
> following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> Fix sbi_hartid_to_scratch() usage in ACLINT drivers").

Great, thanks Anup! I thought I had tested tip-of-tree too, but
obviously not.

> I always assumed that Microchip hss.bin is the preferred BIOS for the
> QEMU microchip-icicle-kit machine but I guess that's not true.

Unfortunately the HSS has not worked in QEMU for a long time, and while
I would love to fix it, but am pretty stretched for spare time to begin
with.
I usually just do direct kernel boots, which use the OpenSBI that comes
with QEMU, as I am sure you already know :)

> At this point, you can either:
> 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine

> 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
>     microchip-icicle-kit machine with OpenSBI 1.3

Will OpenSBI disable it? If not, I think option 2) needs to be remove
the DT node. I'll just use tip-of-tree myself & up to the 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/opensbi/attachments/20230714/3cdabf44/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-14 10:19                     ` Conor Dooley
@ 2023-07-14 12:28                       ` Conor Dooley
  2023-07-15  9:12                         ` Atish Patra
  2023-07-14 12:35                       ` Anup Patel
  1 sibling, 1 reply; 17+ messages in thread
From: Conor Dooley @ 2023-07-14 12:28 UTC (permalink / raw)
  To: opensbi

On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> 
> > > > OpenSBI v1.3
> > > >    ____                    _____ ____ _____
> > > >   / __ \                  / ____|  _ \_   _|
> > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > >         | |
> > > >         |_|
> > > >
> > > > init_coldboot: ipi init failed (error -1009)
> > > >
> > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > and compiles only a significantly cut down number of files from it, we
> > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > platform firmware to vendor v1.3 either.
> > > >
> > > > I unless there's something obvious to you, it sounds like I will need to
> > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > time.
> > > >
> > 
> > The real issue is some CPU/HART DT nodes marked as disabled in the
> > DT passed to OpenSBI 1.3.
> > 
> > This issue does not exist in any of the DTs generated by QEMU but some
> > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > the E-core disabled.
> > 
> > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > release happened. This issue is already fixed in the latest OpenSBI by the
> > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> 
> Great, thanks Anup! I thought I had tested tip-of-tree too, but
> obviously not.
> 
> > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > QEMU microchip-icicle-kit machine but I guess that's not true.
> 
> Unfortunately the HSS has not worked in QEMU for a long time, and while
> I would love to fix it, but am pretty stretched for spare time to begin
> with.
> I usually just do direct kernel boots, which use the OpenSBI that comes
> with QEMU, as I am sure you already know :)
> 
> > At this point, you can either:
> > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine

I forgot to reply to this point, wondering what should be done with
QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
of whether I can go and build a fixed version of OpenSBI.

> > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> >     microchip-icicle-kit machine with OpenSBI 1.3
> 
> Will OpenSBI disable it? If not, I think option 2) needs to be remove
> the DT node. I'll just use tip-of-tree myself & up to the 

Clearly didn't finish this comment. It was meant to say "up to the QEMU
maintainers what they want to do on the QEMU side of things".

Thanks,
Conor.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/opensbi/attachments/20230714/7ad470a8/attachment.sig>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-14 10:19                     ` Conor Dooley
  2023-07-14 12:28                       ` Conor Dooley
@ 2023-07-14 12:35                       ` Anup Patel
  1 sibling, 0 replies; 17+ messages in thread
From: Anup Patel @ 2023-07-14 12:35 UTC (permalink / raw)
  To: opensbi

On Fri, Jul 14, 2023 at 3:50?PM Conor Dooley <conor@kernel.org> wrote:
>
> On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
>
> > > > OpenSBI v1.3
> > > >    ____                    _____ ____ _____
> > > >   / __ \                  / ____|  _ \_   _|
> > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > >         | |
> > > >         |_|
> > > >
> > > > init_coldboot: ipi init failed (error -1009)
> > > >
> > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > and compiles only a significantly cut down number of files from it, we
> > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > platform firmware to vendor v1.3 either.
> > > >
> > > > I unless there's something obvious to you, it sounds like I will need to
> > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > time.
> > > >
> >
> > The real issue is some CPU/HART DT nodes marked as disabled in the
> > DT passed to OpenSBI 1.3.
> >
> > This issue does not exist in any of the DTs generated by QEMU but some
> > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > the E-core disabled.
> >
> > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > release happened. This issue is already fixed in the latest OpenSBI by the
> > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
>
> Great, thanks Anup! I thought I had tested tip-of-tree too, but
> obviously not.
>
> > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > QEMU microchip-icicle-kit machine but I guess that's not true.
>
> Unfortunately the HSS has not worked in QEMU for a long time, and while
> I would love to fix it, but am pretty stretched for spare time to begin
> with.
> I usually just do direct kernel boots, which use the OpenSBI that comes
> with QEMU, as I am sure you already know :)
>
> > At this point, you can either:
> > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
>
> > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> >     microchip-icicle-kit machine with OpenSBI 1.3
>
> Will OpenSBI disable it? If not, I think option 2) needs to be remove
> the DT node. I'll just use tip-of-tree myself & up to the

Current, FDT fixup code in OpenSBI will disable any CPU DT node
which satisfies any of the following:
1) CPU is not assigned to the current domain
2) CPU does not have "mmu-type" DT property

Regards,
Anup


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-14 12:28                       ` Conor Dooley
@ 2023-07-15  9:12                         ` Atish Patra
  2023-07-19  1:32                           ` Alistair Francis
  0 siblings, 1 reply; 17+ messages in thread
From: Atish Patra @ 2023-07-15  9:12 UTC (permalink / raw)
  To: opensbi

On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
>
> On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> >
> > > > > OpenSBI v1.3
> > > > >    ____                    _____ ____ _____
> > > > >   / __ \                  / ____|  _ \_   _|
> > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > >         | |
> > > > >         |_|
> > > > >
> > > > > init_coldboot: ipi init failed (error -1009)
> > > > >
> > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > and compiles only a significantly cut down number of files from it, we
> > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > platform firmware to vendor v1.3 either.
> > > > >
> > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > time.
> > > > >
> > >
> > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > DT passed to OpenSBI 1.3.
> > >
> > > This issue does not exist in any of the DTs generated by QEMU but some
> > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > the E-core disabled.
> > >
> > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> >
> > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > obviously not.
> >
> > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > QEMU microchip-icicle-kit machine but I guess that's not true.
> >
> > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > I would love to fix it, but am pretty stretched for spare time to begin
> > with.
> > I usually just do direct kernel boots, which use the OpenSBI that comes
> > with QEMU, as I am sure you already know :)
> >
> > > At this point, you can either:
> > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
>
> I forgot to reply to this point, wondering what should be done with
> QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> of whether I can go and build a fixed version of OpenSBI.
>
FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
user using the latest kernel (> v6.4)
may hit those random linear map related issues (in hibernation or EFI
booting path).

There are three possible scenarios:

1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
or sifive fu540 machine users
may hit this issue if the device tree has the disabled hart (e core).
2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
have issues [1]
3. Include a non-release version OpenSBI in Qemu with the fix as an exception.

#3 probably deviates from policy and sets a bad precedent. So I am not
advocating for it though ;)
For both #1 & #2, the solution would be to use the latest OpenSBI in
-bios argument instead of the stock one.
I could be wrong but my guess is the number of users facing #2 would
be higher than #1.

[1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai at tinylab.org/
> > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > >     microchip-icicle-kit machine with OpenSBI 1.3
> >
> > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > the DT node. I'll just use tip-of-tree myself & up to the
>
> Clearly didn't finish this comment. It was meant to say "up to the QEMU
> maintainers what they want to do on the QEMU side of things".
>
> Thanks,
> Conor.



-- 
Regards,
Atish


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-15  9:12                         ` Atish Patra
@ 2023-07-19  1:32                           ` Alistair Francis
  2023-07-19  5:39                             ` Anup Patel
  2023-07-19  7:07                             ` Conor Dooley
  0 siblings, 2 replies; 17+ messages in thread
From: Alistair Francis @ 2023-07-19  1:32 UTC (permalink / raw)
  To: opensbi

On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > >
> > > > > > OpenSBI v1.3
> > > > > >    ____                    _____ ____ _____
> > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > >         | |
> > > > > >         |_|
> > > > > >
> > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > >
> > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > platform firmware to vendor v1.3 either.
> > > > > >
> > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > time.
> > > > > >
> > > >
> > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > DT passed to OpenSBI 1.3.
> > > >
> > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > the E-core disabled.
> > > >
> > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > >
> > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > obviously not.
> > >
> > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > >
> > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > I would love to fix it, but am pretty stretched for spare time to begin
> > > with.
> > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > with QEMU, as I am sure you already know :)
> > >
> > > > At this point, you can either:
> > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> >
> > I forgot to reply to this point, wondering what should be done with
> > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > of whether I can go and build a fixed version of OpenSBI.
> >
> FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> user using the latest kernel (> v6.4)
> may hit those random linear map related issues (in hibernation or EFI
> booting path).
>
> There are three possible scenarios:
>
> 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> or sifive fu540 machine users
> may hit this issue if the device tree has the disabled hart (e core).
> 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> have issues [1]
> 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
>
> #3 probably deviates from policy and sets a bad precedent. So I am not
> advocating for it though ;)
> For both #1 & #2, the solution would be to use the latest OpenSBI in
> -bios argument instead of the stock one.
> I could be wrong but my guess is the number of users facing #2 would
> be higher than #1.

Thanks for that info Atish!

We are stuck in a bad situation.

The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
do you think you could do that?

Otherwise I think we should stick with OpenSBI 1.3. Considering that
it fixes UEFI boot issues for the virt board (which would be the most
used) it seems like a best call to make. People using the other boards
are unfortunately stuck building their own OpenSBI release.

If there is no OpenSBI 1.3.1 release we should add something to the
release notes. @Conor Dooley are you able to give a clear sentence on
how the boot fails?

Alistair

>
> [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai at tinylab.org/
> > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > >
> > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > the DT node. I'll just use tip-of-tree myself & up to the
> >
> > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > maintainers what they want to do on the QEMU side of things".
> >
> > Thanks,
> > Conor.
>
>
>
> --
> Regards,
> Atish
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19  1:32                           ` Alistair Francis
@ 2023-07-19  5:39                             ` Anup Patel
  2023-07-19  9:53                               ` Alistair Francis
  2023-07-19  7:07                             ` Conor Dooley
  1 sibling, 1 reply; 17+ messages in thread
From: Anup Patel @ 2023-07-19  5:39 UTC (permalink / raw)
  To: opensbi

On Wed, Jul 19, 2023 at 7:03?AM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > >
> > > > > > > OpenSBI v1.3
> > > > > > >    ____                    _____ ____ _____
> > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > >         | |
> > > > > > >         |_|
> > > > > > >
> > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > >
> > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > platform firmware to vendor v1.3 either.
> > > > > > >
> > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > time.
> > > > > > >
> > > > >
> > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > DT passed to OpenSBI 1.3.
> > > > >
> > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > the E-core disabled.
> > > > >
> > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > >
> > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > obviously not.
> > > >
> > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > >
> > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > with.
> > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > with QEMU, as I am sure you already know :)
> > > >
> > > > > At this point, you can either:
> > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > >
> > > I forgot to reply to this point, wondering what should be done with
> > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > of whether I can go and build a fixed version of OpenSBI.
> > >
> > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > user using the latest kernel (> v6.4)
> > may hit those random linear map related issues (in hibernation or EFI
> > booting path).
> >
> > There are three possible scenarios:
> >
> > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > or sifive fu540 machine users
> > may hit this issue if the device tree has the disabled hart (e core).
> > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > have issues [1]
> > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> >
> > #3 probably deviates from policy and sets a bad precedent. So I am not
> > advocating for it though ;)
> > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > -bios argument instead of the stock one.
> > I could be wrong but my guess is the number of users facing #2 would
> > be higher than #1.
>
> Thanks for that info Atish!
>
> We are stuck in a bad situation.
>
> The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> do you think you could do that?

OpenSBI has a major number and minor number in the version but it does
not have release/patch number so best would be to treat OpenSBI vX.Y.Z
as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
using sbi_get_impl_version().

There are only three commits between the ACLINT fix and OpenSBI v1.3
so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
four commits on-top of OpenSBI v1.3

Does this sound okay ?

>
> Otherwise I think we should stick with OpenSBI 1.3. Considering that
> it fixes UEFI boot issues for the virt board (which would be the most
> used) it seems like a best call to make. People using the other boards
> are unfortunately stuck building their own OpenSBI release.
>
> If there is no OpenSBI 1.3.1 release we should add something to the
> release notes. @Conor Dooley are you able to give a clear sentence on
> how the boot fails?
>
> Alistair
>
> >
> > [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai at tinylab.org/
> > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > >
> > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > the DT node. I'll just use tip-of-tree myself & up to the
> > >
> > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > maintainers what they want to do on the QEMU side of things".
> > >
> > > Thanks,
> > > Conor.
> >
> >
> >
> > --
> > Regards,
> > Atish
> >

Regards,
Anup


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19  1:32                           ` Alistair Francis
  2023-07-19  5:39                             ` Anup Patel
@ 2023-07-19  7:07                             ` Conor Dooley
  1 sibling, 0 replies; 17+ messages in thread
From: Conor Dooley @ 2023-07-19  7:07 UTC (permalink / raw)
  To: opensbi

On Wed, Jul 19, 2023 at 11:32:55AM +1000, Alistair Francis wrote:

> If there is no OpenSBI 1.3.1 release we should add something to the
> release notes. @Conor Dooley are you able to give a clear sentence on
> how the boot fails?

Uhh, I'll give it a shot, but hopefully it is not required :)

In version v1.3, OpenSBI's aclint drivers fail to initialise if they
encounter a disabled CPU node in the devicetree. Attempting to boot
using, for example, the Linux kernel's PolarFire SoC or Freedom U540
devicetrees, will fail with the error:
"init_coldboot: ipi init failed (error -1009)"
Please see OpenSBI commit c6a3573 ("lib: utils: Fix sbi_hartid_to_scratch()
usage in ACLINT drivers")
<https://github.com/riscv-software-src/opensbi/commit/c6a35733b74aeff612398f274ed19a74f81d1f37>
for the fix.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/opensbi/attachments/20230719/839e4402/attachment.sig>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19  5:39                             ` Anup Patel
@ 2023-07-19  9:53                               ` Alistair Francis
  2023-07-19 15:21                                 ` Anup Patel
  0 siblings, 1 reply; 17+ messages in thread
From: Alistair Francis @ 2023-07-19  9:53 UTC (permalink / raw)
  To: opensbi

On Wed, Jul 19, 2023 at 3:39?PM Anup Patel <anup@brainfault.org> wrote:
>
> On Wed, Jul 19, 2023 at 7:03?AM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > >
> > > > > > > > OpenSBI v1.3
> > > > > > > >    ____                    _____ ____ _____
> > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > >         | |
> > > > > > > >         |_|
> > > > > > > >
> > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > >
> > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > >
> > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > time.
> > > > > > > >
> > > > > >
> > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > DT passed to OpenSBI 1.3.
> > > > > >
> > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > the E-core disabled.
> > > > > >
> > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > >
> > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > obviously not.
> > > > >
> > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > >
> > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > with.
> > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > with QEMU, as I am sure you already know :)
> > > > >
> > > > > > At this point, you can either:
> > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > >
> > > > I forgot to reply to this point, wondering what should be done with
> > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > of whether I can go and build a fixed version of OpenSBI.
> > > >
> > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > user using the latest kernel (> v6.4)
> > > may hit those random linear map related issues (in hibernation or EFI
> > > booting path).
> > >
> > > There are three possible scenarios:
> > >
> > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > or sifive fu540 machine users
> > > may hit this issue if the device tree has the disabled hart (e core).
> > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > have issues [1]
> > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > >
> > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > advocating for it though ;)
> > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > -bios argument instead of the stock one.
> > > I could be wrong but my guess is the number of users facing #2 would
> > > be higher than #1.
> >
> > Thanks for that info Atish!
> >
> > We are stuck in a bad situation.
> >
> > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > do you think you could do that?
>
> OpenSBI has a major number and minor number in the version but it does
> not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> using sbi_get_impl_version().
>
> There are only three commits between the ACLINT fix and OpenSBI v1.3
> so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> four commits on-top of OpenSBI v1.3
>
> Does this sound okay ?

That sounds fine to me. It fixes the issue for the Microsemi board and
it's a very small change between 1.3 and 1.3.1

Alistair

>
> >
> > Otherwise I think we should stick with OpenSBI 1.3. Considering that
> > it fixes UEFI boot issues for the virt board (which would be the most
> > used) it seems like a best call to make. People using the other boards
> > are unfortunately stuck building their own OpenSBI release.
> >
> > If there is no OpenSBI 1.3.1 release we should add something to the
> > release notes. @Conor Dooley are you able to give a clear sentence on
> > how the boot fails?
> >
> > Alistair
> >
> > >
> > > [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai at tinylab.org/
> > > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > > >
> > > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > > the DT node. I'll just use tip-of-tree myself & up to the
> > > >
> > > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > > maintainers what they want to do on the QEMU side of things".
> > > >
> > > > Thanks,
> > > > Conor.
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Atish
> > >
>
> Regards,
> Anup


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19  9:53                               ` Alistair Francis
@ 2023-07-19 15:21                                 ` Anup Patel
  2023-07-19 15:45                                   ` Bin Meng
  0 siblings, 1 reply; 17+ messages in thread
From: Anup Patel @ 2023-07-19 15:21 UTC (permalink / raw)
  To: opensbi

On Wed, Jul 19, 2023 at 3:23?PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 3:39?PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Wed, Jul 19, 2023 at 7:03?AM Alistair Francis <alistair23@gmail.com> wrote:
> > >
> > > On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
> > > >
> > > > On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > >
> > > > > > > > > OpenSBI v1.3
> > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > >         | |
> > > > > > > > >         |_|
> > > > > > > > >
> > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > >
> > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > >
> > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > time.
> > > > > > > > >
> > > > > > >
> > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > DT passed to OpenSBI 1.3.
> > > > > > >
> > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > the E-core disabled.
> > > > > > >
> > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > >
> > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > obviously not.
> > > > > >
> > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > >
> > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > with.
> > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > with QEMU, as I am sure you already know :)
> > > > > >
> > > > > > > At this point, you can either:
> > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > >
> > > > > I forgot to reply to this point, wondering what should be done with
> > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > >
> > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > user using the latest kernel (> v6.4)
> > > > may hit those random linear map related issues (in hibernation or EFI
> > > > booting path).
> > > >
> > > > There are three possible scenarios:
> > > >
> > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > or sifive fu540 machine users
> > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > have issues [1]
> > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > >
> > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > advocating for it though ;)
> > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > -bios argument instead of the stock one.
> > > > I could be wrong but my guess is the number of users facing #2 would
> > > > be higher than #1.
> > >
> > > Thanks for that info Atish!
> > >
> > > We are stuck in a bad situation.
> > >
> > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > do you think you could do that?
> >
> > OpenSBI has a major number and minor number in the version but it does
> > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > using sbi_get_impl_version().
> >
> > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > four commits on-top of OpenSBI v1.3
> >
> > Does this sound okay ?
>
> That sounds fine to me. It fixes the issue for the Microsemi board and
> it's a very small change between 1.3 and 1.3.1

Please check
https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1

I hope this helps.

Regards,
Anup

>
> Alistair
>
> >
> > >
> > > Otherwise I think we should stick with OpenSBI 1.3. Considering that
> > > it fixes UEFI boot issues for the virt board (which would be the most
> > > used) it seems like a best call to make. People using the other boards
> > > are unfortunately stuck building their own OpenSBI release.
> > >
> > > If there is no OpenSBI 1.3.1 release we should add something to the
> > > release notes. @Conor Dooley are you able to give a clear sentence on
> > > how the boot fails?
> > >
> > > Alistair
> > >
> > > >
> > > > [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai at tinylab.org/
> > > > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > > > >
> > > > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > > > the DT node. I'll just use tip-of-tree myself & up to the
> > > > >
> > > > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > > > maintainers what they want to do on the QEMU side of things".
> > > > >
> > > > > Thanks,
> > > > > Conor.
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Atish
> > > >
> >
> > Regards,
> > Anup


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19 15:21                                 ` Anup Patel
@ 2023-07-19 15:45                                   ` Bin Meng
  2023-07-19 16:10                                     ` Anup Patel
  2023-07-19 16:17                                     ` Andreas Schwab
  0 siblings, 2 replies; 17+ messages in thread
From: Bin Meng @ 2023-07-19 15:45 UTC (permalink / raw)
  To: opensbi

On Wed, Jul 19, 2023 at 11:22?PM Anup Patel <anup@brainfault.org> wrote:
>
> On Wed, Jul 19, 2023 at 3:23?PM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Wed, Jul 19, 2023 at 3:39?PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 7:03?AM Alistair Francis <alistair23@gmail.com> wrote:
> > > >
> > > > On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > >
> > > > > On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > >
> > > > > > > > > > OpenSBI v1.3
> > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > >         | |
> > > > > > > > > >         |_|
> > > > > > > > > >
> > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > >
> > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > >
> > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > time.
> > > > > > > > > >
> > > > > > > >
> > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > >
> > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > the E-core disabled.
> > > > > > > >
> > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > >
> > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > obviously not.
> > > > > > >
> > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > >
> > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > with.
> > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > with QEMU, as I am sure you already know :)
> > > > > > >
> > > > > > > > At this point, you can either:
> > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > >
> > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > >
> > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > user using the latest kernel (> v6.4)
> > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > booting path).
> > > > >
> > > > > There are three possible scenarios:
> > > > >
> > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > or sifive fu540 machine users
> > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > have issues [1]
> > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > >
> > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > advocating for it though ;)
> > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > -bios argument instead of the stock one.
> > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > be higher than #1.
> > > >
> > > > Thanks for that info Atish!
> > > >
> > > > We are stuck in a bad situation.
> > > >
> > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > do you think you could do that?
> > >
> > > OpenSBI has a major number and minor number in the version but it does
> > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > using sbi_get_impl_version().
> > >
> > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > four commits on-top of OpenSBI v1.3
> > >
> > > Does this sound okay ?
> >
> > That sounds fine to me. It fixes the issue for the Microsemi board and
> > it's a very small change between 1.3 and 1.3.1
>
> Please check
> https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
>
> I hope this helps.

Hi Alistair,

Do we need to update QEMU's opensbi binaries to v1.3.1?

Hi Anup,

Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
git repo. Am I missing anything?

Regards,
Bin


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19 15:45                                   ` Bin Meng
@ 2023-07-19 16:10                                     ` Anup Patel
  2023-07-19 16:18                                       ` Bin Meng
  2023-07-19 16:17                                     ` Andreas Schwab
  1 sibling, 1 reply; 17+ messages in thread
From: Anup Patel @ 2023-07-19 16:10 UTC (permalink / raw)
  To: opensbi

Hi Bin,

On Wed, Jul 19, 2023 at 9:15?PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 11:22?PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Wed, Jul 19, 2023 at 3:23?PM Alistair Francis <alistair23@gmail.com> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 3:39?PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > On Wed, Jul 19, 2023 at 7:03?AM Alistair Francis <alistair23@gmail.com> wrote:
> > > > >
> > > > > On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > > >
> > > > > > > > > > > OpenSBI v1.3
> > > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > > >         | |
> > > > > > > > > > >         |_|
> > > > > > > > > > >
> > > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > > >
> > > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > > >
> > > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > > time.
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > > >
> > > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > > the E-core disabled.
> > > > > > > > >
> > > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > > >
> > > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > > obviously not.
> > > > > > > >
> > > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > > >
> > > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > > with.
> > > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > > with QEMU, as I am sure you already know :)
> > > > > > > >
> > > > > > > > > At this point, you can either:
> > > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > > >
> > > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > > >
> > > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > > user using the latest kernel (> v6.4)
> > > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > > booting path).
> > > > > >
> > > > > > There are three possible scenarios:
> > > > > >
> > > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > > or sifive fu540 machine users
> > > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > > have issues [1]
> > > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > > >
> > > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > > advocating for it though ;)
> > > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > > -bios argument instead of the stock one.
> > > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > > be higher than #1.
> > > > >
> > > > > Thanks for that info Atish!
> > > > >
> > > > > We are stuck in a bad situation.
> > > > >
> > > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > > do you think you could do that?
> > > >
> > > > OpenSBI has a major number and minor number in the version but it does
> > > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > > using sbi_get_impl_version().
> > > >
> > > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > > four commits on-top of OpenSBI v1.3
> > > >
> > > > Does this sound okay ?
> > >
> > > That sounds fine to me. It fixes the issue for the Microsemi board and
> > > it's a very small change between 1.3 and 1.3.1
> >
> > Please check
> > https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
> >
> > I hope this helps.
>
> Hi Alistair,
>
> Do we need to update QEMU's opensbi binaries to v1.3.1?
>
> Hi Anup,
>
> Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> git repo. Am I missing anything?

There is a v1.3.1 tag in https://github.com/riscv-software-src/opensbi
(Try cloning the repo again?)

The commit history of v1.3.1 is v1.3 tag + 5 cherry picked commits
which means the commit history of the master branch is not the same
as the commit history of v1.3.1.

Regards,
Anup


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19 15:45                                   ` Bin Meng
  2023-07-19 16:10                                     ` Anup Patel
@ 2023-07-19 16:17                                     ` Andreas Schwab
  1 sibling, 0 replies; 17+ messages in thread
From: Andreas Schwab @ 2023-07-19 16:17 UTC (permalink / raw)
  To: opensbi

On Jul 19 2023, Bin Meng wrote:

>> Please check
>> https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
>>
>> I hope this helps.
>
> Hi Alistair,
>
> Do we need to update QEMU's opensbi binaries to v1.3.1?
>
> Hi Anup,
>
> Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> git repo. Am I missing anything?

You need to run git fetch --tags, because the tag is not part of any
branch, thus not fetched automatically.

-- 
Andreas Schwab, schwab at linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type)
  2023-07-19 16:10                                     ` Anup Patel
@ 2023-07-19 16:18                                       ` Bin Meng
  0 siblings, 0 replies; 17+ messages in thread
From: Bin Meng @ 2023-07-19 16:18 UTC (permalink / raw)
  To: opensbi

Hi Anup,

On Thu, Jul 20, 2023 at 12:10?AM Anup Patel <apatel@ventanamicro.com> wrote:
>
> Hi Bin,
>
> On Wed, Jul 19, 2023 at 9:15?PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > On Wed, Jul 19, 2023 at 11:22?PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 3:23?PM Alistair Francis <alistair23@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 19, 2023 at 3:39?PM Anup Patel <anup@brainfault.org> wrote:
> > > > >
> > > > > On Wed, Jul 19, 2023 at 7:03?AM Alistair Francis <alistair23@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Jul 15, 2023 at 7:14?PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 14, 2023 at 5:29?AM Conor Dooley <conor@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > > > >
> > > > > > > > > > > > OpenSBI v1.3
> > > > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > > > >         | |
> > > > > > > > > > > >         |_|
> > > > > > > > > > > >
> > > > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > > > >
> > > > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > > > >
> > > > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > > > time.
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > > > >
> > > > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > > > the E-core disabled.
> > > > > > > > > >
> > > > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > > > >
> > > > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > > > obviously not.
> > > > > > > > >
> > > > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > > > >
> > > > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > > > with.
> > > > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > > > with QEMU, as I am sure you already know :)
> > > > > > > > >
> > > > > > > > > > At this point, you can either:
> > > > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > > > >
> > > > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > > > >
> > > > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > > > user using the latest kernel (> v6.4)
> > > > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > > > booting path).
> > > > > > >
> > > > > > > There are three possible scenarios:
> > > > > > >
> > > > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > > > or sifive fu540 machine users
> > > > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > > > have issues [1]
> > > > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > > > >
> > > > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > > > advocating for it though ;)
> > > > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > > > -bios argument instead of the stock one.
> > > > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > > > be higher than #1.
> > > > > >
> > > > > > Thanks for that info Atish!
> > > > > >
> > > > > > We are stuck in a bad situation.
> > > > > >
> > > > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > > > do you think you could do that?
> > > > >
> > > > > OpenSBI has a major number and minor number in the version but it does
> > > > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > > > using sbi_get_impl_version().
> > > > >
> > > > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > > > four commits on-top of OpenSBI v1.3
> > > > >
> > > > > Does this sound okay ?
> > > >
> > > > That sounds fine to me. It fixes the issue for the Microsemi board and
> > > > it's a very small change between 1.3 and 1.3.1
> > >
> > > Please check
> > > https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
> > >
> > > I hope this helps.
> >
> > Hi Alistair,
> >
> > Do we need to update QEMU's opensbi binaries to v1.3.1?
> >
> > Hi Anup,
> >
> > Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> > git repo. Am I missing anything?
>
> There is a v1.3.1 tag in https://github.com/riscv-software-src/opensbi
> (Try cloning the repo again?)
>
> The commit history of v1.3.1 is v1.3 tag + 5 cherry picked commits
> which means the commit history of the master branch is not the same
> as the commit history of v1.3.1.

I see. I was seeing a github warning message when I see the last
commit [1] from tag v1.3.1:

"This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository."

and was misled. Thanks for the hint. I now added "--tags" when I do a
git fetch and now is seeing the new tag.

[1] https://github.com/riscv-software-src/opensbi/commit/057eb10b6d523540012e6947d5c9f63e95244e94

Regards,
Bin


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-07-19 16:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20230712190149.424675-1-dbarboza@ventanamicro.com>
     [not found] ` <20230712190149.424675-7-dbarboza@ventanamicro.com>
     [not found]   ` <20230712-stench-happiness-40c2ea831257@spud>
     [not found]     ` <3e9b5be8-d3ca-3a17-bef9-4a6a5bdc0ad0@ventanamicro.com>
     [not found]       ` <20230712-tulip-replica-0322e71c3e81@spud>
     [not found]         ` <744cbde6-7ce5-c327-3c5a-3858e994cc39@ventanamicro.com>
     [not found]           ` <20230712-superhero-rabid-578605f52927@spud>
     [not found]             ` <5dd3366d-13ba-c7fb-554f-549d97e7d4f9@ventanamicro.com>
     [not found]               ` <20230712-fancied-aviator-270f51166407@spud>
2023-07-13 22:12                 ` Boot failure after QEMU's upgrade to OpenSBI v1.3 (was Re: [PATCH for-8.2 6/7] target/riscv: add 'max' CPU type) Conor Dooley
2023-07-13 22:35                   ` Daniel Henrique Barboza
2023-07-13 23:04                   ` Conor Dooley
2023-07-14  4:30                   ` Anup Patel
2023-07-14 10:19                     ` Conor Dooley
2023-07-14 12:28                       ` Conor Dooley
2023-07-15  9:12                         ` Atish Patra
2023-07-19  1:32                           ` Alistair Francis
2023-07-19  5:39                             ` Anup Patel
2023-07-19  9:53                               ` Alistair Francis
2023-07-19 15:21                                 ` Anup Patel
2023-07-19 15:45                                   ` Bin Meng
2023-07-19 16:10                                     ` Anup Patel
2023-07-19 16:18                                       ` Bin Meng
2023-07-19 16:17                                     ` Andreas Schwab
2023-07-19  7:07                             ` Conor Dooley
2023-07-14 12:35                       ` Anup Patel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox