* [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug @ 2022-05-04 12:49 Nicolas Frattaroli 2022-05-04 13:21 ` Sudeep Holla 0 siblings, 1 reply; 13+ messages in thread From: Nicolas Frattaroli @ 2022-05-04 12:49 UTC (permalink / raw) To: linux-arm-kernel Cc: linux-rockchip, Heiko Stuebner, Liang Chen, Sudeep Holla, linux-kernel, Kever Yang Good day, a user on the #linux-rockchip channel on the Libera.chat IRC network reported that their RK3568 was no longer getting a CPU and GPU clock from scmi and consequently not booting when using linux-next. This was bisected down to the following commit: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62 The error message in the log is as follows: arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3 The rockchip firmware (bl31) being used was v1.32, from here: https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf This seems like a non-fatal firmware bug, for which a kernel workaround is certainly possible, but it would be good if rockchip could fix this in their firmware. The user going by "amazingfate" reported that commenting out the ret = -EPROTO; break; fixes the issue for them. I'm writing here to get the discussion started on how we can resolve this before the Linux 5.19 release. Sudeep Holla has already told me they'll gladly add a workaround before the 5.19 release, but would rather see this fixed in the vendor firmware first. Would rockchip be able and willing to fix it and publish a new bl31 for rk3568? Regards, Nicolas Frattaroli _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-04 12:49 [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug Nicolas Frattaroli @ 2022-05-04 13:21 ` Sudeep Holla 2022-05-04 17:51 ` Nicolas Frattaroli 0 siblings, 1 reply; 13+ messages in thread From: Sudeep Holla @ 2022-05-04 13:21 UTC (permalink / raw) To: Nicolas Frattaroli Cc: linux-arm-kernel, linux-rockchip, Cristian Marussi, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang + Cristian Hi Nicolas, Thanks for the formal report. On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > Good day, > > a user on the #linux-rockchip channel on the Libera.chat IRC network > reported that their RK3568 was no longer getting a CPU and GPU clock > from scmi and consequently not booting when using linux-next. This > was bisected down to the following commit: > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62 > > The error message in the log is as follows: > > arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3 > > The rockchip firmware (bl31) being used was v1.32, from here: > > https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf > So this platform is not supported in upstream TF-A like its predecessors ? > This seems like a non-fatal firmware bug, for which a kernel workaround is > certainly possible, but it would be good if rockchip could fix this in their > firmware. > Indeed, we added this check finding issue in one of our tests. Luckily it helped to unearth the same issue on this platform, but due to the nature of its f/w release, it is bit unfortunate that it can't be fixed easily and quickly. But I really wish this gets fixed in the firmware. Are there any other f/w bugs reported so far ? If so how are they fixed as I don't expect all such bugs can be worked around in the kernel though this might be. I would like to hear details there if possible. > The user going by "amazingfate" reported that commenting out the > ret = -EPROTO; break; > fixes the issue for them. > Sure, or we could relax the check as calc_sz <= real_sz or something so that the reverse is still caught and handled as OS might read junk data in the later case. > I'm writing here to get the discussion started on how we can resolve this > before the Linux 5.19 release. > Agreed, I just sent by pull request for this literally few hours ago. > Sudeep Holla has already told me they'll gladly add a workaround before > the 5.19 release, but would rather see this fixed in the vendor firmware > first. Would rockchip be able and willing to fix it and publish a new > bl31 for rk3568? > Indeed and as mentioned above details on how other such f/w bugs are dealt in general esp that the firmware is blob release and one can't fix it easily. Do we have a bugzilla kind of setup to report and get the bugs fixed ? -- Regards, Sudeep _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-04 13:21 ` Sudeep Holla @ 2022-05-04 17:51 ` Nicolas Frattaroli 2022-05-05 8:03 ` Cristian Marussi 0 siblings, 1 reply; 13+ messages in thread From: Nicolas Frattaroli @ 2022-05-04 17:51 UTC (permalink / raw) To: Sudeep Holla Cc: linux-arm-kernel, linux-rockchip, Cristian Marussi, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote: > + Cristian > > Hi Nicolas, > > Thanks for the formal report. > > On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > > Good day, > > > > a user on the #linux-rockchip channel on the Libera.chat IRC network > > reported that their RK3568 was no longer getting a CPU and GPU clock > > from scmi and consequently not booting when using linux-next. This > > was bisected down to the following commit: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62 > > > > The error message in the log is as follows: > > > > arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3 > > > > The rockchip firmware (bl31) being used was v1.32, from here: > > > > https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf > > > > So this platform is not supported in upstream TF-A like its predecessors ? Hello, it is not yet supported by upstream. Rockchip plans to release the sources for it at some point if I recall correctly, but I believe their software team has been very busy due to new hardware releases, so it hasn't happened yet. I hope we'll see an open source release of the TF-A sources eventually, so that for bugs like this we can always fix them without the vendor needing to do it for us. > > > This seems like a non-fatal firmware bug, for which a kernel workaround is > > certainly possible, but it would be good if rockchip could fix this in their > > firmware. > > > > Indeed, we added this check finding issue in one of our tests. Luckily > it helped to unearth the same issue on this platform, but due to the > nature of its f/w release, it is bit unfortunate that it can't be fixed > easily and quickly. But I really wish this gets fixed in the firmware. > Are there any other f/w bugs reported so far ? If so how are they fixed > as I don't expect all such bugs can be worked around in the kernel though > this might be. I would like to hear details there if possible. I'm not aware of how the rockchip bug report workflow works. They seemingly did update the firmware multiple times, last in October of 2021. The official rockchip repository at [1] hasn't been kept as up to date as the mirror by a rockchip employee at [2], so most people seem to have been using the latter. Speaking of which, I'll add the owner of that repo to the CC of this thread to make sure this doesn't get lost. Rockchip lists an e-mail at [3] for reporting issues at, but this seems to relate to their open-source documentation. The official github repository of "rkbin" on the "rockchip-linux" organisation does not have issues enabled, so submitting a bug report through that is unfortunately not possible. > > > The user going by "amazingfate" reported that commenting out the > > ret = -EPROTO; break; > > fixes the issue for them. > > > > Sure, or we could relax the check as calc_sz <= real_sz or something so > that the reverse is still caught and handled as OS might read junk data in > the later case. This seems like a good solution, that way we're unlikely to ever run into a situation where the kernel does the wrong thing here even if we're less strict about the check. In either case, it should print a dev_err though, it's still an error even if we can tolerate it in some cases. > > > I'm writing here to get the discussion started on how we can resolve this > > before the Linux 5.19 release. > > > > Agreed, I just sent by pull request for this literally few hours ago. > > > Sudeep Holla has already told me they'll gladly add a workaround before > > the 5.19 release, but would rather see this fixed in the vendor firmware > > first. Would rockchip be able and willing to fix it and publish a new > > bl31 for rk3568? > > > > Indeed and as mentioned above details on how other such f/w bugs are dealt > in general esp that the firmware is blob release and one can't fix it easily. > Do we have a bugzilla kind of setup to report and get the bugs fixed ? It's worth mentioning that I think even if we get Rockchip to fix the bug in the firmware, I believe Linux should still add a workaround, as otherwise people running older firmware who are upgrading their kernels could suddenly have unbootable systems and don't know why that happened. Regards, Nicolas Frattaroli PS: I've also CC'd Peter Geis as he has worked on the RK356x support in mainline a bunch and I believe he has been in contact with Rockchip about releasing the TF-A sources before. [1]: https://github.com/rockchip-linux/rkbin [2]: https://github.com/JeffyCN/rockchip_mirrors/tree/rkbin [3]: http://opensource.rock-chips.com/wiki_Main_Page _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-04 17:51 ` Nicolas Frattaroli @ 2022-05-05 8:03 ` Cristian Marussi 2022-05-05 9:40 ` Etienne Carriere 0 siblings, 1 reply; 13+ messages in thread From: Cristian Marussi @ 2022-05-05 8:03 UTC (permalink / raw) To: Nicolas Frattaroli Cc: Sudeep Holla, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis, Etienne Carriere On Wed, May 04, 2022 at 07:51:45PM +0200, Nicolas Frattaroli wrote: > On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote: > > + Cristian +Etieenne Hi Nicolas, > > > > Hi Nicolas, > > > > Thanks for the formal report. > > > > On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > > > Good day, > > > > > > a user on the #linux-rockchip channel on the Libera.chat IRC network > > > reported that their RK3568 was no longer getting a CPU and GPU clock > > > from scmi and consequently not booting when using linux-next. This > > > was bisected down to the following commit: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62 > > > > > > The error message in the log is as follows: > > > > > > arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3 > > > > > > The rockchip firmware (bl31) being used was v1.32, from here: > > > > > > https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf > > > > > > > So this platform is not supported in upstream TF-A like its predecessors ? > > Hello, > > it is not yet supported by upstream. Rockchip plans to release the sources > for it at some point if I recall correctly, but I believe their software > team has been very busy due to new hardware releases, so it hasn't happened > yet. I hope we'll see an open source release of the TF-A sources eventually, > so that for bugs like this we can always fix them without the vendor needing > to do it for us. > > > > > > This seems like a non-fatal firmware bug, for which a kernel workaround is > > > certainly possible, but it would be good if rockchip could fix this in their > > > firmware. > > > > > > > Indeed, we added this check finding issue in one of our tests. Luckily > > it helped to unearth the same issue on this platform, but due to the > > nature of its f/w release, it is bit unfortunate that it can't be fixed > > easily and quickly. But I really wish this gets fixed in the firmware. > > Are there any other f/w bugs reported so far ? If so how are they fixed > > as I don't expect all such bugs can be worked around in the kernel though > > this might be. I would like to hear details there if possible. > > I'm not aware of how the rockchip bug report workflow works. They seemingly > did update the firmware multiple times, last in October of 2021. > > The official rockchip repository at [1] hasn't been kept as up to date as > the mirror by a rockchip employee at [2], so most people seem to have been > using the latter. Speaking of which, I'll add the owner of that repo to > the CC of this thread to make sure this doesn't get lost. > > Rockchip lists an e-mail at [3] for reporting issues at, but this seems to > relate to their open-source documentation. The official github repository > of "rkbin" on the "rockchip-linux" organisation does not have issues > enabled, so submitting a bug report through that is unfortunately not > possible. Having a quick look at TF-A SCMI code in charge of this message (at least in the upstream): https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/drivers/scmi-msg/base.c#n136 it seems to me that the bug lies in the fact that the BASE_DISCOVER_PROTOCOLS response message built by the function above is not properly sized: the trailing message payload carrying the list of protocols (after returned_protos field) returns always a fixed arbitrarily sized payload, possibly zeroed if fewer protocols are carried. IOW, even though the answer in this case carries 3 items/protocols, the payload is anyway 8 bytes (with the last 5 bytes zeroed), while by the spec it should have been just 4 bytes. (in fact testing the kernel fix on a JUNO with last SCP fw release did NOT expose any issue...) I think a fix FW side could be something along these lines (UNTESTED NOR BUILT ! ... I Cc'ed Etienne that seems the author of this bit) This basically mirrors the same checks introduced in kernel...if someone is fancy/able to test it.... --- diff --git a/drivers/scmi-msg/base.c b/drivers/scmi-msg/base.c index 2d7203451..35c99e308 100644 --- a/drivers/scmi-msg/base.c +++ b/drivers/scmi-msg/base.c @@ -142,6 +142,7 @@ static void discover_list_protocols(struct scmi_msg *msg) uint8_t outargs[sizeof(p2a) + MAX_PROTOCOL_IN_LIST] = { 0U }; const uint8_t *list = NULL; unsigned int count = 0U; + size_t list_sz = 0U; if (msg->in_size != sizeof(*a2p)) { scmi_status_response(msg, SCMI_PROTOCOL_ERROR); @@ -163,9 +164,12 @@ static void discover_list_protocols(struct scmi_msg *msg) p2a.num_protocols = count; memcpy(outargs, &p2a, sizeof(p2a)); - memcpy(outargs + sizeof(p2a), list + a2p->skip, count); - - scmi_write_response(msg, outargs, sizeof(outargs)); + if (count) { + memcpy(outargs + sizeof(p2a), list + a2p->skip, count); + list_sz = (1 + (count - 1) / sizeof(uint32_t)) * + sizeof(uint32_t); + } + scmi_write_response(msg, outargs, sizeof(p2a) + list_sz); } > > > > > > The user going by "amazingfate" reported that commenting out the > > > ret = -EPROTO; break; > > > fixes the issue for them. > > > > > > > Sure, or we could relax the check as calc_sz <= real_sz or something so > > that the reverse is still caught and handled as OS might read junk data in > > the later case. > > This seems like a good solution, that way we're unlikely to ever run into a > situation where the kernel does the wrong thing here even if we're less > strict about the check. In either case, it should print a dev_err though, > it's still an error even if we can tolerate it in some cases. > > > Beside fixing the FW, adding a workaround like the one Sudeep mentioned to avoid killing old fw plaform seems reasonable: we can just _err() as long as Kernel is not put into peril (i.e. calc_sz <= real_sz ) without necessarily bail out if an out of spec, but harmless, message is spotted. > > > I'm writing here to get the discussion started on how we can resolve this > > > before the Linux 5.19 release. > > > > > > > Agreed, I just sent by pull request for this literally few hours ago. > > > > > Sudeep Holla has already told me they'll gladly add a workaround before > > > the 5.19 release, but would rather see this fixed in the vendor firmware > > > first. Would rockchip be able and willing to fix it and publish a new > > > bl31 for rk3568? > > > > > > > Indeed and as mentioned above details on how other such f/w bugs are dealt > > in general esp that the firmware is blob release and one can't fix it easily. > > Do we have a bugzilla kind of setup to report and get the bugs fixed ? > > It's worth mentioning that I think even if we get Rockchip to fix the bug in > the firmware, I believe Linux should still add a workaround, as otherwise > people running older firmware who are upgrading their kernels could suddenly > have unbootable systems and don't know why that happened. > Agree. Thanks, Cristian _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-05 8:03 ` Cristian Marussi @ 2022-05-05 9:40 ` Etienne Carriere 2022-05-05 10:10 ` Sudeep Holla ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Etienne Carriere @ 2022-05-05 9:40 UTC (permalink / raw) To: Cristian Marussi Cc: Nicolas Frattaroli, Sudeep Holla, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis Hello Nicolas, Cristian, On Thu, 5 May 2022 at 10:03, Cristian Marussi <cristian.marussi@arm.com> wrote: > > On Wed, May 04, 2022 at 07:51:45PM +0200, Nicolas Frattaroli wrote: > > On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote: > > > + Cristian > > +Etieenne > > Hi Nicolas, > > > > > > > Hi Nicolas, > > > > > > Thanks for the formal report. > > > > > > On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > > > > Good day, > > > > > > > > a user on the #linux-rockchip channel on the Libera.chat IRC network > > > > reported that their RK3568 was no longer getting a CPU and GPU clock > > > > from scmi and consequently not booting when using linux-next. This > > > > was bisected down to the following commit: > > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62 > > > > > > > > The error message in the log is as follows: > > > > > > > > arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3 > > > > > > > > The rockchip firmware (bl31) being used was v1.32, from here: > > > > > > > > https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf > > > > > > > > > > So this platform is not supported in upstream TF-A like its predecessors ? > > > > Hello, > > > > it is not yet supported by upstream. Rockchip plans to release the sources > > for it at some point if I recall correctly, but I believe their software > > team has been very busy due to new hardware releases, so it hasn't happened > > yet. I hope we'll see an open source release of the TF-A sources eventually, > > so that for bugs like this we can always fix them without the vendor needing > > to do it for us. > > > > > > > > > This seems like a non-fatal firmware bug, for which a kernel workaround is > > > > certainly possible, but it would be good if rockchip could fix this in their > > > > firmware. > > > > > > > > > > Indeed, we added this check finding issue in one of our tests. Luckily > > > it helped to unearth the same issue on this platform, but due to the > > > nature of its f/w release, it is bit unfortunate that it can't be fixed > > > easily and quickly. But I really wish this gets fixed in the firmware. > > > Are there any other f/w bugs reported so far ? If so how are they fixed > > > as I don't expect all such bugs can be worked around in the kernel though > > > this might be. I would like to hear details there if possible. > > > > I'm not aware of how the rockchip bug report workflow works. They seemingly > > did update the firmware multiple times, last in October of 2021. > > > > The official rockchip repository at [1] hasn't been kept as up to date as > > the mirror by a rockchip employee at [2], so most people seem to have been > > using the latter. Speaking of which, I'll add the owner of that repo to > > the CC of this thread to make sure this doesn't get lost. > > > > Rockchip lists an e-mail at [3] for reporting issues at, but this seems to > > relate to their open-source documentation. The official github repository > > of "rkbin" on the "rockchip-linux" organisation does not have issues > > enabled, so submitting a bug report through that is unfortunately not > > possible. > > Having a quick look at TF-A SCMI code in charge of this message (at least in > the upstream): > > https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/drivers/scmi-msg/base.c#n136 > > it seems to me that the bug lies in the fact that the BASE_DISCOVER_PROTOCOLS > response message built by the function above is not properly sized: the trailing > message payload carrying the list of protocols (after returned_protos field) returns > always a fixed arbitrarily sized payload, possibly zeroed if fewer protocols are > carried. > > IOW, even though the answer in this case carries 3 items/protocols, the payload > is anyway 8 bytes (with the last 5 bytes zeroed), while by the spec it should have > been just 4 bytes. > > (in fact testing the kernel fix on a JUNO with last SCP fw release did NOT expose > any issue...) > > I think a fix FW side could be something along these lines (UNTESTED NOR > BUILT ! ... I Cc'ed Etienne that seems the author of this bit) > > This basically mirrors the same checks introduced in kernel...if someone > is fancy/able to test it.... Indeed the firmware implementation is wrong in TF-A. And also in OP-TEE by the way: https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? I can fix the optee_os implementation. I'll tell you when I'll have created a P-R. The fix is the same for TF-A and OP-TEE. Proposal from Cristian looks good to me, maybe simplified: ```patch memcpy(outargs, &p2a, sizeof(p2a)); memcpy(outargs + sizeof(p2a), list + a2p->skip, count); - scmi_write_response(msg, outargs, sizeof(outargs)); + list_sz = (1 + (count - 1) / sizeof(uint32_t)) * sizeof(uint32_t); + scmi_write_response(msg, outargs, sizeof(p2a) + list_sz); ``` > > diff --git a/drivers/scmi-msg/base.c b/drivers/scmi-msg/base.c > index 2d7203451..35c99e308 100644 > --- a/drivers/scmi-msg/base.c > +++ b/drivers/scmi-msg/base.c > @@ -142,6 +142,7 @@ static void discover_list_protocols(struct scmi_msg *msg) > uint8_t outargs[sizeof(p2a) + MAX_PROTOCOL_IN_LIST] = { 0U }; > const uint8_t *list = NULL; > unsigned int count = 0U; > + size_t list_sz = 0U; > > if (msg->in_size != sizeof(*a2p)) { > scmi_status_response(msg, SCMI_PROTOCOL_ERROR); > @@ -163,9 +164,12 @@ static void discover_list_protocols(struct scmi_msg *msg) > p2a.num_protocols = count; > > memcpy(outargs, &p2a, sizeof(p2a)); > - memcpy(outargs + sizeof(p2a), list + a2p->skip, count); > - > - scmi_write_response(msg, outargs, sizeof(outargs)); > + if (count) { > + memcpy(outargs + sizeof(p2a), list + a2p->skip, count); > + list_sz = (1 + (count - 1) / sizeof(uint32_t)) * > + sizeof(uint32_t); > + } > + scmi_write_response(msg, outargs, sizeof(p2a) + list_sz); > } > > > > > > > > > > > The user going by "amazingfate" reported that commenting out the > > > > ret = -EPROTO; break; > > > > fixes the issue for them. > > > > > > > > > > Sure, or we could relax the check as calc_sz <= real_sz or something so > > > that the reverse is still caught and handled as OS might read junk data in > > > the later case. > > > > This seems like a good solution, that way we're unlikely to ever run into a > > situation where the kernel does the wrong thing here even if we're less > > strict about the check. In either case, it should print a dev_err though, > > it's still an error even if we can tolerate it in some cases. > > > > > > > Beside fixing the FW, adding a workaround like the one Sudeep mentioned to > avoid killing old fw plaform seems reasonable: we can just _err() as long as > Kernel is not put into peril (i.e. calc_sz <= real_sz ) without > necessarily bail out if an out of spec, but harmless, message is > spotted. > > > > > I'm writing here to get the discussion started on how we can resolve this > > > > before the Linux 5.19 release. > > > > > > > > > > Agreed, I just sent by pull request for this literally few hours ago. > > > > > > > Sudeep Holla has already told me they'll gladly add a workaround before > > > > the 5.19 release, but would rather see this fixed in the vendor firmware > > > > first. Would rockchip be able and willing to fix it and publish a new > > > > bl31 for rk3568? > > > > > > > > > > Indeed and as mentioned above details on how other such f/w bugs are dealt > > > in general esp that the firmware is blob release and one can't fix it easily. > > > Do we have a bugzilla kind of setup to report and get the bugs fixed ? > > > > It's worth mentioning that I think even if we get Rockchip to fix the bug in > > the firmware, I believe Linux should still add a workaround, as otherwise > > people running older firmware who are upgrading their kernels could suddenly > > have unbootable systems and don't know why that happened. > > > > Agree. Agree also. Sorry for that bug in TF-A/OP-TEE. BR, Etienne > > Thanks, > Cristian _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-05 9:40 ` Etienne Carriere @ 2022-05-05 10:10 ` Sudeep Holla 2022-05-05 13:42 ` Nicolas Frattaroli 2022-05-05 10:47 ` Cristian Marussi 2022-05-12 11:11 ` Nicolas Frattaroli 2 siblings, 1 reply; 13+ messages in thread From: Sudeep Holla @ 2022-05-05 10:10 UTC (permalink / raw) To: Nicolas Frattaroli Cc: Cristian Marussi, Etienne Carriere, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis > > > > On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > > > > > Good day, > > > > > > > > > > a user on the #linux-rockchip channel on the Libera.chat IRC network > > > > > reported that their RK3568 was no longer getting a CPU and GPU clock > > > > > from scmi and consequently not booting when using linux-next. This > > > > > was bisected down to the following commit: OK I missed to read the above properly earlier. If scmi probe failure is resulting in Linux boot failure, then that is another bug that needs fixing. Why does not getting CPU clock block the boot. I would like to see the boot logs. I considered this issue to be non-fatal and must be just ending up disabling all SCMI communication. But the reported issue is boot failure which sounds like another/different bug and I would like that to be fixed first before we push the workaround for the reported issue so that it is not ignored. Has anyone analysed why the absence of CPU clock results in boot failure ? Are you running the upstream kernel itself ? -- Regards, Sudeep _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-05 10:10 ` Sudeep Holla @ 2022-05-05 13:42 ` Nicolas Frattaroli 0 siblings, 0 replies; 13+ messages in thread From: Nicolas Frattaroli @ 2022-05-05 13:42 UTC (permalink / raw) To: Sudeep Holla Cc: Cristian Marussi, Etienne Carriere, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis On Donnerstag, 5. Mai 2022 12:10:32 CEST Sudeep Holla wrote: > > > > > On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > > > > > > Good day, > > > > > > > > > > > > a user on the #linux-rockchip channel on the Libera.chat IRC network > > > > > > reported that their RK3568 was no longer getting a CPU and GPU clock > > > > > > from scmi and consequently not booting when using linux-next. This > > > > > > was bisected down to the following commit: > > OK I missed to read the above properly earlier. If scmi probe failure is > resulting in Linux boot failure, then that is another bug that needs fixing. > Why does not getting CPU clock block the boot. I would like to see the boot > logs. I considered this issue to be non-fatal and must be just ending up > disabling all SCMI communication. But the reported issue is boot failure > which sounds like another/different bug and I would like that to be fixed > first before we push the workaround for the reported issue so that it is > not ignored. > > Has anyone analysed why the absence of CPU clock results in boot failure ? > Are you running the upstream kernel itself ? > > Hello, I'm sorry, I seem to have misinterpreted the original user's messages as having been a boot failure. Upon re-reading the logs, this doesn't seem to have been explicitly mentioned. I therefore assume this wasn't causing a failure to boot. Sadly the user isn't in the IRC channel at this moment so I cannot ask them further questions. I have tested this out on my own RK3566 based platform, and found that we get the following: $ sudo dmesg | grep arm-scmi [ 0.247134] arm-scmi firmware:scmi: Enabled polling mode TX channel - prot_id:16 [ 0.247526] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled. [ 0.247760] arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4 [ 0.247773] arm-scmi firmware:scmi: SCMI Protocol v2.0 'rockchip:' Firmware version 0x0 [ 0.247920] arm-scmi firmware:scmi: SCMI protocol 20 not implemented [ 1.545441] arm-scmi firmware:scmi: Failed. SCMI protocol 20 not active. [ 1.562958] arm-scmi firmware:scmi: Failed. SCMI protocol 23 not active. [ 1.565676] arm-scmi firmware:scmi: Failed. SCMI protocol 22 not active. [ 2.094446] arm-scmi firmware:scmi: Failed. SCMI protocol 21 not active. [ 2.103474] arm-scmi firmware:scmi: Failed. SCMI protocol 19 not active. [ 5.586871] arm-scmi firmware:scmi: Failed. SCMI protocol 17 not active. [ 5.593178] arm-scmi firmware:scmi: Failed. SCMI protocol 21 not active. $ sudo dmesg | grep clk [ 18.255901] panfrost fde60000.gpu: clk init failed -517 [ 18.686720] panfrost fde60000.gpu: clk init failed -517 The system does boot, it's just awfully slow. This is not a boot failure, but arguably still a pretty bad failure mode to find oneself in. Sorry for the confusion this caused. Regards, Nicolas Frattaroli _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-05 9:40 ` Etienne Carriere 2022-05-05 10:10 ` Sudeep Holla @ 2022-05-05 10:47 ` Cristian Marussi 2022-05-05 14:21 ` Cristian Marussi 2022-05-12 11:11 ` Nicolas Frattaroli 2 siblings, 1 reply; 13+ messages in thread From: Cristian Marussi @ 2022-05-05 10:47 UTC (permalink / raw) To: Etienne Carriere Cc: Nicolas Frattaroli, Sudeep Holla, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis On Thu, May 05, 2022 at 11:40:09AM +0200, Etienne Carriere wrote: > Hello Nicolas, Cristian, > > On Thu, 5 May 2022 at 10:03, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > On Wed, May 04, 2022 at 07:51:45PM +0200, Nicolas Frattaroli wrote: > > > On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote: > > > > + Cristian > > > > +Etieenne > > > > Hi Nicolas, > > Hi Etienne, [snip] > > Having a quick look at TF-A SCMI code in charge of this message (at least in > > the upstream): > > > > https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/drivers/scmi-msg/base.c#n136 > > > > it seems to me that the bug lies in the fact that the BASE_DISCOVER_PROTOCOLS > > response message built by the function above is not properly sized: the trailing > > message payload carrying the list of protocols (after returned_protos field) returns > > always a fixed arbitrarily sized payload, possibly zeroed if fewer protocols are > > carried. > > > > IOW, even though the answer in this case carries 3 items/protocols, the payload > > is anyway 8 bytes (with the last 5 bytes zeroed), while by the spec it should have > > been just 4 bytes. > > > > (in fact testing the kernel fix on a JUNO with last SCP fw release did NOT expose > > any issue...) > > > > I think a fix FW side could be something along these lines (UNTESTED NOR > > BUILT ! ... I Cc'ed Etienne that seems the author of this bit) > > > > This basically mirrors the same checks introduced in kernel...if someone > > is fancy/able to test it.... > > Indeed the firmware implementation is wrong in TF-A. > And also in OP-TEE by the way: > https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 > > @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? > > I can fix the optee_os implementation. I'll tell you when I'll have > created a P-R. > The fix is the same for TF-A and OP-TEE. > Proposal from Cristian looks good to me, maybe simplified: > > ```patch > memcpy(outargs, &p2a, sizeof(p2a)); > memcpy(outargs + sizeof(p2a), list + a2p->skip, count); > > - scmi_write_response(msg, outargs, sizeof(outargs)); > + list_sz = (1 + (count - 1) / sizeof(uint32_t)) * sizeof(uint32_t); > + scmi_write_response(msg, outargs, sizeof(p2a) + list_sz); > ``` > I don't think list_sz is properly calculated if you don't rule out the count = 0 case (did the same mistake in Kernel at first :D)...if count is zero list_sz ends up being 4 [(1 + (0 - 1) / 4 ) * 4] while it should be zero. (...and 'if (count)' also avoid an unneeded memcpy of zero bytes) Moreover reviewing my own proposal code below it's probably easier to use some sort of macro like ALIGN(count, 4) if there's one in TF-A/OP-TEE. (...checking anyway that can handle correctly the zero case..) Thanks, Etienne _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-05 10:47 ` Cristian Marussi @ 2022-05-05 14:21 ` Cristian Marussi 0 siblings, 0 replies; 13+ messages in thread From: Cristian Marussi @ 2022-05-05 14:21 UTC (permalink / raw) To: Etienne Carriere Cc: Nicolas Frattaroli, Sudeep Holla, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis On Thu, May 05, 2022 at 11:47:41AM +0100, Cristian Marussi wrote: > On Thu, May 05, 2022 at 11:40:09AM +0200, Etienne Carriere wrote: > > Hello Nicolas, Cristian, > > > > On Thu, 5 May 2022 at 10:03, Cristian Marussi <cristian.marussi@arm.com> wrote: > > > > > > On Wed, May 04, 2022 at 07:51:45PM +0200, Nicolas Frattaroli wrote: > > > > On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote: > > > > > + Cristian > > > > > > +Etieenne > > > > > > Hi Nicolas, > > > > > Hi Etienne, > > [snip] > > > > Having a quick look at TF-A SCMI code in charge of this message (at least in > > > the upstream): > > > > > > https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/drivers/scmi-msg/base.c#n136 > > > > > > it seems to me that the bug lies in the fact that the BASE_DISCOVER_PROTOCOLS > > > response message built by the function above is not properly sized: the trailing > > > message payload carrying the list of protocols (after returned_protos field) returns > > > always a fixed arbitrarily sized payload, possibly zeroed if fewer protocols are > > > carried. > > > > > > IOW, even though the answer in this case carries 3 items/protocols, the payload > > > is anyway 8 bytes (with the last 5 bytes zeroed), while by the spec it should have > > > been just 4 bytes. > > > > > > (in fact testing the kernel fix on a JUNO with last SCP fw release did NOT expose > > > any issue...) > > > > > > I think a fix FW side could be something along these lines (UNTESTED NOR > > > BUILT ! ... I Cc'ed Etienne that seems the author of this bit) > > > > > > This basically mirrors the same checks introduced in kernel...if someone > > > is fancy/able to test it.... > > > > Indeed the firmware implementation is wrong in TF-A. > > And also in OP-TEE by the way: > > https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 > > > > @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? > > > > I can fix the optee_os implementation. I'll tell you when I'll have > > created a P-R. > > The fix is the same for TF-A and OP-TEE. > > Proposal from Cristian looks good to me, maybe simplified: > > > > ```patch > > memcpy(outargs, &p2a, sizeof(p2a)); > > memcpy(outargs + sizeof(p2a), list + a2p->skip, count); > > > > - scmi_write_response(msg, outargs, sizeof(outargs)); > > + list_sz = (1 + (count - 1) / sizeof(uint32_t)) * sizeof(uint32_t); > > + scmi_write_response(msg, outargs, sizeof(p2a) + list_sz); > > ``` > > > > I don't think list_sz is properly calculated if you don't rule out the > count = 0 case (did the same mistake in Kernel at first :D)...if count is > zero list_sz ends up being 4 [(1 + (0 - 1) / 4 ) * 4] while it should be > zero. (...and 'if (count)' also avoid an unneeded memcpy of zero bytes) > > Moreover reviewing my own proposal code below it's probably easier to use > some sort of macro like ALIGN(count, 4) if there's one in TF-A/OP-TEE. > (...checking anyway that can handle correctly the zero case..) > > Thanks, > Etienne > Sorry this was meant to be Thanks, Cristian :P but I messed up the snipping Cristian _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-05 9:40 ` Etienne Carriere 2022-05-05 10:10 ` Sudeep Holla 2022-05-05 10:47 ` Cristian Marussi @ 2022-05-12 11:11 ` Nicolas Frattaroli 2022-05-12 14:34 ` Sudeep Holla 2 siblings, 1 reply; 13+ messages in thread From: Nicolas Frattaroli @ 2022-05-12 11:11 UTC (permalink / raw) To: Cristian Marussi, Etienne Carriere Cc: Sudeep Holla, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis Hello, sorry for the late reply, completely missed that there was a question for me in this mail. On Donnerstag, 5. Mai 2022 11:40:09 CEST Etienne Carriere wrote: > Hello Nicolas, Cristian, > [...] > > Indeed the firmware implementation is wrong in TF-A. > And also in OP-TEE by the way: > https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 > > @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? I have no experience with TF-A, so I'd prefer if you could do it. In good news, Rockchip has confirmed they're preparing to release RK356x TF-A sources, so I'll be able to port the patch over to their sources once they are released, if they don't already apply it themselves. > > I can fix the optee_os implementation. I'll tell you when I'll have > created a P-R. > The fix is the same for TF-A and OP-TEE. > Proposal from Cristian looks good to me, maybe simplified: > > ```patch > memcpy(outargs, &p2a, sizeof(p2a)); > memcpy(outargs + sizeof(p2a), list + a2p->skip, count); > > - scmi_write_response(msg, outargs, sizeof(outargs)); > + list_sz = (1 + (count - 1) / sizeof(uint32_t)) * sizeof(uint32_t); > + scmi_write_response(msg, outargs, sizeof(p2a) + list_sz); > ``` > [...] > > BR, > Etienne Regards, Nicolas Frattaroli _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-12 11:11 ` Nicolas Frattaroli @ 2022-05-12 14:34 ` Sudeep Holla 2022-05-13 8:44 ` Kever Yang 2022-05-13 8:58 ` Etienne Carriere 0 siblings, 2 replies; 13+ messages in thread From: Sudeep Holla @ 2022-05-12 14:34 UTC (permalink / raw) To: Nicolas Frattaroli Cc: Cristian Marussi, Etienne Carriere, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis On Thu, May 12, 2022 at 01:11:22PM +0200, Nicolas Frattaroli wrote: > Hello, > > sorry for the late reply, completely missed that there was a question > for me in this mail. > > On Donnerstag, 5. Mai 2022 11:40:09 CEST Etienne Carriere wrote: > > Hello Nicolas, Cristian, > > [...] > > > > Indeed the firmware implementation is wrong in TF-A. > > And also in OP-TEE by the way: > > https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 > > > > @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? > > I have no experience with TF-A, so I'd prefer if you could do it. > > In good news, Rockchip has confirmed they're preparing to release RK356x > TF-A sources, so I'll be able to port the patch over to their sources once > they are released, if they don't already apply it themselves. > So, there is no way to get a blob release with the patch applied ? We know it is a bug in TF-A and if Rockchip is using that codebase, it will be the same bug there too causing this issue. Waiting until the code is released and the proper TF-A port is done may not be acceptable for many developers. So someone from the rockchip doing these tf-a blob release must get involved here, understand the situation and get the fixed. We can workaround, but I want to hear that it will be fixed instead of getting ignored until proper port is available. Etienne, are you not the author for the related TF-A code ? How can we get that fixed in the TF-A code base. If you are not, then I will try to get my repo login credentials sorted so that I can only push TF-A change. -- Regards, Sudeep _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-12 14:34 ` Sudeep Holla @ 2022-05-13 8:44 ` Kever Yang 2022-05-13 8:58 ` Etienne Carriere 1 sibling, 0 replies; 13+ messages in thread From: Kever Yang @ 2022-05-13 8:44 UTC (permalink / raw) To: Sudeep Holla, Nicolas Frattaroli Cc: Cristian Marussi, Etienne Carriere, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Jeffy Chen, Peter Geis Hi Sudeep, Rockchip has notice this issue, we will check how to fix this issue in our ATF release. Thanks, - Kever On 2022/5/12 22:34, Sudeep Holla wrote: > On Thu, May 12, 2022 at 01:11:22PM +0200, Nicolas Frattaroli wrote: >> Hello, >> >> sorry for the late reply, completely missed that there was a question >> for me in this mail. >> >> On Donnerstag, 5. Mai 2022 11:40:09 CEST Etienne Carriere wrote: >>> Hello Nicolas, Cristian, >>> [...] >>> >>> Indeed the firmware implementation is wrong in TF-A. >>> And also in OP-TEE by the way: >>> https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 >>> >>> @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? >> I have no experience with TF-A, so I'd prefer if you could do it. >> >> In good news, Rockchip has confirmed they're preparing to release RK356x >> TF-A sources, so I'll be able to port the patch over to their sources once >> they are released, if they don't already apply it themselves. >> > So, there is no way to get a blob release with the patch applied ? > We know it is a bug in TF-A and if Rockchip is using that codebase, it > will be the same bug there too causing this issue. Waiting until the > code is released and the proper TF-A port is done may not be acceptable > for many developers. So someone from the rockchip doing these tf-a blob > release must get involved here, understand the situation and get the fixed. > > We can workaround, but I want to hear that it will be fixed instead of > getting ignored until proper port is available. > > Etienne, are you not the author for the related TF-A code ? How can we > get that fixed in the TF-A code base. If you are not, then I will try to > get my repo login credentials sorted so that I can only push TF-A change. > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug 2022-05-12 14:34 ` Sudeep Holla 2022-05-13 8:44 ` Kever Yang @ 2022-05-13 8:58 ` Etienne Carriere 1 sibling, 0 replies; 13+ messages in thread From: Etienne Carriere @ 2022-05-13 8:58 UTC (permalink / raw) To: Sudeep Holla Cc: Nicolas Frattaroli, Cristian Marussi, linux-arm-kernel, linux-rockchip, Heiko Stuebner, Liang Chen, linux-kernel, Kever Yang, Jeffy Chen, Peter Geis Hello all, On Thu, 12 May 2022 at 16:34, Sudeep Holla <sudeep.holla@arm.com> wrote: > > On Thu, May 12, 2022 at 01:11:22PM +0200, Nicolas Frattaroli wrote: > > Hello, > > > > sorry for the late reply, completely missed that there was a question > > for me in this mail. > > > > On Donnerstag, 5. Mai 2022 11:40:09 CEST Etienne Carriere wrote: > > > Hello Nicolas, Cristian, > > > [...] > > > > > > Indeed the firmware implementation is wrong in TF-A. > > > And also in OP-TEE by the way: > > > https://github.com/OP-TEE/optee_os/blob/3.17.0/core/drivers/scmi-msg/base.c#L163-L166 > > > > > > @Nicoals, do you want to send a patch to TF-A, or do you want me to do it? > > > > I have no experience with TF-A, so I'd prefer if you could do it. > > > > In good news, Rockchip has confirmed they're preparing to release RK356x > > TF-A sources, so I'll be able to port the patch over to their sources once > > they are released, if they don't already apply it themselves. > > > > So, there is no way to get a blob release with the patch applied ? > We know it is a bug in TF-A and if Rockchip is using that codebase, it > will be the same bug there too causing this issue. Waiting until the > code is released and the proper TF-A port is done may not be acceptable > for many developers. So someone from the rockchip doing these tf-a blob > release must get involved here, understand the situation and get the fixed. > > We can workaround, but I want to hear that it will be fixed instead of > getting ignored until proper port is available. > > Etienne, are you not the author for the related TF-A code ? How can we > get that fixed in the TF-A code base. If you are not, then I will try to > get my repo login credentials sorted so that I can only push TF-A change. Yes, I'm the author of the code. I have created a gerrit change in TF-A to fix the issue: https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/15196 OP-TEE implementation also embeds the same scmi-msg drivers and have the same issue. I should be fixed by P-R: https://github.com/OP-TEE/optee_os/pull/5334 I will ask TF-A if this specific change can hit TF-A release v2.7 tag, if posisble. Feedbacks are welcome on these 2 forums. Regards, etienne > > -- > Regards, > Sudeep _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-05-13 8:59 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-05-04 12:49 [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug Nicolas Frattaroli 2022-05-04 13:21 ` Sudeep Holla 2022-05-04 17:51 ` Nicolas Frattaroli 2022-05-05 8:03 ` Cristian Marussi 2022-05-05 9:40 ` Etienne Carriere 2022-05-05 10:10 ` Sudeep Holla 2022-05-05 13:42 ` Nicolas Frattaroli 2022-05-05 10:47 ` Cristian Marussi 2022-05-05 14:21 ` Cristian Marussi 2022-05-12 11:11 ` Nicolas Frattaroli 2022-05-12 14:34 ` Sudeep Holla 2022-05-13 8:44 ` Kever Yang 2022-05-13 8:58 ` Etienne Carriere
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).