From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 33E58C433F5 for ; Wed, 4 May 2022 17:53:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=/eAvaOBVgwChHaY+DIrvDl5RIlpaE9Or9G7DLzkaHqU=; b=nBiHlRTKWrgL7i sX3Xiujh+5JQYu4Tp6JVmbxtFdwA3jhjBzynCHTGZ+f5CjhmZFl3GtbpII31TiMrFQOT57yvJ8HQh gULImErMlyERcfVjuPOX8TwIEW2k90UL6UFaCci+L6c5iUduv35nX1fLB+ZgMHgoX4aw9dpRwb4x/ N8Hm3Z1DrBN/q46NBvkb6OtkIO31lrYxawosxzpWN01132lUPLQU+IohUYZsZYX2k0hrmCPSzfiPP Y4/SphZF8uTzYwEyUcsgqdOjlr6Gi19uK85XXb/x3ozyTwcH09cMZ0naQ137XfslLhpPg+xFD04jO viue/teQQwAB+ZBPxaRA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nmJAJ-00C2Uk-8V; Wed, 04 May 2022 17:51:55 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nmJAF-00C2TW-9H; Wed, 04 May 2022 17:51:53 +0000 Received: by mail-ed1-x52c.google.com with SMTP id a1so2561369edt.3; Wed, 04 May 2022 10:51:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=M6OQETHjxsNPeiQwlJ9nRZ0Zhe1S3npqCHoNCu8Hhwc=; b=GuABBy4VBrNuYj0GTLVz/YET1IoX938bnOFRsEwutc/h7zSzR4dRfAO8DubT+Ycc7Q rgRvClsWH2sRR8KX1Gwf5zL13W5RB/j1SXnCeb8jMywk2ivpZMfUxUUVFnf1JL6onVZx QCrLPkTIRIpnTUl5XJomzEGsDth7d8nnnty+4ctCa7GAZ1aKdM4/aiad/5MareclwqAT OZSRqcnYlqKfObymPQlEIMNpud145TBInDS+uQBK4MRep0DOOLPks5PpKMdvN8+ufPkL 1u+Bgxbsr4eDXZU8yv3hauZpSNYtEeZpljR0Y7KmMEEsylm5MG+3N0U9A/lHyz/nXYdP VwbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=M6OQETHjxsNPeiQwlJ9nRZ0Zhe1S3npqCHoNCu8Hhwc=; b=Chv7F+JGH69VuUWK7jZKZtpr9Y3nSs8I439a1kRFTkhJqFr+eLR6Y/3TdrSxgGr9SY WijpOqChDA06W5Yc/Hq7iaQSiK3o+01eu+FCv5WNwoXssFq0KteSMHsRH3MJdJGcm9yX fXJeyFjPezsWszxF91EjLyvZhDlkzEYfX+fI2vcVbY6p3ke2ZXTmj7D77uSCONjDuFNr X1kStkoqk0X6OUTqzU4Sh8/AvIqb0fKmjvrUG0OHnFZFGpd/mSr+CoetmAQi7H23okOD Jfbw6M7VK8r8dDJm0CnOKRKZRJ9iDruazWBtzviq8M/zOIk0s+3C0JWfmqUoahpcZ8Yr bJjg== X-Gm-Message-State: AOAM53197fCxx6aaXsTQpQmq04D2oaVxsZ5bglLXjtW91F5CZ0nXWSMD 0FDHC/AQQT7m6ZmqJHPf4I8= X-Google-Smtp-Source: ABdhPJz0hb6SOgdPZdyzU9O0kQKQPCTbTz0FGLWDIVrCS5K5zZNt1y5LFDTv1pG707NEZlnJ2W5OEA== X-Received: by 2002:aa7:d393:0:b0:425:a8f8:663a with SMTP id x19-20020aa7d393000000b00425a8f8663amr24212615edq.323.1651686707464; Wed, 04 May 2022 10:51:47 -0700 (PDT) Received: from archbook.localnet (84-72-105-84.dclient.hispeed.ch. [84.72.105.84]) by smtp.gmail.com with ESMTPSA id ci18-20020a170907267200b006f3ef214e0esm5914003ejc.116.2022.05.04.10.51.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 May 2022 10:51:46 -0700 (PDT) From: Nicolas Frattaroli To: Sudeep Holla Cc: linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, Cristian Marussi , Heiko Stuebner , Liang Chen , linux-kernel@vger.kernel.org, Kever Yang , Jeffy Chen , Peter Geis Subject: Re: [BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug Date: Wed, 04 May 2022 19:51:45 +0200 Message-ID: <3764923.NsmnsBrXv5@archbook> In-Reply-To: <20220504132130.wmmmge6qjc675jw6@bogus> References: <1698297.NAKyZzlH2u@archbook> <20220504132130.wmmmge6qjc675jw6@bogus> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220504_105151_395406_CB62BC4C X-CRM114-Status: GOOD ( 46.21 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mittwoch, 4. Mai 2022 15:21:30 CEST Sudeep Holla wrote: > + Cristian > > Hi Nicolas, > > Thanks for the formal report. > > On Wed, May 04, 2022 at 02:49:07PM +0200, Nicolas Frattaroli wrote: > > Good day, > > > > a user on the #linux-rockchip channel on the Libera.chat IRC network > > reported that their RK3568 was no longer getting a CPU and GPU clock > > from scmi and consequently not booting when using linux-next. This > > was bisected down to the following commit: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62 > > > > The error message in the log is as follows: > > > > arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3 > > > > The rockchip firmware (bl31) being used was v1.32, from here: > > > > https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf > > > > So this platform is not supported in upstream TF-A like its predecessors ? Hello, it is not yet supported by upstream. Rockchip plans to release the sources for it at some point if I recall correctly, but I believe their software team has been very busy due to new hardware releases, so it hasn't happened yet. I hope we'll see an open source release of the TF-A sources eventually, so that for bugs like this we can always fix them without the vendor needing to do it for us. > > > This seems like a non-fatal firmware bug, for which a kernel workaround is > > certainly possible, but it would be good if rockchip could fix this in their > > firmware. > > > > Indeed, we added this check finding issue in one of our tests. Luckily > it helped to unearth the same issue on this platform, but due to the > nature of its f/w release, it is bit unfortunate that it can't be fixed > easily and quickly. But I really wish this gets fixed in the firmware. > Are there any other f/w bugs reported so far ? If so how are they fixed > as I don't expect all such bugs can be worked around in the kernel though > this might be. I would like to hear details there if possible. I'm not aware of how the rockchip bug report workflow works. They seemingly did update the firmware multiple times, last in October of 2021. The official rockchip repository at [1] hasn't been kept as up to date as the mirror by a rockchip employee at [2], so most people seem to have been using the latter. Speaking of which, I'll add the owner of that repo to the CC of this thread to make sure this doesn't get lost. Rockchip lists an e-mail at [3] for reporting issues at, but this seems to relate to their open-source documentation. The official github repository of "rkbin" on the "rockchip-linux" organisation does not have issues enabled, so submitting a bug report through that is unfortunately not possible. > > > The user going by "amazingfate" reported that commenting out the > > ret = -EPROTO; break; > > fixes the issue for them. > > > > Sure, or we could relax the check as calc_sz <= real_sz or something so > that the reverse is still caught and handled as OS might read junk data in > the later case. This seems like a good solution, that way we're unlikely to ever run into a situation where the kernel does the wrong thing here even if we're less strict about the check. In either case, it should print a dev_err though, it's still an error even if we can tolerate it in some cases. > > > I'm writing here to get the discussion started on how we can resolve this > > before the Linux 5.19 release. > > > > Agreed, I just sent by pull request for this literally few hours ago. > > > Sudeep Holla has already told me they'll gladly add a workaround before > > the 5.19 release, but would rather see this fixed in the vendor firmware > > first. Would rockchip be able and willing to fix it and publish a new > > bl31 for rk3568? > > > > Indeed and as mentioned above details on how other such f/w bugs are dealt > in general esp that the firmware is blob release and one can't fix it easily. > Do we have a bugzilla kind of setup to report and get the bugs fixed ? It's worth mentioning that I think even if we get Rockchip to fix the bug in the firmware, I believe Linux should still add a workaround, as otherwise people running older firmware who are upgrading their kernels could suddenly have unbootable systems and don't know why that happened. Regards, Nicolas Frattaroli PS: I've also CC'd Peter Geis as he has worked on the RK356x support in mainline a bunch and I believe he has been in contact with Rockchip about releasing the TF-A sources before. [1]: https://github.com/rockchip-linux/rkbin [2]: https://github.com/JeffyCN/rockchip_mirrors/tree/rkbin [3]: http://opensource.rock-chips.com/wiki_Main_Page _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel