From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mian Yousaf Kaukab Subject: Re: arm64: tegra186: bpmp: kernel crash while decompressing initrd Date: Mon, 11 May 2020 17:23:30 +0200 Message-ID: <20200511152330.GA1718@suse.de> References: <20200508084041.23366-1-ykaukab@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-tegra-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Robin Murphy Cc: talho-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org, thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, jonathanh-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org, linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, afaerber-l3A5Bk7waGM@public.gmane.org List-Id: linux-tegra@vger.kernel.org On Mon, May 11, 2020 at 12:25:00PM +0100, Robin Murphy wrote: > On 2020-05-08 9:40 am, Mian Yousaf Kaukab wrote: > > I am seeing following kernel crash on Jetson TX2. Board is flashed with > > firmware bits from L4T R32.4.2 with upstream u-boot. Crash always > > happens while decompressing initrd. Initrd is approximately 80 MiB in > > size and compressed with xz (xz --check=crc32 --lzma2=dict=32MiB). > > Crash is not observed if the same initrd is compressed with gzip. > > [1] was a previous attempt to workaround the same issue. > > > > [ 0.651168] Trying to unpack rootfs image as initramfs... > > [ 2.890171] SError Interrupt on CPU0, code 0xbf40c000 -- SError > > [ 2.890174] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G S 5.7.0-rc4-next-20200505 #22 > > [ 2.890175] Hardware name: nvidia p2771-0000/p2771-0000, BIOS 2020.04-rc3 03/25/2020 > > [ 2.890176] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--) > > [ 2.890177] pc : lzma_main+0x648/0x908 > > [ 2.890178] lr : lzma_main+0x330/0x908 > > [ 2.890179] sp : ffff80001003bb70 > > [ 2.890180] x29: ffff80001003bb70 x28: 0000000004d794a4 > > [ 2.890183] x27: 0000000004769941 x26: ffff0001eb064000 > > [ 2.890185] x25: ffff0001eb060028 x24: 0000000000000002 > > [ 2.890187] x23: 0000000000000003 x22: 0000000000000007 > > [ 2.890189] x21: 0000000000611f4b x20: ffff0001eb060000 > > [ 2.890192] x19: ffff80001003bcb8 x18: 0000000000000068 > > [ 2.890194] x17: 00000000000000c0 x16: fffffe00076b2108 > > [ 2.890196] x15: 0000000000000800 x14: 0000000000ffffff > > [ 2.890198] x13: 0000000000000001 x12: ffff0001eb060000 > > [ 2.890200] x11: 0000000000000600 x10: ffff0001eb060028 > > [ 2.890202] x9 : 00000000ffbb2a08 x8 : 0000000000000ed0 > > [ 2.890204] x7 : 00000000011553ec x6 : 0000000000000000 > > [ 2.890206] x5 : 0000000000000000 x4 : 0000000000000006 > > [ 2.890208] x3 : 00000000015a29e4 x2 : ffff0001eb062d0c > > [ 2.890210] x1 : 000000000000000c x0 : 000000000263de44 > > > > With some debugging aid ported from Nvidia downstream kernel [2] the > > actual cause was found: > > > > [ 0.761525] Trying to unpack rootfs image as initramfs... > > [ 2.955499] CPU0: SError: mpidr=0x80000100, esr=0xbf40c000 > > [ 2.955502] CPU1: SError: mpidr=0x80000000, esr=0xbe000000 > > [ 2.955505] CPU2: SError: mpidr=0x80000001, esr=0xbe000000 > > [ 2.955506] CPU3: SError: mpidr=0x80000101, esr=0xbf40c000 > > [ 2.955507] ROC:CCE Machine Check Error: > > [ 2.955508] ROC:CCE Registers: > > [ 2.955509] STAT: 0xb400000000400415 > > [ 2.955510] ADDR: 0x400c00e7a00c > > [ 2.955511] MSC1: 0x80ffc > > [ 2.955512] MSC2: 0x3900000000800 > > [ 2.955513] -------------------------------------- > > [ 2.955514] Decoded ROC:CCE Machine Check: > > [ 2.955515] Uncorrected (this is fatal) > > [ 2.955516] Error reporting enabled when error arrived > > [ 2.955517] Error Code = 0x415 > > [ 2.955518] Poison Error > > [ 2.955518] Command = NCRd (0xc) > > [ 2.955519] Address Type = Non-Secure DRAM > > [ 2.955521] Address = 0x30039e80 -- 30000000.sysram + 0x39e80 > > [ 2.955521] TLimit = 0x3ff > > [ 2.955522] Poison Error Mask = 0x80 > > [ 2.955523] More Info = 0x800 > > [ 2.955524] Timeout Info = 0x0 > > [ 2.955525] Poison Info = 0x800 > > [ 2.955526] Read Request failed GSC checks > > [ 2.955527] Source = L2_1 (A57) (0x1) > > [ 2.955528] TID = 0xe > > > > IIUC, there was read request for 0x30039e80 from EL1/2 which failed. > > This address falls in the sysram security aperture and hence a read > > from normal mode failed. > > > > sysram is mapped at 0x3000_0000 to 0x3004_ffff and is managed by the > > sram driver (drivers/misc/sram.c). There are two reserved pools for > > BPMP driver communication at 0x3004_e000 and 0x3004_f000 of 0x1000 > > bytes each. > > > > sram driver maps complete 0x3000_0000 to 0x3004_ffff range as normal > > memory. > > That's your problem. It's not really worth attempting to reason about, the > architecture says that anything mapped as Normal memory may be speculatively > accessed at any time, so no amount of second-guessing is going to save you > in general. Don't make stuff accessible to the kernel that it doesn't need > to access, and especially don't make stuff accessible to the kernel if > accessing it will kill the system. > I agree and [1] was an attempt in that direction. What I wonder here is that processor is speculating on an address range which kernel has never accessed. Is it correct behavior that cpu is speculating in EL1/EL2 on an address accessed in EL3? > > However, only the BPMP reserved pools (0x3004_e000 - 0x3004_ffff) > > are accessible from the kernel. Address 0x3003_9e80 is inaccessible > > from the kernel and a read to it (which I believe is speculative) > > causes the SError. Only driver which uses sysram is not initialized at > > this point (rootfs_initcall level). As since > > commit d70f5e541ab3 ("firmware: tegra: Make BPMP a regular driver") > > bpmp driver is initialized at device_initcall level. > > > > If none of the drivers on the kernel side using 0x3003_9e80 address > > range. Why a read to it occurs even speculatively? Could it be that > > some EL3 software didn’t cleanup after itself properly? Any > > suggestions on debugging this issue further? > > > > Another solution suggested in [1] was to add no-memory-wc in sysram > > node in device-tree so that sysram is mapped as device-memory. Thus > > preventing any speculative access. However, it causes another set of > > issues with the bpmp driver. That's may be a discussion for another > > time. > > AFAICS the truly correct solution is what Stephen initially suggested there > - for the boot process to somehow describe which parts of SRAM have been > reserved by Secure software and/or which parts remain Non-Secure, and for > the kernel driver to only map and use the latter. Yes, let me see what I can do. > > Robin. Thanks, Yousaf From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 478A7C54E8D for ; Mon, 11 May 2020 15:23:51 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1A7F3206D5 for ; Mon, 11 May 2020 15:23:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="sOXoCFLp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1A7F3206D5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=jkDXpDzT16rIvlr9DT9VwPxm/JabqTZgUohMxPaOEIE=; b=sOXoCFLpwJKl0y 0QvHbkko1xXEY9haOgSsD363iSkarmEzjVQ/a3+WhgQpWnfqF+OrbO/psH1hEqjokcV4rgJ49aPVF ClouWPGgi+BF1ZmPC5W3VVgh840Ulnw2bDIx6c2s3DfNjfDWfh7Q6b1WeJRcHTITmzihbm1SSNpIW dPbQwHblzi3Ekvxw0ZmN/oIS9pppkPuXVgrW56MfCfArtRlH6xuvE2lgrn2XRFFPRK+kdTE0CeAOG CbOrgkC4sgkbtY+DyPtg94eoMM7fENg51vAUlS4XZQPLzgTybaTfzndB1r2ca/tdX0kUr/tXSykOb RBkqnWM4Pswq+wPGs5VQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jYAHU-0007yL-Cg; Mon, 11 May 2020 15:23:48 +0000 Received: from mx2.suse.de ([195.135.220.15]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jYAHQ-0007xY-Af for linux-arm-kernel@lists.infradead.org; Mon, 11 May 2020 15:23:46 +0000 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 388A4AB76; Mon, 11 May 2020 15:23:42 +0000 (UTC) Date: Mon, 11 May 2020 17:23:30 +0200 From: Mian Yousaf Kaukab To: Robin Murphy Subject: Re: arm64: tegra186: bpmp: kernel crash while decompressing initrd Message-ID: <20200511152330.GA1718@suse.de> References: <20200508084041.23366-1-ykaukab@suse.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200511_082344_658112_4FFB93CF X-CRM114-Status: GOOD ( 29.01 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-kernel@vger.kernel.org, jonathanh@nvidia.com, talho@nvidia.com, thierry.reding@gmail.com, linux-tegra@vger.kernel.org, afaerber@suse.de, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org T24gTW9uLCBNYXkgMTEsIDIwMjAgYXQgMTI6MjU6MDBQTSArMDEwMCwgUm9iaW4gTXVycGh5IHdy b3RlOgo+IE9uIDIwMjAtMDUtMDggOTo0MCBhbSwgTWlhbiBZb3VzYWYgS2F1a2FiIHdyb3RlOgo+ ID4gSSBhbSBzZWVpbmcgZm9sbG93aW5nIGtlcm5lbCBjcmFzaCBvbiBKZXRzb24gVFgyLiBCb2Fy ZCBpcyBmbGFzaGVkIHdpdGgKPiA+IGZpcm13YXJlIGJpdHMgZnJvbSBMNFQgUjMyLjQuMiB3aXRo IHVwc3RyZWFtIHUtYm9vdC4gQ3Jhc2ggYWx3YXlzCj4gPiBoYXBwZW5zIHdoaWxlIGRlY29tcHJl c3NpbmcgaW5pdHJkLiBJbml0cmQgaXMgYXBwcm94aW1hdGVseSA4MCBNaUIgaW4KPiA+IHNpemUg YW5kIGNvbXByZXNzZWQgd2l0aCB4eiAoeHogLS1jaGVjaz1jcmMzMiAtLWx6bWEyPWRpY3Q9MzJN aUIpLgo+ID4gQ3Jhc2ggaXMgbm90IG9ic2VydmVkIGlmIHRoZSBzYW1lIGluaXRyZCBpcyBjb21w cmVzc2VkIHdpdGggZ3ppcC4KPiA+IFsxXSB3YXMgYSBwcmV2aW91cyBhdHRlbXB0IHRvIHdvcmth cm91bmQgdGhlIHNhbWUgaXNzdWUuCj4gPiAKPiA+IFsgICAgMC42NTExNjhdIFRyeWluZyB0byB1 bnBhY2sgcm9vdGZzIGltYWdlIGFzIGluaXRyYW1mcy4uLgo+ID4gWyAgICAyLjg5MDE3MV0gU0Vy cm9yIEludGVycnVwdCBvbiBDUFUwLCBjb2RlIDB4YmY0MGMwMDAgLS0gU0Vycm9yCj4gPiBbICAg IDIuODkwMTc0XSBDUFU6IDAgUElEOiAxIENvbW06IHN3YXBwZXIvMCBUYWludGVkOiBHIFMgICAg ICAgICAgICAgICAgNS43LjAtcmM0LW5leHQtMjAyMDA1MDUgIzIyCj4gPiBbICAgIDIuODkwMTc1 XSBIYXJkd2FyZSBuYW1lOiBudmlkaWEgcDI3NzEtMDAwMC9wMjc3MS0wMDAwLCBCSU9TIDIwMjAu MDQtcmMzIDAzLzI1LzIwMjAKPiA+IFsgICAgMi44OTAxNzZdIHBzdGF0ZTogMjAwMDAwMDUgKG56 Q3YgZGFpZiAtUEFOIC1VQU8gQlRZUEU9LS0pCj4gPiBbICAgIDIuODkwMTc3XSBwYyA6IGx6bWFf bWFpbisweDY0OC8weDkwOAo+ID4gWyAgICAyLjg5MDE3OF0gbHIgOiBsem1hX21haW4rMHgzMzAv MHg5MDgKPiA+IFsgICAgMi44OTAxNzldIHNwIDogZmZmZjgwMDAxMDAzYmI3MAo+ID4gWyAgICAy Ljg5MDE4MF0geDI5OiBmZmZmODAwMDEwMDNiYjcwIHgyODogMDAwMDAwMDAwNGQ3OTRhNAo+ID4g WyAgICAyLjg5MDE4M10geDI3OiAwMDAwMDAwMDA0NzY5OTQxIHgyNjogZmZmZjAwMDFlYjA2NDAw MAo+ID4gWyAgICAyLjg5MDE4NV0geDI1OiBmZmZmMDAwMWViMDYwMDI4IHgyNDogMDAwMDAwMDAw MDAwMDAwMgo+ID4gWyAgICAyLjg5MDE4N10geDIzOiAwMDAwMDAwMDAwMDAwMDAzIHgyMjogMDAw MDAwMDAwMDAwMDAwNwo+ID4gWyAgICAyLjg5MDE4OV0geDIxOiAwMDAwMDAwMDAwNjExZjRiIHgy MDogZmZmZjAwMDFlYjA2MDAwMAo+ID4gWyAgICAyLjg5MDE5Ml0geDE5OiBmZmZmODAwMDEwMDNi Y2I4IHgxODogMDAwMDAwMDAwMDAwMDA2OAo+ID4gWyAgICAyLjg5MDE5NF0geDE3OiAwMDAwMDAw MDAwMDAwMGMwIHgxNjogZmZmZmZlMDAwNzZiMjEwOAo+ID4gWyAgICAyLjg5MDE5Nl0geDE1OiAw MDAwMDAwMDAwMDAwODAwIHgxNDogMDAwMDAwMDAwMGZmZmZmZgo+ID4gWyAgICAyLjg5MDE5OF0g eDEzOiAwMDAwMDAwMDAwMDAwMDAxIHgxMjogZmZmZjAwMDFlYjA2MDAwMAo+ID4gWyAgICAyLjg5 MDIwMF0geDExOiAwMDAwMDAwMDAwMDAwNjAwIHgxMDogZmZmZjAwMDFlYjA2MDAyOAo+ID4gWyAg ICAyLjg5MDIwMl0geDkgOiAwMDAwMDAwMGZmYmIyYTA4IHg4IDogMDAwMDAwMDAwMDAwMGVkMAo+ ID4gWyAgICAyLjg5MDIwNF0geDcgOiAwMDAwMDAwMDAxMTU1M2VjIHg2IDogMDAwMDAwMDAwMDAw MDAwMAo+ID4gWyAgICAyLjg5MDIwNl0geDUgOiAwMDAwMDAwMDAwMDAwMDAwIHg0IDogMDAwMDAw MDAwMDAwMDAwNgo+ID4gWyAgICAyLjg5MDIwOF0geDMgOiAwMDAwMDAwMDAxNWEyOWU0IHgyIDog ZmZmZjAwMDFlYjA2MmQwYwo+ID4gWyAgICAyLjg5MDIxMF0geDEgOiAwMDAwMDAwMDAwMDAwMDBj IHgwIDogMDAwMDAwMDAwMjYzZGU0NAo+ID4gCj4gPiBXaXRoIHNvbWUgZGVidWdnaW5nIGFpZCBw b3J0ZWQgZnJvbSBOdmlkaWEgZG93bnN0cmVhbSBrZXJuZWwgWzJdIHRoZQo+ID4gYWN0dWFsIGNh dXNlIHdhcyBmb3VuZDoKPiA+IAo+ID4gWyAgICAwLjc2MTUyNV0gVHJ5aW5nIHRvIHVucGFjayBy b290ZnMgaW1hZ2UgYXMgaW5pdHJhbWZzLi4uCj4gPiBbICAgIDIuOTU1NDk5XSBDUFUwOiBTRXJy b3I6IG1waWRyPTB4ODAwMDAxMDAsIGVzcj0weGJmNDBjMDAwCj4gPiBbICAgIDIuOTU1NTAyXSBD UFUxOiBTRXJyb3I6IG1waWRyPTB4ODAwMDAwMDAsIGVzcj0weGJlMDAwMDAwCj4gPiBbICAgIDIu OTU1NTA1XSBDUFUyOiBTRXJyb3I6IG1waWRyPTB4ODAwMDAwMDEsIGVzcj0weGJlMDAwMDAwCj4g PiBbICAgIDIuOTU1NTA2XSBDUFUzOiBTRXJyb3I6IG1waWRyPTB4ODAwMDAxMDEsIGVzcj0weGJm NDBjMDAwCj4gPiBbICAgIDIuOTU1NTA3XSBST0M6Q0NFIE1hY2hpbmUgQ2hlY2sgRXJyb3I6Cj4g PiBbICAgIDIuOTU1NTA4XSBST0M6Q0NFIFJlZ2lzdGVyczoKPiA+IFsgICAgMi45NTU1MDldICBT VEFUOiAweGI0MDAwMDAwMDA0MDA0MTUKPiA+IFsgICAgMi45NTU1MTBdICBBRERSOiAweDQwMGMw MGU3YTAwYwo+ID4gWyAgICAyLjk1NTUxMV0gIE1TQzE6IDB4ODBmZmMKPiA+IFsgICAgMi45NTU1 MTJdICBNU0MyOiAweDM5MDAwMDAwMDA4MDAKPiA+IFsgICAgMi45NTU1MTNdIC0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tCj4gPiBbICAgIDIuOTU1NTE0XSBEZWNvZGVkIFJP QzpDQ0UgTWFjaGluZSBDaGVjazoKPiA+IFsgICAgMi45NTU1MTVdICBVbmNvcnJlY3RlZCAodGhp cyBpcyBmYXRhbCkKPiA+IFsgICAgMi45NTU1MTZdICBFcnJvciByZXBvcnRpbmcgZW5hYmxlZCB3 aGVuIGVycm9yIGFycml2ZWQKPiA+IFsgICAgMi45NTU1MTddICBFcnJvciBDb2RlID0gMHg0MTUK PiA+IFsgICAgMi45NTU1MThdICBQb2lzb24gRXJyb3IKPiA+IFsgICAgMi45NTU1MThdICBDb21t YW5kID0gTkNSZCAoMHhjKQo+ID4gWyAgICAyLjk1NTUxOV0gIEFkZHJlc3MgVHlwZSA9IE5vbi1T ZWN1cmUgRFJBTQo+ID4gWyAgICAyLjk1NTUyMV0gIEFkZHJlc3MgPSAweDMwMDM5ZTgwIC0tIDMw MDAwMDAwLnN5c3JhbSArIDB4MzllODAKPiA+IFsgICAgMi45NTU1MjFdICBUTGltaXQgPSAweDNm Zgo+ID4gWyAgICAyLjk1NTUyMl0gIFBvaXNvbiBFcnJvciBNYXNrID0gMHg4MAo+ID4gWyAgICAy Ljk1NTUyM10gIE1vcmUgSW5mbyA9IDB4ODAwCj4gPiBbICAgIDIuOTU1NTI0XSAgVGltZW91dCBJ bmZvID0gMHgwCj4gPiBbICAgIDIuOTU1NTI1XSAgICAgICAgICBQb2lzb24gSW5mbyA9IDB4ODAw Cj4gPiBbICAgIDIuOTU1NTI2XSAgICAgICAgICBSZWFkIFJlcXVlc3QgZmFpbGVkIEdTQyBjaGVj a3MKPiA+IFsgICAgMi45NTU1MjddICBTb3VyY2UgPSBMMl8xIChBNTcpICgweDEpCj4gPiBbICAg IDIuOTU1NTI4XSAgVElEID0gMHhlCj4gPiAKPiA+IElJVUMsIHRoZXJlIHdhcyByZWFkIHJlcXVl c3QgZm9yIDB4MzAwMzllODAgZnJvbSBFTDEvMiB3aGljaCBmYWlsZWQuCj4gPiBUaGlzIGFkZHJl c3MgZmFsbHMgaW4gdGhlIHN5c3JhbSBzZWN1cml0eSBhcGVydHVyZSBhbmQgaGVuY2UgYSByZWFk Cj4gPiBmcm9tIG5vcm1hbCBtb2RlIGZhaWxlZC4KPiA+IAo+ID4gc3lzcmFtIGlzIG1hcHBlZCBh dCAweDMwMDBfMDAwMCB0byAweDMwMDRfZmZmZiBhbmQgaXMgbWFuYWdlZCBieSB0aGUKPiA+IHNy YW0gZHJpdmVyIChkcml2ZXJzL21pc2Mvc3JhbS5jKS4gVGhlcmUgYXJlIHR3byByZXNlcnZlZCBw b29scyBmb3IKPiA+IEJQTVAgZHJpdmVyIGNvbW11bmljYXRpb24gYXQgMHgzMDA0X2UwMDAgYW5k IDB4MzAwNF9mMDAwIG9mIDB4MTAwMAo+ID4gYnl0ZXMgZWFjaC4KPiA+IAo+ID4gc3JhbSBkcml2 ZXIgbWFwcyBjb21wbGV0ZSAweDMwMDBfMDAwMCB0byAweDMwMDRfZmZmZiByYW5nZSBhcyBub3Jt YWwKPiA+IG1lbW9yeS4KPiAKPiBUaGF0J3MgeW91ciBwcm9ibGVtLiBJdCdzIG5vdCByZWFsbHkg d29ydGggYXR0ZW1wdGluZyB0byByZWFzb24gYWJvdXQsIHRoZQo+IGFyY2hpdGVjdHVyZSBzYXlz IHRoYXQgYW55dGhpbmcgbWFwcGVkIGFzIE5vcm1hbCBtZW1vcnkgbWF5IGJlIHNwZWN1bGF0aXZl bHkKPiBhY2Nlc3NlZCBhdCBhbnkgdGltZSwgc28gbm8gYW1vdW50IG9mIHNlY29uZC1ndWVzc2lu ZyBpcyBnb2luZyB0byBzYXZlIHlvdQo+IGluIGdlbmVyYWwuIERvbid0IG1ha2Ugc3R1ZmYgYWNj ZXNzaWJsZSB0byB0aGUga2VybmVsIHRoYXQgaXQgZG9lc24ndCBuZWVkCj4gdG8gYWNjZXNzLCBh bmQgZXNwZWNpYWxseSBkb24ndCBtYWtlIHN0dWZmIGFjY2Vzc2libGUgdG8gdGhlIGtlcm5lbCBp Zgo+IGFjY2Vzc2luZyBpdCB3aWxsIGtpbGwgdGhlIHN5c3RlbS4KPiAKSSBhZ3JlZSBhbmQgWzFd IHdhcyBhbiBhdHRlbXB0IGluIHRoYXQgZGlyZWN0aW9uLiBXaGF0IEkgd29uZGVyIGhlcmUgaXMg dGhhdApwcm9jZXNzb3IgaXMgc3BlY3VsYXRpbmcgb24gYW4gYWRkcmVzcyByYW5nZSB3aGljaCBr ZXJuZWwgaGFzIG5ldmVyIGFjY2Vzc2VkLgpJcyBpdCBjb3JyZWN0IGJlaGF2aW9yIHRoYXQgY3B1 IGlzIHNwZWN1bGF0aW5nIGluIEVMMS9FTDIgb24gYW4gYWRkcmVzcwphY2Nlc3NlZCBpbiBFTDM/ Cgo+ID4gSG93ZXZlciwgb25seSB0aGUgQlBNUCByZXNlcnZlZCBwb29scyAoMHgzMDA0X2UwMDAg LSAweDMwMDRfZmZmZikKPiA+IGFyZSBhY2Nlc3NpYmxlIGZyb20gdGhlIGtlcm5lbC4gQWRkcmVz cyAweDMwMDNfOWU4MCBpcyBpbmFjY2Vzc2libGUKPiA+IGZyb20gdGhlIGtlcm5lbCBhbmQgYSBy ZWFkIHRvIGl0ICh3aGljaCBJIGJlbGlldmUgaXMgc3BlY3VsYXRpdmUpCj4gPiBjYXVzZXMgdGhl IFNFcnJvci4gT25seSBkcml2ZXIgd2hpY2ggdXNlcyBzeXNyYW0gaXMgbm90IGluaXRpYWxpemVk IGF0Cj4gPiB0aGlzIHBvaW50IChyb290ZnNfaW5pdGNhbGwgbGV2ZWwpLiBBcyBzaW5jZQo+ID4g Y29tbWl0IGQ3MGY1ZTU0MWFiMyAoImZpcm13YXJlOiB0ZWdyYTogTWFrZSBCUE1QIGEgcmVndWxh ciBkcml2ZXIiKQo+ID4gYnBtcCBkcml2ZXIgaXMgaW5pdGlhbGl6ZWQgYXQgZGV2aWNlX2luaXRj YWxsIGxldmVsLgo+ID4gCj4gPiBJZiBub25lIG9mIHRoZSBkcml2ZXJzIG9uIHRoZSBrZXJuZWwg c2lkZSB1c2luZyAweDMwMDNfOWU4MCBhZGRyZXNzCj4gPiByYW5nZS4gV2h5IGEgcmVhZCB0byBp dCBvY2N1cnMgZXZlbiBzcGVjdWxhdGl2ZWx5PyBDb3VsZCBpdCBiZSB0aGF0Cj4gPiBzb21lIEVM MyBzb2Z0d2FyZSBkaWRu4oCZdCBjbGVhbnVwIGFmdGVyIGl0c2VsZiBwcm9wZXJseT8gQW55Cj4g PiBzdWdnZXN0aW9ucyBvbiBkZWJ1Z2dpbmcgdGhpcyBpc3N1ZSBmdXJ0aGVyPwo+ID4gCj4gPiBB bm90aGVyIHNvbHV0aW9uIHN1Z2dlc3RlZCBpbiBbMV0gd2FzIHRvIGFkZCBuby1tZW1vcnktd2Mg aW4gc3lzcmFtCj4gPiBub2RlIGluIGRldmljZS10cmVlIHNvIHRoYXQgc3lzcmFtIGlzIG1hcHBl ZCBhcyBkZXZpY2UtbWVtb3J5LiBUaHVzCj4gPiBwcmV2ZW50aW5nIGFueSBzcGVjdWxhdGl2ZSBh Y2Nlc3MuIEhvd2V2ZXIsIGl0IGNhdXNlcyBhbm90aGVyIHNldCBvZgo+ID4gaXNzdWVzIHdpdGgg dGhlIGJwbXAgZHJpdmVyLiBUaGF0J3MgbWF5IGJlIGEgZGlzY3Vzc2lvbiBmb3IgYW5vdGhlcgo+ ID4gdGltZS4KPiAKPiBBRkFJQ1MgdGhlIHRydWx5IGNvcnJlY3Qgc29sdXRpb24gaXMgd2hhdCBT dGVwaGVuIGluaXRpYWxseSBzdWdnZXN0ZWQgdGhlcmUKPiAtIGZvciB0aGUgYm9vdCBwcm9jZXNz IHRvIHNvbWVob3cgZGVzY3JpYmUgd2hpY2ggcGFydHMgb2YgU1JBTSBoYXZlIGJlZW4KPiByZXNl cnZlZCBieSBTZWN1cmUgc29mdHdhcmUgYW5kL29yIHdoaWNoIHBhcnRzIHJlbWFpbiBOb24tU2Vj dXJlLCBhbmQgZm9yCj4gdGhlIGtlcm5lbCBkcml2ZXIgdG8gb25seSBtYXAgYW5kIHVzZSB0aGUg bGF0dGVyLgpZZXMsIGxldCBtZSBzZWUgd2hhdCBJIGNhbiBkby4KPiAKPiBSb2Jpbi4KClRoYW5r cywKWW91c2FmCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f XwpsaW51eC1hcm0ta2VybmVsIG1haWxpbmcgbGlzdApsaW51eC1hcm0ta2VybmVsQGxpc3RzLmlu ZnJhZGVhZC5vcmcKaHR0cDovL2xpc3RzLmluZnJhZGVhZC5vcmcvbWFpbG1hbi9saXN0aW5mby9s aW51eC1hcm0ta2VybmVsCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6F89C54E8D for ; Mon, 11 May 2020 15:23:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACC2D206D5 for ; Mon, 11 May 2020 15:23:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730216AbgEKPXm (ORCPT ); Mon, 11 May 2020 11:23:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:49988 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726410AbgEKPXm (ORCPT ); Mon, 11 May 2020 11:23:42 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 388A4AB76; Mon, 11 May 2020 15:23:42 +0000 (UTC) Date: Mon, 11 May 2020 17:23:30 +0200 From: Mian Yousaf Kaukab To: Robin Murphy Cc: talho@nvidia.com, thierry.reding@gmail.com, jonathanh@nvidia.com, linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, afaerber@suse.de Subject: Re: arm64: tegra186: bpmp: kernel crash while decompressing initrd Message-ID: <20200511152330.GA1718@suse.de> References: <20200508084041.23366-1-ykaukab@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 11, 2020 at 12:25:00PM +0100, Robin Murphy wrote: > On 2020-05-08 9:40 am, Mian Yousaf Kaukab wrote: > > I am seeing following kernel crash on Jetson TX2. Board is flashed with > > firmware bits from L4T R32.4.2 with upstream u-boot. Crash always > > happens while decompressing initrd. Initrd is approximately 80 MiB in > > size and compressed with xz (xz --check=crc32 --lzma2=dict=32MiB). > > Crash is not observed if the same initrd is compressed with gzip. > > [1] was a previous attempt to workaround the same issue. > > > > [ 0.651168] Trying to unpack rootfs image as initramfs... > > [ 2.890171] SError Interrupt on CPU0, code 0xbf40c000 -- SError > > [ 2.890174] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G S 5.7.0-rc4-next-20200505 #22 > > [ 2.890175] Hardware name: nvidia p2771-0000/p2771-0000, BIOS 2020.04-rc3 03/25/2020 > > [ 2.890176] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--) > > [ 2.890177] pc : lzma_main+0x648/0x908 > > [ 2.890178] lr : lzma_main+0x330/0x908 > > [ 2.890179] sp : ffff80001003bb70 > > [ 2.890180] x29: ffff80001003bb70 x28: 0000000004d794a4 > > [ 2.890183] x27: 0000000004769941 x26: ffff0001eb064000 > > [ 2.890185] x25: ffff0001eb060028 x24: 0000000000000002 > > [ 2.890187] x23: 0000000000000003 x22: 0000000000000007 > > [ 2.890189] x21: 0000000000611f4b x20: ffff0001eb060000 > > [ 2.890192] x19: ffff80001003bcb8 x18: 0000000000000068 > > [ 2.890194] x17: 00000000000000c0 x16: fffffe00076b2108 > > [ 2.890196] x15: 0000000000000800 x14: 0000000000ffffff > > [ 2.890198] x13: 0000000000000001 x12: ffff0001eb060000 > > [ 2.890200] x11: 0000000000000600 x10: ffff0001eb060028 > > [ 2.890202] x9 : 00000000ffbb2a08 x8 : 0000000000000ed0 > > [ 2.890204] x7 : 00000000011553ec x6 : 0000000000000000 > > [ 2.890206] x5 : 0000000000000000 x4 : 0000000000000006 > > [ 2.890208] x3 : 00000000015a29e4 x2 : ffff0001eb062d0c > > [ 2.890210] x1 : 000000000000000c x0 : 000000000263de44 > > > > With some debugging aid ported from Nvidia downstream kernel [2] the > > actual cause was found: > > > > [ 0.761525] Trying to unpack rootfs image as initramfs... > > [ 2.955499] CPU0: SError: mpidr=0x80000100, esr=0xbf40c000 > > [ 2.955502] CPU1: SError: mpidr=0x80000000, esr=0xbe000000 > > [ 2.955505] CPU2: SError: mpidr=0x80000001, esr=0xbe000000 > > [ 2.955506] CPU3: SError: mpidr=0x80000101, esr=0xbf40c000 > > [ 2.955507] ROC:CCE Machine Check Error: > > [ 2.955508] ROC:CCE Registers: > > [ 2.955509] STAT: 0xb400000000400415 > > [ 2.955510] ADDR: 0x400c00e7a00c > > [ 2.955511] MSC1: 0x80ffc > > [ 2.955512] MSC2: 0x3900000000800 > > [ 2.955513] -------------------------------------- > > [ 2.955514] Decoded ROC:CCE Machine Check: > > [ 2.955515] Uncorrected (this is fatal) > > [ 2.955516] Error reporting enabled when error arrived > > [ 2.955517] Error Code = 0x415 > > [ 2.955518] Poison Error > > [ 2.955518] Command = NCRd (0xc) > > [ 2.955519] Address Type = Non-Secure DRAM > > [ 2.955521] Address = 0x30039e80 -- 30000000.sysram + 0x39e80 > > [ 2.955521] TLimit = 0x3ff > > [ 2.955522] Poison Error Mask = 0x80 > > [ 2.955523] More Info = 0x800 > > [ 2.955524] Timeout Info = 0x0 > > [ 2.955525] Poison Info = 0x800 > > [ 2.955526] Read Request failed GSC checks > > [ 2.955527] Source = L2_1 (A57) (0x1) > > [ 2.955528] TID = 0xe > > > > IIUC, there was read request for 0x30039e80 from EL1/2 which failed. > > This address falls in the sysram security aperture and hence a read > > from normal mode failed. > > > > sysram is mapped at 0x3000_0000 to 0x3004_ffff and is managed by the > > sram driver (drivers/misc/sram.c). There are two reserved pools for > > BPMP driver communication at 0x3004_e000 and 0x3004_f000 of 0x1000 > > bytes each. > > > > sram driver maps complete 0x3000_0000 to 0x3004_ffff range as normal > > memory. > > That's your problem. It's not really worth attempting to reason about, the > architecture says that anything mapped as Normal memory may be speculatively > accessed at any time, so no amount of second-guessing is going to save you > in general. Don't make stuff accessible to the kernel that it doesn't need > to access, and especially don't make stuff accessible to the kernel if > accessing it will kill the system. > I agree and [1] was an attempt in that direction. What I wonder here is that processor is speculating on an address range which kernel has never accessed. Is it correct behavior that cpu is speculating in EL1/EL2 on an address accessed in EL3? > > However, only the BPMP reserved pools (0x3004_e000 - 0x3004_ffff) > > are accessible from the kernel. Address 0x3003_9e80 is inaccessible > > from the kernel and a read to it (which I believe is speculative) > > causes the SError. Only driver which uses sysram is not initialized at > > this point (rootfs_initcall level). As since > > commit d70f5e541ab3 ("firmware: tegra: Make BPMP a regular driver") > > bpmp driver is initialized at device_initcall level. > > > > If none of the drivers on the kernel side using 0x3003_9e80 address > > range. Why a read to it occurs even speculatively? Could it be that > > some EL3 software didn’t cleanup after itself properly? Any > > suggestions on debugging this issue further? > > > > Another solution suggested in [1] was to add no-memory-wc in sysram > > node in device-tree so that sysram is mapped as device-memory. Thus > > preventing any speculative access. However, it causes another set of > > issues with the bpmp driver. That's may be a discussion for another > > time. > > AFAICS the truly correct solution is what Stephen initially suggested there > - for the boot process to somehow describe which parts of SRAM have been > reserved by Secure software and/or which parts remain Non-Secure, and for > the kernel driver to only map and use the latter. Yes, let me see what I can do. > > Robin. Thanks, Yousaf