From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f194.google.com ([209.85.220.194]:38388 "EHLO mail-qk0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbeBUAvM (ORCPT ); Tue, 20 Feb 2018 19:51:12 -0500 Received: by mail-qk0-f194.google.com with SMTP id s198so18936555qke.5 for ; Tue, 20 Feb 2018 16:51:12 -0800 (PST) Subject: Re: [PATCH net-next] cxgb4: append firmware dump to vmcore in kernel panic To: Jakub Kicinski , Rahul Lakkireddy Cc: David Miller , "netdev@vger.kernel.org" , Ganesh GR , Nirranjan Kirubaharan , Indranil Choudhury References: <1518702882-29688-1-git-send-email-rahul.lakkireddy@chelsio.com> <20180216.154101.1707041533181015167.davem@davemloft.net> <20180219123416.GA7737@chelsio.com> <20180220164304.2a38c962@cakuba.netronome.com> From: Florian Fainelli Message-ID: <4e8620fa-daf2-b4ad-f1ba-a1b879e29ef5@gmail.com> Date: Tue, 20 Feb 2018 16:51:03 -0800 MIME-Version: 1.0 In-Reply-To: <20180220164304.2a38c962@cakuba.netronome.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org List-ID: On 02/20/2018 04:43 PM, Jakub Kicinski wrote: > On Mon, 19 Feb 2018 18:04:17 +0530, Rahul Lakkireddy wrote: >> Our requirement is to analyze the state of firmware/hardware at the >> time of kernel panic. > > I was wondering about this since you posted the patch and I can't come > up with any specific scenario where kernel crash would correlate > clearly with device state in non-trivial way. > > Perhaps there is something about cxgb4 HW/FW that makes this useful. > Could you explain? Could you give a real life example of a bug? > Is it related to the TOE-looking TLS offload Atul is posting? > > Is the panic you're targeting here real or manually triggered from user > space to get a full dump of kernel and FW? > > That's me trying to guess what you're doing.. :) > One case where this might be helpful is if you are chasing down DMA corruption and you would like to get a nearly instant capture of both the kernel's memory and the adapter which may be responsible for that. This is not probably 100% proof because there is a timing window during which the dumps of both contexts are going to happen, and that alone might be influencing the captured memory view. Just guessing of course. -- Florian