From: ebiederm@xmission.com (Eric W. Biederman)
To: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Cc: Indranil Choudhury <indranil@chelsio.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Nirranjan Kirubaharan <nirranjan@chelsio.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"stephen@networkplumber.org" <stephen@networkplumber.org>,
"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
"davem@davemloft.net" <davem@davemloft.net>,
Ganesh GR <ganeshgr@chelsio.com>
Subject: Re: [PATCH net-next v2 0/2] kernel: add support to collect hardware logs in crash recovery kernel
Date: Tue, 27 Mar 2018 10:59:50 -0500 [thread overview]
Message-ID: <87tvt1ecg9.fsf@xmission.com> (raw)
In-Reply-To: <20180327152715.GA18097@chelsio.com> (Rahul Lakkireddy's message of "Tue, 27 Mar 2018 20:57:16 +0530")
Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> writes:
> On Tuesday, March 03/27/18, 2018 at 18:47:34 +0530, Eric W. Biederman wrote:
>> Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> writes:
>>
>> > On Saturday, March 03/24/18, 2018 at 20:50:52 +0530, Eric W. Biederman wrote:
>> >>
>> >> Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> writes:
>> >>
>> >> > On production servers running variety of workloads over time, kernel
>> >> > panic can happen sporadically after days or even months. It is
>> >> > important to collect as much debug logs as possible to root cause
>> >> > and fix the problem, that may not be easy to reproduce. Snapshot of
>> >> > underlying hardware/firmware state (like register dump, firmware
>> >> > logs, adapter memory, etc.), at the time of kernel panic will be very
>> >> > helpful while debugging the culprit device driver.
>> >> >
>> >> > This series of patches add new generic framework that enable device
>> >> > drivers to collect device specific snapshot of the hardware/firmware
>> >> > state of the underlying device in the crash recovery kernel. In crash
>> >> > recovery kernel, the collected logs are exposed via /sys/kernel/crashdd/
>> >> > directory, which is copied by user space scripts for post-analysis.
>> >> >
>> >> > A kernel module crashdd is newly added. In crash recovery kernel,
>> >> > crashdd exposes /sys/kernel/crashdd/ directory containing device
>> >> > specific hardware/firmware logs.
>> >>
>> >> Have you looked at instead of adding a sysfs file adding the dumps
>> >> as additional elf notes in /proc/vmcore?
>> >>
>> >
>> > I see the crash recovery kernel's memory is not present in any of the
>> > the PT_LOAD headers. So, makedumpfile is not collecting the dumps
>> > that are in crash recovery kernel's memory.
>> >
>> > Also, are you suggesting exporting the dumps themselves as PT_NOTE
>> > instead? I'll look into doing it this way.
>>
>> Yes. I was suggesting exporting the dumps themselves as PT_NOTE
>> in /proc/vmcore. I think that will allow makedumpfile to collect
>> your new information without modification.
>>
>
> If I export the dumps themselves as PT_NOTE in /proc/vmcore, can the
> crash tool work without modification; i.e can crash tool extract these
> notes?
I believe crash would need to be taught about these notes. This is
something new.
However "readelf -a random_elf_file" does display elf notes, and elf
notes in general are not hard to extract.
What I expect from an enconding in ELF core dump format is a way to
captuer the data, a way to encode the data, and a way to transport the
data to the people who care. Analysis tools are easy enough after the
fact.
Eric
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: ebiederm@xmission.com (Eric W. Biederman)
To: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Cc: "netdev\@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-fsdevel\@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"kexec\@lists.infradead.org" <kexec@lists.infradead.org>,
"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"davem\@davemloft.net" <davem@davemloft.net>,
"viro\@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
"stephen\@networkplumber.org" <stephen@networkplumber.org>,
"akpm\@linux-foundation.org" <akpm@linux-foundation.org>,
"torvalds\@linux-foundation.org" <torvalds@linux-foundation.org>,
Ganesh GR <ganeshgr@chelsio.com>,
Nirranjan Kirubaharan <nirranjan@chelsio.com>,
Indranil Choudhury <indranil@chelsio.com>
Subject: Re: [PATCH net-next v2 0/2] kernel: add support to collect hardware logs in crash recovery kernel
Date: Tue, 27 Mar 2018 10:59:50 -0500 [thread overview]
Message-ID: <87tvt1ecg9.fsf@xmission.com> (raw)
In-Reply-To: <20180327152715.GA18097@chelsio.com> (Rahul Lakkireddy's message of "Tue, 27 Mar 2018 20:57:16 +0530")
Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> writes:
> On Tuesday, March 03/27/18, 2018 at 18:47:34 +0530, Eric W. Biederman wrote:
>> Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> writes:
>>
>> > On Saturday, March 03/24/18, 2018 at 20:50:52 +0530, Eric W. Biederman wrote:
>> >>
>> >> Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> writes:
>> >>
>> >> > On production servers running variety of workloads over time, kernel
>> >> > panic can happen sporadically after days or even months. It is
>> >> > important to collect as much debug logs as possible to root cause
>> >> > and fix the problem, that may not be easy to reproduce. Snapshot of
>> >> > underlying hardware/firmware state (like register dump, firmware
>> >> > logs, adapter memory, etc.), at the time of kernel panic will be very
>> >> > helpful while debugging the culprit device driver.
>> >> >
>> >> > This series of patches add new generic framework that enable device
>> >> > drivers to collect device specific snapshot of the hardware/firmware
>> >> > state of the underlying device in the crash recovery kernel. In crash
>> >> > recovery kernel, the collected logs are exposed via /sys/kernel/crashdd/
>> >> > directory, which is copied by user space scripts for post-analysis.
>> >> >
>> >> > A kernel module crashdd is newly added. In crash recovery kernel,
>> >> > crashdd exposes /sys/kernel/crashdd/ directory containing device
>> >> > specific hardware/firmware logs.
>> >>
>> >> Have you looked at instead of adding a sysfs file adding the dumps
>> >> as additional elf notes in /proc/vmcore?
>> >>
>> >
>> > I see the crash recovery kernel's memory is not present in any of the
>> > the PT_LOAD headers. So, makedumpfile is not collecting the dumps
>> > that are in crash recovery kernel's memory.
>> >
>> > Also, are you suggesting exporting the dumps themselves as PT_NOTE
>> > instead? I'll look into doing it this way.
>>
>> Yes. I was suggesting exporting the dumps themselves as PT_NOTE
>> in /proc/vmcore. I think that will allow makedumpfile to collect
>> your new information without modification.
>>
>
> If I export the dumps themselves as PT_NOTE in /proc/vmcore, can the
> crash tool work without modification; i.e can crash tool extract these
> notes?
I believe crash would need to be taught about these notes. This is
something new.
However "readelf -a random_elf_file" does display elf notes, and elf
notes in general are not hard to extract.
What I expect from an enconding in ELF core dump format is a way to
captuer the data, a way to encode the data, and a way to transport the
data to the people who care. Analysis tools are easy enough after the
fact.
Eric
next prev parent reply other threads:[~2018-03-27 16:01 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-24 10:56 [PATCH net-next v2 0/2] kernel: add support to collect hardware logs in crash recovery kernel Rahul Lakkireddy
2018-03-24 10:56 ` Rahul Lakkireddy
2018-03-24 10:56 ` [PATCH net-next v2 1/2] fs/crashdd: add API to collect hardware dump in second kernel Rahul Lakkireddy
2018-03-24 10:56 ` Rahul Lakkireddy
2018-03-25 12:43 ` kbuild test robot
2018-03-25 12:43 ` kbuild test robot
2018-03-30 10:39 ` Jiri Pirko
2018-03-30 10:39 ` Jiri Pirko
2018-03-30 10:51 ` Rahul Lakkireddy
2018-03-30 10:51 ` Rahul Lakkireddy
2018-03-30 18:42 ` Eric W. Biederman
2018-03-30 18:42 ` Eric W. Biederman
2018-04-02 9:11 ` Jiri Pirko
2018-04-02 9:11 ` Jiri Pirko
2018-04-02 12:21 ` Andrew Lunn
2018-04-02 12:21 ` Andrew Lunn
2018-04-02 12:30 ` Rahul Lakkireddy
2018-04-02 12:30 ` Rahul Lakkireddy
2018-04-03 7:04 ` Jiri Pirko
2018-04-03 7:04 ` Jiri Pirko
2018-03-30 15:11 ` Andrew Lunn
2018-03-30 15:11 ` Andrew Lunn
2018-04-02 9:12 ` Jiri Pirko
2018-04-02 9:12 ` Jiri Pirko
2018-04-03 5:43 ` Alex Vesker
2018-04-03 5:43 ` Alex Vesker
2018-04-03 12:35 ` Andrew Lunn
2018-04-03 12:35 ` Andrew Lunn
2018-03-24 10:56 ` [PATCH net-next v2 2/2] cxgb4: " Rahul Lakkireddy
2018-03-24 10:56 ` Rahul Lakkireddy
2018-03-24 15:18 ` Andrew Lunn
2018-03-24 15:18 ` Andrew Lunn
2018-03-24 22:18 ` Thadeu Lima de Souza Cascardo
2018-03-24 22:18 ` Thadeu Lima de Souza Cascardo
2018-03-25 0:17 ` Eric W. Biederman
2018-03-25 0:17 ` Eric W. Biederman
2018-03-24 15:20 ` [PATCH net-next v2 0/2] kernel: add support to collect hardware logs in crash recovery kernel Eric W. Biederman
2018-03-24 15:20 ` Eric W. Biederman
2018-03-26 13:45 ` Rahul Lakkireddy
2018-03-26 13:45 ` Rahul Lakkireddy
2018-03-27 13:17 ` Eric W. Biederman
2018-03-27 13:17 ` Eric W. Biederman
2018-03-27 15:27 ` Rahul Lakkireddy
2018-03-27 15:27 ` Rahul Lakkireddy
2018-03-27 15:59 ` Eric W. Biederman [this message]
2018-03-27 15:59 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tvt1ecg9.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=ganeshgr@chelsio.com \
--cc=indranil@chelsio.com \
--cc=kexec@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nirranjan@chelsio.com \
--cc=rahul.lakkireddy@chelsio.com \
--cc=stephen@networkplumber.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.