All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Jones <davej@codemonkey.org.uk>
To: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>
Subject: Re: makedumpfile: Fix divide by zero in print_report()
Date: Tue, 8 Oct 2019 17:37:33 -0400	[thread overview]
Message-ID: <20191008213733.GA21304@codemonkey.org.uk> (raw)
In-Reply-To: <4AE2DC15AC0B8543882A74EA0D43DBEC0359146E@BPXM09GP.gisp.nec.co.jp>

On Mon, Oct 07, 2019 at 08:13:07PM +0000, Kazuhito Hagio wrote:
 
 > > [  518.819690] Original pages  : 0x0000000000000000
 > > [  518.828894]   Excluded pages   : 0x0000000003decd15
 > > [  518.838635]     Pages filled with zero  : 0x00000000000210ee
 > > [  518.849920]     Non-private cache pages : 0x000000000000271a
 > > [  518.861218]     Private cache pages     : 0x000000000000da47
 > > [  518.872502]     User process data pages : 0x0000000003d6bdc8
 > > [  518.883786]     Free pages              : 0x000000000004fcfe
 > > [  518.895070]     Hwpoison pages          : 0x0000000000000000
 > > [  518.906356]     Offline pages           : 0x0000000000000000
 > > [  518.917659]   Remaining pages  : 0xfffffffffc2132eb
 > > [  518.927398] Memory Hole     : 0x0000000004080000
 >
 > This is the known issue that I wrote above and am looking for a safe fix.
 > How does this patch work?

I'll give this a try, and see how it goes for a few days.

 > If it looks good, I'll look into its side effects further,
 > but might take some time..


 > > And the crashdump seems corrupt:
 > > 
 > Could you show me the output of "readelf -a vmcore"?

See below.

 > Does this issue always reproduce?

Not 100% the time. Sometimes we do get valid dumps from these hosts.
My guess so far is that it has something to do with how much of memory
makedumpfile was able to discard with -d31


Common case seems to be:

<F28>ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              CORE (Core file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         23881
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no sections to group in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
... <repeats for thousands of lines>
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0
  NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000         0

There is no dynamic section in this file.

There are no relocations in this file.

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Dynamic symbol information is not available for displaying symbols.

No version information found in this file.



There are some other failure cases with non-null data, so maybe there's >1 bug here.
I've not seen an obvious pattern to this. eg...

https://pastebin.com/2uM4sBCF



I'll put your patch on some of the affected hosts and see if this
changes behaviour in any way.

thanks,
	Dave


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

      reply	other threads:[~2019-10-08 21:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190924194005.GA7666@codemonkey.org.uk>
2019-09-26 18:41 ` makedumpfile: Fix divide by zero in print_report() Kazuhito Hagio
2019-09-26 19:32   ` Dave Jones
2019-09-27 20:39     ` Kazuhito Hagio
2019-10-04 17:03       ` Dave Jones
2019-10-07 20:13         ` Kazuhito Hagio
2019-10-08 21:37           ` Dave Jones [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191008213733.GA21304@codemonkey.org.uk \
    --to=davej@codemonkey.org.uk \
    --cc=k-hagio@ab.jp.nec.com \
    --cc=kexec@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.