public inbox for kexec@lists.infradead.org
 help / color / mirror / Atom feed
From: WANG Cong <xiyou.wangcong@gmail.com>
To: kexec@lists.infradead.org
Subject: Re: Kexec & Memory Zones question
Date: Wed, 18 May 2011 02:40:26 +0000 (UTC)	[thread overview]
Message-ID: <iqvbiq$ctv$2@dough.gmane.org> (raw)
In-Reply-To: BANLkTi=ds+qwHd5D3LiYQQf6-zqDOkZ_dw@mail.gmail.com

On Tue, 17 May 2011 19:05:13 -0700, Sujit V wrote:

> We found the root cause for this issue in the bootmem allocator.
> 
> The 96GB NUMA system has two memory nodes each with 48GB. node 0 had
> zone dma, dma32 & normal
> node 1 had only zone normal.
> 
> During the early boot i.e kernel/setup.c The bootmem allocator uses the
> API find_free_area from the e820 map to allocate some of its data
> structures.[ i.e the bitmap ] (The bootmem bitmap is used to track free
> & used pages with 1bit for 4K page. The reserve_bootmem() API is used to
> reserve)
> 
> The amount of memory required to represent the bitmap for node 0 with
> 48GB is. (48GB / (4K * 8)) = 1.5MB
> 
> The start address of the free area of size 1.5 MB returned by e820 map
> was
>>> bitmap starts at  PA (0xf9b000) size 1.5MB
> 0xf9b000 + 1.5 MB = 17.13MB
> 
> The bootmem bitmap used the 1.13MB section from the supposed crashkernel
> reserved area.
> Later when boot param parsing looks at the crashkernel=128M@16M and
> reserves the area using the reserve_bootmem().
> 
> 
> Later when paging_init() is called the bootmem allocator is retired. At
> this point it free's the memory allocated to the bitmap & gives it to
> the system page allocator.
> i.e pages from 16MB to 17.13 MB are given to the system page allocator.
> (Even though the page is reserved by crashkernel.  ]
> 
> So pages in this memory range were given some system resources. When
> kexec loaded the kdump kernel in the 128M@16M range it corrupted that
> memory & we saw the system crash.
> 
> I fixed the boot mem allocator and then things worked correctly.


Yes, this is a bug of bootmem allocator. Before switching to memblock,
the old bootmem allocator marks the crashkernel as exclusive, which
means it should use any memory area used by others, thus in this case
crashkernel memory reservation should fail.

> 
> 
> Ours is a 2.6.23 kernel.
> The later versions of the kernel have some other mechanism for early
> memory reservation (like early_res & memblock)
> 

Right, I think that version of kernel is still using the old bootmem 
allocator, so you can change the crashkernel reservation to be 
exclusively.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

      reply	other threads:[~2011-05-18  2:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-04 18:35 Kexec & Memory Zones question Sujit V
2011-05-10  9:42 ` WANG Cong
2011-05-11 15:09 ` Vivek Goyal
2011-05-12 10:03   ` WANG Cong
2011-05-18  2:05     ` Sujit V
2011-05-18  2:40       ` WANG Cong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='iqvbiq$ctv$2@dough.gmane.org' \
    --to=xiyou.wangcong@gmail.com \
    --cc=kexec@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox