From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Rushikesh Jadhav <2rushikeshj@gmail.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: Whats effect of EXTRA_MEM_RATIO
Date: Tue, 16 Jul 2013 12:28:16 -0400 [thread overview]
Message-ID: <20130716162816.GB9901@phenom.dumpdata.com> (raw)
In-Reply-To: <CAO9XypVSnq-4h91xANott7+f5EKQfpYJ-D63E0RS3uTNqdQSew@mail.gmail.com>
On Tue, Jul 16, 2013 at 09:47:57PM +0530, Rushikesh Jadhav wrote:
> On Tue, Jul 16, 2013 at 9:12 PM, Konrad Rzeszutek Wilk <
> konrad.wilk@oracle.com> wrote:
>
> > On Wed, Jul 10, 2013 at 01:36:44AM +0530, Rushikesh Jadhav wrote:
> > > Sorry about delayed response but I've again got hit by this magic number
> > 10.
> > >
> > > While reading and doing more work on subject topic I found a 2 year older
> > > commit which gives some clue.
> > >
> > https://github.com/torvalds/linux/commit/d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11
> > >
> > > It says that the reserved low memory defaults to 1/32 of total RAM so I
> > > think EXTRA_MEM_RATIO upto 32 should be ok but has no clue for the number
> > > 10.
> > >
> > > Specially, Exact Commit
> > >
> > https://github.com/torvalds/linux/commit/698bb8d14a5b577b6841acaccdf5095d3b7c7389
> > > says that 10x seems like a reasonable balance but can I make a pull
> > > request to make it say 16 or 20.
> >
> > Did you look at the 'struct page' and how it is setup in the kernel?
> > Or rather, how much space it consumes?
> >
>
> Hi Konrad,
>
> I checked the struct page but was'nt able to sum up its exact size for a PV
> kernel but it does go in lowmem. I did something else to tackle the
> EXTRA_MEM_RATIO problem for me.
What exactly is the problem statement?
>
> There are few situations
> 1. PV 3.4.50 kernel does not know about static max mem for domain & it
> always starts with base memory
It does not? There aren't any hypercalls to figure this out?
> 2. The scalability of domain is decided by this EXTRA_MEM_RATIO which is =
> 10 as default.
> 3. 10x scalability is always there irrespective of max mem (even if base
> mem = max mem). Because its pragma #define EXTRA_MEM_RATIO (10)
> 4. To achieve 10x scalability the guest kernel has to make page table
> entries and looses considerable amount of RAM. e.g on Debian guest with
> base & max mem = 512MB, for EXTRA_MEM_RATIO=10 free command shows 327MB
> total memory so a loss of 512MB - 327MB = 185MB
> on same Debian with base & max mem = 512MB, for EXTRA_MEM_RATIO=1 free
> shows 485MB total memory so a loss of 512MB - 485MB = 27MB only.
>
> Now to avoid this problem I made extra_mem_ratio as a boot kernel param and
> now I can customize the "extra_mem_ratio" variable in grub.cfg depending on
> my expected scalability. e.g.
>
> kernel /vmlinuz-3.4.50-8.el6.x86_64 ro root=/dev/mapper/vg_94762034-lv_root
> rd_LVM_LV=vg_94762034/lv_swap rd_NO_LUKS LANG=en_US.UTF-8
> rd_LVM_LV=vg_94762034/lv_root KEYTABLE=us console=hvc0 rd_NO_MD quiet
> SYSFONT=latarcyrheb-sun16 rhgb crashkernel=auto *extra_mem_ratio=4* rd_NO_DM
>
> There is no need to recompile guest kernel each time to change
> EXTRA_MEM_RATIO
Right.
>
> EXTRA_MEM_RATIO in Kernel 3.x looks like a threat for PV XEN Guests as 10
> is a magic hard coded figure for scalability.
Why not just then use the CONFIG_XEN_MEMORY_HOTPLUG mechanism which
will allocate the 'strcut page' within the new added memory regions?
>
> Your views please ?
>
> With reference to highmem and lowmem, I found that the lowmem is kernel
> space and highmem is userspace. This means that the available RAM is
> divided and memory page structures are filled in lowmem which could be 1/3
> of base memory. So for bigger scalability, lowmem would be filled with
> pages only to address the scalability.
Right, but this problem affects _only_ 32-bit guests. 64-bit don't have
a highmem. Everything is in 'lowmen'.
>
>
> > >
> > > Any ideas ?
> > >
> > >
> > > On Mon, Jun 3, 2013 at 11:20 PM, Konrad Rzeszutek Wilk <
> > > konrad.wilk@oracle.com> wrote:
> > >
> > > > On Mon, Jun 03, 2013 at 09:58:36PM +0530, Rushikesh Jadhav wrote:
> > > > > On Mon, Jun 3, 2013 at 5:40 PM, Konrad Rzeszutek Wilk <
> > > > > konrad.wilk@oracle.com> wrote:
> > > > >
> > > > > > On Sun, Jun 02, 2013 at 02:57:11AM +0530, Rushikesh Jadhav wrote:
> > > > > > > Hi guys,
> > > > > > >
> > > > > > > Im fairly new to the Xen Development & trying to understand
> > > > ballooning.
> > > > > >
> > > > > > OK.
> > > > > > >
> > > > > > > While compiling a DomU kernel I'm trying to understand the e820
> > > > memory
> > > > > > map
> > > > > > > w.r.t Xen,
> > > > > > >
> > > > > > > I have modified arch/x86/xen/setup.c EXTRA_MEM_RATIO to 1 and
> > can
> > > > see
> > > > > > > that the guest can not balloon up more than 2GB. Below is the
> > memory
> > > > map
> > > > > > of
> > > > > > > DomU with max mem as 16GB.
> > > > > > >
> > > > > > > for EXTRA_MEM_RATIO = 1
> > > > > > >
> > > > > > > BIOS-provided physical RAM map:
> > > > > > > Xen: 0000000000000000 - 00000000000a0000 (usable)
> > > > > > > Xen: 00000000000a0000 - 0000000000100000 (reserved)
> > > > > > > Xen: 0000000000100000 - 0000000080000000 (usable)
> > > > > > > Xen: 0000000080000000 - 0000000400000000 (unusable)
> > > > > > > NX (Execute Disable) protection: active
> > > > > > > DMI not present or invalid.
> > > > > > > e820 update range: 0000000000000000 - 0000000000010000 (usable)
> > ==>
> > > > > > > (reserved)
> > > > > > > e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
> > > > > > > No AGP bridge found
> > > > > > > last_pfn = 0x80000 max_arch_pfn = 0x400000000
> > > > > > > initial memory mapped : 0 - 0436c000
> > > > > > > Base memory trampoline at [ffff88000009b000] 9b000 size 20480
> > > > > > > init_memory_mapping: 0000000000000000-0000000080000000
> > > > > > > 0000000000 - 0080000000 page 4k
> > > > > > > kernel direct mapping tables up to 80000000 @ bfd000-1000000
> > > > > > > xen: setting RW the range fd6000 - 1000000
> > > > > > >
> > > > > > >
> > > > > > > for EXTRA_MEM_RATIO = 10 the map is like below and can balloon
> > up to
> > > > > > 16GB.
> > > > > > >
> > > > > >
> > > > > > Right, that is the default value.
> > > > > >
> > > > >
> > > > > What are the good or bad effects of making it 20.
> > > > > I found that increasing this number causes base memory to fill up (
> > in
> > > > many
> > > > > MBs ) and increases the range of Base~Max.
> > > >
> > > > That sounds about right. I would suggest you look in the free Linux
> > > > kernel book and look at the section that deals with 'struct page',
> > > > Lowmem and highmen. That should explain what is consuming the lowmem
> > > > memory.
> > > >
> > > > >
> > > > >
> > > > > >
> > > > > > > BIOS-provided physical RAM map:
> > > > > > > Xen: 0000000000000000 - 00000000000a0000 (usable)
> > > > > > > Xen: 00000000000a0000 - 0000000000100000 (reserved)
> > > > > > > Xen: 0000000000100000 - 0000000400000000 (usable)
> > > > > > > NX (Execute Disable) protection: active
> > > > > > > DMI not present or invalid.
> > > > > > > e820 update range: 0000000000000000 - 0000000000010000 (usable)
> > ==>
> > > > > > > (reserved)
> > > > > > > e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
> > > > > > > No AGP bridge found
> > > > > > > last_pfn = 0x400000 max_arch_pfn = 0x400000000
> > > > > > > last_pfn = 0x100000 max_arch_pfn = 0x400000000
> > > > > > > initial memory mapped : 0 - 0436c000
> > > > > > > Base memory trampoline at [ffff88000009b000] 9b000 size 20480
> > > > > > > init_memory_mapping: 0000000000000000-0000000100000000
> > > > > > > 0000000000 - 0100000000 page 4k
> > > > > > > kernel direct mapping tables up to 100000000 @ 7fb000-1000000
> > > > > > > xen: setting RW the range fd6000 - 1000000
> > > > > > > init_memory_mapping: 0000000100000000-0000000400000000
> > > > > > > 0100000000 - 0400000000 page 4k
> > > > > > > kernel direct mapping tables up to 400000000 @ 601ef000-62200000
> > > > > > > xen: setting RW the range 619fb000 - 62200000
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Can someone please help me understand its behavior and
> > importance ?
> > > > > >
> > > > > > Here is the explanation from the code:
> > > > > >
> > > > > > 384 /*
> > > > > > 385 * Clamp the amount of extra memory to a
> > EXTRA_MEM_RATIO
> > > > > > 386 * factor the base size. On non-highmem systems, the
> > base
> > > > > > 387 * size is the full initial memory allocation; on
> > highmem
> > > > it
> > > > > > 388 * is limited to the max size of lowmem, so that it
> > doesn't
> > > > > > 389 * get completely filled.
> > > > > > 390 *
> > > > > >
> > > > >
> > > > > "highmem is limited to the max size of lowmem"
> > > > > Does it mean "1/3" or maximum possible memory or startup memory ?
> > > >
> > > > For my answer to make sense I would steer you toward looking what
> > > > highmem and lowmem are. That should give you an idea of the memory
> > > > limitations 32-bit kernels have.
> > > > > In what cases it can get completely filled ?
> > > >
> > > > Yes.
> > > > >
> > > > >
> > > > > > 391 * In principle there could be a problem in lowmem
> > systems
> > > > if
> > > > > > 392 * the initial memory is also very large with respect
> > to
> > > > > > 393 * lowmem, but we won't try to deal with that here.
> > > > > > 394 */
> > > > > > 395 extra_pages = min(EXTRA_MEM_RATIO * min(max_pfn,
> > > > > > PFN_DOWN(MAXMEM)),
> > > > > > 396 extra_pages);
> > > > > >
> > > > > > I am unclear on what you are exactly want to learn? The hypercalls
> > or
> > > > how
> > > > > > the balloning happens? IF so I would recommend you work backwards -
> > > > look
> > > > > > at the balloon driver itself, how it decreases/increases the
> > memory,
> > > > and
> > > > > > what
> > > > > > data structures it uses to figure out how much memory it can use.
> > Then
> > > > you
> > > > > > can go back to the setup.c to get an idea on how the E820 is being
> > > > created.
> > > > > >
> > > > > >
> > > > > Thanks. I'll check more from drivers/xen/balloon.c
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > Thanks.
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > Xen-devel mailing list
> > > > > > > Xen-devel@lists.xen.org
> > > > > > > http://lists.xen.org/xen-devel
> > > > > >
> > > > > >
> > > >
> >
next prev parent reply other threads:[~2013-07-16 16:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-01 21:27 Whats effect of EXTRA_MEM_RATIO Rushikesh Jadhav
2013-06-03 12:10 ` Konrad Rzeszutek Wilk
2013-06-03 16:28 ` Rushikesh Jadhav
2013-06-03 17:50 ` Konrad Rzeszutek Wilk
2013-07-09 20:06 ` Rushikesh Jadhav
2013-07-16 15:42 ` Konrad Rzeszutek Wilk
2013-07-16 16:17 ` Rushikesh Jadhav
2013-07-16 16:28 ` Konrad Rzeszutek Wilk [this message]
2013-07-16 17:26 ` Rushikesh Jadhav
2013-07-16 19:49 ` Rushikesh Jadhav
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130716162816.GB9901@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=2rushikeshj@gmail.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.