From: George Dunlap <george.dunlap@eu.citrix.com>
To: Jan Beulich <JBeulich@novell.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: PoD issue
Date: Fri, 29 Jan 2010 16:01:07 +0000 [thread overview]
Message-ID: <4B630643.2000904@eu.citrix.com> (raw)
In-Reply-To: <4B630C63020000780002CC11@vpn.id2.novell.com>
What seems likely to me is that Xen (setting the PoD target) and the
balloon driver (allocating memory) have a different way of calculating
the amount of guest memory. So the balloon driver thinks it's done
handing memory back to Xen when there are still more outstanding PoD
entries than there are entries in the PoD memory pool. What balloon
driver are you using? Can you let me know max_mem, target, and what the
balloon driver has reached before calling it quits? (Although 13,000
pages is an awful lot to be off by: 54 MB...)
Re what "B" means, below is a rather long-winded explanation that will,
hopefully, be clear. :-)
Hmm, I'm not sure what the guest balloon driver's "Current allocation"
means either. :-) Does it mean, "Size of the current balloon" (i.e.,
starts at 0 and grows as the balloon driver allocates guest pages and
hands them back to Xen)? Or does it mean, "Amount of memory guest
currently has allocated to it" (i.e., starts at static_max and goes down
as the balloon driver allocates guest pages and hands them back to Xen)?
In the comment, B does *not* mean "the size of the balloon" (i.e., the
number of pages allocated from the guest OS by the balloon driver).
Rather, B means "Amount of memory the guest currently thinks it has
allocated to it." B starts at M at boot. The balloon driver will try
to make B=T by inflating the size of the balloon to M-T. Clear as mud?
Let's make a concrete example. Let's say static max is 409,600K
(100,000 pages).
M=100,000 and doesn't change. Let's say that T is 50,000.
At boot:
B == M == 100,000.
P == 0
tot_pages = pod.count == 50,000
entry_count == 100,000
Thus things hold:
* 0 <= P (0) <= T (50,000) <= B (100,000) <= M (100,000)
* entry_count (100,000) == B (100,000) - P (0)
* tot_pages (50,000) == P (0) + pod.count (50,000)
As the guest boots, pages will be populated from the cache; P increases,
but entry_count and pod.count decrease. Let's say that 25,000 pages get
allocated just before the balloon driver runs:
* 0 <= P (25,000) <= T (50,000) <= B(100,000) <= M (100,000)
* entry_count (75,000) == B (100,000) - P (25,000)
* tot_pages (50,000) == P (25,000) + pod.count (25,000)
Then the balloon driver runs. It should try to allocate 50,000 pages
total (M - T). For simplicity, let's say that the balloon driver only
allocates un-allocated pages. When it's halfway there, having allocated
25,000 pages, things look like this:
* 0 <= P (25,000) <= T (50,000) <= B (75,000) <= M (100,000)
* entry_count (50,000) == B (75,000) - P (25,000)
* tot_pages (50,000) == P (25,000) + pod.count (25,000)
Eventually the balloon driver should reach its new target of 50,000,
having allocated 50,000 pages:
* 0 <= P (25,000) <= T (50,000) <= B (50,000) <= M(100,000)
* entry_count(25,000) == B(50,000) - P (25,000)
* tot_pages (50,000) == P(25,000) + pod.count(25,000)
The reason for the logic is so that we can do the Right Thing if, after
the balloon driver has ballooned half way (to 75,000 pages), the target
is changed. If you're not changing the target before the balloon driver
has reached its target,
-George
Jan Beulich wrote:
> George,
>
> before diving deeply into the PoD code, I hope you have some idea that
> might ease the debugging that's apparently going to be needed.
>
> Following the comment immediately before p2m_pod_set_mem_target(),
> there's an apparent inconsistency with the accounting: While the guest
> in question properly balloons down to its intended setting (1G, with a
> maxmem setting of 2G), the combination of the equations
>
> d->arch.p2m->pod.entry_count == B - P
> d->tot_pages == P + d->arch.p2m->pod.count
>
> doesn't hold (provided I interpreted the meaning of B correctly - I
> took this from the guest balloon driver's "Current allocation" report,
> converted to pages); there's a difference of over 13000 pages.
> Obviously, as soon as the guest uses up enough of its memory, it
> will get crashed by the PoD code.
>
> In two runs I did, the difference (and hence the number of entries
> reported in the eventual crash message) was identical, implying to
> me that this is not a simple race, but rather a systematical problem.
>
> Even on the initial dump taken (when the guest was sitting at the
> boot manager screen), there already appears to be a difference of
> 800 pages (it's my understanding that at this point the difference
> between entries and cache should equal the difference between
> maxmem and mem).
>
> Does this ring any bells? Any hints how to debug this? In any case
> I'm attaching the full log in case you want to look at it.
>
> Jan
>
next prev parent reply other threads:[~2010-01-29 16:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-29 15:27 PoD issue Jan Beulich
2010-01-29 16:01 ` George Dunlap [this message]
2010-01-29 16:59 ` Jan Beulich
2010-01-29 18:30 ` George Dunlap
-- strict thread matches above, loose matches on Subject: below --
2010-01-31 17:48 Jan Beulich
2010-02-03 18:42 ` George Dunlap
2010-02-04 8:17 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B630643.2000904@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=JBeulich@novell.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).