From: Juergen Gross <jgross@suse.com>
To: Steven Haigh <netwiz@crc.id.au>, xen-devel@lists.xenproject.org
Subject: Re: RFE: Detect NUMA misconfigurations and prevent machine freezes
Date: Wed, 29 Aug 2018 07:49:40 +0200 [thread overview]
Message-ID: <29fc38c6-8908-2f6f-e496-a1644f56c59e@suse.com> (raw)
In-Reply-To: <6564259.GaOmOO5kt2@wopr.lan.crc.id.au>
On 29/08/18 07:33, Steven Haigh wrote:
> When playing with NUMA support recently, I noticed a host would always hang
> when trying to create a cpupool for the second NUMA node in the system.
>
> I was using the following commands:
> # xl cpupool-create name=\"Pool-1\" sched=\"credit2\
> # xl cpupool-cpu-remove Pool-0 node:1
> # xl cpupool-cpu-add Pool-1 node:1
>
> After the last command, the system would hang - requiring a hard reset of the
> machine to fix.
>
> I tried a different variation with the same result:
> # xl cpupool-create name=\"Pool-1\" sched=\"credit2\
> # xl cpupool-cpu-remove Pool-0 node:1
> # xl cpupool-cpu-add Pool-1 12
>
> It turns out that the RAM was installed sub-optimally in this machine. A
> partial output from 'xl info -n' shows:
> numa_info :
> node: memsize memfree distances
> 0: 67584 62608 10,21
> 1: 0 0 21,10
>
> A machine where we could get this working every time shows:
> node: memsize memfree distances
> 0: 34816 30483 10,21
> 1: 32768 32125 21,10
>
> As we can deduce RAM misconfigurations in this scenario, I believe we should
> check to ensure that RAM configuration / layout is sane *before* attempting to
> split the system and print a warning.
>
> This would prevent a hard system freeze in this scenario.
RAM placement should not matter here. As the name already suggests
cpupools do assignment of cpus. RAM allocated will be preferred taken
from a local node, but this shouldn't be mandatory for success.
Would it be possible to use a debug hypervisor (e.g. 4.12-unstable) for
generating a verbose log (hypervisor boot parameter "loglvl=all") and
sending the complete hypervisor log?
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2018-08-29 5:49 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-29 5:33 RFE: Detect NUMA misconfigurations and prevent machine freezes Steven Haigh
2018-08-29 5:49 ` Juergen Gross [this message]
2018-08-30 4:01 ` Steven Haigh
2018-08-30 7:13 ` Juergen Gross
2018-08-30 8:33 ` Jan Beulich
2018-08-30 8:49 ` BUG: sched=credit2 crashes system when using cpupools Steven Haigh
2018-09-12 15:11 ` Dario Faggioli
2018-09-12 15:27 ` Steven Haigh
2018-09-12 15:13 ` RFE: Detect NUMA misconfigurations and prevent machine freezes Dario Faggioli
2018-09-13 11:49 ` Dario Faggioli
2018-08-29 6:39 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=29fc38c6-8908-2f6f-e496-a1644f56c59e@suse.com \
--to=jgross@suse.com \
--cc=netwiz@crc.id.au \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).