xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* RFE: Detect NUMA misconfigurations and prevent machine freezes
@ 2018-08-29  5:33 Steven Haigh
  2018-08-29  5:49 ` Juergen Gross
  2018-08-29  6:39 ` Jan Beulich
  0 siblings, 2 replies; 11+ messages in thread
From: Steven Haigh @ 2018-08-29  5:33 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1433 bytes --]

When playing with NUMA support recently, I noticed a host would always hang 
when trying to create a cpupool for the second NUMA node in the system.

I was using the following commands:
# xl cpupool-create name=\"Pool-1\" sched=\"credit2\
# xl cpupool-cpu-remove Pool-0 node:1
# xl cpupool-cpu-add Pool-1 node:1

After the last command, the system would hang - requiring a hard reset of the 
machine to fix.

I tried a different variation with the same result:
# xl cpupool-create name=\"Pool-1\" sched=\"credit2\
# xl cpupool-cpu-remove Pool-0 node:1
# xl cpupool-cpu-add Pool-1 12

It turns out that the RAM was installed sub-optimally in this machine. A 
partial output from 'xl info -n' shows:
numa_info              :
node:    memsize    memfree    distances
  0:     67584      62608      10,21
  1:             0              0      21,10

A machine where we could get this working every time shows:
node:    memsize    memfree    distances
  0:     34816      30483      10,21
  1:     32768      32125      21,10

As we can deduce RAM misconfigurations in this scenario, I believe we should 
check to ensure that RAM configuration / layout is sane *before* attempting to 
split the system and print a warning.

This would prevent a hard system freeze in this scenario.

-- 
Steven Haigh

📧 netwiz@crc.id.au       💻 https://www.crc.id.au
📞 +61 (3) 9001 6090    📱 0412 935 897

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-09-13 11:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-29  5:33 RFE: Detect NUMA misconfigurations and prevent machine freezes Steven Haigh
2018-08-29  5:49 ` Juergen Gross
2018-08-30  4:01   ` Steven Haigh
2018-08-30  7:13     ` Juergen Gross
2018-08-30  8:33     ` Jan Beulich
2018-08-30  8:49       ` BUG: sched=credit2 crashes system when using cpupools Steven Haigh
2018-09-12 15:11         ` Dario Faggioli
2018-09-12 15:27           ` Steven Haigh
2018-09-12 15:13     ` RFE: Detect NUMA misconfigurations and prevent machine freezes Dario Faggioli
2018-09-13 11:49       ` Dario Faggioli
2018-08-29  6:39 ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).