* [xen-4.2-testing test] 58584: regressions - trouble: blocked/broken/fail/pass
@ 2015-06-16 3:43 osstest service user
2015-06-17 8:53 ` stable trees (was: [xen-4.2-testing test] 58584: regressions) Jan Beulich
0 siblings, 1 reply; 32+ messages in thread
From: osstest service user @ 2015-06-16 3:43 UTC (permalink / raw)
To: xen-devel; +Cc: ian.jackson
flight 58584 xen-4.2-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58584/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
build-amd64-libvirt 3 host-install(3) broken REGR. vs. 58411
test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 58460 REGR. vs. 58411
Tests which are failing intermittently (not blocking):
test-amd64-amd64-pair 3 host-install/src_host(3) broken pass in 58540
test-i386-i386-libvirt 3 host-install(3) broken pass in 58540
test-amd64-i386-pair 4 host-install/dst_host(4) broken pass in 58540
test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken pass in 58540
test-amd64-i386-xl-win7-amd64 3 host-install(3) broken pass in 58540
test-amd64-i386-xl-winxpsp3-vcpus1 3 host-install(3) broken pass in 58540
test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken pass in 58540
test-amd64-amd64-xl-win7-amd64 16 guest-stop fail in 58540 pass in 58584
test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail in 58540 pass in 58584
test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-localmigrate.2 fail pass in 58460
test-amd64-amd64-xl-qemuu-ovmf-amd64 6 xen-boot fail pass in 58540
Regressions which are regarded as allowable (not blocking):
test-amd64-i386-xl-win7-amd64 16 guest-stop fail in 58540 like 58411
test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 58411
test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58411
Tests which did not succeed, but are not blocking:
test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a
test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a
test-i386-i386-rumpuserxen-i386 1 build-check(1) blocked n/a
test-amd64-amd64-libvirt 1 build-check(1) blocked n/a
test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail in 58540 never pass
test-amd64-amd64-libvirt 12 migrate-support-check fail in 58540 never pass
test-i386-i386-libvirt 12 migrate-support-check fail in 58540 never pass
test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail never pass
build-amd64-rumpuserxen 5 rumpuserxen-build fail never pass
build-i386-rumpuserxen 5 rumpuserxen-build fail never pass
test-amd64-i386-libvirt 12 migrate-support-check fail never pass
test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail never pass
test-amd64-i386-xend-winxpsp3 20 leak-check/check fail never pass
test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/check fail never pass
version targeted for testing:
xen 97134c441d6d81ba0d7cdcfdc4d8315115b99dce
baseline version:
xen 21a8344ca38a2797a13b4bf57031b6f49ae12ccb
------------------------------------------------------------
People who touched revisions under test:
Ian Campbell <ian.campbell@citrix.com>
Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich <jbeulich@suse.com>
Stefano Stabellini <stefano.stabellini@eu.citrix.com>
------------------------------------------------------------
jobs:
build-amd64 pass
build-i386 pass
build-amd64-libvirt broken
build-i386-libvirt pass
build-amd64-pvops pass
build-i386-pvops pass
build-amd64-rumpuserxen fail
build-i386-rumpuserxen fail
test-amd64-amd64-xl pass
test-amd64-i386-xl pass
test-i386-i386-xl pass
test-amd64-i386-rhel6hvm-amd pass
test-amd64-i386-qemut-rhel6hvm-amd pass
test-amd64-i386-qemuu-rhel6hvm-amd pass
test-amd64-amd64-xl-qemut-debianhvm-amd64 pass
test-amd64-i386-xl-qemut-debianhvm-amd64 pass
test-amd64-amd64-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-qemuu-freebsd10-amd64 broken
test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
test-amd64-i386-xl-qemuu-ovmf-amd64 fail
test-amd64-amd64-rumpuserxen-amd64 blocked
test-amd64-amd64-xl-qemut-win7-amd64 fail
test-amd64-i386-xl-qemut-win7-amd64 fail
test-amd64-amd64-xl-qemuu-win7-amd64 fail
test-amd64-i386-xl-qemuu-win7-amd64 fail
test-amd64-amd64-xl-win7-amd64 pass
test-amd64-i386-xl-win7-amd64 broken
test-amd64-amd64-xl-credit2 pass
test-i386-i386-xl-credit2 pass
test-amd64-i386-qemuu-freebsd10-i386 pass
test-amd64-i386-rumpuserxen-i386 blocked
test-i386-i386-rumpuserxen-i386 blocked
test-amd64-i386-rhel6hvm-intel pass
test-amd64-i386-qemut-rhel6hvm-intel pass
test-amd64-i386-qemuu-rhel6hvm-intel pass
test-amd64-amd64-libvirt blocked
test-amd64-i386-libvirt pass
test-i386-i386-libvirt broken
test-amd64-amd64-xl-multivcpu pass
test-i386-i386-xl-multivcpu pass
test-amd64-amd64-pair broken
test-amd64-i386-pair broken
test-i386-i386-pair pass
test-amd64-amd64-xl-sedf-pin pass
test-i386-i386-xl-sedf-pin pass
test-amd64-amd64-pv pass
test-amd64-i386-pv pass
test-i386-i386-pv pass
test-amd64-amd64-xl-sedf pass
test-i386-i386-xl-sedf pass
test-amd64-i386-xl-qemut-winxpsp3-vcpus1 pass
test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
test-amd64-i386-xl-winxpsp3-vcpus1 broken
test-amd64-i386-xend-qemut-winxpsp3 fail
test-amd64-amd64-xl-qemut-winxpsp3 broken
test-i386-i386-xl-qemut-winxpsp3 pass
test-amd64-amd64-xl-qemuu-winxpsp3 pass
test-i386-i386-xl-qemuu-winxpsp3 pass
test-amd64-i386-xend-winxpsp3 fail
test-amd64-amd64-xl-winxpsp3 pass
test-i386-i386-xl-winxpsp3 pass
------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images
Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs
Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary
broken-step build-amd64-libvirt host-install(3)
broken-step test-amd64-amd64-pair host-install/src_host(3)
broken-step test-i386-i386-libvirt host-install(3)
broken-step test-amd64-i386-pair host-install/dst_host(4)
broken-step test-amd64-i386-qemuu-freebsd10-amd64 host-install(3)
broken-step test-amd64-i386-xl-win7-amd64 host-install(3)
broken-step test-amd64-i386-xl-winxpsp3-vcpus1 host-install(3)
broken-step test-amd64-amd64-xl-qemut-winxpsp3 host-install(3)
Not pushing.
------------------------------------------------------------
commit 97134c441d6d81ba0d7cdcfdc4d8315115b99dce
Author: Ian Jackson <ian.jackson@eu.citrix.com>
Date: Thu Jun 11 16:49:25 2015 +0100
QEMU_TAG update
========================================
commit 1259e092ee27f444f683f0d76a13a8a72d3f26cb
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:55 2015 +0100
xen/pt: unknown PCI config space fields should be read-only
... by default. Add a per-device "permissive" mode similar to pciback's
to allow restoring previous behavior (and hence break security again,
i.e. should be used only for trusted guests).
This is part of XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>)
commit d1ea61a5a1e5eac3184b80b4441a9ae6227a5241
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:55 2015 +0100
xen/pt: add a few PCI config space field descriptions
Since the next patch will turn all not explicitly described fields
read-only by default, those fields that have guest writable bits need
to be given explicit descriptors.
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
commit cd6d2d0599832a90f7265b13aa8bb8c3c4d3f7ce
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:55 2015 +0100
xen/pt: mark reserved bits in PCI config space fields
The adjustments are solely to make the subsequent patches work right
(and hence make the patch set consistent), namely if permissive mode
(introduced by the last patch) gets used (as both reserved registers
and reserved fields must be similarly protected from guest access in
default mode, but the guest should be allowed access to them in
permissive mode).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
commit f43df9842e00898151a8689914f6d4e9cbc37bd2
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:55 2015 +0100
xen/pt: mark all PCIe capability bits read-only
xen_pt_emu_reg_pcie[]'s PCI_EXP_DEVCAP needs to cover all bits as read-
only to avoid unintended write-back (just a precaution, the field ought
to be read-only in hardware).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
commit b9f70510a5731a8ed7527fcbf0c92df0054e5386
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:55 2015 +0100
xen/pt: split out calculation of throughable mask in PCI config space handling
This is just to avoid having to adjust that calculation later in
multiple places.
Note that including ->ro_mask in get_throughable_mask()'s calculation
is only an apparent (i.e. benign) behavioral change: For r/o fields it
doesn't matter > whether they get passed through - either the same flag
is also set in emu_mask (then there's no change at all) or the field is
r/o in hardware (and hence a write won't change it anyway).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
commit 5c6e4c043793bee997cd396de544bc9bcf5e74d2
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:55 2015 +0100
xen/pt: correctly handle PM status bit
xen_pt_pmcsr_reg_write() needs an adjustment to deal with the RW1C
nature of the not passed through bit 15 (PCI_PM_CTRL_PME_STATUS).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
commit 8c245fa40cd01527dac57292e5497e0fc1515e25
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:54 2015 +0100
xen/pt: consolidate PM capability emu_mask
There's no point in xen_pt_pmcsr_reg_{read,write}() each ORing
PCI_PM_CTRL_STATE_MASK and PCI_PM_CTRL_NO_SOFT_RESET into a local
emu_mask variable - we can have the same effect by setting the field
descriptor's emu_mask member suitably right away. Note that
xen_pt_pmcsr_reg_write() is being retained in order to allow later
patches to be less intrusive.
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
commit 7307e1523ae7deb9ea206a75a23ecc8e60524575
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:45 2015 +0100
xen/MSI: don't open-code pass-through of enable bit modifications
Without this the actual XSA-131 fix would cause the enable bit to not
get set anymore (due to the write back getting suppressed there based
on the OR of emu_mask, ro_mask, and res_mask).
Note that the fiddling with the enable bit shouldn't really be done by
qemu, but making this work right (via libxc and the hypervisor) will
require more extensive changes, which can be postponed until after the
security issue got addressed.
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
commit c5f7efbbf46d5d2405d3012e10ea510346bb5e88
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:26 2015 +0100
xen/MSI-X: disable logging by default
... to avoid allowing the guest to cause the control domain's disk to
fill.
This is XSA-130.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
commit bea6adbd1e446c4504c75ed11f3557ab742b87b8
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:24 2015 +0100
xen: don't allow guest to control MSI mask register
It's being used by the hypervisor. For now simply mimic a device not
capable of masking, and fully emulate any accesses a guest may issue
nevertheless as simple reads/writes without side effects.
This is XSA-129.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
commit 7b85a1e9cdef8de686780a6e3506448ceca37572
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Jun 10 14:17:22 2015 +0100
xen: properly gate host writes of modified PCI CFG contents
The old logic didn't work as intended when an access spanned multiple
fields (for example a 32-bit access to the location of the MSI Message
Data field with the high 16 bits not being covered by any known field).
Remove it and derive which fields not to write to from the accessed
fields' emulation masks: When they're all ones, there's no point in
doing any host write.
This fixes a secondary issue at once: We obviously shouldn't make any
host write attempt when already the host read failed.
This is XSA-128.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
^ permalink raw reply [flat|nested] 32+ messages in thread* stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-16 3:43 [xen-4.2-testing test] 58584: regressions - trouble: blocked/broken/fail/pass osstest service user @ 2015-06-17 8:53 ` Jan Beulich 2015-06-17 10:26 ` Ian Jackson 0 siblings, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-17 8:53 UTC (permalink / raw) To: Lars Kurth, ian.jackson, Stefano Stabellini; +Cc: xen-devel >>> On 16.06.15 at 05:43, <osstest@xenbits.xen.org> wrote: > flight 58584 xen-4.2-testing real [real] > http://logs.test-lab.xenproject.org/osstest/logs/58584/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > build-amd64-libvirt 3 host-install(3) broken REGR. vs. 58411 > test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 58460 REGR. vs. 58411 Just made another round through the history of these failures as far as they're accessible through the mailing list archives. The first qemuu related occurrences were in flights 50318, 50315, 50285, and 50330 (for 4.2 through 4.5 respectively). In those or time wise close ones there were also a few other migration failures (i.e. non-qemuu related). Considering how close these flight numbers are to 50000 (and that there are very few earlier flights' [in the > 50000 range] results available on the list), it would seem to me as if we "acquired" these failures with the switch over to the new osstest instance. Which leaves several options: - the problem was always there, but hidden by some factor in the old osstest instance, - this is an infrastructure problem in the new osstest instance (after all what makes the tests fail is a ping timing out, which can have a variety of reasons), - this is a build or runtime problem due to software differences between the old and new instances (no idea whether exact same package versions were used at the time of the switchover), - yet something else I can't think of right now. One aspect making me indeed consider the build (or less likely runtime) aspect is that we're seeing the frequent migration failures in the stable trees only - other than unstable they're all getting built with debug=n. While I agree that it wouldn't be nice to release 4.5.1 with these failures not understood, the current situation (with no-one having a real idea of what's going on, and apparently also no-one really trying to debug the issue - it being migration _and_ [apparently] qemuu specific I don't really feel qualified myself, leaving aside any time constraints, which certainly also apply to others) will lead to an indefinite stall on both this tree and the 4.4 one (4.4.3 would be due in about a month, i.e. normally I would send out a call for backport requests around now). Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-17 8:53 ` stable trees (was: [xen-4.2-testing test] 58584: regressions) Jan Beulich @ 2015-06-17 10:26 ` Ian Jackson 2015-06-17 13:16 ` Stefano Stabellini 2015-06-18 11:37 ` Jan Beulich 0 siblings, 2 replies; 32+ messages in thread From: Ian Jackson @ 2015-06-17 10:26 UTC (permalink / raw) To: Jan Beulich; +Cc: Lars Kurth, xen-devel, Stefano Stabellini Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: regressions)"): > Which leaves several options: > - the problem was always there, but hidden by some factor in the > old osstest instance, I think this is most likely. The old system had much older hosts. I think this is a race that we now happen to lose most of the time. > - this is an infrastructure problem in the new osstest instance > (after all what makes the tests fail is a ping timing out, which can > have a variety of reasons), I think this is very unlikely. When we were investigating the FreeBSD migration failure, I looked at this possibility in some depth. I ran a number of long-term ping tests between various infrastructure machines and test boxes and saw nothing untoward (for example, no unexpected packet loss). (In the end the problem turned out to be a race bug in the FreeBSD netfront, which would try to send the gratuitous ARP before the backend was up.) > - this is a build or runtime problem due to software differences > between the old and new instances (no idea whether exact same > package versions were used at the time of the switchover), All the builds are done on hosts frequently reinstalled from Debian upstream. The compiler would change if Debian released an updated package but not otherwise. So the old and new build environments would be very close to identical, apart from the hardware, hostnames, etc. > One aspect making me indeed consider the build (or less likely > runtime) aspect is that we're seeing the frequent migration failures > in the stable trees only - other than unstable they're all getting built > with debug=n. Races frequently come and go with that kind of change. > While I agree that it wouldn't be nice to release 4.5.1 with these > failures not understood, the current situation (with no-one having > a real idea of what's going on, and apparently also no-one really > trying to debug the issue - it being migration _and_ [apparently] > qemuu specific I don't really feel qualified myself, leaving aside any > time constraints, which certainly also apply to others) will lead to > an indefinite stall on both this tree and the 4.4 one (4.4.3 would > be due in about a month, i.e. normally I would send out a call for > backport requests around now). I think going ahead with 4.5.1 anyway would be a reasonable choice. Stefano ? Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-17 10:26 ` Ian Jackson @ 2015-06-17 13:16 ` Stefano Stabellini 2015-06-18 11:37 ` Jan Beulich 1 sibling, 0 replies; 32+ messages in thread From: Stefano Stabellini @ 2015-06-17 13:16 UTC (permalink / raw) To: Ian Jackson; +Cc: Lars Kurth, xen-devel, Jan Beulich, Stefano Stabellini On Wed, 17 Jun 2015, Ian Jackson wrote: > Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: regressions)"): > > Which leaves several options: > > - the problem was always there, but hidden by some factor in the > > old osstest instance, > > I think this is most likely. The old system had much older hosts. > > I think this is a race that we now happen to lose most of the time. > > > - this is an infrastructure problem in the new osstest instance > > (after all what makes the tests fail is a ping timing out, which can > > have a variety of reasons), > > I think this is very unlikely. When we were investigating the FreeBSD > migration failure, I looked at this possibility in some depth. I ran > a number of long-term ping tests between various infrastructure > machines and test boxes and saw nothing untoward (for example, no > unexpected packet loss). (In the end the problem turned out to be a > race bug in the FreeBSD netfront, which would try to send the > gratuitous ARP before the backend was up.) > > > - this is a build or runtime problem due to software differences > > between the old and new instances (no idea whether exact same > > package versions were used at the time of the switchover), > > All the builds are done on hosts frequently reinstalled from Debian > upstream. The compiler would change if Debian released an updated > package but not otherwise. So the old and new build environments > would be very close to identical, apart from the hardware, hostnames, > etc. > > > One aspect making me indeed consider the build (or less likely > > runtime) aspect is that we're seeing the frequent migration failures > > in the stable trees only - other than unstable they're all getting built > > with debug=n. > > Races frequently come and go with that kind of change. > > > While I agree that it wouldn't be nice to release 4.5.1 with these > > failures not understood, the current situation (with no-one having > > a real idea of what's going on, and apparently also no-one really > > trying to debug the issue - it being migration _and_ [apparently] > > qemuu specific I don't really feel qualified myself, leaving aside any > > time constraints, which certainly also apply to others) will lead to > > an indefinite stall on both this tree and the 4.4 one (4.4.3 would > > be due in about a month, i.e. normally I would send out a call for > > backport requests around now). > > I think going ahead with 4.5.1 anyway would be a reasonable choice. > > Stefano ? I agree ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-17 10:26 ` Ian Jackson 2015-06-17 13:16 ` Stefano Stabellini @ 2015-06-18 11:37 ` Jan Beulich 2015-06-18 14:22 ` Ian Campbell 1 sibling, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-18 11:37 UTC (permalink / raw) To: Ian Jackson, Stefano Stabellini; +Cc: Lars Kurth, xen-devel >>> On 17.06.15 at 12:26, <Ian.Jackson@eu.citrix.com> wrote: > Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: > regressions)"): >> Which leaves several options: >> - the problem was always there, but hidden by some factor in the >> old osstest instance, > > I think this is most likely. The old system had much older hosts. > > I think this is a race that we now happen to lose most of the time. For verification purposes, would it be possible to set up a couple of flights on the old instance for one of the stable trees? >> One aspect making me indeed consider the build (or less likely >> runtime) aspect is that we're seeing the frequent migration failures >> in the stable trees only - other than unstable they're all getting built >> with debug=n. > > Races frequently come and go with that kind of change. True. Question then still is who will try to look into the issue (as right now it is quite harmful to the progress the stable trees can make towards getting pushed). Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-18 11:37 ` Jan Beulich @ 2015-06-18 14:22 ` Ian Campbell 2015-06-19 9:51 ` Jan Beulich 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-18 14:22 UTC (permalink / raw) To: Jan Beulich; +Cc: Lars Kurth, Ian Jackson, xen-devel, Stefano Stabellini On Thu, 2015-06-18 at 12:37 +0100, Jan Beulich wrote: > >>> On 17.06.15 at 12:26, <Ian.Jackson@eu.citrix.com> wrote: > > Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: > > regressions)"): > >> Which leaves several options: > >> - the problem was always there, but hidden by some factor in the > >> old osstest instance, > > > > I think this is most likely. The old system had much older hosts. > > > > I think this is a race that we now happen to lose most of the time. > > For verification purposes, would it be possible to set up a couple of > flights on the old instance for one of the stable trees? I can try and run something adhoc on the old system if you can let me know exactly which jobs (test-*-*-*) and branches you are interested in. Ian. > >> One aspect making me indeed consider the build (or less likely > >> runtime) aspect is that we're seeing the frequent migration failures > >> in the stable trees only - other than unstable they're all getting built > >> with debug=n. > > > > Races frequently come and go with that kind of change. > > True. Question then still is who will try to look into the issue (as > right now it is quite harmful to the progress the stable trees can > make towards getting pushed). > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-18 14:22 ` Ian Campbell @ 2015-06-19 9:51 ` Jan Beulich 2015-06-19 11:07 ` Ian Campbell 0 siblings, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-19 9:51 UTC (permalink / raw) To: Ian Campbell; +Cc: Lars Kurth, Ian Jackson, xen-devel, Stefano Stabellini >>> On 18.06.15 at 16:22, <ian.campbell@citrix.com> wrote: > On Thu, 2015-06-18 at 12:37 +0100, Jan Beulich wrote: >> >>> On 17.06.15 at 12:26, <Ian.Jackson@eu.citrix.com> wrote: >> > Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: >> > regressions)"): >> >> Which leaves several options: >> >> - the problem was always there, but hidden by some factor in the >> >> old osstest instance, >> > >> > I think this is most likely. The old system had much older hosts. >> > >> > I think this is a race that we now happen to lose most of the time. >> >> For verification purposes, would it be possible to set up a couple of >> flights on the old instance for one of the stable trees? > > I can try and run something adhoc on the old system if you can let me > know exactly which jobs (test-*-*-*) and branches you are interested in. Any or all of test-amd64-*-xl-qemuu-win* (not sure whether you can specify wildcards), and I guess stable-4.5 (or staging-4.5) would be the most natural branch choice. Thanks, Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-19 9:51 ` Jan Beulich @ 2015-06-19 11:07 ` Ian Campbell 2015-06-24 9:06 ` Ian Campbell 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-19 11:07 UTC (permalink / raw) To: Jan Beulich; +Cc: Lars Kurth, Ian Jackson, xen-devel, Stefano Stabellini On Fri, 2015-06-19 at 10:51 +0100, Jan Beulich wrote: > >>> On 18.06.15 at 16:22, <ian.campbell@citrix.com> wrote: > > On Thu, 2015-06-18 at 12:37 +0100, Jan Beulich wrote: > >> >>> On 17.06.15 at 12:26, <Ian.Jackson@eu.citrix.com> wrote: > >> > Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: > >> > regressions)"): > >> >> Which leaves several options: > >> >> - the problem was always there, but hidden by some factor in the > >> >> old osstest instance, > >> > > >> > I think this is most likely. The old system had much older hosts. > >> > > >> > I think this is a race that we now happen to lose most of the time. > >> > >> For verification purposes, would it be possible to set up a couple of > >> flights on the old instance for one of the stable trees? > > > > I can try and run something adhoc on the old system if you can let me > > know exactly which jobs (test-*-*-*) and branches you are interested in. > > Any or all of test-amd64-*-xl-qemuu-win* (not sure whether you > can specify wildcards), and I guess stable-4.5 (or staging-4.5) > would be the most natural branch choice. I think the tools can do wildcards, yes. I've kicked off a full adhoc xen-4.5-testing flight so I have a local template to copy the jobs from for some repeated runs with just the problem flights (it's just easier to do that than to invent a cut-down flight from scratch...). > > Thanks, Jan > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-19 11:07 ` Ian Campbell @ 2015-06-24 9:06 ` Ian Campbell 2015-06-24 9:38 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) Ian Campbell 2015-06-24 9:45 ` stable trees (was: [xen-4.2-testing test] 58584: regressions) Jan Beulich 0 siblings, 2 replies; 32+ messages in thread From: Ian Campbell @ 2015-06-24 9:06 UTC (permalink / raw) To: Jan Beulich; +Cc: Lars Kurth, Stefano Stabellini, Ian Jackson, xen-devel On Fri, 2015-06-19 at 12:07 +0100, Ian Campbell wrote: > On Fri, 2015-06-19 at 10:51 +0100, Jan Beulich wrote: > > >>> On 18.06.15 at 16:22, <ian.campbell@citrix.com> wrote: > > > On Thu, 2015-06-18 at 12:37 +0100, Jan Beulich wrote: > > >> >>> On 17.06.15 at 12:26, <Ian.Jackson@eu.citrix.com> wrote: > > >> > Jan Beulich writes ("stable trees (was: [xen-4.2-testing test] 58584: > > >> > regressions)"): > > >> >> Which leaves several options: > > >> >> - the problem was always there, but hidden by some factor in the > > >> >> old osstest instance, > > >> > > > >> > I think this is most likely. The old system had much older hosts. > > >> > > > >> > I think this is a race that we now happen to lose most of the time. > > >> > > >> For verification purposes, would it be possible to set up a couple of > > >> flights on the old instance for one of the stable trees? > > > > > > I can try and run something adhoc on the old system if you can let me > > > know exactly which jobs (test-*-*-*) and branches you are interested in. > > > > Any or all of test-amd64-*-xl-qemuu-win* (not sure whether you > > can specify wildcards), and I guess stable-4.5 (or staging-4.5) > > would be the most natural branch choice. > > I think the tools can do wildcards, yes. > > I've kicked off a full adhoc xen-4.5-testing flight so I have a local > template to copy the jobs from for some repeated runs with just the > problem flights (it's just easier to do that than to invent a cut-down > flight from scratch...). After that baseline I ran a few tests of just the windows + qemuu stuff: http://xenbits.xen.org/people/ianc/tmp/adhoc/37619/ was allowing free reign on the machines and was mostly successful, apart from the windows-install failure on lake-frog. Looking at the test history this seems to have always been a problem on the old infra. *-frog are "AMD Opteron(tm) Processor 6168" which is as close as the old infra has to the new colos merlot[01] which is "AMD Opteron(tm) Processor 6376". With that in mind I reran with things limited to the two frog-* boxes and got http://xenbits.xen.org/people/ianc/tmp/adhoc/37624/. The windows-install of winxpsp3 persisted but there was no migration failure elsewhere. It's not a lot of data, but in comparison with the results in the colo: http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-xl-qemuu-win7-amd64/xen-4.5-testing.html it looks like it's the newer system which is exposing the issue. Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 9:06 ` Ian Campbell @ 2015-06-24 9:38 ` Ian Campbell 2015-06-24 12:29 ` Dario Faggioli 2015-06-26 10:37 ` Ian Campbell 2015-06-24 9:45 ` stable trees (was: [xen-4.2-testing test] 58584: regressions) Jan Beulich 1 sibling, 2 replies; 32+ messages in thread From: Ian Campbell @ 2015-06-24 9:38 UTC (permalink / raw) To: Jan Beulich, Boris Ostrovsky, Suravee Suthikulpanit, Aravind Gopalakrishnan Cc: Lars Kurth, Stefano Stabellini, Dario Faggioli, Ian Jackson, Anthony Perard, xen-devel Adding Boris+Suravee+Aravind (AMD/SVM maintainers), Dario (NUMA) and Jim +Anthony (libvirt) to the CC. TL;DR osstest is exposing issues running on "AMD Opteron(tm) Processor 6376" in at least a couple of test cases. It would be good if someone from AMD could have a look. The systems here == merlot[01], which seem to be having with win7 live migration tests as well as libvirt when starting PV guests. They each contain "AMD Opteron(tm) Processor 6376" processors with 32 threads in 4 nodes and seem to have a strange NUMA layout with no RAM on nodes 1 or 3. The test history on these machines: http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html I just posted some analysis of the windows cases (including experiments on the old Cambridge test infra with "AMD Opteron(tm) Processor 6168" processes) in: http://lists.xen.org/archives/html/xen-devel/2015-06/msg03713.html I've also been investigating the libvirt guest-start failures. The symptom is a 10s timeout starting qemu. Anthony is seeing this with openstack too and did some analysis in http://thread.gmane.org/gmane.comp.emulators.xen.devel/246473/focus=249172 onwards, but it may be that this is unrelated to the osstest failures and that for Anthony's scenario the 10s timeout could be explained by the openstack tempest tests starting lots of VMs in parallel. However for the osstests we are starting a single PV domain on an otherwise idle host. There should be no reason for qemu to take as long as 10s to come up in that case, even with pessimal NUMA layout (IMHO at least). By comparison on other hosts starting qemu seems to take 2-4s, so merlot is at least 2.5-5 times worse. I tried running some adhoc tests on the old infra tied to the *-frog machines (which are the Opteron 6168 ones): http://xenbits.xen.org/people/ianc/tmp/adhoc/37623/ http://xenbits.xen.org/people/ianc/tmp/adhoc/37625/ The -xsm failures are because I botched the flight configuration, the interesting information is that the other ones passed both times (migrate-support is expected to fail at the moment). Supposing that the NUMA oddities might be what is exposing this issue I tried an adhoc run on the merlot machines where I specified "dom0_max_vcpus=8 dom0_nodes=0" on the hypervisor command line: http://logs.test-lab.xenproject.org/osstest/logs/58853/ Again, I messed up the config for the -xsm case, so ignore. The interesting thing is that the extra NUMA settings were apparently_not_ helpful. From http://logs.test-lab.xenproject.org/osstest/logs/58853/test-amd64-amd64-libvirt/serial-merlot0.log I can see they were applied: Jun 23 15:50:34.205057 (XEN) Command line: placeholder conswitch=x watchdog com1=115200,8n1 console=com1,vga gdb=com1 dom0_mem=512M,max:512M ucode=scan dom0_max_vcpus=8 dom0_nodes=0 [...] Jun 23 15:50:38.309057 (XEN) Dom0 has maximum 8 VCPUs The memory info Jun 23 15:56:27.749008 (XEN) Memory location of each domain: Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): Jun 23 15:56:27.756983 (XEN) Node 0: 126905 Jun 23 15:56:27.756998 (XEN) Node 1: 0 Jun 23 15:56:27.764952 (XEN) Node 2: 4167 Jun 23 15:56:27.764969 (XEN) Node 3: 0 suggests at least a small amount of cross-node memory allocation (16M out of dom0s 512M total). That's probably small enough to be OK. And it seems as if the 8 dom0 vcpus are correctly pinned to the first 8 cpus (the ones in node 0): Jun 23 15:56:43.797055 (XEN) VCPU information and callbacks for domain 0: Jun 23 15:56:43.797110 (XEN) VCPU0: CPU4 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={4} Jun 23 15:56:43.805078 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.813121 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.813157 (XEN) No periodic timer Jun 23 15:56:43.821050 (XEN) VCPU1: CPU3 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={3} Jun 23 15:56:43.829044 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.829082 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.837051 (XEN) No periodic timer Jun 23 15:56:43.837084 (XEN) VCPU2: CPU5 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={5} Jun 23 15:56:43.845102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.853035 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.853071 (XEN) No periodic timer Jun 23 15:56:43.853099 (XEN) VCPU3: CPU7 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={7} Jun 23 15:56:43.861102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.869110 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.869145 (XEN) No periodic timer Jun 23 15:56:43.877014 (XEN) VCPU4: CPU0 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={} Jun 23 15:56:43.877038 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.885053 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.885088 (XEN) No periodic timer Jun 23 15:56:43.893085 (XEN) VCPU5: CPU0 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={} Jun 23 15:56:43.901075 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.901134 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.909010 (XEN) No periodic timer Jun 23 15:56:43.909048 (XEN) VCPU6: CPU2 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={2} Jun 23 15:56:43.917065 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.925055 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.925074 (XEN) No periodic timer Jun 23 15:56:43.925095 (XEN) VCPU7: CPU6 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={6} Jun 23 15:56:43.933119 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} Jun 23 15:56:43.941080 (XEN) pause_count=0 pause_flags=1 Jun 23 15:56:43.941129 (XEN) No periodic timer So whatever the issue is it doesn't seem to be particularly related to the strange NUMA layout. Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 9:38 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) Ian Campbell @ 2015-06-24 12:29 ` Dario Faggioli 2015-06-24 12:41 ` Jan Beulich 2015-06-26 10:37 ` Ian Campbell 1 sibling, 1 reply; 32+ messages in thread From: Dario Faggioli @ 2015-06-24 12:29 UTC (permalink / raw) To: Ian Campbell Cc: Lars Kurth, Jan Beulich, Stefano Stabellini, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 5912 bytes --] On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: > Adding Boris+Suravee+Aravind (AMD/SVM maintainers), Dario (NUMA) and Jim > +Anthony (libvirt) to the CC. > Supposing that the NUMA oddities might be what is exposing this issue I > tried an adhoc run on the merlot machines where I specified > "dom0_max_vcpus=8 dom0_nodes=0" on the hypervisor command line: > http://logs.test-lab.xenproject.org/osstest/logs/58853/ > > Again, I messed up the config for the -xsm case, so ignore. > > The interesting thing is that the extra NUMA settings were > apparently_not_ helpful. From > http://logs.test-lab.xenproject.org/osstest/logs/58853/test-amd64-amd64-libvirt/serial-merlot0.log I can see they were applied: > Jun 23 15:50:34.205057 (XEN) Command line: placeholder conswitch=x watchdog com1=115200,8n1 console=com1,vga gdb=com1 dom0_mem=512M,max:512M ucode=scan dom0_max_vcpus=8 dom0_nodes=0 > [...] > Jun 23 15:50:38.309057 (XEN) Dom0 has maximum 8 VCPUs > IIRC, you can drop the dom0_max_vcpus=8, as Xen would figure it out automatically, as a consequence of dom0_nodes=0. In any case, it doesn't hurt. Maybe we can try running this again with dom0_nodes=2 (the other node with memory attached). I wouldn't know what to expect, though, so, yes, it's a shot in the dark, but since we're out of plausible theories... :-/ > The memory info > Jun 23 15:56:27.749008 (XEN) Memory location of each domain: > Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): > Jun 23 15:56:27.756983 (XEN) Node 0: 126905 > Jun 23 15:56:27.756998 (XEN) Node 1: 0 > Jun 23 15:56:27.764952 (XEN) Node 2: 4167 > Jun 23 15:56:27.764969 (XEN) Node 3: 0 > suggests at least a small amount of cross-node memory allocation (16M > out of dom0s 512M total). That's probably small enough to be OK. > Yeah, that is in line with what you usually get with dom0_nodes. Most of the memory, as you noted, comes from the proper node. We're just not (yet?) at the point where _all_ of it can come from there. > And it seems as if the 8 dom0 vcpus are correctly pinned to the first 8 > cpus (the ones in node 0): > Jun 23 15:56:43.797055 (XEN) VCPU information and callbacks for domain 0: > Jun 23 15:56:43.797110 (XEN) VCPU0: CPU4 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={4} > Jun 23 15:56:43.805078 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.813121 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.813157 (XEN) No periodic timer > Jun 23 15:56:43.821050 (XEN) VCPU1: CPU3 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={3} > Jun 23 15:56:43.829044 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.829082 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.837051 (XEN) No periodic timer > Jun 23 15:56:43.837084 (XEN) VCPU2: CPU5 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={5} > Jun 23 15:56:43.845102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.853035 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.853071 (XEN) No periodic timer > Jun 23 15:56:43.853099 (XEN) VCPU3: CPU7 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={7} > Jun 23 15:56:43.861102 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.869110 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.869145 (XEN) No periodic timer > Jun 23 15:56:43.877014 (XEN) VCPU4: CPU0 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={} > Jun 23 15:56:43.877038 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.885053 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.885088 (XEN) No periodic timer > Jun 23 15:56:43.893085 (XEN) VCPU5: CPU0 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={} > Jun 23 15:56:43.901075 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.901134 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.909010 (XEN) No periodic timer > Jun 23 15:56:43.909048 (XEN) VCPU6: CPU2 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={2} > Jun 23 15:56:43.917065 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.925055 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.925074 (XEN) No periodic timer > Jun 23 15:56:43.925095 (XEN) VCPU7: CPU6 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={6} > Jun 23 15:56:43.933119 (XEN) cpu_hard_affinity={0-7} cpu_soft_affinity={0-7} > Jun 23 15:56:43.941080 (XEN) pause_count=0 pause_flags=1 > Jun 23 15:56:43.941129 (XEN) No periodic timer > > So whatever the issue is it doesn't seem to be particularly related to > the strange NUMA layout. > Exactly. Inspecting the logs and looking at the dump of scheduler info, pCPUs info and vCPUs info, everything seems completely and fully idle, at the time the debugkeys are sent to the box. There are no vCPUs active or waiting in any runqueue, all the host pCPUs are in idle_loop() and all Dom0 vCPUs are in ffffffff810013aa, which should be xen_hypercall_sched_op... So, if there was something keeping the system busy enough to make QEMU miss the 10 secs timeout, any dead or live lock, either in Xen or Dom0, it seems to be gone by when we realize things have gone bad and go inspecting the system (as a further, although of course not conclusive, proof of that, we do manage to see the output of `xl info', `xl list', etc., performed during ts-capture-logs, so the system is indeed able to respond). Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 12:29 ` Dario Faggioli @ 2015-06-24 12:41 ` Jan Beulich 2015-06-24 13:15 ` Dario Faggioli 0 siblings, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-24 12:41 UTC (permalink / raw) To: Dario Faggioli, Ian Campbell Cc: Lars Kurth, Stefano Stabellini, IanJackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky >>> On 24.06.15 at 14:29, <dario.faggioli@citrix.com> wrote: > On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: >> The memory info >> Jun 23 15:56:27.749008 (XEN) Memory location of each domain: >> Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): >> Jun 23 15:56:27.756983 (XEN) Node 0: 126905 >> Jun 23 15:56:27.756998 (XEN) Node 1: 0 >> Jun 23 15:56:27.764952 (XEN) Node 2: 4167 >> Jun 23 15:56:27.764969 (XEN) Node 3: 0 >> suggests at least a small amount of cross-node memory allocation (16M >> out of dom0s 512M total). That's probably small enough to be OK. >> > Yeah, that is in line with what you usually get with dom0_nodes. Most of > the memory, as you noted, comes from the proper node. We're just not > (yet?) at the point where _all_ of it can come from there. Actually as long as there is enough memory on the requested node (minus any amount set aside for the DMA pool), this shouldn't happen (and I had seen this to be clean in my own testing). There being 8Gb per node, I see no immediate reason why memory from node 2 would be handed out. Still I wouldn't suspect this to matter here. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 12:41 ` Jan Beulich @ 2015-06-24 13:15 ` Dario Faggioli 2015-06-24 13:28 ` Jan Beulich 0 siblings, 1 reply; 32+ messages in thread From: Dario Faggioli @ 2015-06-24 13:15 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 2198 bytes --] [Moving most people to Bcc, as this is indeed unrelated to the original topic] On Wed, 2015-06-24 at 13:41 +0100, Jan Beulich wrote: > >>> On 24.06.15 at 14:29, <dario.faggioli@citrix.com> wrote: > > On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: > >> The memory info > >> Jun 23 15:56:27.749008 (XEN) Memory location of each domain: > >> Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): > >> Jun 23 15:56:27.756983 (XEN) Node 0: 126905 > >> Jun 23 15:56:27.756998 (XEN) Node 1: 0 > >> Jun 23 15:56:27.764952 (XEN) Node 2: 4167 > >> Jun 23 15:56:27.764969 (XEN) Node 3: 0 > >> suggests at least a small amount of cross-node memory allocation (16M > >> out of dom0s 512M total). That's probably small enough to be OK. > >> > > Yeah, that is in line with what you usually get with dom0_nodes. Most of > > the memory, as you noted, comes from the proper node. We're just not > > (yet?) at the point where _all_ of it can come from there. > > Actually as long as there is enough memory on the requested node > (minus any amount set aside for the DMA pool), this shouldn't > happen (and I had seen this to be clean in my own testing). > ISTR some allocation not being 'converted'. Perhaps I'm misremembering. > There > being 8Gb per node, I see no immediate reason why memory from > node 2 would be handed out. Still I wouldn't suspect this to matter > here. > On my 2 nodes test box with the following configuration: (XEN) SRAT: Node 1 PXM 1 0-dc000000 (XEN) SRAT: Node 1 PXM 1 100000000-1a4000000 (XEN) SRAT: Node 0 PXM 0 1a4000000-324000000 with 'dom0_nodes=0', I see this: (XEN) Memory location of each domain: (XEN) Domain 0 (total: 131072): (XEN) Node 0: 114664 (XEN) Node 1: 16408 while with 'dom0_nodes=1', this: (XEN) Memory location of each domain: (XEN) Domain 0 (total: 131072): (XEN) Node 0: 7749 (XEN) Node 1: 123323 Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 13:15 ` Dario Faggioli @ 2015-06-24 13:28 ` Jan Beulich 2015-06-24 13:54 ` Dario Faggioli 0 siblings, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-24 13:28 UTC (permalink / raw) To: Dario Faggioli; +Cc: xen-devel, Boris Ostrovsky >>> On 24.06.15 at 15:15, <dario.faggioli@citrix.com> wrote: > [Moving most people to Bcc, as this is indeed unrelated to the original > topic] > > On Wed, 2015-06-24 at 13:41 +0100, Jan Beulich wrote: >> >>> On 24.06.15 at 14:29, <dario.faggioli@citrix.com> wrote: >> > On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: >> >> The memory info >> >> Jun 23 15:56:27.749008 (XEN) Memory location of each domain: >> >> Jun 23 15:56:27.756965 (XEN) Domain 0 (total: 131072): >> >> Jun 23 15:56:27.756983 (XEN) Node 0: 126905 >> >> Jun 23 15:56:27.756998 (XEN) Node 1: 0 >> >> Jun 23 15:56:27.764952 (XEN) Node 2: 4167 >> >> Jun 23 15:56:27.764969 (XEN) Node 3: 0 >> >> suggests at least a small amount of cross-node memory allocation (16M >> >> out of dom0s 512M total). That's probably small enough to be OK. >> >> >> > Yeah, that is in line with what you usually get with dom0_nodes. Most of >> > the memory, as you noted, comes from the proper node. We're just not >> > (yet?) at the point where _all_ of it can come from there. >> >> Actually as long as there is enough memory on the requested node >> (minus any amount set aside for the DMA pool), this shouldn't >> happen (and I had seen this to be clean in my own testing). >> > ISTR some allocation not being 'converted'. Perhaps I'm misremembering. Quite possible that I overlooked some. >> There >> being 8Gb per node, I see no immediate reason why memory from >> node 2 would be handed out. Still I wouldn't suspect this to matter >> here. >> > On my 2 nodes test box with the following configuration: > (XEN) SRAT: Node 1 PXM 1 0-dc000000 > (XEN) SRAT: Node 1 PXM 1 100000000-1a4000000 > (XEN) SRAT: Node 0 PXM 0 1a4000000-324000000 > > with 'dom0_nodes=0', I see this: > (XEN) Memory location of each domain: > (XEN) Domain 0 (total: 131072): > (XEN) Node 0: 114664 > (XEN) Node 1: 16408 > > while with 'dom0_nodes=1', this: > (XEN) Memory location of each domain: > (XEN) Domain 0 (total: 131072): > (XEN) Node 0: 7749 > (XEN) Node 1: 123323 In the latter case I'm not surprised, except by the odd number: The SWIOTLB would (except on very small systems, which normally wouldn't be NUMA anyway) always live on node 0. But overall it looks like there's something needing to be fixed. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 13:28 ` Jan Beulich @ 2015-06-24 13:54 ` Dario Faggioli 0 siblings, 0 replies; 32+ messages in thread From: Dario Faggioli @ 2015-06-24 13:54 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Boris Ostrovsky [-- Attachment #1.1: Type: text/plain, Size: 1946 bytes --] On Wed, 2015-06-24 at 14:28 +0100, Jan Beulich wrote: > >>> On 24.06.15 at 15:15, <dario.faggioli@citrix.com> wrote: > > ISTR some allocation not being 'converted'. Perhaps I'm misremembering. > > Quite possible that I overlooked some. > I meant something that we (you!) were aware of, that came up during review. I therefore went back and checked the archives, and found out that what my memory was hinting at was the discussion that there has been about the IOMMU side of the original series submission: http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg00690.html http://lists.xenproject.org/archives/html/xen-devel/2015-03/msg00681.html Anyway... > > On my 2 nodes test box with the following configuration: > > (XEN) SRAT: Node 1 PXM 1 0-dc000000 > > (XEN) SRAT: Node 1 PXM 1 100000000-1a4000000 > > (XEN) SRAT: Node 0 PXM 0 1a4000000-324000000 > > > > with 'dom0_nodes=0', I see this: > > (XEN) Memory location of each domain: > > (XEN) Domain 0 (total: 131072): > > (XEN) Node 0: 114664 > > (XEN) Node 1: 16408 > > > > while with 'dom0_nodes=1', this: > > (XEN) Memory location of each domain: > > (XEN) Domain 0 (total: 131072): > > (XEN) Node 0: 7749 > > (XEN) Node 1: 123323 > > In the latter case I'm not surprised, except by the odd number: The > SWIOTLB would (except on very small systems, which normally > wouldn't be NUMA anyway) always live on node 0. > ...Right. > But overall it looks like there's something needing to be fixed. > I can have a look... If you also will, and would like me to check or run/test anything on the box above, feel free to ask. :-) Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) [-- Attachment #1.2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 181 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-24 9:38 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) Ian Campbell 2015-06-24 12:29 ` Dario Faggioli @ 2015-06-26 10:37 ` Ian Campbell 2015-06-26 10:49 ` Jan Beulich 2015-06-26 12:20 ` Jan Beulich 1 sibling, 2 replies; 32+ messages in thread From: Ian Campbell @ 2015-06-26 10:37 UTC (permalink / raw) To: Jan Beulich Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: > TL;DR osstest is exposing issues running on "AMD Opteron(tm) Processor > 6376" in at least a couple of test cases. It would be good if someone > from AMD could have a look. At Andy Cooper's request I ran a quick job with mtrr.show=true http://logs.test-lab.xenproject.org/osstest/logs/58909/ I think the relevant serial output is: Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 write-back Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 write-back Jun 26 09:57:42.349124 (XEN) 2 disabled Jun 26 09:57:42.357068 (XEN) 3 disabled Jun 26 09:57:42.357098 (XEN) 4 disabled Jun 26 09:57:42.357122 (XEN) 5 disabled Jun 26 09:57:42.357147 (XEN) 6 disabled Jun 26 09:57:42.365063 (XEN) 7 disabled Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 10:37 ` Ian Campbell @ 2015-06-26 10:49 ` Jan Beulich 2015-06-26 11:16 ` Ian Campbell 2015-06-26 12:20 ` Jan Beulich 1 sibling, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-26 10:49 UTC (permalink / raw) To: Ian Campbell Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: > At Andy Cooper's request I ran a quick job with mtrr.show=true > http://logs.test-lab.xenproject.org/osstest/logs/58909/ > > I think the relevant serial output is: > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 write-back > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 write-back > Jun 26 09:57:42.349124 (XEN) 2 disabled > Jun 26 09:57:42.357068 (XEN) 3 disabled > Jun 26 09:57:42.357098 (XEN) 4 disabled > Jun 26 09:57:42.357122 (XEN) 5 disabled > Jun 26 09:57:42.357147 (XEN) 6 disabled > Jun 26 09:57:42.365063 (XEN) 7 disabled This alone would mean UC for all memory above 4G. But I seem to recall AMD having some mechanism to avoid using MTRRs for this case. Let me try to dig this out once back from lunch. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 10:49 ` Jan Beulich @ 2015-06-26 11:16 ` Ian Campbell 2015-06-26 12:37 ` Ian Campbell 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-26 11:16 UTC (permalink / raw) To: Jan Beulich Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky On Fri, 2015-06-26 at 11:49 +0100, Jan Beulich wrote: > >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: > > At Andy Cooper's request I ran a quick job with mtrr.show=true > > http://logs.test-lab.xenproject.org/osstest/logs/58909/ > > > > I think the relevant serial output is: > > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable > > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: > > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back > > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable > > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back > > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: > > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 write-back > > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 write-back > > Jun 26 09:57:42.349124 (XEN) 2 disabled > > Jun 26 09:57:42.357068 (XEN) 3 disabled > > Jun 26 09:57:42.357098 (XEN) 4 disabled > > Jun 26 09:57:42.357122 (XEN) 5 disabled > > Jun 26 09:57:42.357147 (XEN) 6 disabled > > Jun 26 09:57:42.365063 (XEN) 7 disabled > > This alone would mean UC for all memory above 4G. But I seem to > recall AMD having some mechanism to avoid using MTRRs for this > case. Let me try to dig this out once back from lunch. While you do that it seems like I may as well try a run with "e820-mtrr-clip" given to Xen. Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 11:16 ` Ian Campbell @ 2015-06-26 12:37 ` Ian Campbell 2015-06-26 12:59 ` Jan Beulich 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-26 12:37 UTC (permalink / raw) To: Jan Beulich Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky On Fri, 2015-06-26 at 12:16 +0100, Ian Campbell wrote: > On Fri, 2015-06-26 at 11:49 +0100, Jan Beulich wrote: > > >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: > > > At Andy Cooper's request I ran a quick job with mtrr.show=true > > > http://logs.test-lab.xenproject.org/osstest/logs/58909/ > > > > > > I think the relevant serial output is: > > > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable > > > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: > > > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back > > > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable > > > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back > > > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: > > > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 write-back > > > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 write-back > > > Jun 26 09:57:42.349124 (XEN) 2 disabled > > > Jun 26 09:57:42.357068 (XEN) 3 disabled > > > Jun 26 09:57:42.357098 (XEN) 4 disabled > > > Jun 26 09:57:42.357122 (XEN) 5 disabled > > > Jun 26 09:57:42.357147 (XEN) 6 disabled > > > Jun 26 09:57:42.365063 (XEN) 7 disabled > > > > This alone would mean UC for all memory above 4G. But I seem to > > recall AMD having some mechanism to avoid using MTRRs for this > > case. Let me try to dig this out once back from lunch. > > While you do that it seems like I may as well try a run with > "e820-mtrr-clip" given to Xen. According to http://logs.test-lab.xenproject.org/osstest/logs/58914/ it didn't make any difference to the end result. It did seems to cause a huge number of Jun 26 11:51:29.933067 (XEN) AMD-Vi: IO_PAGE_FAULT: domain = 0, device id = 0x92, fault address = 0xbdfe7000, flags = 0 messages which weren't there before, not sure if that is a clue or not. Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 12:37 ` Ian Campbell @ 2015-06-26 12:59 ` Jan Beulich 2015-06-26 14:44 ` Ian Campbell 0 siblings, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-26 12:59 UTC (permalink / raw) To: Ian Campbell Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky >>> On 26.06.15 at 14:37, <ian.campbell@citrix.com> wrote: > On Fri, 2015-06-26 at 12:16 +0100, Ian Campbell wrote: >> On Fri, 2015-06-26 at 11:49 +0100, Jan Beulich wrote: >> > >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: >> > > At Andy Cooper's request I ran a quick job with mtrr.show=true >> > > http://logs.test-lab.xenproject.org/osstest/logs/58909/ >> > > >> > > I think the relevant serial output is: >> > > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable >> > > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: >> > > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back >> > > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable >> > > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back >> > > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: >> > > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 > write-back >> > > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 > write-back >> > > Jun 26 09:57:42.349124 (XEN) 2 disabled >> > > Jun 26 09:57:42.357068 (XEN) 3 disabled >> > > Jun 26 09:57:42.357098 (XEN) 4 disabled >> > > Jun 26 09:57:42.357122 (XEN) 5 disabled >> > > Jun 26 09:57:42.357147 (XEN) 6 disabled >> > > Jun 26 09:57:42.365063 (XEN) 7 disabled >> > >> > This alone would mean UC for all memory above 4G. But I seem to >> > recall AMD having some mechanism to avoid using MTRRs for this >> > case. Let me try to dig this out once back from lunch. >> >> While you do that it seems like I may as well try a run with >> "e820-mtrr-clip" given to Xen. > > According to http://logs.test-lab.xenproject.org/osstest/logs/58914/ it > didn't make any difference to the end result. > > It did seems to cause a huge number of > Jun 26 11:51:29.933067 (XEN) AMD-Vi: IO_PAGE_FAULT: domain = 0, device id = > 0x92, fault address = 0xbdfe7000, flags = 0 > messages which weren't there before, not sure if that is a clue or not. I think that's a result of amd_iommu_hwdom_init() now stopping below the reserved ranges right below 3Gb. I.e. these ought to go away if you had the system use minimally more than 4Gb. I also think that you'd see those too without limiting memory if the reserved range was large enough to not share a PDX with the highest RAM page below 4Gb (due to the way mfn_valid() works), or if we indeed only mapped RAM pages there. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 12:59 ` Jan Beulich @ 2015-06-26 14:44 ` Ian Campbell 2015-06-26 14:53 ` Jan Beulich 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-26 14:44 UTC (permalink / raw) To: Jan Beulich Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky On Fri, 2015-06-26 at 13:59 +0100, Jan Beulich wrote: > >>> On 26.06.15 at 14:37, <ian.campbell@citrix.com> wrote: > > On Fri, 2015-06-26 at 12:16 +0100, Ian Campbell wrote: > >> On Fri, 2015-06-26 at 11:49 +0100, Jan Beulich wrote: > >> > >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: > >> > > At Andy Cooper's request I ran a quick job with mtrr.show=true > >> > > http://logs.test-lab.xenproject.org/osstest/logs/58909/ > >> > > > >> > > I think the relevant serial output is: > >> > > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable > >> > > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: > >> > > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back > >> > > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable > >> > > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back > >> > > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: > >> > > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 > > write-back > >> > > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 > > write-back > >> > > Jun 26 09:57:42.349124 (XEN) 2 disabled > >> > > Jun 26 09:57:42.357068 (XEN) 3 disabled > >> > > Jun 26 09:57:42.357098 (XEN) 4 disabled > >> > > Jun 26 09:57:42.357122 (XEN) 5 disabled > >> > > Jun 26 09:57:42.357147 (XEN) 6 disabled > >> > > Jun 26 09:57:42.365063 (XEN) 7 disabled > >> > > >> > This alone would mean UC for all memory above 4G. But I seem to > >> > recall AMD having some mechanism to avoid using MTRRs for this > >> > case. Let me try to dig this out once back from lunch. > >> > >> While you do that it seems like I may as well try a run with > >> "e820-mtrr-clip" given to Xen. > > > > According to http://logs.test-lab.xenproject.org/osstest/logs/58914/ it > > didn't make any difference to the end result. > > > > It did seems to cause a huge number of > > Jun 26 11:51:29.933067 (XEN) AMD-Vi: IO_PAGE_FAULT: domain = 0, device id = > > 0x92, fault address = 0xbdfe7000, flags = 0 > > messages which weren't there before, not sure if that is a clue or not. > > I think that's a result of amd_iommu_hwdom_init() now stopping > below the reserved ranges right below 3Gb. I.e. these ought to > go away if you had the system use minimally more than 4Gb. I > also think that you'd see those too without limiting memory if the > reserved range was large enough to not share a PDX with the > highest RAM page below 4Gb (due to the way mfn_valid() works), > or if we indeed only mapped RAM pages there. I think you are probably speaking hypothetically, but just in case: Do you actually want me to do any of that? I'm not sure how easy it would be. Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 14:44 ` Ian Campbell @ 2015-06-26 14:53 ` Jan Beulich 0 siblings, 0 replies; 32+ messages in thread From: Jan Beulich @ 2015-06-26 14:53 UTC (permalink / raw) To: Ian Campbell Cc: Lars Kurth, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, SuraveeSuthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky >>> On 26.06.15 at 16:44, <ian.campbell@citrix.com> wrote: > On Fri, 2015-06-26 at 13:59 +0100, Jan Beulich wrote: >> >>> On 26.06.15 at 14:37, <ian.campbell@citrix.com> wrote: >> > On Fri, 2015-06-26 at 12:16 +0100, Ian Campbell wrote: >> >> On Fri, 2015-06-26 at 11:49 +0100, Jan Beulich wrote: >> >> > >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: >> >> > > At Andy Cooper's request I ran a quick job with mtrr.show=true >> >> > > http://logs.test-lab.xenproject.org/osstest/logs/58909/ >> >> > > >> >> > > I think the relevant serial output is: >> >> > > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable >> >> > > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: >> >> > > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back >> >> > > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable >> >> > > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back >> >> > > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: >> >> > > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 >> > write-back >> >> > > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 >> > write-back >> >> > > Jun 26 09:57:42.349124 (XEN) 2 disabled >> >> > > Jun 26 09:57:42.357068 (XEN) 3 disabled >> >> > > Jun 26 09:57:42.357098 (XEN) 4 disabled >> >> > > Jun 26 09:57:42.357122 (XEN) 5 disabled >> >> > > Jun 26 09:57:42.357147 (XEN) 6 disabled >> >> > > Jun 26 09:57:42.365063 (XEN) 7 disabled >> >> > >> >> > This alone would mean UC for all memory above 4G. But I seem to >> >> > recall AMD having some mechanism to avoid using MTRRs for this >> >> > case. Let me try to dig this out once back from lunch. >> >> >> >> While you do that it seems like I may as well try a run with >> >> "e820-mtrr-clip" given to Xen. >> > >> > According to http://logs.test-lab.xenproject.org/osstest/logs/58914/ it >> > didn't make any difference to the end result. >> > >> > It did seems to cause a huge number of >> > Jun 26 11:51:29.933067 (XEN) AMD-Vi: IO_PAGE_FAULT: domain = 0, device id = > >> > 0x92, fault address = 0xbdfe7000, flags = 0 >> > messages which weren't there before, not sure if that is a clue or not. >> >> I think that's a result of amd_iommu_hwdom_init() now stopping >> below the reserved ranges right below 3Gb. I.e. these ought to >> go away if you had the system use minimally more than 4Gb. I >> also think that you'd see those too without limiting memory if the >> reserved range was large enough to not share a PDX with the >> highest RAM page below 4Gb (due to the way mfn_valid() works), >> or if we indeed only mapped RAM pages there. > > I think you are probably speaking hypothetically, but just in case: Do > you actually want me to do any of that? I'm not sure how easy it would > be. No, that was indeed meant only as an explanation, not as something to test. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 10:37 ` Ian Campbell 2015-06-26 10:49 ` Jan Beulich @ 2015-06-26 12:20 ` Jan Beulich 2015-06-26 14:34 ` Ian Campbell 1 sibling, 1 reply; 32+ messages in thread From: Jan Beulich @ 2015-06-26 12:20 UTC (permalink / raw) To: Ian Campbell Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: > On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: >> TL;DR osstest is exposing issues running on "AMD Opteron(tm) Processor >> 6376" in at least a couple of test cases. It would be good if someone >> from AMD could have a look. > > At Andy Cooper's request I ran a quick job with mtrr.show=true > http://logs.test-lab.xenproject.org/osstest/logs/58909/ > > I think the relevant serial output is: > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 write-back > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 write-back > Jun 26 09:57:42.349124 (XEN) 2 disabled > Jun 26 09:57:42.357068 (XEN) 3 disabled > Jun 26 09:57:42.357098 (XEN) 4 disabled > Jun 26 09:57:42.357122 (XEN) 5 disabled > Jun 26 09:57:42.357147 (XEN) 6 disabled > Jun 26 09:57:42.365063 (XEN) 7 disabled See the patch just sent for how to get the missing piece of information out of the system. Albeit I just realized that this still leaves the possibility of SYSCFG or TOP_MEM2 disagreeing between CPUs. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 12:20 ` Jan Beulich @ 2015-06-26 14:34 ` Ian Campbell 2015-06-26 14:52 ` Jan Beulich 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-26 14:34 UTC (permalink / raw) To: Jan Beulich Cc: Lars Kurth, Stefano Stabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Suravee Suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky On Fri, 2015-06-26 at 13:20 +0100, Jan Beulich wrote: > >>> On 26.06.15 at 12:37, <ian.campbell@citrix.com> wrote: > > On Wed, 2015-06-24 at 10:38 +0100, Ian Campbell wrote: > >> TL;DR osstest is exposing issues running on "AMD Opteron(tm) Processor > >> 6376" in at least a couple of test cases. It would be good if someone > >> from AMD could have a look. > > > > At Andy Cooper's request I ran a quick job with mtrr.show=true > > http://logs.test-lab.xenproject.org/osstest/logs/58909/ > > > > I think the relevant serial output is: > > Jun 26 09:57:42.325077 (XEN) MTRR default type: uncachable > > Jun 26 09:57:42.325111 (XEN) MTRR fixed ranges enabled: > > Jun 26 09:57:42.333068 (XEN) 00000-9ffff write-back > > Jun 26 09:57:42.333101 (XEN) a0000-bffff uncachable > > Jun 26 09:57:42.333128 (XEN) c0000-fffff write-back > > Jun 26 09:57:42.341077 (XEN) MTRR variable ranges enabled: > > Jun 26 09:57:42.341110 (XEN) 0 base 000000000000 mask ffff80000000 write-back > > Jun 26 09:57:42.349088 (XEN) 1 base 000080000000 mask ffffc0000000 write-back > > Jun 26 09:57:42.349124 (XEN) 2 disabled > > Jun 26 09:57:42.357068 (XEN) 3 disabled > > Jun 26 09:57:42.357098 (XEN) 4 disabled > > Jun 26 09:57:42.357122 (XEN) 5 disabled > > Jun 26 09:57:42.357147 (XEN) 6 disabled > > Jun 26 09:57:42.365063 (XEN) 7 disabled > > See the patch just sent for how to get the missing piece of > information out of the system. Albeit I just realized that this still > leaves the possibility of SYSCFG or TOP_MEM2 disagreeing > between CPUs. I did this using rdmsr from mst-tools instead, running on a native kernel gave: # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 14:34 ` Ian Campbell @ 2015-06-26 14:52 ` Jan Beulich 2015-06-26 16:23 ` Ian Campbell 2015-06-26 19:36 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees Boris Ostrovsky 0 siblings, 2 replies; 32+ messages in thread From: Jan Beulich @ 2015-06-26 14:52 UTC (permalink / raw) To: Aravind Gopalakrishnan, suravee.suthikulpanit, Ian Campbell Cc: Lars Kurth, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Anthony Perard, xen-devel, Boris Ostrovsky >>> On 26.06.15 at 16:34, <ian.campbell@citrix.com> wrote: > I did this using rdmsr from mst-tools instead, running on a native > kernel gave: > > # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done > 0 >[...] > 0 Uniformly uncachable for everything above 4Gb then. And I suppose you already checked that there's no BIOS update available? I'm not sure if it would be reasonable for us to work around this. Suravee, Aravind - do you (or colleagues of yours) have any experience with systems mis-configured like this one? Otoh I'm then pretty confused by your E820 clipping experiment not having yielded any better results. I'm starting to suspect two problems... Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) 2015-06-26 14:52 ` Jan Beulich @ 2015-06-26 16:23 ` Ian Campbell 2015-06-26 19:36 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees Boris Ostrovsky 1 sibling, 0 replies; 32+ messages in thread From: Ian Campbell @ 2015-06-26 16:23 UTC (permalink / raw) To: Jan Beulich Cc: Lars Kurth, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, suravee.suthikulpanit, Anthony Perard, xen-devel, Boris Ostrovsky On Fri, 2015-06-26 at 15:52 +0100, Jan Beulich wrote: > >>> On 26.06.15 at 16:34, <ian.campbell@citrix.com> wrote: > > I did this using rdmsr from mst-tools instead, running on a native > > kernel gave: > > > > # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done > > 0 > >[...] > > 0 > > Uniformly uncachable for everything above 4Gb then. And I suppose > you already checked that there's no BIOS update available? I hadn't, but I have now. dmidecode tells me it is a "ProLiant DL385p Gen8" with BIOS version A28. AFAICT from the HP support site A28 is the latest version. Full dmidecode below in case it might be of interest. Ian. # dmidecode 2.11 SMBIOS 2.8 present. # SMBIOS implementations newer than version 2.7 are not # fully supported by this version of dmidecode. 190 structures occupying 6144 bytes. Table at 0xBFB7D000. Handle 0x0000, DMI type 0, 24 bytes BIOS Information Vendor: HP Version: A28 Release Date: 02/06/2014 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 8192 kB Characteristics: PCI is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported Selectable boot is supported EDD is supported 5.25"/360 kB floppy services are supported (int 13h) 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 kB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported BIOS boot specification is supported Function key-initiated network boot is supported Targeted content distribution is supported Firmware Revision: 1.51 Handle 0x0100, DMI type 1, 27 bytes System Information Manufacturer: HP Product Name: ProLiant DL385p Gen8 Version: Not Specified Serial Number: MXQ44207T1 UUID: 37303137-3532-584D-5134-343230375431 Wake-up Type: Power Switch SKU Number: 710725-S01 Family: ProLiant Handle 0x0300, DMI type 3, 21 bytes Chassis Information Manufacturer: HP Type: Rack Mount Chassis Lock: Not Present Version: Not Specified Serial Number: MXQ44207T1 Asset Tag: Boot-up State: Critical Power Supply State: Critical Thermal State: Safe Security Status: Unknown OEM Information: 0x00000000 Height: 2 U Number Of Power Cords: 2 Contained Elements: 0 Handle 0x0400, DMI type 4, 42 bytes Processor Information Socket Designation: Proc 1 Type: Central Processor Family: Opteron Manufacturer: AMD ID: 20 0F 60 00 FF FB 8B 17 Signature: Family 21, Model 2, Stepping 0 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) MMX (MMX technology supported) FXSR (FXSAVE and FXSTOR instructions supported) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) HTT (Multi-threading) Version: AMD Opteron(tm) Processor 6376 Voltage: 1.4 V External Clock: 200 MHz Max Speed: 3500 MHz Current Speed: 2300 MHz Status: Populated, Enabled Upgrade: Socket G34 L1 Cache Handle: 0x0710 L2 Cache Handle: 0x0720 L3 Cache Handle: 0x0730 Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Core Count: 16 Core Enabled: 16 Thread Count: 16 Characteristics: 64-bit capable Multi-Core Execute Protection Enhanced Virtualization Power/Performance Control Handle 0x0401, DMI type 4, 42 bytes Processor Information Socket Designation: Proc 2 Type: Central Processor Family: Opteron Manufacturer: AMD ID: 20 0F 60 00 FF FB 8B 17 Signature: Family 21, Model 2, Stepping 0 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) MMX (MMX technology supported) FXSR (FXSAVE and FXSTOR instructions supported) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) HTT (Multi-threading) Version: AMD Opteron(tm) Processor 6376 Voltage: 1.4 V External Clock: 200 MHz Max Speed: 3500 MHz Current Speed: 2300 MHz Status: Populated, Idle Upgrade: Socket G34 L1 Cache Handle: 0x0711 L2 Cache Handle: 0x0721 L3 Cache Handle: 0x0731 Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Core Count: 16 Core Enabled: 16 Thread Count: 16 Characteristics: 64-bit capable Multi-Core Execute Protection Enhanced Virtualization Power/Performance Control Handle 0x0710, DMI type 7, 19 bytes Cache Information Socket Designation: Processor 1 Internal L1 Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 768 kB Maximum Size: 768 kB Supported SRAM Types: Burst Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: 2-way Set-associative Handle 0x0711, DMI type 7, 19 bytes Cache Information Socket Designation: Processor 2 Internal L1 Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 768 kB Maximum Size: 768 kB Supported SRAM Types: Burst Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: 2-way Set-associative Handle 0x0720, DMI type 7, 19 bytes Cache Information Socket Designation: Processor 1 Internal L2 Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Varies With Memory Address Location: Internal Installed Size: 16384 kB Maximum Size: 16384 kB Supported SRAM Types: Burst Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: 16-way Set-associative Handle 0x0721, DMI type 7, 19 bytes Cache Information Socket Designation: Processor 2 Internal L2 Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Varies With Memory Address Location: Internal Installed Size: 16384 kB Maximum Size: 16384 kB Supported SRAM Types: Burst Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: 16-way Set-associative Handle 0x0730, DMI type 7, 19 bytes Cache Information Socket Designation: Processor 1 Internal L3 Cache Configuration: Enabled, Not Socketed, Level 3 Operational Mode: Varies With Memory Address Location: Internal Installed Size: 16384 kB Maximum Size: 16384 kB Supported SRAM Types: Burst Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: 20-way Set-associative Handle 0x0731, DMI type 7, 19 bytes Cache Information Socket Designation: Processor 2 Internal L3 Cache Configuration: Enabled, Not Socketed, Level 3 Operational Mode: Varies With Memory Address Location: Internal Installed Size: 16384 kB Maximum Size: 16384 kB Supported SRAM Types: Burst Installed SRAM Type: Burst Speed: Unknown Error Correction Type: Unknown System Type: Unknown Associativity: 20-way Set-associative Handle 0x0801, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J48 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 1 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0802, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J47 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 2 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0803, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J73 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 3 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0804, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J71 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 4 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0805, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J70 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 5 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0806, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J50 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 6 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0807, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J45 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 7 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0808, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J72 Internal Connector Type: Access Bus (USB) External Reference Designator: USB Port 8 External Connector Type: Access Bus (USB) Port Type: USB Handle 0x0809, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J36 Internal Connector Type: None External Reference Designator: Video Port External Connector Type: DB-15 female Port Type: Video Port Handle 0x080A, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J87 Internal Connector Type: None External Reference Designator: COM Port External Connector Type: DB-9 male Port Type: Serial Port 16550A Compatible Handle 0x080B, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J43 Internal Connector Type: None External Reference Designator: NIC port 1 External Connector Type: RJ-45 Port Type: Network Port Handle 0x080C, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J43 Internal Connector Type: None External Reference Designator: NIC port 2 External Connector Type: RJ-45 Port Type: Network Port Handle 0x080D, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J43 Internal Connector Type: None External Reference Designator: NIC port 3 External Connector Type: RJ-45 Port Type: Network Port Handle 0x080E, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J43 Internal Connector Type: None External Reference Designator: NIC port 4 External Connector Type: RJ-45 Port Type: Network Port Handle 0x080F, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J39 Internal Connector Type: None External Reference Designator: ILO NIC port External Connector Type: RJ-45 Port Type: Network Port Handle 0x0810, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J42 Internal Connector Type: None External Reference Designator: Video Port External Connector Type: DB-15 female Port Type: Video Port Handle 0x0901, DMI type 9, 17 bytes System Slot Information Designation: PCI-E Slot 1 Type: x8 PCI Express 2 x16 Current Usage: Available Length: Long ID: 1 Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:05:00.0 Handle 0x0902, DMI type 9, 17 bytes System Slot Information Designation: PCI-E Slot 2 Type: x8 PCI Express 2 Current Usage: Available Length: Short ID: 2 Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:08:00.0 Handle 0x0903, DMI type 9, 17 bytes System Slot Information Designation: PCI-E Slot 3 Type: x4 PCI Express 2 x8 Current Usage: Available Length: Short ID: 3 Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:0b:00.0 Handle 0x0904, DMI type 9, 17 bytes System Slot Information Designation: PCI-E Slot 4 Type: x16 PCI Express 2 Current Usage: Available Length: Long ID: 4 Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:41:00.0 Handle 0x0905, DMI type 9, 17 bytes System Slot Information Designation: PCI-E Slot 5 Type: x8 PCI Express 2 Current Usage: Available Length: Short ID: 5 Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:42:00.0 Handle 0x0906, DMI type 9, 17 bytes System Slot Information Designation: PCI-E Slot 6 Type: x8 PCI Express 2 Current Usage: Available Length: Short ID: 6 Characteristics: 3.3 V is provided PME signal is supported Bus Address: 0000:43:00.0 Handle 0x0B00, DMI type 11, 5 bytes OEM Strings String 1: PSF: String 2: Product ID: 710725-S01 Handle 0x1000, DMI type 16, 23 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: Single-bit ECC Maximum Capacity: 768 GB Error Information Handle: Not Provided Number Of Devices: 24 Handle 0x1100, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 8192 MB Form Factor: DIMM Set: 1 Locator: Proc 1 DIMM 1A Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Registered (Buffered) Speed: 1333 MHz Manufacturer: HP Serial Number: Not Specified Asset Tag: Not Specified Part Number: 647650-171 Rank: 2 Configured Clock Speed: 1333 MHz Handle 0x1101, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 2 Locator: Proc 1 DIMM 2I Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1102, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 3 Locator: Proc 1 DIMM 3E Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1103, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 4 Locator: Proc 1 DIMM 4C Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1104, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 5 Locator: Proc 1 DIMM 5K Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1105, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 6 Locator: Proc 1 DIMM 6G Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1106, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 7 Locator: Proc 1 DIMM 7B Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1107, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 8 Locator: Proc 1 DIMM 8J Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1108, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 9 Locator: Proc 1 DIMM 9F Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1109, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 10 Locator: Proc 1 DIMM 10D Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x110A, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 11 Locator: Proc 1 DIMM 11L Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x110B, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 12 Locator: Proc 1 DIMM 12H Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x110C, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 8192 MB Form Factor: DIMM Set: 13 Locator: Proc 2 DIMM 1A Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Registered (Buffered) Speed: 1333 MHz Manufacturer: HP Serial Number: Not Specified Asset Tag: Not Specified Part Number: 647650-171 Rank: 2 Configured Clock Speed: 1333 MHz Handle 0x110D, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 14 Locator: Proc 2 DIMM 2I Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x110E, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 15 Locator: Proc 2 DIMM 3E Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x110F, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 16 Locator: Proc 2 DIMM 4C Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1110, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 17 Locator: Proc 2 DIMM 5K Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1111, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 18 Locator: Proc 2 DIMM 6G Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1112, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 19 Locator: Proc 2 DIMM 7B Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1113, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 20 Locator: Proc 2 DIMM 8J Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1114, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 21 Locator: Proc 2 DIMM 9F Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1115, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 22 Locator: Proc 2 DIMM 10D Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1116, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 23 Locator: Proc 2 DIMM 11L Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1117, DMI type 17, 40 bytes Memory Device Array Handle: 0x1000 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: No Module Installed Form Factor: DIMM Set: 24 Locator: Proc 2 DIMM 12H Bank Locator: Not Specified Type: DDR3 Type Detail: Synchronous Speed: Unknown Manufacturer: UNKNOWN Serial Number: Not Specified Asset Tag: Not Specified Part Number: NOT AVAILABLE Rank: Unknown Configured Clock Speed: Unknown Handle 0x1300, DMI type 19, 31 bytes Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x000BFFFFFFF Range Size: 3 GB Physical Array Handle: 0x1000 Partition Width: 12 Handle 0x1301, DMI type 19, 31 bytes Memory Array Mapped Address Starting Address: 0x00100000000 Ending Address: 0x0043EFFFFFF Range Size: 13296 MB Physical Array Handle: 0x1000 Partition Width: 12 Handle 0x2000, DMI type 32, 11 bytes System Boot Information Status: Firmware-detected hardware failure Handle 0x2600, DMI type 38, 18 bytes IPMI Device Information Interface Type: KCS (Keyboard Control Style) Specification Version: 2.0 I2C Slave Address: 0x10 NV Storage Device: Not Present Base Address: 0x0000000000000CA2 (I/O) Register Spacing: Successive Byte Boundaries Handle 0x2700, DMI type 39, 22 bytes System Power Supply Power Unit Group: 1 Location: Not Specified Name: Power Supply 1 Manufacturer: HP Serial Number: 5BXRF0DLL6Y771 Asset Tag: Not Specified Model Part Number: 656363-B21 Revision: Not Specified Max Power Capacity: 750 W Status: Present, Unknown Type: Unknown Input Voltage Range Switching: Unknown Plugged: Yes Hot Replaceable: Yes Handle 0x2701, DMI type 39, 22 bytes System Power Supply Power Unit Group: 1 Location: Not Specified Name: Power Supply 2 Manufacturer: HP Serial Number: 5BXRF0DLL6Y24S Asset Tag: Not Specified Model Part Number: 656363-B21 Revision: Not Specified Max Power Capacity: 750 W Status: Present, Unknown Type: Unknown Input Voltage Range Switching: Unknown Plugged: Yes Hot Replaceable: Yes Handle 0x2901, DMI type 41, 11 bytes Onboard Device Reference Designation: NIC Port 1 Type: Ethernet Status: Enabled Type Instance: 1 Bus Address: 0000:04:00.0 Handle 0x2902, DMI type 41, 11 bytes Onboard Device Reference Designation: NIC Port 2 Type: Ethernet Status: Enabled Type Instance: 2 Bus Address: 0000:04:00.1 Handle 0x2903, DMI type 41, 11 bytes Onboard Device Reference Designation: NIC Port 3 Type: Ethernet Status: Enabled Type Instance: 3 Bus Address: 0000:04:00.2 Handle 0x2904, DMI type 41, 11 bytes Onboard Device Reference Designation: NIC Port 4 Type: Ethernet Status: Enabled Type Instance: 4 Bus Address: 0000:04:00.3 Handle 0x2945, DMI type 41, 11 bytes Onboard Device Reference Designation: Storage Controller Type: SAS Controller Status: Enabled Type Instance: 1 Bus Address: 0000:03:00.0 Handle 0xC101, DMI type 193, 9 bytes OEM-specific Type Header and Data: C1 09 01 C1 01 01 02 03 04 Strings: 02/06/2014 05/05/2012 Handle 0xC200, DMI type 194, 5 bytes OEM-specific Type Header and Data: C2 05 00 C2 11 Handle 0xC300, DMI type 195, 7 bytes OEM-specific Type Header and Data: C3 07 00 C3 01 B4 00 Strings: $0E1107BE Handle 0xC400, DMI type 196, 13 bytes OEM-specific Type Header and Data: C4 0D 00 C4 00 00 00 00 00 00 01 02 00 Handle 0xC500, DMI type 197, 12 bytes OEM-specific Type Header and Data: C5 0C 00 C5 00 04 20 01 FF 01 73 00 Handle 0xC501, DMI type 197, 12 bytes OEM-specific Type Header and Data: C5 0C 01 C5 01 04 40 00 FF 02 73 00 Handle 0xC600, DMI type 198, 11 bytes OEM-specific Type Header and Data: C6 0B 00 C6 01 00 00 00 00 00 01 Handle 0xC900, DMI type 201, 11 bytes OEM-specific Type Header and Data: C9 0B 00 C9 FA 01 00 00 40 0B 01 Handle 0xCA00, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 00 CA 00 11 FF 01 01 Handle 0xCA01, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 01 CA 01 11 FF 02 01 Handle 0xCA02, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 02 CA 02 11 FF 03 01 Handle 0xCA03, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 03 CA 03 11 FF 04 01 Handle 0xCA04, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 04 CA 04 11 FF 05 01 Handle 0xCA05, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 05 CA 05 11 FF 06 01 Handle 0xCA06, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 06 CA 06 11 FF 07 01 Handle 0xCA07, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 07 CA 07 11 FF 08 01 Handle 0xCA08, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 08 CA 08 11 FF 09 01 Handle 0xCA09, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 09 CA 09 11 FF 0A 01 Handle 0xCA0A, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 0A CA 0A 11 FF 0B 01 Handle 0xCA0B, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 0B CA 0B 11 FF 0C 01 Handle 0xCA0C, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 0C CA 0C 11 FF 01 02 Handle 0xCA0D, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 0D CA 0D 11 FF 02 02 Handle 0xCA0E, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 0E CA 0E 11 FF 03 02 Handle 0xCA0F, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 0F CA 0F 11 FF 04 02 Handle 0xCA10, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 10 CA 10 11 FF 05 02 Handle 0xCA11, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 11 CA 11 11 FF 06 02 Handle 0xCA12, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 12 CA 12 11 FF 07 02 Handle 0xCA13, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 13 CA 13 11 FF 08 02 Handle 0xCA14, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 14 CA 14 11 FF 09 02 Handle 0xCA15, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 15 CA 15 11 FF 0A 02 Handle 0xCA16, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 16 CA 16 11 FF 0B 02 Handle 0xCA17, DMI type 202, 9 bytes OEM-specific Type Header and Data: CA 09 17 CA 17 11 FF 0C 02 Handle 0xD100, DMI type 209, 36 bytes HP BIOS NIC PCI and MAC Information NIC 1: PCI device 04:00.0, MAC address 40:A8:F0:1F:AE:7C NIC 2: PCI device 04:00.1, MAC address 40:A8:F0:1F:AE:7D NIC 3: PCI device 04:00.2, MAC address 40:A8:F0:1F:AE:7E NIC 4: PCI device 04:00.3, MAC address 40:A8:F0:1F:AE:7F Handle 0xD700, DMI type 215, 6 bytes OEM-specific Type Header and Data: D7 06 00 D7 00 05 Handle 0xD800, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 00 D8 01 00 01 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Strings: System ROM 02/06/2014 Handle 0xD801, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 01 D8 02 00 01 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Strings: Redundant System ROM 02/06/2014 Handle 0xD802, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 02 D8 03 00 01 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Strings: System ROM Bootblock 05/05/2012 Handle 0xD803, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 03 D8 04 00 01 02 02 33 00 00 00 00 00 00 00 00 00 00 00 0C 00 Strings: Power Management Controller Firmware 3.3 Handle 0xD804, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 04 D8 05 00 01 02 02 27 00 00 00 00 00 00 00 00 00 00 00 0C 00 Strings: Power Management Controller Firmware Bootloader 2.7 Handle 0xD805, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 05 D8 08 00 01 00 01 23 23 00 00 00 00 00 00 00 00 00 00 00 00 Strings: System Programmable Logic Device Handle 0xD806, DMI type 216, 23 bytes OEM-specific Type Header and Data: D8 17 06 D8 08 00 01 00 01 0C 0C 00 00 00 00 00 00 00 00 00 00 00 00 Strings: SAS Programmable Logic Device Handle 0xDB00, DMI type 219, 32 bytes OEM-specific Type Header and Data: DB 20 00 DB DF 0B 00 00 0F 00 00 00 00 00 00 00 07 08 00 00 00 00 00 00 01 00 00 00 00 00 00 00 Handle 0xDF00, DMI type 223, 7 bytes OEM-specific Type Header and Data: DF 07 00 DF 66 46 70 Handle 0xE000, DMI type 224, 5 bytes OEM-specific Type Header and Data: E0 05 00 E0 00 Handle 0xE200, DMI type 226, 21 bytes OEM-specific Type Header and Data: E2 15 00 E2 37 31 30 37 32 35 4D 58 51 34 34 32 30 37 54 31 01 Strings: MXQ44207T1 Handle 0xE300, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 00 E3 00 04 00 11 08 A0 01 00 Handle 0xE301, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 01 E3 00 04 01 11 08 A2 01 00 Handle 0xE302, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 02 E3 00 04 02 11 08 A4 01 00 Handle 0xE303, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 03 E3 00 04 03 11 08 A6 01 01 Handle 0xE304, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 04 E3 00 04 04 11 08 A8 01 01 Handle 0xE305, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 05 E3 00 04 05 11 08 AA 01 01 Handle 0xE306, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 06 E3 00 04 06 11 09 A0 01 02 Handle 0xE307, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 07 E3 00 04 07 11 09 A2 01 02 Handle 0xE308, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 08 E3 00 04 08 11 09 A4 01 02 Handle 0xE309, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 09 E3 00 04 09 11 09 A6 01 03 Handle 0xE30A, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 0A E3 00 04 0A 11 09 A8 01 03 Handle 0xE30B, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 0B E3 00 04 0B 11 09 AA 01 03 Handle 0xE30C, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 0C E3 01 04 0C 11 0A A0 01 04 Handle 0xE30D, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 0D E3 01 04 0D 11 0A A2 01 04 Handle 0xE30E, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 0E E3 01 04 0E 11 0A A4 01 04 Handle 0xE30F, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 0F E3 01 04 0F 11 0A A6 01 05 Handle 0xE310, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 10 E3 01 04 10 11 0A A8 01 05 Handle 0xE311, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 11 E3 01 04 11 11 0A AA 01 05 Handle 0xE312, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 12 E3 01 04 12 11 0B A0 01 06 Handle 0xE313, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 13 E3 01 04 13 11 0B A2 01 06 Handle 0xE314, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 14 E3 01 04 14 11 0B A4 01 06 Handle 0xE315, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 15 E3 01 04 15 11 0B A6 01 07 Handle 0xE316, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 16 E3 01 04 16 11 0B A8 01 07 Handle 0xE317, DMI type 227, 12 bytes OEM-specific Type Header and Data: E3 0C 17 E3 01 04 17 11 0B AA 01 07 Handle 0xE400, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 00 E4 00 00 00 00 00 00 00 FF 00 00 Handle 0xE401, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 01 E4 01 00 00 00 00 00 00 FF 00 00 Handle 0xE402, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 02 E4 02 00 00 00 00 00 00 FF 00 00 Handle 0xE403, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 03 E4 03 00 00 00 00 00 00 FF 00 00 Handle 0xE404, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 04 E4 04 00 00 00 00 00 00 FF 00 00 Handle 0xE405, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 05 E4 05 00 00 00 00 00 00 FF 00 00 Handle 0xE406, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 06 E4 06 00 00 00 00 00 00 FF 01 00 Handle 0xE407, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 07 E4 07 00 00 00 00 00 00 FF 00 00 Handle 0xE408, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 08 E4 08 04 E0 00 F8 04 00 00 00 00 Handle 0xE409, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 09 E4 09 04 E0 00 F8 05 00 00 00 00 Handle 0xE40A, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 0A E4 0A 04 E0 00 F8 06 00 00 00 00 Handle 0xE40B, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 0B E4 0B 04 E0 00 F8 07 00 00 00 00 Handle 0xE40C, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 0C E4 0C 01 3E FF 08 00 00 07 00 00 Handle 0xE40D, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 0D E4 0D 01 3E FF 09 00 00 07 00 00 Handle 0xE40E, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 0E E4 0E 01 3E FF 10 00 00 07 00 00 Handle 0xE40F, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 0F E4 0F 01 3E FF 11 00 00 07 00 00 Handle 0xE410, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 10 E4 10 01 3E FF 12 00 00 07 00 00 Handle 0xE411, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 11 E4 11 01 3E FF 13 00 00 07 00 00 Handle 0xE412, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 12 E4 12 01 3E FF 20 00 00 07 00 00 Handle 0xE413, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 13 E4 13 01 3E FF 21 00 00 07 00 00 Handle 0xE414, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 14 E4 14 01 3E FF 22 00 00 07 00 00 Handle 0xE415, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 15 E4 15 01 3E FF 23 00 00 07 00 00 Handle 0xE416, DMI type 228, 14 bytes OEM-specific Type Header and Data: E4 0E 16 E4 16 01 3E FF 0A 00 00 07 04 00 Handle 0xE500, DMI type 229, 100 bytes OEM-specific Type Header and Data: E5 64 00 E5 24 48 44 44 00 E0 FF BD 00 00 00 00 00 20 00 00 24 4F 43 53 00 80 FF BD 00 00 00 00 00 40 00 00 24 43 52 50 00 E0 F7 BD 00 00 00 00 00 00 02 00 24 44 46 43 00 C0 FF BD 00 00 00 00 00 04 00 00 24 4F 43 42 00 80 FE BD 00 00 00 00 00 00 01 00 24 53 41 45 00 D0 FF BD 00 00 00 00 00 10 00 00 Handle 0xE600, DMI type 230, 11 bytes OEM-specific Type Header and Data: E6 0B 00 E6 00 27 01 02 02 03 A0 Strings: LTEON 13 Handle 0xE601, DMI type 230, 11 bytes OEM-specific Type Header and Data: E6 0B 01 E6 01 27 01 02 02 03 A2 Strings: LTEON 13 Handle 0xE800, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 00 E8 00 11 05 00 00 00 46 05 DC 05 Handle 0xE801, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 01 E8 01 11 00 00 00 00 00 00 00 00 Handle 0xE802, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 02 E8 02 11 00 00 00 00 00 00 00 00 Handle 0xE803, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 03 E8 03 11 00 00 00 00 00 00 00 00 Handle 0xE804, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 04 E8 04 11 00 00 00 00 00 00 00 00 Handle 0xE805, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 05 E8 05 11 00 00 00 00 00 00 00 00 Handle 0xE806, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 06 E8 06 11 00 00 00 00 00 00 00 00 Handle 0xE807, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 07 E8 07 11 00 00 00 00 00 00 00 00 Handle 0xE808, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 08 E8 08 11 00 00 00 00 00 00 00 00 Handle 0xE809, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 09 E8 09 11 00 00 00 00 00 00 00 00 Handle 0xE80A, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 0A E8 0A 11 00 00 00 00 00 00 00 00 Handle 0xE80B, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 0B E8 0B 11 00 00 00 00 00 00 00 00 Handle 0xE80C, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 0C E8 0C 11 05 00 00 00 46 05 DC 05 Handle 0xE80D, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 0D E8 0D 11 00 00 00 00 00 00 00 00 Handle 0xE80E, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 0E E8 0E 11 00 00 00 00 00 00 00 00 Handle 0xE80F, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 0F E8 0F 11 00 00 00 00 00 00 00 00 Handle 0xE810, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 10 E8 10 11 00 00 00 00 00 00 00 00 Handle 0xE811, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 11 E8 11 11 00 00 00 00 00 00 00 00 Handle 0xE812, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 12 E8 12 11 00 00 00 00 00 00 00 00 Handle 0xE813, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 13 E8 13 11 00 00 00 00 00 00 00 00 Handle 0xE814, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 14 E8 14 11 00 00 00 00 00 00 00 00 Handle 0xE815, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 15 E8 15 11 00 00 00 00 00 00 00 00 Handle 0xE816, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 16 E8 16 11 00 00 00 00 00 00 00 00 Handle 0xE817, DMI type 232, 14 bytes OEM-specific Type Header and Data: E8 0E 17 E8 17 11 00 00 00 00 00 00 00 00 Handle 0x7F00, DMI type 127, 4 bytes End Of Table ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees 2015-06-26 14:52 ` Jan Beulich 2015-06-26 16:23 ` Ian Campbell @ 2015-06-26 19:36 ` Boris Ostrovsky 2015-06-26 20:07 ` Ian Campbell 1 sibling, 1 reply; 32+ messages in thread From: Boris Ostrovsky @ 2015-06-26 19:36 UTC (permalink / raw) To: Jan Beulich, Aravind Gopalakrishnan, suravee.suthikulpanit, Ian Campbell Cc: Lars Kurth, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Anthony Perard, xen-devel On 06/26/2015 10:52 AM, Jan Beulich wrote: >>>> On 26.06.15 at 16:34, <ian.campbell@citrix.com> wrote: >> I did this using rdmsr from mst-tools instead, running on a native >> kernel gave: >> >> # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done >> 0 >> [...] >> 0 Is MSR_K8_TOP_MEM2 defined somewhere in the shell? Just to make sure, could you use explicit address, i.e. for i in $(seq 0 31) ;do rdmsr -p $i 0xc001001d; done (and if they are still all zeroes, can you read 0xc0010010 (SYSCFG) as well?) -boris > Uniformly uncachable for everything above 4Gb then. And I suppose > you already checked that there's no BIOS update available? > > I'm not sure if it would be reasonable for us to work around this. > Suravee, Aravind - do you (or colleagues of yours) have any > experience with systems mis-configured like this one? > > Otoh I'm then pretty confused by your E820 clipping experiment not > having yielded any better results. I'm starting to suspect two > problems... > > Jan > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees 2015-06-26 19:36 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees Boris Ostrovsky @ 2015-06-26 20:07 ` Ian Campbell 2015-06-29 10:23 ` Ian Campbell 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-26 20:07 UTC (permalink / raw) To: Boris Ostrovsky Cc: Lars Kurth, Jan Beulich, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, suravee.suthikulpanit, Anthony Perard, xen-devel On Fri, 2015-06-26 at 15:36 -0400, Boris Ostrovsky wrote: > On 06/26/2015 10:52 AM, Jan Beulich wrote: > >>>> On 26.06.15 at 16:34, <ian.campbell@citrix.com> wrote: > >> I did this using rdmsr from mst-tools instead, running on a native > >> kernel gave: > >> > >> # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done > >> 0 > >> [...] > >> 0 > > Is MSR_K8_TOP_MEM2 defined somewhere in the shell? There is no $ there, so it wouldn't make any difference... I had foolishly assumed that rdmsr would either know the names of the MSRs or it would complain about a string it didn't understand which wasn't a number. Instead it just reads some random register which happens to be strtoul("MSR_K8_TOP_MEM2"), how helpful. > Just to make sure, could you use explicit address, i.e. > > for i in $(seq 0 31) ;do rdmsr -p $i 0xc001001d; done > > (and if they are still all zeroes, can you read 0xc0010010 (SYSCFG) as > well?) I'll try this next week. Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees 2015-06-26 20:07 ` Ian Campbell @ 2015-06-29 10:23 ` Ian Campbell 2015-06-29 13:13 ` Boris Ostrovsky 0 siblings, 1 reply; 32+ messages in thread From: Ian Campbell @ 2015-06-29 10:23 UTC (permalink / raw) To: Boris Ostrovsky Cc: Lars Kurth, suravee.suthikulpanit, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Jan Beulich, Anthony Perard, xen-devel On Fri, 2015-06-26 at 21:07 +0100, Ian Campbell wrote: > On Fri, 2015-06-26 at 15:36 -0400, Boris Ostrovsky wrote: > > On 06/26/2015 10:52 AM, Jan Beulich wrote: > > >>>> On 26.06.15 at 16:34, <ian.campbell@citrix.com> wrote: > > >> I did this using rdmsr from mst-tools instead, running on a native > > >> kernel gave: > > >> > > >> # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done > > >> 0 > > >> [...] > > >> 0 > > > > Is MSR_K8_TOP_MEM2 defined somewhere in the shell? > > There is no $ there, so it wouldn't make any difference... > > I had foolishly assumed that rdmsr would either know the names of the > MSRs or it would complain about a string it didn't understand which > wasn't a number. > > Instead it just reads some random register which happens to be > strtoul("MSR_K8_TOP_MEM2"), how helpful. => https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790075 > > Just to make sure, could you use explicit address, i.e. > > > > for i in $(seq 0 31) ;do rdmsr -p $i 0xc001001d; done It reported 43f000000 on all processors on native (and only the first 8 on Xen due to limited dom0 vcpus). > > > > (and if they are still all zeroes, can you read 0xc0010010 (SYSCFG) as > > well?) It wasn't all zeroes, but anyway, it reported 740000 on all processors on native (I forgot to run under Xen). Ian. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees 2015-06-29 10:23 ` Ian Campbell @ 2015-06-29 13:13 ` Boris Ostrovsky 2015-07-06 9:38 ` Jan Beulich 0 siblings, 1 reply; 32+ messages in thread From: Boris Ostrovsky @ 2015-06-29 13:13 UTC (permalink / raw) To: Ian Campbell Cc: Lars Kurth, suravee.suthikulpanit, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, Jan Beulich, Anthony Perard, xen-devel On 06/29/2015 06:23 AM, Ian Campbell wrote: > On Fri, 2015-06-26 at 21:07 +0100, Ian Campbell wrote: >> On Fri, 2015-06-26 at 15:36 -0400, Boris Ostrovsky wrote: >>> On 06/26/2015 10:52 AM, Jan Beulich wrote: >>>>>>> On 26.06.15 at 16:34, <ian.campbell@citrix.com> wrote: >>>>> I did this using rdmsr from mst-tools instead, running on a native >>>>> kernel gave: >>>>> >>>>> # for i in $(seq 0 31) ;do rdmsr -p $i MSR_K8_TOP_MEM2; done >>>>> 0 >>>>> [...] >>>>> 0 >>> Is MSR_K8_TOP_MEM2 defined somewhere in the shell? >> There is no $ there, so it wouldn't make any difference... >> >> I had foolishly assumed that rdmsr would either know the names of the >> MSRs or it would complain about a string it didn't understand which >> wasn't a number. >> >> Instead it just reads some random register which happens to be >> strtoul("MSR_K8_TOP_MEM2"), how helpful. > => https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790075 > >>> Just to make sure, could you use explicit address, i.e. >>> >>> for i in $(seq 0 31) ;do rdmsr -p $i 0xc001001d; done > It reported 43f000000 on all processors on native (and only the first 8 > on Xen due to limited dom0 vcpus). > >>> (and if they are still all zeroes, can you read 0xc0010010 (SYSCFG) as >>> well?) > It wasn't all zeroes, but anyway, it reported 740000 on all processors > on native (I forgot to run under Xen). Thanks, so this means that we do have WB memory above 4G. (And I am not sure I understand why Jan said MTRRs show that memory above 4G is UC in http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg04397.html . The log also seems to suggest that it is WB, doesn't it?) -boris ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees 2015-06-29 13:13 ` Boris Ostrovsky @ 2015-07-06 9:38 ` Jan Beulich 0 siblings, 0 replies; 32+ messages in thread From: Jan Beulich @ 2015-07-06 9:38 UTC (permalink / raw) To: Ian Campbell, Boris Ostrovsky Cc: Lars Kurth, StefanoStabellini, Andrew Cooper, Dario Faggioli, Ian Jackson, Aravind Gopalakrishnan, suravee.suthikulpanit, Anthony Perard, xen-devel >>> On 29.06.15 at 15:13, <boris.ostrovsky@oracle.com> wrote: > On 06/29/2015 06:23 AM, Ian Campbell wrote: >> On Fri, 2015-06-26 at 21:07 +0100, Ian Campbell wrote: >>> On Fri, 2015-06-26 at 15:36 -0400, Boris Ostrovsky wrote: >>>> Just to make sure, could you use explicit address, i.e. >>>> >>>> for i in $(seq 0 31) ;do rdmsr -p $i 0xc001001d; done >> It reported 43f000000 on all processors on native (and only the first 8 >> on Xen due to limited dom0 vcpus). >> >>>> (and if they are still all zeroes, can you read 0xc0010010 (SYSCFG) as >>>> well?) >> It wasn't all zeroes, but anyway, it reported 740000 on all processors >> on native (I forgot to run under Xen). > > Thanks, so this means that we do have WB memory above 4G. Good (because no fw problem) and bad (because still no reason for the observed behavior). > (And I am not sure I understand why Jan said MTRRs show that memory > above 4G is UC in > http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg04397.html . > The log also seems to suggest that it is WB, doesn't it?) How would it? The two MTRRs only cover the ranges 0-2G and 2G-3G afaics. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: stable trees (was: [xen-4.2-testing test] 58584: regressions) 2015-06-24 9:06 ` Ian Campbell 2015-06-24 9:38 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) Ian Campbell @ 2015-06-24 9:45 ` Jan Beulich 1 sibling, 0 replies; 32+ messages in thread From: Jan Beulich @ 2015-06-24 9:45 UTC (permalink / raw) To: Ian Campbell; +Cc: Lars Kurth, Ian Jackson, xen-devel, Stefano Stabellini >>> On 24.06.15 at 11:06, <ian.campbell@citrix.com> wrote: > After that baseline I ran a few tests of just the windows + qemuu stuff: > http://xenbits.xen.org/people/ianc/tmp/adhoc/37619/ > > was allowing free reign on the machines and was mostly successful, apart > from the windows-install failure on lake-frog. Looking at the test > history this seems to have always been a problem on the old infra. > *-frog are "AMD Opteron(tm) Processor 6168" which is as close as the old > infra has to the new colos merlot[01] which is "AMD Opteron(tm) > Processor 6376". > > With that in mind I reran with things limited to the two frog-* boxes > and got http://xenbits.xen.org/people/ianc/tmp/adhoc/37624/. > > The windows-install of winxpsp3 persisted but there was no migration > failure elsewhere. > > It's not a lot of data, but in comparison with the results in the colo: > http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64 > -xl-qemuu-win7-amd64/xen-4.5-testing.html > it looks like it's the newer system which is exposing the issue. Thanks for doing all of this! While not pointing towards a solution on the side of the newer systems, it at least reassures us that we didn't release regressing software with 4.5.1. Jan ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2015-07-06 9:38 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-06-16 3:43 [xen-4.2-testing test] 58584: regressions - trouble: blocked/broken/fail/pass osstest service user 2015-06-17 8:53 ` stable trees (was: [xen-4.2-testing test] 58584: regressions) Jan Beulich 2015-06-17 10:26 ` Ian Jackson 2015-06-17 13:16 ` Stefano Stabellini 2015-06-18 11:37 ` Jan Beulich 2015-06-18 14:22 ` Ian Campbell 2015-06-19 9:51 ` Jan Beulich 2015-06-19 11:07 ` Ian Campbell 2015-06-24 9:06 ` Ian Campbell 2015-06-24 9:38 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees (was: [xen-4.2-testing test] 58584: regressions)) Ian Campbell 2015-06-24 12:29 ` Dario Faggioli 2015-06-24 12:41 ` Jan Beulich 2015-06-24 13:15 ` Dario Faggioli 2015-06-24 13:28 ` Jan Beulich 2015-06-24 13:54 ` Dario Faggioli 2015-06-26 10:37 ` Ian Campbell 2015-06-26 10:49 ` Jan Beulich 2015-06-26 11:16 ` Ian Campbell 2015-06-26 12:37 ` Ian Campbell 2015-06-26 12:59 ` Jan Beulich 2015-06-26 14:44 ` Ian Campbell 2015-06-26 14:53 ` Jan Beulich 2015-06-26 12:20 ` Jan Beulich 2015-06-26 14:34 ` Ian Campbell 2015-06-26 14:52 ` Jan Beulich 2015-06-26 16:23 ` Ian Campbell 2015-06-26 19:36 ` Problems with merlot* AMD Opteron 6376 systems (Was Re: stable trees Boris Ostrovsky 2015-06-26 20:07 ` Ian Campbell 2015-06-29 10:23 ` Ian Campbell 2015-06-29 13:13 ` Boris Ostrovsky 2015-07-06 9:38 ` Jan Beulich 2015-06-24 9:45 ` stable trees (was: [xen-4.2-testing test] 58584: regressions) Jan Beulich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.