* [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer
@ 2016-11-21 17:16 bugzilla-daemon
2016-11-21 17:17 ` [Bug 188271] " bugzilla-daemon
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:16 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
Bug ID: 188271
Summary: IOMMU DMAR fault with NVIDIA CUDA peer to peer
Product: Drivers
Version: 2.5
Kernel Version: 4.8.6
Hardware: x86-64
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri@kernel-bugs.osdl.org
Reporter: vadim@sourced.tech
Regression: No
My motherboard is Supermicro X10DRG-Q (details in attached output of
dmidecode). It has 2 Xeon E5-2620 v4 (details in attached lscpu output). Two
Titan X 2016 GPUs are inserted into PCIe slots (see nvidia-smi output). After
enabling of the peer to peer access between those two cards, execution of
cudaMemcpyPeer() hangs and dmesg shows:
[16193.612535] DMAR: DRHD: handling fault status reg 602
[16193.617662] DMAR: [DMA Write] Request device [82:00.0] fault addr
387fc000c000 [fault reason 05] PTE Write access is not set
[16193.661857] DMAR: DRHD: handling fault status reg 702
[16193.666976] DMAR: [DMA Write] Request device [82:00.0] fault addr f8139000
[fault reason 05] PTE Write access is not set (edited)
I am using CoreOS, and the whole stuff happens inside a docker container
running with -device /dev/nvidiactl --device /dev/nvidia0 --device /dev/nvidia1
--device /dev/nvidia-uvm --privileged --security-opt seccomp=unconfined
The addition of intel_iommu=igfx_off to kernel command line cures the problem
and peer to peer works perfectly.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 188271] IOMMU DMAR fault with NVIDIA CUDA peer to peer
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
@ 2016-11-21 17:17 ` bugzilla-daemon
2016-11-21 17:17 ` bugzilla-daemon
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:17 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
--- Comment #1 from Vadim Markovtsev <vadim@sourced.tech> ---
Created attachment 245361
--> https://bugzilla.kernel.org/attachment.cgi?id=245361&action=edit
dmidecode -t 2
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 188271] IOMMU DMAR fault with NVIDIA CUDA peer to peer
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
2016-11-21 17:17 ` [Bug 188271] " bugzilla-daemon
@ 2016-11-21 17:17 ` bugzilla-daemon
2016-11-21 17:17 ` bugzilla-daemon
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:17 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
--- Comment #2 from Vadim Markovtsev <vadim@sourced.tech> ---
Created attachment 245371
--> https://bugzilla.kernel.org/attachment.cgi?id=245371&action=edit
lscpu
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 188271] IOMMU DMAR fault with NVIDIA CUDA peer to peer
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
2016-11-21 17:17 ` [Bug 188271] " bugzilla-daemon
2016-11-21 17:17 ` bugzilla-daemon
@ 2016-11-21 17:17 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:17 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
--- Comment #3 from Vadim Markovtsev <vadim@sourced.tech> ---
Created attachment 245381
--> https://bugzilla.kernel.org/attachment.cgi?id=245381&action=edit
lspci -knnv
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 188271] IOMMU DMAR fault with NVIDIA CUDA peer to peer
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
` (2 preceding siblings ...)
2016-11-21 17:17 ` bugzilla-daemon
@ 2016-11-21 17:18 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:18 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
--- Comment #4 from Vadim Markovtsev <vadim@sourced.tech> ---
Created attachment 245391
--> https://bugzilla.kernel.org/attachment.cgi?id=245391&action=edit
nvidia-smi proto -m
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 188271] IOMMU DMAR fault with NVIDIA CUDA peer to peer
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
` (3 preceding siblings ...)
2016-11-21 17:18 ` bugzilla-daemon
@ 2016-11-21 17:18 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:18 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
--- Comment #5 from Vadim Markovtsev <vadim@sourced.tech> ---
Created attachment 245401
--> https://bugzilla.kernel.org/attachment.cgi?id=245401&action=edit
cat /proc/cmdline
Added intel_iommu=off
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug 188271] IOMMU DMAR fault with NVIDIA CUDA peer to peer
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
` (4 preceding siblings ...)
2016-11-21 17:18 ` bugzilla-daemon
@ 2016-11-21 17:18 ` bugzilla-daemon
5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-11-21 17:18 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=188271
--- Comment #6 from Vadim Markovtsev <vadim@sourced.tech> ---
Created attachment 245411
--> https://bugzilla.kernel.org/attachment.cgi?id=245411&action=edit
uname -a
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-11-21 17:18 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-11-21 17:16 [Bug 188271] New: IOMMU DMAR fault with NVIDIA CUDA peer to peer bugzilla-daemon
2016-11-21 17:17 ` [Bug 188271] " bugzilla-daemon
2016-11-21 17:17 ` bugzilla-daemon
2016-11-21 17:17 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
2016-11-21 17:18 ` bugzilla-daemon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.