From: Alex Williamson <alex.williamson@redhat.com>
To: Bob Chen <a175818323@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
"Marcel Apfelbaum" <marcel@redhat.com>, 陈博 <chenbo02@meituan.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】
Date: Mon, 31 Jul 2017 23:46:26 -0600 [thread overview]
Message-ID: <20170731234626.7664be18@w520.home> (raw)
In-Reply-To: <CAMxP3BTFgwJtjh78hNBCoxBp1WsnZMZLsqzb3McqCq=-SX0a4g@mail.gmail.com>
On Tue, 1 Aug 2017 13:04:46 +0800
Bob Chen <a175818323@gmail.com> wrote:
> Hi,
>
> This is a sketch of my hardware topology.
>
> CPU0 <- QPI -> CPU1
> | |
> Root Port(at PCIe.0) Root Port(at PCIe.1)
> / \ / \
Are each of these lines above separate root ports? ie. each root
complex hosts two root ports, each with a two-port switch downstream of
it?
> Switch Switch Switch Switch
> / \ / \ / \ / \
> GPU GPU GPU GPU GPU GPU GPU GPU
>
>
> And below are the p2p bandwidth test results.
>
> Host:
> D\D 0 1 2 3 4 5 6 7
> 0 426.91 25.32 19.72 19.72 19.69 19.68 19.75 19.66
> 1 25.31 427.61 19.74 19.72 19.66 19.68 19.74 19.73
> 2 19.73 19.73 429.49 25.33 19.66 19.74 19.73 19.74
> 3 19.72 19.71 25.36 426.68 19.70 19.71 19.77 19.74
> 4 19.72 19.72 19.73 19.75 425.75 25.33 19.72 19.71
> 5 19.71 19.75 19.76 19.75 25.35 428.11 19.69 19.70
> 6 19.76 19.72 19.79 19.78 19.73 19.74 425.75 25.35
> 7 19.69 19.75 19.79 19.75 19.72 19.72 25.39 427.15
>
> VM:
> D\D 0 1 2 3 4 5 6 7
> 0 427.38 10.52 18.99 19.11 19.75 19.62 19.75 19.71
> 1 10.53 426.68 19.28 19.19 19.73 19.71 19.72 19.73
> 2 18.88 19.30 426.92 10.48 19.66 19.71 19.67 19.68
> 3 18.93 19.18 10.45 426.94 19.69 19.72 19.67 19.72
> 4 19.60 19.66 19.69 19.70 428.13 10.49 19.40 19.57
> 5 19.52 19.74 19.72 19.69 10.44 426.45 19.68 19.61
> 6 19.63 19.50 19.72 19.64 19.59 19.66 426.91 10.47
> 7 19.69 19.75 19.70 19.69 19.66 19.74 10.45 426.23
Interesting test, how do you get these numbers? What are the units,
GB/s?
> In the VM, the bandwidth between two GPUs under the same physical switch is
> obviously lower, as per the reasons you said in former threads.
Hmm, I'm not sure I can explain why the number is lower than to more
remote GPUs though. Is the test simultaneously reading and writing and
therefore we overload the link to the upstream switch port? Otherwise
I'd expect the bidirectional support in PCIe to be able to handle the
bandwidth. Does the test have a read-only or write-only mode?
> But what confused me most is that GPUs under different switches could
> achieve the same speed, as well as in the Host. Does that mean after IOMMU
> address translation, data traversing has utilized QPI bus by default? Even
> these two devices do not belong to the same PCIe bus?
Yes, of course. Once the transaction is translated by the IOMMU it's
just a matter of routing the resulting address, whether that's back
down the I/O hierarchy under the same root complex or across the QPI
link to the other root complex. The translated address could just as
easily be to RAM that lives on the other side of the QPI link. Also, it
seems like the IOMMU overhead is perhaps negligible here, unless the
IOMMU is actually being used in both cases.
In the host test, is the IOMMU still enabled? The routing of PCIe
transactions is going to be governed by ACS, which Linux enables
whenever the IOMMU is enabled, not just when a device is assigned to a
VM. It would be interesting to see if another performance tier is
exposed if the IOMMU is entirely disabled, or perhaps it might better
expose the overhead of the IOMMU translation. It would also be
interesting to see the ACS settings in lspci for each downstream port
for each test. Thanks,
Alex
next prev parent reply other threads:[~2017-08-01 5:52 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4E0AFA5F-44D6-4624-A99F-68A7FE52F397@meituan.com>
[not found] ` <4b31a711-a52e-25d3-4a7c-1be8521097d9@redhat.com>
[not found] ` <F99BFE80-FC15-40A0-BB3E-1B53B6CF9B05@meituan.com>
2017-07-26 6:21 ` [Qemu-devel] About virtio device hotplug in Q35! 【外域邮件.谨慎查阅】 Marcel Apfelbaum
2017-07-26 15:29 ` Alex Williamson
2017-07-26 16:06 ` Michael S. Tsirkin
2017-07-26 17:32 ` Alex Williamson
2017-08-01 5:04 ` Bob Chen
2017-08-01 5:46 ` Alex Williamson [this message]
2017-08-01 9:35 ` Bob Chen
2017-08-01 14:39 ` Michael S. Tsirkin
2017-08-01 15:01 ` Alex Williamson
2017-08-07 13:00 ` Bob Chen
2017-08-07 15:52 ` Alex Williamson
2017-08-08 1:44 ` Bob Chen
2017-08-08 8:06 ` Bob Chen
2017-08-08 16:53 ` Alex Williamson
2017-08-08 20:07 ` Michael S. Tsirkin
2017-08-22 7:04 ` Bob Chen
2017-08-22 16:56 ` Alex Williamson
2017-08-22 18:06 ` Michael S. Tsirkin
2017-08-29 10:41 ` Bob Chen
2017-08-29 14:13 ` Alex Williamson
2017-08-30 9:41 ` Bob Chen
2017-08-30 16:43 ` Alex Williamson
2017-09-01 9:58 ` Bob Chen
2017-11-30 8:06 ` Bob Chen
2017-08-07 13:04 ` Bob Chen
2017-08-07 16:00 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170731234626.7664be18@w520.home \
--to=alex.williamson@redhat.com \
--cc=a175818323@gmail.com \
--cc=chenbo02@meituan.com \
--cc=marcel@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).