From: "Michael S. Tsirkin" <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Alexander Duyck
<alexander.h.duyck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Yan Burman <yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Asias He <asias-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: decent performance drop for SCSI LLD / SAN initiator when iommu is turned on
Date: Tue, 7 May 2013 15:22:35 +0300 [thread overview]
Message-ID: <20130507122235.GA21361@redhat.com> (raw)
In-Reply-To: <5188304E.9050603-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
On Mon, May 06, 2013 at 03:35:58PM -0700, Alexander Duyck wrote:
> On 05/06/2013 02:39 PM, Or Gerlitz wrote:
> > On Thu, May 2, 2013 at 4:56 AM, Michael S. Tsirkin <mst-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >> On Thu, May 02, 2013 at 02:11:15AM +0300, Or Gerlitz wrote:
> >>> So we've noted that when configuring the kernel && booting with intel
> >>> iommu set to on on a physical node (non VM, and without enabling SRIOV
> >>> by the HW device driver) raw performance of the iSER (iSCSI RDMA) SAN
> >>> initiator is reduced notably, e.g in the testbed we looked today we
> >>> had ~260K 1KB random IOPS and 5.5GBs BW for 128KB IOs with iommu
> >>> turned off for single LUN, and ~150K IOPS and 4GBs BW with iommu
> >>> turned on. No change on the target node between runs.
> >> That's why we have iommu=pt.
> >> See definition of iommu_pass_through in arch/x86/kernel/pci-dma.c.
> >
> >
> > Hi Michael (hope you feel better),
> >
> > We did some runs with the pt approach you suggested and still didn't
> > get the promised gain -- in parallel we came across this 2012 commit
> > f800326dc "ixgbe: Replace standard receive path with a page based
> > receive" where they say "[...] we are able to see a considerable
> > performance gain when an IOMMU is enabled because we are no longer
> > unmapping every buffer on receive [...] instead we can simply call
> > sync_single_range [...]" looking on the commit you can see that they
> > allocate a page/skb dma_map it initially and later of the life cycle
> > of that buffer use dma_sync_for_device/cpu and avoid dma_map/unmap on
> > the fast path.
> >
> > Well few questions which I'd love to hear people's opinion -- 1st this
> > approach seems cool for network device RX path, but what about the TX
> > path, any idea how to avoid dma_map for it? or why on the TX path
> > calling dma_map/unmap for every buffer doesn't involve a notable perf
> > hit? 2nd I don't see how to apply the method on block device since
> > these devices don't allocate buffers, but rather get a scatter-gather
> > list of pages from upper layers, issue dma_map_sg on them and submit
> > the IO, later when done call dma_unmap_sg
> >
> > Or.
>
> The Tx path ends up taking a performance hit if IOMMU is enabled. It
> just isn't as severe due to things like TSO.
>
> One way to work around the performance penalty is to allocate bounce
> buffers and just leave them static mapped. Then you can simply memcpy
> the data to the buffers and avoid the locking overhead of
> allocating/freeing IOMMU resources. It consumes more memory but works
> around the IOMMU limitations.
>
> Thanks,
>
> Alex
But why isn't iommu=pt effective?
AFAIK the whole point of it was to give up on security
for host-controlled devices, but still get a
measure of security for assigned devices.
--
MST
next prev parent reply other threads:[~2013-05-07 12:22 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-01 23:11 decent performance drop for SCSI LLD / SAN initiator when iommu is turned on Or Gerlitz
[not found] ` <CAJZOPZJ8eF-Q+WFzA-_vvzkpSb41PQjKFo27_Wi3McUccOqs9A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-02 0:13 ` Roland Dreier
2013-05-02 1:56 ` Michael S. Tsirkin
[not found] ` <20130502015603.GC26105-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-05-02 14:13 ` Yan Burman
[not found] ` <0EE9A1CDC8D6434DB00095CD7DB873462CF9D73E-fViJhHBwANKuSA5JZHE7gA@public.gmane.org>
2013-05-03 19:40 ` Don Dutile
[not found] ` <518412AC.3070507-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-05-05 14:06 ` Yan Burman
2013-05-06 21:39 ` Or Gerlitz
[not found] ` <CAJZOPZLWgXNCEpZjzuizVGPEVPg1G+cHh373ZCoumMx9eAabvQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-06 22:35 ` Alexander Duyck
[not found] ` <5188304E.9050603-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2013-05-07 12:12 ` Or Gerlitz
2013-05-07 12:22 ` Michael S. Tsirkin [this message]
[not found] ` <20130507122235.GA21361-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-05-07 14:50 ` Or Gerlitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130507122235.GA21361@redhat.com \
--to=mst-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=alexander.h.duyck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=asias-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=yanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox