From: Shaohua Li <shli-b10kYP2dOMg@public.gmane.org>
To: Adam Morrison <mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
Cc: Kernel-team-b10kYP2dOMg@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Subject: Re: [PATCH V2 0/4]IOMMU: avoid lock contention in iova allocation
Date: Thu, 21 Jan 2016 16:18:10 -0800 [thread overview]
Message-ID: <20160122001801.GA4114061@devbig084.prn1.facebook.com> (raw)
In-Reply-To: <20160121231326.GA16905-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
On Fri, Jan 22, 2016 at 01:14:56AM +0200, Adam Morrison wrote:
> On Wed, Jan 20, 2016 at 01:14:35PM -0800, Shaohua Li wrote:
>
> > > My understanding from the above is that the only issue with our
> > > patchset was not dealing with pfn_limit. I can just fix that and
> > > repost, sounds good?
> >
> > Sure, please do it. For the patches, I'm not comformatable about the
> > per-cpu deferred invalidation. One important benefit of IOMMU is
> > isolation. Deferred invalidation already loose the isolation, per-cpu
> > invalidation loose further. It would be better we can flush all per-cpu
> > invalidation entries if one cpu hits its per-cpu limit. Also you'd
> > better look at CPU hotplug. We don't want to lose the invalidation
> > entries if one cpu is hot removed.
>
> I'll look into these.
>
> > The per-cpu iova implementation looks unnecessary complicated. I know
> > you are referring the paper, but the whole point is batch
> > allocation/free.
>
> Batched allocation/free isn't enough. It still creates spinlock
> contention, even if there is per-cpu invalidation (that gets rid of
> async_umap_flush_lock). Here are sample results from our memcached
> test (throughput of querying 16 memcached instances on a 16-core box
> with an Intel XL710 NIC):
>
> batched alloc/free, iommu=on:
> 313,161 memcached transactions/sec (= 29% of iommu=off)
>
> batched alloc/free + per-cpu invalidations, iommu=on:
> 434,590 memcached transactions/sec (= 40% of iommu=off)
>
> perf report:
> 61.15% 0.33% swapper [kernel.kallsyms] [k] _raw_spin_lock_irqsave
> |
> ---_raw_spin_lock_irqsave
> |
> |--87.81%-- free_iova_array
> |--11.71%-- alloc_iova
>
> In contrast, the per-cpu magazine cache in our patchset enables iova
> allocation/free to complete without accessing the iova allocator at
> all. So we don't touch the rbtree spinlock, and also complete iova
> allocation in constant time, which avoids the linear-time allocations
> that the iova allocator suffers from. (These were described in the
> paper "Efficient intra-operating system protection against harmful
> DMAs", presented at the USENIX FAST 2015 conference.) The end result:
>
> magazines cache + per-cpu invalidations, iommu=on:
> 1,067,586 memcached transactions/sec (= 98% of iommu=off)
Yes, I know a typical per-cpu cache algorithm will free entry to the
per-cpu cache first. I didn't do it because it's not worthy in my test.
We definitely should do it if it's worthy though. My point is we don't
need implement the per-cpu that complicated. It could be simply:
alloc:
- refill per-cpu cache if it's empty
- alloc from per-cpu cache
free
- free to per-cpu cache
- if per-cpu cache hits threshold, free to global
And please don't cache too much. The DMA address below 4G is still
precious.
Thanks,
Shaohua
next prev parent reply other threads:[~2016-01-22 0:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-09 0:44 [PATCH V2 0/4]IOMMU: avoid lock contention in iova allocation Shaohua Li
[not found] ` <cover.1452297604.git.shli-b10kYP2dOMg@public.gmane.org>
2016-01-09 0:44 ` [PATCH V2 1/4] IOMMU: add a percpu cache for " Shaohua Li
2016-01-09 0:44 ` [PATCH V2 2/4] iommu: free_iova doesn't need lock twice Shaohua Li
2016-01-09 0:44 ` [PATCH V2 3/4] intel-iommu: remove find_iova in unmap path Shaohua Li
2016-01-09 0:44 ` [PATCH V2 4/4] intel-iommu: do batch iova free Shaohua Li
2016-01-10 22:56 ` [PATCH V2 0/4]IOMMU: avoid lock contention in iova allocation Adam Morrison
[not found] ` <20160110225610.GA16778-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-01-11 3:37 ` Shaohua Li
[not found] ` <20160111033749.GA1811285-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-01-20 12:21 ` Joerg Roedel
[not found] ` <20160120122103.GE18805-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2016-01-20 18:10 ` Shaohua Li
[not found] ` <20160120180956.GA230160-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-01-20 20:47 ` Adam Morrison
[not found] ` <CAHMfzJmAUWT8X4492_84smUgQMW9pW1yNfOxC6e7ur9TitA2cA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-01-20 21:14 ` Shaohua Li
[not found] ` <20160120211421.GA684235-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-01-21 23:14 ` Adam Morrison
[not found] ` <20160121231326.GA16905-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-01-22 0:18 ` Shaohua Li [this message]
[not found] ` <20160122001801.GA4114061-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-01-22 11:35 ` Adam Morrison
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160122001801.GA4114061@devbig084.prn1.facebook.com \
--to=shli-b10kyp2domg@public.gmane.org \
--cc=Kernel-team-b10kYP2dOMg@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.