From: Adam Morrison <mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
To: dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: serebrin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
dan-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org,
omer-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org,
shli-b10kYP2dOMg@public.gmane.org,
gvdl-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
Kernel-team-b10kYP2dOMg@public.gmane.org
Subject: [PATCH v2 0/7] Intel IOMMU scalability improvements
Date: Wed, 13 Apr 2016 21:50:49 +0300 [thread overview]
Message-ID: <cover.1460548546.git.mad@cs.technion.ac.il> (raw)
This patchset improves the scalability of the Intel IOMMU code by
resolving two spinlock bottlenecks, yielding up to ~5x performance
improvement and approaching iommu=off performance.
For example, here's the throughput obtained by 16 memcached instances
running on a 16-core Sandy Bridge system, accessed using memslap on
another machine that has iommu=off, using the default memslap config
(64-byte keys, 1024-byte values, and 10%/90% SET/GET ops):
stock iommu=off:
990,803 memcached transactions/sec (=100%, median of 10 runs).
stock iommu=on:
221,416 memcached transactions/sec (=22%).
[61.70% 0.63% memcached [kernel.kallsyms] [k] _raw_spin_lock_irqsave]
patched iommu=on:
963,457 memcached transactions/sec (=97%).
[1.04% 0.94% memcached [kernel.kallsyms] [k] _raw_spin_lock_irqsave]
The two resolved spinlocks:
- Deferred IOTLB invalidations are batched in a global data structure
and serialized under a spinlock (add_unmap() & flush_unmaps()); this
patchset batches IOTLB invalidations in a per-CPU data structure.
- IOVA management (alloc_iova() & __free_iova()) is serialized under
the rbtree spinlock; this patchset adds per-CPU caches of allocated
IOVAs so that the rbtree doesn't get accessed frequently. (Adding a
cache above the existing IOVA allocator is less intrusive than dynamic
identity mapping and helps keep IOMMU page table usage low; see
Patch 7.)
The paper "Utilizing the IOMMU Scalably" (presented at the 2015 USENIX
Annual Technical Conference) contains many more details and experiments:
https://www.usenix.org/system/files/conference/atc15/atc15-paper-peleg.pdf
v2:
* Extend IOVA API instead of modifying it, to not break the API's other
non-Intel callers.
* Invalidate all per-cpu invalidations if one CPU hits its per-cpu limit,
so that we don't defer invalidations more than before.
* Smaller cap on per-CPU cache size, to consume less of the IOVA space.
* Free resources and perform IOTLB invalidations when a CPU is hot-unplugged.
Omer Peleg (7):
iommu: refactoring of deferred flush entries
iommu: per-cpu deferred invalidation queues
iommu: correct flush_unmaps pfn usage
iommu: only unmap mapped entries
iommu: avoid dev iotlb logic in intel-iommu for domains with no dev
iotlbs
iommu: change intel-iommu to use IOVA frame numbers
iommu: introduce per-cpu caching to iova allocation
drivers/iommu/intel-iommu.c | 318 ++++++++++++++++++++++++++-----------
drivers/iommu/iova.c | 372 +++++++++++++++++++++++++++++++++++++++++---
include/linux/iova.h | 23 ++-
3 files changed, 593 insertions(+), 120 deletions(-)
--
1.9.1
next reply other threads:[~2016-04-13 18:50 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-13 18:50 Adam Morrison [this message]
[not found] ` <cover.1460548546.git.mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-04-13 18:51 ` [PATCH v2 1/7] iommu: refactoring of deferred flush entries Adam Morrison
2016-04-13 18:51 ` [PATCH v2 2/7] iommu: per-cpu deferred invalidation queues Adam Morrison
2016-04-13 18:51 ` [PATCH v2 3/7] iommu: correct flush_unmaps pfn usage Adam Morrison
2016-04-13 18:52 ` [PATCH v2 4/7] iommu: only unmap mapped entries Adam Morrison
[not found] ` <e07164c8d0aaff68cabd2cf8e3aee9ed20882ae4.1460548546.git.mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-04-13 20:37 ` Shaohua Li
2016-04-13 18:52 ` [PATCH v2 5/7] iommu: avoid dev iotlb logic in intel-iommu for domains with no dev iotlbs Adam Morrison
2016-04-13 18:52 ` [PATCH v2 6/7] iommu: change intel-iommu to use IOVA frame numbers Adam Morrison
2016-04-13 18:52 ` [PATCH v2 7/7] iommu: introduce per-cpu caching to iova allocation Adam Morrison
[not found] ` <b208a304d83088aae7ecac10a3062dc57c0a2f79.1460548546.git.mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-04-13 20:43 ` Shaohua Li
2016-04-14 18:26 ` Benjamin Serebrin via iommu
[not found] ` <CAN+hb0W=+tuQp3cm_VKoU=LKiVQDPMtGrZGq=59rcaWsy2S-+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-14 21:05 ` Adam Morrison
[not found] ` <CAHMfzJmjZWeUpmTVb-Z7NMJUp0N84ZK4zwUWdAKHv4sd4TXPMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-14 21:18 ` Benjamin Serebrin via iommu
[not found] ` <CAN+hb0WOaokFYc6C+mR6rdj4WwmMUSzDHDZngfUvy-5cEve_-g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-14 21:33 ` Shaohua Li
[not found] ` <20160414213326.GA474260-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-04-15 4:59 ` Benjamin Serebrin via iommu
[not found] ` <CAN+hb0WRDYCpY8xoUvvGu4SSD83F9VTMW=9W=xfjYtJV-dijmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-15 17:55 ` Shaohua Li
[not found] ` <20160415175520.GA2644484-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-04-17 18:05 ` Adam Morrison
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1460548546.git.mad@cs.technion.ac.il \
--to=mad-fresstt7abv7r6psnubssmzhpeb/a1y/@public.gmane.org \
--cc=Kernel-team-b10kYP2dOMg@public.gmane.org \
--cc=dan-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org \
--cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=gvdl-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org \
--cc=omer-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org \
--cc=serebrin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=shli-b10kYP2dOMg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.