iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Adam Morrison <mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
To: dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org,
	joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: serebrin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	dan-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org,
	omer-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org,
	shli-b10kYP2dOMg@public.gmane.org,
	gvdl-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	Kernel-team-b10kYP2dOMg@public.gmane.org
Subject: [PATCH v2 0/7] Intel IOMMU scalability improvements
Date: Wed, 13 Apr 2016 21:50:49 +0300	[thread overview]
Message-ID: <cover.1460548546.git.mad@cs.technion.ac.il> (raw)

This patchset improves the scalability of the Intel IOMMU code by
resolving two spinlock bottlenecks, yielding up to ~5x performance
improvement and approaching iommu=off performance.

For example, here's the throughput obtained by 16 memcached instances
running on a 16-core Sandy Bridge system, accessed using memslap on
another machine that has iommu=off, using the default memslap config
(64-byte keys, 1024-byte values, and 10%/90% SET/GET ops):

    stock iommu=off:
       990,803 memcached transactions/sec (=100%, median of 10 runs).
    stock iommu=on:
       221,416 memcached transactions/sec (=22%).
       [61.70%    0.63%  memcached       [kernel.kallsyms]      [k] _raw_spin_lock_irqsave]
    patched iommu=on:
       963,457 memcached transactions/sec (=97%).
       [1.04%     0.94%  memcached       [kernel.kallsyms]      [k] _raw_spin_lock_irqsave]

The two resolved spinlocks:

 - Deferred IOTLB invalidations are batched in a global data structure
   and serialized under a spinlock (add_unmap() & flush_unmaps()); this
   patchset batches IOTLB invalidations in a per-CPU data structure.

 - IOVA management (alloc_iova() & __free_iova()) is serialized under
   the rbtree spinlock; this patchset adds per-CPU caches of allocated
   IOVAs so that the rbtree doesn't get accessed frequently. (Adding a
   cache above the existing IOVA allocator is less intrusive than dynamic
   identity mapping and helps keep IOMMU page table usage low; see
   Patch 7.)

The paper "Utilizing the IOMMU Scalably" (presented at the 2015 USENIX
Annual Technical Conference) contains many more details and experiments:

  https://www.usenix.org/system/files/conference/atc15/atc15-paper-peleg.pdf

v2:

 * Extend IOVA API instead of modifying it, to not break the API's other 
   non-Intel callers.
 * Invalidate all per-cpu invalidations if one CPU hits its per-cpu limit,
   so that we don't defer invalidations more than before.
 * Smaller cap on per-CPU cache size, to consume less of the IOVA space.
 * Free resources and perform IOTLB invalidations when a CPU is hot-unplugged.


Omer Peleg (7):
  iommu: refactoring of deferred flush entries
  iommu: per-cpu deferred invalidation queues
  iommu: correct flush_unmaps pfn usage
  iommu: only unmap mapped entries
  iommu: avoid dev iotlb logic in intel-iommu for domains with no dev
    iotlbs
  iommu: change intel-iommu to use IOVA frame numbers
  iommu: introduce per-cpu caching to iova allocation

 drivers/iommu/intel-iommu.c | 318 ++++++++++++++++++++++++++-----------
 drivers/iommu/iova.c        | 372 +++++++++++++++++++++++++++++++++++++++++---
 include/linux/iova.h        |  23 ++-
 3 files changed, 593 insertions(+), 120 deletions(-)

-- 
1.9.1

             reply	other threads:[~2016-04-13 18:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-13 18:50 Adam Morrison [this message]
     [not found] ` <cover.1460548546.git.mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-04-13 18:51   ` [PATCH v2 1/7] iommu: refactoring of deferred flush entries Adam Morrison
2016-04-13 18:51   ` [PATCH v2 2/7] iommu: per-cpu deferred invalidation queues Adam Morrison
2016-04-13 18:51   ` [PATCH v2 3/7] iommu: correct flush_unmaps pfn usage Adam Morrison
2016-04-13 18:52   ` [PATCH v2 4/7] iommu: only unmap mapped entries Adam Morrison
     [not found]     ` <e07164c8d0aaff68cabd2cf8e3aee9ed20882ae4.1460548546.git.mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-04-13 20:37       ` Shaohua Li
2016-04-13 18:52   ` [PATCH v2 5/7] iommu: avoid dev iotlb logic in intel-iommu for domains with no dev iotlbs Adam Morrison
2016-04-13 18:52   ` [PATCH v2 6/7] iommu: change intel-iommu to use IOVA frame numbers Adam Morrison
2016-04-13 18:52   ` [PATCH v2 7/7] iommu: introduce per-cpu caching to iova allocation Adam Morrison
     [not found]     ` <b208a304d83088aae7ecac10a3062dc57c0a2f79.1460548546.git.mad-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org>
2016-04-13 20:43       ` Shaohua Li
2016-04-14 18:26       ` Benjamin Serebrin via iommu
     [not found]         ` <CAN+hb0W=+tuQp3cm_VKoU=LKiVQDPMtGrZGq=59rcaWsy2S-+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-14 21:05           ` Adam Morrison
     [not found]             ` <CAHMfzJmjZWeUpmTVb-Z7NMJUp0N84ZK4zwUWdAKHv4sd4TXPMg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-14 21:18               ` Benjamin Serebrin via iommu
     [not found]                 ` <CAN+hb0WOaokFYc6C+mR6rdj4WwmMUSzDHDZngfUvy-5cEve_-g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-14 21:33                   ` Shaohua Li
     [not found]                     ` <20160414213326.GA474260-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-04-15  4:59                       ` Benjamin Serebrin via iommu
     [not found]                         ` <CAN+hb0WRDYCpY8xoUvvGu4SSD83F9VTMW=9W=xfjYtJV-dijmQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-04-15 17:55                           ` Shaohua Li
     [not found]                             ` <20160415175520.GA2644484-tb7CFzD8y5b7E6g3fPdp/g2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2016-04-17 18:05                               ` Adam Morrison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1460548546.git.mad@cs.technion.ac.il \
    --to=mad-fresstt7abv7r6psnubssmzhpeb/a1y/@public.gmane.org \
    --cc=Kernel-team-b10kYP2dOMg@public.gmane.org \
    --cc=dan-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org \
    --cc=dwmw2-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
    --cc=gvdl-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org \
    --cc=omer-FrESSTt7Abv7r6psnUbsSmZHpeb/A1Y/@public.gmane.org \
    --cc=serebrin-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=shli-b10kYP2dOMg@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).