iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: "Tian, Kevin" <kevin.tian@intel.com>,
	Raj Ashok <ashok.raj@intel.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.com>,
	LKML <linux-kernel@vger.kernel.org>,
	iommu@lists.linux-foundation.org,
	Alex Williamson <alex.williamson@redhat.com>,
	"Wu, Hao" <hao.wu@intel.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Jonathan Cameron <jic23@kernel.org>
Subject: Re: [PATCH 00/10] IOASID extensions for guest SVA
Date: Thu, 2 Apr 2020 09:09:09 -0700	[thread overview]
Message-ID: <20200402090909.4475d430@jacob-builder> (raw)
In-Reply-To: <20200402122633.GC1176452@myrica>

On Thu, 2 Apr 2020 14:26:33 +0200
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> On Wed, Apr 01, 2020 at 04:38:42PM -0700, Jacob Pan wrote:
> > On Wed, 1 Apr 2020 16:03:01 +0200
> > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> >   
> > > Hi Jacob,
> > > 
> > > On Wed, Mar 25, 2020 at 10:55:21AM -0700, Jacob Pan wrote:  
> > > > IOASID was introduced in v5.5 as a generic kernel allocator
> > > > service for both PCIe Process Address Space ID (PASID) and ARM
> > > > SMMU's Sub Stream ID. In addition to basic ID allocation,
> > > > ioasid_set was introduced as a token that is shared by a group
> > > > of IOASIDs. This set token can be used for permission checking
> > > > but lack of some features needed by guest Shared Virtual
> > > > Address (SVA). In addition, IOASID support for life cycle
> > > > management is needed among multiple users.
> > > > 
> > > > This patchset introduces two extensions to the IOASID code,
> > > > 1. IOASID set operations
> > > > 2. Notifications for IOASID state synchronization    
> > > 
> > > My main concern with this series is patch 7 changing the spinlock
> > > to a mutex, which prevents SVA from calling ioasid_free() from
> > > the RCU callback of MMU notifiers. Could we use atomic notifiers,
> > > or do the FREE notification another way?
> > >   
> > Maybe I am looking at the wrong code, I thought
> > mmu_notifier_ops.free_notifier() is called outside spinlock with
> > call_srcu(), which will be invoked in the thread context.
> > in mmu_notifier.c mmu_notifier_put()
> > 	spin_unlock(&mm->notifier_subscriptions->lock);
> > 
> > 	call_srcu(&srcu, &subscription->rcu,
> > mmu_notifier_free_rcu);  
> 
> free_notifier() is called from RCU callback, and according to
> Documentation/RCU/checklist.txt:
> 
> 5.      If call_rcu() or call_srcu() is used, the callback function
> will be called from softirq context.  In particular, it cannot block.
> 
> When applying the patch I get the sleep-in-atomic warning:
> 
> [   87.861793] BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:935 [   87.863293] in_atomic(): 1,
> irqs_disabled(): 0, non_block: 0, pid: 74, name: kworker/6:1
> [   87.863993] 2 locks held by kworker/6:1/74: [   87.864493]  #0:
> ffffff885ac12538 ((wq_completion)rcu_gp){+.+.}-{0:0}, at:
> process_one_work+0x740/0x1880 [   87.865593]  #1: ffffff88591efd30
> ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> process_one_work+0x740/0x1880 [   87.866993] CPU: 6 PID: 74 Comm:
> kworker/6:1 Not tainted 5.6.0-next-20200331+ #121 [   87.867393]
> Hardware name: FVP Base (DT) [   87.867893] Workqueue: rcu_gp
> srcu_invoke_callbacks [   87.868393] Call trace: [   87.868793]
> dump_backtrace+0x0/0x310 [   87.869293]  show_stack+0x14/0x20
> [   87.869693]  dump_stack+0x124/0x180 [   87.870193]
> ___might_sleep+0x2ac/0x428 [   87.870693]  __might_sleep+0x88/0x168
> [   87.871094]  __mutex_lock+0xa0/0x1270
> [   87.871593]  mutex_lock_nested+0x1c/0x28
> [   87.872093]  ioasid_free+0x28/0x48
> [   87.872493]  io_mm_free+0x1d0/0x608
> [   87.872993]  mmu_notifier_free_rcu+0x74/0xe8
> [   87.873393]  srcu_invoke_callbacks+0x1d0/0x2c8
> [   87.873893]  process_one_work+0x858/0x1880
> [   87.874393]  worker_thread+0x314/0xcd0
> [   87.874793]  kthread+0x318/0x400
> [   87.875293]  ret_from_fork+0x10/0x18
> 
You are right, I was reading call_srcu comments too fast. I guess
rcu callbacks are still in softirq not offloaded to kernel threads.

 *
 * The callback will be invoked from process context, but must
 *  nevertheless be fast and must not block.
 */
So even atomic works in principle but not a good idea since it may take
a long time.

> > 
> > Anyway, if we have to use atomic. I tried atomic notifier first,
> > there are two subscribers to the free event on x86.
> > 1. IOMMU
> > 2. KVM
> > 
> > For #1, the problem is that in the free operation, VT-d driver
> > needs to do a lot of clean up in thread context.
> > - hold a mutex to traverse a list of devices
> > - clear PASID entry and flush cache
> > 
> > For #2, KVM might be able to deal with spinlocks for updating VMCS
> > PASID translation table. +Hao
> > 
> > Perhaps two solutions I can think of:
> > 1. Use a cyclic IOASID allocator. The main reason of clean up at
> > free is to prevent race with IOASID alloc. Similar to PID, 2M IOASID
> > will take long time overflow. Then we can use atomic notifier and a
> > deferred workqueue to do IOMMU cleanup. The downside is a large and
> > growing PASID table, may not be a performance issue since it has
> > TLB.  
> 
> That might be a problem for SMMU, which has 1024 * 64kB leaf PASID
> tables, for a total of 64MB per endpoint if there is too much
> fragmentation in the IOASID space.
> 
OK. Not an option here :(

> > 2. Let VFIO ensure free always happen after unbind. Then there is no
> > need to do cleanup. But that requires VFIO to keep track of all the
> > PASIDs within each VM. When the VM terminates, VFIO is responsible
> > for the clean up. That was Yi's original proposal. I also tried to
> > provide an IOASID set iterator for VFIO to free the IOASIDs within
> > each VM/set, but the private data belongs to IOMMU driver.  
> 
> Not really my place to comment on this, but I find it nicer to use the
> same gpasid_unbind() path when VFIO frees a PASID as when the guest
> explicitly unbinds before freeing. 
> 

Might be the only option now.

Thanks,

Jacob

> Thanks,
> Jean
> 

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

      reply	other threads:[~2020-04-02 16:03 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 17:55 [PATCH 00/10] IOASID extensions for guest SVA Jacob Pan
2020-03-25 17:55 ` [PATCH 01/10] iommu/ioasid: Introduce system-wide capacity Jacob Pan
2020-03-27  8:07   ` Tian, Kevin
2020-03-27 16:08     ` Jacob Pan
2020-04-01 13:45   ` Jean-Philippe Brucker
2020-04-01 22:50     ` Jacob Pan
2020-03-25 17:55 ` [PATCH 02/10] iommu/vt-d: Set IOASID capacity when SVM is enabled Jacob Pan
2020-03-27  8:08   ` Tian, Kevin
2020-03-25 17:55 ` [PATCH 03/10] iommu/ioasid: Introduce per set allocation APIs Jacob Pan
2020-03-26  2:12   ` Lu Baolu
2020-03-26 21:30     ` Jacob Pan
2020-03-27  8:38   ` Tian, Kevin
2020-03-27 16:59     ` Jacob Pan
2020-03-28  6:32       ` Tian, Kevin
2020-04-01 13:47   ` Jean-Philippe Brucker
2020-04-06 20:02     ` Jacob Pan
2020-04-07 11:01       ` Jean-Philippe Brucker
2020-04-21 21:51         ` Jacob Pan
2020-03-25 17:55 ` [PATCH 04/10] iommu/ioasid: Rename ioasid_set_data to avoid confusion with ioasid_set Jacob Pan
2020-03-27  9:35   ` Tian, Kevin
2020-03-25 17:55 ` [PATCH 05/10] iommu/ioasid: Create an IOASID set for host SVA use Jacob Pan
2020-03-27  9:41   ` Tian, Kevin
2020-03-27 17:28     ` Jacob Pan
2020-03-28  6:33       ` Tian, Kevin
2020-04-01 13:53   ` Jean-Philippe Brucker
2020-04-06 15:33     ` Jacob Pan
2020-04-07 11:01       ` Jean-Philippe Brucker
2020-04-13 22:06         ` Jacob Pan
2020-04-15 15:10           ` Jean-Philippe Brucker
2020-03-25 17:55 ` [PATCH 06/10] iommu/ioasid: Convert to set aware allocations Jacob Pan
2020-03-27  9:54   ` Tian, Kevin
2020-03-27 17:41     ` Jacob Pan
2020-03-28  6:40       ` Tian, Kevin
2020-04-06 20:07         ` Jacob Pan
2020-04-01 13:55   ` Jean-Philippe Brucker
2020-04-01 22:45     ` Jacob Pan
2020-03-25 17:55 ` [PATCH 07/10] iommu/ioasid: Use mutex instead of spinlock Jacob Pan
2020-03-27  9:55   ` Tian, Kevin
2020-04-01 13:58   ` Jean-Philippe Brucker
2020-03-25 17:55 ` [PATCH 08/10] iommu/ioasid: Introduce notifier APIs Jacob Pan
2020-03-27 10:03   ` Tian, Kevin
2020-03-27 18:36     ` Jacob Pan
2020-03-28  6:43       ` Tian, Kevin
2020-03-31 15:13         ` Jacob Pan
2020-04-01 14:00   ` Jean-Philippe Brucker
2020-04-10 15:43     ` Jacob Pan
2020-03-25 17:55 ` [PATCH 09/10] iommu/ioasid: Support ioasid_set quota adjustment Jacob Pan
2020-03-27 10:09   ` Tian, Kevin
2020-03-27 23:30     ` Jacob Pan
2020-03-28  6:44       ` Tian, Kevin
2020-03-25 17:55 ` [PATCH 10/10] iommu/vt-d: Register PASID notifier for status change Jacob Pan
2020-03-27 10:22   ` Tian, Kevin
2020-03-27 23:47     ` Jacob Pan
2020-04-01 14:03 ` [PATCH 00/10] IOASID extensions for guest SVA Jean-Philippe Brucker
2020-04-01 23:38   ` Jacob Pan
2020-04-02 12:26     ` Jean-Philippe Brucker
2020-04-02 16:09       ` Jacob Pan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200402090909.4475d430@jacob-builder \
    --to=jacob.jun.pan@linux.intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.com \
    --cc=jean-philippe@linaro.org \
    --cc=jic23@kernel.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).