public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Lu Baolu <baolu.lu@linux.intel.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
	Christoph Hellwig <hch@infradead.org>,
	Kevin Tian <kevin.tian@intel.com>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 4/6] iommu: Move lock from iommu_change_dev_def_domain() to its caller
Date: Thu, 9 Mar 2023 21:16:51 -0400	[thread overview]
Message-ID: <ZAqFA8KERGh2Emob@nvidia.com> (raw)
In-Reply-To: <20230306025804.13912-5-baolu.lu@linux.intel.com>

On Mon, Mar 06, 2023 at 10:58:02AM +0800, Lu Baolu wrote:
> The intention is to make it possible to put group ownership check and
> default domain change in a same critical region protected by the group's
> mutex lock. No intentional functional change.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/iommu.c | 29 ++++++++++++++---------------
>  1 file changed, 14 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 0bcd9625090d..f8f400548a10 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2945,7 +2945,7 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  	int ret, dev_def_dom;
>  	struct device *dev;
>  
> -	mutex_lock(&group->mutex);
> +	lockdep_assert_held(&group->mutex);
>  
>  	if (group->default_domain != group->domain) {
>  		dev_err_ratelimited(prev_dev, "Group not assigned to default domain\n");
> @@ -3033,28 +3033,15 @@ static int iommu_change_dev_def_domain(struct iommu_group *group,
>  		goto free_new_domain;
>  
>  	group->domain = group->default_domain;
> -
> -	/*
> -	 * Release the mutex here because ops->probe_finalize() call-back of
> -	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
> -	 * in-turn might call back into IOMMU core code, where it tries to take
> -	 * group->mutex, resulting in a deadlock.
> -	 */
> -	mutex_unlock(&group->mutex);
> -
> -	/* Make sure dma_ops is appropriatley set */
> -	iommu_group_do_probe_finalize(dev, group->default_domain);
>  	iommu_domain_free(prev_dom);
> +
>  	return 0;
>  
>  free_new_domain:
>  	iommu_domain_free(group->default_domain);
>  	group->default_domain = prev_dom;
>  	group->domain = prev_dom;
> -
>  out:
> -	mutex_unlock(&group->mutex);
> -
>  	return ret;
>  }
>  
> @@ -3142,7 +3129,19 @@ static ssize_t iommu_group_store_type(struct iommu_group *group,
>  		goto out;
>  	}
>  
> +	mutex_lock(&group->mutex);
>  	ret = iommu_change_dev_def_domain(group, dev, req_type);
> +	/*
> +	 * Release the mutex here because ops->probe_finalize() call-back of
> +	 * some vendor IOMMU drivers calls arm_iommu_attach_device() which
> +	 * in-turn might call back into IOMMU core code, where it tries to take
> +	 * group->mutex, resulting in a deadlock.
> +	 */
> +	mutex_unlock(&group->mutex);
> +
> +	/* Make sure dma_ops is appropriatley set */
> +	if (!ret)
> +		iommu_group_do_probe_finalize(dev, group->default_domain);

Everything about iommu_group_do_probe_finalize() is still unsafe
against races with release. :(

Pre-existing bug so maybe leave it for this series :\

To fix it I'd suggest splitting probe_finalize ops into probe_finalize
and probe_finalized_unlocked.

Only have the "bad" deadlocky drivers use the unlocked variant and fix
intel and virtio to use the safe varient. 

We can decide which variant to use under the mutex and then at least
"good" drivers don't have this race.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

  reply	other threads:[~2023-03-10  1:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06  2:57 [PATCH v3 0/6] iommu: Extend changing default domain to normal group Lu Baolu
2023-03-06  2:57 ` [PATCH v3 1/6] ARM/dma-mapping: Add arm_iommu_release_device() Lu Baolu
2023-03-10  1:00   ` Jason Gunthorpe
2023-03-10 22:04   ` Robin Murphy
2023-03-12  3:53     ` Baolu Lu
2023-03-06  2:58 ` [PATCH v3 2/6] iommu: Split iommu_group_remove_device() into helpers Lu Baolu
2023-03-10  1:01   ` Jason Gunthorpe
2023-03-06  2:58 ` [PATCH v3 3/6] iommu: Same critical region for device release and removal Lu Baolu
2023-03-10  1:08   ` Jason Gunthorpe
2023-03-06  2:58 ` [PATCH v3 4/6] iommu: Move lock from iommu_change_dev_def_domain() to its caller Lu Baolu
2023-03-10  1:16   ` Jason Gunthorpe [this message]
2023-03-06  2:58 ` [PATCH v3 5/6] iommu: Replace device_lock() with group->mutex Lu Baolu
2023-03-10  1:30   ` Jason Gunthorpe
2023-03-06  2:58 ` [PATCH v3 6/6] iommu: Cleanup iommu_change_dev_def_domain() Lu Baolu
2023-03-10  1:30   ` Jason Gunthorpe
2023-03-10  1:32 ` [PATCH v3 0/6] iommu: Extend changing default domain to normal group Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZAqFA8KERGh2Emob@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox