public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Gerd Bayer <gbayer@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Joerg Roedel <jroedel@suse.de>,
	iommu@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-s390@vger.kernel.org
Subject: Re: [PATCH] iommu/s390: Implement blocking domain
Date: Thu, 8 Aug 2024 10:44:51 -0300	[thread overview]
Message-ID: <20240808134451.GC1985367@ziepe.ca> (raw)
In-Reply-To: <20240806-blocking_domain-v1-1-8abc18e37e52@linux.ibm.com>

On Tue, Aug 06, 2024 at 03:45:15PM +0200, Niklas Schnelle wrote:
> This fixes a crash when surprise hot-unplugging a PCI device. This crash
> happens because during hot-unplug __iommu_group_set_domain_nofail()
> attaching the default domain fails when the platform no longer
> recognizes the device as it has already been removed and we end up with
> a NULL domain pointer and UAF.

Huh?

A device can't be removed before telling the iommu subsystem about
it. How did the domain become NULL asynchronously while the iommu side
thought it was still alive?? That seems like the main bug here..

> Note: I somewhat suspect this to be related to the following discussion
> or at least we have seen the same backtraces in reports that we suspect
> to be caused by the issue fixed with this patch.

No, it shouldn't be. That bug is because the netstack is continuing to
use a struct device with the DMA API after the device driver has been
removed. That is just illegal.

> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index ed6c5cb60c5a..91b3b23bf55c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -119,8 +119,11 @@ static int __iommu_group_set_domain(struct iommu_group *group,
>  static void __iommu_group_set_domain_nofail(struct iommu_group *group,
>  					    struct iommu_domain *new_domain)
>  {
> -	WARN_ON(__iommu_group_set_domain_internal(
> -		group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED));
> +	int ret = __iommu_group_set_domain_internal(
> +		group, new_domain, IOMMU_SET_DOMAIN_MUST_SUCCEED);
> +
> +	/* Allow attach to fail when the device is gone */
> +	WARN_ON(ret && ret != -ENODEV);
>  }

Like this doesn't really make sense. Until the iommu subystem removes
the device from the group it really cannot be "gone".

The hypervisor could fail attachment because the hypervisor has
already fenced the device. Sure, that make sense, but it is not really
that the device is gone from a Linux perspective, it is just now in a
forced-blocked state.

Like Lu says, if we need to add a new flow for devices that are now
force-blocking and cannot be changed then it will need its own error
code.

But what is the backtrace that runs into this warn on? VFIO exiting
and trying to put the device back to the DMA API?

Though I feel like more is needed here if you expect to allow the
nofail version of this to actually fail.. For instance a force-blocked
device should block driver binding through the dma_owner APIs.

> +static int blocking_domain_attach_device(struct iommu_domain *domain,
> +					 struct device *dev)
> +{
> +	struct s390_domain *s390_domain = to_s390_domain(domain);
> +	struct zpci_dev *zdev = to_zpci_dev(dev);
> +	unsigned long flags;
> +
> +	if (!zdev)
> +		return 0;
> +
> +	/* Detach sets the blocking domain */
> +	if (zdev->s390_domain)
> +		s390_iommu_detach_device(&zdev->s390_domain->domain, dev);

When I've done these conversions on other drivers it was the case that
zdev->s390_domain is never NULL. Instead probe_device immediately
starts it as a blocking_domain or fails to probe.

This way we don't ever have the ill defined notion of NULL here.

> @@ -777,6 +812,8 @@ static int __init s390_iommu_init(void)
>  subsys_initcall(s390_iommu_init);
>  
>  static const struct iommu_ops s390_iommu_ops = {
> +	.blocked_domain		= &s390_blocking_domain.domain,
> +	.release_domain		= &s390_blocking_domain.domain,

If you set release_domain then remove s390_iommu_release_device(), it
is the same code.

Jason

  parent reply	other threads:[~2024-08-08 13:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-06 13:45 [PATCH] iommu/s390: Implement blocking domain Niklas Schnelle
2024-08-06 14:10 ` Niklas Schnelle
2024-08-07  3:09 ` Baolu Lu
2024-08-08 13:44 ` Jason Gunthorpe [this message]
2024-08-08 17:05   ` Matthew Rosato
2024-08-08 19:41     ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240808134451.GC1985367@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=agordeev@linux.ibm.com \
    --cc=gbayer@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=robin.murphy@arm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox