* [PATCH] iommu: fix double spin_lock_irqsave on `device_domain_lock'
@ 2016-10-17 8:54 Iago Abal
[not found] ` <1476694480-4251-1-git-send-email-iari-YidNj35/HaM@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Iago Abal @ 2016-10-17 8:54 UTC (permalink / raw)
To: Joerg Roedel; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Iago Abal
From: Iago Abal <mail-nS/Qcn0WMKPsq35pWSNszA@public.gmane.org>
The EBA code analyzer (https://github.com/models-team/eba) reported
the following double lock:
1. In function `disable_dmar_iommu' defined at 1706;
2. the lock `device_domain_lock' is first taken in line 1714:
// FIRST
spin_lock_irqsave(&device_domain_lock, flags);
3. enter the `list_for_each_entry_safe' loop at 1715;
4. call function `dmar_remove_one_dev_info' (defined at 4851) in line 1726;
5. finally, the lock is taken a second time in line 4857:
// SECOND: DOUBLE LOCK !!!
spin_lock_irqsave(&device_domain_lock, flags);
In addition, within that same loop, there is also a call to `domain_exit', which
calls to `domain_remove_dev_info', which also spin_lock on `device_domain_lock'.
I fixed the potential deadlock by releasing the `device_domain_lock' during the
execution of the loop body. This seems to respect the locking assumptions made
by the rest of the code: both `dmar_remove_one_dev_info' and `domain_exit' will
(directly or indiretly) take that look, so they should not be called with it held.
Function `domain_type_is_vm_or_si' just checks `domain->flags' and there seem
to be no concurrent writes to this field.
Signed-off-by: Iago Abal <mail-nS/Qcn0WMKPsq35pWSNszA@public.gmane.org>
---
drivers/iommu/intel-iommu.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a4407ea..05796a8 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1721,12 +1721,16 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
if (!info->dev || !info->domain)
continue;
+ spin_unlock_irqrestore(&device_domain_lock, flags);
+
domain = info->domain;
dmar_remove_one_dev_info(domain, info->dev);
if (!domain_type_is_vm_or_si(domain))
domain_exit(domain);
+
+ spin_lock_irqsave(&device_domain_lock, flags);
}
spin_unlock_irqrestore(&device_domain_lock, flags);
--
1.9.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] iommu: fix double spin_lock_irqsave on `device_domain_lock'
[not found] ` <1476694480-4251-1-git-send-email-iari-YidNj35/HaM@public.gmane.org>
@ 2016-11-03 20:51 ` Joerg Roedel
[not found] ` <20161103205136.GA4930-l3A5Bk7waGM@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Joerg Roedel @ 2016-11-03 20:51 UTC (permalink / raw)
To: Iago Abal; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Iago Abal
On Mon, Oct 17, 2016 at 10:54:40AM +0200, Iago Abal wrote:
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index a4407ea..05796a8 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1721,12 +1721,16 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
> if (!info->dev || !info->domain)
> continue;
>
> + spin_unlock_irqrestore(&device_domain_lock, flags);
> +
> domain = info->domain;
>
> dmar_remove_one_dev_info(domain, info->dev);
>
> if (!domain_type_is_vm_or_si(domain))
> domain_exit(domain);
> +
> + spin_lock_irqsave(&device_domain_lock, flags);
> }
> spin_unlock_irqrestore(&device_domain_lock, flags);
No, you can't just release the lock to re-aquire it in
dmar_remove_one_dev_info(). This introduces new races, as the list your
are walking is no longer protected by the lock. The right solution is to
call a variant of dmar_remove_one_dev_info() which does not take the
lock. It turns out this function already exists, so the patch looks like
below. Can you check if this is still correct and resubmit your patch?
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a4407ea..3cadde2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1723,7 +1723,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
domain = info->domain;
- dmar_remove_one_dev_info(domain, info->dev);
+ __dmar_remove_one_dev_info(info);
if (!domain_type_is_vm_or_si(domain))
domain_exit(domain);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] iommu: fix double spin_lock_irqsave on `device_domain_lock'
[not found] ` <20161103205136.GA4930-l3A5Bk7waGM@public.gmane.org>
@ 2016-11-04 9:38 ` Iago Abal
[not found] ` <CAGbDTvqDAEcLPNxHa6wvjTwpotDbZSSDL3MocVgj=LTMFG8Okw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Iago Abal @ 2016-11-04 9:38 UTC (permalink / raw)
To: Joerg Roedel; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Thu, Nov 3, 2016 at 9:51 PM, Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org> wrote:
>
> No, you can't just release the lock to re-aquire it in
> dmar_remove_one_dev_info(). This introduces new races, as the list your
> are walking is no longer protected by the lock. The right solution is to
> call a variant of dmar_remove_one_dev_info() which does not take the
> lock. It turns out this function already exists, so the patch looks like
> below. Can you check if this is still correct and resubmit your patch?
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index a4407ea..3cadde2 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1723,7 +1723,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
>
> domain = info->domain;
>
> - dmar_remove_one_dev_info(domain, info->dev);
> + __dmar_remove_one_dev_info(info);
>
> if (!domain_type_is_vm_or_si(domain))
> domain_exit(domain);
That patch was actually my first attempt at fixing the problem, but I
ran the tool and I found a second possibility of deadlock:
`domain_exit' calls to `domain_remove_dev_info', which also spin_locks
on `device_domain_lock'.
Alternatively I could add another `__domain_exit' function that
doesn't take the lock.
Would that be fine?
-- iago
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
[not found] ` <CAGbDTvqDAEcLPNxHa6wvjTwpotDbZSSDL3MocVgj=LTMFG8Okw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-11-08 14:15 ` Joerg Roedel
0 siblings, 0 replies; 4+ messages in thread
From: Joerg Roedel @ 2016-11-08 14:15 UTC (permalink / raw)
To: Iago Abal; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Fri, Nov 04, 2016 at 10:38:29AM +0100, Iago Abal wrote:
> That patch was actually my first attempt at fixing the problem, but I
> ran the tool and I found a second possibility of deadlock:
> `domain_exit' calls to `domain_remove_dev_info', which also spin_locks
> on `device_domain_lock'.
>
> Alternatively I could add another `__domain_exit' function that
> doesn't take the lock.
So that is actually not easy to do, I'd rather go with the simpler
solution of dropping the lock for domain_exit() invocation and then
re-start walking the list afterwards, like in the below patch:
>From bea64033dd7b5fb6296eda8266acab6364ce1554 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Date: Tue, 8 Nov 2016 15:08:26 +0100
Subject: [PATCH] iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
It turns out that the disable_dmar_iommu() code-path tried
to get the device_domain_lock recursivly, which will
dead-lock when this code runs on dmar removal. Fix both
code-paths that could lead to the dead-lock.
Fixes: 55d940430ab9 ('iommu/vt-d: Get rid of domain->iommu_lock')
Reported-by: Iago Abal <iari-YidNj35/HaM@public.gmane.org>
Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
drivers/iommu/intel-iommu.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a4407ea..3965e73 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1711,6 +1711,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
if (!iommu->domains || !iommu->domain_ids)
return;
+again:
spin_lock_irqsave(&device_domain_lock, flags);
list_for_each_entry_safe(info, tmp, &device_domain_list, global) {
struct dmar_domain *domain;
@@ -1723,10 +1724,19 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
domain = info->domain;
- dmar_remove_one_dev_info(domain, info->dev);
+ __dmar_remove_one_dev_info(info);
- if (!domain_type_is_vm_or_si(domain))
+ if (!domain_type_is_vm_or_si(domain)) {
+ /*
+ * The domain_exit() function can't be called under
+ * device_domain_lock, as it takes this lock itself.
+ * So release the lock here and re-run the loop
+ * afterwards.
+ */
+ spin_unlock_irqrestore(&device_domain_lock, flags);
domain_exit(domain);
+ goto again;
+ }
}
spin_unlock_irqrestore(&device_domain_lock, flags);
--
2.6.6
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-11-08 14:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-17 8:54 [PATCH] iommu: fix double spin_lock_irqsave on `device_domain_lock' Iago Abal
[not found] ` <1476694480-4251-1-git-send-email-iari-YidNj35/HaM@public.gmane.org>
2016-11-03 20:51 ` Joerg Roedel
[not found] ` <20161103205136.GA4930-l3A5Bk7waGM@public.gmane.org>
2016-11-04 9:38 ` Iago Abal
[not found] ` <CAGbDTvqDAEcLPNxHa6wvjTwpotDbZSSDL3MocVgj=LTMFG8Okw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-11-08 14:15 ` [PATCH] iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path Joerg Roedel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.