* dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic
@ 2015-06-29 17:44 George Wang
[not found] ` <CAPBX1x+zagVVYebbXU0M7VkEaDkzvqBGnkt6PW_N42fRQRQ9Gg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: George Wang @ 2015-06-29 17:44 UTC (permalink / raw)
To: joro-zLv9SwRftAIdnm+yROfE0A,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Hi,
I am trying to do some tests for kernel 4.1.0-next-20150626+, but
panic in amd_iommu_attach_dev. After some digging inside amd_iommu.c,
I found the suspecting code:
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index d3e5e9a..4f6da17 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1882,6 +1882,7 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
return NULL;
spin_lock_init(&dma_dom->domain.lock);
+ mutex_init(&dma_dom->domain.api_lock);
When I initialize the api_lock, then I can go forward with another problem.
Thanks,
George
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic
[not found] ` <CAPBX1x+zagVVYebbXU0M7VkEaDkzvqBGnkt6PW_N42fRQRQ9Gg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-06-29 19:34 ` Joerg Roedel
[not found] ` <20150629193402.GM18569-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Joerg Roedel @ 2015-06-29 19:34 UTC (permalink / raw)
To: George Wang; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Tue, Jun 30, 2015 at 01:44:34AM +0800, George Wang wrote:
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index d3e5e9a..4f6da17 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -1882,6 +1882,7 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
> return NULL;
>
> spin_lock_init(&dma_dom->domain.lock);
> + mutex_init(&dma_dom->domain.api_lock);
>
> When I initialize the api_lock, then I can go forward with another problem.
How do you trigger this? The DMA-API domains are not used via the
IOMMU-API yet, so the initializing the api-lock for it shouldn't matter.
Joerg
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic
[not found] ` <20150629193402.GM18569-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2015-06-30 3:55 ` George Wang
[not found] ` <CAPBX1xLA_GDeoi9wq-9A7njwzL3NBqJYYT_PqhwEzBAg=9=8kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: George Wang @ 2015-06-30 3:55 UTC (permalink / raw)
To: Joerg Roedel; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Tue, Jun 30, 2015 at 3:34 AM, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote:
> On Tue, Jun 30, 2015 at 01:44:34AM +0800, George Wang wrote:
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index d3e5e9a..4f6da17 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -1882,6 +1882,7 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
>> return NULL;
>>
>> spin_lock_init(&dma_dom->domain.lock);
>> + mutex_init(&dma_dom->domain.api_lock);
>>
>> When I initialize the api_lock, then I can go forward with another problem.
>
> How do you trigger this? The DMA-API domains are not used via the
> IOMMU-API yet, so the initializing the api-lock for it shouldn't matter.
>
>
> Joerg
>
I don't know what triger it, I just build the kernel, install, and
panic. The call call trace is like below:
[ 11.687392] BUG: unable to handle kernel NULL pointer dereference
at (null)
[ 11.690196] IP: [<ffffffff813326ef>] __list_add+0x1f/0xc0
[ 11.692026] PGD 0
[ 11.692794] Oops: 0000 [#1] SMP
[ 11.693939] Modules linked in:
[ 11.694997] CPU: 11 PID: 1 Comm: swapper/0 Not tainted
4.1.0-next-20150626+ #6
[ 11.697415] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 02/06/2014
[ 11.699683] task: ffff880835888000 ti: ffff880236918000 task.ti:
ffff880236918000
[ 11.702281] RIP: 0010:[<ffffffff813326ef>] [<ffffffff813326ef>]
__list_add+0x1f/0xc0
[ 11.704935] RSP: 0018:ffff88023691b968 EFLAGS: 00010246
[ 11.706702] RAX: 00000000ffffffff RBX: ffff88023691b998 RCX: ffff880835888000
[ 11.709199] RDX: ffff880634f58468 RSI: 0000000000000000 RDI: ffff88023691b998
[ 11.711597] RBP: ffff88023691b988 R08: 0000000000000000 R09: ffff88023691bab8
[ 11.714022] R10: 00000000000f0000 R11: ffff880000000000 R12: ffff880634f58468
[ 11.716415] R13: 0000000000000000 R14: 00000000ffffffff R15: ffff880634f58468
[ 11.718909] FS: 0000000000000000(0000) GS:ffff880637d40000(0000)
knlGS:0000000000000000
[ 11.721575] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.723541] CR2: 0000000000000000 CR3: 00000000019d4000 CR4: 00000000000406e0
[ 11.725960] Stack:
[ 11.726632] 0000000000001000 ffff880634f58460 ffff880634f58464
ffff880835888000
[ 11.729440] ffff88023691b9e8 ffffffff8168fde1 ffff88023691b9f8
ffffffff81318798
[ 11.732131] 000000000000a1ff 000000008d3d0bb4 00002fdd3691b9e8
ffff880634f58460
[ 11.734774] Call Trace:
[ 11.735635] [<ffffffff8168fde1>] __mutex_lock_slowpath+0x91/0x120
[ 11.737676] [<ffffffff81318798>] ? ida_simple_get+0x98/0x100
[ 11.739682] [<ffffffff8168fe93>] mutex_lock+0x23/0x37
[ 11.741407] [<ffffffff8143513a>] amd_iommu_map+0x4a/0x1b0
[ 11.743293] [<ffffffff8143081a>] iommu_map+0xfa/0x200
[ 11.745025] [<ffffffff81431587>] iommu_group_add_device+0x327/0x390
[ 11.747184] [<ffffffff814316fb>] iommu_group_get_forv+0x10b/0x1f0
[ 11.849564] [<ffffffff81436ac6>] amd_iommu_add_device+0x1b6/0x580
[ 11.851645] [<ffffffff8168d891>] ? __schedule+0xe1/0x890
[ 11.85350883] [<ffffffff814304db>] add_iommu_group+0x2b/0x50
[ 11.857765] [<ffffffff8144b40c>] bus_for_each_dev+0x6c/0xc0
[ 11.859752] [<ffffffff814311b4>] ? bus_set_iommu+0x54/0x100
[ 11.861698] [<ffffffff8143121e>] bus_set_iommu+0xbe/0x100
[ 11.863485] [<ffffffff81b77e46>] amd_iommu_init_api+0x17/0x19
[ 11.865473] [<ffffffff81b7993c>] state_next+0x57e/0x715
[ 11.867212] [<ffffffff81b37eec>] ? memblock_find_dma_reserve+0x177/0x177
[ 11.869577] [<ffffffff81b79aed>] iommu_go_to_state+0x1a/0x2d
[ 11.871577] [<ffffffff81b79b72>] amd_iommu_init+0x15/0xfc
[ 11.873425] [<ffffffff81b37eff>] pci_iommu_init+0x13/0x3e
[ 11.875259] [<ffffffff8100213d>] do_one_initcall+0xcd/0x1f0
[ 11.877162] [<ffffffff81098d00>] ? parse_args+0x220/0x470
[ 11.879122] [<ffffffff810bd548>] ? __wake_up+0x48/0x60
[ 11.880872] [<ffffffff81b2e349>] kernel_inia5/0x249
[ 12.282919] [<ffffffff81b2d9dd>] ? initcall_blacklist+0xb6/0xb6
[ 12.285018] [<ffffffff8167b9a0>] ? rest_init+0x80/0x80
[ 12.286803] [<ffffffff8167b9ae>] kernel_init+0xe/0xe0
[ 12.288621] [<ffffffff81691f5f>] ret_from_fork+0x3f/0x70
[ 12.290761] [<ffffffff8167b9a0>] ? rest_init+0x80/0x80
[ 12.292516] Code: ff ff ff e9 31 ff ff ff 0f 1f 40 00 55 48 89 e5
41 55 49 89 f5 41 54 49 89 d4 53 48 89 fb 48 83 ec 08 4c 8b 42 08 49
39 f0 75 2e <4d> 8b 45 00 4d 39 c4 75 6c 4c 39 e3 74 42 4c 39 eb 74 3d
49 89
[ 12.301447] RIP [<ffffffff813326ef>] __list_add+0x1f/0xc0
[ 12.303331] RSP <ffff88023691b968>
[ 12.304516] CR2: 0000000000000000
[ 12.305657] ---[ end trace 20a8e3deaab91b75 ]---
I think the the
add_iommu_group->amd_iommu_add_device->init_iommu_group->iommu_group_get_for_dev->iommu_group_add_device->iommu_group_create_direct_mappings->iommu_map->amd_iommu_map->mutex_lock(&domain->api_lock)
but the is initialized amd_iommu_domain_alloc->dma_ops_domain_alloc,
which has not initialized the api_lock of protect_domain, so got the
panic.
Thanks,
Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic
[not found] ` <CAPBX1xLA_GDeoi9wq-9A7njwzL3NBqJYYT_PqhwEzBAg=9=8kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-06-30 7:44 ` Joerg Roedel
[not found] ` <20150630074454.GO18569-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Joerg Roedel @ 2015-06-30 7:44 UTC (permalink / raw)
To: George Wang; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Tue, Jun 30, 2015 at 11:55:24AM +0800, George Wang wrote:
> [ 11.734774] Call Trace:
> [ 11.735635] [<ffffffff8168fde1>] __mutex_lock_slowpath+0x91/0x120
> [ 11.737676] [<ffffffff81318798>] ? ida_simple_get+0x98/0x100
> [ 11.739682] [<ffffffff8168fe93>] mutex_lock+0x23/0x37
> [ 11.741407] [<ffffffff8143513a>] amd_iommu_map+0x4a/0x1b0
> [ 11.743293] [<ffffffff8143081a>] iommu_map+0xfa/0x200
> [ 11.745025] [<ffffffff81431587>] iommu_group_add_device+0x327/0x390
> [ 11.747184] [<ffffffff814316fb>] iommu_group_get_forv+0x10b/0x1f0
> [ 11.849564] [<ffffffff81436ac6>] amd_iommu_add_device+0x1b6/0x580
Ah, your AMD IOMMU system probably has unity mappings defined in its
ACPI table. I don't have systems with unity mappings defined, so I
couldn't test this. On what system you are running this test (system or
mainboard vendor and type)
Anyway, here is a patch that should fix this issue for you, can you
please test it?
>From a83e7544c3bc1bd843478e0809cc9781e844fd08 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Date: Tue, 30 Jun 2015 08:56:11 +0200
Subject: [PATCH] iommu/amd: Introduce protection_domain_init() function
This function contains the common parts between the
initialization of dma_ops_domains and usual protection
domains. This also fixes a long-standing bug which was
uncovered by recent changes, in which the api_lock was not
initialized for dma_ops_domains.
Reported-by: George Wang <xuw2015-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
---
drivers/iommu/amd_iommu.c | 26 ++++++++++++++++----------
1 file changed, 16 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index c5677ed..cedbf00 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -116,6 +116,7 @@ struct kmem_cache *amd_iommu_irq_cache;
static void update_domain(struct protection_domain *domain);
static int alloc_passthrough_domain(void);
+static int protection_domain_init(struct protection_domain *domain);
/****************************************************************************
*
@@ -1880,12 +1881,9 @@ static struct dma_ops_domain *dma_ops_domain_alloc(void)
if (!dma_dom)
return NULL;
- spin_lock_init(&dma_dom->domain.lock);
-
- dma_dom->domain.id = domain_id_alloc();
- if (dma_dom->domain.id == 0)
+ if (protection_domain_init(&dma_dom->domain))
goto free_dma_dom;
- INIT_LIST_HEAD(&dma_dom->domain.dev_list);
+
dma_dom->domain.mode = PAGE_MODE_2_LEVEL;
dma_dom->domain.pt_root = (void *)get_zeroed_page(GFP_KERNEL);
dma_dom->domain.flags = PD_DMA_OPS_MASK;
@@ -2915,6 +2913,18 @@ static void protection_domain_free(struct protection_domain *domain)
kfree(domain);
}
+static int protection_domain_init(struct protection_domain *domain)
+{
+ spin_lock_init(&domain->lock);
+ mutex_init(&domain->api_lock);
+ domain->id = domain_id_alloc();
+ if (!domain->id)
+ return -ENOMEM;
+ INIT_LIST_HEAD(&domain->dev_list);
+
+ return 0;
+}
+
static struct protection_domain *protection_domain_alloc(void)
{
struct protection_domain *domain;
@@ -2923,12 +2933,8 @@ static struct protection_domain *protection_domain_alloc(void)
if (!domain)
return NULL;
- spin_lock_init(&domain->lock);
- mutex_init(&domain->api_lock);
- domain->id = domain_id_alloc();
- if (!domain->id)
+ if (protection_domain_init(domain))
goto out_err;
- INIT_LIST_HEAD(&domain->dev_list);
add_domain_to_list(domain);
--
1.8.4.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic
[not found] ` <20150630074454.GO18569-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
@ 2015-07-01 5:20 ` George Wang
[not found] ` <CAPBX1x+OK8EMwDsripY71jF44d73Qv0jBxyM+jJgPMzNVPTyaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: George Wang @ 2015-07-01 5:20 UTC (permalink / raw)
To: Joerg Roedel; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
On Tue, Jun 30, 2015 at 3:44 PM, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org> wrote:
> On Tue, Jun 30, 2015 at 11:55:24AM +0800, George Wang wrote:
>> [ 11.734774] Call Trace:
>> [ 11.735635] [<ffffffff8168fde1>] __mutex_lock_slowpath+0x91/0x120
>> [ 11.737676] [<ffffffff81318798>] ? ida_simple_get+0x98/0x100
>> [ 11.739682] [<ffffffff8168fe93>] mutex_lock+0x23/0x37
>> [ 11.741407] [<ffffffff8143513a>] amd_iommu_map+0x4a/0x1b0
>> [ 11.743293] [<ffffffff8143081a>] iommu_map+0xfa/0x200
>> [ 11.745025] [<ffffffff81431587>] iommu_group_add_device+0x327/0x390
>> [ 11.747184] [<ffffffff814316fb>] iommu_group_get_forv+0x10b/0x1f0
>> [ 11.849564] [<ffffffff81436ac6>] amd_iommu_add_device+0x1b6/0x580
>
> Ah, your AMD IOMMU system probably has unity mappings defined in its
> ACPI table. I don't have systems with unity mappings defined, so I
> couldn't test this. On what system you are running this test (system or
> mainboard vendor and type)
I am not clear about the unity-mappings, I will do some learning for it.
I run lspic and dmidecode to get some infos about my machine. I am not
sure whether it is useful to you.
If you want to get information, please let me know.
[root@hp-dl385pg8-09 linux-next]# lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890
Northbridge only dual slot (2x16) PCI-e GFX Hydra part (rev 02)
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD/ATI] RD990 I/O Memory
Management Unit (IOMMU)
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI
to PCI bridge (PCI express gpp port B)
00:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890 PCI
to PCI bridge (external gfx1 port A)
00:0c.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] RD890S PCI
Express bridge for GPP2 port 1
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode]
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
--snip--
[root@hp-dl385pg8-09 linux-next]# dmidecode|grep -A16 "System Information"
System Information
Manufacturer: HP
Product Name: ProLiant DL385p Gen8
Version: Not Specified
Serial Number: 6CU428FNLL
UUID: 32333536-3330-4336-5534-3238464E4C4C
Wake-up Type: Power Switch
SKU Number: 653203-B21
Family: ProLiant
>
> Anyway, here is a patch that should fix this issue for you, can you
> please test it?
Thanks for you work. Apply this patch, and it works good for me.
Thanks,
George
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic
[not found] ` <CAPBX1x+OK8EMwDsripY71jF44d73Qv0jBxyM+jJgPMzNVPTyaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-01 6:44 ` Joerg Roedel
0 siblings, 0 replies; 6+ messages in thread
From: Joerg Roedel @ 2015-07-01 6:44 UTC (permalink / raw)
To: George Wang; +Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Hi George,
On Wed, Jul 01, 2015 at 01:20:59PM +0800, George Wang wrote:
> [root@hp-dl385pg8-09 linux-next]# dmidecode|grep -A16 "System Information"
> System Information
> Manufacturer: HP
> Product Name: ProLiant DL385p Gen8
> Version: Not Specified
> Serial Number: 6CU428FNLL
> UUID: 32333536-3330-4336-5534-3238464E4C4C
> Wake-up Type: Power Switch
> SKU Number: 653203-B21
> Family: ProLiant
Thanks for that info, so its HP hardware which has it.
> Thanks for you work. Apply this patch, and it works good for me.
Thanks for testing, I send the fix upstream asap.
Joerg
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-07-01 6:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-29 17:44 dma_ops_domain_alloc causes kernel 4.1.0-next-20150626+ panic George Wang
[not found] ` <CAPBX1x+zagVVYebbXU0M7VkEaDkzvqBGnkt6PW_N42fRQRQ9Gg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-29 19:34 ` Joerg Roedel
[not found] ` <20150629193402.GM18569-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-06-30 3:55 ` George Wang
[not found] ` <CAPBX1xLA_GDeoi9wq-9A7njwzL3NBqJYYT_PqhwEzBAg=9=8kA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-06-30 7:44 ` Joerg Roedel
[not found] ` <20150630074454.GO18569-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-07-01 5:20 ` George Wang
[not found] ` <CAPBX1x+OK8EMwDsripY71jF44d73Qv0jBxyM+jJgPMzNVPTyaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-01 6:44 ` Joerg Roedel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox