From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754027AbbIWKWe (ORCPT ); Wed, 23 Sep 2015 06:22:34 -0400 Received: from h1446028.stratoserver.net ([85.214.92.142]:59635 "EHLO mail.ahsoftware.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752234AbbIWKWd (ORCPT ); Wed, 23 Sep 2015 06:22:33 -0400 From: Alexander Holler Subject: AMD-IOMMU and problem with __init(data)? To: linux-kernel@vger.kernel.org Message-ID: <56027D60.3070903@ahsoftware.de> Date: Wed, 23 Sep 2015 12:22:24 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, It looks like I have a problem with the AMD IOMMU and it's handling of __init or __initdata. I'm working on something which stores some structs right after INIT_CALLS but before CON_INITCALL (see include/asm-generic/vmlinux.lds.h). This structures will be accessed right after the initcalls from level fs (5, see init/main.c), have been called. Here is how that structure is defined: -- struct _annotated_initcall { initcall_t initcall; unsigned driver_id; unsigned *dependencies; struct device_driver *driver; }; extern struct _annotated_initcall __annotated_initcall_start[], __annotated_initcall_end[]; -- The code which uses that is -- struct _annotated_initcall *ac; ac = __annotated_initcall_start; for (; ac < __annotated_initcall_end; ++ac) pr_info("AHO: ac %p ic %p ID %u deps %p drv %p\n", ac, ac->initcall, ac->driver_id, ac->dependencies, ac->driver); -- What now happens if I've enabled CONFIG_AMD_IOMMU is the following: -- (...) [ 1.240362] io scheduler noop registered [ 1.395764] iommu: Adding device 0000:00:00.0 to group 0 [ 1.401478] iommu: Adding device 0000:00:01.0 to group 1 [ 1.406828] iommu: Adding device 0000:00:01.1 to group 1 [ 1.412487] iommu: Adding device 0000:00:10.0 to group 2 [ 1.417839] iommu: Adding device 0000:00:10.1 to group 2 [ 1.423501] iommu: Adding device 0000:00:11.0 to group 3 [ 1.429157] iommu: Adding device 0000:00:12.0 to group 4 [ 1.434510] iommu: Adding device 0000:00:12.2 to group 4 [ 1.440166] iommu: Adding device 0000:00:13.0 to group 5 [ 1.445520] iommu: Adding device 0000:00:13.2 to group 5 [ 1.451196] iommu: Adding device 0000:00:14.0 to group 6 [ 1.456551] iommu: Adding device 0000:00:14.2 to group 6 [ 1.461904] iommu: Adding device 0000:00:14.3 to group 6 [ 1.467568] iommu: Adding device 0000:00:14.4 to group 7 [ 1.473226] iommu: Adding device 0000:00:15.0 to group 8 [ 1.480807] iommu: Adding device 0000:00:15.1 to group 8 [ 1.486470] iommu: Adding device 0000:00:18.0 to group 9 [ 1.491822] iommu: Adding device 0000:00:18.1 to group 9 [ 1.497176] iommu: Adding device 0000:00:18.2 to group 9 [ 1.502528] iommu: Adding device 0000:00:18.3 to group 9 [ 1.507883] iommu: Adding device 0000:00:18.4 to group 9 [ 1.513237] iommu: Adding device 0000:00:18.5 to group 9 [ 1.518593] iommu: Adding device 0000:03:00.0 to group 8 [ 1.523932] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40 [ 1.529276] AMD-Vi: Extended features: PreF PPR GT IA [ 1.534776] AMD-Vi: Interrupt remapping enabled [ 1.539496] AMD-Vi: Lazy IO/TLB flushing enabled [ 1.545741] AHO: count_annotated 25 [ 1.549259] AHO: build inventory [ 1.552517] AHO: ac ffffffff81d400d8 ic (null) ID 2177560225 deps 00000000000000b0 drv ffffffff81d25090 [ 1.562801] BUG: unable to handle kernel paging request at 00000000039c2af5 (...) -- The bug happens because the field driver_id of the structure (and likely the other stuff) is wrong. If I disable CONFIG_AMD_IOMMU it looks like it should and how it works on ARM and Intel systems: -- (...) [ 1.151906] io scheduler noop registered [ 1.307088] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 1.313563] software IO TLB [mem 0x894ca000-0x8d4ca000] (64MB) mapped at [ffff8800894ca000-ffff88008d4c9fff] [ 1.323411] AHO: count_annotated 25 [ 1.326934] AHO: build inventory [ 1.330189] AHO: ac ffffffff81d3cea0 ic ffffffff81cadcb4 ID 176 deps ffffffff81d22090 drv (null) (...) -- Does anyone have an idea what's going on? Kernel is 4.2.1 (x86_64) and HW is an A10-5800K. If it's necessary, I could try put together a small patch which kills a system (reproducible here). Thanks in advance, Alexander Holler