From: Lutz Vieweg <lvml-i6VILw57VWU@public.gmane.org>
To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: e1000-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: Re: [E1000-devel] AMD-Vi: Event logged IO_PAGE_FAULT - ixgbe Detected Tx Unit Hang - Reset adapter - master disable timed out
Date: Mon, 13 Jun 2016 19:40:11 +0200 [thread overview]
Message-ID: <575EEFFB.20004@5t9.de> (raw)
In-Reply-To: <CAKT61h9cNnGDNugoWXYcpN1VjVK3Hn-VOW+TwHahj5EXzfsXgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 06/13/2016 04:46 AM, Wan ZongShun wrote:
>> With "iommu=pt":
>>>
>>> [ 4.832580] iommu: Adding device 0000:04:00.0 to group 13
>>> [ 4.832838] iommu: Using direct mapping for device 0000:04:00.0
>>
>
> That is right, you will pass through AMD IOMMU when you set iommu=pt.
>
>> ...
>>>
>>> [ 4.837074] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
>>> [ 4.837305] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
>>> [ 4.837535] AMD-Vi: Interrupt remapping enabled
>>> [ 4.838062] AMD-Vi: Lazy IO/TLB flushing enabled
>>> [ 4.838291] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
>>> [ 4.838533] software IO TLB [mem 0xd3e80000-0xd7e80000] (64MB) mapped
>>> at [ffff8800d3e80000-ffff8800d7e7ffff]
>>
>>
>> I hope that doesn't mean all my network data is now passing through
>> an additional copy-by-CPU... that would be kind of the opposite of what
>> "iommu=pt" seemed to promise :-)
>
> It depends.
>
> Firstly, I need to know if your ethernet card works well now or not
> after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I'd say celebration of "works with iommu=pt" has to wait for at least
two weeks or so before it is reasonably probable it works for this reason.
> If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
> is ok, you will not be impacted by bounce buffer.
> But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
> If you want to get further help, Please try:
>
> (1)Please add 'amd_iommu_dump' option in your kernel boot option, and
> send your full kernel logs, lspci info, don't add iommu=pt.
> (2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
>> One more thing I find curious, but this didn't change with "iommu=pt":
>>>
>>> [ 0.000000] AGP: Checking aperture...
>>> [ 0.000000] AGP: No AGP bridge found
>>> [ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff]
>>> (32MB)
>>> [ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
>>> [ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
>>> [ 0.000000] AGP: This costs you 64MB of RAM
>>> [ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff]
>>> (65536KB)
>>
>> I checked and the IOMMU-option is definitely enabled in the BIOS setup.
>> So I assume right that these message are irrelevant (since AGP as a whole
>> is irrelevant on this server)?
>
> Please cat /proc/iomem, send the information.
Here it is:
> 00000000-00000fff : reserved
> 00001000-00097bff : System RAM
> 00097c00-0009ffff : reserved
> 000a0000-000bffff : PCI Bus 0000:00
> 000c0000-000c7fff : Video ROM
> 000ce800-000d43ff : Adapter ROM
> 000d4800-000d57ff : Adapter ROM
> 000e6000-000fffff : reserved
> 000f0000-000fffff : System ROM
> 00100000-d7e7ffff : System RAM
> 01000000-01688c05 : Kernel code
> 01688c06-01d4f53f : Kernel data
> 01eea000-02174fff : Kernel bss
> d7e80000-d7e8dfff : RAM buffer
> d7e8e000-d7e8ffff : reserved
> d7e90000-d7eb3fff : ACPI Tables
> d7eb4000-d7edffff : ACPI Non-volatile Storage
> d7ee0000-d7ffffff : reserved
> d9000000-daffffff : PCI Bus 0000:40
> d9000000-d90003ff : IOAPIC 2
> d9010000-d9013fff : amd_iommu
> db000000-dcffffff : PCI Bus 0000:00
> db000000-dbffffff : PCI Bus 0000:01
> db000000-dbffffff : 0000:01:04.0
> db000000-dbffffff : mgadrmfb_vram
> dcd00000-dcffffff : PCI Bus 0000:04
> dcdfc000-dcdfffff : 0000:04:00.0
> dcdfc000-dcdfffff : ixgbe
> dce00000-dcffffff : 0000:04:00.0
> dce00000-dcffffff : ixgbe
> dd000000-dfffffff : PCI Bus 0000:00
> def00000-df7fffff : PCI Bus 0000:01
> deffc000-deffffff : 0000:01:04.0
> deffc000-deffffff : mgadrmfb_mmio
> df000000-df7fffff : 0000:01:04.0
> dfaf6000-dfaf6fff : 0000:00:12.1
> dfaf6000-dfaf6fff : ohci_hcd
> dfaf7000-dfaf7fff : 0000:00:12.0
> dfaf7000-dfaf7fff : ohci_hcd
> dfaf8400-dfaf87ff : 0000:00:11.0
> dfaf8400-dfaf87ff : ahci
> dfaf8800-dfaf88ff : 0000:00:12.2
> dfaf8800-dfaf88ff : ehci_hcd
> dfaf8c00-dfaf8cff : 0000:00:13.2
> dfaf8c00-dfaf8cff : ehci_hcd
> dfaf9000-dfaf9fff : 0000:00:13.1
> dfaf9000-dfaf9fff : ohci_hcd
> dfafa000-dfafafff : 0000:00:13.0
> dfafa000-dfafafff : ohci_hcd
> dfafb000-dfafbfff : 0000:00:14.5
> dfafb000-dfafbfff : ohci_hcd
> dfb00000-dfbfffff : PCI Bus 0000:02
> dfb1c000-dfb1ffff : 0000:02:00.1
> dfb1c000-dfb1ffff : igb
> dfb20000-dfb3ffff : 0000:02:00.1
> dfb40000-dfb5ffff : 0000:02:00.1
> dfb40000-dfb5ffff : igb
> dfb60000-dfb7ffff : 0000:02:00.1
> dfb60000-dfb7ffff : igb
> dfb9c000-dfb9ffff : 0000:02:00.0
> dfb9c000-dfb9ffff : igb
> dfba0000-dfbbffff : 0000:02:00.0
> dfbc0000-dfbdffff : 0000:02:00.0
> dfbc0000-dfbdffff : igb
> dfbe0000-dfbfffff : 0000:02:00.0
> dfbe0000-dfbfffff : igb
> dfc00000-dfcfffff : PCI Bus 0000:03
> dfc3c000-dfc3ffff : 0000:03:00.0
> dfc3c000-dfc3ffff : mpt2sas
> dfc40000-dfc7ffff : 0000:03:00.0
> dfc40000-dfc7ffff : mpt2sas
> dfc80000-dfcfffff : 0000:03:00.0
> dfd00000-dfdfffff : PCI Bus 0000:04
> dfd80000-dfdfffff : 0000:04:00.0
> dfe00000-dfffffff : PCI Bus 0000:05
> dfeb0000-dfebffff : 0000:05:00.0
> dfeb0000-dfebffff : mpt2sas
> dfec0000-dfefffff : 0000:05:00.0
> dfec0000-dfefffff : mpt2sas
> dff00000-dfffffff : 0000:05:00.0
> e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
> e0000000-efffffff : reserved
> e0000000-efffffff : pnp 00:0a
> f6000000-f6003fff : amd_iommu
> fec00000-fec003ff : IOAPIC 0
> fec10000-fec1001f : pnp 00:04
> fec20000-fec203ff : IOAPIC 1
> fed00000-fed003ff : HPET 2
> fed00000-fed003ff : PNP0103:00
> fed40000-fed44fff : PCI Bus 0000:00
> fee00000-fee00fff : Local APIC
> fee00000-fee00fff : pnp 00:03
> ffb80000-ffbfffff : pnp 00:04
> ffe00000-ffffffff : reserved
> ffe50000-ffe5e05f : pnp 00:04
> 100000000-2026ffffff : System RAM
> 2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
next prev parent reply other threads:[~2016-06-13 17:40 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <loom.20160606T232112-817@post.gmane.org>
[not found] ` <CAKgT0UfEGS_QzM1phGKRV1hDgcnAwX-BqMkyQ6KJUOv82_kCiA@mail.gmail.com>
[not found] ` <nj64hf$9v5$1@ger.gmane.org>
[not found] ` <CAKgT0UfKUkrsXqLm4KdjXgLZ6QXZp5Rf-yYA3pBSzc1=ghJ4CQ@mail.gmail.com>
[not found] ` <njbvjb$40r$1@ger.gmane.org>
2016-06-09 16:03 ` [E1000-devel] AMD-Vi: Event logged IO_PAGE_FAULT - ixgbe Detected Tx Unit Hang - Reset adapter - master disable timed out Alexander Duyck
2016-06-09 16:57 ` Lutz Vieweg
[not found] ` <5759A009.8040200-i6VILw57VWU@public.gmane.org>
2016-06-13 2:46 ` [E1000-devel] " Wan ZongShun
[not found] ` <CAKT61h9cNnGDNugoWXYcpN1VjVK3Hn-VOW+TwHahj5EXzfsXgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-06-13 17:40 ` Lutz Vieweg [this message]
[not found] ` <575EEFFB.20004-i6VILw57VWU@public.gmane.org>
2016-06-14 3:01 ` Wan ZongShun
2016-08-29 12:29 ` Lutz Vieweg
2016-08-29 12:29 ` Lutz Vieweg
2016-08-29 12:30 ` Lutz Vieweg
2016-08-29 12:30 ` Lutz Vieweg
2016-08-29 12:30 ` Lutz Vieweg
[not found] ` <CAKgT0UeFM1jYTU83YFohxUHWuJeTYfWDpdFM2CDQCutmf_vXvA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-06-13 9:08 ` [E1000-devel] " Joerg Roedel
2016-06-13 17:46 ` Lutz Vieweg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=575EEFFB.20004@5t9.de \
--to=lvml-i6vilw57vwu@public.gmane.org \
--cc=e1000-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.