iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Lutz Vieweg <lvml-i6VILw57VWU@public.gmane.org>
To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: e1000-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Subject: Re: [E1000-devel] AMD-Vi: Event logged IO_PAGE_FAULT - ixgbe Detected Tx Unit Hang - Reset adapter - master disable timed out
Date: Mon, 13 Jun 2016 19:40:11 +0200	[thread overview]
Message-ID: <575EEFFB.20004@5t9.de> (raw)
In-Reply-To: <CAKT61h9cNnGDNugoWXYcpN1VjVK3Hn-VOW+TwHahj5EXzfsXgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 06/13/2016 04:46 AM, Wan ZongShun wrote:
>> With "iommu=pt":
>>>
>>> [    4.832580] iommu: Adding device 0000:04:00.0 to group 13
>>> [    4.832838] iommu: Using direct mapping for device 0000:04:00.0
>>
>
> That is right, you will pass through AMD IOMMU when you set iommu=pt.
>
>> ...
>>>
>>> [    4.837074] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
>>> [    4.837305] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
>>> [    4.837535] AMD-Vi: Interrupt remapping enabled
>>> [    4.838062] AMD-Vi: Lazy IO/TLB flushing enabled
>>> [    4.838291] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
>>> [    4.838533] software IO TLB [mem 0xd3e80000-0xd7e80000] (64MB) mapped
>>> at [ffff8800d3e80000-ffff8800d7e7ffff]
>>
>>
>> I hope that doesn't mean all my network data is now passing through
>> an additional copy-by-CPU... that would be kind of the opposite of what
>> "iommu=pt" seemed to promise :-)
>
> It depends.
>
> Firstly, I need to know if your ethernet card works well now or not
> after you set iommu=pt.

Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.

I'd say celebration of "works with iommu=pt" has to wait for at least
two weeks or so before it is reasonably probable it works for this reason.

> If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
> is ok, you will not be impacted by bounce buffer.

> But iommu=pt is a terrible option, that make all devices bypass the iommu.

Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.

> If you want to get further help, Please try:
>
> (1)Please add 'amd_iommu_dump' option in your kernel boot option, and
> send your full kernel logs, lspci info, don't add iommu=pt.
> (2) Add amd_iommu=fullflush option to kernel boot option, just try it.

Will try that when the NIC becomes unavailable again.

>> One more thing I find curious, but this didn't change with "iommu=pt":
>>>
>>> [    0.000000] AGP: Checking aperture...
>>> [    0.000000] AGP: No AGP bridge found
>>> [    0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff]
>>> (32MB)
>>> [    0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
>>> [    0.000000] AGP: Please enable the IOMMU option in the BIOS setup
>>> [    0.000000] AGP: This costs you 64MB of RAM
>>> [    0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff]
>>> (65536KB)
>>
>> I checked and the IOMMU-option is definitely enabled in the BIOS setup.
>> So I assume right that these message are irrelevant (since AGP as a whole
>> is irrelevant on this server)?
>
> Please cat /proc/iomem, send the information.

Here it is:
> 00000000-00000fff : reserved
> 00001000-00097bff : System RAM
> 00097c00-0009ffff : reserved
> 000a0000-000bffff : PCI Bus 0000:00
> 000c0000-000c7fff : Video ROM
> 000ce800-000d43ff : Adapter ROM
> 000d4800-000d57ff : Adapter ROM
> 000e6000-000fffff : reserved
>   000f0000-000fffff : System ROM
> 00100000-d7e7ffff : System RAM
>   01000000-01688c05 : Kernel code
>   01688c06-01d4f53f : Kernel data
>   01eea000-02174fff : Kernel bss
> d7e80000-d7e8dfff : RAM buffer
> d7e8e000-d7e8ffff : reserved
> d7e90000-d7eb3fff : ACPI Tables
> d7eb4000-d7edffff : ACPI Non-volatile Storage
> d7ee0000-d7ffffff : reserved
> d9000000-daffffff : PCI Bus 0000:40
>   d9000000-d90003ff : IOAPIC 2
>   d9010000-d9013fff : amd_iommu
> db000000-dcffffff : PCI Bus 0000:00
>   db000000-dbffffff : PCI Bus 0000:01
>     db000000-dbffffff : 0000:01:04.0
>       db000000-dbffffff : mgadrmfb_vram
>   dcd00000-dcffffff : PCI Bus 0000:04
>     dcdfc000-dcdfffff : 0000:04:00.0
>       dcdfc000-dcdfffff : ixgbe
>     dce00000-dcffffff : 0000:04:00.0
>       dce00000-dcffffff : ixgbe
> dd000000-dfffffff : PCI Bus 0000:00
>   def00000-df7fffff : PCI Bus 0000:01
>     deffc000-deffffff : 0000:01:04.0
>       deffc000-deffffff : mgadrmfb_mmio
>     df000000-df7fffff : 0000:01:04.0
>   dfaf6000-dfaf6fff : 0000:00:12.1
>     dfaf6000-dfaf6fff : ohci_hcd
>   dfaf7000-dfaf7fff : 0000:00:12.0
>     dfaf7000-dfaf7fff : ohci_hcd
>   dfaf8400-dfaf87ff : 0000:00:11.0
>     dfaf8400-dfaf87ff : ahci
>   dfaf8800-dfaf88ff : 0000:00:12.2
>     dfaf8800-dfaf88ff : ehci_hcd
>   dfaf8c00-dfaf8cff : 0000:00:13.2
>     dfaf8c00-dfaf8cff : ehci_hcd
>   dfaf9000-dfaf9fff : 0000:00:13.1
>     dfaf9000-dfaf9fff : ohci_hcd
>   dfafa000-dfafafff : 0000:00:13.0
>     dfafa000-dfafafff : ohci_hcd
>   dfafb000-dfafbfff : 0000:00:14.5
>     dfafb000-dfafbfff : ohci_hcd
>   dfb00000-dfbfffff : PCI Bus 0000:02
>     dfb1c000-dfb1ffff : 0000:02:00.1
>       dfb1c000-dfb1ffff : igb
>     dfb20000-dfb3ffff : 0000:02:00.1
>     dfb40000-dfb5ffff : 0000:02:00.1
>       dfb40000-dfb5ffff : igb
>     dfb60000-dfb7ffff : 0000:02:00.1
>       dfb60000-dfb7ffff : igb
>     dfb9c000-dfb9ffff : 0000:02:00.0
>       dfb9c000-dfb9ffff : igb
>     dfba0000-dfbbffff : 0000:02:00.0
>     dfbc0000-dfbdffff : 0000:02:00.0
>       dfbc0000-dfbdffff : igb
>     dfbe0000-dfbfffff : 0000:02:00.0
>       dfbe0000-dfbfffff : igb
>   dfc00000-dfcfffff : PCI Bus 0000:03
>     dfc3c000-dfc3ffff : 0000:03:00.0
>       dfc3c000-dfc3ffff : mpt2sas
>     dfc40000-dfc7ffff : 0000:03:00.0
>       dfc40000-dfc7ffff : mpt2sas
>     dfc80000-dfcfffff : 0000:03:00.0
>   dfd00000-dfdfffff : PCI Bus 0000:04
>     dfd80000-dfdfffff : 0000:04:00.0
>   dfe00000-dfffffff : PCI Bus 0000:05
>     dfeb0000-dfebffff : 0000:05:00.0
>       dfeb0000-dfebffff : mpt2sas
>     dfec0000-dfefffff : 0000:05:00.0
>       dfec0000-dfefffff : mpt2sas
>     dff00000-dfffffff : 0000:05:00.0
> e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
>   e0000000-efffffff : reserved
>     e0000000-efffffff : pnp 00:0a
> f6000000-f6003fff : amd_iommu
> fec00000-fec003ff : IOAPIC 0
> fec10000-fec1001f : pnp 00:04
> fec20000-fec203ff : IOAPIC 1
> fed00000-fed003ff : HPET 2
>   fed00000-fed003ff : PNP0103:00
> fed40000-fed44fff : PCI Bus 0000:00
> fee00000-fee00fff : Local APIC
>   fee00000-fee00fff : pnp 00:03
> ffb80000-ffbfffff : pnp 00:04
> ffe00000-ffffffff : reserved
>   ffe50000-ffe5e05f : pnp 00:04
> 100000000-2026ffffff : System RAM
> 2027000000-2027ffffff : RAM buffer

Regards,

Lutz Vieweg

  parent reply	other threads:[~2016-06-13 17:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <loom.20160606T232112-817@post.gmane.org>
     [not found] ` <CAKgT0UfEGS_QzM1phGKRV1hDgcnAwX-BqMkyQ6KJUOv82_kCiA@mail.gmail.com>
     [not found]   ` <nj64hf$9v5$1@ger.gmane.org>
     [not found]     ` <CAKgT0UfKUkrsXqLm4KdjXgLZ6QXZp5Rf-yYA3pBSzc1=ghJ4CQ@mail.gmail.com>
     [not found]       ` <njbvjb$40r$1@ger.gmane.org>
2016-06-09 16:03         ` [E1000-devel] AMD-Vi: Event logged IO_PAGE_FAULT - ixgbe Detected Tx Unit Hang - Reset adapter - master disable timed out Alexander Duyck
2016-06-09 16:57           ` Lutz Vieweg
     [not found]             ` <5759A009.8040200-i6VILw57VWU@public.gmane.org>
2016-06-13  2:46               ` [E1000-devel] " Wan ZongShun
     [not found]                 ` <CAKT61h9cNnGDNugoWXYcpN1VjVK3Hn-VOW+TwHahj5EXzfsXgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-06-13 17:40                   ` Lutz Vieweg [this message]
     [not found]                     ` <575EEFFB.20004-i6VILw57VWU@public.gmane.org>
2016-06-14  3:01                       ` Wan ZongShun
2016-08-29 12:29                       ` Lutz Vieweg
2016-08-29 12:29                       ` Lutz Vieweg
2016-08-29 12:30                       ` Lutz Vieweg
2016-08-29 12:30                       ` Lutz Vieweg
2016-08-29 12:30                     ` Lutz Vieweg
     [not found]           ` <CAKgT0UeFM1jYTU83YFohxUHWuJeTYfWDpdFM2CDQCutmf_vXvA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-06-13  9:08             ` [E1000-devel] " Joerg Roedel
2016-06-13 17:46               ` Lutz Vieweg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=575EEFFB.20004@5t9.de \
    --to=lvml-i6vilw57vwu@public.gmane.org \
    --cc=e1000-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).