dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Andreas Hartmann <andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>
To: Mikulas Patocka
	<mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Andreas Hartmann
	<andihartmann-KuiJ5kEpwI6ELgA04lAiVw@public.gmane.org>,
	Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Leo Duran <leo.duran-5C7GfCeVMHo@public.gmane.org>
Cc: linux-pci <linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Jens Axboe <axboe-b10kYP2dOMg@public.gmane.org>,
	device-mapper development
	<dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>,
	Milan Broz <mbroz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0
Date: Sun, 20 Sep 2015 08:50:40 +0200	[thread overview]
Message-ID: <55FE5740.2060701@maya.org> (raw)
In-Reply-To: <alpine.LRH.2.02.1508021347480.17729-Hpncn10jQN4oNljnaZt3ZvA+iT7yCHsGwRM8/txMwJMAicBL8TP8PQ@public.gmane.org>

On 08/02/2015 at 07:57 PM, Mikulas Patocka wrote:
> 
> 
> On Sun, 2 Aug 2015, Andreas Hartmann wrote:
> 
>> On 08/01/2015 at 04:20 PM Andreas Hartmann wrote:
>>> On 07/28/2015 at 09:29 PM, Mike Snitzer wrote:
>>> [...]
>>>> Mikulas was saying to biect what is causing ATA to fail.
>>>
>>> Some good news and some bad news. The good news first:
>>>
>>> Your patchset
>>>
>>> f3396c58fd8442850e759843457d78b6ec3a9589,
>>> cf2f1abfbd0dba701f7f16ef619e4d2485de3366,
>>> 7145c241a1bf2841952c3e297c4080b357b3e52d,
>>> 94f5e0243c48aa01441c987743dc468e2d6eaca2,
>>> dc2676210c425ee8e5cb1bec5bc84d004ddf4179,
>>> 0f5d8e6ee758f7023e4353cca75d785b2d4f6abe,
>>> b3c5fd3052492f1b8d060799d4f18be5a5438add
>>>
>>> seems to work fine w/ 3.18.19 !!
>>>
>>> Why did I test it with 3.18.x now? Because I suddenly got two ata errors
>>> (ata1 and ata2) with clean 3.19.8 (w/o the AMD-Vi IO_PAGE_FAULTs) during
>>> normal operation. This means: 3.19 must already be broken, too.
>>>
>>> Therefore, I applied your patchset to 3.18.x and it seems to work like a
>>> charme - I don't get any AMD-Vi IO_PAGE_FAULTs on boot and no ata errors
>>> (until now).
>>>
>>>
>>> Next I did: I tried to bisect between 3.18 and 3.19 with your patchset
>>> applied, because w/ this patchset applied, the problem can be seen
>>> easily and directly on boot. Unfortunately, this does work only a few
>>> git bisect rounds until I got stuck because of interferences with your
>>> extra patches applied:
>>
>> [Resolved the problems written at the last post.]
>>
>> Bisecting ended here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34b48db66e08ca1c1bc07cf305d672ac940268dc
>>
>> block: remove artifical max_hw_sectors cap
>>
>>
>> Removing this patch on 3.19 and 4.1 make things working again. Didn't
>> test 4.0, but I think it's the same. No more AMD-Vi IO_PAGE_FAULTS  with
>> that patch reverted.

After long period of testing, I now can say, that max_sectors_kb can be
set to 1024 - higher values produce AMD-Vi IO_PAGE_FAULTS and ata faults.


This patch "sd: Fix maximum I/O size for BLOCK_PC requests"[1] as part
of 4.1.7 produces ata / AMD-Vi IO_PAGE_FAULTS already during boot, too -
no matter if "block: remove artifical max_hw_sectors cap"[2] has been
applied or not.


Next I tested was "dm crypt: constrain crypt device's max_segment_size
to PAGE_SIZE" patch[3] applied to an unchanged 4.1.7 kernel w/o setting
max_sectors_kb to 1024.

Interesting effect was, that booting has been fine, but I could see lots
of ata errors afterwards as soon as there is load on the md raid 1
(during kernel compile e.g.), which is built on *rotational* disks:


[  367.264873] ata2.00: exception Emask 0x0 SAct 0x7fbfffff SErr 0x0
action 0x6 frozen
[  367.264883] ata2.00: failed command: WRITE FPDMA QUEUED
[  367.264893] ata2.00: cmd 61/40:00:b0:7b:d4/05:00:06:00:00/40 tag 0
ncq 688128 out
[  367.264893]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  367.264899] ata2.00: status: { DRDY }
...
[  367.265332] ata2.00: failed command: WRITE FPDMA QUEUED
[  367.265339] ata2.00: cmd 61/40:f0:30:71:d4/05:00:06:00:00/40 tag 30
ncq 688128 out
[  367.265339]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  367.265343] ata2.00: status: { DRDY }
[  367.265350] ata2: hard resetting link
[  367.775330] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  367.776970] ata2.00: configured for UDMA/133
[  367.776997] ata2.00: device reported invalid CHS sector 0
...
[  367.777761] ata2: EH complete


Iow: Using an unpatched kernel >= 3.19 means high risk to break
filesystems if there are given some yet unknown conditions [4].

>>
>>
>> Please check why this patch triggers AMD-Vi IO_PAGE_FAULTS.
> 
> I would submit this bug to maintainers of AMD-Vi. They understand the 
> hardware, so they should tell why do large I/O requests result in 
> IO_PAGE_FAULTs.
> 
> It is probably bug either in AMD-Vi driver or in hardware.

Until now, I didn't hear anything from the maintainers of AMD-Vi.


Regards,
Andreas Hartmann


[1] http://thread.gmane.org/gmane.linux.kernel.commits.head/538464
[2]
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34b48db66e08ca1c1bc07cf305d672ac940268dc
[3]
http://news.gmane.org/find-root.php?group=gmane.linux.kernel&article=2036495
[4] http://thread.gmane.org/gmane.linux.kernel.pci/43851/focus=44011

  parent reply	other threads:[~2015-09-20  6:50 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-28 17:40 AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0 Andreas Hartmann
2015-07-28 17:50 ` Mike Snitzer
2015-07-28 18:20   ` Andreas Hartmann
2015-07-28 18:58     ` Mike Snitzer
2015-07-28 19:23       ` Andreas Hartmann
2015-07-28 19:31         ` Mike Snitzer
2015-07-28 20:08           ` Andreas Hartmann
2015-07-28 21:24             ` Mike Snitzer
2015-07-29  6:17               ` [dm-devel] " Ondrej Kozina
2015-07-29  6:41                 ` Milan Broz
2015-07-29 17:23                   ` Andreas Hartmann
2015-07-30 20:30                     ` Andreas Hartmann
2015-07-31  7:23                       ` Milan Broz
2015-07-31  7:55                         ` Andreas Hartmann
2015-07-31  8:15                           ` Andreas Hartmann
2015-07-31  8:28                           ` Milan Broz
2015-07-29 10:37               ` Milan Broz
2015-07-28 18:56   ` Andreas Hartmann
2015-07-28 19:29     ` Mike Snitzer
2015-08-01 14:20       ` [dm-devel] " Andreas Hartmann
2015-08-02 13:38         ` Andreas Hartmann
2015-08-02 17:57           ` Mikulas Patocka
     [not found]             ` <alpine.LRH.2.02.1508021347480.17729-Hpncn10jQN4oNljnaZt3ZvA+iT7yCHsGwRM8/txMwJMAicBL8TP8PQ@public.gmane.org>
2015-08-02 18:48               ` Andreas Hartmann
2015-08-03  8:12                 ` Joerg Roedel
2015-08-04 14:47                   ` Mike Snitzer
2015-08-04 16:10                     ` Jeff Moyer
     [not found]                       ` <x4937zzm3uc.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2015-08-04 18:11                         ` Andreas Hartmann
2015-08-07  6:04                           ` Andreas Hartmann
2015-09-20  6:50               ` Andreas Hartmann [this message]
     [not found]                 ` <55FE5740.2060701-YKS6W9RDU/w@public.gmane.org>
2015-09-29 15:21                   ` [dm-devel] " Joerg Roedel
     [not found]                     ` <20150929152100.GL3036-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-09-29 15:58                       ` Mikulas Patocka
2015-09-29 16:20                         ` Joerg Roedel
     [not found]                           ` <20150929162042.GR3036-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-09-30 14:52                             ` Andreas Hartmann
2015-10-06 10:13                               ` Joerg Roedel
     [not found]                                 ` <20151006101356.GE12506-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-10-06 18:37                                   ` Andreas Hartmann
     [not found]                                     ` <56141507.7040103-YKS6W9RDU/w@public.gmane.org>
2015-10-07  2:57                                       ` Andreas Hartmann
     [not found]                                         ` <56148A1B.5060506-YKS6W9RDU/w@public.gmane.org>
2015-10-07 16:10                                           ` Joerg Roedel
     [not found]                                             ` <20151007161022.GI28811-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-10-07 16:52                                               ` Andreas Hartmann
2015-10-08 16:39                                                 ` Joerg Roedel
     [not found]                                                   ` <20151008163957.GK28811-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-10-08 18:21                                                     ` Andreas Hartmann
     [not found]                                                       ` <5616B436.1000802-YKS6W9RDU/w@public.gmane.org>
2015-10-08 19:52                                                         ` Andreas Hartmann
     [not found]                                                           ` <5616C998.1010309-YKS6W9RDU/w@public.gmane.org>
2015-10-09  5:20                                                             ` Andreas Hartmann
     [not found]                                                               ` <56174EA6.7000106-YKS6W9RDU/w@public.gmane.org>
2015-10-09  9:15                                                                 ` Andreas Hartmann
     [not found]                                                                   ` <56178599.6010807-YKS6W9RDU/w@public.gmane.org>
2015-10-09 14:59                                                                     ` Joerg Roedel
     [not found]                                                                       ` <20151009145951.GC27420-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-10-09 17:46                                                                         ` Andreas Hartmann
     [not found]                                                                           ` <5617FD6E.70802-YKS6W9RDU/w@public.gmane.org>
2015-10-11 12:23                                                                             ` Andreas Hartmann
2015-10-12 12:07                                                                               ` Andreas Hartmann
2015-10-12 12:34                                                                           ` Mikulas Patocka
2015-10-07 15:40                                     ` Joerg Roedel
2015-10-07 17:02                                       ` Andreas Hartmann
2015-10-08 17:30                                         ` Joerg Roedel
     [not found]                                           ` <20151008173007.GL28811-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>
2015-10-08 18:59                                             ` Andreas Hartmann
     [not found]                                               ` <5616BCF4.10104-YKS6W9RDU/w@public.gmane.org>
2015-10-08 19:47                                                 ` Andreas Hartmann
2015-10-09 10:40                                                   ` Joerg Roedel
     [not found]                                                   ` <5616C850.2000906-YKS6W9RDU/w@public.gmane.org>
2015-10-09 14:45                                                     ` [PATCH] iommu/amd: Fix NULL pointer deref on device detach " Joerg Roedel
2015-10-09 17:42                                                       ` Andreas Hartmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55FE5740.2060701@maya.org \
    --to=andihartmann-kuij5kepwi6elga04laivw@public.gmane.org \
    --cc=axboe-b10kYP2dOMg@public.gmane.org \
    --cc=dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org \
    --cc=leo.duran-5C7GfCeVMHo@public.gmane.org \
    --cc=linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mbroz-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).