From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout2.freenet.de ([195.4.92.92]:56574 "EHLO mout2.freenet.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751938AbbJHSYR (ORCPT ); Thu, 8 Oct 2015 14:24:17 -0400 Subject: Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0 To: Joerg Roedel References: <55FE5740.2060701@maya.org> <20150929152100.GL3036@8bytes.org> <20150929162042.GR3036@8bytes.org> <560BF73F.8000008@maya.org> <20151006101356.GE12506@8bytes.org> <56141507.7040103@maya.org> <56148A1B.5060506@maya.org> <20151007161022.GI28811@8bytes.org> <56154DEA.5050901@maya.org> <20151008163957.GK28811@8bytes.org> Cc: Mikulas Patocka , iommu@lists.linux-foundation.org, Leo Duran , Christoph Hellwig , device-mapper development , Milan Broz , Jens Axboe , linux-pci , Linus Torvalds From: Andreas Hartmann Message-ID: <5616B436.1000802@maya.org> Date: Thu, 8 Oct 2015 20:21:42 +0200 MIME-Version: 1.0 In-Reply-To: <20151008163957.GK28811@8bytes.org> Content-Type: text/plain; charset=windows-1252; format=flowed Sender: linux-pci-owner@vger.kernel.org List-ID: Am 08.10.2015 um 18:39 schrieb Joerg Roedel: > On Wed, Oct 07, 2015 at 06:52:58PM +0200, Andreas Hartmann wrote: >> To reproduce the error: >> First I mounted /daten2, afterwards /raid/mt, which produces the errors. >> The ssd mounts have been already active (during boot by fstab). > > Okay, I spent the day on that problem, and managed to reproduce it here > on one of my AMD IOMMU boxes. I wasn't an easy journey, as I can only > reproduce it if I setup the crypto partition and everything above that > (like mounting the lvm volumes) _after_ the system has finished booting. > If everything is setup during system boot it works fine and I don't see > any IO_PAGE_FAULTS. Thank you very much for spending so much of your time to reproduce the problem! > I also tried kernel v4.3-rc4 first, to have it tested with a > self-compiled kernel. It didn't show up there, so I built a 4.1.0, where > it showed up again. Something seems to have fixed the issue in the > latest kernels. > > So I looked a little bit around at the commits that were merged into the > respective parts involved here, and found this one: > > 586b286 dm crypt: constrain crypt device's max_segment_size to PAGE_SIZE > > The problem fixed with this commit looks quite similar to what you have > seen (execpt that there was no IOMMU involved). So I cherry-picked that > commit on 4.1.0 and tested that. The problem was gone. That's true - I already knew this patch and tested it some weeks ago - unfortunately it doesn't fix the problem here. To be really sure, I just retested it now again. I couldn't see any IO_PAGE_FAULTS errors today (unfortunately I can't remember anymore if I didn't see them too a few weeks ago) - but the ata errors remain. Therefore, this patch isn't a solution for the problem I encounter here. > So it looks like it was a dm-crypt issue, the patch went into v4.3-rc3, > either this kernel of rc4 should fix the problem for you too. Can you > please verify this is fixed for you too with v4.3-rc4? As I already wrote, I even couldn't see the problem with v4.3-rc2 any more (as far as I was able to test because of the other problem). I have to do some more tests now with this kernel to be really sure. Kind regards, Andreas