All of lore.kernel.org
 help / color / mirror / Atom feed
From: arno@natisbad.org (Arnaud Ebalard)
To: Robert Hancock <hancockrwd@gmail.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>,
	Andrew Lunn <andrew@lunn.ch>, Jason Cooper <jason@lakedaemon.net>,
	linux-ide@vger.kernel.org,
	Jason Gunthorpe <jgunthorpe@obsidianresearch.com>,
	Marc Carino <marc.ceeeee@gmail.com>,
	Ezequiel Garcia <ezequiel.garcia@free-electrons.com>,
	Tejun Heo <tj@kernel.org>,
	Gregory Clement <gregory.clement@free-electrons.com>,
	willy tarreau <w@1wt.eu>,
	linux-arm-kernel@lists.infradead.org,
	Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Subject: Re: [BUG,REGRESSION] SATA regression on 12.0-rc4 kernel
Date: Tue, 08 Oct 2013 08:10:39 +0200	[thread overview]
Message-ID: <87y564waw0.fsf@natisbad.org> (raw)
In-Reply-To: 52537015.6040200@gmail.com

Hi Robert,

Robert Hancock <hancockrwd@gmail.com> writes:

> On 10/07/2013 01:12 PM, Arnaud Ebalard wrote:
>> Hi guys,
>>
>> yesterday, I reported on arm kernel mailing list what looked like a sata
>> regression on my platform (Marvell Armada 370-based NETGEAR ReadyNAS
>> 102). I initially thought this was an ARM-related issue. My initial
>> email, provided below, contains various details on the platform and the
>> error encountered.
>>
>> Today, before starting a painful git bisect, I decided to git log
>> sata_mv.c code and then more generally drivers/ata to quickly end up on
>> commit ed36911c747c (libata: Add support for SEND/RECEIVE FPDMA QUEUED)
>> against which I got suspicious after looking again at the errors I had:
>>
>> [  417.288155] ata1.00: exception Emask 0x0 SAct 0x1fff6001 SErr 0x0 action 0x6 frozen
>> [  417.295838] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.301097] ata1.00: cmd 61/48:00:80:ad:0b/00:00:0c:00:00/40 tag 0 ncq 36864 out
>> [  417.315896] ata1.00: status: { DRDY }
>> [  417.319570] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.324814] ata1.00: cmd 61/08:68:70:a1:87/00:00:0d:00:00/40 tag 13 ncq 4096 out
>> [  417.339619] ata1.00: status: { DRDY }
>> [  417.343288] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.348536] ata1.00: cmd 61/08:70:28:a2:87/00:00:0d:00:00/40 tag 14 ncq 4096 out
>> [  417.363341] ata1.00: status: { DRDY }
>> [  417.367010] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.372257] ata1.00: cmd 61/08:80:80:a3:87/00:00:0d:00:00/40 tag 16 ncq 4096 out
>> [  417.387061] ata1.00: status: { DRDY }
>> [  417.390733] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.395977] ata1.00: cmd 61/08:88:58:a1:c7/00:00:0d:00:00/40 tag 17 ncq 4096 out
>> [  417.410782] ata1.00: status: { DRDY }
>>
>> Reverting both 87fb6c31b9 (libata: Add support for queued DSM TRIM) and
>> ed36911c74 (libata: Add support for SEND/RECEIVE FPDMA QUEUED) makes the
>> problem disappear. Note: reverting 87fb6c31b9 is not enough and I cannot
>> compile the kernel with only the latter reverted.
>>
>> If you need more info on the platform or want me to test something some
>> fix, do not hesitate.
>
> I assume that it consistently fails on a non-working kernel and
> consistently works with those patches reverted? Given that both of
> those patches seem to only be touching SSDs with NCQ trim support, it
> seems odd they would be breaking a normal hard drive, but maybe there
> is some unexpected side effect..

With two different disks (same model though, i.e. 250GB 3.5" WD blue), it
consistently works on a 3.11.4 and consistently fails on 3.12-rc3 and
3.12-rc4 (not tested others 3.12-rc). The problem is easy to reproduce,
i.e. I just need to perform some disk operations. With the two commits
reverted from 3.12-rc4, I can consistently do a "find / -exec sha256sum
'{}' \;" w/o anything happening.

What I do not understand is why the log report failed FPDMA commands if
the feature is supposed to be SSD-related (looking only at commit
messages: 87fb6c31b9 seems SSD-related, ed36911c74 does not). Is it
possible that the feature detection is what is causing the issue? Or
that the hardware report support w/o having? I can test with a different
disk if you think it would help.

Cheers,

a+

WARNING: multiple messages have this Message-ID (diff)
From: arno@natisbad.org (Arnaud Ebalard)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG,REGRESSION] SATA regression on 12.0-rc4 kernel
Date: Tue, 08 Oct 2013 08:10:39 +0200	[thread overview]
Message-ID: <87y564waw0.fsf@natisbad.org> (raw)
In-Reply-To: 52537015.6040200@gmail.com

Hi Robert,

Robert Hancock <hancockrwd@gmail.com> writes:

> On 10/07/2013 01:12 PM, Arnaud Ebalard wrote:
>> Hi guys,
>>
>> yesterday, I reported on arm kernel mailing list what looked like a sata
>> regression on my platform (Marvell Armada 370-based NETGEAR ReadyNAS
>> 102). I initially thought this was an ARM-related issue. My initial
>> email, provided below, contains various details on the platform and the
>> error encountered.
>>
>> Today, before starting a painful git bisect, I decided to git log
>> sata_mv.c code and then more generally drivers/ata to quickly end up on
>> commit ed36911c747c (libata: Add support for SEND/RECEIVE FPDMA QUEUED)
>> against which I got suspicious after looking again at the errors I had:
>>
>> [  417.288155] ata1.00: exception Emask 0x0 SAct 0x1fff6001 SErr 0x0 action 0x6 frozen
>> [  417.295838] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.301097] ata1.00: cmd 61/48:00:80:ad:0b/00:00:0c:00:00/40 tag 0 ncq 36864 out
>> [  417.315896] ata1.00: status: { DRDY }
>> [  417.319570] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.324814] ata1.00: cmd 61/08:68:70:a1:87/00:00:0d:00:00/40 tag 13 ncq 4096 out
>> [  417.339619] ata1.00: status: { DRDY }
>> [  417.343288] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.348536] ata1.00: cmd 61/08:70:28:a2:87/00:00:0d:00:00/40 tag 14 ncq 4096 out
>> [  417.363341] ata1.00: status: { DRDY }
>> [  417.367010] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.372257] ata1.00: cmd 61/08:80:80:a3:87/00:00:0d:00:00/40 tag 16 ncq 4096 out
>> [  417.387061] ata1.00: status: { DRDY }
>> [  417.390733] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  417.395977] ata1.00: cmd 61/08:88:58:a1:c7/00:00:0d:00:00/40 tag 17 ncq 4096 out
>> [  417.410782] ata1.00: status: { DRDY }
>>
>> Reverting both 87fb6c31b9 (libata: Add support for queued DSM TRIM) and
>> ed36911c74 (libata: Add support for SEND/RECEIVE FPDMA QUEUED) makes the
>> problem disappear. Note: reverting 87fb6c31b9 is not enough and I cannot
>> compile the kernel with only the latter reverted.
>>
>> If you need more info on the platform or want me to test something some
>> fix, do not hesitate.
>
> I assume that it consistently fails on a non-working kernel and
> consistently works with those patches reverted? Given that both of
> those patches seem to only be touching SSDs with NCQ trim support, it
> seems odd they would be breaking a normal hard drive, but maybe there
> is some unexpected side effect..

With two different disks (same model though, i.e. 250GB 3.5" WD blue), it
consistently works on a 3.11.4 and consistently fails on 3.12-rc3 and
3.12-rc4 (not tested others 3.12-rc). The problem is easy to reproduce,
i.e. I just need to perform some disk operations. With the two commits
reverted from 3.12-rc4, I can consistently do a "find / -exec sha256sum
'{}' \;" w/o anything happening.

What I do not understand is why the log report failed FPDMA commands if
the feature is supposed to be SSD-related (looking only at commit
messages: 87fb6c31b9 seems SSD-related, ed36911c74 does not). Is it
possible that the feature detection is what is causing the issue? Or
that the hardware report support w/o having? I can test with a different
disk if you think it would help.

Cheers,

a+

  reply	other threads:[~2013-10-08  6:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-06 21:38 [BUG,REGRESSION] ARM: mvebu: SATA regression w/ 12.0-rc4 kernel Arnaud Ebalard
2013-10-07 12:59 ` Jason Cooper
2013-10-07 19:12   ` [BUG,REGRESSION] SATA regression on " Arnaud Ebalard
2013-10-08  2:38     ` Robert Hancock
2013-10-08  2:38       ` Robert Hancock
2013-10-08  6:10       ` Arnaud Ebalard [this message]
2013-10-08  6:10         ` Arnaud Ebalard
2013-10-09  5:50         ` Robert Hancock
2013-10-09  5:50           ` Robert Hancock
2013-10-09  8:40           ` Arnaud Ebalard
2013-10-09  8:40             ` Arnaud Ebalard
2013-10-09 15:22           ` Marc (Marc-Angelo) Carino
2013-10-09 15:22             ` Marc (Marc-Angelo) Carino
2013-10-09 18:56             ` Arnaud Ebalard
2013-10-09 18:56               ` Arnaud Ebalard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y564waw0.fsf@natisbad.org \
    --to=arno@natisbad.org \
    --cc=andrew@lunn.ch \
    --cc=ezequiel.garcia@free-electrons.com \
    --cc=gregory.clement@free-electrons.com \
    --cc=hancockrwd@gmail.com \
    --cc=jason@lakedaemon.net \
    --cc=jgunthorpe@obsidianresearch.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=marc.ceeeee@gmail.com \
    --cc=sebastian.hesselbarth@gmail.com \
    --cc=thomas.petazzoni@free-electrons.com \
    --cc=tj@kernel.org \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.