All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Torsten Kaiser <just.for.lkml@googlemail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-ide@vger.kernel.org,
	Kuan Luo <kluo@nvidia.com>, Peer Chen <pchen@nvidia.com>
Subject: Re: 2.6.23-mm1
Date: Sat, 13 Oct 2007 08:19:17 -0400	[thread overview]
Message-ID: <4710B7C5.5050403@garzik.org> (raw)
In-Reply-To: <64bb37e0710130503haa66d6eu93e75ecdc78ac866@mail.gmail.com>

Torsten Kaiser wrote:
> On 10/13/07, Jeff Garzik <jeff@garzik.org> wrote:
>> Torsten Kaiser wrote:
>>> On 10/12/07, Andrew Morton <akpm@linux-foundation.org> wrote:
>>>> On Fri, 12 Oct 2007 10:31:42 +0200 "Torsten Kaiser" <just.for.lkml@googlemail.com> wrote:
>>>>> Oct 12 10:23:03 treogen smartd[6091]: Device: /dev/sdc, not found in
>>>>> smartd database.
>>>> hm.
>>>>
>>>>> Oct 12 10:23:03 treogen [  105.990000] WARNING: at
>>>>> drivers/ata/libata-core.c:5752 ata_qc_issue()
>>>> Let's cc linux-ide.
>>>>
>>>>> Oct 12 10:23:03 treogen [  105.990000]
>>>>> Oct 12 10:23:03 treogen [  105.990000] Call Trace:
>>>>> Oct 12 10:23:03 treogen [  105.990000]  [<ffffffff804442ef>]
>>>>> ata_qc_issue+0x47f/0x540
>>>>> Oct 12 10:23:03 treogen [  105.990000]  [<ffffffff80432e60>] scsi_done+0x0/0x20
>>>>> Oct 12 10:23:03 treogen [  105.990000]  [<ffffffff80449c80>]
>>>>> ata_scsi_flush_xlat+0x0/0x30
>>> Oct 13 07:46:48 treogen [   99.850000]
>>> Oct 13 07:46:48 treogen [   99.850000] ata3: EH in SWNCQ
>>> mode,QC:qc_active 0x3 sactive 0x1
>>> Oct 13 07:46:48 treogen [   99.850000] ata3: SWNCQ:qc_active 0x1
>>> defer_bits 0x0 last_issue_tag 0x0
>> The WARNING indicates that there is a SWNCQ bug in sata_nv.  Given that
>> the problem appears when SYNCHRONIZE CACHE is being issued, I would
> 
> I can't follow you on SYNCHRONIZE CACHE.
> The only command written to the syslog in the errors where
> 0x60==ATA_CMD_FPDMA_READ and 0xB0 (which is not in
> include/linux/ata.h, but ATA-6 says that this is SMART related. That
> makes sense, as smartd is failing).

In the traceback you have "ata_scsi_flush_xlat", which is the function 
that translates a SCSI sync-cache command into an ATA flush-cache command.

The "WARNING: at drivers/ata/libata-core.c:5752 ata_qc_issue()" also 
guides us to the code comment

         /* Make sure only one non-NCQ command is outstanding.  The
          * check is skipped for old EH because it reuses active qc to
          * request ATAPI sense.
          */

which is a check related to NCQ->off and off->NCQ edge cases.

So those are the two bits of information I found interesting.


>> guess that sata_nv is not properly handling non-queued commands.
> 
> But that still seems correct, as I would not expect that SMART
> commands get queued. (Thats just a guess, as I did not try to find the
> code that does this distinction)
> 
>> This is a patch from libata-dev.git#nv-swncq (via #ALL).
> 
> Comparing sata_nv.c from 2.6.23-rc8-mm1 and 2.6.23-mm1 I see two
> changes, that look suspicious:
> 
> http://git.kernel.org/?p=linux/kernel/git/jgarzik/libata-dev.git;a=commitdiff;h=31cc23b34913bc173680bdc87af79e551bf8cc0d
> 
> The comment says: "ahci and sata_sil24 are converted to use ata_std_qc_defer()."
> But the patch also adds ".qc_defer = ata_std_qc_defer," to sata_nv.c
> 
> The second change is the removal of the 'lock' spinlock from sata_nv.c
> that was used in nv_swncq_qc_issue and nv_swncq_host_interrupt.
> 
> Should I try to revert one or both of these changes?

If you are git-capable, IMO the next steps in problem elimination should be

* download latest linux-2.6.git (currently 
752097cec53eea111d087c545179b421e2bde98a)
* build and test linux-2.6.git, to establish a new baseline
* download latest libata-dev.git#nv-swncq (currently 
3cb664c2d319a4fde5028c3c5dab6221fe70bd2d)
* build and test, with sata_nv module option swncq=0
* build and test, with sata_nv module option swncq=1

That will get -mm out of the picture, use the same baseline kernel for 
all three tests (nv-swncq is based off of 
752097cec53eea111d087c545179b421e2bde98a) and narrow things down to the 
precise changes that went upstream (or are on the 'nv-swncq' branch, 
waiting to go upstream).

My gut feeling is that there is a lingering bug in sata_nv SWNCQ somewhere.

	Jeff




  reply	other threads:[~2007-10-13 12:19 UTC|newest]

Thread overview: 164+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-12  4:31 2.6.23-mm1 Andrew Morton
2007-10-12  5:03 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-12  6:42   ` 2.6.23-mm1 Andrew Morton
2007-10-12  6:46     ` 2.6.23-mm1 Al Viro
2007-10-12  7:13       ` 2.6.23-mm1 Andrew Morton
2007-10-12 18:06         ` [PATCH net-2.6] uml: hard_header fix Stephen Hemminger
2007-10-12 19:04         ` 2.6.23-mm1 Al Viro
2007-10-12 19:47         ` 2.6.23-mm1 thread exit_group issue Mathieu Desnoyers
2007-10-12 20:01           ` Andrew Morton
2007-10-13  1:03           ` Andrew Morton
2007-10-13 11:48             ` Oleg Nesterov
2007-10-13 12:02               ` Oleg Nesterov
2007-10-13 17:49                 ` Andrew Morton
2007-10-14  4:04               ` Mathieu Desnoyers
2007-10-12  7:25     ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-12  8:36       ` 2.6.23-mm1 Sam Ravnborg
2007-10-12  8:31     ` 2.6.23-mm1 Torsten Kaiser
2007-10-12  8:37       ` 2.6.23-mm1 Andrew Morton
2007-10-12 12:46         ` 2.6.23-mm1 Torsten Kaiser
2007-10-13  8:01         ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 10:55           ` 2.6.23-mm1 Jeff Garzik
2007-10-13 12:03             ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 12:19               ` Jeff Garzik [this message]
2007-10-13 14:32                 ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 14:40                   ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 15:13                     ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 17:48                       ` 2.6.23-mm1 Jeff Garzik
2007-10-13 18:05                         ` 2.6.23-mm1 Torsten Kaiser
2007-10-13 18:18                           ` 2.6.23-mm1 Andrew Morton
2007-10-13 18:35                             ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 11:54                             ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 18:39                               ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:12                                 ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 19:26                                   ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:26                                     ` 2.6.23-mm1 Andrew Morton
2007-10-14 19:40                                     ` 2.6.23-mm1 Torsten Kaiser
2007-10-14 22:03                                     ` 2.6.23-mm1 Milan Broz
2007-10-14 22:03                                       ` 2.6.23-mm1 Milan Broz
2007-10-15  6:50                                       ` 2.6.23-mm1 Jens Axboe
2007-10-15  6:50                                         ` 2.6.23-mm1 Jens Axboe
2007-10-15  7:31                                         ` 2.6.23-mm1 Neil Brown
2007-10-15  7:31                                           ` 2.6.23-mm1 Neil Brown
2007-10-15  7:45                                           ` 2.6.23-mm1 Jens Axboe
2007-10-15  7:45                                             ` 2.6.23-mm1 Jens Axboe
2007-10-13 18:41                           ` 2.6.23-mm1 Jeff Garzik
2007-10-12  6:48   ` 2.6.23-mm1 Cedric Le Goater
2007-10-12  6:51 ` [PATCH] add missing parenthesis in cfe_writeblk() macro Mariusz Kozlowski
2007-10-12  7:44 ` 2.6.23-mm1 - build failure on axonram Kamalesh Babulal
2007-10-12  9:42 ` Build Failure (Was Re: 2.6.23-mm1) Dhaval Giani
2007-10-12  9:42   ` Dhaval Giani
2007-10-12 20:38 ` 2.6.23-mm1 Laurent Riffard
2007-10-12 21:00   ` 2.6.23-mm1 Andrew Morton
2007-10-13  9:29     ` [PATCH] Reiser4: Drop 'size' argument from bio_endio and bi_end_io Laurent Riffard
2007-10-13 10:10       ` Jens Axboe
2007-10-14 13:09       ` Edward Shishkin
2007-10-15 16:13     ` 2.6.23-mm1 Zan Lynx
2007-10-12 21:32 ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-15 16:09   ` 2.6.23-mm1 Mark Gross
2007-10-15 20:40     ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-16 19:58       ` 2.6.23-mm1 Mark Gross
2007-10-16 20:28         ` 2.6.23-mm1 Rafael J. Wysocki
2007-10-16 23:31           ` 2.6.23-mm1 Mark Gross
2007-10-17 21:15           ` [PATCH] static initialization with blocking notifiers. was :wqRe: 2.6.23-mm1 Mark Gross
2007-10-17 17:21   ` [PATCH] static initialization and blocking notification for pm_qos... was 2.6.23-mm1 Mark Gross
2007-10-13  4:35 ` 2.6.23-mm1 - Build failure on rgmii Kamalesh Babulal
2007-10-13  4:44 ` 2.6.23-mm1 - build failure with advansys Kamalesh Babulal
2007-10-13  6:52   ` Andrew Morton
2007-10-13  6:52     ` Andrew Morton
2007-10-18  0:07     ` Paul Mackerras
2007-10-18  0:07       ` Paul Mackerras
2007-10-18  1:48       ` Matthew Wilcox
2007-10-18  1:48         ` Matthew Wilcox
2007-10-13 15:50 ` 2.6.23-mm1 pm_prepare() and _finish() w/ args vs. without Joseph Fannin
2007-10-13 17:22   ` Rafael J. Wysocki
2007-10-13 18:40     ` Joseph Fannin
2007-10-13 19:13       ` Rafael J. Wysocki
2007-10-14 19:47         ` Joseph Fannin
2007-10-14 20:20           ` Rafael J. Wysocki
2007-10-15 20:55             ` Rafael J. Wysocki
2007-10-16 17:29               ` Joseph Fannin
2007-10-13 17:12 ` 2.6.23-mm1 Gabriel C
2007-10-13 18:01   ` 2.6.23-mm1 Andrew Morton
2007-10-13 18:08     ` 2.6.23-mm1 Gabriel C
2007-10-15 16:28     ` 2.6.23-mm1 Dave Hansen
2007-10-13 17:58 ` Suspend Broken (Re: 2.6.23-mm1) Dhaval Giani
2007-10-13 18:33   ` Rafael J. Wysocki
2007-10-14  4:26     ` Dhaval Giani
2007-10-14 14:19       ` Rafael J. Wysocki
2007-10-13 22:11 ` [2.6.23-mm1] CONFIG_LOCALVERSION handling broken Tilman Schmidt
2007-10-17 20:27   ` Sam Ravnborg
2007-10-17 23:06   ` Tilman Schmidt
2007-10-27 15:19     ` Tilman Schmidt
2007-10-27 15:28       ` Sam Ravnborg
2007-10-14 22:34 ` 2.6.23-mm1: BUG in reiserfs_delete_xattrs Laurent Riffard
2007-10-14 22:34   ` Laurent Riffard
2007-10-15  8:40   ` Christoph Hellwig
2007-10-15 18:31     ` Jeff Mahoney
2007-10-15 20:06       ` Laurent Riffard
2007-10-15 20:06         ` Laurent Riffard
2007-10-15 20:23         ` Jeff Mahoney
2007-10-15 20:23           ` Jeff Mahoney
2007-10-17  8:59         ` Christoph Hellwig
2007-10-17  8:58       ` Christoph Hellwig
2007-10-17 14:55         ` Jeff Mahoney
2007-10-17 14:55           ` Jeff Mahoney
2007-10-17 14:55         ` Jeff Mahoney
2007-10-15 18:31     ` Jeff Mahoney
2007-10-15 18:31     ` Jeff Mahoney
2007-10-15 19:51     ` Laurent Riffard
2007-10-15 19:51     ` Laurent Riffard
2007-10-15 19:51     ` Laurent Riffard
2007-10-15  6:18 ` [PATCH] Add irq protection in the percpu-counters cpu-hotplug-callback path Gautham R Shenoy
2007-10-15 12:28 ` nfs mmap adventure (was: 2.6.23-mm1) Peter Zijlstra
2007-10-15 14:06   ` David Howells
2007-10-15 15:51     ` Trond Myklebust
2007-10-15 16:38       ` Peter Zijlstra
2007-10-16  1:46     ` Nick Piggin
2007-10-15 23:27       ` David Howells
2007-10-15 15:43   ` Trond Myklebust
2007-10-16  7:18 ` 2.6.23-mm1 - regression- PowerPC link failure at arch/powerpc/kernel/head_64.o Kamalesh Babulal
2007-10-16  7:28   ` Andrew Morton
2007-10-16  7:44     ` Kamalesh Babulal
2007-10-21  6:42       ` Kamalesh Babulal
2007-10-27  5:05         ` Stephen Rothwell
2007-10-17  7:01 ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-17  9:02   ` 2.6.23-mm1 Andrew Morton
2007-10-17  9:10   ` 2.6.23-mm1 Jiri Kosina
2007-10-17  9:36     ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-17 11:42       ` 2.6.23-mm1 Jiri Kosina
2007-10-17 12:33         ` 2.6.23-mm1 KAMEZAWA Hiroyuki
2007-10-19  9:07           ` PIE randomization (was Re: 2.6.23-mm1) Jiri Kosina
2007-10-19 21:54       ` 2.6.23-mm1 Jiri Kosina
2007-10-17 15:54 ` 2.6.23-mm1 - list_add corruption in cgroup Cedric Le Goater
2007-10-18 15:56   ` Paul Menage
2007-10-19 22:11   ` Paul Menage
2007-10-18 12:06 ` 2.6.23-mm1 - powerpc - Build fails at arch/powerpc/boot/inflate.o Kamalesh Babulal
2007-10-18 12:06   ` Kamalesh Babulal
2007-10-18 12:23   ` Paul Mackerras
2007-10-18 12:23     ` Paul Mackerras
2007-10-18 13:20     ` Kamalesh Babulal
2007-10-18 13:20       ` Kamalesh Babulal
2007-10-20  4:57 ` oops in lbmIODone, fails to boot [Re: 2.6.23-mm1] Mattia Dongili
2007-10-20  5:34   ` Andrew Morton
2007-10-20 12:18     ` Dave Kleikamp
2007-10-21  5:44       ` Mattia Dongili
2007-10-20  5:13 ` 2.6.23-mm1 - autofs broken Rik van Riel
2007-10-20  5:39   ` Andrew Morton
2007-10-20  5:54     ` Rik van Riel
2007-10-20  5:54       ` Rik van Riel
2007-10-20 14:56         ` Rik van Riel
2007-10-22 22:03           ` Dave Hansen
2007-10-22  3:45   ` Ian Kent
2007-10-22 16:46     ` Rik van Riel
2007-10-21  5:58 ` mysqld prevents s2ram [Re: 2.6.23-mm1] Mattia Dongili
2007-10-21  6:28   ` Mattia Dongili
2007-10-21  9:58   ` Pavel Machek
2007-10-21 11:53     ` Rafael J. Wysocki
2007-10-22 18:40 ` kernel panic when running tcpdump Mariusz Kozlowski
2007-10-22 18:40   ` Mariusz Kozlowski
2007-10-22 19:03   ` Andrew Morton
2007-10-22 19:03     ` Andrew Morton
2007-10-22 21:16     ` Mariusz Kozlowski
2007-10-22 21:16       ` Mariusz Kozlowski
  -- strict thread matches above, loose matches on Subject: below --
2007-10-12  4:31 2.6.23-mm1 Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4710B7C5.5050403@garzik.org \
    --to=jeff@garzik.org \
    --cc=akpm@linux-foundation.org \
    --cc=just.for.lkml@googlemail.com \
    --cc=kluo@nvidia.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pchen@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.