From: Robert Hancock <hancockr@shaw.ca>
To: Gabor Gombas <gombasg@sztaki.hu>
Cc: Allen Martin <AMartin@nvidia.com>, Mark Lord <lkml@rtr.ca>,
Jeff Garzik <jeff@garzik.org>, Tejun Heo <htejun@gmail.com>,
linux-kernel@vger.kernel.org, linux-ide@vger.kernel.org,
Kuan Luo <kluo@nvidia.com>, Peer Chen <pchen@nvidia.com>
Subject: Re: sata_nv + ADMA + Samsung disk problem
Date: Fri, 11 Jan 2008 19:10:15 -0600 [thread overview]
Message-ID: <47881377.4070508@shaw.ca> (raw)
In-Reply-To: <20080111231813.GB7052@boogie.lpds.sztaki.hu>
Gabor Gombas wrote:
> On Mon, Jan 07, 2008 at 06:10:29PM -0600, Robert Hancock wrote:
>
>> Gabor, I just noticed you said that it worked OK in 2.6.20, yet 2.6.22
>> fails. 2.6.20 had ADMA support as well, so I wonder what change started
>> causing the problem. Would it be possible for you to do a git bisect (or
>> at least try 2.6.21 to try and narrow it down)?
>
> I've now booted 2.6.21.7, we'll see. The problem with the bisection is
> that I can't explicitely trigger the bug so I can't say for sure if a
> kernel is good or it is just needs more time to trigger. The average
> uptime of this machine is just a couple hours a day.
>
> For example, with 2.6.24-rc6 it took over 3 hours for the first disk to
> trigger the bug and the second disk needed more than 7 hours. This
> machine is seldom turned on for that long.
If you want to try to reproduce the problem more rapidly, you can try
the recipe I just suggested to the NVIDIA guys:
Run 2 instances of this C program, with different output files as the
argument, i.e. save this to fsynctest.c, and do
gcc fsynctest.c -o fsynctest
./fsynctest testfile & ./fsynctest testfile2 &
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
int main(int argc, char* argv[])
{
int i;
int fd = open( argv[1], O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR |
S_IWUSR);
if(fd == -1)
{
perror("open");
return 1;
}
for(i=0;i<1000000;i++)
{
int rc = write(fd, "0", 1);
if( rc != 1 )
{
perror("write");
return 2;
}
rc = fsync(fd);
if(rc)
{
perror("fsync");
return 2;
}
}
return 0;
}
Also run one instance of this:
dd if=/dev/zero of=blankfile bs=512 count=100000 oflag=direct
and one of this:
while /bin/true; do sdparm --command=sync /dev/sdb; done
all at the same time. In my experience, it helps to disable cpufreq (on
Red Hat/Fedora, /sbin/service cpuspeed stop) to force the CPU to run at
max frequency all the time. After a few minutes I got this:
ata4: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000
status 0x400 next cpb count 0x2 next cpb idx 0x0
ata4: CPB 0: ctl_flags 0x1f, resp_flags 0x0
ata4: CPB 1: ctl_flags 0x1f, resp_flags 0x0
ata4: CPB 2: ctl_flags 0x1f, resp_flags 0x0
ata4: timeout waiting for ADMA IDLE, stat=0x400
ata4: timeout waiting for ADMA LEGACY, stat=0x400
ata4.00: exception Emask 0x0 SAct 0x7 SErr 0x0 action 0x2 frozen
ata4.00: cmd 61/08:00:e0:74:64/00:00:0a:00:00/40 tag 0 ncq 4096 out
res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4.00: cmd 61/08:08:30:5b:76/00:00:0c:00:00/40 tag 1 ncq 4096 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4.00: cmd 61/01:10:ba:51:77/00:00:0c:00:00/40 tag 2 ncq 512 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4: soft resetting link
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: configured for UDMA/133
ata4: EH complete
prev parent reply other threads:[~2008-01-12 1:10 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-08-08 12:08 sata_nv + ADMA + Samsung disk problem Gabor Gombas
2007-08-14 9:30 ` Tejun Heo
2007-08-14 12:02 ` Gabor Gombas
2007-08-16 16:06 ` Gabor Gombas
2007-08-16 18:45 ` Jim Paris
2008-01-01 16:44 ` Gabor Gombas
2008-01-02 3:25 ` Tejun Heo
2008-01-02 4:03 ` Robert Hancock
2008-01-02 4:20 ` Robert Hancock
2008-01-02 4:25 ` Tejun Heo
2008-01-02 6:19 ` Jeff Garzik
2008-01-02 6:39 ` Robert Hancock
2008-01-02 6:55 ` Tejun Heo
2008-01-03 0:27 ` Robert Hancock
2008-01-02 17:23 ` Allen Martin
2008-01-02 17:23 ` Allen Martin
2008-01-02 18:57 ` Jeff Garzik
2008-01-02 23:23 ` Allen Martin
2008-01-02 23:23 ` Allen Martin
2008-01-03 0:21 ` Robert Hancock
2008-01-03 4:14 ` Mark Lord
2008-01-03 4:17 ` Mark Lord
2008-01-03 4:54 ` Robert Hancock
2008-01-03 15:44 ` Mark Lord
2008-01-03 15:47 ` Mark Lord
2008-01-03 21:13 ` Benjamin Herrenschmidt
2008-01-04 1:43 ` Robert Hancock
2008-01-04 5:51 ` Benjamin Herrenschmidt
2008-01-04 0:41 ` Allen Martin
2008-01-04 0:41 ` Allen Martin
2008-01-04 2:51 ` Robert Hancock
2008-01-08 0:10 ` Robert Hancock
2008-01-11 23:18 ` Gabor Gombas
2008-01-12 1:10 ` Robert Hancock [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47881377.4070508@shaw.ca \
--to=hancockr@shaw.ca \
--cc=AMartin@nvidia.com \
--cc=gombasg@sztaki.hu \
--cc=htejun@gmail.com \
--cc=jeff@garzik.org \
--cc=kluo@nvidia.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lkml@rtr.ca \
--cc=pchen@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.