* Flush drive cache before issuing ATA_16 from userspace?
@ 2008-08-19 13:41 Mark Lord
2008-08-19 16:38 ` Grant Grundler
2008-08-19 22:05 ` Mark Lord
0 siblings, 2 replies; 5+ messages in thread
From: Mark Lord @ 2008-08-19 13:41 UTC (permalink / raw)
To: Jeff Garzik, Tejun Heo, Alan Cox; +Cc: IDE/ATA development list
I have a rather busy MythTV system here, with four tuners
and a hirez HDTV for the display.
It uses a pair of 750GB Hitachi SATA drives (RAID0) for storage.
I wanted to see how warm the drives get, so I set up a monitoring
program that invokes hddtemp every 20-30 seconds or so, updating
a front panel display with the current drive temperature of /dev/sdb.
So far, so good.
But.. when the machine is busy recording a hi-def (17mbit/sec) stream
whilst also playing back a hi-def stream, libata locks up and resets
/dev/sdb periodically, say once every minute or so (quite irregular).
This causes lots of recording bits to be dropped, ruining later playback.
The dual-core system was using 2.6.24.3 (32-bit) at the time,
and libata.ahci w/NCQ on both drives.
I retested with "hdparm -Q1 /dev/sd?" and it didn't help -- same problem.
Looking at the system logs, and running a full S.M.A.R.T. test shows
both drives to be clean (no media faults found), other than libata
reporting timeouts as here:
..
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
ata2.00: status: { DRDY }
ata2: soft resetting link
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: configured for UDMA/133
ata2: EH complete
sd 1:0:0:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
..
That's an IDENTIFY (0xEC) command timing out.
The hddtemp program does it's work by issuing IDENTIFY and SMART
commands to the target drive, /dev/sdb in this case.
ioctl(3, 0x30d, 0xbfd2c418)
ioctl(3, 0x31f, 0xbfd2c60c)
ioctl(3, 0x31f, 0xbfd2c614)
ioctl(3, 0x31f, 0xbfd2c408)
So that 0xEC most likely came from the hddtemp program,
since libata doesn't normally issue them after probing.
So why is it timing out? Well, these drives have 32MB onboard caches,
and I'm guessing that something (firmware, whatever) tries to empty that
cache before processing the issued IDENTIFY command. And we time out
before the drive has a chance to actually process the IDENTIFY.
This is a problem for libata, one that will become more common as drives
get larger and larger caches. Fiddling with timeouts isn't really the
greatest solution, though it could help.
A more deterministic solution might be to issue a CACHE FLUSH command
ahead of any ATA_16 command (perhaps other than a R/W command?),
so that the timeout for the ATA_16 (IDENTIFY in this case) won't have
to account for an unpredictable write cache on the drive.
Or maybe there's a better way. Suggestions ??
Meanwhile, I'm installing 2.6.26.2 on the box later today,
but don't really expect anything to be different with the newer kernel.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Flush drive cache before issuing ATA_16 from userspace?
2008-08-19 13:41 Flush drive cache before issuing ATA_16 from userspace? Mark Lord
@ 2008-08-19 16:38 ` Grant Grundler
2008-08-19 22:05 ` Mark Lord
1 sibling, 0 replies; 5+ messages in thread
From: Grant Grundler @ 2008-08-19 16:38 UTC (permalink / raw)
To: Mark Lord; +Cc: Jeff Garzik, Tejun Heo, Alan Cox, IDE/ATA development list
On Tue, Aug 19, 2008 at 6:41 AM, Mark Lord <liml@rtr.ca> wrote:
> I have a rather busy MythTV system here, with four tuners
> and a hirez HDTV for the display.
>
> It uses a pair of 750GB Hitachi SATA drives (RAID0) for storage.
>
> I wanted to see how warm the drives get, so I set up a monitoring
> program that invokes hddtemp every 20-30 seconds or so, updating
> a front panel display with the current drive temperature of /dev/sdb.
>
> So far, so good.
>
> But.. when the machine is busy recording a hi-def (17mbit/sec) stream
> whilst also playing back a hi-def stream, libata locks up and resets
> /dev/sdb periodically, say once every minute or so (quite irregular).
> This causes lots of recording bits to be dropped, ruining later playback.
>
> The dual-core system was using 2.6.24.3 (32-bit) at the time,
> and libata.ahci w/NCQ on both drives.
> I retested with "hdparm -Q1 /dev/sd?" and it didn't help -- same problem.
>
> Looking at the system logs, and running a full S.M.A.R.T. test shows
> both drives to be clean (no media faults found), other than libata
> reporting timeouts as here:
>
> ..
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
> res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
> ata2.00: status: { DRDY }
> ata2: soft resetting link
> ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata2.00: configured for UDMA/133
> ata2: EH complete
> sd 1:0:0:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
> DPO or FUA
> ..
>
> That's an IDENTIFY (0xEC) command timing out.
> The hddtemp program does it's work by issuing IDENTIFY and SMART
> commands to the target drive, /dev/sdb in this case.
>
> ioctl(3, 0x30d, 0xbfd2c418)
> ioctl(3, 0x31f, 0xbfd2c60c)
> ioctl(3, 0x31f, 0xbfd2c614)
> ioctl(3, 0x31f, 0xbfd2c408)
>
> So that 0xEC most likely came from the hddtemp program,
> since libata doesn't normally issue them after probing.
>
> So why is it timing out? Well, these drives have 32MB onboard caches,
> and I'm guessing that something (firmware, whatever) tries to empty that
> cache before processing the issued IDENTIFY command. And we time out
> before the drive has a chance to actually process the IDENTIFY.
This sounds like a drive firmware bug. I can't think of any reason for IDENTIFY
to cause a cache flush. Not unless the IDENTIFY return values can be
influenced by pending writes - ie some of the IDENTIFY data is coming
from media.
We have seen some drives take up to 350ms to process a SMART data
request (Vendor fixed this in later firmware). Even the "better" drives
(different vendor) can take up to 100ms for SMART data to be returned.
(Sorry, I can't name vendors - they know who they are and can speak
for themselves.)
> This is a problem for libata, one that will become more common as drives
> get larger and larger caches. Fiddling with timeouts isn't really the
> greatest solution, though it could help.
>
> A more deterministic solution might be to issue a CACHE FLUSH command
> ahead of any ATA_16 command (perhaps other than a R/W command?),
> so that the timeout for the ATA_16 (IDENTIFY in this case) won't have
> to account for an unpredictable write cache on the drive.
I'm getting more worried that faster drives (more throughput and _average_
latency is well below 15ms) will have extremely high latency in the worst
cases despite having no HW errors.
This IDENTIFY case is the third case I'm aware of.
> Or maybe there's a better way. Suggestions ??
Maybe get a SATA trace to confirm it's the drive's fault.
But otherwise no better ideas besides complaining to
the vendor and asking them to fix the problem.
hth,
grant
>
> Meanwhile, I'm installing 2.6.26.2 on the box later today,
> but don't really expect anything to be different with the newer kernel.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Flush drive cache before issuing ATA_16 from userspace?
2008-08-19 13:41 Flush drive cache before issuing ATA_16 from userspace? Mark Lord
2008-08-19 16:38 ` Grant Grundler
@ 2008-08-19 22:05 ` Mark Lord
2008-08-20 4:08 ` Tejun Heo
1 sibling, 1 reply; 5+ messages in thread
From: Mark Lord @ 2008-08-19 22:05 UTC (permalink / raw)
To: Jeff Garzik, Tejun Heo, Alan Cox; +Cc: IDE/ATA development list
Mark Lord wrote:
> I have a rather busy MythTV system here, with four tuners
> and a hirez HDTV for the display.
>
> It uses a pair of 750GB Hitachi SATA drives (RAID0) for storage.
>
> I wanted to see how warm the drives get, so I set up a monitoring
> program that invokes hddtemp every 20-30 seconds or so, updating
> a front panel display with the current drive temperature of /dev/sdb.
>
> So far, so good.
>
> But.. when the machine is busy recording a hi-def (17mbit/sec) stream
> whilst also playing back a hi-def stream, libata locks up and resets
> /dev/sdb periodically, say once every minute or so (quite irregular).
> This causes lots of recording bits to be dropped, ruining later playback.
>
> The dual-core system was using 2.6.24.3 (32-bit) at the time,
> and libata.ahci w/NCQ on both drives.
> I retested with "hdparm -Q1 /dev/sd?" and it didn't help -- same problem.
>
> Looking at the system logs, and running a full S.M.A.R.T. test shows
> both drives to be clean (no media faults found), other than libata
> reporting timeouts as here:
>
> ..
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in
> res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
> ata2.00: status: { DRDY }
> ata2: soft resetting link
> ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata2.00: configured for UDMA/133
> ata2: EH complete
> sd 1:0:0:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
> support DPO or FUA
> ..
>
> That's an IDENTIFY (0xEC) command timing out.
> The hddtemp program does it's work by issuing IDENTIFY and SMART
> commands to the target drive, /dev/sdb in this case.
>
> ioctl(3, 0x30d, 0xbfd2c418)
> ioctl(3, 0x31f, 0xbfd2c60c)
> ioctl(3, 0x31f, 0xbfd2c614)
> ioctl(3, 0x31f, 0xbfd2c408)
>
> So that 0xEC most likely came from the hddtemp program,
> since libata doesn't normally issue them after probing.
>
> So why is it timing out? Well, these drives have 32MB onboard caches,
> and I'm guessing that something (firmware, whatever) tries to empty that
> cache before processing the issued IDENTIFY command. And we time out
> before the drive has a chance to actually process the IDENTIFY.
..
Another possibility could be some kind of bug in libata or ahci.c.
It seems unlikely -- .qc_defer ought to prevent issues -- but I haven't
really poked around in there. And this is a "production" machine :)
so we don't like to use it (much) for debugging kernels if we can help it.
Cheers
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: Flush drive cache before issuing ATA_16 from userspace?
2008-08-19 22:05 ` Mark Lord
@ 2008-08-20 4:08 ` Tejun Heo
2008-08-21 22:02 ` Mark Lord
0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2008-08-20 4:08 UTC (permalink / raw)
To: Mark Lord; +Cc: Jeff Garzik, Alan Cox, IDE/ATA development list
Mark Lord wrote:
>> That's an IDENTIFY (0xEC) command timing out.
>> The hddtemp program does it's work by issuing IDENTIFY and SMART
>> commands to the target drive, /dev/sdb in this case.
>>
>> ioctl(3, 0x30d, 0xbfd2c418)
>> ioctl(3, 0x31f, 0xbfd2c60c)
>> ioctl(3, 0x31f, 0xbfd2c614)
>> ioctl(3, 0x31f, 0xbfd2c408)
>>
>> So that 0xEC most likely came from the hddtemp program,
>> since libata doesn't normally issue them after probing.
>>
>> So why is it timing out? Well, these drives have 32MB onboard caches,
>> and I'm guessing that something (firmware, whatever) tries to empty that
>> cache before processing the issued IDENTIFY command. And we time out
>> before the drive has a chance to actually process the IDENTIFY.
> ..
>
> Another possibility could be some kind of bug in libata or ahci.c.
> It seems unlikely -- .qc_defer ought to prevent issues -- but I haven't
> really poked around in there. And this is a "production" machine :)
> so we don't like to use it (much) for debugging kernels if we can help it.
Can you please hack up a program to issue IDENTIFY with different
timeouts and see when the condition triggers?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Flush drive cache before issuing ATA_16 from userspace?
2008-08-20 4:08 ` Tejun Heo
@ 2008-08-21 22:02 ` Mark Lord
0 siblings, 0 replies; 5+ messages in thread
From: Mark Lord @ 2008-08-21 22:02 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jeff Garzik, Alan Cox, IDE/ATA development list
Tejun Heo wrote:
> Mark Lord wrote:
>>> That's an IDENTIFY (0xEC) command timing out.
>>> The hddtemp program does it's work by issuing IDENTIFY and SMART
>>> commands to the target drive, /dev/sdb in this case.
>>>
>>> ioctl(3, 0x30d, 0xbfd2c418)
>>> ioctl(3, 0x31f, 0xbfd2c60c)
>>> ioctl(3, 0x31f, 0xbfd2c614)
>>> ioctl(3, 0x31f, 0xbfd2c408)
>>>
>>> So that 0xEC most likely came from the hddtemp program,
>>> since libata doesn't normally issue them after probing.
>>>
>>> So why is it timing out? Well, these drives have 32MB onboard caches,
>>> and I'm guessing that something (firmware, whatever) tries to empty that
>>> cache before processing the issued IDENTIFY command. And we time out
>>> before the drive has a chance to actually process the IDENTIFY.
>> ..
>>
>> Another possibility could be some kind of bug in libata or ahci.c.
>> It seems unlikely -- .qc_defer ought to prevent issues -- but I haven't
>> really poked around in there. And this is a "production" machine :)
>> so we don't like to use it (much) for debugging kernels if we can help it.
>
> Can you please hack up a program to issue IDENTIFY with different
> timeouts and see when the condition triggers?
..
Yeah, there's a bunch of stuff like that for me to do, someday. :)
Meanwhile, summer finally arrived here today, so..
Cheers
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-08-21 22:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-19 13:41 Flush drive cache before issuing ATA_16 from userspace? Mark Lord
2008-08-19 16:38 ` Grant Grundler
2008-08-19 22:05 ` Mark Lord
2008-08-20 4:08 ` Tejun Heo
2008-08-21 22:02 ` Mark Lord
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).