Re: [PATCH 2/4] libata: Implement disk shock protection support

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Tejun Heo <htejun@gmail.com>
To: Elias Oltmanns <eo@nebensachen.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
	Jeff Garzik <jeff@garzik.org>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] libata: Implement disk shock protection support
Date: Fri, 05 Sep 2008 10:51:04 +0200	[thread overview]
Message-ID: <48C0F2F8.1040308@gmail.com> (raw)
In-Reply-To: <87ej3zrf3o.fsf@denkblock.local>

Elias Oltmanns wrote:
> Tejun Heo <htejun@gmail.com> wrote:
>>  extern struct device_attribute **libata_sdev_attrs;
>>
>>  #define ATA_BASE_SHT(name)				\
>>  ....
>> 	.sdev_attrs		= libata_sdev_attrs;	\
>>  ....
>>
>> Which will give unload_heads to all libata drivers.  As ahci needs its
>> own node it would need to define its own sdev_attrs tho.
> 
> Dear me, I totally forgot about that, didn't I. Anyway, I meant to ask
> you about that when you mentioned it the last time round, so thanks for
> explaining in more detail. I'll do it this way then.

Great.

>> Isn't seconds a bit too crude? Or it just doesn't matter as it's
>> usually adjusted before expiring?  For most time interval values
>> (except for transfer timings of course) in ATA land, millisecs seem to
>> be good enough and I've been trying to unify things that direction.
> 
> Well, I can see your point. Technically, we are talking about magnitudes
> in the order of seconds rather than milliseconds here because the specs
> only guarantee command completion for head unload in 300 or even 500
> msecs. This means that the daemon should always schedule timeouts well
> above this limit. That's the reason why we have only accepted timeouts
> in seconds rather than milliseconds at the user's request. When reading
> from sysfs, we have returned seconds for consistency. I'm a bit torn
> between the options now:
> 
> 1. Switch the interface completely to msecs: consistent with the rest of
>    libata but slightly misleading because it may promise more accuracy
>    than we can actually provide for;
> 2. keep it the way it was (i.e. seconds on read and write): we don't
>    promise too much as far as accuracy is concerned, but it is
>    inconsistent with the rest of libata. Besides, user space can still
>    issue a 0 and another nonzero timeout within a very short time and we
>    don't protect against that anyway;
> 3. only switch to msecs on read: probably the worst of all options.
> 
> What do you think?

My favorite is #1.  Millisecond is small amount of time but it's also
not hard to imagine some future cases where, say, 0.5 sec of
granuality makes some difference.

>> Hmmm... Sorry to bring another issue with it but I think the interface
>> is a bit convoluted.  The unpark node is per-dev but the action is
>> per-port but devices can opt out by writing -2.  Also, although the
>> sysfs nodes are per-dev, writing to a node changes the value of park
>> node in the device sharing the port except when the value is -1 or -2.
>> That's strange, right?
> 
> Well, it is strange, but it pretty much reflects reality as close as it
> can get. Devices can only opt in / out of actually issuing the unload
> command but they will always stop I/O and thus be affected by the
> timeout (intentionally).
> 
>> How about something like the following?
>>
>> * In park_store: set dev->unpark_timeout, kick and wake up EH.
>>
>> * In park EH action: until the latest of all unpark_timeout are
>>   passed, park all drives whose unpark_timeout is in future.  When
>>   none of the drives needs to be parked (all timers expired), the
>>   action completes.
>>
>> * There probably needs to be a flag to indicate that the timeout is
>>   valid; otherwise, we could get spurious head unparking after jiffies
>>   wraps (or maybe just use jiffies_64?).
>>
>> With something like the above, the interface is cleanly per-dev and we
>> wouldn't need -1/-2 special cases.  The implementation is still
>> per-port but we can change that later without modifying userland
>> interface.
> 
> First of all, we cannot do a proper per-dev implementation internally.

Not yet but I think we should move toward per-queue EH which will
enable fine-grained exception handling like this.  Such approach would
also help things like ATAPI CHECK_SENSE behind PMP.  I think it's
better to define the interface which suits the problem best rather
than reflects the current implementation.

> Admittedly, we could do it per-link rather than per-port, but the point
> I'm making is this: there really is just *one* grobal timeout (per-port
> now or perhaps per-link in the long run). The confusing thing right now
> is that you can read the current timeout on any device, but you can only
> set a timeout on a device that actually supports head unloading. Perhaps
> we should return something like "n/a" when reading the sysfs attribute
> for a device that doesn't support head unloads, even though a timer on
> that port may be running because the other device has just received an
> unload request. This way, both devices will be affected by the timeout,
> but you can only read it on the device where you can change it as well.
> Would that suit you?

If the timeout is global, it's best to have one knob.  If the timeout
is per-port, it's best to have one knob per-port, and so on.  I can't
think of a good reason to implement per-port timeout with per-device
opt out instead of doing per-device timeout from the beginning.  It
just doesn't make much sense interface-wise to me.  As this is an
interface which is gonna stick around for a long time, I really think
it should be done as straight forward as possible even though the
current implementation of the feature has to do it in more crude
manner.

Thanks.

-- 
tejun

next prev parent reply	other threads:[~2008-09-05  8:51 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-29 21:11 [RFC] Disk shock protection in GNU/Linux (take 2) Elias Oltmanns
2008-08-29 21:16 ` [PATCH 1/4] Introduce ata_id_has_unload() Elias Oltmanns
2008-08-30 11:56   ` Sergei Shtylyov
2008-08-30 17:29     ` Elias Oltmanns
2008-08-30 18:01       ` Sergei Shtylyov
2008-08-29 21:20 ` [PATCH 2/4] libata: Implement disk shock protection support Elias Oltmanns
2008-08-30  9:33   ` Tejun Heo
2008-08-30 23:38     ` Elias Oltmanns
2008-08-31  9:25       ` Tejun Heo
2008-08-31 12:08         ` Elias Oltmanns
2008-08-31 13:03           ` Tejun Heo
2008-08-31 14:32             ` Bartlomiej Zolnierkiewicz
2008-08-31 17:07               ` Elias Oltmanns
2008-08-31 19:35                 ` Bartlomiej Zolnierkiewicz
2008-09-01 15:41                   ` Elias Oltmanns
2008-09-01  2:08                 ` Henrique de Moraes Holschuh
2008-09-01  9:37                   ` Matthew Garrett
2008-08-31 16:14             ` Elias Oltmanns
2008-09-01  8:33               ` Tejun Heo
2008-09-01 14:51                 ` Elias Oltmanns
2008-09-01 16:43                   ` Tejun Heo
2008-09-03 20:23                     ` Elias Oltmanns
2008-09-04  9:06                       ` Tejun Heo
2008-09-04 17:32                         ` Elias Oltmanns
2008-09-05  8:51                           ` Tejun Heo [this message]
2008-09-10 13:53                             ` Elias Oltmanns
2008-09-10 14:40                               ` Tejun Heo
2008-09-10 19:28                                 ` Elias Oltmanns
2008-09-10 20:23                                   ` Tejun Heo
2008-09-10 21:04                                     ` Elias Oltmanns
2008-09-10 22:56                                       ` Tejun Heo
2008-09-11 12:26                                         ` Elias Oltmanns
2008-09-11 12:51                                           ` Tejun Heo
2008-09-11 13:01                                             ` Tejun Heo
2008-09-11 18:28                                               ` Valdis.Kletnieks
2008-09-11 23:25                                                 ` Tejun Heo
2008-09-12 10:15                                                   ` Elias Oltmanns
2008-09-12 18:11                                                     ` Valdis.Kletnieks
2008-09-17 15:26                                           ` Elias Oltmanns
2008-08-29 21:26 ` [PATCH 3/4] ide: " Elias Oltmanns
2008-09-01 19:29   ` Bartlomiej Zolnierkiewicz
2008-09-03 20:01     ` Elias Oltmanns
2008-09-03 21:33       ` Elias Oltmanns
2008-09-05 17:33       ` Bartlomiej Zolnierkiewicz
2008-09-12  9:55         ` Elias Oltmanns
2008-09-12 11:55           ` Elias Oltmanns
2008-09-15 19:15           ` Elias Oltmanns
2008-09-15 23:22             ` Bartlomiej Zolnierkiewicz
2008-09-17 15:28           ` Elias Oltmanns
2008-08-29 21:28 ` [PATCH 4/4] Add documentation for hard disk shock protection interface Elias Oltmanns
2008-09-08 22:04   ` Randy Dunlap
2008-09-16 16:53     ` Elias Oltmanns

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48C0F2F8.1040308@gmail.com \
    --to=htejun@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=bzolnier@gmail.com \
    --cc=eo@nebensachen.de \
    --cc=jeff@garzik.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=randy.dunlap@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).