From: Tejun Heo <htejun@gmail.com>
To: Elias Oltmanns <eo@nebensachen.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
Andrew Morton <akpm@linux-foundation.org>,
Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>,
Jeff Garzik <jeff@garzik.org>,
Randy Dunlap <randy.dunlap@oracle.com>,
linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] libata: Implement disk shock protection support
Date: Fri, 05 Sep 2008 10:51:04 +0200 [thread overview]
Message-ID: <48C0F2F8.1040308@gmail.com> (raw)
In-Reply-To: <87ej3zrf3o.fsf@denkblock.local>
Elias Oltmanns wrote:
> Tejun Heo <htejun@gmail.com> wrote:
>> extern struct device_attribute **libata_sdev_attrs;
>>
>> #define ATA_BASE_SHT(name) \
>> ....
>> .sdev_attrs = libata_sdev_attrs; \
>> ....
>>
>> Which will give unload_heads to all libata drivers. As ahci needs its
>> own node it would need to define its own sdev_attrs tho.
>
> Dear me, I totally forgot about that, didn't I. Anyway, I meant to ask
> you about that when you mentioned it the last time round, so thanks for
> explaining in more detail. I'll do it this way then.
Great.
>> Isn't seconds a bit too crude? Or it just doesn't matter as it's
>> usually adjusted before expiring? For most time interval values
>> (except for transfer timings of course) in ATA land, millisecs seem to
>> be good enough and I've been trying to unify things that direction.
>
> Well, I can see your point. Technically, we are talking about magnitudes
> in the order of seconds rather than milliseconds here because the specs
> only guarantee command completion for head unload in 300 or even 500
> msecs. This means that the daemon should always schedule timeouts well
> above this limit. That's the reason why we have only accepted timeouts
> in seconds rather than milliseconds at the user's request. When reading
> from sysfs, we have returned seconds for consistency. I'm a bit torn
> between the options now:
>
> 1. Switch the interface completely to msecs: consistent with the rest of
> libata but slightly misleading because it may promise more accuracy
> than we can actually provide for;
> 2. keep it the way it was (i.e. seconds on read and write): we don't
> promise too much as far as accuracy is concerned, but it is
> inconsistent with the rest of libata. Besides, user space can still
> issue a 0 and another nonzero timeout within a very short time and we
> don't protect against that anyway;
> 3. only switch to msecs on read: probably the worst of all options.
>
> What do you think?
My favorite is #1. Millisecond is small amount of time but it's also
not hard to imagine some future cases where, say, 0.5 sec of
granuality makes some difference.
>> Hmmm... Sorry to bring another issue with it but I think the interface
>> is a bit convoluted. The unpark node is per-dev but the action is
>> per-port but devices can opt out by writing -2. Also, although the
>> sysfs nodes are per-dev, writing to a node changes the value of park
>> node in the device sharing the port except when the value is -1 or -2.
>> That's strange, right?
>
> Well, it is strange, but it pretty much reflects reality as close as it
> can get. Devices can only opt in / out of actually issuing the unload
> command but they will always stop I/O and thus be affected by the
> timeout (intentionally).
>
>> How about something like the following?
>>
>> * In park_store: set dev->unpark_timeout, kick and wake up EH.
>>
>> * In park EH action: until the latest of all unpark_timeout are
>> passed, park all drives whose unpark_timeout is in future. When
>> none of the drives needs to be parked (all timers expired), the
>> action completes.
>>
>> * There probably needs to be a flag to indicate that the timeout is
>> valid; otherwise, we could get spurious head unparking after jiffies
>> wraps (or maybe just use jiffies_64?).
>>
>> With something like the above, the interface is cleanly per-dev and we
>> wouldn't need -1/-2 special cases. The implementation is still
>> per-port but we can change that later without modifying userland
>> interface.
>
> First of all, we cannot do a proper per-dev implementation internally.
Not yet but I think we should move toward per-queue EH which will
enable fine-grained exception handling like this. Such approach would
also help things like ATAPI CHECK_SENSE behind PMP. I think it's
better to define the interface which suits the problem best rather
than reflects the current implementation.
> Admittedly, we could do it per-link rather than per-port, but the point
> I'm making is this: there really is just *one* grobal timeout (per-port
> now or perhaps per-link in the long run). The confusing thing right now
> is that you can read the current timeout on any device, but you can only
> set a timeout on a device that actually supports head unloading. Perhaps
> we should return something like "n/a" when reading the sysfs attribute
> for a device that doesn't support head unloads, even though a timer on
> that port may be running because the other device has just received an
> unload request. This way, both devices will be affected by the timeout,
> but you can only read it on the device where you can change it as well.
> Would that suit you?
If the timeout is global, it's best to have one knob. If the timeout
is per-port, it's best to have one knob per-port, and so on. I can't
think of a good reason to implement per-port timeout with per-device
opt out instead of doing per-device timeout from the beginning. It
just doesn't make much sense interface-wise to me. As this is an
interface which is gonna stick around for a long time, I really think
it should be done as straight forward as possible even though the
current implementation of the feature has to do it in more crude
manner.
Thanks.
--
tejun
next prev parent reply other threads:[~2008-09-05 8:51 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-29 21:11 [RFC] Disk shock protection in GNU/Linux (take 2) Elias Oltmanns
2008-08-29 21:16 ` [PATCH 1/4] Introduce ata_id_has_unload() Elias Oltmanns
2008-08-30 11:56 ` Sergei Shtylyov
2008-08-30 17:29 ` Elias Oltmanns
2008-08-30 18:01 ` Sergei Shtylyov
2008-08-29 21:20 ` [PATCH 2/4] libata: Implement disk shock protection support Elias Oltmanns
2008-08-30 9:33 ` Tejun Heo
2008-08-30 23:38 ` Elias Oltmanns
2008-08-31 9:25 ` Tejun Heo
2008-08-31 12:08 ` Elias Oltmanns
2008-08-31 13:03 ` Tejun Heo
2008-08-31 14:32 ` Bartlomiej Zolnierkiewicz
2008-08-31 17:07 ` Elias Oltmanns
2008-08-31 19:35 ` Bartlomiej Zolnierkiewicz
2008-09-01 15:41 ` Elias Oltmanns
2008-09-01 2:08 ` Henrique de Moraes Holschuh
2008-09-01 9:37 ` Matthew Garrett
2008-08-31 16:14 ` Elias Oltmanns
2008-09-01 8:33 ` Tejun Heo
2008-09-01 14:51 ` Elias Oltmanns
2008-09-01 16:43 ` Tejun Heo
2008-09-03 20:23 ` Elias Oltmanns
2008-09-04 9:06 ` Tejun Heo
2008-09-04 17:32 ` Elias Oltmanns
2008-09-05 8:51 ` Tejun Heo [this message]
2008-09-10 13:53 ` Elias Oltmanns
2008-09-10 14:40 ` Tejun Heo
2008-09-10 19:28 ` Elias Oltmanns
2008-09-10 20:23 ` Tejun Heo
2008-09-10 21:04 ` Elias Oltmanns
2008-09-10 22:56 ` Tejun Heo
2008-09-11 12:26 ` Elias Oltmanns
2008-09-11 12:51 ` Tejun Heo
2008-09-11 13:01 ` Tejun Heo
2008-09-11 18:28 ` Valdis.Kletnieks
2008-09-11 23:25 ` Tejun Heo
2008-09-12 10:15 ` Elias Oltmanns
2008-09-12 18:11 ` Valdis.Kletnieks
2008-09-17 15:26 ` Elias Oltmanns
2008-08-29 21:26 ` [PATCH 3/4] ide: " Elias Oltmanns
2008-09-01 19:29 ` Bartlomiej Zolnierkiewicz
2008-09-03 20:01 ` Elias Oltmanns
2008-09-03 21:33 ` Elias Oltmanns
2008-09-05 17:33 ` Bartlomiej Zolnierkiewicz
2008-09-12 9:55 ` Elias Oltmanns
2008-09-12 11:55 ` Elias Oltmanns
2008-09-15 19:15 ` Elias Oltmanns
2008-09-15 23:22 ` Bartlomiej Zolnierkiewicz
2008-09-17 15:28 ` Elias Oltmanns
2008-08-29 21:28 ` [PATCH 4/4] Add documentation for hard disk shock protection interface Elias Oltmanns
2008-09-08 22:04 ` Randy Dunlap
2008-09-16 16:53 ` Elias Oltmanns
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48C0F2F8.1040308@gmail.com \
--to=htejun@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=bzolnier@gmail.com \
--cc=eo@nebensachen.de \
--cc=jeff@garzik.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=randy.dunlap@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).