From: Elias Oltmanns <eo@nebensachen.de>
To: Tejun Heo <htejun@gmail.com>
Cc: linux-ide@vger.kernel.org
Subject: Re: Current qc_defer implementation may lead to infinite recursion
Date: Mon, 11 Feb 2008 08:57:01 +0100 [thread overview]
Message-ID: <871w7k9b8y.fsf@denkblock.local> (raw)
In-Reply-To: <47AFD7C1.7070204@gmail.com> (Tejun Heo's message of "Mon, 11 Feb 2008 14:06:09 +0900")
Tejun Heo <htejun@gmail.com> wrote:
> Elias Oltmanns wrote:
>> Hi Tejun,
>>
>> due to your commit 31cc23b34913bc173680bdc87af79e551bf8cc0d libata now
>> sets max_host_blocked and max_device_blocked to 1 for all devices it
>> manages. Under certain conditions this may lead to system lockups due to
>> infinite recursion as I have explained to James on the scsi list (kept
>> you cc-ed). James told me that it was the business of libata to make
>> sure that such a recursion cannot happen.
>>
>> In my discussion with James I imprudently claimed that this was easy to
>> fix in libata. However, after giving the matter some thought, I'm not at
>> all sure as to what exactly should be done about it. The easy bit is
>> that max_host_blocked and max_device_blocked should be left alone as
>> long as the low level driver does not provide the ->qc_defer() callback.
>> But even if the driver has defined this callback, ata_std_qc_defer() for
>> one will not prevent this recursion on a uniprocessor, whereas things
>> might work out well on an SMP system due to the lock fiddling in the
>> scsi midlayer.
>>
>> As a conclusion, the current implementation makes it imperative to leave
>> max_host_blocked and max_device_blocked alone on a uniprocessor system.
>> For SMP systems the current implementation might just be fine but even
>> there it might just as well be a good idea to make the adjustment
>> depending on ->qc_defer != NULL.
>
> Hmmm... The reason why max_host_blocked and max_device_blocked are set
> to 1 is to let libata re-consider status after each command completion
> as blocked status can be rather complex w/ PMP. I haven't really
> followed the code yet but you're saying that blocked count of 2 should
> be used for that behavior, right?
Not quite. On an SMP system the current implementation will probably do
exactly what you had in mind. In particular, setting max_device_blocked
and max_host_blocked to 1 seems to be the right thing to do in this
case.
>
> Another strange thing is that there hasn't been any such lock up /
> infinite recursion report till now although ->qc_defer mechanism bas
> been used widely for some time now. Can you reproduce the problem w/o
> the disk shock protection?
No, unfortunately, I'm unable to reproduce this without the patch on my
machine. This is for purely technical reasons though because I'm using
ata_piix. Running a vanilla kernel, I'd expect everything to work just
fine except for one case: A non-SMP system using a driver that provides
the ->qc_defer() callback. Currently, the ->qc_defer() callback is the
only thing that can possibly send a non zero return value to the scsi
midlayer. Once it does, however, the driver will only get a chance to
complete some qcs before ->qc_defer() is called again provided that
multithreading is supported.
So, what I'm saying is this: If the low level driver doesn't provide a
->qc_defer() callback, there is no (obvious) reason why
max_device_blocked and max_host_blocked should be set to 1 since libata
won't gain anything by it. However, it is not a bug either, even though
James considers it suboptimal and I will have to think about a solution
for my patch. On the other hand, once a driver defines the ->qc_defer()
callback, we really have a bug because things will go wrong once
->qc_defer() returns non zero on a uniprocessor. So, in this case
max_device_blocked and max_host_blocked should be set to 1 on an SMP
system and *have to* be bigger than 1 otherwise.
Regards,
Elias
next prev parent reply other threads:[~2008-02-11 7:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-10 17:54 Current qc_defer implementation may lead to infinite recursion Elias Oltmanns
2008-02-11 5:06 ` Tejun Heo
2008-02-11 7:57 ` Elias Oltmanns [this message]
2008-02-11 8:43 ` Tejun Heo
2008-02-11 22:03 ` Elias Oltmanns
2008-02-12 1:14 ` Tejun Heo
2008-02-12 8:57 ` Elias Oltmanns
2008-02-12 9:05 ` Tejun Heo
2008-02-12 9:43 ` Elias Oltmanns
2008-02-12 12:56 ` Tejun Heo
2008-02-18 20:03 ` Elias Oltmanns
2008-04-12 8:02 ` Elias Oltmanns
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871w7k9b8y.fsf@denkblock.local \
--to=eo@nebensachen.de \
--cc=htejun@gmail.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).