linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elias Oltmanns <eo@nebensachen.de>
To: Tejun Heo <htejun@gmail.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	Jens Axboe <jens.axboe@oracle.com>
Cc: linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: Prevent busy looping
Date: Thu, 17 Apr 2008 10:50:20 +0200	[thread overview]
Message-ID: <87ve2gc1bn.fsf@denkblock.local> (raw)
In-Reply-To: 20080417071335.GR12774@kernel.dk

[-- Attachment #1: Type: text/plain, Size: 2388 bytes --]

Jens Axboe <jens.axboe@oracle.com> wrote:
> On Thu, Apr 17 2008, Elias Oltmanns wrote:
>> Jens Axboe <jens.axboe@oracle.com> wrote:
>> > On Wed, Apr 16 2008, Elias Oltmanns wrote:
>> >> blk_run_queue() as well as blk_start_queue() plug the device on reentry
>> >> and schedule blk_unplug_work() right afterwards. However,
>> >> blk_plug_device() takes care of that already and makes sure that there is
>> >> a short delay before blk_unplug_work() is scheduled. This is important
>> >> to prevent busy looping and possibly system lockups as observed here:
>> >> <http://permalink.gmane.org/gmane.linux.ide/28351>.
>> >
>> > If you call blk_start_queue() and blk_run_queue(), you better mean it.
>> > There should be no delay. The only reason it does blk_plug_device() is
>> > so that the work queue function will actually do some work.
>> 
>> Well, I'm mainly concerned with blk_run_queue(). In a comment it says
>> that it should recurse only once so as not to overrun the stack. On my
>> machine, however, immediate rescheduling may have exactly as disastrous
>> consequences as an overrunning stack would have since the system locks
>> up completely.
>> 
>> Just to get this straight: Are low level drivers allowed to rely on
>> blk_run_queue() that there will be no loops or do they have to make sure
>> that this function is not called from the request_fn() of the same
>> queue?
>
> It's not really designed for being called recursively. Which isn't the
> problem imo, the problem is SCSI apparently being dumb and calling
> blk_run_queue() all the time. blk_run_queue() must run the queue NOW. If
> SCSI wants something like 'run the queue in a bit', it should use
> blk_plug_device() instead.

James would probably argue that this is alright as long as
max_device_blocked and max_host_blocked are bigger than one.

>
>> > In the newer kernels we just do:
>> >
>> >         set_bit(QUEUE_FLAG_PLUGGED, &q->queue_flags);
>> >         kblockd_schedule_work(q, &q->unplug_work);
>> >
>> > instead, which is much better.
>> 
>> Only as long as it doesn't get called from the request_fn() of the same
>> queue. Otherwise, there may be no chance for other threads to clear the
>> condition that caused blk_run_queue() to be called in the first place.
>
> Broken usage.

Right. Tejun, would it be possible to apply the patch below (2.6.25) or
do you see any alternative?

Regards,

Elias


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: adjust-blocked-counters.patch --]
[-- Type: text/x-patch, Size: 821 bytes --]

---

 drivers/ata/libata-scsi.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 1579539..ce865e9 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -831,7 +831,7 @@ static void ata_scsi_sdev_config(struct scsi_device *sdev)
 	 * prevent SCSI midlayer from automatically deferring
 	 * requests.
 	 */
-	sdev->max_device_blocked = 1;
+	sdev->max_device_blocked = 2;
 }
 
 /**
@@ -3206,7 +3206,7 @@ int ata_scsi_add_hosts(struct ata_host *host, struct scsi_host_template *sht)
 		 * Set host_blocked to 1 to prevent SCSI midlayer from
 		 * automatically deferring requests.
 		 */
-		shost->max_host_blocked = 1;
+		shost->max_host_blocked = 2;
 
 		rc = scsi_add_host(ap->scsi_host, ap->host->dev);
 		if (rc)

       reply	other threads:[~2008-04-17  8:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20080416151305.8788.63912.stgit@denkblock.local>
     [not found] ` <20080416163152.GK12774@kernel.dk>
     [not found]   ` <87r6d5l9pb.fsf@denkblock.local>
     [not found]     ` <20080417071335.GR12774@kernel.dk>
2008-04-17  8:50       ` Elias Oltmanns [this message]
2008-06-11  7:11         ` Prevent busy looping Tejun Heo
2008-06-11  7:05           ` Alan Cox
2008-06-11  8:03             ` Tejun Heo
2008-06-12  3:06               ` Tejun Heo
2008-06-12 11:32                 ` Elias Oltmanns
2008-06-12 13:43                   ` Tejun Heo
2008-06-12 14:18                     ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ve2gc1bn.fsf@denkblock.local \
    --to=eo@nebensachen.de \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=htejun@gmail.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).