Re: [RFC] [PATCH 1/2] blk request timeout handler patches

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jens Axboe <jens.axboe@oracle.com>
To: linux-scsi@vger.kernel.org
Subject: Re: [RFC] [PATCH 1/2] blk request timeout handler patches
Date: Mon, 8 Oct 2007 09:04:41 +0200	[thread overview]
Message-ID: <20071008070441.GA25784@kernel.dk> (raw)
In-Reply-To: <20071008065424.GA19845@us.ibm.com>

On Sun, Oct 07 2007, malahal@us.ibm.com wrote:
> Jens Axboe [jens.axboe@oracle.com] wrote:
> > On Thu, Oct 04 2007, malahal@us.ibm.com wrote:
> > > Mike Christie's patches refreshed to 2.6.23-rc8-mm1.
> > > 
> > > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> > > Signed-off-by: Malahal Naineni <malahal@us.ibm.com>
> > > 
> > > 
> > > diff -r 3697367c6e4d block/ll_rw_blk.c
> > > --- a/block/ll_rw_blk.c	Thu Sep 27 00:12:13 2007 -0700
> > > +++ b/block/ll_rw_blk.c	Thu Sep 27 00:13:07 2007 -0700
> > > @@ -181,6 +181,19 @@ void blk_queue_softirq_done(struct reque
> > >  
> > >  EXPORT_SYMBOL(blk_queue_softirq_done);
> > >  
> > > +void blk_queue_rq_timeout(struct request_queue *q, unsigned int timeout)
> > > +{
> > > +	q->rq_timeout = timeout;
> > > +}
> > > +EXPORT_SYMBOL_GPL(blk_queue_rq_timeout);
> > > +
> > > +void blk_queue_rq_timed_out(struct request_queue *q, rq_timed_out_fn *fn)
> > > +{
> > > +	q->rq_timed_out_fn = fn;
> > > +}
> > > +
> > > +EXPORT_SYMBOL_GPL(blk_queue_rq_timed_out);
> > > +
> > >  /**
> > >   * blk_queue_make_request - define an alternate make_request function for a device
> > >   * @q:  the request queue for the device to be affected
> > > @@ -243,7 +256,9 @@ static void rq_init(struct request_queue
> > >  {
> > >  	INIT_LIST_HEAD(&rq->queuelist);
> > >  	INIT_LIST_HEAD(&rq->donelist);
> > > -
> > > +	init_timer(&rq->timer);
> > > +
> > > +	rq->timeout = 0;
> > >  	rq->errors = 0;
> > >  	rq->bio = rq->biotail = NULL;
> > >  	INIT_HLIST_NODE(&rq->hash);
> > > @@ -2305,6 +2320,7 @@ EXPORT_SYMBOL(blk_start_queueing);
> > >   */
> > >  void blk_requeue_request(struct request_queue *q, struct request *rq)
> > >  {
> > > +	blk_delete_timer(rq);
> > >  	blk_add_trace_rq(q, rq, BLK_TA_REQUEUE);
> > >  
> > >  	if (blk_rq_tagged(rq))
> > > @@ -3647,8 +3663,121 @@ static struct notifier_block blk_cpu_not
> > >  };
> > >  
> > >  /**
> > > + * blk_delete_timer - Delete/cancel timer for a given function.
> > > + * @req:	request that we are canceling timer for
> > > + *
> > > + * Return value:
> > > + *     1 if we were able to detach the timer.  0 if we blew it, and the
> > > + *     timer function has already started to run.
> > > + **/
> > > +int blk_delete_timer(struct request *req)
> > > +{
> > > +	int rtn;
> > > +
> > > +	if (!req->q->rq_timed_out_fn)
> > > +		return 1;
> > > +
> > > +	rtn = del_timer(&req->timer);
> > > +	req->timer.data = (unsigned long)NULL;
> > > +	req->timer.function = NULL;
> > > +
> > > +	return rtn;
> > > +}
> > 
> > Hmm, that looks fishy. What if the timer _just_ fired, yet it hasn't
> > retrieved ->data yet?
> > 
> 
> Looks like it is just copied from the SCSI code. More over _run_timers()
> copies data and function on to stack variables before calling the timer
> function. So, I don't think there is a problem here but assuming that
> implementation detail is bad! I would like modify the above function to
> reinitialize data and function fields only if del_timer() returns TRUE.
> Please let me know if that is OK.

I guess it could be OK, but it then deserves a big fat comment
describing why it is ok.

> > > +
> > > +EXPORT_SYMBOL_GPL(blk_delete_timer);
> > > +
> > > +static void blk_rq_timed_out(struct request *req)
> > > +{
> > > +	struct request_queue *q = req->q;
> > > +
> > > +	switch (q->rq_timed_out_fn(req)) {
> > > +	case BLK_EH_HANDLED:
> > > +		__blk_complete_request(req);
> > > +		return;
> > > +	case BLK_EH_RESET_TIMER:
> > > +		blk_add_timer(req);
> > > +		return;
> > > +	case BLK_EH_NOT_HANDLED:
> > > +		/*
> > > +		 * LLD handles this for now but in the future
> > > +		 * we can send a request msg to abort the command
> > > +		 * and we can move more of the generic scsi eh code to
> > > +		 * the blk layer.
> > > +		 */
> > > +		return;
> > > +	}
> > > +}
> > > +
> > > +/**
> > > + * blk_abort_req -- Request request recovery for the specified command
> > > + * req:		pointer to the request of interest
> > > + *
> > > + * This function requests that the block layer start recovery for the
> > > + * request by deleting the timer and calling the q's timeout function.
> > > + * LLDDs who implement their own error recovery MAY ignore the timeout
> > > + * event if they generated blk_abort_req.
> > > + */
> > > +void blk_abort_req(struct request *req)
> > > +{
> > > +        if (!blk_delete_timer(req))
> > > +                return;
> > > +        blk_rq_timed_out(req);
> > > +}
> > > +
> > > +EXPORT_SYMBOL_GPL(blk_abort_req);
> > > +
> > > +/**
> > > + * blk_add_timer - Start timeout timer for a single request
> > > + * @req:	request that is about to start running.
> > > + *
> > > + * Notes:
> > > + *    Each request has its own timer, and as it is added to the queue, we
> > > + *    set up the timer.  When the request completes, we cancel the timer.
> > > + **/
> > > +void blk_add_timer(struct request *req)
> > > +{
> > > +	struct request_queue *q = req->q;
> > > +
> > > +	/*
> > > +	 * If the clock was already running for this command, then
> > > +	 * first delete the timer.  The timer handling code gets rather
> > > +	 * confused if we don't do this.
> > > +	 */
> > > +	if (req->timer.function)
> > > +		del_timer(&req->timer);
> > > +
> > > +	req->timer.data = (unsigned long)req;
> > > +	if (req->timeout)
> > > +		req->timer.expires = jiffies + req->timeout;
> > > +	else
> > > +		req->timer.expires = jiffies + q->rq_timeout;
> > > +	req->timer.function = (void (*)(unsigned long))blk_rq_timed_out;
> > 
> > Why the cast?
> > 
> 
> Because blk_rq_timed_out() doesn't take "unsigned long" as its input
> parameter. Again, copied from SCSI.

Don't copy stuff from SCSI just because you can, it's allowed to fix
things up along the way! I'd propose adding blk_rq_timed_out_timer()
ala:

static void blk_rq_timed_out_timer(unsigned long data)
{
        struct request *rq = (struct request *) data;

        blk_rq_timed_out(rq);
}

and using that to get rid of the cast.

> 
> > > +        add_timer(&req->timer);
> > 
> > You have space vs tab issues.
> > 
> > > +}
> > 
> > That's an ugly interface. I'd say it's a bug to call blk_add_timer()
> > with it already pending, so fix the API.
> 
> blk_add_timer is also called from the timer function (blk_rq_timed_out) if
> BLK_EH_RESET_TIMER is retuned. This is also copied from SCSI! Please let me
> know if you want to do it some other way.

I thought I just did :-)

Delete the timer first, the API you are proposing isn't very nice.

-- 
Jens Axboe

next prev parent reply	other threads:[~2007-10-08  7:03 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-04 18:12 [RFC] [PATCH 0/2] blk request timeout handler patches malahal
2007-10-04 18:17 ` [RFC] [PATCH 1/2] " malahal
2007-10-04 18:52   ` Randy Dunlap
2007-10-04 20:40   ` Salyzyn, Mark
2007-10-05 12:49   ` Jens Axboe
2007-10-08  6:54     ` malahal
2007-10-08  7:04       ` Jens Axboe [this message]
2007-10-09  5:36     ` [RFC] [PATCH 1/1] " malahal
2007-10-09  9:14       ` Jens Axboe
2007-10-09 14:26         ` malahal
2007-10-09 12:00       ` Matthew Wilcox
2007-10-09 12:15         ` Jens Axboe
2007-10-09 15:56           ` James Bottomley
2007-10-09 17:23             ` malahal
2007-10-10 12:25             ` Jens Axboe
2007-10-10 16:58               ` malahal
2007-10-10 17:04                 ` Jens Axboe
2007-10-11 18:01               ` malahal
2007-10-11 18:24                 ` Jens Axboe
2007-10-11 18:33                   ` Jens Axboe
2007-10-23  1:45                     ` malahal
2007-10-23  6:30                       ` malahal
2007-10-23 11:59                       ` Jens Axboe
2007-10-05 12:50   ` [RFC] [PATCH 1/2] " Jens Axboe
2007-10-04 18:20 ` [RFC] [PATCH 2/2] " malahal
2007-10-04 18:32   ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071008070441.GA25784@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).