All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <Bart.VanAssche@wdc.com>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"israelr@mellanox.com" <israelr@mellanox.com>,
	"sagi@grimberg.me" <sagi@grimberg.me>, "hch@lst.de" <hch@lst.de>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"maxg@mellanox.com" <maxg@mellanox.com>,
	"tj@kernel.org" <tj@kernel.org>
Subject: Re: [PATCH v4] blk-mq: Fix race conditions in request timeout handling
Date: Tue, 10 Apr 2018 21:55:47 +0800	[thread overview]
Message-ID: <20180410135541.GA22340@ming.t460p> (raw)
In-Reply-To: <b36ea1107be4f32fb04fffdde3ee45a2e0fcc36c.camel@wdc.com>

On Tue, Apr 10, 2018 at 12:58:04PM +0000, Bart Van Assche wrote:
> On Tue, 2018-04-10 at 16:41 +0800, Ming Lei wrote:
> > On Mon, Apr 09, 2018 at 06:34:55PM -0700, Bart Van Assche wrote:
> > > If a completion occurs after blk_mq_rq_timed_out() has reset
> > > rq->aborted_gstate and the request is again in flight when the timeout
> > 
> > Given rq->aborted_gstate is reset only for BLK_EH_RESET_TIMER, I
> > think you are addressing the race between normal completion and
> > the timeout of BLK_EH_RESET_TIMER.
> 
> Yes, that's correct. I will make this more explicit.
> 
> > > expires then a request will be completed twice: a first time by the
> > > timeout handler and a second time when the regular completion occurs.
> > 
> > This patch looks simpler, and more like the previous model of
> > using blk_mark_rq_complete().
> 
> That's also correct.
> 
> > I have one question:
> > 
> > - with this patch, rq's state is updated atomically as cmpxchg. Suppose
> > one rq is timed out, the rq's state is updated as MQ_RQ_COMPLETE by
> > blk_mq_check_expired(), then ops->timeout() is run and
> > BLK_EH_RESET_TIMER is returned, so blk_mq_add_timer(req, MQ_RQ_COMPLETE,
> > MQ_RQ_IN_FLIGHT) is called to update rq's state into MQ_RQ_IN_FLIGHT.
> > 
> > Now the original normal completion still may occur after rq's state
> > becomes MQ_RQ_IN_FLIGHT, and seems there is still the double completion
> > with this patch? Maybe I am wrong, please explain a bit.
> 
> That scenario won't result in a double completion. After the timer has
> been reset the block driver blk_mq_complete_request() call will change
> the request state from MQ_RQ_IN_FLIGHT into MQ_RQ_COMPLETE. The next
> time blk_mq_check_expired() is called it will execute the following code:
> 
> 	blk_mq_change_rq_state(rq, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE);
> 
> That function call only changes the request state if the current state is
> IN_FLIGHT. However, the blk_mq_complete_request() call changed the request
> state into COMPLETE. Hence, the above blk_mq_change_rq_state() call will
> return false and the blk-mq timeout code will skip this request. If the
> request would already have been reused and would have been marked again as
> IN_FLIGHT then its deadline will also have been updated and the request
> will be skipped by the timeout code because its deadline has not yet
> expired.

OK.

Then I have same question with Jianchao, what is the actual double
complete in linus tree between BLK_EH_RESET_TIMER and normal completion?

Follows my understanding:

1) when timeout is detected on one request, its aborted_gstate is
updated in blk_mq_check_expired().

2) run synchronize_rcu(), and make sure all pending completion is done

3) run blk_mq_rq_timed_out()
- ret = ops->timeout
- blk_mq_rq_update_aborted_gstate(req, 0)
- blk_add_timer(req);

If normal completion is done between 1) and reset aborted_gstate in 3),
blk_mq_complete_request() will be called, and found that aborted_gstate
is set, then the rq won't be completed really.

If normal completion is done after reset aborted_gstate in 3), it should
be same with applying this patch.

> 
> > And synchronize_rcu() is needed by Tejun's approach between marking
> > COMPLETE and handling this rq's timeout, and the time can be quite long,
> > I guess it might be easier to trigger?
> 
> I have done what I could to trigger races between the regular completion
> path and the timeout code in my tests. Without this patch if I run the
> srp-test software I see crashes being reported in the rdma_rxe driver but
> with this patch applied I don't see any crashes with my tests.

I believe this patch may fix this issue, but I think the idea behind
should be understood.

Thanks,
Ming

  reply	other threads:[~2018-04-10 13:55 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-10  1:34 [PATCH v4] blk-mq: Fix race conditions in request timeout handling Bart Van Assche
2018-04-10  7:59 ` jianchao.wang
2018-04-10 10:04   ` Ming Lei
2018-04-10 12:04     ` Shan Hai
2018-04-10 13:01   ` Bart Van Assche
2018-04-10 13:01     ` Bart Van Assche
2018-04-10 14:32     ` jianchao.wang
2018-04-10  8:41 ` Ming Lei
2018-04-10 12:58   ` Bart Van Assche
2018-04-10 12:58     ` Bart Van Assche
2018-04-10 13:55     ` Ming Lei [this message]
2018-04-10 14:09       ` Bart Van Assche
2018-04-10 14:09         ` Bart Van Assche
2018-04-10 14:30         ` Ming Lei
2018-04-10 15:02           ` Bart Van Assche
2018-04-10 15:02             ` Bart Van Assche
2018-04-10 15:25             ` Ming Lei
2018-04-10 15:30               ` tj
2018-04-10 15:38                 ` Ming Lei
2018-04-10 15:40                   ` tj
2018-04-10 21:33                     ` tj
2018-04-10 21:46                       ` Bart Van Assche
2018-04-10 21:46                         ` Bart Van Assche
2018-04-10 21:54                         ` tj
2018-04-11 12:50                           ` Bart Van Assche
2018-04-11 12:50                             ` Bart Van Assche
2018-04-11 14:16                             ` tj
2018-04-11 18:38                             ` Martin Steigerwald
2018-04-11 18:38                               ` Martin Steigerwald
2018-04-11 14:24                           ` Sagi Grimberg
2018-04-11 14:43                             ` tj
2018-04-11 16:16                             ` Israel Rukshin
2018-04-11 17:07                               ` tj
2018-04-11 21:31                                 ` tj
2018-04-12  8:59                                   ` Israel Rukshin
2018-04-12 13:35                                     ` tj
2018-04-15 12:28                                       ` Israel Rukshin
2018-04-18 16:34                           ` Bart Van Assche
2018-04-10  9:55 ` Christoph Hellwig
2018-04-10 13:26   ` Bart Van Assche
2018-04-10 13:26     ` Bart Van Assche
2018-04-10 14:50     ` hch
2018-04-10 14:41   ` Jens Axboe
2018-04-10 14:20 ` Tejun Heo
2018-04-10 14:30   ` Bart Van Assche
2018-04-10 14:30     ` Bart Van Assche
2018-04-10 14:33     ` tj

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180410135541.GA22340@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=Bart.VanAssche@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=israelr@mellanox.com \
    --cc=linux-block@vger.kernel.org \
    --cc=maxg@mellanox.com \
    --cc=sagi@grimberg.me \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.