From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Return-Path: Sender: Tejun Heo Date: Tue, 10 Apr 2018 08:30:31 -0700 From: "tj@kernel.org" To: Ming Lei Cc: Bart Van Assche , "linux-block@vger.kernel.org" , "israelr@mellanox.com" , "sagi@grimberg.me" , "hch@lst.de" , "stable@vger.kernel.org" , "axboe@kernel.dk" , "maxg@mellanox.com" Subject: Re: [PATCH v4] blk-mq: Fix race conditions in request timeout handling Message-ID: <20180410153031.GO3126663@devbig577.frc2.facebook.com> References: <20180410013455.7448-1-bart.vanassche@wdc.com> <20180410084133.GB9133@ming.t460p> <20180410135541.GA22340@ming.t460p> <7c4a8b019182d5a9259ef20e6462b3f6a533abed.camel@wdc.com> <20180410143007.GB22340@ming.t460p> <20180410152553.GC22340@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20180410152553.GC22340@ming.t460p> List-ID: Hello, Ming. On Tue, Apr 10, 2018 at 11:25:54PM +0800, Ming Lei wrote: > + if (time_after_eq(jiffies, deadline) && > + blk_mq_change_rq_state(rq, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE)) { > + blk_mq_rq_timed_out(rq, reserved); > > Normal completion still can happen between blk_mq_change_rq_state() > and blk_mq_rq_timed_out(). > > In tj's approach, there is synchronize_rcu() between writing aborted_gstate > and blk_mq_rq_timed_out, it is easier for normal completion to happen during > the big window. I don't think plugging this hole is all that difficult, but this shouldn't lead to any critical failures. If so, that'd be a driver bug. Thanks. -- tejun