From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752425AbdLLRBV (ORCPT ); Tue, 12 Dec 2017 12:01:21 -0500 Received: from mail-qt0-f180.google.com ([209.85.216.180]:44017 "EHLO mail-qt0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752338AbdLLRBP (ORCPT ); Tue, 12 Dec 2017 12:01:15 -0500 X-Google-Smtp-Source: ACJfBoubjkzy9+2Ea4lRuJdJstclxVB83gSywgkadWrkJTXIL9ygwadgoxSA2i5yntGnZw/A3n3rhg== Date: Tue, 12 Dec 2017 09:01:10 -0800 From: Tejun Heo To: "jianchao.wang" Cc: axboe@kernel.dk, linux-kernel@vger.kernel.org, oleg@redhat.com, peterz@infradead.org, kernel-team@fb.com, osandov@fb.com Subject: Re: [PATCH 6/6] blk-mq: remove REQ_ATOM_STARTED Message-ID: <20171212170110.GE3919388@devbig577.frc2.facebook.com> References: <20171209192525.982030-1-tj@kernel.org> <20171209192525.982030-7-tj@kernel.org> <8c52269a-d5a9-d13c-bdb6-8f47cdaed982@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8c52269a-d5a9-d13c-bdb6-8f47cdaed982@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Jianchao. On Tue, Dec 12, 2017 at 06:09:32PM +0800, jianchao.wang wrote: > > @@ -786,18 +779,6 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved) > > const struct blk_mq_ops *ops = req->q->mq_ops; > > enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER; > > > > - /* > > - * We know that complete is set at this point. If STARTED isn't set > > - * anymore, then the request isn't active and the "timeout" should > > - * just be ignored. This can happen due to the bitflag ordering. > > - * Timeout first checks if STARTED is set, and if it is, assumes > > - * the request is active. But if we race with completion, then > > - * both flags will get cleared. So check here again, and ignore > > - * a timeout event with a request that isn't active. > > - */ > > - if (!test_bit(REQ_ATOM_STARTED, &req->atomic_flags)) > > - return; > > - > > if (ops->timeout) > > ret = ops->timeout(req, reserved); > > The BLK_EH_RESET_TIMER case has not been covered here. In that case, > the timer will be re-armed, but the gstate and aborted_gstate are > not updated and still equal with echo other. Consequently, when the > request is completed later, the __blk_mq_complete_request() will be > missed, then the request will expire again. The aborted_gstate > should be updated in the BLK_EH_RESET_TIMER case. You're right. This is inherently racy tho. Nothing prevented the command from completing before complete was cleared. I'll just clear aborted_gstate which should behave the same way. Thanks. -- tejun