From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH] null_blk: add 'requeue' fault attribute To: Omar Sandoval Cc: "linux-block@vger.kernel.org" References: <20180228085136.GA27941@vader.DHCP.thefacebook.com> <1317b3c8-0f6e-a229-414a-de81bb69be2f@kernel.dk> <20180228161408.GA8502@vader> From: Jens Axboe Message-ID: <95a7fe14-cc41-e476-54ca-6a08e319082a@kernel.dk> Date: Wed, 28 Feb 2018 09:15:37 -0700 MIME-Version: 1.0 In-Reply-To: <20180228161408.GA8502@vader> Content-Type: text/plain; charset=utf-8 List-ID: On 2/28/18 9:14 AM, Omar Sandoval wrote: > On Wed, Feb 28, 2018 at 08:28:25AM -0700, Jens Axboe wrote: >> On 2/28/18 1:51 AM, Omar Sandoval wrote: >>> On Tue, Feb 27, 2018 at 03:34:53PM -0700, Jens Axboe wrote: >>>> Similarly to the support we have for testing/faking timeouts for >>>> null_blk, this adds support for triggering a requeue condition. >>>> Considering the issues around restart we've been seeing, this should be >>>> a useful addition to the testing arsenal to ensure that we are handling >>>> requeue conditions correctly. >>>> >>>> This works for queue mode 1 (legacy request_fn based path) and 2 (blk-mq >>>> path), as there's no good way to do requeue with a bio based driver. >>>> This is similar to the timeout path. >>>> >>>> Signed-off-by: Jens Axboe >>>> >>>> --- >>>> >>>> null_blk.c | 55 +++++++++++++++++++++++++++++++++++++++++++------------ >>>> 1 file changed, 43 insertions(+), 12 deletions(-) >>>> >>>> diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c >>>> index 287a09611c0f..363536572e19 100644 >>>> --- a/drivers/block/null_blk.c >>>> +++ b/drivers/block/null_blk.c >>> >>> [snip] >>> >>>> @@ -1422,10 +1440,12 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx, >>>> >>>> blk_mq_start_request(bd->rq); >>>> >>>> - if (!should_timeout_request(bd->rq)) >>>> - return null_handle_cmd(cmd); >>>> + if (should_requeue_request(bd->rq)) >>>> + return BLK_STS_RESOURCE; >>> >>> Hm, this goes through the less interesting requeue path, add to the >>> dispatch list and __blk_mq_requeue_request(). blk_mq_requeue_request() >>> is the one that I wanted to test since that's the one that needs to call >>> the scheduler hook. >> >> Until recently, it would have :-) >> >> Both of them are interesting to test, though. Most of the core stall >> cases would have been triggered by going through the STS_RESOURCE case. >> How about we just make it exercise both? The below patch alternates >> between them when we have chosen to requeue. > > Works for me. One idle thought, if we set this up to always requeue, > then it won't make any progress. Maybe we should limit the number of > times each request can be requeued so people (me) don't lock up their > test systems? Either way, Dunno, that gets into the "doctor it hurts when I shoot myself in the foot" territory. Same can be said for the timeout setting. I think we should just ignore that. -- Jens Axboe