From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f43.google.com ([209.85.214.43]:39009 "EHLO mail-it0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030839AbeBNPTM (ORCPT ); Wed, 14 Feb 2018 10:19:12 -0500 Received: by mail-it0-f43.google.com with SMTP id l187so5717747ith.4 for ; Wed, 14 Feb 2018 07:19:12 -0800 (PST) Subject: Re: [PATCH BUGFIX V3] block, bfq: add requeue-request hook To: Paolo Valente , Mike Galbraith Cc: Oleksandr Natalenko , stable References: <20180207211920.6343-1-paolo.valente@linaro.org> <1518197379.26824.31.camel@gmx.de> <6394471.U0O273vb9H@natalenko.name> <9E24F648-C93A-4CEA-A1B6-B041540CEAAE@linaro.org> <1518434553.13087.7.camel@gmx.de> <805e9d5af9aed3fbc2697fdb0ce51e88@natalenko.name> <1518498175.6944.48.camel@gmx.de> <816E0B1B-B2D9-4604-A8CB-1E32AFBF6C22@linaro.org> <1518504174.6944.71.camel@gmx.de> <1518504640.6944.73.camel@gmx.de> <1518532236.15792.25.camel@gmx.de> <1518587910.5647.14.camel@gmx.de> <1518591888.6752.12.camel@gmx.de> <1518592512.6752.14.camel@gmx.de> <07DA441B-C2E8-4F65-B674-11C87D2F084B@linaro.org> From: Jens Axboe Message-ID: <0cdbafe7-fe13-51b4-4e86-e7453026508e@kernel.dk> Date: Wed, 14 Feb 2018 08:19:07 -0700 MIME-Version: 1.0 In-Reply-To: <07DA441B-C2E8-4F65-B674-11C87D2F084B@linaro.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: stable-owner@vger.kernel.org List-ID: On 2/14/18 1:56 AM, Paolo Valente wrote: > > >> Il giorno 14 feb 2018, alle ore 08:15, Mike Galbraith ha scritto: >> >> On Wed, 2018-02-14 at 08:04 +0100, Mike Galbraith wrote: >>> >>> And _of course_, roughly two minutes later, IO stalled. >> >> P.S. >> >> crash> bt 19117 >> PID: 19117 TASK: ffff8803d2dcd280 CPU: 7 COMMAND: "kworker/7:2" >> #0 [ffff8803f7207bb8] __schedule at ffffffff81595e18 >> #1 [ffff8803f7207c40] schedule at ffffffff81596422 >> #2 [ffff8803f7207c50] io_schedule at ffffffff8108a832 >> #3 [ffff8803f7207c60] blk_mq_get_tag at ffffffff8129cd1e >> #4 [ffff8803f7207cc0] blk_mq_get_request at ffffffff812987cc >> #5 [ffff8803f7207d00] blk_mq_alloc_request at ffffffff81298a9a >> #6 [ffff8803f7207d38] blk_get_request_flags at ffffffff8128e674 >> #7 [ffff8803f7207d60] scsi_execute at ffffffffa0025b58 [scsi_mod] >> #8 [ffff8803f7207d98] scsi_test_unit_ready at ffffffffa002611c [scsi_mod] >> #9 [ffff8803f7207df8] sd_check_events at ffffffffa0212747 [sd_mod] >> #10 [ffff8803f7207e20] disk_check_events at ffffffff812a0f85 >> #11 [ffff8803f7207e78] process_one_work at ffffffff81079867 >> #12 [ffff8803f7207eb8] worker_thread at ffffffff8107a127 >> #13 [ffff8803f7207f10] kthread at ffffffff8107ef48 >> #14 [ffff8803f7207f50] ret_from_fork at ffffffff816001a5 >> crash> > > This has evidently to do with tag pressure. I've looked for a way to > easily reduce the number of tags online, so as to put your system in > the bad spot deterministically. But at no avail. Does anyone know a > way to do it? The key here might be that it's not a regular file system request, which I'm sure bfq probably handles differently. So it's possible that you are slowly leaking those tags, and we end up in this miserable situation after a while. -- Jens Axboe