From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 148D3C32771 for ; Wed, 17 Aug 2022 16:31:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237206AbiHQQbJ (ORCPT ); Wed, 17 Aug 2022 12:31:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236386AbiHQQbG (ORCPT ); Wed, 17 Aug 2022 12:31:06 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB6F3DECC; Wed, 17 Aug 2022 09:31:01 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 6557D1F9EA; Wed, 17 Aug 2022 16:31:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1660753860; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=38RIYwPJSBj+njOW8lX/s5pnoWS5wLKDA4I+GoOrolk=; b=xfV9Rckhj+asO11u/UkJKQhXnXgELLDbOaNOBGVBMjk7NQMkYiqAYBX1o0OagqPFrRF70e YmU04ZaQ5rKQ5VMKMsgUMKHd5iolqgsutkWwp3WFUJqX5qu3mmGVpMW3hOJYfO7y7/JYgV u7qmDwcS2qhlSR6voHEKX1ZT3wyKkHg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1660753860; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=38RIYwPJSBj+njOW8lX/s5pnoWS5wLKDA4I+GoOrolk=; b=bxE6BSRzRcXaMkPBdHgvifBPkiUT/7wcRlKEu8w1ntaybjmj/C1jUfyxcVWyuuCVZastXU CE8wYBxP5M0Yt7Dw== Received: from quack3.suse.cz (unknown [10.100.224.230]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id E3DB22C177; Wed, 17 Aug 2022 16:30:59 +0000 (UTC) Received: by quack3.suse.cz (Postfix, from userid 1000) id 3682CA066B; Wed, 17 Aug 2022 18:30:59 +0200 (CEST) Date: Wed, 17 Aug 2022 18:30:59 +0200 From: Jan Kara To: Chris Murphy Cc: Jan Kara , Holger =?utf-8?Q?Hoffst=C3=A4tte?= , Nikolay Borisov , Jens Axboe , Paolo Valente , Linux-RAID , linux-block , linux-kernel , Josef Bacik Subject: Re: stalling IO regression since linux 5.12, through 5.18 Message-ID: <20220817163059.kigrvdfmxfswmhls@quack3> References: <61e5ccda-a527-4fea-9850-91095ffa91c4@www.fastmail.com> <4995baed-c561-421d-ba3e-3a75d6a738a3@www.fastmail.com> <2b8a38fa-f15f-45e8-8caa-61c5f8cd52de@www.fastmail.com> <7c830487-95a6-b008-920b-8bc4a318f10a@applied-asynchrony.com> <20220817114933.66c4g4xjsi4df2tg@quack3> <85a141ae-56a7-4dcd-b75a-04be4b276b3a@www.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <85a141ae-56a7-4dcd-b75a-04be4b276b3a@www.fastmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org On Wed 17-08-22 11:09:26, Chris Murphy wrote: > > > On Wed, Aug 17, 2022, at 7:49 AM, Jan Kara wrote: > > > > > Another thing worth trying is to compile the kernel without > > CONFIG_BFQ_GROUP_IOSCHED. That will essentially disable cgroup support in > > BFQ so we will see whether the problem may be cgroup related or not. > > The problem happens with a 5.12.0 kernel built without > CONFIG_BFQ_GROUP_IOSCHED. Thanks for testing! Just to answer your previous question: This is different from cgroup.disable=io because BFQ takes different code paths. So this makes it even less likely this is some obscure BFQ bug. Why BFQ could be different here from mq-deadline is that it artificially reduces device queue depth (it sets shallow_depth when allocating new tags) and maybe that triggers some bug in request tag allocation. BTW, are you sure the first problematic kernel is 5.12? Because support for shared tagsets was added to megaraid_sas driver in 5.11 (5.11-rc3 in particular - commit 81e7eb5bf08f3 ("Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug"")) and that is one candidate I'd expect to start to trigger issues. BTW that may be an interesting thing to try: Can you boot with "megaraid_sas.host_tagset_enable = 0" kernel option and see whether the issue reproduces? Honza -- Jan Kara SUSE Labs, CR