From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07611EB64DD for ; Wed, 9 Aug 2023 11:20:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233517AbjHILUK (ORCPT ); Wed, 9 Aug 2023 07:20:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233511AbjHILUJ (ORCPT ); Wed, 9 Aug 2023 07:20:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79483FA for ; Wed, 9 Aug 2023 04:20:08 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 17A77631D2 for ; Wed, 9 Aug 2023 11:20:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24C76C433C7; Wed, 9 Aug 2023 11:20:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1691580007; bh=GkTnCfG9CqE9cItvm7MTBbPF8bOrujndLn7AK3y/aow=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ew8kAnZ2vBNWWTrNVnKTVB5Ws7S0CbAjIclX3/LHDooulat9s9XpHNT+Nib20xjdE A7Ib488bN1DGD5UUvHu1vT8dslK3ak93+q3goSfPL58p6aaoJkw0KSDYxTa5EQKvyr W5z7/KjMRC1awfy/ln8hgJmDVzTV7V3ADgx+wzes= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Yu Kuai , Song Liu , Sasha Levin Subject: [PATCH 4.19 194/323] md/raid10: prevent soft lockup while flush writes Date: Wed, 9 Aug 2023 12:40:32 +0200 Message-ID: <20230809103707.052098422@linuxfoundation.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230809103658.104386911@linuxfoundation.org> References: <20230809103658.104386911@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Yu Kuai [ Upstream commit 010444623e7f4da6b4a4dd603a7da7469981e293 ] Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io will spend a long time in plug, hence io latency is bad. Signed-off-by: Yu Kuai Signed-off-by: Song Liu Link: https://lore.kernel.org/r/20230529131106.2123367-2-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index d46056b07c079..bee694be20132 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -942,6 +942,7 @@ static void flush_pending_writes(struct r10conf *conf) else generic_make_request(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1127,6 +1128,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else generic_make_request(bio); bio = next; + cond_resched(); } kfree(plug); } -- 2.39.2