From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E193EB64DA for ; Sun, 2 Jul 2023 19:43:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231411AbjGBTnM (ORCPT ); Sun, 2 Jul 2023 15:43:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231360AbjGBTmg (ORCPT ); Sun, 2 Jul 2023 15:42:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E80EB2952; Sun, 2 Jul 2023 12:41:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8CA3360CB9; Sun, 2 Jul 2023 19:40:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50BD3C433C7; Sun, 2 Jul 2023 19:40:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688326841; bh=bNQgnsly7EI2/yJDSKf96Jch8Y79fbQnrYfAucSxEJA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lxFx8QkIE/HNosAIcv5qQ6Sy7DjB1qKHtc1ZrjQhfnB5G8vT+r+6eaupwWuZLrjUW /YiKFZqT76oYhdaaBJqt0IyRADRmwOC8XwOkqSU7/1q5G4QsGwiKBgNbafWIhDjGsX U6dKvNy4arh+M0K4dC3v58KRsm2dv8Ia9WJh8tjqKm5n607IQX4FTjgI/aXby7ir7c FYTf4dtl791n/9uIPYTfXy+6+brrCHI0aXUT1oYf/GPLZFYa7M/U9g6uG4WknGChNh lWedKJNFfP+8T2g0UPu34uY+LBVvCraMPJZB5g/qfXQ7Zzb2cI+WB/T3emmaCwY8vJ wfavMNkwwZ8pg== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Yu Kuai , Song Liu , Sasha Levin , linux-raid@vger.kernel.org Subject: [PATCH AUTOSEL 6.4 08/15] md/raid10: prevent soft lockup while flush writes Date: Sun, 2 Jul 2023 15:40:13 -0400 Message-Id: <20230702194020.1776895-8-sashal@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230702194020.1776895-1-sashal@kernel.org> References: <20230702194020.1776895-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.4.1 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yu Kuai [ Upstream commit 010444623e7f4da6b4a4dd603a7da7469981e293 ] Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io will spend a long time in plug, hence io latency is bad. Signed-off-by: Yu Kuai Signed-off-by: Song Liu Link: https://lore.kernel.org/r/20230529131106.2123367-2-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 4fcfcb350d2b4..99163661170f3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) else submit_bio_noacct(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1142,6 +1143,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else submit_bio_noacct(bio); bio = next; + cond_resched(); } kfree(plug); } -- 2.39.2