From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2479CEB64D9 for ; Sun, 2 Jul 2023 19:47:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232988AbjGBTrP (ORCPT ); Sun, 2 Jul 2023 15:47:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232507AbjGBTqS (ORCPT ); Sun, 2 Jul 2023 15:46:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0F60173C; Sun, 2 Jul 2023 12:42:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 58DB860C8F; Sun, 2 Jul 2023 19:42:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BC4DC433C7; Sun, 2 Jul 2023 19:42:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1688326956; bh=a5RnewJF22V0OwGyX/W5gnfmorne3mPU/7eKgWWVwS0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HedUCkhgcqo3yg8jhrctX7NAk4t7dXC7QXZN072/vq3QwE1b9CWDJ2mgKaTlVbApm bMaU/IAZu7sNIsVBsdstajD0C669F+YU22vuRbMMnLA8rGsNuWqXVleL5SHzwaezpG 49sbeVYTjGyEnWCZdG1LJ5XFvfrr12fz12u39rgsOp4dPvAPPnzFULFly2/BHprM0c lnjjikRF683od2QonXrN13Xw+aNf52QSnsprainLw0WHhvKHPLBWp20MrQXt5XeiBV uRwm4W/7ha6+0aplxcxFokxoXz9iJbdulp39lk0AHcHM1H5p+oQx88s8Utek5ev1f8 DUXa0UKpUPopQ== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Yu Kuai , Song Liu , Sasha Levin , linux-raid@vger.kernel.org Subject: [PATCH AUTOSEL 4.14 4/5] md/raid10: prevent soft lockup while flush writes Date: Sun, 2 Jul 2023 15:42:29 -0400 Message-Id: <20230702194230.1779535-4-sashal@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230702194230.1779535-1-sashal@kernel.org> References: <20230702194230.1779535-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 4.14.320 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Yu Kuai [ Upstream commit 010444623e7f4da6b4a4dd603a7da7469981e293 ] Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io will spend a long time in plug, hence io latency is bad. Signed-off-by: Yu Kuai Signed-off-by: Song Liu Link: https://lore.kernel.org/r/20230529131106.2123367-2-yukuai1@huaweicloud.com Signed-off-by: Sasha Levin --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 95c3a21cd7335..f2dbc1554328a 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -916,6 +916,7 @@ static void flush_pending_writes(struct r10conf *conf) else generic_make_request(bio); bio = next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1101,6 +1102,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule) else generic_make_request(bio); bio = next; + cond_resched(); } kfree(plug); } -- 2.39.2