From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A88DC432C0 for ; Fri, 22 Nov 2019 06:02:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 00CE32071B for ; Fri, 22 Nov 2019 06:02:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1574402555; bh=LHiUdUHDdThUMjt+1McqXD4GQsQ6DLH4P+4atOmQcvU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=KChJ3qlCrloLvj/r1IZfMSYnhKxofiyAkSWqpgDzQm3nPdpOL9MF1ci8mCicdJiEu T168g2UVDZz9czRhEV37UD3NpGKWMdzY0pAf+jRWg5en7FpYm7Q4ORmetzCwPv6Dnu vjap4NwdoTxTZXXPCr+F8B0gIKlpRaucEw9k/oCM= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729752AbfKVGCd (ORCPT ); Fri, 22 Nov 2019 01:02:33 -0500 Received: from mail.kernel.org ([198.145.29.99]:41194 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728312AbfKVGC3 (ORCPT ); Fri, 22 Nov 2019 01:02:29 -0500 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B63342070B; Fri, 22 Nov 2019 06:02:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1574402548; bh=LHiUdUHDdThUMjt+1McqXD4GQsQ6DLH4P+4atOmQcvU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RClvpUSAA0UIjH68VQJU6g5071GL9wVPLFgjE6P2NLk4VqWpq61BbMeBGTTizoep6 g7kWOSsPb5TbtBOwK4RgKn7nTcAlLVH/fh4INty8Y1t8iW6BB26EHTGZkuI2uEFEQN CZvnouoWjjwIRDQ0WtKViNIMNXQa+8VHkIO0zGs8= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Lars Ellenberg , Jens Axboe , Sasha Levin , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH AUTOSEL 4.9 54/91] drbd: do not block when adjusting "disk-options" while IO is frozen Date: Fri, 22 Nov 2019 01:00:52 -0500 Message-Id: <20191122060129.4239-53-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191122060129.4239-1-sashal@kernel.org> References: <20191122060129.4239-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Lars Ellenberg [ Upstream commit f708bd08ecbdc23d03aaedf5b3311ebe44cfdb50 ] "suspending" IO is overloaded. It can mean "do not allow new requests" (obviously), but it also may mean "must not complete pending IO", for example while the fencing handlers do their arbitration. When adjusting disk options, we suspend io (disallow new requests), then wait for the activity-log to become unused (drain all IO completions), and possibly replace it with a new activity log of different size. If the other "suspend IO" aspect is active, pending IO completions won't happen, and we would block forever (unkillable drbdsetup process). Fix this by skipping the activity log adjustment if the "al-extents" setting did not change. Also, in case it did change, fail early without blocking if it looks like we would block forever. Signed-off-by: Lars Ellenberg Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- drivers/block/drbd/drbd_nl.c | 37 ++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index ff26f676f24d1..b809f325c2bea 100644 --- a/drivers/block/drbd/drbd_nl.c +++ b/drivers/block/drbd/drbd_nl.c @@ -1508,6 +1508,30 @@ static void sanitize_disk_conf(struct drbd_device *device, struct disk_conf *dis } } +static int disk_opts_check_al_size(struct drbd_device *device, struct disk_conf *dc) +{ + int err = -EBUSY; + + if (device->act_log && + device->act_log->nr_elements == dc->al_extents) + return 0; + + drbd_suspend_io(device); + /* If IO completion is currently blocked, we would likely wait + * "forever" for the activity log to become unused. So we don't. */ + if (atomic_read(&device->ap_bio_cnt)) + goto out; + + wait_event(device->al_wait, lc_try_lock(device->act_log)); + drbd_al_shrink(device); + err = drbd_check_al_size(device, dc); + lc_unlock(device->act_log); + wake_up(&device->al_wait); +out: + drbd_resume_io(device); + return err; +} + int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info) { struct drbd_config_context adm_ctx; @@ -1570,15 +1594,12 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info) } } - drbd_suspend_io(device); - wait_event(device->al_wait, lc_try_lock(device->act_log)); - drbd_al_shrink(device); - err = drbd_check_al_size(device, new_disk_conf); - lc_unlock(device->act_log); - wake_up(&device->al_wait); - drbd_resume_io(device); - + err = disk_opts_check_al_size(device, new_disk_conf); if (err) { + /* Could be just "busy". Ignore? + * Introduce dedicated error code? */ + drbd_msg_put_info(adm_ctx.reply_skb, + "Try again without changing current al-extents setting"); retcode = ERR_NOMEM; goto fail_unlock; } -- 2.20.1