From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5ACD1217330 for ; Fri, 4 Oct 2024 14:32:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728052361; cv=none; b=gM2isD726I4fEKG9X3UR7Rk/Z5XyW3PkZEBWJTvs7I6eNmh4Ru+VmOV4P8N/SDnreNsXQUb9MSQzGkoyyDT8FTsGxxJipCl1ep+KL/GFDHyf58ehcqdfeMe4hTizSu7t1IcGIQ69kk9gMw7/5biHpwlNEJnIR+70EneEBt/G73E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728052361; c=relaxed/simple; bh=F8PsO+BnwCJVC3akmaDfGnwEiRevyDbGGoLVqNO7VTw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=cGS0ybd/7gubKI13clPe+JrP00YBKM+Cleja6wFTBYy7QIqcdavU7rGqaWGE0elRny9vSmpiC70SYxqfgogzkSCgbC4EdZUscy0f5MNl/blNashryyywjk4/T5F/0HpCTpBrsmxt2d8hlQFRMh4ffm8Ys89ZLHnPAWzcqzJw9EM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=YV4OCnLv; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="YV4OCnLv" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-71970655611so2066537b3a.0 for ; Fri, 04 Oct 2024 07:32:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1728052359; x=1728657159; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=jGOVuLxflFVNxqaDGKYkyq7bKX0WqijX5gWjK6fbSfo=; b=YV4OCnLvC9OtzsYuaJ377Nsth88pzwhlYbx+7qL3QuiIDuyiKgi3wSQPij9+5cW9M1 k+0oikAoAFyVD8NdzSu5zHTpxnWfYqqN7gRxAWnO4eXV1YAJyBfV5lEX4MP1XtpOogCu 9ggJct/ftTgIEomYI5Rcv82hB8gaI8tZkqN6o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728052359; x=1728657159; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=jGOVuLxflFVNxqaDGKYkyq7bKX0WqijX5gWjK6fbSfo=; b=wJmUFbxGR/bdR3erX+KCORjYdt3HdkHa4WAo7a177hH3KVncE4Z7ZNH988PrwiF/+T GREVJ82nPS8orjElT19l12/YasZoJGmqbpsCDbo5njRro2tQ2x/mnpkGqvBjxkE2AI3I wRZ9ZFmfbbjFNF7bAbsi6VxHCGO/dGQ4lunNLEofXuXKTEEk5bXbehg/l0+cDG8nKMTh 9CXEetwDk2XldK8nGBSj1n4r+txGWIAwVWUhiUQ4kULmfkXzendmRvdCOQGV0CbpRS04 Ioc8cpfN8WowcQKgid5VPT79NErcMz+tG3QUa9q+A1n8zFZSfeROI8N8TMRz6Co5CrG8 lbGg== X-Forwarded-Encrypted: i=1; AJvYcCW6BQI4pCXlsJsbyiaZ6JhEmTNf+HO0MgEPXWhNVko0CayyzSo5HyDArS4il1IlxyEfzZC2QU5PgW0qZQ==@vger.kernel.org X-Gm-Message-State: AOJu0Yz6Kv08M/8DJY/unsrV/J4Su9R3/u+Ju9pf5DRBODvb+BsEgK7x OTbAo7wqAfbZAjjZa2u4ktt3UbNiGCi3Nzj5pzy9vk0hW1REoRIEip90WZnKiw== X-Google-Smtp-Source: AGHT+IFW6CoB4T2NkyJVW1ixt0XrONiXKf9TMTr1q/qQkG9eMOXRbXl39yOVbxRZcpZzrW0suZiZIg== X-Received: by 2002:aa7:88c9:0:b0:70e:9907:ef75 with SMTP id d2e1a72fcca58-71de23a57a1mr5609940b3a.4.1728052359616; Fri, 04 Oct 2024 07:32:39 -0700 (PDT) Received: from google.com ([2401:fa00:8f:203:17b3:dfd3:7130:df15]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71dd9ddb69fsm3366143b3a.98.2024.10.04.07.32.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Oct 2024 07:32:38 -0700 (PDT) Date: Fri, 4 Oct 2024 23:32:34 +0900 From: Sergey Senozhatsky To: Christoph Hellwig Cc: Sergey Senozhatsky , Jens Axboe , linux-block@vger.kernel.org, Yang Yang Subject: Re: block: del_gendisk() vs blk_queue_enter() race condition Message-ID: <20241004143234.GR11458@google.com> References: <20241003085610.GK11458@google.com> <20241003140051.GM11458@google.com> <20241003141709.GN11458@google.com> <20241004042127.GO11458@google.com> <20241004074818.GP11458@google.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On (24/10/04 05:20), Christoph Hellwig wrote: > On Fri, Oct 04, 2024 at 04:48:18PM +0900, Sergey Senozhatsky wrote: > > diff --git a/block/blk-core.c b/block/blk-core.c > > index bc5e8c5eaac9..ccd36cb5ada7 100644 > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -292,6 +292,16 @@ void blk_queue_start_drain(struct request_queue *q) > > wake_up_all(&q->mq_freeze_wq); > > } > > > > +void blk_queue_disk_dead(struct request_queue *q) > > +{ > > + struct gendisk *disk = q->disk; > > + > > + if (WARN_ON_ONCE(!test_bit(GD_DEAD, &disk->state))) > > + return; > > + /* Make blk_queue_enter() reexamine the GD_DEAD flag. */ > > + wake_up_all(&q->mq_freeze_wq); > > +} > > Why is this a separate helper vs just doing the wake_up_all in the > only caller that sets (with the suggested fixup anyway) GD_DEAD? It looked to me like whatever happens to ->mq_freeze_wq stays in Las^W blk-core or blk-mq, so I added a new helper to follow suit, IOW to not spread ->mq_freeze_wq wakeup across multiple files. > > + blk_queue_dying(q) || > > + test_bit(GD_DEAD, &disk->state)); > > This needs to check for a NULL disk. Ack. > And now that I'm looking at the code a bit more this makes me worried > that checking for q->disk here sounds like a good way to hit a race with > clearing it. So I fear we need the other hack variant that sets > QUEUE_FLAG_DYING unconditionally in __blk_mark_disk_dead and then clears > it again (for GD_OWNS_QUEUE only) toward the end of del_gendisk. Hmm, setting QUEUE_FLAG_DYING unconditionally in __blk_mark_disk_dead() implies moving it up, to the very top of del_gendisk(), before the first time we grab ->open_mutex, because that's the issue that we are having. Does this sound like re-introducing the previous deadlock scenario (the one you pointed at previously) because of that "don't acquire ->open_mutex after freezing the queue" thing?