From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BA4F1C83F1C for ; Tue, 29 Aug 2023 15:05:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=uSnxqZdcrAYLPJPn0DfMY9X2iPdkTMayza8iox8Bem8=; b=GQJhO3y5ZQQiw5JbUwTMjmyyhh 8hT0jDFkvT2jY+u5w3H8CZGJmsUTEgQlV2R49J5ULfDYD1fM35YLIxrGvWkzfVzAo9xn3DA1d8+UV vLADXFdixtIBsr0MKTR60EpiSS/4jMhCwmp8M+5N5G/IHNvOiBhgzQWnXitR/G2JyDsJNf43Lmcrp SCM4B5BqCMLZWzD0p93w+xRMAqkOS+Pxt8WsBJxvT2Jz8r+75+RnPm+AGwbkryvWy/V4gttl8v73V p+TVL5jtm28HGMJeOniH68um49Zcbmc555F+llhPzO6S1EGyD9u6eO9UqiUx8outDsdwrryoHj7YE Lc7tQ10g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qb0HP-00BkeX-27; Tue, 29 Aug 2023 15:05:19 +0000 Received: from hch by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qadDM-009iIb-0l; Mon, 28 Aug 2023 14:27:36 +0000 Date: Mon, 28 Aug 2023 07:27:36 -0700 From: Christoph Hellwig To: Al Viro Cc: Jan Kara , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, Christoph Hellwig , Alasdair Kergon , Andrew Morton , Anna Schumaker , Chao Yu , Christian Borntraeger , "Darrick J. Wong" , Dave Kleikamp , David Sterba , dm-devel@redhat.com, drbd-dev@lists.linbit.com, Gao Xiang , Jack Wang , Jaegeuk Kim , jfs-discussion@lists.sourceforge.net, Joern Engel , Joseph Qi , Kent Overstreet , linux-bcache@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-mm@kvack.org, linux-mtd@lists.infradead.org, linux-nfs@vger.kernel.org, linux-nilfs@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pm@vger.kernel.org, linux-raid@vger.kernel.org, linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org, linux-xfs@vger.kernel.org, "Md. Haris Iqbal" , Mike Snitzer , Minchan Kim , ocfs2-devel@oss.oracle.com, reiserfs-devel@vger.kernel.org, Sergey Senozhatsky , Song Liu , Sven Schnelle , target-devel@vger.kernel.org, Ted Tso , Trond Myklebust , xen-devel@lists.xenproject.org, Jens Axboe , Christian Brauner Subject: Re: [PATCH v2 0/29] block: Make blkdev_get_by_*() return handle Message-ID: References: <20230810171429.31759-1-jack@suse.cz> <20230825015843.GB95084@ZenIV> <20230825134756.o3wpq6bogndukn53@quack3> <20230826022852.GO3390869@ZenIV> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230826022852.GO3390869@ZenIV> X-Mailman-Approved-At: Tue, 29 Aug 2023 08:05:16 -0700 X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On Sat, Aug 26, 2023 at 03:28:52AM +0100, Al Viro wrote: > I mean, look at claim_swapfile() for example: > p->bdev = blkdev_get_by_dev(inode->i_rdev, > FMODE_READ | FMODE_WRITE | FMODE_EXCL, p); > if (IS_ERR(p->bdev)) { > error = PTR_ERR(p->bdev); > p->bdev = NULL; > return error; > } > p->old_block_size = block_size(p->bdev); > error = set_blocksize(p->bdev, PAGE_SIZE); > if (error < 0) > return error; > we already have the file opened, and we keep it opened all the way until > the swapoff(2); here we have noticed that it's a block device and we > * open the fucker again (by device number), this time claiming > it with our swap_info_struct as holder, to be closed at swapoff(2) time > (just before we close the file) Note that some drivers look at FMODE_EXCL/BLK_OPEN_EXCL in ->open. These are probably bogus and maybe we want to kill them, but that will need an audit first. > BTW, what happens if two threads call ioctl(fd, BLKBSZSET, &n) > for the same descriptor that happens to have been opened O_EXCL? > Without O_EXCL they would've been unable to claim the sucker at the same > time - the holder we are using is the address of a function argument, > i.e. something that points to kernel stack of the caller. Those would > conflict and we either get set_blocksize() calls fully serialized, or > one of the callers would eat -EBUSY. Not so in "opened with O_EXCL" > case - they can very well overlap and IIRC set_blocksize() does *not* > expect that kind of crap... It's all under CAP_SYS_ADMIN, so it's not > as if it was a meaningful security hole anyway, but it does look fishy. The user get to keep the pieces.. BLKBSZSET is kinda bogus anyway as the soft blocksize only matters for buffer_head-like I/O, and there only for file systems. Not idea why anyone would set it manually.