From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fhigh-a3-smtp.messagingengine.com (fhigh-a3-smtp.messagingengine.com [103.168.172.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8E2421504E; Fri, 11 Jul 2025 19:11:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.154 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752261111; cv=none; b=F1a1FxkY39OBXT02ri+6Ni0ifrCSNi+hy7ksKKYPs+jaSUhGqMxpkCrOlJiiuvEIoeJptyr29CBml9bVPJ1CqBj28BhKJf/6yOP5ggv4XQDaoy1ED/u1axArQdeNLyAmL+EYy1y6IMmiudOrmPjyYp7mWWUSAKh3b+L1UADyaaM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752261111; c=relaxed/simple; bh=IIg3URYmv6xWYI14i1THAS7URsoobYG/rgtasZkZ9OQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=T+JpRTIAUiVlqEM8FW6AQwBvXSAnOoFDE37URY7EEa0593IVsGuvj+hu0BnTOSrtmxbwhOBZKvn+DNHUtMJLXpSA0p5aYqHS2gHx41DACtgShkXrqp9Y45Q4Ld3ewhfu7d/mDFR7Xqcps4N3WeZuI7BIDcACJ2dG/OKbCL2z0qM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=maowtm.org; spf=pass smtp.mailfrom=maowtm.org; dkim=pass (2048-bit key) header.d=maowtm.org header.i=@maowtm.org header.b=TIbcL63m; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=entiRvvW; arc=none smtp.client-ip=103.168.172.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=maowtm.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=maowtm.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=maowtm.org header.i=@maowtm.org header.b="TIbcL63m"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="entiRvvW" Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfhigh.phl.internal (Postfix) with ESMTP id CE0DF14000BA; Fri, 11 Jul 2025 15:11:48 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Fri, 11 Jul 2025 15:11:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=maowtm.org; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1752261108; x=1752347508; bh=zVqXbJMyuYU/2MbDexMJAmbazqSopFPwjDQCJvXjDSs=; b= TIbcL63m4VSaBnN3C60VDQT+NlcNNA6nunXsxXM63w+wHOsw5n5/Q4MQfTxbLKHG FMAegqxnvKYlt1t0FEkXZOTl7CVBMTPGhi9SJzzLETPWxGEmJeFYgP/+WwFzgAHi XjWTP0F3Sl1eKU69lNCU+4TU2IZv5BWZlP3/sIJJmHqbr4tBEeT9NcrQygJh4RHk e2f/pPcJ9Zt6iUL/rCQa2/UhDgQsARgetTVE2t6SZzZ8Is78o5x+8PV8gL2gEK2i GdoZaLu4ZwNgff4iNPnst9tY7GNPworsIdIMzigQEJ3Fzv+lKIPQu01DtfBEcTJJ 3gGSsmiMKZIUBsu4iD9URA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1752261108; x= 1752347508; bh=zVqXbJMyuYU/2MbDexMJAmbazqSopFPwjDQCJvXjDSs=; b=e ntiRvvWup2vYHXmq1f5P/DM8XAlJcNZfTkbKTFTxU/Lygko0jcVWJDnp+i3mX6tR eIXjW90D7jlZjpwspusx32U1Cb/GfJ+UtD5hqIn4EiZj0UZKG/NFgDcuIlu6YLmO bjW5uDyHuFynPKJPEzbVv7p/lIXY8NgZ5BTHZN+NCvs+M4WStm8Bs3gF6dwxzzSf R+oPlTKl2NnxsRnWKgMMYBgYz63/dDMhqb4LzDVEE7cfcGqwJYbtnkthTzTe8i+S I6r0zPDd3FNUoBDep0q0kC4J0Dmu9Q7ilcroWAQAfZT4YXcEpvlcfPzFUUQZlH66 Na/PX6JtXfSxWOVfGZ9bQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgdeggeduvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpefkffggfgfuvfevfhfhjggtgfesthejredttddvjeenucfhrhhomhepvfhinhhgmhgr ohcuhggrnhhguceomhesmhgrohifthhmrdhorhhgqeenucggtffrrghtthgvrhhnpeefvd ehleeutdfhlefgvedvgfeklefgleekgedtvdehvdfgtdefieelhfdutefgudenucevlhhu shhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehmsehmrghofihtmh drohhrghdpnhgspghrtghpthhtohepudefpdhmohguvgepshhmthhpohhuthdprhgtphht thhopehvihhrohesiigvnhhivhdrlhhinhhugidrohhrghdruhhkpdhrtghpthhtohepvg hrihgtvhhhsehkvghrnhgvlhdrohhrghdprhgtphhtthhopegrshhmrgguvghushestgho uggvfihrvggtkhdrohhrghdprhgtphhtthhopehluhgthhhosehiohhnkhhovhdrnhgvth dprhgtphhtthhopehlihhnuhigpghoshhssegtrhhuuggvsgihthgvrdgtohhmpdhrtghp thhtohepvhelfhhssehlihhsthhsrdhlihhnuhigrdguvghvpdhrtghpthhtohepmhhitg esughighhikhhougdrnhgvthdprhgtphhtthhopehgnhhorggtkhesghhoohhglhgvrdgt ohhmpdhrtghpthhtoheplhhinhhugidqshgvtghurhhithihqdhmohguuhhlvgesvhhgvg hrrdhkvghrnhgvlhdrohhrgh X-ME-Proxy: Feedback-ID: i580e4893:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 11 Jul 2025 15:11:45 -0400 (EDT) Message-ID: Date: Fri, 11 Jul 2025 20:11:44 +0100 Precedence: bulk X-Mailing-List: linux-security-module@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 1/6] fs/9p: Add ability to identify inode by path for .L To: Al Viro Cc: Eric Van Hensbergen , Dominique Martinet , Latchesar Ionkov , Christian Schoenebeck , v9fs@lists.linux.dev, =?UTF-8?Q?Micka=C3=ABl_Sala=C3=BCn?= , =?UTF-8?Q?G=C3=BCnther_Noack?= , linux-security-module@vger.kernel.org, Jan Kara , Amir Goldstein , Matthew Bobrowski , linux-fsdevel@vger.kernel.org References: <20250705002536.GW1880847@ZenIV> Content-Language: en-US From: Tingmao Wang In-Reply-To: <20250705002536.GW1880847@ZenIV> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hi Al, thanks for the review :) I haven't had the chance to properly think about this until today, so apologies for the delay. On 7/5/25 01:19, Al Viro wrote: > On Sun, Apr 06, 2025 at 09:43:02PM +0100, Tingmao Wang wrote: > >> +struct v9fs_ino_path *make_ino_path(struct dentry *dentry) >> +{ >> + struct v9fs_ino_path *path; >> + size_t path_components = 0; >> + struct dentry *curr = dentry; >> + ssize_t i; >> + >> + lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem); >> + >> + rcu_read_lock(); >> + >> + /* Don't include the root dentry */ >> + while (curr->d_parent != curr) { >> + path_components++; >> + curr = curr->d_parent; >> + } >> + if (WARN_ON(path_components > SSIZE_MAX)) { (Looking at this again I think this check is a bit bogus. I don't know how would it be possible at all for us to have > SSIZE_MAX deep directories especially since each level requires a dentry allocation, but even if this check is actually useful, it should be in the while loop, before each path_components++) >> + rcu_read_unlock(); >> + return NULL; >> + } >> + >> + path = kmalloc(struct_size(path, names, path_components), >> + GFP_KERNEL); > > Blocking allocation under rcu_read_lock(). I think my first instinct of how to fix this, if the original code is correct barring this allocation issue, would be to take rcu read lock twice (first walk to calculate how much to allocate, then second walk to actually take the snapshots). We should be safe to rcu_read_unlock() in the middle as long as caller has a reference to the target dentry (this needs to be true even if we just do one rcu_read_lock() anyway), and we can start a parent walk again. The v9fs rename_sem should ensure we see the same path again. Alternatively, we can use dget_parent to do the walk, and not lock RCU at all. We still need to walk twice tho, to know how much to allocate. But for now I will keep the current approach. New version: /* * Must hold rename_sem due to traversing parents. Caller must hold * reference to dentry. */ struct v9fs_ino_path *make_ino_path(struct dentry *dentry) { struct v9fs_ino_path *path; size_t path_components = 0; struct dentry *curr = dentry; ssize_t i; lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem); might_sleep(); /* Allocation below might block */ rcu_read_lock(); /* Don't include the root dentry */ while (curr->d_parent != curr) { if (WARN_ON(path_components >= SSIZE_MAX)) { rcu_read_unlock(); return NULL; } path_components++; curr = curr->d_parent; } /* * Allocation can block so don't do it in RCU (and because the * allocation might be large, since name_snapshot leaves space for * inline str, not worth trying GFP_ATOMIC) */ rcu_read_unlock(); path = kmalloc(struct_size(path, names, path_components), GFP_KERNEL); if (!path) { rcu_read_unlock(); return NULL; } path->nr_components = path_components; curr = dentry; rcu_read_lock(); for (i = path_components - 1; i >= 0; i--) { take_dentry_name_snapshot(&path->names[i], curr); curr = curr->d_parent; } WARN_ON(curr != curr->d_parent); rcu_read_unlock(); return path; } How does this look? On 7/5/25 01:25, Al Viro wrote: > On Sun, Apr 06, 2025 at 09:43:02PM +0100, Tingmao Wang wrote: >> +bool ino_path_compare(struct v9fs_ino_path *ino_path, >> + struct dentry *dentry) >> +{ >> + struct dentry *curr = dentry; >> + struct qstr *curr_name; >> + struct name_snapshot *compare; >> + ssize_t i; >> + >> + lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem); >> + >> + rcu_read_lock(); >> + for (i = ino_path->nr_components - 1; i >= 0; i--) { >> + if (curr->d_parent == curr) { >> + /* We're supposed to have more components to walk */ >> + rcu_read_unlock(); >> + return false; >> + } >> + curr_name = &curr->d_name; >> + compare = &ino_path->names[i]; >> + /* >> + * We can't use hash_len because it is salted with the parent >> + * dentry pointer. We could make this faster by pre-computing our >> + * own hashlen for compare and ino_path outside, probably. >> + */ >> + if (curr_name->len != compare->name.len) { >> + rcu_read_unlock(); >> + return false; >> + } >> + if (strncmp(curr_name->name, compare->name.name, >> + curr_name->len) != 0) { > > ... without any kind of protection for curr_name. Incidentally, > what about rename()? Not a cross-directory one, just one that > changes the name of a subdirectory within the same parent? As far as I can tell, in v9fs_vfs_rename, v9ses->rename_sem is taken for both same-parent and different parent renames, so I think we're safe here (and hopefully for any v9fs dentries, nobody should be causing d_name to change except for ourselves when we call d_move in v9fs_vfs_rename? If yes then because we also take v9ses->rename_sem, in theory we should be fine here...?) (Let me know if I missed anything. I'm assuming only the filesystem "owning" a dentry should d_move/d_exchange the dentry.) However, I see that there is a d_same_name function in dcache.c which is slightly more careful (but still requires the caller to check the dentry seqcount, which we do not need to because of the reasoning above), and in hindsight I think that is probably the more proper way to do this comparison (and will also handle case-insensitivity, although I've not explored if this is applicable to 9pfs). New version: /* * Must hold rename_sem due to traversing parents */ bool ino_path_compare(struct v9fs_ino_path *ino_path, struct dentry *dentry) { struct dentry *curr = dentry; struct name_snapshot *compare; ssize_t i; lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem); rcu_read_lock(); for (i = ino_path->nr_components - 1; i >= 0; i--) { if (curr->d_parent == curr) { /* We're supposed to have more components to walk */ rcu_read_unlock(); return false; } compare = &ino_path->names[i]; if (!d_same_name(curr, curr->d_parent, &compare->name)) { rcu_read_unlock(); return false; } curr = curr->d_parent; } rcu_read_unlock(); if (curr != curr->d_parent) { /* dentry is deeper than ino_path */ return false; } return true; } If you think this is not enough, can you suggest what would be needed? I'm thinking maybe we can check dentry seqcount to be safe, but from earlier discussion in "bpf path iterator" my impression is that that is VFS internal data - can we use it here (if needed)? (I assume, from looking at the code, just having a reference does not prevent d_name from changing)