* [Question] Unlinking original file of bind mounted file. @ 2022-12-30 8:08 Yun Levi 2022-12-30 10:58 ` Matthew Wilcox 0 siblings, 1 reply; 7+ messages in thread From: Yun Levi @ 2022-12-30 8:08 UTC (permalink / raw) To: linux-fsdevel Hello fs-devel folks, I have a few questions about below situation's handling. ====================================================== 1. mount --bind {somefile} {target} 2. rm -f {somefile} ======================================================= when it happens, the step (2)'s operation is working -- it removes. But, the inode of {somefile} is live with i_nlink = 0 with an orphan state of ext4_inode_info in ext4-fs. IIUC, because ext4-inode-entry is removed in the disk via ext4_unlink, and it seems possible the inode_entry which is freed by unlink in step(2) will be used again when a new file is created. Suggest new created file which recycled the inode_entry unlinked by step(2). and bind mounted-file is live. In that situation, it seems that via bind mount-file, it can manipulate the data of the newly created file and access it arbitrarily. I don't know if it's right to allow access to the removed file via binded-file and it's the spec of filesystems or designed action by ext4 filesystem only. Thanks. -- Best regards, Levi ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] Unlinking original file of bind mounted file. 2022-12-30 8:08 [Question] Unlinking original file of bind mounted file Yun Levi @ 2022-12-30 10:58 ` Matthew Wilcox [not found] ` <CAM7-yPROANYjeGn3ECfqmn0sLzEQPUpzCyU5zSN3-mJv3UA4CA@mail.gmail.com> 0 siblings, 1 reply; 7+ messages in thread From: Matthew Wilcox @ 2022-12-30 10:58 UTC (permalink / raw) To: Yun Levi; +Cc: linux-fsdevel On Fri, Dec 30, 2022 at 05:08:31PM +0900, Yun Levi wrote: > Hello fs-devel folks, > > I have a few questions about below situation's handling. > > ====================================================== > 1. mount --bind {somefile} {target} > 2. rm -f {somefile} > ======================================================= > > when it happens, the step (2)'s operation is working -- it removes. > But, the inode of {somefile} is live with i_nlink = 0 with an orphan > state of ext4_inode_info in ext4-fs. > > IIUC, because ext4-inode-entry is removed in the disk via ext4_unlink, > and it seems possible > the inode_entry which is freed by unlink in step(2) will be used again > when a new file is created. No, that's not correct. Here's how to think about Unix files (not just ext4, going all the way back to the 1970s). Each inode has a reference count. All kinds of things hold a reference count to an inode; some of the more common ones are a name in a directory, an open file, a mmap of that open file, passing a file descriptor through a unix socket, etc, etc. Unlink removes a name from a directory. That causes the reference count to be decreased, but the inode will only be released if that causes the reference count to drop to 0. If the file is open, or it has multiple names, it won't be removed. mount --bind obviously isn't traditional Unix, but it fits in the same paradigm. It causes a new reference count to be taken on the inode. So you can remove the original name that was used to create the link, and that causes i_nlink to drop to 0, but the in-memory refcount is still positive, so the inode will not be reused. ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAM7-yPROANYjeGn3ECfqmn0sLzEQPUpzCyU5zSN3-mJv3UA4CA@mail.gmail.com>]
* Fwd: [Question] Unlinking original file of bind mounted file. [not found] ` <CAM7-yPROANYjeGn3ECfqmn0sLzEQPUpzCyU5zSN3-mJv3UA4CA@mail.gmail.com> @ 2022-12-30 11:16 ` Yun Levi 2022-12-30 21:51 ` Eric Biggers 0 siblings, 1 reply; 7+ messages in thread From: Yun Levi @ 2022-12-30 11:16 UTC (permalink / raw) Cc: linux-fsdevel > No, that's not correct. Here's how to think about Unix files (not just > ext4, going all the way back to the 1970s). Each inode has a reference > count. All kinds of things hold a reference count to an inode; some of > the more common ones are a name in a directory, an open file, a mmap of > that open file, passing a file descriptor through a unix socket, etc, etc. > > Unlink removes a name from a directory. That causes the reference count > to be decreased, but the inode will only be released if that causes the > reference count to drop to 0. If the file is open, or it has multiple > names, it won't be removed. > > mount --bind obviously isn't traditional Unix, but it fits in the same > paradigm. It causes a new reference count to be taken on the inode. > So you can remove the original name that was used to create the link, > and that causes i_nlink to drop to 0, but the in-memory refcount is > still positive, so the inode will not be reused. > Actually, when the bind mount happens on the some file, it doesn't increase the inode->i_count, Instead of that, it increases dentry's refcount. So, If we do "mount --bind a b" it just increases the reference of dentry of a, not i_count of a. So, when rm -f a, it just put the reference of dentry, but not decreased the reference count of inode->i_count. When the unlink on b, finally the dentry is killed and free the inode. That's the reason why inode's count sustains "1" though a was unlinked but makes the inode->n_link as 0. Here is What I saw via crash the b's inode's reference count which after unlink the original a. // 0xffff0000c6af9d18 is the inode which unlinks the original file. (mentioned as b above). crash> struct inode.i_count 0xffff0000c6af9d18 i_count = { counter = 1 }, crash> struct inode.i_nlink 0xffff0000c6af9d18 i_nlink = 0, ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] Unlinking original file of bind mounted file. 2022-12-30 11:16 ` Fwd: " Yun Levi @ 2022-12-30 21:51 ` Eric Biggers 2022-12-30 22:58 ` Yun Levi 0 siblings, 1 reply; 7+ messages in thread From: Eric Biggers @ 2022-12-30 21:51 UTC (permalink / raw) To: Yun Levi; +Cc: linux-fsdevel On Fri, Dec 30, 2022 at 08:16:19PM +0900, Yun Levi wrote: > > No, that's not correct. Here's how to think about Unix files (not just > > ext4, going all the way back to the 1970s). Each inode has a reference > > count. All kinds of things hold a reference count to an inode; some of > > the more common ones are a name in a directory, an open file, a mmap of > > that open file, passing a file descriptor through a unix socket, etc, etc. > > > > Unlink removes a name from a directory. That causes the reference count > > to be decreased, but the inode will only be released if that causes the > > reference count to drop to 0. If the file is open, or it has multiple > > names, it won't be removed. > > > > mount --bind obviously isn't traditional Unix, but it fits in the same > > paradigm. It causes a new reference count to be taken on the inode. > > So you can remove the original name that was used to create the link, > > and that causes i_nlink to drop to 0, but the in-memory refcount is > > still positive, so the inode will not be reused. > > > > Actually, when the bind mount happens on the some file, it doesn't > increase the inode->i_count, > Instead of that, it increases dentry's refcount. > So, If we do "mount --bind a b" > it just increases the reference of dentry of a, not i_count of a. Sure, but the dentry pins the inode. > So, when rm -f a, it just put the reference of dentry No, it doesn't change the refcount of the dentry. The unlink does temporarily increment, and then decrement, the refcount. However, there is still another reference that's held by the bind mount. For that reason, the dentry's inode is not released yet; instead, the dentry is just made unavailable to lookups. > When the unlink on b, finally the dentry is killed and free the inode. You can't actually do that, because the unlink fails with EBUSY. And even if you could, it would be a different dentry (b instead of a). > Here is What I saw via crash If you have a reproducer for an actual crash, please provide it. (And if you do indeed have an actual crash, please consider that its root cause may be completely unrelated to the theory that you've described...) - Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] Unlinking original file of bind mounted file. 2022-12-30 21:51 ` Eric Biggers @ 2022-12-30 22:58 ` Yun Levi 2022-12-30 23:05 ` Eric Biggers 0 siblings, 1 reply; 7+ messages in thread From: Yun Levi @ 2022-12-30 22:58 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-fsdevel > No, it doesn't change the refcount of the dentry. The unlink does temporarily > increment, and then decrement, the refcount. However, there is still another > reference that's held by the bind mount. For that reason, the dentry's inode is > not released yet; instead, the dentry is just made unavailable to lookups. > >You can't actually do that, because the unlink fails with EBUSY. And even if > you could, it would be a different dentry (b instead of a). Thanks for the correction!. I've said wrong sorry! > If you have a reproducer for an actual crash, please provide it. (And if you do > indeed have an actual crash, please consider that its root cause may be > completely unrelated to the theory that you've described...) What I describe doesn't cause any panic. but I've traced with crash in live below situation.: ====================================================== /** * NOT directory bind, file bind. */ 1. mount --bind {original file} {bind file} // original's inode->i_count = 1, inode->i_nlink =0, and ext4_inode becomes orphaned, // inode->i_no which managed by ext4 is freed and become reusable. 2. rm -f {original file} ======================================================= It's not issued remove of bind itself. after step (2) In the view of VFS, inode seems live, but in the view of EXT4, it becomes "orphaned" and its inode->i_no which managed by ext4. was removed and reuse by other created file. In that situation, if inode->i_no is reused, via bind fil, It seems "arbitrary access" the data of new created one with same i_no. (write, and read...) What I wonder is that it's intended action or not. IMHO, should it prohibit to unlink of original to prevent above arbitrary access? Thanks. -- Sincerely, Levi ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] Unlinking original file of bind mounted file. 2022-12-30 22:58 ` Yun Levi @ 2022-12-30 23:05 ` Eric Biggers 2022-12-31 4:35 ` Yun Levi 0 siblings, 1 reply; 7+ messages in thread From: Eric Biggers @ 2022-12-30 23:05 UTC (permalink / raw) To: Yun Levi; +Cc: linux-fsdevel On Sat, Dec 31, 2022 at 07:58:09AM +0900, Yun Levi wrote: > /** > * NOT directory bind, file bind. > */ > 1. mount --bind {original file} {bind file} > > // original's inode->i_count = 1, inode->i_nlink =0, and ext4_inode > becomes orphaned, > // inode->i_no which managed by ext4 is freed and become reusable. > 2. rm -f {original file} ext4 doesn't free the inode number while the inode is still on the orphan list. So your claim "inode->i_no which managed by ext4 is freed and become reusable" is wrong. - Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Question] Unlinking original file of bind mounted file. 2022-12-30 23:05 ` Eric Biggers @ 2022-12-31 4:35 ` Yun Levi 0 siblings, 0 replies; 7+ messages in thread From: Yun Levi @ 2022-12-31 4:35 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-fsdevel > ext4 doesn't free the inode number while the inode is still on the orphan list. > So your claim "inode->i_no which managed by ext4 is freed and become reusable" > is wrong. Thanks :) I didn't know that! and Sorry to make noise. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-12-31 4:35 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-12-30 8:08 [Question] Unlinking original file of bind mounted file Yun Levi 2022-12-30 10:58 ` Matthew Wilcox [not found] ` <CAM7-yPROANYjeGn3ECfqmn0sLzEQPUpzCyU5zSN3-mJv3UA4CA@mail.gmail.com> 2022-12-30 11:16 ` Fwd: " Yun Levi 2022-12-30 21:51 ` Eric Biggers 2022-12-30 22:58 ` Yun Levi 2022-12-30 23:05 ` Eric Biggers 2022-12-31 4:35 ` Yun Levi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).