From: "Theodore Ts'o" <tytso@mit.edu>
To: Mitta Sai Chaithanya <mittas@microsoft.com>
Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
Nilesh Awate <Nilesh.Awate@microsoft.com>,
Ganesan Kalyanasundaram <ganesanka@microsoft.com>,
Pawan Sharma <sharmapawan@microsoft.com>
Subject: Re: [EXTERNAL] Re: EXT4/JBD2 Not Fully Released device after unmount of NVMe-oF Block Device
Date: Tue, 3 Jun 2025 00:29:04 +0000 [thread overview]
Message-ID: <20250603002904.GE179983@mit.edu> (raw)
In-Reply-To: <TYZP153MB0627DED95B9B9B2E86D66EFED762A@TYZP153MB0627.APCP153.PROD.OUTLOOK.COM>
On Mon, Jun 02, 2025 at 09:32:18PM +0000, Mitta Sai Chaithanya wrote:
> However, after the connection is re-established and the device is
> unmounted from all namespaces, I still observe errors from both ext4
> and jb2 when the device is especially disconnected.
How do you *know* that you've unmounted the device in all namespaces.
I seem to recall that some process (I think one of the systemd
daemons, but I could be wrong) was creating a namespace that users
were not expecting, resulting in the device staying mounted when the
users were not so expecting it.
The fact that /proc/fs/ext4/<device_name> still exists means that the
kernel (specifically, the VFS layer) doesn't think that the file
system can be shut down. As a result, the VFS layer has not called
ext4's put_super() and kill_sb() methods. And so yes, I/O activity
can still happen, because the file system has not been shutdown.
If you still see /proc/fs/ext4/<device_name>, my suggestion would be
grep /proc/*/mounts looking to see which processes has a namespace
which still has the device mounted. I suspect that you will see that
there is some namespace that you weren't aware of that is keeping the
ext4 struct super object pinned and alive.
> Another point I would like to mention, I am observing JBD2 errors especially after NVMe-oF device has been disconnected and below are the logs.
Sure, but that's the effect, not the cause, of the NVME-of device
getting ripped down while the file system is still active. Which I am
99.997% sure is because it is still mounted in some namespace. The
other 0.003% chance is that there is some refcount problem in the VFS
subsytem, and I would suggest that you ask Microsoft's VFS experts,
(such as Christain Brauner, who is one of the VFS maintainers) to take
a look. I very much doubt it is a kernel bug, though.
- Ted
next prev parent reply other threads:[~2025-06-03 0:29 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-01 11:02 EXT4/JBD2 Not Fully Released device after unmount of NVMe-oF Block Device Mitta Sai Chaithanya
2025-06-01 22:04 ` Theodore Ts'o
2025-06-02 21:32 ` [EXTERNAL] " Mitta Sai Chaithanya
2025-06-03 0:29 ` Theodore Ts'o [this message]
2025-06-03 20:32 ` Andreas Dilger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250603002904.GE179983@mit.edu \
--to=tytso@mit.edu \
--cc=Nilesh.Awate@microsoft.com \
--cc=ganesanka@microsoft.com \
--cc=linux-ext4@vger.kernel.org \
--cc=mittas@microsoft.com \
--cc=sharmapawan@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.