From: Steve Dickson <steved@redhat.com>
To: Aaron Tomlin <atomlin@atomlin.com>, tbecker@redhat.com
Cc: yi.zhang@redhat.com, linux-nfs@vger.kernel.org
Subject: Re: [PATCH] nfsrahead: enable event-driven mountinfo monitoring and skip non-NFS devices
Date: Fri, 6 Mar 2026 17:10:03 -0500 [thread overview]
Message-ID: <ea495f1d-1464-4f9d-91de-dd3fe828fcff@redhat.com> (raw)
In-Reply-To: <20260306161929.4148128-1-atomlin@atomlin.com>
On 3/6/26 11:19 AM, Aaron Tomlin wrote:
> The nfsrahead utility relies on parsing "/proc/self/mountinfo" to
> correlate a device number with a specific NFS mount point. However, due
> to the asynchronous nature of system initialisation, the relevant entry
> in mountinfo may not be immediately available when the tool is executed.
>
> Currently, the utility employs a naive polling mechanism, retrying the
> search five times with a fixed 50ms delay (totalling 250ms). This
> approach proves brittle on systems under heavy load or during
> distinctively slow boot sequences.
>
> To mitigate this race condition and improve robustness, update
> get_device_info() to utilise the libmount monitoring API.
>
> The new implementation introduces the following logic:
>
> 1. Initialises a monitor on /proc/self/mountinfo using
> mnt_new_monitor().
>
> 2. Replaces the fixed polling loop with mnt_monitor_wait().
>
> 3. Increases the maximum wait time to 10 seconds (MNT_NM_TIMEOUT).
>
> 4. Introduces a fast-path rejection mechanism. NFS backing devices are
> allocated from the kernel's unnamed block device pool (major number
> 0). While some local multi-device filesystems (such as Btrfs) also
> utilise anonymous device numbers, physical hardware block devices
> (e.g., sda, nvme) always possess specific, non-zero major numbers.
> By instantly exiting with -ENODEV for any device string not
> beginning with "0:", we safely bypass the monitor for physical
> drives, preventing the exhaustion of udev worker threads.
> See set_anon_super() and get_anon_bdev().
>
> 5. Implements strict monotonic deadline tracking within the monitor
> loop to prevent indefinite blocking.
>
> Fixes: 2b62ac4c ("nfsrahead: enable event-driven mountinfo monitoring")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs8URj2fJ7KyP9ViAm6npVOaMiAErnw2uFyPYEU2wb7G_w@mail.gmail.com/T/#t
> Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
Committed... (tag: nfs-utils-2-8-6-rc4)
steved.
> ---
>
> Hi Steve,
>
> This patch should resolve the udev worker exhaustion issue reported by
> Yi. It applies cleanly on top of the current nfs-utils tree, after your
> revert [1].
>
> Thank you.
>
> [1]: https://lore.kernel.org/linux-nfs/20260305124221.55407-1-steved@redhat.com/
>
>
> tools/nfsrahead/main.c | 55 +++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/tools/nfsrahead/main.c b/tools/nfsrahead/main.c
> index b7b889ff..78cd2581 100644
> --- a/tools/nfsrahead/main.c
> +++ b/tools/nfsrahead/main.c
> @@ -3,6 +3,7 @@
> #include <stdlib.h>
> #include <errno.h>
> #include <unistd.h>
> +#include <time.h>
>
> #include <libmount/libmount.h>
> #include <sys/sysmacros.h>
> @@ -17,6 +18,8 @@
> #define CONF_NAME "nfsrahead"
> #define NFS_DEFAULT_READAHEAD 128
>
> +#define MNT_NM_TIMEOUT 10000
> +
> /* Device information from the system */
> struct device_info {
> char *device_number;
> @@ -117,7 +120,57 @@ out_free_device_info:
>
> static int get_device_info(const char *device_number, struct device_info *device_info)
> {
> - int ret = get_mountinfo(device_number, device_info, MOUNTINFO_PATH);
> + int ret;
> + struct libmnt_monitor *mn = NULL;
> + struct timespec start, now;
> + int remaining_ms = MNT_NM_TIMEOUT;
> +
> + /*
> + * Fast-path rejection:
> + * NFS backing devices always use the anonymous block device major number (0).
> + * If the device number does not start with "0:", it is a physical block device
> + * and will never be an NFS mount. Exit immediately to prevent blocking udev.
> + */
> + if (strncmp(device_number, "0:", 2) != 0)
> + return -ENODEV;
> +
> + ret = get_mountinfo(device_number, device_info, MOUNTINFO_PATH);
> + if (ret == 0)
> + return 0;
> +
> + mn = mnt_new_monitor();
> + if (!mn)
> + goto fallback;
> +
> + if (mnt_monitor_enable_kernel(mn, 1) < 0) {
> + mnt_unref_monitor(mn);
> + goto fallback;
> + }
> +
> + clock_gettime(CLOCK_MONOTONIC, &start);
> +
> + while (remaining_ms > 0) {
> + int rc = mnt_monitor_wait(mn, remaining_ms);
> + if (rc > 0) {
> + ret = get_mountinfo(device_number, device_info, MOUNTINFO_PATH);
> + if (ret == 0) {
> + mnt_unref_monitor(mn);
> + return 0;
> + }
> + } else {
> + break;
> + }
> +
> + clock_gettime(CLOCK_MONOTONIC, &now);
> + long elapsed_ms = (now.tv_sec - start.tv_sec) * 1000 +
> + (now.tv_nsec - start.tv_nsec) / 1000000;
> + remaining_ms = MNT_NM_TIMEOUT - elapsed_ms;
> + }
> +
> + mnt_unref_monitor(mn);
> + return ret;
> +
> +fallback:
> for (int retry_count = 0; retry_count < 5 && ret != 0; retry_count++) {
> usleep(50000);
> ret = get_mountinfo(device_number, device_info, MOUNTINFO_PATH);
next prev parent reply other threads:[~2026-03-06 22:10 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 16:19 [PATCH] nfsrahead: enable event-driven mountinfo monitoring and skip non-NFS devices Aaron Tomlin
2026-03-06 22:10 ` Steve Dickson [this message]
2026-03-09 12:38 ` Yi Zhang
2026-03-09 13:29 ` Aaron Tomlin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ea495f1d-1464-4f9d-91de-dd3fe828fcff@redhat.com \
--to=steved@redhat.com \
--cc=atomlin@atomlin.com \
--cc=linux-nfs@vger.kernel.org \
--cc=tbecker@redhat.com \
--cc=yi.zhang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox