* [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale
@ 2025-08-01 15:22 David Disseldorp
2025-08-04 1:31 ` Ian Kent
2025-09-24 8:36 ` Ian Kent
0 siblings, 2 replies; 5+ messages in thread
From: David Disseldorp @ 2025-08-01 15:22 UTC (permalink / raw)
To: autofs; +Cc: David Disseldorp
This effectively reverts commit 21ce28d ("autofs-5.1.4 - mark removed
cache entry negative"), which causes the kernel to stall in autofs_wait
for the following workload:
cat > /etc/auto.direct <<EOF
echo "/nfs/share $mount_args ${NFS_SERVER}:/${NFS_SHARE}"
EOF
setsid --fork automount --debug --foreground &> /automount.log
sleep 1
touch /test.run
setsid --fork /bin/bash -c \
"while [[ -f /test.run ]]; do df -ia >> /test.log; sleep 1; done"
echo "df loop logging to /test.log"
sleep 2
echo "changing and reloading auto.direct"
echo > /etc/auto.direct
killall -HUP automount
sleep 2
echo "unmounting..."
umount /nfs/share || echo "umount failed"
The current behaviour sees us hit:
handle_packet_missing_direct:1352: can't find map entry for ()
...which doesn't respond to the kernel, triggering the stall.
This approach adds a new MOUNT_FLAG_STALE flag to track removed map
entries. While keeping enough state around to respond for the
handle_packet_missing_direct case.
RFC:
- needs further testing (e.g. indirect maps)
- I'm not familiar with the codebase so this may be the wrong approach
- we may need a background job to purge MOUNT_FLAG_STALE entries?
Signed-off-by: David Disseldorp <ddiss@suse.de>
---
daemon/direct.c | 8 ++++++--
daemon/indirect.c | 8 ++++++--
daemon/lookup.c | 11 ++++-------
include/automount.h | 3 +++
4 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/daemon/direct.c b/daemon/direct.c
index 42baac8..5e78c40 100644
--- a/daemon/direct.c
+++ b/daemon/direct.c
@@ -1389,8 +1389,12 @@ int handle_packet_missing_direct(struct autofs_point *ap, autofs_packet_missing_
return 0;
}
- /* Check if we recorded a mount fail for this key */
- if (me->status >= monotonic_time(NULL)) {
+ /*
+ * Check if we recorded a mount fail for this key, or the entry has
+ * been removed.
+ */
+ if (me->status >= monotonic_time(NULL) ||
+ me->flags & MOUNT_FLAG_STALE) {
ops->send_fail(ap->logopt,
ioctlfd, pkt->wait_queue_token, -ENOENT);
ops->close(ap->logopt, ioctlfd);
diff --git a/daemon/indirect.c b/daemon/indirect.c
index 7d4aad7..934bb74 100644
--- a/daemon/indirect.c
+++ b/daemon/indirect.c
@@ -798,8 +798,12 @@ int handle_packet_missing_indirect(struct autofs_point *ap, autofs_packet_missin
me = lookup_source_mapent(ap, pkt->name, LKP_DISTINCT);
if (me) {
- /* Check if we recorded a mount fail for this key */
- if (me->status >= monotonic_time(NULL)) {
+ /*
+ * Check if we recorded a mount fail for this key, or the entry
+ * has been removed.
+ */
+ if (me->status >= monotonic_time(NULL) ||
+ me->flags & MOUNT_FLAG_STALE) {
ops->send_fail(ap->logopt, ap->ioctlfd,
pkt->wait_queue_token, -ENOENT);
cache_unlock(me->mc);
diff --git a/daemon/lookup.c b/daemon/lookup.c
index dc77948..ad0b460 100644
--- a/daemon/lookup.c
+++ b/daemon/lookup.c
@@ -1416,15 +1416,12 @@ void lookup_prune_one_cache(struct autofs_point *ap, struct mapent_cache *mc, ti
if (valid && valid->mc == mc) {
/*
* We've found a map entry that has been removed from
- * the current cache so it isn't really valid. Set the
- * mapent negative to prevent further mount requests
+ * the current cache so it isn't really valid. Flag the
+ * mapent stale to prevent further mount requests
* using the cache entry.
*/
- debug(ap->logopt, "removed map entry detected, mark negative");
- if (valid->mapent) {
- free(valid->mapent);
- valid->mapent = NULL;
- }
+ debug(ap->logopt, "removed map entry detected, mark stale");
+ valid->flags |= MOUNT_FLAG_STALE;
cache_unlock(valid->mc);
valid = NULL;
}
diff --git a/include/automount.h b/include/automount.h
index 9548db8..007d020 100644
--- a/include/automount.h
+++ b/include/automount.h
@@ -548,6 +548,9 @@ struct kernel_mod_version {
/* Indicator for applications to ignore the mount entry */
#define MOUNT_FLAG_IGNORE 0x1000
+/* map has been removed, but we can't clean up yet */
+#define MOUNT_FLAG_STALE 0x2000
+
struct autofs_point {
pthread_t thid;
char *path; /* Mount point name */
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale
2025-08-01 15:22 [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale David Disseldorp
@ 2025-08-04 1:31 ` Ian Kent
2025-08-05 8:37 ` David Disseldorp
2025-09-24 8:36 ` Ian Kent
1 sibling, 1 reply; 5+ messages in thread
From: Ian Kent @ 2025-08-04 1:31 UTC (permalink / raw)
To: David Disseldorp, autofs
Hi David,
I was surprised to see this because I'm working on the very same problem.
But since I didn't have a reproducer I've painstakingly worked through
the map reload related code.
I don't know if my changes have fixed the problem but I can post them
for you to try them out. The main reason I would prefer to use my changes
(if they do fix the problem) is that I found quite a few problems with the
map reload not working properly which lead to spending a bunch of time on
that. One of the changes fixes the valid entry lookup and removes setting
the entry negative in lookup_prune_one_cache() and I think fixes the devid
handling in do_readmap_mount().
I'm not quite finished the series yet so I'll post it when I have, hopefully
today.
Umm, I probably should give your reproducer a go ... perhaps later ...
Ian
On 1/8/25 23:22, David Disseldorp wrote:
> This effectively reverts commit 21ce28d ("autofs-5.1.4 - mark removed
> cache entry negative"), which causes the kernel to stall in autofs_wait
> for the following workload:
>
> cat > /etc/auto.direct <<EOF
> echo "/nfs/share $mount_args ${NFS_SERVER}:/${NFS_SHARE}"
> EOF
>
> setsid --fork automount --debug --foreground &> /automount.log
> sleep 1
>
> touch /test.run
> setsid --fork /bin/bash -c \
> "while [[ -f /test.run ]]; do df -ia >> /test.log; sleep 1; done"
> echo "df loop logging to /test.log"
>
> sleep 2
> echo "changing and reloading auto.direct"
> echo > /etc/auto.direct
> killall -HUP automount
>
> sleep 2
> echo "unmounting..."
> umount /nfs/share || echo "umount failed"
>
> The current behaviour sees us hit:
> handle_packet_missing_direct:1352: can't find map entry for ()
> ...which doesn't respond to the kernel, triggering the stall.
>
> This approach adds a new MOUNT_FLAG_STALE flag to track removed map
> entries. While keeping enough state around to respond for the
> handle_packet_missing_direct case.
>
> RFC:
> - needs further testing (e.g. indirect maps)
> - I'm not familiar with the codebase so this may be the wrong approach
> - we may need a background job to purge MOUNT_FLAG_STALE entries?
>
> Signed-off-by: David Disseldorp <ddiss@suse.de>
> ---
> daemon/direct.c | 8 ++++++--
> daemon/indirect.c | 8 ++++++--
> daemon/lookup.c | 11 ++++-------
> include/automount.h | 3 +++
> 4 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/daemon/direct.c b/daemon/direct.c
> index 42baac8..5e78c40 100644
> --- a/daemon/direct.c
> +++ b/daemon/direct.c
> @@ -1389,8 +1389,12 @@ int handle_packet_missing_direct(struct autofs_point *ap, autofs_packet_missing_
> return 0;
> }
>
> - /* Check if we recorded a mount fail for this key */
> - if (me->status >= monotonic_time(NULL)) {
> + /*
> + * Check if we recorded a mount fail for this key, or the entry has
> + * been removed.
> + */
> + if (me->status >= monotonic_time(NULL) ||
> + me->flags & MOUNT_FLAG_STALE) {
> ops->send_fail(ap->logopt,
> ioctlfd, pkt->wait_queue_token, -ENOENT);
> ops->close(ap->logopt, ioctlfd);
> diff --git a/daemon/indirect.c b/daemon/indirect.c
> index 7d4aad7..934bb74 100644
> --- a/daemon/indirect.c
> +++ b/daemon/indirect.c
> @@ -798,8 +798,12 @@ int handle_packet_missing_indirect(struct autofs_point *ap, autofs_packet_missin
>
> me = lookup_source_mapent(ap, pkt->name, LKP_DISTINCT);
> if (me) {
> - /* Check if we recorded a mount fail for this key */
> - if (me->status >= monotonic_time(NULL)) {
> + /*
> + * Check if we recorded a mount fail for this key, or the entry
> + * has been removed.
> + */
> + if (me->status >= monotonic_time(NULL) ||
> + me->flags & MOUNT_FLAG_STALE) {
> ops->send_fail(ap->logopt, ap->ioctlfd,
> pkt->wait_queue_token, -ENOENT);
> cache_unlock(me->mc);
> diff --git a/daemon/lookup.c b/daemon/lookup.c
> index dc77948..ad0b460 100644
> --- a/daemon/lookup.c
> +++ b/daemon/lookup.c
> @@ -1416,15 +1416,12 @@ void lookup_prune_one_cache(struct autofs_point *ap, struct mapent_cache *mc, ti
> if (valid && valid->mc == mc) {
> /*
> * We've found a map entry that has been removed from
> - * the current cache so it isn't really valid. Set the
> - * mapent negative to prevent further mount requests
> + * the current cache so it isn't really valid. Flag the
> + * mapent stale to prevent further mount requests
> * using the cache entry.
> */
> - debug(ap->logopt, "removed map entry detected, mark negative");
> - if (valid->mapent) {
> - free(valid->mapent);
> - valid->mapent = NULL;
> - }
> + debug(ap->logopt, "removed map entry detected, mark stale");
> + valid->flags |= MOUNT_FLAG_STALE;
> cache_unlock(valid->mc);
> valid = NULL;
> }
> diff --git a/include/automount.h b/include/automount.h
> index 9548db8..007d020 100644
> --- a/include/automount.h
> +++ b/include/automount.h
> @@ -548,6 +548,9 @@ struct kernel_mod_version {
> /* Indicator for applications to ignore the mount entry */
> #define MOUNT_FLAG_IGNORE 0x1000
>
> +/* map has been removed, but we can't clean up yet */
> +#define MOUNT_FLAG_STALE 0x2000
> +
> struct autofs_point {
> pthread_t thid;
> char *path; /* Mount point name */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale
2025-08-04 1:31 ` Ian Kent
@ 2025-08-05 8:37 ` David Disseldorp
2025-08-06 0:36 ` Ian Kent
0 siblings, 1 reply; 5+ messages in thread
From: David Disseldorp @ 2025-08-05 8:37 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs
Hi Ian,
On Mon, 4 Aug 2025 09:31:22 +0800, Ian Kent wrote:
> Hi David,
>
>
> I was surprised to see this because I'm working on the very same problem.
>
> But since I didn't have a reproducer I've painstakingly worked through
>
> the map reload related code.
>
>
> I don't know if my changes have fixed the problem but I can post them
>
> for you to try them out.
Sounds good to me, I'm happy to run any rough drafts in my reproducer
environment.
Thanks, David
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale
2025-08-05 8:37 ` David Disseldorp
@ 2025-08-06 0:36 ` Ian Kent
0 siblings, 0 replies; 5+ messages in thread
From: Ian Kent @ 2025-08-06 0:36 UTC (permalink / raw)
To: David Disseldorp; +Cc: autofs
On 5/8/25 16:37, David Disseldorp wrote:
> Hi Ian,
>
> On Mon, 4 Aug 2025 09:31:22 +0800, Ian Kent wrote:
>
>> Hi David,
>>
>>
>> I was surprised to see this because I'm working on the very same problem.
>>
>> But since I didn't have a reproducer I've painstakingly worked through
>>
>> the map reload related code.
>>
>>
>> I don't know if my changes have fixed the problem but I can post them
>>
>> for you to try them out.
> Sounds good to me, I'm happy to run any rough drafts in my reproducer
> environment.
I've been a bit ill with a respiratory infection so I've been slowed up.
I'll post the series as soon as I can give the latest revision a bit of
a test myself.
Ian
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale
2025-08-01 15:22 [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale David Disseldorp
2025-08-04 1:31 ` Ian Kent
@ 2025-09-24 8:36 ` Ian Kent
1 sibling, 0 replies; 5+ messages in thread
From: Ian Kent @ 2025-09-24 8:36 UTC (permalink / raw)
To: David Disseldorp, autofs
On 1/8/25 23:22, David Disseldorp wrote:
> This effectively reverts commit 21ce28d ("autofs-5.1.4 - mark removed
> cache entry negative"), which causes the kernel to stall in autofs_wait
> for the following workload:
>
> cat > /etc/auto.direct <<EOF
> echo "/nfs/share $mount_args ${NFS_SERVER}:/${NFS_SHARE}"
> EOF
>
> setsid --fork automount --debug --foreground &> /automount.log
> sleep 1
>
> touch /test.run
> setsid --fork /bin/bash -c \
> "while [[ -f /test.run ]]; do df -ia >> /test.log; sleep 1; done"
> echo "df loop logging to /test.log"
>
> sleep 2
> echo "changing and reloading auto.direct"
> echo > /etc/auto.direct
> killall -HUP automount
>
> sleep 2
> echo "unmounting..."
> umount /nfs/share || echo "umount failed"
>
> The current behaviour sees us hit:
> handle_packet_missing_direct:1352: can't find map entry for ()
> ...which doesn't respond to the kernel, triggering the stall.
>
> This approach adds a new MOUNT_FLAG_STALE flag to track removed map
> entries. While keeping enough state around to respond for the
> handle_packet_missing_direct case.
>
> RFC:
> - needs further testing (e.g. indirect maps)
> - I'm not familiar with the codebase so this may be the wrong approach
> - we may need a background job to purge MOUNT_FLAG_STALE entries?
In the light of the work that I did reviewing the map entry cache pruning
this patch seems not quite right but it is a good approach and I do like
it. I'll see if I can work it into what I've done to the cache pruning
without breaking it.
Ian
>
> Signed-off-by: David Disseldorp <ddiss@suse.de>
> ---
> daemon/direct.c | 8 ++++++--
> daemon/indirect.c | 8 ++++++--
> daemon/lookup.c | 11 ++++-------
> include/automount.h | 3 +++
> 4 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/daemon/direct.c b/daemon/direct.c
> index 42baac8..5e78c40 100644
> --- a/daemon/direct.c
> +++ b/daemon/direct.c
> @@ -1389,8 +1389,12 @@ int handle_packet_missing_direct(struct autofs_point *ap, autofs_packet_missing_
> return 0;
> }
>
> - /* Check if we recorded a mount fail for this key */
> - if (me->status >= monotonic_time(NULL)) {
> + /*
> + * Check if we recorded a mount fail for this key, or the entry has
> + * been removed.
> + */
> + if (me->status >= monotonic_time(NULL) ||
> + me->flags & MOUNT_FLAG_STALE) {
> ops->send_fail(ap->logopt,
> ioctlfd, pkt->wait_queue_token, -ENOENT);
> ops->close(ap->logopt, ioctlfd);
> diff --git a/daemon/indirect.c b/daemon/indirect.c
> index 7d4aad7..934bb74 100644
> --- a/daemon/indirect.c
> +++ b/daemon/indirect.c
> @@ -798,8 +798,12 @@ int handle_packet_missing_indirect(struct autofs_point *ap, autofs_packet_missin
>
> me = lookup_source_mapent(ap, pkt->name, LKP_DISTINCT);
> if (me) {
> - /* Check if we recorded a mount fail for this key */
> - if (me->status >= monotonic_time(NULL)) {
> + /*
> + * Check if we recorded a mount fail for this key, or the entry
> + * has been removed.
> + */
> + if (me->status >= monotonic_time(NULL) ||
> + me->flags & MOUNT_FLAG_STALE) {
> ops->send_fail(ap->logopt, ap->ioctlfd,
> pkt->wait_queue_token, -ENOENT);
> cache_unlock(me->mc);
> diff --git a/daemon/lookup.c b/daemon/lookup.c
> index dc77948..ad0b460 100644
> --- a/daemon/lookup.c
> +++ b/daemon/lookup.c
> @@ -1416,15 +1416,12 @@ void lookup_prune_one_cache(struct autofs_point *ap, struct mapent_cache *mc, ti
> if (valid && valid->mc == mc) {
> /*
> * We've found a map entry that has been removed from
> - * the current cache so it isn't really valid. Set the
> - * mapent negative to prevent further mount requests
> + * the current cache so it isn't really valid. Flag the
> + * mapent stale to prevent further mount requests
> * using the cache entry.
> */
> - debug(ap->logopt, "removed map entry detected, mark negative");
> - if (valid->mapent) {
> - free(valid->mapent);
> - valid->mapent = NULL;
> - }
> + debug(ap->logopt, "removed map entry detected, mark stale");
> + valid->flags |= MOUNT_FLAG_STALE;
> cache_unlock(valid->mc);
> valid = NULL;
> }
> diff --git a/include/automount.h b/include/automount.h
> index 9548db8..007d020 100644
> --- a/include/automount.h
> +++ b/include/automount.h
> @@ -548,6 +548,9 @@ struct kernel_mod_version {
> /* Indicator for applications to ignore the mount entry */
> #define MOUNT_FLAG_IGNORE 0x1000
>
> +/* map has been removed, but we can't clean up yet */
> +#define MOUNT_FLAG_STALE 0x2000
> +
> struct autofs_point {
> pthread_t thid;
> char *path; /* Mount point name */
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-09-24 8:36 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-01 15:22 [PATCH] RFC: autofs-5.1.9 - flag removed entries as stale David Disseldorp
2025-08-04 1:31 ` Ian Kent
2025-08-05 8:37 ` David Disseldorp
2025-08-06 0:36 ` Ian Kent
2025-09-24 8:36 ` Ian Kent
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).