* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer [not found] ` <20260522164415.846589-5-mwilck@suse.com> @ 2026-05-27 9:28 ` Hannes Reinecke 2026-05-27 10:43 ` Martin Wilck 2026-05-27 23:16 ` Benjamin Marzinski 1 sibling, 1 reply; 10+ messages in thread From: Hannes Reinecke @ 2026-05-27 9:28 UTC (permalink / raw) To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez, Martin Wilck On 5/22/26 18:44, Martin Wilck wrote: > Add a set of simple functions to handle refcounted pointers. > > Signed-off-by: Martin Wilck <mwilck@suse.com> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > libmpathutil/libmpathutil.version | 7 +++++ > libmpathutil/util.c | 45 ++++++++++++++++++++++++++++--- > libmpathutil/util.h | 5 ++++ > 3 files changed, 53 insertions(+), 4 deletions(-) > That really looks like the Hazard pointers from Paul McK and Mathieu Desnoyers. Maybe have a look and see if you cannot re-use the approach here. > diff --git a/libmpathutil/libmpathutil.version b/libmpathutil/libmpathutil.version > index c69120d..51594ad 100644 > --- a/libmpathutil/libmpathutil.version > +++ b/libmpathutil/libmpathutil.version > @@ -194,3 +194,10 @@ global: > local: > *; > }; > + > +LIBMPATHUTIL_6.1 { > +global: > + alloc_shared_ptr; > + get_shared_ptr; > + put_shared_ptr; > +} LIBMPATHUTIL_6.0; > diff --git a/libmpathutil/util.c b/libmpathutil/util.c > index 23a9797..3c623ec 100644 > --- a/libmpathutil/util.c > +++ b/libmpathutil/util.c > @@ -12,15 +12,13 @@ > #include <dirent.h> > #include <unistd.h> > #include <errno.h> > +#include <urcu/uatomic.h> > #include "mt-udev-wrap.h" > > #include "util.h" > #include "debug.h" > -#include "checkers.h" > #include "vector.h" > -#include "structs.h" > -#include "config.h" > -#include "log.h" > +#include "list.h" > > size_t > strchop(char *str) > @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device **udd) > if (*udd) > udev_device_unref(*udd); > } > + > +struct shared_ptr { > + long refcnt; > + void (*destructor)(void *); > + char __attribute__((aligned(sizeof(void *)))) ptr[]; > +}; > + > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *)) > +{ > + struct shared_ptr *sp = malloc(sizeof(*sp) + size); > + > + if (!sp) > + return NULL; > + uatomic_set(&sp->refcnt, 1); > + sp->destructor = destructor; > + return sp->ptr; > +} > + > +void get_shared_ptr(void *ptr) > +{ > + struct shared_ptr *sp = container_of(ptr, struct shared_ptr, ptr); > + > + if (uatomic_add_return(&sp->refcnt, 1) < 0) > + condlog(0, "%s: refcount overflow", __func__); > +} > + > +void put_shared_ptr(void *ptr) > +{ > + struct shared_ptr *sp; > + > + if (!ptr) > + return; > + sp = container_of(ptr, struct shared_ptr, ptr); > + if (uatomic_sub_return(&sp->refcnt, 1) == 0) { > + if (sp->destructor) > + sp->destructor(ptr); > + free(sp); > + } > +} I _think_ the could be implemented better using RCU primitives; in the end, what you really care about is whether the pointer is valid (9which is what RCU provides). Or indeed use the hazard pointer approach, and count the number of RCU references to that object ... > diff --git a/libmpathutil/util.h b/libmpathutil/util.h > index aed1bc1..ffbb182 100644 > --- a/libmpathutil/util.h > +++ b/libmpathutil/util.h > @@ -160,4 +160,9 @@ void cleanup_charp(char **p); > void cleanup_ucharp(unsigned char **p); > void cleanup_udev_device(struct udev_device **udd); > void cleanup_bitfield(union bitfield **p); > + > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *)); > +void get_shared_ptr(void *ptr); > +void put_shared_ptr(void *ptr); > + > #endif /* UTIL_H_INCLUDED */ ... or just leave it as it is if the approach is to complex. I doubt it's time-critical. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer 2026-05-27 9:28 ` [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer Hannes Reinecke @ 2026-05-27 10:43 ` Martin Wilck 0 siblings, 0 replies; 10+ messages in thread From: Martin Wilck @ 2026-05-27 10:43 UTC (permalink / raw) To: Hannes Reinecke, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez Hi Hannes, On Wed, 2026-05-27 at 11:28 +0200, Hannes Reinecke wrote: > On 5/22/26 18:44, Martin Wilck wrote: > > Add a set of simple functions to handle refcounted pointers. > > > > Signed-off-by: Martin Wilck <mwilck@suse.com> > > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > > --- > > libmpathutil/libmpathutil.version | 7 +++++ > > libmpathutil/util.c | 45 > > ++++++++++++++++++++++++++++--- > > libmpathutil/util.h | 5 ++++ > > 3 files changed, 53 insertions(+), 4 deletions(-) > > > That really looks like the Hazard pointers from Paul McK and Mathieu > Desnoyers. Maybe have a look and see if you cannot re-use the > approach here. Thanks for the hint. Hazard pointers are a far more advanced complicated concept, focussed on cache efficiency and performance. That's important for the kernel, but it would be over-engineered for libmultipath, IMO. We just need to make sure that the pointer is not freed as long as it's still referenced somewhere. AFAICS, Mathieu's code is targeted at the kernel. We could consider using some generic user-space hazard pointer implementation for libmultipath if there is something we could take off the shelf (like, as a part of liburcu, which we're using anyway). But I don't see a strong need for it. There's no doubt that libmultipath is inefficient in many ways, but it isn't CPU-intensive, and thus correctness and simplicity matter more than efficiency. > > > diff --git a/libmpathutil/libmpathutil.version > > b/libmpathutil/libmpathutil.version > > index c69120d..51594ad 100644 > > --- a/libmpathutil/libmpathutil.version > > +++ b/libmpathutil/libmpathutil.version > > @@ -194,3 +194,10 @@ global: > > local: > > *; > > }; > > + > > +LIBMPATHUTIL_6.1 { > > +global: > > + alloc_shared_ptr; > > + get_shared_ptr; > > + put_shared_ptr; > > +} LIBMPATHUTIL_6.0; > > diff --git a/libmpathutil/util.c b/libmpathutil/util.c > > index 23a9797..3c623ec 100644 > > --- a/libmpathutil/util.c > > +++ b/libmpathutil/util.c > > @@ -12,15 +12,13 @@ > > #include <dirent.h> > > #include <unistd.h> > > #include <errno.h> > > +#include <urcu/uatomic.h> > > #include "mt-udev-wrap.h" > > > > #include "util.h" > > #include "debug.h" > > -#include "checkers.h" > > #include "vector.h" > > -#include "structs.h" > > -#include "config.h" > > -#include "log.h" > > +#include "list.h" > > > > size_t > > strchop(char *str) > > @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device > > **udd) > > if (*udd) > > udev_device_unref(*udd); > > } > > + > > +struct shared_ptr { > > + long refcnt; > > + void (*destructor)(void *); > > + char __attribute__((aligned(sizeof(void *)))) ptr[]; > > +}; > > + > > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *)) > > +{ > > + struct shared_ptr *sp = malloc(sizeof(*sp) + size); > > + > > + if (!sp) > > + return NULL; > > + uatomic_set(&sp->refcnt, 1); > > + sp->destructor = destructor; > > + return sp->ptr; > > +} > > + > > +void get_shared_ptr(void *ptr) > > +{ > > + struct shared_ptr *sp = container_of(ptr, struct > > shared_ptr, ptr); > > + > > + if (uatomic_add_return(&sp->refcnt, 1) < 0) > > + condlog(0, "%s: refcount overflow", __func__); > > +} > > + > > +void put_shared_ptr(void *ptr) > > +{ > > + struct shared_ptr *sp; > > + > > + if (!ptr) > > + return; > > + sp = container_of(ptr, struct shared_ptr, ptr); > > + if (uatomic_sub_return(&sp->refcnt, 1) == 0) { > > + if (sp->destructor) > > + sp->destructor(ptr); > > + free(sp); > > + } > > +} > > I _think_ the could be implemented better using RCU primitives; > in the end, what you really care about is whether the pointer is > valid (9which is what RCU provides). What we want to achieve is just that (the destructor is called and) the memory is freed when the last reference is dropped, but no sooner. It's the same concept that the kernel uses for refcounted objects all over the place. > Or indeed use the hazard pointer approach, and count the number > of RCU references to that object ... Like I said, I'd like to keep it simple. The basic refcounting that this patch implements does the job for libmultipath. I just created this admittedly simple abstraction because the multipath code was using reference counters in multiple places, and I wanted to consolidate the code and add a unit test for it. It's not the purpose of this patch set to optimize for performance. If we want to optimize it, the API is simple enough that we can substitute the implementation easily with something more efficient in the future. > > > diff --git a/libmpathutil/util.h b/libmpathutil/util.h > > index aed1bc1..ffbb182 100644 > > --- a/libmpathutil/util.h > > +++ b/libmpathutil/util.h > > @@ -160,4 +160,9 @@ void cleanup_charp(char **p); > > void cleanup_ucharp(unsigned char **p); > > void cleanup_udev_device(struct udev_device **udd); > > void cleanup_bitfield(union bitfield **p); > > + > > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *)); > > +void get_shared_ptr(void *ptr); > > +void put_shared_ptr(void *ptr); > > + > > #endif /* UTIL_H_INCLUDED */ > > ... or just leave it as it is if the approach is to complex. > I doubt it's time-critical. > Exactly. I'd like to see what Ben and others think. Regards, Martin > Cheers, > > Hannes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer [not found] ` <20260522164415.846589-5-mwilck@suse.com> 2026-05-27 9:28 ` [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer Hannes Reinecke @ 2026-05-27 23:16 ` Benjamin Marzinski 2026-05-28 7:50 ` Martin Wilck 1 sibling, 1 reply; 10+ messages in thread From: Benjamin Marzinski @ 2026-05-27 23:16 UTC (permalink / raw) To: Martin Wilck Cc: Christophe Varoqui, Brian Bunker, dm-devel, Xose Vazquez Perez, Martin Wilck On Fri, May 22, 2026 at 06:44:00PM +0200, Martin Wilck wrote: > Add a set of simple functions to handle refcounted pointers. > > Signed-off-by: Martin Wilck <mwilck@suse.com> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > libmpathutil/libmpathutil.version | 7 +++++ > libmpathutil/util.c | 45 ++++++++++++++++++++++++++++--- > libmpathutil/util.h | 5 ++++ > 3 files changed, 53 insertions(+), 4 deletions(-) > > diff --git a/libmpathutil/util.c b/libmpathutil/util.c > index 23a9797..3c623ec 100644 > --- a/libmpathutil/util.c > +++ b/libmpathutil/util.c > @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device **udd) > if (*udd) > udev_device_unref(*udd); > } > + > +struct shared_ptr { > + long refcnt; > + void (*destructor)(void *); > + char __attribute__((aligned(sizeof(void *)))) ptr[]; > +}; > + > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *)) > +{ > + struct shared_ptr *sp = malloc(sizeof(*sp) + size); > + > + if (!sp) > + return NULL; > + uatomic_set(&sp->refcnt, 1); > + sp->destructor = destructor; > + return sp->ptr; > +} > + > +void get_shared_ptr(void *ptr) > +{ We should probably return here if ptr is NULL, like we do in put_shared_ptr(). Alternatively, this might be a reasonable place for an assert(), since nobody should be calling these with NULL pointers. As far as the general question Hannes raise about this implentantion, I'm happy with it as-is. -Ben > + struct shared_ptr *sp = container_of(ptr, struct shared_ptr, ptr); > + > + if (uatomic_add_return(&sp->refcnt, 1) < 0) > + condlog(0, "%s: refcount overflow", __func__); > +} > + > +void put_shared_ptr(void *ptr) > +{ > + struct shared_ptr *sp; > + > + if (!ptr) > + return; > + sp = container_of(ptr, struct shared_ptr, ptr); > + if (uatomic_sub_return(&sp->refcnt, 1) == 0) { > + if (sp->destructor) > + sp->destructor(ptr); > + free(sp); > + } > +} ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer 2026-05-27 23:16 ` Benjamin Marzinski @ 2026-05-28 7:50 ` Martin Wilck 0 siblings, 0 replies; 10+ messages in thread From: Martin Wilck @ 2026-05-28 7:50 UTC (permalink / raw) To: Benjamin Marzinski Cc: Christophe Varoqui, Brian Bunker, dm-devel, Xose Vazquez Perez On Wed, 2026-05-27 at 19:16 -0400, Benjamin Marzinski wrote: > On Fri, May 22, 2026 at 06:44:00PM +0200, Martin Wilck wrote: > > Add a set of simple functions to handle refcounted pointers. > > > > Signed-off-by: Martin Wilck <mwilck@suse.com> > > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > > --- > > libmpathutil/libmpathutil.version | 7 +++++ > > libmpathutil/util.c | 45 > > ++++++++++++++++++++++++++++--- > > libmpathutil/util.h | 5 ++++ > > 3 files changed, 53 insertions(+), 4 deletions(-) > > > > diff --git a/libmpathutil/util.c b/libmpathutil/util.c > > index 23a9797..3c623ec 100644 > > --- a/libmpathutil/util.c > > +++ b/libmpathutil/util.c > > @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device > > **udd) > > if (*udd) > > udev_device_unref(*udd); > > } > > + > > +struct shared_ptr { > > + long refcnt; > > + void (*destructor)(void *); > > + char __attribute__((aligned(sizeof(void *)))) ptr[]; > > +}; > > + > > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *)) > > +{ > > + struct shared_ptr *sp = malloc(sizeof(*sp) + size); > > + > > + if (!sp) > > + return NULL; > > + uatomic_set(&sp->refcnt, 1); > > + sp->destructor = destructor; > > + return sp->ptr; > > +} > > + > > +void get_shared_ptr(void *ptr) > > +{ > > We should probably return here if ptr is NULL, like we do in > put_shared_ptr(). Alternatively, this might be a reasonable place for > an > assert(), since nobody should be calling these with NULL pointers. That makes sense. I'll add an assert(), and change put_shared_ptr() accordingly. Martin ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20260522164415.846589-10-mwilck@suse.com>]
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr [not found] ` <20260522164415.846589-10-mwilck@suse.com> @ 2026-05-27 9:40 ` Hannes Reinecke 2026-05-27 10:11 ` Martin Wilck 0 siblings, 1 reply; 10+ messages in thread From: Hannes Reinecke @ 2026-05-27 9:40 UTC (permalink / raw) To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez, Martin Wilck On 5/22/26 18:44, Martin Wilck wrote: > Use shared_ptr to track the runner_context refcount. > > Signed-off-by: Martin Wilck <mwilck@suse.com> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > --- > libmpathutil/runner.c | 30 +++++++++--------------------- > 1 file changed, 9 insertions(+), 21 deletions(-) > > diff --git a/libmpathutil/runner.c b/libmpathutil/runner.c > index 8c6d6b9..56abd03 100644 > --- a/libmpathutil/runner.c > +++ b/libmpathutil/runner.c > @@ -30,7 +30,6 @@ const char *runner_state_name(int state) > } > > struct runner_context { > - int refcount; > int status; > struct timespec deadline; > pthread_t thr; > @@ -39,17 +38,6 @@ struct runner_context { > char __attribute__((aligned(sizeof(void *)))) data[]; > }; > > -static void release_context(struct runner_context *rctx) > -{ > - int n; > - > - n = uatomic_sub_return(&rctx->refcount, 1); > - assert(n >= 0); > - > - if (n == 0) > - free(rctx); > -} > - > static void cleanup_context(struct runner_context **prctx) > { > struct runner_context *rctx = *prctx; > @@ -65,7 +53,7 @@ static void cleanup_context(struct runner_context **prctx) > "%s: runner %p finished in state '%s'", __func__, rctx, > runner_state_name(st)); > } > - release_context(rctx); > + put_shared_ptr(rctx); > } > > static void *runner_thread(void *arg) > @@ -147,7 +135,7 @@ repeat: > void release_runner(struct runner_context *rctx) > { > cancel_runner(rctx); > - release_context(rctx); > + put_shared_ptr(rctx); > } > > int check_runner(struct runner_context *rctx, void *data, unsigned int size) > @@ -195,17 +183,17 @@ struct runner_context *get_runner(runner_func func, void *data, > return NULL; > } > > - rctx = malloc(sizeof(*rctx) + size); > + rctx = alloc_shared_ptr(sizeof(*rctx) + size, NULL); > if (!rctx) > return NULL; > > rctx->func = func; > /* > - * We have to set the refcount to 2 here. The runner thread may be > - * cancelled before it even had the chance to increase the refcount, > - * which could result in a use-after-free in cleanup_context(). > + * Take an additional reference here. The runner thread may be > + * cancelled before it even had the chance to take a reference, which > + * could result in a use-after-free in cleanup_context(). > */ > - uatomic_set(&rctx->refcount, 2); > + get_shared_ptr(rctx); > uatomic_set(&rctx->status, RUNNER_IDLE); > memcpy(rctx->data, data, size); > > @@ -222,8 +210,8 @@ struct runner_context *get_runner(runner_func func, void *data, > > if (rc) { > condlog(1, "%s: pthread_create(): %s", __func__, strerror(rc)); > - uatomic_dec(&rctx->refcount); > - release_context(rctx); > + put_shared_ptr(rctx); > + put_shared_ptr(rctx); > return NULL; > } > return rctx; Hmm. While I fully get the idea why one would want to use refcounted objects, there really is only one snag: this is multipath. We really, _really_, should make sure the daemon can execute even if the root filesystem is not available. So if we start dropping the reference for the checker (eg if the last path drops and the checker is no longer used), we might end up freeing the memory (and dropping the modules altogether). But if the path comes back we need to reinstate the checker, which not only requires additional memory (which might need a recursion into the filesystem to get free pages) but we also might need to read the checker module from disk (again). So plenty of opportunity to deaslock waithing for the disk be become readable. Maybe it's an idea to have multipathd loading the modules initially, keeping a reference to them, and then dropping the reference once multipathd itself shuts down? Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr 2026-05-27 9:40 ` [PATCH v2 09/19] libmpathutil: runner: use shared_ptr Hannes Reinecke @ 2026-05-27 10:11 ` Martin Wilck 2026-05-27 13:13 ` Hannes Reinecke 0 siblings, 1 reply; 10+ messages in thread From: Martin Wilck @ 2026-05-27 10:11 UTC (permalink / raw) To: Hannes Reinecke, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote: > On 5/22/26 18:44, Martin Wilck wrote: > > Use shared_ptr to track the runner_context refcount. > > > > Signed-off-by: Martin Wilck <mwilck@suse.com> > > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > > --- > > libmpathutil/runner.c | 30 +++++++++--------------------- > > 1 file changed, 9 insertions(+), 21 deletions(-) > > > > diff --git a/libmpathutil/runner.c b/libmpathutil/runner.c > > index 8c6d6b9..56abd03 100644 > > --- a/libmpathutil/runner.c > > +++ b/libmpathutil/runner.c > > @@ -30,7 +30,6 @@ const char *runner_state_name(int state) > > } > > > > struct runner_context { > > - int refcount; > > int status; > > struct timespec deadline; > > pthread_t thr; > > @@ -39,17 +38,6 @@ struct runner_context { > > char __attribute__((aligned(sizeof(void *)))) data[]; > > }; > > > > -static void release_context(struct runner_context *rctx) > > -{ > > - int n; > > - > > - n = uatomic_sub_return(&rctx->refcount, 1); > > - assert(n >= 0); > > - > > - if (n == 0) > > - free(rctx); > > -} > > - > > static void cleanup_context(struct runner_context **prctx) > > { > > struct runner_context *rctx = *prctx; > > @@ -65,7 +53,7 @@ static void cleanup_context(struct runner_context > > **prctx) > > "%s: runner %p finished in state '%s'", > > __func__, rctx, > > runner_state_name(st)); > > } > > - release_context(rctx); > > + put_shared_ptr(rctx); > > } > > > > static void *runner_thread(void *arg) > > @@ -147,7 +135,7 @@ repeat: > > void release_runner(struct runner_context *rctx) > > { > > cancel_runner(rctx); > > - release_context(rctx); > > + put_shared_ptr(rctx); > > } > > > > int check_runner(struct runner_context *rctx, void *data, > > unsigned int size) > > @@ -195,17 +183,17 @@ struct runner_context *get_runner(runner_func > > func, void *data, > > return NULL; > > } > > > > - rctx = malloc(sizeof(*rctx) + size); > > + rctx = alloc_shared_ptr(sizeof(*rctx) + size, NULL); > > if (!rctx) > > return NULL; > > > > rctx->func = func; > > /* > > - * We have to set the refcount to 2 here. The runner > > thread may be > > - * cancelled before it even had the chance to increase the > > refcount, > > - * which could result in a use-after-free in > > cleanup_context(). > > + * Take an additional reference here. The runner thread > > may be > > + * cancelled before it even had the chance to take a > > reference, which > > + * could result in a use-after-free in cleanup_context(). > > */ > > - uatomic_set(&rctx->refcount, 2); > > + get_shared_ptr(rctx); > > uatomic_set(&rctx->status, RUNNER_IDLE); > > memcpy(rctx->data, data, size); > > > > @@ -222,8 +210,8 @@ struct runner_context *get_runner(runner_func > > func, void *data, > > > > if (rc) { > > condlog(1, "%s: pthread_create(): %s", __func__, > > strerror(rc)); > > - uatomic_dec(&rctx->refcount); > > - release_context(rctx); > > + put_shared_ptr(rctx); > > + put_shared_ptr(rctx); > > return NULL; > > } > > return rctx; > > Hmm. While I fully get the idea why one would want to use refcounted > objects, there really is only one snag: this is multipath. > We really, _really_, should make sure the daemon can execute even if > the root filesystem is not available. > So if we start dropping the reference for the checker (eg if the last > path drops and the checker is no longer used), we might end up > freeing > the memory (and dropping the modules altogether). I am pretty sure that we handle this correctly. The current patch set doesn't change anything in this respect, except translating the hard- coded refcounting into the shared_ptr framework. > But if the path comes back we need to reinstate the checker, which > not > only requires additional memory (which might need a recursion into > the > filesystem to get free pages) but we also might need to read the > checker > module from disk (again). So plenty of opportunity to deaslock > waithing > for the disk be become readable. In checker_get(), we call add_checker_class() if the class is not yet loaded, which will take one ref on the class, and then get_shared_ptr() for every path (including the one for which add_checker_class() had been called), which means we have (number of paths + 1) references on the class, and will only drop the last ref when multipathd calls cleanup_checkers() during exit. Therefore I don't think it's possible that we unload the shared library prematurely, and have to reload it later. Regards Martin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr 2026-05-27 10:11 ` Martin Wilck @ 2026-05-27 13:13 ` Hannes Reinecke 2026-05-27 14:11 ` Martin Wilck 0 siblings, 1 reply; 10+ messages in thread From: Hannes Reinecke @ 2026-05-27 13:13 UTC (permalink / raw) To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez On 5/27/26 12:11, Martin Wilck wrote: > On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote: >> On 5/22/26 18:44, Martin Wilck wrote: >>> Use shared_ptr to track the runner_context refcount. >>> >>> Signed-off-by: Martin Wilck <mwilck@suse.com> >>> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> >>> --- >>> libmpathutil/runner.c | 30 +++++++++--------------------- >>> 1 file changed, 9 insertions(+), 21 deletions(-) >>> >>> diff --git a/libmpathutil/runner.c b/libmpathutil/runner.c >>> index 8c6d6b9..56abd03 100644 >>> --- a/libmpathutil/runner.c >>> +++ b/libmpathutil/runner.c >>> @@ -30,7 +30,6 @@ const char *runner_state_name(int state) >>> } >>> >>> struct runner_context { >>> - int refcount; >>> int status; >>> struct timespec deadline; >>> pthread_t thr; >>> @@ -39,17 +38,6 @@ struct runner_context { >>> char __attribute__((aligned(sizeof(void *)))) data[]; >>> }; >>> >>> -static void release_context(struct runner_context *rctx) >>> -{ >>> - int n; >>> - >>> - n = uatomic_sub_return(&rctx->refcount, 1); >>> - assert(n >= 0); >>> - >>> - if (n == 0) >>> - free(rctx); >>> -} >>> - >>> static void cleanup_context(struct runner_context **prctx) >>> { >>> struct runner_context *rctx = *prctx; >>> @@ -65,7 +53,7 @@ static void cleanup_context(struct runner_context >>> **prctx) >>> "%s: runner %p finished in state '%s'", >>> __func__, rctx, >>> runner_state_name(st)); >>> } >>> - release_context(rctx); >>> + put_shared_ptr(rctx); >>> } >>> >>> static void *runner_thread(void *arg) >>> @@ -147,7 +135,7 @@ repeat: >>> void release_runner(struct runner_context *rctx) >>> { >>> cancel_runner(rctx); >>> - release_context(rctx); >>> + put_shared_ptr(rctx); >>> } >>> >>> int check_runner(struct runner_context *rctx, void *data, >>> unsigned int size) >>> @@ -195,17 +183,17 @@ struct runner_context *get_runner(runner_func >>> func, void *data, >>> return NULL; >>> } >>> >>> - rctx = malloc(sizeof(*rctx) + size); >>> + rctx = alloc_shared_ptr(sizeof(*rctx) + size, NULL); >>> if (!rctx) >>> return NULL; >>> >>> rctx->func = func; >>> /* >>> - * We have to set the refcount to 2 here. The runner >>> thread may be >>> - * cancelled before it even had the chance to increase the >>> refcount, >>> - * which could result in a use-after-free in >>> cleanup_context(). >>> + * Take an additional reference here. The runner thread >>> may be >>> + * cancelled before it even had the chance to take a >>> reference, which >>> + * could result in a use-after-free in cleanup_context(). >>> */ >>> - uatomic_set(&rctx->refcount, 2); >>> + get_shared_ptr(rctx); >>> uatomic_set(&rctx->status, RUNNER_IDLE); >>> memcpy(rctx->data, data, size); >>> >>> @@ -222,8 +210,8 @@ struct runner_context *get_runner(runner_func >>> func, void *data, >>> >>> if (rc) { >>> condlog(1, "%s: pthread_create(): %s", __func__, >>> strerror(rc)); >>> - uatomic_dec(&rctx->refcount); >>> - release_context(rctx); >>> + put_shared_ptr(rctx); >>> + put_shared_ptr(rctx); >>> return NULL; >>> } >>> return rctx; >> >> Hmm. While I fully get the idea why one would want to use refcounted >> objects, there really is only one snag: this is multipath. >> We really, _really_, should make sure the daemon can execute even if >> the root filesystem is not available. >> So if we start dropping the reference for the checker (eg if the last >> path drops and the checker is no longer used), we might end up >> freeing >> the memory (and dropping the modules altogether). > > I am pretty sure that we handle this correctly. The current patch set > doesn't change anything in this respect, except translating the hard- > coded refcounting into the shared_ptr framework. > >> But if the path comes back we need to reinstate the checker, which >> not >> only requires additional memory (which might need a recursion into >> the >> filesystem to get free pages) but we also might need to read the >> checker >> module from disk (again). So plenty of opportunity to deaslock >> waithing >> for the disk be become readable. > > In checker_get(), we call add_checker_class() if the class is not yet > loaded, which will take one ref on the class, and then get_shared_ptr() > for every path (including the one for which add_checker_class() had > been called), which means we have (number of paths + 1) references on > the class, and will only drop the last ref when multipathd calls > cleanup_checkers() during exit. > > Therefore I don't think it's possible that we unload the shared library > prematurely, and have to reload it later. > Hmm. Really hmmmmm. If the checkers are loaded during startup, and unloaded on shutdown, there really is no point in refcounting them, is there? Wouldn't global pointers to each check sufficient here, avoiding the need for refcounting altogether? Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr 2026-05-27 13:13 ` Hannes Reinecke @ 2026-05-27 14:11 ` Martin Wilck 2026-05-27 17:30 ` Hannes Reinecke 0 siblings, 1 reply; 10+ messages in thread From: Martin Wilck @ 2026-05-27 14:11 UTC (permalink / raw) To: Hannes Reinecke, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez On Wed, 2026-05-27 at 15:13 +0200, Hannes Reinecke wrote: > On 5/27/26 12:11, Martin Wilck wrote: > > On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote: > > > > > > > But if the path comes back we need to reinstate the checker, > > > which > > > not > > > only requires additional memory (which might need a recursion > > > into > > > the > > > filesystem to get free pages) but we also might need to read the > > > checker > > > module from disk (again). So plenty of opportunity to deaslock > > > waithing > > > for the disk be become readable. > > > > In checker_get(), we call add_checker_class() if the class is not > > yet > > loaded, which will take one ref on the class, and then > > get_shared_ptr() > > for every path (including the one for which add_checker_class() had > > been called), which means we have (number of paths + 1) references > > on > > the class, and will only drop the last ref when multipathd calls > > cleanup_checkers() during exit. > > > > Therefore I don't think it's possible that we unload the shared > > library > > prematurely, and have to reload it later. > > > Hmm. > > Really hmmmmm. > > If the checkers are loaded during startup, and unloaded on shutdown, > there really is no point in refcounting them, is there? > Wouldn't global pointers to each check sufficient here, avoiding > the need for refcounting altogether? Currently checker and prioritizer DSOs aren't loaded during startup. They are loaded when the first path that uses the given checker is initialized. I agree that it would be possible to simplify the code by just loading all checkers and prioritizers in advance, in which case we wouldn't need refcounting, like you wrote. Actually, we don't need to use shared objects in the first place; we could simply include all the checker and prioritizer code in libmultipath itself. The shared object architecture dates back to multipath-tools 0.4.9 (2008). We (actually, you ;-) ) added refcounting to fix races in 323090f ("Use refcounting for checkers"). So far it hasn't occured to me to change this architecture, which has been working well for over a decade. Anyway, this discussion is orthogonal to my current patch set. Regards Martin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr 2026-05-27 14:11 ` Martin Wilck @ 2026-05-27 17:30 ` Hannes Reinecke 0 siblings, 0 replies; 10+ messages in thread From: Hannes Reinecke @ 2026-05-27 17:30 UTC (permalink / raw) To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski, Brian Bunker, dm-devel Cc: Xose Vazquez Perez On 5/27/26 16:11, Martin Wilck wrote: > On Wed, 2026-05-27 at 15:13 +0200, Hannes Reinecke wrote: >> On 5/27/26 12:11, Martin Wilck wrote: >>> On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote: >>> >>> >>>> But if the path comes back we need to reinstate the checker, >>>> which >>>> not >>>> only requires additional memory (which might need a recursion >>>> into >>>> the >>>> filesystem to get free pages) but we also might need to read the >>>> checker >>>> module from disk (again). So plenty of opportunity to deaslock >>>> waithing >>>> for the disk be become readable. >>> >>> In checker_get(), we call add_checker_class() if the class is not >>> yet >>> loaded, which will take one ref on the class, and then >>> get_shared_ptr() >>> for every path (including the one for which add_checker_class() had >>> been called), which means we have (number of paths + 1) references >>> on >>> the class, and will only drop the last ref when multipathd calls >>> cleanup_checkers() during exit. >>> >>> Therefore I don't think it's possible that we unload the shared >>> library >>> prematurely, and have to reload it later. >>> >> Hmm. >> >> Really hmmmmm. >> >> If the checkers are loaded during startup, and unloaded on shutdown, >> there really is no point in refcounting them, is there? >> Wouldn't global pointers to each check sufficient here, avoiding >> the need for refcounting altogether? > > Currently checker and prioritizer DSOs aren't loaded during startup. > They are loaded when the first path that uses the given checker is > initialized. > > I agree that it would be possible to simplify the code by just loading > all checkers and prioritizers in advance, in which case we wouldn't > need refcounting, like you wrote. Actually, we don't need to use shared > objects in the first place; we could simply include all the checker and > prioritizer code in libmultipath itself. > > The shared object architecture dates back to multipath-tools 0.4.9 > (2008). We (actually, you ;-) ) added refcounting to fix races in > 323090f ("Use refcounting for checkers"). So far it hasn't occured to > me to change this architecture, which has been working well for over a > decade. > > Anyway, this discussion is orthogonal to my current patch set. > That is true. So maybe for a later cleanup. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 00/19] multipath-tools: asynchronous checker framework [not found] <20260522164415.846589-1-mwilck@suse.com> [not found] ` <20260522164415.846589-5-mwilck@suse.com> [not found] ` <20260522164415.846589-10-mwilck@suse.com> @ 2026-06-02 5:53 ` Benjamin Marzinski 2 siblings, 0 replies; 10+ messages in thread From: Benjamin Marzinski @ 2026-06-02 5:53 UTC (permalink / raw) To: Martin Wilck Cc: Christophe Varoqui, Brian Bunker, dm-devel, Xose Vazquez Perez, Martin Wilck On Fri, May 22, 2026 at 06:43:56PM +0200, Martin Wilck wrote: > Based on my previous patch set [1], which introduced the "runner" thread > handling code as an abstract implementation of the asynchronous checker > logic in the TUR checker, this set introduces a generic "async_checker" > type and converts all existing checkers except directio to this framework. > With this set applied, all checkers run asyncrhonously unless explicitly > forbidden to do so with the "force_sync" option. > > Patch 1-3 are minor preparations. Patch 4-8 introduce a general concept of a > refcounted "shared" pointer and use it to replace multiple instances > of hard-coded refcounts on objects. Patch 9 converts the current TUR code > into a generic asynchronous checker type by separating the actual TUR > code from the handling of the async runners. Patch 10 adds support for > this async checker type in the checkers code by searching for > "libcheck_async_func" with dlsym() in checker DSOs. Patch 11-15 convert > all checkers except emc_clariion to the async_checker framework. > emc_clariion is special because it needs extra context both for the checker > itself and for multipath-related information. Patch 16 and 17 add the > respective additional fields to the async_checker framework. The mpcontext > handling is rewritten, because the current apporach of handling void ** > pointers would be disastrous in scenarious with truly asynchronous checker > threads that may hang forever. Patch 18 finally converts emc_clariion, > using the concepts from patch 16 and 17. > > Comments and reviews welcome. > > Regards, > Martin > Sorry for the late review. For the set: Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-06-02 5:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260522164415.846589-1-mwilck@suse.com>
[not found] ` <20260522164415.846589-5-mwilck@suse.com>
2026-05-27 9:28 ` [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer Hannes Reinecke
2026-05-27 10:43 ` Martin Wilck
2026-05-27 23:16 ` Benjamin Marzinski
2026-05-28 7:50 ` Martin Wilck
[not found] ` <20260522164415.846589-10-mwilck@suse.com>
2026-05-27 9:40 ` [PATCH v2 09/19] libmpathutil: runner: use shared_ptr Hannes Reinecke
2026-05-27 10:11 ` Martin Wilck
2026-05-27 13:13 ` Hannes Reinecke
2026-05-27 14:11 ` Martin Wilck
2026-05-27 17:30 ` Hannes Reinecke
2026-06-02 5:53 ` [PATCH v2 00/19] multipath-tools: asynchronous checker framework Benjamin Marzinski
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.