* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer
[not found] ` <20260522164415.846589-5-mwilck@suse.com>
@ 2026-05-27 9:28 ` Hannes Reinecke
2026-05-27 10:43 ` Martin Wilck
2026-05-27 23:16 ` Benjamin Marzinski
1 sibling, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2026-05-27 9:28 UTC (permalink / raw)
To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez, Martin Wilck
On 5/22/26 18:44, Martin Wilck wrote:
> Add a set of simple functions to handle refcounted pointers.
>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> libmpathutil/libmpathutil.version | 7 +++++
> libmpathutil/util.c | 45 ++++++++++++++++++++++++++++---
> libmpathutil/util.h | 5 ++++
> 3 files changed, 53 insertions(+), 4 deletions(-)
>
That really looks like the Hazard pointers from Paul McK and Mathieu
Desnoyers. Maybe have a look and see if you cannot re-use the
approach here.
> diff --git a/libmpathutil/libmpathutil.version b/libmpathutil/libmpathutil.version
> index c69120d..51594ad 100644
> --- a/libmpathutil/libmpathutil.version
> +++ b/libmpathutil/libmpathutil.version
> @@ -194,3 +194,10 @@ global:
> local:
> *;
> };
> +
> +LIBMPATHUTIL_6.1 {
> +global:
> + alloc_shared_ptr;
> + get_shared_ptr;
> + put_shared_ptr;
> +} LIBMPATHUTIL_6.0;
> diff --git a/libmpathutil/util.c b/libmpathutil/util.c
> index 23a9797..3c623ec 100644
> --- a/libmpathutil/util.c
> +++ b/libmpathutil/util.c
> @@ -12,15 +12,13 @@
> #include <dirent.h>
> #include <unistd.h>
> #include <errno.h>
> +#include <urcu/uatomic.h>
> #include "mt-udev-wrap.h"
>
> #include "util.h"
> #include "debug.h"
> -#include "checkers.h"
> #include "vector.h"
> -#include "structs.h"
> -#include "config.h"
> -#include "log.h"
> +#include "list.h"
>
> size_t
> strchop(char *str)
> @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device **udd)
> if (*udd)
> udev_device_unref(*udd);
> }
> +
> +struct shared_ptr {
> + long refcnt;
> + void (*destructor)(void *);
> + char __attribute__((aligned(sizeof(void *)))) ptr[];
> +};
> +
> +void *alloc_shared_ptr(size_t size, void (*destructor)(void *))
> +{
> + struct shared_ptr *sp = malloc(sizeof(*sp) + size);
> +
> + if (!sp)
> + return NULL;
> + uatomic_set(&sp->refcnt, 1);
> + sp->destructor = destructor;
> + return sp->ptr;
> +}
> +
> +void get_shared_ptr(void *ptr)
> +{
> + struct shared_ptr *sp = container_of(ptr, struct shared_ptr, ptr);
> +
> + if (uatomic_add_return(&sp->refcnt, 1) < 0)
> + condlog(0, "%s: refcount overflow", __func__);
> +}
> +
> +void put_shared_ptr(void *ptr)
> +{
> + struct shared_ptr *sp;
> +
> + if (!ptr)
> + return;
> + sp = container_of(ptr, struct shared_ptr, ptr);
> + if (uatomic_sub_return(&sp->refcnt, 1) == 0) {
> + if (sp->destructor)
> + sp->destructor(ptr);
> + free(sp);
> + }
> +}
I _think_ the could be implemented better using RCU primitives;
in the end, what you really care about is whether the pointer is
valid (9which is what RCU provides).
Or indeed use the hazard pointer approach, and count the number
of RCU references to that object ...
> diff --git a/libmpathutil/util.h b/libmpathutil/util.h
> index aed1bc1..ffbb182 100644
> --- a/libmpathutil/util.h
> +++ b/libmpathutil/util.h
> @@ -160,4 +160,9 @@ void cleanup_charp(char **p);
> void cleanup_ucharp(unsigned char **p);
> void cleanup_udev_device(struct udev_device **udd);
> void cleanup_bitfield(union bitfield **p);
> +
> +void *alloc_shared_ptr(size_t size, void (*destructor)(void *));
> +void get_shared_ptr(void *ptr);
> +void put_shared_ptr(void *ptr);
> +
> #endif /* UTIL_H_INCLUDED */
... or just leave it as it is if the approach is to complex.
I doubt it's time-critical.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr
[not found] ` <20260522164415.846589-10-mwilck@suse.com>
@ 2026-05-27 9:40 ` Hannes Reinecke
2026-05-27 10:11 ` Martin Wilck
0 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2026-05-27 9:40 UTC (permalink / raw)
To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez, Martin Wilck
On 5/22/26 18:44, Martin Wilck wrote:
> Use shared_ptr to track the runner_context refcount.
>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> libmpathutil/runner.c | 30 +++++++++---------------------
> 1 file changed, 9 insertions(+), 21 deletions(-)
>
> diff --git a/libmpathutil/runner.c b/libmpathutil/runner.c
> index 8c6d6b9..56abd03 100644
> --- a/libmpathutil/runner.c
> +++ b/libmpathutil/runner.c
> @@ -30,7 +30,6 @@ const char *runner_state_name(int state)
> }
>
> struct runner_context {
> - int refcount;
> int status;
> struct timespec deadline;
> pthread_t thr;
> @@ -39,17 +38,6 @@ struct runner_context {
> char __attribute__((aligned(sizeof(void *)))) data[];
> };
>
> -static void release_context(struct runner_context *rctx)
> -{
> - int n;
> -
> - n = uatomic_sub_return(&rctx->refcount, 1);
> - assert(n >= 0);
> -
> - if (n == 0)
> - free(rctx);
> -}
> -
> static void cleanup_context(struct runner_context **prctx)
> {
> struct runner_context *rctx = *prctx;
> @@ -65,7 +53,7 @@ static void cleanup_context(struct runner_context **prctx)
> "%s: runner %p finished in state '%s'", __func__, rctx,
> runner_state_name(st));
> }
> - release_context(rctx);
> + put_shared_ptr(rctx);
> }
>
> static void *runner_thread(void *arg)
> @@ -147,7 +135,7 @@ repeat:
> void release_runner(struct runner_context *rctx)
> {
> cancel_runner(rctx);
> - release_context(rctx);
> + put_shared_ptr(rctx);
> }
>
> int check_runner(struct runner_context *rctx, void *data, unsigned int size)
> @@ -195,17 +183,17 @@ struct runner_context *get_runner(runner_func func, void *data,
> return NULL;
> }
>
> - rctx = malloc(sizeof(*rctx) + size);
> + rctx = alloc_shared_ptr(sizeof(*rctx) + size, NULL);
> if (!rctx)
> return NULL;
>
> rctx->func = func;
> /*
> - * We have to set the refcount to 2 here. The runner thread may be
> - * cancelled before it even had the chance to increase the refcount,
> - * which could result in a use-after-free in cleanup_context().
> + * Take an additional reference here. The runner thread may be
> + * cancelled before it even had the chance to take a reference, which
> + * could result in a use-after-free in cleanup_context().
> */
> - uatomic_set(&rctx->refcount, 2);
> + get_shared_ptr(rctx);
> uatomic_set(&rctx->status, RUNNER_IDLE);
> memcpy(rctx->data, data, size);
>
> @@ -222,8 +210,8 @@ struct runner_context *get_runner(runner_func func, void *data,
>
> if (rc) {
> condlog(1, "%s: pthread_create(): %s", __func__, strerror(rc));
> - uatomic_dec(&rctx->refcount);
> - release_context(rctx);
> + put_shared_ptr(rctx);
> + put_shared_ptr(rctx);
> return NULL;
> }
> return rctx;
Hmm. While I fully get the idea why one would want to use refcounted
objects, there really is only one snag: this is multipath.
We really, _really_, should make sure the daemon can execute even if
the root filesystem is not available.
So if we start dropping the reference for the checker (eg if the last
path drops and the checker is no longer used), we might end up freeing
the memory (and dropping the modules altogether).
But if the path comes back we need to reinstate the checker, which not
only requires additional memory (which might need a recursion into the
filesystem to get free pages) but we also might need to read the checker
module from disk (again). So plenty of opportunity to deaslock waithing
for the disk be become readable.
Maybe it's an idea to have multipathd loading the modules initially,
keeping a reference to them, and then dropping the reference once
multipathd itself shuts down?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr
2026-05-27 9:40 ` [PATCH v2 09/19] libmpathutil: runner: use shared_ptr Hannes Reinecke
@ 2026-05-27 10:11 ` Martin Wilck
2026-05-27 13:13 ` Hannes Reinecke
0 siblings, 1 reply; 10+ messages in thread
From: Martin Wilck @ 2026-05-27 10:11 UTC (permalink / raw)
To: Hannes Reinecke, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez
On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote:
> On 5/22/26 18:44, Martin Wilck wrote:
> > Use shared_ptr to track the runner_context refcount.
> >
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > libmpathutil/runner.c | 30 +++++++++---------------------
> > 1 file changed, 9 insertions(+), 21 deletions(-)
> >
> > diff --git a/libmpathutil/runner.c b/libmpathutil/runner.c
> > index 8c6d6b9..56abd03 100644
> > --- a/libmpathutil/runner.c
> > +++ b/libmpathutil/runner.c
> > @@ -30,7 +30,6 @@ const char *runner_state_name(int state)
> > }
> >
> > struct runner_context {
> > - int refcount;
> > int status;
> > struct timespec deadline;
> > pthread_t thr;
> > @@ -39,17 +38,6 @@ struct runner_context {
> > char __attribute__((aligned(sizeof(void *)))) data[];
> > };
> >
> > -static void release_context(struct runner_context *rctx)
> > -{
> > - int n;
> > -
> > - n = uatomic_sub_return(&rctx->refcount, 1);
> > - assert(n >= 0);
> > -
> > - if (n == 0)
> > - free(rctx);
> > -}
> > -
> > static void cleanup_context(struct runner_context **prctx)
> > {
> > struct runner_context *rctx = *prctx;
> > @@ -65,7 +53,7 @@ static void cleanup_context(struct runner_context
> > **prctx)
> > "%s: runner %p finished in state '%s'",
> > __func__, rctx,
> > runner_state_name(st));
> > }
> > - release_context(rctx);
> > + put_shared_ptr(rctx);
> > }
> >
> > static void *runner_thread(void *arg)
> > @@ -147,7 +135,7 @@ repeat:
> > void release_runner(struct runner_context *rctx)
> > {
> > cancel_runner(rctx);
> > - release_context(rctx);
> > + put_shared_ptr(rctx);
> > }
> >
> > int check_runner(struct runner_context *rctx, void *data,
> > unsigned int size)
> > @@ -195,17 +183,17 @@ struct runner_context *get_runner(runner_func
> > func, void *data,
> > return NULL;
> > }
> >
> > - rctx = malloc(sizeof(*rctx) + size);
> > + rctx = alloc_shared_ptr(sizeof(*rctx) + size, NULL);
> > if (!rctx)
> > return NULL;
> >
> > rctx->func = func;
> > /*
> > - * We have to set the refcount to 2 here. The runner
> > thread may be
> > - * cancelled before it even had the chance to increase the
> > refcount,
> > - * which could result in a use-after-free in
> > cleanup_context().
> > + * Take an additional reference here. The runner thread
> > may be
> > + * cancelled before it even had the chance to take a
> > reference, which
> > + * could result in a use-after-free in cleanup_context().
> > */
> > - uatomic_set(&rctx->refcount, 2);
> > + get_shared_ptr(rctx);
> > uatomic_set(&rctx->status, RUNNER_IDLE);
> > memcpy(rctx->data, data, size);
> >
> > @@ -222,8 +210,8 @@ struct runner_context *get_runner(runner_func
> > func, void *data,
> >
> > if (rc) {
> > condlog(1, "%s: pthread_create(): %s", __func__,
> > strerror(rc));
> > - uatomic_dec(&rctx->refcount);
> > - release_context(rctx);
> > + put_shared_ptr(rctx);
> > + put_shared_ptr(rctx);
> > return NULL;
> > }
> > return rctx;
>
> Hmm. While I fully get the idea why one would want to use refcounted
> objects, there really is only one snag: this is multipath.
> We really, _really_, should make sure the daemon can execute even if
> the root filesystem is not available.
> So if we start dropping the reference for the checker (eg if the last
> path drops and the checker is no longer used), we might end up
> freeing
> the memory (and dropping the modules altogether).
I am pretty sure that we handle this correctly. The current patch set
doesn't change anything in this respect, except translating the hard-
coded refcounting into the shared_ptr framework.
> But if the path comes back we need to reinstate the checker, which
> not
> only requires additional memory (which might need a recursion into
> the
> filesystem to get free pages) but we also might need to read the
> checker
> module from disk (again). So plenty of opportunity to deaslock
> waithing
> for the disk be become readable.
In checker_get(), we call add_checker_class() if the class is not yet
loaded, which will take one ref on the class, and then get_shared_ptr()
for every path (including the one for which add_checker_class() had
been called), which means we have (number of paths + 1) references on
the class, and will only drop the last ref when multipathd calls
cleanup_checkers() during exit.
Therefore I don't think it's possible that we unload the shared library
prematurely, and have to reload it later.
Regards
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer
2026-05-27 9:28 ` [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer Hannes Reinecke
@ 2026-05-27 10:43 ` Martin Wilck
0 siblings, 0 replies; 10+ messages in thread
From: Martin Wilck @ 2026-05-27 10:43 UTC (permalink / raw)
To: Hannes Reinecke, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez
Hi Hannes,
On Wed, 2026-05-27 at 11:28 +0200, Hannes Reinecke wrote:
> On 5/22/26 18:44, Martin Wilck wrote:
> > Add a set of simple functions to handle refcounted pointers.
> >
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > libmpathutil/libmpathutil.version | 7 +++++
> > libmpathutil/util.c | 45
> > ++++++++++++++++++++++++++++---
> > libmpathutil/util.h | 5 ++++
> > 3 files changed, 53 insertions(+), 4 deletions(-)
> >
> That really looks like the Hazard pointers from Paul McK and Mathieu
> Desnoyers. Maybe have a look and see if you cannot re-use the
> approach here.
Thanks for the hint. Hazard pointers are a far more advanced
complicated concept, focussed on cache efficiency and performance.
That's important for the kernel, but it would be over-engineered for
libmultipath, IMO. We just need to make sure that the pointer is not
freed as long as it's still referenced somewhere.
AFAICS, Mathieu's code is targeted at the kernel. We could consider
using some generic user-space hazard pointer implementation for
libmultipath if there is something we could take off the shelf (like,
as a part of liburcu, which we're using anyway). But I don't see a
strong need for it. There's no doubt that libmultipath is inefficient
in many ways, but it isn't CPU-intensive, and thus correctness and
simplicity matter more than efficiency.
>
> > diff --git a/libmpathutil/libmpathutil.version
> > b/libmpathutil/libmpathutil.version
> > index c69120d..51594ad 100644
> > --- a/libmpathutil/libmpathutil.version
> > +++ b/libmpathutil/libmpathutil.version
> > @@ -194,3 +194,10 @@ global:
> > local:
> > *;
> > };
> > +
> > +LIBMPATHUTIL_6.1 {
> > +global:
> > + alloc_shared_ptr;
> > + get_shared_ptr;
> > + put_shared_ptr;
> > +} LIBMPATHUTIL_6.0;
> > diff --git a/libmpathutil/util.c b/libmpathutil/util.c
> > index 23a9797..3c623ec 100644
> > --- a/libmpathutil/util.c
> > +++ b/libmpathutil/util.c
> > @@ -12,15 +12,13 @@
> > #include <dirent.h>
> > #include <unistd.h>
> > #include <errno.h>
> > +#include <urcu/uatomic.h>
> > #include "mt-udev-wrap.h"
> >
> > #include "util.h"
> > #include "debug.h"
> > -#include "checkers.h"
> > #include "vector.h"
> > -#include "structs.h"
> > -#include "config.h"
> > -#include "log.h"
> > +#include "list.h"
> >
> > size_t
> > strchop(char *str)
> > @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device
> > **udd)
> > if (*udd)
> > udev_device_unref(*udd);
> > }
> > +
> > +struct shared_ptr {
> > + long refcnt;
> > + void (*destructor)(void *);
> > + char __attribute__((aligned(sizeof(void *)))) ptr[];
> > +};
> > +
> > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *))
> > +{
> > + struct shared_ptr *sp = malloc(sizeof(*sp) + size);
> > +
> > + if (!sp)
> > + return NULL;
> > + uatomic_set(&sp->refcnt, 1);
> > + sp->destructor = destructor;
> > + return sp->ptr;
> > +}
> > +
> > +void get_shared_ptr(void *ptr)
> > +{
> > + struct shared_ptr *sp = container_of(ptr, struct
> > shared_ptr, ptr);
> > +
> > + if (uatomic_add_return(&sp->refcnt, 1) < 0)
> > + condlog(0, "%s: refcount overflow", __func__);
> > +}
> > +
> > +void put_shared_ptr(void *ptr)
> > +{
> > + struct shared_ptr *sp;
> > +
> > + if (!ptr)
> > + return;
> > + sp = container_of(ptr, struct shared_ptr, ptr);
> > + if (uatomic_sub_return(&sp->refcnt, 1) == 0) {
> > + if (sp->destructor)
> > + sp->destructor(ptr);
> > + free(sp);
> > + }
> > +}
>
> I _think_ the could be implemented better using RCU primitives;
> in the end, what you really care about is whether the pointer is
> valid (9which is what RCU provides).
What we want to achieve is just that (the destructor is called and) the
memory is freed when the last reference is dropped, but no sooner. It's
the same concept that the kernel uses for refcounted objects all over
the place.
> Or indeed use the hazard pointer approach, and count the number
> of RCU references to that object ...
Like I said, I'd like to keep it simple. The basic refcounting
that this patch implements does the job for libmultipath.
I just created this admittedly simple abstraction because the multipath
code was using reference counters in multiple places, and I wanted to
consolidate the code and add a unit test for it. It's not the purpose
of this patch set to optimize for performance. If we want to optimize
it, the API is simple enough that we can substitute the implementation
easily with something more efficient in the future.
>
> > diff --git a/libmpathutil/util.h b/libmpathutil/util.h
> > index aed1bc1..ffbb182 100644
> > --- a/libmpathutil/util.h
> > +++ b/libmpathutil/util.h
> > @@ -160,4 +160,9 @@ void cleanup_charp(char **p);
> > void cleanup_ucharp(unsigned char **p);
> > void cleanup_udev_device(struct udev_device **udd);
> > void cleanup_bitfield(union bitfield **p);
> > +
> > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *));
> > +void get_shared_ptr(void *ptr);
> > +void put_shared_ptr(void *ptr);
> > +
> > #endif /* UTIL_H_INCLUDED */
>
> ... or just leave it as it is if the approach is to complex.
> I doubt it's time-critical.
>
Exactly. I'd like to see what Ben and others think.
Regards,
Martin
> Cheers,
>
> Hannes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr
2026-05-27 10:11 ` Martin Wilck
@ 2026-05-27 13:13 ` Hannes Reinecke
2026-05-27 14:11 ` Martin Wilck
0 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2026-05-27 13:13 UTC (permalink / raw)
To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez
On 5/27/26 12:11, Martin Wilck wrote:
> On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote:
>> On 5/22/26 18:44, Martin Wilck wrote:
>>> Use shared_ptr to track the runner_context refcount.
>>>
>>> Signed-off-by: Martin Wilck <mwilck@suse.com>
>>> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
>>> ---
>>> libmpathutil/runner.c | 30 +++++++++---------------------
>>> 1 file changed, 9 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/libmpathutil/runner.c b/libmpathutil/runner.c
>>> index 8c6d6b9..56abd03 100644
>>> --- a/libmpathutil/runner.c
>>> +++ b/libmpathutil/runner.c
>>> @@ -30,7 +30,6 @@ const char *runner_state_name(int state)
>>> }
>>>
>>> struct runner_context {
>>> - int refcount;
>>> int status;
>>> struct timespec deadline;
>>> pthread_t thr;
>>> @@ -39,17 +38,6 @@ struct runner_context {
>>> char __attribute__((aligned(sizeof(void *)))) data[];
>>> };
>>>
>>> -static void release_context(struct runner_context *rctx)
>>> -{
>>> - int n;
>>> -
>>> - n = uatomic_sub_return(&rctx->refcount, 1);
>>> - assert(n >= 0);
>>> -
>>> - if (n == 0)
>>> - free(rctx);
>>> -}
>>> -
>>> static void cleanup_context(struct runner_context **prctx)
>>> {
>>> struct runner_context *rctx = *prctx;
>>> @@ -65,7 +53,7 @@ static void cleanup_context(struct runner_context
>>> **prctx)
>>> "%s: runner %p finished in state '%s'",
>>> __func__, rctx,
>>> runner_state_name(st));
>>> }
>>> - release_context(rctx);
>>> + put_shared_ptr(rctx);
>>> }
>>>
>>> static void *runner_thread(void *arg)
>>> @@ -147,7 +135,7 @@ repeat:
>>> void release_runner(struct runner_context *rctx)
>>> {
>>> cancel_runner(rctx);
>>> - release_context(rctx);
>>> + put_shared_ptr(rctx);
>>> }
>>>
>>> int check_runner(struct runner_context *rctx, void *data,
>>> unsigned int size)
>>> @@ -195,17 +183,17 @@ struct runner_context *get_runner(runner_func
>>> func, void *data,
>>> return NULL;
>>> }
>>>
>>> - rctx = malloc(sizeof(*rctx) + size);
>>> + rctx = alloc_shared_ptr(sizeof(*rctx) + size, NULL);
>>> if (!rctx)
>>> return NULL;
>>>
>>> rctx->func = func;
>>> /*
>>> - * We have to set the refcount to 2 here. The runner
>>> thread may be
>>> - * cancelled before it even had the chance to increase the
>>> refcount,
>>> - * which could result in a use-after-free in
>>> cleanup_context().
>>> + * Take an additional reference here. The runner thread
>>> may be
>>> + * cancelled before it even had the chance to take a
>>> reference, which
>>> + * could result in a use-after-free in cleanup_context().
>>> */
>>> - uatomic_set(&rctx->refcount, 2);
>>> + get_shared_ptr(rctx);
>>> uatomic_set(&rctx->status, RUNNER_IDLE);
>>> memcpy(rctx->data, data, size);
>>>
>>> @@ -222,8 +210,8 @@ struct runner_context *get_runner(runner_func
>>> func, void *data,
>>>
>>> if (rc) {
>>> condlog(1, "%s: pthread_create(): %s", __func__,
>>> strerror(rc));
>>> - uatomic_dec(&rctx->refcount);
>>> - release_context(rctx);
>>> + put_shared_ptr(rctx);
>>> + put_shared_ptr(rctx);
>>> return NULL;
>>> }
>>> return rctx;
>>
>> Hmm. While I fully get the idea why one would want to use refcounted
>> objects, there really is only one snag: this is multipath.
>> We really, _really_, should make sure the daemon can execute even if
>> the root filesystem is not available.
>> So if we start dropping the reference for the checker (eg if the last
>> path drops and the checker is no longer used), we might end up
>> freeing
>> the memory (and dropping the modules altogether).
>
> I am pretty sure that we handle this correctly. The current patch set
> doesn't change anything in this respect, except translating the hard-
> coded refcounting into the shared_ptr framework.
>
>> But if the path comes back we need to reinstate the checker, which
>> not
>> only requires additional memory (which might need a recursion into
>> the
>> filesystem to get free pages) but we also might need to read the
>> checker
>> module from disk (again). So plenty of opportunity to deaslock
>> waithing
>> for the disk be become readable.
>
> In checker_get(), we call add_checker_class() if the class is not yet
> loaded, which will take one ref on the class, and then get_shared_ptr()
> for every path (including the one for which add_checker_class() had
> been called), which means we have (number of paths + 1) references on
> the class, and will only drop the last ref when multipathd calls
> cleanup_checkers() during exit.
>
> Therefore I don't think it's possible that we unload the shared library
> prematurely, and have to reload it later.
>
Hmm.
Really hmmmmm.
If the checkers are loaded during startup, and unloaded on shutdown,
there really is no point in refcounting them, is there?
Wouldn't global pointers to each check sufficient here, avoiding
the need for refcounting altogether?
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr
2026-05-27 13:13 ` Hannes Reinecke
@ 2026-05-27 14:11 ` Martin Wilck
2026-05-27 17:30 ` Hannes Reinecke
0 siblings, 1 reply; 10+ messages in thread
From: Martin Wilck @ 2026-05-27 14:11 UTC (permalink / raw)
To: Hannes Reinecke, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez
On Wed, 2026-05-27 at 15:13 +0200, Hannes Reinecke wrote:
> On 5/27/26 12:11, Martin Wilck wrote:
> > On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote:
> >
> >
> > > But if the path comes back we need to reinstate the checker,
> > > which
> > > not
> > > only requires additional memory (which might need a recursion
> > > into
> > > the
> > > filesystem to get free pages) but we also might need to read the
> > > checker
> > > module from disk (again). So plenty of opportunity to deaslock
> > > waithing
> > > for the disk be become readable.
> >
> > In checker_get(), we call add_checker_class() if the class is not
> > yet
> > loaded, which will take one ref on the class, and then
> > get_shared_ptr()
> > for every path (including the one for which add_checker_class() had
> > been called), which means we have (number of paths + 1) references
> > on
> > the class, and will only drop the last ref when multipathd calls
> > cleanup_checkers() during exit.
> >
> > Therefore I don't think it's possible that we unload the shared
> > library
> > prematurely, and have to reload it later.
> >
> Hmm.
>
> Really hmmmmm.
>
> If the checkers are loaded during startup, and unloaded on shutdown,
> there really is no point in refcounting them, is there?
> Wouldn't global pointers to each check sufficient here, avoiding
> the need for refcounting altogether?
Currently checker and prioritizer DSOs aren't loaded during startup.
They are loaded when the first path that uses the given checker is
initialized.
I agree that it would be possible to simplify the code by just loading
all checkers and prioritizers in advance, in which case we wouldn't
need refcounting, like you wrote. Actually, we don't need to use shared
objects in the first place; we could simply include all the checker and
prioritizer code in libmultipath itself.
The shared object architecture dates back to multipath-tools 0.4.9
(2008). We (actually, you ;-) ) added refcounting to fix races in
323090f ("Use refcounting for checkers"). So far it hasn't occured to
me to change this architecture, which has been working well for over a
decade.
Anyway, this discussion is orthogonal to my current patch set.
Regards
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 09/19] libmpathutil: runner: use shared_ptr
2026-05-27 14:11 ` Martin Wilck
@ 2026-05-27 17:30 ` Hannes Reinecke
0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2026-05-27 17:30 UTC (permalink / raw)
To: Martin Wilck, Christophe Varoqui, Benjamin Marzinski,
Brian Bunker, dm-devel
Cc: Xose Vazquez Perez
On 5/27/26 16:11, Martin Wilck wrote:
> On Wed, 2026-05-27 at 15:13 +0200, Hannes Reinecke wrote:
>> On 5/27/26 12:11, Martin Wilck wrote:
>>> On Wed, 2026-05-27 at 11:40 +0200, Hannes Reinecke wrote:
>>>
>>>
>>>> But if the path comes back we need to reinstate the checker,
>>>> which
>>>> not
>>>> only requires additional memory (which might need a recursion
>>>> into
>>>> the
>>>> filesystem to get free pages) but we also might need to read the
>>>> checker
>>>> module from disk (again). So plenty of opportunity to deaslock
>>>> waithing
>>>> for the disk be become readable.
>>>
>>> In checker_get(), we call add_checker_class() if the class is not
>>> yet
>>> loaded, which will take one ref on the class, and then
>>> get_shared_ptr()
>>> for every path (including the one for which add_checker_class() had
>>> been called), which means we have (number of paths + 1) references
>>> on
>>> the class, and will only drop the last ref when multipathd calls
>>> cleanup_checkers() during exit.
>>>
>>> Therefore I don't think it's possible that we unload the shared
>>> library
>>> prematurely, and have to reload it later.
>>>
>> Hmm.
>>
>> Really hmmmmm.
>>
>> If the checkers are loaded during startup, and unloaded on shutdown,
>> there really is no point in refcounting them, is there?
>> Wouldn't global pointers to each check sufficient here, avoiding
>> the need for refcounting altogether?
>
> Currently checker and prioritizer DSOs aren't loaded during startup.
> They are loaded when the first path that uses the given checker is
> initialized.
>
> I agree that it would be possible to simplify the code by just loading
> all checkers and prioritizers in advance, in which case we wouldn't
> need refcounting, like you wrote. Actually, we don't need to use shared
> objects in the first place; we could simply include all the checker and
> prioritizer code in libmultipath itself.
>
> The shared object architecture dates back to multipath-tools 0.4.9
> (2008). We (actually, you ;-) ) added refcounting to fix races in
> 323090f ("Use refcounting for checkers"). So far it hasn't occured to
> me to change this architecture, which has been working well for over a
> decade.
>
> Anyway, this discussion is orthogonal to my current patch set.
>
That is true.
So maybe for a later cleanup.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer
[not found] ` <20260522164415.846589-5-mwilck@suse.com>
2026-05-27 9:28 ` [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer Hannes Reinecke
@ 2026-05-27 23:16 ` Benjamin Marzinski
2026-05-28 7:50 ` Martin Wilck
1 sibling, 1 reply; 10+ messages in thread
From: Benjamin Marzinski @ 2026-05-27 23:16 UTC (permalink / raw)
To: Martin Wilck
Cc: Christophe Varoqui, Brian Bunker, dm-devel, Xose Vazquez Perez,
Martin Wilck
On Fri, May 22, 2026 at 06:44:00PM +0200, Martin Wilck wrote:
> Add a set of simple functions to handle refcounted pointers.
>
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
> libmpathutil/libmpathutil.version | 7 +++++
> libmpathutil/util.c | 45 ++++++++++++++++++++++++++++---
> libmpathutil/util.h | 5 ++++
> 3 files changed, 53 insertions(+), 4 deletions(-)
>
> diff --git a/libmpathutil/util.c b/libmpathutil/util.c
> index 23a9797..3c623ec 100644
> --- a/libmpathutil/util.c
> +++ b/libmpathutil/util.c
> @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device **udd)
> if (*udd)
> udev_device_unref(*udd);
> }
> +
> +struct shared_ptr {
> + long refcnt;
> + void (*destructor)(void *);
> + char __attribute__((aligned(sizeof(void *)))) ptr[];
> +};
> +
> +void *alloc_shared_ptr(size_t size, void (*destructor)(void *))
> +{
> + struct shared_ptr *sp = malloc(sizeof(*sp) + size);
> +
> + if (!sp)
> + return NULL;
> + uatomic_set(&sp->refcnt, 1);
> + sp->destructor = destructor;
> + return sp->ptr;
> +}
> +
> +void get_shared_ptr(void *ptr)
> +{
We should probably return here if ptr is NULL, like we do in
put_shared_ptr(). Alternatively, this might be a reasonable place for an
assert(), since nobody should be calling these with NULL pointers.
As far as the general question Hannes raise about this implentantion,
I'm happy with it as-is.
-Ben
> + struct shared_ptr *sp = container_of(ptr, struct shared_ptr, ptr);
> +
> + if (uatomic_add_return(&sp->refcnt, 1) < 0)
> + condlog(0, "%s: refcount overflow", __func__);
> +}
> +
> +void put_shared_ptr(void *ptr)
> +{
> + struct shared_ptr *sp;
> +
> + if (!ptr)
> + return;
> + sp = container_of(ptr, struct shared_ptr, ptr);
> + if (uatomic_sub_return(&sp->refcnt, 1) == 0) {
> + if (sp->destructor)
> + sp->destructor(ptr);
> + free(sp);
> + }
> +}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer
2026-05-27 23:16 ` Benjamin Marzinski
@ 2026-05-28 7:50 ` Martin Wilck
0 siblings, 0 replies; 10+ messages in thread
From: Martin Wilck @ 2026-05-28 7:50 UTC (permalink / raw)
To: Benjamin Marzinski
Cc: Christophe Varoqui, Brian Bunker, dm-devel, Xose Vazquez Perez
On Wed, 2026-05-27 at 19:16 -0400, Benjamin Marzinski wrote:
> On Fri, May 22, 2026 at 06:44:00PM +0200, Martin Wilck wrote:
> > Add a set of simple functions to handle refcounted pointers.
> >
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> > libmpathutil/libmpathutil.version | 7 +++++
> > libmpathutil/util.c | 45
> > ++++++++++++++++++++++++++++---
> > libmpathutil/util.h | 5 ++++
> > 3 files changed, 53 insertions(+), 4 deletions(-)
> >
> > diff --git a/libmpathutil/util.c b/libmpathutil/util.c
> > index 23a9797..3c623ec 100644
> > --- a/libmpathutil/util.c
> > +++ b/libmpathutil/util.c
> > @@ -384,3 +382,42 @@ void cleanup_udev_device(struct udev_device
> > **udd)
> > if (*udd)
> > udev_device_unref(*udd);
> > }
> > +
> > +struct shared_ptr {
> > + long refcnt;
> > + void (*destructor)(void *);
> > + char __attribute__((aligned(sizeof(void *)))) ptr[];
> > +};
> > +
> > +void *alloc_shared_ptr(size_t size, void (*destructor)(void *))
> > +{
> > + struct shared_ptr *sp = malloc(sizeof(*sp) + size);
> > +
> > + if (!sp)
> > + return NULL;
> > + uatomic_set(&sp->refcnt, 1);
> > + sp->destructor = destructor;
> > + return sp->ptr;
> > +}
> > +
> > +void get_shared_ptr(void *ptr)
> > +{
>
> We should probably return here if ptr is NULL, like we do in
> put_shared_ptr(). Alternatively, this might be a reasonable place for
> an
> assert(), since nobody should be calling these with NULL pointers.
That makes sense. I'll add an assert(), and change put_shared_ptr()
accordingly.
Martin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 00/19] multipath-tools: asynchronous checker framework
[not found] <20260522164415.846589-1-mwilck@suse.com>
[not found] ` <20260522164415.846589-10-mwilck@suse.com>
[not found] ` <20260522164415.846589-5-mwilck@suse.com>
@ 2026-06-02 5:53 ` Benjamin Marzinski
2 siblings, 0 replies; 10+ messages in thread
From: Benjamin Marzinski @ 2026-06-02 5:53 UTC (permalink / raw)
To: Martin Wilck
Cc: Christophe Varoqui, Brian Bunker, dm-devel, Xose Vazquez Perez,
Martin Wilck
On Fri, May 22, 2026 at 06:43:56PM +0200, Martin Wilck wrote:
> Based on my previous patch set [1], which introduced the "runner" thread
> handling code as an abstract implementation of the asynchronous checker
> logic in the TUR checker, this set introduces a generic "async_checker"
> type and converts all existing checkers except directio to this framework.
> With this set applied, all checkers run asyncrhonously unless explicitly
> forbidden to do so with the "force_sync" option.
>
> Patch 1-3 are minor preparations. Patch 4-8 introduce a general concept of a
> refcounted "shared" pointer and use it to replace multiple instances
> of hard-coded refcounts on objects. Patch 9 converts the current TUR code
> into a generic asynchronous checker type by separating the actual TUR
> code from the handling of the async runners. Patch 10 adds support for
> this async checker type in the checkers code by searching for
> "libcheck_async_func" with dlsym() in checker DSOs. Patch 11-15 convert
> all checkers except emc_clariion to the async_checker framework.
> emc_clariion is special because it needs extra context both for the checker
> itself and for multipath-related information. Patch 16 and 17 add the
> respective additional fields to the async_checker framework. The mpcontext
> handling is rewritten, because the current apporach of handling void **
> pointers would be disastrous in scenarious with truly asynchronous checker
> threads that may hang forever. Patch 18 finally converts emc_clariion,
> using the concepts from patch 16 and 17.
>
> Comments and reviews welcome.
>
> Regards,
> Martin
>
Sorry for the late review. For the set:
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-06-02 5:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260522164415.846589-1-mwilck@suse.com>
[not found] ` <20260522164415.846589-10-mwilck@suse.com>
2026-05-27 9:40 ` [PATCH v2 09/19] libmpathutil: runner: use shared_ptr Hannes Reinecke
2026-05-27 10:11 ` Martin Wilck
2026-05-27 13:13 ` Hannes Reinecke
2026-05-27 14:11 ` Martin Wilck
2026-05-27 17:30 ` Hannes Reinecke
[not found] ` <20260522164415.846589-5-mwilck@suse.com>
2026-05-27 9:28 ` [PATCH v2 04/19] libmpathutil: add implementation of generic shared pointer Hannes Reinecke
2026-05-27 10:43 ` Martin Wilck
2026-05-27 23:16 ` Benjamin Marzinski
2026-05-28 7:50 ` Martin Wilck
2026-06-02 5:53 ` [PATCH v2 00/19] multipath-tools: asynchronous checker framework Benjamin Marzinski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.