public inbox for dm-devel@redhat.com
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Martin Wilck <mwilck@suse.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>,
	dm-devel@lists.linux.dev
Subject: Re: [PATCH 18/21] libmpathutil: add wrapper code for libudev
Date: Fri, 19 Dec 2025 12:27:27 -0500	[thread overview]
Message-ID: <aUWK_4QoG6JFWIqG@redhat.com> (raw)
In-Reply-To: <af00ff3e05b643a9ed41e23a9f0ba4553d6c2e4b.camel@suse.com>

On Fri, Dec 19, 2025 at 09:06:49AM +0100, Martin Wilck wrote:
> On Thu, 2025-12-18 at 19:38 -0500, Benjamin Marzinski wrote:
> > On Wed, Dec 17, 2025 at 10:21:10PM +0100, Martin Wilck wrote:
> > It sounds like
> > the functions themselves are safe to call from multiple threads, as
> > long
> > as they are working on separate data. If they were manipulating
> > global
> > state that wouldn't be true.
> 
> They aren't working on completely separate data, that's the problem. We
> only have one "struct udev", which provides the context for the entire
> operation of libudev in our code. The udev_device structs we use are
> created from that object and link to it internally. Individual
> udev_device structs are linked together e.g. through parent
> releationships. That's why devices obtained by udev_device_get_parent()
> et al. don't need to be unref()'d; they're referenced by their children
> and supposed to be freed together with the last child.
> 
> As the udev operations are opaque, we can't make any assumptions about
> which of the libudev funtions allocate memory, or modify shared data
> structures otherwise. Some properties are fetched "lazily", when the
> user actually needs them, and then cached in libudev. The 2-layer
> architecture with udev_device being actually a shallow wrapper around
> sd_device etc. doesn't make it easier to assess the code flow inside
> the systemd libraries.
> 
> These issues should be fixed by serializing the libudev accesses with a
> mutex. I agree with you that I don't understand what could still go
> wrong if that's done.
> 
> In theory we could have separate "struct udev" devices per thread, but
> keeping these in sync with each other would be a nightmare. We could
> rather give up running multithreaded altogether.

Yeah. I started looking into this, and quickly realized the issue with
the shared struct udev and gave up. I can easily believe that two
threads running at the same time could corrupt a sharted object, and
since everything is linked to one struct udev, there's no way that we
can claim that there can't be shared objects in any of our calls.

I assume that we've likely been pretty safe because most (maybe all) of
our udev device accesses (which is where I assume things could get
corrupted the easiest) have been protected by the vecs lock. But the
purge code is about to add a bunch of unprotected accesses, so clearly
we need some locking there. 
 
> > 
> > > Add wrapper code that wraps all calls to libudev in a critical
> > > section
> > > holding a specific mutex.
> > > 
> > > [1] https://man7.org/linux/man-pages/man3/libudev.3.html
> > 
> > I assume that there must be times when use call a udev outside of the
> > vecs lock. Did you check how often that is? Not that would be great
> > to
> > pile even more work on the vecs lock.
> 
> I wouldn't want to do that. Even if we could create a comprehensive
> assessment now, it could become incorrect any time just by adding an
> innocent libudev call somewhere. Also, I wouldn't want to overload the
> vecs lock further. This code is ugly, but I am pretty sure that the
> mutex operations will be light-weight. The mutex is always at the
> bottom of our call stack, so no ABBA deadlock is possibe.
> 
> I started out by just wrapping some libudev calls, but I soon realized
> that we can't do that without making assumptions about libudev's inner
> workings, which we shouldn't (see above).
> 
> That said, I agree that it's unclear if this is really worth the
> effort. I did it because I was hoping to fix the memory leak issue #129
> with it. But that attempt was unsuccessful. So we have no evidence that
> this serialization fixes any actual issue. multipathd has been using
> libudev from multi-theaded context for 20 years, and we have never
> encountered a problem that was caused by that (or if we did, we didn't
> realize). At least in theory, the mutex could cause a deadlock or
> noticeable delay if some libudev operation blocks or crashes, so the
> change is not without risk. 
> 
> So, consider this part of the patch set an RFC.
> 
> OTOH, the libudev man page is pretty clear about being not thread-safe.

Especially with the purge code adding otherwise unprotected udev device
accesses, I think this is probably a good idea. I don't recall seeing
libudev calls block. I'm betting the real risk is for systems where the
root filesystem is multipathed, and you lose all paths. But that's
always the scary case, and we do enough udev accesses with the vecs lock
held, that if we can get wedged in one, we're likely already gonna get
locked up, even without these mutexes.

-Ben
 
> > Also, git flagged some whitespace errors.
> 
> Ups, will fix.
> 
> Thanks,
> Martin


  reply	other threads:[~2025-12-19 17:27 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-17 21:20 [PATCH 00/21] Multipath-tools: various bug fixes Martin Wilck
2025-12-17 21:20 ` [PATCH 01/21] libmultipath: drop drop_multipath Martin Wilck
2025-12-17 21:20 ` [PATCH 02/21] libmultipath: don't access path members in free_pgvec() Martin Wilck
2025-12-17 21:20 ` [PATCH 03/21] multipathd: free paths in checker_finished() Martin Wilck
2025-12-18 23:34   ` Benjamin Marzinski
2025-12-18 23:40     ` Martin Wilck
2025-12-19 10:38       ` Martin Wilck
2025-12-17 21:20 ` [PATCH 04/21] libmultipath: don't touch mpvec in remove_map() Martin Wilck
2025-12-17 21:20 ` [PATCH 05/21] libmpathutil: constify find_slot() Martin Wilck
2025-12-17 21:20 ` [PATCH 06/21] libmultipath: don't free paths in orphan_paths() Martin Wilck
2025-12-17 21:20 ` [PATCH 07/21] libmultipath: free orphaned paths in check_removed_paths() Martin Wilck
2025-12-17 21:21 ` [PATCH 08/21] libmultipath: remove free_paths argument from free_pathgroup() Martin Wilck
2025-12-17 21:21 ` [PATCH 09/21] libmultipath: fix numeric value of free_paths in free_multipaths() Martin Wilck
2025-12-19  0:10   ` Benjamin Marzinski
2025-12-17 21:21 ` [PATCH 10/21] libmultipath: remove free_paths argument from free_pgvec() Martin Wilck
2025-12-17 21:21 ` [PATCH 11/21] libmultipath: remove free_paths argument from free_multipathvec() Martin Wilck
2025-12-17 21:21 ` [PATCH 12/21] libmultipath: free_multipath: fix FREE_PATHS case Martin Wilck
2025-12-19  0:15   ` Benjamin Marzinski
2025-12-17 21:21 ` [PATCH 13/21] multipath-tools: Fix ISO C23 errors with strchr() Martin Wilck
2025-12-17 21:21 ` [PATCH 14/21] libmultipath: simplify sysfs_get_target_nodename() Martin Wilck
2025-12-17 21:21 ` [PATCH 15/21] multipathd: join the init_unwinder dummy thread Martin Wilck
2025-12-17 21:21 ` [PATCH 16/21] kpartx: fix some memory leaks Martin Wilck
2025-12-17 21:21 ` [PATCH 17/21] libmpathutil: use union for bitfield Martin Wilck
2025-12-19  0:17   ` Benjamin Marzinski
2025-12-19  8:08     ` Martin Wilck
2025-12-17 21:21 ` [PATCH 18/21] libmpathutil: add wrapper code for libudev Martin Wilck
2025-12-19  0:38   ` Benjamin Marzinski
2025-12-19  8:06     ` Martin Wilck
2025-12-19 17:27       ` Benjamin Marzinski [this message]
2025-12-17 21:21 ` [PATCH 19/21] multipath-tools: use the libudev wrapper functions Martin Wilck
2025-12-17 21:21 ` [PATCH 20/21] Makefile: add functionality to determine cmocka version Martin Wilck
2025-12-17 21:21 ` [PATCH 21/21] multipath-tools tests: adaptations for cmocka 2.0 Martin Wilck
2025-12-19  0:40 ` [PATCH 00/21] Multipath-tools: various bug fixes Benjamin Marzinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUWK_4QoG6JFWIqG@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=christophe.varoqui@opensvc.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=mwilck@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox