From: Martin Wilck <mwilck@suse.com>
To: Julian Andres Klode <julian.klode@canonical.com>,
Christophe Varoqui <christophe.varoqui@opensvc.com>,
Device-mapper development mailing list <dm-devel@redhat.com>
Cc: Guan Junxiong <guanjunxiong@huawei.com>
Subject: Re: multipath-tools 0.7.4 failure to remove device
Date: Fri, 12 Jan 2018 21:35:39 +0100 [thread overview]
Message-ID: <1515789339.3409.33.camel@suse.com> (raw)
In-Reply-To: <20180112083833.apsybil4ux4d2y32@jak-x230>
[-- Attachment #1: Type: text/plain, Size: 3124 bytes --]
On Fri, 2018-01-12 at 09:38 +0100, Julian Andres Klode wrote:
>
> and then we get I/O error on the device and it's rendered unusable.
> It's
> also crashing in uev_pathfail_check() occassionally because
> find_path_by_devt()
> returns NULL, so I applied the following patch to at least continue,
> but that's
> obviously wrong - We get an udev event for a device which does not
> exist in /dev
> (but it should)?
Adding Guan, as the pathfail check is from his code.
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1090,6 +1090,11 @@ uev_pathfail_check(struct uevent *uev, s
> lock(&vecs->lock);
> pthread_testcancel();
> pp = find_path_by_devt(vecs->pathvec, devt);
> + if (!pp) {
> + condlog(3, "%s: Cannot find path by dm path %s",
> uev->kernel, devt);
> + FREE(devt);
> + goto out;
> + }
> r = io_err_stat_handle_pathfail(pp);
> lock_cleanup_pop(vecs->lock);
You need to cleanup the lock in the error path. I'd pefer checking
for a NULL path argument in io_err_stat_handle_pathfail(). See
attachment.
I'm assuming that you are not using the "marginal path" logic. In
general I don't like the fact that PATH_FAILED events are handled at
all in multipathd if this logic is inactive; that code path is only
needed for this purpose. But that's just a side note.
> Jan 12 09:17:52 autopkgtest kernel: device-mapper: multipath: Failing
> path 8:16.
> > Jan 12 09:17:52 autopkgtest kernel: sd 3:0:0:1: [sdb] Synchronizing
> SCSI cache
> > Jan 12 09:17:52 autopkgtest multipath[6909]: 8:16: cannot find
> block device
> Jan 12 09:17:52 autopkgtest multipath[6909]: 8:16: Empty device
name
> Jan 12 09:17:52 autopkgtest multipath[6909]: 8:16: Empty device
name
> > Jan 12 09:17:52 autopkgtest multipath[6909]: get_udev_device: >
> failed to look up 8:16 with type 1
> > Jan 12 09:17:52 autopkgtest multipath[6909]: dm-0: usable paths
> found
> > Jan 12 09:17:53 autopkgtest iscsid[649]: Connection2:0 to [target:
> iqn.2016-11.foo.com:target.iscsi, portal: 127.0.0.1,3260] through
> [iface: default] is shutdown.
> > We can see that it correctly removed the first device (sda) -
> except well, it seems to try
> >again and fail with the part where it would have crashed. But when
> it tries to lookup the
> second one it fails.
> > Given that this works in 0.6.4, I think it's a bug that appeared
> later on,
> > but I can't really pin point the source of it.
Well, it may be because of the locking being broken by your patch.
If you look at the journal you sent, multipathd never prints a single
message after the removal of sda, until it says
Jan 12 09:18:37 autopkgtest multipathd[1980]: exit (signal)
That makes me think it hangs somehow, which could well be explained by
the lock not being released. Please retry with the attached patch.
We are seeing the *multipath* messages ([6069]) which are printed from
multipath during udev rule processing, because the map still holds
references to the deleted path.
Regards,
Martin
--
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
[-- Attachment #2: deal-with-NULL-path-in-pathfail-handler.patch --]
[-- Type: text/x-patch, Size: 849 bytes --]
commit c4d48c633b0825941024a34acf2304a6f5a2d17d (HEAD -> upstream)
Author: Martin Wilck <mwilck@suse.com>
Date: Fri Jan 12 21:21:49 2018 +0100
libmultipath: deal with NULL path in pathfail handler
This avoids a crash for paths which are already deleted.
Reported-by: Julian Andres Klode <julian.klode@canonical.com>
diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c
index 75a6df67c207..d2d2276a523e 100644
--- a/libmultipath/io_err_stat.c
+++ b/libmultipath/io_err_stat.c
@@ -315,6 +315,10 @@ int io_err_stat_handle_pathfail(struct path *path)
struct timespec curr_time;
int res;
+ if (path == NULL) {
+ io_err_stat_log(1, "%s: called with empty path", __func__);
+ return 1;
+ }
if (path->io_err_disable_reinstate) {
io_err_stat_log(3, "%s: reinstate is already disabled",
path->dev);
[-- Attachment #3: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2018-01-12 20:35 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-12 8:38 multipath-tools 0.7.4 failure to remove device Julian Andres Klode
2018-01-12 20:35 ` Martin Wilck [this message]
2018-01-12 21:47 ` Julian Andres Klode
2018-01-12 22:18 ` Martin Wilck
2018-01-12 22:26 ` Julian Andres Klode
2018-01-15 15:44 ` Julian Andres Klode
2018-01-15 16:12 ` Martin Wilck
2018-01-15 16:26 ` Julian Andres Klode
2018-01-15 16:46 ` Martin Wilck
2018-01-16 20:30 ` Benjamin Marzinski
2018-01-16 22:05 ` Xose Vazquez Perez
2018-01-17 0:43 ` Martin Wilck
2018-01-17 18:38 ` Benjamin Marzinski
2018-01-18 3:50 ` Benjamin Marzinski
2018-01-18 8:11 ` Martin Wilck
2018-01-18 15:22 ` Benjamin Marzinski
2018-01-18 16:32 ` Martin Wilck
2018-01-17 9:38 ` Julian Andres Klode
2018-01-17 18:45 ` Benjamin Marzinski
2018-01-17 16:27 ` Multipath path classification revisited Martin Wilck
2018-01-18 22:35 ` Benjamin Marzinski
2018-01-13 19:46 ` multipath-tools 0.7.4 failure to remove device Martin Wilck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1515789339.3409.33.camel@suse.com \
--to=mwilck@suse.com \
--cc=christophe.varoqui@opensvc.com \
--cc=dm-devel@redhat.com \
--cc=guanjunxiong@huawei.com \
--cc=julian.klode@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.