dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Zdenek Kabelac <zkabelac@redhat.com>
To: Benjamin Marzinski <bmarzins@redhat.com>,
	device-mapper development <dm-devel@redhat.com>
Cc: Christophe Varoqui <christophe.varoqui@gmail.com>
Subject: Re: [PATCH v2 10/18] multipathd: delay reloads during creation
Date: Fri, 8 Apr 2016 10:36:07 +0200	[thread overview]
Message-ID: <57076D77.6020208@redhat.com> (raw)
In-Reply-To: <1460071212-21018-11-git-send-email-bmarzins@redhat.com>

Dne 8.4.2016 v 01:20 Benjamin Marzinski napsal(a):
> lvm needs PV devices to not be suspended while the udev rules are
> running, for them to be correctly identified as PVs. However, multipathd
> will often be in a situation where it will create a multipath device
> upon seeing a path, and then immediately reload the device upon seeing
> another path.  If multipath is reloading a device while processing the
> udev event from its creation, lvm can fail to identify it as a PV. This
> can cause systems to fail to boot. Unfortunately, using udev
> synchronization cookies to solve this issue would cause a host of other
> issues that could only be avoided by a pretty substantial change in how
> multipathd does locking and event processing.


This is somewhat misunderstanding of the core problem it's not about 'lvm2 
needs to be not suspended'. So let me elaborate here with more details.
(And Peter please fix me in case of mistakes ;)


Lvm2 command has lvm.conf settings allowing a user to select if he wants to 
scan devices that are suspended or not - there is already a 'race' - since 
checking device is not suspended and then opening it has lots of room for the 
race, but it's not a major issue in this case.


The reasons to skip reading suspended devices are primarily to avoid holding 
VG lock for potentially a very long time and also avoiding udev with its 
'built-in' 30sec timeout to kills its worker process blocked in blkid scan
with a danger of marking device as GONE and having further consequences like 
an automated umount by systemd....


Thus lvm2/dm udev rules implemented a (racy) check for skipping a read of 
suspended device and this check may also skip the call of pvscan with the 
generic assumption  RESUME goes afterwards with a CHANGE event and device will 
be properly scanned anyway - so there would be no info lost - only gets into 
udev database later.


However now the multipathd *kills* this assumption - since the current udev 
rules implementation for multipath devices targets only for the initial scan 
and all subsequent RESUMES are supposed to be ignored as it's believed the 
device remained in the same state and only some paths have been added/lost. 
Scanning such a device thus shall not change any remembered info in udev 
database. As 'extra' bonus multipath may SUSPEND devices (and that's somehow 
solved by this patch) right after the initial activation of the device so the 
lvm2 check for skipping of suspended devices may have skipped the whole 
discovery operation and since further RESUMES were meant to be ignore, device 
remained invisible forever.

Now we get to the best technical solution for multipath here with other 
surrounding software (i.e. udev) - before multipath starts to mark RESUMES as 
'ignorable' it should check/validate if udevdb already does contain a valid 
information about the device (i.e. it's been scanned...)  and only in this 
case it would mark this device to be ignored.

This may of course mean there will be few more extra initial repeated scans - 
but it's the only 'safest' way to proceed (i.e. you can't resolve the problem 
with cookie waiting on loaded system - since the udev 30sec timeout is 
unpredictable....)

Regards

Zdenek

  reply	other threads:[~2016-04-08  8:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-07 23:19 [PATCH v2 00/18] Multipath patch sync Benjamin Marzinski
2016-04-07 23:19 ` [PATCH v2 01/18] multipathd: use /run instead of /var/run Benjamin Marzinski
2016-04-07 23:19 ` [PATCH v2 02/18] retrigger uevents to try and get the uid through udev Benjamin Marzinski
2016-04-07 23:19 ` [PATCH v2 03/18] Fix issues with user_friendly_names initramfs bindings Benjamin Marzinski
2016-04-07 23:19 ` [PATCH v2 04/18] Add libmpathcmd library and use it internally Benjamin Marzinski
2016-04-07 23:19 ` [PATCH v2 05/18] libmultipath: add ignore_new_boot_devs option Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 06/18] libmultipath: fix PAD and PRINT macros Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 07/18] libmultipath: Cut down on alua prioritizer ioctls Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 08/18] multipathd: fail if pidfile can't be created Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 09/18] libmultipath: check correct function for define Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 10/18] multipathd: delay reloads during creation Benjamin Marzinski
2016-04-08  8:36   ` Zdenek Kabelac [this message]
2016-04-08 21:53     ` Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 11/18] multipath: Fix minor text issues Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 12/18] kpartx: verify partition devices Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 13/18] multipath: add exclusive_pref_bit for alua prio Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 14/18] multipathd: print "fail" when remove fails Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 15/18] multipath: check partitions unused before removing Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 16/18] multipathd.service: remove blk-availability Requires Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 17/18] multipathd: use 64-bit int for command key Benjamin Marzinski
2016-04-07 23:20 ` [PATCH v2 18/18] multipath: add wwn keyword to weightedpath prio Benjamin Marzinski
2016-04-18  9:36 ` [PATCH v2 00/18] Multipath patch sync Christophe Varoqui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57076D77.6020208@redhat.com \
    --to=zkabelac@redhat.com \
    --cc=bmarzins@redhat.com \
    --cc=christophe.varoqui@gmail.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).