All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Martin Wilck <mwilck@suse.com>
Cc: Christophe Varoqui <christophe.varoqui@opensvc.com>,
	dm-devel@lists.linux.dev
Subject: Re: [PATCH v3 15/15] multipathd: enable pathgroups in checker_finished()
Date: Tue, 21 Jan 2025 16:16:03 -0500	[thread overview]
Message-ID: <Z5AOk0sjCtfxDggL@redhat.com> (raw)
In-Reply-To: <bd30c82ae259c15cfcb9b26985c3a6c139986822.camel@suse.com>

On Tue, Jan 21, 2025 at 12:12:50PM +0100, Martin Wilck wrote:
> On Mon, 2025-01-20 at 18:51 -0500, Benjamin Marzinski wrote:
> > On Fri, Jan 17, 2025 at 09:27:38PM +0100, Martin Wilck wrote:
> 
> > Alternatively, we could make the check_finished() logic match the
> > update_path_state() logic exactly by just setting a flag in the
> > path's
> > pathgroup during enable_group() (and possibly not call
> > dm_enablegroup()
> > at all, 
> 
> That was my first thought, but it doesn't work, because we call  
> free_pgvec(mpp->pg, KEEP_PATHS) update_multipath_strings(), and
> reallocate the vector in disassemble_map. I guess we could change that,
> but it would be another intrusive patch set. Currently we just can't
> store any persistent properties in struct pathgroup.

Oh, yeah. nuts.

> > but I suppose that there could be a benefit to re-enabling the
> > group as soon as possible. I'm still kinda fuzzy on whether the
> > kernel's
> > own pathgroup re-enabling code makes all this redundant).
> 
> The kernel will even try "bypassed" PGs if it finds no other. But that
> means that e.g. the marginal pathgroup would be tried before disabled
> PGs [1]. In general, I don't think we can rely on the kernel in this
> respect.
>
> >  Then, in
> > enable_pathrgoups(), instead of checking each path, we just need to
> > check if pgp->need_reenable is set for the pathgroup().
> 
> As I said, as long as we discard the PGs, I guess we'll have to test
> the path flags. It's difficult to come up with a decent test case for
> this, in particular if checking for PATH_UP is not enough...
> 
> 
> > [...]
> > 
> > But perhaps the best answer is to just to say that this is a corner
> > case
> > where we skip a multipathd action that might be totally unnecessary.
> > And
> > even if it is a problem, it will be fixed if multipathd ever decides
> > that the kernel is using the wrong pathgroup and should switch, or
> > whenever the table gets reloaded. Maybe the whole patch is
> > unnecessary.
> 
> I'm going to submit a v4 that adds a test for the checker flags.

Seems reasonable.

> 
> Btw, I could only find one place in the kernel code where the kernel
> actively sets the "bypassed" flag, when the SCSI device handler returns
> SCSL_DH_DEV_TEMP_BUSY [2]. This seems rather questionable to me. First
> of all, this seems to assume that PGs are mapped to "controllers", i.e.
> something like "group_by_tpg", which isn't necessarily the case.
> Second, while the EMC device handler uses this error code in this way,
> in the ALUA handler it rather means a memory allocation failure, and
> the other device handlers don't support it at all.

Yeah, I really wish I knew how this was supposed to work. For the ALUA
devices, it's likely fine to clear bypassed when the device is in
PATH_UP. Either the hardware controller just ran into a temporary error,
and we should go back to using this pathgroup based on its priority, or
there is a persistent memory issue, it which case it likey doesn't
matter what we do, since the system has larger problems than using a
non-optimal pathgroup.

It's the EMC devices that seem like there should be a real right and
wrong way to do things. For instance, even though the code doesn't
currently clear bypassed for PATH_GHOST devices, that might be the right
thing to do.  Imagine if you tried to switch to a new pathgroup, but the
controller had some issue and so you bypassed it, and that issue was
later resolved.  In that case, you want to clear the bypassed flag. But
since you are using some other controller, the paths in the bypassed
pathgroup could well be in PATH_GHOST. This means that you are less
likely to pick that pathgroup in the future, so the bypass flag won't
get cleared. And multipathd won't do anything to fix this. So the
correct answer might be to just check for
pp->is_checked == CHECK_PATH_NEW_UP. But maybe not. I dunno.

> I think we can conclude that the kernel setting "bypassed" on a path
> group by itself is indeed a corner case, except for EMC Clariion, which
> is quite dated by now.
> 
> > Thoughts? I'm clearly thinking too much.
> > 
> 
> No, you are not. I strongly appreciate these comments.
> 
> Regards,
> Martin
> 
> [1] Unrelated wild thought: in view of the logic the kernel applies for
> "bypassed" PGs, it might actually make sense to use this flag for 
> marginal pathgroups. They would be tried if all else fails, which is
> more or less the behavior we want for them. What do you think?

Maybe, but I kind of doubt it. It will get cleared by the kernel if we
ever call switch_pathgroup(), but we could reset it. AFAICS the only
real "benefit" would be that regular bypassed pathgroups would get used
instead of marginal pathgroups. But if we're doing the right thing with
clearing the bypassed flag, then when a pathgroup is bypassed, it likely
can't be initialized (at least for the EMC handler). It might make more
sense to use a marginal pathgroup that can be initialized instead of a
regular pathgroup that quite possibly can't at all.

-Ben

> [2] https://elixir.bootlin.com/linux/v6.12.6/source/drivers/md/dm-mpath.c#L1564


      reply	other threads:[~2025-01-21 21:16 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-17 20:27 [PATCH v3 00/15] multipathd: More map reload handling, and checkerloop work Martin Wilck
2025-01-17 20:27 ` [PATCH v3 01/15] multipathd: don't reload map in update_mpp_prio() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 02/15] multipathd: remove dm_get_info() call from refresh_multipath() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 03/15] multipathd: sync maps at end of checkerloop Martin Wilck
2025-01-22 18:15   ` Benjamin Marzinski
2025-01-17 20:27 ` [PATCH v3 04/15] multipathd: emit a warning if a map remains inconsistent after reload Martin Wilck
2025-01-22 18:15   ` Benjamin Marzinski
2025-01-17 20:27 ` [PATCH v3 05/15] multipathd: move yielding for waiters to start of checkerloop Martin Wilck
2025-01-17 20:27 ` [PATCH v3 06/15] multipathd: add checker_finished() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 07/15] multipathd: move "tick" calls into checker_finished() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 08/15] multipathd: don't call reload_and_sync_map() from deferred_failback_tick() Martin Wilck
2025-01-22 18:16   ` Benjamin Marzinski
2025-01-17 20:27 ` [PATCH v3 09/15] multipathd: move retry_count_tick() into existing mpvec loop Martin Wilck
2025-01-17 20:27 ` [PATCH v3 10/15] multipathd: don't call update_map() from missing_uev_wait_tick() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 11/15] multipathd: don't call udpate_map() from ghost_delay_tick() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 12/15] multipathd: only call reload_and_sync_map() when ghost delay expires Martin Wilck
2025-01-17 20:27 ` [PATCH v3 13/15] multipathd: remove non-existent maps in checkerloop Martin Wilck
2025-01-17 20:27 ` [PATCH v3 14/15] multipathd: remove mpvec_garbage_collector() Martin Wilck
2025-01-17 20:27 ` [PATCH v3 15/15] multipathd: enable pathgroups in checker_finished() Martin Wilck
2025-01-20 23:51   ` Benjamin Marzinski
2025-01-21 11:12     ` Martin Wilck
2025-01-21 21:16       ` Benjamin Marzinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5AOk0sjCtfxDggL@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=christophe.varoqui@opensvc.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=mwilck@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.