public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Tore Anderson <tore@linpro.no>
To: Linux SCSI Mailing List <linux-scsi@vger.kernel.org>
Subject: Disabling dev_loss_tmo?
Date: Tue, 13 Nov 2007 10:11:55 +0100	[thread overview]
Message-ID: <47396A5B.5040001@linpro.no> (raw)

Hi.  Recent kernels will remove the block devices if a FC rport is lost,
which causes a number of problems when dm-multipath is used:

1) Multipathd will receive an event notifying it of the removed rport,
and will respond by removing the path.  This causes a suspend which
flushes outstanding I/O, and in a all-paths-down scenario this will
cause I/O errors to propagate up to the file system layer - even if
queue_if_no_path is in use.  This is fixed in newer versions of
multipath-tools, but old versions are still shipped by the various
server distros.

http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/4005

2) Multipathd will often keep open the device as it's being removed,
resulting in an error message when attempting to re-register the
recently revived rport:

«object_add failed for H:B:T:L with -EEXIST, don't try to register
things with the same name in the same directory»

The newly added path will therefore not make it back into the
dm-multipath map (and won't be available as a block device either).

http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/4240/focus=4255

3) Even when the -EEXIST error doesn't show up, udev/multipath/something
seems to get it wrong sometimes. Either the revived path is added to the
wrong (a new) priority group, or it's not added at all.  Most of the
time it works fine, but it's can't be relied upon in my experience.
Haven't been able to track this one down, unfortunately.

Anyway.  I believe all of these problems would be possible to avoid if I
could simply make it so that block devices would never be removed due to
rports becoming unavailable.  dm-multipath would fail the path anyway,
and multipathd would just keep on testing its availability and would
re-instate when/if it came back online.  If it didn't, it would of
course hang around as harmless junk - but fibre channel SANs are usually
quite stable anyway, and the admin will always have the possibility of
removing the block device manually if it bugs him.  In any case it would
be better than the loss of reliability I experience now.

So what I suggest is a way of disabling dev_loss_tmo (or setting it to
unlimited).  Think that's doable for a kernel newbie like me, or are
there any takers?

Regards
-- 
Tore Anderson
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2007-11-13  9:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-13  9:11 Tore Anderson [this message]
2007-11-13 10:24 ` Disabling dev_loss_tmo? Michael Loehr
2007-11-13 11:07   ` Tore Anderson
2007-11-13 16:18 ` James Smart
2007-11-14  8:10   ` Tore Anderson
2007-11-14 14:29     ` James Smart
2007-11-14 15:38       ` Tore Anderson
2007-11-14 19:00         ` James Smart
2007-11-15  8:02           ` [dm-devel] " Mike Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47396A5B.5040001@linpro.no \
    --to=tore@linpro.no \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox