All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] How to fix system stall on root volume multipath
@ 2007-11-09 23:17 Kiyoshi Ueda
  2007-11-10  3:31 ` Alasdair G Kergon
  0 siblings, 1 reply; 9+ messages in thread
From: Kiyoshi Ueda @ 2007-11-09 23:17 UTC (permalink / raw)
  To: dm-devel

Hi,

If we use multipath for "/", temporal all-paths failure could lead to
system stall because multipathd depends on callout programs on "/".
I would like to hear your comments about my idea to fix it.

For example, the script below causes system stall on the following
environmnt.
  o "/" on a multipath device
  o setting 'no_path_retry = queue'
  o using priority callout (If your storage doesn't have priority
    callout, using "/bin/echo 1" should be fine for testing.)
-----------------------------------------------------------------
#!/bin/sh

# specify all paths for your root filesystem
paths="sdd sdg"

while true; do
	for dev in $paths; do
		echo offline > /sys/block/${dev}/device/state
	done

	for dev in $paths; do
		echo running > /sys/block/${dev}/device/state
	done
done
-----------------------------------------------------------------
This is because the path checker thread stalls on executing
the priority callout and revived paths aren't reinstated.


To fix it, my proposal is to build all priority callouts into
multipathd as library functions like path checkers.
(But keep the feature to use external priority callouts as an option.)

Although the proposal doesn't work if target device for down/up path
is deleted/added because getuid callouts are used for path addition,
the target device deletion can be controlled by the "dev_loss_tmo"
parameter of transport layer.
Also, source codes of getuid callouts are outside of multipath-tools.
So I think making only all priority callouts built-in is enough now.


Ideally, multipathd shouldn't do file I/Os nor get memory after started.
I think the proposal above is the first step for the ideal multipathd.
What do you think about it?

Thanks,
Kiyoshi Ueda

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-12-20 18:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-09 23:17 [RFC] How to fix system stall on root volume multipath Kiyoshi Ueda
2007-11-10  3:31 ` Alasdair G Kergon
2007-11-12 16:24   ` Kiyoshi Ueda
2007-11-13  1:01     ` Christophe Varoqui
2007-11-13 19:30       ` Kiyoshi Ueda
2007-11-13 22:02       ` Benjamin Marzinski
2007-11-16  8:41       ` Christophe Varoqui
2007-11-19  0:17         ` Christophe Varoqui
2007-12-20 18:10           ` Kiyoshi Ueda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.