All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Wilck <mwilck@suse.com>
To: Guan Junxiong <guanjunxiong@huawei.com>,
	dm-devel@redhat.com, christophe.varoqui@opensvc.com
Cc: chengjike.cheng@huawei.com, niuhaoxin@huawei.com, shenhong09@huawei.com
Subject: Re: [PATCH] multipath-tools: intermittent IO error accounting to improve reliability
Date: Mon, 04 Sep 2017 21:36:42 +0200	[thread overview]
Message-ID: <1504553802.4605.10.camel@suse.com> (raw)
In-Reply-To: <a82c0dc5-0aae-7c22-400d-9efc63f6bbaa@huawei.com>

On Tue, 2017-08-29 at 11:18 +0800, Guan Junxiong wrote:

> Three parameters are added for the admin:
> > > "path_io_err_sample_time",
> > > "path_io_err_num_threshold" and "path_io_err_recovery_time".
> > > If path_io_err_sample_time and path_io_err_recovery_time are set
> > > to a
> > > value greater than 0, when a path fail event occurs due to an IO
> > > error,
> > > multipathd will enqueue this path into a queue of which each
> > > member
> > > is
> > > sent direct reading asynchronous io at a fixed sample rate of
> > > 100HZ. 
> > 
> > 1. I don't think it's prudent to start the flakiness check for
> > every
> > PATH_DOWN event. Rather, the test should only be started for paths
> > that
> > have failed repeatedly in a certain time frame, so that we have
> > reason
> > to assume they're flaky.
> > 
> 
> Sounds reasonable. And IMO, at least two new options of
> multipath.conf
> should be introduced: time frame and threshold if we let the admin
> to tune this. OTOH, if we don't want to bother the admin, we handle
> the pre-flacky check automatically.  Maybe 60 second of time frame
> and 2 PATH_UP->PATH_DOWNH events of thredshold is reasonable.
> Which solution do you prefer?

I wonder if the exisiting san_path_err_threshold etx. could be reused
for that. As I said before, I can't imagine that they could be used
reasonably in parallel with your approach, anyway. I can't help youwith the default values, you have done the testing. 

> > 2. How did you come up with the 100Hz sample rate value? It sounds
> > quite high in the first place (that was my first thought when I
> > read
> > it), and OTOH those processes that run into intermittent IO errors
> > are
> > probably the ones that do I/O almost continuously, so maybe reading
> > continuously for a limited time is the better idea?
> > 
> 
> I am afraid of whether reading for a limited time will affect the
> usable
> bandwidth of the true up-layer transaction. If not, reading
> continuously
> is the nearest to the real case.
> How about 10Hz sample rate and 10ms continuous reading?  In other
> words,
> every 100ms interval, we read continuously for 10ms.
> 
> 
> > 3. I would suggest setting the dm status of paths undergoing the
> > flakiness test to "failed" immediately. That way the IO caused by
> > the
> > test will not interfere with the regular IO on the path. 
> > 
> 
> Great.
> 
> > 4. I can see that you chose aio to avoid having to create a
> > separate
> > thread for each path being checked. But I'm wondering whether using
> > aio
> > for this is a good choice in general. My concern with aio is that
> > to my
> > knowledge there's no good low-level cancellation mechanism. When
> > you
> > io_cancel() one of your requests, you can be sure not to get
> > notified
> > of its completion any more, but that doesn't mean that it wouldn't
> > continue to lurk down in the block layer. But I may be overlooking
> > essential stuff here.
> > 
> > > Thethr
> > > IO accounting process for a path will last for
> > > path_io_err_sample_time.
> > > If the number of IO error on a particular path is greater than
> > > the
> > > path_io_err_num_threshold, then the path will not reinstate for
> > 
> > It would be more user-friendly to allow the user to specify the
> > error
> > rate threshold as a percentage
> 
> Error rate threshold as a percentage is not enough to distinguish
> small
> error rate. How about a permillage rate (1/1000)?
> 
> 
> > > +struct io_err_stat_path *find_err_path_by_dev(vector pathvec,
> > > char
> > > *dev)
> > 
> > You are re-implementing generic functionality here
> > (find_path_by_dev),
> > why?
> > 
> 
> Yes, because the structure is different: struct  io_err_stat_path and
> struct path.

Why don't you just add a struct io_err_stat_path* in "struct path"?

Regards,
Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

  reply	other threads:[~2017-09-04 19:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-22 10:07 [PATCH] multipath-tools: intermittent IO error accounting to improve reliability Guan Junxiong
2017-08-22 15:37 ` Hannes Reinecke
2017-08-24  9:59   ` Guan Junxiong
2017-08-28 11:13     ` Martin Wilck
2017-08-29  1:16       ` Guan Junxiong
2017-08-28 13:28 ` Martin Wilck
2017-08-29  3:18   ` Guan Junxiong
2017-09-04 19:36     ` Martin Wilck [this message]
2017-09-05  1:36       ` Guan Junxiong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1504553802.4605.10.camel@suse.com \
    --to=mwilck@suse.com \
    --cc=chengjike.cheng@huawei.com \
    --cc=christophe.varoqui@opensvc.com \
    --cc=dm-devel@redhat.com \
    --cc=guanjunxiong@huawei.com \
    --cc=niuhaoxin@huawei.com \
    --cc=shenhong09@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.