From: Michael Tokarev <mjt@tls.msk.ru>
To: linux-raid@vger.kernel.org
Subject: Re: "Robust Read"
Date: Sun, 20 Mar 2005 01:05:19 +0300 [thread overview]
Message-ID: <423CA21F.9060503@tls.msk.ru> (raw)
In-Reply-To: <19gug2-0bj.ln1@news.it.uc3m.es>
[-- Attachment #1: Type: text/plain, Size: 1801 bytes --]
Peter T. Breuer wrote:
> Michael Tokarev <mjt@tls.msk.ru> wrote:
[]
> But your approach is fine - it's just that (a) I don't like to mess
> with the struct sizes, as that makes modules binary incompatible,
> instead of just having extra functionalities, and (b) I really think
> it's not going to be easily maintainable to subvert read_balance, and
> (c), it fragments the patch more, and (d), I really think you only need
> to count and pray, not keep an exact map.
The point a) is moot, because this whole structure is used in raid1.c ONLY.
(I don't know why it is placed into raid1.h header file instead of into
raid1.c directly, but that's a different topic).
> Maybe if you were to use the existing "remaining" field for the birmap,
> you would get rid of objection (a)! :-). You could even rename it via a
> #define :-).
atomic_t has its own API, it can't be used for &= and |= operations for
example.
>>BTW, what's the reason to use raid1d for retries in the first place?
>>Why not just resubmit the request in reschedule_retry() directly,
>>the same way it is done in raid1d?
>
> No reason that I recall. Isn't raid1d going to do it offline? You want
> to do it sync instead? Anyway, the rerite code is completely
> uninvestigated, so please try anything you like!
I asked about retrying failed read, not about re-writes.
[]
>>>>+ r1_bio->tried_disks = 0;
>>>
>>>Hmm. Ditto. I suppose we are making the read request here. Yes we are.
>>>I had set the "remaining" count instead.
Note in your version you have a chance to request a read from
first disk again, in case you have some failed/missing disks.
So i think that instead of usage of "remaining" field, it's better
to remember the disk from which we did the first read, and loop
over all disks up to the first one.
/mjt
[-- Attachment #2: raid1_robust_read-2.6.11.diff --]
[-- Type: text/plain, Size: 2809 bytes --]
--- ./include/linux/raid/raid1.h.orig Wed Mar 2 10:38:10 2005
+++ ./include/linux/raid/raid1.h Sun Mar 20 00:50:59 2005
@@ -83,6 +83,7 @@ struct r1bio_s {
* if the IO is in READ direction, then this is where we read
*/
int read_disk;
+ unsigned long first_disk; /* disk we started read from */
struct list_head retry_list;
/*
--- ./drivers/md/raid1.c.orig Wed Mar 2 10:38:10 2005
+++ ./drivers/md/raid1.c Sun Mar 20 00:51:57 2005
@@ -231,12 +231,13 @@ static int raid1_end_read_request(struct
return 1;
mirror = r1_bio->read_disk;
+
+ update_head_pos(mirror, r1_bio);
+
/*
- * this branch is our 'one mirror IO has finished' event handler:
+ * we have only one bio on the read side
*/
- if (!uptodate)
- md_error(r1_bio->mddev, conf->mirrors[mirror].rdev);
- else
+ if (uptodate) {
/*
* Set R1BIO_Uptodate in our master bio, so that
* we will return a good error code for to the higher
@@ -247,14 +248,8 @@ static int raid1_end_read_request(struct
* wait for the 'master' bio.
*/
set_bit(R1BIO_Uptodate, &r1_bio->state);
-
- update_head_pos(mirror, r1_bio);
-
- /*
- * we have only one bio on the read side
- */
- if (uptodate)
raid_end_bio_io(r1_bio);
+ }
else {
/*
* oops, read error:
@@ -421,6 +416,42 @@ rb_out:
return new_disk;
}
+/*
+ * This routine returns the disk from which the requested read should
+ * be done in case previous read failed. We don't try to optimize
+ * read positions here as in read_balance().
+ *
+ * The rdev for the device selected will have nr_pending incremented.
+ */
+static int read_remap(conf_t *conf, r1bio_t *r1_bio)
+{
+ int disk;
+ int first_disk;
+
+ rcu_read_lock();
+
+ /* Choose the next operation device */
+ first_disk = r1_bio->first_disk;
+ disk = r1_bio->read_disk;
+
+ do {
+ ++disk;
+ if (disk >= conf->raid_disks)
+ disk = 0;
+ if (disk == first_disk) {
+ rcu_read_unlock();
+ return -1;
+ }
+ } while (!conf->mirrors[disk].rdev ||
+ !conf->mirrors[disk].rdev->in_sync);
+
+ /* don't bother updating last position */
+ atomic_inc(&conf->mirrors[disk].rdev->nr_pending);
+ rcu_read_unlock();
+
+ return disk;
+}
+
static void unplug_slaves(mddev_t *mddev)
{
conf_t *conf = mddev_to_conf(mddev);
@@ -560,6 +591,7 @@ static int make_request(request_queue_t
mirror = conf->mirrors + rdisk;
r1_bio->read_disk = rdisk;
+ r1_bio->first_disk = rdisk;
read_bio = bio_clone(bio, GFP_NOIO);
@@ -934,7 +966,7 @@ static void raid1d(mddev_t *mddev)
} else {
int disk;
bio = r1_bio->bios[r1_bio->read_disk];
- if ((disk=read_balance(conf, r1_bio)) == -1) {
+ if ((disk=read_remap(conf, r1_bio)) < 0) {
printk(KERN_ALERT "raid1: %s: unrecoverable I/O"
" read error for block %llu\n",
bdevname(bio->bi_bdev,b),
next prev parent reply other threads:[~2005-03-19 22:05 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-09 22:18 [PATCH 1/2] md bitmap bug fixes Paul Clements
2005-03-09 22:19 ` [PATCH 2/2] " Paul Clements
2005-03-14 4:43 ` [PATCH 1/2] " Neil Brown
2005-03-14 9:44 ` Lars Marowsky-Bree
2005-03-14 10:22 ` Neil Brown
2005-03-14 11:24 ` Lars Marowsky-Bree
2005-03-14 22:54 ` Neil Brown
2005-03-18 10:33 ` Lars Marowsky-Bree
2005-03-18 12:52 ` Peter T. Breuer
2005-03-18 13:42 ` Lars Marowsky-Bree
2005-03-18 14:50 ` Peter T. Breuer
2005-03-18 17:03 ` Paul Clements
2005-03-18 18:43 ` Peter T. Breuer
2005-03-18 19:01 ` Mario Holbe
2005-03-18 19:33 ` Peter T. Breuer
2005-03-18 20:24 ` Mario Holbe
2005-03-18 21:01 ` Andy Smith
2005-03-19 11:43 ` Peter T. Breuer
2005-03-19 12:58 ` Lars Marowsky-Bree
2005-03-19 13:27 ` Peter T. Breuer
2005-03-19 14:07 ` Lars Marowsky-Bree
2005-03-19 15:06 ` Peter T. Breuer
2005-03-19 15:24 ` Mario Holbe
2005-03-19 15:58 ` Peter T. Breuer
2005-03-19 16:24 ` Lars Marowsky-Bree
2005-03-19 17:19 ` Peter T. Breuer
2005-03-19 17:36 ` Lars Marowsky-Bree
2005-03-19 17:44 ` Guy
2005-03-19 17:54 ` Lars Marowsky-Bree
2005-03-19 18:05 ` Guy
2005-03-19 20:29 ` berk walker
2005-03-19 18:11 ` Peter T. Breuer
2005-03-18 19:43 ` Paul Clements
2005-03-19 12:10 ` Peter T. Breuer
2005-03-21 16:07 ` Paul Clements
2005-03-21 18:56 ` Luca Berra
2005-03-21 19:58 ` Paul Clements
2005-03-21 20:45 ` Peter T. Breuer
2005-03-21 21:09 ` Gil
2005-03-21 21:19 ` Paul Clements
2005-03-21 22:15 ` Peter T. Breuer
2005-03-22 22:35 ` Peter T. Breuer
2005-03-21 21:32 ` Guy
2005-03-22 9:35 ` Luca Berra
2005-03-22 10:02 ` Peter T. Breuer
2005-03-23 20:31 ` Luca Berra
2005-03-25 18:51 ` Peter T. Breuer
2005-03-25 20:54 ` berk walker
2005-03-25 20:56 ` berk walker
2005-03-18 17:16 ` Luca Berra
2005-03-18 17:57 ` Lars Marowsky-Bree
2005-03-18 21:46 ` Michael Tokarev
2005-03-19 9:05 ` Lars Marowsky-Bree
2005-03-19 12:16 ` Peter T. Breuer
2005-03-19 12:34 ` Michael Tokarev
2005-03-19 12:53 ` Peter T. Breuer
2005-03-19 16:08 ` "Robust Read" (was: [PATCH 1/2] md bitmap bug fixes) Michael Tokarev
2005-03-19 17:03 ` "Robust Read" Peter T. Breuer
2005-03-19 20:20 ` Michael Tokarev
2005-03-19 20:56 ` Peter T. Breuer
2005-03-19 22:05 ` Michael Tokarev [this message]
2005-03-19 22:30 ` Peter T. Breuer
2005-03-15 4:24 ` [PATCH 1/2] md bitmap bug fixes Paul Clements
2005-03-17 20:51 ` [PATCH 0/3] md bitmap-based asynchronous writes Paul Clements
2005-03-17 20:53 ` [PATCH 1/3] md bitmap async write enabling Paul Clements
2005-03-17 20:55 ` [PATCH 2/3] md bitmap async writes for raid1 Paul Clements
2005-03-17 20:56 ` [PATCH 3/3] mdadm: bitmap async writes Paul Clements
2005-03-21 4:21 ` [PATCH 0/3] md bitmap-based asynchronous writes Neil Brown
2005-03-21 16:31 ` Paul Clements
2005-03-21 22:09 ` Neil Brown
2005-03-22 8:35 ` Peter T. Breuer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=423CA21F.9060503@tls.msk.ru \
--to=mjt@tls.msk.ru \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).