From: Phillip Susi <psusi@cfl.rr.com>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: sander@humilis.net, NeilBrown <neilb@suse.de>,
linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
"Steinar H. Gunderson" <sgunderson@bigfoot.com>
Subject: Re: [PATCH 000 of 5] md: Introduction
Date: Tue, 17 Jan 2006 17:38:50 -0500 [thread overview]
Message-ID: <43CD71FA.4090908@cfl.rr.com> (raw)
In-Reply-To: <43CCD453.9070900@tls.msk.ru>
Michael Tokarev wrote:
<snip>
> Compare this with my statement about "offline" "reshaper" above:
> separate userspace (easier to write/debug compared with kernel
> space) program which operates on an inactive array (no locking
> needed, no need to worry about other I/O operations going to the
> array at the time of reshaping etc), with an ability to plan it's
> I/O strategy in alot more efficient and safer way... Yes this
> apprpach has one downside: the array has to be inactive. But in
> my opinion it's worth it, compared to more possibilities to lose
> your data, even if you do NOT use that feature at all...
>
>
I also like the idea of this kind of thing going in user space. I was
also under the impression that md was going to be phased out and
replaced by the device mapper. I've been kicking around the idea of a
user space utility that manipulates the device mapper tables and
performs block moves itself to reshape a raid array. It doesn't seem
like it would be that difficult and would not require modifying the
kernel at all. The basic idea is something like this:
/dev/mapper/raid is your raid array, which is mapped to a stripe between
/dev/sda, /dev/sdb. When you want to expand the stripe to add /dev/sdc
to the array, you create three new devices:
/dev/mapper/raid-old: copy of the old mapper table, striping sda and sdb
/dev/mapper/raid-progress: linear map with size = new stripe width, and
pointing to raid-new
/dev/mapper/raid-new: what the raid will look like when done, i.e.
stripe of sda, sdb, and sdc
Then you replace /dev/mapper/raid with a linear map to raid-new,
raid-progress, and raid-old, in that order. Initially the length of the
chunks from raid-progress and raid-new are zero, so you will still be
entirely accessing raid-old. For each stripe in the array, you change
raid-progress to point to the corresponding blocks in raid-new, but
suspended, so IO to this stripe will block. Then you update the raid
map so raid-progress overlays the stripe you are working on to catch IO
instead of allowing it to go to raid-old. After you read that stripe
from raid-old and write it to raid-new, resume raid-progress to flush
any blocked writes to the raid-new stripe. Finally update raid so the
previously in progress stripe now maps to raid-new.
Repeat for each stripe in the array, and finally replace the raid table
with raid-new's table, and delete the 3 temporary devices.
Adding transaction logging to the user mode utility wouldn't be very
hard either.
next prev parent reply other threads:[~2006-01-17 22:38 UTC|newest]
Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-17 6:56 [PATCH 000 of 5] md: Introduction NeilBrown
2006-01-17 6:56 ` [PATCH 001 of 5] md: Split disks array out of raid5 conf structure so it is easier to grow NeilBrown
2006-01-17 14:37 ` John Stoffel
2006-01-19 0:26 ` Neil Brown
2006-01-21 3:37 ` John Stoffel
2006-01-22 22:57 ` Neil Brown
2006-01-17 6:56 ` [PATCH 002 of 5] md: Allow stripes to be expanded in preparation for expanding an array NeilBrown
2006-01-17 6:56 ` [PATCH 003 of 5] md: Infrastructure to allow normal IO to continue while array is expanding NeilBrown
2006-01-17 6:56 ` [PATCH 004 of 5] md: Core of raid5 resize process NeilBrown
2006-01-17 6:56 ` [PATCH 005 of 5] md: Final stages of raid5 expand code NeilBrown
2006-01-17 9:55 ` Sander
2006-01-19 0:32 ` Neil Brown
2006-01-17 8:17 ` [PATCH 000 of 5] md: Introduction Michael Tokarev
[not found] ` <fd8d0180601170121s1e6a55b7o@mail.gmail.com>
2006-01-17 9:38 ` Francois Barre
2006-01-19 0:35 ` Neil Brown
2006-01-17 9:50 ` Sander
2006-01-17 11:26 ` Michael Tokarev
2006-01-17 11:37 ` Francois Barre
2006-01-17 14:03 ` Kyle Moffett
2006-01-19 0:28 ` Neil Brown
2006-01-17 16:08 ` Ross Vandegrift
2006-01-17 16:08 ` Ross Vandegrift
2006-01-17 18:12 ` Michael Tokarev
2006-01-17 18:12 ` Michael Tokarev
2006-01-18 8:14 ` Sander
2006-01-18 8:14 ` Sander
2006-01-18 8:37 ` Brad Campbell
2006-01-18 9:03 ` Alan Cox
2006-01-18 12:46 ` John Hendrikx
2006-01-18 12:51 ` Gordon Henderson
2006-01-18 23:51 ` Neil Brown
2006-01-19 7:20 ` PFC
2006-01-19 8:01 ` dean gaudet
2006-01-18 23:54 ` Neil Brown
2006-01-19 0:22 ` Neil Brown
2006-01-19 0:22 ` Neil Brown
2006-01-19 9:01 ` Jakob Oestergaard
2006-01-19 9:01 ` Jakob Oestergaard
2006-01-17 22:38 ` Phillip Susi [this message]
2006-01-17 22:57 ` Neil Brown
2006-01-17 14:10 ` Steinar H. Gunderson
2006-01-17 15:07 ` Mr. James W. Laferriere
2006-01-19 0:23 ` Neil Brown
2006-01-22 4:42 ` Adam Kropelin
2006-01-22 22:52 ` Neil Brown
2006-01-23 23:02 ` Adam Kropelin
2006-01-23 23:02 ` Adam Kropelin
2006-01-23 1:08 ` John Hendrikx
2006-01-23 1:25 ` Neil Brown
2006-01-23 1:54 ` Kyle Moffett
2006-01-23 2:09 ` Mr. James W. Laferriere
2006-01-23 2:33 ` Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2006-01-17 21:38 Lincoln Dale (ltd)
2006-01-17 21:38 ` Lincoln Dale (ltd)
2006-01-18 13:27 ` Jan Engelhardt
2006-01-18 23:19 ` Neil Brown
2006-01-19 15:33 ` Mark Hahn
2006-01-19 15:33 ` Mark Hahn
2006-01-19 20:12 ` Jan Engelhardt
2006-01-19 21:22 ` Lars Marowsky-Bree
2006-01-19 21:22 ` Lars Marowsky-Bree
2006-01-19 22:17 ` Phillip Susi
2006-01-19 22:32 ` Neil Brown
2006-01-19 23:26 ` Phillip Susi
2006-01-19 23:43 ` Neil Brown
2006-01-20 2:17 ` Phillip Susi
2006-01-20 10:53 ` Lars Marowsky-Bree
2006-01-20 10:53 ` Lars Marowsky-Bree
2006-01-20 12:06 ` Jens Axboe
2006-01-20 18:38 ` Heinz Mauelshagen
2006-01-20 18:38 ` Heinz Mauelshagen
2006-01-20 22:09 ` Lars Marowsky-Bree
2006-01-20 22:09 ` Lars Marowsky-Bree
2006-01-21 0:06 ` Heinz Mauelshagen
2006-01-21 0:06 ` Heinz Mauelshagen
2006-01-20 18:41 ` Heinz Mauelshagen
2006-01-20 17:29 ` Ross Vandegrift
2006-01-20 17:29 ` Ross Vandegrift
2006-01-20 18:36 ` Heinz Mauelshagen
2006-01-20 22:57 ` Lars Marowsky-Bree
2006-01-20 22:57 ` Lars Marowsky-Bree
2006-01-21 0:01 ` Heinz Mauelshagen
2006-01-21 0:01 ` Heinz Mauelshagen
2006-01-21 0:03 ` Lars Marowsky-Bree
2006-01-21 0:03 ` Lars Marowsky-Bree
2006-01-21 0:08 ` Heinz Mauelshagen
2006-01-21 0:08 ` Heinz Mauelshagen
2006-01-21 0:13 ` Lars Marowsky-Bree
2006-01-23 9:44 ` Heinz Mauelshagen
2006-01-23 10:26 ` Lars Marowsky-Bree
2006-01-23 10:38 ` Heinz Mauelshagen
2006-01-23 10:38 ` Heinz Mauelshagen
2006-01-23 10:45 ` Lars Marowsky-Bree
2006-01-23 10:45 ` Lars Marowsky-Bree
2006-01-23 11:00 ` Heinz Mauelshagen
2006-01-23 11:00 ` Heinz Mauelshagen
2006-01-23 12:54 ` Ville Herva
2006-01-23 12:54 ` Ville Herva
2006-01-23 13:00 ` Steinar H. Gunderson
2006-01-23 13:54 ` Heinz Mauelshagen
2006-01-23 17:33 ` Ville Herva
2006-01-23 17:33 ` Ville Herva
2006-01-24 2:02 ` Phillip Susi
2006-01-20 7:51 ` Reuben Farrelly
2006-01-20 3:43 ` Andre' Breiler
2006-01-21 0:42 ` David Greaves
2006-01-20 16:48 Hubert Tonneau
2006-01-20 17:01 Hubert Tonneau
2006-01-20 16:15 ` Christoph Hellwig
2006-01-22 6:45 ` Herbert Poetzl
2006-01-20 18:05 Hubert Tonneau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43CD71FA.4090908@cfl.rr.com \
--to=psusi@cfl.rr.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=mjt@tls.msk.ru \
--cc=neilb@suse.de \
--cc=sander@humilis.net \
--cc=sgunderson@bigfoot.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.