From: Phillip Susi <psusi@cfl.rr.com>
To: Michael Tokarev <mjt@tls.msk.ru>
Cc: sander@humilis.net, NeilBrown <neilb@suse.de>,
linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
"Steinar H. Gunderson" <sgunderson@bigfoot.com>
Subject: Re: [PATCH 000 of 5] md: Introduction
Date: Tue, 17 Jan 2006 17:38:50 -0500 [thread overview]
Message-ID: <43CD71FA.4090908@cfl.rr.com> (raw)
In-Reply-To: <43CCD453.9070900@tls.msk.ru>
Michael Tokarev wrote:
<snip>
> Compare this with my statement about "offline" "reshaper" above:
> separate userspace (easier to write/debug compared with kernel
> space) program which operates on an inactive array (no locking
> needed, no need to worry about other I/O operations going to the
> array at the time of reshaping etc), with an ability to plan it's
> I/O strategy in alot more efficient and safer way... Yes this
> apprpach has one downside: the array has to be inactive. But in
> my opinion it's worth it, compared to more possibilities to lose
> your data, even if you do NOT use that feature at all...
>
>
I also like the idea of this kind of thing going in user space. I was
also under the impression that md was going to be phased out and
replaced by the device mapper. I've been kicking around the idea of a
user space utility that manipulates the device mapper tables and
performs block moves itself to reshape a raid array. It doesn't seem
like it would be that difficult and would not require modifying the
kernel at all. The basic idea is something like this:
/dev/mapper/raid is your raid array, which is mapped to a stripe between
/dev/sda, /dev/sdb. When you want to expand the stripe to add /dev/sdc
to the array, you create three new devices:
/dev/mapper/raid-old: copy of the old mapper table, striping sda and sdb
/dev/mapper/raid-progress: linear map with size = new stripe width, and
pointing to raid-new
/dev/mapper/raid-new: what the raid will look like when done, i.e.
stripe of sda, sdb, and sdc
Then you replace /dev/mapper/raid with a linear map to raid-new,
raid-progress, and raid-old, in that order. Initially the length of the
chunks from raid-progress and raid-new are zero, so you will still be
entirely accessing raid-old. For each stripe in the array, you change
raid-progress to point to the corresponding blocks in raid-new, but
suspended, so IO to this stripe will block. Then you update the raid
map so raid-progress overlays the stripe you are working on to catch IO
instead of allowing it to go to raid-old. After you read that stripe
from raid-old and write it to raid-new, resume raid-progress to flush
any blocked writes to the raid-new stripe. Finally update raid so the
previously in progress stripe now maps to raid-new.
Repeat for each stripe in the array, and finally replace the raid table
with raid-new's table, and delete the 3 temporary devices.
Adding transaction logging to the user mode utility wouldn't be very
hard either.
next prev parent reply other threads:[~2006-01-17 22:38 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-17 6:56 [PATCH 000 of 5] md: Introduction NeilBrown
2006-01-17 6:56 ` [PATCH 001 of 5] md: Split disks array out of raid5 conf structure so it is easier to grow NeilBrown
2006-01-17 14:37 ` John Stoffel
2006-01-19 0:26 ` Neil Brown
2006-01-21 3:37 ` John Stoffel
2006-01-22 22:57 ` Neil Brown
2006-01-17 6:56 ` [PATCH 002 of 5] md: Allow stripes to be expanded in preparation for expanding an array NeilBrown
2006-01-17 6:56 ` [PATCH 003 of 5] md: Infrastructure to allow normal IO to continue while array is expanding NeilBrown
2006-01-17 6:56 ` [PATCH 004 of 5] md: Core of raid5 resize process NeilBrown
2006-01-17 6:56 ` [PATCH 005 of 5] md: Final stages of raid5 expand code NeilBrown
2006-01-17 9:55 ` Sander
2006-01-19 0:32 ` Neil Brown
2006-01-17 8:17 ` [PATCH 000 of 5] md: Introduction Michael Tokarev
[not found] ` <fd8d0180601170121s1e6a55b7o@mail.gmail.com>
2006-01-17 9:38 ` Francois Barre
2006-01-19 0:35 ` Neil Brown
2006-01-17 9:50 ` Sander
2006-01-17 11:26 ` Michael Tokarev
2006-01-17 11:37 ` Francois Barre
2006-01-17 14:03 ` Kyle Moffett
2006-01-19 0:28 ` Neil Brown
2006-01-17 16:08 ` Ross Vandegrift
2006-01-17 18:12 ` Michael Tokarev
2006-01-18 8:14 ` Sander
2006-01-18 8:37 ` Brad Campbell
2006-01-18 9:03 ` Alan Cox
2006-01-18 12:46 ` John Hendrikx
2006-01-18 12:51 ` Gordon Henderson
2006-01-18 23:51 ` Neil Brown
2006-01-19 7:20 ` PFC
2006-01-19 8:01 ` dean gaudet
2006-01-18 23:54 ` Neil Brown
2006-01-19 0:22 ` Neil Brown
2006-01-19 9:01 ` Jakob Oestergaard
2006-01-17 22:38 ` Phillip Susi [this message]
2006-01-17 22:57 ` Neil Brown
2006-01-17 14:10 ` Steinar H. Gunderson
2006-01-17 15:07 ` Mr. James W. Laferriere
2006-01-19 0:23 ` Neil Brown
2006-01-22 4:42 ` Adam Kropelin
2006-01-22 22:52 ` Neil Brown
2006-01-23 23:02 ` Adam Kropelin
2006-01-23 1:08 ` John Hendrikx
2006-01-23 1:25 ` Neil Brown
2006-01-23 1:54 ` Kyle Moffett
2006-01-23 2:09 ` Mr. James W. Laferriere
2006-01-23 2:33 ` Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2006-01-17 21:38 Lincoln Dale (ltd)
2006-01-18 13:27 ` Jan Engelhardt
2006-01-18 23:19 ` Neil Brown
2006-01-19 15:33 ` Mark Hahn
2006-01-19 20:12 ` Jan Engelhardt
2006-01-19 21:22 ` Lars Marowsky-Bree
2006-01-19 22:17 ` Phillip Susi
2006-01-19 22:32 ` Neil Brown
2006-01-19 23:26 ` Phillip Susi
2006-01-19 23:43 ` Neil Brown
2006-01-20 2:17 ` Phillip Susi
2006-01-20 10:53 ` Lars Marowsky-Bree
2006-01-20 12:06 ` Jens Axboe
2006-01-20 18:38 ` Heinz Mauelshagen
2006-01-20 22:09 ` Lars Marowsky-Bree
2006-01-21 0:06 ` Heinz Mauelshagen
2006-01-20 18:41 ` Heinz Mauelshagen
2006-01-20 17:29 ` Ross Vandegrift
2006-01-20 18:36 ` Heinz Mauelshagen
2006-01-20 22:57 ` Lars Marowsky-Bree
2006-01-21 0:01 ` Heinz Mauelshagen
2006-01-21 0:03 ` Lars Marowsky-Bree
2006-01-21 0:08 ` Heinz Mauelshagen
2006-01-21 0:13 ` Lars Marowsky-Bree
2006-01-23 9:44 ` Heinz Mauelshagen
2006-01-23 10:26 ` Lars Marowsky-Bree
2006-01-23 10:38 ` Heinz Mauelshagen
2006-01-23 10:45 ` Lars Marowsky-Bree
2006-01-23 11:00 ` Heinz Mauelshagen
2006-01-23 12:54 ` Ville Herva
2006-01-23 13:00 ` Steinar H. Gunderson
2006-01-23 13:54 ` Heinz Mauelshagen
2006-01-23 17:33 ` Ville Herva
2006-01-24 2:02 ` Phillip Susi
2006-01-20 7:51 ` Reuben Farrelly
2006-01-20 3:43 ` Andre' Breiler
2006-01-21 0:42 ` David Greaves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43CD71FA.4090908@cfl.rr.com \
--to=psusi@cfl.rr.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=mjt@tls.msk.ru \
--cc=neilb@suse.de \
--cc=sander@humilis.net \
--cc=sgunderson@bigfoot.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).