From: Paul Clements <Paul.Clements@SteelEye.com>
To: linux-raid@vger.kernel.org, ptb@it.uc3m.es, mingo@redhat.com,
"james.bottomley" <james.bottomley@SteelEye.com>,
Neil Brown <neilb@cse.unsw.edu.au>
Subject: Re: [ANNOUNCE][PATCH 2.6] md: persistent (file-backed) bitmap and async writes
Date: Fri, 30 Jan 2004 17:52:05 -0500 [thread overview]
Message-ID: <401AE015.56F4B384@SteelEye.com> (raw)
In-Reply-To: 40198E85.29EBC8E0@SteelEye.com
I've uploaded a patch against the latest mdadm (1.5.0):
http://parisc-linux.org/~jejb/md_bitmap/mdadm_1_5_0.diff
Thanks,
Paul
Paul Clements wrote:
>
> Description
> ===========
> This patch provides the md driver with the ability to track
> resync/rebuild progress with a bitmap. It also gives the raid1 driver
> the ability to perform asynchronous writes (i.e., writes are
> acknowledged before they actually reach the secondary disk). The bitmap
> and asynchronous write capabilities are primarily useful when raid1 is
> employed in data replication (e.g., with a remote disk served over nbd
> as the secondary device). However, the bitmap is also useful for
> reducing resync/rebuild times with ordinary (local) raid1, raid5, and
> raid6 arrays.
>
> Background
> ==========
> This patch is an adjunct to Peter T. Breuer's raid1 bitmap code (fr1
> v2.14, ftp://oboe.it.uc3m.es/pub/Programs/fr1-2.14.tgz). The code was
> originally written for 2.4 (I have patches vs. 2.4.19/20 Red Hat and
> SuSE kernels, if anyone is interested). The 2.4 version of this patch
> has undergone extensive alpha, beta, and stress testing, including a WAN
> setup where a 500MB partition was mirrored across the U.S. The 2.6
> version of the patch remains as close to the 2.4 version as possible,
> while still allowing it to function properly in the 2.6 kernel. The 2.6
> code has also been tested quite a bit and is fairly stable.
>
> Features
> ========
>
> Persistent Bitmap
> -----------------
> The bitmap tracks which blocks are out of sync between the primary and
> secondary disk in a raid1 array (in raid5, the bitmap would indicate
> which stripes need to be rebuilt). The bitmap is stored in memory (for
> speed) and on disk (for persistence, so that a full resync is never
> needed, even after a failure or reboot).
>
> There is a kernel daemon that periodically (lazily) clears bits in the
> bitmap file (this reduces the number and frequency of disk writes to the
> bitmap file).
>
> The bitmap can also be rescaled -- i.e., change the amount of data that
> each bit represents. This allows for increased efficiency at the cost of
> reduced bitmap granularity.
>
> Currently, the bitmap code has been implemented only for raid1, but it
> could easily be leveraged by other raid drivers (namely raid5 and raid6)
> by adding a few calls to the bitmap routines in the appropriate places.
>
> Asynchronous Writes
> -------------------
> The asynchronous write capability allows the raid1 driver to function
> more efficiently in data replication environments (i.e., where the
> secondary disk is remote). Asynchronous writes allow us to overcome high
> network latency by filling the network pipe.
>
> Modifications to mdadm
> ----------------------
> I have modified Neil's mdadm tool to allow it to configure the
> additional bitmap and async parameters. The attached patch is against
> the 1.2 mdadm release. Briefly, the new options are:
>
> Creation:
>
> mdadm -C /dev/md0 -l 1 -n 2 --persistent --async=512
> --bitmap=/tmp/bitmap_md0_file,4096,5 /dev/xxx /dev/yyy
>
> This creates a raid1 array with:
>
> * 2 disks
> * a persistent superblock
> * asynchronous writes enabled (maximum of 512 outstanding writes)
> * bitmap enabled (using the file /tmp/bitmap_md0_file)
> * a bitmap chunksize of 4k (bitmap chunksize determines how much data
> each bitmap bit represents)
> * the bitmap daemon set to wake up every 5 seconds to clear bits in the
> bitmap file (if needed)
> * /dev/xxx as the primary disk
> * /dev/yyy as the backup disk (when asynchronous writes are enabled, the
> second disk in the array is labelled as a "backup", indicating that it
> is remote, and thus no reads will be issued to the device)
>
> Assembling:
>
> mdadm -A /dev/md0 --bitmap=/tmp/bitmap_md0_file /dev/xxx /dev/yyy
>
> This assembles an existing array and configures it to use a bitmap file.
> The bitmap file pathname is not stored in the array superblock, and so
> must be specified every time the array is assembled.
>
> Details:
>
> mdadm -D /dev/md0
>
> This will display information about /dev/md0, including some additional
> information about the bitmap and async parameters.
>
> I've also added some information to the /proc/mdstat file:
>
> # cat /proc/mdstat
> Personalities : [raid1]
> md1 : active raid1 loop0[0] loop1[1](B)
> 39936 blocks [2/2] [UU]
> async: 0/256 outstanding writes
> bitmap: 1/1 pages (15 cached) [64KB], 64KB chunk, file:
> /tmp/bitmap_md1
>
> unused devices: <none>
>
> More details on the design and implementation can be found in Section 3
> of my 2003 OLS Paper:
> http://archive.linuxsymposium.org/ols2003/Proceedings/All-Reprints/Reprint-Clements-OLS2003.pdf
>
> Patch Location
> ==============
>
> Finally, the patches are available here:
>
> kernel patch vs. 2.6.2-rc2-bk3
> ------------------------------
> http://parisc-linux.org/~jejb/md_bitmap/md_bitmap_2_30_2_6_2_RC2_BK3_RELEASE.diff
>
> mdadm patch vs. 1.2.0
> ---------------------
> http://parisc-linux.org/~jejb/md_bitmap/mdadm_1_2_0.diff
>
> So if you're interested, please review, test, ask questions, etc. Any
> feedback is welcome.
>
> Thanks,
> Paul
next prev parent reply other threads:[~2004-01-30 22:52 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-01-29 22:51 [ANNOUNCE][PATCH 2.6] md: persistent (file-backed) bitmap and async writes Paul Clements
2004-01-30 22:52 ` Paul Clements [this message]
2004-02-09 2:51 ` Neil Brown
2004-02-09 19:45 ` Paul Clements
2004-02-10 0:04 ` Neil Brown
2004-02-10 16:20 ` Paul Clements
2004-02-10 16:57 ` Paul Clements
2004-02-13 20:58 ` Paul Clements
2004-03-05 5:06 ` Neil Brown
2004-03-05 22:05 ` Paul Clements
2004-03-31 18:38 ` Paul Clements
2004-04-28 18:10 ` Paul Clements
2004-04-28 18:53 ` Peter T. Breuer
2004-04-29 8:41 ` Neil Brown
2004-05-04 20:08 ` Paul Clements
2004-06-08 20:53 ` Paul Clements
2004-06-08 22:47 ` Neil Brown
2004-06-14 23:39 ` Neil Brown
2004-06-14 23:59 ` James Bottomley
2004-06-15 6:27 ` Neil Brown
2004-06-17 17:57 ` Paul Clements
2004-06-18 20:48 ` Paul Clements
2004-06-23 21:48 ` Paul Clements
2004-06-23 21:50 ` Paul Clements
2004-07-06 14:52 ` Paul Clements
[not found] ` <40F7E50F.2040308@steeleye.com>
[not found] ` <16649.61212.310271.36561@cse.unsw.edu.au>
2004-08-10 21:37 ` Paul Clements
2004-08-13 3:04 ` Neil Brown
2004-09-21 3:28 ` Paul Clements
2004-09-21 19:19 ` Paul Clements
2004-10-12 2:15 ` Neil Brown
2004-10-12 14:06 ` Paul Clements
2004-10-12 21:16 ` Paul Clements
2004-11-10 0:37 ` md: persistent (file-backed) bitmap Neil Brown
2004-11-10 18:28 ` Paul Clements
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=401AE015.56F4B384@SteelEye.com \
--to=paul.clements@steeleye.com \
--cc=james.bottomley@SteelEye.com \
--cc=linux-raid@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=neilb@cse.unsw.edu.au \
--cc=ptb@it.uc3m.es \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).