All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Clements <Paul.Clements@SteelEye.com>
To: linux-raid@vger.kernel.org, ptb@it.uc3m.es, mingo@redhat.com,
	"james.bottomley" <james.bottomley@SteelEye.com>,
	Neil Brown <neilb@cse.unsw.edu.au>
Subject: Re: [ANNOUNCE][PATCH 2.6] md: persistent (file-backed) bitmap and async writes
Date: Fri, 30 Jan 2004 17:52:05 -0500	[thread overview]
Message-ID: <401AE015.56F4B384@SteelEye.com> (raw)
In-Reply-To: 40198E85.29EBC8E0@SteelEye.com

I've uploaded a patch against the latest mdadm (1.5.0):

http://parisc-linux.org/~jejb/md_bitmap/mdadm_1_5_0.diff

Thanks,
Paul


Paul Clements wrote:
> 
> Description
> ===========
> This patch provides the md driver with the ability to track
> resync/rebuild progress with a bitmap. It also gives the raid1 driver
> the ability to perform asynchronous writes (i.e., writes are
> acknowledged before they actually reach the secondary disk). The bitmap
> and asynchronous write capabilities are primarily useful when raid1 is
> employed in data replication (e.g., with a remote disk served over nbd
> as the secondary device). However, the bitmap is also useful for
> reducing resync/rebuild times with ordinary (local) raid1, raid5, and
> raid6 arrays.
> 
> Background
> ==========
> This patch is an adjunct to Peter T. Breuer's raid1 bitmap code (fr1
> v2.14, ftp://oboe.it.uc3m.es/pub/Programs/fr1-2.14.tgz). The code was
> originally written for 2.4 (I have patches vs. 2.4.19/20 Red Hat and
> SuSE kernels, if anyone is interested). The 2.4 version of this patch
> has undergone extensive alpha, beta, and stress testing, including a WAN
> setup where a 500MB partition was mirrored across the U.S. The 2.6
> version of the patch remains as close to the 2.4 version as possible,
> while still allowing it to function properly in the 2.6 kernel. The 2.6
> code has also been tested quite a bit and is fairly stable.
> 
> Features
> ========
> 
> Persistent Bitmap
> -----------------
> The bitmap tracks which blocks are out of sync between the primary and
> secondary disk in a raid1 array (in raid5, the bitmap would indicate
> which stripes need to be rebuilt). The bitmap is stored in memory (for
> speed) and on disk (for persistence, so that a full resync is never
> needed, even after a failure or reboot).
> 
> There is a kernel daemon that periodically (lazily) clears bits in the
> bitmap file (this reduces the number and frequency of disk writes to the
> bitmap file).
> 
> The bitmap can also be rescaled -- i.e., change the amount of data that
> each bit represents. This allows for increased efficiency at the cost of
> reduced bitmap granularity.
> 
> Currently, the bitmap code has been implemented only for raid1, but it
> could easily be leveraged by other raid drivers (namely raid5 and raid6)
> by adding a few calls to the bitmap routines in the appropriate places.
> 
> Asynchronous Writes
> -------------------
> The asynchronous write capability allows the raid1 driver to function
> more efficiently in data replication environments (i.e., where the
> secondary disk is remote). Asynchronous writes allow us to overcome high
> network latency by filling the network pipe.
> 
> Modifications to mdadm
> ----------------------
> I have modified Neil's mdadm tool to allow it to configure the
> additional bitmap and async parameters. The attached patch is against
> the 1.2 mdadm release. Briefly, the new options are:
> 
> Creation:
> 
> mdadm -C /dev/md0 -l 1 -n 2 --persistent --async=512
> --bitmap=/tmp/bitmap_md0_file,4096,5 /dev/xxx /dev/yyy
> 
> This creates a raid1 array with:
> 
> * 2 disks
> * a persistent superblock
> * asynchronous writes enabled (maximum of 512 outstanding writes)
> * bitmap enabled (using the file /tmp/bitmap_md0_file)
> * a bitmap chunksize of 4k (bitmap chunksize determines how much data
> each bitmap bit represents)
> * the bitmap daemon set to wake up every 5 seconds to clear bits in the
> bitmap file (if needed)
> * /dev/xxx as the primary disk
> * /dev/yyy as the backup disk (when asynchronous writes are enabled, the
> second disk in the array is labelled as a "backup", indicating that it
> is remote, and thus no reads will be issued to the device)
> 
> Assembling:
> 
> mdadm -A /dev/md0 --bitmap=/tmp/bitmap_md0_file /dev/xxx /dev/yyy
> 
> This assembles an existing array and configures it to use a bitmap file.
> The bitmap file pathname is not stored in the array superblock, and so
> must be specified every time the array is assembled.
> 
> Details:
> 
> mdadm -D /dev/md0
> 
> This will display information about /dev/md0, including some additional
> information about the bitmap and async parameters.
> 
> I've also added some information to the /proc/mdstat file:
> 
> # cat /proc/mdstat
> Personalities : [raid1]
> md1 : active raid1 loop0[0] loop1[1](B)
>       39936 blocks [2/2] [UU]
>       async: 0/256 outstanding writes
>       bitmap: 1/1 pages (15 cached) [64KB], 64KB chunk, file:
> /tmp/bitmap_md1
> 
> unused devices: <none>
> 
> More details on the design and implementation can be found in Section 3
> of my 2003 OLS Paper:
> http://archive.linuxsymposium.org/ols2003/Proceedings/All-Reprints/Reprint-Clements-OLS2003.pdf
> 
> Patch Location
> ==============
> 
> Finally, the patches are available here:
> 
> kernel patch vs. 2.6.2-rc2-bk3
> ------------------------------
> http://parisc-linux.org/~jejb/md_bitmap/md_bitmap_2_30_2_6_2_RC2_BK3_RELEASE.diff
> 
> mdadm patch vs. 1.2.0
> ---------------------
> http://parisc-linux.org/~jejb/md_bitmap/mdadm_1_2_0.diff
> 
> So if you're interested, please review, test, ask questions, etc. Any
> feedback is welcome.
> 
> Thanks,
> Paul

  reply	other threads:[~2004-01-30 22:52 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-29 22:51 [ANNOUNCE][PATCH 2.6] md: persistent (file-backed) bitmap and async writes Paul Clements
2004-01-30 22:52 ` Paul Clements [this message]
2004-02-09  2:51 ` Neil Brown
2004-02-09 19:45   ` Paul Clements
2004-02-10  0:04     ` Neil Brown
2004-02-10 16:20       ` Paul Clements
2004-02-10 16:57       ` Paul Clements
2004-02-13 20:58       ` Paul Clements
2004-03-05  5:06         ` Neil Brown
2004-03-05 22:05           ` Paul Clements
2004-03-31 18:38             ` Paul Clements
2004-04-28 18:10               ` Paul Clements
2004-04-28 18:53                 ` Peter T. Breuer
2004-04-29  8:41               ` Neil Brown
2004-05-04 20:08                 ` Paul Clements
2004-06-08 20:53                 ` Paul Clements
2004-06-08 22:47                   ` Neil Brown
2004-06-14 23:39                   ` Neil Brown
2004-06-14 23:59                     ` James Bottomley
2004-06-15  6:27                   ` Neil Brown
2004-06-17 17:57                     ` Paul Clements
2004-06-18 20:48                     ` Paul Clements
2004-06-23 21:48                     ` Paul Clements
2004-06-23 21:50                       ` Paul Clements
2004-07-06 14:52                       ` Paul Clements
     [not found]                       ` <40F7E50F.2040308@steeleye.com>
     [not found]                         ` <16649.61212.310271.36561@cse.unsw.edu.au>
2004-08-10 21:37                           ` Paul Clements
2004-08-13  3:04                             ` Neil Brown
2004-09-21  3:28                               ` Paul Clements
2004-09-21 19:19                                 ` Paul Clements
2004-10-12  2:15                                   ` Neil Brown
2004-10-12 14:06                                     ` Paul Clements
2004-10-12 21:16                                       ` Paul Clements
2004-11-10  0:37                                     ` md: persistent (file-backed) bitmap Neil Brown
2004-11-10 18:28                                       ` Paul Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=401AE015.56F4B384@SteelEye.com \
    --to=paul.clements@steeleye.com \
    --cc=james.bottomley@SteelEye.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=neilb@cse.unsw.edu.au \
    --cc=ptb@it.uc3m.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.