* [PATCH 0/6] md: Remove the hash tables from raid0.
@ 2009-05-14 10:43 Andre Noll
2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll
` (5 more replies)
0 siblings, 6 replies; 39+ messages in thread
From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw)
To: neilb; +Cc: raziebe, linux-raid, Andre Noll
As mentioned by Neil, the raid0 hash table code does probably not
add any value. Moreover, it contains some rather strange sector_t
manipulations which are needed to setup and maintain the table.
This patch series against Neil's for-next tree as of yesterday removes
the hash table from the raid0 code.
Patch #1 replaces the hash table lookup by a simple function that
loops over all strip zones to find the zone that holds a given sector.
This change allows to get rid of the hash table itself (patch #2)
and of related fields of struct raid0_private_data (patch #3).
Patch #4 makes raid0 return a proper error code rather than -ENOMEM
in case the array could not be started for reasons other than memory
shortage.
The remaining two patches are simple cleanups that further simplify
the raid0 code a bit.
The patched kernel has been tested with a smallish raid0 array
consisting of five devices of varying sizes (created and filled with
contents by an unpatched kernel) and seems to work just fine. It
passes the raid0 tests of the mdadm test suite.
Please consider for inclusion.
drivers/md/raid0.c | 150 ++++++++++++++--------------------------------------
drivers/md/raid0.h | 4 --
2 files changed, 39 insertions(+), 115 deletions(-)
Thanks
Andre
^ permalink raw reply [flat|nested] 39+ messages in thread* [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll @ 2009-05-14 10:43 ` Andre Noll 2009-05-14 11:15 ` SandeepKsinha ` (3 more replies) 2009-05-14 10:43 ` [PATCH] md: raid0: Remove hash table Andre Noll ` (4 subsequent siblings) 5 siblings, 4 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw) To: neilb; +Cc: raziebe, linux-raid, Andre Noll The number of strip_zones of a raid0 array is bounded by the number of drives in the array and is in fact much smaller for typical setups. For example, any raid0 array containing identical disks will have only a single strip_zone. Therefore, the hash tables which are used for quickly finding the strip_zone that holds a particular sector are of questionable value and add quite a bit of unnecessary complexity. This patch replaces the hash table lookup by equivalent code which simply loops over all strip zones to find the zone that holds the given sector. Subsequent cleanup patches will remove the hash table structure. Signed-off-by: Andre Noll <maan@systemlinux.org> --- drivers/md/raid0.c | 32 +++++++++++++++++++------------- 1 files changed, 19 insertions(+), 13 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index c08d755..9fd3c3c 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -398,6 +398,22 @@ static int raid0_stop (mddev_t *mddev) return 0; } +/* Find the zone which holds a particular offset */ +static struct strip_zone *find_zone(struct raid0_private_data *conf, + sector_t sector) +{ + int i; + + for (i = 0; i < conf->nr_strip_zones; i++) { + struct strip_zone *z = conf->strip_zone + i; + + if (sector < z->zone_start + z->sectors) + return z; + } + BUG(); + return NULL; +} + static int raid0_make_request (struct request_queue *q, struct bio *bio) { mddev_t *mddev = q->queuedata; @@ -443,20 +459,10 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) bio_pair_release(bp); return 0; } - - - { - sector_t x = sector >> conf->sector_shift; - sector_div(x, (u32)conf->spacing); - zone = conf->hash_table[x]; - } - - while (sector >= zone->zone_start + zone->sectors) - zone++; - + zone = find_zone(conf, sector); + if (!zone) + return 1; sect_in_chunk = bio->bi_sector & (chunk_sects - 1); - - { sector_t x = (sector - zone->zone_start) >> chunksect_bits; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll @ 2009-05-14 11:15 ` SandeepKsinha 2009-05-14 11:15 ` NeilBrown ` (2 subsequent siblings) 3 siblings, 0 replies; 39+ messages in thread From: SandeepKsinha @ 2009-05-14 11:15 UTC (permalink / raw) To: Andre Noll; +Cc: neilb, raziebe, linux-raid Hi Andre, On Thu, May 14, 2009 at 4:13 PM, Andre Noll <maan@systemlinux.org> wrote: > The number of strip_zones of a raid0 array is bounded by the number of > drives in the array and is in fact much smaller for typical setups. For > example, any raid0 array containing identical disks will have only > a single strip_zone. > > Therefore, the hash tables which are used for quickly finding the > strip_zone that holds a particular sector are of questionable value > and add quite a bit of unnecessary complexity. > > This patch replaces the hash table lookup by equivalent code which > simply loops over all strip zones to find the zone that holds the > given sector. > > Subsequent cleanup patches will remove the hash table structure. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 32 +++++++++++++++++++------------- > 1 files changed, 19 insertions(+), 13 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index c08d755..9fd3c3c 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -398,6 +398,22 @@ static int raid0_stop (mddev_t *mddev) > return 0; > } > > +/* Find the zone which holds a particular offset */ > +static struct strip_zone *find_zone(struct raid0_private_data *conf, > + sector_t sector) > +{ > + int i; > + > + for (i = 0; i < conf->nr_strip_zones; i++) { > + struct strip_zone *z = conf->strip_zone + i; > + > + if (sector < z->zone_start + z->sectors) > + return z; > + } > + BUG(); > + return NULL; > +} > + > static int raid0_make_request (struct request_queue *q, struct bio *bio) > { > mddev_t *mddev = q->queuedata; > @@ -443,20 +459,10 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > bio_pair_release(bp); > return 0; > } > - > - > - { > - sector_t x = sector >> conf->sector_shift; > - sector_div(x, (u32)conf->spacing); > - zone = conf->hash_table[x]; > - } > - > - while (sector >= zone->zone_start + zone->sectors) > - zone++; > - > + zone = find_zone(conf, sector); > + if (!zone) > + return 1; Firstly, will find_zone ever return a NULL. You code already asserts on that situation in find_zone. But in my opinion, You should allow NULL to be returned from find_zone( ), handle NULL in the caller function more gracefully than panicking (BUG). Also, the above lines mean that if zone==NULL, you return 1, which will eventually submit your request through generic_make_request. I might be wrong as I am quite new to this code. Kindly forgive. > sect_in_chunk = bio->bi_sector & (chunk_sects - 1); > - > - > { > sector_t x = (sector - zone->zone_start) >> chunksect_bits; > > -- > 1.5.4.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll 2009-05-14 11:15 ` SandeepKsinha @ 2009-05-14 11:15 ` NeilBrown 2009-05-14 12:10 ` Andre Noll 2009-05-14 12:22 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Neil Brown 2009-05-14 12:01 ` SandeepKsinha 2009-05-14 14:13 ` raz ben yehuda 3 siblings, 2 replies; 39+ messages in thread From: NeilBrown @ 2009-05-14 11:15 UTC (permalink / raw) Cc: raziebe, linux-raid, Andre Noll On Thu, May 14, 2009 8:43 pm, Andre Noll wrote: > The number of strip_zones of a raid0 array is bounded by the number of > drives in the array and is in fact much smaller for typical setups. For > example, any raid0 array containing identical disks will have only > a single strip_zone. > > Therefore, the hash tables which are used for quickly finding the > strip_zone that holds a particular sector are of questionable value > and add quite a bit of unnecessary complexity. > > This patch replaces the hash table lookup by equivalent code which > simply loops over all strip zones to find the zone that holds the > given sector. > > Subsequent cleanup patches will remove the hash table structure. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 32 +++++++++++++++++++------------- > 1 files changed, 19 insertions(+), 13 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index c08d755..9fd3c3c 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -398,6 +398,22 @@ static int raid0_stop (mddev_t *mddev) > return 0; > } > > +/* Find the zone which holds a particular offset */ > +static struct strip_zone *find_zone(struct raid0_private_data *conf, > + sector_t sector) > +{ > + int i; > + > + for (i = 0; i < conf->nr_strip_zones; i++) { > + struct strip_zone *z = conf->strip_zone + i; > + > + if (sector < z->zone_start + z->sectors) > + return z; > + } Maybe I'm being petty, but I really don't like this... I really like > - while (sector >= zone->zone_start + zone->sectors) > - zone++; It seems to capture what is really happening. But I can see that your code has a defensive aspect to it which is hard to argue against. It's probably the fact that there are two loop variables (i and z) that bothers me.... Here is a thought. How about we extend the stripe_zone array by one element and put a sentinal at the end, with ->zone_start being the size of the drive ... or maybe even "max sector". Then the loop could be for (i = 0; i < conf->nr_strip_zones; i++) if (sector < conf->strip_zone[i+1]) return conf->strip_zone[i] Save our selves an 'add' that way :-) > + BUG(); > + return NULL; > +} > + > static int raid0_make_request (struct request_queue *q, struct bio *bio) > { > mddev_t *mddev = q->queuedata; > @@ -443,20 +459,10 @@ static int raid0_make_request (struct request_queue > *q, struct bio *bio) > bio_pair_release(bp); > return 0; > } > - > - > - { > - sector_t x = sector >> conf->sector_shift; > - sector_div(x, (u32)conf->spacing); > - zone = conf->hash_table[x]; > - } > - > - while (sector >= zone->zone_start + zone->sectors) > - zone++; > - > + zone = find_zone(conf, sector); > + if (!zone) > + return 1; I don't much like those last two lines either. zone will never be NULL because if find_zone doesn't find anything, it calls BUG. And returning from make_request without doing any IO is just wrong. So those two lines can go. Thanks, NeilBrown > sect_in_chunk = bio->bi_sector & (chunk_sects - 1); > - > - > { > sector_t x = (sector - zone->zone_start) >> chunksect_bits; > > -- > 1.5.4.3 > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 11:15 ` NeilBrown @ 2009-05-14 12:10 ` Andre Noll 2009-05-14 12:25 ` NeilBrown 2009-05-14 12:22 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Neil Brown 1 sibling, 1 reply; 39+ messages in thread From: Andre Noll @ 2009-05-14 12:10 UTC (permalink / raw) To: NeilBrown; +Cc: raziebe, linux-raid [-- Attachment #1: Type: text/plain, Size: 2399 bytes --] On 21:15, NeilBrown wrote: > > +/* Find the zone which holds a particular offset */ > > +static struct strip_zone *find_zone(struct raid0_private_data *conf, > > + sector_t sector) > > +{ > > + int i; > > + > > + for (i = 0; i < conf->nr_strip_zones; i++) { > > + struct strip_zone *z = conf->strip_zone + i; > > + > > + if (sector < z->zone_start + z->sectors) > > + return z; > > + } > > Maybe I'm being petty, but I really don't like this... > > I really like > > - while (sector >= zone->zone_start + zone->sectors) > > - zone++; > > It seems to capture what is really happening. > But I can see that your code has a defensive aspect to it > which is hard to argue against. > > It's probably the fact that there are two loop variables (i and z) > that bothers me.... > Here is a thought. How about we extend the stripe_zone array > by one element and put a sentinal at the end, with ->zone_start being > the size of the drive ... or maybe even "max sector". > > Then the loop could be > for (i = 0; i < conf->nr_strip_zones; i++) > if (sector < conf->strip_zone[i+1]) > return conf->strip_zone[i] > > Save our selves an 'add' that way :-) Saving an addition is a good thing as find_zone() is certainly performance-critical. After all, that's why the original author used hash tables. The drawback is of course that we will happily return the last zone for (invalid) requests beyond the array size. If that happens, data corruption on the drive(s) of the last zone will likely be the result, so I feel a bit uncomfortable in doing this. How about extending stripe_zone as you propose, storing the array size in ->zone_start of the last element and still call BUG() after the loop? That still saves the additional add and avoids the mentioned data corruption. > > + zone = find_zone(conf, sector); > > + if (!zone) > > + return 1; > > I don't much like those last two lines either. zone will > never be NULL because if find_zone doesn't find anything, it > calls BUG. And returning from make_request without doing any IO > is just wrong. So those two lines can go. Yup. Do you want me to rework the patches, or should I follow up with patches on top of the series? Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 12:10 ` Andre Noll @ 2009-05-14 12:25 ` NeilBrown 2009-05-14 12:54 ` Sujit Karataparambil ` (3 more replies) 0 siblings, 4 replies; 39+ messages in thread From: NeilBrown @ 2009-05-14 12:25 UTC (permalink / raw) To: Andre Noll; +Cc: raziebe, linux-raid On Thu, May 14, 2009 10:10 pm, Andre Noll wrote: > Yup. Do you want me to rework the patches, or should I follow up with > patches on top of the series? A fresh new set please. And I agree with everything else you said in the email. NeilBrown ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 12:25 ` NeilBrown @ 2009-05-14 12:54 ` Sujit Karataparambil 2009-05-14 15:00 ` SandeepKsinha 2009-05-14 15:58 ` PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity raz ben yehuda ` (2 subsequent siblings) 3 siblings, 1 reply; 39+ messages in thread From: Sujit Karataparambil @ 2009-05-14 12:54 UTC (permalink / raw) To: NeilBrown; +Cc: Andre Noll, raziebe, linux-raid Hi neil, could you please tell? out of curiosity what happens when a null is returned by find_zone. On Thu, May 14, 2009 at 5:55 PM, NeilBrown <neilb@suse.de> wrote: > On Thu, May 14, 2009 10:10 pm, Andre Noll wrote: >> Yup. Do you want me to rework the patches, or should I follow up with >> patches on top of the series? > > A fresh new set please. And I agree with everything else you said in > the email. > > NeilBrown > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- -- Sujit K M -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 12:54 ` Sujit Karataparambil @ 2009-05-14 15:00 ` SandeepKsinha 0 siblings, 0 replies; 39+ messages in thread From: SandeepKsinha @ 2009-05-14 15:00 UTC (permalink / raw) To: Sujit Karataparambil; +Cc: NeilBrown, Andre Noll, raziebe, linux-raid Hi Sujit, On Thu, May 14, 2009 at 6:24 PM, Sujit Karataparambil <sjt.kar@gmail.com> wrote: > Hi neil, > > could you please tell? out of curiosity what happens when a null is returned by > find_zone. > find_zone will never return a NULL as we have already asserted that condition. If we are unable to find a zone, we inoke BUG(); > On Thu, May 14, 2009 at 5:55 PM, NeilBrown <neilb@suse.de> wrote: >> On Thu, May 14, 2009 10:10 pm, Andre Noll wrote: >>> Yup. Do you want me to rework the patches, or should I follow up with >>> patches on top of the series? >> >> A fresh new set please. And I agree with everything else you said in >> the email. >> >> NeilBrown >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > -- Sujit K M > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity 2009-05-14 12:25 ` NeilBrown 2009-05-14 12:54 ` Sujit Karataparambil @ 2009-05-14 15:58 ` raz ben yehuda 2009-05-14 14:07 ` Andre Noll 2009-05-14 22:35 ` Neil Brown 2009-05-14 16:00 ` Subject: PATCH[002:002] md: raid0: dump raid configuration raz ben yehuda 2009-05-14 17:12 ` Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 raz ben yehuda 3 siblings, 2 replies; 39+ messages in thread From: raz ben yehuda @ 2009-05-14 15:58 UTC (permalink / raw) To: NeilBrown; +Cc: Andre Noll, linux-raid move raid0 chunk size to 4K*n granularity. motivation for this patch is to have a better access to raid550. if a raid 5 is 3M stripe (4-1),and you have two of these raids 5's, and on top of you have a raid0, it is better to access raid550 with a 3MB buffers and not 1M ( no raid5 write penalty). Andre, Patch is applied on top of your last post. now it is your turn to merge :) md.c | 24 ++++++++++----- raid0.c | 102 ++++++++++++++++++++++++++++++---------------------------------- 2 files changed, 65 insertions(+), 61 deletions(-) Signed-Off-by:raziebe@gmail.com diff --git a/drivers/md/md.c b/drivers/md/md.c index ed5727c..5eab782 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -440,12 +440,14 @@ static inline sector_t calc_dev_sboffset(struct block_device *bdev) return MD_NEW_SIZE_SECTORS(num_sectors); } + static sector_t calc_num_sectors(mdk_rdev_t *rdev, unsigned chunk_size) { sector_t num_sectors = rdev->sb_start; - - if (chunk_size) - num_sectors &= ~((sector_t)chunk_size/512 - 1); + if (chunk_size) { + int chunk_sects = chunk_size>>9; + num_sectors = (num_sectors/chunk_sects)*chunk_sects; + } return num_sectors; } @@ -3512,7 +3514,7 @@ min_sync_store(mddev_t *mddev, const char *buf, size_t len) /* Must be a multiple of chunk_size */ if (mddev->chunk_size) { - if (min & (sector_t)((mddev->chunk_size>>9)-1)) + if (min % (sector_t)(mddev->chunk_size>>9)) return -EINVAL; } mddev->resync_min = min; @@ -3549,7 +3551,7 @@ max_sync_store(mddev_t *mddev, const char *buf, size_t len) /* Must be a multiple of chunk_size */ if (mddev->chunk_size) { - if (max & (sector_t)((mddev->chunk_size>>9)-1)) + if (max % (sector_t)((mddev->chunk_size>>9))) return -EINVAL; } mddev->resync_max = max; @@ -3993,11 +3995,19 @@ static int do_md_run(mddev_t * mddev) /* * chunk-size has to be a power of 2 */ - if ( (1 << ffz(~chunk_size)) != chunk_size) { + if ((1 << ffz(~chunk_size)) != chunk_size && + mddev->level != 0) { printk(KERN_ERR "chunk_size of %d not valid\n", chunk_size); return -EINVAL; } - + /* + * raid0 chunk size has to divide by a page + */ + if (mddev->level == 0 && (chunk_size % 4096)) { + printk(KERN_ERR "chunk_size of %d not valid\n", + chunk_size); + return -EINVAL; + } /* devices must have minimum size of one chunk */ list_for_each_entry(rdev, &mddev->disks, same_set) { if (test_bit(Faulty, &rdev->flags)) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 36b747a..9865316 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -53,7 +53,7 @@ static int raid0_congested(void *data, int bits) } -static int create_strip_zones (mddev_t *mddev) +static int raid0_create_strip_zones(mddev_t *mddev) { int i, c, j; sector_t current_start, curr_zone_start; @@ -237,7 +237,7 @@ static int raid0_mergeable_bvec(struct request_queue *q, unsigned int chunk_sectors = mddev->chunk_size >> 9; unsigned int bio_sectors = bvm->bi_size >> 9; - max = (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) << 9; + max = (chunk_sectors - ((sector % chunk_sectors) + bio_sectors)) << 9; if (max < 0) max = 0; /* bio_add cannot handle a negative return */ if (max <= biovec->bv_len && bio_sectors == 0) return biovec->bv_len; @@ -259,26 +259,37 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) return array_sectors; } +static int raid0_is_power2_chunk(mddev_t *mddev) +{ + if ((1 << ffz(~mddev->chunk_size)) == mddev->chunk_size) + return 1; + return 0; +} + + static int raid0_run(mddev_t *mddev) { int ret; + int segment_boundary = (mddev->chunk_size>>1)-1; if (mddev->chunk_size == 0) { printk(KERN_ERR "md/raid0: non-zero chunk size required.\n"); return -EINVAL; } - printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n", - mdname(mddev), - mddev->chunk_size >> 9, - (mddev->chunk_size>>1)-1); blk_queue_max_sectors(mddev->queue, mddev->chunk_size >> 9); - blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1); + if (!raid0_is_power2_chunk(mddev)) + segment_boundary = ~(ffz(~mddev->chunk_size))>>1; + printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n", + mdname(mddev), + mddev->chunk_size >> 9, + segment_boundary); + blk_queue_segment_boundary(mddev->queue, segment_boundary); mddev->queue->queue_lock = &mddev->queue->__queue_lock; mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); if (!mddev->private) return -ENOMEM; - ret = create_strip_zones(mddev); + ret = raid0_create_strip_zones(mddev); if (ret < 0) { kfree(mddev->private); mddev->private = NULL; @@ -322,31 +333,35 @@ static int raid0_stop (mddev_t *mddev) return 0; } -/* Find the zone which holds a particular offset */ -static struct strip_zone *find_zone(struct raid0_private_data *conf, - sector_t sector) +static int raid0_position_bio(mddev_t *mddev, struct bio *bio, sector_t sector) { - int i; - - for (i = 0; i < conf->nr_strip_zones; i++) { - struct strip_zone *z = conf->strip_zone + i; - - if (sector < z->zone_start + z->sectors) - return z; - } - BUG(); - return NULL; + sector_t sect_in_chunk; + mdk_rdev_t *tmp_dev; + sector_t chunk_in_dev; + sector_t rsect; + sector_t x; + raid0_conf_t *conf = mddev_to_conf(mddev); + sector_t chunk_sects = mddev->chunk_size >> 9; + struct strip_zone *zone = &conf->strip_zone[0]; + + while (sector >= zone->zone_start + zone->sectors) + zone++; + sect_in_chunk = sector % chunk_sects; + x = (sector - zone->zone_start) / chunk_sects; + sector_div(x, zone->nb_dev); + chunk_in_dev = x; + x = sector / chunk_sects; + tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; + rsect = (chunk_in_dev * chunk_sects) + zone->dev_start + sect_in_chunk; + bio->bi_bdev = tmp_dev->bdev; + bio->bi_sector = rsect + tmp_dev->data_offset; + return 0; } -static int raid0_make_request (struct request_queue *q, struct bio *bio) +static int raid0_make_request(struct request_queue *q, struct bio *bio) { mddev_t *mddev = q->queuedata; - unsigned int sect_in_chunk, chunksect_bits, chunk_sects; - raid0_conf_t *conf = mddev_to_conf(mddev); - struct strip_zone *zone; - mdk_rdev_t *tmp_dev; - sector_t chunk; - sector_t sector, rsect; + unsigned int chunk_sects; const int rw = bio_data_dir(bio); int cpu; @@ -362,10 +377,9 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) part_stat_unlock(); chunk_sects = mddev->chunk_size >> 9; - chunksect_bits = ffz(~chunk_sects); - sector = bio->bi_sector; - if (unlikely(chunk_sects < (bio->bi_sector & (chunk_sects - 1)) + (bio->bi_size >> 9))) { + if (unlikely(chunk_sects < ((bio->bi_sector % chunk_sects) + + (bio->bi_size >> 9)))) { struct bio_pair *bp; /* Sanity check -- queue functions should prevent this happening */ if (bio->bi_vcnt != 1 || @@ -374,7 +388,8 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) /* This is a one page bio that upper layers * refuse to split for us, so we need to split it. */ - bp = bio_split(bio, chunk_sects - (bio->bi_sector & (chunk_sects - 1))); + bp = bio_split(bio, chunk_sects - + (bio->bi_sector % chunk_sects)); if (raid0_make_request(q, &bp->bio1)) generic_make_request(&bp->bio1); if (raid0_make_request(q, &bp->bio2)) @@ -383,29 +398,8 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) bio_pair_release(bp); return 0; } - zone = find_zone(conf, sector); - if (!zone) + if (!raid0_position_bio(mddev, bio, bio->bi_sector)) return 1; - sect_in_chunk = bio->bi_sector & (chunk_sects - 1); - { - sector_t x = (sector - zone->zone_start) >> chunksect_bits; - - sector_div(x, zone->nb_dev); - chunk = x; - - x = sector >> chunksect_bits; - tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; - } - rsect = (chunk << chunksect_bits) + zone->dev_start + sect_in_chunk; - - bio->bi_bdev = tmp_dev->bdev; - bio->bi_sector = rsect + tmp_dev->data_offset; - - /* - * Let the main block layer submit the IO and resolve recursion: - */ - return 1; - bad_map: printk("raid0_make_request bug: can't convert block across chunks" " or bigger than %dk %llu %d\n", chunk_sects / 2, ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity 2009-05-14 15:58 ` PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity raz ben yehuda @ 2009-05-14 14:07 ` Andre Noll 2009-05-14 22:35 ` Neil Brown 1 sibling, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 14:07 UTC (permalink / raw) To: raz ben yehuda; +Cc: NeilBrown, linux-raid [-- Attachment #1: Type: text/plain, Size: 3382 bytes --] On 18:58, raz ben yehuda wrote: > Andre, Patch is applied on top of your last post. now it is your turn to merge :) I'll have to rework my patch series and will hopefully send the new series tomorrow. If it's clear that moving raid0 to 4K granularity is good thing to do, I'll add your patches on top of the new series. > static sector_t calc_num_sectors(mdk_rdev_t *rdev, unsigned chunk_size) > { > sector_t num_sectors = rdev->sb_start; > - > - if (chunk_size) > - num_sectors &= ~((sector_t)chunk_size/512 - 1); > + if (chunk_size) { > + int chunk_sects = chunk_size>>9; As chunk_size is unsigned, chunk_sects should probably also be unsigned. > + num_sectors = (num_sectors/chunk_sects)*chunk_sects; A ROUND_DOWN macro perhaps? There are only four of them in the kernel tree :-/ > @@ -3512,7 +3514,7 @@ min_sync_store(mddev_t *mddev, const char *buf, size_t len) > > /* Must be a multiple of chunk_size */ > if (mddev->chunk_size) { > - if (min & (sector_t)((mddev->chunk_size>>9)-1)) > + if (min % (sector_t)(mddev->chunk_size>>9)) > return -EINVAL; > } > mddev->resync_min = min; > @@ -3549,7 +3551,7 @@ max_sync_store(mddev_t *mddev, const char *buf, size_t len) > > /* Must be a multiple of chunk_size */ > if (mddev->chunk_size) { > - if (max & (sector_t)((mddev->chunk_size>>9)-1)) > + if (max % (sector_t)((mddev->chunk_size>>9))) If we decide to clean these up, we might as well do it right and use ffz()/ffs(). > + /* > + * raid0 chunk size has to divide by a page > + */ Nitpick: I'd prefer to say "must be a multiple of", and the indentation does not match coding style standards. > +static int raid0_is_power2_chunk(mddev_t *mddev) > +{ > + if ((1 << ffz(~mddev->chunk_size)) == mddev->chunk_size) > + return 1; > + return 0; > +} This could be go to md.h as an inline function or at least be used everywhere in raid0.c. > -/* Find the zone which holds a particular offset */ > -static struct strip_zone *find_zone(struct raid0_private_data *conf, > - sector_t sector) > +static int raid0_position_bio(mddev_t *mddev, struct bio *bio, sector_t sector) > { > - int i; > - > - for (i = 0; i < conf->nr_strip_zones; i++) { > - struct strip_zone *z = conf->strip_zone + i; > - > - if (sector < z->zone_start + z->sectors) > - return z; > - } > - BUG(); > - return NULL; > + sector_t sect_in_chunk; > + mdk_rdev_t *tmp_dev; > + sector_t chunk_in_dev; > + sector_t rsect; > + sector_t x; > + raid0_conf_t *conf = mddev_to_conf(mddev); > + sector_t chunk_sects = mddev->chunk_size >> 9; > + struct strip_zone *zone = &conf->strip_zone[0]; > + > + while (sector >= zone->zone_start + zone->sectors) > + zone++; > + sect_in_chunk = sector % chunk_sects; > + x = (sector - zone->zone_start) / chunk_sects; > + sector_div(x, zone->nb_dev); > + chunk_in_dev = x; > + x = sector / chunk_sects; > + tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > + rsect = (chunk_in_dev * chunk_sects) + zone->dev_start + sect_in_chunk; > + bio->bi_bdev = tmp_dev->bdev; > + bio->bi_sector = rsect + tmp_dev->data_offset; > + return 0; > } This clashes of course with the planned changes that extend stripe_zone. I'll see what I can do. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity 2009-05-14 15:58 ` PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity raz ben yehuda 2009-05-14 14:07 ` Andre Noll @ 2009-05-14 22:35 ` Neil Brown 2009-05-18 22:58 ` raz ben yehuda 1 sibling, 1 reply; 39+ messages in thread From: Neil Brown @ 2009-05-14 22:35 UTC (permalink / raw) To: raz ben yehuda; +Cc: Andre Noll, linux-raid On Thursday May 14, raziebe@gmail.com wrote: > move raid0 chunk size to 4K*n granularity. > motivation for this patch is to have a better access to raid550. if a raid 5 is 3M > stripe (4-1),and you have two of these raids 5's, and on top of you have a raid0, > it is better to access raid550 with a 3MB buffers and not 1M ( no raid5 write penalty). I don't think you can justify "no raid5 write penalty" but it could possibly reduce the raid5 write penalty. > Andre, Patch is applied on top of your last post. now it is your turn to merge :) > md.c | 24 ++++++++++----- > raid0.c | 102 ++++++++++++++++++++++++++++++---------------------------------- > > 2 files changed, 65 insertions(+), 61 deletions(-) > Signed-Off-by:raziebe@gmail.com > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index ed5727c..5eab782 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -440,12 +440,14 @@ static inline sector_t calc_dev_sboffset(struct block_device *bdev) > return MD_NEW_SIZE_SECTORS(num_sectors); > } > > + Just a single blank line between functions please. > static sector_t calc_num_sectors(mdk_rdev_t *rdev, unsigned chunk_size) > { > sector_t num_sectors = rdev->sb_start; > - > - if (chunk_size) > - num_sectors &= ~((sector_t)chunk_size/512 - 1); > + if (chunk_size) { > + int chunk_sects = chunk_size>>9; > + num_sectors = (num_sectors/chunk_sects)*chunk_sects; > + } You need to use "sector_div" here. On a 32 bit host with 64bit sector_t you cannot use normal division. sector_div takes a sector_t and an "unsigned long", divides the one by the other, stores the quotient in the sector_t and returns the remainder. > return num_sectors; > } > > @@ -3512,7 +3514,7 @@ min_sync_store(mddev_t *mddev, const char *buf, size_t len) > > /* Must be a multiple of chunk_size */ > if (mddev->chunk_size) { > - if (min & (sector_t)((mddev->chunk_size>>9)-1)) > + if (min % (sector_t)(mddev->chunk_size>>9)) sector_div again. > return -EINVAL; > } > mddev->resync_min = min; > @@ -3549,7 +3551,7 @@ max_sync_store(mddev_t *mddev, const char *buf, size_t len) > > /* Must be a multiple of chunk_size */ > if (mddev->chunk_size) { > - if (max & (sector_t)((mddev->chunk_size>>9)-1)) > + if (max % (sector_t)((mddev->chunk_size>>9))) and again. > return -EINVAL; > } > mddev->resync_max = max; > @@ -3993,11 +3995,19 @@ static int do_md_run(mddev_t * mddev) > /* > * chunk-size has to be a power of 2 > */ > - if ( (1 << ffz(~chunk_size)) != chunk_size) { > + if ((1 << ffz(~chunk_size)) != chunk_size && > + mddev->level != 0) { > printk(KERN_ERR "chunk_size of %d not valid\n", chunk_size); > return -EINVAL; > } > - > + /* > + * raid0 chunk size has to divide by a page > + */ > + if (mddev->level == 0 && (chunk_size % 4096)) { > + printk(KERN_ERR "chunk_size of %d not valid\n", > + chunk_size); > + return -EINVAL; > + } Why does the chunk size have to be a multiple of page size? If there is a reason, include it in the comment. And the size of a page is PAGE_SIZE, not 4096. On some systems PAGE_SIZE is 65536. > /* devices must have minimum size of one chunk */ > list_for_each_entry(rdev, &mddev->disks, same_set) { > if (test_bit(Faulty, &rdev->flags)) > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index 36b747a..9865316 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -53,7 +53,7 @@ static int raid0_congested(void *data, int bits) > } > > > -static int create_strip_zones (mddev_t *mddev) > +static int raid0_create_strip_zones(mddev_t *mddev) Why did you add the "raid0_". There is no value in a prefix like that on static functions. Removing the trailing space is good though! > { > int i, c, j; > sector_t current_start, curr_zone_start; > @@ -237,7 +237,7 @@ static int raid0_mergeable_bvec(struct request_queue *q, > unsigned int chunk_sectors = mddev->chunk_size >> 9; > unsigned int bio_sectors = bvm->bi_size >> 9; > > - max = (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) << 9; > + max = (chunk_sectors - ((sector % chunk_sectors) + bio_sectors)) << 9; sector_div... but you get the point, I'll stop mentioning it. > if (max < 0) max = 0; /* bio_add cannot handle a negative return */ > if (max <= biovec->bv_len && bio_sectors == 0) > return biovec->bv_len; > @@ -259,26 +259,37 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) > return array_sectors; > } > > +static int raid0_is_power2_chunk(mddev_t *mddev) > +{ > + if ((1 << ffz(~mddev->chunk_size)) == mddev->chunk_size) > + return 1; > + return 0; > +} include/linux/log2.h has is_power_of_2. Use that. And just say is_power_of_2(mddev->chunk_size) where it is needed. No need for a separate function. > + > + > static int raid0_run(mddev_t *mddev) > { > int ret; > + int segment_boundary = (mddev->chunk_size>>1)-1; > > if (mddev->chunk_size == 0) { > printk(KERN_ERR "md/raid0: non-zero chunk size required.\n"); > return -EINVAL; > } > - printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n", > - mdname(mddev), > - mddev->chunk_size >> 9, > - (mddev->chunk_size>>1)-1); > blk_queue_max_sectors(mddev->queue, mddev->chunk_size >> 9); > - blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1); > + if (!raid0_is_power2_chunk(mddev)) > + segment_boundary = ~(ffz(~mddev->chunk_size))>>1; > + printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n", > + mdname(mddev), > + mddev->chunk_size >> 9, > + segment_boundary); > + blk_queue_segment_boundary(mddev->queue, segment_boundary); Setting blk_queue_segment_boundary here seems wrong. I know you didn't add that code, you just modified. But I think it might be best to remove it first - in a separate patch. I've just asked Peter Chubb who added that code in the first place if he has any idea why it was there. > mddev->queue->queue_lock = &mddev->queue->__queue_lock; > > mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); > if (!mddev->private) > return -ENOMEM; > - ret = create_strip_zones(mddev); > + ret = raid0_create_strip_zones(mddev); > if (ret < 0) { > kfree(mddev->private); > mddev->private = NULL; > @@ -322,31 +333,35 @@ static int raid0_stop (mddev_t *mddev) > return 0; > } > > -/* Find the zone which holds a particular offset */ > -static struct strip_zone *find_zone(struct raid0_private_data *conf, > - sector_t sector) > +static int raid0_position_bio(mddev_t *mddev, struct bio *bio, sector_t sector) You've removed a comment, changed the name of the function that it described, but have not provided a new comment to describe the new function.. And no "raid0_" prefix please. I know a lot of function do still have that prefix but I would like the number to decrease, not increase. It is redundant. > { > - int i; > - > - for (i = 0; i < conf->nr_strip_zones; i++) { > - struct strip_zone *z = conf->strip_zone + i; > - > - if (sector < z->zone_start + z->sectors) > - return z; > - } > - BUG(); > - return NULL; > + sector_t sect_in_chunk; > + mdk_rdev_t *tmp_dev; > + sector_t chunk_in_dev; > + sector_t rsect; > + sector_t x; > + raid0_conf_t *conf = mddev_to_conf(mddev); > + sector_t chunk_sects = mddev->chunk_size >> 9; > + struct strip_zone *zone = &conf->strip_zone[0]; > + > + while (sector >= zone->zone_start + zone->sectors) > + zone++; > + sect_in_chunk = sector % chunk_sects; > + x = (sector - zone->zone_start) / chunk_sects; > + sector_div(x, zone->nb_dev); > + chunk_in_dev = x; > + x = sector / chunk_sects; > + tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > + rsect = (chunk_in_dev * chunk_sects) + zone->dev_start + sect_in_chunk; > + bio->bi_bdev = tmp_dev->bdev; > + bio->bi_sector = rsect + tmp_dev->data_offset; > + return 0; > } I would really like it if we only did division if division were need, and just use shift when that is possible. e..g if (is_power_of_two()) do the shift version else do the divide version. > > -static int raid0_make_request (struct request_queue *q, struct bio *bio) > +static int raid0_make_request(struct request_queue *q, struct bio *bio) > { > mddev_t *mddev = q->queuedata; > - unsigned int sect_in_chunk, chunksect_bits, chunk_sects; > - raid0_conf_t *conf = mddev_to_conf(mddev); > - struct strip_zone *zone; > - mdk_rdev_t *tmp_dev; > - sector_t chunk; > - sector_t sector, rsect; > + unsigned int chunk_sects; > const int rw = bio_data_dir(bio); > int cpu; > > @@ -362,10 +377,9 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > part_stat_unlock(); > > chunk_sects = mddev->chunk_size >> 9; > - chunksect_bits = ffz(~chunk_sects); > - sector = bio->bi_sector; > > - if (unlikely(chunk_sects < (bio->bi_sector & (chunk_sects - 1)) + (bio->bi_size >> 9))) { > + if (unlikely(chunk_sects < ((bio->bi_sector % chunk_sects) > + + (bio->bi_size >> 9)))) { Please keep things lines up properly. The contents of a bracketed expression should never be to the left of the opening bracket, unless the opening bracket is the last symbol on the line. So this should be: > + if (unlikely(chunk_sects < ((bio->bi_sector % chunk_sects) > + + (bio->bi_size >> 9)))) { > struct bio_pair *bp; > /* Sanity check -- queue functions should prevent this happening */ > if (bio->bi_vcnt != 1 || > @@ -374,7 +388,8 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > /* This is a one page bio that upper layers > * refuse to split for us, so we need to split it. > */ > - bp = bio_split(bio, chunk_sects - (bio->bi_sector & (chunk_sects - 1))); > + bp = bio_split(bio, chunk_sects - > + (bio->bi_sector % chunk_sects)); and this should be > + bp = bio_split(bio, chunk_sects - > + (bio->bi_sector % chunk_sects)); > if (raid0_make_request(q, &bp->bio1)) > generic_make_request(&bp->bio1); > if (raid0_make_request(q, &bp->bio2)) > @@ -383,29 +398,8 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > bio_pair_release(bp); > return 0; > } > - zone = find_zone(conf, sector); > - if (!zone) > + if (!raid0_position_bio(mddev, bio, bio->bi_sector)) > return 1; > - sect_in_chunk = bio->bi_sector & (chunk_sects - 1); > - { > - sector_t x = (sector - zone->zone_start) >> chunksect_bits; > - > - sector_div(x, zone->nb_dev); > - chunk = x; > - > - x = sector >> chunksect_bits; > - tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > - } > - rsect = (chunk << chunksect_bits) + zone->dev_start + sect_in_chunk; > - > - bio->bi_bdev = tmp_dev->bdev; > - bio->bi_sector = rsect + tmp_dev->data_offset; > - > - /* > - * Let the main block layer submit the IO and resolve recursion: > - */ > - return 1; > - > bad_map: > printk("raid0_make_request bug: can't convert block across chunks" > " or bigger than %dk %llu %d\n", chunk_sects / 2, > Thanks. I haven't thoroughly read through 'position_bio' so I reserve the right to comment further on that :-) But apart from the various observations above, it looks like it is heading in the right direction. NeilBrown ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity 2009-05-14 22:35 ` Neil Brown @ 2009-05-18 22:58 ` raz ben yehuda 0 siblings, 0 replies; 39+ messages in thread From: raz ben yehuda @ 2009-05-18 22:58 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Neil Hello following this email are 6 patches. I fixed the errors as required. patch 1: Because of the removal the device list from the strips raid0 did not compile with MD_DEBUG flag on. patch 2: informative : print to the user raid0 zones patch 3: Add support to chunk size of 4K*n instead of 4K*2^n patch 4: md to support page size chunks in the case of raid 0 patch 5: md to support 1K*n chunks only in the case of raid 0 patch 6: fix mdadm to path 1K*n chunks only in the case of raid0. Commit is stacked on top mdadm.c On Fri, 2009-05-15 at 08:35 +1000, Neil Brown wrote: > On Thursday May 14, raziebe@gmail.com wrote: > > move raid0 chunk size to 4K*n granularity. > > motivation for this patch is to have a better access to raid550. if a raid 5 is 3M > > stripe (4-1),and you have two of these raids 5's, and on top of you have a raid0, > > it is better to access raid550 with a 3MB buffers and not 1M ( no raid5 write penalty). > > I don't think you can justify "no raid5 write penalty" but it could > possibly reduce the raid5 write penalty. > > > Andre, Patch is applied on top of your last post. now it is your turn to merge :) > > md.c | 24 ++++++++++----- > > raid0.c | 102 ++++++++++++++++++++++++++++++---------------------------------- > > > > 2 files changed, 65 insertions(+), 61 deletions(-) > > Signed-Off-by:raziebe@gmail.com > > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index ed5727c..5eab782 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -440,12 +440,14 @@ static inline sector_t calc_dev_sboffset(struct block_device *bdev) > > return MD_NEW_SIZE_SECTORS(num_sectors); > > } > > > > + > > Just a single blank line between functions please. > > > static sector_t calc_num_sectors(mdk_rdev_t *rdev, unsigned chunk_size) > > { > > sector_t num_sectors = rdev->sb_start; > > - > > - if (chunk_size) > > - num_sectors &= ~((sector_t)chunk_size/512 - 1); > > + if (chunk_size) { > > + int chunk_sects = chunk_size>>9; > > + num_sectors = (num_sectors/chunk_sects)*chunk_sects; > > + } > > You need to use "sector_div" here. On a 32 bit host with 64bit > sector_t you cannot use normal division. > sector_div takes a sector_t and an "unsigned long", divides the one by > the other, stores the quotient in the sector_t and returns the > remainder. > > > return num_sectors; > > } > > > > @@ -3512,7 +3514,7 @@ min_sync_store(mddev_t *mddev, const char *buf, size_t len) > > > > /* Must be a multiple of chunk_size */ > > if (mddev->chunk_size) { > > - if (min & (sector_t)((mddev->chunk_size>>9)-1)) > > + if (min % (sector_t)(mddev->chunk_size>>9)) > > sector_div again. > > > return -EINVAL; > > } > > mddev->resync_min = min; > > @@ -3549,7 +3551,7 @@ max_sync_store(mddev_t *mddev, const char *buf, size_t len) > > > > /* Must be a multiple of chunk_size */ > > if (mddev->chunk_size) { > > - if (max & (sector_t)((mddev->chunk_size>>9)-1)) > > + if (max % (sector_t)((mddev->chunk_size>>9))) > > and again. > > > return -EINVAL; > > } > > mddev->resync_max = max; > > @@ -3993,11 +3995,19 @@ static int do_md_run(mddev_t * mddev) > > /* > > * chunk-size has to be a power of 2 > > */ > > - if ( (1 << ffz(~chunk_size)) != chunk_size) { > > + if ((1 << ffz(~chunk_size)) != chunk_size && > > + mddev->level != 0) { > > printk(KERN_ERR "chunk_size of %d not valid\n", chunk_size); > > return -EINVAL; > > } > > - > > + /* > > + * raid0 chunk size has to divide by a page > > + */ > > + if (mddev->level == 0 && (chunk_size % 4096)) { > > + printk(KERN_ERR "chunk_size of %d not valid\n", > > + chunk_size); > > + return -EINVAL; > > + } > > Why does the chunk size have to be a multiple of page size? > If there is a reason, include it in the comment. > And the size of a page is PAGE_SIZE, not 4096. On some systems > PAGE_SIZE is 65536. > > > > /* devices must have minimum size of one chunk */ > > list_for_each_entry(rdev, &mddev->disks, same_set) { > > if (test_bit(Faulty, &rdev->flags)) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > > index 36b747a..9865316 100644 > > --- a/drivers/md/raid0.c > > +++ b/drivers/md/raid0.c > > @@ -53,7 +53,7 @@ static int raid0_congested(void *data, int bits) > > } > > > > > > -static int create_strip_zones (mddev_t *mddev) > > +static int raid0_create_strip_zones(mddev_t *mddev) > > Why did you add the "raid0_". There is no value in a prefix like that > on static functions. Removing the trailing space is good though! > > > { > > int i, c, j; > > sector_t current_start, curr_zone_start; > > @@ -237,7 +237,7 @@ static int raid0_mergeable_bvec(struct request_queue *q, > > unsigned int chunk_sectors = mddev->chunk_size >> 9; > > unsigned int bio_sectors = bvm->bi_size >> 9; > > > > - max = (chunk_sectors - ((sector & (chunk_sectors - 1)) + bio_sectors)) << 9; > > + max = (chunk_sectors - ((sector % chunk_sectors) + bio_sectors)) << 9; > > sector_div... but you get the point, I'll stop mentioning it. > > > if (max < 0) max = 0; /* bio_add cannot handle a negative return */ > > if (max <= biovec->bv_len && bio_sectors == 0) > > return biovec->bv_len; > > @@ -259,26 +259,37 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) > > return array_sectors; > > } > > > > +static int raid0_is_power2_chunk(mddev_t *mddev) > > +{ > > + if ((1 << ffz(~mddev->chunk_size)) == mddev->chunk_size) > > + return 1; > > + return 0; > > +} > > include/linux/log2.h has is_power_of_2. Use that. > And just say > is_power_of_2(mddev->chunk_size) > where it is needed. No need for a separate function. > > > > + > > + > > static int raid0_run(mddev_t *mddev) > > { > > int ret; > > + int segment_boundary = (mddev->chunk_size>>1)-1; > > > > if (mddev->chunk_size == 0) { > > printk(KERN_ERR "md/raid0: non-zero chunk size required.\n"); > > return -EINVAL; > > } > > - printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n", > > - mdname(mddev), > > - mddev->chunk_size >> 9, > > - (mddev->chunk_size>>1)-1); > > blk_queue_max_sectors(mddev->queue, mddev->chunk_size >> 9); > > - blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1); > > + if (!raid0_is_power2_chunk(mddev)) > > + segment_boundary = ~(ffz(~mddev->chunk_size))>>1; > > + printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n", > > + mdname(mddev), > > + mddev->chunk_size >> 9, > > + segment_boundary); > > + blk_queue_segment_boundary(mddev->queue, segment_boundary); > > Setting blk_queue_segment_boundary here seems wrong. I know you > didn't add that code, you just modified. But I think it might be > best to remove it first - in a separate patch. > I've just asked Peter Chubb who added that code in the first place if > he has any idea why it was there. > > > mddev->queue->queue_lock = &mddev->queue->__queue_lock; > > > > mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); > > if (!mddev->private) > > return -ENOMEM; > > - ret = create_strip_zones(mddev); > > + ret = raid0_create_strip_zones(mddev); > > if (ret < 0) { > > kfree(mddev->private); > > mddev->private = NULL; > > @@ -322,31 +333,35 @@ static int raid0_stop (mddev_t *mddev) > > return 0; > > } > > > > -/* Find the zone which holds a particular offset */ > > -static struct strip_zone *find_zone(struct raid0_private_data *conf, > > - sector_t sector) > > +static int raid0_position_bio(mddev_t *mddev, struct bio *bio, sector_t sector) > > You've removed a comment, changed the name of the function that it > described, but have not provided a new comment to describe the new > function.. > > And no "raid0_" prefix please. I know a lot of function do still have > that prefix but I would like the number to decrease, not increase. It > is redundant. > > > > { > > - int i; > > - > > - for (i = 0; i < conf->nr_strip_zones; i++) { > > - struct strip_zone *z = conf->strip_zone + i; > > - > > - if (sector < z->zone_start + z->sectors) > > - return z; > > - } > > - BUG(); > > - return NULL; > > + sector_t sect_in_chunk; > > + mdk_rdev_t *tmp_dev; > > + sector_t chunk_in_dev; > > + sector_t rsect; > > + sector_t x; > > + raid0_conf_t *conf = mddev_to_conf(mddev); > > + sector_t chunk_sects = mddev->chunk_size >> 9; > > + struct strip_zone *zone = &conf->strip_zone[0]; > > + > > + while (sector >= zone->zone_start + zone->sectors) > > + zone++; > > + sect_in_chunk = sector % chunk_sects; > > + x = (sector - zone->zone_start) / chunk_sects; > > + sector_div(x, zone->nb_dev); > > + chunk_in_dev = x; > > + x = sector / chunk_sects; > > + tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > > + rsect = (chunk_in_dev * chunk_sects) + zone->dev_start + sect_in_chunk; > > + bio->bi_bdev = tmp_dev->bdev; > > + bio->bi_sector = rsect + tmp_dev->data_offset; > > + return 0; > > } > > I would really like it if we only did division if division were need, > and just use shift when that is possible. > e..g > if (is_power_of_two()) > do the shift version > else > do the divide version. > > > > > > -static int raid0_make_request (struct request_queue *q, struct bio *bio) > > +static int raid0_make_request(struct request_queue *q, struct bio *bio) > > { > > mddev_t *mddev = q->queuedata; > > - unsigned int sect_in_chunk, chunksect_bits, chunk_sects; > > - raid0_conf_t *conf = mddev_to_conf(mddev); > > - struct strip_zone *zone; > > - mdk_rdev_t *tmp_dev; > > - sector_t chunk; > > - sector_t sector, rsect; > > + unsigned int chunk_sects; > > const int rw = bio_data_dir(bio); > > int cpu; > > > > @@ -362,10 +377,9 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > > part_stat_unlock(); > > > > chunk_sects = mddev->chunk_size >> 9; > > - chunksect_bits = ffz(~chunk_sects); > > - sector = bio->bi_sector; > > > > - if (unlikely(chunk_sects < (bio->bi_sector & (chunk_sects - 1)) + (bio->bi_size >> 9))) { > > + if (unlikely(chunk_sects < ((bio->bi_sector % chunk_sects) > > + + (bio->bi_size >> 9)))) { > > Please keep things lines up properly. The contents of a bracketed > expression should never be to the left of the opening bracket, unless > the opening bracket is the last symbol on the line. > So this should be: > > + if (unlikely(chunk_sects < ((bio->bi_sector % chunk_sects) > > + + (bio->bi_size >> 9)))) { > > > struct bio_pair *bp; > > /* Sanity check -- queue functions should prevent this happening */ > > if (bio->bi_vcnt != 1 || > > @@ -374,7 +388,8 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > > /* This is a one page bio that upper layers > > * refuse to split for us, so we need to split it. > > */ > > - bp = bio_split(bio, chunk_sects - (bio->bi_sector & (chunk_sects - 1))); > > + bp = bio_split(bio, chunk_sects - > > + (bio->bi_sector % chunk_sects)); > > and this should be > > + bp = bio_split(bio, chunk_sects - > > + (bio->bi_sector % chunk_sects)); > > > > if (raid0_make_request(q, &bp->bio1)) > > generic_make_request(&bp->bio1); > > if (raid0_make_request(q, &bp->bio2)) > > @@ -383,29 +398,8 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > > bio_pair_release(bp); > > return 0; > > } > > - zone = find_zone(conf, sector); > > - if (!zone) > > + if (!raid0_position_bio(mddev, bio, bio->bi_sector)) > > return 1; > > - sect_in_chunk = bio->bi_sector & (chunk_sects - 1); > > - { > > - sector_t x = (sector - zone->zone_start) >> chunksect_bits; > > - > > - sector_div(x, zone->nb_dev); > > - chunk = x; > > - > > - x = sector >> chunksect_bits; > > - tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > > - } > > - rsect = (chunk << chunksect_bits) + zone->dev_start + sect_in_chunk; > > - > > - bio->bi_bdev = tmp_dev->bdev; > > - bio->bi_sector = rsect + tmp_dev->data_offset; > > - > > - /* > > - * Let the main block layer submit the IO and resolve recursion: > > - */ > > - return 1; > > - > > bad_map: > > printk("raid0_make_request bug: can't convert block across chunks" > > " or bigger than %dk %llu %d\n", chunk_sects / 2, > > > > > Thanks. > I haven't thoroughly read through 'position_bio' so I reserve the > right to comment further on that :-) > But apart from the various observations above, it looks like it is > heading in the right direction. > > NeilBrown ^ permalink raw reply [flat|nested] 39+ messages in thread
* Subject: PATCH[002:002] md: raid0: dump raid configuration 2009-05-14 12:25 ` NeilBrown 2009-05-14 12:54 ` Sujit Karataparambil 2009-05-14 15:58 ` PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity raz ben yehuda @ 2009-05-14 16:00 ` raz ben yehuda 2009-05-14 17:12 ` Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 raz ben yehuda 3 siblings, 0 replies; 39+ messages in thread From: raz ben yehuda @ 2009-05-14 16:00 UTC (permalink / raw) To: NeilBrown; +Cc: Andre Noll, linux-raid Tell the user how raid0 looks like. raid0.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) Signed-Off-by:raziebe@gmail.com --- diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 9865316..2ce81d0 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -267,6 +267,29 @@ static int raid0_is_power2_chunk(mddev_t *mddev) } +static void raid0_dump_zones(mddev_t *mddev) +{ + int j, k, h; + char b[BDEVNAME_SIZE]; + raid0_conf_t *conf = mddev_to_conf(mddev); + printk(KERN_INFO "***** %s configuration ******", + mdname(mddev)); + h = 0; + for (j = 0; j < conf->nr_strip_zones; j++) { + printk("\nzone%d", j); + printk("=["); + for (k = 0; k < conf->strip_zone[j].nb_dev; k++) + printk("%s/", bdevname( + conf->strip_zone[j].dev[k]->bdev, b)); + printk("]\n\t zone offset=%llu device offset=%llu size=%llukb\n", + (unsigned long long)conf->strip_zone[j].zone_start, + (unsigned long long)conf->strip_zone[j].dev_start, + (unsigned long long)conf->strip_zone[j].sectors>>1); + } + printk(KERN_INFO "**********************************\n\n"); +} + + static int raid0_run(mddev_t *mddev) { int ret; @@ -317,6 +340,7 @@ static int raid0_run(mddev_t *mddev) } blk_queue_merge_bvec(mddev->queue, raid0_mergeable_bvec); + raid0_dump_zones(mddev); return 0; } ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 2009-05-14 12:25 ` NeilBrown ` (2 preceding siblings ...) 2009-05-14 16:00 ` Subject: PATCH[002:002] md: raid0: dump raid configuration raz ben yehuda @ 2009-05-14 17:12 ` raz ben yehuda 2009-05-15 3:59 ` Sujit Karataparambil 3 siblings, 1 reply; 39+ messages in thread From: raz ben yehuda @ 2009-05-14 17:12 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid Neil. I do not know what is policy for mdadm patches. I would thankful for some instructions. The bellow is the patch I applied to support raid0 chunk sizes 4K*n. --- --- mdadm.c (revision 16152) +++ mdadm.c (working copy) @@ -314,7 +314,8 @@ exit(2); } chunk = strtol(optarg, &c, 10); - if (!optarg[0] || *c || chunk<4 || (chunk%4)) { + if (!optarg[0] || *c || chunk<4 || (((chunk-1)&chunk) && level !=0) + || (((chunk%4) && level ==0) )) { fprintf(stderr, Name ": invalid chunk/rounding value: %s\n", optarg); exit(2); ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 2009-05-14 17:12 ` Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 raz ben yehuda @ 2009-05-15 3:59 ` Sujit Karataparambil 2009-05-15 6:01 ` Raz 0 siblings, 1 reply; 39+ messages in thread From: Sujit Karataparambil @ 2009-05-15 3:59 UTC (permalink / raw) To: raz ben yehuda; +Cc: NeilBrown, linux-raid Neil, I donot understand why this needs to be done in raid. Why should be the raid taking care of 4k*n data sets. What I see of this patch is that it specifically checks for some sort of 4K Boundaries. Thanks, Sujit On Thu, May 14, 2009 at 10:42 PM, raz ben yehuda <raziebe@gmail.com> wrote: > Neil. > I do not know what is policy for mdadm patches. I would thankful for some instructions. > The bellow is the patch I applied to support raid0 chunk sizes 4K*n. > --- > --- mdadm.c (revision 16152) > +++ mdadm.c (working copy) > @@ -314,7 +314,8 @@ > exit(2); > } > chunk = strtol(optarg, &c, 10); > - if (!optarg[0] || *c || chunk<4 || (chunk%4)) { > + if (!optarg[0] || *c || chunk<4 || (((chunk-1)&chunk) && level !=0) > + || (((chunk%4) && level ==0) )) { > fprintf(stderr, Name ": invalid chunk/rounding value: %s\n", > optarg); > exit(2); > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- -- Sujit K M -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 2009-05-15 3:59 ` Sujit Karataparambil @ 2009-05-15 6:01 ` Raz 2009-05-15 6:45 ` Sujit Karataparambil 0 siblings, 1 reply; 39+ messages in thread From: Raz @ 2009-05-15 6:01 UTC (permalink / raw) To: Sujit Karataparambil; +Cc: linux-raid On Fri, May 15, 2009 at 6:59 AM, Sujit Karataparambil <sjt.kar@gmail.com> wrote: > Neil, > > I donot understand why this needs to be done in raid. Why should be the raid where else ? > taking care of 4k*n data sets. > What I see of this patch is that it specifically checks for some sort > of 4K Boundaries. instead of 4K*2^n where n>=0 correct and recompure IO posistion. > Thanks, > Sujit > > On Thu, May 14, 2009 at 10:42 PM, raz ben yehuda <raziebe@gmail.com> wrote: >> Neil. >> I do not know what is policy for mdadm patches. I would thankful for some instructions. >> The bellow is the patch I applied to support raid0 chunk sizes 4K*n. >> --- >> --- mdadm.c (revision 16152) >> +++ mdadm.c (working copy) >> @@ -314,7 +314,8 @@ >> exit(2); >> } >> chunk = strtol(optarg, &c, 10); >> - if (!optarg[0] || *c || chunk<4 || (chunk%4)) { >> + if (!optarg[0] || *c || chunk<4 || (((chunk-1)&chunk) && level !=0) >> + || (((chunk%4) && level ==0) )) { >> fprintf(stderr, Name ": invalid chunk/rounding value: %s\n", >> optarg); >> exit(2); >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > -- Sujit K M > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 2009-05-15 6:01 ` Raz @ 2009-05-15 6:45 ` Sujit Karataparambil 2009-05-15 8:39 ` NeilBrown 0 siblings, 1 reply; 39+ messages in thread From: Sujit Karataparambil @ 2009-05-15 6:45 UTC (permalink / raw) To: Raz; +Cc: linux-raid, neilb On Fri, May 15, 2009 at 11:31 AM, Raz <raziebe@gmail.com> wrote: > where else ? See you were the one who has sent out the patch. Ok? Then you should know the reason for patch? What I had asked from Neil is an RFC Cross Checking the Patch. So I donot expect an Reply From You. Ok? >> taking care of 4k*n data sets. >> What I see of this patch is that it specifically checks for some sort >> of 4K Boundaries. > instead of 4K*2^n where n>=0 > correct and recompure IO posistion. >>> fprintf(stderr, Name ": invalid chunk/rounding value: %s\n", Could you tell me how 4K*2^n is used to do the patch what ever it was sent for? Or is it that the patch is incomplete. Kindly Maintain Some sort of mailing list courtesy.Ok? Reply only when you are asked to. Ok? -- -- Sujit K M ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 2009-05-15 6:45 ` Sujit Karataparambil @ 2009-05-15 8:39 ` NeilBrown 2009-05-15 15:45 ` Raz 0 siblings, 1 reply; 39+ messages in thread From: NeilBrown @ 2009-05-15 8:39 UTC (permalink / raw) To: Sujit Karataparambil; +Cc: Raz, linux-raid On Fri, May 15, 2009 4:45 pm, Sujit Karataparambil wrote: > On Fri, May 15, 2009 at 11:31 AM, Raz <raziebe@gmail.com> wrote: >> where else ? > > See you were the one who has sent out the patch. Ok? Then you should > know the reason for patch? What I had asked from Neil is an RFC Cross > Checking the Patch. So I donot expect an Reply From You. Ok? >>> taking care of 4k*n data sets. >>> What I see of this patch is that it specifically checks for some sort >>> of 4K Boundaries. > > >> instead of 4K*2^n where n>=0 >> correct and recompure IO posistion. >>>> fprintf(stderr, Name ": invalid >>>> chunk/rounding value: %s\n", > > Could you tell me how 4K*2^n is used to do the patch what ever it was > sent for? Or > is it that the patch is incomplete. > > Kindly Maintain Some sort of mailing list courtesy.Ok? Reply only when > you are asked to. > Ok? That really isn't an appropriate thing to say. If people want to reply (politely) this questions directed at other people on a public list, I think that is quite appropriate. In fact I value it as it often means I don't need to reply because someone else has done it for me. Now I agree that "where else" is not a very helpful reply, but I'm not sure that the original question was crystal clear either. Maybe there is a bit of a language barrier - I'm sure not everyone on this list has English as their first language, yet English is the language we use (for which I'm very thankful as I don't speak anything else). So a little bit of patience all around would be good. Also it would help if explanations were a little more detailed, and even if questions contained more detail too. In the case of the original patch, I'm not really sure of the point of the patch myself. The text "I do not know what is policy for mdadm patches. I would thankful for some instructions." is clear enough. The answer is "send them to me" in much the same format as the kernel patches you have seen sent by e.g. Andre Noll. The text "The bellow is the patch I applied to support raid0 chunk sizes 4K*n." confuses me as it doesn't seem add anything, it only removes. i.e. it removes the possibility of passing a non-power-of-2 chunk size to raid levels other than 0. That might be an appropriate thing to do, but the commentary that came with the patch doesn't say it does that, and says something quite different, so I'm not really sure what to make of it. Which is one of the reasons I didn't reply straight away - I wanted to give myself time to read it again and make sure I wasn't missing anything. NeilBrown > > -- > -- Sujit K M > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 2009-05-15 8:39 ` NeilBrown @ 2009-05-15 15:45 ` Raz 0 siblings, 0 replies; 39+ messages in thread From: Raz @ 2009-05-15 15:45 UTC (permalink / raw) To: NeilBrown, sjt.kar; +Cc: Linux RAID Mailing List Well, first, anyone that somehow in someway got confused or insulted please accept my applogies. This all thing is just a big missunderstanding, and from my part. please see inline. I hope we are ok now Sujit. On 5/15/09, NeilBrown <neilb@suse.de> wrote: > On Fri, May 15, 2009 4:45 pm, Sujit Karataparambil wrote: > > On Fri, May 15, 2009 at 11:31 AM, Raz <raziebe@gmail.com> wrote: > >> where else ? > > > > See you were the one who has sent out the patch. Ok? Then you should > > know the reason for patch? What I had asked from Neil is an RFC Cross > > Checking the Patch. So I donot expect an Reply From You. Ok? > >>> taking care of 4k*n data sets. > >>> What I see of this patch is that it specifically checks for some sort > >>> of 4K Boundaries. > > > > > >> instead of 4K*2^n where n>=0 > >> correct and recompure IO posistion. > >>>> fprintf(stderr, Name ": invalid > >>>> chunk/rounding value: %s\n", > > > > Could you tell me how 4K*2^n is used to do the patch what ever it was > > sent for? Or > > is it that the patch is incomplete. > > > > Kindly Maintain Some sort of mailing list courtesy.Ok? Reply only when > > you are asked to. > > Ok? > > That really isn't an appropriate thing to say. If people want to > reply (politely) this questions directed at other people on a public > list, I think that is quite appropriate. In fact I value it as it > often means I don't need to reply because someone else has done > it for me. > > Now I agree that "where else" is not a very helpful reply, but I'm ok. sorry about that. i just do not understand what the problem. > not sure that the original question was crystal clear either. Maybe > there is a bit of a language barrier - I'm sure not everyone on this > list has English as their first language, yet English is the language > we use (for which I'm very thankful as I don't speak anything else). > So a little bit of patience all around would be good. > > Also it would help if explanations were a little more detailed, and > even if questions contained more detail too. > > In the case of the original patch, I'm not really sure of the point > of the patch myself. > > The text > "I do not know what is policy for mdadm patches. I would thankful for some > instructions." > > is clear enough. The answer is "send them to me" in much the same format > as the kernel patches you have seen sent by e.g. Andre Noll. > The text > "The bellow is the patch I applied to support raid0 chunk sizes 4K*n." > > confuses me as it doesn't seem add anything, it only removes. > i.e. it removes the possibility of passing a non-power-of-2 > chunk size to raid levels other than 0. lol.... patch is my mistake. patch wat not against your original code . Original mdadm.c code is: if (!optarg[0] || *c || chunk<4 || ((chunk-1)&chunk)) ... All I want is to enable the chunk fix only to raid0. This is all my mistake because i am not working under your mdadm git repository but under a private svn repository. Would you be so kind to direct me to your git's url ( there are several mdadm gits i found in google ), I will repost this fix. Thank you. raz > That might be an appropriate thing to do, but the commentary that > came with the patch doesn't say it does that, and says something > quite different, so I'm not really sure what to make of it. Which > is one of the reasons I didn't reply straight away - I wanted to give > myself time to read it again and make sure I wasn't missing anything. > > > NeilBrown > > > > > -- > > -- Sujit K M > > > > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 11:15 ` NeilBrown 2009-05-14 12:10 ` Andre Noll @ 2009-05-14 12:22 ` Neil Brown 2009-05-14 15:51 ` raz ben yehuda 1 sibling, 1 reply; 39+ messages in thread From: Neil Brown @ 2009-05-14 12:22 UTC (permalink / raw) To: Andre Noll; +Cc: raziebe, linux-raid On Thursday May 14, neilb@suse.de wrote: > > Then the loop could be > for (i = 0; i < conf->nr_strip_zones; i++) > if (sector < conf->strip_zone[i+1]) > return conf->strip_zone[i] > > Save our selves an 'add' that way :-) That should have been > > Then the loop could be > for (i = 0; i < conf->nr_strip_zones; i++) > if (sector < conf->strip_zone[i+1]->zone_start) > return conf->strip_zone[i] > > Save ourselves an 'add' that way :-) and as we aren't using ->sectors in this loop any more, I think it can be discarded completely, just making struct strip_zone 20% smaller and improving cache performance. What do you think? NeilBrown ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 12:22 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Neil Brown @ 2009-05-14 15:51 ` raz ben yehuda 2009-05-14 20:38 ` NeilBrown 2009-05-15 17:30 ` Andre Noll 0 siblings, 2 replies; 39+ messages in thread From: raz ben yehuda @ 2009-05-14 15:51 UTC (permalink / raw) To: Neil Brown; +Cc: Andre Noll, linux-raid On Thu, 2009-05-14 at 22:22 +1000, Neil Brown wrote: > On Thursday May 14, neilb@suse.de wrote: > > > > Then the loop could be > > for (i = 0; i < conf->nr_strip_zones; i++) > > if (sector < conf->strip_zone[i+1]) > > return conf->strip_zone[i] > > > > Save our selves an 'add' that way :-) > > That should have been > > > > > Then the loop could be > > for (i = 0; i < conf->nr_strip_zones; i++) > > if (sector < conf->strip_zone[i+1]->zone_start) > > return conf->strip_zone[i] > > > > Save ourselves an 'add' that way :-) > > and as we aren't using ->sectors in this loop any more, I think it > can be discarded completely, just making struct strip_zone 20% smaller > and improving cache performance. > > What do you think? hamm.. please do not do it now. I think i need it for the reshape because I have to traverse each zone up to its end. > NeilBrown ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 15:51 ` raz ben yehuda @ 2009-05-14 20:38 ` NeilBrown 2009-05-15 13:18 ` Andre Noll 2009-05-15 17:30 ` Andre Noll 1 sibling, 1 reply; 39+ messages in thread From: NeilBrown @ 2009-05-14 20:38 UTC (permalink / raw) To: raz ben yehuda; +Cc: Andre Noll, linux-raid On Fri, May 15, 2009 1:51 am, raz ben yehuda wrote: > > On Thu, 2009-05-14 at 22:22 +1000, Neil Brown wrote: >> On Thursday May 14, neilb@suse.de wrote: >> > >> > Then the loop could be >> > for (i = 0; i < conf->nr_strip_zones; i++) >> > if (sector < conf->strip_zone[i+1]) >> > return conf->strip_zone[i] >> > >> > Save our selves an 'add' that way :-) >> >> That should have been >> >> > >> > Then the loop could be >> > for (i = 0; i < conf->nr_strip_zones; i++) >> > if (sector < conf->strip_zone[i+1]->zone_start) >> > return conf->strip_zone[i] >> > >> > Save ourselves an 'add' that way :-) >> >> and as we aren't using ->sectors in this loop any more, I think it >> can be discarded completely, just making struct strip_zone 20% smaller >> and improving cache performance. >> >> What do you think? > hamm.. please do not do it now. I think i need it for the reshape > because I have to traverse each zone up to its end. You don't lose that ability. Then end of one zone matches the beginning of the next zone. So run from zone[0].zone_start to zone[1].zone_start I was thinking further about this and realised that in the first zone, zone_start is always 0. And we added a sentinal zone for which only the zone_start value is set. This seems silly. Instead we can just store 'zone_end' in each zone (or possibly start_of_next_zone) and then we don't need the sentinal and don't waste the zone_start at the beginning. NeilBrown >> NeilBrown > > > ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 20:38 ` NeilBrown @ 2009-05-15 13:18 ` Andre Noll 0 siblings, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-15 13:18 UTC (permalink / raw) To: NeilBrown; +Cc: raz ben yehuda, linux-raid [-- Attachment #1: Type: text/plain, Size: 596 bytes --] On 06:38, NeilBrown wrote: > I was thinking further about this and realised that in the first zone, > zone_start is always 0. And we added a sentinal zone for which only > the zone_start value is set. This seems silly. Yes, it is. > Instead we can just store 'zone_end' in each zone (or possibly > start_of_next_zone) and then we don't need the sentinal and don't waste > the zone_start at the beginning. That's what I did in the new version of the patch set. I'll send it out shortly. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 15:51 ` raz ben yehuda 2009-05-14 20:38 ` NeilBrown @ 2009-05-15 17:30 ` Andre Noll 2009-05-15 21:19 ` Raz 1 sibling, 1 reply; 39+ messages in thread From: Andre Noll @ 2009-05-15 17:30 UTC (permalink / raw) To: raz ben yehuda; +Cc: Neil Brown, linux-raid [-- Attachment #1: Type: text/plain, Size: 788 bytes --] On 18:51, raz ben yehuda wrote: > > and as we aren't using ->sectors in this loop any more, I think it > > can be discarded completely, just making struct strip_zone 20% smaller > > and improving cache performance. > > > > What do you think? > hamm.. please do not do it now. I think i need it for the reshape > because I have to traverse each zone up to its end. Well, you can always get the zone size of zone #i by z[i].zone_end - z[i - 1].zone_end (with z[-1].zone_end := 0). Since reshape is a slow operation anyway, the additional subtraction won't hurt, and we'd have the smaller cache footprint during normal operation. So I'd be inclined to kill ->sectors as well. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-15 17:30 ` Andre Noll @ 2009-05-15 21:19 ` Raz 2009-05-18 8:21 ` Andre Noll 0 siblings, 1 reply; 39+ messages in thread From: Raz @ 2009-05-15 21:19 UTC (permalink / raw) To: Andre Noll; +Cc: Neil Brown, linux-raid I have not patched yet my code with this new serie, but where is patch 1/6 ? Does it exist or simply wrong numbering ? On Fri, May 15, 2009 at 8:30 PM, Andre Noll <maan@systemlinux.org> wrote: > On 18:51, raz ben yehuda wrote: > >> > and as we aren't using ->sectors in this loop any more, I think it >> > can be discarded completely, just making struct strip_zone 20% smaller >> > and improving cache performance. >> > >> > What do you think? >> hamm.. please do not do it now. I think i need it for the reshape >> because I have to traverse each zone up to its end. > > Well, you can always get the zone size of zone #i by > > z[i].zone_end - z[i - 1].zone_end > > (with z[-1].zone_end := 0). Since reshape is a slow operation anyway, > the additional subtraction won't hurt, and we'd have the smaller > cache footprint during normal operation. So I'd be inclined to kill > ->sectors as well. > > Andre > -- > The only person who always got his work done by Friday was Robinson Crusoe > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQFKDabAWto1QDEAkw8RAoEmAKCqYyRinCYIuXWi4fhJSLKfj//2XgCfYTfp > z0lX/gg5fAfGb8IM0XKOAl4= > =ShCN > -----END PGP SIGNATURE----- > > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-15 21:19 ` Raz @ 2009-05-18 8:21 ` Andre Noll 0 siblings, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-18 8:21 UTC (permalink / raw) To: Raz; +Cc: Neil Brown, linux-raid [-- Attachment #1: Type: text/plain, Size: 299 bytes --] On 00:19, Raz wrote: > I have not patched yet my code with this new serie, but where is patch > 1/6 ? Does it exist or simply wrong numbering ? It's all in Neil's for-next tree. Sorry for the confusion. Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll 2009-05-14 11:15 ` SandeepKsinha 2009-05-14 11:15 ` NeilBrown @ 2009-05-14 12:01 ` SandeepKsinha 2009-05-14 12:15 ` SandeepKsinha 2009-05-14 14:13 ` raz ben yehuda 3 siblings, 1 reply; 39+ messages in thread From: SandeepKsinha @ 2009-05-14 12:01 UTC (permalink / raw) To: Andre Noll; +Cc: neilb, raziebe, linux-raid On Thu, May 14, 2009 at 4:13 PM, Andre Noll <maan@systemlinux.org> wrote: > The number of strip_zones of a raid0 array is bounded by the number of > drives in the array and is in fact much smaller for typical setups. For > example, any raid0 array containing identical disks will have only > a single strip_zone. > > Therefore, the hash tables which are used for quickly finding the > strip_zone that holds a particular sector are of questionable value > and add quite a bit of unnecessary complexity. > > This patch replaces the hash table lookup by equivalent code which > simply loops over all strip zones to find the zone that holds the > given sector. > > Subsequent cleanup patches will remove the hash table structure. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 32 +++++++++++++++++++------------- > 1 files changed, 19 insertions(+), 13 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index c08d755..9fd3c3c 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -398,6 +398,22 @@ static int raid0_stop (mddev_t *mddev) > return 0; > } > > +/* Find the zone which holds a particular offset */ > +static struct strip_zone *find_zone(struct raid0_private_data *conf, > + sector_t sector) > +{ > + int i; > + > + for (i = 0; i < conf->nr_strip_zones; i++) { > + struct strip_zone *z = conf->strip_zone + i; > + > + if (sector < z->zone_start + z->sectors) > + return z; > + } > + BUG(); > + return NULL; > +} > + The point here is that rather than figuring the issue after traversing through the list of all the zones, you can make a check and smartly get away with the situation without wasting extra effort. int i; BUG_ON(sector > conf->stripe_zone[0].sectors); for (i = 0; i < conf->nr_strip_zones; i++) { struct strip_zone *z = conf->strip_zone + i; if (sector < z->zone_start + z->sectors) return z; } return NULL; > static int raid0_make_request (struct request_queue *q, struct bio *bio) > { > mddev_t *mddev = q->queuedata; > @@ -443,20 +459,10 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > bio_pair_release(bp); > return 0; > } > - > - > - { > - sector_t x = sector >> conf->sector_shift; > - sector_div(x, (u32)conf->spacing); > - zone = conf->hash_table[x]; > - } > - > - while (sector >= zone->zone_start + zone->sectors) > - zone++; > - > + zone = find_zone(conf, sector); > + if (!zone) > + return 1; > sect_in_chunk = bio->bi_sector & (chunk_sects - 1); > - > - > { > sector_t x = (sector - zone->zone_start) >> chunksect_bits; > > -- > 1.5.4.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 12:01 ` SandeepKsinha @ 2009-05-14 12:15 ` SandeepKsinha 0 siblings, 0 replies; 39+ messages in thread From: SandeepKsinha @ 2009-05-14 12:15 UTC (permalink / raw) To: Andre Noll; +Cc: neilb, raziebe, linux-raid On Thu, May 14, 2009 at 5:31 PM, SandeepKsinha <sandeepksinha@gmail.com> wrote: > On Thu, May 14, 2009 at 4:13 PM, Andre Noll <maan@systemlinux.org> wrote: >> The number of strip_zones of a raid0 array is bounded by the number of >> drives in the array and is in fact much smaller for typical setups. For >> example, any raid0 array containing identical disks will have only >> a single strip_zone. >> >> Therefore, the hash tables which are used for quickly finding the >> strip_zone that holds a particular sector are of questionable value >> and add quite a bit of unnecessary complexity. >> >> This patch replaces the hash table lookup by equivalent code which >> simply loops over all strip zones to find the zone that holds the >> given sector. >> >> Subsequent cleanup patches will remove the hash table structure. >> >> Signed-off-by: Andre Noll <maan@systemlinux.org> >> --- >> drivers/md/raid0.c | 32 +++++++++++++++++++------------- >> 1 files changed, 19 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c >> index c08d755..9fd3c3c 100644 >> --- a/drivers/md/raid0.c >> +++ b/drivers/md/raid0.c >> @@ -398,6 +398,22 @@ static int raid0_stop (mddev_t *mddev) >> return 0; >> } >> >> +/* Find the zone which holds a particular offset */ >> +static struct strip_zone *find_zone(struct raid0_private_data *conf, >> + sector_t sector) >> +{ >> + int i; >> + >> + for (i = 0; i < conf->nr_strip_zones; i++) { >> + struct strip_zone *z = conf->strip_zone + i; >> + >> + if (sector < z->zone_start + z->sectors) >> + return z; >> + } >> + BUG(); >> + return NULL; >> +} >> + > > The point here is that rather than figuring the issue after traversing > through the list of all the zones, > you can make a check and smartly get away with the situation without > wasting extra effort. > > int i; > > BUG_ON(sector > conf->stripe_zone[0].sectors); > > for (i = 0; i < conf->nr_strip_zones; i++) { > struct strip_zone *z = conf->strip_zone + i; > > if (sector < z->zone_start + z->sectors) > return z; > } > > return NULL; > Sorry taken back. This won't exactly help. But thinking in similar lines can be helpful. > > >> static int raid0_make_request (struct request_queue *q, struct bio *bio) >> { >> mddev_t *mddev = q->queuedata; >> @@ -443,20 +459,10 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) >> bio_pair_release(bp); >> return 0; >> } >> - >> - >> - { >> - sector_t x = sector >> conf->sector_shift; >> - sector_div(x, (u32)conf->spacing); >> - zone = conf->hash_table[x]; >> - } >> - >> - while (sector >= zone->zone_start + zone->sectors) >> - zone++; >> - >> + zone = find_zone(conf, sector); >> + if (!zone) >> + return 1; >> sect_in_chunk = bio->bi_sector & (chunk_sects - 1); >> - >> - >> { >> sector_t x = (sector - zone->zone_start) >> chunksect_bits; >> >> -- >> 1.5.4.3 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > > -- > Regards, > Sandeep. > > > > > > > “To learn is to change. Education is a process that changes the learner.” > -- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones. 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll ` (2 preceding siblings ...) 2009-05-14 12:01 ` SandeepKsinha @ 2009-05-14 14:13 ` raz ben yehuda 3 siblings, 0 replies; 39+ messages in thread From: raz ben yehuda @ 2009-05-14 14:13 UTC (permalink / raw) To: Andre Noll; +Cc: neilb, linux-raid On Thu, 2009-05-14 at 12:43 +0200, Andre Noll wrote: > The number of strip_zones of a raid0 array is bounded by the number of > drives in the array and is in fact much smaller for typical setups. For > example, any raid0 array containing identical disks will have only > a single strip_zone. > > Therefore, the hash tables which are used for quickly finding the > strip_zone that holds a particular sector are of questionable value > and add quite a bit of unnecessary complexity. > > This patch replaces the hash table lookup by equivalent code which > simply loops over all strip zones to find the zone that holds the > given sector. > > Subsequent cleanup patches will remove the hash table structure. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 32 +++++++++++++++++++------------- > 1 files changed, 19 insertions(+), 13 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index c08d755..9fd3c3c 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -398,6 +398,22 @@ static int raid0_stop (mddev_t *mddev) > return 0; > } > > +/* Find the zone which holds a particular offset */ > +static struct strip_zone *find_zone(struct raid0_private_data *conf, > + sector_t sector) > +{ > + int i; > + > + for (i = 0; i < conf->nr_strip_zones; i++) { > + struct strip_zone *z = conf->strip_zone + i; > + > + if (sector < z->zone_start + z->sectors) > + return z; > + } > + BUG(); > + return NULL; > +} > + > static int raid0_make_request (struct request_queue *q, struct bio *bio) > { > mddev_t *mddev = q->queuedata; > @@ -443,20 +459,10 @@ static int raid0_make_request (struct request_queue *q, struct bio *bio) > bio_pair_release(bp); > return 0; > } > - > - > - { > - sector_t x = sector >> conf->sector_shift; > - sector_div(x, (u32)conf->spacing); > - zone = conf->hash_table[x]; > - } > - > - while (sector >= zone->zone_start + zone->sectors) > - zone++; > - > + zone = find_zone(conf, sector); > + if (!zone) > + return 1; ? if you cannot serve the io you shouldn't you return an error ? > sect_in_chunk = bio->bi_sector & (chunk_sects - 1); > - > - > { > sector_t x = (sector - zone->zone_start) >> chunksect_bits; > ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH] md: raid0: Remove hash table. 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll @ 2009-05-14 10:43 ` Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Remove hash spacing and sector shift Andre Noll ` (3 subsequent siblings) 5 siblings, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw) To: neilb; +Cc: raziebe, linux-raid, Andre Noll The raid0 hash table has become unused due to the changes in the previous patch. This patch removes the hash table allocation and setup code and kills the hash_table field of struct raid0_private_data. Signed-off-by: Andre Noll <maan@systemlinux.org> --- drivers/md/raid0.c | 12 ------------ drivers/md/raid0.h | 1 - 2 files changed, 0 insertions(+), 13 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 9fd3c3c..6bae90c 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -330,22 +330,14 @@ static int raid0_run (mddev_t *mddev) nb_zone = s + round; } printk(KERN_INFO "raid0 : nb_zone is %d.\n", nb_zone); - - printk(KERN_INFO "raid0 : Allocating %zu bytes for hash.\n", - nb_zone*sizeof(struct strip_zone*)); - conf->hash_table = kmalloc (sizeof (struct strip_zone *)*nb_zone, GFP_KERNEL); - if (!conf->hash_table) - goto out_free_conf; sectors = conf->strip_zone[cur].sectors; - conf->hash_table[0] = conf->strip_zone + cur; for (i=1; i< nb_zone; i++) { while (sectors <= conf->spacing) { cur++; sectors += conf->strip_zone[cur].sectors; } sectors -= conf->spacing; - conf->hash_table[i] = conf->strip_zone + cur; } if (conf->sector_shift) { conf->spacing >>= conf->sector_shift; @@ -388,8 +380,6 @@ static int raid0_stop (mddev_t *mddev) raid0_conf_t *conf = mddev_to_conf(mddev); blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/ - kfree(conf->hash_table); - conf->hash_table = NULL; kfree(conf->strip_zone); conf->strip_zone = NULL; kfree(conf); @@ -502,8 +492,6 @@ static void raid0_status (struct seq_file *seq, mddev_t *mddev) h = 0; for (j = 0; j < conf->nr_strip_zones; j++) { seq_printf(seq, " z%d", j); - if (conf->hash_table[h] == conf->strip_zone+j) - seq_printf(seq, "(h%d)", h++); seq_printf(seq, "=["); for (k = 0; k < conf->strip_zone[j].nb_dev; k++) seq_printf(seq, "%s/", bdevname( diff --git a/drivers/md/raid0.h b/drivers/md/raid0.h index 824b12e..aac3613 100644 --- a/drivers/md/raid0.h +++ b/drivers/md/raid0.h @@ -12,7 +12,6 @@ struct strip_zone struct raid0_private_data { - struct strip_zone **hash_table; /* Table of indexes into strip_zone */ struct strip_zone *strip_zone; mdk_rdev_t **devlist; /* lists of rdevs, pointed to by strip_zone->dev */ int nr_strip_zones; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* [PATCH] md: raid0: Remove hash spacing and sector shift. 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Remove hash table Andre Noll @ 2009-05-14 10:43 ` Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Make raid0_run() return a proper error code Andre Noll ` (2 subsequent siblings) 5 siblings, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw) To: neilb; +Cc: raziebe, linux-raid, Andre Noll The "sector_shift" and "spacing" fields of struct raid0_private_data were only used for the hash table lookups. So the removal of the hash table allows get rid of these fields as well which simplifies create_strip_zones() and raid0_run() quite a bit. Signed-off-by: Andre Noll <maan@systemlinux.org> --- drivers/md/raid0.c | 63 +--------------------------------------------------- drivers/md/raid0.h | 3 -- 2 files changed, 1 insertions(+), 65 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 6bae90c..2401e94 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -57,7 +57,6 @@ static int create_strip_zones (mddev_t *mddev) { int i, c, j; sector_t current_start, curr_zone_start; - sector_t min_spacing; raid0_conf_t *conf = mddev_to_conf(mddev); mdk_rdev_t *smallest, *rdev1, *rdev2, *rdev; struct strip_zone *zone; @@ -206,28 +205,7 @@ static int create_strip_zones (mddev_t *mddev) (unsigned long long)current_start); } - /* Now find appropriate hash spacing. - * We want a number which causes most hash entries to cover - * at most two strips, but the hash table must be at most - * 1 PAGE. We choose the smallest strip, or contiguous collection - * of strips, that has big enough size. We never consider the last - * strip though as it's size has no bearing on the efficacy of the hash - * table. - */ - conf->spacing = curr_zone_start; - min_spacing = curr_zone_start; - sector_div(min_spacing, PAGE_SIZE/sizeof(struct strip_zone*)); - for (i=0; i < conf->nr_strip_zones-1; i++) { - sector_t s = 0; - for (j = i; j < conf->nr_strip_zones - 1 && - s < min_spacing; j++) - s += conf->strip_zone[j].sectors; - if (s >= min_spacing && s < conf->spacing) - conf->spacing = s; - } - mddev->queue->unplug_fn = raid0_unplug; - mddev->queue->backing_dev_info.congested_fn = raid0_congested; mddev->queue->backing_dev_info.congested_data = mddev; @@ -277,10 +255,8 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) return array_sectors; } -static int raid0_run (mddev_t *mddev) +static int raid0_run(mddev_t *mddev) { - unsigned cur=0, i=0, nb_zone; - s64 sectors; raid0_conf_t *conf; if (mddev->chunk_size == 0) { @@ -310,43 +286,6 @@ static int raid0_run (mddev_t *mddev) printk(KERN_INFO "raid0 : md_size is %llu sectors.\n", (unsigned long long)mddev->array_sectors); - printk(KERN_INFO "raid0 : conf->spacing is %llu sectors.\n", - (unsigned long long)conf->spacing); - { - sector_t s = raid0_size(mddev, 0, 0); - sector_t space = conf->spacing; - int round; - conf->sector_shift = 0; - if (sizeof(sector_t) > sizeof(u32)) { - /*shift down space and s so that sector_div will work */ - while (space > (sector_t) (~(u32)0)) { - s >>= 1; - space >>= 1; - s += 1; /* force round-up */ - conf->sector_shift++; - } - } - round = sector_div(s, (u32)space) ? 1 : 0; - nb_zone = s + round; - } - printk(KERN_INFO "raid0 : nb_zone is %d.\n", nb_zone); - sectors = conf->strip_zone[cur].sectors; - - for (i=1; i< nb_zone; i++) { - while (sectors <= conf->spacing) { - cur++; - sectors += conf->strip_zone[cur].sectors; - } - sectors -= conf->spacing; - } - if (conf->sector_shift) { - conf->spacing >>= conf->sector_shift; - /* round spacing up so when we divide by it, we - * err on the side of too-low, which is safest - */ - conf->spacing++; - } - /* calculate the max read-ahead size. * For read-ahead of large files to be effective, we need to * readahead at least twice a whole stripe. i.e. number of devices diff --git a/drivers/md/raid0.h b/drivers/md/raid0.h index aac3613..218d7f3 100644 --- a/drivers/md/raid0.h +++ b/drivers/md/raid0.h @@ -15,9 +15,6 @@ struct raid0_private_data struct strip_zone *strip_zone; mdk_rdev_t **devlist; /* lists of rdevs, pointed to by strip_zone->dev */ int nr_strip_zones; - - sector_t spacing; - int sector_shift; /* shift this before divide by spacing */ }; typedef struct raid0_private_data raid0_conf_t; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* [PATCH] md: raid0: Make raid0_run() return a proper error code. 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll ` (2 preceding siblings ...) 2009-05-14 10:43 ` [PATCH] md: raid0: Remove hash spacing and sector shift Andre Noll @ 2009-05-14 10:43 ` Andre Noll 2009-05-14 11:21 ` NeilBrown 2009-05-14 10:43 ` [PATCH] md: raid0: Kfree() strip_zone and devlist in create_strip_zones() Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Simplify raid0_run() Andre Noll 5 siblings, 1 reply; 39+ messages in thread From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw) To: neilb; +Cc: raziebe, linux-raid, Andre Noll Currently raid0_run() always returns -ENOMEM on errors. This is incorrect as running the array might fail for other reasons, for example because not all component devices were available. This patch changes create_strip_zones() so that it returns a proper error code (either -ENOMEM or -EINVAL) rather than 1 on errors and makes raid0_run(), its single caller, return that value instead of -ENOMEM. Signed-off-by: Andre Noll <maan@systemlinux.org> --- drivers/md/raid0.c | 17 +++++++++-------- 1 files changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 2401e94..0f4330f 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -106,12 +106,12 @@ static int create_strip_zones (mddev_t *mddev) conf->strip_zone = kzalloc(sizeof(struct strip_zone)* conf->nr_strip_zones, GFP_KERNEL); if (!conf->strip_zone) - return 1; + return -ENOMEM; conf->devlist = kzalloc(sizeof(mdk_rdev_t*)* conf->nr_strip_zones*mddev->raid_disks, GFP_KERNEL); if (!conf->devlist) - return 1; + return -ENOMEM; /* The first zone must contain all devices, so here we check that * there is a proper alignment of slots to devices and find them all @@ -211,8 +211,8 @@ static int create_strip_zones (mddev_t *mddev) printk(KERN_INFO "raid0: done.\n"); return 0; - abort: - return 1; +abort: + return -EINVAL; } /** @@ -258,6 +258,7 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) static int raid0_run(mddev_t *mddev) { raid0_conf_t *conf; + int ret; if (mddev->chunk_size == 0) { printk(KERN_ERR "md/raid0: non-zero chunk size required.\n"); @@ -273,12 +274,13 @@ static int raid0_run(mddev_t *mddev) conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL); if (!conf) - goto out; + return -ENOMEM; mddev->private = (void *)conf; conf->strip_zone = NULL; conf->devlist = NULL; - if (create_strip_zones (mddev)) + ret = create_strip_zones(mddev); + if (ret < 0) goto out_free_conf; /* calculate array device size */ @@ -310,8 +312,7 @@ out_free_conf: kfree(conf->devlist); kfree(conf); mddev->private = NULL; -out: - return -ENOMEM; + return ret; } static int raid0_stop (mddev_t *mddev) -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Make raid0_run() return a proper error code. 2009-05-14 10:43 ` [PATCH] md: raid0: Make raid0_run() return a proper error code Andre Noll @ 2009-05-14 11:21 ` NeilBrown 2009-05-14 11:42 ` Andre Noll 0 siblings, 1 reply; 39+ messages in thread From: NeilBrown @ 2009-05-14 11:21 UTC (permalink / raw) Cc: raziebe, linux-raid, Andre Noll On Thu, May 14, 2009 8:43 pm, Andre Noll wrote: > Currently raid0_run() always returns -ENOMEM on errors. This is > incorrect as running the array might fail for other reasons, for > example because not all component devices were available. > > This patch changes create_strip_zones() so that it returns a proper > error code (either -ENOMEM or -EINVAL) rather than 1 on errors and > makes raid0_run(), its single caller, return that value instead > of -ENOMEM. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 17 +++++++++-------- > 1 files changed, 9 insertions(+), 8 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index 2401e94..0f4330f 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -106,12 +106,12 @@ static int create_strip_zones (mddev_t *mddev) > conf->strip_zone = kzalloc(sizeof(struct strip_zone)* > conf->nr_strip_zones, GFP_KERNEL); > if (!conf->strip_zone) > - return 1; > + return -ENOMEM; > conf->devlist = kzalloc(sizeof(mdk_rdev_t*)* > conf->nr_strip_zones*mddev->raid_disks, > GFP_KERNEL); > if (!conf->devlist) > - return 1; > + return -ENOMEM; > > /* The first zone must contain all devices, so here we check that > * there is a proper alignment of slots to devices and find them all > @@ -211,8 +211,8 @@ static int create_strip_zones (mddev_t *mddev) > > printk(KERN_INFO "raid0: done.\n"); > return 0; > - abort: > - return 1; > +abort: > + return -EINVAL; > } > > /** > @@ -258,6 +258,7 @@ static sector_t raid0_size(mddev_t *mddev, sector_t > sectors, int raid_disks) > static int raid0_run(mddev_t *mddev) > { > raid0_conf_t *conf; > + int ret; > > if (mddev->chunk_size == 0) { > printk(KERN_ERR "md/raid0: non-zero chunk size required.\n"); > @@ -273,12 +274,13 @@ static int raid0_run(mddev_t *mddev) > > conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL); > if (!conf) > - goto out; > + return -ENOMEM; > mddev->private = (void *)conf; > > conf->strip_zone = NULL; > conf->devlist = NULL; > - if (create_strip_zones (mddev)) > + ret = create_strip_zones(mddev); > + if (ret < 0) > goto out_free_conf; Can we go one set further here and move the allocation of ->private into create_stripe_zones. It seems funny allocating it here and never actually using it directly. ?? Everything else looks good. Thanks for doing this. NeilBrown ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Make raid0_run() return a proper error code. 2009-05-14 11:21 ` NeilBrown @ 2009-05-14 11:42 ` Andre Noll 0 siblings, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 11:42 UTC (permalink / raw) To: NeilBrown; +Cc: raziebe, linux-raid [-- Attachment #1: Type: text/plain, Size: 662 bytes --] On 21:21, NeilBrown wrote: > > conf->strip_zone = NULL; > > conf->devlist = NULL; > > - if (create_strip_zones (mddev)) > > + ret = create_strip_zones(mddev); > > + if (ret < 0) > > goto out_free_conf; > > Can we go one set further here and move the allocation of ->private > into create_stripe_zones. It seems funny allocating it here and > never actually using it directly. Yes, I thought about that but wanted the patches to be as small as possible. I'll follow up with another patch that moves the allocation to create_stripe_zones(). Thanks Andre -- The only person who always got his work done by Friday was Robinson Crusoe [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* [PATCH] md: raid0: Kfree() strip_zone and devlist in create_strip_zones(). 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll ` (3 preceding siblings ...) 2009-05-14 10:43 ` [PATCH] md: raid0: Make raid0_run() return a proper error code Andre Noll @ 2009-05-14 10:43 ` Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Simplify raid0_run() Andre Noll 5 siblings, 0 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw) To: neilb; +Cc: raziebe, linux-raid, Andre Noll It's cleaner to kfree() these in the same function where they are allocated. This allows to simplify raid0_run() a bit which will be done in the next patch. Signed-off-by: Andre Noll <maan@systemlinux.org> --- drivers/md/raid0.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 0f4330f..5ff6290 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -110,8 +110,10 @@ static int create_strip_zones (mddev_t *mddev) conf->devlist = kzalloc(sizeof(mdk_rdev_t*)* conf->nr_strip_zones*mddev->raid_disks, GFP_KERNEL); - if (!conf->devlist) + if (!conf->devlist) { + kfree(conf->strip_zone); return -ENOMEM; + } /* The first zone must contain all devices, so here we check that * there is a proper alignment of slots to devices and find them all @@ -212,6 +214,8 @@ static int create_strip_zones (mddev_t *mddev) printk(KERN_INFO "raid0: done.\n"); return 0; abort: + kfree(conf->strip_zone); + kfree(conf->devlist); return -EINVAL; } @@ -276,9 +280,7 @@ static int raid0_run(mddev_t *mddev) if (!conf) return -ENOMEM; mddev->private = (void *)conf; - - conf->strip_zone = NULL; - conf->devlist = NULL; + ret = create_strip_zones(mddev); if (ret < 0) goto out_free_conf; @@ -308,8 +310,6 @@ static int raid0_run(mddev_t *mddev) return 0; out_free_conf: - kfree(conf->strip_zone); - kfree(conf->devlist); kfree(conf); mddev->private = NULL; return ret; -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* [PATCH] md: raid0: Simplify raid0_run(). 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll ` (4 preceding siblings ...) 2009-05-14 10:43 ` [PATCH] md: raid0: Kfree() strip_zone and devlist in create_strip_zones() Andre Noll @ 2009-05-14 10:43 ` Andre Noll 2009-05-14 11:43 ` SandeepKsinha 2009-05-14 14:03 ` raz ben yehuda 5 siblings, 2 replies; 39+ messages in thread From: Andre Noll @ 2009-05-14 10:43 UTC (permalink / raw) To: neilb; +Cc: raziebe, linux-raid, Andre Noll Get rid of the local variable "conf" and of an unnecessary cast. Signed-off-by: Andre Noll <maan@systemlinux.org> --- drivers/md/raid0.c | 20 +++++++------------- 1 files changed, 7 insertions(+), 13 deletions(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 5ff6290..36b747a 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -261,7 +261,6 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) static int raid0_run(mddev_t *mddev) { - raid0_conf_t *conf; int ret; if (mddev->chunk_size == 0) { @@ -276,14 +275,15 @@ static int raid0_run(mddev_t *mddev) blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1); mddev->queue->queue_lock = &mddev->queue->__queue_lock; - conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL); - if (!conf) + mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); + if (!mddev->private) return -ENOMEM; - mddev->private = (void *)conf; - ret = create_strip_zones(mddev); - if (ret < 0) - goto out_free_conf; + if (ret < 0) { + kfree(mddev->private); + mddev->private = NULL; + return ret; + } /* calculate array device size */ md_set_array_sectors(mddev, raid0_size(mddev, 0, 0)); @@ -305,14 +305,8 @@ static int raid0_run(mddev_t *mddev) mddev->queue->backing_dev_info.ra_pages = 2* stripe; } - blk_queue_merge_bvec(mddev->queue, raid0_mergeable_bvec); return 0; - -out_free_conf: - kfree(conf); - mddev->private = NULL; - return ret; } static int raid0_stop (mddev_t *mddev) -- 1.5.4.3 ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Simplify raid0_run(). 2009-05-14 10:43 ` [PATCH] md: raid0: Simplify raid0_run() Andre Noll @ 2009-05-14 11:43 ` SandeepKsinha 2009-05-14 12:06 ` NeilBrown 2009-05-14 14:03 ` raz ben yehuda 1 sibling, 1 reply; 39+ messages in thread From: SandeepKsinha @ 2009-05-14 11:43 UTC (permalink / raw) To: Andre Noll; +Cc: neilb, raziebe, linux-raid Hi Andre, On Thu, May 14, 2009 at 4:13 PM, Andre Noll <maan@systemlinux.org> wrote: > Get rid of the local variable "conf" and of an unnecessary cast. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 20 +++++++------------- > 1 files changed, 7 insertions(+), 13 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index 5ff6290..36b747a 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -261,7 +261,6 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) > > static int raid0_run(mddev_t *mddev) > { > - raid0_conf_t *conf; > int ret; > > if (mddev->chunk_size == 0) { > @@ -276,14 +275,15 @@ static int raid0_run(mddev_t *mddev) > blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1); > mddev->queue->queue_lock = &mddev->queue->__queue_lock; > > - conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL); > - if (!conf) > + mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); > + if (!mddev->private) > return -ENOMEM; > - mddev->private = (void *)conf; > - > ret = create_strip_zones(mddev); > - if (ret < 0) > - goto out_free_conf; > + if (ret < 0) { > + kfree(mddev->private); > + mddev->private = NULL; > + return ret; > + } > I believe the use of goto statements keep the code more structured, especially for code cleanup in case of failures. Its better that you revert back to the goto's so that later if a new piece of code is added and there would be cases of failure to handle, you will have a common cleanup routine. I think Neil can comment better on this one. > /* calculate array device size */ > md_set_array_sectors(mddev, raid0_size(mddev, 0, 0)); > @@ -305,14 +305,8 @@ static int raid0_run(mddev_t *mddev) > mddev->queue->backing_dev_info.ra_pages = 2* stripe; > } > > - > blk_queue_merge_bvec(mddev->queue, raid0_mergeable_bvec); > return 0; > - > -out_free_conf: > - kfree(conf); > - mddev->private = NULL; > - return ret; > } > > static int raid0_stop (mddev_t *mddev) > -- > 1.5.4.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Simplify raid0_run(). 2009-05-14 11:43 ` SandeepKsinha @ 2009-05-14 12:06 ` NeilBrown 0 siblings, 0 replies; 39+ messages in thread From: NeilBrown @ 2009-05-14 12:06 UTC (permalink / raw) To: SandeepKsinha; +Cc: Andre Noll, raziebe, linux-raid On Thu, May 14, 2009 9:43 pm, SandeepKsinha wrote: > Hi Andre, > > On Thu, May 14, 2009 at 4:13 PM, Andre Noll <maan@systemlinux.org> wrote: >> Get rid of the local variable "conf" and of an unnecessary cast. >> >> Signed-off-by: Andre Noll <maan@systemlinux.org> >> --- >> drivers/md/raid0.c | 20 +++++++------------- >> 1 files changed, 7 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c >> index 5ff6290..36b747a 100644 >> --- a/drivers/md/raid0.c >> +++ b/drivers/md/raid0.c >> @@ -261,7 +261,6 @@ static sector_t raid0_size(mddev_t *mddev, sector_t >> sectors, int raid_disks) >> >> static int raid0_run(mddev_t *mddev) >> { >> - raid0_conf_t *conf; >> int ret; >> >> if (mddev->chunk_size == 0) { >> @@ -276,14 +275,15 @@ static int raid0_run(mddev_t *mddev) >> blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - >> 1); >> mddev->queue->queue_lock = &mddev->queue->__queue_lock; >> >> - conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL); >> - if (!conf) >> + mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); >> + if (!mddev->private) >> return -ENOMEM; >> - mddev->private = (void *)conf; >> - >> ret = create_strip_zones(mddev); >> - if (ret < 0) >> - goto out_free_conf; >> + if (ret < 0) { >> + kfree(mddev->private); >> + mddev->private = NULL; >> + return ret; >> + } >> > > > I believe the use of goto statements keep the code more structured, > especially for code cleanup in case of failures. > Its better that you revert back to the goto's so that later if a new > piece of code is added and there would be cases of failure to handle, > you will have a common cleanup routine. > > I think Neil can comment better on this one. I'm in two minds about this. In general I prefer if (something went wrong) return error; to if (something went wrong)_ goto handle_error; .... handle_error: return error; as it is easier to read. However the point about maintainability is valid. Having to put all those gotos back in if you find that you need to free something for unlock something in error prone (you might miss a return) So having exactly one return for each function is a valuable goal. So in the general case there are good arguments in both directions. Which means you have to make a decision based on the specifics of a given case. In this case, I think raid0_run is sufficiently small (now that all the hash related code is gone) that the benefits of having gotos are minimal. So I would be inclined to leave it as it is. Thanks for the suggestion. NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [PATCH] md: raid0: Simplify raid0_run(). 2009-05-14 10:43 ` [PATCH] md: raid0: Simplify raid0_run() Andre Noll 2009-05-14 11:43 ` SandeepKsinha @ 2009-05-14 14:03 ` raz ben yehuda 1 sibling, 0 replies; 39+ messages in thread From: raz ben yehuda @ 2009-05-14 14:03 UTC (permalink / raw) To: Andre Noll; +Cc: neilb, linux-raid andre i applied your patches and tested it a bit. seems to be working. so thank you. Neil, I will continue on top of andre of patches, if this is ok with you.I promise to dissect better now. On Thu, 2009-05-14 at 12:43 +0200, Andre Noll wrote: > Get rid of the local variable "conf" and of an unnecessary cast. > > Signed-off-by: Andre Noll <maan@systemlinux.org> > --- > drivers/md/raid0.c | 20 +++++++------------- > 1 files changed, 7 insertions(+), 13 deletions(-) > > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c > index 5ff6290..36b747a 100644 > --- a/drivers/md/raid0.c > +++ b/drivers/md/raid0.c > @@ -261,7 +261,6 @@ static sector_t raid0_size(mddev_t *mddev, sector_t sectors, int raid_disks) > > static int raid0_run(mddev_t *mddev) > { > - raid0_conf_t *conf; > int ret; > > if (mddev->chunk_size == 0) { > @@ -276,14 +275,15 @@ static int raid0_run(mddev_t *mddev) > blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1); > mddev->queue->queue_lock = &mddev->queue->__queue_lock; > > - conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL); > - if (!conf) > + mddev->private = kmalloc(sizeof(raid0_conf_t), GFP_KERNEL); > + if (!mddev->private) > return -ENOMEM; > - mddev->private = (void *)conf; > - > ret = create_strip_zones(mddev); > - if (ret < 0) > - goto out_free_conf; > + if (ret < 0) { > + kfree(mddev->private); > + mddev->private = NULL; > + return ret; > + } > > /* calculate array device size */ > md_set_array_sectors(mddev, raid0_size(mddev, 0, 0)); > @@ -305,14 +305,8 @@ static int raid0_run(mddev_t *mddev) > mddev->queue->backing_dev_info.ra_pages = 2* stripe; > } > > - > blk_queue_merge_bvec(mddev->queue, raid0_mergeable_bvec); > return 0; > - > -out_free_conf: > - kfree(conf); > - mddev->private = NULL; > - return ret; > } > > static int raid0_stop (mddev_t *mddev) ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2009-05-18 22:58 UTC | newest] Thread overview: 39+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-14 10:43 [PATCH 0/6] md: Remove the hash tables from raid0 Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Andre Noll 2009-05-14 11:15 ` SandeepKsinha 2009-05-14 11:15 ` NeilBrown 2009-05-14 12:10 ` Andre Noll 2009-05-14 12:25 ` NeilBrown 2009-05-14 12:54 ` Sujit Karataparambil 2009-05-14 15:00 ` SandeepKsinha 2009-05-14 15:58 ` PATCH md [001:002]: raid0: fix chunk size to 4K*n granularity raz ben yehuda 2009-05-14 14:07 ` Andre Noll 2009-05-14 22:35 ` Neil Brown 2009-05-18 22:58 ` raz ben yehuda 2009-05-14 16:00 ` Subject: PATCH[002:002] md: raid0: dump raid configuration raz ben yehuda 2009-05-14 17:12 ` Subject: [PATCH] mdadm: raid0: support chunks of 4K*n for raid0 raz ben yehuda 2009-05-15 3:59 ` Sujit Karataparambil 2009-05-15 6:01 ` Raz 2009-05-15 6:45 ` Sujit Karataparambil 2009-05-15 8:39 ` NeilBrown 2009-05-15 15:45 ` Raz 2009-05-14 12:22 ` [PATCH] md: raid0: Replace hash table lookup by looping over all strip_zones Neil Brown 2009-05-14 15:51 ` raz ben yehuda 2009-05-14 20:38 ` NeilBrown 2009-05-15 13:18 ` Andre Noll 2009-05-15 17:30 ` Andre Noll 2009-05-15 21:19 ` Raz 2009-05-18 8:21 ` Andre Noll 2009-05-14 12:01 ` SandeepKsinha 2009-05-14 12:15 ` SandeepKsinha 2009-05-14 14:13 ` raz ben yehuda 2009-05-14 10:43 ` [PATCH] md: raid0: Remove hash table Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Remove hash spacing and sector shift Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Make raid0_run() return a proper error code Andre Noll 2009-05-14 11:21 ` NeilBrown 2009-05-14 11:42 ` Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Kfree() strip_zone and devlist in create_strip_zones() Andre Noll 2009-05-14 10:43 ` [PATCH] md: raid0: Simplify raid0_run() Andre Noll 2009-05-14 11:43 ` SandeepKsinha 2009-05-14 12:06 ` NeilBrown 2009-05-14 14:03 ` raz ben yehuda
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).