All of lore.kernel.org
 help / color / mirror / Atom feed
From: raz ben yehuda <raziebe@gmail.com>
To: Neil Brown <neilb@suse.de>
Cc: linux raid <linux-raid@vger.kernel.org>
Subject: Re: Subject: [PATCH 006/009]: raid1: chunk size check in run
Date: Thu, 21 May 2009 16:32:12 +0300	[thread overview]
Message-ID: <1242912733.3598.42.camel@raz> (raw)
In-Reply-To: <18964.50795.747389.606277@notabene.brown>


On Thu, 2009-05-21 at 13:11 +1000, Neil Brown wrote:
> On Wednesday May 20, raziebe@gmail.com wrote:
> > Neil
> > First I thank you for your effort. Now I can work in full steam on the
> > reshape on top of the new raid0 code. Currently this is what I have in
> > mind.If you have any design suggestions I would be happy to hear before
> > the coding.
> > 
> >    I added : raid0_add_hot that:
> > 	1. checks if the new disk size if smaller than the raid chunk size. if
> > so , reject.
> > 	2. check if new the disk size max_hw_sectors is smaller than the
> > raid's. if so generate a warning but do not reject.   
> >  	3. adds a disk to raid0 disk list. and turns off its in_sync bit.
> 
> I don't think the 'in_sync' bit is used in raid0 currently, so that
> bit seems irrelevant, but shouldn't hurt.
> > 
> > I will add raid0_check_reshape 
> >       This procedure prepares the raid for the reshape process.
> > 	1. Creates a temporary mddev with the same disks as the raid's and with
> > the new disks. This raid acts as a mere mappings so i will be able to
> > map sectors to the new target raid in the reshape process. This means i
> > have to work in create_strip_zones raid0_run ( separate patch ).
> >         2. Sets the target raid transfer size.
> > 	3. Create an allocation scheme for reshape bio allocation. i reshape in
> > chunk size. 
> > 	4. create raid0_reshape thread for writes.
> > 	5. wake up raid0_sync thread. 
> 
> Do you really need a temporary mddev, or just a temporary 'conf'??
I need mddev because i want to use map_sector and create_strip. I
certainly can fix map_sector and create_strip to work with conf and not
mddev, though it will make create_strip quite cumbersome. 
I will split create_strip to several independent functions. 
Do you agree ? 
> Having to create the raid0_reshape thread just for writes is a bit
> unfortunate, but it probably is the easiest approach.  You might be
> able to get the raid0_sync thread to do them, but that would be messy
> I expect.
I will start with the easy approach, meaning , a different thread for the writes.
Once i am done , i will see how can merge the reads and writes to work
in md_sync.
> > 
> > I will add raid0_sync: raid0_sync acts as the reshape read size process.
> > 
> >     1. Allocates a read bio.	
> >     2. Map_bio target with find_zone and map_sector, both map_sector and
> > find_zone are using the old raid mappings.
> >     3. Deactivate the raid.
> >     3. Lock and wait for the raid to be emptied from any previous IOs.
> >     4. Generate a read request.
> >     5. Release the lock. 
> 
> I think that sounds correct.
> 
> > 
> > I will add reshape_read_endio: 
> > 	if IO is successful then:
> > 		add the bio to reshape_list
> > 	else
> > 		add the bio to a retry list ( how many retries .. ?)
> 
> zero retries.  The underlying block device has done all the retries
> that are appropriate.  If you get a read error, then that block is
> gone.  Probably the best you can do is write garbage to the
> destination and report the error.
> 
> > 
> > I will add raid0_reshape: 
> > 	raid0_reshape is a md_thread that polls on the reshape_list and
> > commences writes based on the reads.
> > 	1. Grub a bio from reshape list.
> > 	2. map sector and find zone on the new raid mappings. 
> > 	3. set bio direction to write.
> > 	4. generate a write.
> > 	
> > 	if bio is in retry_list retry the bio.
> > 	if bio is in active_io list do the bio.
> > 	
> > I will add a reshape_write_endio that just frees the bio and his pages.
> 
> OK (except for the retry).
> 
> > 
> > raid0_make_request
> > 	I will add a check and see if the raid is in reshape. 
> > 	if so then
> > 		if IO is in the new mappings area we generate the IO
> > 				from the new mappings.
> > 		if IO is in the old mappings then we generate the IO
> > 				from the old mappings ( race here .. no ?)
> > 		if IO is in the current reshape active area, we push the io to a
> > active_io list that will processed by raid0_reshape.
> 
> This doesn't seem to match what you say above.
> If you don't submit a read for 'reshape' until all IO has drained,  
> then presumably you would just block any incoming IO until the current
> reshape requests have all finished.  i.e. you only ever have IO or
> reshape, but not both.
Where did i said that ? guess i wasn't clear.
> Alternately  you could have a sliding window covering there area
> that is currently being reshaped.
> If an IO comes in for that area, you need to either
>   - close the window and perform the IO, or
>   - wait for the window to slide past.
> I would favour the latter.  But queueing the IO for raid0_reshape doesn't
> really gain you anything I think.
I wasn't clear enough.A "current reshape active area" is my sliding window. 
I wait for the window to slide past. this is exactly what i had in mind.so yes
this is what am writing, a reshape_window.
> Issues that you haven't mentioned:
>   - metadata update: you need to record progress in the metadata
>     as the window slides along, in case of an unclean restart
I thought md does that for me. So it doesn't. Am i to call
md_allow_write ( that calls md_update_sbs ) ? how frequent ?
>   - Unless you only schedule one chunk at a time (which would be slow
>     things down I expect), you need to ensure that you don't schedule
>     a write to block for which the read hasn't completed yet.
ah... yes. i thought of it but forgot. my question is how ? should i
simply use an interruptable sleep ?  what do you do in raid5 ?
>     This is particularly an issues if you support changing the 
>     chunk size.
>   - I assume you are (currently) only supporting a reshape that
>     increases the size of the array and the number of devices?
neither changing the chunk size nor shrink is in my spec. so no. 
Maybe when i finish my studies ( you can google for "offsched + raz" and follow the link ... )
then i will have some raid quality time. 
> NeilBrown


  reply	other threads:[~2009-05-21 13:32 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-19 16:04 Subject: [PATCH 006/009]: raid1: chunk size check in run raz ben yehuda
2009-05-20  1:45 ` Neil Brown
2009-05-20 13:50   ` raz ben yehuda
2009-05-21  3:11     ` Neil Brown
2009-05-21 13:32       ` raz ben yehuda [this message]
2009-05-21 11:33         ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1242912733.3598.42.camel@raz \
    --to=raziebe@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.