From: "Keld Jørn Simonsen" <keld@keldix.com>
To: Roberto Spadim <roberto@spadim.com.br>
Cc: "Keld Jørn Simonsen" <keld@keldix.com>,
"Jon Nelson" <jnelson-linux-raid@jamponi.net>,
linux-raid@vger.kernel.org
Subject: Re: What's the typical RAID10 setup?
Date: Wed, 2 Feb 2011 23:13:59 +0100 [thread overview]
Message-ID: <20110202221358.GA16382@www2.open-std.org> (raw)
In-Reply-To: <AANLkTin5o5gqogn5xHUfGpCV55D=76efpJQWs3Wmx+D8@mail.gmail.com>
Hmm, Roberto, I think we are close to theoretical maximum with
some of the raid1/raid10 stuff already. and my nose tells me
that we can gain more by minimizing CPU usage.
Or maybe using some threading for raid modules - they
all run single-threaded.
Best regards
keld
On Wed, Feb 02, 2011 at 06:28:27PM -0200, Roberto Spadim wrote:
> before, this thread i put at this page:
> https://bbs.archlinux.org/viewtopic.php?pid=887267
> to make this mail list with less emails
>
> 2011/2/2 Keld Jørn Simonsen <keld@keldix.com>:
> > Hmm, Roberto, where are the gains?
>
> it?s dificult to talk... NCQ and linux scheduler don?t help a mirror,
> they help a single device
> a new scheduler for mirrors can be done (round robin, closest head, others)
>
> > I think it is hard to make raid1 better than it is today.
> i don?t think, since head, is just for hard disk (rotational) not for
> solid state disks, let?s not talk about ssd, just hard disk? a raid
> with 5000rpm and 10000rpm disk, we will have better i/o read with
> 10000rpm ? we don?t know the model of i/o for that device, but
> probally will be faster, but when it?s busy we could use 5000rpm...
> that?s the point, just closest head don?t help, we need know what?s
> the queue (list of i/o being processed) and the time to read the
> current i/o
>
> > Normally the driver orders the reads to minimize head movement
> > and loss with rotation latency. Where can we improve that?
>
> no way to improve it, it?s very good! but per hard disk, not per mirror
> but since we know it?s busy we can use another mirror (another disk
> with same information), that?s what i want
>
> > Also, what about conflicts with the elevator algorithm?
> elevator are based on model of disk, think disk as: linux elevator +
> NCQ + disks, the sum of three infomration give us time based
> infomrations to select best device
> maybe making complex code (per elevator) we could know the time spent
> to execute it, but it?s a lot of work,
> for the first model, lets think about parameters of our model (linux
> elevator + ncq + disks)
> a second version we could implement elevator algorithm time
> calculation (network block device NBD, have a elevator? at server side
> + tcp/ip stack at client and server side, right?)
>
> > There are several scheduling algorithms available, and each has
> > its merits. Will your new scheme work against these?
> > Or is your new scheme just another scheduling algorithm?
>
> it?s a scheduling for mirrors
> round balance is a algorithm for mirror
> closest head is a algorithm for mirror
> my 'new' algorith will be for mirror (if anyone help me coding for
> linux kernel hehehe, i didn?t coded for linux kernel yet, just for
> user space)
>
> noop, deadline, cfq isn?t for mirror, these are for raid0 problem
> (linear, stripe if you hard disk have more then one head on your hard
> disk)
>
> > I think I learned that scheduling is per drive, not per file system.
> yes, you learned right! =)
> /dev/md0 (raid1) is a device with scheduling (closest head,round robin)
> /dev/sda is a device with scheduling (noop, deadline, cfq, others)
> /dev/sda1 is a device with scheduling (it send all i/o directly to /dev/sda)
>
> the new algorithm is just for mirrors (raid1), i dont remeber about
> raid5,6 if they are mirror based too, if yes they could be optimized
> with this algorithm too
>
> raid0 don?t have mirrors, but information is per device striped (not
> for linear), that?s why it can be faster... can make parallel reads
>
> with closest head we can?t use best disk, we can use a single disk all
> time if it?s head closer, maybe it?s not the fastest disk (that?s why
> we implent the write-mostly, we don?t make they usable for read, just
> for write or when mirror fail, but it?s not perfect for speed, a
> better algorithm can be made, for identical disks, a round robin work
> well, better than closest head if it?s a solid state disk)
> ok on a high load, maybe closest mirror is better than this algorithm?
> yes, if you just use hard disk, if you mix hard disk+solid
> state+network block device +floppy disks+any other device, you don?t
> have the best algorithm for i/o over mirrors
>
>
> > and is it reading or writing or both? Normally we are dependant on the
> > reading, as we cannot process data before we have read them.
> > OTOH writing is less time critical, as nobody is waiting for it.
> it must be implemented on write and read, write for just time
> calculations, read for select the best mirror
> for write we must write on all mirrors (sync write is better, async
> isn?t power fail safe)
>
> > Or is it maximum thruput you want?
> > Or a mix, given some restraints?
> it?s the maximum performace = what?s the better strategy to spent less
> time to execute current i/o, based on time to access disk, time to
> read bytes, time to wait others i/o being executed
>
> that?s for mirror select, not for disks i/o
> for disks we can use noop, deadline, cfq scheduller (for disks)
> tcp/ip tweaks for network block device
>
> a model identification must execute to tell the mirror select
> algorithm what?s the model of each device
> model: time to read X bytes, time to move head, time to start a read,
> time to write, time time time per byte per kb per units
> calcule time and select the minimal value calculated as the device
> (mirror) to execute our read
>
>
> >
> > best regards
> > keld
>
> thanks keld
>
> sorry if i make email list very big
>
>
>
> --
> Roberto Spadim
> Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-02-02 22:13 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-31 9:41 What's the typical RAID10 setup? Mathias Burén
2011-01-31 10:14 ` Robin Hill
2011-01-31 10:22 ` Mathias Burén
2011-01-31 10:36 ` CoolCold
2011-01-31 15:00 ` Roberto Spadim
2011-01-31 15:21 ` Robin Hill
2011-01-31 15:27 ` Roberto Spadim
2011-01-31 15:28 ` Roberto Spadim
2011-01-31 15:32 ` Roberto Spadim
2011-01-31 15:34 ` Roberto Spadim
2011-01-31 15:37 ` Roberto Spadim
2011-01-31 15:45 ` Robin Hill
2011-01-31 16:55 ` Denis
2011-01-31 17:31 ` Roberto Spadim
2011-01-31 18:35 ` Denis
2011-01-31 19:15 ` Roberto Spadim
2011-01-31 19:28 ` Keld Jørn Simonsen
2011-01-31 19:35 ` Roberto Spadim
2011-01-31 19:37 ` Roberto Spadim
2011-01-31 20:22 ` Keld Jørn Simonsen
2011-01-31 20:17 ` Stan Hoeppner
2011-01-31 20:37 ` Keld Jørn Simonsen
2011-01-31 21:20 ` Roberto Spadim
2011-01-31 21:24 ` Mathias Burén
2011-01-31 21:27 ` Jon Nelson
2011-01-31 21:47 ` Roberto Spadim
2011-01-31 21:51 ` Roberto Spadim
2011-01-31 22:50 ` NeilBrown
2011-01-31 22:53 ` Roberto Spadim
2011-01-31 23:10 ` NeilBrown
2011-01-31 23:14 ` Roberto Spadim
2011-01-31 22:52 ` Keld Jørn Simonsen
2011-01-31 23:00 ` Roberto Spadim
2011-02-01 10:01 ` David Brown
2011-02-01 13:50 ` Jon Nelson
2011-02-01 14:25 ` Roberto Spadim
2011-02-01 14:48 ` David Brown
2011-02-01 15:41 ` Roberto Spadim
2011-02-03 3:36 ` Drew
2011-02-03 8:18 ` Stan Hoeppner
[not found] ` <AANLkTikerSZfhMbkEvGBVyLB=wHDSHLWszoEz5As5Hi4@mail.gmail.com>
[not found] ` <AANLkTikLyR206x4aMy+veNkWPV67uF9r5dZKGqXJUEqN@mail.gmail.com>
2011-02-03 14:35 ` Roberto Spadim
2011-02-03 15:43 ` Keld Jørn Simonsen
2011-02-03 15:50 ` Roberto Spadim
2011-02-03 15:54 ` Roberto Spadim
2011-02-03 16:02 ` Keld Jørn Simonsen
2011-02-03 16:07 ` Roberto Spadim
2011-02-03 16:16 ` Roberto Spadim
2011-02-01 22:05 ` Stan Hoeppner
2011-02-01 23:12 ` Roberto Spadim
2011-02-02 9:25 ` Robin Hill
2011-02-02 16:00 ` Roberto Spadim
2011-02-02 16:06 ` Roberto Spadim
2011-02-02 16:07 ` Roberto Spadim
2011-02-02 16:10 ` Roberto Spadim
2011-02-02 16:13 ` Roberto Spadim
2011-02-02 19:44 ` Keld Jørn Simonsen
2011-02-02 20:28 ` Roberto Spadim
2011-02-02 21:31 ` Roberto Spadim
2011-02-02 22:13 ` Keld Jørn Simonsen [this message]
2011-02-02 22:26 ` Roberto Spadim
2011-02-03 1:57 ` Roberto Spadim
2011-02-03 3:05 ` Stan Hoeppner
2011-02-03 3:13 ` Roberto Spadim
2011-02-03 3:17 ` Roberto Spadim
2011-02-01 23:35 ` Keld Jørn Simonsen
2011-02-01 16:02 ` Keld Jørn Simonsen
2011-02-01 16:24 ` Roberto Spadim
2011-02-01 17:56 ` Keld Jørn Simonsen
2011-02-01 18:09 ` Roberto Spadim
2011-02-01 20:16 ` Keld Jørn Simonsen
2011-02-01 20:32 ` Keld Jørn Simonsen
2011-02-01 20:58 ` Roberto Spadim
2011-02-01 21:04 ` Roberto Spadim
2011-02-01 21:18 ` David Brown
2011-02-01 0:58 ` Stan Hoeppner
2011-02-01 12:50 ` Roman Mamedov
2011-02-03 11:04 ` Keld Jørn Simonsen
2011-02-03 14:17 ` Roberto Spadim
2011-02-03 15:54 ` Keld Jørn Simonsen
2011-02-03 18:39 ` Keld Jørn Simonsen
2011-02-03 18:41 ` Roberto Spadim
2011-02-03 23:43 ` Stan Hoeppner
2011-02-04 3:49 ` hansbkk
2011-02-04 7:06 ` Keld Jørn Simonsen
2011-02-04 8:27 ` Stan Hoeppner
2011-02-04 9:06 ` Keld Jørn Simonsen
2011-02-04 10:04 ` Stan Hoeppner
2011-02-04 11:15 ` hansbkk
2011-02-04 13:33 ` Keld Jørn Simonsen
2011-02-04 20:35 ` Keld Jørn Simonsen
2011-02-04 20:42 ` Keld Jørn Simonsen
2011-02-04 21:15 ` Stan Hoeppner
2011-02-04 22:05 ` Keld Jørn Simonsen
2011-02-04 23:03 ` Stan Hoeppner
2011-02-06 3:59 ` Drew
2011-02-06 4:27 ` Stan Hoeppner
2011-02-04 11:34 ` David Brown
2011-02-04 13:53 ` Keld Jørn Simonsen
2011-02-04 14:17 ` David Brown
2011-02-04 14:21 ` hansbkk
2011-02-06 4:02 ` Drew
2011-02-06 7:58 ` Keld Jørn Simonsen
2011-02-06 12:03 ` Roman Mamedov
2011-02-06 14:30 ` Roberto Spadim
2011-02-01 8:46 ` hansbkk
2011-01-31 19:37 ` Phillip Susi
2011-01-31 19:41 ` Roberto Spadim
2011-01-31 19:46 ` Phillip Susi
2011-01-31 19:53 ` Roberto Spadim
2011-01-31 22:10 ` Phillip Susi
2011-01-31 22:14 ` Denis
2011-01-31 22:33 ` Roberto Spadim
2011-01-31 22:36 ` Roberto Spadim
2011-01-31 20:23 ` Stan Hoeppner
2011-01-31 21:59 ` Phillip Susi
2011-01-31 22:08 ` Jon Nelson
2011-01-31 22:38 ` Phillip Susi
2011-02-01 10:05 ` David Brown
2011-02-01 9:20 ` Robin Hill
2011-02-04 16:03 ` Phillip Susi
2011-02-04 16:22 ` Robin Hill
2011-02-04 20:35 ` [OT] " Phil Turmel
2011-02-04 20:35 ` Phillip Susi
2011-02-04 21:05 ` Stan Hoeppner
2011-02-04 21:13 ` Roberto Spadim
2011-01-31 15:30 ` Robin Hill
2011-01-31 20:07 ` Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110202221358.GA16382@www2.open-std.org \
--to=keld@keldix.com \
--cc=jnelson-linux-raid@jamponi.net \
--cc=linux-raid@vger.kernel.org \
--cc=roberto@spadim.com.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).