* very strange (maybe) raid1 testing results
@ 2007-05-30 2:52 Jon Nelson
2007-05-31 2:35 ` Richard Scobie
0 siblings, 1 reply; 6+ messages in thread
From: Jon Nelson @ 2007-05-30 2:52 UTC (permalink / raw)
To: linux-raid
I assembled a 3-component raid1 out of 3 4GB partitions.
After syncing, I ran the following script:
for bs in 32 64 128 192 256 384 512 768 1024 ; do \
let COUNT="2048 * 1024 / ${bs}"; \
echo -n "${bs}K bs - "; \
dd if=/dev/md1 of=/dev/null bs=${bs}k count=$COUNT iflag=direct 2>&1 |
grep 'copied' ; \
done
I also ran 'dstat' (like iostat) in another terminal. What I noticed was
very unexpected to me, so I re-ran it several times. I confirmed my
initial observation - every time a new dd process ran, *all* of the read
I/O for that process came from a single disk. It does not (appear to)
have to do with block size - if I stop and re-run the script the next
drive in line will take all of the I/O - it goes sda, sdc, sdb and back
to sda and so on.
I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s
as reported by dd. What I don't understand is why just one disk is being
used here, instead of two or more. I tried different versions of
metadata, and using a bitmap makes no difference. I created the array
with (allowing for variations of bitmap and metadata version):
mdadm --create --level=1 --raid-devices=3 /dev/md1 /dev/sda3 /dev/sdb3 /dev/sdc3
I am running 2.6.18.8-0.3-default on x86_64, openSUSE 10.2.
Am I doing something wrong or is something weird going on?
--
Jon Nelson <jnelson-linux-raid@jamponi.net>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: very strange (maybe) raid1 testing results 2007-05-30 2:52 very strange (maybe) raid1 testing results Jon Nelson @ 2007-05-31 2:35 ` Richard Scobie 2007-05-31 2:58 ` Jon Nelson 0 siblings, 1 reply; 6+ messages in thread From: Richard Scobie @ 2007-05-31 2:35 UTC (permalink / raw) To: linux-raid Jon Nelson wrote: > I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s > as reported by dd. What I don't understand is why just one disk is being > used here, instead of two or more. I tried different versions of > metadata, and using a bitmap makes no difference. I created the array > with (allowing for variations of bitmap and metadata version): This is normal for md RAID1. What you should find is that for concurrent reads, each read will be serviced by a different disk, until no. of reads = no. of drives. Regards, Richard ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: very strange (maybe) raid1 testing results 2007-05-31 2:35 ` Richard Scobie @ 2007-05-31 2:58 ` Jon Nelson 2007-05-31 3:05 ` Jon Nelson 0 siblings, 1 reply; 6+ messages in thread From: Jon Nelson @ 2007-05-31 2:58 UTC (permalink / raw) Cc: linux-raid On Thu, 31 May 2007, Richard Scobie wrote: > Jon Nelson wrote: > > > I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s as > > reported by dd. What I don't understand is why just one disk is being used > > here, instead of two or more. I tried different versions of metadata, and > > using a bitmap makes no difference. I created the array with (allowing for > > variations of bitmap and metadata version): > > This is normal for md RAID1. What you should find is that for > concurrent reads, each read will be serviced by a different disk, > until no. of reads = no. of drives. Alright. To clarify, let's assume some process (like a single-threaded webserver) using a raid1 to store content (who knows why, let's just say it is), and also assume that the I/O load is 100% reads. Given that the server does not fork (or create a thread) for each request, does that mean that every single web request is essentially serviced from one disk, always? What mechanism determines which disk actually services the request? -- Jon Nelson <jnelson-linux-raid@jamponi.net> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: very strange (maybe) raid1 testing results 2007-05-31 2:58 ` Jon Nelson @ 2007-05-31 3:05 ` Jon Nelson 2007-05-31 4:28 ` Neil Brown 0 siblings, 1 reply; 6+ messages in thread From: Jon Nelson @ 2007-05-31 3:05 UTC (permalink / raw) Cc: linux-raid On Wed, 30 May 2007, Jon Nelson wrote: > On Thu, 31 May 2007, Richard Scobie wrote: > > > Jon Nelson wrote: > > > > > I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s as > > > reported by dd. What I don't understand is why just one disk is being used > > > here, instead of two or more. I tried different versions of metadata, and > > > using a bitmap makes no difference. I created the array with (allowing for > > > variations of bitmap and metadata version): > > > > This is normal for md RAID1. What you should find is that for > > concurrent reads, each read will be serviced by a different disk, > > until no. of reads = no. of drives. > > Alright. To clarify, let's assume some process (like a single-threaded > webserver) using a raid1 to store content (who knows why, let's just say > it is), and also assume that the I/O load is 100% reads. Given that the > server does not fork (or create a thread) for each request, does that > mean that every single web request is essentially serviced from one > disk, always? What mechanism determines which disk actually services the > request? It's probably bad form to reply to one's own posts, but I just found static int read_balance(conf_t *conf, r1bio_t *r1_bio) in raid1.c which, if I'm reading the rest of the source correctly, basically says "pick the disk whose current head position is closest". This *could* explain the behavior I was seeing. Is that not correct? -- Jon Nelson <jnelson-linux-raid@jamponi.net> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: very strange (maybe) raid1 testing results 2007-05-31 3:05 ` Jon Nelson @ 2007-05-31 4:28 ` Neil Brown 2007-05-31 12:48 ` Bill Davidsen 0 siblings, 1 reply; 6+ messages in thread From: Neil Brown @ 2007-05-31 4:28 UTC (permalink / raw) To: Jon Nelson; +Cc: linux-raid On Wednesday May 30, jnelson-linux-raid@jamponi.net wrote: > On Wed, 30 May 2007, Jon Nelson wrote: > > > On Thu, 31 May 2007, Richard Scobie wrote: > > > > > Jon Nelson wrote: > > > > > > > I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s as > > > > reported by dd. What I don't understand is why just one disk is being used > > > > here, instead of two or more. I tried different versions of metadata, and > > > > using a bitmap makes no difference. I created the array with (allowing for > > > > variations of bitmap and metadata version): > > > > > > This is normal for md RAID1. What you should find is that for > > > concurrent reads, each read will be serviced by a different disk, > > > until no. of reads = no. of drives. > > > > Alright. To clarify, let's assume some process (like a single-threaded > > webserver) using a raid1 to store content (who knows why, let's just say > > it is), and also assume that the I/O load is 100% reads. Given that the > > server does not fork (or create a thread) for each request, does that > > mean that every single web request is essentially serviced from one > > disk, always? What mechanism determines which disk actually services the > > request? > > It's probably bad form to reply to one's own posts, but I just found > > static int read_balance(conf_t *conf, r1bio_t *r1_bio) > > in raid1.c which, if I'm reading the rest of the source correctly, > basically says "pick the disk whose current head position is closest". > This *could* explain the behavior I was seeing. Is that not correct? Yes, that is correct. md/raid1 will send a completely sequential read request to just one device. There is not much to be gained by doing anything else. md/raid10 in 'far' or 'offset' mode lays the data out differently and will issue read requests to all devices and often get better read throughput at some cost in write throughput. NeilBrown ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: very strange (maybe) raid1 testing results 2007-05-31 4:28 ` Neil Brown @ 2007-05-31 12:48 ` Bill Davidsen 0 siblings, 0 replies; 6+ messages in thread From: Bill Davidsen @ 2007-05-31 12:48 UTC (permalink / raw) To: Neil Brown; +Cc: Jon Nelson, linux-raid Neil Brown wrote: > On Wednesday May 30, jnelson-linux-raid@jamponi.net wrote: > >> On Wed, 30 May 2007, Jon Nelson wrote: >> >> >>> On Thu, 31 May 2007, Richard Scobie wrote: >>> >>> >>>> Jon Nelson wrote: >>>> >>>> >>>>> I am getting 70-80MB/s read rates as reported via dstat, and 60-80MB/s as >>>>> reported by dd. What I don't understand is why just one disk is being used >>>>> here, instead of two or more. I tried different versions of metadata, and >>>>> using a bitmap makes no difference. I created the array with (allowing for >>>>> variations of bitmap and metadata version): >>>>> >>>> This is normal for md RAID1. What you should find is that for >>>> concurrent reads, each read will be serviced by a different disk, >>>> until no. of reads = no. of drives. >>>> >>> Alright. To clarify, let's assume some process (like a single-threaded >>> webserver) using a raid1 to store content (who knows why, let's just say >>> it is), and also assume that the I/O load is 100% reads. Given that the >>> server does not fork (or create a thread) for each request, does that >>> mean that every single web request is essentially serviced from one >>> disk, always? What mechanism determines which disk actually services the >>> request? >>> >> It's probably bad form to reply to one's own posts, but I just found >> >> static int read_balance(conf_t *conf, r1bio_t *r1_bio) >> >> in raid1.c which, if I'm reading the rest of the source correctly, >> basically says "pick the disk whose current head position is closest". >> This *could* explain the behavior I was seeing. Is that not correct? >> > > Yes, that is correct. > md/raid1 will send a completely sequential read request to just one > device. There is not much to be gained by doing anything else. > md/raid10 in 'far' or 'offset' mode lays the data out differently and > will issue read requests to all devices and often get better read > throughput at some cost in write throughput. > The whole "single process" thing may be a distraction rather than a solution, as well. I wrote a small program using pthreads which shared reads of a file between N threads in 1k blocks, such that each read was preceded by a seek. It *seemed* that these were being combined in the block layer before being passed on to the md logic, and treated as a single read as nearly as I could tell. I did NOT look at actually disk i/o (didn't care), but rather only at the transfer rate from the file to memory, which did not change significantly from 1..N threads active, where N was the number of mirrors. And RAID-10 did as well with one thread as several. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-05-31 12:48 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-05-30 2:52 very strange (maybe) raid1 testing results Jon Nelson 2007-05-31 2:35 ` Richard Scobie 2007-05-31 2:58 ` Jon Nelson 2007-05-31 3:05 ` Jon Nelson 2007-05-31 4:28 ` Neil Brown 2007-05-31 12:48 ` Bill Davidsen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).