* layout error (with my test code)
@ 2012-04-12 16:26 Alex Elder
2012-04-12 16:31 ` Sage Weil
0 siblings, 1 reply; 2+ messages in thread
From: Alex Elder @ 2012-04-12 16:26 UTC (permalink / raw)
To: ceph-devel
I sent this a couple days ago but I apparently got booted off
the list, so I don't think it got out.
-Alex
I'm running suites/iozone.sh on a 3-node ceph cluster with
each running kernel ceph-client/wip-layout-helpers.
I've hit a consistent error twice now, but it seems to be
hitting it when running with particular arguments.
Here are the three commands in that workunit:
iozone -c -e -s 1024M -r 16K -t 1 -F f1 -i 0 -i 1
iozone -c -e -s 1024M -r 1M -t 1 -F f2 -i 0 -i 1
iozone -c -e -s 10240M -r 1M -t 1 -F f3 -i 0 -i 1
The first two run to completion without a problem. The third
one runs for a while and then reports something like what's
below, and then hangs the test (system is still operational).
I see this in the syslog, but I'm not sure its timing aligned
with the failure:
[ 3925.501128] libceph: osd1 10.214.133.32:6800 socket closed
Since it shows up only with the 10MB file size and 1MB record
size I am wondering if this combination hits some sort of
boundary that would help me understand what's wrong. Anyone
have any ideas?
Here is how my three nodes are configured in the teuthology file:
- [mon.a, mon.c, osd.0]
- [mon.b, mds.a, osd.1]
- [client.0]
Thanks.
-Alex
Run began: Wed Apr 11 08:36:52 2012
Include close in write timing
Include fsync in write timing
File size set to 10485760 KB
Record Size 1024 KB
Command line used: iozone -c -e -s 10240M -r 1M -t 1 -F f3 -i 0
-i 1
Output is in Kbytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 Kbytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
Throughput test with 1 process
Each process writes a 10485760 Kbyte file in 1024 Kbyte records
Error writing block 9408, fd= 3
Children see throughput for 1 initial writers = 0.00 KB/sec
Parent sees throughput for 1 initial writers = 0.00 KB/sec
Min throughput per process = 0.00 KB/sec
Max throughput per process = 0.00 KB/sec
Avg throughput per process = 0.00 KB/sec
Min xfer = 0.00 KB
Child 0
f3: No such file or directory
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: layout error (with my test code)
2012-04-12 16:26 layout error (with my test code) Alex Elder
@ 2012-04-12 16:31 ` Sage Weil
0 siblings, 0 replies; 2+ messages in thread
From: Sage Weil @ 2012-04-12 16:31 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
On Thu, 12 Apr 2012, Alex Elder wrote:
> I sent this a couple days ago but I apparently got booted off
> the list, so I don't think it got out.
>
> -Alex
>
> I'm running suites/iozone.sh on a 3-node ceph cluster with
> each running kernel ceph-client/wip-layout-helpers.
>
> I've hit a consistent error twice now, but it seems to be
> hitting it when running with particular arguments.
>
> Here are the three commands in that workunit:
> iozone -c -e -s 1024M -r 16K -t 1 -F f1 -i 0 -i 1
> iozone -c -e -s 1024M -r 1M -t 1 -F f2 -i 0 -i 1
> iozone -c -e -s 10240M -r 1M -t 1 -F f3 -i 0 -i 1
>
> The first two run to completion without a problem. The third
> one runs for a while and then reports something like what's
> below, and then hangs the test (system is still operational).
> I see this in the syslog, but I'm not sure its timing aligned
> with the failure:
> [ 3925.501128] libceph: osd1 10.214.133.32:6800 socket closed
>
> Since it shows up only with the 10MB file size and 1MB record
> size I am wondering if this combination hits some sort of
> boundary that would help me understand what's wrong. Anyone
> have any ideas?
Nothing off the top of my head. Is this 100% reproducible? With just the
third run?
sage
>
> Here is how my three nodes are configured in the teuthology file:
> - [mon.a, mon.c, osd.0]
> - [mon.b, mds.a, osd.1]
> - [client.0]
>
> Thanks.
>
> -Alex
>
>
> Run began: Wed Apr 11 08:36:52 2012
>
> Include close in write timing
> Include fsync in write timing
> File size set to 10485760 KB
> Record Size 1024 KB
> Command line used: iozone -c -e -s 10240M -r 1M -t 1 -F f3 -i 0 -i 1
> Output is in Kbytes/sec
> Time Resolution = 0.000001 seconds.
> Processor cache size set to 1024 Kbytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> Throughput test with 1 process
> Each process writes a 10485760 Kbyte file in 1024 Kbyte records
>
> Error writing block 9408, fd= 3
>
> Children see throughput for 1 initial writers = 0.00 KB/sec
> Parent sees throughput for 1 initial writers = 0.00 KB/sec
> Min throughput per process = 0.00 KB/sec
> Max throughput per process = 0.00 KB/sec
> Avg throughput per process = 0.00 KB/sec
> Min xfer = 0.00 KB
>
> Child 0
> f3: No such file or directory
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-04-12 16:31 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-12 16:26 layout error (with my test code) Alex Elder
2012-04-12 16:31 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.