* Re: Performence test on ceph v0.23 + EXT4 and Btrfs [not found] ` <1291002250.1872.116.camel@cephhost> @ 2010-11-29 3:53 ` Jeff Wu 0 siblings, 0 replies; 14+ messages in thread From: Jeff Wu @ 2010-11-29 3:53 UTC (permalink / raw) To: sage, yehuda, gregf, colinm; +Cc: ceph-devel, mllv HTML ---> TEXT,re-send ================================================= Hi , I've recently been using FFSB and iozone to do performance test with Ceph v0.23 on my platform. FFSB configuration file and ceph.conf attached . I am using one server(172.16.10.171) for the MDS and MON daemons and client host, one server(172.16.10.42) is for OSD0 ,one server(172.16.10.65) is for OSD1. The three machines all have Gigabit ethernet cards and connect with Gigabit Router. The disks are formatted using ext4 in no-journal mode and btrfs mode. The following is my patform infos and test results: ceph: 0.23 OS:ubuntu 10.10 x86_64 ,2.6.35-22-generic kernel. Ethernet: one Gigabit MON MDS CLIENT HOST: CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz Memory: 2GB cleint: mount.ceph 172.16.10.171:6789:/ /mnt/ceph OSD0 host: CPU:Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz Memory: 2GB OSD1 host: CPU:AMD Athlon(tm) 64 X2 Dual Core Processor 3600+ Memory: 4GB 1) here are FFSB test result on ceph+btrfs disk 8 thread 16 threads 32 threads large_file_create 14.7 MB/sec 16.4 MB/sec 17.8 MB/sec sequential_reads 15.5 MB/sec 16 MB/sec 17 MB/sec random_reads 490 KB/sec 594 KB/sec 664 KB/sec random_writes 57.2 MB/sec 68.4 MB/sec 72.1 MB/sec mailserver Read:85.8KB/sec Read : 236KB/sec Read:286KB/sec Write : 36KB/sec Write : 132KB/sec Write:129KB/sec 2) For comparison, here are the FFSB test result on ceph+ext4 disk with no journal 8 thread 16 threads 32 threads large_file_create 7.92 MB/sec 8.09 MB/sec 8.46 MB/sec sequential_reads 8.19 MB/sec 8.77 MB/sec 8.14 MB/sec random_reads 786 KB/sec 556 KB/sec 170 KB/sec random_writes 52.9 MB/sec 63 MB/sec 59.1 MB/sec mailserver Read:456KB/sec Read : 249KB/sec Read:485KB/sec Write : 228KB/sec Write : 120KB/sec Write:226KB/sec 3) here are iozone test result on ceph+btrfs disk,file size 6GB , Output is in Kbytes/sec Iozone: Performance Test of File I/O Version $Revision: 3.353 $ Compiled for 64 bit mode. Build: linux-ia64 Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li. Run began: Fri Nov 26 09:00:33 2010 Include close in write timing Include fsync in write timing Auto Mode Using minimum file size of 6291456 kilobytes. Using maximum file size of 6291456 kilobytes. Excel chart generation enabled Command line used: ./benchmark/iozone/iozone_x86_64 -c -e -a -n 6144M -g 6144M -i 0 -i 1 -i 2 -f /mnt/ceph/f1 -Rb ./benchmark/iozone/iozone.201011260900.xls Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 6291456 64 6627 6417 9898 10334 3629 5908 6291456 128 6803 7182 10200 10582 5106 6268 6291456 256 6734 7249 10821 11224 7348 7135 6291456 512 7109 7213 10538 10682 9392 7788 6291456 1024 6932 7616 11204 10873 8673 8467 6291456 2048 7896 7669 11025 9981 10258 7770 6291456 4096 6933 7084 10590 10703 10450 7758 6291456 8192 7215 7192 10490 10700 11110 7838 6291456 16384 6557 6646 10224 11179 10738 7062 4) For comparison, here are the iozone test result on ceph+ext4 disk with no journal,file size 6GB , Output is in Kbytes/sec Iozone: Performance Test of File I/O Version $Revision: 3.353 $ Compiled for 64 bit mode. Build: linux-ia64 Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li. Run began: Thu Nov 25 08:42:25 2010 Include close in write timing Include fsync in write timing Auto Mode Using minimum file size of 6291456 kilobytes. Using maximum file size of 6291456 kilobytes. Excel chart generation enabled Command line used: ./benchmark/iozone/iozone_x86_64 -c -e -a -n 6144M -g 6144M -i 0 -i 1 -i 2 -f /mnt/ceph/f1 -Rb ./benchmark/iozone/iozone.201011250841.xls Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 6291456 64 7214 7128 9847 9991 3112 4168 6291456 128 7514 7367 10601 10281 4667 6041 6291456 256 7420 7414 11041 11238 6933 7860 6291456 512 8190 8120 11449 11166 9001 7959 6291456 1024 7611 7702 10497 10391 7497 8887 6291456 2048 7516 7408 9908 10254 8639 8387 6291456 4096 7355 7453 10383 10598 9469 7554 6291456 8192 7415 7651 10244 10240 9868 8450 6291456 16384 7200 7166 9877 9778 9829 8228 Are these results reasonable ? which seem too slow , maybe,i do something wrong. Could you give me some ceph performance test results for the reference ? Any ideas ,please let me know ,thanks. Jeff.Wu =============================ceph.conf ======================================= ; ; Sample ceph ceph.conf file. ; ; This file defines cluster membership, the various locations ; that Ceph stores data, and any other runtime options. ; If a 'host' is defined for a daemon, the start/stop script will ; verify that it matches the hostname (or else ignore it). If it is ; not defined, it is assumed that the daemon is intended to start on ; the current host (e.g., in a setup with a startup.conf on each ; node). ; global [global] ; enable secure authentication ; auth supported = cephx keyring = /etc/ceph/keyring.bin ; monitors ; You need at least one. You need at least three if you want to ; tolerate any node failures. Always create an odd number. [mon] mon data = /opt/ceph/data/mon$id ; logging, for debugging monitor crashes, in order of ; their likelihood of being helpful :) ;debug ms = 20 ;debug mon = 20 ;debug paxos = 20 ;debug auth = 20 [mon0] host = ubuntu-mon0 mon addr = 172.16.10.171:6789 [mon1] host = ubuntu-mon0 mon addr = 172.16.10.171:6790 [mon2] host = ubuntu-mon0 mon addr = 172.16.10.171:6791 ; mds ; You need at least one. Define two to get a standby. [mds] ; where the mds keeps it's secret encryption keys keyring = /etc/ceph/keyring.$name ; mds logging to debug issues. ;debug ms = 20 ;debug mds = 20 [mds.0] host = ubuntu-mon0 [mds.1] host = ubuntu-mon0 ; osd ; You need at least one. Two if you want data to be replicated. ; Define as many as you like. [osd] ; This is where the btrfs volume will be mounted. osd data = /opt/ceph/data/osd$id ;osd journal = /opt/ceph/data/osd$id/journal osd class tmp = /var/lib/ceph/tmp ; Ideally, make this a separate disk or partition. A few ; hundred MB should be enough; more if you have fast or many ; disks. You can use a file under the osd data dir if need be ; (e.g. /data/osd$id/journal), but it will be slower than a ; separate disk or partition. ; This is an example of a file-based journal. ; osd journal size = 1000 ; journal size, in megabytes keyring = /etc/ceph/keyring.$name ; osd logging to debug osd issues, in order of likelihood of being ; helpful ;debug ms = 20 ;debug osd = 20 ;debug filestore = 20 ;debug journal = 20 [osd0] host = ubuntu-osd0 ;osd journal size = 1000 ; journal size, in megabytes ; if 'btrfs devs' is not specified, you're responsible for ; setting up the 'osd data' dir. if it is not btrfs, things ; will behave up until you try to recover from a crash (which ; usually fine for basic testing). ; btrfs devs = /dev/sdx [osd1] host = ubuntu-osd1 ;osd data = /opt/data/osd$id ;osd journal = /opt/data/osd$id/journal ;filestore journal writeahead = true ;osd journal size = 1000 ; journal size, in megabytes ;btrfs devs = /dev/sdy ;[osd2] ;host = zeta ;btrfs devs = /dev/sdx ;[osd3] ;host = eta ;btrfs devs = /dev/sdy ================================= large_files_create======================================== # Large file creates # Creating 1024 MB files. time=300 alignio=1 directio=0 [filesystem0] location=%TESTPATH% # All created files will be 1024 MB. min_filesize=1024M max_filesize=1024M [end0] [threadgroup0] num_threads=32 # 8,16 create_weight=1 write_blocksize=4K [stats] enable_stats=1 enable_range=1 msec_range 0.00 0.01 msec_range 0.01 0.02 msec_range 0.02 0.05 msec_range 0.05 0.10 msec_range 0.10 0.20 msec_range 0.20 0.50 msec_range 0.50 1.00 msec_range 1.00 2.00 msec_range 2.00 5.00 msec_range 5.00 10.00 msec_range 10.00 20.00 msec_range 20.00 50.00 msec_range 50.00 100.00 msec_range 100.00 200.00 msec_range 200.00 500.00 msec_range 500.00 1000.00 msec_range 1000.00 2000.00 msec_range 2000.00 5000.00 msec_range 5000.00 10000.00 [end] [end0] ================================= mail server ======================================== # Mail server simulation. # 1024 file time=300 alignio=1 directio=0 [filesystem0] location=%TESTPATH% num_files=1024 num_dirs=100 # File sizes range from 1kB to 1MB. size_weight 1KB 10 size_weight 2KB 15 size_weight 4KB 16 size_weight 8KB 16 size_weight 16KB 15 size_weight 32KB 10 size_weight 64KB 8 size_weight 128KB 4 size_weight 256KB 3 size_weight 512KB 2 size_weight 1MB 1 [end0] [threadgroup0] num_threads=32 # 8,16 readall_weight=4 create_fsync_weight=2 delete_weight=1 write_size=4KB write_blocksize=4KB read_size=4KB read_blocksize=4KB [stats] enable_stats=1 enable_range=1 msec_range 0.00 0.01 msec_range 0.01 0.02 msec_range 0.02 0.05 msec_range 0.05 0.10 msec_range 0.10 0.20 msec_range 0.20 0.50 msec_range 0.50 1.00 msec_range 1.00 2.00 msec_range 2.00 5.00 msec_range 5.00 10.00 msec_range 10.00 20.00 msec_range 20.00 50.00 msec_range 50.00 100.00 msec_range 100.00 200.00 msec_range 200.00 500.00 msec_range 500.00 1000.00 msec_range 1000.00 2000.00 msec_range 2000.00 5000.00 msec_range 5000.00 10000.00 [end] [end0] ================================= random reads======================================== # Large file random reads. # 256 files, 100MB per file. time=300 # 5 min alignio=1 [filesystem0] location=%TESTPATH% num_files=256 min_filesize=100M # 100 MB max_filesize=100M reuse=1 [end0] [threadgroup0] num_threads=32 # 8,16 read_random=1 read_weight=1 read_size=1M # 1 MB read_blocksize=4k [stats] enable_stats=1 enable_range=1 msec_range 0.00 0.01 msec_range 0.01 0.02 msec_range 0.02 0.05 msec_range 0.05 0.10 msec_range 0.10 0.20 msec_range 0.20 0.50 msec_range 0.50 1.00 msec_range 1.00 2.00 msec_range 2.00 5.00 msec_range 5.00 10.00 msec_range 10.00 20.00 msec_range 20.00 50.00 msec_range 50.00 100.00 msec_range 100.00 200.00 msec_range 200.00 500.00 msec_range 500.00 1000.00 msec_range 1000.00 2000.00 msec_range 2000.00 5000.00 msec_range 5000.00 10000.00 [end] [end0] ================================= random writes======================================== # Large file random writes. # 256 files, 100MB per file. time=300 # 5 min alignio=1 [filesystem0] location=%TESTPATH% num_files=256 min_filesize=100M # 100 MB max_filesize=100M reuse=1 [end0] [threadgroup0] num_threads=32 # 8,16 write_random=1 write_weight=1 write_size=1M # 1 MB write_blocksize=4k [stats] enable_stats=1 enable_range=1 msec_range 0.00 0.01 msec_range 0.01 0.02 msec_range 0.02 0.05 msec_range 0.05 0.10 msec_range 0.10 0.20 msec_range 0.20 0.50 msec_range 0.50 1.00 msec_range 1.00 2.00 msec_range 2.00 5.00 msec_range 5.00 10.00 msec_range 10.00 20.00 msec_range 20.00 50.00 msec_range 50.00 100.00 msec_range 100.00 200.00 msec_range 200.00 500.00 msec_range 500.00 1000.00 msec_range 1000.00 2000.00 msec_range 2000.00 5000.00 msec_range 5000.00 10000.00 [end] [end0] ================================= sequential reads======================================== # Large file sequential reads. # 256 files, 100MB per file. time=300 # 5 min alignio=1 [filesystem0] location=%TESTPATH% num_files=256 min_filesize=100M # 100 MB max_filesize=100M # 100 MB reuse=1 [end0] [threadgroup0] num_threads=32 # 8,16 read_weight=1 read_size=1M # 1 MB read_blocksize=4k [stats] enable_stats=1 enable_range=1 msec_range 0.00 0.01 msec_range 0.01 0.02 msec_range 0.02 0.05 msec_range 0.05 0.10 msec_range 0.10 0.20 msec_range 0.20 0.50 msec_range 0.50 1.00 msec_range 1.00 2.00 msec_range 2.00 5.00 msec_range 5.00 10.00 msec_range 10.00 20.00 msec_range 20.00 50.00 msec_range 50.00 100.00 msec_range 100.00 200.00 msec_range 200.00 500.00 msec_range 500.00 1000.00 msec_range 1000.00 2000.00 msec_range 2000.00 5000.00 msec_range 5000.00 10000.00 [end] [end0] ================================================================== ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs [not found] <1291001135.1872.106.camel@cephhost> [not found] ` <1291001398.1872.107.camel@cephhost> @ 2010-11-29 17:07 ` Gregory Farnum 2010-11-30 2:55 ` Jeff Wu 1 sibling, 1 reply; 14+ messages in thread From: Gregory Farnum @ 2010-11-29 17:07 UTC (permalink / raw) To: cpwu; +Cc: sage, yehuda, colinm, ceph-devel, mllv On Sun, Nov 28, 2010 at 7:25 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > Hi , > > I've recently been using FFSB and iozone to do performance test with Ceph > v0.23 on my platform. > the attachment file are FFSB configuration file and ceph.conf. So you're not using OSD journals in either test configuration? You're going to get pretty terrible write results without a journal. The reads are clearly slower than they should be and you could probably get better results by adjusting the caching behaviors. We haven't done too much work on optimizing read behavior. Could you run "ceph osd tell * bench", then run "ceph -w", and report the results? (That'll just run local benchmarking on the OSD to report the approximate write speed it's capable of.) You can also run "rados -p data bench 60 write", and then "rados -p data bench 0 seq" to get a simpler (better understood) performance test. With this data as a baseline we can start looking at what might be causing trouble. -Greg ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-11-29 17:07 ` Gregory Farnum @ 2010-11-30 2:55 ` Jeff Wu 2010-11-30 3:18 ` Gregory Farnum 0 siblings, 1 reply; 14+ messages in thread From: Jeff Wu @ 2010-11-30 2:55 UTC (permalink / raw) To: Gregory Farnum Cc: sage@newdream.net, yehuda@hq.newdream.net, colinm@hq.newdream.net, ceph-devel@vger.kernel.org, Andrew Lv 在 2010-11-30二的 01:07 +0800,Gregory Farnum写道: > On Sun, Nov 28, 2010 at 7:25 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > > > Hi , > > > > I've recently been using FFSB and iozone to do performance test with Ceph > > v0.23 on my platform. > > the attachment file are FFSB configuration file and ceph.conf. > So you're not using OSD journals in either test configuration? You're > going to get pretty terrible write results without a journal. > The reads are clearly slower than they should be and you could > probably get better results by adjusting the caching behaviors. We > haven't done too much work on optimizing read behavior. Hi ,Greg , thank you for your suggestions. > > Could you run "ceph osd tell * bench", then run "ceph -w", and report > the results? (That'll just run local benchmarking on the OSD to report > the approximate write speed it's capable of.) I run six times for the command: "$ sudo ceph osd tell 0/1 bench " use "$ sudo ceph -w" to get the following results: osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec at 49775 KB/sec osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec at 49384 KB/sec osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec at 51535 KB/sec osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec at 37547 KB/sec osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec at 77191 KB/sec osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec at 47369 KB/sec The detail test logs attached below. > You can also run "rados -p data bench 60 write", and then "rados -p > data bench 0 seq" to get a simpler (better understood) performance > test. I run twice for the command: "rados -p data bench 60 write" , Get the results: $ sudo rados -p data bench 60 write .......................... .......................... Total time run: 76.182225 Total writes made: 121 Write size: 4194304 Bandwidth (MB/sec): 6.219 Average Latency: 13.3068 Max latency: 23.9986 Min latency: 7.01847 $ sudo rados -p data bench 60 write .......................... .......................... Total time run: 74.830651 Total writes made: 97 Write size: 4194304 Bandwidth (MB/sec): 4.714 Average Latency: 15.5064 Max latency: 24.5641 Min latency: 3.50005 but run "$ sudo rados -p data bench 0 seq" fail to get the results, Maybe it's a bug,ceph version 0.23. logs: sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 16 16 0 0 0 - 0 read got -2 error during benchmark: -2 ./common/Mutex.h: In function 'Mutex::~Mutex()': ./common/Mutex.h:97: FAILED assert(nlock == 0) ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) ............................... ............................... The detail logs addached below. > With this data as a baseline we can start looking at what might be > causing trouble. > -Greg ========================================================== 1. Process transoft@ubuntu-mon0:/usr/local/etc/ceph$ ps -ef root 12919 1 2 Nov26 ? 02:23:45 /usr/local/bin/cmon -i 0 -c ceph.conf root 12952 1 1 Nov26 ? 01:30:03 /usr/local/bin/cmon -i 1 -c ceph.conf root 12987 1 1 Nov26 ? 01:51:42 /usr/local/bin/cmon -i 2 -c ceph.conf root 13036 1 1 Nov26 ? 01:00:08 /usr/local/bin/cmds -i 0 -c ceph.conf root 13075 1 0 Nov26 ? 00:47:47 /usr/local/bin/cmds -i 1 -c ceph.conf ========================================================== 2. ceph -w transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench 2010-11-30 08:47:09.080125 mon <- [osd,tell,0,bench] 2010-11-30 08:47:09.080378 mon1 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench 2010-11-30 08:47:54.520159 mon <- [osd,tell,1,bench] 2010-11-30 08:47:54.520433 mon2 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench 2010-11-30 08:48:29.590115 mon <- [osd,tell,0,bench] 2010-11-30 08:48:29.590365 mon2 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench 2010-11-30 08:48:47.450092 mon <- [osd,tell,1,bench] 2010-11-30 08:48:47.450341 mon1 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench 2010-11-30 08:49:27.240742 mon <- [osd,tell,0,bench] 2010-11-30 08:49:27.241076 mon1 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench 2010-11-30 08:49:43.500749 mon <- [osd,tell,1,bench] 2010-11-30 08:49:43.501043 mon2 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph -w 2010-11-30 08:47:04.150457 pg v8701: 528 pgs: 528 active+clean; 10674 KB data, 442 MB used, 219 GB / 219 GB avail 2010-11-30 08:47:04.151228 mds e5: 1/1/1 up {0=up:active}, 1 up:standby 2010-11-30 08:47:04.151253 osd e6: 2 osds: 2 up, 2 in 2010-11-30 08:47:04.151319 log 2010-11-30 08:45:17.795260 osd0 172.16.10.42:6800/16864 3 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.828886 sec at 51682 KB/sec 2010-11-30 08:47:04.151375 mon e1: 3 mons at {0=172.16.10.171:6789/0,1=172.16.10.171:6790/0,2=172.16.10.171:6791/0} 2010-11-30 08:47:22.487639 log 2010-11-30 08:47:00.704960 osd0 172.16.10.42:6800/16864 4 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec at 49775 KB/sec 2010-11-30 08:48:17.047842 log 2010-11-30 09:52:29.975820 osd1 172.16.10.65:6800/6678 3 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec at 49384 KB/sec 2010-11-30 08:48:42.915344 log 2010-11-30 08:48:21.135651 osd0 172.16.10.42:6800/16864 5 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec at 51535 KB/sec 2010-11-30 08:49:10.000303 log 2010-11-30 09:53:22.645957 osd1 172.16.10.65:6800/6678 4 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec at 37547 KB/sec 2010-11-30 08:49:40.809023 log 2010-11-30 08:49:19.005413 osd0 172.16.10.42:6800/16864 6 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec at 77191 KB/sec 2010-11-30 08:50:06.184830 log 2010-11-30 09:54:19.064900 osd1 172.16.10.65:6800/6678 5 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec at 47369 KB/sec =========================================================== 3. rados -p data bench 60 write transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60 write Maintaining 16 concurrent writes of 4194304 bytes for at least 60 seconds. sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 16 16 0 0 0 - 0 1 16 16 0 0 0 - 0 2 16 16 0 0 0 - 0 3 16 16 0 0 0 - 0 4 16 16 0 0 0 - 0 5 16 17 1 0.793474 0.8 7.9252 7.9252 6 16 17 1 0.662028 0 - 7.9252 7 16 17 1 0.567947 0 - 7.9252 8 16 24 8 3.97823 9.33333 10.5391 10.1201 9 16 24 8 3.53801 0 - 10.1201 10 16 38 22 8.76019 28 7.13522 9.96534 11 16 38 22 7.96645 0 - 9.96534 12 16 38 22 7.30461 0 - 9.96534 13 16 38 22 6.74431 0 - 9.96534 14 16 38 22 6.26388 0 - 9.96534 15 16 38 22 5.84735 0 - 9.96534 16 16 41 25 6.23039 2 8.7632 10.3001 17 16 41 25 5.86468 0 - 10.3001 18 16 41 25 5.53953 0 - 10.3001 19 16 52 36 7.55792 14.6667 10.3215 11.3923 min lat: 7.01847 max lat: 15.8599 avg lat: 11.3923 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 20 16 52 36 7.18061 0 - 11.3923 21 16 56 40 7.59925 8 13.7597 11.8294 22 16 56 40 7.25439 0 - 11.8294 23 16 56 40 6.93949 0 - 11.8294 24 16 56 40 6.65079 0 - 11.8294 25 16 63 47 7.50258 7 7.33881 11.2982 26 16 63 47 7.21441 0 - 11.2982 27 16 68 52 7.68667 10 9.28991 11.1051 28 16 68 52 7.41251 0 - 11.1051 29 16 68 52 7.15725 0 - 11.1051 30 16 68 52 6.91898 0 - 11.1051 31 16 68 52 6.69606 0 - 11.1051 32 16 79 63 7.85927 8.8 15.0751 11.933 33 16 79 63 7.62121 0 - 11.933 34 16 79 63 7.3973 0 - 11.933 35 16 79 63 7.18618 0 - 11.933 36 16 79 63 6.98678 0 - 11.933 37 16 84 68 7.33767 4 17.4944 12.3426 38 16 84 68 7.14464 0 - 12.3426 39 16 85 69 7.06399 2 8.24284 12.2832 min lat: 7.01847 max lat: 19.7384 avg lat: 12.2832 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 40 16 85 69 6.88751 0 - 12.2832 41 16 85 69 6.71971 0 - 12.2832 42 16 85 69 6.55985 0 - 12.2832 43 16 87 71 6.59315 2 15.8845 12.3846 44 16 87 71 6.44343 0 - 12.3846 45 16 87 71 6.30037 0 - 12.3846 46 16 87 71 6.16352 0 - 12.3846 47 16 91 75 6.37235 4 21.8578 12.8896 48 16 91 75 6.23967 0 - 12.8896 49 16 101 85 6.92742 20 18.2741 13.1211 50 16 101 85 6.78897 0 - 13.1211 51 16 101 85 6.65595 0 - 13.1211 52 16 101 85 6.52804 0 - 13.1211 53 16 101 85 6.40496 0 - 13.1211 54 16 111 95 7.02602 8 13.1645 12.939 55 16 111 95 6.89835 0 - 12.939 56 16 111 95 6.77524 0 - 12.939 57 16 111 95 6.65646 0 - 12.939 58 16 111 95 6.54176 0 - 12.939 59 16 115 99 6.70174 3.2 13.1073 12.9458 min lat: 7.01847 max lat: 21.8578 avg lat: 12.9364 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 60 16 120 104 6.92291 20 11.3313 12.9364 61 15 121 106 6.94046 8 13.3389 12.9444 62 15 121 106 6.82859 0 - 12.9444 63 15 121 106 6.72026 0 - 12.9444 64 15 121 106 6.61533 0 - 12.9444 65 15 121 106 6.51362 0 - 12.9444 66 11 121 110 6.65706 3.2 14.3524 12.9974 67 11 121 110 6.55777 0 - 12.9974 68 11 121 110 6.46139 0 - 12.9974 69 11 121 110 6.3678 0 - 12.9974 70 11 121 110 6.27689 0 - 12.9974 71 2 121 119 6.69486 7.2 13.2646 13.1676 72 2 121 119 6.60193 0 - 13.1676 73 2 121 119 6.51154 0 - 13.1676 74 2 121 119 6.4236 0 - 13.1676 75 2 121 119 6.338 0 - 13.1676 76 2 121 119 6.25465 0 - 13.1676 Total time run: 76.182225 Total writes made: 121 Write size: 4194304 Bandwidth (MB/sec): 6.219 Average Latency: 13.3068 Max latency: 23.9986 Min latency: 7.01847 transoft@ubuntu-mon0:/usr/local/etc/ceph$ transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60 write Maintaining 16 concurrent writes of 4194304 bytes for at least 60 seconds. sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 16 16 0 0 0 - 0 1 16 16 0 0 0 - 0 2 16 16 0 0 0 - 0 3 16 17 1 1.31537 1.33333 3.50005 3.50005 4 16 17 1 0.989638 0 - 3.50005 5 16 17 1 0.793226 0 - 3.50005 6 16 17 1 0.661871 0 - 3.50005 7 16 17 1 0.567841 0 - 3.50005 8 16 17 1 0.497201 0 - 3.50005 9 16 17 1 0.442194 0 - 3.50005 10 16 17 1 0.398146 0 - 3.50005 11 16 17 1 0.362079 0 - 3.50005 12 16 25 9 2.98799 3.55556 12.0273 17.9704 13 16 27 11 3.37183 8 12.0974 16.8994 14 16 33 17 4.83996 24 12.8105 16.4478 15 16 33 17 4.5181 0 - 16.4478 16 16 33 17 4.2364 0 - 16.4478 17 16 33 17 3.98777 0 - 16.4478 18 16 33 17 3.76669 0 - 16.4478 19 16 38 22 4.6185 4 14.6053 16.0123 min lat: 3.50005 max lat: 20.9926 avg lat: 16.0123 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 20 16 38 22 4.38793 0 - 16.0123 21 16 38 22 4.17939 0 - 16.0123 22 16 38 22 3.98976 0 - 16.0123 23 16 38 22 3.81659 0 - 16.0123 24 16 38 22 3.65782 0 - 16.0123 25 16 47 31 4.94834 6 13.0266 15.7939 26 16 47 31 4.75829 0 - 15.7939 27 16 47 31 4.58231 0 - 15.7939 28 16 47 31 4.41888 0 - 15.7939 29 16 47 31 4.26672 0 - 15.7939 30 16 47 31 4.12467 0 - 15.7939 31 16 54 38 4.89315 4.66667 14.5712 16.1038 32 16 54 38 4.74042 0 - 16.1038 33 16 54 38 4.59694 0 - 16.1038 34 16 54 38 4.46188 0 - 16.1038 35 16 54 38 4.33454 0 - 16.1038 36 16 63 47 5.21238 7.2 12.2193 15.3573 37 16 66 50 5.39536 12 9.59416 15.0125 38 16 66 50 5.25352 0 - 15.0125 39 16 66 50 5.11895 0 - 15.0125 min lat: 3.50005 max lat: 24.5641 avg lat: 15.0125 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 40 16 66 50 4.99107 0 - 15.0125 41 16 66 50 4.86947 0 - 15.0125 42 16 73 57 5.41915 5.6 11.6831 14.5749 43 16 73 57 5.29323 0 - 14.5749 44 16 73 57 5.17304 0 - 14.5749 45 16 73 57 5.05817 0 - 14.5749 46 16 73 57 4.94831 0 - 14.5749 47 16 82 66 5.60781 7.2 14.1181 14.6534 48 16 82 66 5.49105 0 - 14.6534 49 16 82 66 5.37907 0 - 14.6534 50 16 82 66 5.27157 0 - 14.6534 51 16 82 66 5.16829 0 - 14.6534 52 16 82 66 5.06898 0 - 14.6534 53 16 88 72 5.42554 4 14.2779 14.6222 54 16 88 72 5.32512 0 - 14.6222 55 16 88 72 5.22837 0 - 14.6222 56 16 88 72 5.13507 0 - 14.6222 57 16 88 72 5.04505 0 - 14.6222 58 16 88 72 4.95812 0 - 14.6222 59 16 90 74 5.00954 1.33333 18.7183 14.808 min lat: 3.50005 max lat: 24.5641 avg lat: 14.8389 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 60 16 96 80 5.32549 24 16.2963 14.8389 61 16 96 80 5.23826 0 - 14.8389 62 16 96 80 5.15383 0 - 14.8389 63 16 96 80 5.07208 0 - 14.8389 64 13 97 84 5.24252 4 13.2043 15.0243 65 13 97 84 5.16191 0 - 15.0243 66 13 97 84 5.08375 0 - 15.0243 67 13 97 84 5.00792 0 - 15.0243 68 13 97 84 4.93432 0 - 15.0243 69 13 97 84 4.86285 0 - 15.0243 70 5 97 92 5.24994 5.33333 18.0332 15.3781 71 5 97 92 5.17604 0 - 15.3781 72 5 97 92 5.10419 0 - 15.3781 73 5 97 92 5.0343 0 - 15.3781 74 5 97 92 4.96631 0 - 15.3781 Total time run: 74.830651 Total writes made: 97 Write size: 4194304 Bandwidth (MB/sec): 4.714 Average Latency: 15.5064 Max latency: 24.5641 Min latency: 3.50005 transoft@ubuntu-mon0:/usr/local/etc/ceph$ =========================================================== 4. rados -p data bench 0 seq transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 0 seq sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 16 16 0 0 0 - 0 read got -2 error during benchmark: -2 ./common/Mutex.h: In function 'Mutex::~Mutex()': ./common/Mutex.h:97: FAILED assert(nlock == 0) ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) 1: rados() [0x40c941] 2: (exit()+0xe2) [0x7f518ea744f2] 3: (__libc_start_main()+0x105) [0x7f518ea59d95] 4: rados() [0x405b19] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. ./common/Mutex.h: In function 'Mutex::~Mutex()': ./common/Mutex.h:97: FAILED assert(nlock == 0) ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) 1: rados() [0x40c941] 2: (exit()+0xe2) [0x7f518ea744f2] 3: (__libc_start_main()+0x105) [0x7f518ea59d95] 4: rados() [0x405b19] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion' *** Caught signal (ABRT) *** ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) 1: (sigabrt_handler(int)+0x91) [0x7f518fe7dec1] 2: (()+0x33c20) [0x7f518ea6ec20] 3: (gsignal()+0x35) [0x7f518ea6eba5] 4: (abort()+0x180) [0x7f518ea726b0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f518f3126bd] 6: (()+0xb9906) [0x7f518f310906] 7: (()+0xb9933) [0x7f518f310933] 8: (()+0xb9a3e) [0x7f518f310a3e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x456) [0x7f518fe5ab96] 10: rados() [0x40c941] 11: (exit()+0xe2) [0x7f518ea744f2] 12: (__libc_start_main()+0x105) [0x7f518ea59d95] 13: rados() [0x405b19] Aborted transoft@ubuntu-mon0:/usr/local/etc/ceph$ transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60 seq sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 16 16 0 0 0 - 0 read got -2 error during benchmark: -2 ./common/Mutex.h: In function 'Mutex::~Mutex()': ./common/Mutex.h:97: FAILED assert(nlock == 0) ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) 1: rados() [0x40c941] 2: (exit()+0xe2) [0x7f795c31c4f2] 3: (__libc_start_main()+0x105) [0x7f795c301d95] 4: rados() [0x405b19] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. ./common/Mutex.h: In function 'Mutex::~Mutex()': ./common/Mutex.h:97: FAILED assert(nlock == 0) ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) 1: rados() [0x40c941] 2: (exit()+0xe2) [0x7f795c31c4f2] 3: (__libc_start_main()+0x105) [0x7f795c301d95] 4: rados() [0x405b19] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion' *** Caught signal (ABRT) *** ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56) 1: (sigabrt_handler(int)+0x91) [0x7f795d725ec1] 2: (()+0x33c20) [0x7f795c316c20] 3: (gsignal()+0x35) [0x7f795c316ba5] 4: (abort()+0x180) [0x7f795c31a6b0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f795cbba6bd] 6: (()+0xb9906) [0x7f795cbb8906] 7: (()+0xb9933) [0x7f795cbb8933] 8: (()+0xb9a3e) [0x7f795cbb8a3e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x456) [0x7f795d702b96] 10: rados() [0x40c941] 11: (exit()+0xe2) [0x7f795c31c4f2] 12: (__libc_start_main()+0x105) [0x7f795c301d95] 13: rados() [0x405b19] Aborted transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60 rand Random test not implemented yet! error during benchmark: -1 transoft@ubuntu-mon0:/usr/local/etc/ceph$ Best Regards Jeff.Wu -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-11-30 2:55 ` Jeff Wu @ 2010-11-30 3:18 ` Gregory Farnum 2010-11-30 6:19 ` Jeff Wu 0 siblings, 1 reply; 14+ messages in thread From: Gregory Farnum @ 2010-11-30 3:18 UTC (permalink / raw) To: cpwu; +Cc: ceph-devel@vger.kernel.org, Andrew Lv On Mon, Nov 29, 2010 at 6:55 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: >> Could you run "ceph osd tell * bench", then run "ceph -w", and report >> the results? (That'll just run local benchmarking on the OSD to report >> the approximate write speed it's capable of.) > > I run six times for the command: "$ sudo ceph osd tell 0/1 bench " > use "$ sudo ceph -w" to get the following results: > > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec > at 49775 KB/sec > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec > at 49384 KB/sec > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec > at 51535 KB/sec > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec > at 37547 KB/sec > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec > at 77191 KB/sec > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec > at 47369 KB/sec Okay, those are a bit slow but reasonable. Based on these I'd expect you to generally manage about 40-50MB/s (since everything is replicated; in a 2-disk configuration it'll just be the speed of your slowest disk), assuming a properly configured system. >> You can also run "rados -p data bench 60 write", and then "rados -p >> data bench 0 seq" to get a simpler (better understood) performance >> test. > > I run twice for the command: "rados -p data bench 60 write" , > Get the results: > $ sudo rados -p data bench 60 write > .......................... > .......................... > > Total time run: 76.182225 > Total writes made: 121 > Write size: 4194304 > Bandwidth (MB/sec): 6.219 > > Average Latency: 13.3068 > Max latency: 23.9986 > Min latency: 7.01847 WOAH. That's a lot of latency. Rather more than I'd expect to get just from seek times in a non-journaled environment. What's the round-trip time to ping the OSDs from your client? Are your disks okay? > but run "$ sudo rados -p data bench 0 seq" > fail to get the results, Maybe it's a bug,ceph version 0.23. Oh right, sorry. We put in a quick fix to make the write benchmark scale across multiple client writers and forgot to adjust the read benchmark. *oops* -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-11-30 3:18 ` Gregory Farnum @ 2010-11-30 6:19 ` Jeff Wu 2010-11-30 17:07 ` Gregory Farnum 0 siblings, 1 reply; 14+ messages in thread From: Jeff Wu @ 2010-11-30 6:19 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv 在 2010-11-30二的 11:18 +0800,Gregory Farnum写道: > On Mon, Nov 29, 2010 at 6:55 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > >> Could you run "ceph osd tell * bench", then run "ceph -w", and report > >> the results? (That'll just run local benchmarking on the OSD to report > >> the approximate write speed it's capable of.) > > > > I run six times for the command: "$ sudo ceph osd tell 0/1 bench " > > use "$ sudo ceph -w" to get the following results: > > > > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec > > at 49775 KB/sec > > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec > > at 49384 KB/sec > > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec > > at 51535 KB/sec > > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec > > at 37547 KB/sec > > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec > > at 77191 KB/sec > > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec > > at 47369 KB/sec > Okay, those are a bit slow but reasonable. Based on these I'd expect > you to generally manage about 40-50MB/s (since everything is > replicated; in a 2-disk configuration it'll just be the speed of your > slowest disk), assuming a properly configured system. Is "40-50MB/s" the speed that it run bench at local btrfs disk ? not the speed that run bench from client to osd server ? with this speed ,run bench from client to osd server ,will which get about 20~25MB/s( 40~50MB /2 )speed ? > > >> You can also run "rados -p data bench 60 write", and then "rados -p > >> data bench 0 seq" to get a simpler (better understood) performance > >> test. > > > > I run twice for the command: "rados -p data bench 60 write" , > > Get the results: > > $ sudo rados -p data bench 60 write > > .......................... > > .......................... > > > > Total time run: 76.182225 > > Total writes made: 121 > > Write size: 4194304 > > Bandwidth (MB/sec): 6.219 > > > > Average Latency: 13.3068 > > Max latency: 23.9986 > > Min latency: 7.01847 > WOAH. That's a lot of latency. Rather more than I'd expect to get just > from seek times in a non-journaled environment. What's the round-trip > time to ping the OSDs from your client? Are your disks okay? > Hi,i get the RTT and the disks are okay . 1. RTT: client ping OSD0: rtt min/avg/max/mdev = 0.103/0.124/0.143/0.016 ms client ping OSD1: rtt min/avg/max/mdev = 0.112/0.116/0.122/0.005 ms 2. disks 1) mon data save in sda4, which formatted with mkfs.btrfs $lsscsi [0:0:0:0] disk ATA WDC WD3200AAKS-7 02.0 /dev/sda $fdisk -l ............. /dev/sda4 17264 38913 173897568+ 83 Linux 2) OSD0 data save in sda5,which formatted with mkfs.btrfs ~$ lsscsi [0:0:0:0] disk ATA ST3320418AS CC45 /dev/sda $ fdisk -l ....................... /dev/sda5 18480 38913 164131649 83 Linux 3) OSD1 data save in sda5,which formatted with mkfs.btrfs $ lsscsi [0:0:0:0] disk ATA ST3160812AS 3.AD /dev/sda $ fdisk -l /dev/sda5 11185 19452 66405470+ 83 Linux more detail logs attached(see below) Thank you jeff.wu > > but run "$ sudo rados -p data bench 0 seq" > > fail to get the results, Maybe it's a bug,ceph version 0.23. > Oh right, sorry. We put in a quick fix to make the write benchmark > scale across multiple client writers and forgot to adjust the read > benchmark. *oops* thank you ! > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html logs: ================================================================= transoft@ubuntu-mon0:/usr/local/etc/ceph$ ping 172.16.10.65 PING 172.16.10.65 (172.16.10.65) 56(84) bytes of data. 64 bytes from 172.16.10.65: icmp_req=1 ttl=64 time=0.127 ms 64 bytes from 172.16.10.65: icmp_req=2 ttl=64 time=0.121 ms 64 bytes from 172.16.10.65: icmp_req=3 ttl=64 time=0.127 ms 64 bytes from 172.16.10.65: icmp_req=4 ttl=64 time=0.124 ms 64 bytes from 172.16.10.65: icmp_req=5 ttl=64 time=0.128 ms 64 bytes from 172.16.10.65: icmp_req=6 ttl=64 time=0.128 ms 64 bytes from 172.16.10.65: icmp_req=7 ttl=64 time=0.127 ms 64 bytes from 172.16.10.65: icmp_req=8 ttl=64 time=0.126 ms 64 bytes from 172.16.10.65: icmp_req=9 ttl=64 time=0.126 ms 64 bytes from 172.16.10.65: icmp_req=10 ttl=64 time=0.128 ms 64 bytes from 172.16.10.65: icmp_req=11 ttl=64 time=0.120 ms 64 bytes from 172.16.10.65: icmp_req=12 ttl=64 time=0.127 ms 64 bytes from 172.16.10.65: icmp_req=13 ttl=64 time=0.120 ms 64 bytes from 172.16.10.65: icmp_req=14 ttl=64 time=0.125 ms 64 bytes from 172.16.10.65: icmp_req=15 ttl=64 time=0.131 ms 64 bytes from 172.16.10.65: icmp_req=16 ttl=64 time=0.125 ms 64 bytes from 172.16.10.65: icmp_req=17 ttl=64 time=0.124 ms 64 bytes from 172.16.10.65: icmp_req=18 ttl=64 time=0.103 ms 64 bytes from 172.16.10.65: icmp_req=19 ttl=64 time=0.124 ms 64 bytes from 172.16.10.65: icmp_req=20 ttl=64 time=0.125 ms 64 bytes from 172.16.10.65: icmp_req=21 ttl=64 time=0.143 ms 64 bytes from 172.16.10.65: icmp_req=22 ttl=64 time=0.127 ms 64 bytes from 172.16.10.65: icmp_req=23 ttl=64 time=0.117 ms ^C --- 172.16.10.65 ping statistics --- 23 packets transmitted, 23 received, 0% packet loss, time 21999ms rtt min/avg/max/mdev = 0.103/0.124/0.143/0.016 ms transoft@ubuntu-mon0:/usr/local/etc/ceph$ ping 172.16.10.42 PING 172.16.10.42 (172.16.10.42) 56(84) bytes of data. 64 bytes from 172.16.10.42: icmp_req=1 ttl=64 time=0.121 ms 64 bytes from 172.16.10.42: icmp_req=2 ttl=64 time=0.116 ms 64 bytes from 172.16.10.42: icmp_req=3 ttl=64 time=0.122 ms 64 bytes from 172.16.10.42: icmp_req=5 ttl=64 time=0.117 ms 64 bytes from 172.16.10.42: icmp_req=6 ttl=64 time=0.112 ms 64 bytes from 172.16.10.42: icmp_req=7 ttl=64 time=0.117 ms 64 bytes from 172.16.10.42: icmp_req=8 ttl=64 time=0.112 ms 64 bytes from 172.16.10.42: icmp_req=9 ttl=64 time=0.112 ms 64 bytes from 172.16.10.42: icmp_req=10 ttl=64 time=0.112 ms 64 bytes from 172.16.10.42: icmp_req=11 ttl=64 time=0.117 ms 64 bytes from 172.16.10.42: icmp_req=12 ttl=64 time=0.114 ms 64 bytes from 172.16.10.42: icmp_req=13 ttl=64 time=0.121 ms ^C --- 172.16.10.42 ping statistics --- 13 packets transmitted, 12 received, 7% packet loss, time 11999ms rtt min/avg/max/mdev = 0.112/0.116/0.122/0.005 ms transoft@ubuntu-mon0:/usr/local/etc/ceph$ transoft@ubuntu-mon0:~$ lsscsi [0:0:0:0] disk ATA WDC WD3200AAKS-7 02.0 /dev/sda transoft@ubuntu-mon0:~$ sudo fdisk -l [sudo] password for transoft: Disk /dev/sda: 320.1 GB, 320072933376 bytes 255 heads, 63 sectors/track, 38913 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00006279 Device Boot Start End Blocks Id System /dev/sda1 * 1 4864 39061504 83 Linux /dev/sda2 4864 17264 99607553 5 Extended /dev/sda3 38914 38914 0+ 83 Linux /dev/sda4 17264 38913 173897568+ 83 Linux /dev/sda5 4864 17021 97654784 83 Linux /dev/sda6 17021 17264 1951744 82 Linux swap / Solaris Partition table entries are not in disk order transoft@ubuntu-mon0:~$ transoft@ubuntu-osd0:~$ lsscsi [0:0:0:0] disk ATA ST3320418AS CC45 /dev/sda transoft@ubuntu-osd0:~$ sudo fdisk -l [sudo] password for transoft: Disk /dev/sda: 320.1 GB, 320072933376 bytes 255 heads, 63 sectors/track, 38913 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x50000000 Device Boot Start End Blocks Id System /dev/sda1 * 1 6079 48827392 83 Linux /dev/sda2 6079 18237 97655808 83 Linux /dev/sda3 18237 18480 1952768 82 Linux swap / Solaris /dev/sda4 18480 38913 164131680+ 5 Extended /dev/sda5 18480 38913 164131649 83 Linux transoft@ubuntu-osd0:~$ transoft@ubuntu-osd1:~$ lsscsi [0:0:0:0] disk ATA ST3160812AS 3.AD /dev/sda transoft@ubuntu-osd1:~$ sudo fdisk -l [sudo] password for transoft: Disk /dev/sda: 160.0 GB, 160000000000 bytes 255 heads, 63 sectors/track, 19452 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xc45cc45c Device Boot Start End Blocks Id System /dev/sda1 * 1 3648 29295616 83 Linux /dev/sda2 3648 10942 58593280 83 Linux /dev/sda3 10942 11185 1952768 82 Linux swap / Solaris /dev/sda4 11185 19452 66405502 5 Extended /dev/sda5 11185 19452 66405470+ 83 Linux transoft@ubuntu-osd1:~$ -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-11-30 6:19 ` Jeff Wu @ 2010-11-30 17:07 ` Gregory Farnum 2010-12-01 1:35 ` Jeff Wu 0 siblings, 1 reply; 14+ messages in thread From: Gregory Farnum @ 2010-11-30 17:07 UTC (permalink / raw) To: cpwu; +Cc: ceph-devel@vger.kernel.org, Andrew Lv On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > Is "40-50MB/s" the speed that it run bench at local btrfs disk ? > not the speed that run bench from client to osd server ? > with this speed ,run bench from client to osd server ,will which get > about 20~25MB/s( 40~50MB /2 )speed ? Data on Ceph is replicated across 2 OSDs (by default; this is configurable). So while figuring out potential performance involves a lot of variables, in a simple case like this where you aren't bounded by network bandwidth you'll find that your read/write performance simply tracks the slower disk. I'd expect your Ceph tests (at least the streaming ones) to run at 40-50MB/s. Given that everything else is okay, I cannot stress enough that running without a journal is going to cause significant performance degradations. I have a hard time believing that it's responsible for 13-second latencies, but it's possible. So how about you set up a journal (it can just be a file or new partition on the drives you're already using) and report back your results after you do that. :) Adding a journal to the OSDs lets them turn all their random writes into streaming ones. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-11-30 17:07 ` Gregory Farnum @ 2010-12-01 1:35 ` Jeff Wu 2010-12-01 6:59 ` Jeff Wu 0 siblings, 1 reply; 14+ messages in thread From: Jeff Wu @ 2010-12-01 1:35 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv 在 2010-12-01三的 01:07 +0800,Gregory Farnum写道: > On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > Is "40-50MB/s" the speed that it run bench at local btrfs disk ? > > not the speed that run bench from client to osd server ? > > with this speed ,run bench from client to osd server ,will which get > > about 20~25MB/s( 40~50MB /2 )speed ? > Data on Ceph is replicated across 2 OSDs (by default; this is > configurable). So while figuring out potential performance involves a > lot of variables, in a simple case like this where you aren't bounded > by network bandwidth you'll find that your read/write performance > simply tracks the slower disk. I'd expect your Ceph tests (at least > the streaming ones) to run at 40-50MB/s. Hi Greg,thank you very much for your quickly reply. > > Given that everything else is okay, I cannot stress enough that > running without a journal is going to cause significant performance > degradations. I have a hard time believing that it's responsible for > 13-second latencies, but it's possible. So how about you set up a > journal (it can just be a file or new partition on the drives you're > already using) and report back your results after you do that. :) I will add journal to ceph.conf to try it . > Adding a journal to the OSDs lets them turn all their random writes > into streaming ones. > -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-01 1:35 ` Jeff Wu @ 2010-12-01 6:59 ` Jeff Wu 2010-12-01 16:05 ` Gregory Farnum 0 siblings, 1 reply; 14+ messages in thread From: Jeff Wu @ 2010-12-01 6:59 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv 在 2010-12-01三的 09:35 +0800,Jeff Wu写道: > > 在 2010-12-01三的 01:07 +0800,Gregory Farnum写道: > > On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > > Is "40-50MB/s" the speed that it run bench at local btrfs disk ? > > > not the speed that run bench from client to osd server ? > > > with this speed ,run bench from client to osd server ,will which get > > > about 20~25MB/s( 40~50MB /2 )speed ? > > Data on Ceph is replicated across 2 OSDs (by default; this is > > configurable). So while figuring out potential performance involves a > > lot of variables, in a simple case like this where you aren't bounded > > by network bandwidth you'll find that your read/write performance > > simply tracks the slower disk. I'd expect your Ceph tests (at least > > the streaming ones) to run at 40-50MB/s. > > Hi Greg,thank you very much for your quickly reply. > > > > Given that everything else is okay, I cannot stress enough that > > running without a journal is going to cause significant performance > > degradations. I have a hard time believing that it's responsible for > > 13-second latencies, but it's possible. So how about you set up a > > journal (it can just be a file or new partition on the drives you're > > already using) and report back your results after you do that. :) > > I will add journal to ceph.conf to try it . > > Hi ,greg, With your suggestions, i add the journal config: " osd data = /opt/ceph/data/osd$id osd journal = /home/transoft/data/osd$id/journal filestore journal writeahead = true osd journal size = 10000 " to ceph.conf. the detail ceph.conf attached below. then , run six times for the commad: "$ sudo ceph osd tell 0/1 bench" ,get the results: $ sudo ceph -w osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 29.818194 sec at 28201 KB/sec osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 30.013058 sec at 34801 KB/sec osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 30.463511 sec at 30274 KB/sec osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 165.067603 sec at 6329 KB/sec osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 181.034333 sec at 5782 KB/sec osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 196.055812 sec at 5334 KB/sec and i also use "dd" to test raw drive, get the logs: 1. OSD0, mkfs.btrfs format /opt $ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 1024+0 records in 1024+0 records out 2147483648 bytes transfered in 21.4497 secs(100 MB/sec) 2. OSD1 ,mkfs. btrfs format /opt ~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 1024+0 records in 1024+0 records out 2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec) with these logs, OSD1 disk speed might limit the test performance. and i also detect a issue ,take the following steps: $. mckephfs -c ceph.conf -v --mkbtrfs -a $ init-ceph - ceph.conf --btrfs -v -a start then execute: $ init-ceph - ceph.conf --btrfs -v -a stop this command can't stop OSD0 and OSD1 cosd process: OSD0: /usr/local/bin/cosd -i 0 -c ceph.conf OSD1: /usr/local/bin/cosd -i 1 -c ceph.conf then , i create the folder "/var/run/ceph" at OSD0 and OSD1 host manually. execute: $ init-ceph - ceph.conf --btrfs -v -a stop this command can stop OSD0 and OSD1 cosd process: /usr/local/bin/cosd -i 0 -c ceph.conf /usr/local/bin/cosd -i 1 -c ceph.conf Thanks, Jeff.Wu > > > Adding a journal to the OSDs lets them turn all their random writes > > into streaming ones. > > -Greg > ========================================================= transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench 2010-12-01 10:45:13.670910 mon <- [osd,tell,0,bench] 2010-12-01 10:45:13.671180 mon1 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench 2010-12-01 10:45:29.350198 mon <- [osd,tell,0,bench] 2010-12-01 10:45:29.350457 mon1 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench 2010-12-01 10:45:31.000281 mon <- [osd,tell,0,bench] 2010-12-01 10:45:31.000560 mon0 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench 2010-12-01 10:45:34.860782 mon <- [osd,tell,1,bench] 2010-12-01 10:45:34.861020 mon1 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench 2010-12-01 10:45:36.760811 mon <- [osd,tell,1,bench] 2010-12-01 10:45:36.761161 mon2 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench 2010-12-01 10:45:37.530714 mon <- [osd,tell,1,bench] 2010-12-01 10:45:37.530968 mon2 -> 'ok' (0) transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph -w 2010-12-01 10:44:59.450653 pg v13: 528 pgs: 528 active+clean; 12 KB data, 5304 KB used, 219 GB / 219 GB avail 2010-12-01 10:44:59.451365 mds e5: 1/1/1 up {0=up:active}, 1 up:standby 2010-12-01 10:44:59.451387 osd e6: 2 osds: 2 up, 2 in 2010-12-01 10:44:59.451412 log 2010-12-01 10:43:43.044865 mon0 172.16.10.171:6789/0 7 : [INF] mds0 172.16.10.171:6801/2482 up:active 2010-12-01 10:44:59.451440 mon e1: 3 mons at {0=172.16.10.171:6789/0,1=172.16.10.171:6790/0,2=172.16.10.171:6791/0} 2010-12-01 10:46:45.000262 log 2010-12-01 10:45:15.599526 osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 29.818194 sec at 28201 KB/sec 2010-12-01 10:46:45.000262 log 2010-12-01 10:45:46.062142 osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 30.013058 sec at 34801 KB/sec 2010-12-01 10:46:45.000262 log 2010-12-01 10:46:16.836607 osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 30.463511 sec at 30274 KB/sec 2010-12-01 10:48:20.042152 pg v14: 528 pgs: 528 active+clean; 32780 KB data, 888 MB used, 218 GB / 219 GB avail 2010-12-01 10:50:50.038298 pg v15: 528 pgs: 528 active+clean; 73740 KB data, 54928 KB used, 219 GB / 219 GB avail 2010-12-01 10:52:15.074470 pg v16: 528 pgs: 528 active+clean; 73740 KB data, 79440 KB used, 219 GB / 219 GB avail 2010-12-01 10:54:55.546098 log 2010-12-01 11:52:34.244851 osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 165.067603 sec at 6329 KB/sec 2010-12-01 10:54:55.546098 log 2010-12-01 11:55:52.010739 osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 181.034333 sec at 5782 KB/sec 2010-12-01 10:54:55.546098 log 2010-12-01 11:59:09.560115 osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 196.055812 sec at 5334 KB/sec 2010-12-01 10:55:01.001357 pg v17: 528 pgs: 528 active+clean; 73741 KB data, 1106 MB used, 218 GB / 219 GB avail ============ceph.conf==================== ; ; Sample ceph ceph.conf file. ; ; This file defines cluster membership, the various locations ; that Ceph stores data, and any other runtime options. ; If a 'host' is defined for a daemon, the start/stop script will ; verify that it matches the hostname (or else ignore it). If it is ; not defined, it is assumed that the daemon is intended to start on ; the current host (e.g., in a setup with a startup.conf on each ; node). ; global [global] ; enable secure authentication ; auth supported = cephx keyring = /etc/ceph/keyring.bin ; monitors ; You need at least one. You need at least three if you want to ; tolerate any node failures. Always create an odd number. [mon] mon data = /opt/ceph/data/mon$id ;mon data = /home/transoft/data/mon$id ; logging, for debugging monitor crashes, in order of ; their likelihood of being helpful :) ;debug ms = 20 ;debug mon = 20 ;debug paxos = 20 ;debug auth = 20 [mon0] host = ubuntu-mon0 mon addr = 172.16.10.171:6789 [mon1] host = ubuntu-mon0 mon addr = 172.16.10.171:6790 [mon2] host = ubuntu-mon0 mon addr = 172.16.10.171:6791 ; mds ; You need at least one. Define two to get a standby. [mds] ; where the mds keeps it's secret encryption keys keyring = /etc/ceph/keyring.$name ; mds logging to debug issues. ;debug ms = 20 ;debug mds = 20 [mds.0] host = ubuntu-mon0 [mds.1] host = ubuntu-mon0 ; osd ; You need at least one. Two if you want data to be replicated. ; Define as many as you like. [osd] ; This is where the btrfs volume will be mounted. ;osd data = /opt/ceph/data/osd$id osd class tmp = /var/lib/ceph/tmp ; Ideally, make this a separate disk or partition. A few ; hundred MB should be enough; more if you have fast or many ; disks. You can use a file under the osd data dir if need be ; (e.g. /data/osd$id/journal), but it will be slower than a ; separate disk or partition. ; This is an example of a file-based journal. ;osd journal = /home/transoft/data/osd$id/journal ;filestore journal writeahead = true ; journal size, in megabytes ;osd journal size = 1000 keyring = /etc/ceph/keyring.$name ; osd logging to debug osd issues, in order of likelihood of being ; helpful ;debug ms = 20 ;debug osd = 20 ;debug filestore = 20 ;debug journal = 20 [osd0] host = ubuntu-osd0 osd data = /opt/ceph/data/osd$id osd journal = /home/transoft/data/osd$id/journal filestore journal writeahead = true osd journal size = 10000 ; if 'btrfs devs' is not specified, you're responsible for ; setting up the 'osd data' dir. if it is not btrfs, things ; will behave up until you try to recover from a crash (which ; usually fine for basic testing). ; btrfs devs = /dev/sdx [osd1] host = ubuntu-osd1 osd data = /opt/ceph/data/osd$id osd journal = /home/transoft/data/osd$id/journal filestore journal writeahead = true osd journal size = 10000 ;btrfs devs = /dev/sdy ;[osd2] ;host = zeta ;btrfs devs = /dev/sdx ;[osd3] ;host = eta ;btrfs devs = /dev/sdy -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-01 6:59 ` Jeff Wu @ 2010-12-01 16:05 ` Gregory Farnum 2010-12-02 1:38 ` Jeff Wu 0 siblings, 1 reply; 14+ messages in thread From: Gregory Farnum @ 2010-12-01 16:05 UTC (permalink / raw) To: cpwu; +Cc: ceph-devel@vger.kernel.org, Andrew Lv On Tue, Nov 30, 2010 at 10:57 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > Hi ,greg, > > With your suggestions, i add the journal config: > " > osd data = /opt/ceph/data/osd$id > osd journal = /home/transoft/data/osd$id/journal > filestore journal writeahead = true > osd journal size = 10000 > " > to ceph.conf. the detail ceph.conf attached below. > > then , run six times for the commad: "$ sudo ceph osd tell 0/1 bench" ,get > the results: > > > $ sudo ceph -w > > osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of > 4096 KB in 29.818194 sec at 28201 KB/sec > osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of > 4096 KB in 30.013058 sec at 34801 KB/sec > osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of > 4096 KB in 30.463511 sec at 30274 KB/sec > > osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096 > KB in 165.067603 sec at 6329 KB/sec > osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096 > KB in 181.034333 sec at 5782 KB/sec > osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096 > KB in 196.055812 sec at 5334 KB/sec > > and i also use "dd" to test raw drive, get the logs: > > 1. OSD0, mkfs.btrfs format /opt > > $ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 > 1024+0 records in > 1024+0 records out > 2147483648 bytes transfered in 21.4497 secs(100 MB/sec) > > 2. OSD1 ,mkfs. btrfs format /opt > > ~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 > 1024+0 records in > 1024+0 records out > 2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec) > > with these logs, OSD1 disk speed might limit the test performance. Yes, it looks to me like your OSD1 disk isn't working properly. Switching it from one streaming write to two streaming writes shouldn't reduce it to 20% of its original performance (~45MB/s to 5MB/s*2). The same change in tasks doesn't impact OSD0's disk at all. > and i also detect a issue ,take the following steps: > > $. mckephfs -c ceph.conf -v --mkbtrfs -a > $ init-ceph - ceph.conf --btrfs -v -a start > then execute: > $ init-ceph - ceph.conf --btrfs -v -a stop > > this command can't stop OSD0 and OSD1 cosd process: > OSD0: > /usr/local/bin/cosd -i 0 -c ceph.conf > OSD1: > /usr/local/bin/cosd -i 1 -c ceph.conf Not sure I understand what you're doing here. Also, it looks like you've got a malformed command there -- you don't specify the "-c" option, just the nonexistent "-" option. ;) -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-01 16:05 ` Gregory Farnum @ 2010-12-02 1:38 ` Jeff Wu 2010-12-02 2:35 ` Gregory Farnum 0 siblings, 1 reply; 14+ messages in thread From: Jeff Wu @ 2010-12-02 1:38 UTC (permalink / raw) To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv 在 2010-12-02四的 00:05 +0800,Gregory Farnum写道: > On Tue, Nov 30, 2010 at 10:57 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > Hi ,greg, > > > > With your suggestions, i add the journal config: > > " > > osd data = /opt/ceph/data/osd$id > > osd journal = /home/transoft/data/osd$id/journal > > filestore journal writeahead = true > > osd journal size = 10000 > > " > > to ceph.conf. the detail ceph.conf attached below. > > > > then , run six times for the commad: "$ sudo ceph osd tell 0/1 bench" ,get > > the results: > > > > > > $ sudo ceph -w > > > > osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of > > 4096 KB in 29.818194 sec at 28201 KB/sec > > osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of > > 4096 KB in 30.013058 sec at 34801 KB/sec > > osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of > > 4096 KB in 30.463511 sec at 30274 KB/sec > > > > osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096 > > KB in 165.067603 sec at 6329 KB/sec > > osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096 > > KB in 181.034333 sec at 5782 KB/sec > > osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096 > > KB in 196.055812 sec at 5334 KB/sec > > > > and i also use "dd" to test raw drive, get the logs: > > > > 1. OSD0, mkfs.btrfs format /opt > > > > $ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 > > 1024+0 records in > > 1024+0 records out > > 2147483648 bytes transfered in 21.4497 secs(100 MB/sec) > > > > 2. OSD1 ,mkfs. btrfs format /opt > > > > ~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 > > 1024+0 records in > > 1024+0 records out > > 2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec) > > > > with these logs, OSD1 disk speed might limit the test performance. > Yes, it looks to me like your OSD1 disk isn't working properly. > Switching it from one streaming write to two streaming writes > shouldn't reduce it to 20% of its original performance (~45MB/s to > 5MB/s*2). The same change in tasks doesn't impact OSD0's disk at all. > Hi greg, thank you for your detail comments. > > and i also detect a issue ,take the following steps: > > > > $. mckephfs -c ceph.conf -v --mkbtrfs -a > > $ init-ceph - ceph.conf --btrfs -v -a start > > then execute: > > $ init-ceph - ceph.conf --btrfs -v -a stop > > > > this command can't stop OSD0 and OSD1 cosd process: > > OSD0: > > /usr/local/bin/cosd -i 0 -c ceph.conf > > OSD1: > > /usr/local/bin/cosd -i 1 -c ceph.conf > Not sure I understand what you're doing here. Also, it looks like > you've got a malformed command there -- you don't specify the "-c" > option, just the nonexistent "-" option. ;) Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs -v -a stop " ,which can't auto-kill OSD host cosd process, like this : OSD0 host: $ ps -ef | grep cosd root 13987 1 0 Dec01 ? 00:02:55 /usr/local/bin/cosd -i 0 -c ceph.conf OSD1 host: $ ps -ef | grep cosd root 14028 1 0 Dec01 ? 00:02:53 /usr/local/bin/cosd -i 1 -c ceph.conf I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd process manually, or ,next time , it will fail to execute "$ init-ceph -c ceph.conf --btrfs -v -a start " command. jeff,wu -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-02 1:38 ` Jeff Wu @ 2010-12-02 2:35 ` Gregory Farnum 2010-12-02 3:22 ` Jeff Wu 0 siblings, 1 reply; 14+ messages in thread From: Gregory Farnum @ 2010-12-02 2:35 UTC (permalink / raw) To: cpwu, Sage Weil; +Cc: ceph-devel@vger.kernel.org, Andrew Lv On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: >> > and i also detect a issue ,take the following steps: >> > >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a >> > $ init-ceph - ceph.conf --btrfs -v -a start >> > then execute: >> > $ init-ceph - ceph.conf --btrfs -v -a stop >> > >> > this command can't stop OSD0 and OSD1 cosd process: >> > OSD0: >> > /usr/local/bin/cosd -i 0 -c ceph.conf >> > OSD1: >> > /usr/local/bin/cosd -i 1 -c ceph.conf >> Not sure I understand what you're doing here. Also, it looks like >> you've got a malformed command there -- you don't specify the "-c" >> option, just the nonexistent "-" option. ;) > > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs > -v -a stop " ,which can't auto-kill OSD host cosd process, like this : > OSD0 host: > $ ps -ef | grep cosd > root 13987 1 0 Dec01 ? 00:02:55 /usr/local/bin/cosd -i 0 > -c ceph.conf > OSD1 host: > $ ps -ef | grep cosd > root 14028 1 0 Dec01 ? 00:02:53 /usr/local/bin/cosd -i 1 > -c ceph.conf > > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd > process manually, or ,next time , it will fail to execute "$ init-ceph > -c ceph.conf --btrfs -v -a start " command. Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It ought to be created automatically; if it's not we should fix that. What version of Ceph are you running, and where from? I'd imagine it's being packaged wrong or something. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-02 2:35 ` Gregory Farnum @ 2010-12-02 3:22 ` Jeff Wu 2010-12-02 6:10 ` Sage Weil 0 siblings, 1 reply; 14+ messages in thread From: Jeff Wu @ 2010-12-02 3:22 UTC (permalink / raw) To: Gregory Farnum; +Cc: Sage Weil, ceph-devel@vger.kernel.org, Andrew Lv 在 2010-12-02四的 10:35 +0800,Gregory Farnum写道: > On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > >> > and i also detect a issue ,take the following steps: > >> > > >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a > >> > $ init-ceph - ceph.conf --btrfs -v -a start > >> > then execute: > >> > $ init-ceph - ceph.conf --btrfs -v -a stop > >> > > >> > this command can't stop OSD0 and OSD1 cosd process: > >> > OSD0: > >> > /usr/local/bin/cosd -i 0 -c ceph.conf > >> > OSD1: > >> > /usr/local/bin/cosd -i 1 -c ceph.conf > >> Not sure I understand what you're doing here. Also, it looks like > >> you've got a malformed command there -- you don't specify the "-c" > >> option, just the nonexistent "-" option. ;) > > > > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD > > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs > > -v -a stop " ,which can't auto-kill OSD host cosd process, like this : > > OSD0 host: > > $ ps -ef | grep cosd > > root 13987 1 0 Dec01 ? 00:02:55 /usr/local/bin/cosd -i 0 > > -c ceph.conf > > OSD1 host: > > $ ps -ef | grep cosd > > root 14028 1 0 Dec01 ? 00:02:53 /usr/local/bin/cosd -i 1 > > -c ceph.conf > > > > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd > > process manually, or ,next time , it will fail to execute "$ init-ceph > > -c ceph.conf --btrfs -v -a start " command. > Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It > ought to be created automatically; if it's not we should fix that. > What version of Ceph are you running, and where from? I'd imagine it's > being packaged wrong or something. Hi , i download ceph 0.23 from http://ceph.newdream.net/download/ceph-0.23.tar.gz So , maybe , should i add " [global] pid file = /var/run/ceph/$name.pid .............................. " at ceph.conf ? > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-02 3:22 ` Jeff Wu @ 2010-12-02 6:10 ` Sage Weil 2010-12-02 7:31 ` Jeff Wu 0 siblings, 1 reply; 14+ messages in thread From: Sage Weil @ 2010-12-02 6:10 UTC (permalink / raw) To: Jeff Wu; +Cc: Gregory Farnum, ceph-devel@vger.kernel.org, Andrew Lv [-- Attachment #1: Type: TEXT/PLAIN, Size: 2278 bytes --] On Thu, 2 Dec 2010, Jeff Wu wrote: > ÿÿ 2010-12-02ÿÿÿÿ 10:35 +0800ÿÿGregory Farnumÿÿÿÿÿÿ > > On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > >> > and i also detect a issue ,take the following steps: > > >> > > > >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a > > >> > $ init-ceph - ceph.conf --btrfs -v -a start > > >> > then execute: > > >> > $ init-ceph - ceph.conf --btrfs -v -a stop > > >> > > > >> > this command can't stop OSD0 and OSD1 cosd process: > > >> > OSD0: > > >> > /usr/local/bin/cosd -i 0 -c ceph.conf > > >> > OSD1: > > >> > /usr/local/bin/cosd -i 1 -c ceph.conf > > >> Not sure I understand what you're doing here. Also, it looks like > > >> you've got a malformed command there -- you don't specify the "-c" > > >> option, just the nonexistent "-" option. ;) > > > > > > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD > > > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs > > > -v -a stop " ,which can't auto-kill OSD host cosd process, like this : > > > OSD0 host: > > > $ ps -ef | grep cosd > > > root 13987 1 0 Dec01 ? 00:02:55 /usr/local/bin/cosd -i 0 > > > -c ceph.conf > > > OSD1 host: > > > $ ps -ef | grep cosd > > > root 14028 1 0 Dec01 ? 00:02:53 /usr/local/bin/cosd -i 1 > > > -c ceph.conf > > > > > > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd > > > process manually, or ,next time , it will fail to execute "$ init-ceph > > > -c ceph.conf --btrfs -v -a start " command. > > Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It > > ought to be created automatically; if it's not we should fix that. > > What version of Ceph are you running, and where from? I'd imagine it's > > being packaged wrong or something. > > Hi , > i download ceph 0.23 from > http://ceph.newdream.net/download/ceph-0.23.tar.gz > > So , maybe , should i add > " > [global] > pid file = /var/run/ceph/$name.pid That's the default, so no... I think the problem is that 'make install' does 'mkdir -p /var/run/ceph'. The .deb and .rpm create the dir, but an install from source does not. There is probably a similar problem with the osd class tmp dir. sage ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs 2010-12-02 6:10 ` Sage Weil @ 2010-12-02 7:31 ` Jeff Wu 0 siblings, 0 replies; 14+ messages in thread From: Jeff Wu @ 2010-12-02 7:31 UTC (permalink / raw) To: Sage Weil; +Cc: Gregory Farnum, ceph-devel@vger.kernel.org, Andrew Lv 在 2010-12-02四的 14:10 +0800,Sage Weil写道: > On Thu, 2 Dec 2010, Jeff Wu wrote: > > ÿÿ 2010-12-02ÿÿÿÿ 10:35 +0800ÿÿGregory Farnumÿÿÿÿÿÿ > > > On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote: > > > >> > and i also detect a issue ,take the following steps: > > > >> > > > > >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a > > > >> > $ init-ceph - ceph.conf --btrfs -v -a start > > > >> > then execute: > > > >> > $ init-ceph - ceph.conf --btrfs -v -a stop > > > >> > > > > >> > this command can't stop OSD0 and OSD1 cosd process: > > > >> > OSD0: > > > >> > /usr/local/bin/cosd -i 0 -c ceph.conf > > > >> > OSD1: > > > >> > /usr/local/bin/cosd -i 1 -c ceph.conf > > > >> Not sure I understand what you're doing here. Also, it looks like > > > >> you've got a malformed command there -- you don't specify the "-c" > > > >> option, just the nonexistent "-" option. ;) > > > > > > > > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD > > > > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs > > > > -v -a stop " ,which can't auto-kill OSD host cosd process, like this : > > > > OSD0 host: > > > > $ ps -ef | grep cosd > > > > root 13987 1 0 Dec01 ? 00:02:55 /usr/local/bin/cosd -i 0 > > > > -c ceph.conf > > > > OSD1 host: > > > > $ ps -ef | grep cosd > > > > root 14028 1 0 Dec01 ? 00:02:53 /usr/local/bin/cosd -i 1 > > > > -c ceph.conf > > > > > > > > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd > > > > process manually, or ,next time , it will fail to execute "$ init-ceph > > > > -c ceph.conf --btrfs -v -a start " command. > > > Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It > > > ought to be created automatically; if it's not we should fix that. > > > What version of Ceph are you running, and where from? I'd imagine it's > > > being packaged wrong or something. > > > > Hi , > > i download ceph 0.23 from > > http://ceph.newdream.net/download/ceph-0.23.tar.gz > > > > So , maybe , should i add > > " > > [global] > > pid file = /var/run/ceph/$name.pid > > That's the default, so no... I think the problem is that 'make install' > does 'mkdir -p /var/run/ceph'. The .deb and .rpm create the dir, but an > install from source does not. There is probably a similar problem with > the osd class tmp dir. > Yes, when i try to use RBD , need also create "/var/lib/ceph/tmp" mamually. I install ceph server with ceph-0.23.tar.gz. Jeff,wu > sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2010-12-02 7:30 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1291001135.1872.106.camel@cephhost>
[not found] ` <1291001398.1872.107.camel@cephhost>
[not found] ` <1291001962.1872.113.camel@cephhost>
[not found] ` <1291002250.1872.116.camel@cephhost>
2010-11-29 3:53 ` Performence test on ceph v0.23 + EXT4 and Btrfs Jeff Wu
2010-11-29 17:07 ` Gregory Farnum
2010-11-30 2:55 ` Jeff Wu
2010-11-30 3:18 ` Gregory Farnum
2010-11-30 6:19 ` Jeff Wu
2010-11-30 17:07 ` Gregory Farnum
2010-12-01 1:35 ` Jeff Wu
2010-12-01 6:59 ` Jeff Wu
2010-12-01 16:05 ` Gregory Farnum
2010-12-02 1:38 ` Jeff Wu
2010-12-02 2:35 ` Gregory Farnum
2010-12-02 3:22 ` Jeff Wu
2010-12-02 6:10 ` Sage Weil
2010-12-02 7:31 ` Jeff Wu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.