All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
       [not found]     ` <1291002250.1872.116.camel@cephhost>
@ 2010-11-29  3:53       ` Jeff Wu
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Wu @ 2010-11-29  3:53 UTC (permalink / raw)
  To: sage, yehuda, gregf, colinm; +Cc: ceph-devel, mllv

HTML ---> TEXT,re-send
=================================================

Hi , 

I've recently been using FFSB and iozone to do performance test with Ceph v0.23 on my platform.
 FFSB configuration file and ceph.conf attached .

I am using one server(172.16.10.171) for the MDS and MON daemons and client host,
one server(172.16.10.42) is for OSD0 ,one server(172.16.10.65) is for OSD1.
The three machines all have Gigabit ethernet cards and connect with Gigabit Router.
The disks are formatted using ext4 in no-journal mode and btrfs mode.

The following is my patform infos and test results:
ceph: 0.23
OS:ubuntu 10.10 x86_64 ,2.6.35-22-generic kernel.

Ethernet: one Gigabit

MON MDS CLIENT  HOST:
CPU: Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
Memory: 2GB
cleint:
mount.ceph 172.16.10.171:6789:/  /mnt/ceph

OSD0 host:
CPU:Intel(R) Core(TM)2 Duo CPU     E8400  @ 3.00GHz
Memory: 2GB

OSD1 host:
CPU:AMD Athlon(tm) 64 X2 Dual Core Processor 3600+
Memory: 4GB


1) here are FFSB test result on ceph+btrfs disk

                                                        8 thread
16 threads                                32 threads 
large_file_create   14.7 MB/sec         16.4 MB/sec         17.8 MB/sec 
sequential_reads    15.5 MB/sec        16 MB/sec           17 MB/sec 
random_reads        490 KB/sec        594 KB/sec           664 KB/sec 
random_writes       57.2 MB/sec          68.4 MB/sec         72.1 MB/sec
mailserver       		  
		Read:85.8KB/sec		Read : 236KB/sec   Read:286KB/sec 
		Write : 36KB/sec	Write : 132KB/sec  Write:129KB/sec


2) For comparison, here are the FFSB test result on ceph+ext4 disk with no journal

                    8 thread           16 threads            32 threads 
large_file_create   7.92 MB/sec         8.09 MB/sec           8.46 MB/sec 
sequential_reads    8.19 MB/sec         8.77 MB/sec           8.14 MB/sec 
random_reads        786 KB/sec          556 KB/sec            170 KB/sec 
random_writes       52.9 MB/sec         63 MB/sec             59.1 MB/sec
mailserver       		  
		  Read:456KB/sec	Read : 249KB/sec   Read:485KB/sec 
		  Write : 228KB/sec	Write : 120KB/sec  Write:226KB/sec

3) here are iozone test result on ceph+btrfs disk,file size 6GB , Output is in Kbytes/sec

	Iozone: Performance Test of File I/O
	        Version $Revision: 3.353 $
		Compiled for 64 bit mode.
		Build: linux-ia64 

	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
	             Fabrice Bacchella, Zhenghua Xue, Qin Li.

	Run began: Fri Nov 26 09:00:33 2010

	Include close in write timing
	Include fsync in write timing
	Auto Mode
	Using minimum file size of 6291456 kilobytes.
	Using maximum file size of 6291456 kilobytes.
	Excel chart generation enabled
	Command line used: ./benchmark/iozone/iozone_x86_64 -c -e -a -n 6144M -g 6144M -i 0 -i 1 -i 2 -f /mnt/ceph/f1 -Rb ./benchmark/iozone/iozone.201011260900.xls
	Output is in Kbytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 Kbytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456      64    6627    6417     9898    10334    3629    5908                                                          
         6291456     128    6803    7182    10200    10582    5106    6268                                                          
         6291456     256    6734    7249    10821    11224    7348    7135                                                          
         6291456     512    7109    7213    10538    10682    9392    7788                                                          
         6291456    1024    6932    7616    11204    10873    8673    8467                                                          
         6291456    2048    7896    7669    11025     9981   10258    7770                                                          
         6291456    4096    6933    7084    10590    10703   10450    7758                                                          
         6291456    8192    7215    7192    10490    10700   11110    7838                                                          
         6291456   16384    6557    6646    10224    11179   10738    7062                                                          

4) For comparison, here are the iozone test result on ceph+ext4 disk with no journal,file size 6GB , Output is in Kbytes/sec

	Iozone: Performance Test of File I/O
	        Version $Revision: 3.353 $
		Compiled for 64 bit mode.
		Build: linux-ia64 

	Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
	             Al Slater, Scott Rhine, Mike Wisner, Ken Goss
	             Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
	             Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
	             Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
	             Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
	             Fabrice Bacchella, Zhenghua Xue, Qin Li.

	Run began: Thu Nov 25 08:42:25 2010

	Include close in write timing
	Include fsync in write timing
	Auto Mode
	Using minimum file size of 6291456 kilobytes.
	Using maximum file size of 6291456 kilobytes.
	Excel chart generation enabled
	Command line used: ./benchmark/iozone/iozone_x86_64 -c -e -a -n 6144M -g 6144M -i 0 -i 1 -i 2 -f /mnt/ceph/f1 -Rb ./benchmark/iozone/iozone.201011250841.xls
	Output is in Kbytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 Kbytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
         6291456      64    7214    7128     9847     9991    3112    4168                                                          
         6291456     128    7514    7367    10601    10281    4667    6041                                                          
         6291456     256    7420    7414    11041    11238    6933    7860                                                          
         6291456     512    8190    8120    11449    11166    9001    7959                                                          
         6291456    1024    7611    7702    10497    10391    7497    8887                                                          
         6291456    2048    7516    7408     9908    10254    8639    8387                                                          
         6291456    4096    7355    7453    10383    10598    9469    7554                                                          
         6291456    8192    7415    7651    10244    10240    9868    8450                                                          
         6291456   16384    7200    7166     9877     9778    9829    8228         


Are these results reasonable ? which seem  too slow , maybe,i do something wrong. 
Could you give me some ceph performance test results for the reference ?

Any ideas ,please let me know ,thanks.

Jeff.Wu




=============================ceph.conf =======================================

;
; Sample ceph ceph.conf file.
;
; This file defines cluster membership, the various locations
; that Ceph stores data, and any other runtime options.

; If a 'host' is defined for a daemon, the start/stop script will
; verify that it matches the hostname (or else ignore it).  If it is
; not defined, it is assumed that the daemon is intended to start on
; the current host (e.g., in a setup with a startup.conf on each
; node).

; global
[global]
	; enable secure authentication
	; auth supported = cephx
	keyring = /etc/ceph/keyring.bin
; monitors
;  You need at least one.  You need at least three if you want to
;  tolerate any node failures.  Always create an odd number.
[mon]
	mon data = /opt/ceph/data/mon$id

	; logging, for debugging monitor crashes, in order of
	; their likelihood of being helpful :)
	;debug ms = 20
	;debug mon = 20
	;debug paxos = 20
	;debug auth = 20

[mon0]
	host = ubuntu-mon0
	mon addr = 172.16.10.171:6789

[mon1]
	host = ubuntu-mon0
	mon addr = 172.16.10.171:6790

[mon2]
	host = ubuntu-mon0
	mon addr = 172.16.10.171:6791

; mds
;  You need at least one.  Define two to get a standby.
[mds]
	; where the mds keeps it's secret encryption keys
	keyring = /etc/ceph/keyring.$name

	; mds logging to debug issues.
	;debug ms = 20
	;debug mds = 20

[mds.0]
	host = ubuntu-mon0

[mds.1]
	host = ubuntu-mon0

; osd
;  You need at least one.  Two if you want data to be replicated.
;  Define as many as you like.
[osd]
	; This is where the btrfs volume will be mounted.
	osd data = /opt/ceph/data/osd$id
	;osd journal = /opt/ceph/data/osd$id/journal
	osd class tmp = /var/lib/ceph/tmp

	; Ideally, make this a separate disk or partition.  A few
 	; hundred MB should be enough; more if you have fast or many
 	; disks.  You can use a file under the osd data dir if need be
 	; (e.g. /data/osd$id/journal), but it will be slower than a
 	; separate disk or partition.

        ; This is an example of a file-based journal.
	; osd journal size = 1000 ; journal size, in megabytes
	keyring = /etc/ceph/keyring.$name

	; osd logging to debug osd issues, in order of likelihood of being
	; helpful
	;debug ms = 20
	;debug osd = 20
	;debug filestore = 20
	;debug journal = 20

[osd0]
	host = ubuntu-osd0

	;osd journal size = 1000 ; journal size, in megabytes
	; if 'btrfs devs' is not specified, you're responsible for
	; setting up the 'osd data' dir.  if it is not btrfs, things
	; will behave up until you try to recover from a crash (which
	; usually fine for basic testing).
	; btrfs devs = /dev/sdx

[osd1]
	host = ubuntu-osd1
	;osd data = /opt/data/osd$id
	;osd journal = /opt/data/osd$id/journal
	;filestore journal writeahead = true
	;osd journal size = 1000 ; journal size, in megabytes
	;btrfs devs = /dev/sdy

;[osd2]
	;host = zeta
	;btrfs devs = /dev/sdx

;[osd3]
	;host = eta
	;btrfs devs = /dev/sdy

================================= large_files_create========================================


# Large file creates
# Creating 1024 MB files.

time=300
alignio=1
directio=0

[filesystem0]
	location=%TESTPATH%

	# All created files will be 1024 MB.
	min_filesize=1024M
	max_filesize=1024M
[end0]

[threadgroup0]
	num_threads=32  # 8,16

	create_weight=1

	write_blocksize=4K

	[stats]
		enable_stats=1
		enable_range=1

		msec_range    0.00      0.01
		msec_range    0.01      0.02
		msec_range    0.02      0.05
		msec_range    0.05      0.10
		msec_range    0.10      0.20
		msec_range    0.20      0.50
		msec_range    0.50      1.00
		msec_range    1.00      2.00
		msec_range    2.00      5.00
		msec_range    5.00     10.00
		msec_range   10.00     20.00
		msec_range   20.00     50.00
		msec_range   50.00    100.00
		msec_range  100.00    200.00
		msec_range  200.00    500.00
		msec_range  500.00   1000.00
		msec_range 1000.00   2000.00
		msec_range 2000.00   5000.00
		msec_range 5000.00  10000.00
	[end]
[end0]



================================= mail server ========================================


# Mail server simulation.
# 1024 file

time=300
alignio=1
directio=0

[filesystem0]
	location=%TESTPATH%
	num_files=1024
	num_dirs=100

	# File sizes range from 1kB to 1MB.
	size_weight 1KB 10
	size_weight 2KB 15
	size_weight 4KB 16
	size_weight 8KB 16
	size_weight 16KB 15
	size_weight 32KB 10
	size_weight 64KB 8
	size_weight 128KB 4
	size_weight 256KB 3
	size_weight 512KB 2
	size_weight 1MB 1
[end0]

[threadgroup0]
	num_threads=32 # 8,16

	readall_weight=4
	create_fsync_weight=2
	delete_weight=1

	write_size=4KB
	write_blocksize=4KB

	read_size=4KB
	read_blocksize=4KB

	[stats]
		enable_stats=1
		enable_range=1

		msec_range    0.00      0.01
		msec_range    0.01      0.02
		msec_range    0.02      0.05
		msec_range    0.05      0.10
		msec_range    0.10      0.20
		msec_range    0.20      0.50
		msec_range    0.50      1.00
		msec_range    1.00      2.00
		msec_range    2.00      5.00
		msec_range    5.00     10.00
		msec_range   10.00     20.00
		msec_range   20.00     50.00
		msec_range   50.00    100.00
		msec_range  100.00    200.00
		msec_range  200.00    500.00
		msec_range  500.00   1000.00
		msec_range 1000.00   2000.00
		msec_range 2000.00   5000.00
		msec_range 5000.00  10000.00
	[end]
[end0]


================================= random reads========================================
# Large file random reads.
# 256 files, 100MB per file.

time=300  # 5 min
alignio=1

[filesystem0]
	location=%TESTPATH%
	num_files=256
	min_filesize=100M  # 100 MB
	max_filesize=100M
	reuse=1
[end0]

[threadgroup0]
	num_threads=32 # 8,16

	read_random=1
	read_weight=1

	read_size=1M  # 1 MB
	read_blocksize=4k

	[stats]
		enable_stats=1
		enable_range=1

		msec_range    0.00      0.01
		msec_range    0.01      0.02
		msec_range    0.02      0.05
		msec_range    0.05      0.10
		msec_range    0.10      0.20
		msec_range    0.20      0.50
		msec_range    0.50      1.00
		msec_range    1.00      2.00
		msec_range    2.00      5.00
		msec_range    5.00     10.00
		msec_range   10.00     20.00
		msec_range   20.00     50.00
		msec_range   50.00    100.00
		msec_range  100.00    200.00
		msec_range  200.00    500.00
		msec_range  500.00   1000.00
		msec_range 1000.00   2000.00
		msec_range 2000.00   5000.00
		msec_range 5000.00  10000.00
	[end]
[end0]


================================= random writes========================================

# Large file random writes.
# 256 files, 100MB per file.

time=300  # 5 min
alignio=1

[filesystem0]
        location=%TESTPATH%
        num_files=256
        min_filesize=100M  # 100 MB
        max_filesize=100M
        reuse=1
[end0]

[threadgroup0]
        num_threads=32 # 8,16

        write_random=1
        write_weight=1

        write_size=1M  # 1 MB
        write_blocksize=4k

        [stats]
                enable_stats=1
                enable_range=1

                msec_range    0.00      0.01
                msec_range    0.01      0.02
                msec_range    0.02      0.05
                msec_range    0.05      0.10
                msec_range    0.10      0.20
                msec_range    0.20      0.50
                msec_range    0.50      1.00
                msec_range    1.00      2.00
                msec_range    2.00      5.00
                msec_range    5.00     10.00
                msec_range   10.00     20.00
                msec_range   20.00     50.00
                msec_range   50.00    100.00
                msec_range  100.00    200.00
                msec_range  200.00    500.00
                msec_range  500.00   1000.00
                msec_range 1000.00   2000.00
                msec_range 2000.00   5000.00
                msec_range 5000.00  10000.00
        [end]
[end0]

================================= sequential reads========================================


# Large file sequential reads.
# 256 files, 100MB per file.

time=300  # 5 min
alignio=1

[filesystem0]
	location=%TESTPATH%
	num_files=256
	min_filesize=100M  # 100 MB
	max_filesize=100M  # 100 MB
	reuse=1
[end0]

[threadgroup0]
	num_threads=32 # 8,16
	read_weight=1
	read_size=1M  # 1 MB
	read_blocksize=4k

	[stats]
		enable_stats=1
		enable_range=1

		msec_range    0.00      0.01
		msec_range    0.01      0.02
		msec_range    0.02      0.05
		msec_range    0.05      0.10
		msec_range    0.10      0.20
		msec_range    0.20      0.50
		msec_range    0.50      1.00
		msec_range    1.00      2.00
		msec_range    2.00      5.00
		msec_range    5.00     10.00
		msec_range   10.00     20.00
		msec_range   20.00     50.00
		msec_range   50.00    100.00
		msec_range  100.00    200.00
		msec_range  200.00    500.00
		msec_range  500.00   1000.00
		msec_range 1000.00   2000.00
		msec_range 2000.00   5000.00
		msec_range 5000.00  10000.00
	[end]
[end0]

==================================================================


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
       [not found] <1291001135.1872.106.camel@cephhost>
       [not found] ` <1291001398.1872.107.camel@cephhost>
@ 2010-11-29 17:07 ` Gregory Farnum
  2010-11-30  2:55   ` Jeff Wu
  1 sibling, 1 reply; 14+ messages in thread
From: Gregory Farnum @ 2010-11-29 17:07 UTC (permalink / raw)
  To: cpwu; +Cc: sage, yehuda, colinm, ceph-devel, mllv

On Sun, Nov 28, 2010 at 7:25 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
>
> Hi ,
>
> I've recently been using FFSB and iozone to do performance test with Ceph
> v0.23 on my platform.
> the attachment file are FFSB configuration file and ceph.conf.
So you're not using OSD journals in either test configuration? You're
going to get pretty terrible write results without a journal.
The reads are clearly slower than they should be and you could
probably get better results by adjusting the caching behaviors. We
haven't done too much work on optimizing read behavior.

Could you run "ceph osd tell * bench", then run "ceph -w", and report
the results? (That'll just run local benchmarking on the OSD to report
the approximate write speed it's capable of.)
You can also run "rados -p data bench 60 write", and then "rados -p
data bench 0 seq" to get a simpler (better understood) performance
test.
With this data as a baseline we can start looking at what might be
causing trouble.
-Greg

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-11-29 17:07 ` Gregory Farnum
@ 2010-11-30  2:55   ` Jeff Wu
  2010-11-30  3:18     ` Gregory Farnum
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Wu @ 2010-11-30  2:55 UTC (permalink / raw)
  To: Gregory Farnum
  Cc: sage@newdream.net, yehuda@hq.newdream.net, colinm@hq.newdream.net,
	ceph-devel@vger.kernel.org, Andrew Lv



在 2010-11-30二的 01:07 +0800,Gregory Farnum写道:
> On Sun, Nov 28, 2010 at 7:25 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> >
> > Hi ,
> >
> > I've recently been using FFSB and iozone to do performance test with Ceph
> > v0.23 on my platform.
> > the attachment file are FFSB configuration file and ceph.conf.
> So you're not using OSD journals in either test configuration? You're
> going to get pretty terrible write results without a journal.
> The reads are clearly slower than they should be and you could
> probably get better results by adjusting the caching behaviors. We
> haven't done too much work on optimizing read behavior.

Hi ,Greg , thank you for your suggestions. 

> 
> Could you run "ceph osd tell * bench", then run "ceph -w", and report
> the results? (That'll just run local benchmarking on the OSD to report
> the approximate write speed it's capable of.)

I run six times for the command: "$ sudo ceph osd tell 0/1 bench " 
use "$ sudo ceph -w" to get the following results: 

osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec
at 49775 KB/sec
osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec
at 49384 KB/sec
osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec
at 51535 KB/sec
osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec
at 37547 KB/sec
osd0  : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec
at 77191 KB/sec
osd1  : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec
at 47369 KB/sec

The detail test logs attached below.


> You can also run "rados -p data bench 60 write", and then "rados -p
> data bench 0 seq" to get a simpler (better understood) performance
> test.

I run twice for the command: "rados -p data bench 60 write" ,
Get the results:
$ sudo rados -p data bench 60 write
..........................
..........................

Total time run:        76.182225
Total writes made:     121
Write size:            4194304
Bandwidth (MB/sec):    6.219 

Average Latency:       13.3068
Max latency:           23.9986
Min latency:           7.01847

$ sudo rados -p data bench 60 write
..........................
..........................
Total time run:        74.830651
Total writes made:     97
Write size:            4194304
Bandwidth (MB/sec):    4.714 

Average Latency:       15.5064
Max latency:           24.5641
Min latency:           3.50005

but run  "$ sudo rados -p data bench 0 seq" 
fail to get the results, Maybe it's a bug,ceph version 0.23.
logs:
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
    0      16        16         0         0         0         -
0
read got -2
error during benchmark: -2
./common/Mutex.h: In function 'Mutex::~Mutex()':
./common/Mutex.h:97: FAILED assert(nlock == 0)
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)

...............................
...............................

The detail logs addached below.



> With this data as a baseline we can start looking at what might be
> causing trouble.
> -Greg


==========================================================
1. Process

transoft@ubuntu-mon0:/usr/local/etc/ceph$ ps -ef
root     12919     1  2 Nov26 ?        02:23:45 /usr/local/bin/cmon -i 0
-c ceph.conf
root     12952     1  1 Nov26 ?        01:30:03 /usr/local/bin/cmon -i 1
-c ceph.conf
root     12987     1  1 Nov26 ?        01:51:42 /usr/local/bin/cmon -i 2
-c ceph.conf
root     13036     1  1 Nov26 ?        01:00:08 /usr/local/bin/cmds -i 0
-c ceph.conf
root     13075     1  0 Nov26 ?        00:47:47 /usr/local/bin/cmds -i 1
-c ceph.conf

==========================================================
2. ceph -w 

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-11-30 08:47:09.080125 mon <- [osd,tell,0,bench]
2010-11-30 08:47:09.080378 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-11-30 08:47:54.520159 mon <- [osd,tell,1,bench]
2010-11-30 08:47:54.520433 mon2 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-11-30 08:48:29.590115 mon <- [osd,tell,0,bench]
2010-11-30 08:48:29.590365 mon2 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-11-30 08:48:47.450092 mon <- [osd,tell,1,bench]
2010-11-30 08:48:47.450341 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-11-30 08:49:27.240742 mon <- [osd,tell,0,bench]
2010-11-30 08:49:27.241076 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-11-30 08:49:43.500749 mon <- [osd,tell,1,bench]
2010-11-30 08:49:43.501043 mon2 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ 

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph -w
2010-11-30 08:47:04.150457    pg v8701: 528 pgs: 528 active+clean; 10674
KB data, 442 MB used, 219 GB / 219 GB avail
2010-11-30 08:47:04.151228   mds e5: 1/1/1 up {0=up:active}, 1
up:standby
2010-11-30 08:47:04.151253   osd e6: 2 osds: 2 up, 2 in
2010-11-30 08:47:04.151319   log 2010-11-30 08:45:17.795260 osd0
172.16.10.42:6800/16864 3 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 12.828886 sec at 51682 KB/sec
2010-11-30 08:47:04.151375   mon e1: 3 mons at
{0=172.16.10.171:6789/0,1=172.16.10.171:6790/0,2=172.16.10.171:6791/0}
2010-11-30 08:47:22.487639   log 2010-11-30 08:47:00.704960 osd0
172.16.10.42:6800/16864 4 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 12.906598 sec at 49775 KB/sec
2010-11-30 08:48:17.047842   log 2010-11-30 09:52:29.975820 osd1
172.16.10.65:6800/6678 3 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 21.023294 sec at 49384 KB/sec
2010-11-30 08:48:42.915344   log 2010-11-30 08:48:21.135651 osd0
172.16.10.42:6800/16864 5 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 12.834682 sec at 51535 KB/sec
2010-11-30 08:49:10.000303   log 2010-11-30 09:53:22.645957 osd1
172.16.10.65:6800/6678 4 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 20.792697 sec at 37547 KB/sec
2010-11-30 08:49:40.809023   log 2010-11-30 08:49:19.005413 osd0
172.16.10.42:6800/16864 6 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 13.058412 sec at 77191 KB/sec
2010-11-30 08:50:06.184830   log 2010-11-30 09:54:19.064900 osd1
172.16.10.65:6800/6678 5 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 21.113612 sec at 47369 KB/sec



===========================================================
3. rados -p data bench 60 write

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60
write
Maintaining 16 concurrent writes of 4194304 bytes for at least 60
seconds.
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
    0      16        16         0         0         0         -
0
    1      16        16         0         0         0         -
0
    2      16        16         0         0         0         -
0
    3      16        16         0         0         0         -
0
    4      16        16         0         0         0         -
0
    5      16        17         1  0.793474       0.8    7.9252
7.9252
    6      16        17         1  0.662028         0         -
7.9252
    7      16        17         1  0.567947         0         -
7.9252
    8      16        24         8   3.97823   9.33333   10.5391
10.1201
    9      16        24         8   3.53801         0         -
10.1201
   10      16        38        22   8.76019        28   7.13522
9.96534
   11      16        38        22   7.96645         0         -
9.96534
   12      16        38        22   7.30461         0         -
9.96534
   13      16        38        22   6.74431         0         -
9.96534
   14      16        38        22   6.26388         0         -
9.96534
   15      16        38        22   5.84735         0         -
9.96534
   16      16        41        25   6.23039         2    8.7632
10.3001
   17      16        41        25   5.86468         0         -
10.3001
   18      16        41        25   5.53953         0         -
10.3001
   19      16        52        36   7.55792   14.6667   10.3215
11.3923
min lat: 7.01847 max lat: 15.8599 avg lat: 11.3923
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
   20      16        52        36   7.18061         0         -
11.3923
   21      16        56        40   7.59925         8   13.7597
11.8294
   22      16        56        40   7.25439         0         -
11.8294
   23      16        56        40   6.93949         0         -
11.8294
   24      16        56        40   6.65079         0         -
11.8294
   25      16        63        47   7.50258         7   7.33881
11.2982
   26      16        63        47   7.21441         0         -
11.2982
   27      16        68        52   7.68667        10   9.28991
11.1051
   28      16        68        52   7.41251         0         -
11.1051
   29      16        68        52   7.15725         0         -
11.1051
   30      16        68        52   6.91898         0         -
11.1051
   31      16        68        52   6.69606         0         -
11.1051
   32      16        79        63   7.85927       8.8   15.0751
11.933
   33      16        79        63   7.62121         0         -
11.933
   34      16        79        63    7.3973         0         -
11.933
   35      16        79        63   7.18618         0         -
11.933
   36      16        79        63   6.98678         0         -
11.933
   37      16        84        68   7.33767         4   17.4944
12.3426
   38      16        84        68   7.14464         0         -
12.3426
   39      16        85        69   7.06399         2   8.24284
12.2832
min lat: 7.01847 max lat: 19.7384 avg lat: 12.2832
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
   40      16        85        69   6.88751         0         -
12.2832
   41      16        85        69   6.71971         0         -
12.2832
   42      16        85        69   6.55985         0         -
12.2832
   43      16        87        71   6.59315         2   15.8845
12.3846
   44      16        87        71   6.44343         0         -
12.3846
   45      16        87        71   6.30037         0         -
12.3846
   46      16        87        71   6.16352         0         -
12.3846
   47      16        91        75   6.37235         4   21.8578
12.8896
   48      16        91        75   6.23967         0         -
12.8896
   49      16       101        85   6.92742        20   18.2741
13.1211
   50      16       101        85   6.78897         0         -
13.1211
   51      16       101        85   6.65595         0         -
13.1211
   52      16       101        85   6.52804         0         -
13.1211
   53      16       101        85   6.40496         0         -
13.1211
   54      16       111        95   7.02602         8   13.1645
12.939
   55      16       111        95   6.89835         0         -
12.939
   56      16       111        95   6.77524         0         -
12.939
   57      16       111        95   6.65646         0         -
12.939
   58      16       111        95   6.54176         0         -
12.939
   59      16       115        99   6.70174       3.2   13.1073
12.9458
min lat: 7.01847 max lat: 21.8578 avg lat: 12.9364
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
   60      16       120       104   6.92291        20   11.3313
12.9364
   61      15       121       106   6.94046         8   13.3389
12.9444
   62      15       121       106   6.82859         0         -
12.9444
   63      15       121       106   6.72026         0         -
12.9444
   64      15       121       106   6.61533         0         -
12.9444
   65      15       121       106   6.51362         0         -
12.9444
   66      11       121       110   6.65706       3.2   14.3524
12.9974
   67      11       121       110   6.55777         0         -
12.9974
   68      11       121       110   6.46139         0         -
12.9974
   69      11       121       110    6.3678         0         -
12.9974
   70      11       121       110   6.27689         0         -
12.9974
   71       2       121       119   6.69486       7.2   13.2646
13.1676
   72       2       121       119   6.60193         0         -
13.1676
   73       2       121       119   6.51154         0         -
13.1676
   74       2       121       119    6.4236         0         -
13.1676
   75       2       121       119     6.338         0         -
13.1676
   76       2       121       119   6.25465         0         -
13.1676
Total time run:        76.182225
Total writes made:     121
Write size:            4194304
Bandwidth (MB/sec):    6.219 

Average Latency:       13.3068
Max latency:           23.9986
Min latency:           7.01847
transoft@ubuntu-mon0:/usr/local/etc/ceph$ 

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60
write
Maintaining 16 concurrent writes of 4194304 bytes for at least 60
seconds.
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
    0      16        16         0         0         0         -
0
    1      16        16         0         0         0         -
0
    2      16        16         0         0         0         -
0
    3      16        17         1   1.31537   1.33333   3.50005
3.50005
    4      16        17         1  0.989638         0         -
3.50005
    5      16        17         1  0.793226         0         -
3.50005
    6      16        17         1  0.661871         0         -
3.50005
    7      16        17         1  0.567841         0         -
3.50005
    8      16        17         1  0.497201         0         -
3.50005
    9      16        17         1  0.442194         0         -
3.50005
   10      16        17         1  0.398146         0         -
3.50005
   11      16        17         1  0.362079         0         -
3.50005
   12      16        25         9   2.98799   3.55556   12.0273
17.9704
   13      16        27        11   3.37183         8   12.0974
16.8994
   14      16        33        17   4.83996        24   12.8105
16.4478
   15      16        33        17    4.5181         0         -
16.4478
   16      16        33        17    4.2364         0         -
16.4478
   17      16        33        17   3.98777         0         -
16.4478
   18      16        33        17   3.76669         0         -
16.4478
   19      16        38        22    4.6185         4   14.6053
16.0123
min lat: 3.50005 max lat: 20.9926 avg lat: 16.0123
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
   20      16        38        22   4.38793         0         -
16.0123
   21      16        38        22   4.17939         0         -
16.0123
   22      16        38        22   3.98976         0         -
16.0123
   23      16        38        22   3.81659         0         -
16.0123
   24      16        38        22   3.65782         0         -
16.0123
   25      16        47        31   4.94834         6   13.0266
15.7939
   26      16        47        31   4.75829         0         -
15.7939
   27      16        47        31   4.58231         0         -
15.7939
   28      16        47        31   4.41888         0         -
15.7939
   29      16        47        31   4.26672         0         -
15.7939
   30      16        47        31   4.12467         0         -
15.7939
   31      16        54        38   4.89315   4.66667   14.5712
16.1038
   32      16        54        38   4.74042         0         -
16.1038
   33      16        54        38   4.59694         0         -
16.1038
   34      16        54        38   4.46188         0         -
16.1038
   35      16        54        38   4.33454         0         -
16.1038
   36      16        63        47   5.21238       7.2   12.2193
15.3573
   37      16        66        50   5.39536        12   9.59416
15.0125
   38      16        66        50   5.25352         0         -
15.0125
   39      16        66        50   5.11895         0         -
15.0125
min lat: 3.50005 max lat: 24.5641 avg lat: 15.0125
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
   40      16        66        50   4.99107         0         -
15.0125
   41      16        66        50   4.86947         0         -
15.0125
   42      16        73        57   5.41915       5.6   11.6831
14.5749
   43      16        73        57   5.29323         0         -
14.5749
   44      16        73        57   5.17304         0         -
14.5749
   45      16        73        57   5.05817         0         -
14.5749
   46      16        73        57   4.94831         0         -
14.5749
   47      16        82        66   5.60781       7.2   14.1181
14.6534
   48      16        82        66   5.49105         0         -
14.6534
   49      16        82        66   5.37907         0         -
14.6534
   50      16        82        66   5.27157         0         -
14.6534
   51      16        82        66   5.16829         0         -
14.6534
   52      16        82        66   5.06898         0         -
14.6534
   53      16        88        72   5.42554         4   14.2779
14.6222
   54      16        88        72   5.32512         0         -
14.6222
   55      16        88        72   5.22837         0         -
14.6222
   56      16        88        72   5.13507         0         -
14.6222
   57      16        88        72   5.04505         0         -
14.6222
   58      16        88        72   4.95812         0         -
14.6222
   59      16        90        74   5.00954   1.33333   18.7183
14.808
min lat: 3.50005 max lat: 24.5641 avg lat: 14.8389
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
   60      16        96        80   5.32549        24   16.2963
14.8389
   61      16        96        80   5.23826         0         -
14.8389
   62      16        96        80   5.15383         0         -
14.8389
   63      16        96        80   5.07208         0         -
14.8389
   64      13        97        84   5.24252         4   13.2043
15.0243
   65      13        97        84   5.16191         0         -
15.0243
   66      13        97        84   5.08375         0         -
15.0243
   67      13        97        84   5.00792         0         -
15.0243
   68      13        97        84   4.93432         0         -
15.0243
   69      13        97        84   4.86285         0         -
15.0243
   70       5        97        92   5.24994   5.33333   18.0332
15.3781
   71       5        97        92   5.17604         0         -
15.3781
   72       5        97        92   5.10419         0         -
15.3781
   73       5        97        92    5.0343         0         -
15.3781
   74       5        97        92   4.96631         0         -
15.3781
Total time run:        74.830651
Total writes made:     97
Write size:            4194304
Bandwidth (MB/sec):    4.714 

Average Latency:       15.5064
Max latency:           24.5641
Min latency:           3.50005
transoft@ubuntu-mon0:/usr/local/etc/ceph$ 

===========================================================
4. rados -p data bench 0 seq

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 0 seq
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
    0      16        16         0         0         0         -
0
read got -2
error during benchmark: -2
./common/Mutex.h: In function 'Mutex::~Mutex()':
./common/Mutex.h:97: FAILED assert(nlock == 0)
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)
 1: rados() [0x40c941]
 2: (exit()+0xe2) [0x7f518ea744f2]
 3: (__libc_start_main()+0x105) [0x7f518ea59d95]
 4: rados() [0x405b19]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
./common/Mutex.h: In function 'Mutex::~Mutex()':
./common/Mutex.h:97: FAILED assert(nlock == 0)
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)
 1: rados() [0x40c941]
 2: (exit()+0xe2) [0x7f518ea744f2]
 3: (__libc_start_main()+0x105) [0x7f518ea59d95]
 4: rados() [0x405b19]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (ABRT) ***
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)
 1: (sigabrt_handler(int)+0x91) [0x7f518fe7dec1]
 2: (()+0x33c20) [0x7f518ea6ec20]
 3: (gsignal()+0x35) [0x7f518ea6eba5]
 4: (abort()+0x180) [0x7f518ea726b0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f518f3126bd]
 6: (()+0xb9906) [0x7f518f310906]
 7: (()+0xb9933) [0x7f518f310933]
 8: (()+0xb9a3e) [0x7f518f310a3e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x456) [0x7f518fe5ab96]
 10: rados() [0x40c941]
 11: (exit()+0xe2) [0x7f518ea744f2]
 12: (__libc_start_main()+0x105) [0x7f518ea59d95]
 13: rados() [0x405b19]
Aborted
transoft@ubuntu-mon0:/usr/local/etc/ceph$ 

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60
seq
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg
lat
    0      16        16         0         0         0         -
0
read got -2
error during benchmark: -2
./common/Mutex.h: In function 'Mutex::~Mutex()':
./common/Mutex.h:97: FAILED assert(nlock == 0)
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)
 1: rados() [0x40c941]
 2: (exit()+0xe2) [0x7f795c31c4f2]
 3: (__libc_start_main()+0x105) [0x7f795c301d95]
 4: rados() [0x405b19]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
./common/Mutex.h: In function 'Mutex::~Mutex()':
./common/Mutex.h:97: FAILED assert(nlock == 0)
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)
 1: rados() [0x40c941]
 2: (exit()+0xe2) [0x7f795c31c4f2]
 3: (__libc_start_main()+0x105) [0x7f795c301d95]
 4: rados() [0x405b19]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
*** Caught signal (ABRT) ***
 ceph version 0.23 (commit:5d1d8d0c4602be9819cc9f7aea562fccbb005a56)
 1: (sigabrt_handler(int)+0x91) [0x7f795d725ec1]
 2: (()+0x33c20) [0x7f795c316c20]
 3: (gsignal()+0x35) [0x7f795c316ba5]
 4: (abort()+0x180) [0x7f795c31a6b0]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f795cbba6bd]
 6: (()+0xb9906) [0x7f795cbb8906]
 7: (()+0xb9933) [0x7f795cbb8933]
 8: (()+0xb9a3e) [0x7f795cbb8a3e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x456) [0x7f795d702b96]
 10: rados() [0x40c941]
 11: (exit()+0xe2) [0x7f795c31c4f2]
 12: (__libc_start_main()+0x105) [0x7f795c301d95]
 13: rados() [0x405b19]
Aborted

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo rados -p data bench 60
rand
Random test not implemented yet!
error during benchmark: -1
transoft@ubuntu-mon0:/usr/local/etc/ceph$ 




Best Regards
Jeff.Wu









--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-11-30  2:55   ` Jeff Wu
@ 2010-11-30  3:18     ` Gregory Farnum
  2010-11-30  6:19       ` Jeff Wu
  0 siblings, 1 reply; 14+ messages in thread
From: Gregory Farnum @ 2010-11-30  3:18 UTC (permalink / raw)
  To: cpwu; +Cc: ceph-devel@vger.kernel.org, Andrew Lv

On Mon, Nov 29, 2010 at 6:55 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
>> Could you run "ceph osd tell * bench", then run "ceph -w", and report
>> the results? (That'll just run local benchmarking on the OSD to report
>> the approximate write speed it's capable of.)
>
> I run six times for the command: "$ sudo ceph osd tell 0/1 bench "
> use "$ sudo ceph -w" to get the following results:
>
> osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec
> at 49775 KB/sec
> osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec
> at 49384 KB/sec
> osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec
> at 51535 KB/sec
> osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec
> at 37547 KB/sec
> osd0  : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec
> at 77191 KB/sec
> osd1  : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec
> at 47369 KB/sec
Okay, those are a bit slow but reasonable. Based on these I'd expect
you to generally manage about 40-50MB/s (since everything is
replicated; in a 2-disk configuration it'll just be the speed of your
slowest disk), assuming a properly configured system.

>> You can also run "rados -p data bench 60 write", and then "rados -p
>> data bench 0 seq" to get a simpler (better understood) performance
>> test.
>
> I run twice for the command: "rados -p data bench 60 write" ,
> Get the results:
> $ sudo rados -p data bench 60 write
> ..........................
> ..........................
>
> Total time run:        76.182225
> Total writes made:     121
> Write size:            4194304
> Bandwidth (MB/sec):    6.219
>
> Average Latency:       13.3068
> Max latency:           23.9986
> Min latency:           7.01847
WOAH. That's a lot of latency. Rather more than I'd expect to get just
from seek times in a non-journaled environment. What's the round-trip
time to ping the OSDs from your client? Are your disks okay?

> but run  "$ sudo rados -p data bench 0 seq"
> fail to get the results, Maybe it's a bug,ceph version 0.23.
Oh right, sorry. We put in a quick fix to make the write benchmark
scale across multiple client writers and forgot to adjust the read
benchmark. *oops*
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-11-30  3:18     ` Gregory Farnum
@ 2010-11-30  6:19       ` Jeff Wu
  2010-11-30 17:07         ` Gregory Farnum
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Wu @ 2010-11-30  6:19 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv



在 2010-11-30二的 11:18 +0800,Gregory Farnum写道:
> On Mon, Nov 29, 2010 at 6:55 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> >> Could you run "ceph osd tell * bench", then run "ceph -w", and report
> >> the results? (That'll just run local benchmarking on the OSD to report
> >> the approximate write speed it's capable of.)
> >
> > I run six times for the command: "$ sudo ceph osd tell 0/1 bench "
> > use "$ sudo ceph -w" to get the following results:
> >
> > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.906598 sec
> > at 49775 KB/sec
> > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.023294 sec
> > at 49384 KB/sec
> > osd0 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 12.834682 sec
> > at 51535 KB/sec
> > osd1 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 20.792697 sec
> > at 37547 KB/sec
> > osd0  : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 13.058412 sec
> > at 77191 KB/sec
> > osd1  : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 21.113612 sec
> > at 47369 KB/sec
> Okay, those are a bit slow but reasonable. Based on these I'd expect
> you to generally manage about 40-50MB/s (since everything is
> replicated; in a 2-disk configuration it'll just be the speed of your
> slowest disk), assuming a properly configured system.

Is "40-50MB/s" the speed that it run bench at local btrfs disk ?
not the speed that run bench from client to osd server ? 
with this speed ,run bench from client to osd server ,will which  get
about 20~25MB/s( 40~50MB /2 )speed ?


> 
> >> You can also run "rados -p data bench 60 write", and then "rados -p
> >> data bench 0 seq" to get a simpler (better understood) performance
> >> test.
> >
> > I run twice for the command: "rados -p data bench 60 write" ,
> > Get the results:
> > $ sudo rados -p data bench 60 write
> > ..........................
> > ..........................
> >
> > Total time run:        76.182225
> > Total writes made:     121
> > Write size:            4194304
> > Bandwidth (MB/sec):    6.219
> >
> > Average Latency:       13.3068
> > Max latency:           23.9986
> > Min latency:           7.01847
> WOAH. That's a lot of latency. Rather more than I'd expect to get just
> from seek times in a non-journaled environment. What's the round-trip
> time to ping the OSDs from your client? Are your disks okay?
> 
Hi,i get the RTT and the disks are okay .

1. RTT:
client ping OSD0:
rtt min/avg/max/mdev = 0.103/0.124/0.143/0.016 ms
client ping OSD1:
rtt min/avg/max/mdev = 0.112/0.116/0.122/0.005 ms

2. disks

1) mon data save in sda4, which formatted with mkfs.btrfs
$lsscsi
[0:0:0:0]    disk    ATA      WDC WD3200AAKS-7 02.0  /dev/sda
$fdisk -l
.............
/dev/sda4           17264       38913   173897568+  83  Linux

2) OSD0 data save in  sda5,which formatted with mkfs.btrfs
~$ lsscsi
[0:0:0:0]    disk    ATA      ST3320418AS      CC45  /dev/sda
$ fdisk -l 
.......................
/dev/sda5           18480       38913   164131649   83  Linux


3) OSD1 data save in  sda5,which formatted with mkfs.btrfs

$ lsscsi
[0:0:0:0]    disk    ATA      ST3160812AS      3.AD  /dev/sda
$ fdisk -l
/dev/sda5           11185       19452    66405470+  83  Linux


more detail logs attached(see below)


Thank you 
jeff.wu

> > but run  "$ sudo rados -p data bench 0 seq"
> > fail to get the results, Maybe it's a bug,ceph version 0.23.
> Oh right, sorry. We put in a quick fix to make the write benchmark
> scale across multiple client writers and forgot to adjust the read
> benchmark. *oops*

thank you !

> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

logs:
=================================================================

transoft@ubuntu-mon0:/usr/local/etc/ceph$ ping 172.16.10.65
PING 172.16.10.65 (172.16.10.65) 56(84) bytes of data.
64 bytes from 172.16.10.65: icmp_req=1 ttl=64 time=0.127 ms
64 bytes from 172.16.10.65: icmp_req=2 ttl=64 time=0.121 ms
64 bytes from 172.16.10.65: icmp_req=3 ttl=64 time=0.127 ms
64 bytes from 172.16.10.65: icmp_req=4 ttl=64 time=0.124 ms
64 bytes from 172.16.10.65: icmp_req=5 ttl=64 time=0.128 ms
64 bytes from 172.16.10.65: icmp_req=6 ttl=64 time=0.128 ms
64 bytes from 172.16.10.65: icmp_req=7 ttl=64 time=0.127 ms
64 bytes from 172.16.10.65: icmp_req=8 ttl=64 time=0.126 ms
64 bytes from 172.16.10.65: icmp_req=9 ttl=64 time=0.126 ms
64 bytes from 172.16.10.65: icmp_req=10 ttl=64 time=0.128 ms
64 bytes from 172.16.10.65: icmp_req=11 ttl=64 time=0.120 ms
64 bytes from 172.16.10.65: icmp_req=12 ttl=64 time=0.127 ms
64 bytes from 172.16.10.65: icmp_req=13 ttl=64 time=0.120 ms
64 bytes from 172.16.10.65: icmp_req=14 ttl=64 time=0.125 ms
64 bytes from 172.16.10.65: icmp_req=15 ttl=64 time=0.131 ms
64 bytes from 172.16.10.65: icmp_req=16 ttl=64 time=0.125 ms
64 bytes from 172.16.10.65: icmp_req=17 ttl=64 time=0.124 ms
64 bytes from 172.16.10.65: icmp_req=18 ttl=64 time=0.103 ms
64 bytes from 172.16.10.65: icmp_req=19 ttl=64 time=0.124 ms
64 bytes from 172.16.10.65: icmp_req=20 ttl=64 time=0.125 ms
64 bytes from 172.16.10.65: icmp_req=21 ttl=64 time=0.143 ms
64 bytes from 172.16.10.65: icmp_req=22 ttl=64 time=0.127 ms
64 bytes from 172.16.10.65: icmp_req=23 ttl=64 time=0.117 ms
^C
--- 172.16.10.65 ping statistics ---
23 packets transmitted, 23 received, 0% packet loss, time 21999ms
rtt min/avg/max/mdev = 0.103/0.124/0.143/0.016 ms


transoft@ubuntu-mon0:/usr/local/etc/ceph$ ping 172.16.10.42
PING 172.16.10.42 (172.16.10.42) 56(84) bytes of data.
64 bytes from 172.16.10.42: icmp_req=1 ttl=64 time=0.121 ms
64 bytes from 172.16.10.42: icmp_req=2 ttl=64 time=0.116 ms
64 bytes from 172.16.10.42: icmp_req=3 ttl=64 time=0.122 ms
64 bytes from 172.16.10.42: icmp_req=5 ttl=64 time=0.117 ms
64 bytes from 172.16.10.42: icmp_req=6 ttl=64 time=0.112 ms
64 bytes from 172.16.10.42: icmp_req=7 ttl=64 time=0.117 ms
64 bytes from 172.16.10.42: icmp_req=8 ttl=64 time=0.112 ms
64 bytes from 172.16.10.42: icmp_req=9 ttl=64 time=0.112 ms
64 bytes from 172.16.10.42: icmp_req=10 ttl=64 time=0.112 ms
64 bytes from 172.16.10.42: icmp_req=11 ttl=64 time=0.117 ms
64 bytes from 172.16.10.42: icmp_req=12 ttl=64 time=0.114 ms
64 bytes from 172.16.10.42: icmp_req=13 ttl=64 time=0.121 ms
^C
--- 172.16.10.42 ping statistics ---
13 packets transmitted, 12 received, 7% packet loss, time 11999ms
rtt min/avg/max/mdev = 0.112/0.116/0.122/0.005 ms
transoft@ubuntu-mon0:/usr/local/etc/ceph$ 


transoft@ubuntu-mon0:~$ lsscsi
[0:0:0:0]    disk    ATA      WDC WD3200AAKS-7 02.0  /dev/sda
transoft@ubuntu-mon0:~$ sudo fdisk -l
[sudo] password for transoft: 

Disk /dev/sda: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00006279

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        4864    39061504   83  Linux
/dev/sda2            4864       17264    99607553    5  Extended
/dev/sda3           38914       38914           0+  83  Linux
/dev/sda4           17264       38913   173897568+  83  Linux
/dev/sda5            4864       17021    97654784   83  Linux
/dev/sda6           17021       17264     1951744   82  Linux swap /
Solaris

Partition table entries are not in disk order
transoft@ubuntu-mon0:~$ 



transoft@ubuntu-osd0:~$ lsscsi
[0:0:0:0]    disk    ATA      ST3320418AS      CC45  /dev/sda
transoft@ubuntu-osd0:~$ sudo fdisk -l
[sudo] password for transoft: 

Disk /dev/sda: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x50000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        6079    48827392   83  Linux
/dev/sda2            6079       18237    97655808   83  Linux
/dev/sda3           18237       18480     1952768   82  Linux swap /
Solaris
/dev/sda4           18480       38913   164131680+   5  Extended
/dev/sda5           18480       38913   164131649   83  Linux
transoft@ubuntu-osd0:~$ 



transoft@ubuntu-osd1:~$ lsscsi
[0:0:0:0]    disk    ATA      ST3160812AS      3.AD  /dev/sda
transoft@ubuntu-osd1:~$ sudo fdisk -l
[sudo] password for transoft: 

Disk /dev/sda: 160.0 GB, 160000000000 bytes
255 heads, 63 sectors/track, 19452 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc45cc45c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        3648    29295616   83  Linux
/dev/sda2            3648       10942    58593280   83  Linux
/dev/sda3           10942       11185     1952768   82  Linux swap /
Solaris
/dev/sda4           11185       19452    66405502    5  Extended
/dev/sda5           11185       19452    66405470+  83  Linux
transoft@ubuntu-osd1:~$ 

























--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-11-30  6:19       ` Jeff Wu
@ 2010-11-30 17:07         ` Gregory Farnum
  2010-12-01  1:35           ` Jeff Wu
  0 siblings, 1 reply; 14+ messages in thread
From: Gregory Farnum @ 2010-11-30 17:07 UTC (permalink / raw)
  To: cpwu; +Cc: ceph-devel@vger.kernel.org, Andrew Lv

On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> Is "40-50MB/s" the speed that it run bench at local btrfs disk ?
> not the speed that run bench from client to osd server ?
> with this speed ,run bench from client to osd server ,will which  get
> about 20~25MB/s( 40~50MB /2 )speed ?
Data on Ceph is replicated across 2 OSDs (by default; this is
configurable). So while figuring out potential performance involves a
lot of variables, in a simple case like this where you aren't bounded
by network bandwidth you'll find that your read/write performance
simply tracks the slower disk. I'd expect your Ceph tests (at least
the streaming ones) to run at 40-50MB/s.

Given that everything else is okay, I cannot stress enough that
running without a journal is going to cause significant performance
degradations. I have a hard time believing that it's responsible for
13-second latencies, but it's possible. So how about you set up a
journal (it can just be a file or new partition on the drives you're
already using) and report back your results after you do that. :)
Adding a journal to the OSDs lets them turn all their random writes
into streaming ones.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-11-30 17:07         ` Gregory Farnum
@ 2010-12-01  1:35           ` Jeff Wu
  2010-12-01  6:59             ` Jeff Wu
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Wu @ 2010-12-01  1:35 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv



在 2010-12-01三的 01:07 +0800,Gregory Farnum写道:
> On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> > Is "40-50MB/s" the speed that it run bench at local btrfs disk ?
> > not the speed that run bench from client to osd server ?
> > with this speed ,run bench from client to osd server ,will which  get
> > about 20~25MB/s( 40~50MB /2 )speed ?
> Data on Ceph is replicated across 2 OSDs (by default; this is
> configurable). So while figuring out potential performance involves a
> lot of variables, in a simple case like this where you aren't bounded
> by network bandwidth you'll find that your read/write performance
> simply tracks the slower disk. I'd expect your Ceph tests (at least
> the streaming ones) to run at 40-50MB/s.

Hi Greg,thank you very much for your quickly reply.
> 
> Given that everything else is okay, I cannot stress enough that
> running without a journal is going to cause significant performance
> degradations. I have a hard time believing that it's responsible for
> 13-second latencies, but it's possible. So how about you set up a
> journal (it can just be a file or new partition on the drives you're
> already using) and report back your results after you do that. :)

I will add journal to ceph.conf to try it . 



> Adding a journal to the OSDs lets them turn all their random writes
> into streaming ones.
> -Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-01  1:35           ` Jeff Wu
@ 2010-12-01  6:59             ` Jeff Wu
  2010-12-01 16:05               ` Gregory Farnum
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Wu @ 2010-12-01  6:59 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv



在 2010-12-01三的 09:35 +0800,Jeff Wu写道: 
> 
> 在 2010-12-01三的 01:07 +0800,Gregory Farnum写道:
> > On Mon, Nov 29, 2010 at 10:19 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> > > Is "40-50MB/s" the speed that it run bench at local btrfs disk ?
> > > not the speed that run bench from client to osd server ?
> > > with this speed ,run bench from client to osd server ,will which  get
> > > about 20~25MB/s( 40~50MB /2 )speed ?
> > Data on Ceph is replicated across 2 OSDs (by default; this is
> > configurable). So while figuring out potential performance involves a
> > lot of variables, in a simple case like this where you aren't bounded
> > by network bandwidth you'll find that your read/write performance
> > simply tracks the slower disk. I'd expect your Ceph tests (at least
> > the streaming ones) to run at 40-50MB/s.
> 
> Hi Greg,thank you very much for your quickly reply.
> > 
> > Given that everything else is okay, I cannot stress enough that
> > running without a journal is going to cause significant performance
> > degradations. I have a hard time believing that it's responsible for
> > 13-second latencies, but it's possible. So how about you set up a
> > journal (it can just be a file or new partition on the drives you're
> > already using) and report back your results after you do that. :)
> 
> I will add journal to ceph.conf to try it . 
> 
> 
Hi ,greg, 

With your suggestions, i add the journal config:
"
osd data = /opt/ceph/data/osd$id
osd journal = /home/transoft/data/osd$id/journal
filestore journal writeahead = true
osd journal size = 10000
" 
to ceph.conf.  the  detail ceph.conf attached below.

then , run six times for the commad: "$ sudo ceph osd tell 0/1
bench" ,get the results:


$ sudo ceph -w

osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 29.818194 sec at 28201 KB/sec
osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 30.013058 sec at 34801 KB/sec
osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 30.463511 sec at 30274 KB/sec

osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 165.067603 sec at 6329 KB/sec
osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 181.034333 sec at 5782 KB/sec
osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of
4096 KB in 196.055812 sec at 5334 KB/sec

and i also use "dd" to test raw drive, get the logs:

1. OSD0, mkfs.btrfs format /opt 

$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024 
1024+0 records in
1024+0 records out
2147483648 bytes transfered in 21.4497 secs(100 MB/sec)

2. OSD1 ,mkfs. btrfs format /opt 

~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024
1024+0 records in
1024+0 records out
2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec)

with these logs, OSD1 disk speed might limit the  test performance.

and i also detect a issue ,take the following steps:

$. mckephfs -c ceph.conf -v --mkbtrfs -a 
$  init-ceph - ceph.conf --btrfs -v -a start 
then execute:
$  init-ceph - ceph.conf --btrfs -v -a stop

this command can't stop OSD0 and OSD1 cosd process:
OSD0:
/usr/local/bin/cosd -i 0 -c ceph.conf
OSD1:
/usr/local/bin/cosd -i 1 -c ceph.conf


then , i create the folder "/var/run/ceph"  at OSD0 and OSD1 host
manually.
execute:
$  init-ceph - ceph.conf --btrfs -v -a stop

this command can  stop OSD0 and OSD1 cosd process:

/usr/local/bin/cosd -i 0 -c ceph.conf
/usr/local/bin/cosd -i 1 -c ceph.conf


Thanks,
Jeff.Wu

> 
> > Adding a journal to the OSDs lets them turn all their random writes
> > into streaming ones.
> > -Greg
> 

=========================================================
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-12-01 10:45:13.670910 mon <- [osd,tell,0,bench]
2010-12-01 10:45:13.671180 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-12-01 10:45:29.350198 mon <- [osd,tell,0,bench]
2010-12-01 10:45:29.350457 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 0 bench
2010-12-01 10:45:31.000281 mon <- [osd,tell,0,bench]
2010-12-01 10:45:31.000560 mon0 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-12-01 10:45:34.860782 mon <- [osd,tell,1,bench]
2010-12-01 10:45:34.861020 mon1 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-12-01 10:45:36.760811 mon <- [osd,tell,1,bench]
2010-12-01 10:45:36.761161 mon2 -> 'ok' (0)
transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph osd tell 1 bench
2010-12-01 10:45:37.530714 mon <- [osd,tell,1,bench]
2010-12-01 10:45:37.530968 mon2 -> 'ok' (0)

transoft@ubuntu-mon0:/usr/local/etc/ceph$ sudo ceph -w

2010-12-01 10:44:59.450653    pg v13: 528 pgs: 528 active+clean; 12 KB
data, 5304 KB used, 219 GB / 219 GB avail
2010-12-01 10:44:59.451365   mds e5: 1/1/1 up {0=up:active}, 1
up:standby
2010-12-01 10:44:59.451387   osd e6: 2 osds: 2 up, 2 in
2010-12-01 10:44:59.451412   log 2010-12-01 10:43:43.044865 mon0
172.16.10.171:6789/0 7 : [INF] mds0 172.16.10.171:6801/2482 up:active
2010-12-01 10:44:59.451440   mon e1: 3 mons at
{0=172.16.10.171:6789/0,1=172.16.10.171:6790/0,2=172.16.10.171:6791/0}
2010-12-01 10:46:45.000262   log 2010-12-01 10:45:15.599526 osd0
172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 29.818194 sec at 28201 KB/sec
2010-12-01 10:46:45.000262   log 2010-12-01 10:45:46.062142 osd0
172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 30.013058 sec at 34801 KB/sec
2010-12-01 10:46:45.000262   log 2010-12-01 10:46:16.836607 osd0
172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 30.463511 sec at 30274 KB/sec
2010-12-01 10:48:20.042152    pg v14: 528 pgs: 528 active+clean; 32780
KB data, 888 MB used, 218 GB / 219 GB avail
2010-12-01 10:50:50.038298    pg v15: 528 pgs: 528 active+clean; 73740
KB data, 54928 KB used, 219 GB / 219 GB avail
2010-12-01 10:52:15.074470    pg v16: 528 pgs: 528 active+clean; 73740
KB data, 79440 KB used, 219 GB / 219 GB avail
2010-12-01 10:54:55.546098   log 2010-12-01 11:52:34.244851 osd1
172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 165.067603 sec at 6329 KB/sec
2010-12-01 10:54:55.546098   log 2010-12-01 11:55:52.010739 osd1
172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 181.034333 sec at 5782 KB/sec
2010-12-01 10:54:55.546098   log 2010-12-01 11:59:09.560115 osd1
172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096
KB in 196.055812 sec at 5334 KB/sec
2010-12-01 10:55:01.001357    pg v17: 528 pgs: 528 active+clean; 73741
KB data, 1106 MB used, 218 GB / 219 GB avail


============ceph.conf====================


;
; Sample ceph ceph.conf file.
;
; This file defines cluster membership, the various locations
; that Ceph stores data, and any other runtime options.

; If a 'host' is defined for a daemon, the start/stop script will
; verify that it matches the hostname (or else ignore it).  If it is
; not defined, it is assumed that the daemon is intended to start on
; the current host (e.g., in a setup with a startup.conf on each
; node).

; global
[global]
; enable secure authentication
; auth supported = cephx
keyring = /etc/ceph/keyring.bin
; monitors
;  You need at least one.  You need at least three if you want to
;  tolerate any node failures.  Always create an odd number.
[mon]
mon data = /opt/ceph/data/mon$id
;mon data = /home/transoft/data/mon$id

; logging, for debugging monitor crashes, in order of
; their likelihood of being helpful :)
;debug ms = 20
;debug mon = 20
;debug paxos = 20
;debug auth = 20

[mon0]
host = ubuntu-mon0
mon addr = 172.16.10.171:6789

[mon1]
host = ubuntu-mon0
mon addr = 172.16.10.171:6790

[mon2]
host = ubuntu-mon0
mon addr = 172.16.10.171:6791

; mds
;  You need at least one.  Define two to get a standby.
[mds]
; where the mds keeps it's secret encryption keys
keyring = /etc/ceph/keyring.$name

; mds logging to debug issues.
;debug ms = 20
;debug mds = 20

[mds.0]
host = ubuntu-mon0

[mds.1]
host = ubuntu-mon0

; osd
;  You need at least one.  Two if you want data to be replicated.
;  Define as many as you like.
[osd]
; This is where the btrfs volume will be mounted.
;osd data = /opt/ceph/data/osd$id
osd class tmp = /var/lib/ceph/tmp

; Ideally, make this a separate disk or partition.  A few
; hundred MB should be enough; more if you have fast or many
; disks.  You can use a file under the osd data dir if need be
; (e.g. /data/osd$id/journal), but it will be slower than a
; separate disk or partition.

        ; This is an example of a file-based journal.
;osd journal = /home/transoft/data/osd$id/journal
;filestore journal writeahead = true
; journal size, in megabytes
;osd journal size = 1000 
keyring = /etc/ceph/keyring.$name

; osd logging to debug osd issues, in order of likelihood of being
; helpful
;debug ms = 20
;debug osd = 20
;debug filestore = 20
;debug journal = 20

[osd0]
host = ubuntu-osd0
osd data = /opt/ceph/data/osd$id
osd journal = /home/transoft/data/osd$id/journal
filestore journal writeahead = true
osd journal size = 10000 
; if 'btrfs devs' is not specified, you're responsible for
; setting up the 'osd data' dir.  if it is not btrfs, things
; will behave up until you try to recover from a crash (which
; usually fine for basic testing).
; btrfs devs = /dev/sdx

[osd1]
host = ubuntu-osd1
osd data = /opt/ceph/data/osd$id
osd journal = /home/transoft/data/osd$id/journal
filestore journal writeahead = true
osd journal size = 10000 
;btrfs devs = /dev/sdy

;[osd2]
;host = zeta
;btrfs devs = /dev/sdx

;[osd3]
;host = eta
;btrfs devs = /dev/sdy






--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-01  6:59             ` Jeff Wu
@ 2010-12-01 16:05               ` Gregory Farnum
  2010-12-02  1:38                 ` Jeff Wu
  0 siblings, 1 reply; 14+ messages in thread
From: Gregory Farnum @ 2010-12-01 16:05 UTC (permalink / raw)
  To: cpwu; +Cc: ceph-devel@vger.kernel.org, Andrew Lv

On Tue, Nov 30, 2010 at 10:57 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> Hi ,greg,
>
> With your suggestions, i add the journal config:
> "
> osd data = /opt/ceph/data/osd$id
> osd journal = /home/transoft/data/osd$id/journal
> filestore journal writeahead = true
> osd journal size = 10000
> "
> to ceph.conf.  the  detail ceph.conf attached below.
>
> then , run six times for the commad: "$ sudo ceph osd tell 0/1 bench" ,get
> the results:
>
>
> $ sudo ceph -w
>
> osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of
> 4096 KB in 29.818194 sec at 28201 KB/sec
> osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of
> 4096 KB in 30.013058 sec at 34801 KB/sec
> osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of
> 4096 KB in 30.463511 sec at 30274 KB/sec
>
> osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096
> KB in 165.067603 sec at 6329 KB/sec
> osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096
> KB in 181.034333 sec at 5782 KB/sec
> osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096
> KB in 196.055812 sec at 5334 KB/sec
>
> and i also use "dd" to test raw drive, get the logs:
>
> 1. OSD0, mkfs.btrfs format /opt
>
> $ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024
> 1024+0 records in
> 1024+0 records out
> 2147483648 bytes transfered in 21.4497 secs(100 MB/sec)
>
> 2. OSD1 ,mkfs. btrfs format /opt
>
> ~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024
> 1024+0 records in
> 1024+0 records out
> 2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec)
>
> with these logs, OSD1 disk speed might limit the  test performance.
Yes, it looks to me like your OSD1 disk isn't working properly.
Switching it from one streaming write to two streaming writes
shouldn't reduce it to 20% of its original performance (~45MB/s to
5MB/s*2). The same change in tasks doesn't impact OSD0's disk at all.

> and i also detect a issue ,take the following steps:
>
> $. mckephfs -c ceph.conf -v --mkbtrfs -a
> $  init-ceph - ceph.conf --btrfs -v -a start
> then execute:
> $  init-ceph - ceph.conf --btrfs -v -a stop
>
> this command can't stop OSD0 and OSD1 cosd process:
> OSD0:
> /usr/local/bin/cosd -i 0 -c ceph.conf
> OSD1:
> /usr/local/bin/cosd -i 1 -c ceph.conf
Not sure I understand what you're doing here. Also, it looks like
you've got a malformed command there -- you don't specify the "-c"
option, just the nonexistent "-" option. ;)
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-01 16:05               ` Gregory Farnum
@ 2010-12-02  1:38                 ` Jeff Wu
  2010-12-02  2:35                   ` Gregory Farnum
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Wu @ 2010-12-02  1:38 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org, Andrew Lv



在 2010-12-02四的 00:05 +0800,Gregory Farnum写道:
> On Tue, Nov 30, 2010 at 10:57 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> > Hi ,greg,
> >
> > With your suggestions, i add the journal config:
> > "
> > osd data = /opt/ceph/data/osd$id
> > osd journal = /home/transoft/data/osd$id/journal
> > filestore journal writeahead = true
> > osd journal size = 10000
> > "
> > to ceph.conf.  the  detail ceph.conf attached below.
> >
> > then , run six times for the commad: "$ sudo ceph osd tell 0/1 bench" ,get
> > the results:
> >
> >
> > $ sudo ceph -w
> >
> > osd0 172.16.10.42:6800/17347 1 : [INF] bench: wrote 1024 MB in blocks of
> > 4096 KB in 29.818194 sec at 28201 KB/sec
> > osd0 172.16.10.42:6800/17347 2 : [INF] bench: wrote 1024 MB in blocks of
> > 4096 KB in 30.013058 sec at 34801 KB/sec
> > osd0 172.16.10.42:6800/17347 3 : [INF] bench: wrote 1024 MB in blocks of
> > 4096 KB in 30.463511 sec at 30274 KB/sec
> >
> > osd1 172.16.10.65:6800/4845 1 : [INF] bench: wrote 1024 MB in blocks of 4096
> > KB in 165.067603 sec at 6329 KB/sec
> > osd1 172.16.10.65:6800/4845 2 : [INF] bench: wrote 1024 MB in blocks of 4096
> > KB in 181.034333 sec at 5782 KB/sec
> > osd1 172.16.10.65:6800/4845 3 : [INF] bench: wrote 1024 MB in blocks of 4096
> > KB in 196.055812 sec at 5334 KB/sec
> >
> > and i also use "dd" to test raw drive, get the logs:
> >
> > 1. OSD0, mkfs.btrfs format /opt
> >
> > $ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024
> > 1024+0 records in
> > 1024+0 records out
> > 2147483648 bytes transfered in 21.4497 secs(100 MB/sec)
> >
> > 2. OSD1 ,mkfs. btrfs format /opt
> >
> > ~$ sudo dd if=/dev/zero of=/opt/dd.img bs=2M count=1024
> > 1024+0 records in
> > 1024+0 records out
> > 2147483648 bytes transfered in 48.2037 secs(44.6 MB/sec)
> >
> > with these logs, OSD1 disk speed might limit the  test performance.
> Yes, it looks to me like your OSD1 disk isn't working properly.
> Switching it from one streaming write to two streaming writes
> shouldn't reduce it to 20% of its original performance (~45MB/s to
> 5MB/s*2). The same change in tasks doesn't impact OSD0's disk at all.
> 
Hi greg, thank you for your detail comments.


> > and i also detect a issue ,take the following steps:
> >
> > $. mckephfs -c ceph.conf -v --mkbtrfs -a
> > $  init-ceph - ceph.conf --btrfs -v -a start
> > then execute:
> > $  init-ceph - ceph.conf --btrfs -v -a stop
> >
> > this command can't stop OSD0 and OSD1 cosd process:
> > OSD0:
> > /usr/local/bin/cosd -i 0 -c ceph.conf
> > OSD1:
> > /usr/local/bin/cosd -i 1 -c ceph.conf
> Not sure I understand what you're doing here. Also, it looks like
> you've got a malformed command there -- you don't specify the "-c"
> option, just the nonexistent "-" option. ;)

Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD
hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs
-v -a stop " ,which can't auto-kill OSD host cosd process, like this :
OSD0 host:
$ ps -ef | grep cosd
root     13987     1  0 Dec01 ?        00:02:55 /usr/local/bin/cosd -i 0
-c ceph.conf
OSD1 host:
$ ps -ef | grep cosd
root     14028     1  0 Dec01 ?        00:02:53 /usr/local/bin/cosd -i 1
-c ceph.conf

I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd
process manually, or ,next time , it will fail to execute "$ init-ceph
-c ceph.conf --btrfs -v -a start " command.


jeff,wu


















--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-02  1:38                 ` Jeff Wu
@ 2010-12-02  2:35                   ` Gregory Farnum
  2010-12-02  3:22                     ` Jeff Wu
  0 siblings, 1 reply; 14+ messages in thread
From: Gregory Farnum @ 2010-12-02  2:35 UTC (permalink / raw)
  To: cpwu, Sage Weil; +Cc: ceph-devel@vger.kernel.org, Andrew Lv

On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
>> > and i also detect a issue ,take the following steps:
>> >
>> > $. mckephfs -c ceph.conf -v --mkbtrfs -a
>> > $  init-ceph - ceph.conf --btrfs -v -a start
>> > then execute:
>> > $  init-ceph - ceph.conf --btrfs -v -a stop
>> >
>> > this command can't stop OSD0 and OSD1 cosd process:
>> > OSD0:
>> > /usr/local/bin/cosd -i 0 -c ceph.conf
>> > OSD1:
>> > /usr/local/bin/cosd -i 1 -c ceph.conf
>> Not sure I understand what you're doing here. Also, it looks like
>> you've got a malformed command there -- you don't specify the "-c"
>> option, just the nonexistent "-" option. ;)
>
> Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD
> hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs
> -v -a stop " ,which can't auto-kill OSD host cosd process, like this :
> OSD0 host:
> $ ps -ef | grep cosd
> root     13987     1  0 Dec01 ?        00:02:55 /usr/local/bin/cosd -i 0
> -c ceph.conf
> OSD1 host:
> $ ps -ef | grep cosd
> root     14028     1  0 Dec01 ?        00:02:53 /usr/local/bin/cosd -i 1
> -c ceph.conf
>
> I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd
> process manually, or ,next time , it will fail to execute "$ init-ceph
> -c ceph.conf --btrfs -v -a start " command.
Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It
ought to be created automatically; if it's not we should fix that.
What version of Ceph are you running, and where from? I'd imagine it's
being packaged wrong or something.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-02  2:35                   ` Gregory Farnum
@ 2010-12-02  3:22                     ` Jeff Wu
  2010-12-02  6:10                       ` Sage Weil
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Wu @ 2010-12-02  3:22 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: Sage Weil, ceph-devel@vger.kernel.org, Andrew Lv



在 2010-12-02四的 10:35 +0800,Gregory Farnum写道:
> On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> >> > and i also detect a issue ,take the following steps:
> >> >
> >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a
> >> > $  init-ceph - ceph.conf --btrfs -v -a start
> >> > then execute:
> >> > $  init-ceph - ceph.conf --btrfs -v -a stop
> >> >
> >> > this command can't stop OSD0 and OSD1 cosd process:
> >> > OSD0:
> >> > /usr/local/bin/cosd -i 0 -c ceph.conf
> >> > OSD1:
> >> > /usr/local/bin/cosd -i 1 -c ceph.conf
> >> Not sure I understand what you're doing here. Also, it looks like
> >> you've got a malformed command there -- you don't specify the "-c"
> >> option, just the nonexistent "-" option. ;)
> >
> > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD
> > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs
> > -v -a stop " ,which can't auto-kill OSD host cosd process, like this :
> > OSD0 host:
> > $ ps -ef | grep cosd
> > root     13987     1  0 Dec01 ?        00:02:55 /usr/local/bin/cosd -i 0
> > -c ceph.conf
> > OSD1 host:
> > $ ps -ef | grep cosd
> > root     14028     1  0 Dec01 ?        00:02:53 /usr/local/bin/cosd -i 1
> > -c ceph.conf
> >
> > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd
> > process manually, or ,next time , it will fail to execute "$ init-ceph
> > -c ceph.conf --btrfs -v -a start " command.
> Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It
> ought to be created automatically; if it's not we should fix that.
> What version of Ceph are you running, and where from? I'd imagine it's
> being packaged wrong or something.

Hi ,
i download ceph 0.23 from 
http://ceph.newdream.net/download/ceph-0.23.tar.gz

So , maybe , should i add 
"
[global]
       pid file = /var/run/ceph/$name.pid
..............................
"
at ceph.conf ?





> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-02  3:22                     ` Jeff Wu
@ 2010-12-02  6:10                       ` Sage Weil
  2010-12-02  7:31                         ` Jeff Wu
  0 siblings, 1 reply; 14+ messages in thread
From: Sage Weil @ 2010-12-02  6:10 UTC (permalink / raw)
  To: Jeff Wu; +Cc: Gregory Farnum, ceph-devel@vger.kernel.org, Andrew Lv

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2278 bytes --]

On Thu, 2 Dec 2010, Jeff Wu wrote:
> ÿÿ 2010-12-02ÿÿÿÿ 10:35 +0800ÿÿGregory Farnumÿÿÿÿÿÿ
> > On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> > >> > and i also detect a issue ,take the following steps:
> > >> >
> > >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a
> > >> > $  init-ceph - ceph.conf --btrfs -v -a start
> > >> > then execute:
> > >> > $  init-ceph - ceph.conf --btrfs -v -a stop
> > >> >
> > >> > this command can't stop OSD0 and OSD1 cosd process:
> > >> > OSD0:
> > >> > /usr/local/bin/cosd -i 0 -c ceph.conf
> > >> > OSD1:
> > >> > /usr/local/bin/cosd -i 1 -c ceph.conf
> > >> Not sure I understand what you're doing here. Also, it looks like
> > >> you've got a malformed command there -- you don't specify the "-c"
> > >> option, just the nonexistent "-" option. ;)
> > >
> > > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD
> > > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs
> > > -v -a stop " ,which can't auto-kill OSD host cosd process, like this :
> > > OSD0 host:
> > > $ ps -ef | grep cosd
> > > root     13987     1  0 Dec01 ?        00:02:55 /usr/local/bin/cosd -i 0
> > > -c ceph.conf
> > > OSD1 host:
> > > $ ps -ef | grep cosd
> > > root     14028     1  0 Dec01 ?        00:02:53 /usr/local/bin/cosd -i 1
> > > -c ceph.conf
> > >
> > > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd
> > > process manually, or ,next time , it will fail to execute "$ init-ceph
> > > -c ceph.conf --btrfs -v -a start " command.
> > Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It
> > ought to be created automatically; if it's not we should fix that.
> > What version of Ceph are you running, and where from? I'd imagine it's
> > being packaged wrong or something.
> 
> Hi ,
> i download ceph 0.23 from 
> http://ceph.newdream.net/download/ceph-0.23.tar.gz
> 
> So , maybe , should i add 
> "
> [global]
>        pid file = /var/run/ceph/$name.pid

That's the default, so no...  I think the problem is that 'make install' 
does 'mkdir -p /var/run/ceph'.  The .deb and .rpm create the dir, but an 
install from source does not.  There is probably a similar problem with 
the osd class tmp dir.

sage

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Performence test on ceph v0.23 + EXT4 and Btrfs
  2010-12-02  6:10                       ` Sage Weil
@ 2010-12-02  7:31                         ` Jeff Wu
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Wu @ 2010-12-02  7:31 UTC (permalink / raw)
  To: Sage Weil; +Cc: Gregory Farnum, ceph-devel@vger.kernel.org, Andrew Lv



在 2010-12-02四的 14:10 +0800,Sage Weil写道:
> On Thu, 2 Dec 2010, Jeff Wu wrote:
> > ÿÿ 2010-12-02ÿÿÿÿ 10:35 +0800ÿÿGregory Farnumÿÿÿÿÿÿ
> > > On Wed, Dec 1, 2010 at 5:38 PM, Jeff Wu <cpwu@tnsoft.com.cn> wrote:
> > > >> > and i also detect a issue ,take the following steps:
> > > >> >
> > > >> > $. mckephfs -c ceph.conf -v --mkbtrfs -a
> > > >> > $  init-ceph - ceph.conf --btrfs -v -a start
> > > >> > then execute:
> > > >> > $  init-ceph - ceph.conf --btrfs -v -a stop
> > > >> >
> > > >> > this command can't stop OSD0 and OSD1 cosd process:
> > > >> > OSD0:
> > > >> > /usr/local/bin/cosd -i 0 -c ceph.conf
> > > >> > OSD1:
> > > >> > /usr/local/bin/cosd -i 1 -c ceph.conf
> > > >> Not sure I understand what you're doing here. Also, it looks like
> > > >> you've got a malformed command there -- you don't specify the "-c"
> > > >> option, just the nonexistent "-" option. ;)
> > > >
> > > > Oh,Sorry, i mean that , if i don't create folder "/var/run/ceph" at OSD
> > > > hosts manually. Execute the command : "$init-ceph -c ceph.conf --btrfs
> > > > -v -a stop " ,which can't auto-kill OSD host cosd process, like this :
> > > > OSD0 host:
> > > > $ ps -ef | grep cosd
> > > > root     13987     1  0 Dec01 ?        00:02:55 /usr/local/bin/cosd -i 0
> > > > -c ceph.conf
> > > > OSD1 host:
> > > > $ ps -ef | grep cosd
> > > > root     14028     1  0 Dec01 ?        00:02:53 /usr/local/bin/cosd -i 1
> > > > -c ceph.conf
> > > >
> > > > I have to execute "kill -9 13987 " and "kill -9 14028" to kill cosd
> > > > process manually, or ,next time , it will fail to execute "$ init-ceph
> > > > -c ceph.conf --btrfs -v -a start " command.
> > > Ah. I think that /var/run/ceph is where init-ceph stores the PIDs. It
> > > ought to be created automatically; if it's not we should fix that.
> > > What version of Ceph are you running, and where from? I'd imagine it's
> > > being packaged wrong or something.
> > 
> > Hi ,
> > i download ceph 0.23 from 
> > http://ceph.newdream.net/download/ceph-0.23.tar.gz
> > 
> > So , maybe , should i add 
> > "
> > [global]
> >        pid file = /var/run/ceph/$name.pid
> 
> That's the default, so no...  I think the problem is that 'make install' 
> does 'mkdir -p /var/run/ceph'.  The .deb and .rpm create the dir, but an 
> install from source does not.  There is probably a similar problem with 
> the osd class tmp dir.
> 
Yes, when i try to use RBD , need also create "/var/lib/ceph/tmp"
mamually. I install ceph server with ceph-0.23.tar.gz.

Jeff,wu

> sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-12-02  7:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1291001135.1872.106.camel@cephhost>
     [not found] ` <1291001398.1872.107.camel@cephhost>
     [not found]   ` <1291001962.1872.113.camel@cephhost>
     [not found]     ` <1291002250.1872.116.camel@cephhost>
2010-11-29  3:53       ` Performence test on ceph v0.23 + EXT4 and Btrfs Jeff Wu
2010-11-29 17:07 ` Gregory Farnum
2010-11-30  2:55   ` Jeff Wu
2010-11-30  3:18     ` Gregory Farnum
2010-11-30  6:19       ` Jeff Wu
2010-11-30 17:07         ` Gregory Farnum
2010-12-01  1:35           ` Jeff Wu
2010-12-01  6:59             ` Jeff Wu
2010-12-01 16:05               ` Gregory Farnum
2010-12-02  1:38                 ` Jeff Wu
2010-12-02  2:35                   ` Gregory Farnum
2010-12-02  3:22                     ` Jeff Wu
2010-12-02  6:10                       ` Sage Weil
2010-12-02  7:31                         ` Jeff Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.