All of lore.kernel.org
 help / color / mirror / Atom feed
* Trouble
@ 2011-04-10  0:08 Fyodor Ustinov
  2011-04-10 14:25 ` Trouble Fyodor Ustinov
  0 siblings, 1 reply; 2+ messages in thread
From: Fyodor Ustinov @ 2011-04-10  0:08 UTC (permalink / raw)
  To: ceph-devel

Hi!

Trouble.

stb1 - ceph client (xx.xx.xx.104)
4 osd servers (stc1 - stc4, xx.xx.xx.200 - xx.xx.xx.203).
1 mon+mds server (mds1 - xx.xx.xx.199).
ceph - 0.26

New, empty fs.

root@stc1:~# ceph osd dump -o -
2011-04-09 21:12:05.365830 mon <- [osd,dump]
2011-04-09 21:12:05.366348 mon0 -> 'dumped osdmap epoch 5' (0)
epoch 5
fsid bc6102b2-46a9-8bcf-6eed-a05a8de25a2e
created 2011-04-09 21:05:28.352664
modifed 2011-04-09 21:05:41.640899
flags

pg_pool 0 'data' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash 
rjenkins pg_num 256 pgp_num 256 lpg_num 2 lpgp_num 2 last_change 1 owner 0)
pg_pool 1 'metadata' pg_pool(rep pg_size 2 crush_ruleset 1 object_hash 
rjenkins pg_num 256 pgp_num 256 lpg_num 2 lpgp_num 2 last_change 1 owner 0)
pg_pool 2 'casdata' pg_pool(rep pg_size 2 crush_ruleset 2 object_hash 
rjenkins pg_num 256 pgp_num 256 lpg_num 2 lpgp_num 2 last_change 1 owner 0)
pg_pool 3 'rbd' pg_pool(rep pg_size 2 crush_ruleset 3 object_hash 
rjenkins pg_num 256 pgp_num 256 lpg_num 2 lpgp_num 2 last_change 1 owner 0)

max_osd 4
osd0 up   in  weight 1 up_from 2 up_thru 4 down_at 0 last_clean_interval 
0-0 xx.xx.xx.200:6800/9065 xx.xx.xx.200:6801/9065 xx.xx.xx.200:6802/9065
osd1 up   in  weight 1 up_from 3 up_thru 4 down_at 0 last_clean_interval 
0-0 xx.xx.xx.201:6800/9153 xx.xx.xx.201:6801/9153 xx.xx.xx.201:6802/9153
osd2 up   in  weight 1 up_from 4 up_thru 4 down_at 0 last_clean_interval 
0-0 xx.xx.xx.202:6800/9915 xx.xx.xx.202:6801/9915 xx.xx.xx.202:6802/9915
osd3 up   in  weight 1 up_from 4 up_thru 4 down_at 0 last_clean_interval 
0-0 xx.xx.xx.203:6800/3325 xx.xx.xx.203:6801/3325 xx.xx.xx.203:6802/3325


root@stc1:~# rados df
pool name                 KB      objects       clones     degraded      
unfound           rd        rd KB           wr        wr KB
casdata                    0            0            0            
0           0            0            0            0            0
data                       0            0            0            
0           0            0            0            0            0
metadata                  21           21            0            
0           0           12            0           48           22
rbd                        0            0            0            
0           0            0            0            0            0
   total used          844432           21
   total avail     3624515892
   total space     3768948976

2011-04-09 21:08:21.478776    pg v94: 1056 pgs: 1056 active+clean; 21 KB 
data, 824 MB used, 3456 GB / 3594 GB avail

root@stb1:~# mount -t ceph stm1:/ /mnt

root@stb1:~# df -h /mnt
Filesystem            Size  Used Avail Use% Mounted on
xx.xx.xx.199:/      3.6T  825M  3.4T   1% /mnt
root@stb1:~#

root@stb1:~# bonnie++ -d /mnt -u root -n 128:10:0:5 -m stb1-1
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.96       ------Sequential Output------ --Sequential Input- 
--Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
--Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  
/sec %CP
stb1-1           8G  1150  97 102418   6 32516   2  2139  95 96899   4 
771.7  15
Latency              8881us    1842ms    4292ms   11700us     252ms   
47795us
Version  1.96       ------Sequential Create------ --------Random 
Create--------
stb1-1              -Create-- --Read--- -Delete-- -Create-- --Read--- 
-Delete--
files:max            /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  
/sec %CP
          128:10:0/5   120   0 40511  22   109   0    94   0    93   
0    28   0
Latency             13858ms   27269us      300s   23428ms     965ms    
3740ms
1.96,1.96,stb1-1,1,1302367781,8G,,1150,97,102418,6,32516,2,2139,95,96899,4,771.7,15,128,10,,,5,120,0,40511,22,109,0,94,0,93,0,28,0,8881us,1842ms,4292ms,11700us,252ms,47795us,13858ms,27269us,300s,23428ms,965ms,3740ms

Messages in mds.log:

2011-04-09 21:09:12.355366 7fa114795700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0xea48b0 sd=10 pgs=0 cs=0 l=0).accept 
peer addr is really xx.xx.xx.104:0/2435973235 (socket is 77.120
.104.104:45340/0)
2011-04-09 21:46:29.219213 7fa115f9f700 log [INF] : closing stale 
session client4109 xx.xx.xx.104:0/2435973235 after 301.146714
2011-04-09 21:46:33.078161 7fa114795700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x517dcd0 sd=10 pgs=0 cs=0 l=0).accept 
peer addr is really xx.xx.xx.104:0/2435973235 (socket is 
xx.xx.xx.104:34911/0)
2011-04-09 21:46:33.083027 7fa114795700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x517dcd0 sd=10 pgs=0 cs=0 l=0).accept we 
reset (peer sent cseq 2), sending RESETSESSION
2011-04-09 21:46:33.151068 7fa114795700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x24b3e90 sd=10 pgs=0 cs=0 l=0).accept 
peer addr is really xx.xx.xx.104:0/2435973235 (socket is 
xx.xx.xx.104:34912/0)
2011-04-09 21:46:33.392349 7fa1170a2700 mds0.server  no longer in 
reconnect state, ignoring reconnect, sending close
2011-04-09 21:46:33.392481 7fa1170a2700 log [INF] : denied reconnect 
attempt (mds is up:active) from client4109 xx.xx.xx.104:0/2435973235 
after 2011-04-09 21:46:33.392330 (allowed interval 45)
2011-04-09 21:46:33.431556 7fa114795700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x24b3e90 sd=10 pgs=9 cs=1 l=0).fault 
initiating reconnect
2011-04-09 21:46:33.434201 7fa11417d700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x517dcd0 sd=11 pgs=0 cs=0 l=0).accept 
peer addr is really xx.xx.xx.104:0/2435973235 (socket is 
xx.xx.xx.104:34913/0)
2011-04-09 21:46:33.442067 7fa11417d700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x517dcd0 sd=11 pgs=0 cs=0 l=0).accept 
connect_seq 0 vs existing 2 state 3
2011-04-09 21:46:33.442109 7fa11417d700 -- xx.xx.xx.199:6800/20513 >> 
xx.xx.xx.104:0/2435973235 pipe(0x517dcd0 sd=11 pgs=0 cs=0 l=0).accept 
peer reset, then tried to connect to us, replacing
2011-04-09 21:46:33.448964 7fa1170a2700 mds0.1 ms_handle_remote_reset on 
xx.xx.xx.104:0/2435973235
2011-04-09 22:51:32.482137 7fa1170a2700 mds0.cache.dir(10000020009) 
mismatch between head items and fnode.fragstat! printing dentries

And many-many messages like:

2011-04-09 22:51:32.482192 7fa1170a2700 mds0.cache.dir(10000020009) 
[dentry #1/Bonnie.11304/00000/0000000000d9 [2,head] auth (dversion lock) 
v=27380 inode=0x7fa0f4a72fa0 | inodepin 0x7fa0ef246cf0]
2011-04-09 22:51:32.482208 7fa1170a2700 mds0.cache.dir(10000020009) 
[dentry #1/Bonnie.11304/00000/0000000008 [2,head] auth (dversion lock) 
pv=0 v=125745 inode=0x7fa0d02939e0 | inodepin 0x7fa0cb76c418]
2011-04-09 22:51:32.482223 7fa1170a2700 mds0.cache.dir(10000020009) 
[dentry #1/Bonnie.11304/00000/000000000e [2,head] auth (dversion lock) 
pv=0 v=108861 inode=0x7fa0d2776fc0 | inodepin 0x7fa0cbf5dae8]
2011-04-09 22:51:32.482236 7fa1170a2700 mds0.cache.dir(10000020009) 
[dentry #1/Bonnie.11304/00000/0000000028 [2,head] auth (dversion lock) 
v=33172 inode=0x7fa0f4bf0020 | inodepin 0x7fa0fcea2c60]
2011-04-09 22:51:32.482250 7fa1170a2700 mds0.cache.dir(10000020009) 
[dentry #1/Bonnie.11304/00000/0000000039 [2,head] auth (dversion lock) 
pv=0 v=117571 inode=0x7fa0cf2783d0 | inodepin 0x7fa0cb3f4600]
---

root@stb1:~# ls -al /mnt
total 4
drwxr-xr-x  1 root root    0 2011-04-10 00:04 .
drwxr-xr-x 23 root root 4096 2011-04-07 02:02 ..
root@stb1:~# df -h /mnt
Filesystem            Size  Used Avail Use% Mounted on
xx.xx.xx.199:/      3.6T  1.8G  3.4T   1% /mnt

root@stc1:~# rados df
pool name                 KB      objects       clones     degraded      
unfound           rd        rd KB           wr        wr KB
casdata                    0            0            0            
0           0            0            0            0            0
data                   32005        32005            0            
0           0        88297     14912171       451822     17025197
metadata              146855           63            0            
0           0         4468     37326846       213199      2451206
rbd                        0            0            0            
0           0            0            0            0            0
   total used         1798788        32068
   total avail     3623561536
   total space     3768948976
root@stc1:~#

--------------------

Configuration:

ceph.conf:
[global]
         max open files = 131072

[mon]
         mon data = /mfs/data/mon$id
[mon0]
         host = stm1
         mon addr = xx.xx.xx.199:6789
[mds]
         keyring = /mfs/data/keyring.$name
[mds1]
         host = stm1
         mds addr = xx.xx.xx.199
[osd]
         osd data = /mfs
         osd journal = /journal/journal
         osd journal size = 500
         journal dio = false
[osd0]
         host = stc1
[osd1]
         host = stc2
[osd2]
         host = stc3
[osd3]
         host = stc4



WBR,
    Fyodor.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-04-10 14:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-10  0:08 Trouble Fyodor Ustinov
2011-04-10 14:25 ` Trouble Fyodor Ustinov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.