All of lore.kernel.org
 help / color / mirror / Atom feed
* pg 0.xxxx on [] is laggy
@ 2010-06-29 12:12 Sébastien Paolacci
  2010-06-29 15:58 ` Sage Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Sébastien Paolacci @ 2010-06-29 12:12 UTC (permalink / raw)
  To: ceph-devel

Hello,

First of all, many thanks for this wonderful piece of software that
actually looks very promising. It's a segment that imho definitely
lack of credible open source alternatives to the (sometimes infamous
and inefficient) proprietary systems.

So I've just pulled the unstable branch from last Sunday and are few
outcomes (local vm, sorry for that, but as for a first try...):

- Build: transparent, which is actually not so common for an unstable
branch of a said to not be mature project ;). Thanks.

- Config: it's a bit difficult to understand the real meaning of all
the available options (debian and suse dedicated pages are however
very helpful), so documentation is sparse, as expected, and I should
have start by reading the code anyway (so my bad at the end).

- First setup attempt left me with a "mon fs missing 'whoami'.. did
you run mkcephfs?" (see end of mail) I just echoed a "0" in
"/data/mom0/whoami" and it did startup.

- cfuse -m 127.0.01:5678/ /mnt/ceph is eating all my memory and
crashes with a bad_alloc
root@debian-vm1:/home/seb# gdb cfuse
(gdb) run -m 127.0.0.1:6789/ /mnt/ceph/
Starting program: /usr/local/bin/cfuse -m 127.0.0.1:6789/ /mnt/ceph/
[Thread debugging using libthread_db enabled]
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Program received signal SIGABRT, Aborted.
0x00007ffff6b12175 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007ffff6b12175 in raise () from /lib/libc.so.6
#1  0x00007ffff6b14f80 in abort () from /lib/libc.so.6
#2  0x00007ffff73a5dd5 in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib/libstdc++.so.6
#3  0x00007ffff73a4176 in ?? () from /usr/lib/libstdc++.so.6
#4  0x00007ffff73a41a3 in std::terminate() () from /usr/lib/libstdc++.so.6
#5  0x00007ffff73a429e in __cxa_throw () from /usr/lib/libstdc++.so.6
#6  0x00007ffff73a472d in operator new(unsigned long) () from
/usr/lib/libstdc++.so.6
#7  0x000000000053884d in
__gnu_cxx::new_allocator<entity_addr_t>::allocate
(this=0x7fffffffdcf0, __position=..., __x=<value optimized out>)
    at /usr/include/c++/4.4/ext/new_allocator.h:89
#8  std::_Vector_base<entity_addr_t, std::allocator<entity_addr_t>
>::_M_allocate (this=0x7fffffffdcf0, __position=..., __x=<value
optimized out>)
    at /usr/include/c++/4.4/bits/stl_vector.h:140
#9  std::vector<entity_addr_t, std::allocator<entity_addr_t>
>::_M_insert_aux (this=0x7fffffffdcf0, __position=..., __x=<value
optimized out>)
    at /usr/include/c++/4.4/bits/vector.tcc:322
#10 0x000000000053630e in std::vector<entity_addr_t,
std::allocator<entity_addr_t> >::push_back (s=<value optimized out>,
vec=...)
    at /usr/include/c++/4.4/bits/stl_vector.h:741
#11 parse_ip_port_vec (s=<value optimized out>, vec=...) at config.cc:182
#12 0x000000000052a9a2 in MonClient::build_initial_monmap
(this=0x7fffffffddc0) at mon/MonClient.cc:62
#13 0x000000000044347c in main (argc=2, argv=0x7c65e0, envp=<value
optimized out>) at cfuse.cc:67
(gdb) fr 11
#11 parse_ip_port_vec (s=<value optimized out>, vec=...) at config.cc:182
182         vec.push_back(a);
(gdb) info local
a = {type = 0, nonce = 0, {addr = {ss_family = 0, __ss_align = 0,
__ss_padding = '\000' <repeats 111 times>}, addr4 = {sin_family = 0,
sin_port = 0, sin_addr = {s_addr = 0},
      sin_zero = "\000\000\000\000\000\000\000"}, addr6 = {sin6_family
= 0, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u =
{__u6_addr8 = '\000' <repeats 15 times>,
          __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0,
0, 0}}}, sin6_scope_id = 0}}}
p = 0x7c666e "/"
end = 0x7c666f ""
(gdb)


- cfuse /mnt/ceph is however working as expected. Creating files and
browsing /mnt/ceph content provide with the desired result, dbench -D
/mnt/ceph/ -t 10 2 however seems to endless wait for completion. On
the cfuse side, I'm getting (what seems to be) an endless serie of "pg
0.xxx on [] is laggy"

10.06.29_13:09:29.611244 7fe0e7571710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6800/2736 pipe(0xd083b0 sd=6 pgs=3 cs=1 l=0).getting message
bytes now, currently using 352/104857600
10.06.29_13:09:29.611267 7fe0e7571710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6800/2736 pipe(0xd083b0 sd=6 pgs=3 cs=1 l=0).aborted = 0
10.06.29_13:09:29.611281 7fe0e7571710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6800/2736 pipe(0xd083b0 sd=6 pgs=3 cs=1 l=0).reader got
message 1779 0xd71c70 client_caps(flush_ack ino 10000000089 139 seq 3
caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0 mtime
0.000000) v1
10.06.29_13:09:29.611309 7fe0e7d72710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6800/2736 pipe(0xd083b0 sd=6 pgs=3 cs=1 l=0).writer: state =
2 policy.server=0
10.06.29_13:09:29.611320 7fe0e7d72710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6800/2736 pipe(0xd083b0 sd=6 pgs=3 cs=1 l=0).write_ack 1779
10.06.29_13:09:29.611331 7fe0e7d72710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6800/2736 pipe(0xd083b0 sd=6 pgs=3 cs=1 l=0).writer: state =
2 policy.server=0
10.06.29_13:09:29.611347 7fe0ead78710 -- 127.0.0.1:0/2807 <== mds0
127.0.0.1:6800/2736 1777 ==== client_caps(flush_ack ino 10000000086
136 seq 3 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0
mtime 0.000000) v1 ==== 176+0+0 (407452104 0 0) 0xd71210
10.06.29_13:09:29.611371 7fe0ead78710 -- 127.0.0.1:0/2807 <== mds0
127.0.0.1:6800/2736 1778 ==== client_caps(flush_ack ino 10000000087
137 seq 3 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0
mtime 0.000000) v1 ==== 176+0+0 (3805174403 0 0) 0xd70e70
10.06.29_13:09:29.611746 7fe0ead78710 -- 127.0.0.1:0/2807 <== mds0
127.0.0.1:6800/2736 1779 ==== client_caps(flush_ack ino 10000000089
139 seq 3 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=- follows 0 size 0/0
mtime 0.000000) v1 ==== 176+0+0 (1062806157 0 0) 0xd71c70
10.06.29_13:09:32.265289 7fe0e9d76710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6789/0 pipe(0xd09010 sd=5 pgs=3 cs=1 l=1).writer: state = 2
policy.server=0
10.06.29_13:09:32.265350 7fe0e9d76710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6789/0 pipe(0xd09010 sd=5 pgs=3 cs=1 l=1).write_keepalive
10.06.29_13:09:32.265476 7fe0e9d76710 -- 127.0.0.1:0/2807 >>
127.0.0.1:6789/0 pipe(0xd09010 sd=5 pgs=3 cs=1 l=1).writer: state = 2
policy.server=0
10.06.29_13:09:32.289977 7fe0e8d74710 client4099.objecter  pg 0.4e11
on [] is laggy: 93
10.06.29_13:09:32.290055 7fe0e8d74710 client4099.objecter  pg 0.eba2
on [] is laggy: 7
10.06.29_13:09:32.290079 7fe0e8d74710 client4099.objecter  pg 0.e050
on [] is laggy: 112
10.06.29_13:09:32.290097 7fe0e8d74710 client4099.objecter  pg 0.9066
on [] is laggy: 34
10.06.29_13:09:32.290116 7fe0e8d74710 client4099.objecter  pg 0.df8a
on [] is laggy: 106
10.06.29_13:09:32.290142 7fe0e8d74710 client4099.objecter  pg 0.3b2e
on [] is laggy: 70
10.06.29_13:09:32.290162 7fe0e8d74710 client4099.objecter  pg 0.4136
on [] is laggy: 62
10.06.29_13:09:32.290178 7fe0e8d74710 client4099.objecter  pg 0.8c98
on [] is laggy: 32
10.06.29_13:09:32.290193 7fe0e8d74710 client4099.objecter  pg 0.9d2b
on [] is laggy: 54
10.06.29_13:09:32.290208 7fe0e8d74710 client4099.objecter  pg 0.d372
on [] is laggy: 52
10.06.29_13:09:32.290227 7fe0e8d74710 client4099.objecter  pg 0.661 on
[] is laggy: 64
10.06.29_13:09:32.290241 7fe0e8d74710 client4099.objecter  pg 0.c822
on [] is laggy: 46
10.06.29_13:09:32.290256 7fe0e8d74710 client4099.objecter  pg 0.cbe6
on [] is laggy: 110
10.06.29_13:09:32.290270 7fe0e8d74710 client4099.objecter  pg 0.8506
on [] is laggy: 44
10.06.29_13:09:32.290284 7fe0e8d74710 client4099.objecter  pg 0.9960
on [] is laggy: 83
10.06.29_13:09:32.290299 7fe0e8d74710 client4099.objecter  pg 0.1571
on [] is laggy: 39
(...)

As previously stated I just have a very partial understanding of the
system and I barely took time to look at the sources, so the most
probable reason for this to happen is obviously a
misconfiguration/misuse issue on my side. Shouldn't it however be the
case, what could I provide you with (or where should I start) to
further investigate?

Thanks,
Sebastien

root@debian-vm1:/home/seb# cat /etc/ceph/ceph.conf | grep -v '^;'
[global]
       pid file = /var/run/ceph/$name.pid
       debug ms = 10
[mon]
       mon data = /data/mon$id
[mon0]
       host = debian-vm1
       mon addr = 127.0.0.1:6789
[mds]
[mds0]
       host = debian-vm1
[osd]
       sudo = true
       osd data = /data/osd$id
       osd journal = /data/osd$id/journal
       osd journal size = 128
       filestore journal writeahead = true
[osd0]
       host = debian-vm1

root@debian-vm1:/home/seb# mount | grep data
/dev/sda7 on /data type ext3 (rw,user_xattr)

root@debian-vm1:/home/seb# clear; mkcephfs --allhosts -v -c /etc/ceph/ceph.conf
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon  "mon addr" ""
/usr/local/bin/monmaptool --create --clobber --add 127.0.0.1:6789
--print /tmp/monmap.2416
/usr/local/bin/monmaptool: monmap file /tmp/monmap.2416
/usr/local/bin/monmaptool: generated fsid d5f42ea9-a17f-0916-62a0-6d2e1ea822da
epoch 1
fsid d5f42ea9-a17f-0916-62a0-6d2e1ea822da
last_changed 10.06.29_13:07:31.438903
created 10.06.29_13:07:31.438903
        mon0 127.0.0.1:6789/0
/usr/local/bin/monmaptool: writing epoch 1 to /tmp/monmap.2416 (1 monitors)
max osd in /etc/ceph/ceph.conf is 0, num osd is 1
/usr/local/bin/osdmaptool: osdmap file '/tmp/osdmap.2416'
/usr/local/bin/osdmaptool: writing epoch 1 to /tmp/osdmap.2416
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon  "crush map src" ""
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon  "crush map" ""
Building admin keyring at /tmp/admin.keyring.2416
creating /tmp/admin.keyring.2416
Building monitor keyring with all service keys
creating /tmp/monkeyring.2416
importing contents of /tmp/admin.keyring.2416 into /tmp/monkeyring.2416
creating /tmp/keyring.mds.0
importing contents of /tmp/keyring.mds.0 into /tmp/monkeyring.2416
creating /tmp/keyring.osd.0
importing contents of /tmp/keyring.osd.0 into /tmp/monkeyring.2416
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon  "user" ""
=== mon0 ===
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon  "mon data" ""
--- debian-vm1# /usr/local/bin/cmon -c /etc/ceph/ceph.conf --mkfs -i 0
--monmap /tmp/monmap.2416 --osdmap /tmp/osdmap.2416 -k
/tmp/keyring.2416 ; rm -f /tmp/keyring.2416
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
/usr/local/bin/cmon: created monfs at /data/mon0 for mon0
`/tmp/admin.keyring.2416' -> `/data/mon0/admin_keyring.bin'
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds  "user" ""
=== mds0 ===
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds  "keyring" ""
WARNING: no keyring specified for mds0
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "user" ""
=== osd0 ===
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "osd data" ""
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "osd journal" ""
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "keyring" ""
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "btrfs path"
"/data/osd0"
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "btrfs devs" ""
/usr/local/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd  "btrfs
options" "noatime"
--- debian-vm1# test -d /data/osd0 || mkdir -p /data/osd0
--- debian-vm1# test -d /data/osd0/journal || mkdir -p /data/osd0
--- debian-vm1# /usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap
/tmp/monmap.2416 -i 0 --mkfs --osd-data /data/osd0
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
created object store /data/osd0 journal /data/osd0/journal for osd0
fsid d5f42ea9-a17f-0916-62a0-6d2e1ea822da
WARNING: no keyring specified for osd0
root@debian-vm1:/home/seb# /etc/init.d/ceph start
=== mon0 ===
Starting Ceph mon0 on debian-vm1...
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
mon fs missing 'whoami'.. did you run mkcephfs?
failed: ' /usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf '

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-29 12:12 pg 0.xxxx on [] is laggy Sébastien Paolacci
@ 2010-06-29 15:58 ` Sage Weil
  2010-06-30 12:05   ` Sébastien Paolacci
  0 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2010-06-29 15:58 UTC (permalink / raw)
  To: Sébastien Paolacci; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3007 bytes --]

Hi Sebastien,

On Tue, 29 Jun 2010, Sébastien Paolacci wrote:
> First of all, many thanks for this wonderful piece of software that
> actually looks very promising. It's a segment that imho definitely
> lack of credible open source alternatives to the (sometimes infamous
> and inefficient) proprietary systems.

Thanks!

> So I've just pulled the unstable branch from last Sunday and are few
> outcomes (local vm, sorry for that, but as for a first try...):
> 
> - Build: transparent, which is actually not so common for an unstable
> branch of a said to not be mature project ;). Thanks.
> 
> - Config: it's a bit difficult to understand the real meaning of all
> the available options (debian and suse dedicated pages are however
> very helpful), so documentation is sparse, as expected, and I should
> have start by reading the code anyway (so my bad at the end).
> 
> - First setup attempt left me with a "mon fs missing 'whoami'.. did
> you run mkcephfs?" (see end of mail) I just echoed a "0" in
> "/data/mom0/whoami" and it did startup.

It looks like when you ran /etc/init.d/ceph is tried to start 
/usr/bin/cmon, although I notice lots of /usr/local in your mkcephfs 
output.  Do you by chance build from source and then 'make install', and 
then also install a .deb or .rpm?  The "mon fs missing 'whoami'" is an old 
error message that no longer appears in the 'unstable' branch, so there is 
an old binary or old source involved somewhere.

> - cfuse -m 127.0.01:5678/ /mnt/ceph is eating all my memory and
> crashes with a bad_alloc

Fixed this.. there was an ip address parsing error.  The trailing '/' 
shouldn't be there, and wasn't getting ignored.

> - cfuse /mnt/ceph is however working as expected. Creating files and
> browsing /mnt/ceph content provide with the desired result, dbench -D
> /mnt/ceph/ -t 10 2 however seems to endless wait for completion. On
> the cfuse side, I'm getting (what seems to be) an endless serie of "pg
> 0.xxx on [] is laggy"

That means the OSD isn't responding for some request(s).  Did cosd start?  
Does a 'ceph -s' show some osds are 'up'?  If cosd crashed, the output log 
or gdb backtrack would be helpful.

> root@debian-vm1:/home/seb# cat /etc/ceph/ceph.conf | grep -v '^;'
> [global]
>        pid file = /var/run/ceph/$name.pid
>        debug ms = 10
I wouldn't put this in [global] or you will clutter up output from things 
like 'ceph -s'.
> [mon]
	debug ms = 1    ; is usually enough msgr output
>        mon data = /data/mon$id
> [mon0]
>        host = debian-vm1
>        mon addr = 127.0.0.1:6789
> [mds]
	debug ms = 1    ; is usually enough msgr output
> [mds0]
>        host = debian-vm1
> [osd]
	debug ms = 1    ; is usually enough msgr output
>        sudo = true
>        osd data = /data/osd$id
>        osd journal = /data/osd$id/journal
>        osd journal size = 128
>        filestore journal writeahead = true
> [osd0]
>        host = debian-vm1

sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-29 15:58 ` Sage Weil
@ 2010-06-30 12:05   ` Sébastien Paolacci
  2010-06-30 15:45     ` Thomas Mueller
  2010-06-30 18:06     ` Sage Weil
  0 siblings, 2 replies; 10+ messages in thread
From: Sébastien Paolacci @ 2010-06-30 12:05 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

Hello Sage,

It actually works better when removing the old deb pacakge;)... thanks!

The "pg 0.xxxx on [] is laggy" messages have also now disappeared.

Here is, just for information, the obtained two processes dbench
output (which is not so bad given the vm the system in running into):
Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX       1632     8.835   213.966
 Close           1237     0.377   157.360
 Rename            73    24.290   195.969
 Unlink           291     6.514   244.166
 Qpathinfo       1455     4.544   237.949
 Qfileinfo        311     0.233     5.723
 Qfsinfo          270     1.854   110.355
 Sfileinfo        168     3.090    34.291
 Find             552    11.182   192.610
 WriteX          1076     4.916   409.798
 ReadX           2437     0.408    61.635
 LockX              4     0.011     0.017
 UnlockX            4     0.006     0.009
 Flush            140     8.630   175.423

Throughput 2.88849 MB/sec  2 clients  2 procs  max_latency=409.805 ms

I however get the a bunch of metadata sync warnings on the cfuse side
"10.06.30_13:16:20.556187 7f2bc5817720 client4100 fsync - not syncing
metadata yet.. implement me"


On a different topic (sorry for the mix-up), pushing the journal on a
seperate partition give me the following crash:

--- debian-vm1# /usr/local/bin/cosd -c /etc/ceph/ceph.conf --monmap
/tmp/monmap.1924 -i 0 --mkfs --osd-data /data/osd0
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
os/FileJournal.cc: In function 'void FileJournal::write_bl(off64_t&,
ceph::bufferlist&)':
os/FileJournal.cc:503: FAILED assert((bl.length() & ~ceph::_page_mask) == 0)
 1: (FileJournal::do_write(ceph::buffer::list&)+0x2a0) [0x55adc0]
 2: (FileJournal::write_thread_entry()+0x1fa) [0x55db4a]
 3: (FileJournal::Writer::entry()+0xd) [0x54f41d]
 4: (Thread::_entry_func(void*)+0x7) [0x46bac7]
 5: (()+0x68ba) [0x7f5dfce098ba]
 6: (clone()+0x6d) [0x7f5dfc02401d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion*'
Aborted (core dumped)

root@debian-vm1:~# gdb /usr/local/bin/cosd -c core
(gdb) bt full
#0  0x00007f5dfbf87175 in raise () from /lib/libc.so.6
No symbol table info available.
#1  0x00007f5dfbf89f80 in abort () from /lib/libc.so.6
No symbol table info available.
#2  0x00007f5dfc81add5 in __gnu_cxx::__verbose_terminate_handler() ()
from /usr/lib/libstdc++.so.6
No symbol table info available.
#3  0x00007f5dfc819176 in ?? () from /usr/lib/libstdc++.so.6
No symbol table info available.
#4  0x00007f5dfc8191a3 in std::terminate() () from /usr/lib/libstdc++.so.6
No symbol table info available.
#5  0x00007f5dfc81929e in __cxa_throw () from /usr/lib/libstdc++.so.6
No symbol table info available.
#6  0x00000000005abb38 in ceph::__ceph_assert_fail (assertion=0x5e4ed8
"(bl.length() & ~ceph::_page_mask) == 0", file=<value optimized out>,
line=503,
    func=<value optimized out>) at common/assert.cc:30
No locals.
#7  0x0000000000557c60 in FileJournal::write_bl (this=0x24fd720,
pos=@0x7f5dfbd3cd68, bl=...) at os/FileJournal.cc:503
        __PRETTY_FUNCTION__ = "void FileJournal::write_bl(off64_t&,
ceph::bufferlist&)"
        err = <value optimized out>
#8  0x000000000055adc0 in FileJournal::do_write (this=0x24fd720,
bl=...) at os/FileJournal.cc:568
        __PRETTY_FUNCTION__ = "void FileJournal::do_write(ceph::bufferlist&)"
        lat = {tv = {tv_sec = 4224962000, tv_usec = 32605}}
        hbp = {_raw = 0x2500450, _off = 0, _len = 1024}
        pos = 1024
        split = <value optimized out>
#9  0x000000000055db4a in FileJournal::write_thread_entry
(this=0x24fd720) at os/FileJournal.cc:657
        orig_ops = 1
        bl = {_buffers = {<std::_List_base<ceph::buffer::ptr,
std::allocator<ceph::buffer::ptr> >> = {
              _M_impl =
{<std::allocator<std::_List_node<ceph::buffer::ptr> >> =
{<__gnu_cxx::new_allocator<std::_List_node<ceph::buffer::ptr> >> =
{<No data fields>}, <No data fields>}, _M_node = {_M_next = 0x2500360,
_M_prev = 0x2500360}}}, <No data fields>}, _len = 5120, append_buffer
= {_raw = 0x25002d0, _off = 0, _len = 80}, last_p = {
            bl = 0x7f5dfbd3ce10, ls = 0x7f5dfbd3ce10, off = 0, p =
{_M_node = 0x7f5dfbd3ce10}, p_off = 0}}
        orig_bytes = 206
        r = 0
        __PRETTY_FUNCTION__ = "void FileJournal::write_thread_entry()"
#10 0x000000000054f41d in FileJournal::Writer::entry (this=<value
optimized out>) at os/FileJournal.h:140
No locals.
#11 0x000000000046bac7 in Thread::_entry_func (arg=0x800) at
./common/Thread.h:39
        r = <value optimized out>
#12 0x00007f5dfce098ba in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#13 0x00007f5dfc02401d in clone () from /lib/libc.so.6
No symbol table info available.
#14 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)

I don't have so much time to investigate right now but I'll give a try
later today.

Sebastien

[global]
       pid file = /var/run/ceph/$name.pid
[mon]
       debug ms = 1
       mon data = /data/mon$id
[mon0]
       host = debian-vm1
       mon addr = 127.0.0.1:6789
[mds]
       debug ms = 1
[mds0]
       host = debian-vm1
[osd]
       debug ms = 1
       sudo = true
       osd data = /data/osd$id
       osd journal = /ceph-journal/osd$id
       osd journal size = 256
       filestore journal writeahead = true
[osd0]
       host = debian-vm1


with
/dev/sda6 on /ceph-journal type ext3 (rw,user_xattr)
/dev/sda7 on /data type ext3 (rw,user_xattr)


2010/6/29 Sage Weil <sage@newdream.net>:
> Hi Sebastien,
>
> On Tue, 29 Jun 2010, Sébastien Paolacci wrote:
>> First of all, many thanks for this wonderful piece of software that
>> actually looks very promising. It's a segment that imho definitely
>> lack of credible open source alternatives to the (sometimes infamous
>> and inefficient) proprietary systems.
>
> Thanks!
>
>> So I've just pulled the unstable branch from last Sunday and are few
>> outcomes (local vm, sorry for that, but as for a first try...):
>>
>> - Build: transparent, which is actually not so common for an unstable
>> branch of a said to not be mature project ;). Thanks.
>>
>> - Config: it's a bit difficult to understand the real meaning of all
>> the available options (debian and suse dedicated pages are however
>> very helpful), so documentation is sparse, as expected, and I should
>> have start by reading the code anyway (so my bad at the end).
>>
>> - First setup attempt left me with a "mon fs missing 'whoami'.. did
>> you run mkcephfs?" (see end of mail) I just echoed a "0" in
>> "/data/mom0/whoami" and it did startup.
>
> It looks like when you ran /etc/init.d/ceph is tried to start
> /usr/bin/cmon, although I notice lots of /usr/local in your mkcephfs
> output.  Do you by chance build from source and then 'make install', and
> then also install a .deb or .rpm?  The "mon fs missing 'whoami'" is an old
> error message that no longer appears in the 'unstable' branch, so there is
> an old binary or old source involved somewhere.
>
>> - cfuse -m 127.0.01:5678/ /mnt/ceph is eating all my memory and
>> crashes with a bad_alloc
>
> Fixed this.. there was an ip address parsing error.  The trailing '/'
> shouldn't be there, and wasn't getting ignored.
>
>> - cfuse /mnt/ceph is however working as expected. Creating files and
>> browsing /mnt/ceph content provide with the desired result, dbench -D
>> /mnt/ceph/ -t 10 2 however seems to endless wait for completion. On
>> the cfuse side, I'm getting (what seems to be) an endless serie of "pg
>> 0.xxx on [] is laggy"
>
> That means the OSD isn't responding for some request(s).  Did cosd start?
> Does a 'ceph -s' show some osds are 'up'?  If cosd crashed, the output log
> or gdb backtrack would be helpful.
>
>> root@debian-vm1:/home/seb# cat /etc/ceph/ceph.conf | grep -v '^;'
>> [global]
>>        pid file = /var/run/ceph/$name.pid
>>        debug ms = 10
> I wouldn't put this in [global] or you will clutter up output from things
> like 'ceph -s'.
>> [mon]
>        debug ms = 1    ; is usually enough msgr output
>>        mon data = /data/mon$id
>> [mon0]
>>        host = debian-vm1
>>        mon addr = 127.0.0.1:6789
>> [mds]
>        debug ms = 1    ; is usually enough msgr output
>> [mds0]
>>        host = debian-vm1
>> [osd]
>        debug ms = 1    ; is usually enough msgr output
>>        sudo = true
>>        osd data = /data/osd$id
>>        osd journal = /data/osd$id/journal
>>        osd journal size = 128
>>        filestore journal writeahead = true
>> [osd0]
>>        host = debian-vm1
>
> sage
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-30 12:05   ` Sébastien Paolacci
@ 2010-06-30 15:45     ` Thomas Mueller
  2010-06-30 20:57       ` Sébastien Paolacci
  2010-06-30 18:06     ` Sage Weil
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Mueller @ 2010-06-30 15:45 UTC (permalink / raw)
  To: ceph-devel

Am Wed, 30 Jun 2010 14:05:34 +0200 schrieb Sébastien Paolacci:

> Hello Sage,
> 
> It actually works better when removing the old deb pacakge;)... thanks!
> 
> The "pg 0.xxxx on [] is laggy" messages have also now disappeared.
> 
> Here is, just for information, the obtained two processes dbench output
> (which is not so bad given the vm the system in running into): Operation
>      Count    AvgLat    MaxLat
>  ---------------------------------------- NTCreateX       1632     8.835
>    213.966 Close           1237     0.377   157.360 Rename            73
>     24.290   195.969 Unlink           291     6.514   244.166 Qpathinfo 
>       1455     4.544   237.949 Qfileinfo        311     0.233     5.723
>  Qfsinfo          270     1.854   110.355 Sfileinfo        168     3.090
>     34.291 Find             552    11.182   192.610 WriteX          1076
>      4.916   409.798 ReadX           2437     0.408    61.635 LockX     
>          4     0.011     0.017 UnlockX            4     0.006     0.009
>  Flush            140     8.630   175.423
> 
> Throughput 2.88849 MB/sec  2 clients  2 procs  max_latency=409.805 ms
> 
> I however get the a bunch of metadata sync warnings on the cfuse side
> "10.06.30_13:16:20.556187 7f2bc5817720 client4100 fsync - not syncing
> metadata yet.. implement me"

give the kclient a try? git://ceph.newdream.net/ceph-client-
standalone.git branch unstable-backport for use with "older" kernels.



> [osd]
>        debug ms = 1
>        sudo = true
>        osd data = /data/osd$id
>        osd journal = /ceph-journal/osd$id

why not let ceph manage the whole partition without a fs inbetween ? like

osd journal = /dev/sdX1

(and remove "journal size")

>        filestore journal writeahead = true

seems to be true by default (http://ceph.newdream.net/wiki/Ceph.conf -> 
filestore_journal_parallel)

- Thomas

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-30 12:05   ` Sébastien Paolacci
  2010-06-30 15:45     ` Thomas Mueller
@ 2010-06-30 18:06     ` Sage Weil
  2010-06-30 21:06       ` Sébastien Paolacci
  1 sibling, 1 reply; 10+ messages in thread
From: Sage Weil @ 2010-06-30 18:06 UTC (permalink / raw)
  To: Sébastien Paolacci; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 783 bytes --]

Hi,

On Wed, 30 Jun 2010, Sébastien Paolacci wrote:
> It actually works better when removing the old deb pacakge;)... thanks!

Good to hear!

> I however get the a bunch of metadata sync warnings on the cfuse side
> "10.06.30_13:16:20.556187 7f2bc5817720 client4100 fsync - not syncing
> metadata yet.. implement me"

Added this to the issue tracker.

> On a different topic (sorry for the mix-up), pushing the journal on a
> seperate partition give me the following crash:

Hmm.. is this reproducible?  I just pushed something to ceph.git unstable 
that prints out useful debugging info when that happens, but the question 
remains what is causing it to happen in the first place.  Do you mind 
giving the latest a go and seeing what it tells us?

Thanks!
sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-30 15:45     ` Thomas Mueller
@ 2010-06-30 20:57       ` Sébastien Paolacci
  0 siblings, 0 replies; 10+ messages in thread
From: Sébastien Paolacci @ 2010-06-30 20:57 UTC (permalink / raw)
  To: Thomas Mueller; +Cc: ceph-devel

On Wed, Jun 30, 2010 at 5:45 PM, Thomas Mueller <thomas@chaschperli.ch> wrote:
>
> give the kclient a try? git://ceph.newdream.net/ceph-client-
> standalone.git branch unstable-backport for use with "older" kernels.

It's working like a charm (albeit abit more difficult to debug...), thanks.

> why not let ceph manage the whole partition without a fs inbetween ? like
>
> osd journal = /dev/sdX1
>
> (and remove "journal size")
>
>>        filestore journal writeahead = true
>
> seems to be true by default (http://ceph.newdream.net/wiki/Ceph.conf ->
> filestore_journal_parallel)
>
> - Thomas

Mainly because performance is (currently) not my very priority but it
actually works this way and bypass the issue (as a side note there's
indeed a huge performance benefit).

Thanks for the tips,
Sebastien
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-30 18:06     ` Sage Weil
@ 2010-06-30 21:06       ` Sébastien Paolacci
  2010-06-30 22:06         ` Sage Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Sébastien Paolacci @ 2010-06-30 21:06 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 6/30/10, Sage Weil <sage@newdream.net> wrote:
> Hmm.. is this reproducible?  I just pushed something to ceph.git unstable
> that prints out useful debugging info when that happens, but the question
> remains what is causing it to happen in the first place.  Do you mind
> giving the latest a go and seeing what it tells us?
>
Yes it's definitely reproducible. I gave a try to your last commit
(that is actually triggered), here is the log output:

10.06.30_22:49:37.214544 --- opened log /var/log/ceph/debian-vm1.11418 ---
ceph version 0.21~rc (3235abe91863b3bffe5638b2169603f7d89ea375)
10.06.30_22:49:37.214655 7fec9a01c720 ---- renamed symlink
/var/log/ceph/osd0 -> /var/log/ceph/osd0.0 ----
10.06.30_22:49:37.214689 7fec9a01c720 ---- created symlink
/var/log/ceph/osd0 -> debian-vm1.11418 ----
10.06.30_22:49:37.214747 7fec9a01c720 -- :/0 register_entity client?
10.06.30_22:49:37.214876 7fec9a01c720 -- :/0 register_entity client? at :/0
10.06.30_22:49:37.214953 7fec9a01c720 -- :/0 ready :/0
10.06.30_22:49:37.215019 7fec9a01c720 -- :/11418 messenger.start
10.06.30_22:49:37.215076 7fec9a01c720 -- :/11418 shutdown :/11418
10.06.30_22:49:37.215091 7fec9a01c720 -- :/11418 shutdown i am not
dispatch, setting stop flag and joining thread.
10.06.30_22:49:37.215107 7fec9a01c720 -- :/11418 wait: still active
10.06.30_22:49:37.215145 7fec9a01c720 -- :/11418 wait: woke up
10.06.30_22:49:37.215165 7fec9a01c720 -- :/11418 wait: everything stopped
10.06.30_22:49:37.215177 7fec9a01c720 -- :/11418 wait: closing pipes
10.06.30_22:49:37.215191 7fec9a01c720 -- :/11418 reaper
10.06.30_22:49:37.215202 7fec9a01c720 -- :/11418 reaper done
10.06.30_22:49:37.215214 7fec9a01c720 -- :/11418 wait: waiting for
pipes  to close
10.06.30_22:49:37.215225 7fec9a01c720 -- :/11418 wait: done.
10.06.30_22:49:37.215236 7fec9a01c720 -- :/11418 shutdown complete.
10.06.30_22:49:37.215398 7fec9a01c720 filestore(/data/osd0) mkfs in /data/osd0
10.06.30_22:49:37.215477 7fec9a01c720 filestore(/data/osd0) mkfs
removing old directory current
10.06.30_22:49:37.218588 7fec9a01c720 filestore(/data/osd0) mkfs
removing old file magic
10.06.30_22:49:37.218644 7fec9a01c720 filestore(/data/osd0) mkfs
removing old file fsid
10.06.30_22:49:37.218674 7fec9a01c720 filestore(/data/osd0) mkfs
removing old file ceph_fsid
10.06.30_22:49:37.218703 7fec9a01c720 filestore(/data/osd0) mkfs
removing old file whoami
10.06.30_22:49:37.219248 7fec9a01c720 filestore(/data/osd0) mkjournal
created journal on /ceph-journal/osd0
10.06.30_22:49:37.219279 7fec9a01c720 filestore(/data/osd0) mkfs done
in /data/osd0
10.06.30_22:49:37.219329 7fec9a01c720 filestore(/data/osd0) mount did
NOT detect btrfs
10.06.30_22:49:37.219376 7fec9a01c720 filestore(/data/osd0) mount found snaps <>
10.06.30_22:49:37.220478 7fec98932710 journal rebuild_page_aligned
failed, buffer::list(len=5120,
        buffer::ptr(0~5120 0xab3410 in raw 0xab3410 len 5120 nref 1)
)
os/FileJournal.cc: In function 'void FileJournal::write_bl(off64_t&,
ceph::bufferlist&)':
os/FileJournal.cc:508: FAILED assert((bl.length() & ~ceph::_page_mask) == 0)
 1: (FileJournal::do_write(ceph::buffer::list&)+0x2a2) [0x55b362]
 2: (FileJournal::write_thread_entry()+0x1fa) [0x55da6a]
 3: (FileJournal::Writer::entry()+0xd) [0x54f04d]
 4: (Thread::_entry_func(void*)+0x7) [0x46ca97]
 5: (()+0x68ba) [0x7fec999fe8ba]
 6: (clone()+0x6d) [0x7fec98c1901d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

Would you prefer to have the core dump?

Sebastien

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-30 21:06       ` Sébastien Paolacci
@ 2010-06-30 22:06         ` Sage Weil
  2010-07-01 23:42           ` Sébastien Paolacci
  0 siblings, 1 reply; 10+ messages in thread
From: Sage Weil @ 2010-06-30 22:06 UTC (permalink / raw)
  To: Sébastien Paolacci; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4158 bytes --]

On Wed, 30 Jun 2010, Sébastien Paolacci wrote:
> On 6/30/10, Sage Weil <sage@newdream.net> wrote:
> > Hmm.. is this reproducible?  I just pushed something to ceph.git unstable
> > that prints out useful debugging info when that happens, but the question
> > remains what is causing it to happen in the first place.  Do you mind
> > giving the latest a go and seeing what it tells us?
> >
> Yes it's definitely reproducible. I gave a try to your last commit
> (that is actually triggered), here is the log output:

Can you do the same, but with

	debug filestore = 20
	debug journal = 20
	debug osd = 20

I'm not sure where that length 5120 write is coming from.

Thanks!
sage

> 
> 10.06.30_22:49:37.214544 --- opened log /var/log/ceph/debian-vm1.11418 ---
> ceph version 0.21~rc (3235abe91863b3bffe5638b2169603f7d89ea375)
> 10.06.30_22:49:37.214655 7fec9a01c720 ---- renamed symlink
> /var/log/ceph/osd0 -> /var/log/ceph/osd0.0 ----
> 10.06.30_22:49:37.214689 7fec9a01c720 ---- created symlink
> /var/log/ceph/osd0 -> debian-vm1.11418 ----
> 10.06.30_22:49:37.214747 7fec9a01c720 -- :/0 register_entity client?
> 10.06.30_22:49:37.214876 7fec9a01c720 -- :/0 register_entity client? at :/0
> 10.06.30_22:49:37.214953 7fec9a01c720 -- :/0 ready :/0
> 10.06.30_22:49:37.215019 7fec9a01c720 -- :/11418 messenger.start
> 10.06.30_22:49:37.215076 7fec9a01c720 -- :/11418 shutdown :/11418
> 10.06.30_22:49:37.215091 7fec9a01c720 -- :/11418 shutdown i am not
> dispatch, setting stop flag and joining thread.
> 10.06.30_22:49:37.215107 7fec9a01c720 -- :/11418 wait: still active
> 10.06.30_22:49:37.215145 7fec9a01c720 -- :/11418 wait: woke up
> 10.06.30_22:49:37.215165 7fec9a01c720 -- :/11418 wait: everything stopped
> 10.06.30_22:49:37.215177 7fec9a01c720 -- :/11418 wait: closing pipes
> 10.06.30_22:49:37.215191 7fec9a01c720 -- :/11418 reaper
> 10.06.30_22:49:37.215202 7fec9a01c720 -- :/11418 reaper done
> 10.06.30_22:49:37.215214 7fec9a01c720 -- :/11418 wait: waiting for
> pipes  to close
> 10.06.30_22:49:37.215225 7fec9a01c720 -- :/11418 wait: done.
> 10.06.30_22:49:37.215236 7fec9a01c720 -- :/11418 shutdown complete.
> 10.06.30_22:49:37.215398 7fec9a01c720 filestore(/data/osd0) mkfs in /data/osd0
> 10.06.30_22:49:37.215477 7fec9a01c720 filestore(/data/osd0) mkfs
> removing old directory current
> 10.06.30_22:49:37.218588 7fec9a01c720 filestore(/data/osd0) mkfs
> removing old file magic
> 10.06.30_22:49:37.218644 7fec9a01c720 filestore(/data/osd0) mkfs
> removing old file fsid
> 10.06.30_22:49:37.218674 7fec9a01c720 filestore(/data/osd0) mkfs
> removing old file ceph_fsid
> 10.06.30_22:49:37.218703 7fec9a01c720 filestore(/data/osd0) mkfs
> removing old file whoami
> 10.06.30_22:49:37.219248 7fec9a01c720 filestore(/data/osd0) mkjournal
> created journal on /ceph-journal/osd0
> 10.06.30_22:49:37.219279 7fec9a01c720 filestore(/data/osd0) mkfs done
> in /data/osd0
> 10.06.30_22:49:37.219329 7fec9a01c720 filestore(/data/osd0) mount did
> NOT detect btrfs
> 10.06.30_22:49:37.219376 7fec9a01c720 filestore(/data/osd0) mount found snaps <>
> 10.06.30_22:49:37.220478 7fec98932710 journal rebuild_page_aligned
> failed, buffer::list(len=5120,
>         buffer::ptr(0~5120 0xab3410 in raw 0xab3410 len 5120 nref 1)
> )
> os/FileJournal.cc: In function 'void FileJournal::write_bl(off64_t&,
> ceph::bufferlist&)':
> os/FileJournal.cc:508: FAILED assert((bl.length() & ~ceph::_page_mask) == 0)
>  1: (FileJournal::do_write(ceph::buffer::list&)+0x2a2) [0x55b362]
>  2: (FileJournal::write_thread_entry()+0x1fa) [0x55da6a]
>  3: (FileJournal::Writer::entry()+0xd) [0x54f04d]
>  4: (Thread::_entry_func(void*)+0x7) [0x46ca97]
>  5: (()+0x68ba) [0x7fec999fe8ba]
>  6: (clone()+0x6d) [0x7fec98c1901d]
>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> 
> Would you prefer to have the core dump?
> 
> Sebastien
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-06-30 22:06         ` Sage Weil
@ 2010-07-01 23:42           ` Sébastien Paolacci
  2010-07-01 23:58             ` Sage Weil
  0 siblings, 1 reply; 10+ messages in thread
From: Sébastien Paolacci @ 2010-07-01 23:42 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

2010/7/1 Sage Weil <sage@newdream.net>:
> Can you do the same, but with
>
>        debug filestore = 20
>        debug journal = 20
>        debug osd = 20
>
> I'm not sure where that length 5120 write is coming from.
>
Neither do I ;)

Sorry for not taking much more time to trace myself, here is however a
more detailed log output:

10.07.02_01:11:40.764748 --- opened log /var/log/ceph/debian-vm1.2181 ---
ceph version 0.21~rc (3235abe91863b3bffe5638b2169603f7d89ea375)
10.07.02_01:11:40.764859 7f17ea49a720 ---- renamed symlink
/var/log/ceph/osd0 -> /var/log/ceph/osd0.0 ----
10.07.02_01:11:40.764904 7f17ea49a720 ---- created symlink
/var/log/ceph/osd0 -> debian-vm1.2181 ----
10.07.02_01:11:40.764981 7f17ea49a720 -- :/0 register_entity client?
10.07.02_01:11:40.765015 7f17ea49a720 -- :/0 register_entity client? at :/0
10.07.02_01:11:40.765038 7f17ea49a720 -- :/0 ready :/0
10.07.02_01:11:40.765152 7f17ea49a720 -- :/2181 messenger.start
10.07.02_01:11:40.765276 7f17ea49a720 -- :/2181 shutdown :/2181
10.07.02_01:11:40.765304 7f17ea49a720 -- :/2181 shutdown i am not
dispatch, setting stop flag and joining thread.
10.07.02_01:11:40.765361 7f17ea49a720 -- :/2181 wait: still active
10.07.02_01:11:40.765564 7f17ea49a720 -- :/2181 wait: woke up
10.07.02_01:11:40.765596 7f17ea49a720 -- :/2181 wait: everything stopped
10.07.02_01:11:40.765608 7f17ea49a720 -- :/2181 wait: closing pipes
10.07.02_01:11:40.765620 7f17ea49a720 -- :/2181 reaper
10.07.02_01:11:40.765629 7f17ea49a720 -- :/2181 reaper done
10.07.02_01:11:40.765638 7f17ea49a720 -- :/2181 wait: waiting for
pipes  to close
10.07.02_01:11:40.765647 7f17ea49a720 -- :/2181 wait: done.
10.07.02_01:11:40.765656 7f17ea49a720 -- :/2181 shutdown complete.
10.07.02_01:11:40.765730 7f17ea49a720 filestore(/data/osd0) mkfs in /data/osd0
10.07.02_01:11:40.765795 7f17ea49a720 filestore(/data/osd0) mkfs
removing old file fsid
10.07.02_01:11:40.765845 7f17ea49a720 filestore(/data/osd0) mkfs fsid
is 29089128
10.07.02_01:11:40.765866 7f17ea49a720 filestore(/data/osd0)
BTRFS_IOC_SUBVOL_CREATE ioctl failed, trying mkdir /data/osd0/current
10.07.02_01:11:40.765901 7f17ea49a720 filestore(/data/osd0)
open_journal at /ceph-journal/osd0
10.07.02_01:11:40.765920 7f17ea49a720 journal create /ceph-journal/osd0
10.07.02_01:11:40.765953 7f17ea49a720 journal _open extending to 268435456 bytes
10.07.02_01:11:40.765973 7f17ea49a720 journal _open journal is not a
block device, NOT checking disk write cache on /ceph-journal/osd0
10.07.02_01:11:40.765984 7f17ea49a720 journal _open /ceph-journal/osd0
fd 5: 268435456 bytes, block size 1024 bytes, directio = 1
10.07.02_01:11:40.765995 7f17ea49a720 journal header: block_size 1024
alignment 1024 max_size 268435456
10.07.02_01:11:40.766003 7f17ea49a720 journal header: start 1024
10.07.02_01:11:40.766010 7f17ea49a720 journal  write_pos 0
10.07.02_01:11:40.771483 7f17ea49a720 journal create done
10.07.02_01:11:40.771511 7f17ea49a720 filestore(/data/osd0) mkjournal
created journal on /ceph-journal/osd0
10.07.02_01:11:40.771531 7f17ea49a720 filestore(/data/osd0) mkfs done
in /data/osd0
10.07.02_01:11:40.771543 7f17ea49a720 filestore(/data/osd0) basedir
/data/osd0 journal /ceph-journal/osd0
10.07.02_01:11:40.771580 7f17ea49a720 filestore(/data/osd0) mount did
NOT detect btrfs
10.07.02_01:11:40.771601 7f17ea49a720 filestore(/data/osd0) mount fsid
is 29089128
10.07.02_01:11:40.771623 7f17ea49a720 filestore(/data/osd0) mount found snaps <>
10.07.02_01:11:40.771648 7f17ea49a720 filestore(/data/osd0) mount op_seq is 0
10.07.02_01:11:40.771658 7f17ea49a720 filestore(/data/osd0)
open_journal at /ceph-journal/osd0
10.07.02_01:11:40.771668 7f17ea49a720 journal journal_replay fs op_seq 0
10.07.02_01:11:40.771677 7f17ea49a720 journal open /ceph-journal/osd0 next_seq 1
10.07.02_01:11:40.771687 7f17ea49a720 journal _open journal is not a
block device, NOT checking disk write cache on /ceph-journal/osd0
10.07.02_01:11:40.771696 7f17ea49a720 journal _open /ceph-journal/osd0
fd 9: 268435456 bytes, block size 1024 bytes, directio = 1
10.07.02_01:11:40.771703 7f17ea49a720 journal read_header
10.07.02_01:11:40.782354 7f17ea49a720 journal header: block_size 1024
alignment 1024 max_size 268435456
10.07.02_01:11:40.782443 7f17ea49a720 journal header: start 1024
10.07.02_01:11:40.782485 7f17ea49a720 journal  write_pos 1024
10.07.02_01:11:40.782640 7f17ea49a720 journal open header.fsid = 29089128
10.07.02_01:11:40.782721 7f17ea49a720 journal read_entry 1024 : bad
header magic, end of journal
10.07.02_01:11:40.782751 7f17ea49a720 journal open reached end of journal.
10.07.02_01:11:40.782805 7f17ea49a720 journal read_entry 1024 : bad
header magic, end of journal
10.07.02_01:11:40.782853 7f17ea49a720 journal journal_replay: end of
journal, done.
10.07.02_01:11:40.782924 7f17ea49a720 journal _open journal is not a
block device, NOT checking disk write cache on /ceph-journal/osd0

0.07.02_01:11:40.782978 7f17ea49a720 journal _open /ceph-journal/osd0
fd 9: 268435456 bytes, block size 1024 bytes, directio = 1
10.07.02_01:11:40.783121 7f17ea49a720 journal journal_start
10.07.02_01:11:40.784591 7f17ea49a720 filestore(/data/osd0) apply queued
10.07.02_01:11:40.784652 7f17ea49a720 filestore(/data/osd0)
queue_transactions (writeahead) 1 0x7fff1d91ce40
10.07.02_01:11:40.784721 7f17ea49a720 journal journal_transactions 1
10.07.02_01:11:40.784785 7f17ea49a720 journal submit_entry seq 1 len
206 (0x18e4610)
10.07.02_01:11:40.785023 7f17e7dae710 filestore(/data/osd0) sync_entry
waiting for max_interval 5.000000
10.07.02_01:11:40.785153 7f17e8db0710 journal write_thread_entry start
10.07.02_01:11:40.785228 7f17e8db0710 journal room 268434431 max_size
268435456 pos 1024 header.start 1024 top 1024
10.07.02_01:11:40.785277 7f17e8db0710 journal check_for_full at 1024 :
5120 < 268434431
10.07.02_01:11:40.785323 7f17e8db0710 journal prepare_single_write 1
will write 1024 : seq 1 len 206 -> 5120 (head 40 pre_pad 4056 ebl 206
post_pad 778 tail 40) (ebl alignment 0)
10.07.02_01:11:40.785401 7f17e8db0710 journal prepare_multi_write
queue_pos now 6144
10.07.02_01:11:40.785474 7f17e8db0710 journal do_write writing
1024~5120 + header
10.07.02_01:11:40.786492 7f17e65ab710 filestore(/data/osd0) flusher_entry start
10.07.02_01:11:40.786653 7f17e65ab710 filestore(/data/osd0)
flusher_entry sleeping
10.07.02_01:11:40.787092 7f17e8db0710 journal rebuild_page_aligned
failed, buffer::list(len=5120,
        buffer::ptr(0~5120 0x18ed410 in raw 0x18ed410 len 5120 nref 1)
)
os/FileJournal.cc: In function 'void FileJournal::write_bl(off64_t&,
ceph::bufferlist&)':
os/FileJournal.cc:508: FAILED assert((bl.length() & ~ceph::_page_mask) == 0)
 1: (FileJournal::do_write(ceph::buffer::list&)+0x2a2) [0x55b362]
 2: (FileJournal::write_thread_entry()+0x1fa) [0x55da6a]
 3: (FileJournal::Writer::entry()+0xd) [0x54f04d]
 4: (Thread::_entry_func(void*)+0x7) [0x46ca97]
 5: (()+0x68ba) [0x7f17e9e7c8ba]
 6: (clone()+0x6d) [0x7f17e909701d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is
needed to interpret this.

Sebastien
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: pg 0.xxxx on [] is laggy
  2010-07-01 23:42           ` Sébastien Paolacci
@ 2010-07-01 23:58             ` Sage Weil
  0 siblings, 0 replies; 10+ messages in thread
From: Sage Weil @ 2010-07-01 23:58 UTC (permalink / raw)
  To: Sébastien Paolacci; +Cc: ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 579 bytes --]

On Fri, 2 Jul 2010, Sébastien Paolacci wrote:
> 2010/7/1 Sage Weil <sage@newdream.net>:
> > Can you do the same, but with
> >
> >        debug filestore = 20
> >        debug journal = 20
> >        debug osd = 20
> >
> > I'm not sure where that length 5120 write is coming from.
> >
> Neither do I ;)
> 
> Sorry for not taking much more time to trace myself, here is however a
> more detailed log output:

I see the problem now. Your journal block size is 1KB and the page size is 
4KB.  Fixing the journal code to align with the max of the two.

Thanks!
sage

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-07-01 23:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-29 12:12 pg 0.xxxx on [] is laggy Sébastien Paolacci
2010-06-29 15:58 ` Sage Weil
2010-06-30 12:05   ` Sébastien Paolacci
2010-06-30 15:45     ` Thomas Mueller
2010-06-30 20:57       ` Sébastien Paolacci
2010-06-30 18:06     ` Sage Weil
2010-06-30 21:06       ` Sébastien Paolacci
2010-06-30 22:06         ` Sage Weil
2010-07-01 23:42           ` Sébastien Paolacci
2010-07-01 23:58             ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.