From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josh Durgin Subject: Re: couldn't use rbd Date: Fri, 03 Feb 2012 09:05:43 -0800 Message-ID: <4F2C13E7.5000509@dreamhost.com> References: <4F2B37DB.4070506@dreamhost.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mail.hq.newdream.net ([66.33.206.127]:49217 "EHLO mail.hq.newdream.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755448Ab2BCRFn (ORCPT ); Fri, 3 Feb 2012 12:05:43 -0500 In-Reply-To: Sender: ceph-devel-owner@vger.kernel.org List-ID: To: Masuko Tomoya Cc: ceph-devel@vger.kernel.org On 02/03/2012 12:51 AM, Masuko Tomoya wrote: > Hi Josh, > > Thank you for reply ! > >> This might mean the rbd image list object can't be read for some >> reason, or the rbd tool is doing something weird that the rados tool >> isn't. Can you share the output of 'ceph -s' and >> 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 --debug-monc 20 >> --debug-auth 20'? > > The output of 'ceph -s' is below. > ------- > root@ceph01:~# ceph -s > 2012-02-03 17:01:47.881960 pg v33: 6 pgs: 6 active+clean+degraded; > 0 KB data, 1056 KB used, 15357 MB / 15358 MB avail > 2012-02-03 17:01:47.882583 mds e9: 1/1/1 up {0=0=up:creating} > 2012-02-03 17:01:47.882733 osd e21: 1 osds: 1 up, 1 in > 2012-02-03 17:01:47.883042 log 2012-02-03 16:35:11.183897 osd.0 > 10.68.119.191:6801/2912 12 : [WRN] map e19 wrongly marked me down or > wrong addr > 2012-02-03 17:01:47.883144 mon e1: 1 mons at {0=10.68.119.191:6789/0} > > The output of 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 > --debug-monc 20 --debug-auth 20' is below. > ------ > root@ceph01:~# rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 > --debug-monc 20 --debug-auth 20 > 2012-02-03 17:02:10.971391 7f88cbb91720 client.4106.objecter op_submit > oid rbd_directory @2 [read 0~0] tid 1 osd.0 > 2012-02-03 17:02:10.971465 7f88cbb91720 client.4106.objecter send_op 1 to osd.0 > 2012-02-03 17:02:10.971533 7f88cbb91720 -- 10.68.119.191:0/1003500 --> > 10.68.119.191:6801/2912 -- osd_op(client.4106.0:1 rbd_directory [read > 0~0] 2.30a98c1c) v1 -- ?+0 0x24664c0 con 0x24661b0 Everything above here is normal - the rbd tool connected to the monitors, got the monmap and osdmap, and sent a request to read the 'rbd_directory' object. > 2012-02-03 17:02:25.969338 7f88c7261710 client.4106.objecter tid 1 on > osd.0 is laggy This means the osd isn't responding to the read. Check the osd log for errors. If there's nothing obvious, add this to the osd section of your ceph.conf and restart the osd: debug osd = 20 debug ms = 1 debug filestore = 20 Then run 'rbd ls' and look at what happens after 'osd_op.*rbd_directory' appears in the osd log. > > I compared those logs and found there is differences. > 'rbd list' > 2012-02-03 17:02:10.971770 7f88c9366710 -- 10.68.119.191:0/1003500<== > mon.0 10.68.119.191:6789/0 10 ==== osd_map(21..21 src has 1..21) v2 > ==== 1284+0+0 (473305567 0 0) 0x24655a0 con 0x24603e0 > 2012-02-03 17:02:10.971789 7f88c9366710 client.4106.objecter > handle_osd_map ignoring epochs [21,21]<= 21 > 2012-02-03 17:02:10.971801 7f88c9366710 client.4106.objecter > dump_active .. 0 homeless > 2012-02-03 17:02:10.971815 7f88c9366710 client.4106.objecter 1 > 2.30a98c1c osd.0 rbd_directory [read 0~0] > --(snip)-- > > 'rados lspools' > 2012-02-03 17:11:52.866072 7f9c5764b710 -- 10.68.119.191:0/1003868<== > mon.0 10.68.119.191:6789/0 7 ==== osd_map(21..21 src has 1..21) v2 > ==== 1284+0+0 (473305567 0 0) 0x770a70 con 0x771440 > 2012-02-03 17:11:52.866103 7f9c5764b710 client.4107.objecter > handle_osd_map got epochs [21,21]> 0 > 2012-02-03 17:11:52.866111 7f9c5764b710 client.4107.objecter > handle_osd_map decoding full epoch 21 > 2012-02-03 17:11:52.866272 7f9c5764b710 client.4107.objecter > dump_active .. 0 homeless > data > metadata > rbd > --(snip)-- > > What do these logs mean ? The difference is that 'rbd ls' talks to the monitors and osds, while 'rados lspools' just needs to talk to the monitors. The objecter dump_active part is listing in-flight osd requests. > >>> *ceph cluster is configured on single server. >>> *server is ubuntu 10.10 maverick. >>> *ceph, librados2 and librbd1 packages are installed. >>> (version: 0.38-1maverick) >>> *apparmor is disable. >> >> >> apparmor shouldn't matter if you have libvirt 0.9.9 or newer. > > I use libvirt 0.8.3 (latest version for maverick), so I disabled apparmor. > > > > 2012/2/2 Josh Durgin: >> On 02/02/2012 01:49 AM, Masuko Tomoya wrote: >>> >>> Hi, all. >>> >>> When I execute "rbd" command, it is not success. >>> >>> root@ceph01:~# rbd list >>> (no response) >>> >>> /var/log/ceph/mon.0.log >>> ----- >>> 2012-02-02 17:58:19.801762 7ff4bbfb1710 -- 10.68.119.191:6789/0<== >>> client.? 10.68.119.191:0/1002580 1 ==== auth(proto 0 30 bytes) v1 ==== >>> 56+0+0 (625540289 0 0) 0x1619a00 con 0x1615a00 >>> 2012-02-02 17:58:19.801919 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> 10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0 >>> 0x1619c00 con 0x1615a00 >>> 2012-02-02 17:58:19.802505 7ff4bbfb1710 -- 10.68.119.191:6789/0<== >>> client.? 10.68.119.191:0/1002580 2 ==== auth(proto 2 32 bytes) v1 ==== >>> 58+0+0 (346146289 0 0) 0x161fc00 con 0x1615a00 >>> 2012-02-02 17:58:19.802673 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> 10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0 >>> 0x1619a00 con 0x1615a00 >>> 2012-02-02 17:58:19.803473 7ff4bbfb1710 -- 10.68.119.191:6789/0<== >>> client.? 10.68.119.191:0/1002580 3 ==== auth(proto 2 165 bytes) v1 >>> ==== 191+0+0 (3737796417 0 0) 0x1619600 con 0x1615a00 >>> 2012-02-02 17:58:19.803745 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> 10.68.119.191:0/1002580 -- auth_reply(proto 2 0 Success) v1 -- ?+0 >>> 0x161fc00 con 0x1615a00 >>> 2012-02-02 17:58:19.804425 7ff4bbfb1710 -- 10.68.119.191:6789/0<== >>> client.? 10.68.119.191:0/1002580 4 ==== mon_subscribe({monmap=0+}) v2 >>> ==== 23+0+0 (1620593354 0 0) 0x1617380 con 0x1615a00 >>> 2012-02-02 17:58:19.804488 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> 10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1635700 con 0x1615a00 >>> 2012-02-02 17:58:19.804517 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> client.? 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 -- ?+0 >>> 0x163d780 >>> 2012-02-02 17:58:19.804550 7ff4bbfb1710 -- 10.68.119.191:6789/0<== >>> client.4112 10.68.119.191:0/1002580 5 ==== >>> mon_subscribe({monmap=0+,osdmap=0}) v2 ==== 42+0+0 (982583713 0 0) >>> 0x1617a80 con 0x1615a00 >>> 2012-02-02 17:58:19.804578 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> 10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1617380 con 0x1615a00 >>> 2012-02-02 17:58:19.804656 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> client.? 10.68.119.191:0/1002580 -- osd_map(3..3 src has 1..3) v1 -- >>> ?+0 0x1619600 >>> 2012-02-02 17:58:19.804744 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> client.4112 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 -- >>> ?+0 0x163d900 >>> 2012-02-02 17:58:19.804778 7ff4bbfb1710 -- 10.68.119.191:6789/0<== >>> client.4112 10.68.119.191:0/1002580 6 ==== >>> mon_subscribe({monmap=0+,osdmap=0}) v2 ==== 42+0+0 (982583713 0 0) >>> 0x16178c0 con 0x1615a00 >>> 2012-02-02 17:58:19.804811 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> 10.68.119.191:0/1002580 -- mon_map v1 -- ?+0 0x1617a80 con 0x1615a00 >>> 2012-02-02 17:58:19.804855 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> client.? 10.68.119.191:0/1002580 -- osd_map(3..3 src has 1..3) v1 -- >>> ?+0 0x1619400 >>> 2012-02-02 17:58:19.804884 7ff4bbfb1710 -- 10.68.119.191:6789/0 --> >>> client.4112 10.68.119.191:0/1002580 -- mon_subscribe_ack(300s) v1 -- >>> ?+0 0x161d300 >>> ----- >> >> >> No problems there. >> >> >>> >>> BTW, I could execute "rados lspools". >>> >>> root@ceph01:~# rados lspools >>> data >>> metadata >>> rbd >> >> >> This might mean the rbd image list object can't be read for some >> reason, or the rbd tool is doing something weird that the rados tool >> isn't. Can you share the output of 'ceph -s' and >> 'rbd ls --log-to-stderr --debug-ms 1 --debug-objecter 20 --debug-monc 20 >> --debug-auth 20'? >> >> You can run 'rados lspools' with those options as well and compare. >> >> >>> I would like to use rbd volume and attach it as virtual device for VM >>> guest on KVM. >>> >>> Could you advice to me ? >>> >>> >>> My environment is below. >>> >>> *ceph cluster is configured on single server. >>> *server is ubuntu 10.10 maverick. >>> *ceph, librados2 and librbd1 packages are installed. >>> (version: 0.38-1maverick) >>> *apparmor is disable. >> >> >> apparmor shouldn't matter if you have libvirt 0.9.9 or newer. >> >> >>> *root@ceph01:/# ls -l /etc/ceph >>> total 16 >>> -rw-r--r-- 1 root root 340 2012-02-02 17:28 ceph.conf >>> -rw------- 1 root root 92 2012-02-02 17:28 client.admin.keyring >>> -rw------- 1 root root 85 2012-02-02 17:28 mds.0.keyring >>> -rw------- 1 root root 85 2012-02-02 17:28 osd.0.keyring >>> */var/lib/ceph/tmp is exists. >>> root@ceph01:/var/log# ls -l /var/lib/ceph/ >>> total 4 >>> drwxrwxrwx 2 root root 4096 2011-11-11 09:28 tmp >>> >>> */etc/ceph/ceph.conf >>> [global] >>> auth supported = cephx >>> keyring = /etc/ceph/$name.keyring >>> [mon] >>> mon data = /data/data/mon$id >>> debug ms = 1 >>> [mon.0] >>> host = ceph01 >>> mon addr = 10.68.119.191:6789 >>> [mds] >>> >>> [mds.0] >>> host = ceph01 >>> [osd] >>> osd data = /data/osd$id >>> osd journal = /data/osd$id/journal >>> osd journal size = 512 >>> osd class tmp = /var/lib/ceph/tmp >>> [osd.0] >>> host = ceph01 >>> btrfs devs = /dev/sdb1 >>> >>> >>> Waiting for your reply, >>> >>> Tomoya. >> >> >>