All of lore.kernel.org
 help / color / mirror / Atom feed
* RadosGW problems with copy in s3
@ 2012-02-28 11:43 Sławomir Skowron
  2012-02-28 17:18 ` Yehuda Sadeh Weinraub
  0 siblings, 1 reply; 12+ messages in thread
From: Sławomir Skowron @ 2012-02-28 11:43 UTC (permalink / raw)
  To: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 2894 bytes --]

After some parallel copy command via botto for many files everything,
going to slow down, and eventualy got timeout from nginx@radosgw.

# ceph -s
2012-02-28 12:16:57.818566    pg v20743: 8516 pgs: 8516 active+clean;
2154 MB data, 53807 MB used, 20240 GB / 21379 GB avail
2012-02-28 12:16:57.845274   mds e1: 0/0/1 up
2012-02-28 12:16:57.845307   osd e719: 78 osds: 78 up, 78 in
2012-02-28 12:16:57.845512   log 2012-02-28 12:14:41.578889 osd.24
10.177.64.4:6839/2063 20 : [WRN] old request
osd_op(client.62138.0:1162 3_.archive/1330427624.13/37828,serwisy.html
[delete,create 0~0,setxattr user.rgw.acl (62),setxattr
user.rgw.content_type (10),setxattr user.rgw.etag (33),setxattr
user.rgw.idtag,setxattr user.rgw.x-amz-copy-source (33),setxattr
user.rgw.x-amz-meta-checksum (33),setxattr
user.rgw.x-amz-metadata-directive (5),setxattr
user.rgw.x-amz-storage-class (9),setxattr user.rgw.idtag (32),setxattr
user.rgw.shadow_name (74),clonerange 0~17913 from
3_.archive/1330427624.13/37828,serwisy.html_g8DHGgLP7Dhi7YSAzYtu_FGT_96NsiF/head
offset 0] 7.645e07f5) v4 received at 2012-02-28 12:14:10.683511
currently waiting for sub ops
2012-02-28 12:16:57.845622   mon e3: 3 mons at
{0=10.177.64.4:6789/0,1=10.177.64.6:6789/0,2=10.177.64.8:6789/0}

and other one from ceph -s

2012-02-28 12:32:16.697642    pg v21010: 8516 pgs: 8516 active+clean;
2161 MB data, 53839 MB used, 20240 GB / 21379 GB avail
2012-02-28 12:32:16.722796   mds e1: 0/0/1 up
2012-02-28 12:32:16.722938   osd e719: 78 osds: 78 up, 78 in
2012-02-28 12:32:16.723204   log 2012-02-28 12:28:30.015814 osd.24
10.177.64.4:6839/2063 78 : [WRN] old request
osd_op(client.62020.0:3344 .dir.3 [call rgw.bucket_complete_op]
7.ccb26a35) v4 received at 2012-02-28 12:27:59.458121 currently
waiting for sub ops
2012-02-28 12:32:16.723327   mon e3: 3 mons at
{0=10.177.64.4:6789/0,1=10.177.64.6:6789/0,2=10.177.64.8:6789/0}

from nginx timeout after 33 seconds.

10.177.8.12 - - - [28/Feb/2012:12:14:19 +0100] "PUT
/www.onet.pl/.archive/1330427624.13/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
HTTP/1.1" rlength: 528 bsent: 0 rtime: 33.001 urtime: - status: 499
bbsent: 0 httpref: "-" useragent: "Boto/2.2.2 (linux2)"

Inside attachment logs from radosgw in debug mode.

In normal put, get, head etc, everything works OK. Only parallel copy
going to be mess i think :).

I see some in radosgw log:

2012-02-28 12:27:34.774107 7ff99bfef700 can't clone object
www.onet.pl:.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
to shadow object, tag/shadow_obj haven't been set

and

2012-02-28 12:27:52.798272 7ff99bfef700 prepare_atomic_for_write_impl:
state is not atomic. state=0x7ff93c02cd58

I use ceph 0.42 with ext4 as local filesystems, maybe xattr in ext4
make that problem ??.

-- 
-----
Pozdrawiam

Sławek Skowron

[-- Attachment #2: radosgw_debug.txt --]
[-- Type: text/plain, Size: 13175 bytes --]

2012-02-28 12:27:34.767514 7ff99bfef700 dequeued request req=0x7ff93c08fb10
2012-02-28 12:27:34.767531 7ff99bfef700 RGWWQ: empty
2012-02-28 12:27:34.767544 7ff99bfef700 ====== starting new request req=0x7ff93c08fb10 =====
2012-02-28 12:27:34.767593 7ff99bfef700 req 650:0.000047::::initializing
2012-02-28 12:27:34.767601 7ff99bfef700 in url_decode with /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.767606 7ff99bfef700 src=/test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.767612 7ff99bfef700 dest=/test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
2012-02-28 12:27:34.767630 7ff99bfef700 in url_decode with
2012-02-28 12:27:34.767636 7ff99bfef700 src=
2012-02-28 12:27:34.767640 7ff99bfef700 dest=
2012-02-28 12:27:34.767646 7ff99bfef700 parsed: name= val=
2012-02-28 12:27:34.767662 7ff99bfef700 s->object=.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif s->bucket=test
2012-02-28 12:27:34.767675 7ff99bfef700 meta>> HTTP_X_AMZ_STORAGE_CLASS=STANDARD
2012-02-28 12:27:34.767690 7ff99bfef700 meta>> HTTP_X_AMZ_COPY_SOURCE=test/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.767704 7ff99bfef700 meta>> HTTP_X_AMZ_METADATA_DIRECTIVE=COPY
2012-02-28 12:27:34.767717 7ff99bfef700 x>> x-amz-copy-source:test/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.767723 7ff99bfef700 x>> x-amz-metadata-directive:COPY
2012-02-28 12:27:34.767727 7ff99bfef700 x>> x-amz-storage-class:STANDARD
2012-02-28 12:27:34.767736 7ff99bfef700 FCGI_ROLE=RESPONDER
2012-02-28 12:27:34.767741 7ff99bfef700 SCRIPT_FILENAME=/var/www/radosgw.fcgi
2012-02-28 12:27:34.767745 7ff99bfef700 QUERY_STRING=
2012-02-28 12:27:34.767750 7ff99bfef700 REQUEST_METHOD=PUT
2012-02-28 12:27:34.767755 7ff99bfef700 CONTENT_TYPE=
2012-02-28 12:27:34.767759 7ff99bfef700 CONTENT_LENGTH=0
2012-02-28 12:27:34.767764 7ff99bfef700 SCRIPT_NAME=/test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
2012-02-28 12:27:34.767777 7ff99bfef700 REQUEST_URI=/test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.767782 7ff99bfef700 DOCUMENT_URI=/test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
2012-02-28 12:27:34.767787 7ff99bfef700 DOCUMENT_ROOT=/var/www
2012-02-28 12:27:34.767791 7ff99bfef700 SERVER_PROTOCOL=HTTP/1.1
2012-02-28 12:27:34.767795 7ff99bfef700 GATEWAY_INTERFACE=CGI/1.1
2012-02-28 12:27:34.767800 7ff99bfef700 SERVER_SOFTWARE=nginx/1.0.5
2012-02-28 12:27:34.767804 7ff99bfef700 REMOTE_ADDR=10.177.8.11
2012-02-28 12:27:34.767808 7ff99bfef700 REMOTE_PORT=47279
2012-02-28 12:27:34.767813 7ff99bfef700 SERVER_ADDR=10.177.64.8
2012-02-28 12:27:34.767817 7ff99bfef700 SERVER_PORT=80
2012-02-28 12:27:34.767822 7ff99bfef700 SERVER_NAME=
2012-02-28 12:27:34.767826 7ff99bfef700 REDIRECT_STATUS=200
2012-02-28 12:27:34.767830 7ff99bfef700 HTTP_HOST=10.177.0.3
2012-02-28 12:27:34.767835 7ff99bfef700 HTTP_ACCEPT_ENCODING=identity
2012-02-28 12:27:34.767839 7ff99bfef700 HTTP_CONTENT_LENGTH=0
2012-02-28 12:27:34.767844 7ff99bfef700 HTTP_X_AMZ_STORAGE_CLASS=STANDARD
2012-02-28 12:27:34.767848 7ff99bfef700 HTTP_USER_AGENT=Boto/2.2.2 (linux2)
2012-02-28 12:27:34.767852 7ff99bfef700 HTTP_X_AMZ_COPY_SOURCE=test/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.767857 7ff99bfef700 HTTP_DATE=Tue, 28 Feb 2012 11:27:32 GMT
2012-02-28 12:27:34.767861 7ff99bfef700 HTTP_AUTHORIZATION=AWS 73VF66Q2JH5PT5K7QJ3A:Colo8TeOgpkMWG6g763152hie2A=
2012-02-28 12:27:34.767866 7ff99bfef700 HTTP_X_AMZ_METADATA_DIRECTIVE=COPY
2012-02-28 12:27:34.767870 7ff99bfef700 HTTP_X_FORWARDED_FOR=10.177.57.86
2012-02-28 12:27:34.767875 7ff99bfef700 HTTP_CONNECTION=close
2012-02-28 12:27:34.767883 7ff99bfef700 req 650:0.000339:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif::getting op
2012-02-28 12:27:34.767892 7ff99bfef700 req 650:0.000348:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif:copy_obj:authorizing
2012-02-28 12:27:34.767909 7ff99bfef700 get_obj_state: rctx=0x7ff93c0971e0 obj=.users:73VF66Q2JH5PT5K7QJ3A state=0x7ff93c097298 s->prefetch_data=0
2012-02-28 12:27:34.767920 7ff99bfef700 moving .users+73VF66Q2JH5PT5K7QJ3A to cache LRU end
2012-02-28 12:27:34.767926 7ff99bfef700 cache get: name=.users+73VF66Q2JH5PT5K7QJ3A : hit
2012-02-28 12:27:34.767934 7ff99bfef700 get_obj_state: s->obj_tag was set empty
2012-02-28 12:27:34.767942 7ff99bfef700 moving .users+73VF66Q2JH5PT5K7QJ3A to cache LRU end
2012-02-28 12:27:34.767948 7ff99bfef700 cache get: name=.users+73VF66Q2JH5PT5K7QJ3A : hit
2012-02-28 12:27:34.767993 7ff99bfef700 get_canon_resource(): dest=/test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.768003 7ff99bfef700 auth_hdr:
2012-02-28 12:27:34.768024 7ff99bfef700 hmac=0a8968f1378e82990c586ea0efadf5e768627b60
2012-02-28 12:27:34.768041 7ff99bfef700 b64=Colo8TeOgpkMWG6g763152hie2A=
2012-02-28 12:27:34.768052 7ff99bfef700 auth_sign=Colo8TeOgpkMWG6g763152hie2A=
2012-02-28 12:27:34.768061 7ff99bfef700 compare=0
2012-02-28 12:27:34.768072 7ff99bfef700 req 650:0.000527:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif:copy_obj:reading permissions
2012-02-28 12:27:34.768088 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=.rgw:test state=0x7ff93c097628 s->prefetch_data=0
2012-02-28 12:27:34.768096 7ff99bfef700 moving .rgw+test to cache LRU end
2012-02-28 12:27:34.768102 7ff99bfef700 cache get: name=.rgw+test : hit
2012-02-28 12:27:34.768111 7ff99bfef700 get_obj_state: s->obj_tag was set empty
2012-02-28 12:27:34.768126 7ff99bfef700 moving .rgw+test to cache LRU end
2012-02-28 12:27:34.768132 7ff99bfef700 cache get: name=.rgw+test : hit
2012-02-28 12:27:34.768144 7ff99bfef700 rgw_get_bucket_info: bucket=test(@.rgw.buckets[3]) owner 0
2012-02-28 12:27:34.768158 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test: state=0x7ff93c097628 s->prefetch_data=0
2012-02-28 12:27:34.768174 7ff99bfef700 Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>0</ID><DisplayName>ocdn</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>0</ID><DisplayName>ocdn</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
2012-02-28 12:27:34.768183 7ff99bfef700 req 650:0.000639:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif:copy_obj:verifying op permissions
2012-02-28 12:27:34.768191 7ff99bfef700 in url_decode with test/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.768196 7ff99bfef700 src=test/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.768202 7ff99bfef700 dest=test/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
2012-02-28 12:27:34.768207 7ff99bfef700 decoded obj=test/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
2012-02-28 12:27:34.768218 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=.rgw:test state=0x7ff93c097628 s->prefetch_data=0
2012-02-28 12:27:34.768226 7ff99bfef700 moving .rgw+test to cache LRU end
2012-02-28 12:27:34.768231 7ff99bfef700 cache get: name=.rgw+test : hit
2012-02-28 12:27:34.768242 7ff99bfef700 rgw_get_bucket_info: bucket=test(@.rgw.buckets[3]) owner 0
2012-02-28 12:27:34.768256 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test:__d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif state=0x7ff93c097ce8 s->prefetch_data=0
2012-02-28 12:27:34.769338 7ff99bfef700 get_obj_state: s->obj_tag was set empty
2012-02-28 12:27:34.769449 7ff99bfef700 Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>0</ID><DisplayName>ocdn</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>0</ID><DisplayName>ocdn</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="Group"><URI>http://acs.amazonaws.com/groups/global/AllUsers</URI></Grantee><Permission>READ</Permission></Grant></AccessControlList></AccessControlPolicy>
2012-02-28 12:27:34.769471 7ff99bfef700 Searching permissions for uid=0 mask=1
2012-02-28 12:27:34.769485 7ff99bfef700 Found permission: 15
2012-02-28 12:27:34.769496 7ff99bfef700  uid=0 requested perm (type)=1, policy perm=1, user_perm_mask=15, acl perm=1
2012-02-28 12:27:34.769511 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test: state=0x7ff93c097628 s->prefetch_data=0
2012-02-28 12:27:34.769528 7ff99bfef700 Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>0</ID><DisplayName>ocdn</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>0</ID><DisplayName>ocdn</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
2012-02-28 12:27:34.769536 7ff99bfef700 Searching permissions for uid=0 mask=2
2012-02-28 12:27:34.769542 7ff99bfef700 Found permission: 15
2012-02-28 12:27:34.769548 7ff99bfef700  uid=0 requested perm (type)=2, policy perm=2, user_perm_mask=15, acl perm=2
2012-02-28 12:27:34.769597 7ff99bfef700 req 650:0.002052:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif:copy_obj:verifying op params
2012-02-28 12:27:34.769610 7ff99bfef700 req 650:0.002065:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif:copy_obj:executing
2012-02-28 12:27:34.769668 7ff99bfef700 x>> x-amz-copy-source:test/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
2012-02-28 12:27:34.769743 7ff99bfef700 x>> x-amz-metadata-directive:COPY
2012-02-28 12:27:34.769786 7ff99bfef700 x>> x-amz-storage-class:STANDARD
2012-02-28 12:27:34.769895 7ff99bfef700 Copy object test(@.rgw.buckets[3]):__d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif => test(@.rgw.buckets[3]):.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
2012-02-28 12:27:34.769973 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test:__d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif state=0x7ff93c097ce8 s->prefetch_data=0
2012-02-28 12:27:34.769992 7ff99bfef700 Read xattr: user.rgw.acl
2012-02-28 12:27:34.770004 7ff99bfef700 Read xattr: user.rgw.content_type
2012-02-28 12:27:34.770015 7ff99bfef700 Read xattr: user.rgw.etag
2012-02-28 12:27:34.770027 7ff99bfef700 Read xattr: user.rgw.idtag
2012-02-28 12:27:34.770038 7ff99bfef700 Read xattr: user.rgw.x-amz-meta-checksum
2012-02-28 12:27:34.770054 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test:__d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif state=0x7ff93c097ce8 s->prefetch_data=0
2012-02-28 12:27:34.770061 7ff99bfef700 state->obj_tag is empty, not appending atomic test
2012-02-28 12:27:34.770067 7ff99bfef700 rados->read ofs=0 len=3346
2012-02-28 12:27:34.770856 7ff99bfef700 rados->read r=0 bl.length=3346
2012-02-28 12:27:34.773301 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test:.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif state=0x7ff93c099d58 s->prefetch_data=0
2012-02-28 12:27:34.774107 7ff99bfef700 can't clone object test:.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif to shadow object, tag/shadow_obj haven't been set
2012-02-28 12:27:34.774178 7ff99bfef700 calling op.clone_range(dst_ofs=0, src.object=.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif_U-nYhwkjVZT_hzxpSaZG1c9WjqeDFjI range.src_ofs=0 range.len=3346
2012-02-28 12:27:52.797247 7ff99bfef700 get_obj_state: rctx=0x7ff93c091850 obj=test:.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif_U-nYhwkjVZT_hzxpSaZG1c9WjqeDFjI state=0x7ff93c02cd58 s->prefetch_data=0
2012-02-28 12:27:52.798252 7ff99bfef700 get_obj_state: s->obj_tag was set empty
2012-02-28 12:27:52.798272 7ff99bfef700 prepare_atomic_for_write_impl: state is not atomic. state=0x7ff93c02cd58
2012-02-28 12:27:52.798389 7ff99bfef700 --> Status: 200
2012-02-28 12:27:52.798426 7ff99bfef700 --> Content-type: binary/octet-stre
2012-02-28 12:27:52.798537 7ff99bfef700 req 650:18.030990:s3:PUT /test/.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif:copy_obj:http status=200
2012-02-28 12:27:52.798696 7ff99bfef700 ====== req done req=0x7ff93c08fb10 http_status=200 ======

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-02-28 11:43 RadosGW problems with copy in s3 Sławomir Skowron
@ 2012-02-28 17:18 ` Yehuda Sadeh Weinraub
       [not found]   ` <CAMwB3ThZ4jPis_Bb9tsf=HTnh3gLd6CNkn2XMr4xo09bO3hEiQ@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Yehuda Sadeh Weinraub @ 2012-02-28 17:18 UTC (permalink / raw)
  To: Sławomir Skowron; +Cc: ceph-devel

On Tue, Feb 28, 2012 at 3:43 AM, Sławomir Skowron
<slawomir.skowron@gmail.com> wrote:
> After some parallel copy command via botto for many files everything,
> going to slow down, and eventualy got timeout from nginx@radosgw.
>
> # ceph -s
> 2012-02-28 12:16:57.818566    pg v20743: 8516 pgs: 8516 active+clean;
> 2154 MB data, 53807 MB used, 20240 GB / 21379 GB avail
> 2012-02-28 12:16:57.845274   mds e1: 0/0/1 up
> 2012-02-28 12:16:57.845307   osd e719: 78 osds: 78 up, 78 in
> 2012-02-28 12:16:57.845512   log 2012-02-28 12:14:41.578889 osd.24
> 10.177.64.4:6839/2063 20 : [WRN] old request
> osd_op(client.62138.0:1162 3_.archive/1330427624.13/37828,serwisy.html
> [delete,create 0~0,setxattr user.rgw.acl (62),setxattr
> user.rgw.content_type (10),setxattr user.rgw.etag (33),setxattr
> user.rgw.idtag,setxattr user.rgw.x-amz-copy-source (33),setxattr
> user.rgw.x-amz-meta-checksum (33),setxattr
> user.rgw.x-amz-metadata-directive (5),setxattr
> user.rgw.x-amz-storage-class (9),setxattr user.rgw.idtag (32),setxattr
> user.rgw.shadow_name (74),clonerange 0~17913 from
> 3_.archive/1330427624.13/37828,serwisy.html_g8DHGgLP7Dhi7YSAzYtu_FGT_96NsiF/head
> offset 0] 7.645e07f5) v4 received at 2012-02-28 12:14:10.683511
> currently waiting for sub ops
> 2012-02-28 12:16:57.845622   mon e3: 3 mons at
> {0=10.177.64.4:6789/0,1=10.177.64.6:6789/0,2=10.177.64.8:6789/0}
>
> and other one from ceph -s
>
> 2012-02-28 12:32:16.697642    pg v21010: 8516 pgs: 8516 active+clean;
> 2161 MB data, 53839 MB used, 20240 GB / 21379 GB avail
> 2012-02-28 12:32:16.722796   mds e1: 0/0/1 up
> 2012-02-28 12:32:16.722938   osd e719: 78 osds: 78 up, 78 in
> 2012-02-28 12:32:16.723204   log 2012-02-28 12:28:30.015814 osd.24
> 10.177.64.4:6839/2063 78 : [WRN] old request
> osd_op(client.62020.0:3344 .dir.3 [call rgw.bucket_complete_op]
> 7.ccb26a35) v4 received at 2012-02-28 12:27:59.458121 currently
> waiting for sub ops

Looks like osd op takes too long to complete. After how many copies
did this happen?

> 2012-02-28 12:32:16.723327   mon e3: 3 mons at
> {0=10.177.64.4:6789/0,1=10.177.64.6:6789/0,2=10.177.64.8:6789/0}
>
> from nginx timeout after 33 seconds.
>
> 10.177.8.12 - - - [28/Feb/2012:12:14:19 +0100] "PUT
> /www.onet.pl/.archive/1330427624.13/_d/dodatki/887184387c048cb30c88f22169f0f74d%2Cwyslij_belka.gif
> HTTP/1.1" rlength: 528 bsent: 0 rtime: 33.001 urtime: - status: 499
> bbsent: 0 httpref: "-" useragent: "Boto/2.2.2 (linux2)"
>
> Inside attachment logs from radosgw in debug mode.
>
> In normal put, get, head etc, everything works OK. Only parallel copy
> going to be mess i think :).
>
> I see some in radosgw log:
>
> 2012-02-28 12:27:34.774107 7ff99bfef700 can't clone object
> www.onet.pl:.archive/1330428452.43/_d/dodatki/887184387c048cb30c88f22169f0f74d,wyslij_belka.gif
> to shadow object, tag/shadow_obj haven't been set

That's normal.

>
> and
>
> 2012-02-28 12:27:52.798272 7ff99bfef700 prepare_atomic_for_write_impl:
> state is not atomic. state=0x7ff93c02cd58

That's normal too.
>
> I use ceph 0.42 with ext4 as local filesystems, maybe xattr in ext4
> make that problem ??.
>

Don't think so. Looking above, the used xattrs size are much less than
that. Can you reproduce with 'debug ms = 1', and provide the osd logs
too?

Thanks,
Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
       [not found]     ` <CAC-hyiHB0aDsEO8ES8pqhcXB38MR1QNZLMB=jyRm2p_Yp7V8Aw@mail.gmail.com>
@ 2012-02-28 22:41       ` Yehuda Sadeh Weinraub
  2012-02-29 13:06         ` Sławomir Skowron
  0 siblings, 1 reply; 12+ messages in thread
From: Yehuda Sadeh Weinraub @ 2012-02-28 22:41 UTC (permalink / raw)
  To: Sławomir Skowron; +Cc: ceph-devel

(resending to list)

On Tue, Feb 28, 2012 at 11:53 AM, Sławomir Skowron
<slawomir.skowron@gmail.com> wrote:
>
> 2012/2/28 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
> > On Tue, Feb 28, 2012 at 3:43 AM, Sławomir Skowron
> > <slawomir.skowron@gmail.com> wrote:
> >> After some parallel copy command via botto for many files everything,
> >> going to slow down, and eventualy got timeout from nginx@radosgw.


Note that you're overwriting the same object. Is that intentional?

>
> Reproduced logs in attachment.
>

I was able to recreate the issue. The problem is specifically related
to the fact that you're overwriting the same object from 10s of
parallel threads. What happens is that our race-detection code (that
is related to the radosgw atomic object write) detects that the
underlying object has been written and it needs to reread its header
before overwriting it. This works well when there are a few writers to
the same object, but doesn't scale very well. I opened issue #2120 to
track the issue.

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-02-28 22:41       ` Yehuda Sadeh Weinraub
@ 2012-02-29 13:06         ` Sławomir Skowron
  2012-02-29 19:44           ` Yehuda Sadeh Weinraub
  0 siblings, 1 reply; 12+ messages in thread
From: Sławomir Skowron @ 2012-02-29 13:06 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel

Ok, it's intentional.

We are checking meta info about files, then, checking md5 of file
content. In parallel, updating object that have change, and then
archiving this objects in another key, and last thing is deleting
objects that expires.

This happens over and over, because, this site is changing many times.

Now i don't have any idea, how to workaround this problem, without
shutdown this app :(

Another question is, why data from radosgw writing almost everything
only on one osd, and copies to 2 others, and only those. Is anyone can
explain this to me ??

/dev/sdu              275G  605M  260G   1% /vol0/data/osd.18
/dev/sdw              275G  608M  260G   1% /vol0/data/osd.21
/dev/sdz              275G  638M  260G   1% /vol0/data/osd.24
/dev/sde              275G  605M  260G   1% /vol0/data/osd.3
/dev/sdr              275G  605M  260G   1% /vol0/data/osd.16
/dev/sdaa             275G  696M  260G   1% /vol0/data/osd.25
/dev/sdp              275G  605M  260G   1% /vol0/data/osd.14
/dev/sdd              275G  605M  260G   1% /vol0/data/osd.2
/dev/sdk              275G  605M  260G   1% /vol0/data/osd.8
/dev/sdh              275G  608M  260G   1% /vol0/data/osd.6
/dev/sds              275G  605M  260G   1% /vol0/data/osd.17
/dev/sdf              275G  638M  260G   1% /vol0/data/osd.4
/dev/sdj              275G  637M  260G   1% /vol0/data/osd.9
/dev/sdc              275G  604M  260G   1% /vol0/data/osd.1
/dev/sdv              275G  2.7G  258G   2% /vol0/data/osd.20
/dev/sdn              275G  607M  260G   1% /vol0/data/osd.12
/dev/sdo              275G  605M  260G   1% /vol0/data/osd.13
/dev/sdg              275G  605M  260G   1% /vol0/data/osd.5
/dev/sdy              275G  633M  260G   1% /vol0/data/osd.23
/dev/sdm              275G  605M  260G   1% /vol0/data/osd.11
/dev/sdx              275G  605M  260G   1% /vol0/data/osd.22
/dev/sdb              275G  608M  260G   1% /vol0/data/osd.0
/dev/sdq              275G  605M  260G   1% /vol0/data/osd.15
/dev/sdl              275G  605M  260G   1% /vol0/data/osd.10
/dev/sdi              275G  605M  260G   1% /vol0/data/osd.7
/dev/sdt              275G  604M  260G   1% /vol0/data/osd.19


2012/2/28 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
> (resending to list)
>
> On Tue, Feb 28, 2012 at 11:53 AM, Sławomir Skowron
> <slawomir.skowron@gmail.com> wrote:
>>
>> 2012/2/28 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
>> > On Tue, Feb 28, 2012 at 3:43 AM, Sławomir Skowron
>> > <slawomir.skowron@gmail.com> wrote:
>> >> After some parallel copy command via botto for many files everything,
>> >> going to slow down, and eventualy got timeout from nginx@radosgw.
>
>
> Note that you're overwriting the same object. Is that intentional?
>
>>
>> Reproduced logs in attachment.
>>
>
> I was able to recreate the issue. The problem is specifically related
> to the fact that you're overwriting the same object from 10s of
> parallel threads. What happens is that our race-detection code (that
> is related to the radosgw atomic object write) detects that the
> underlying object has been written and it needs to reread its header
> before overwriting it. This works well when there are a few writers to
> the same object, but doesn't scale very well. I opened issue #2120 to
> track the issue.

You join this with Tasks #1956, and when this task go to development ??

> Yehuda

-- 
-----
Pozdrawiam

Sławek "sZiBis" Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-02-29 13:06         ` Sławomir Skowron
@ 2012-02-29 19:44           ` Yehuda Sadeh Weinraub
       [not found]             ` <CAMwB3TjuDhRfTheJvTKx7yiTy7JSiTp2h795fuNS-mjFeJP=Fg@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Yehuda Sadeh Weinraub @ 2012-02-29 19:44 UTC (permalink / raw)
  To: Sławomir Skowron; +Cc: ceph-devel

On Wed, Feb 29, 2012 at 5:06 AM, Sławomir Skowron
<slawomir.skowron@gmail.com> wrote:
>
> Ok, it's intentional.
>
> We are checking meta info about files, then, checking md5 of file
> content. In parallel, updating object that have change, and then
> archiving this objects in another key, and last thing is deleting
> objects that expires.
>
> This happens over and over, because, this site is changing many times.
>
> Now i don't have any idea, how to workaround this problem, without
> shutdown this app :(

I looked at your osd log again, and there are other things that don't
look right. I'll also need you to turn on 'debug osd = 20' and 'debug
filestore = 20'.
Other than that, I just pushed a workaround that might improve things.
It's on the wip-rgw-atomic-no-retry branch on github (based on
0.42.2), so you might want to give it a spin and let us know whether
it actually improved things.


>
> Another question is, why data from radosgw writing almost everything
> only on one osd, and copies to 2 others, and only those. Is anyone can
> explain this to me ??
>
> /dev/sdu              275G  605M  260G   1% /vol0/data/osd.18
> /dev/sdw              275G  608M  260G   1% /vol0/data/osd.21
> /dev/sdz              275G  638M  260G   1% /vol0/data/osd.24
> /dev/sde              275G  605M  260G   1% /vol0/data/osd.3
> /dev/sdr              275G  605M  260G   1% /vol0/data/osd.16
> /dev/sdaa             275G  696M  260G   1% /vol0/data/osd.25
> /dev/sdp              275G  605M  260G   1% /vol0/data/osd.14
> /dev/sdd              275G  605M  260G   1% /vol0/data/osd.2
> /dev/sdk              275G  605M  260G   1% /vol0/data/osd.8
> /dev/sdh              275G  608M  260G   1% /vol0/data/osd.6
> /dev/sds              275G  605M  260G   1% /vol0/data/osd.17
> /dev/sdf              275G  638M  260G   1% /vol0/data/osd.4
> /dev/sdj              275G  637M  260G   1% /vol0/data/osd.9
> /dev/sdc              275G  604M  260G   1% /vol0/data/osd.1
> /dev/sdv              275G  2.7G  258G   2% /vol0/data/osd.20
> /dev/sdn              275G  607M  260G   1% /vol0/data/osd.12
> /dev/sdo              275G  605M  260G   1% /vol0/data/osd.13
> /dev/sdg              275G  605M  260G   1% /vol0/data/osd.5
> /dev/sdy              275G  633M  260G   1% /vol0/data/osd.23
> /dev/sdm              275G  605M  260G   1% /vol0/data/osd.11
> /dev/sdx              275G  605M  260G   1% /vol0/data/osd.22
> /dev/sdb              275G  608M  260G   1% /vol0/data/osd.0
> /dev/sdq              275G  605M  260G   1% /vol0/data/osd.15
> /dev/sdl              275G  605M  260G   1% /vol0/data/osd.10
> /dev/sdi              275G  605M  260G   1% /vol0/data/osd.7
> /dev/sdt              275G  604M  260G   1% /vol0/data/osd.19
>
That's probably because you're overwriting the same object over and
over. When you do that the old object is still being kept so that any
pending readers may complete reading it. You should run a maintenance
utility in order to remove those (radosgw-admin temp remove --date=..
to remove all temp objects that were created before a specific date).

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
       [not found]             ` <CAMwB3TjuDhRfTheJvTKx7yiTy7JSiTp2h795fuNS-mjFeJP=Fg@mail.gmail.com>
@ 2012-03-05 10:23               ` Sławomir Skowron
  2012-03-05 18:59                 ` Yehuda Sadeh Weinraub
  0 siblings, 1 reply; 12+ messages in thread
From: Sławomir Skowron @ 2012-03-05 10:23 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel

2012/3/1 Sławomir Skowron <slawomir.skowron@gmail.com>:
> 2012/2/29 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
>> On Wed, Feb 29, 2012 at 5:06 AM, Sławomir Skowron
>> <slawomir.skowron@gmail.com> wrote:
>>>
>>> Ok, it's intentional.
>>>
>>> We are checking meta info about files, then, checking md5 of file
>>> content. In parallel, updating object that have change, and then
>>> archiving this objects in another key, and last thing is deleting
>>> objects that expires.
>>>
>>> This happens over and over, because, this site is changing many times.
>>>
>>> Now i don't have any idea, how to workaround this problem, without
>>> shutdown this app :(
>>
>> I looked at your osd log again, and there are other things that don't
>> look right. I'll also need you to turn on 'debug osd = 20' and 'debug
>> filestore = 20'.
>
> osd.24 almost 10 minutes of log in debug, as above in attachment.
>
>> Other than that, I just pushed a workaround that might improve things.
>> It's on the wip-rgw-atomic-no-retry branch on github (based on
>> 0.42.2), so you might want to give it a spin and let us know whether
>> it actually improved things.
>
> Ok i will try, and let you know soon.

Unfortunately, no improvment after upgrade for this version.

Ningx reports growing time for radosgw backend, and after increase
this time to 33 seconds, changing into timeout.

10.177.8.11 - - - [05/Mar/2012:10:14:10 +0100] "PUT
/test/_x/sidebar2/_d/widgets/sym/button_bw.png HTTP/1.1" rlength: 1049
bsent: 232 rtime: 2.487 urtime: 2.485 status: 200 bbsent: 25 httpref:
"-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.12 - - - [05/Mar/2012:10:14:10 +0100] "PUT
/test/_x/sidebar2/_d/widgets/sym/button_bw.png?acl HTTP/1.1" rlength:
353 bsent: 157 rtime: 0.013 urtime: 0.013 status: 200 bbsent: 5
httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:13 +0100] "PUT
/test/_x/sidebar2/_d/widgets/newmail/list.png?acl HTTP/1.1" rlength:
352 bsent: 157 rtime: 0.006 urtime: 0.006 status: 200 bbsent: 5
httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:17 +0100] "PUT
/test/_x/sidebar2/_d/widgets/zumisearch/button.png?acl HTTP/1.1"
rlength: 357 bsent: 157 rtime: 0.006 urtime: 0.006 status: 200 bbsent:
5 httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:21 +0100] "PUT
/test/_g/sidebar/_d/widgets/globaltime/city_point_green.gif HTTP/1.1"
rlength: 1297 bsent: 232 rtime: 13.427 urtime: 13.426 status: 200
bbsent: 25 httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:26 +0100] "PUT
/test/_x/sidebar2/_d/widgets/zum/zumikombw.png HTTP/1.1" rlength: 4278
bsent: 232 rtime: 18.384 urtime: 18.383 status: 200 bbsent: 25
httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.12 - - - [05/Mar/2012:10:14:29 +0100] "PUT
/test/_x/sidebar2/_d/widgets/allegro/allegro.png HTTP/1.1" rlength:
4129 bsent: 232 rtime: 21.963 urtime: 21.962 status: 200 bbsent: 25
httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:32 +0100] "PUT
/test/_g/sidebar/_d/popover/mdcbtn.gif HTTP/1.1" rlength: 1532 bsent:
232 rtime: 25.105 urtime: 25.104 status: 200 bbsent: 25 httpref: "-"
useragent: "Boto/2.2.2 (linux2)"
10.177.8.12 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/40622%2Cserwisy.html HTTP/1.1" rlength: 18178 bsent: 232 rtime:
32.651 urtime: 32.648 status: 200 bbsent: 25 httpref: "-" useragent:
"Boto/2.2.2 (linux2)"
10.177.8.12 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/_x/sidebar/cssb/popover/btn-nrm.gif HTTP/1.1" rlength: 539
bsent: 25 rtime: 33.003 urtime: - status: 499 bbsent: 25 httpref: "-"
useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/DaneAjax%2CWiadomosciLokalne%2Cajax.json%3Fmiasto%3Dslask%26z%3D0%26v%3D201203051014
HTTP/1.1" rlength: 1572 bsent: 25 rtime: 33.002 urtime: - status: 499
bbsent: 25 httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.12 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/37828%2Cserwisy.html HTTP/1.1" rlength: 18363 bsent: 25 rtime:
33.003 urtime: - status: 499 bbsent: 25 httpref: "-" useragent:
"Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/DaneAjax%2CWiadomosciLokalne%2Cajax.json%3Fmiasto%3Dpoznan%26z%3D0%26v%3D201203051014
HTTP/1.1" rlength: 1602 bsent: 25 rtime: 33.004 urtime: - status: 499
bbsent: 25 httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/DaneAjax%2CWiadomosciLokalne%2Cajax.json%3Fmiasto%3Dbydgoszcz%26z%3D0%26v%3D201203051014
HTTP/1.1" rlength: 1738 bsent: 25 rtime: 33.005 urtime: - status: 499
bbsent: 25 httpref: "-" useragent: "Boto/2.2.2 (linux2)"
10.177.8.11 - - - [05/Mar/2012:10:14:40 +0100] "PUT
/test/_x/sidebar2/_d/widgets/sym/button.png HTTP/1.1" rlength: 1398
bsent: 25 rtime: 33.001 urtime: - status: 499 bbsent: 25 httpref: "-"
useragent: "Boto/2.2.2 (linux2)"

>
>>
>>
>>>
>>> Another question is, why data from radosgw writing almost everything
>>> only on one osd, and copies to 2 others, and only those. Is anyone can
>>> explain this to me ??
>>>
>>> /dev/sdu              275G  605M  260G   1% /vol0/data/osd.18
>>> /dev/sdw              275G  608M  260G   1% /vol0/data/osd.21
>>> /dev/sdz              275G  638M  260G   1% /vol0/data/osd.24
>>> /dev/sde              275G  605M  260G   1% /vol0/data/osd.3
>>> /dev/sdr              275G  605M  260G   1% /vol0/data/osd.16
>>> /dev/sdaa             275G  696M  260G   1% /vol0/data/osd.25
>>> /dev/sdp              275G  605M  260G   1% /vol0/data/osd.14
>>> /dev/sdd              275G  605M  260G   1% /vol0/data/osd.2
>>> /dev/sdk              275G  605M  260G   1% /vol0/data/osd.8
>>> /dev/sdh              275G  608M  260G   1% /vol0/data/osd.6
>>> /dev/sds              275G  605M  260G   1% /vol0/data/osd.17
>>> /dev/sdf              275G  638M  260G   1% /vol0/data/osd.4
>>> /dev/sdj              275G  637M  260G   1% /vol0/data/osd.9
>>> /dev/sdc              275G  604M  260G   1% /vol0/data/osd.1
>>> /dev/sdv              275G  2.7G  258G   2% /vol0/data/osd.20
>>> /dev/sdn              275G  607M  260G   1% /vol0/data/osd.12
>>> /dev/sdo              275G  605M  260G   1% /vol0/data/osd.13
>>> /dev/sdg              275G  605M  260G   1% /vol0/data/osd.5
>>> /dev/sdy              275G  633M  260G   1% /vol0/data/osd.23
>>> /dev/sdm              275G  605M  260G   1% /vol0/data/osd.11
>>> /dev/sdx              275G  605M  260G   1% /vol0/data/osd.22
>>> /dev/sdb              275G  608M  260G   1% /vol0/data/osd.0
>>> /dev/sdq              275G  605M  260G   1% /vol0/data/osd.15
>>> /dev/sdl              275G  605M  260G   1% /vol0/data/osd.10
>>> /dev/sdi              275G  605M  260G   1% /vol0/data/osd.7
>>> /dev/sdt              275G  604M  260G   1% /vol0/data/osd.19
>>>
>> That's probably because you're overwriting the same object over and
>> over. When you do that the old object is still being kept so that any
>> pending readers may complete reading it. You should run a maintenance
>> utility in order to remove those (radosgw-admin temp remove --date=..
>> to remove all temp objects that were created before a specific date).
>
> Nice, useful advice, thanks.
>
>>
>> Yehuda
>
> --
> -----
> Pozdrawiam
>
> Sławek Skowron



-- 
-----
Pozdrawiam

Sławek "sZiBis" Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-03-05 10:23               ` Sławomir Skowron
@ 2012-03-05 18:59                 ` Yehuda Sadeh Weinraub
  2012-03-05 21:21                   ` Sławomir Skowron
  0 siblings, 1 reply; 12+ messages in thread
From: Yehuda Sadeh Weinraub @ 2012-03-05 18:59 UTC (permalink / raw)
  To: Sławomir Skowron; +Cc: ceph-devel

On Mon, Mar 5, 2012 at 2:23 AM, Sławomir Skowron
<slawomir.skowron@gmail.com> wrote:
> 2012/3/1 Sławomir Skowron <slawomir.skowron@gmail.com>:
>> 2012/2/29 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
>>> On Wed, Feb 29, 2012 at 5:06 AM, Sławomir Skowron
>>> <slawomir.skowron@gmail.com> wrote:
>>>>
>>>> Ok, it's intentional.
>>>>
>>>> We are checking meta info about files, then, checking md5 of file
>>>> content. In parallel, updating object that have change, and then
>>>> archiving this objects in another key, and last thing is deleting
>>>> objects that expires.
>>>>
>>>> This happens over and over, because, this site is changing many times.
>>>>
>>>> Now i don't have any idea, how to workaround this problem, without
>>>> shutdown this app :(
>>>
>>> I looked at your osd log again, and there are other things that don't
>>> look right. I'll also need you to turn on 'debug osd = 20' and 'debug
>>> filestore = 20'.
>>
>> osd.24 almost 10 minutes of log in debug, as above in attachment.
>>
>>> Other than that, I just pushed a workaround that might improve things.
>>> It's on the wip-rgw-atomic-no-retry branch on github (based on
>>> 0.42.2), so you might want to give it a spin and let us know whether
>>> it actually improved things.
>>
>> Ok i will try, and let you know soon.
>
> Unfortunately, no improvment after upgrade for this version.
>
It looks like an issue with updating the bucket index, but I'm having
trouble confirming it, as the log provided (of osd.24) doesn't contain
any relevant operations. If you could provide a log from the relevant
osd it may be very helpful.

You can find the relevant osd by looking at an operation that took too
long, and look for a request like the following:

2012-02-28 20:20:10.944859 7fb1affb7700 -- 10.177.64.6:0/1020439 -->
10.177.64.4:6839/7954 -- osd_op(client.65007.0:587 .dir.3 [call
rgw.bucket_prepare_op] 7.ccb26a35) v4 -- ?+0 0xf25270 con 0xbcd1c0

It would be easiest looking for the reply to that request as it will
contain the osd id (search for a line that contains osd_op_reply and
the client.65007.0:587 request id).

In the mean time, I created issue #2139 for a probable culprit. Having
the relevant logs will allow us to verify whether you're hitting that
or another issue.

Thanks,
Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-03-05 18:59                 ` Yehuda Sadeh Weinraub
@ 2012-03-05 21:21                   ` Sławomir Skowron
  2012-03-06 10:08                     ` Sławomir Skowron
  0 siblings, 1 reply; 12+ messages in thread
From: Sławomir Skowron @ 2012-03-05 21:21 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org

On 5 mar 2012, at 19:59, Yehuda Sadeh Weinraub
<yehuda.sadeh@dreamhost.com> wrote:

> On Mon, Mar 5, 2012 at 2:23 AM, Sławomir Skowron
> <slawomir.skowron@gmail.com> wrote:
>> 2012/3/1 Sławomir Skowron <slawomir.skowron@gmail.com>:
>>> 2012/2/29 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
>>>> On Wed, Feb 29, 2012 at 5:06 AM, Sławomir Skowron
>>>> <slawomir.skowron@gmail.com> wrote:
>>>>>
>>>>> Ok, it's intentional.
>>>>>
>>>>> We are checking meta info about files, then, checking md5 of file
>>>>> content. In parallel, updating object that have change, and then
>>>>> archiving this objects in another key, and last thing is deleting
>>>>> objects that expires.
>>>>>
>>>>> This happens over and over, because, this site is changing many times.
>>>>>
>>>>> Now i don't have any idea, how to workaround this problem, without
>>>>> shutdown this app :(
>>>>
>>>> I looked at your osd log again, and there are other things that don't
>>>> look right. I'll also need you to turn on 'debug osd = 20' and 'debug
>>>> filestore = 20'.
>>>
>>> osd.24 almost 10 minutes of log in debug, as above in attachment.
>>>
>>>> Other than that, I just pushed a workaround that might improve things.
>>>> It's on the wip-rgw-atomic-no-retry branch on github (based on
>>>> 0.42.2), so you might want to give it a spin and let us know whether
>>>> it actually improved things.
>>>
>>> Ok i will try, and let you know soon.
>>
>> Unfortunately, no improvment after upgrade for this version.
>>
> It looks like an issue with updating the bucket index, but I'm having
> trouble confirming it, as the log provided (of osd.24) doesn't contain
> any relevant operations. If you could provide a log from the relevant
> osd it may be very helpful.
>
> You can find the relevant osd by looking at an operation that took too
> long, and look for a request like the following:
>
> 2012-02-28 20:20:10.944859 7fb1affb7700 -- 10.177.64.6:0/1020439 -->
> 10.177.64.4:6839/7954 -- osd_op(client.65007.0:587 .dir.3 [call
> rgw.bucket_prepare_op] 7.ccb26a35) v4 -- ?+0 0xf25270 con 0xbcd1c0
>
> It would be easiest looking for the reply to that request as it will
> contain the osd id (search for a line that contains osd_op_reply and
> the client.65007.0:587 request id).
>
> In the mean time, I created issue #2139 for a probable culprit. Having
> the relevant logs will allow us to verify whether you're hitting that
> or another issue.
>
> Thanks,
> Yehuda

Ok, because of time difference between as i will try too find this on
the morning in job. If there will be insufficient verbosity of logs i
will try too start all OSD in debug, as you write earlier, and then
generate, the problem again.
I try to send logs as soon as possible.

Regards
Slawomir Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-03-05 21:21                   ` Sławomir Skowron
@ 2012-03-06 10:08                     ` Sławomir Skowron
  2012-03-06 17:53                       ` Yehuda Sadeh Weinraub
  0 siblings, 1 reply; 12+ messages in thread
From: Sławomir Skowron @ 2012-03-06 10:08 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org

2012/3/5 Sławomir Skowron <szibis@gmail.com>:
> On 5 mar 2012, at 19:59, Yehuda Sadeh Weinraub
> <yehuda.sadeh@dreamhost.com> wrote:
>
>> On Mon, Mar 5, 2012 at 2:23 AM, Sławomir Skowron
>> <slawomir.skowron@gmail.com> wrote:
>>> 2012/3/1 Sławomir Skowron <slawomir.skowron@gmail.com>:
>>>> 2012/2/29 Yehuda Sadeh Weinraub <yehuda.sadeh@dreamhost.com>:
>>>>> On Wed, Feb 29, 2012 at 5:06 AM, Sławomir Skowron
>>>>> <slawomir.skowron@gmail.com> wrote:
>>>>>>
>>>>>> Ok, it's intentional.
>>>>>>
>>>>>> We are checking meta info about files, then, checking md5 of file
>>>>>> content. In parallel, updating object that have change, and then
>>>>>> archiving this objects in another key, and last thing is deleting
>>>>>> objects that expires.
>>>>>>
>>>>>> This happens over and over, because, this site is changing many times.
>>>>>>
>>>>>> Now i don't have any idea, how to workaround this problem, without
>>>>>> shutdown this app :(
>>>>>
>>>>> I looked at your osd log again, and there are other things that don't
>>>>> look right. I'll also need you to turn on 'debug osd = 20' and 'debug
>>>>> filestore = 20'.
>>>>
>>>> osd.24 almost 10 minutes of log in debug, as above in attachment.
>>>>
>>>>> Other than that, I just pushed a workaround that might improve things.
>>>>> It's on the wip-rgw-atomic-no-retry branch on github (based on
>>>>> 0.42.2), so you might want to give it a spin and let us know whether
>>>>> it actually improved things.
>>>>
>>>> Ok i will try, and let you know soon.
>>>
>>> Unfortunately, no improvment after upgrade for this version.
>>>
>> It looks like an issue with updating the bucket index, but I'm having
>> trouble confirming it, as the log provided (of osd.24) doesn't contain
>> any relevant operations. If you could provide a log from the relevant
>> osd it may be very helpful.
>>
>> You can find the relevant osd by looking at an operation that took too
>> long, and look for a request like the following:
>>
>> 2012-02-28 20:20:10.944859 7fb1affb7700 -- 10.177.64.6:0/1020439 -->
>> 10.177.64.4:6839/7954 -- osd_op(client.65007.0:587 .dir.3 [call
>> rgw.bucket_prepare_op] 7.ccb26a35) v4 -- ?+0 0xf25270 con 0xbcd1c0
>>
>> It would be easiest looking for the reply to that request as it will
>> contain the osd id (search for a line that contains osd_op_reply and
>> the client.65007.0:587 request id).
>>
>> In the mean time, I created issue #2139 for a probable culprit. Having
>> the relevant logs will allow us to verify whether you're hitting that
>> or another issue.
>>
>> Thanks,
>> Yehuda
>
> Ok, because of time difference between as i will try too find this on
> the morning in job. If there will be insufficient verbosity of logs i
> will try too start all OSD in debug, as you write earlier, and then
> generate, the problem again.
> I try to send logs as soon as possible.
>
> Regards
> Slawomir Skowron

All logs from osd.24, osd.62, and osd.36 with osd debug =20 and
filestore debug = 20 from 2012-03-06 10:25 and more.

http://217.144.195.170/ceph/osd.24.log.tar.gz   (348MB) - machine 1 - rack1
http://217.144.195.170/ceph/osd.36.log.tar.gz   (26MB) - machine 2 - rack 2
http://217.144.195.170/ceph/osd.62.log.tar.gz   (23MB) - machine3 - rack 3

-- 
-----
Pozdrawiam

Sławek "sZiBis" Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-03-06 10:08                     ` Sławomir Skowron
@ 2012-03-06 17:53                       ` Yehuda Sadeh Weinraub
  2012-03-06 18:20                         ` Sławomir Skowron
  0 siblings, 1 reply; 12+ messages in thread
From: Yehuda Sadeh Weinraub @ 2012-03-06 17:53 UTC (permalink / raw)
  To: Sławomir Skowron; +Cc: ceph-devel@vger.kernel.org

On Tue, Mar 6, 2012 at 2:08 AM, Sławomir Skowron <szibis@gmail.com> wrote:

> All logs from osd.24, osd.62, and osd.36 with osd debug =20 and
> filestore debug = 20 from 2012-03-06 10:25 and more.
>
> http://217.144.195.170/ceph/osd.24.log.tar.gz   (348MB) - machine 1 - rack1
> http://217.144.195.170/ceph/osd.36.log.tar.gz   (26MB) - machine 2 - rack 2
> http://217.144.195.170/ceph/osd.62.log.tar.gz   (23MB) - machine3 - rack 3
>

I looked at the logs, and it seems that you hit the tmap scaling
issue. Luckily we're fixing that for 0.44, so it's going to be fixed
in the next version.

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-03-06 17:53                       ` Yehuda Sadeh Weinraub
@ 2012-03-06 18:20                         ` Sławomir Skowron
  2012-03-26 13:05                           ` Sławomir Skowron
  0 siblings, 1 reply; 12+ messages in thread
From: Sławomir Skowron @ 2012-03-06 18:20 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org

On 6 mar 2012, at 18:53, Yehuda Sadeh Weinraub
<yehuda.sadeh@dreamhost.com> wrote:

> On Tue, Mar 6, 2012 at 2:08 AM, Sławomir Skowron <szibis@gmail.com> wrote:
>
>> All logs from osd.24, osd.62, and osd.36 with osd debug =20 and
>> filestore debug = 20 from 2012-03-06 10:25 and more.
>>
>> http://217.144.195.170/ceph/osd.24.log.tar.gz   (348MB) - machine 1 - rack1
>> http://217.144.195.170/ceph/osd.36.log.tar.gz   (26MB) - machine 2 - rack 2
>> http://217.144.195.170/ceph/osd.62.log.tar.gz   (23MB) - machine3 - rack 3
>>
>
> I looked at the logs, and it seems that you hit the tmap scaling
> issue. Luckily we're fixing that for 0.44, so it's going to be fixed
> in the next version.
>
> Yehuda

That's wonderful news. Thanks for everything.

Regards
Slawomir Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RadosGW problems with copy in s3
  2012-03-06 18:20                         ` Sławomir Skowron
@ 2012-03-26 13:05                           ` Sławomir Skowron
  0 siblings, 0 replies; 12+ messages in thread
From: Sławomir Skowron @ 2012-03-26 13:05 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel@vger.kernel.org

After some tests. PUT/GET/DELETE via Radosgw now in version 0.44 works
much better.

End of this topic.

Thanks.

2012/3/6 Sławomir Skowron <szibis@gmail.com>:
> On 6 mar 2012, at 18:53, Yehuda Sadeh Weinraub
> <yehuda.sadeh@dreamhost.com> wrote:
>
>> On Tue, Mar 6, 2012 at 2:08 AM, Sławomir Skowron <szibis@gmail.com> wrote:
>>
>>> All logs from osd.24, osd.62, and osd.36 with osd debug =20 and
>>> filestore debug = 20 from 2012-03-06 10:25 and more.
>>>
>>> http://217.144.195.170/ceph/osd.24.log.tar.gz   (348MB) - machine 1 - rack1
>>> http://217.144.195.170/ceph/osd.36.log.tar.gz   (26MB) - machine 2 - rack 2
>>> http://217.144.195.170/ceph/osd.62.log.tar.gz   (23MB) - machine3 - rack 3
>>>
>>
>> I looked at the logs, and it seems that you hit the tmap scaling
>> issue. Luckily we're fixing that for 0.44, so it's going to be fixed
>> in the next version.
>>
>> Yehuda
>
> That's wonderful news. Thanks for everything.
>
> Regards
> Slawomir Skowron



-- 
-----
Pozdrawiam

Sławek "sZiBis" Skowron
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-03-26 13:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-28 11:43 RadosGW problems with copy in s3 Sławomir Skowron
2012-02-28 17:18 ` Yehuda Sadeh Weinraub
     [not found]   ` <CAMwB3ThZ4jPis_Bb9tsf=HTnh3gLd6CNkn2XMr4xo09bO3hEiQ@mail.gmail.com>
     [not found]     ` <CAC-hyiHB0aDsEO8ES8pqhcXB38MR1QNZLMB=jyRm2p_Yp7V8Aw@mail.gmail.com>
2012-02-28 22:41       ` Yehuda Sadeh Weinraub
2012-02-29 13:06         ` Sławomir Skowron
2012-02-29 19:44           ` Yehuda Sadeh Weinraub
     [not found]             ` <CAMwB3TjuDhRfTheJvTKx7yiTy7JSiTp2h795fuNS-mjFeJP=Fg@mail.gmail.com>
2012-03-05 10:23               ` Sławomir Skowron
2012-03-05 18:59                 ` Yehuda Sadeh Weinraub
2012-03-05 21:21                   ` Sławomir Skowron
2012-03-06 10:08                     ` Sławomir Skowron
2012-03-06 17:53                       ` Yehuda Sadeh Weinraub
2012-03-06 18:20                         ` Sławomir Skowron
2012-03-26 13:05                           ` Sławomir Skowron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.