From mboxrd@z Thu Jan 1 00:00:00 1970 From: Loic Dachary Subject: Re: teuthology timeout error Date: Thu, 21 May 2015 11:30:48 +0200 Message-ID: <555DA5C8.5040703@dachary.org> References: <870DE8DBB716524BAE51B2D499EC81E40AAF9237@g01jpexmbyt24> <555C3C67.6080905@dachary.org> <870DE8DBB716524BAE51B2D499EC81E40AAFDE74@g01jpexmbyt24> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SB0tM3XhPXc3qDFQsGGLm6k5PB72prOFH" Return-path: Received: from mail2.dachary.org ([91.121.57.175]:48224 "EHLO smtp.dmail.dachary.org" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753328AbbEUJav (ORCPT ); Thu, 21 May 2015 05:30:51 -0400 In-Reply-To: <870DE8DBB716524BAE51B2D499EC81E40AAFDE74@g01jpexmbyt24> Sender: ceph-devel-owner@vger.kernel.org List-ID: To: "Miyamae, Takeshi" , Ceph Development Cc: "Kawaguchi, Shotaro" , "Imai, Hiroki" , "Nakao, Takanori" , "Shiozawa, Kensuke" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --SB0tM3XhPXc3qDFQsGGLm6k5PB72prOFH Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 21/05/2015 10:32, Miyamae, Takeshi wrote: > Hi Loic, >=20 >> Could you please share the teuthology/ceph-qa-suite repository you are= using to run these tests >> so I can try to reproduce / diagnose the problem ? >=20 > https://github.com/kawaguchi-s/teuthology/tree/wip-10886 > https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886 >=20 When compared against master they show differences that indicate it would= be good to rebase: https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886= https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10= 886 I think the teuthology commit on top of wip-10886 is a mistake > Here are our teuthology/ceph-qa-suite repositories. Thanks in advance. >=20 > Best regards, > Takeshi Miyamae >=20 > -----Original Message----- > From: Loic Dachary [mailto:loic@dachary.org]=20 > Sent: Wednesday, May 20, 2015 4:49 PM > To: Miyamae, Takeshi/=E5=AE=AE=E5=89=8D =E5=89=9B; Ceph Development > Cc: Kawaguchi, Shotaro/=E5=B7=9D=E5=8F=A3 =E7=BF=94=E5=A4=AA=E6=9C=97; = Imai, Hiroki/=E4=BB=8A=E4=BA=95 =E5=AE=8F=E6=A8=B9; Nakao, Takanori/=E4=B8= =AD=E5=B0=BE =E9=B7=B9=E8=A9=94; Shiozawa, Kensuke/=E5=A1=A9=E6=B2=A2 =E8= =B3=A2=E8=BC=94 > Subject: Re: teuthology timeout error >=20 > Hi, >=20 > On 20/05/2015 04:20, Miyamae, Takeshi wrote: >> Hi Loic, >> >> When we fixed our own issue and restarted teuthology,=20 >=20 > Great ! >=20 >> we encountered another issue (timeout error) which occurs in case of L= RC as well. >> Do you have any information about that ? >=20 > Could you please share the teuthology/ceph-qa-suite repository you are = using to run these tests so I can try to reproduce / diagnose the problem= ? >=20 > Thanks >=20 >> >> [error messages (in case of LRC pool)] >> >> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Runni= ng: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage = ceph status --format=3Djson-pretty' >> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress s= een, keeping timeout for now >> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (= most recent call last): >> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 63= 2, in wrapper >> return func(self) >> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 66= 5, in do_thrash >> timeout=3Dself.config.get('timeout') >> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 15= 66, in wait_for_recovery >> 'failed to recover before timeout expired' >> AssertionError: failed to recover before timeout expired >> >> Traceback (most recent call last): >> File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/g= event/greenlet.py", line 390, in run >> result =3D self._run(*self.args, **self.kwargs) >> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 63= 2, in wrapper >> return func(self) >> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 66= 5, in do_thrash >> timeout=3Dself.config.get('timeout') >> File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 15= 66, in wait_for_recovery >> 'failed to recover before timeout expired' >> AssertionError: failed to recover before timeout expired >> failed with AssertionError >> >> [ceph version] >> 0.93-952-gfe28daa >> >> [teuthology, ceph-qa-suite] >> newest version at 3/25/2015 >> >> [configurations] >> check-locks: false >> overrides: >> ceph: >> conf: >> global: >> ms inject socket failures: 5000 >> osd: >> osd heartbeat use min delay socket: true >> osd sloppy crc: true >> fs: xfs >> roles: >> - - mon.a >> - osd.0 >> - osd.4 >> - osd.8 >> - osd.12 >> - - mon.b >> - osd.1 >> - osd.5 >> - osd.9 >> - osd.13 >> - - mon.c >> - osd.2 >> - osd.6 >> - osd.10 >> - osd.14 >> - - osd.3 >> - osd.7 >> - osd.11 >> - osd.15 >> - client.0 >> targets: >> ubuntu@RX35-1.primary.ceph-poc.fsc.net: >> ubuntu@RX35-2.primary.ceph-poc.fsc.net: >> ubuntu@RX35-3.primary.ceph-poc.fsc.net: >> ubuntu@RX35-4.primary.ceph-poc.fsc.net: >> tasks: >> - ceph: >> conf: >> osd: >> osd debug reject backfill probability: 0.3 >> osd max backfills: 1 >> osd scrub max interval: 120 >> osd scrub min interval: 60 >> log-whitelist: >> - wrongly marked me down >> - objects unfound and apparently lost >> - thrashosds: >> chance_pgnum_grow: 1 >> chance_pgpnum_fix: 1 >> min_in: 4 >> timeout: 1200 >> - rados: >> clients: >> - client.0 >> ec_pool: true >> erasure_code_profile: >> k: 4 >> l: 3 >> m: 2 >> name: lrcprofile >> plugin: lrc >> ruleset-failure-domain: osd >> objects: 50 >> op_weights: >> append: 100 >> copy_from: 50 >> delete: 50 >> read: 100 >> rmattr: 25 >> rollback: 50 >> setattr: 25 >> snap_create: 50 >> snap_remove: 50 >> write: 0 >> ops: 190000 >> >> Best regards, >> Takeshi Miyamae >> >=20 --=20 Lo=C3=AFc Dachary, Artisan Logiciel Libre --SB0tM3XhPXc3qDFQsGGLm6k5PB72prOFH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlVdpckACgkQ8dLMyEl6F20yQQCgg6ZgltKactazDhYxnaosY6k5 MQAAn0QytarT11tPEMJycaGjXHkq2arW =IKvY -----END PGP SIGNATURE----- --SB0tM3XhPXc3qDFQsGGLm6k5PB72prOFH--