All of lore.kernel.org
 help / color / mirror / Atom feed
From: Loic Dachary <loic@dachary.org>
To: "Miyamae, Takeshi" <miyamae.takeshi@jp.fujitsu.com>,
	Ceph Development <ceph-devel@vger.kernel.org>
Cc: "Kawaguchi, Shotaro" <kawaguchi.s@jp.fujitsu.com>,
	"Imai, Hiroki" <imai.hiroki@jp.fujitsu.com>,
	"Nakao, Takanori" <nakao.takanori@jp.fujitsu.com>,
	"Shiozawa, Kensuke" <shiozawa.kennsu@jp.fujitsu.com>
Subject: Re: teuthology timeout error
Date: Tue, 26 May 2015 10:59:12 +0200	[thread overview]
Message-ID: <556435E0.7030502@dachary.org> (raw)
In-Reply-To: <870DE8DBB716524BAE51B2D499EC81E40AB19A89@g01jpexmbyt24>

[-- Attachment #1: Type: text/plain, Size: 9304 bytes --]

Hi Takeshi,

I'm trying to repeat your problem at https://github.com/ceph/ceph-qa-suite/pull/445. To be continued :-)

Cheers

On 26/05/2015 04:39, Miyamae, Takeshi wrote:
> Hi Loic,
> 
> We rebased our teuthology/ceph-qa-suite and retried the test toward LRC on current master.
> However, we unfortunately got the same result as before (timeout error).
> 
> [test conditions]
> Target : Ceph-9.0.0-971-gd49d816
> https://github.com/kawaguchi-s/teuthology
> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886-lrc
> 
> [teuthology log]
> 
> 2015-05-25 10:18:23	# start
> 
> 2015-05-25 11:59:52,106.106 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status -- format=json-pretty'
> 2015-05-25 11:59:52,564.564 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
> 2015-05-25 11:59:52,565.565 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired
> 
> Traceback (most recent call last):
>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>     result = self._run(*self.args, **self.kwargs)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired <Greenlet at 0x36cacd0: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x36df3f8>>> failed with AssertionError
> 
> Best regards,
> Takeshi Miyamae
> 
> -----Original Message-----
> From: Loic Dachary [mailto:loic@dachary.org] 
> Sent: Thursday, May 21, 2015 6:38 PM
> To: Miyamae, Takeshi/宮前 剛; Ceph Development
> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
> Subject: Re: teuthology timeout error
> 
> Hi,
> 
> [sorry the previous mail was sent by accident, here is the full mail]
> 
> On 21/05/2015 10:32, Miyamae, Takeshi wrote:
>> Hi Loic,
>>
>>> Could you please share the teuthology/ceph-qa-suite repository you 
>>> are using to run these tests so I can try to reproduce / diagnose the problem ?
>>
>> https://github.com/kawaguchi-s/teuthology/tree/wip-10886
>> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886
>>
> 
> When compared against master they show differences that indicate it would be good to rebase:
> 
> https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886
> https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886
> 
> I think the teuthology commit on top of wip-10886 is a mistake
> 
> https://github.com/kawaguchi-s/teuthology/commit/348e54931f89c9b0ae7a84eb931576f8414017b5
> 
> do you really need to modify teuthology ? It should just be necessary to use the latest master branch.
> 
> It looks like the
> 
> https://github.com/kawaguchi-s/ceph-qa-suite/commit/f2e3ca5d12ceef742eae2a9cf4057c436e9040c3
> 
> commit in your ceph-qa-suite is not what you intended. However
> 
> https://github.com/kawaguchi-s/ceph-qa-suite/commit/4b39d6d4862f9091a849d224e880795be406815d
> https://github.com/kawaguchi-s/ceph-qa-suite/commit/d16b4b058ae118931928541a2c8acd68f9703a44
> 
> look ok :-) Instead of naming the test 4nodes16osds3mons1client.yaml it would be better to use the same kind of naming you see at https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados/thrash-erasure-code/workloads. That is a file name made of the distinctive parameters for the shec plugin (the parameters that are the default can be omited).
> 
> Cheers
> 
>> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.
>>
>> Best regards,
>> Takeshi Miyamae
>>
>> -----Original Message-----
>> From: Loic Dachary [mailto:loic@dachary.org]
>> Sent: Wednesday, May 20, 2015 4:49 PM
>> To: Miyamae, Takeshi/宮前 剛; Ceph Development
>> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 
>> 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
>> Subject: Re: teuthology timeout error
>>
>> Hi,
>>
>> On 20/05/2015 04:20, Miyamae, Takeshi wrote:
>>> Hi Loic,
>>>
>>> When we fixed our own issue and restarted teuthology,
>>
>> Great !
>>
>>> we encountered another issue (timeout error) which occurs in case of LRC as well.
>>> Do you have any information about that ?
>>
>> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?
>>
>> Thanks
>>
>>>
>>> [error messages (in case of LRC pool)]
>>>
>>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
>>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress 
>>> seen, keeping timeout for now
>>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>>     return func(self)
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>>     timeout=self.config.get('timeout')
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>>     'failed to recover before timeout expired'
>>> AssertionError: failed to recover before timeout expired
>>>
>>> Traceback (most recent call last):
>>>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>>>     result = self._run(*self.args, **self.kwargs)
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>>     return func(self)
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>>     timeout=self.config.get('timeout')
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>>     'failed to recover before timeout expired'
>>> AssertionError: failed to recover before timeout expired <Greenlet at 
>>> 0x2a7d550: <bound method Thrasher.do_thrash of 
>>> <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with 
>>> AssertionError
>>>
>>> [ceph version]
>>> 0.93-952-gfe28daa
>>>
>>> [teuthology, ceph-qa-suite]
>>> newest version at 3/25/2015
>>>
>>> [configurations]
>>>   check-locks: false
>>>   overrides:
>>>     ceph:
>>>       conf:
>>>         global:
>>>           ms inject socket failures: 5000
>>>         osd:
>>>           osd heartbeat use min delay socket: true
>>>           osd sloppy crc: true
>>>       fs: xfs
>>>   roles:
>>>   - - mon.a
>>>     - osd.0
>>>     - osd.4
>>>     - osd.8
>>>     - osd.12
>>>   - - mon.b
>>>     - osd.1
>>>     - osd.5
>>>     - osd.9
>>>     - osd.13
>>>   - - mon.c
>>>     - osd.2
>>>     - osd.6
>>>     - osd.10
>>>     - osd.14
>>>   - - osd.3
>>>     - osd.7
>>>     - osd.11
>>>     - osd.15
>>>     - client.0
>>>   targets:
>>>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>>>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>>>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>>>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>>>   tasks:
>>>   - ceph:
>>>       conf:
>>>         osd:
>>>           osd debug reject backfill probability: 0.3
>>>           osd max backfills: 1
>>>           osd scrub max interval: 120
>>>           osd scrub min interval: 60
>>>       log-whitelist:
>>>       - wrongly marked me down
>>>       - objects unfound and apparently lost
>>>   - thrashosds:
>>>       chance_pgnum_grow: 1
>>>       chance_pgpnum_fix: 1
>>>       min_in: 4
>>>       timeout: 1200
>>>   - rados:
>>>       clients:
>>>       - client.0
>>>       ec_pool: true
>>>       erasure_code_profile:
>>>         k: 4
>>>         l: 3
>>>         m: 2
>>>         name: lrcprofile
>>>         plugin: lrc
>>>         ruleset-failure-domain: osd
>>>       objects: 50
>>>       op_weights:
>>>         append: 100
>>>         copy_from: 50
>>>         delete: 50
>>>         read: 100
>>>         rmattr: 25
>>>         rollback: 50
>>>         setattr: 25
>>>         snap_create: 50
>>>         snap_remove: 50
>>>         write: 0
>>>       ops: 190000
>>>
>>> Best regards,
>>> Takeshi Miyamae
>>>
>>
> 
> --
> Loïc Dachary, Artisan Logiciel Libre
> 
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

      reply	other threads:[~2015-05-26  8:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-20  2:20 teuthology timeout error Miyamae, Takeshi
2015-05-20  7:48 ` Loic Dachary
2015-05-21  8:32   ` Miyamae, Takeshi
2015-05-21  9:30     ` Loic Dachary
2015-05-21  9:37     ` Loic Dachary
2015-05-26  2:39       ` Miyamae, Takeshi
2015-05-26  8:59         ` Loic Dachary [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556435E0.7030502@dachary.org \
    --to=loic@dachary.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=imai.hiroki@jp.fujitsu.com \
    --cc=kawaguchi.s@jp.fujitsu.com \
    --cc=miyamae.takeshi@jp.fujitsu.com \
    --cc=nakao.takanori@jp.fujitsu.com \
    --cc=shiozawa.kennsu@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.