All of lore.kernel.org
 help / color / mirror / Atom feed
* Firefly upgrade tests
@ 2014-07-03 22:39 Loic Dachary
  2014-07-05 13:46 ` Loic Dachary
  0 siblings, 1 reply; 4+ messages in thread
From: Loic Dachary @ 2014-07-03 22:39 UTC (permalink / raw)
  To: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 4069 bytes --]

Hi Ceph,

The firefly-x test upgrade suite is designed to check that upgrading from Firefly to a newer version (master or a branch) works as expected. It was created it by copying dumpling-x and can be browsed at https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade/firefly-x

To establish a baseline, a run was scheduled to upgrade from firefly to firefly (i.e. no upgrade really ;-) and it should therefore show that when nothing happens all is well. It however fails in various ways as can be seen here.

./virtualenv/bin/teuthology-suite --suite upgrade/firefly-x/stress-split  --suite-dir ~/software/ceph/ceph-qa-suite  --ceph firefly --machine-type vps --email loic@dachary.org http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/

* Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm' 
  Does that mean kernels are not ready yet for this distribution and the tests should be skipped ? 
* Command failed on vpm058 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'"
  http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338941

  Although it looks like http://tracker.ceph.com/issues/7808 which is a duplicate of http://tracker.ceph.com/issues/7799 it is slightly different and http://tracker.ceph.com/issues/8735 was created to keep track of it.

* Command failed on vpm070 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 1'    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338904/

   Although the root of the error seems to be that osd 1 cannot be killed by the thrasher, I don't see meaningfull error messages. http://tracker.ceph.com/issues/8736 was filed to keep track of this condition.

* timed out waiting for admin_socket to appear after osd.1 restart    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338908/

   It looks like a race : the osd is killed at the same time it is restarted by the thrasher and http://tracker.ceph.com/issues/8737 was opened for this

* hang on "INFO:teuthology.task.rados:joining rados"
  http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338915/

  It looks like a bug and http://tracker.ceph.com/issues/8740 was filed

When the same suite is run to upgrade from firefly to master it gives http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/ which shows the following errors:

* Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'   (same as above)

* Could not reconnect to ubuntu@vpm042.front.sepia.ceph.com  : it looks like a transient timeout problem that can be ignored
  http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/338891/
  2014-07-02T18:52:24.546 INFO:teuthology.orchestra.connection:{'username': u'ubuntu', 'hostname': u'vpm042.front.sepia.ceph.com', 'timeout': 60}

* Command failed on vpm017 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'" 
  One of which looks exactly as http://tracker.ceph.com/issues/7799 which was re-opened

* hang on "INFO:teuthology.task.rados:joining rados" (same as above)
 
Cheers
-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Firefly upgrade tests
  2014-07-03 22:39 Firefly upgrade tests Loic Dachary
@ 2014-07-05 13:46 ` Loic Dachary
  2014-07-05 14:39   ` Yuri Weinstein
  0 siblings, 1 reply; 4+ messages in thread
From: Loic Dachary @ 2014-07-05 13:46 UTC (permalink / raw)
  To: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 4463 bytes --]


Hi,

It looks like there is a shortage of VPS for some reason:

http://pulpito.ceph.com/loic-2014-07-03_11:24:33-upgrade:firefly-x:stress-split-wip-8475-testing-basic-vps/

has a number of tests scheduled since ~48h and not making progress.

Cheers

On 04/07/2014 00:39, Loic Dachary wrote:
> Hi Ceph,
> 
> The firefly-x test upgrade suite is designed to check that upgrading from Firefly to a newer version (master or a branch) works as expected. It was created it by copying dumpling-x and can be browsed at https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade/firefly-x
> 
> To establish a baseline, a run was scheduled to upgrade from firefly to firefly (i.e. no upgrade really ;-) and it should therefore show that when nothing happens all is well. It however fails in various ways as can be seen here.
> 
> ./virtualenv/bin/teuthology-suite --suite upgrade/firefly-x/stress-split  --suite-dir ~/software/ceph/ceph-qa-suite  --ceph firefly --machine-type vps --email loic@dachary.org http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/
> 
> * Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm' 
>   Does that mean kernels are not ready yet for this distribution and the tests should be skipped ? 
> * Command failed on vpm058 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'"
>   http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338941
> 
>   Although it looks like http://tracker.ceph.com/issues/7808 which is a duplicate of http://tracker.ceph.com/issues/7799 it is slightly different and http://tracker.ceph.com/issues/8735 was created to keep track of it.
> 
> * Command failed on vpm070 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 1'    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338904/
> 
>    Although the root of the error seems to be that osd 1 cannot be killed by the thrasher, I don't see meaningfull error messages. http://tracker.ceph.com/issues/8736 was filed to keep track of this condition.
> 
> * timed out waiting for admin_socket to appear after osd.1 restart    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338908/
> 
>    It looks like a race : the osd is killed at the same time it is restarted by the thrasher and http://tracker.ceph.com/issues/8737 was opened for this
> 
> * hang on "INFO:teuthology.task.rados:joining rados"
>   http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338915/
> 
>   It looks like a bug and http://tracker.ceph.com/issues/8740 was filed
> 
> When the same suite is run to upgrade from firefly to master it gives http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/ which shows the following errors:
> 
> * Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'   (same as above)
> 
> * Could not reconnect to ubuntu@vpm042.front.sepia.ceph.com  : it looks like a transient timeout problem that can be ignored
>   http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/338891/
>   2014-07-02T18:52:24.546 INFO:teuthology.orchestra.connection:{'username': u'ubuntu', 'hostname': u'vpm042.front.sepia.ceph.com', 'timeout': 60}
> 
> * Command failed on vpm017 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'" 
>   One of which looks exactly as http://tracker.ceph.com/issues/7799 which was re-opened
> 
> * hang on "INFO:teuthology.task.rados:joining rados" (same as above)
>  
> Cheers
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Firefly upgrade tests
  2014-07-05 13:46 ` Loic Dachary
@ 2014-07-05 14:39   ` Yuri Weinstein
  2014-07-05 14:58     ` Loic Dachary
  0 siblings, 1 reply; 4+ messages in thread
From: Yuri Weinstein @ 2014-07-05 14:39 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

I killed several runs that had been running for 2-3 days, hopefully it
will speed up your runs.

Thx
YuriW

On Sat, Jul 5, 2014 at 6:46 AM, Loic Dachary <loic@dachary.org> wrote:
>
> Hi,
>
> It looks like there is a shortage of VPS for some reason:
>
> http://pulpito.ceph.com/loic-2014-07-03_11:24:33-upgrade:firefly-x:stress-split-wip-8475-testing-basic-vps/
>
> has a number of tests scheduled since ~48h and not making progress.
>
> Cheers
>
> On 04/07/2014 00:39, Loic Dachary wrote:
>> Hi Ceph,
>>
>> The firefly-x test upgrade suite is designed to check that upgrading from Firefly to a newer version (master or a branch) works as expected. It was created it by copying dumpling-x and can be browsed at https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade/firefly-x
>>
>> To establish a baseline, a run was scheduled to upgrade from firefly to firefly (i.e. no upgrade really ;-) and it should therefore show that when nothing happens all is well. It however fails in various ways as can be seen here.
>>
>> ./virtualenv/bin/teuthology-suite --suite upgrade/firefly-x/stress-split  --suite-dir ~/software/ceph/ceph-qa-suite  --ceph firefly --machine-type vps --email loic@dachary.org http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/
>>
>> * Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'
>>   Does that mean kernels are not ready yet for this distribution and the tests should be skipped ?
>> * Command failed on vpm058 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'"
>>   http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338941
>>
>>   Although it looks like http://tracker.ceph.com/issues/7808 which is a duplicate of http://tracker.ceph.com/issues/7799 it is slightly different and http://tracker.ceph.com/issues/8735 was created to keep track of it.
>>
>> * Command failed on vpm070 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 1'    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338904/
>>
>>    Although the root of the error seems to be that osd 1 cannot be killed by the thrasher, I don't see meaningfull error messages. http://tracker.ceph.com/issues/8736 was filed to keep track of this condition.
>>
>> * timed out waiting for admin_socket to appear after osd.1 restart    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338908/
>>
>>    It looks like a race : the osd is killed at the same time it is restarted by the thrasher and http://tracker.ceph.com/issues/8737 was opened for this
>>
>> * hang on "INFO:teuthology.task.rados:joining rados"
>>   http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338915/
>>
>>   It looks like a bug and http://tracker.ceph.com/issues/8740 was filed
>>
>> When the same suite is run to upgrade from firefly to master it gives http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/ which shows the following errors:
>>
>> * Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'   (same as above)
>>
>> * Could not reconnect to ubuntu@vpm042.front.sepia.ceph.com  : it looks like a transient timeout problem that can be ignored
>>   http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/338891/
>>   2014-07-02T18:52:24.546 INFO:teuthology.orchestra.connection:{'username': u'ubuntu', 'hostname': u'vpm042.front.sepia.ceph.com', 'timeout': 60}
>>
>> * Command failed on vpm017 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'"
>>   One of which looks exactly as http://tracker.ceph.com/issues/7799 which was re-opened
>>
>> * hang on "INFO:teuthology.task.rados:joining rados" (same as above)
>>
>> Cheers
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Firefly upgrade tests
  2014-07-05 14:39   ` Yuri Weinstein
@ 2014-07-05 14:58     ` Loic Dachary
  0 siblings, 0 replies; 4+ messages in thread
From: Loic Dachary @ 2014-07-05 14:58 UTC (permalink / raw)
  To: Yuri Weinstein; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 4918 bytes --]

This is very kind of you. Hopefully noone will mind ;-)

On 05/07/2014 16:39, Yuri Weinstein wrote:
> I killed several runs that had been running for 2-3 days, hopefully it
> will speed up your runs.
> 
> Thx
> YuriW
> 
> On Sat, Jul 5, 2014 at 6:46 AM, Loic Dachary <loic@dachary.org> wrote:
>>
>> Hi,
>>
>> It looks like there is a shortage of VPS for some reason:
>>
>> http://pulpito.ceph.com/loic-2014-07-03_11:24:33-upgrade:firefly-x:stress-split-wip-8475-testing-basic-vps/
>>
>> has a number of tests scheduled since ~48h and not making progress.
>>
>> Cheers
>>
>> On 04/07/2014 00:39, Loic Dachary wrote:
>>> Hi Ceph,
>>>
>>> The firefly-x test upgrade suite is designed to check that upgrading from Firefly to a newer version (master or a branch) works as expected. It was created it by copying dumpling-x and can be browsed at https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade/firefly-x
>>>
>>> To establish a baseline, a run was scheduled to upgrade from firefly to firefly (i.e. no upgrade really ;-) and it should therefore show that when nothing happens all is well. It however fails in various ways as can be seen here.
>>>
>>> ./virtualenv/bin/teuthology-suite --suite upgrade/firefly-x/stress-split  --suite-dir ~/software/ceph/ceph-qa-suite  --ceph firefly --machine-type vps --email loic@dachary.org http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/
>>>
>>> * Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'
>>>   Does that mean kernels are not ready yet for this distribution and the tests should be skipped ?
>>> * Command failed on vpm058 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'"
>>>   http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338941
>>>
>>>   Although it looks like http://tracker.ceph.com/issues/7808 which is a duplicate of http://tracker.ceph.com/issues/7799 it is slightly different and http://tracker.ceph.com/issues/8735 was created to keep track of it.
>>>
>>> * Command failed on vpm070 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 1'    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338904/
>>>
>>>    Although the root of the error seems to be that osd 1 cannot be killed by the thrasher, I don't see meaningfull error messages. http://tracker.ceph.com/issues/8736 was filed to keep track of this condition.
>>>
>>> * timed out waiting for admin_socket to appear after osd.1 restart    http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338908/
>>>
>>>    It looks like a race : the osd is killed at the same time it is restarted by the thrasher and http://tracker.ceph.com/issues/8737 was opened for this
>>>
>>> * hang on "INFO:teuthology.task.rados:joining rados"
>>>   http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338915/
>>>
>>>   It looks like a bug and http://tracker.ceph.com/issues/8740 was filed
>>>
>>> When the same suite is run to upgrade from firefly to master it gives http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/ which shows the following errors:
>>>
>>> * Command failed on vpm105 with status 1: 'sudo yum install -y http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'   (same as above)
>>>
>>> * Could not reconnect to ubuntu@vpm042.front.sepia.ceph.com  : it looks like a transient timeout problem that can be ignored
>>>   http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/338891/
>>>   2014-07-02T18:52:24.546 INFO:teuthology.orchestra.connection:{'username': u'ubuntu', 'hostname': u'vpm042.front.sepia.ceph.com', 'timeout': 60}
>>>
>>> * Command failed on vpm017 with status 1: "SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'"
>>>   One of which looks exactly as http://tracker.ceph.com/issues/7799 which was re-opened
>>>
>>> * hang on "INFO:teuthology.task.rados:joining rados" (same as above)
>>>
>>> Cheers
>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-07-05 14:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-03 22:39 Firefly upgrade tests Loic Dachary
2014-07-05 13:46 ` Loic Dachary
2014-07-05 14:39   ` Yuri Weinstein
2014-07-05 14:58     ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.