* Swift tests failing randomly
@ 2014-08-10 8:46 Loic Dachary
2014-08-11 16:21 ` Loic Dachary
0 siblings, 1 reply; 8+ messages in thread
From: Loic Dachary @ 2014-08-10 8:46 UTC (permalink / raw)
To: Yehuda Sadeh; +Cc: Ceph Development
[-- Attachment #1: Type: text/plain, Size: 1219 bytes --]
Hi Yehuda,
In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
You can find more context in
http://tracker.ceph.com/issues/8988
http://tracker.ceph.com/issues/8016
http://tracker.ceph.com/issues/7799
and discussions at
http://www.spinics.net/lists/ceph-devel/msg19933.html
Cheers
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-10 8:46 Swift tests failing randomly Loic Dachary
@ 2014-08-11 16:21 ` Loic Dachary
2014-08-11 17:11 ` Yehuda Sadeh
0 siblings, 1 reply; 8+ messages in thread
From: Loic Dachary @ 2014-08-11 16:21 UTC (permalink / raw)
To: Yehuda Sadeh; +Cc: Ceph Development
[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]
Hi Yehuda,
It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
Cheers
On 10/08/2014 10:46, Loic Dachary wrote:
> Hi Yehuda,
>
> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
>
> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
>
> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
>
> You can find more context in
>
> http://tracker.ceph.com/issues/8988
> http://tracker.ceph.com/issues/8016
> http://tracker.ceph.com/issues/7799
>
> and discussions at
>
> http://www.spinics.net/lists/ceph-devel/msg19933.html
>
> Cheers
>
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-11 16:21 ` Loic Dachary
@ 2014-08-11 17:11 ` Yehuda Sadeh
2014-08-11 17:13 ` Sage Weil
0 siblings, 1 reply; 8+ messages in thread
From: Yehuda Sadeh @ 2014-08-11 17:11 UTC (permalink / raw)
To: Loic Dachary; +Cc: Ceph Development
Yeah, looking at these logs, it really seem that it's just that things
are going slow on these machines and it's hitting timeouts. The fix is
ok with me, although I'd rather have it adjusted per machine type
(somehow).
Yehuda
On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote:
> Hi Yehuda,
>
> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
>
> Cheers
>
> On 10/08/2014 10:46, Loic Dachary wrote:
>> Hi Yehuda,
>>
>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
>>
>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
>>
>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
>>
>> You can find more context in
>>
>> http://tracker.ceph.com/issues/8988
>> http://tracker.ceph.com/issues/8016
>> http://tracker.ceph.com/issues/7799
>>
>> and discussions at
>>
>> http://www.spinics.net/lists/ceph-devel/msg19933.html
>>
>> Cheers
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-11 17:11 ` Yehuda Sadeh
@ 2014-08-11 17:13 ` Sage Weil
2014-08-11 17:34 ` Yuri Weinstein
0 siblings, 1 reply; 8+ messages in thread
From: Sage Weil @ 2014-08-11 17:13 UTC (permalink / raw)
To: Yehuda Sadeh; +Cc: Loic Dachary, Ceph Development
On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
> Yeah, looking at these logs, it really seem that it's just that things
> are going slow on these machines and it's hitting timeouts. The fix is
> ok with me, although I'd rather have it adjusted per machine type
> (somehow).
There is a vps.yaml that bumps up another timeout, so we could put it
there. Right now it lives on the teuthology machine
(~teuthworker/vps.yaml I think?), but perhaps we should stick it in
ceph-qa-suite.git somewhere ...
sage
>
> Yehuda
>
> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote:
> > Hi Yehuda,
> >
> > It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
> >
> > Cheers
> >
> > On 10/08/2014 10:46, Loic Dachary wrote:
> >> Hi Yehuda,
> >>
> >> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
> >>
> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
> >>
> >> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
> >>
> >> You can find more context in
> >>
> >> http://tracker.ceph.com/issues/8988
> >> http://tracker.ceph.com/issues/8016
> >> http://tracker.ceph.com/issues/7799
> >>
> >> and discussions at
> >>
> >> http://www.spinics.net/lists/ceph-devel/msg19933.html
> >>
> >> Cheers
> >>
> >
> > --
> > Lo?c Dachary, Artisan Logiciel Libre
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-11 17:13 ` Sage Weil
@ 2014-08-11 17:34 ` Yuri Weinstein
2014-08-11 18:47 ` Loic Dachary
0 siblings, 1 reply; 8+ messages in thread
From: Yuri Weinstein @ 2014-08-11 17:34 UTC (permalink / raw)
To: Sage Weil; +Cc: Yehuda Sadeh, Loic Dachary, Ceph Development
Here is what we have in vps.yaml now:
overrides:
ceph:
conf:
global:
osd heartbeat grace: 40
What do we want to add?
~
On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote:
> On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
>> Yeah, looking at these logs, it really seem that it's just that things
>> are going slow on these machines and it's hitting timeouts. The fix is
>> ok with me, although I'd rather have it adjusted per machine type
>> (somehow).
>
> There is a vps.yaml that bumps up another timeout, so we could put it
> there. Right now it lives on the teuthology machine
> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
> ceph-qa-suite.git somewhere ...
>
> sage
>
>>
>> Yehuda
>>
>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote:
>> > Hi Yehuda,
>> >
>> > It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
>> >
>> > Cheers
>> >
>> > On 10/08/2014 10:46, Loic Dachary wrote:
>> >> Hi Yehuda,
>> >>
>> >> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
>> >>
>> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
>> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
>> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
>> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
>> >>
>> >> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
>> >>
>> >> You can find more context in
>> >>
>> >> http://tracker.ceph.com/issues/8988
>> >> http://tracker.ceph.com/issues/8016
>> >> http://tracker.ceph.com/issues/7799
>> >>
>> >> and discussions at
>> >>
>> >> http://www.spinics.net/lists/ceph-devel/msg19933.html
>> >>
>> >> Cheers
>> >>
>> >
>> > --
>> > Lo?c Dachary, Artisan Logiciel Libre
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-11 17:34 ` Yuri Weinstein
@ 2014-08-11 18:47 ` Loic Dachary
2014-08-11 18:50 ` Yuri Weinstein
0 siblings, 1 reply; 8+ messages in thread
From: Loic Dachary @ 2014-08-11 18:47 UTC (permalink / raw)
To: Yuri Weinstein, Sage Weil; +Cc: Yehuda Sadeh, Ceph Development
[-- Attachment #1: Type: text/plain, Size: 3264 bytes --]
On 11/08/2014 19:34, Yuri Weinstein wrote:
> Here is what we have in vps.yaml now:
>
> overrides:
> ceph:
> conf:
> global:
> osd heartbeat grace: 40
>
> What do we want to add?
I think the idle_timeout values at
https://github.com/ceph/ceph-qa-suite/pull/79/files
>
> ~
>
> On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote:
>> On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
>>> Yeah, looking at these logs, it really seem that it's just that things
>>> are going slow on these machines and it's hitting timeouts. The fix is
>>> ok with me, although I'd rather have it adjusted per machine type
>>> (somehow).
>>
>> There is a vps.yaml that bumps up another timeout, so we could put it
>> there. Right now it lives on the teuthology machine
>> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
>> ceph-qa-suite.git somewhere ...
>>
>> sage
>>
>>>
>>> Yehuda
>>>
>>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote:
>>>> Hi Yehuda,
>>>>
>>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
>>>>
>>>> Cheers
>>>>
>>>> On 10/08/2014 10:46, Loic Dachary wrote:
>>>>> Hi Yehuda,
>>>>>
>>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
>>>>>
>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
>>>>>
>>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
>>>>>
>>>>> You can find more context in
>>>>>
>>>>> http://tracker.ceph.com/issues/8988
>>>>> http://tracker.ceph.com/issues/8016
>>>>> http://tracker.ceph.com/issues/7799
>>>>>
>>>>> and discussions at
>>>>>
>>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html
>>>>>
>>>>> Cheers
>>>>>
>>>>
>>>> --
>>>> Lo?c Dachary, Artisan Logiciel Libre
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-11 18:47 ` Loic Dachary
@ 2014-08-11 18:50 ` Yuri Weinstein
2014-08-11 18:55 ` Sage Weil
0 siblings, 1 reply; 8+ messages in thread
From: Yuri Weinstein @ 2014-08-11 18:50 UTC (permalink / raw)
To: Loic Dachary; +Cc: Sage Weil, Yehuda Sadeh, Ceph Development
I thought we could do the same in run-time for vps'es only.
Sage?
On Mon, Aug 11, 2014 at 11:47 AM, Loic Dachary <loic@dachary.org> wrote:
>
>
> On 11/08/2014 19:34, Yuri Weinstein wrote:
>> Here is what we have in vps.yaml now:
>>
>> overrides:
>> ceph:
>> conf:
>> global:
>> osd heartbeat grace: 40
>>
>> What do we want to add?
>
> I think the idle_timeout values at
>
> https://github.com/ceph/ceph-qa-suite/pull/79/files
>
>
>>
>> ~
>>
>> On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote:
>>> On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
>>>> Yeah, looking at these logs, it really seem that it's just that things
>>>> are going slow on these machines and it's hitting timeouts. The fix is
>>>> ok with me, although I'd rather have it adjusted per machine type
>>>> (somehow).
>>>
>>> There is a vps.yaml that bumps up another timeout, so we could put it
>>> there. Right now it lives on the teuthology machine
>>> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
>>> ceph-qa-suite.git somewhere ...
>>>
>>> sage
>>>
>>>>
>>>> Yehuda
>>>>
>>>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote:
>>>>> Hi Yehuda,
>>>>>
>>>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
>>>>>
>>>>> Cheers
>>>>>
>>>>> On 10/08/2014 10:46, Loic Dachary wrote:
>>>>>> Hi Yehuda,
>>>>>>
>>>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
>>>>>>
>>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
>>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
>>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
>>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
>>>>>>
>>>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
>>>>>>
>>>>>> You can find more context in
>>>>>>
>>>>>> http://tracker.ceph.com/issues/8988
>>>>>> http://tracker.ceph.com/issues/8016
>>>>>> http://tracker.ceph.com/issues/7799
>>>>>>
>>>>>> and discussions at
>>>>>>
>>>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>
>>>>> --
>>>>> Lo?c Dachary, Artisan Logiciel Libre
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly
2014-08-11 18:50 ` Yuri Weinstein
@ 2014-08-11 18:55 ` Sage Weil
0 siblings, 0 replies; 8+ messages in thread
From: Sage Weil @ 2014-08-11 18:55 UTC (permalink / raw)
To: Yuri Weinstein; +Cc: Loic Dachary, Yehuda Sadeh, Ceph Development
On Mon, 11 Aug 2014, Yuri Weinstein wrote:
> I thought we could do the same in run-time for vps'es only.
>
> Sage?
That's what I'm suggesting here.. move the change in that pull request
into vps.yaml, and put vps.yaml inside ceph-qa-suite.git (maybe
machine_types/vps.yaml?) so that it doesn't live in ~teuthworker?
Need to confirm that that value can be set with an overrides, though ...
sage
>
> On Mon, Aug 11, 2014 at 11:47 AM, Loic Dachary <loic@dachary.org> wrote:
> >
> >
> > On 11/08/2014 19:34, Yuri Weinstein wrote:
> >> Here is what we have in vps.yaml now:
> >>
> >> overrides:
> >> ceph:
> >> conf:
> >> global:
> >> osd heartbeat grace: 40
> >>
> >> What do we want to add?
> >
> > I think the idle_timeout values at
> >
> > https://github.com/ceph/ceph-qa-suite/pull/79/files
> >
> >
> >>
> >> ~
> >>
> >> On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote:
> >>> On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
> >>>> Yeah, looking at these logs, it really seem that it's just that things
> >>>> are going slow on these machines and it's hitting timeouts. The fix is
> >>>> ok with me, although I'd rather have it adjusted per machine type
> >>>> (somehow).
> >>>
> >>> There is a vps.yaml that bumps up another timeout, so we could put it
> >>> there. Right now it lives on the teuthology machine
> >>> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
> >>> ceph-qa-suite.git somewhere ...
> >>>
> >>> sage
> >>>
> >>>>
> >>>> Yehuda
> >>>>
> >>>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote:
> >>>>> Hi Yehuda,
> >>>>>
> >>>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ?
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> On 10/08/2014 10:46, Loic Dachary wrote:
> >>>>>> Hi Yehuda,
> >>>>>>
> >>>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
> >>>>>>
> >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
> >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
> >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
> >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
> >>>>>>
> >>>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
> >>>>>>
> >>>>>> You can find more context in
> >>>>>>
> >>>>>> http://tracker.ceph.com/issues/8988
> >>>>>> http://tracker.ceph.com/issues/8016
> >>>>>> http://tracker.ceph.com/issues/7799
> >>>>>>
> >>>>>> and discussions at
> >>>>>>
> >>>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html
> >>>>>>
> >>>>>> Cheers
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Lo?c Dachary, Artisan Logiciel Libre
> >>>>>
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>>
> >>>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> > --
> > Lo?c Dachary, Artisan Logiciel Libre
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-08-11 18:55 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-10 8:46 Swift tests failing randomly Loic Dachary
2014-08-11 16:21 ` Loic Dachary
2014-08-11 17:11 ` Yehuda Sadeh
2014-08-11 17:13 ` Sage Weil
2014-08-11 17:34 ` Yuri Weinstein
2014-08-11 18:47 ` Loic Dachary
2014-08-11 18:50 ` Yuri Weinstein
2014-08-11 18:55 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.