* Swift tests failing randomly
@ 2014-08-10 8:46 Loic Dachary
2014-08-11 16:21 ` Loic Dachary
0 siblings, 1 reply; 8+ messages in thread
From: Loic Dachary @ 2014-08-10 8:46 UTC (permalink / raw)
To: Yehuda Sadeh; +Cc: Ceph Development
[-- Attachment #1: Type: text/plain, Size: 1219 bytes --]
Hi Yehuda,
In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples:
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-)
You can find more context in
http://tracker.ceph.com/issues/8988
http://tracker.ceph.com/issues/8016
http://tracker.ceph.com/issues/7799
and discussions at
http://www.spinics.net/lists/ceph-devel/msg19933.html
Cheers
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Swift tests failing randomly 2014-08-10 8:46 Swift tests failing randomly Loic Dachary @ 2014-08-11 16:21 ` Loic Dachary 2014-08-11 17:11 ` Yehuda Sadeh 0 siblings, 1 reply; 8+ messages in thread From: Loic Dachary @ 2014-08-11 16:21 UTC (permalink / raw) To: Yehuda Sadeh; +Cc: Ceph Development [-- Attachment #1: Type: text/plain, Size: 1612 bytes --] Hi Yehuda, It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? Cheers On 10/08/2014 10:46, Loic Dachary wrote: > Hi Yehuda, > > In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: > > http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 > http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 > http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 > http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 > > and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) > > You can find more context in > > http://tracker.ceph.com/issues/8988 > http://tracker.ceph.com/issues/8016 > http://tracker.ceph.com/issues/7799 > > and discussions at > > http://www.spinics.net/lists/ceph-devel/msg19933.html > > Cheers > -- Loïc Dachary, Artisan Logiciel Libre [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly 2014-08-11 16:21 ` Loic Dachary @ 2014-08-11 17:11 ` Yehuda Sadeh 2014-08-11 17:13 ` Sage Weil 0 siblings, 1 reply; 8+ messages in thread From: Yehuda Sadeh @ 2014-08-11 17:11 UTC (permalink / raw) To: Loic Dachary; +Cc: Ceph Development Yeah, looking at these logs, it really seem that it's just that things are going slow on these machines and it's hitting timeouts. The fix is ok with me, although I'd rather have it adjusted per machine type (somehow). Yehuda On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote: > Hi Yehuda, > > It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? > > Cheers > > On 10/08/2014 10:46, Loic Dachary wrote: >> Hi Yehuda, >> >> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: >> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 >> >> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) >> >> You can find more context in >> >> http://tracker.ceph.com/issues/8988 >> http://tracker.ceph.com/issues/8016 >> http://tracker.ceph.com/issues/7799 >> >> and discussions at >> >> http://www.spinics.net/lists/ceph-devel/msg19933.html >> >> Cheers >> > > -- > Loïc Dachary, Artisan Logiciel Libre > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly 2014-08-11 17:11 ` Yehuda Sadeh @ 2014-08-11 17:13 ` Sage Weil 2014-08-11 17:34 ` Yuri Weinstein 0 siblings, 1 reply; 8+ messages in thread From: Sage Weil @ 2014-08-11 17:13 UTC (permalink / raw) To: Yehuda Sadeh; +Cc: Loic Dachary, Ceph Development On Mon, 11 Aug 2014, Yehuda Sadeh wrote: > Yeah, looking at these logs, it really seem that it's just that things > are going slow on these machines and it's hitting timeouts. The fix is > ok with me, although I'd rather have it adjusted per machine type > (somehow). There is a vps.yaml that bumps up another timeout, so we could put it there. Right now it lives on the teuthology machine (~teuthworker/vps.yaml I think?), but perhaps we should stick it in ceph-qa-suite.git somewhere ... sage > > Yehuda > > On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote: > > Hi Yehuda, > > > > It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? > > > > Cheers > > > > On 10/08/2014 10:46, Loic Dachary wrote: > >> Hi Yehuda, > >> > >> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: > >> > >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 > >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 > >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 > >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 > >> > >> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) > >> > >> You can find more context in > >> > >> http://tracker.ceph.com/issues/8988 > >> http://tracker.ceph.com/issues/8016 > >> http://tracker.ceph.com/issues/7799 > >> > >> and discussions at > >> > >> http://www.spinics.net/lists/ceph-devel/msg19933.html > >> > >> Cheers > >> > > > > -- > > Lo?c Dachary, Artisan Logiciel Libre > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly 2014-08-11 17:13 ` Sage Weil @ 2014-08-11 17:34 ` Yuri Weinstein 2014-08-11 18:47 ` Loic Dachary 0 siblings, 1 reply; 8+ messages in thread From: Yuri Weinstein @ 2014-08-11 17:34 UTC (permalink / raw) To: Sage Weil; +Cc: Yehuda Sadeh, Loic Dachary, Ceph Development Here is what we have in vps.yaml now: overrides: ceph: conf: global: osd heartbeat grace: 40 What do we want to add? ~ On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote: > On Mon, 11 Aug 2014, Yehuda Sadeh wrote: >> Yeah, looking at these logs, it really seem that it's just that things >> are going slow on these machines and it's hitting timeouts. The fix is >> ok with me, although I'd rather have it adjusted per machine type >> (somehow). > > There is a vps.yaml that bumps up another timeout, so we could put it > there. Right now it lives on the teuthology machine > (~teuthworker/vps.yaml I think?), but perhaps we should stick it in > ceph-qa-suite.git somewhere ... > > sage > >> >> Yehuda >> >> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote: >> > Hi Yehuda, >> > >> > It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? >> > >> > Cheers >> > >> > On 10/08/2014 10:46, Loic Dachary wrote: >> >> Hi Yehuda, >> >> >> >> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: >> >> >> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 >> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 >> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 >> >> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 >> >> >> >> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) >> >> >> >> You can find more context in >> >> >> >> http://tracker.ceph.com/issues/8988 >> >> http://tracker.ceph.com/issues/8016 >> >> http://tracker.ceph.com/issues/7799 >> >> >> >> and discussions at >> >> >> >> http://www.spinics.net/lists/ceph-devel/msg19933.html >> >> >> >> Cheers >> >> >> > >> > -- >> > Lo?c Dachary, Artisan Logiciel Libre >> > >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly 2014-08-11 17:34 ` Yuri Weinstein @ 2014-08-11 18:47 ` Loic Dachary 2014-08-11 18:50 ` Yuri Weinstein 0 siblings, 1 reply; 8+ messages in thread From: Loic Dachary @ 2014-08-11 18:47 UTC (permalink / raw) To: Yuri Weinstein, Sage Weil; +Cc: Yehuda Sadeh, Ceph Development [-- Attachment #1: Type: text/plain, Size: 3264 bytes --] On 11/08/2014 19:34, Yuri Weinstein wrote: > Here is what we have in vps.yaml now: > > overrides: > ceph: > conf: > global: > osd heartbeat grace: 40 > > What do we want to add? I think the idle_timeout values at https://github.com/ceph/ceph-qa-suite/pull/79/files > > ~ > > On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote: >> On Mon, 11 Aug 2014, Yehuda Sadeh wrote: >>> Yeah, looking at these logs, it really seem that it's just that things >>> are going slow on these machines and it's hitting timeouts. The fix is >>> ok with me, although I'd rather have it adjusted per machine type >>> (somehow). >> >> There is a vps.yaml that bumps up another timeout, so we could put it >> there. Right now it lives on the teuthology machine >> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in >> ceph-qa-suite.git somewhere ... >> >> sage >> >>> >>> Yehuda >>> >>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote: >>>> Hi Yehuda, >>>> >>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? >>>> >>>> Cheers >>>> >>>> On 10/08/2014 10:46, Loic Dachary wrote: >>>>> Hi Yehuda, >>>>> >>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: >>>>> >>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 >>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 >>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 >>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 >>>>> >>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) >>>>> >>>>> You can find more context in >>>>> >>>>> http://tracker.ceph.com/issues/8988 >>>>> http://tracker.ceph.com/issues/8016 >>>>> http://tracker.ceph.com/issues/7799 >>>>> >>>>> and discussions at >>>>> >>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html >>>>> >>>>> Cheers >>>>> >>>> >>>> -- >>>> Lo?c Dachary, Artisan Logiciel Libre >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Loïc Dachary, Artisan Logiciel Libre [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly 2014-08-11 18:47 ` Loic Dachary @ 2014-08-11 18:50 ` Yuri Weinstein 2014-08-11 18:55 ` Sage Weil 0 siblings, 1 reply; 8+ messages in thread From: Yuri Weinstein @ 2014-08-11 18:50 UTC (permalink / raw) To: Loic Dachary; +Cc: Sage Weil, Yehuda Sadeh, Ceph Development I thought we could do the same in run-time for vps'es only. Sage? On Mon, Aug 11, 2014 at 11:47 AM, Loic Dachary <loic@dachary.org> wrote: > > > On 11/08/2014 19:34, Yuri Weinstein wrote: >> Here is what we have in vps.yaml now: >> >> overrides: >> ceph: >> conf: >> global: >> osd heartbeat grace: 40 >> >> What do we want to add? > > I think the idle_timeout values at > > https://github.com/ceph/ceph-qa-suite/pull/79/files > > >> >> ~ >> >> On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote: >>> On Mon, 11 Aug 2014, Yehuda Sadeh wrote: >>>> Yeah, looking at these logs, it really seem that it's just that things >>>> are going slow on these machines and it's hitting timeouts. The fix is >>>> ok with me, although I'd rather have it adjusted per machine type >>>> (somehow). >>> >>> There is a vps.yaml that bumps up another timeout, so we could put it >>> there. Right now it lives on the teuthology machine >>> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in >>> ceph-qa-suite.git somewhere ... >>> >>> sage >>> >>>> >>>> Yehuda >>>> >>>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote: >>>>> Hi Yehuda, >>>>> >>>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? >>>>> >>>>> Cheers >>>>> >>>>> On 10/08/2014 10:46, Loic Dachary wrote: >>>>>> Hi Yehuda, >>>>>> >>>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: >>>>>> >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 >>>>>> >>>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) >>>>>> >>>>>> You can find more context in >>>>>> >>>>>> http://tracker.ceph.com/issues/8988 >>>>>> http://tracker.ceph.com/issues/8016 >>>>>> http://tracker.ceph.com/issues/7799 >>>>>> >>>>>> and discussions at >>>>>> >>>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html >>>>>> >>>>>> Cheers >>>>>> >>>>> >>>>> -- >>>>> Lo?c Dachary, Artisan Logiciel Libre >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > Loïc Dachary, Artisan Logiciel Libre > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Swift tests failing randomly 2014-08-11 18:50 ` Yuri Weinstein @ 2014-08-11 18:55 ` Sage Weil 0 siblings, 0 replies; 8+ messages in thread From: Sage Weil @ 2014-08-11 18:55 UTC (permalink / raw) To: Yuri Weinstein; +Cc: Loic Dachary, Yehuda Sadeh, Ceph Development On Mon, 11 Aug 2014, Yuri Weinstein wrote: > I thought we could do the same in run-time for vps'es only. > > Sage? That's what I'm suggesting here.. move the change in that pull request into vps.yaml, and put vps.yaml inside ceph-qa-suite.git (maybe machine_types/vps.yaml?) so that it doesn't live in ~teuthworker? Need to confirm that that value can be set with an overrides, though ... sage > > On Mon, Aug 11, 2014 at 11:47 AM, Loic Dachary <loic@dachary.org> wrote: > > > > > > On 11/08/2014 19:34, Yuri Weinstein wrote: > >> Here is what we have in vps.yaml now: > >> > >> overrides: > >> ceph: > >> conf: > >> global: > >> osd heartbeat grace: 40 > >> > >> What do we want to add? > > > > I think the idle_timeout values at > > > > https://github.com/ceph/ceph-qa-suite/pull/79/files > > > > > >> > >> ~ > >> > >> On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil <sweil@redhat.com> wrote: > >>> On Mon, 11 Aug 2014, Yehuda Sadeh wrote: > >>>> Yeah, looking at these logs, it really seem that it's just that things > >>>> are going slow on these machines and it's hitting timeouts. The fix is > >>>> ok with me, although I'd rather have it adjusted per machine type > >>>> (somehow). > >>> > >>> There is a vps.yaml that bumps up another timeout, so we could put it > >>> there. Right now it lives on the teuthology machine > >>> (~teuthworker/vps.yaml I think?), but perhaps we should stick it in > >>> ceph-qa-suite.git somewhere ... > >>> > >>> sage > >>> > >>>> > >>>> Yehuda > >>>> > >>>> On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary <loic@dachary.org> wrote: > >>>>> Hi Yehuda, > >>>>> > >>>>> It looks like increasing the rgw idle timeout makes the problem go away ( https://github.com/ceph/ceph-qa-suite/pull/79 and http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which looks like a large value already. Does this fix / workaround make sense to you ? > >>>>> > >>>>> Cheers > >>>>> > >>>>> On 10/08/2014 10:46, Loic Dachary wrote: > >>>>>> Hi Yehuda, > >>>>>> > >>>>>> In the past few months the swift tests failed randomly and I was unfortunately unable to figure out why. Here are a few examples: > >>>>>> > >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944 > >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941 > >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946 > >>>>>> http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947 > >>>>>> > >>>>>> and it has happened on every upgrade test run since I can remember. I fail to see a pattern and cannot figure out what the real problem is. It would be really great if you could take a look. Even a hunch or a tip would be greatly appreciated :-) > >>>>>> > >>>>>> You can find more context in > >>>>>> > >>>>>> http://tracker.ceph.com/issues/8988 > >>>>>> http://tracker.ceph.com/issues/8016 > >>>>>> http://tracker.ceph.com/issues/7799 > >>>>>> > >>>>>> and discussions at > >>>>>> > >>>>>> http://www.spinics.net/lists/ceph-devel/msg19933.html > >>>>>> > >>>>>> Cheers > >>>>>> > >>>>> > >>>>> -- > >>>>> Lo?c Dachary, Artisan Logiciel Libre > >>>>> > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>> the body of a message to majordomo@vger.kernel.org > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>> > >>>> > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > > Lo?c Dachary, Artisan Logiciel Libre > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-08-11 18:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-10 8:46 Swift tests failing randomly Loic Dachary 2014-08-11 16:21 ` Loic Dachary 2014-08-11 17:11 ` Yehuda Sadeh 2014-08-11 17:13 ` Sage Weil 2014-08-11 17:34 ` Yuri Weinstein 2014-08-11 18:47 ` Loic Dachary 2014-08-11 18:50 ` Yuri Weinstein 2014-08-11 18:55 ` Sage Weil
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.