* endless flying slow requests @ 2012-11-14 9:12 Stefan Priebe - Profihost AG 2012-11-14 14:59 ` Sage Weil 0 siblings, 1 reply; 15+ messages in thread From: Stefan Priebe - Profihost AG @ 2012-11-14 9:12 UTC (permalink / raw) To: ceph-devel@vger.kernel.org Hello list, i see this several times. Endless flying slow requests. And they never stop until i restart the mentioned osd. 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31789.858457 secs 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds old, received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 currently delayed 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31790.858646 secs 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds old, received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 currently delayed 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31791.858827 secs 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds old, received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 currently delayed 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31792.858971 secs 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds old, received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 currently delayed 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31793.859124 secs 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds old, received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 currently delayed When i now restart osd 24 they go away and everything is fine again. Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-14 9:12 endless flying slow requests Stefan Priebe - Profihost AG @ 2012-11-14 14:59 ` Sage Weil 2012-11-15 8:09 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 15+ messages in thread From: Sage Weil @ 2012-11-14 14:59 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org Hi Stefan, I would be nice to confirm that no clients are waiting on replies for these requests; currently we suspect that the OSD request tracking is the buggy part. If you query the OSD admin socket you should be able to dump requests and see the client IP, and then query the client. Is it librbd? In that case you likely need to change the config so that it is listening on an admin socket ('admin socket = path'). Thanks! sage On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: > Hello list, > > i see this several times. Endless flying slow requests. And they never stop > until i restart the mentioned osd. > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; > oldest blocked for > 31789.858457 secs > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds old, > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > currently delayed > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; > oldest blocked for > 31790.858646 secs > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds old, > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > currently delayed > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; > oldest blocked for > 31791.858827 secs > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds old, > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > currently delayed > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; > oldest blocked for > 31792.858971 secs > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds old, > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > currently delayed > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; > oldest blocked for > 31793.859124 secs > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds old, > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > currently delayed > > When i now restart osd 24 they go away and everything is fine again. > > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-14 14:59 ` Sage Weil @ 2012-11-15 8:09 ` Stefan Priebe - Profihost AG 2012-11-15 8:18 ` Josh Durgin 2012-11-15 14:58 ` Sage Weil 0 siblings, 2 replies; 15+ messages in thread From: Stefan Priebe - Profihost AG @ 2012-11-15 8:09 UTC (permalink / raw) To: Sage Weil; +Cc: ceph-devel@vger.kernel.org Am 14.11.2012 15:59, schrieb Sage Weil: > Hi Stefan, > > I would be nice to confirm that no clients are waiting on replies for > these requests; currently we suspect that the OSD request tracking is the > buggy part. If you query the OSD admin socket you should be able to dump > requests and see the client IP, and then query the client. > > Is it librbd? In that case you likely need to change the config so that > it is listening on an admin socket ('admin socket = path'). Yes it is. So i have to specify admin socket at the KVM host? How do i query the admin socket for requests? Stefan > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: > >> Hello list, >> >> i see this several times. Endless flying slow requests. And they never stop >> until i restart the mentioned osd. >> >> 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >> oldest blocked for > 31789.858457 secs >> 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds old, >> received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> currently delayed >> 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >> oldest blocked for > 31790.858646 secs >> 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds old, >> received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> currently delayed >> 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >> oldest blocked for > 31791.858827 secs >> 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds old, >> received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> currently delayed >> 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >> oldest blocked for > 31792.858971 secs >> 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds old, >> received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> currently delayed >> 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >> oldest blocked for > 31793.859124 secs >> 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds old, >> received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> currently delayed >> >> When i now restart osd 24 they go away and everything is fine again. >> >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-15 8:09 ` Stefan Priebe - Profihost AG @ 2012-11-15 8:18 ` Josh Durgin 2012-11-15 14:58 ` Sage Weil 1 sibling, 0 replies; 15+ messages in thread From: Josh Durgin @ 2012-11-15 8:18 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: Sage Weil, ceph-devel@vger.kernel.org On 11/15/2012 12:09 AM, Stefan Priebe - Profihost AG wrote: > Am 14.11.2012 15:59, schrieb Sage Weil: >> Hi Stefan, >> >> I would be nice to confirm that no clients are waiting on replies for >> these requests; currently we suspect that the OSD request tracking is the >> buggy part. If you query the OSD admin socket you should be able to dump >> requests and see the client IP, and then query the client. >> >> Is it librbd? In that case you likely need to change the config so that >> it is listening on an admin socket ('admin socket = path'). > > Yes it is. So i have to specify admin socket at the KVM host? How do i > query the admin socket for requests? Yes, add 'admin socket = /path/to/admin/socket' to the [client] section in ceph.conf, and when a guest is running, show outstanding requests with: ceph --admin-daemon /path/to/admin/socket objecter_requests Josh ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-15 8:09 ` Stefan Priebe - Profihost AG 2012-11-15 8:18 ` Josh Durgin @ 2012-11-15 14:58 ` Sage Weil 2012-11-28 1:51 ` Sage Weil 1 sibling, 1 reply; 15+ messages in thread From: Sage Weil @ 2012-11-15 14:58 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: > Am 14.11.2012 15:59, schrieb Sage Weil: > > Hi Stefan, > > > > I would be nice to confirm that no clients are waiting on replies for > > these requests; currently we suspect that the OSD request tracking is the > > buggy part. If you query the OSD admin socket you should be able to dump > > requests and see the client IP, and then query the client. > > > > Is it librbd? In that case you likely need to change the config so that > > it is listening on an admin socket ('admin socket = path'). > > Yes it is. So i have to specify admin socket at the KVM host? Right. IIRC the disk line is a ; (or \;) separated list of key/value pairs. > How do i query the admin socket for requests? ceph --admin-daemon /path/to/socket help ceph --admin-daemon /path/to/socket objecter_dump (i think) sage > > Stefan > > > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: > > > > > Hello list, > > > > > > i see this several times. Endless flying slow requests. And they never > > > stop > > > until i restart the mentioned osd. > > > > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; > > > oldest blocked for > 31789.858457 secs > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds > > > old, > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > currently delayed > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; > > > oldest blocked for > 31790.858646 secs > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds > > > old, > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > currently delayed > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; > > > oldest blocked for > 31791.858827 secs > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds > > > old, > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > currently delayed > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; > > > oldest blocked for > 31792.858971 secs > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds > > > old, > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > currently delayed > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; > > > oldest blocked for > 31793.859124 secs > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds > > > old, > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > currently delayed > > > > > > When i now restart osd 24 they go away and everything is fine again. > > > > > > Stefan > > > -- > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-15 14:58 ` Sage Weil @ 2012-11-28 1:51 ` Sage Weil 2012-11-28 7:23 ` Stefan Priebe - Profihost AG 2012-11-28 7:47 ` Andrey Korolyov 0 siblings, 2 replies; 15+ messages in thread From: Sage Weil @ 2012-11-28 1:51 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: ceph-devel@vger.kernel.org Hi Stefan, On Thu, 15 Nov 2012, Sage Weil wrote: > On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: > > Am 14.11.2012 15:59, schrieb Sage Weil: > > > Hi Stefan, > > > > > > I would be nice to confirm that no clients are waiting on replies for > > > these requests; currently we suspect that the OSD request tracking is the > > > buggy part. If you query the OSD admin socket you should be able to dump > > > requests and see the client IP, and then query the client. > > > > > > Is it librbd? In that case you likely need to change the config so that > > > it is listening on an admin socket ('admin socket = path'). > > > > Yes it is. So i have to specify admin socket at the KVM host? > > Right. IIRC the disk line is a ; (or \;) separated list of key/value > pairs. > > > How do i query the admin socket for requests? > > ceph --admin-daemon /path/to/socket help > ceph --admin-daemon /path/to/socket objecter_dump (i think) Were you able to reproduce this? Thanks! sage > > sage > > > > > Stefan > > > > > > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: > > > > > > > Hello list, > > > > > > > > i see this several times. Endless flying slow requests. And they never > > > > stop > > > > until i restart the mentioned osd. > > > > > > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; > > > > oldest blocked for > 31789.858457 secs > > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds > > > > old, > > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > > currently delayed > > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; > > > > oldest blocked for > 31790.858646 secs > > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds > > > > old, > > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > > currently delayed > > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; > > > > oldest blocked for > 31791.858827 secs > > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds > > > > old, > > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > > currently delayed > > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; > > > > oldest blocked for > 31792.858971 secs > > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds > > > > old, > > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > > currently delayed > > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; > > > > oldest blocked for > 31793.859124 secs > > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds > > > > old, > > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > > > > currently delayed > > > > > > > > When i now restart osd 24 they go away and everything is fine again. > > > > > > > > Stefan > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-28 1:51 ` Sage Weil @ 2012-11-28 7:23 ` Stefan Priebe - Profihost AG 2012-11-28 7:47 ` Andrey Korolyov 1 sibling, 0 replies; 15+ messages in thread From: Stefan Priebe - Profihost AG @ 2012-11-28 7:23 UTC (permalink / raw) To: Sage Weil; +Cc: ceph-devel@vger.kernel.org Am 28.11.2012 02:51, schrieb Sage Weil: > Hi Stefan, > >>> Yes it is. So i have to specify admin socket at the KVM host? >> >> Right. IIRC the disk line is a ; (or \;) separated list of key/value >> pairs. >> >>> How do i query the admin socket for requests? >> >> ceph --admin-daemon /path/to/socket help >> ceph --admin-daemon /path/to/socket objecter_dump (i think) > > Were you able to reproduce this? No not right now. But i've saved your answers and will make the tests when i see this again. Right now i'm just seeing extremely slow OSD recovery. Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-28 1:51 ` Sage Weil 2012-11-28 7:23 ` Stefan Priebe - Profihost AG @ 2012-11-28 7:47 ` Andrey Korolyov 2012-11-28 21:11 ` Samuel Just 1 sibling, 1 reply; 15+ messages in thread From: Andrey Korolyov @ 2012-11-28 7:47 UTC (permalink / raw) To: Sage Weil; +Cc: Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: > Hi Stefan, > > On Thu, 15 Nov 2012, Sage Weil wrote: >> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >> > Am 14.11.2012 15:59, schrieb Sage Weil: >> > > Hi Stefan, >> > > >> > > I would be nice to confirm that no clients are waiting on replies for >> > > these requests; currently we suspect that the OSD request tracking is the >> > > buggy part. If you query the OSD admin socket you should be able to dump >> > > requests and see the client IP, and then query the client. >> > > >> > > Is it librbd? In that case you likely need to change the config so that >> > > it is listening on an admin socket ('admin socket = path'). >> > >> > Yes it is. So i have to specify admin socket at the KVM host? >> >> Right. IIRC the disk line is a ; (or \;) separated list of key/value >> pairs. >> >> > How do i query the admin socket for requests? >> >> ceph --admin-daemon /path/to/socket help >> ceph --admin-daemon /path/to/socket objecter_dump (i think) > > Were you able to reproduce this? > > Thanks! > sage Meanwhile, I did. :) Such requests will always be created if you have restarted or marked an osd out and then back in and scrub didn`t happen in the meantime (after such operation and before request arrival). What is more interesting, the hangup happens not exactly at the time of operation, but tens of minutes later. { "description": "osd_sub_op(client.1292013.0:45422 4.731 a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v 16444'21693 snapset=0=[]:[] snapc=0=[])", "received_at": "2012-11-28 03:54:43.094151", "age": "27812.942680", "duration": "2.676641", "flag_point": "started", "events": [ { "time": "2012-11-28 03:54:43.094222", "event": "waiting_for_osdmap"}, { "time": "2012-11-28 03:54:43.386890", "event": "reached_pg"}, { "time": "2012-11-28 03:54:43.386894", "event": "started"}, { "time": "2012-11-28 03:54:43.386973", "event": "commit_queued_for_journal_write"}, { "time": "2012-11-28 03:54:45.360049", "event": "write_thread_in_journal_buffer"}, { "time": "2012-11-28 03:54:45.586183", "event": "journaled_completion_queued"}, { "time": "2012-11-28 03:54:45.586262", "event": "sub_op_commit"}, { "time": "2012-11-28 03:54:45.770792", "event": "sub_op_applied"}]}]} > > >> >> sage >> >> > >> > Stefan >> > >> > >> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >> > > >> > > > Hello list, >> > > > >> > > > i see this several times. Endless flying slow requests. And they never >> > > > stop >> > > > until i restart the mentioned osd. >> > > > >> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >> > > > oldest blocked for > 31789.858457 secs >> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >> > > > old, >> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> > > > currently delayed >> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >> > > > oldest blocked for > 31790.858646 secs >> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >> > > > old, >> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> > > > currently delayed >> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >> > > > oldest blocked for > 31791.858827 secs >> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >> > > > old, >> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> > > > currently delayed >> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >> > > > oldest blocked for > 31792.858971 secs >> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >> > > > old, >> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> > > > currently delayed >> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >> > > > oldest blocked for > 31793.859124 secs >> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >> > > > old, >> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> > > > currently delayed >> > > > >> > > > When i now restart osd 24 they go away and everything is fine again. >> > > > >> > > > Stefan >> > > > -- >> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> > > > the body of a message to majordomo@vger.kernel.org >> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > > >> > > > >> > > -- >> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> > > the body of a message to majordomo@vger.kernel.org >> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > >> > -- >> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> > the body of a message to majordomo@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> > >> > >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-28 7:47 ` Andrey Korolyov @ 2012-11-28 21:11 ` Samuel Just 2012-11-28 21:12 ` Samuel Just 0 siblings, 1 reply; 15+ messages in thread From: Samuel Just @ 2012-11-28 21:11 UTC (permalink / raw) To: Andrey Korolyov Cc: Sage Weil, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org Did you observe hung io along with that error? Both sub_op_commit and sub_op_applied have happened, so the sub_op_reply should have been sent back to the primary. This looks more like a leak. If you also observed hung io, then it's possible that the problem is occurring between the sub_op_applied event and the response. -Sam On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: > On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: >> Hi Stefan, >> >> On Thu, 15 Nov 2012, Sage Weil wrote: >>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >>> > Am 14.11.2012 15:59, schrieb Sage Weil: >>> > > Hi Stefan, >>> > > >>> > > I would be nice to confirm that no clients are waiting on replies for >>> > > these requests; currently we suspect that the OSD request tracking is the >>> > > buggy part. If you query the OSD admin socket you should be able to dump >>> > > requests and see the client IP, and then query the client. >>> > > >>> > > Is it librbd? In that case you likely need to change the config so that >>> > > it is listening on an admin socket ('admin socket = path'). >>> > >>> > Yes it is. So i have to specify admin socket at the KVM host? >>> >>> Right. IIRC the disk line is a ; (or \;) separated list of key/value >>> pairs. >>> >>> > How do i query the admin socket for requests? >>> >>> ceph --admin-daemon /path/to/socket help >>> ceph --admin-daemon /path/to/socket objecter_dump (i think) >> >> Were you able to reproduce this? >> >> Thanks! >> sage > > Meanwhile, I did. :) > Such requests will always be created if you have restarted or marked > an osd out and then back in and scrub didn`t happen in the meantime > (after such operation and before request arrival). > What is more interesting, the hangup happens not exactly at the time > of operation, but tens of minutes later. > > { "description": "osd_sub_op(client.1292013.0:45422 4.731 > a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v > 16444'21693 snapset=0=[]:[] snapc=0=[])", > "received_at": "2012-11-28 03:54:43.094151", > "age": "27812.942680", > "duration": "2.676641", > "flag_point": "started", > "events": [ > { "time": "2012-11-28 03:54:43.094222", > "event": "waiting_for_osdmap"}, > { "time": "2012-11-28 03:54:43.386890", > "event": "reached_pg"}, > { "time": "2012-11-28 03:54:43.386894", > "event": "started"}, > { "time": "2012-11-28 03:54:43.386973", > "event": "commit_queued_for_journal_write"}, > { "time": "2012-11-28 03:54:45.360049", > "event": "write_thread_in_journal_buffer"}, > { "time": "2012-11-28 03:54:45.586183", > "event": "journaled_completion_queued"}, > { "time": "2012-11-28 03:54:45.586262", > "event": "sub_op_commit"}, > { "time": "2012-11-28 03:54:45.770792", > "event": "sub_op_applied"}]}]} > > >> >> >>> >>> sage >>> >>> > >>> > Stefan >>> > >>> > >>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >>> > > >>> > > > Hello list, >>> > > > >>> > > > i see this several times. Endless flying slow requests. And they never >>> > > > stop >>> > > > until i restart the mentioned osd. >>> > > > >>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >>> > > > oldest blocked for > 31789.858457 secs >>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >>> > > > old, >>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>> > > > currently delayed >>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >>> > > > oldest blocked for > 31790.858646 secs >>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >>> > > > old, >>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>> > > > currently delayed >>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >>> > > > oldest blocked for > 31791.858827 secs >>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >>> > > > old, >>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>> > > > currently delayed >>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >>> > > > oldest blocked for > 31792.858971 secs >>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >>> > > > old, >>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>> > > > currently delayed >>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >>> > > > oldest blocked for > 31793.859124 secs >>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >>> > > > old, >>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>> > > > currently delayed >>> > > > >>> > > > When i now restart osd 24 they go away and everything is fine again. >>> > > > >>> > > > Stefan >>> > > > -- >>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> > > > the body of a message to majordomo@vger.kernel.org >>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > > > >>> > > > >>> > > -- >>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> > > the body of a message to majordomo@vger.kernel.org >>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > > >>> > -- >>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> > the body of a message to majordomo@vger.kernel.org >>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > >>> > >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-28 21:11 ` Samuel Just @ 2012-11-28 21:12 ` Samuel Just 2012-11-28 21:52 ` Moore, Shawn M 2012-11-28 22:04 ` Andrey Korolyov 0 siblings, 2 replies; 15+ messages in thread From: Samuel Just @ 2012-11-28 21:12 UTC (permalink / raw) To: Andrey Korolyov Cc: Sage Weil, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org Also, these clusters aren't mixed argonaut and next, are they? (Not that that shouldn't work, but it would be a useful data point.) -Sam On Wed, Nov 28, 2012 at 1:11 PM, Samuel Just <sam.just@inktank.com> wrote: > Did you observe hung io along with that error? Both sub_op_commit and > sub_op_applied have happened, so the sub_op_reply should have been > sent back to the primary. This looks more like a leak. If you also > observed hung io, then it's possible that the problem is occurring > between the sub_op_applied event and the response. > -Sam > > On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: >> On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: >>> Hi Stefan, >>> >>> On Thu, 15 Nov 2012, Sage Weil wrote: >>>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>> > Am 14.11.2012 15:59, schrieb Sage Weil: >>>> > > Hi Stefan, >>>> > > >>>> > > I would be nice to confirm that no clients are waiting on replies for >>>> > > these requests; currently we suspect that the OSD request tracking is the >>>> > > buggy part. If you query the OSD admin socket you should be able to dump >>>> > > requests and see the client IP, and then query the client. >>>> > > >>>> > > Is it librbd? In that case you likely need to change the config so that >>>> > > it is listening on an admin socket ('admin socket = path'). >>>> > >>>> > Yes it is. So i have to specify admin socket at the KVM host? >>>> >>>> Right. IIRC the disk line is a ; (or \;) separated list of key/value >>>> pairs. >>>> >>>> > How do i query the admin socket for requests? >>>> >>>> ceph --admin-daemon /path/to/socket help >>>> ceph --admin-daemon /path/to/socket objecter_dump (i think) >>> >>> Were you able to reproduce this? >>> >>> Thanks! >>> sage >> >> Meanwhile, I did. :) >> Such requests will always be created if you have restarted or marked >> an osd out and then back in and scrub didn`t happen in the meantime >> (after such operation and before request arrival). >> What is more interesting, the hangup happens not exactly at the time >> of operation, but tens of minutes later. >> >> { "description": "osd_sub_op(client.1292013.0:45422 4.731 >> a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v >> 16444'21693 snapset=0=[]:[] snapc=0=[])", >> "received_at": "2012-11-28 03:54:43.094151", >> "age": "27812.942680", >> "duration": "2.676641", >> "flag_point": "started", >> "events": [ >> { "time": "2012-11-28 03:54:43.094222", >> "event": "waiting_for_osdmap"}, >> { "time": "2012-11-28 03:54:43.386890", >> "event": "reached_pg"}, >> { "time": "2012-11-28 03:54:43.386894", >> "event": "started"}, >> { "time": "2012-11-28 03:54:43.386973", >> "event": "commit_queued_for_journal_write"}, >> { "time": "2012-11-28 03:54:45.360049", >> "event": "write_thread_in_journal_buffer"}, >> { "time": "2012-11-28 03:54:45.586183", >> "event": "journaled_completion_queued"}, >> { "time": "2012-11-28 03:54:45.586262", >> "event": "sub_op_commit"}, >> { "time": "2012-11-28 03:54:45.770792", >> "event": "sub_op_applied"}]}]} >> >> >>> >>> >>>> >>>> sage >>>> >>>> > >>>> > Stefan >>>> > >>>> > >>>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>> > > >>>> > > > Hello list, >>>> > > > >>>> > > > i see this several times. Endless flying slow requests. And they never >>>> > > > stop >>>> > > > until i restart the mentioned osd. >>>> > > > >>>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31789.858457 secs >>>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31790.858646 secs >>>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31791.858827 secs >>>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31792.858971 secs >>>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31793.859124 secs >>>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > >>>> > > > When i now restart osd 24 they go away and everything is fine again. >>>> > > > >>>> > > > Stefan >>>> > > > -- >>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> > > > the body of a message to majordomo@vger.kernel.org >>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> > > > >>>> > > > >>>> > > -- >>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> > > the body of a message to majordomo@vger.kernel.org >>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> > > >>>> > -- >>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> > the body of a message to majordomo@vger.kernel.org >>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> > >>>> > >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: endless flying slow requests 2012-11-28 21:12 ` Samuel Just @ 2012-11-28 21:52 ` Moore, Shawn M 2012-11-28 22:04 ` Andrey Korolyov 1 sibling, 0 replies; 15+ messages in thread From: Moore, Shawn M @ 2012-11-28 21:52 UTC (permalink / raw) To: Samuel Just, Andrey Korolyov Cc: Sage Weil, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org I actually have this exact issue with hung io during these slow requests. The odd thing is when I restarted the osd that was being mentioned via "ceph -w" it didn't resume io. A bit later (unsure on exact time) another one started being listed in "ceph -w" and when I restarted that different osd io came back. I do believe these occur when I reboot nodes. I never reboot more than one at a time and I have 3 replicas in the "cluster". I have a pure "ceph version 0.53 (commit:2528b5ee105b16352c91af064af5c0b5a7d45d7c)" cluster. Regards, Shawn ________________________________________ From: ceph-devel-owner@vger.kernel.org [ceph-devel-owner@vger.kernel.org] on behalf of Samuel Just [sam.just@inktank.com] Sent: Wednesday, November 28, 2012 16:12 To: Andrey Korolyov Cc: Sage Weil; Stefan Priebe - Profihost AG; ceph-devel@vger.kernel.org Subject: Re: endless flying slow requests Also, these clusters aren't mixed argonaut and next, are they? (Not that that shouldn't work, but it would be a useful data point.) -Sam On Wed, Nov 28, 2012 at 1:11 PM, Samuel Just <sam.just@inktank.com> wrote: > Did you observe hung io along with that error? Both sub_op_commit and > sub_op_applied have happened, so the sub_op_reply should have been > sent back to the primary. This looks more like a leak. If you also > observed hung io, then it's possible that the problem is occurring > between the sub_op_applied event and the response. > -Sam > > On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: >> On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: >>> Hi Stefan, >>> >>> On Thu, 15 Nov 2012, Sage Weil wrote: >>>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>> > Am 14.11.2012 15:59, schrieb Sage Weil: >>>> > > Hi Stefan, >>>> > > >>>> > > I would be nice to confirm that no clients are waiting on replies for >>>> > > these requests; currently we suspect that the OSD request tracking is the >>>> > > buggy part. If you query the OSD admin socket you should be able to dump >>>> > > requests and see the client IP, and then query the client. >>>> > > >>>> > > Is it librbd? In that case you likely need to change the config so that >>>> > > it is listening on an admin socket ('admin socket = path'). >>>> > >>>> > Yes it is. So i have to specify admin socket at the KVM host? >>>> >>>> Right. IIRC the disk line is a ; (or \;) separated list of key/value >>>> pairs. >>>> >>>> > How do i query the admin socket for requests? >>>> >>>> ceph --admin-daemon /path/to/socket help >>>> ceph --admin-daemon /path/to/socket objecter_dump (i think) >>> >>> Were you able to reproduce this? >>> >>> Thanks! >>> sage >> >> Meanwhile, I did. :) >> Such requests will always be created if you have restarted or marked >> an osd out and then back in and scrub didn`t happen in the meantime >> (after such operation and before request arrival). >> What is more interesting, the hangup happens not exactly at the time >> of operation, but tens of minutes later. >> >> { "description": "osd_sub_op(client.1292013.0:45422 4.731 >> a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v >> 16444'21693 snapset=0=[]:[] snapc=0=[])", >> "received_at": "2012-11-28 03:54:43.094151", >> "age": "27812.942680", >> "duration": "2.676641", >> "flag_point": "started", >> "events": [ >> { "time": "2012-11-28 03:54:43.094222", >> "event": "waiting_for_osdmap"}, >> { "time": "2012-11-28 03:54:43.386890", >> "event": "reached_pg"}, >> { "time": "2012-11-28 03:54:43.386894", >> "event": "started"}, >> { "time": "2012-11-28 03:54:43.386973", >> "event": "commit_queued_for_journal_write"}, >> { "time": "2012-11-28 03:54:45.360049", >> "event": "write_thread_in_journal_buffer"}, >> { "time": "2012-11-28 03:54:45.586183", >> "event": "journaled_completion_queued"}, >> { "time": "2012-11-28 03:54:45.586262", >> "event": "sub_op_commit"}, >> { "time": "2012-11-28 03:54:45.770792", >> "event": "sub_op_applied"}]}]} >> >> >>> >>> >>>> >>>> sage >>>> >>>> > >>>> > Stefan >>>> > >>>> > >>>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>> > > >>>> > > > Hello list, >>>> > > > >>>> > > > i see this several times. Endless flying slow requests. And they never >>>> > > > stop >>>> > > > until i restart the mentioned osd. >>>> > > > >>>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31789.858457 secs >>>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31790.858646 secs >>>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31791.858827 secs >>>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31792.858971 secs >>>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >>>> > > > oldest blocked for > 31793.859124 secs >>>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >>>> > > > old, >>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>> > > > currently delayed >>>> > > > >>>> > > > When i now restart osd 24 they go away and everything is fine again. >>>> > > > >>>> > > > Stefan >>>> > > > -- >>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> > > > the body of a message to majordomo@vger.kernel.org >>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> > > > >>>> > > > >>>> > > -- >>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> > > the body of a message to majordomo@vger.kernel.org >>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> > > >>>> > -- >>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> > the body of a message to majordomo@vger.kernel.org >>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> > >>>> > >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-28 21:12 ` Samuel Just 2012-11-28 21:52 ` Moore, Shawn M @ 2012-11-28 22:04 ` Andrey Korolyov 2012-12-01 5:06 ` Samuel Just 1 sibling, 1 reply; 15+ messages in thread From: Andrey Korolyov @ 2012-11-28 22:04 UTC (permalink / raw) To: Samuel Just Cc: Sage Weil, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org On Thu, Nov 29, 2012 at 1:12 AM, Samuel Just <sam.just@inktank.com> wrote: > Also, these clusters aren't mixed argonaut and next, are they? (Not > that that shouldn't work, but it would be a useful data point.) > -Sam > > On Wed, Nov 28, 2012 at 1:11 PM, Samuel Just <sam.just@inktank.com> wrote: >> Did you observe hung io along with that error? Both sub_op_commit and >> sub_op_applied have happened, so the sub_op_reply should have been >> sent back to the primary. This looks more like a leak. If you also >> observed hung io, then it's possible that the problem is occurring >> between the sub_op_applied event and the response. >> -Sam >> It is relatively easy to check if one of client VMs has locked one or more cores to iowait or just hangs, so yes, these ops are related to real commit operations and they are hanged. I`m using all-new 0.54 cluster, without mixing of course. Does everyone who hit that bug readjusted cluster before bug shows itself(say, in a day-long distance)? >> On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: >>> On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: >>>> Hi Stefan, >>>> >>>> On Thu, 15 Nov 2012, Sage Weil wrote: >>>>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>>> > Am 14.11.2012 15:59, schrieb Sage Weil: >>>>> > > Hi Stefan, >>>>> > > >>>>> > > I would be nice to confirm that no clients are waiting on replies for >>>>> > > these requests; currently we suspect that the OSD request tracking is the >>>>> > > buggy part. If you query the OSD admin socket you should be able to dump >>>>> > > requests and see the client IP, and then query the client. >>>>> > > >>>>> > > Is it librbd? In that case you likely need to change the config so that >>>>> > > it is listening on an admin socket ('admin socket = path'). >>>>> > >>>>> > Yes it is. So i have to specify admin socket at the KVM host? >>>>> >>>>> Right. IIRC the disk line is a ; (or \;) separated list of key/value >>>>> pairs. >>>>> >>>>> > How do i query the admin socket for requests? >>>>> >>>>> ceph --admin-daemon /path/to/socket help >>>>> ceph --admin-daemon /path/to/socket objecter_dump (i think) >>>> >>>> Were you able to reproduce this? >>>> >>>> Thanks! >>>> sage >>> >>> Meanwhile, I did. :) >>> Such requests will always be created if you have restarted or marked >>> an osd out and then back in and scrub didn`t happen in the meantime >>> (after such operation and before request arrival). >>> What is more interesting, the hangup happens not exactly at the time >>> of operation, but tens of minutes later. >>> >>> { "description": "osd_sub_op(client.1292013.0:45422 4.731 >>> a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v >>> 16444'21693 snapset=0=[]:[] snapc=0=[])", >>> "received_at": "2012-11-28 03:54:43.094151", >>> "age": "27812.942680", >>> "duration": "2.676641", >>> "flag_point": "started", >>> "events": [ >>> { "time": "2012-11-28 03:54:43.094222", >>> "event": "waiting_for_osdmap"}, >>> { "time": "2012-11-28 03:54:43.386890", >>> "event": "reached_pg"}, >>> { "time": "2012-11-28 03:54:43.386894", >>> "event": "started"}, >>> { "time": "2012-11-28 03:54:43.386973", >>> "event": "commit_queued_for_journal_write"}, >>> { "time": "2012-11-28 03:54:45.360049", >>> "event": "write_thread_in_journal_buffer"}, >>> { "time": "2012-11-28 03:54:45.586183", >>> "event": "journaled_completion_queued"}, >>> { "time": "2012-11-28 03:54:45.586262", >>> "event": "sub_op_commit"}, >>> { "time": "2012-11-28 03:54:45.770792", >>> "event": "sub_op_applied"}]}]} >>> >>> >>>> >>>> >>>>> >>>>> sage >>>>> >>>>> > >>>>> > Stefan >>>>> > >>>>> > >>>>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>>> > > >>>>> > > > Hello list, >>>>> > > > >>>>> > > > i see this several times. Endless flying slow requests. And they never >>>>> > > > stop >>>>> > > > until i restart the mentioned osd. >>>>> > > > >>>>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >>>>> > > > oldest blocked for > 31789.858457 secs >>>>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >>>>> > > > old, >>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>> > > > currently delayed >>>>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >>>>> > > > oldest blocked for > 31790.858646 secs >>>>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >>>>> > > > old, >>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>> > > > currently delayed >>>>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >>>>> > > > oldest blocked for > 31791.858827 secs >>>>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >>>>> > > > old, >>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>> > > > currently delayed >>>>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >>>>> > > > oldest blocked for > 31792.858971 secs >>>>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >>>>> > > > old, >>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>> > > > currently delayed >>>>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >>>>> > > > oldest blocked for > 31793.859124 secs >>>>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >>>>> > > > old, >>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>> > > > currently delayed >>>>> > > > >>>>> > > > When i now restart osd 24 they go away and everything is fine again. >>>>> > > > >>>>> > > > Stefan >>>>> > > > -- >>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> > > > the body of a message to majordomo@vger.kernel.org >>>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> > > > >>>>> > > > >>>>> > > -- >>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> > > the body of a message to majordomo@vger.kernel.org >>>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> > > >>>>> > -- >>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> > the body of a message to majordomo@vger.kernel.org >>>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> > >>>>> > >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-11-28 22:04 ` Andrey Korolyov @ 2012-12-01 5:06 ` Samuel Just 2012-12-02 3:18 ` Sage Weil 0 siblings, 1 reply; 15+ messages in thread From: Samuel Just @ 2012-12-01 5:06 UTC (permalink / raw) To: Andrey Korolyov Cc: Sage Weil, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org I've pushed a fix to next, 49f32cee647c5bd09f36ba7c9fd4f481a697b9d7. Let me know if the problem persists with this patch. -Sam On Wed, Nov 28, 2012 at 2:04 PM, Andrey Korolyov <andrey@xdel.ru> wrote: > On Thu, Nov 29, 2012 at 1:12 AM, Samuel Just <sam.just@inktank.com> wrote: >> Also, these clusters aren't mixed argonaut and next, are they? (Not >> that that shouldn't work, but it would be a useful data point.) >> -Sam >> >> On Wed, Nov 28, 2012 at 1:11 PM, Samuel Just <sam.just@inktank.com> wrote: >>> Did you observe hung io along with that error? Both sub_op_commit and >>> sub_op_applied have happened, so the sub_op_reply should have been >>> sent back to the primary. This looks more like a leak. If you also >>> observed hung io, then it's possible that the problem is occurring >>> between the sub_op_applied event and the response. >>> -Sam >>> > > It is relatively easy to check if one of client VMs has locked one or > more cores to iowait or just hangs, so yes, these ops are related to > real commit operations and they are hanged. > I`m using all-new 0.54 cluster, without mixing of course. Does > everyone who hit that bug readjusted cluster before bug shows > itself(say, in a day-long distance)? > >>> On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: >>>> On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: >>>>> Hi Stefan, >>>>> >>>>> On Thu, 15 Nov 2012, Sage Weil wrote: >>>>>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>>>> > Am 14.11.2012 15:59, schrieb Sage Weil: >>>>>> > > Hi Stefan, >>>>>> > > >>>>>> > > I would be nice to confirm that no clients are waiting on replies for >>>>>> > > these requests; currently we suspect that the OSD request tracking is the >>>>>> > > buggy part. If you query the OSD admin socket you should be able to dump >>>>>> > > requests and see the client IP, and then query the client. >>>>>> > > >>>>>> > > Is it librbd? In that case you likely need to change the config so that >>>>>> > > it is listening on an admin socket ('admin socket = path'). >>>>>> > >>>>>> > Yes it is. So i have to specify admin socket at the KVM host? >>>>>> >>>>>> Right. IIRC the disk line is a ; (or \;) separated list of key/value >>>>>> pairs. >>>>>> >>>>>> > How do i query the admin socket for requests? >>>>>> >>>>>> ceph --admin-daemon /path/to/socket help >>>>>> ceph --admin-daemon /path/to/socket objecter_dump (i think) >>>>> >>>>> Were you able to reproduce this? >>>>> >>>>> Thanks! >>>>> sage >>>> >>>> Meanwhile, I did. :) >>>> Such requests will always be created if you have restarted or marked >>>> an osd out and then back in and scrub didn`t happen in the meantime >>>> (after such operation and before request arrival). >>>> What is more interesting, the hangup happens not exactly at the time >>>> of operation, but tens of minutes later. >>>> >>>> { "description": "osd_sub_op(client.1292013.0:45422 4.731 >>>> a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v >>>> 16444'21693 snapset=0=[]:[] snapc=0=[])", >>>> "received_at": "2012-11-28 03:54:43.094151", >>>> "age": "27812.942680", >>>> "duration": "2.676641", >>>> "flag_point": "started", >>>> "events": [ >>>> { "time": "2012-11-28 03:54:43.094222", >>>> "event": "waiting_for_osdmap"}, >>>> { "time": "2012-11-28 03:54:43.386890", >>>> "event": "reached_pg"}, >>>> { "time": "2012-11-28 03:54:43.386894", >>>> "event": "started"}, >>>> { "time": "2012-11-28 03:54:43.386973", >>>> "event": "commit_queued_for_journal_write"}, >>>> { "time": "2012-11-28 03:54:45.360049", >>>> "event": "write_thread_in_journal_buffer"}, >>>> { "time": "2012-11-28 03:54:45.586183", >>>> "event": "journaled_completion_queued"}, >>>> { "time": "2012-11-28 03:54:45.586262", >>>> "event": "sub_op_commit"}, >>>> { "time": "2012-11-28 03:54:45.770792", >>>> "event": "sub_op_applied"}]}]} >>>> >>>> >>>>> >>>>> >>>>>> >>>>>> sage >>>>>> >>>>>> > >>>>>> > Stefan >>>>>> > >>>>>> > >>>>>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >>>>>> > > >>>>>> > > > Hello list, >>>>>> > > > >>>>>> > > > i see this several times. Endless flying slow requests. And they never >>>>>> > > > stop >>>>>> > > > until i restart the mentioned osd. >>>>>> > > > >>>>>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >>>>>> > > > oldest blocked for > 31789.858457 secs >>>>>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >>>>>> > > > old, >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>>> > > > currently delayed >>>>>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >>>>>> > > > oldest blocked for > 31790.858646 secs >>>>>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >>>>>> > > > old, >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>>> > > > currently delayed >>>>>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >>>>>> > > > oldest blocked for > 31791.858827 secs >>>>>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >>>>>> > > > old, >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>>> > > > currently delayed >>>>>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >>>>>> > > > oldest blocked for > 31792.858971 secs >>>>>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >>>>>> > > > old, >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>>> > > > currently delayed >>>>>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >>>>>> > > > oldest blocked for > 31793.859124 secs >>>>>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >>>>>> > > > old, >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >>>>>> > > > currently delayed >>>>>> > > > >>>>>> > > > When i now restart osd 24 they go away and everything is fine again. >>>>>> > > > >>>>>> > > > Stefan >>>>>> > > > -- >>>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> > > > the body of a message to majordomo@vger.kernel.org >>>>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> > > > >>>>>> > > > >>>>>> > > -- >>>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> > > the body of a message to majordomo@vger.kernel.org >>>>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> > > >>>>>> > -- >>>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> > the body of a message to majordomo@vger.kernel.org >>>>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> > >>>>>> > >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-12-01 5:06 ` Samuel Just @ 2012-12-02 3:18 ` Sage Weil 2012-12-02 22:02 ` Samuel Just 0 siblings, 1 reply; 15+ messages in thread From: Sage Weil @ 2012-12-02 3:18 UTC (permalink / raw) To: Samuel Just Cc: Andrey Korolyov, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org I pushed another fix that plugs a leak when requests race with peering that fixed the 'slow request' for osd_sub_op messages for me. It's in wip-osd-leak. Sam, when you get a chance, can you take a look and push it to next if it looks okay? Thanks! sage On Fri, 30 Nov 2012, Samuel Just wrote: > I've pushed a fix to next, 49f32cee647c5bd09f36ba7c9fd4f481a697b9d7. > Let me know if the problem persists with this patch. > -Sam > > On Wed, Nov 28, 2012 at 2:04 PM, Andrey Korolyov <andrey@xdel.ru> wrote: > > On Thu, Nov 29, 2012 at 1:12 AM, Samuel Just <sam.just@inktank.com> wrote: > >> Also, these clusters aren't mixed argonaut and next, are they? (Not > >> that that shouldn't work, but it would be a useful data point.) > >> -Sam > >> > >> On Wed, Nov 28, 2012 at 1:11 PM, Samuel Just <sam.just@inktank.com> wrote: > >>> Did you observe hung io along with that error? Both sub_op_commit and > >>> sub_op_applied have happened, so the sub_op_reply should have been > >>> sent back to the primary. This looks more like a leak. If you also > >>> observed hung io, then it's possible that the problem is occurring > >>> between the sub_op_applied event and the response. > >>> -Sam > >>> > > > > It is relatively easy to check if one of client VMs has locked one or > > more cores to iowait or just hangs, so yes, these ops are related to > > real commit operations and they are hanged. > > I`m using all-new 0.54 cluster, without mixing of course. Does > > everyone who hit that bug readjusted cluster before bug shows > > itself(say, in a day-long distance)? > > > >>> On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: > >>>> On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: > >>>>> Hi Stefan, > >>>>> > >>>>> On Thu, 15 Nov 2012, Sage Weil wrote: > >>>>>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: > >>>>>> > Am 14.11.2012 15:59, schrieb Sage Weil: > >>>>>> > > Hi Stefan, > >>>>>> > > > >>>>>> > > I would be nice to confirm that no clients are waiting on replies for > >>>>>> > > these requests; currently we suspect that the OSD request tracking is the > >>>>>> > > buggy part. If you query the OSD admin socket you should be able to dump > >>>>>> > > requests and see the client IP, and then query the client. > >>>>>> > > > >>>>>> > > Is it librbd? In that case you likely need to change the config so that > >>>>>> > > it is listening on an admin socket ('admin socket = path'). > >>>>>> > > >>>>>> > Yes it is. So i have to specify admin socket at the KVM host? > >>>>>> > >>>>>> Right. IIRC the disk line is a ; (or \;) separated list of key/value > >>>>>> pairs. > >>>>>> > >>>>>> > How do i query the admin socket for requests? > >>>>>> > >>>>>> ceph --admin-daemon /path/to/socket help > >>>>>> ceph --admin-daemon /path/to/socket objecter_dump (i think) > >>>>> > >>>>> Were you able to reproduce this? > >>>>> > >>>>> Thanks! > >>>>> sage > >>>> > >>>> Meanwhile, I did. :) > >>>> Such requests will always be created if you have restarted or marked > >>>> an osd out and then back in and scrub didn`t happen in the meantime > >>>> (after such operation and before request arrival). > >>>> What is more interesting, the hangup happens not exactly at the time > >>>> of operation, but tens of minutes later. > >>>> > >>>> { "description": "osd_sub_op(client.1292013.0:45422 4.731 > >>>> a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v > >>>> 16444'21693 snapset=0=[]:[] snapc=0=[])", > >>>> "received_at": "2012-11-28 03:54:43.094151", > >>>> "age": "27812.942680", > >>>> "duration": "2.676641", > >>>> "flag_point": "started", > >>>> "events": [ > >>>> { "time": "2012-11-28 03:54:43.094222", > >>>> "event": "waiting_for_osdmap"}, > >>>> { "time": "2012-11-28 03:54:43.386890", > >>>> "event": "reached_pg"}, > >>>> { "time": "2012-11-28 03:54:43.386894", > >>>> "event": "started"}, > >>>> { "time": "2012-11-28 03:54:43.386973", > >>>> "event": "commit_queued_for_journal_write"}, > >>>> { "time": "2012-11-28 03:54:45.360049", > >>>> "event": "write_thread_in_journal_buffer"}, > >>>> { "time": "2012-11-28 03:54:45.586183", > >>>> "event": "journaled_completion_queued"}, > >>>> { "time": "2012-11-28 03:54:45.586262", > >>>> "event": "sub_op_commit"}, > >>>> { "time": "2012-11-28 03:54:45.770792", > >>>> "event": "sub_op_applied"}]}]} > >>>> > >>>> > >>>>> > >>>>> > >>>>>> > >>>>>> sage > >>>>>> > >>>>>> > > >>>>>> > Stefan > >>>>>> > > >>>>>> > > >>>>>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: > >>>>>> > > > >>>>>> > > > Hello list, > >>>>>> > > > > >>>>>> > > > i see this several times. Endless flying slow requests. And they never > >>>>>> > > > stop > >>>>>> > > > until i restart the mentioned osd. > >>>>>> > > > > >>>>>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; > >>>>>> > > > oldest blocked for > 31789.858457 secs > >>>>>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds > >>>>>> > > > old, > >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > >>>>>> > > > currently delayed > >>>>>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; > >>>>>> > > > oldest blocked for > 31790.858646 secs > >>>>>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds > >>>>>> > > > old, > >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > >>>>>> > > > currently delayed > >>>>>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; > >>>>>> > > > oldest blocked for > 31791.858827 secs > >>>>>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds > >>>>>> > > > old, > >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > >>>>>> > > > currently delayed > >>>>>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; > >>>>>> > > > oldest blocked for > 31792.858971 secs > >>>>>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds > >>>>>> > > > old, > >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > >>>>>> > > > currently delayed > >>>>>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; > >>>>>> > > > oldest blocked for > 31793.859124 secs > >>>>>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds > >>>>>> > > > old, > >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 > >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 > >>>>>> > > > currently delayed > >>>>>> > > > > >>>>>> > > > When i now restart osd 24 they go away and everything is fine again. > >>>>>> > > > > >>>>>> > > > Stefan > >>>>>> > > > -- > >>>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>>>> > > > the body of a message to majordomo@vger.kernel.org > >>>>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> > > > > >>>>>> > > > > >>>>>> > > -- > >>>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>>>> > > the body of a message to majordomo@vger.kernel.org > >>>>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> > > > >>>>>> > -- > >>>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>>>> > the body of a message to majordomo@vger.kernel.org > >>>>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> > > >>>>>> > > >>>>>> -- > >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>>>> the body of a message to majordomo@vger.kernel.org > >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>>>> > >>>>>> > >>>>> -- > >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>>> the body of a message to majordomo@vger.kernel.org > >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >>>> the body of a message to majordomo@vger.kernel.org > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: endless flying slow requests 2012-12-02 3:18 ` Sage Weil @ 2012-12-02 22:02 ` Samuel Just 0 siblings, 0 replies; 15+ messages in thread From: Samuel Just @ 2012-12-02 22:02 UTC (permalink / raw) To: Sage Weil Cc: Andrey Korolyov, Stefan Priebe - Profihost AG, ceph-devel@vger.kernel.org Looks good, pushed. -Sam On Sat, Dec 1, 2012 at 7:18 PM, Sage Weil <sage@inktank.com> wrote: > I pushed another fix that plugs a leak when requests race with peering > that fixed the 'slow request' for osd_sub_op messages for me. It's in > wip-osd-leak. > > Sam, when you get a chance, can you take a look and push it to next if it > looks okay? > > Thanks! > sage > > > On Fri, 30 Nov 2012, Samuel Just wrote: > >> I've pushed a fix to next, 49f32cee647c5bd09f36ba7c9fd4f481a697b9d7. >> Let me know if the problem persists with this patch. >> -Sam >> >> On Wed, Nov 28, 2012 at 2:04 PM, Andrey Korolyov <andrey@xdel.ru> wrote: >> > On Thu, Nov 29, 2012 at 1:12 AM, Samuel Just <sam.just@inktank.com> wrote: >> >> Also, these clusters aren't mixed argonaut and next, are they? (Not >> >> that that shouldn't work, but it would be a useful data point.) >> >> -Sam >> >> >> >> On Wed, Nov 28, 2012 at 1:11 PM, Samuel Just <sam.just@inktank.com> wrote: >> >>> Did you observe hung io along with that error? Both sub_op_commit and >> >>> sub_op_applied have happened, so the sub_op_reply should have been >> >>> sent back to the primary. This looks more like a leak. If you also >> >>> observed hung io, then it's possible that the problem is occurring >> >>> between the sub_op_applied event and the response. >> >>> -Sam >> >>> >> > >> > It is relatively easy to check if one of client VMs has locked one or >> > more cores to iowait or just hangs, so yes, these ops are related to >> > real commit operations and they are hanged. >> > I`m using all-new 0.54 cluster, without mixing of course. Does >> > everyone who hit that bug readjusted cluster before bug shows >> > itself(say, in a day-long distance)? >> > >> >>> On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov <andrey@xdel.ru> wrote: >> >>>> On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil <sage@inktank.com> wrote: >> >>>>> Hi Stefan, >> >>>>> >> >>>>> On Thu, 15 Nov 2012, Sage Weil wrote: >> >>>>>> On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote: >> >>>>>> > Am 14.11.2012 15:59, schrieb Sage Weil: >> >>>>>> > > Hi Stefan, >> >>>>>> > > >> >>>>>> > > I would be nice to confirm that no clients are waiting on replies for >> >>>>>> > > these requests; currently we suspect that the OSD request tracking is the >> >>>>>> > > buggy part. If you query the OSD admin socket you should be able to dump >> >>>>>> > > requests and see the client IP, and then query the client. >> >>>>>> > > >> >>>>>> > > Is it librbd? In that case you likely need to change the config so that >> >>>>>> > > it is listening on an admin socket ('admin socket = path'). >> >>>>>> > >> >>>>>> > Yes it is. So i have to specify admin socket at the KVM host? >> >>>>>> >> >>>>>> Right. IIRC the disk line is a ; (or \;) separated list of key/value >> >>>>>> pairs. >> >>>>>> >> >>>>>> > How do i query the admin socket for requests? >> >>>>>> >> >>>>>> ceph --admin-daemon /path/to/socket help >> >>>>>> ceph --admin-daemon /path/to/socket objecter_dump (i think) >> >>>>> >> >>>>> Were you able to reproduce this? >> >>>>> >> >>>>> Thanks! >> >>>>> sage >> >>>> >> >>>> Meanwhile, I did. :) >> >>>> Such requests will always be created if you have restarted or marked >> >>>> an osd out and then back in and scrub didn`t happen in the meantime >> >>>> (after such operation and before request arrival). >> >>>> What is more interesting, the hangup happens not exactly at the time >> >>>> of operation, but tens of minutes later. >> >>>> >> >>>> { "description": "osd_sub_op(client.1292013.0:45422 4.731 >> >>>> a384cf31\/rbd_data.1415fb1075f187.00000000000000a7\/head\/\/4 [] v >> >>>> 16444'21693 snapset=0=[]:[] snapc=0=[])", >> >>>> "received_at": "2012-11-28 03:54:43.094151", >> >>>> "age": "27812.942680", >> >>>> "duration": "2.676641", >> >>>> "flag_point": "started", >> >>>> "events": [ >> >>>> { "time": "2012-11-28 03:54:43.094222", >> >>>> "event": "waiting_for_osdmap"}, >> >>>> { "time": "2012-11-28 03:54:43.386890", >> >>>> "event": "reached_pg"}, >> >>>> { "time": "2012-11-28 03:54:43.386894", >> >>>> "event": "started"}, >> >>>> { "time": "2012-11-28 03:54:43.386973", >> >>>> "event": "commit_queued_for_journal_write"}, >> >>>> { "time": "2012-11-28 03:54:45.360049", >> >>>> "event": "write_thread_in_journal_buffer"}, >> >>>> { "time": "2012-11-28 03:54:45.586183", >> >>>> "event": "journaled_completion_queued"}, >> >>>> { "time": "2012-11-28 03:54:45.586262", >> >>>> "event": "sub_op_commit"}, >> >>>> { "time": "2012-11-28 03:54:45.770792", >> >>>> "event": "sub_op_applied"}]}]} >> >>>> >> >>>> >> >>>>> >> >>>>> >> >>>>>> >> >>>>>> sage >> >>>>>> >> >>>>>> > >> >>>>>> > Stefan >> >>>>>> > >> >>>>>> > >> >>>>>> > > On Wed, 14 Nov 2012, Stefan Priebe - Profihost AG wrote: >> >>>>>> > > >> >>>>>> > > > Hello list, >> >>>>>> > > > >> >>>>>> > > > i see this several times. Endless flying slow requests. And they never >> >>>>>> > > > stop >> >>>>>> > > > until i restart the mentioned osd. >> >>>>>> > > > >> >>>>>> > > > 2012-11-14 10:11:57.513395 osd.24 [WRN] 1 slow requests, 1 included below; >> >>>>>> > > > oldest blocked for > 31789.858457 secs >> >>>>>> > > > 2012-11-14 10:11:57.513399 osd.24 [WRN] slow request 31789.858457 seconds >> >>>>>> > > > old, >> >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> >>>>>> > > > currently delayed >> >>>>>> > > > 2012-11-14 10:11:58.513584 osd.24 [WRN] 1 slow requests, 1 included below; >> >>>>>> > > > oldest blocked for > 31790.858646 secs >> >>>>>> > > > 2012-11-14 10:11:58.513586 osd.24 [WRN] slow request 31790.858646 seconds >> >>>>>> > > > old, >> >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> >>>>>> > > > currently delayed >> >>>>>> > > > 2012-11-14 10:11:59.513766 osd.24 [WRN] 1 slow requests, 1 included below; >> >>>>>> > > > oldest blocked for > 31791.858827 secs >> >>>>>> > > > 2012-11-14 10:11:59.513768 osd.24 [WRN] slow request 31791.858827 seconds >> >>>>>> > > > old, >> >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> >>>>>> > > > currently delayed >> >>>>>> > > > 2012-11-14 10:12:00.513909 osd.24 [WRN] 1 slow requests, 1 included below; >> >>>>>> > > > oldest blocked for > 31792.858971 secs >> >>>>>> > > > 2012-11-14 10:12:00.513916 osd.24 [WRN] slow request 31792.858971 seconds >> >>>>>> > > > old, >> >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> >>>>>> > > > currently delayed >> >>>>>> > > > 2012-11-14 10:12:01.514061 osd.24 [WRN] 1 slow requests, 1 included below; >> >>>>>> > > > oldest blocked for > 31793.859124 secs >> >>>>>> > > > 2012-11-14 10:12:01.514063 osd.24 [WRN] slow request 31793.859124 seconds >> >>>>>> > > > old, >> >>>>>> > > > received at 2012-11-14 01:22:07.654922: osd_op(client.30286.0:6719 >> >>>>>> > > > rbd_data.75c55bf2fdd7.0000000000001399 [write 282624~4096] 3.3f6d2373) v4 >> >>>>>> > > > currently delayed >> >>>>>> > > > >> >>>>>> > > > When i now restart osd 24 they go away and everything is fine again. >> >>>>>> > > > >> >>>>>> > > > Stefan >> >>>>>> > > > -- >> >>>>>> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>> > > > the body of a message to majordomo@vger.kernel.org >> >>>>>> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>> > > > >> >>>>>> > > > >> >>>>>> > > -- >> >>>>>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>> > > the body of a message to majordomo@vger.kernel.org >> >>>>>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>> > > >> >>>>>> > -- >> >>>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>> > the body of a message to majordomo@vger.kernel.org >> >>>>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>> > >> >>>>>> > >> >>>>>> -- >> >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>>> the body of a message to majordomo@vger.kernel.org >> >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>>>> >> >>>>>> >> >>>>> -- >> >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>>> the body of a message to majordomo@vger.kernel.org >> >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >>>> -- >> >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >>>> the body of a message to majordomo@vger.kernel.org >> >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-12-02 22:02 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-11-14 9:12 endless flying slow requests Stefan Priebe - Profihost AG 2012-11-14 14:59 ` Sage Weil 2012-11-15 8:09 ` Stefan Priebe - Profihost AG 2012-11-15 8:18 ` Josh Durgin 2012-11-15 14:58 ` Sage Weil 2012-11-28 1:51 ` Sage Weil 2012-11-28 7:23 ` Stefan Priebe - Profihost AG 2012-11-28 7:47 ` Andrey Korolyov 2012-11-28 21:11 ` Samuel Just 2012-11-28 21:12 ` Samuel Just 2012-11-28 21:52 ` Moore, Shawn M 2012-11-28 22:04 ` Andrey Korolyov 2012-12-01 5:06 ` Samuel Just 2012-12-02 3:18 ` Sage Weil 2012-12-02 22:02 ` Samuel Just
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.