* Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout)
@ 2019-07-26 7:01 Ajitha Robert
[not found] ` <CAEbG6hG7dAhg=Z9JUKcCCTOEPyXZ6cZcS=jar7SeL-5VTcqEgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Ajitha Robert @ 2019-07-26 7:01 UTC (permalink / raw)
To: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-users-Qp0mS5GaXlQ,
ceph-devel-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1.1: Type: text/plain, Size: 457 bytes --]
I have a rbd mirroring setup with primary and secondary clusters as peers
and I have a pool enabled image mode.., In this i created a rbd image ,
enabled with journaling.
But whenever i enable mirroring on the image, I m getting error in
rbdmirror.log and osd.log.
I have increased the timeouts.. nothing worked and couldnt traceout the
error
please guide me to solve this error.
*Logs*
http://paste.openstack.org/show/754766/
--
*Regards,Ajitha R*
[-- Attachment #1.2: Type: text/html, Size: 918 bytes --]
[-- Attachment #2: Type: text/plain, Size: 178 bytes --]
_______________________________________________
ceph-users mailing list
ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
^ permalink raw reply [flat|nested] 7+ messages in thread[parent not found: <CAEbG6hG7dAhg=Z9JUKcCCTOEPyXZ6cZcS=jar7SeL-5VTcqEgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) [not found] ` <CAEbG6hG7dAhg=Z9JUKcCCTOEPyXZ6cZcS=jar7SeL-5VTcqEgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2019-07-26 9:31 ` Mykola Golub [not found] ` <20190726093147.GA31242-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Mykola Golub @ 2019-07-26 9:31 UTC (permalink / raw) To: Ajitha Robert Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-users-Qp0mS5GaXlQ, ceph-devel-u79uwXL29TY76Z2rM5mHXA On Fri, Jul 26, 2019 at 12:31:59PM +0530, Ajitha Robert wrote: > I have a rbd mirroring setup with primary and secondary clusters as peers > and I have a pool enabled image mode.., In this i created a rbd image , > enabled with journaling. > But whenever i enable mirroring on the image, I m getting error in > rbdmirror.log and osd.log. > I have increased the timeouts.. nothing worked and couldnt traceout the > error > please guide me to solve this error. > > *Logs* > http://paste.openstack.org/show/754766/ What do you mean by "nothing worked"? According to mirroring status the image is mirroring: it is in "up+stopped" state on the primary as expected, and in "up+replaying" state on the secondary with 0 entries behind master. The "failed to get omap key" error in the osd log is harmless, and just a week ago the fix was merged upstream not to display it. The cause of "InstanceWatcher: ... resending after timeout" error in the rbd-mirror log is not clear but if it is not repeating it is harmless too. I see you were trying to map the image with krbd. It is expected to fail as the krbd does not support "journaling" feature, which is necessary for mirroring. You can access those images only with librbd (e.g. mapping with rbd-nbd driver or via qemu). -- Mykola Golub ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20190726093147.GA31242-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) [not found] ` <20190726093147.GA31242-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2019-07-26 11:10 ` Ajitha Robert [not found] ` <CAEbG6hFgvWFMgaYHRRtZdth-OkJ7ib4vWxf__b7QvGPd1rF6Qg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Ajitha Robert @ 2019-07-26 11:10 UTC (permalink / raw) To: Mykola Golub Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-users-Qp0mS5GaXlQ, ceph-devel-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1.1: Type: text/plain, Size: 1705 bytes --] Thank you for the clarification. But i was trying with openstack-cinder.. when i load some data into the volume around 50gb, the image sync will stop by 5 % or something within 15%... What could be the reason? On Fri, Jul 26, 2019 at 3:01 PM Mykola Golub <to.my.trociny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Fri, Jul 26, 2019 at 12:31:59PM +0530, Ajitha Robert wrote: > > I have a rbd mirroring setup with primary and secondary clusters as > peers > > and I have a pool enabled image mode.., In this i created a rbd image , > > enabled with journaling. > > But whenever i enable mirroring on the image, I m getting error in > > rbdmirror.log and osd.log. > > I have increased the timeouts.. nothing worked and couldnt traceout the > > error > > please guide me to solve this error. > > > > *Logs* > > http://paste.openstack.org/show/754766/ > > What do you mean by "nothing worked"? According to mirroring status > the image is mirroring: it is in "up+stopped" state on the primary as > expected, and in "up+replaying" state on the secondary with 0 entries > behind master. > > The "failed to get omap key" error in the osd log is harmless, and > just a week ago the fix was merged upstream not to display it. > > The cause of "InstanceWatcher: ... resending after timeout" error in > the rbd-mirror log is not clear but if it is not repeating it is > harmless too. > > I see you were trying to map the image with krbd. It is expected to > fail as the krbd does not support "journaling" feature, which is > necessary for mirroring. You can access those images only with librbd > (e.g. mapping with rbd-nbd driver or via qemu). > > -- > Mykola Golub > -- *Regards,Ajitha R* [-- Attachment #1.2: Type: text/html, Size: 2572 bytes --] [-- Attachment #2: Type: text/plain, Size: 178 bytes --] _______________________________________________ ceph-users mailing list ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAEbG6hFgvWFMgaYHRRtZdth-OkJ7ib4vWxf__b7QvGPd1rF6Qg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) [not found] ` <CAEbG6hFgvWFMgaYHRRtZdth-OkJ7ib4vWxf__b7QvGPd1rF6Qg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2019-07-26 13:25 ` Mykola Golub [not found] ` <20190726132546.GA6825-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Mykola Golub @ 2019-07-26 13:25 UTC (permalink / raw) To: Ajitha Robert Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-users-Qp0mS5GaXlQ, ceph-devel-u79uwXL29TY76Z2rM5mHXA On Fri, Jul 26, 2019 at 04:40:35PM +0530, Ajitha Robert wrote: > Thank you for the clarification. > > But i was trying with openstack-cinder.. when i load some data into the > volume around 50gb, the image sync will stop by 5 % or something within > 15%... What could be the reason? I suppose you see image sync stop in mirror status output? Could you please provide an example? And I suppose you don't see any other messages in rbd-mirror log apart from what you have already posted? Depending on configuration rbd-mirror might log in several logs. Could you please try to find all its logs? `lsof |grep 'rbd-mirror.*log'` may be useful for this. BTW, what rbd-mirror version are you running? -- Mykola Golub ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20190726132546.GA6825-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) [not found] ` <20190726132546.GA6825-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2019-07-26 13:58 ` Jason Dillaman 2019-07-27 12:38 ` Ajitha Robert 1 sibling, 0 replies; 7+ messages in thread From: Jason Dillaman @ 2019-07-26 13:58 UTC (permalink / raw) To: Mykola Golub; +Cc: Ajitha Robert, ceph-users, ceph-users, ceph-devel On Fri, Jul 26, 2019 at 9:26 AM Mykola Golub <to.my.trociny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > On Fri, Jul 26, 2019 at 04:40:35PM +0530, Ajitha Robert wrote: > > Thank you for the clarification. > > > > But i was trying with openstack-cinder.. when i load some data into the > > volume around 50gb, the image sync will stop by 5 % or something within > > 15%... What could be the reason? > > I suppose you see image sync stop in mirror status output? Could you > please provide an example? And I suppose you don't see any other > messages in rbd-mirror log apart from what you have already posted? > Depending on configuration rbd-mirror might log in several logs. Could > you please try to find all its logs? `lsof |grep 'rbd-mirror.*log'` > may be useful for this. > > BTW, what rbd-mirror version are you running? From the previous thread a few days ago (not sure why a new thread was started on this same topic), to me it sounded like one or more OSDs isn't reachable from the secondary site: > > Scenario 2: > > but when i create a 50gb volume with another glance image. Volume get created. and in the backend i could see the rbd images both in primary and secondary > > > > From rbd mirror image status i found secondary cluster starts copying , and syncing was struck at around 14 %... It will be in 14 % .. no progress at all. should I set any parameters for this like timeout?? > > > > I manually checked rbd --cluster primary object-map check <object-name>.. No results came for the objects and the command was in hanging.. Thats why got worried on the failed to map object key log. I couldnt even rebuild the object map. > It sounds like one or more of your primary OSDs are not reachable from > the secondary site. If you run w/ "debug rbd-mirror = 20" and "debug > rbd = 20", you should be able to see the last object it attempted to > copy. From that, you could use "ceph osd map" to figure out the > primary OSD for that object. > -- > Mykola Golub > _______________________________________________ > ceph-users mailing list > ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jason ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) [not found] ` <20190726132546.GA6825-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2019-07-26 13:58 ` Jason Dillaman @ 2019-07-27 12:38 ` Ajitha Robert [not found] ` <CAEbG6hE1s=wJ7hGAPSiFee7iLu7QPrC-s4zDf1kZa3xMsVscdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 7+ messages in thread From: Ajitha Robert @ 2019-07-27 12:38 UTC (permalink / raw) To: Mykola Golub Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-users-Qp0mS5GaXlQ, ceph-devel-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1.1: Type: text/plain, Size: 2162 bytes --] Thanks for the clarification *1) Will there be any folder related to rbd-mirroring in /var/lib/ceph ? * *2) Is ceph rbd-mirror authentication mandatory?* *3)when even i create any cinder volume loaded with glance image i get the following error.. * 2019-07-27 17:26:46.762571 7f93eb0a5780 20 librbd::api::Mirror: peer_list: 2019-07-27 17:27:07.541701 7f939d7fa700 0 rbd::mirror::ImageReplayer: 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down: remote image no longer exists: scheduling deletion 2019-07-27 17:27:16.766199 7f93eb0a5780 20 librbd::api::Mirror: peer_list: 2019-07-27 17:27:22.568970 7f939d7fa700 0 rbd::mirror::ImageReplayer: 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down: mirror image no longer exists 2019-07-27 17:27:46.769158 7f93eb0a5780 20 librbd::api::Mirror: peer_list: 2019 *4) Finally,* *Attimes i can able to create bootable cinder volume apart from the above errors, * *but certain times i face the following* example, For a 50 gb volume, Local image get created, but it couldnt create a mirror image Logs of the image status showed as replaying but pls see the rbd-mirror log http://paste.openstack.org/show/754917/ rbd-mirror.log http://paste.openstack.org/show/754916/ On Fri, Jul 26, 2019 at 6:55 PM Mykola Golub <to.my.trociny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Fri, Jul 26, 2019 at 04:40:35PM +0530, Ajitha Robert wrote: > > Thank you for the clarification. > > > > But i was trying with openstack-cinder.. when i load some data into the > > volume around 50gb, the image sync will stop by 5 % or something within > > 15%... What could be the reason? > > I suppose you see image sync stop in mirror status output? Could you > please provide an example? And I suppose you don't see any other > messages in rbd-mirror log apart from what you have already posted? > Depending on configuration rbd-mirror might log in several logs. Could > you please try to find all its logs? `lsof |grep 'rbd-mirror.*log'` > may be useful for this. > > BTW, what rbd-mirror version are you running? > > -- > Mykola Golub > -- *Regards,Ajitha R* [-- Attachment #1.2: Type: text/html, Size: 4146 bytes --] [-- Attachment #2: Type: text/plain, Size: 178 bytes --] _______________________________________________ ceph-users mailing list ceph-users-idqoXFIVOFJgJs9I8MT0rw@public.gmane.org http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <CAEbG6hE1s=wJ7hGAPSiFee7iLu7QPrC-s4zDf1kZa3xMsVscdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) [not found] ` <CAEbG6hE1s=wJ7hGAPSiFee7iLu7QPrC-s4zDf1kZa3xMsVscdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2019-07-29 16:30 ` Mykola Golub 0 siblings, 0 replies; 7+ messages in thread From: Mykola Golub @ 2019-07-29 16:30 UTC (permalink / raw) To: Ajitha Robert Cc: ceph-users-idqoXFIVOFJgJs9I8MT0rw, ceph-users-Qp0mS5GaXlQ, ceph-devel-u79uwXL29TY76Z2rM5mHXA On Sat, Jul 27, 2019 at 06:08:58PM +0530, Ajitha Robert wrote: > *1) Will there be any folder related to rbd-mirroring in /var/lib/ceph ? * no > *2) Is ceph rbd-mirror authentication mandatory?* no. But why are you asking? > *3)when even i create any cinder volume loaded with glance image i get the > following error.. * > > 2019-07-27 17:26:46.762571 7f93eb0a5780 20 librbd::api::Mirror: peer_list: > 2019-07-27 17:27:07.541701 7f939d7fa700 0 rbd::mirror::ImageReplayer: > 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down: > remote image no longer exists: scheduling deletion > 2019-07-27 17:27:16.766199 7f93eb0a5780 20 librbd::api::Mirror: peer_list: > 2019-07-27 17:27:22.568970 7f939d7fa700 0 rbd::mirror::ImageReplayer: > 0x7f93c800e9e0 [19/b6656be7-6006-4246-ba93-a49a220e33ce] handle_shut_down: > mirror image no longer exists > 2019-07-27 17:27:46.769158 7f93eb0a5780 20 librbd::api::Mirror: peer_list: > 2019 The log tells that the primary image was deleted by some reason and the rbd-mirror scheduled the secondary (mirrored) image deletion. From the logs it is not seen why the primary image was deleted. It might be sinder but can't exlude some bug in the rbd-mirror, running on the primary cluster, though I don't recall any issues like this. > *Attimes i can able to create bootable cinder volume apart from the above > errors, but certain times i face the following > > example, For a 50 gb volume, Local image get created, but it couldnt create > a mirror image "Connection timed out" errors suggest you have a connectivity issue between sites? -- Mykola Golub ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-07-29 16:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-26 7:01 Error in ceph rbd mirroring(rbd::mirror::InstanceWatcher: C_NotifyInstanceRequestfinish: resending after timeout) Ajitha Robert
[not found] ` <CAEbG6hG7dAhg=Z9JUKcCCTOEPyXZ6cZcS=jar7SeL-5VTcqEgA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-26 9:31 ` Mykola Golub
[not found] ` <20190726093147.GA31242-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-07-26 11:10 ` Ajitha Robert
[not found] ` <CAEbG6hFgvWFMgaYHRRtZdth-OkJ7ib4vWxf__b7QvGPd1rF6Qg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-26 13:25 ` Mykola Golub
[not found] ` <20190726132546.GA6825-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-07-26 13:58 ` Jason Dillaman
2019-07-27 12:38 ` Ajitha Robert
[not found] ` <CAEbG6hE1s=wJ7hGAPSiFee7iLu7QPrC-s4zDf1kZa3xMsVscdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-29 16:30 ` Mykola Golub
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.