* SIGTERM and osd close @ 2016-06-30 15:45 Ramesh Chander 2016-06-30 16:16 ` Piotr Dałek 2016-06-30 18:44 ` Somnath Roy 0 siblings, 2 replies; 9+ messages in thread From: Ramesh Chander @ 2016-06-30 15:45 UTC (permalink / raw) To: ceph-devel@vger.kernel.org Hi All, When I use stop.sh without any argument, I suppose it calls pkill with SIGTERM on osds as well as other processes. Does osd handle this signal and take care of closing all components? I am specifically interested in if it closes objectstore -> keyvaluedb . I don't see my code of keyvaluedb shutdown/close being called when I do ./stop.sh Any argument or way to force this? -Ramesh PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: SIGTERM and osd close 2016-06-30 15:45 SIGTERM and osd close Ramesh Chander @ 2016-06-30 16:16 ` Piotr Dałek 2016-06-30 18:44 ` Somnath Roy 1 sibling, 0 replies; 9+ messages in thread From: Piotr Dałek @ 2016-06-30 16:16 UTC (permalink / raw) To: Ramesh Chander; +Cc: ceph-devel@vger.kernel.org On Thu, Jun 30, 2016 at 03:45:56PM +0000, Ramesh Chander wrote: > Hi All, > > When I use stop.sh without any argument, I suppose it calls pkill with SIGTERM on osds as well as other processes. > > Does osd handle this signal and take care of closing all components? Yes, it does clean shutdown of daemons affected. This can be proven by measuring the time between signal dispatch and actual OSD daemon going away (usually few seconds). -- Piotr Dałek branch@predictor.org.pl http://blog.predictor.org.pl -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: SIGTERM and osd close 2016-06-30 15:45 SIGTERM and osd close Ramesh Chander 2016-06-30 16:16 ` Piotr Dałek @ 2016-06-30 18:44 ` Somnath Roy 2016-06-30 22:18 ` Brad Hubbard 1 sibling, 1 reply; 9+ messages in thread From: Somnath Roy @ 2016-06-30 18:44 UTC (permalink / raw) To: Ramesh Chander, ceph-devel@vger.kernel.org You need to call it from BlueStore::umount() I guess for cleanup work.. -----Original Message----- From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh Chander Sent: Thursday, June 30, 2016 8:46 AM To: ceph-devel@vger.kernel.org Subject: SIGTERM and osd close Hi All, When I use stop.sh without any argument, I suppose it calls pkill with SIGTERM on osds as well as other processes. Does osd handle this signal and take care of closing all components? I am specifically interested in if it closes objectstore -> keyvaluedb . I don't see my code of keyvaluedb shutdown/close being called when I do ./stop.sh Any argument or way to force this? -Ramesh PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: SIGTERM and osd close 2016-06-30 18:44 ` Somnath Roy @ 2016-06-30 22:18 ` Brad Hubbard 2016-07-01 6:01 ` Ramesh Chander 0 siblings, 1 reply; 9+ messages in thread From: Brad Hubbard @ 2016-06-30 22:18 UTC (permalink / raw) To: Somnath Roy; +Cc: Ramesh Chander, ceph-devel@vger.kernel.org On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy <Somnath.Roy@sandisk.com> wrote: > You need to call it from BlueStore::umount() I guess for cleanup work.. > > -----Original Message----- > From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh Chander > Sent: Thursday, June 30, 2016 8:46 AM > To: ceph-devel@vger.kernel.org > Subject: SIGTERM and osd close > > Hi All, > > When I use stop.sh without any argument, I suppose it calls pkill with SIGTERM on osds as well as other processes. 616 // install signal handlers 617 init_async_signal_handler(); 618 register_async_signal_handler(SIGHUP, sighup_handler); 619 register_async_signal_handler_oneshot(SIGINT, handle_osd_signal); 620 register_async_signal_handler_oneshot(SIGTERM, handle_osd_signal); 65 void handle_osd_signal(int signum) 66 { 67 if (osd) 68 osd->handle_signal(signum); 69 } 1735 void OSD::handle_signal(int signum) 1736 { 1737 assert(signum == SIGINT || signum == SIGTERM); 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; 1739 shutdown(); 1740 } 2598 int OSD::shutdown() 2599 { OSD::shutdown() in src/osd/OSD.cc is quite a large function that performs quite a bit of clean up such as draining and shutting down thread pool work queues, shutting down messenger instances, un-registering admin commands, shutting down the PGs, flushing outstanding ops, updating the superblock and unmounting the filestore (as Somnath mentioned this might be where you want to look), shutting down the MON client and clearing the peering work queue, in no particular order. So there is no doubt the OSD (and other daemons such as MON and MDS) intercepts this signal and performs a graceful shutdown including many housekeeping tasks. HTH, Brad > > Does osd handle this signal and take care of closing all components? > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > I don't see my code of keyvaluedb shutdown/close being called when I do ./stop.sh > > Any argument or way to force this? > > -Ramesh > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Cheers, Brad ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: SIGTERM and osd close 2016-06-30 22:18 ` Brad Hubbard @ 2016-07-01 6:01 ` Ramesh Chander 2016-07-01 14:01 ` Sage Weil 0 siblings, 1 reply; 9+ messages in thread From: Ramesh Chander @ 2016-07-01 6:01 UTC (permalink / raw) To: Brad Hubbard, Somnath Roy; +Cc: ceph-devel@vger.kernel.org Thank you all for reply, Brad, I should trace the code path you pointed out. -Regards, Ramesh > -----Original Message----- > From: Brad Hubbard [mailto:bhubbard@redhat.com] > Sent: Friday, July 01, 2016 3:48 AM > To: Somnath Roy > Cc: Ramesh Chander; ceph-devel@vger.kernel.org > Subject: Re: SIGTERM and osd close > > On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy <Somnath.Roy@sandisk.com> > wrote: > > You need to call it from BlueStore::umount() I guess for cleanup work.. > > > > -----Original Message----- > > From: ceph-devel-owner@vger.kernel.org > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh > Chander > > Sent: Thursday, June 30, 2016 8:46 AM > > To: ceph-devel@vger.kernel.org > > Subject: SIGTERM and osd close > > > > Hi All, > > > > When I use stop.sh without any argument, I suppose it calls pkill with > SIGTERM on osds as well as other processes. > > 616 // install signal handlers > 617 init_async_signal_handler(); > 618 register_async_signal_handler(SIGHUP, sighup_handler); > 619 register_async_signal_handler_oneshot(SIGINT, handle_osd_signal); > 620 register_async_signal_handler_oneshot(SIGTERM, handle_osd_signal); > > 65 void handle_osd_signal(int signum) > 66 { > 67 if (osd) > 68 osd->handle_signal(signum); > 69 } > > 1735 void OSD::handle_signal(int signum) > 1736 { > 1737 assert(signum == SIGINT || signum == SIGTERM); > 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; > 1739 shutdown(); > 1740 } > > 2598 int OSD::shutdown() > 2599 { > > OSD::shutdown() in src/osd/OSD.cc is quite a large function that performs > quite a bit of clean up such as draining and shutting down thread pool work > queues, shutting down messenger instances, un-registering admin > commands, shutting down the PGs, flushing outstanding ops, updating the > superblock and unmounting the filestore (as Somnath mentioned this might > be where you want to look), shutting down the MON client and clearing the > peering work queue, in no particular order. > > So there is no doubt the OSD (and other daemons such as MON and MDS) > intercepts this signal and performs a graceful shutdown including many > housekeeping tasks. > > HTH, > Brad > > > > > Does osd handle this signal and take care of closing all components? > > > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > > > I don't see my code of keyvaluedb shutdown/close being called when I > > do ./stop.sh > > > > Any argument or way to force this? > > > > -Ramesh > > PLEASE NOTE: The information contained in this electronic mail message is > intended only for the use of the designated recipient(s) named above. If the > reader of this message is not the intended recipient, you are hereby notified > that you have received this message in error and that any review, > dissemination, distribution, or copying of this message is strictly prohibited. If > you have received this communication in error, please notify the sender by > telephone or e-mail (as shown above) immediately and destroy any and all > copies of this message in your possession (whether hard copies or > electronically stored copies). > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@vger.kernel.org More > majordomo > > info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > in the body of a message to majordomo@vger.kernel.org More > majordomo > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > Cheers, > Brad PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: SIGTERM and osd close 2016-07-01 6:01 ` Ramesh Chander @ 2016-07-01 14:01 ` Sage Weil 2016-07-11 6:41 ` Ramesh Chander 0 siblings, 1 reply; 9+ messages in thread From: Sage Weil @ 2016-07-01 14:01 UTC (permalink / raw) To: Ramesh Chander; +Cc: Brad Hubbard, Somnath Roy, ceph-devel@vger.kernel.org [-- Attachment #1: Type: TEXT/PLAIN, Size: 4841 bytes --] On Fri, 1 Jul 2016, Ramesh Chander wrote: > Thank you all for reply, > > Brad, > > I should trace the code path you pointed out. In this case, the important bit is BlueFS::umount(), which calls BlueFS::_stop_alloc(). BlueStore::_close_db() should be calling bluefs->umount(). Any of the unit tests should be triggering these code paths. sage > > -Regards, > Ramesh > > > -----Original Message----- > > From: Brad Hubbard [mailto:bhubbard@redhat.com] > > Sent: Friday, July 01, 2016 3:48 AM > > To: Somnath Roy > > Cc: Ramesh Chander; ceph-devel@vger.kernel.org > > Subject: Re: SIGTERM and osd close > > > > On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy <Somnath.Roy@sandisk.com> > > wrote: > > > You need to call it from BlueStore::umount() I guess for cleanup work.. > > > > > > -----Original Message----- > > > From: ceph-devel-owner@vger.kernel.org > > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh > > Chander > > > Sent: Thursday, June 30, 2016 8:46 AM > > > To: ceph-devel@vger.kernel.org > > > Subject: SIGTERM and osd close > > > > > > Hi All, > > > > > > When I use stop.sh without any argument, I suppose it calls pkill with > > SIGTERM on osds as well as other processes. > > > > 616 // install signal handlers > > 617 init_async_signal_handler(); > > 618 register_async_signal_handler(SIGHUP, sighup_handler); > > 619 register_async_signal_handler_oneshot(SIGINT, handle_osd_signal); > > 620 register_async_signal_handler_oneshot(SIGTERM, handle_osd_signal); > > > > 65 void handle_osd_signal(int signum) > > 66 { > > 67 if (osd) > > 68 osd->handle_signal(signum); > > 69 } > > > > 1735 void OSD::handle_signal(int signum) > > 1736 { > > 1737 assert(signum == SIGINT || signum == SIGTERM); > > 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; > > 1739 shutdown(); > > 1740 } > > > > 2598 int OSD::shutdown() > > 2599 { > > > > OSD::shutdown() in src/osd/OSD.cc is quite a large function that performs > > quite a bit of clean up such as draining and shutting down thread pool work > > queues, shutting down messenger instances, un-registering admin > > commands, shutting down the PGs, flushing outstanding ops, updating the > > superblock and unmounting the filestore (as Somnath mentioned this might > > be where you want to look), shutting down the MON client and clearing the > > peering work queue, in no particular order. > > > > So there is no doubt the OSD (and other daemons such as MON and MDS) > > intercepts this signal and performs a graceful shutdown including many > > housekeeping tasks. > > > > HTH, > > Brad > > > > > > > > Does osd handle this signal and take care of closing all components? > > > > > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > > > > > I don't see my code of keyvaluedb shutdown/close being called when I > > > do ./stop.sh > > > > > > Any argument or way to force this? > > > > > > -Ramesh > > > PLEASE NOTE: The information contained in this electronic mail message is > > intended only for the use of the designated recipient(s) named above. If the > > reader of this message is not the intended recipient, you are hereby notified > > that you have received this message in error and that any review, > > dissemination, distribution, or copying of this message is strictly prohibited. If > > you have received this communication in error, please notify the sender by > > telephone or e-mail (as shown above) immediately and destroy any and all > > copies of this message in your possession (whether hard copies or > > electronically stored copies). > > > -- > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > in the body of a message to majordomo@vger.kernel.org More > > majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > -- > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > in the body of a message to majordomo@vger.kernel.org More > > majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > > > > > > -- > > Cheers, > > Brad > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). > N?????r??y??????X??ǧv???){.n?????z?]z????ay?\x1dʇڙ??j\a??f???h?????\x1e?w???\f???j:+v???w????????\a????zZ+???????j"????i ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: SIGTERM and osd close 2016-07-01 14:01 ` Sage Weil @ 2016-07-11 6:41 ` Ramesh Chander 2016-07-11 7:14 ` Somnath Roy 0 siblings, 1 reply; 9+ messages in thread From: Ramesh Chander @ 2016-07-11 6:41 UTC (permalink / raw) To: Sage Weil; +Cc: Brad Hubbard, Somnath Roy, ceph-devel@vger.kernel.org I tried to trace down shutdown call with signal SIGTERM in osd. It seems shutdown call never reached BlueStore::umount. Steps: 1. Starst osd. 2. Attached gdb and put breakpoints: (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x00007f5f8ba5e8a0 in OSD::shutdown() at osd/OSD.cc:2599 2 breakpoint keep y 0x00007f5f8bd65160 in BlueStore::umount() at os/bluestore/BlueStore.cc:2686 3. Trigger stop.sh Breakpoint 1 is hit but it never hits second breakpoints. It get stuck somewhere in call: #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f5f8ba515b5 in WaitUntil (when=..., mutex=..., this=0x7f5f95428e20) at ./common/Cond.h:72 #2 OSDService::prepare_to_stop (this=this@entry=0x7f5f954275c8) at osd/OSD.cc:1174 #3 0x00007f5f8ba5e8cb in OSD::shutdown (this=this@entry=0x7f5f95426000) at osd/OSD.cc:2600 #4 0x00007f5f8ba604d0 in OSD::handle_signal (this=0x7f5f95426000, signum=<optimized out>) at osd/OSD.cc:1739 #5 0x00007f5f8c0209b7 in SignalHandler::entry (this=0x7f5f952a8560) at global/signal_handler.cc:252 #6 0x00007f5f89f19182 in start_thread (arg=0x7f5f686d7700) at pthread_create.c:312 #7 0x00007f5f87e2c00d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Any idea what is happening? I think it working with recovery in place but never does graceful shutdown? Or I am missing anything here? -Ramesh > -----Original Message----- > From: Sage Weil [mailto:sweil@redhat.com] > Sent: Friday, July 01, 2016 7:32 PM > To: Ramesh Chander > Cc: Brad Hubbard; Somnath Roy; ceph-devel@vger.kernel.org > Subject: RE: SIGTERM and osd close > > On Fri, 1 Jul 2016, Ramesh Chander wrote: > > Thank you all for reply, > > > > Brad, > > > > I should trace the code path you pointed out. > > In this case, the important bit is BlueFS::umount(), which calls > BlueFS::_stop_alloc(). BlueStore::_close_db() should be calling > bluefs->umount(). Any of the unit tests should be triggering these code > paths. > > sage > > > > > > -Regards, > > Ramesh > > > > > -----Original Message----- > > > From: Brad Hubbard [mailto:bhubbard@redhat.com] > > > Sent: Friday, July 01, 2016 3:48 AM > > > To: Somnath Roy > > > Cc: Ramesh Chander; ceph-devel@vger.kernel.org > > > Subject: Re: SIGTERM and osd close > > > > > > On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy > <Somnath.Roy@sandisk.com> > > > wrote: > > > > You need to call it from BlueStore::umount() I guess for cleanup work.. > > > > > > > > -----Original Message----- > > > > From: ceph-devel-owner@vger.kernel.org > > > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh > > > Chander > > > > Sent: Thursday, June 30, 2016 8:46 AM > > > > To: ceph-devel@vger.kernel.org > > > > Subject: SIGTERM and osd close > > > > > > > > Hi All, > > > > > > > > When I use stop.sh without any argument, I suppose it calls pkill with > > > SIGTERM on osds as well as other processes. > > > > > > 616 // install signal handlers > > > 617 init_async_signal_handler(); > > > 618 register_async_signal_handler(SIGHUP, sighup_handler); > > > 619 register_async_signal_handler_oneshot(SIGINT, > handle_osd_signal); > > > 620 register_async_signal_handler_oneshot(SIGTERM, > handle_osd_signal); > > > > > > 65 void handle_osd_signal(int signum) > > > 66 { > > > 67 if (osd) > > > 68 osd->handle_signal(signum); > > > 69 } > > > > > > 1735 void OSD::handle_signal(int signum) > > > 1736 { > > > 1737 assert(signum == SIGINT || signum == SIGTERM); > > > 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; > > > 1739 shutdown(); > > > 1740 } > > > > > > 2598 int OSD::shutdown() > > > 2599 { > > > > > > OSD::shutdown() in src/osd/OSD.cc is quite a large function that performs > > > quite a bit of clean up such as draining and shutting down thread pool > work > > > queues, shutting down messenger instances, un-registering admin > > > commands, shutting down the PGs, flushing outstanding ops, updating > the > > > superblock and unmounting the filestore (as Somnath mentioned this > might > > > be where you want to look), shutting down the MON client and clearing > the > > > peering work queue, in no particular order. > > > > > > So there is no doubt the OSD (and other daemons such as MON and > MDS) > > > intercepts this signal and performs a graceful shutdown including many > > > housekeeping tasks. > > > > > > HTH, > > > Brad > > > > > > > > > > > Does osd handle this signal and take care of closing all components? > > > > > > > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > > > > > > > I don't see my code of keyvaluedb shutdown/close being called when I > > > > do ./stop.sh > > > > > > > > Any argument or way to force this? > > > > > > > > -Ramesh > > > > PLEASE NOTE: The information contained in this electronic mail message > is > > > intended only for the use of the designated recipient(s) named above. If > the > > > reader of this message is not the intended recipient, you are hereby > notified > > > that you have received this message in error and that any review, > > > dissemination, distribution, or copying of this message is strictly > prohibited. If > > > you have received this communication in error, please notify the sender > by > > > telephone or e-mail (as shown above) immediately and destroy any and > all > > > copies of this message in your possession (whether hard copies or > > > electronically stored copies). > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > in the body of a message to majordomo@vger.kernel.org More > > > majordomo > > > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > in the body of a message to majordomo@vger.kernel.org More > > > majordomo > > > > info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > -- > > > Cheers, > > > Brad > > PLEASE NOTE: The information contained in this electronic mail message is > intended only for the use of the designated recipient(s) named above. If the > reader of this message is not the intended recipient, you are hereby notified > that you have received this message in error and that any review, > dissemination, distribution, or copying of this message is strictly prohibited. If > you have received this communication in error, please notify the sender by > telephone or e-mail (as shown above) immediately and destroy any and all > copies of this message in your possession (whether hard copies or > electronically stored copies). > > N?????r??y??????X??ǧv???){.n?????z?]z????ay?\x1dʇڙ??j > ??f???h?????\x1e?w??? > > ???j:+v???w???????? ????zZ+???????j"????i PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: SIGTERM and osd close 2016-07-11 6:41 ` Ramesh Chander @ 2016-07-11 7:14 ` Somnath Roy 2016-07-11 7:54 ` Ramesh Chander 0 siblings, 1 reply; 9+ messages in thread From: Somnath Roy @ 2016-07-11 7:14 UTC (permalink / raw) To: Ramesh Chander, Sage Weil; +Cc: Brad Hubbard, ceph-devel@vger.kernel.org Communication to monitor is not happening it seems, it is stuck there.. Is your monitor running ? I am not super familiar with vstart as I rarely use it, but, if I can remember correctly, stop.sh stops all the services including mon. I can see, store->umount is been called from osd::shutdown(), so, it should clean up properly. Without running stop.sh , if you send a kill signal (without -9 of course) to the osd , it should execute shutdown properly. Thanks & Regards Somnath -----Original Message----- From: Ramesh Chander Sent: Sunday, July 10, 2016 11:42 PM To: Sage Weil Cc: Brad Hubbard; Somnath Roy; ceph-devel@vger.kernel.org Subject: RE: SIGTERM and osd close I tried to trace down shutdown call with signal SIGTERM in osd. It seems shutdown call never reached BlueStore::umount. Steps: 1. Starst osd. 2. Attached gdb and put breakpoints: (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x00007f5f8ba5e8a0 in OSD::shutdown() at osd/OSD.cc:2599 2 breakpoint keep y 0x00007f5f8bd65160 in BlueStore::umount() at os/bluestore/BlueStore.cc:2686 3. Trigger stop.sh Breakpoint 1 is hit but it never hits second breakpoints. It get stuck somewhere in call: #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 #1 0x00007f5f8ba515b5 in WaitUntil (when=..., mutex=..., this=0x7f5f95428e20) at ./common/Cond.h:72 #2 OSDService::prepare_to_stop (this=this@entry=0x7f5f954275c8) at osd/OSD.cc:1174 #3 0x00007f5f8ba5e8cb in OSD::shutdown (this=this@entry=0x7f5f95426000) at osd/OSD.cc:2600 #4 0x00007f5f8ba604d0 in OSD::handle_signal (this=0x7f5f95426000, signum=<optimized out>) at osd/OSD.cc:1739 #5 0x00007f5f8c0209b7 in SignalHandler::entry (this=0x7f5f952a8560) at global/signal_handler.cc:252 #6 0x00007f5f89f19182 in start_thread (arg=0x7f5f686d7700) at pthread_create.c:312 #7 0x00007f5f87e2c00d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Any idea what is happening? I think it working with recovery in place but never does graceful shutdown? Or I am missing anything here? -Ramesh > -----Original Message----- > From: Sage Weil [mailto:sweil@redhat.com] > Sent: Friday, July 01, 2016 7:32 PM > To: Ramesh Chander > Cc: Brad Hubbard; Somnath Roy; ceph-devel@vger.kernel.org > Subject: RE: SIGTERM and osd close > > On Fri, 1 Jul 2016, Ramesh Chander wrote: > > Thank you all for reply, > > > > Brad, > > > > I should trace the code path you pointed out. > > In this case, the important bit is BlueFS::umount(), which calls > BlueFS::_stop_alloc(). BlueStore::_close_db() should be calling > bluefs->umount(). Any of the unit tests should be triggering these > bluefs->code > paths. > > sage > > > > > > -Regards, > > Ramesh > > > > > -----Original Message----- > > > From: Brad Hubbard [mailto:bhubbard@redhat.com] > > > Sent: Friday, July 01, 2016 3:48 AM > > > To: Somnath Roy > > > Cc: Ramesh Chander; ceph-devel@vger.kernel.org > > > Subject: Re: SIGTERM and osd close > > > > > > On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy > <Somnath.Roy@sandisk.com> > > > wrote: > > > > You need to call it from BlueStore::umount() I guess for cleanup work.. > > > > > > > > -----Original Message----- > > > > From: ceph-devel-owner@vger.kernel.org > > > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh > > > Chander > > > > Sent: Thursday, June 30, 2016 8:46 AM > > > > To: ceph-devel@vger.kernel.org > > > > Subject: SIGTERM and osd close > > > > > > > > Hi All, > > > > > > > > When I use stop.sh without any argument, I suppose it calls > > > > pkill with > > > SIGTERM on osds as well as other processes. > > > > > > 616 // install signal handlers > > > 617 init_async_signal_handler(); > > > 618 register_async_signal_handler(SIGHUP, sighup_handler); > > > 619 register_async_signal_handler_oneshot(SIGINT, > handle_osd_signal); > > > 620 register_async_signal_handler_oneshot(SIGTERM, > handle_osd_signal); > > > > > > 65 void handle_osd_signal(int signum) > > > 66 { > > > 67 if (osd) > > > 68 osd->handle_signal(signum); > > > 69 } > > > > > > 1735 void OSD::handle_signal(int signum) > > > 1736 { > > > 1737 assert(signum == SIGINT || signum == SIGTERM); > > > 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; > > > 1739 shutdown(); > > > 1740 } > > > > > > 2598 int OSD::shutdown() > > > 2599 { > > > > > > OSD::shutdown() in src/osd/OSD.cc is quite a large function that > > > performs quite a bit of clean up such as draining and shutting > > > down thread pool > work > > > queues, shutting down messenger instances, un-registering admin > > > commands, shutting down the PGs, flushing outstanding ops, > > > updating > the > > > superblock and unmounting the filestore (as Somnath mentioned this > might > > > be where you want to look), shutting down the MON client and > > > clearing > the > > > peering work queue, in no particular order. > > > > > > So there is no doubt the OSD (and other daemons such as MON and > MDS) > > > intercepts this signal and performs a graceful shutdown including > > > many housekeeping tasks. > > > > > > HTH, > > > Brad > > > > > > > > > > > Does osd handle this signal and take care of closing all components? > > > > > > > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > > > > > > > I don't see my code of keyvaluedb shutdown/close being called > > > > when I do ./stop.sh > > > > > > > > Any argument or way to force this? > > > > > > > > -Ramesh > > > > PLEASE NOTE: The information contained in this electronic mail > > > > message > is > > > intended only for the use of the designated recipient(s) named > > > above. If > the > > > reader of this message is not the intended recipient, you are > > > hereby > notified > > > that you have received this message in error and that any review, > > > dissemination, distribution, or copying of this message is > > > strictly > prohibited. If > > > you have received this communication in error, please notify the > > > sender > by > > > telephone or e-mail (as shown above) immediately and destroy any > > > and > all > > > copies of this message in your possession (whether hard copies or > > > electronically stored copies). > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > in the body of a message to majordomo@vger.kernel.org More > > > majordomo > > > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > in the body of a message to majordomo@vger.kernel.org More > > > majordomo > > > > info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > -- > > > Cheers, > > > Brad > > PLEASE NOTE: The information contained in this electronic mail > > message is > intended only for the use of the designated recipient(s) named above. > If the reader of this message is not the intended recipient, you are > hereby notified that you have received this message in error and that > any review, dissemination, distribution, or copying of this message is > strictly prohibited. If you have received this communication in error, > please notify the sender by telephone or e-mail (as shown above) > immediately and destroy any and all copies of this message in your > possession (whether hard copies or electronically stored copies). > > N?????r??y??????X??ǧv???){.n?????z?]z????ay?\x1dʇڙ??j > ??f???h?????\x1e?w??? > > ???j:+v???w???????? ????zZ+???????j"????i PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: SIGTERM and osd close 2016-07-11 7:14 ` Somnath Roy @ 2016-07-11 7:54 ` Ramesh Chander 0 siblings, 0 replies; 9+ messages in thread From: Ramesh Chander @ 2016-07-11 7:54 UTC (permalink / raw) To: Somnath Roy, Sage Weil; +Cc: Brad Hubbard, ceph-devel@vger.kernel.org You are right, using kill on osd process does call to shutdown -> umount I think some order problem in stop.sh. -Regards, Ramesh > -----Original Message----- > From: Somnath Roy > Sent: Monday, July 11, 2016 12:44 PM > To: Ramesh Chander; Sage Weil > Cc: Brad Hubbard; ceph-devel@vger.kernel.org > Subject: RE: SIGTERM and osd close > > Communication to monitor is not happening it seems, it is stuck there.. Is > your monitor running ? > I am not super familiar with vstart as I rarely use it, but, if I can remember > correctly, stop.sh stops all the services including mon. > I can see, store->umount is been called from osd::shutdown(), so, it should > clean up properly. > Without running stop.sh , if you send a kill signal (without -9 of course) to the > osd , it should execute shutdown properly. > > Thanks & Regards > Somnath > > -----Original Message----- > From: Ramesh Chander > Sent: Sunday, July 10, 2016 11:42 PM > To: Sage Weil > Cc: Brad Hubbard; Somnath Roy; ceph-devel@vger.kernel.org > Subject: RE: SIGTERM and osd close > > I tried to trace down shutdown call with signal SIGTERM in osd. > > It seems shutdown call never reached BlueStore::umount. > > Steps: > > 1. Starst osd. > 2. Attached gdb and put breakpoints: > (gdb) info b > Num Type Disp Enb Address What > 1 breakpoint keep y 0x00007f5f8ba5e8a0 in OSD::shutdown() at > osd/OSD.cc:2599 > 2 breakpoint keep y 0x00007f5f8bd65160 in > BlueStore::umount() at os/bluestore/BlueStore.cc:2686 > > 3. Trigger stop.sh > > Breakpoint 1 is hit but it never hits second breakpoints. It get stuck > somewhere in call: > > #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238 > #1 0x00007f5f8ba515b5 in WaitUntil (when=..., mutex=..., > this=0x7f5f95428e20) at ./common/Cond.h:72 > #2 OSDService::prepare_to_stop (this=this@entry=0x7f5f954275c8) at > osd/OSD.cc:1174 > #3 0x00007f5f8ba5e8cb in OSD::shutdown > (this=this@entry=0x7f5f95426000) at osd/OSD.cc:2600 > #4 0x00007f5f8ba604d0 in OSD::handle_signal (this=0x7f5f95426000, > signum=<optimized out>) at osd/OSD.cc:1739 > #5 0x00007f5f8c0209b7 in SignalHandler::entry (this=0x7f5f952a8560) at > global/signal_handler.cc:252 > #6 0x00007f5f89f19182 in start_thread (arg=0x7f5f686d7700) at > pthread_create.c:312 > #7 0x00007f5f87e2c00d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 > > > Any idea what is happening? I think it working with recovery in place but > never does graceful shutdown? > > Or I am missing anything here? > > > -Ramesh > > > > -----Original Message----- > > From: Sage Weil [mailto:sweil@redhat.com] > > Sent: Friday, July 01, 2016 7:32 PM > > To: Ramesh Chander > > Cc: Brad Hubbard; Somnath Roy; ceph-devel@vger.kernel.org > > Subject: RE: SIGTERM and osd close > > > > On Fri, 1 Jul 2016, Ramesh Chander wrote: > > > Thank you all for reply, > > > > > > Brad, > > > > > > I should trace the code path you pointed out. > > > > In this case, the important bit is BlueFS::umount(), which calls > > BlueFS::_stop_alloc(). BlueStore::_close_db() should be calling > > bluefs->umount(). Any of the unit tests should be triggering these > > bluefs->code > > paths. > > > > sage > > > > > > > > > > -Regards, > > > Ramesh > > > > > > > -----Original Message----- > > > > From: Brad Hubbard [mailto:bhubbard@redhat.com] > > > > Sent: Friday, July 01, 2016 3:48 AM > > > > To: Somnath Roy > > > > Cc: Ramesh Chander; ceph-devel@vger.kernel.org > > > > Subject: Re: SIGTERM and osd close > > > > > > > > On Fri, Jul 1, 2016 at 4:44 AM, Somnath Roy > > <Somnath.Roy@sandisk.com> > > > > wrote: > > > > > You need to call it from BlueStore::umount() I guess for cleanup > work.. > > > > > > > > > > -----Original Message----- > > > > > From: ceph-devel-owner@vger.kernel.org > > > > > [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Ramesh > > > > Chander > > > > > Sent: Thursday, June 30, 2016 8:46 AM > > > > > To: ceph-devel@vger.kernel.org > > > > > Subject: SIGTERM and osd close > > > > > > > > > > Hi All, > > > > > > > > > > When I use stop.sh without any argument, I suppose it calls > > > > > pkill with > > > > SIGTERM on osds as well as other processes. > > > > > > > > 616 // install signal handlers > > > > 617 init_async_signal_handler(); > > > > 618 register_async_signal_handler(SIGHUP, sighup_handler); > > > > 619 register_async_signal_handler_oneshot(SIGINT, > > handle_osd_signal); > > > > 620 register_async_signal_handler_oneshot(SIGTERM, > > handle_osd_signal); > > > > > > > > 65 void handle_osd_signal(int signum) > > > > 66 { > > > > 67 if (osd) > > > > 68 osd->handle_signal(signum); > > > > 69 } > > > > > > > > 1735 void OSD::handle_signal(int signum) > > > > 1736 { > > > > 1737 assert(signum == SIGINT || signum == SIGTERM); > > > > 1738 derr << "*** Got signal " << sig_str(signum) << " ***" << dendl; > > > > 1739 shutdown(); > > > > 1740 } > > > > > > > > 2598 int OSD::shutdown() > > > > 2599 { > > > > > > > > OSD::shutdown() in src/osd/OSD.cc is quite a large function that > > > > performs quite a bit of clean up such as draining and shutting > > > > down thread pool > > work > > > > queues, shutting down messenger instances, un-registering admin > > > > commands, shutting down the PGs, flushing outstanding ops, > > > > updating > > the > > > > superblock and unmounting the filestore (as Somnath mentioned this > > might > > > > be where you want to look), shutting down the MON client and > > > > clearing > > the > > > > peering work queue, in no particular order. > > > > > > > > So there is no doubt the OSD (and other daemons such as MON and > > MDS) > > > > intercepts this signal and performs a graceful shutdown including > > > > many housekeeping tasks. > > > > > > > > HTH, > > > > Brad > > > > > > > > > > > > > > Does osd handle this signal and take care of closing all components? > > > > > > > > > > I am specifically interested in if it closes objectstore -> keyvaluedb . > > > > > > > > > > I don't see my code of keyvaluedb shutdown/close being called > > > > > when I do ./stop.sh > > > > > > > > > > Any argument or way to force this? > > > > > > > > > > -Ramesh > > > > > PLEASE NOTE: The information contained in this electronic mail > > > > > message > > is > > > > intended only for the use of the designated recipient(s) named > > > > above. If > > the > > > > reader of this message is not the intended recipient, you are > > > > hereby > > notified > > > > that you have received this message in error and that any review, > > > > dissemination, distribution, or copying of this message is > > > > strictly > > prohibited. If > > > > you have received this communication in error, please notify the > > > > sender > > by > > > > telephone or e-mail (as shown above) immediately and destroy any > > > > and > > all > > > > copies of this message in your possession (whether hard copies or > > > > electronically stored copies). > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > > in the body of a message to majordomo@vger.kernel.org More > > > > majordomo > > > > > info at http://vger.kernel.org/majordomo-info.html > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" > > > > > in the body of a message to majordomo@vger.kernel.org More > > > > majordomo > > > > > info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > > > > > -- > > > > Cheers, > > > > Brad > > > PLEASE NOTE: The information contained in this electronic mail > > > message is > > intended only for the use of the designated recipient(s) named above. > > If the reader of this message is not the intended recipient, you are > > hereby notified that you have received this message in error and that > > any review, dissemination, distribution, or copying of this message is > > strictly prohibited. If you have received this communication in error, > > please notify the sender by telephone or e-mail (as shown above) > > immediately and destroy any and all copies of this message in your > > possession (whether hard copies or electronically stored copies). > > > N?????r??y??????X??ǧv???){.n?????z?]z????ay?\x1dʇڙ??j > > ??f???h?????\x1e?w??? > > > > ???j:+v???w???????? ????zZ+???????j"????i PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-07-11 7:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-30 15:45 SIGTERM and osd close Ramesh Chander 2016-06-30 16:16 ` Piotr Dałek 2016-06-30 18:44 ` Somnath Roy 2016-06-30 22:18 ` Brad Hubbard 2016-07-01 6:01 ` Ramesh Chander 2016-07-01 14:01 ` Sage Weil 2016-07-11 6:41 ` Ramesh Chander 2016-07-11 7:14 ` Somnath Roy 2016-07-11 7:54 ` Ramesh Chander
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.