* About Adding eventfd support for LibRBD @ 2015-07-07 15:18 Haomai Wang 2015-07-08 3:08 ` Josh Durgin 0 siblings, 1 reply; 13+ messages in thread From: Haomai Wang @ 2015-07-07 15:18 UTC (permalink / raw) To: Josh Durgin, ceph-devel@vger.kernel.org; +Cc: Jason Dillaman Hi All, Currently librbd support aio_read/write with specified callback(AioCompletion). It would be nice for simple caller logic, but it also has some problems: 1. Performance bottleneck: Create/Free AioCompletion and librbd internal finisher thread complete "callback" isn't a *very littleweight" job, especially when "callback" need to update some status with lock hold 2. Call logic: Usually like fio rbd engine, caller will maintain some status with io and rbd callback isn't enough to finish all the jobs related to io. For example, caller need to check each queued io stupidly again when rbd callback finished. So maybe we could add new api which support eventfd, so caller could add eventfd to its event loop and batch reap finished io event and update status or do more things. Any feedback is appreciated! -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-07 15:18 About Adding eventfd support for LibRBD Haomai Wang @ 2015-07-08 3:08 ` Josh Durgin 2015-07-08 3:46 ` Haomai Wang 0 siblings, 1 reply; 13+ messages in thread From: Josh Durgin @ 2015-07-08 3:08 UTC (permalink / raw) To: Haomai Wang, ceph-devel@vger.kernel.org; +Cc: Jason Dillaman On 07/07/2015 08:18 AM, Haomai Wang wrote: > Hi All, > > Currently librbd support aio_read/write with specified > callback(AioCompletion). It would be nice for simple caller logic, but > it also has some problems: > > 1. Performance bottleneck: Create/Free AioCompletion and librbd > internal finisher thread complete "callback" isn't a *very > littleweight" job, especially when "callback" need to update some > status with lock hold > > 2. Call logic: Usually like fio rbd engine, caller will maintain some > status with io and rbd callback isn't enough to finish all the jobs > related to io. For example, caller need to check each queued io > stupidly again when rbd callback finished. > > So maybe we could add new api which support eventfd, so caller could > add eventfd to its event loop and batch reap finished io event and > update status or do more things. > > Any feedback is appreciated! It seems like a good idea to me. I'm not sure how much overhead it avoids, but letting the callers check status from their own threads is much nicer in general. I'd be curious how much overhead the callback + finisher add. If it's significant, it might make sense to add similar eventfd interfaces lower in the stack too. Josh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-08 3:08 ` Josh Durgin @ 2015-07-08 3:46 ` Haomai Wang 2015-07-10 3:16 ` Haomai Wang 0 siblings, 1 reply; 13+ messages in thread From: Haomai Wang @ 2015-07-08 3:46 UTC (permalink / raw) To: Josh Durgin; +Cc: ceph-devel@vger.kernel.org, Jason Dillaman On Wed, Jul 8, 2015 at 11:08 AM, Josh Durgin <jdurgin@redhat.com> wrote: > On 07/07/2015 08:18 AM, Haomai Wang wrote: >> >> Hi All, >> >> Currently librbd support aio_read/write with specified >> callback(AioCompletion). It would be nice for simple caller logic, but >> it also has some problems: >> >> 1. Performance bottleneck: Create/Free AioCompletion and librbd >> internal finisher thread complete "callback" isn't a *very >> littleweight" job, especially when "callback" need to update some >> status with lock hold >> >> 2. Call logic: Usually like fio rbd engine, caller will maintain some >> status with io and rbd callback isn't enough to finish all the jobs >> related to io. For example, caller need to check each queued io >> stupidly again when rbd callback finished. >> >> So maybe we could add new api which support eventfd, so caller could >> add eventfd to its event loop and batch reap finished io event and >> update status or do more things. >> >> Any feedback is appreciated! > > > It seems like a good idea to me. I'm not sure how much overhead it > avoids, but letting the callers check status from their own threads > is much nicer in general. > > I'd be curious how much overhead the callback + finisher add. If it's > significant, it might make sense to add similar eventfd interfaces > lower in the stack too. From intuition if we do high iodepth benchmark, noncallback way could reduce lots of "extra callback latency" because new way could batch them. Another performance benefit I think from caller side, new way could let complexity io finished job avoid "callback lock" and reduce extra logic. Finally, mostly callback need to wakeup caller thread to do next thing, it would be great that with new way we can do it in librbd via eventfd. > > Josh -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-08 3:46 ` Haomai Wang @ 2015-07-10 3:16 ` Haomai Wang 2015-07-13 13:52 ` Jason Dillaman 0 siblings, 1 reply; 13+ messages in thread From: Haomai Wang @ 2015-07-10 3:16 UTC (permalink / raw) To: Josh Durgin; +Cc: ceph-devel@vger.kernel.org, Jason Dillaman I made a simple draft about adding async event notification support for librbd: The initial idea is try to avoid much change to existing apis. So we could add a new api like: struct { int result; void *userdata; ...... } rbd_aio_event; int poll_io_events(ImageCtx *ictx, rbd_aio_event *events, int numevents, struct timespec *timeout); int set_image_notification(ImageCtx *ictx, void *handler, enum notification_type); It seemed a little tricky, if user call "set_image_notification" successfully, user can call aio_write/read with specified userdata(original callback argument pointer). Librbd internal thread will post async event to the "eventfd" using the specified way(notification_type) when io finished. For example, linux/bsd will use [eventfd])(http://man7.org/linux/man-pages/man2/eventfd.2.html), solaris could use [port_send](http://docs.oracle.com/cd/E23823_01/html/816-5168/port-send-3c.html#scrolltoc), windows could use iocp method [PostQueuedCompletionStatus](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365458(v=vs.85).aspx). If client call rbd without "set_image_notification", user could call "poll_io_events" will get -EOPNOTSUPP. On Wed, Jul 8, 2015 at 11:46 AM, Haomai Wang <haomaiwang@gmail.com> wrote: > On Wed, Jul 8, 2015 at 11:08 AM, Josh Durgin <jdurgin@redhat.com> wrote: >> On 07/07/2015 08:18 AM, Haomai Wang wrote: >>> >>> Hi All, >>> >>> Currently librbd support aio_read/write with specified >>> callback(AioCompletion). It would be nice for simple caller logic, but >>> it also has some problems: >>> >>> 1. Performance bottleneck: Create/Free AioCompletion and librbd >>> internal finisher thread complete "callback" isn't a *very >>> littleweight" job, especially when "callback" need to update some >>> status with lock hold >>> >>> 2. Call logic: Usually like fio rbd engine, caller will maintain some >>> status with io and rbd callback isn't enough to finish all the jobs >>> related to io. For example, caller need to check each queued io >>> stupidly again when rbd callback finished. >>> >>> So maybe we could add new api which support eventfd, so caller could >>> add eventfd to its event loop and batch reap finished io event and >>> update status or do more things. >>> >>> Any feedback is appreciated! >> >> >> It seems like a good idea to me. I'm not sure how much overhead it >> avoids, but letting the callers check status from their own threads >> is much nicer in general. >> >> I'd be curious how much overhead the callback + finisher add. If it's >> significant, it might make sense to add similar eventfd interfaces >> lower in the stack too. > > From intuition if we do high iodepth benchmark, noncallback way could > reduce lots of "extra callback latency" because new way could batch > them. Another performance benefit I think from caller side, new way > could let complexity io finished job avoid "callback lock" and reduce > extra logic. Finally, mostly callback need to wakeup caller thread to > do next thing, it would be great that with new way we can do it in > librbd via eventfd. > >> >> Josh > > > > -- > Best Regards, > > Wheat -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-10 3:16 ` Haomai Wang @ 2015-07-13 13:52 ` Jason Dillaman 2015-07-13 17:14 ` Haomai Wang 0 siblings, 1 reply; 13+ messages in thread From: Jason Dillaman @ 2015-07-13 13:52 UTC (permalink / raw) To: Haomai Wang; +Cc: Josh Durgin, ceph-devel I was originally thinking that you were just proposing to have librbd write to the eventfd descriptor when your AIO op completed so that you could hook librbd callbacks into an existing app poll loop. If librbd is doing the polling via poll_io_events, I guess I don't see why you would even need to use eventfd. -- Jason Dillaman Red Hat dillaman@redhat.com http://www.redhat.com ----- Original Message ----- > From: "Haomai Wang" <haomaiwang@gmail.com> > To: "Josh Durgin" <jdurgin@redhat.com> > Cc: ceph-devel@vger.kernel.org, "Jason Dillaman" <dillaman@redhat.com> > Sent: Thursday, July 9, 2015 11:16:14 PM > Subject: Re: About Adding eventfd support for LibRBD > > I made a simple draft about adding async event notification support for > librbd: > > The initial idea is try to avoid much change to existing apis. So we > could add a new api like: > > struct { > int result; > void *userdata; > ...... > } rbd_aio_event; > > int poll_io_events(ImageCtx *ictx, rbd_aio_event *events, int > numevents, struct timespec *timeout); > > int set_image_notification(ImageCtx *ictx, void *handler, enum > notification_type); > > It seemed a little tricky, if user call "set_image_notification" > successfully, user can call aio_write/read with specified > userdata(original callback argument pointer). Librbd internal thread > will post async event to the "eventfd" using the specified > way(notification_type) when io finished. For example, linux/bsd will > use [eventfd])(http://man7.org/linux/man-pages/man2/eventfd.2.html), > solaris could use > [port_send](http://docs.oracle.com/cd/E23823_01/html/816-5168/port-send-3c.html#scrolltoc), > windows could use iocp method > [PostQueuedCompletionStatus](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365458(v=vs.85).aspx). > > If client call rbd without "set_image_notification", user could call > "poll_io_events" will get -EOPNOTSUPP. > > > > On Wed, Jul 8, 2015 at 11:46 AM, Haomai Wang <haomaiwang@gmail.com> wrote: > > On Wed, Jul 8, 2015 at 11:08 AM, Josh Durgin <jdurgin@redhat.com> wrote: > >> On 07/07/2015 08:18 AM, Haomai Wang wrote: > >>> > >>> Hi All, > >>> > >>> Currently librbd support aio_read/write with specified > >>> callback(AioCompletion). It would be nice for simple caller logic, but > >>> it also has some problems: > >>> > >>> 1. Performance bottleneck: Create/Free AioCompletion and librbd > >>> internal finisher thread complete "callback" isn't a *very > >>> littleweight" job, especially when "callback" need to update some > >>> status with lock hold > >>> > >>> 2. Call logic: Usually like fio rbd engine, caller will maintain some > >>> status with io and rbd callback isn't enough to finish all the jobs > >>> related to io. For example, caller need to check each queued io > >>> stupidly again when rbd callback finished. > >>> > >>> So maybe we could add new api which support eventfd, so caller could > >>> add eventfd to its event loop and batch reap finished io event and > >>> update status or do more things. > >>> > >>> Any feedback is appreciated! > >> > >> > >> It seems like a good idea to me. I'm not sure how much overhead it > >> avoids, but letting the callers check status from their own threads > >> is much nicer in general. > >> > >> I'd be curious how much overhead the callback + finisher add. If it's > >> significant, it might make sense to add similar eventfd interfaces > >> lower in the stack too. > > > > From intuition if we do high iodepth benchmark, noncallback way could > > reduce lots of "extra callback latency" because new way could batch > > them. Another performance benefit I think from caller side, new way > > could let complexity io finished job avoid "callback lock" and reduce > > extra logic. Finally, mostly callback need to wakeup caller thread to > > do next thing, it would be great that with new way we can do it in > > librbd via eventfd. > > > >> > >> Josh > > > > > > > > -- > > Best Regards, > > > > Wheat > > > > -- > Best Regards, > > Wheat > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 13:52 ` Jason Dillaman @ 2015-07-13 17:14 ` Haomai Wang 2015-07-13 17:32 ` Jason Dillaman 0 siblings, 1 reply; 13+ messages in thread From: Haomai Wang @ 2015-07-13 17:14 UTC (permalink / raw) To: Jason Dillaman; +Cc: Josh Durgin, ceph-devel@vger.kernel.org On Mon, Jul 13, 2015 at 9:52 PM, Jason Dillaman <dillaman@redhat.com> wrote: > I was originally thinking that you were just proposing to have librbd write to the eventfd descriptor when your AIO op completed so that you could hook librbd callbacks into an existing app poll loop. If librbd is doing the polling via poll_io_events, I guess I don't see why you would even need to use eventfd. Sorry, I'm not following. Even we have poll_io_events, we need to when to call "poll_io_events". I guess you mean we could notify user's side "fd" in rbd callback. Yes, we could do this. But a extra rbd callback could be omitted if we embed standard notification methods, we can get performance benefits via inline notify and maybe we can reduce internal completion structures(maybe?). > > -- > > Jason Dillaman > Red Hat > dillaman@redhat.com > http://www.redhat.com > > > ----- Original Message ----- >> From: "Haomai Wang" <haomaiwang@gmail.com> >> To: "Josh Durgin" <jdurgin@redhat.com> >> Cc: ceph-devel@vger.kernel.org, "Jason Dillaman" <dillaman@redhat.com> >> Sent: Thursday, July 9, 2015 11:16:14 PM >> Subject: Re: About Adding eventfd support for LibRBD >> >> I made a simple draft about adding async event notification support for >> librbd: >> >> The initial idea is try to avoid much change to existing apis. So we >> could add a new api like: >> >> struct { >> int result; >> void *userdata; >> ...... >> } rbd_aio_event; >> >> int poll_io_events(ImageCtx *ictx, rbd_aio_event *events, int >> numevents, struct timespec *timeout); >> >> int set_image_notification(ImageCtx *ictx, void *handler, enum >> notification_type); >> >> It seemed a little tricky, if user call "set_image_notification" >> successfully, user can call aio_write/read with specified >> userdata(original callback argument pointer). Librbd internal thread >> will post async event to the "eventfd" using the specified >> way(notification_type) when io finished. For example, linux/bsd will >> use [eventfd])(http://man7.org/linux/man-pages/man2/eventfd.2.html), >> solaris could use >> [port_send](http://docs.oracle.com/cd/E23823_01/html/816-5168/port-send-3c.html#scrolltoc), >> windows could use iocp method >> [PostQueuedCompletionStatus](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365458(v=vs.85).aspx). >> >> If client call rbd without "set_image_notification", user could call >> "poll_io_events" will get -EOPNOTSUPP. >> >> >> >> On Wed, Jul 8, 2015 at 11:46 AM, Haomai Wang <haomaiwang@gmail.com> wrote: >> > On Wed, Jul 8, 2015 at 11:08 AM, Josh Durgin <jdurgin@redhat.com> wrote: >> >> On 07/07/2015 08:18 AM, Haomai Wang wrote: >> >>> >> >>> Hi All, >> >>> >> >>> Currently librbd support aio_read/write with specified >> >>> callback(AioCompletion). It would be nice for simple caller logic, but >> >>> it also has some problems: >> >>> >> >>> 1. Performance bottleneck: Create/Free AioCompletion and librbd >> >>> internal finisher thread complete "callback" isn't a *very >> >>> littleweight" job, especially when "callback" need to update some >> >>> status with lock hold >> >>> >> >>> 2. Call logic: Usually like fio rbd engine, caller will maintain some >> >>> status with io and rbd callback isn't enough to finish all the jobs >> >>> related to io. For example, caller need to check each queued io >> >>> stupidly again when rbd callback finished. >> >>> >> >>> So maybe we could add new api which support eventfd, so caller could >> >>> add eventfd to its event loop and batch reap finished io event and >> >>> update status or do more things. >> >>> >> >>> Any feedback is appreciated! >> >> >> >> >> >> It seems like a good idea to me. I'm not sure how much overhead it >> >> avoids, but letting the callers check status from their own threads >> >> is much nicer in general. >> >> >> >> I'd be curious how much overhead the callback + finisher add. If it's >> >> significant, it might make sense to add similar eventfd interfaces >> >> lower in the stack too. >> > >> > From intuition if we do high iodepth benchmark, noncallback way could >> > reduce lots of "extra callback latency" because new way could batch >> > them. Another performance benefit I think from caller side, new way >> > could let complexity io finished job avoid "callback lock" and reduce >> > extra logic. Finally, mostly callback need to wakeup caller thread to >> > do next thing, it would be great that with new way we can do it in >> > librbd via eventfd. >> > >> >> >> >> Josh >> > >> > >> > >> > -- >> > Best Regards, >> > >> > Wheat >> >> >> >> -- >> Best Regards, >> >> Wheat >> -- Best Regards, Wheat ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 17:14 ` Haomai Wang @ 2015-07-13 17:32 ` Jason Dillaman 2015-07-13 18:16 ` Milosz Tanski 0 siblings, 1 reply; 13+ messages in thread From: Jason Dillaman @ 2015-07-13 17:32 UTC (permalink / raw) To: Haomai Wang; +Cc: Josh Durgin, ceph-devel > Sorry, I'm not following. Even we have poll_io_events, we need to when > to call "poll_io_events". > > I guess you mean we could notify user's side "fd" in rbd callback. > Yes, we could do this. But a extra rbd callback could be omitted if we > embed standard notification methods, we can get performance benefits > via inline notify and maybe we can reduce internal completion > structures(maybe?). > Perhaps if you could provide a full example of how you seeing this be used, it would be helpful. Are you proposing to just use eventfd (and its equivalents under other OSs) in-lieu of a mutex/condition variable. I.e. when an AioCompletion finishes, it would add itself to a (potentially lock-free) list of completed AIO ops to be returned by the next "poll_io_events" invocation -- which is signaled via the eventfd mechanism instead of the using a pipe or lock/condition variable. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 17:32 ` Jason Dillaman @ 2015-07-13 18:16 ` Milosz Tanski 2015-07-13 18:39 ` Jason Dillaman 0 siblings, 1 reply; 13+ messages in thread From: Milosz Tanski @ 2015-07-13 18:16 UTC (permalink / raw) To: Jason Dillaman; +Cc: Haomai Wang, Josh Durgin, ceph-devel On Mon, Jul 13, 2015 at 1:32 PM, Jason Dillaman <dillaman@redhat.com> wrote: >> Sorry, I'm not following. Even we have poll_io_events, we need to when >> to call "poll_io_events". >> >> I guess you mean we could notify user's side "fd" in rbd callback. >> Yes, we could do this. But a extra rbd callback could be omitted if we >> embed standard notification methods, we can get performance benefits >> via inline notify and maybe we can reduce internal completion >> structures(maybe?). >> > > Perhaps if you could provide a full example of how you seeing this be used, it would be helpful. Are you proposing to just use eventfd (and its equivalents under other OSs) in-lieu of a mutex/condition variable. I.e. when an AioCompletion finishes, it would add itself to a (potentially lock-free) list of completed AIO ops to be returned by the next "poll_io_events" invocation -- which is signaled via the eventfd mechanism instead of the using a pipe or lock/condition variable. A lock free list (SPSC) + an eventcount to prevent busy waiting is probably the best balance between performance and not busy spinning in the free / full case. But it doesn't provide an easily compassable way of integrating waiting on other events in the application. eventfd is easy to embed in your (e)pool loop or any kind of event library (libev). > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: milosz@adfin.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 18:16 ` Milosz Tanski @ 2015-07-13 18:39 ` Jason Dillaman 2015-07-13 18:42 ` Sage Weil 2015-07-13 19:36 ` Milosz Tanski 0 siblings, 2 replies; 13+ messages in thread From: Jason Dillaman @ 2015-07-13 18:39 UTC (permalink / raw) To: Milosz Tanski; +Cc: Haomai Wang, Josh Durgin, ceph-devel > But it doesn't provide an easily compassable way > of integrating waiting on other events in the application. eventfd is > easy to embed in your (e)pool loop or any kind of event library > (libev). Agreed -- which is why I asked about the proposed design since it appears (to me) that everything is hidden behind the librbd API and thus not embeddable within a generic app event loop. It might just be a misunderstanding on my part, which is why I asked for an example integration. Jason ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 18:39 ` Jason Dillaman @ 2015-07-13 18:42 ` Sage Weil 2015-07-13 19:58 ` Josh Durgin 2015-07-13 19:36 ` Milosz Tanski 1 sibling, 1 reply; 13+ messages in thread From: Sage Weil @ 2015-07-13 18:42 UTC (permalink / raw) To: Jason Dillaman; +Cc: Milosz Tanski, Haomai Wang, Josh Durgin, ceph-devel On Mon, 13 Jul 2015, Jason Dillaman wrote: > > But it doesn't provide an easily compassable way > > of integrating waiting on other events in the application. eventfd is > > easy to embed in your (e)pool loop or any kind of event library > > (libev). > > Agreed -- which is why I asked about the proposed design since it > appears (to me) that everything is hidden behind the librbd API and thus > not embeddable within a generic app event loop. It might just be a > misunderstanding on my part, which is why I asked for an example > integration. Bonus points if this helps out the qemu librbd usage, which (if memory serves) does some annoying stuff with a pipe in the callback to notify qemu of IO completion. Perhaps qemu can work with an eventfd more directly? sage ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 18:42 ` Sage Weil @ 2015-07-13 19:58 ` Josh Durgin 2015-07-20 5:03 ` Alexandre DERUMIER 0 siblings, 1 reply; 13+ messages in thread From: Josh Durgin @ 2015-07-13 19:58 UTC (permalink / raw) To: Sage Weil, Jason Dillaman; +Cc: Milosz Tanski, Haomai Wang, ceph-devel On 07/13/2015 11:42 AM, Sage Weil wrote: > On Mon, 13 Jul 2015, Jason Dillaman wrote: >>> But it doesn't provide an easily compassable way >>> of integrating waiting on other events in the application. eventfd is >>> easy to embed in your (e)pool loop or any kind of event library >>> (libev). >> >> Agreed -- which is why I asked about the proposed design since it >> appears (to me) that everything is hidden behind the librbd API and thus >> not embeddable within a generic app event loop. It might just be a >> misunderstanding on my part, which is why I asked for an example >> integration. > > Bonus points if this helps out the qemu librbd usage, which (if memory > serves) does some annoying stuff with a pipe in the callback to notify > qemu of IO completion. Perhaps qemu can work with an eventfd more > directly? This was fixed a little while back when qemu was converted to using coroutines and more than one thread, but it would make usage in other applications with simpler threading models easier. IIRC the xen blktap driver for rbd used the same pipe workaround to deal with librbd's current callback api. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 19:58 ` Josh Durgin @ 2015-07-20 5:03 ` Alexandre DERUMIER 0 siblings, 0 replies; 13+ messages in thread From: Alexandre DERUMIER @ 2015-07-20 5:03 UTC (permalink / raw) To: Josh Durgin Cc: Sage Weil, Jason Dillaman, Milosz Tanski, Haomai Wang, ceph-devel Hi, >>This was fixed a little while back when qemu was converted to using >>coroutines and more than one thread, but it would make usage in other >>applications with simpler threading models easier AFAIK, currently qemu can use 1 thread by disk, with qemu iothread feature. But this work with virtio-blk driver only (virtio-scsi not yet stable). Can help to scale with multiple disk, but still limited with a single disk workload. (So, if the eventfd implementation could help to scale with a single disk, it could a big speed up for qemu) Alexandre ----- Mail original ----- De: "Josh Durgin" <jdurgin@redhat.com> À: "Sage Weil" <sage@newdream.net>, "Jason Dillaman" <dillaman@redhat.com> Cc: "Milosz Tanski" <milosz@adfin.com>, "Haomai Wang" <haomaiwang@gmail.com>, "ceph-devel" <ceph-devel@vger.kernel.org> Envoyé: Lundi 13 Juillet 2015 21:58:05 Objet: Re: About Adding eventfd support for LibRBD On 07/13/2015 11:42 AM, Sage Weil wrote: > On Mon, 13 Jul 2015, Jason Dillaman wrote: >>> But it doesn't provide an easily compassable way >>> of integrating waiting on other events in the application. eventfd is >>> easy to embed in your (e)pool loop or any kind of event library >>> (libev). >> >> Agreed -- which is why I asked about the proposed design since it >> appears (to me) that everything is hidden behind the librbd API and thus >> not embeddable within a generic app event loop. It might just be a >> misunderstanding on my part, which is why I asked for an example >> integration. > > Bonus points if this helps out the qemu librbd usage, which (if memory > serves) does some annoying stuff with a pipe in the callback to notify > qemu of IO completion. Perhaps qemu can work with an eventfd more > directly? This was fixed a little while back when qemu was converted to using coroutines and more than one thread, but it would make usage in other applications with simpler threading models easier. IIRC the xen blktap driver for rbd used the same pipe workaround to deal with librbd's current callback api. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: About Adding eventfd support for LibRBD 2015-07-13 18:39 ` Jason Dillaman 2015-07-13 18:42 ` Sage Weil @ 2015-07-13 19:36 ` Milosz Tanski 1 sibling, 0 replies; 13+ messages in thread From: Milosz Tanski @ 2015-07-13 19:36 UTC (permalink / raw) To: Jason Dillaman; +Cc: Haomai Wang, Josh Durgin, ceph-devel On Mon, Jul 13, 2015 at 2:39 PM, Jason Dillaman <dillaman@redhat.com> wrote: >> But it doesn't provide an easily compassable way >> of integrating waiting on other events in the application. eventfd is >> easy to embed in your (e)pool loop or any kind of event library >> (libev). > > Agreed -- which is why I asked about the proposed design since it appears (to me) that everything is hidden behind the librbd API and thus not embeddable within a generic app event loop. It might just be a misunderstanding on my part, which is why I asked for an example integration. > Provided that the user never uses the file descriptor for anything but notification. As in, there's a separate (librbd) library function to drain it that fd. You an use either a eventfd fd or the receiving fd of a pipe() for non-Linux operating systems. The non eventfd case might end up last optimal (due to a needing a drain loop) but the notification mechanism will be both portable and transparent. -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: milosz@adfin.com ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-07-20 5:03 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-07 15:18 About Adding eventfd support for LibRBD Haomai Wang 2015-07-08 3:08 ` Josh Durgin 2015-07-08 3:46 ` Haomai Wang 2015-07-10 3:16 ` Haomai Wang 2015-07-13 13:52 ` Jason Dillaman 2015-07-13 17:14 ` Haomai Wang 2015-07-13 17:32 ` Jason Dillaman 2015-07-13 18:16 ` Milosz Tanski 2015-07-13 18:39 ` Jason Dillaman 2015-07-13 18:42 ` Sage Weil 2015-07-13 19:58 ` Josh Durgin 2015-07-20 5:03 ` Alexandre DERUMIER 2015-07-13 19:36 ` Milosz Tanski
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.