linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
@ 2024-08-01 11:38 Andy Pan via B4 Relay
  2024-08-02  0:59 ` 潘少
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Andy Pan via B4 Relay @ 2024-08-01 11:38 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, linux-api, mtk.manpages, Andy Pan, Andy Pan

From: Andy Pan <i@andypan.me>

For the moment, the edge-triggered epoll generates an event for each
receipt of a chunk of data, that is to say, epoll_wait() will return
and tell us a monitored file descriptor is ready whenever there is a
new activity on that FD since we were last informed about that FD.
This is not a real _edge_ implementation for epoll, but it's been
working this way for years and plenty of projects are relying on it
to eliminate the overhead of one system call of read(2) per wakeup event.

There are several renowned open-source projects relying on this feature
for notification function (with eventfd): register eventfd with EPOLLET
and avoid calling read(2) on the eventfd when there is wakeup event (eventfd being written).
Examples: nginx [1], netty [2], tokio [3], libevent [4], ect. [5]
These projects are widely used in today's Internet infrastructures.
Thus, changing this behavior of epoll ET will fundamentally break them
and cause a significant negative impact.
Linux has changed it for pipe before [6], breaking some Android libraries,
which had got "reverted" somehow. [7] [8]

Nevertheless, the paragraph in the manual pages describing this
characteristic of epoll ET seems ambiguous, I think a more explict
sentence should be used to clarify it. We're improving the notification
mechanism for libuv recently by exploiting this feature with eventfd,
which brings us a significant performance boost. [9]

Therefore, we (as well as the maintainers of nginx, netty, tokio, etc.)
would have a sense of security to build an enhanced notification function
based on this feature if there is a guarantee of retaining this implementation
of epoll ET for the backward compatibility in the man pages.

[1]: https://github.com/nginx/nginx/blob/efc6a217b92985a1ee211b6bb7337cd2f62deb90/src/event/modules/ngx_epoll_module.c#L386-L457
[2]: https://github.com/netty/netty/pull/9192
[3]: https://github.com/tokio-rs/mio/blob/309daae21ecb1d46203a7dbc0cf4c80310240cba/src/sys/unix/waker.rs#L111-L143
[4]: https://github.com/libevent/libevent/blob/525f5d0a14c9c103be750f2ca175328c25505ea4/event.c#L2597-L2614
[5]: https://github.com/libuv/libuv/pull/4400#issuecomment-2123798748
[6]: https://lkml.iu.edu/hypermail/linux/kernel/2010.1/04363.html
[7]: https://github.com/torvalds/linux/commit/3a34b13a88caeb2800ab44a4918f230041b37dd9
[8]: https://github.com/torvalds/linux/commit/3b844826b6c6affa80755254da322b017358a2f4
[9]: https://github.com/libuv/libuv/pull/4400#issuecomment-2103232402

Signed-off-by: Andy Pan <i@andypan.me>
---
Changes in v5:
- Update the sentence of clarifying the epoll ET
- Link to v4: https://lore.kernel.org/r/20240731-epoll-et-desc-v4-1-7eb819bdde0d@andypan.me

Changes in v4:
- Move the added sentence elsewhere to make more sense
- Link to v3: https://lore.kernel.org/r/20240730-epoll-et-desc-v3-1-6aa81b1c400d@andypan.me

Changes in v3:
- Updated the git commit description
- Link to v2: https://lore.kernel.org/r/20240727-epoll-et-desc-v2-1-c99b2ac66775@andypan.me

Changes in v2:
- Added the git commit description based on feedback
- Link to v1: https://lore.kernel.org/r/20240727-epoll-et-desc-v1-1-390bafc678b9@andypan.me
---
 man/man7/epoll.7 | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/man/man7/epoll.7 b/man/man7/epoll.7
index 951500131..86e5f8363 100644
--- a/man/man7/epoll.7
+++ b/man/man7/epoll.7
@@ -121,7 +121,8 @@ .SS Level-triggered and edge-triggered
 meanwhile the remote peer might be expecting a response based on the
 data it already sent.
 The reason for this is that edge-triggered mode
-delivers events only when changes occur on the monitored file descriptor.
+delivers events only when changes occur on the monitored file descriptor,
+that is, an event will be generated upon each receipt of a chunk of data.
 So, in step
 .B 5
 the caller might end up waiting for some data that is already present inside

---
base-commit: cbc0a111e4dceea2037c51098de33e6bc8c16a5c
change-id: 20240727-epoll-et-desc-04ea9a590f3b

Best regards,
-- 
Andy Pan <i@andypan.me>



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re:[PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-01 11:38 [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode Andy Pan via B4 Relay
@ 2024-08-02  0:59 ` 潘少
  2024-08-02  6:38 ` [PATCH " Alejandro Colomar
  2024-08-03 14:01 ` Alejandro Colomar
  2 siblings, 0 replies; 9+ messages in thread
From: 潘少 @ 2024-08-02  0:59 UTC (permalink / raw)
  To: i, Alejandro Colomar; +Cc: linux-man, linux-api, Michael Kerrisk, panjf2000

Hi folks,
I'm not trying to rush you, but please do consider reviewing this patch when you have a moment, thanks!
-----------
Best regards,
Andy Pan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-01 11:38 [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode Andy Pan via B4 Relay
  2024-08-02  0:59 ` 潘少
@ 2024-08-02  6:38 ` Alejandro Colomar
  2024-08-02 10:01   ` 潘少
  2024-08-03 11:49   ` 潘少
  2024-08-03 14:01 ` Alejandro Colomar
  2 siblings, 2 replies; 9+ messages in thread
From: Alejandro Colomar @ 2024-08-02  6:38 UTC (permalink / raw)
  To: i; +Cc: linux-man, linux-api, mtk.manpages, Andy Pan

[-- Attachment #1: Type: text/plain, Size: 4594 bytes --]

Hi Andy,

On Thu, Aug 01, 2024 at 11:38:38AM GMT, Andy Pan via B4 Relay wrote:
> From: Andy Pan <i@andypan.me>
> 
> For the moment, the edge-triggered epoll generates an event for each
> receipt of a chunk of data, that is to say, epoll_wait() will return
> and tell us a monitored file descriptor is ready whenever there is a
> new activity on that FD since we were last informed about that FD.
> This is not a real _edge_ implementation for epoll, but it's been
> working this way for years and plenty of projects are relying on it
> to eliminate the overhead of one system call of read(2) per wakeup event.
> 
> There are several renowned open-source projects relying on this feature
> for notification function (with eventfd): register eventfd with EPOLLET
> and avoid calling read(2) on the eventfd when there is wakeup event (eventfd being written).
> Examples: nginx [1], netty [2], tokio [3], libevent [4], ect. [5]
> These projects are widely used in today's Internet infrastructures.
> Thus, changing this behavior of epoll ET will fundamentally break them
> and cause a significant negative impact.
> Linux has changed it for pipe before [6], breaking some Android libraries,
> which had got "reverted" somehow. [7] [8]
> 
> Nevertheless, the paragraph in the manual pages describing this
> characteristic of epoll ET seems ambiguous, I think a more explict
> sentence should be used to clarify it. We're improving the notification
> mechanism for libuv recently by exploiting this feature with eventfd,
> which brings us a significant performance boost. [9]
> 
> Therefore, we (as well as the maintainers of nginx, netty, tokio, etc.)
> would have a sense of security to build an enhanced notification function
> based on this feature if there is a guarantee of retaining this implementation
> of epoll ET for the backward compatibility in the man pages.
> 
> [1]: https://github.com/nginx/nginx/blob/efc6a217b92985a1ee211b6bb7337cd2f62deb90/src/event/modules/ngx_epoll_module.c#L386-L457
> [2]: https://github.com/netty/netty/pull/9192
> [3]: https://github.com/tokio-rs/mio/blob/309daae21ecb1d46203a7dbc0cf4c80310240cba/src/sys/unix/waker.rs#L111-L143
> [4]: https://github.com/libevent/libevent/blob/525f5d0a14c9c103be750f2ca175328c25505ea4/event.c#L2597-L2614
> [5]: https://github.com/libuv/libuv/pull/4400#issuecomment-2123798748
> [6]: https://lkml.iu.edu/hypermail/linux/kernel/2010.1/04363.html
> [7]: https://github.com/torvalds/linux/commit/3a34b13a88caeb2800ab44a4918f230041b37dd9
> [8]: https://github.com/torvalds/linux/commit/3b844826b6c6affa80755254da322b017358a2f4
> [9]: https://github.com/libuv/libuv/pull/4400#issuecomment-2103232402
> 
> Signed-off-by: Andy Pan <i@andypan.me>
> ---
> Changes in v5:
> - Update the sentence of clarifying the epoll ET
> - Link to v4: https://lore.kernel.org/r/20240731-epoll-et-desc-v4-1-7eb819bdde0d@andypan.me
> 
> Changes in v4:
> - Move the added sentence elsewhere to make more sense
> - Link to v3: https://lore.kernel.org/r/20240730-epoll-et-desc-v3-1-6aa81b1c400d@andypan.me
> 
> Changes in v3:
> - Updated the git commit description
> - Link to v2: https://lore.kernel.org/r/20240727-epoll-et-desc-v2-1-c99b2ac66775@andypan.me
> 
> Changes in v2:
> - Added the git commit description based on feedback
> - Link to v1: https://lore.kernel.org/r/20240727-epoll-et-desc-v1-1-390bafc678b9@andypan.me
> ---
>  man/man7/epoll.7 | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/man/man7/epoll.7 b/man/man7/epoll.7
> index 951500131..86e5f8363 100644
> --- a/man/man7/epoll.7
> +++ b/man/man7/epoll.7
> @@ -121,7 +121,8 @@ .SS Level-triggered and edge-triggered
>  meanwhile the remote peer might be expecting a response based on the
>  data it already sent.
>  The reason for this is that edge-triggered mode
> -delivers events only when changes occur on the monitored file descriptor.
> +delivers events only when changes occur on the monitored file descriptor,
> +that is, an event will be generated upon each receipt of a chunk of data.

LGTM.  Thanks for the patch, and the detailed commit message!
I'll probably merge it later today (I'll let you know).

Have a lovely day!
Alex

>  So, in step
>  .B 5
>  the caller might end up waiting for some data that is already present inside
> 
> ---
> base-commit: cbc0a111e4dceea2037c51098de33e6bc8c16a5c
> change-id: 20240727-epoll-et-desc-04ea9a590f3b
> 
> Best regards,
> -- 
> Andy Pan <i@andypan.me>
> 
> 

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-02  6:38 ` [PATCH " Alejandro Colomar
@ 2024-08-02 10:01   ` 潘少
  2024-08-03 11:49   ` 潘少
  1 sibling, 0 replies; 9+ messages in thread
From: 潘少 @ 2024-08-02 10:01 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, linux-api, Michael Kerrisk, panjf2000

Thank you so much, Alejandro.

Have a good one!
-----------
Best regards,
Andy Pan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-02  6:38 ` [PATCH " Alejandro Colomar
  2024-08-02 10:01   ` 潘少
@ 2024-08-03 11:49   ` 潘少
  1 sibling, 0 replies; 9+ messages in thread
From: 潘少 @ 2024-08-03 11:49 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, linux-api, Michael Kerrisk, panjf2000

Hi Alejandro,
Sorry to bother you over the weekend, but I haven't seen the patch submitted yet.
Is there something wrong with it?
-----------
Best regards,
Andy Pan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-01 11:38 [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode Andy Pan via B4 Relay
  2024-08-02  0:59 ` 潘少
  2024-08-02  6:38 ` [PATCH " Alejandro Colomar
@ 2024-08-03 14:01 ` Alejandro Colomar
  2024-08-03 14:15   ` 潘少
  2 siblings, 1 reply; 9+ messages in thread
From: Alejandro Colomar @ 2024-08-03 14:01 UTC (permalink / raw)
  To: i; +Cc: linux-man, linux-api, mtk.manpages, Andy Pan

[-- Attachment #1: Type: text/plain, Size: 4802 bytes --]

Hi Andy,

On Thu, Aug 01, 2024 at 11:38:38AM GMT, Andy Pan via B4 Relay wrote:
> From: Andy Pan <i@andypan.me>
> 
> For the moment, the edge-triggered epoll generates an event for each
> receipt of a chunk of data, that is to say, epoll_wait() will return
> and tell us a monitored file descriptor is ready whenever there is a
> new activity on that FD since we were last informed about that FD.
> This is not a real _edge_ implementation for epoll, but it's been
> working this way for years and plenty of projects are relying on it
> to eliminate the overhead of one system call of read(2) per wakeup event.
> 
> There are several renowned open-source projects relying on this feature
> for notification function (with eventfd): register eventfd with EPOLLET
> and avoid calling read(2) on the eventfd when there is wakeup event (eventfd being written).
> Examples: nginx [1], netty [2], tokio [3], libevent [4], ect. [5]
> These projects are widely used in today's Internet infrastructures.
> Thus, changing this behavior of epoll ET will fundamentally break them
> and cause a significant negative impact.
> Linux has changed it for pipe before [6], breaking some Android libraries,
> which had got "reverted" somehow. [7] [8]
> 
> Nevertheless, the paragraph in the manual pages describing this
> characteristic of epoll ET seems ambiguous, I think a more explict
> sentence should be used to clarify it. We're improving the notification
> mechanism for libuv recently by exploiting this feature with eventfd,
> which brings us a significant performance boost. [9]
> 
> Therefore, we (as well as the maintainers of nginx, netty, tokio, etc.)
> would have a sense of security to build an enhanced notification function
> based on this feature if there is a guarantee of retaining this implementation
> of epoll ET for the backward compatibility in the man pages.
> 
> [1]: https://github.com/nginx/nginx/blob/efc6a217b92985a1ee211b6bb7337cd2f62deb90/src/event/modules/ngx_epoll_module.c#L386-L457
> [2]: https://github.com/netty/netty/pull/9192
> [3]: https://github.com/tokio-rs/mio/blob/309daae21ecb1d46203a7dbc0cf4c80310240cba/src/sys/unix/waker.rs#L111-L143
> [4]: https://github.com/libevent/libevent/blob/525f5d0a14c9c103be750f2ca175328c25505ea4/event.c#L2597-L2614
> [5]: https://github.com/libuv/libuv/pull/4400#issuecomment-2123798748
> [6]: https://lkml.iu.edu/hypermail/linux/kernel/2010.1/04363.html
> [7]: https://github.com/torvalds/linux/commit/3a34b13a88caeb2800ab44a4918f230041b37dd9
> [8]: https://github.com/torvalds/linux/commit/3b844826b6c6affa80755254da322b017358a2f4
> [9]: https://github.com/libuv/libuv/pull/4400#issuecomment-2103232402
> 
> Signed-off-by: Andy Pan <i@andypan.me>
> ---

Thanks for the patch.  I've applied it, and pushed to my branch:
<https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=da20f7834879cf28ad1d4621f9f1bc421d9c04bf>.

I will take some days before pushing to master, as I'm waiting for a
review of another patch that's before in the queue.

Have a lovely day!
Alex

> Changes in v5:
> - Update the sentence of clarifying the epoll ET
> - Link to v4: https://lore.kernel.org/r/20240731-epoll-et-desc-v4-1-7eb819bdde0d@andypan.me
> 
> Changes in v4:
> - Move the added sentence elsewhere to make more sense
> - Link to v3: https://lore.kernel.org/r/20240730-epoll-et-desc-v3-1-6aa81b1c400d@andypan.me
> 
> Changes in v3:
> - Updated the git commit description
> - Link to v2: https://lore.kernel.org/r/20240727-epoll-et-desc-v2-1-c99b2ac66775@andypan.me
> 
> Changes in v2:
> - Added the git commit description based on feedback
> - Link to v1: https://lore.kernel.org/r/20240727-epoll-et-desc-v1-1-390bafc678b9@andypan.me
> ---
>  man/man7/epoll.7 | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/man/man7/epoll.7 b/man/man7/epoll.7
> index 951500131..86e5f8363 100644
> --- a/man/man7/epoll.7
> +++ b/man/man7/epoll.7
> @@ -121,7 +121,8 @@ .SS Level-triggered and edge-triggered
>  meanwhile the remote peer might be expecting a response based on the
>  data it already sent.
>  The reason for this is that edge-triggered mode
> -delivers events only when changes occur on the monitored file descriptor.
> +delivers events only when changes occur on the monitored file descriptor,
> +that is, an event will be generated upon each receipt of a chunk of data.
>  So, in step
>  .B 5
>  the caller might end up waiting for some data that is already present inside
> 
> ---
> base-commit: cbc0a111e4dceea2037c51098de33e6bc8c16a5c
> change-id: 20240727-epoll-et-desc-04ea9a590f3b
> 
> Best regards,
> -- 
> Andy Pan <i@andypan.me>
> 
> 

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-03 14:01 ` Alejandro Colomar
@ 2024-08-03 14:15   ` 潘少
  2024-08-21 21:58     ` Alejandro Colomar
  0 siblings, 1 reply; 9+ messages in thread
From: 潘少 @ 2024-08-03 14:15 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, linux-api, Michael Kerrisk, panjf2000

Thank you for the updates!
Please also update this email thread after merging this patch, thanks in advance.

Sorry again for interrupting your weekend. Have a great weekend!
-----------
Best regards,
Andy Pan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-03 14:15   ` 潘少
@ 2024-08-21 21:58     ` Alejandro Colomar
  2024-08-22  1:09       ` 潘少
  0 siblings, 1 reply; 9+ messages in thread
From: Alejandro Colomar @ 2024-08-21 21:58 UTC (permalink / raw)
  To: 潘少; +Cc: linux-man, linux-api, Michael Kerrisk, panjf2000

[-- Attachment #1: Type: text/plain, Size: 368 bytes --]

Hi Andy,

On Sat, Aug 03, 2024 at 10:15:10PM GMT, 潘少 wrote:
> Thank you for the updates!
> Please also update this email thread after merging this patch, thanks in advance.

I've pushed everything now.

> 
> Sorry again for interrupting your weekend. Have a great weekend!

\:)

Have a lovely night!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode
  2024-08-21 21:58     ` Alejandro Colomar
@ 2024-08-22  1:09       ` 潘少
  0 siblings, 0 replies; 9+ messages in thread
From: 潘少 @ 2024-08-22  1:09 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: linux-man, linux-api, Michael Kerrisk, panjf2000

Hi Alejandro,
Gratifying! Also, thanks for your efforts in reviewing and submitting this patch!
-----------
Best regards,
Andy Pan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-08-22  1:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 11:38 [PATCH v5] epoll.7: clarify the event distribution under edge-triggered mode Andy Pan via B4 Relay
2024-08-02  0:59 ` 潘少
2024-08-02  6:38 ` [PATCH " Alejandro Colomar
2024-08-02 10:01   ` 潘少
2024-08-03 11:49   ` 潘少
2024-08-03 14:01 ` Alejandro Colomar
2024-08-03 14:15   ` 潘少
2024-08-21 21:58     ` Alejandro Colomar
2024-08-22  1:09       ` 潘少

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).