netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jong eon Park" <jongeon.park@samsung.com>
To: "'Jakub Kicinski'" <kuba@kernel.org>,
	"'Paolo Abeni'" <pabeni@redhat.com>
Cc: "'David S. Miller'" <davem@davemloft.net>,
	"'Eric Dumazet'" <edumazet@google.com>, <netdev@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	"'Dong ha Kang'" <dongha7.kang@samsung.com>
Subject: RE: [PATCH] netlink: introduce netlink poll to resolve fast return issue
Date: Tue, 7 Nov 2023 11:05:08 +0900	[thread overview]
Message-ID: <25c501da111e$d527b010$7f771030$@samsung.com> (raw)
In-Reply-To: <20231106154812.14c470c2@kernel.org>



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Tuesday, November 7, 2023 8:48 AM
> To: Jong eon Park <jongeon.park@samsung.com>; Paolo Abeni
> <pabeni@redhat.com>
> Cc: David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Dong ha Kang <dongha7.kang@samsung.com>
> Subject: Re: [PATCH] netlink: introduce netlink poll to resolve fast
> return issue
> 
> On Fri,  3 Nov 2023 16:22:09 +0900 Jong eon Park wrote:
> > In very rare cases, there was an issue where a user's poll function
> > waiting for a uevent would continuously return very quickly, causing
> > excessive CPU usage due to the following scenario.
> >
> > Once sk_rcvbuf becomes full netlink_broadcast_deliver returns an error
> > and netlink_overrun is called. However, if netlink_overrun was called
> > in a context just before a another context returns from the poll and
> > recv is invoked, emptying the rcvbuf, sk->sk_err = ENOBUF is written
> > to the netlink socket belatedly and it enters the NETLINK_S_CONGESTED
> state.
> > If the user does not check for POLLERR, they cannot consume and clean
> > sk_err and repeatedly enter the situation where they call poll again
> > but return immediately.
> >
> > To address this issue, I would like to introduce the following netlink
> > poll.
> >
> > After calling the datagram_poll, netlink poll checks the
> > NETLINK_S_CONGESTED status and rcv queue, and this make the user to be
> > readable once more even if the user has already emptied rcv queue.
> > This allows the user to be able to consume sk->sk_err value through
> > netlink_recvmsg, thus the situation described above can be avoided
> 
> The explanation makes sense, but I'm not able to make the jump in
> understanding how this is a netlink problem. datagram_poll() returns
> EPOLLERR because sk_err is set, what makes netlink special?
> The fact that we can have an sk_err with nothing in the recv queue?
> 
> Paolo understands this better, maybe he can weigh in tomorrow...

Perhaps my explanation was not comprehensive enough.

The issue at hand is that once it occurs, users cannot escape from this 
"busy running" situation, and the inadequate handling of EPOLLERR by users 
imposes a heavy burden on the entire system, which seems quite harsh.

The reason for a separate netlink poll is related to the netlink state. 
When it enters the NETLINK_S_CONGESTED state, sk can no longer receive or 
deliver skb, and the receive_queue must be completely emptied to clear the 
state. However, it was found that the NETLINK_S_CONGESTED state was still 
maintained even when the receive_queue was empty, which was incorrect, and 
that's why I implemented the handling in poll.

I don't consider this approach to be the best way, so if you have any 
recommendations for a better solution, I would appreciate it.

Regards.
JE Park.



  reply	other threads:[~2023-11-07  2:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20231103072245epcas1p4471a31e9f579e38501c8c856d3ca2a77@epcas1p4.samsung.com>
2023-11-03  7:22 ` [PATCH] netlink: introduce netlink poll to resolve fast return issue Jong eon Park
2023-11-06 23:48   ` Jakub Kicinski
2023-11-07  2:05     ` Jong eon Park [this message]
2023-11-07 16:53       ` Jakub Kicinski
2023-11-10 14:54         ` Jong eon Park
2023-11-10 19:00           ` Jakub Kicinski
2023-11-13  3:50             ` Jong eon Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='25c501da111e$d527b010$7f771030$@samsung.com' \
    --to=jongeon.park@samsung.com \
    --cc=davem@davemloft.net \
    --cc=dongha7.kang@samsung.com \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).