netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net] net: do not leave an empty skb in write queue
@ 2023-10-19 11:24 Eric Dumazet
  2023-10-19 18:01 ` Shakeel Butt
  2023-10-21  0:50 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Dumazet @ 2023-10-19 11:24 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Soheil Hassas Yeganeh, Neal Cardwell, netdev, eric.dumazet,
	Eric Dumazet, Shakeel Butt

Under memory stress conditions, tcp_sendmsg_locked()
might call sk_stream_wait_memory(), thus releasing the socket lock.

If a fresh skb has been allocated prior to this,
we should not leave it in the write queue otherwise
tcp_write_xmit() could panic.

This apparently does not happen often, but a future change
in __sk_mem_raise_allocated() that Shakeel and others are
considering would increase chances of being hurt.

Under discussion is to remove this controversial part:

    /* Fail only if socket is _under_ its sndbuf.
     * In this case we cannot block, so that we have to fail.
     */
    if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
        /* Force charge with __GFP_NOFAIL */
        if (memcg_charge && !charged) {
            mem_cgroup_charge_skmem(sk->sk_memcg, amt,
                gfp_memcg_charge() | __GFP_NOFAIL);
        }
        return 1;
    }

Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
---
v2: call tcp_remove_empty_skb() before tcp_push()

 net/ipv4/tcp.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index d3456cf840de35b28a6adb682e27d426b0a60f84..3d3a24f795734eecd60fc761f25f48b7a27714d4 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -927,10 +927,11 @@ int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
 	return mss_now;
 }
 
-/* In some cases, both sendmsg() could have added an skb to the write queue,
- * but failed adding payload on it.  We need to remove it to consume less
+/* In some cases, sendmsg() could have added an skb to the write queue,
+ * but failed adding payload on it. We need to remove it to consume less
  * memory, but more importantly be able to generate EPOLLOUT for Edge Trigger
- * epoll() users.
+ * epoll() users. Another reason is that tcp_write_xmit() does not like
+ * finding an empty skb in the write queue.
  */
 void tcp_remove_empty_skb(struct sock *sk)
 {
@@ -1289,6 +1290,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 wait_for_space:
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+		tcp_remove_empty_skb(sk);
 		if (copied)
 			tcp_push(sk, flags & ~MSG_MORE, mss_now,
 				 TCP_NAGLE_PUSH, size_goal);
-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] net: do not leave an empty skb in write queue
  2023-10-19 11:24 [PATCH v2 net] net: do not leave an empty skb in write queue Eric Dumazet
@ 2023-10-19 18:01 ` Shakeel Butt
  2023-10-19 19:13   ` Dmitry Kravkov
  2023-10-21  0:50 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 5+ messages in thread
From: Shakeel Butt @ 2023-10-19 18:01 UTC (permalink / raw)
  To: Eric Dumazet, Abel Wu
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni,
	Soheil Hassas Yeganeh, Neal Cardwell, netdev, eric.dumazet

+Abel Wu

On Thu, Oct 19, 2023 at 4:24 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Under memory stress conditions, tcp_sendmsg_locked()
> might call sk_stream_wait_memory(), thus releasing the socket lock.
>
> If a fresh skb has been allocated prior to this,
> we should not leave it in the write queue otherwise
> tcp_write_xmit() could panic.
>
> This apparently does not happen often, but a future change
> in __sk_mem_raise_allocated() that Shakeel and others are
> considering would increase chances of being hurt.
>
> Under discussion is to remove this controversial part:
>
>     /* Fail only if socket is _under_ its sndbuf.
>      * In this case we cannot block, so that we have to fail.
>      */
>     if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
>         /* Force charge with __GFP_NOFAIL */
>         if (memcg_charge && !charged) {
>             mem_cgroup_charge_skmem(sk->sk_memcg, amt,
>                 gfp_memcg_charge() | __GFP_NOFAIL);
>         }
>         return 1;
>     }
>
> Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Shakeel Butt <shakeelb@google.com>

Reviewed-by: Shakeel Butt <shakeelb@google.com>

> ---
> v2: call tcp_remove_empty_skb() before tcp_push()
>
>  net/ipv4/tcp.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index d3456cf840de35b28a6adb682e27d426b0a60f84..3d3a24f795734eecd60fc761f25f48b7a27714d4 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -927,10 +927,11 @@ int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
>         return mss_now;
>  }
>
> -/* In some cases, both sendmsg() could have added an skb to the write queue,
> - * but failed adding payload on it.  We need to remove it to consume less
> +/* In some cases, sendmsg() could have added an skb to the write queue,
> + * but failed adding payload on it. We need to remove it to consume less
>   * memory, but more importantly be able to generate EPOLLOUT for Edge Trigger
> - * epoll() users.
> + * epoll() users. Another reason is that tcp_write_xmit() does not like
> + * finding an empty skb in the write queue.
>   */
>  void tcp_remove_empty_skb(struct sock *sk)
>  {
> @@ -1289,6 +1290,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>
>  wait_for_space:
>                 set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> +               tcp_remove_empty_skb(sk);
>                 if (copied)
>                         tcp_push(sk, flags & ~MSG_MORE, mss_now,
>                                  TCP_NAGLE_PUSH, size_goal);
> --
> 2.42.0.655.g421f12c284-goog
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] net: do not leave an empty skb in write queue
  2023-10-19 18:01 ` Shakeel Butt
@ 2023-10-19 19:13   ` Dmitry Kravkov
  2023-10-19 19:18     ` Eric Dumazet
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Kravkov @ 2023-10-19 19:13 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Eric Dumazet, Abel Wu, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Soheil Hassas Yeganeh, Neal Cardwell, netdev,
	eric.dumazet

On Thu, Oct 19, 2023 at 9:01 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> +Abel Wu
>
> On Thu, Oct 19, 2023 at 4:24 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > Under memory stress conditions, tcp_sendmsg_locked()
> > might call sk_stream_wait_memory(), thus releasing the socket lock.
> >
> > If a fresh skb has been allocated prior to this,
> > we should not leave it in the write queue otherwise
> > tcp_write_xmit() could panic.

Eric, do you have a panic trace accidentally? Thanks

> >
> > This apparently does not happen often, but a future change
> > in __sk_mem_raise_allocated() that Shakeel and others are
> > considering would increase chances of being hurt.
> >
> > Under discussion is to remove this controversial part:
> >
> >     /* Fail only if socket is _under_ its sndbuf.
> >      * In this case we cannot block, so that we have to fail.
> >      */
> >     if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
> >         /* Force charge with __GFP_NOFAIL */
> >         if (memcg_charge && !charged) {
> >             mem_cgroup_charge_skmem(sk->sk_memcg, amt,
> >                 gfp_memcg_charge() | __GFP_NOFAIL);
> >         }
> >         return 1;
> >     }
> >
> > Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Cc: Shakeel Butt <shakeelb@google.com>
>
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
>
> > ---
> > v2: call tcp_remove_empty_skb() before tcp_push()
> >
> >  net/ipv4/tcp.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index d3456cf840de35b28a6adb682e27d426b0a60f84..3d3a24f795734eecd60fc761f25f48b7a27714d4 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -927,10 +927,11 @@ int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
> >         return mss_now;
> >  }
> >
> > -/* In some cases, both sendmsg() could have added an skb to the write queue,
> > - * but failed adding payload on it.  We need to remove it to consume less
> > +/* In some cases, sendmsg() could have added an skb to the write queue,
> > + * but failed adding payload on it. We need to remove it to consume less
> >   * memory, but more importantly be able to generate EPOLLOUT for Edge Trigger
> > - * epoll() users.
> > + * epoll() users. Another reason is that tcp_write_xmit() does not like
> > + * finding an empty skb in the write queue.
> >   */
> >  void tcp_remove_empty_skb(struct sock *sk)
> >  {
> > @@ -1289,6 +1290,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> >
> >  wait_for_space:
> >                 set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> > +               tcp_remove_empty_skb(sk);
> >                 if (copied)
> >                         tcp_push(sk, flags & ~MSG_MORE, mss_now,
> >                                  TCP_NAGLE_PUSH, size_goal);
> > --
> > 2.42.0.655.g421f12c284-goog
> >
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] net: do not leave an empty skb in write queue
  2023-10-19 19:13   ` Dmitry Kravkov
@ 2023-10-19 19:18     ` Eric Dumazet
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Dumazet @ 2023-10-19 19:18 UTC (permalink / raw)
  To: Dmitry Kravkov
  Cc: Shakeel Butt, Abel Wu, David S . Miller, Jakub Kicinski,
	Paolo Abeni, Soheil Hassas Yeganeh, Neal Cardwell, netdev,
	eric.dumazet

On Thu, Oct 19, 2023 at 9:14 PM Dmitry Kravkov <dmitryk@qwilt.com> wrote:
>
> On Thu, Oct 19, 2023 at 9:01 PM Shakeel Butt <shakeelb@google.com> wrote:
> >
> > +Abel Wu
> >
> > On Thu, Oct 19, 2023 at 4:24 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > Under memory stress conditions, tcp_sendmsg_locked()
> > > might call sk_stream_wait_memory(), thus releasing the socket lock.
> > >
> > > If a fresh skb has been allocated prior to this,
> > > we should not leave it in the write queue otherwise
> > > tcp_write_xmit() could panic.
>
> Eric, do you have a panic trace accidentally? Thanks

I have no panic yet. It would be a bit tricky to trigger I think,
but a bit of clever fault injection could do this.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 net] net: do not leave an empty skb in write queue
  2023-10-19 11:24 [PATCH v2 net] net: do not leave an empty skb in write queue Eric Dumazet
  2023-10-19 18:01 ` Shakeel Butt
@ 2023-10-21  0:50 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-10-21  0:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, soheil, ncardwell, netdev, eric.dumazet,
	shakeelb

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 19 Oct 2023 11:24:57 +0000 you wrote:
> Under memory stress conditions, tcp_sendmsg_locked()
> might call sk_stream_wait_memory(), thus releasing the socket lock.
> 
> If a fresh skb has been allocated prior to this,
> we should not leave it in the write queue otherwise
> tcp_write_xmit() could panic.
> 
> [...]

Here is the summary with links:
  - [v2,net] net: do not leave an empty skb in write queue
    https://git.kernel.org/netdev/net/c/72bf4f1767f0

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-21  0:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-19 11:24 [PATCH v2 net] net: do not leave an empty skb in write queue Eric Dumazet
2023-10-19 18:01 ` Shakeel Butt
2023-10-19 19:13   ` Dmitry Kravkov
2023-10-19 19:18     ` Eric Dumazet
2023-10-21  0:50 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).