netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>,
	Alexander Duyck <aduyck-nYU0QVwCCFFWk0Htik3J/w@public.gmane.org>,
	Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>,
	Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tom Herbert <tom-BjP2VixgY4xUbtYUoyoikg@public.gmane.org>,
	Hannes Frederic Sowa
	<hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>,
	Edward Cree <ecree-s/n/eUQHGBpZroRs9YW3xA@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH net-next v4 2/3] udp: implement memory accounting helpers
Date: Thu, 20 Oct 2016 09:02:30 +0200	[thread overview]
Message-ID: <1476946950.5656.4.camel@redhat.com> (raw)
In-Reply-To: <1476938622.5650.111.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>

On Wed, 2016-10-19 at 21:43 -0700, Eric Dumazet wrote:
> On Wed, 2016-10-19 at 14:47 +0200, Paolo Abeni wrote:
> > Avoid using the generic helpers.
> > Adds a new spinlock_t to the udp_sock structure and use
> > it to protect the memory accounting operation, both on
> > enqueue and on dequeue.
> > 
> > On dequeue perform partial memory reclaiming, trying to
> > leave a quantum of forward allocated memory.
> > 
> > On enqueue use a custom helper, to allow some optimizations:
> > - use a plain spin_lock() variant instead of the slightly
> >   costly spin_lock_irqsave(),
> > - avoid dst_force check, since the calling code has already
> >   dropped the skb dst
> > 
> > The above needs custom memory reclaiming on shutdown, provided
> > by the udp_destruct_sock().
> > 
> > v3 -> v4:
> >   - reworked memory accunting, simplifying the schema
> >   - provide an helper for both memory scheduling and enqueuing
> > 
> > v1 -> v2:
> >   - use a udp specific destrctor to perform memory reclaiming
> >   - remove a couple of helpers, unneeded after the above cleanup
> >   - do not reclaim memory on dequeue if not under memory
> >     pressure
> >   - reworked the fwd accounting schema to avoid potential
> >     integer overflow
> > 
> > Acked-by: Hannes Frederic Sowa <hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>
> > Signed-off-by: Paolo Abeni <pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> >  include/linux/udp.h |   1 +
> >  include/net/udp.h   |   4 ++
> >  net/ipv4/udp.c      | 111 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 116 insertions(+)
> > 
> > diff --git a/include/linux/udp.h b/include/linux/udp.h
> > index d1fd8cd..8ea7f8b 100644
> > --- a/include/linux/udp.h
> > +++ b/include/linux/udp.h
> > @@ -66,6 +66,7 @@ struct udp_sock {
> >  #define UDPLITE_RECV_CC  0x4		/* set via udplite setsocktopt        */
> >  	__u8		 pcflag;        /* marks socket as UDP-Lite if > 0    */
> >  	__u8		 unused[3];
> > +	spinlock_t	 mem_lock;	/* protects memory accounting         */
> >  	/*
> >  	 * For encapsulation sockets.
> >  	 */
> > diff --git a/include/net/udp.h b/include/net/udp.h
> > index ea53a87..18f1e6b 100644
> > --- a/include/net/udp.h
> > +++ b/include/net/udp.h
> > @@ -246,6 +246,9 @@ static inline __be16 udp_flow_src_port(struct net *net, struct sk_buff *skb,
> >  }
> >  
> >  /* net/ipv4/udp.c */
> > +void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len);
> > +int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb);
> > +
> >  void udp_v4_early_demux(struct sk_buff *skb);
> >  int udp_get_port(struct sock *sk, unsigned short snum,
> >  		 int (*saddr_cmp)(const struct sock *,
> > @@ -258,6 +261,7 @@ int udp_get_port(struct sock *sk, unsigned short snum,
> >  void udp4_hwcsum(struct sk_buff *skb, __be32 src, __be32 dst);
> >  int udp_rcv(struct sk_buff *skb);
> >  int udp_ioctl(struct sock *sk, int cmd, unsigned long arg);
> > +int udp_init_sock(struct sock *sk);
> >  int udp_disconnect(struct sock *sk, int flags);
> >  unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait);
> >  struct sk_buff *skb_udp_tunnel_segment(struct sk_buff *skb,
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 7d96dc2..7047bb3 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1172,6 +1172,117 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
> >  	return ret;
> >  }
> >  
> > +static void udp_rmem_release(struct sock *sk, int size, int partial)
> > +{
> > +	struct udp_sock *up = udp_sk(sk);
> > +	int amt;
> > +
> > +	atomic_sub(size, &sk->sk_rmem_alloc);
> > +
> > +	spin_lock_bh(&up->mem_lock);
> > +	sk->sk_forward_alloc += size;
> > +	amt = (sk->sk_forward_alloc - partial) & ~(SK_MEM_QUANTUM - 1);
> > +	sk->sk_forward_alloc -= amt;
> > +	spin_unlock_bh(&up->mem_lock);
> > +
> > +	if (amt)
> > +		__sk_mem_reduce_allocated(sk, amt >> SK_MEM_QUANTUM_SHIFT);
> > +}
> > +
> > +static void udp_rmem_free(struct sk_buff *skb)
> > +{
> > +	udp_rmem_release(skb->sk, skb->truesize, 1);
> > +}
> > +
> > +int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
> > +{
> > +	struct sk_buff_head *list = &sk->sk_receive_queue;
> > +	int rmem, delta, amt, err = -ENOMEM;
> > +	struct udp_sock *up = udp_sk(sk);
> > +	int size = skb->truesize;
> > +
> > +	/* try to avoid the costly atomic add/sub pair when the receive
> > +	 * queue is full
> > +	 */
> > +	if (atomic_read(&sk->sk_rmem_alloc) + size > sk->sk_rcvbuf)
> > +		goto drop;
> > +
> > +	rmem = atomic_add_return(size, &sk->sk_rmem_alloc);
> > +	if (rmem > sk->sk_rcvbuf)
> > +		goto uncharge_drop;
> 
> This is dangerous : We used to accept one packet in receive queue, even
> if sk_rcvbuf is too small. (Check __sock_queue_rcv_skb())
> 
> With your code we might drop all packets even if receive queue is empty,
> this might add regressions for some apps.

Thank you for reviewing this! 

I missed that point. I'll take care of it (and of your others comments)
in v5.

Regards,

Paolo

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-10-20  7:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-19 12:46 [PATCH net-next v4 0/3] udp: refactor memory accounting Paolo Abeni
     [not found] ` <cover.1476877189.git.pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-10-19 12:47   ` [PATCH net-next v4 1/3] net/socket: factor out helpers for memory and queue manipulation Paolo Abeni
2016-10-19 12:47   ` [PATCH net-next v4 3/3] udp: use it's own memory accounting schema Paolo Abeni
2016-10-20  1:54     ` Eric Dumazet
2016-10-19 12:47 ` [PATCH net-next v4 2/3] udp: implement memory accounting helpers Paolo Abeni
2016-10-20  4:43   ` Eric Dumazet
     [not found]     ` <1476938622.5650.111.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2016-10-20  7:02       ` Paolo Abeni [this message]
2016-10-20 10:18       ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1476946950.5656.4.camel@redhat.com \
    --to=pabeni-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=aduyck-nYU0QVwCCFFWk0Htik3J/w@public.gmane.org \
    --cc=daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=ecree-s/n/eUQHGBpZroRs9YW3xA@public.gmane.org \
    --cc=edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org \
    --cc=jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tom-BjP2VixgY4xUbtYUoyoikg@public.gmane.org \
    --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).