netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Abeni <pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>,
	James Morris <jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>,
	Alexander Duyck <aduyck-nYU0QVwCCFFWk0Htik3J/w@public.gmane.org>,
	Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>,
	Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Tom Herbert <tom-BjP2VixgY4xUbtYUoyoikg@public.gmane.org>,
	Hannes Frederic Sowa
	<hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org>,
	Edward Cree <ecree-s/n/eUQHGBpZroRs9YW3xA@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH net-next v3 2/3] udp: implement memory accounting helpers
Date: Thu, 29 Sep 2016 16:01:14 +0200	[thread overview]
Message-ID: <1475157674.4676.52.camel@redhat.com> (raw)
In-Reply-To: <1475155472.28155.164.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>

On Thu, 2016-09-29 at 06:24 -0700, Eric Dumazet wrote:
> On Thu, 2016-09-29 at 11:31 +0200, Paolo Abeni wrote:
> > On Wed, 2016-09-28 at 18:42 -0700, Eric Dumazet wrote:
> > > On Wed, 2016-09-28 at 12:52 +0200, Paolo Abeni wrote:
> > > 
> > > > +static void udp_rmem_release(struct sock *sk, int partial)
> > > > +{
> > > > +	struct udp_sock *up = udp_sk(sk);
> > > > +	int fwd, amt;
> > > > +
> > > > +	if (partial && !udp_under_memory_pressure(sk))
> > > > +		return;
> > > > +
> > > > +	/* we can have concurrent release; if we catch any conflict
> > > > +	 * we let only one of them do the work
> > > > +	 */
> > > > +	if (atomic_dec_if_positive(&up->can_reclaim) < 0)
> > > > +		return;
> > > > +
> > > > +	fwd = __udp_forward(up, atomic_read(&sk->sk_rmem_alloc));
> > > > +	if (fwd < SK_MEM_QUANTUM + partial) {
> > > > +		atomic_inc(&up->can_reclaim);
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	amt = (fwd - partial) & ~(SK_MEM_QUANTUM - 1);
> > > > +	atomic_sub(amt, &up->mem_allocated);
> > > > +	atomic_inc(&up->can_reclaim);
> > > > +
> > > > +	__sk_mem_reduce_allocated(sk, amt >> SK_MEM_QUANTUM_SHIFT);
> > > > +	sk->sk_forward_alloc = fwd - amt;
> > > > +}
> > > 
> > > 
> > > This is racy... all these atomics make me nervous...
> > 
> > Ah, perhaps I got it: if we have a concurrent memory scheduling, we
> > could end up with a value of mem_allocated below the real need. 
> > 
> > That mismatch will not drift: at worst we can end up with mem_allocated
> > being single SK_MEM_QUANTUM below what is strictly needed.
> > 
> > A possible alternative could be:
> > 
> > static void udp_rmem_release(struct sock *sk, int partial)
> > {
> > 	struct udp_sock *up = udp_sk(sk);
> > 	int fwd, amt, alloc_old, alloc;
> > 
> > 	if (partial && !udp_under_memory_pressure(sk))
> > 		return;
> > 
> > 	alloc = atomic_read(&up->mem_allocated);
> > 	fwd = alloc - atomic_read(&sk->sk_rmem_alloc);
> > 	if (fwd < SK_MEM_QUANTUM + partial)
> > 		return;
> > 
> > 	amt = (fwd - partial) & ~(SK_MEM_QUANTUM - 1);
> > 	alloc_old = atomic_cmpxchg(&up->mem_allocated, alloc, alloc - amt);
> > 	/* if a concurrent update is detected, just do nothing; if said update
> > 	 * is due to another memory release, that release take care of
> > 	 * reclaiming the memory for us, too.
> > 	 * Otherwise we will be able to release on later dequeue, since
> > 	 * we will eventually stop colliding with the writer when it will
> > 	 * consume all the fwd allocated memory
> > 	 */
> > 	if (alloc_old != alloc)
> > 		return;
> > 
> > 	__sk_mem_reduce_allocated(sk, amt >> SK_MEM_QUANTUM_SHIFT);
> > 	sk->sk_forward_alloc = fwd - amt;
> 
> Can still be done from multiple cpus.
> 
> Add some ndelay() or udelay() before to simulate fact that current cpu
> could be interrupted by an NMI handler (perf for example)... or hard IRQ
> handler...
> 
> Then make sure your tests involve 16 concurrent cpus dealing with one
> udp socket...

Thank you again reviewing this.

I'm working to this sort of tests right now.

> 
> > }
> > 
> > which is even more lazy in reclaiming but should never underestimate the
> > needed forward allocation, and under pressure should eventually free the
> > needed memory.
> 
> 
> If this code is rarely used, why don't you simply use a real spinlock,
> so that we do not have to worry about all this ?
> 
> A spinlock  acquisition/release is a _single_ locked operation.
> Faster than the 3 atomic you got in last version.
> spinlock code (ticket or MCS) avoids starvation.

I'd like to avoid adding a lock, if possible, to avoid any possible
source of contention.

> Then, you can safely update multiple fields in the socket.
> 
> And you get nice lockdep support as a bonus.
> 
> cmpxchg() is fine when a single field need an exclusion. But there you
> have multiple fields to update at once :
> 
> sk_memory_allocated_add() and sk_memory_allocated_sub() can work using 
> atomic_long_add_return() and atomic_long_sub() because their caller owns
> the socket lock and can safely update sk->sk_forward_alloc without
> additional locking, but UDP wont have this luxury after your patches.

When we reach __sk_mem_reduce_allocated() we are sure we can free the
specified amount of memory, so we only need to ensure consistent
sk_prot->memory_allocated updates. The current atomic operation suffices
to this.

Paolo


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-09-29 14:01 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-28 10:52 [PATCH net-next v3 0/3] udp: refactor memory accounting Paolo Abeni
2016-09-28 10:52 ` [PATCH net-next v3 1/3] net/socket: factor out helpers for memory and queue manipulation Paolo Abeni
2016-09-28 10:52 ` [PATCH net-next v3 2/3] udp: implement memory accounting helpers Paolo Abeni
2016-09-29  1:42   ` Eric Dumazet
     [not found]     ` <1475113378.28155.124.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2016-09-29  7:34       ` Paolo Abeni
2016-09-29  9:31       ` Paolo Abeni
     [not found]         ` <1475141514.4676.28.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-09-29 13:24           ` Eric Dumazet
     [not found]             ` <1475155472.28155.164.camel-XN9IlZ5yJG9HTL0Zs8A6p+yfmBU6pStAUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>
2016-09-29 14:01               ` Paolo Abeni [this message]
     [not found]                 ` <1475157674.4676.52.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-09-29 14:13                   ` Eric Dumazet
2016-09-29 14:34                     ` Paolo Abeni
2016-09-29 14:49                       ` Eric Dumazet
2016-09-29 14:59                         ` Paolo Abeni
     [not found] ` <cover.1475048434.git.pabeni-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-09-28 10:52   ` [PATCH net-next v3 3/3] udp: use it's own memory accounting schema Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475157674.4676.52.camel@redhat.com \
    --to=pabeni-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=aduyck-nYU0QVwCCFFWk0Htik3J/w@public.gmane.org \
    --cc=daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org \
    --cc=davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org \
    --cc=ecree-s/n/eUQHGBpZroRs9YW3xA@public.gmane.org \
    --cc=edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r@public.gmane.org \
    --cc=jmorris-gx6/JNMH7DfYtjvyW6yDsg@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=tom-BjP2VixgY4xUbtYUoyoikg@public.gmane.org \
    --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).