public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Alban Crequy <alban.crequy@collabora.co.uk>,
	Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>,
	Stephen Hemminger <shemminger@vyatta.com>,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pauli Nieminen <pauli.nieminen@collabora.co.uk>,
	Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: [PATCH] af_unix: unix_write_space() use keyed wakeups
Date: Sat, 30 Oct 2010 08:44:44 +0200	[thread overview]
Message-ID: <1288421084.2680.717.camel@edumazet-laptop> (raw)
In-Reply-To: <1288380431.2680.3.camel@edumazet-laptop>

Le vendredi 29 octobre 2010 à 21:27 +0200, Eric Dumazet a écrit :
> Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
> > Hi,
> > 
> > When a process calls the poll or select, the kernel calls (struct
> > file_operations)->poll on every file descriptor and returns a mask of
> > events which are ready. If the process is only interested by POLLIN
> > events, the mask is still computed for POLLOUT and it can be expensive.
> > For example, on Unix datagram sockets, a process running poll() with
> > POLLIN will wakes-up when the remote end call read(). This is a
> > performance regression introduced when fixing another bug by
> > 3c73419c09a5ef73d56472dbfdade9e311496e9b and
> > ec0d215f9420564fc8286dcf93d2d068bb53a07e.
> > 

unix_write_space() doesn’t yet use the wake_up_interruptible_sync_poll()
to restrict wakeups to only POLLOUT | POLLWRNORM | POLLWRBAND interested
sleepers. Same for unix_dgram_recvmsg()

In your pathological case, each time the other process calls
unix_dgram_recvmsg(), it loops on 800 pollwake() /
default_wake_function() / try_to_wake_up(), which are obviously
expensive, as you pointed out with your test program, carefully designed
to show the false sharing and O(N^2) effect :)

Once do_select() thread can _really_ block, the false sharing problem
disappears for good.

We still loop on 800 items, on each wake_up_interruptible_sync_poll()
call, so maybe we want to optimize this later, adding a global key,
ORing all items keys. I dont think its worth the added complexity, given
the biased usage of your program (800 'listeners' to one event). Is it a
real life scenario ?

Thanks

[PATCH] af_unix: use keyed wakeups

Instead of wakeup all sleepers, use wake_up_interruptible_sync_poll() to
wakeup only ones interested into writing the socket.

This patch is a specialization of commit 37e5540b3c9d (epoll keyed
wakeups: make sockets use keyed wakeups).

On a test program provided by Alan Crequy :

Before:
real    0m3.101s
user    0m0.000s
sys     0m6.104s

After:

real	0m0.211s
user	0m0.000s
sys	0m0.208s

Reported-by: Alban Crequy <alban.crequy@collabora.co.uk>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
---
 net/unix/af_unix.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3c95304..f33c595 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -316,7 +316,8 @@ static void unix_write_space(struct sock *sk)
 	if (unix_writable(sk)) {
 		wq = rcu_dereference(sk->sk_wq);
 		if (wq_has_sleeper(wq))
-			wake_up_interruptible_sync(&wq->wait);
+			wake_up_interruptible_sync_poll(&wq->wait,
+				POLLOUT | POLLWRNORM | POLLWRBAND);
 		sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT);
 	}
 	rcu_read_unlock();
@@ -1710,7 +1711,8 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 		goto out_unlock;
 	}
 
-	wake_up_interruptible_sync(&u->peer_wait);
+	wake_up_interruptible_sync_poll(&u->peer_wait,
+					POLLOUT | POLLWRNORM | POLLWRBAND);
 
 	if (msg->msg_name)
 		unix_copy_addr(msg, skb->sk);



  parent reply	other threads:[~2010-10-30  6:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-29 18:18 [PATCH 0/1] RFC: poll/select performance on datagram sockets Alban Crequy
2010-10-29 18:21 ` [PATCH] " Alban Crequy
2010-10-29 19:27 ` [PATCH 0/1] RFC: " Eric Dumazet
2010-10-29 20:08   ` Davide Libenzi
2010-10-29 20:20     ` Eric Dumazet
2010-10-29 20:46     ` Davide Libenzi
2010-10-29 21:05       ` Eric Dumazet
2010-10-29 21:57         ` Davide Libenzi
2010-10-29 22:08           ` Eric Dumazet
2010-10-30  9:53       ` [PATCH] af_unix: optimize unix_dgram_poll() Eric Dumazet
2010-10-30 17:45         ` Davide Libenzi
2010-10-29 20:20   ` [PATCH 0/1] RFC: poll/select performance on datagram sockets Jesper Juhl
2010-10-29 20:40     ` David Miller
2010-10-29 20:45       ` Eric Dumazet
2010-10-30  6:44   ` Eric Dumazet [this message]
2010-10-30 15:03     ` [PATCH] af_unix: unix_write_space() use keyed wakeups Davide Libenzi
2010-11-08 21:44       ` David Miller
2010-10-30 21:36     ` Alban Crequy
     [not found]       ` <1290554876.2158.5.camel@Nokia-N900-51-1>
2010-11-24  0:20         ` Alban Crequy
2010-11-24  0:28           ` Eric Dumazet
2010-10-30 11:34   ` [PATCH 0/1] RFC: poll/select performance on datagram sockets Alban Crequy
2010-10-30 12:53     ` Eric Dumazet
2010-10-30 13:17       ` Eric Dumazet
     [not found]         ` <20101030224703.065e70f6@chocolatine.cbg.collabora.co.uk>
2010-10-31 15:36           ` [PATCH 1/2] af_unix: fix unix_dgram_poll() behavior for EPOLLOUT event Eric Dumazet
2010-10-31 19:07             ` Davide Libenzi
2010-11-08 21:44             ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1288421084.2680.717.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=adobriyan@gmail.com \
    --cc=alban.crequy@collabora.co.uk \
    --cc=davem@davemloft.net \
    --cc=davidel@xmailserver.org \
    --cc=gorcunov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pauli.nieminen@collabora.co.uk \
    --cc=rweikusat@mssgmbh.com \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox