* [PATCH 0/2] net: fix race in the receive/select @ 2009-06-29 14:04 Jiri Olsa 2009-06-29 14:14 ` [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa 2009-06-29 14:15 ` [PATCH 2/2] memory barrier: adding smp_mb__after_lock Jiri Olsa 0 siblings, 2 replies; 14+ messages in thread From: Jiri Olsa @ 2009-06-29 14:04 UTC (permalink / raw) To: netdev Cc: linux-kernel, fbl, nhorman, davem, htejun, jarkao2, oleg, davidel, eric.dumazet This patchset fixies the race within the poll_wait call and the receive callbacks, by adding memory barrier. - 1/2 includes the actual fix - 2/2 includes optimization for the x86 arch It is discussed and described in the following discussions: http://lkml.org/lkml/2009/6/18/124 http://lkml.org/lkml/2009/6/25/188 Patchset is made on top of Linus'es tree 52989765629e7d182b4f146050ebba0abf2cb0b7. wbr, jirka --- arch/x86/include/asm/spinlock.h | 3 ++ include/linux/poll.h | 11 ++++++++- include/linux/spinlock.h | 5 ++++ include/net/sock.h | 40 +++++++++++++++++++++++++++++++++++++++ net/atm/common.c | 6 ++-- net/core/datagram.c | 2 +- net/core/sock.c | 8 +++--- net/dccp/output.c | 2 +- net/dccp/proto.c | 2 +- net/ipv4/tcp.c | 2 +- net/iucv/af_iucv.c | 4 +- net/rxrpc/af_rxrpc.c | 4 +- net/unix/af_unix.c | 8 +++--- 13 files changed, 76 insertions(+), 21 deletions(-) ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 14:04 [PATCH 0/2] net: fix race in the receive/select Jiri Olsa @ 2009-06-29 14:14 ` Jiri Olsa 2009-06-29 15:34 ` Davide Libenzi ` (2 more replies) 2009-06-29 14:15 ` [PATCH 2/2] memory barrier: adding smp_mb__after_lock Jiri Olsa 1 sibling, 3 replies; 14+ messages in thread From: Jiri Olsa @ 2009-06-29 14:14 UTC (permalink / raw) To: netdev Cc: linux-kernel, fbl, nhorman, davem, htejun, jarkao2, oleg, davidel, eric.dumazet Adding memory barrier after the poll_wait function, paired with receive callbacks. Adding fuctions sock_poll_wait and sock_has_sleeper to wrap the memory barrier. Without the memory barrier, following race can happen. The race fires, when following code paths meet, and the tp->rcv_nxt and __add_wait_queue updates stay in CPU caches. CPU1 CPU2 sys_select receive packet ... ... __add_wait_queue update tp->rcv_nxt ... ... tp->rcv_nxt check sock_def_readable ... { schedule ... if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) wake_up_interruptible(sk->sk_sleep) ... } If there was no cache the code would work ok, since the wait_queue and rcv_nxt are opposit to each other. Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already passed the tp->rcv_nxt check and sleeps, or will get the new value for tp->rcv_nxt and will return with new data mask. In both cases the process (CPU1) is being added to the wait queue, so the waitqueue_active (CPU2) call cannot miss and will wake up CPU1. The bad case is when the __add_wait_queue changes done by CPU1 stay in its cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1 will then endup calling schedule and sleep forever if there are no more data on the socket. Calls to poll_wait in following modules were ommited: net/bluetooth/af_bluetooth.c net/irda/af_irda.c net/irda/irnet/irnet_ppp.c net/mac80211/rc80211_pid_debugfs.c net/phonet/socket.c net/rds/af_rds.c net/rfkill/core.c net/sunrpc/cache.c net/sunrpc/rpc_pipe.c net/tipc/socket.c wbr, jirka Signed-off-by: Jiri Olsa <jolsa@redhat.com> --- include/linux/poll.h | 11 +++++++++-- include/net/sock.h | 40 ++++++++++++++++++++++++++++++++++++++++ net/atm/common.c | 6 +++--- net/core/datagram.c | 2 +- net/core/sock.c | 8 ++++---- net/dccp/output.c | 2 +- net/dccp/proto.c | 2 +- net/ipv4/tcp.c | 2 +- net/iucv/af_iucv.c | 4 ++-- net/rxrpc/af_rxrpc.c | 4 ++-- net/unix/af_unix.c | 8 ++++---- 11 files changed, 68 insertions(+), 21 deletions(-) diff --git a/include/linux/poll.h b/include/linux/poll.h index fa287f2..b2ea8ef 100644 --- a/include/linux/poll.h +++ b/include/linux/poll.h @@ -35,10 +35,17 @@ typedef struct poll_table_struct { unsigned long key; } poll_table; -static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) +static inline void __poll_wait(struct file *filp, + wait_queue_head_t *wait_address, poll_table *p) +{ + p->qproc(filp, wait_address, p); +} + +static inline void poll_wait(struct file *filp, + wait_queue_head_t *wait_address, poll_table *p) { if (p && wait_address) - p->qproc(filp, wait_address, p); + __poll_wait(filp, wait_address, p); } static inline void init_poll_funcptr(poll_table *pt, poll_queue_proc qproc) diff --git a/include/net/sock.h b/include/net/sock.h index 352f06b..e9137ed 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -54,6 +54,7 @@ #include <linux/filter.h> #include <linux/rculist_nulls.h> +#include <linux/poll.h> #include <asm/atomic.h> #include <net/dst.h> @@ -1241,6 +1242,45 @@ static inline int sk_has_allocations(const struct sock *sk) return sk_wmem_alloc_get(sk) || sk_rmem_alloc_get(sk); } +/** + * sk_has_sleeper - check if there are any waiting processes + * @sk: socket + * + * Returns true if socket has waiting processes + */ +static inline int sk_has_sleeper(struct sock *sk) +{ + /* + * We need to be sure we are in sync with the + * add_wait_queue modifications to the wait queue. + * + * This memory barrier is paired in the sock_poll_wait. + */ + smp_mb(); + return sk->sk_sleep && waitqueue_active(sk->sk_sleep); +} + +/** + * sock_poll_wait - place memory barrier behind the __poll_wait call. + * @filp: file + * @sk: socket + * @p: poll_table + */ +static inline void sock_poll_wait(struct file *filp, struct sock *sk, + poll_table *p) +{ + if (p && sk->sk_sleep) { + __poll_wait(filp, sk->sk_sleep, p); + /* + * We need to be sure we are in sync with the + * socket flags modification. + * + * This memory barrier is paired in the sk_has_sleeper. + */ + smp_mb(); + } +} + /* * Queue a received datagram if it will fit. Stream and sequenced * protocols can't normally use this as they need to fit buffers in diff --git a/net/atm/common.c b/net/atm/common.c index c1c9793..8c4d843 100644 --- a/net/atm/common.c +++ b/net/atm/common.c @@ -92,7 +92,7 @@ static void vcc_sock_destruct(struct sock *sk) static void vcc_def_wakeup(struct sock *sk) { read_lock(&sk->sk_callback_lock); - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up(sk->sk_sleep); read_unlock(&sk->sk_callback_lock); } @@ -110,7 +110,7 @@ static void vcc_write_space(struct sock *sk) read_lock(&sk->sk_callback_lock); if (vcc_writable(sk)) { - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible(sk->sk_sleep); sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT); @@ -594,7 +594,7 @@ unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait) struct atm_vcc *vcc; unsigned int mask; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); mask = 0; vcc = ATM_SD(sock); diff --git a/net/core/datagram.c b/net/core/datagram.c index 58abee1..b0fe692 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -712,7 +712,7 @@ unsigned int datagram_poll(struct file *file, struct socket *sock, struct sock *sk = sock->sk; unsigned int mask; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); mask = 0; /* exceptional events? */ diff --git a/net/core/sock.c b/net/core/sock.c index b0ba569..6354863 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1715,7 +1715,7 @@ EXPORT_SYMBOL(sock_no_sendpage); static void sock_def_wakeup(struct sock *sk) { read_lock(&sk->sk_callback_lock); - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible_all(sk->sk_sleep); read_unlock(&sk->sk_callback_lock); } @@ -1723,7 +1723,7 @@ static void sock_def_wakeup(struct sock *sk) static void sock_def_error_report(struct sock *sk) { read_lock(&sk->sk_callback_lock); - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible_poll(sk->sk_sleep, POLLERR); sk_wake_async(sk, SOCK_WAKE_IO, POLL_ERR); read_unlock(&sk->sk_callback_lock); @@ -1732,7 +1732,7 @@ static void sock_def_error_report(struct sock *sk) static void sock_def_readable(struct sock *sk, int len) { read_lock(&sk->sk_callback_lock); - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible_sync_poll(sk->sk_sleep, POLLIN | POLLRDNORM | POLLRDBAND); sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN); @@ -1747,7 +1747,7 @@ static void sock_def_write_space(struct sock *sk) * progress. --DaveM */ if ((atomic_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf) { - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible_sync_poll(sk->sk_sleep, POLLOUT | POLLWRNORM | POLLWRBAND); diff --git a/net/dccp/output.c b/net/dccp/output.c index c0e88c1..c96119f 100644 --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -196,7 +196,7 @@ void dccp_write_space(struct sock *sk) { read_lock(&sk->sk_callback_lock); - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible(sk->sk_sleep); /* Should agree with poll, otherwise some programs break */ if (sock_writeable(sk)) diff --git a/net/dccp/proto.c b/net/dccp/proto.c index 314a1b5..94ca8ea 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -311,7 +311,7 @@ unsigned int dccp_poll(struct file *file, struct socket *sock, unsigned int mask; struct sock *sk = sock->sk; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); if (sk->sk_state == DCCP_LISTEN) return inet_csk_listen_poll(sk); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 17b89c5..aa0ac36 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -339,7 +339,7 @@ unsigned int tcp_poll(struct file *file, struct socket *sock, poll_table *wait) struct sock *sk = sock->sk; struct tcp_sock *tp = tcp_sk(sk); - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); if (sk->sk_state == TCP_LISTEN) return inet_csk_listen_poll(sk); diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c index 6be5f92..49c15b4 100644 --- a/net/iucv/af_iucv.c +++ b/net/iucv/af_iucv.c @@ -306,7 +306,7 @@ static inline int iucv_below_msglim(struct sock *sk) static void iucv_sock_wake_msglim(struct sock *sk) { read_lock(&sk->sk_callback_lock); - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible_all(sk->sk_sleep); sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT); read_unlock(&sk->sk_callback_lock); @@ -1256,7 +1256,7 @@ unsigned int iucv_sock_poll(struct file *file, struct socket *sock, struct sock *sk = sock->sk; unsigned int mask = 0; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); if (sk->sk_state == IUCV_LISTEN) return iucv_accept_poll(sk); diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c index eac5e7b..bfe493e 100644 --- a/net/rxrpc/af_rxrpc.c +++ b/net/rxrpc/af_rxrpc.c @@ -63,7 +63,7 @@ static void rxrpc_write_space(struct sock *sk) _enter("%p", sk); read_lock(&sk->sk_callback_lock); if (rxrpc_writable(sk)) { - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible(sk->sk_sleep); sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT); } @@ -588,7 +588,7 @@ static unsigned int rxrpc_poll(struct file *file, struct socket *sock, unsigned int mask; struct sock *sk = sock->sk; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); mask = 0; /* the socket is readable if there are any messages waiting on the Rx diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 36d4e44..fc3ebb9 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -315,7 +315,7 @@ static void unix_write_space(struct sock *sk) { read_lock(&sk->sk_callback_lock); if (unix_writable(sk)) { - if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) + if (sk_has_sleeper(sk)) wake_up_interruptible_sync(sk->sk_sleep); sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT); } @@ -1985,7 +1985,7 @@ static unsigned int unix_poll(struct file *file, struct socket *sock, poll_table struct sock *sk = sock->sk; unsigned int mask; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); mask = 0; /* exceptional events? */ @@ -2022,7 +2022,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock, struct sock *sk = sock->sk, *other; unsigned int mask, writable; - poll_wait(file, sk->sk_sleep, wait); + sock_poll_wait(file, sk->sk_sleep, wait); mask = 0; /* exceptional events? */ @@ -2053,7 +2053,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock, other = unix_peer_get(sk); if (other) { if (unix_peer(other) != sk) { - poll_wait(file, &unix_sk(other)->peer_wait, + sock_poll_wait(file, &unix_sk(other)->peer_wait, wait); if (unix_recvq_full(other)) writable = 0; ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 14:14 ` [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa @ 2009-06-29 15:34 ` Davide Libenzi 2009-06-29 17:32 ` Jarek Poplawski 2009-06-29 17:19 ` Jarek Poplawski 2009-06-29 17:47 ` Jarek Poplawski 2 siblings, 1 reply; 14+ messages in thread From: Davide Libenzi @ 2009-06-29 15:34 UTC (permalink / raw) To: Jiri Olsa Cc: netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, jarkao2, oleg, eric.dumazet On Mon, 29 Jun 2009, Jiri Olsa wrote: > -static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) > +static inline void __poll_wait(struct file *filp, > + wait_queue_head_t *wait_address, poll_table *p) > +{ > + p->qproc(filp, wait_address, p); > +} > + > +static inline void poll_wait(struct file *filp, > + wait_queue_head_t *wait_address, poll_table *p) > { > if (p && wait_address) > - p->qproc(filp, wait_address, p); > + __poll_wait(filp, wait_address, p); > } > +static inline void sock_poll_wait(struct file *filp, struct sock *sk, > + poll_table *p) > +{ > + if (p && sk->sk_sleep) { > + __poll_wait(filp, sk->sk_sleep, p); > + /* > + * We need to be sure we are in sync with the > + * socket flags modification. > + * > + * This memory barrier is paired in the sk_has_sleeper. > + */ > + smp_mb(); > + } > +} I think Oleg already said this, but you can use directly poll_wait() without adding another abstraction, and the compiler will drop the double check for you: extern void foo(int, int, int); extern void mb(void); static inline void cfoo(int a, int b, int c) { if (b && c) foo(a, b, c); } void xxx(int a, int b, int c) { if (b && c) { cfoo(a, b, c); mb(); } } ----- xxx: subq $8, %rsp testl %esi, %esi je .L3 testl %edx, %edx je .L3 call foo addq $8, %rsp jmp mb .L3: addq $8, %rsp ret - Davide ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 15:34 ` Davide Libenzi @ 2009-06-29 17:32 ` Jarek Poplawski 2009-06-29 17:36 ` Davide Libenzi 0 siblings, 1 reply; 14+ messages in thread From: Jarek Poplawski @ 2009-06-29 17:32 UTC (permalink / raw) To: Davide Libenzi Cc: Jiri Olsa, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, Jun 29, 2009 at 08:34:55AM -0700, Davide Libenzi wrote: > On Mon, 29 Jun 2009, Jiri Olsa wrote: > > > -static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) > > +static inline void __poll_wait(struct file *filp, > > + wait_queue_head_t *wait_address, poll_table *p) > > +{ > > + p->qproc(filp, wait_address, p); > > +} > > + > > +static inline void poll_wait(struct file *filp, > > + wait_queue_head_t *wait_address, poll_table *p) > > { > > if (p && wait_address) > > - p->qproc(filp, wait_address, p); > > + __poll_wait(filp, wait_address, p); > > } > > > +static inline void sock_poll_wait(struct file *filp, struct sock *sk, > > + poll_table *p) > > +{ > > + if (p && sk->sk_sleep) { > > + __poll_wait(filp, sk->sk_sleep, p); > > + /* > > + * We need to be sure we are in sync with the > > + * socket flags modification. > > + * > > + * This memory barrier is paired in the sk_has_sleeper. > > + */ > > + smp_mb(); > > + } > > +} > > I think Oleg already said this, but you can use directly poll_wait() > without adding another abstraction, and the compiler will drop the double > check for you: I think Oleg told about cosmetics and let Jiri to choose. I'd only add it's not mainly about optimization, but easy showing the main difference, of course depending on taste. Jarek P. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 17:32 ` Jarek Poplawski @ 2009-06-29 17:36 ` Davide Libenzi 2009-06-29 18:04 ` Jarek Poplawski 2009-06-29 20:17 ` Jiri Olsa 0 siblings, 2 replies; 14+ messages in thread From: Davide Libenzi @ 2009-06-29 17:36 UTC (permalink / raw) To: Jarek Poplawski Cc: Jiri Olsa, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, 29 Jun 2009, Jarek Poplawski wrote: > > I think Oleg already said this, but you can use directly poll_wait() > > without adding another abstraction, and the compiler will drop the double > > check for you: > > I think Oleg told about cosmetics and let Jiri to choose. I'd only > add it's not mainly about optimization, but easy showing the main > difference, of course depending on taste. We already have a universally used function to do that, and that's poll_wait(). That code (adding an extra __poll_wait()) was entirely about optimizations (otherwise why not use the existing poll_wait()?), so if the optimization does not actually take place, IMO it's better to not add an extra API. - Davide ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 17:36 ` Davide Libenzi @ 2009-06-29 18:04 ` Jarek Poplawski 2009-06-29 18:14 ` Jarek Poplawski 2009-06-29 20:17 ` Jiri Olsa 1 sibling, 1 reply; 14+ messages in thread From: Jarek Poplawski @ 2009-06-29 18:04 UTC (permalink / raw) To: Davide Libenzi Cc: Jiri Olsa, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, Jun 29, 2009 at 10:36:30AM -0700, Davide Libenzi wrote: > On Mon, 29 Jun 2009, Jarek Poplawski wrote: > > > > I think Oleg already said this, but you can use directly poll_wait() > > > without adding another abstraction, and the compiler will drop the double > > > check for you: > > > > I think Oleg told about cosmetics and let Jiri to choose. I'd only > > add it's not mainly about optimization, but easy showing the main > > difference, of course depending on taste. > > We already have a universally used function to do that, and that's > poll_wait(). > That code (adding an extra __poll_wait()) was entirely about > optimizations (otherwise why not use the existing poll_wait()?), so if > the optimization does not actually take place, IMO it's better to not add > an extra API. OK, you're right, it is about optimization! But IMHO mainly about reading optimization... I simply guess me and probably Jiri too, after reading Oleg's variant thought about compiler, instead of the real difference. Btw., maybe I miss something but I guess Oleg proposed something in between: inlining __poll_wait(), which would save us 'extra API' and compiler doubts. (But I still prefer Jiri's choice. ;-) Jarek P. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 18:04 ` Jarek Poplawski @ 2009-06-29 18:14 ` Jarek Poplawski 2009-06-29 19:47 ` Jarek Poplawski 0 siblings, 1 reply; 14+ messages in thread From: Jarek Poplawski @ 2009-06-29 18:14 UTC (permalink / raw) To: Davide Libenzi Cc: Jiri Olsa, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, Jun 29, 2009 at 08:04:31PM +0200, Jarek Poplawski wrote: ... > Btw., maybe I miss something but I guess Oleg proposed something in > between: inlining __poll_wait(), which would save us 'extra API' and > compiler doubts. (But I still prefer Jiri's choice. ;-) After re-reading I guess Oleg didn't proposed anything in between yet. Sorry, Jarek P. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 18:14 ` Jarek Poplawski @ 2009-06-29 19:47 ` Jarek Poplawski 0 siblings, 0 replies; 14+ messages in thread From: Jarek Poplawski @ 2009-06-29 19:47 UTC (permalink / raw) To: Davide Libenzi Cc: Jiri Olsa, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, Jun 29, 2009 at 08:14:42PM +0200, Jarek Poplawski wrote: ... > > (But I still prefer Jiri's choice. ;-) ...Even if Jiri decides to change his mind, because it really wasn't intended for any persuasions. Jarek P. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 17:36 ` Davide Libenzi 2009-06-29 18:04 ` Jarek Poplawski @ 2009-06-29 20:17 ` Jiri Olsa 2009-06-29 20:20 ` Davide Libenzi 1 sibling, 1 reply; 14+ messages in thread From: Jiri Olsa @ 2009-06-29 20:17 UTC (permalink / raw) To: Davide Libenzi Cc: Jarek Poplawski, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, Jun 29, 2009 at 10:36:30AM -0700, Davide Libenzi wrote: > On Mon, 29 Jun 2009, Jarek Poplawski wrote: > > > > I think Oleg already said this, but you can use directly poll_wait() > > > without adding another abstraction, and the compiler will drop the double > > > check for you: > > > > I think Oleg told about cosmetics and let Jiri to choose. I'd only > > add it's not mainly about optimization, but easy showing the main > > difference, of course depending on taste. > > We already have a universally used function to do that, and that's > poll_wait(). > That code (adding an extra __poll_wait()) was entirely about > optimizations (otherwise why not use the existing poll_wait()?), so if > the optimization does not actually take place, IMO it's better to not add > an extra API. > > > > - Davide > > my thinking was that both variants will endup in the same code anyway, so it'd be probably better if the more readable (subjective) got in.. however I dont have any strong preffering feelings about either of those choices, so I can convert easilly :) jirka ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 20:17 ` Jiri Olsa @ 2009-06-29 20:20 ` Davide Libenzi 0 siblings, 0 replies; 14+ messages in thread From: Davide Libenzi @ 2009-06-29 20:20 UTC (permalink / raw) To: Jiri Olsa Cc: Jarek Poplawski, netdev, Linux Kernel Mailing List, fbl, nhorman, davem, htejun, oleg, eric.dumazet On Mon, 29 Jun 2009, Jiri Olsa wrote: > my thinking was that both variants will endup in the same code anyway, > so it'd be probably better if the more readable (subjective) got in.. > > however I dont have any strong preffering feelings about either of those choices, > so I can convert easilly :) Please use the existing poll_wait() then, as there's no reason to add another API. - Davide ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 14:14 ` [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa 2009-06-29 15:34 ` Davide Libenzi @ 2009-06-29 17:19 ` Jarek Poplawski 2009-06-29 20:05 ` Jiri Olsa 2009-06-29 17:47 ` Jarek Poplawski 2 siblings, 1 reply; 14+ messages in thread From: Jarek Poplawski @ 2009-06-29 17:19 UTC (permalink / raw) To: Jiri Olsa Cc: netdev, linux-kernel, fbl, nhorman, davem, htejun, oleg, davidel, eric.dumazet On Mon, Jun 29, 2009 at 04:14:45PM +0200, Jiri Olsa wrote: ... > +static inline void sock_poll_wait(struct file *filp, struct sock *sk, ... > + sock_poll_wait(file, sk->sk_sleep, wait); -----------------------------^^^^^^^^^^^^ Is it something with my eyes or it wasn't compiled? ;-) Jarek P. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 17:19 ` Jarek Poplawski @ 2009-06-29 20:05 ` Jiri Olsa 0 siblings, 0 replies; 14+ messages in thread From: Jiri Olsa @ 2009-06-29 20:05 UTC (permalink / raw) To: Jarek Poplawski Cc: netdev, linux-kernel, fbl, nhorman, davem, htejun, oleg, davidel, eric.dumazet On Mon, Jun 29, 2009 at 07:19:26PM +0200, Jarek Poplawski wrote: > On Mon, Jun 29, 2009 at 04:14:45PM +0200, Jiri Olsa wrote: > ... > > +static inline void sock_poll_wait(struct file *filp, struct sock *sk, > ... > > + sock_poll_wait(file, sk->sk_sleep, wait); > > -----------------------------^^^^^^^^^^^^ > Is it something with my eyes or it wasn't compiled? ;-) > > Jarek P. your eyes are great, my brain is screwed... sry about that, I'll resend it was compiled actually.. must have ended up with warning I did not noticed... thanks! jirka ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks 2009-06-29 14:14 ` [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa 2009-06-29 15:34 ` Davide Libenzi 2009-06-29 17:19 ` Jarek Poplawski @ 2009-06-29 17:47 ` Jarek Poplawski 2 siblings, 0 replies; 14+ messages in thread From: Jarek Poplawski @ 2009-06-29 17:47 UTC (permalink / raw) To: Jiri Olsa Cc: netdev, linux-kernel, fbl, nhorman, davem, htejun, oleg, davidel, eric.dumazet On Mon, Jun 29, 2009 at 04:14:45PM +0200, Jiri Olsa wrote: ... > +/** > + * sk_has_sleeper - check if there are any waiting processes > + * @sk: socket > + * > + * Returns true if socket has waiting processes > + */ > +static inline int sk_has_sleeper(struct sock *sk) > +{ > + /* > + * We need to be sure we are in sync with the > + * add_wait_queue modifications to the wait queue. > + * > + * This memory barrier is paired in the sock_poll_wait. > + */ > + smp_mb(); > + return sk->sk_sleep && waitqueue_active(sk->sk_sleep); > +} > + Btw. I hope Jiri won't "listen" to me, but I can't stop to mention sock_waitqueue_active() looks to me quite naturally better paired with sock_poll_wait() than sk_has_sleeper(which otherwise is more conspicuous, sorry Eric.) Jarek P. > +/** > + * sock_poll_wait - place memory barrier behind the __poll_wait call. > + * @filp: file > + * @sk: socket > + * @p: poll_table > + */ > +static inline void sock_poll_wait(struct file *filp, struct sock *sk, > + poll_table *p) > +{ > + if (p && sk->sk_sleep) { > + __poll_wait(filp, sk->sk_sleep, p); > + /* > + * We need to be sure we are in sync with the > + * socket flags modification. > + * > + * This memory barrier is paired in the sk_has_sleeper. > + */ > + smp_mb(); > + } > +} ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] memory barrier: adding smp_mb__after_lock 2009-06-29 14:04 [PATCH 0/2] net: fix race in the receive/select Jiri Olsa 2009-06-29 14:14 ` [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa @ 2009-06-29 14:15 ` Jiri Olsa 1 sibling, 0 replies; 14+ messages in thread From: Jiri Olsa @ 2009-06-29 14:15 UTC (permalink / raw) To: netdev Cc: linux-kernel, fbl, nhorman, davem, htejun, jarkao2, oleg, davidel, eric.dumazet Adding smp_mb__after_lock define to be used as a smp_mb call after a lock. Making it nop for x86, since {read|write|spin}_lock() on x86 are full memory barriers. wbr, jirka Signed-off-by: Jiri Olsa <jolsa@redhat.com> --- arch/x86/include/asm/spinlock.h | 3 +++ include/linux/spinlock.h | 5 +++++ include/net/sock.h | 2 +- 3 files changed, 9 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h index b7e5db8..39ecc5f 100644 --- a/arch/x86/include/asm/spinlock.h +++ b/arch/x86/include/asm/spinlock.h @@ -302,4 +302,7 @@ static inline void __raw_write_unlock(raw_rwlock_t *rw) #define _raw_read_relax(lock) cpu_relax() #define _raw_write_relax(lock) cpu_relax() +/* The {read|write|spin}_lock() on x86 are full memory barriers. */ +#define smp_mb__after_lock() do { } while (0) + #endif /* _ASM_X86_SPINLOCK_H */ diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h index 252b245..ae053bd 100644 --- a/include/linux/spinlock.h +++ b/include/linux/spinlock.h @@ -132,6 +132,11 @@ do { \ #endif /*__raw_spin_is_contended*/ #endif +/* The lock does not imply full memory barrier. */ +#ifndef smp_mb__after_lock +#define smp_mb__after_lock() smp_mb() +#endif + /** * spin_unlock_wait - wait until the spinlock gets unlocked * @lock: the spinlock in question. diff --git a/include/net/sock.h b/include/net/sock.h index e9137ed..05fbf8b 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1277,7 +1277,7 @@ static inline void sock_poll_wait(struct file *filp, struct sock *sk, * * This memory barrier is paired in the sk_has_sleeper. */ - smp_mb(); + smp_mb__after_lock(); } } ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-06-29 20:20 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-29 14:04 [PATCH 0/2] net: fix race in the receive/select Jiri Olsa 2009-06-29 14:14 ` [PATCH 1/2] net: adding memory barrier to the poll and receive callbacks Jiri Olsa 2009-06-29 15:34 ` Davide Libenzi 2009-06-29 17:32 ` Jarek Poplawski 2009-06-29 17:36 ` Davide Libenzi 2009-06-29 18:04 ` Jarek Poplawski 2009-06-29 18:14 ` Jarek Poplawski 2009-06-29 19:47 ` Jarek Poplawski 2009-06-29 20:17 ` Jiri Olsa 2009-06-29 20:20 ` Davide Libenzi 2009-06-29 17:19 ` Jarek Poplawski 2009-06-29 20:05 ` Jiri Olsa 2009-06-29 17:47 ` Jarek Poplawski 2009-06-29 14:15 ` [PATCH 2/2] memory barrier: adding smp_mb__after_lock Jiri Olsa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).