Re: [PATCH] datagram: Extend the datagram queue MSG_PEEK-ing

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Pavel Emelyanov <xemul@parallels.com>
To: David Miller <davem@davemloft.net>,
	Eric Dumazet <eric.dumazet@gmail.com>
Cc: "tj@kernel.org" <tj@kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH] datagram: Extend the datagram queue MSG_PEEK-ing
Date: Mon, 20 Feb 2012 16:17:11 +0400	[thread overview]
Message-ID: <4F4239C7.6040905@parallels.com> (raw)
In-Reply-To: <20120215.155226.1446930776120577759.davem@davemloft.net>

> I'm still thinking about how I feel about this.
> 
> But meanwhile you can fix some thing up, for example I really
> feel that you should make sure that multiple bits aren't set
> at the same time.
> 
> You either mean MSG_PEEK_MORE or MSG_PEEK_AGAIN, so setting both
> makes no sense and should be flagged as an error.

Actually Eric has pointed out the critical issue with this -- if a task
(whose socket's queue we're about to peek) is currently peek-ing this
queue himself, we will peek only the queue's tail and will potentially 
break this task's peeking in case he uses the _MORE/_AGAIN macros :(


I was thinking about another option of doing the same, how about introducing
the peek offset member on sock (get/set via sockopt) which works like

* if == -1, then peek works as before
* if >= 0, then each peek/recvmsg will increase/decrease the value so that
  the next peek peeks next data

It's questionable what to do if the peek_off points into the middle of a
datagram however. Here's an example of how this looks for datagram sockets
(tested on pf_unix), for stream this requires more patching.

What do you think? Does it make sense to go on with this making other 
->recvmsg handlers support peeking offset?

---

diff --git a/include/asm-generic/socket.h b/include/asm-generic/socket.h
index 49c1704..832c270 100644
--- a/include/asm-generic/socket.h
+++ b/include/asm-generic/socket.h
@@ -66,5 +66,6 @@
 #define SO_RXQ_OVFL             40
 
 #define SO_WIFI_STATUS		41
+#define SO_PEEK_OFF		42
 #define SCM_WIFI_STATUS	SO_WIFI_STATUS
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/net/sock.h b/include/net/sock.h
index 91c1c8b..94b0372 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -357,6 +357,7 @@ struct sock {
 	struct page		*sk_sndmsg_page;
 	struct sk_buff		*sk_send_head;
 	__u32			sk_sndmsg_off;
+	__s32			sk_peek_off;
 	int			sk_write_pending;
 #ifdef CONFIG_SECURITY
 	void			*sk_security;
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 68bbf9f..78e5147 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -180,21 +180,37 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
 		 * However, this function was correct in any case. 8)
 		 */
 		unsigned long cpu_flags;
+		long skip;
 
 		spin_lock_irqsave(&sk->sk_receive_queue.lock, cpu_flags);
-		skb = skb_peek(&sk->sk_receive_queue);
-		if (skb) {
+		skip = sk->sk_peek_off;
+		skb_queue_walk(&sk->sk_receive_queue, skb) {
 			*peeked = skb->peeked;
 			if (flags & MSG_PEEK) {
 				skb->peeked = 1;
 				atomic_inc(&skb->users);
-			} else
+				if (skip >= skb->len) {
+					skip -= skb->len;
+					continue;
+				}
+
+				if (sk->sk_peek_off >= 0)
+					sk->sk_peek_off += (skb->len - skip);
+			} else {
 				__skb_unlink(skb, &sk->sk_receive_queue);
-		}
-		spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
 
-		if (skb)
+				if (sk->sk_peek_off >= 0) {
+					if (sk->sk_peek_off >= skb->len)
+						sk->sk_peek_off -= skb->len;
+					else
+						sk->sk_peek_off = 0;
+				}
+			}
+
+			spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
 			return skb;
+		}
+		spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
 
 		/* User doesn't want to wait */
 		error = -EAGAIN;
diff --git a/net/core/sock.c b/net/core/sock.c
index 02f8dfe..5a5d581 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -792,7 +792,9 @@ set_rcvbuf:
 	case SO_WIFI_STATUS:
 		sock_valbool_flag(sk, SOCK_WIFI_STATUS, valbool);
 		break;
-
+	case SO_PEEK_OFF:
+		sk->sk_peek_off = val;
+		break;
 	default:
 		ret = -ENOPROTOOPT;
 		break;
@@ -1017,6 +1019,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 	case SO_WIFI_STATUS:
 		v.val = !!sock_flag(sk, SOCK_WIFI_STATUS);
 		break;
+	case SO_PEEK_OFF:
+		v.val = sk->sk_peek_off;
+		break;
 
 	default:
 		return -ENOPROTOOPT;
@@ -2092,6 +2097,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 
 	sk->sk_sndmsg_page	=	NULL;
 	sk->sk_sndmsg_off	=	0;
+	sk->sk_peek_off		=	-1;
 
 	sk->sk_peer_pid 	=	NULL;
 	sk->sk_peer_cred	=	NULL;

next prev parent reply	other threads:[~2012-02-20 12:17 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-10 13:54 [PATCH] datagram: Extend the datagram queue MSG_PEEK-ing Pavel Emelyanov
2012-02-10 14:41 ` Eric Dumazet
2012-02-10 14:52   ` Pavel Emelyanov
2012-02-15 20:52 ` David Miller
2012-02-20 12:17   ` Pavel Emelyanov [this message]
2012-02-21  5:01     ` David Miller

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:49c1704 dfblob:832c270 dfblob:91c1c8b dfblob:94b0372
dfblob:68bbf9f dfblob:78e5147 dfblob:02f8dfe dfblob:5a5d581 )
 OR (
bs:"Re: [PATCH] datagram: Extend the datagram queue MSG_PEEK-ing" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F4239C7.6040905@parallels.com \
    --to=xemul@parallels.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.