* [PATCH net-next 0/6] Ability to peek full socket queue
@ 2012-02-21 17:30 Pavel Emelyanov
2012-02-21 17:30 ` [PATCH 1/6] datagram: Factor out sk queue referencing Pavel Emelyanov
` (5 more replies)
0 siblings, 6 replies; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:30 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
Hi.
This is an attempt to implement the ability to read socket's queue without
removing skbs from it. Using MSG_PEEK doesn't work for unix sockets, both
dgram and stream.
The proposal is to implement the SO_PEEK_OFF sockopt which specifies an
offset in bytes where to start peeking the data from.
I've already sent an example of how this can look, and since nobody objected
about the concept itself, here's the "official" v1. This includes all the unix
socket types and (hopefully) addresses locking issues David pointed out.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH 1/6] datagram: Factor out sk queue referencing
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
@ 2012-02-21 17:30 ` Pavel Emelyanov
2012-02-21 17:39 ` Eric Dumazet
2012-02-21 17:30 ` [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram Pavel Emelyanov
` (4 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:30 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
This makes lines shorter and simplifies further patching.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
net/core/datagram.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 68bbf9f..6f54d0a 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -180,18 +180,19 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
* However, this function was correct in any case. 8)
*/
unsigned long cpu_flags;
+ struct sk_buff_head *queue = &sk->sk_receive_queue;
- spin_lock_irqsave(&sk->sk_receive_queue.lock, cpu_flags);
- skb = skb_peek(&sk->sk_receive_queue);
+ spin_lock_irqsave(&queue->lock, cpu_flags);
+ skb = skb_peek(queue);
if (skb) {
*peeked = skb->peeked;
if (flags & MSG_PEEK) {
skb->peeked = 1;
atomic_inc(&skb->users);
} else
- __skb_unlink(skb, &sk->sk_receive_queue);
+ __skb_unlink(skb, queue);
}
- spin_unlock_irqrestore(&sk->sk_receive_queue.lock, cpu_flags);
+ spin_unlock_irqrestore(&queue->lock, cpu_flags);
if (skb)
return skb;
--
1.5.5.6
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
2012-02-21 17:30 ` [PATCH 1/6] datagram: Factor out sk queue referencing Pavel Emelyanov
@ 2012-02-21 17:30 ` Pavel Emelyanov
2012-02-21 17:40 ` Eric Dumazet
2012-02-21 17:31 ` [PATCH 3/6] skb: Add skb_peek_next helper Pavel Emelyanov
` (3 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:30 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
This one is only considered for MSG_PEEK flag and the value pointed by
it specifies where to start peeking bytes from. If the offset happens to
point into the middle of the returned skb, the offset within this skb is
put back to this very argument.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
include/linux/skbuff.h | 2 +-
net/core/datagram.c | 21 +++++++++++++--------
net/ipv4/udp.c | 4 ++--
net/ipv6/udp.c | 4 ++--
4 files changed, 18 insertions(+), 13 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2b7317f..f3cf43d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2046,7 +2046,7 @@ static inline void skb_frag_add_head(struct sk_buff *skb, struct sk_buff *frag)
for (iter = skb_shinfo(skb)->frag_list; iter; iter = iter->next)
extern struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
- int *peeked, int *err);
+ int *peeked, int *off, int *err);
extern struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned flags,
int noblock, int *err);
extern unsigned int datagram_poll(struct file *file, struct socket *sock,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 6f54d0a..d3cf12f 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -132,6 +132,8 @@ out_noerr:
* __skb_recv_datagram - Receive a datagram skbuff
* @sk: socket
* @flags: MSG_ flags
+ * @off: an offset in bytes to peek skb from. Returns an offset
+ * within an skb where data actually starts
* @peeked: returns non-zero if this packet has been seen before
* @err: error code returned
*
@@ -158,7 +160,7 @@ out_noerr:
* the standard around please.
*/
struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
- int *peeked, int *err)
+ int *peeked, int *off, int *err)
{
struct sk_buff *skb;
long timeo;
@@ -183,19 +185,22 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
struct sk_buff_head *queue = &sk->sk_receive_queue;
spin_lock_irqsave(&queue->lock, cpu_flags);
- skb = skb_peek(queue);
- if (skb) {
+ skb_queue_walk(queue, skb) {
*peeked = skb->peeked;
if (flags & MSG_PEEK) {
+ if (*off >= skb->len) {
+ *off -= skb->len;
+ continue;
+ }
skb->peeked = 1;
atomic_inc(&skb->users);
} else
__skb_unlink(skb, queue);
- }
- spin_unlock_irqrestore(&queue->lock, cpu_flags);
- if (skb)
+ spin_unlock_irqrestore(&queue->lock, cpu_flags);
return skb;
+ }
+ spin_unlock_irqrestore(&queue->lock, cpu_flags);
/* User doesn't want to wait */
error = -EAGAIN;
@@ -215,10 +220,10 @@ EXPORT_SYMBOL(__skb_recv_datagram);
struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned flags,
int noblock, int *err)
{
- int peeked;
+ int peeked, off = 0;
return __skb_recv_datagram(sk, flags | (noblock ? MSG_DONTWAIT : 0),
- &peeked, err);
+ &peeked, &off, err);
}
EXPORT_SYMBOL(skb_recv_datagram);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd99f1a..7c41ab8 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1167,7 +1167,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name;
struct sk_buff *skb;
unsigned int ulen, copied;
- int peeked;
+ int peeked, off = 0;
int err;
int is_udplite = IS_UDPLITE(sk);
bool slow;
@@ -1183,7 +1183,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
try_again:
skb = __skb_recv_datagram(sk, flags | (noblock ? MSG_DONTWAIT : 0),
- &peeked, &err);
+ &peeked, &off, &err);
if (!skb)
goto out;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 8aebf8f..37b0699 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -342,7 +342,7 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
struct inet_sock *inet = inet_sk(sk);
struct sk_buff *skb;
unsigned int ulen, copied;
- int peeked;
+ int peeked, off = 0;
int err;
int is_udplite = IS_UDPLITE(sk);
int is_udp4;
@@ -359,7 +359,7 @@ int udpv6_recvmsg(struct kiocb *iocb, struct sock *sk,
try_again:
skb = __skb_recv_datagram(sk, flags | (noblock ? MSG_DONTWAIT : 0),
- &peeked, &err);
+ &peeked, &off, &err);
if (!skb)
goto out;
--
1.5.5.6
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 3/6] skb: Add skb_peek_next helper
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
2012-02-21 17:30 ` [PATCH 1/6] datagram: Factor out sk queue referencing Pavel Emelyanov
2012-02-21 17:30 ` [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram Pavel Emelyanov
@ 2012-02-21 17:31 ` Pavel Emelyanov
2012-02-21 17:43 ` Eric Dumazet
2012-02-21 17:31 ` [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option Pavel Emelyanov
` (2 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:31 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
include/linux/skbuff.h | 18 ++++++++++++++++++
1 files changed, 18 insertions(+), 0 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f3cf43d..c11a44e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -877,6 +877,24 @@ static inline struct sk_buff *skb_peek(const struct sk_buff_head *list_)
}
/**
+ * skb_peek_next - peek skb following the given one from a queue
+ * @skb: skb to start from
+ * @list_: list to peek at
+ *
+ * Returns %NULL when the end of the list is met or a pointer to the
+ * next element. The reference count is not incremented and the
+ * reference is therefore volatile. Use with caution.
+ */
+static inline struct sk_buff *skb_peek_next(struct sk_buff *skb,
+ const struct sk_buff_head *list_)
+{
+ struct sk_buff *next = skb->next;
+ if (next == (struct sk_buff *)list_)
+ next = NULL;
+ return next;
+}
+
+/**
* skb_peek_tail - peek at the tail of an &sk_buff_head
* @list_: list to peek at
*
--
1.5.5.6
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
` (2 preceding siblings ...)
2012-02-21 17:31 ` [PATCH 3/6] skb: Add skb_peek_next helper Pavel Emelyanov
@ 2012-02-21 17:31 ` Pavel Emelyanov
2012-02-21 17:45 ` Eric Dumazet
2012-02-21 17:31 ` [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets Pavel Emelyanov
2012-02-21 17:32 ` [PATCH 6/6] unix: Support peeking offset for stream sockets Pavel Emelyanov
5 siblings, 1 reply; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:31 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
This one specifies where to start MSG_PEEK-ing queue data from. When
set to negative value means that MSG_PEEK works as ususally -- peeks
from the head of the queue always.
When some bytes are peeked from queue and the peeking offset is non
negative it is moved forward so that the next peek will return next
portion of data.
When non-peeking recvmsg occurs and the peeking offset is non negative
is is moved backward so that the next peek will still peek the proper
data (i.e. the one that would have been picked if there were no non
peeking recv in between).
The offset is set using per-proto opteration to let the protocol handle
the locking issues and to check whether the peeking offset feature is
supported by the protocol the socket belongs to.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
include/asm-generic/socket.h | 1 +
include/linux/net.h | 1 +
include/net/sock.h | 25 +++++++++++++++++++++++++
net/core/sock.c | 13 +++++++++++++
4 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/include/asm-generic/socket.h b/include/asm-generic/socket.h
index 49c1704..832c270 100644
--- a/include/asm-generic/socket.h
+++ b/include/asm-generic/socket.h
@@ -66,5 +66,6 @@
#define SO_RXQ_OVFL 40
#define SO_WIFI_STATUS 41
+#define SO_PEEK_OFF 42
#define SCM_WIFI_STATUS SO_WIFI_STATUS
#endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/linux/net.h b/include/linux/net.h
index b299230..be60c7f 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -206,6 +206,7 @@ struct proto_ops {
int offset, size_t size, int flags);
ssize_t (*splice_read)(struct socket *sock, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len, unsigned int flags);
+ void (*set_peek_off)(struct sock *sk, int val);
};
#define DECLARE_SOCKADDR(type, dst, src) \
diff --git a/include/net/sock.h b/include/net/sock.h
index 91c1c8b..9c0553b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -357,6 +357,7 @@ struct sock {
struct page *sk_sndmsg_page;
struct sk_buff *sk_send_head;
__u32 sk_sndmsg_off;
+ __s32 sk_peek_off;
int sk_write_pending;
#ifdef CONFIG_SECURITY
void *sk_security;
@@ -373,6 +374,30 @@ struct sock {
void (*sk_destruct)(struct sock *sk);
};
+static inline int sk_peek_offset(struct sock *sk, int flags)
+{
+ if ((flags & MSG_PEEK) && (sk->sk_peek_off >= 0))
+ return sk->sk_peek_off;
+ else
+ return 0;
+}
+
+static inline void sk_peek_offset_bwd(struct sock *sk, int val)
+{
+ if (sk->sk_peek_off >= 0) {
+ if (sk->sk_peek_off >= val)
+ sk->sk_peek_off -= val;
+ else
+ sk->sk_peek_off = 0;
+ }
+}
+
+static inline void sk_peek_offset_fwd(struct sock *sk, int val)
+{
+ if (sk->sk_peek_off >= 0)
+ sk->sk_peek_off += val;
+}
+
/*
* Hashed lists helper routines
*/
diff --git a/net/core/sock.c b/net/core/sock.c
index 02f8dfe..19942d4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -793,6 +793,12 @@ set_rcvbuf:
sock_valbool_flag(sk, SOCK_WIFI_STATUS, valbool);
break;
+ case SO_PEEK_OFF:
+ if (sock->ops->set_peek_off)
+ sock->ops->set_peek_off(sk, val);
+ else
+ ret = -EOPNOTSUPP;
+ break;
default:
ret = -ENOPROTOOPT;
break;
@@ -1018,6 +1024,12 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
v.val = !!sock_flag(sk, SOCK_WIFI_STATUS);
break;
+ case SO_PEEK_OFF:
+ if (!sock->ops->set_peek_off)
+ return -EOPNOTSUPP;
+
+ v.val = sk->sk_peek_off;
+ break;
default:
return -ENOPROTOOPT;
}
@@ -2092,6 +2104,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
sk->sk_sndmsg_page = NULL;
sk->sk_sndmsg_off = 0;
+ sk->sk_peek_off = -1;
sk->sk_peer_pid = NULL;
sk->sk_peer_cred = NULL;
--
1.5.5.6
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
` (3 preceding siblings ...)
2012-02-21 17:31 ` [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option Pavel Emelyanov
@ 2012-02-21 17:31 ` Pavel Emelyanov
2012-02-21 17:49 ` Eric Dumazet
2012-02-21 17:32 ` [PATCH 6/6] unix: Support peeking offset for stream sockets Pavel Emelyanov
5 siblings, 1 reply; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:31 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
The sk_peek_off manipulations are protected with the unix_sk->readlock mutex.
This mutex is enough since all we need is to syncronize setting the offset
vs reading the queue head. The latter is fully covered with the mentioned lock.
The recently added __skb_recv_datagram's offset is used to pick the skb to
read the data from.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
net/unix/af_unix.c | 30 +++++++++++++++++++++++++-----
1 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 85d3bb7..3d9481d 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -530,6 +530,16 @@ static int unix_seqpacket_sendmsg(struct kiocb *, struct socket *,
static int unix_seqpacket_recvmsg(struct kiocb *, struct socket *,
struct msghdr *, size_t, int);
+static void unix_set_peek_off(struct sock *sk, int val)
+{
+ struct unix_sock *u = unix_sk(sk);
+
+ mutex_lock(&u->readlock);
+ sk->sk_peek_off = val;
+ mutex_unlock(&u->readlock);
+}
+
+
static const struct proto_ops unix_stream_ops = {
.family = PF_UNIX,
.owner = THIS_MODULE,
@@ -570,6 +580,7 @@ static const struct proto_ops unix_dgram_ops = {
.recvmsg = unix_dgram_recvmsg,
.mmap = sock_no_mmap,
.sendpage = sock_no_sendpage,
+ .set_peek_off = unix_set_peek_off,
};
static const struct proto_ops unix_seqpacket_ops = {
@@ -591,6 +602,7 @@ static const struct proto_ops unix_seqpacket_ops = {
.recvmsg = unix_seqpacket_recvmsg,
.mmap = sock_no_mmap,
.sendpage = sock_no_sendpage,
+ .set_peek_off = unix_set_peek_off,
};
static struct proto unix_proto = {
@@ -1756,6 +1768,7 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
int noblock = flags & MSG_DONTWAIT;
struct sk_buff *skb;
int err;
+ int peeked, skip;
err = -EOPNOTSUPP;
if (flags&MSG_OOB)
@@ -1769,7 +1782,9 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
goto out;
}
- skb = skb_recv_datagram(sk, flags, noblock, &err);
+ skip = sk_peek_offset(sk, flags);
+
+ skb = __skb_recv_datagram(sk, flags, &peeked, &skip, &err);
if (!skb) {
unix_state_lock(sk);
/* Signal EOF on disconnected non-blocking SEQPACKET socket. */
@@ -1786,12 +1801,12 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
if (msg->msg_name)
unix_copy_addr(msg, skb->sk);
- if (size > skb->len)
- size = skb->len;
- else if (size < skb->len)
+ if (size > skb->len - skip)
+ size = skb->len - skip;
+ else if (size < skb->len - skip)
msg->msg_flags |= MSG_TRUNC;
- err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, size);
+ err = skb_copy_datagram_iovec(skb, skip, msg->msg_iov, size);
if (err)
goto out_free;
@@ -1808,6 +1823,8 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
if (!(flags & MSG_PEEK)) {
if (UNIXCB(skb).fp)
unix_detach_fds(siocb->scm, skb);
+
+ sk_peek_offset_bwd(sk, skb->len);
} else {
/* It is questionable: on PEEK we could:
- do not return fds - good, but too simple 8)
@@ -1821,6 +1838,9 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
clearly however!
*/
+
+ sk_peek_offset_fwd(sk, size);
+
if (UNIXCB(skb).fp)
siocb->scm->fp = scm_fp_dup(UNIXCB(skb).fp);
}
--
1.5.5.6
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH 6/6] unix: Support peeking offset for stream sockets
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
` (4 preceding siblings ...)
2012-02-21 17:31 ` [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets Pavel Emelyanov
@ 2012-02-21 17:32 ` Pavel Emelyanov
2012-02-21 17:51 ` Eric Dumazet
5 siblings, 1 reply; 19+ messages in thread
From: Pavel Emelyanov @ 2012-02-21 17:32 UTC (permalink / raw)
To: David Miller, Eric Dumazet, Linux Netdev List
The same here -- we can protect the sk_peek_off manipulations with
the unix_sk->readlock mutex.
The peeking of data from a stream socket is done in the datagram style,
i.e. even if there's enough room for more data in the user buffer, only
the head skb's data is copied in there. This feature is preserved when
peeking data from a given offset -- the data is read till the nearest
skb's boundary.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
net/unix/af_unix.c | 20 ++++++++++++++++++--
1 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3d9481d..0be4d24 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -559,6 +559,7 @@ static const struct proto_ops unix_stream_ops = {
.recvmsg = unix_stream_recvmsg,
.mmap = sock_no_mmap,
.sendpage = sock_no_sendpage,
+ .set_peek_off = unix_set_peek_off,
};
static const struct proto_ops unix_dgram_ops = {
@@ -1904,6 +1905,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
int target;
int err = 0;
long timeo;
+ int skip;
err = -EINVAL;
if (sk->sk_state != TCP_ESTABLISHED)
@@ -1933,12 +1935,15 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
goto out;
}
+ skip = sk_peek_offset(sk, flags);
+
do {
int chunk;
struct sk_buff *skb;
unix_state_lock(sk);
skb = skb_peek(&sk->sk_receive_queue);
+again:
if (skb == NULL) {
unix_sk(sk)->recursion_level = 0;
if (copied >= target)
@@ -1973,6 +1978,13 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
unix_state_unlock(sk);
break;
}
+
+ if (skip >= skb->len) {
+ skip -= skb->len;
+ skb = skb_peek_next(skb, &sk->sk_receive_queue);
+ goto again;
+ }
+
unix_state_unlock(sk);
if (check_creds) {
@@ -1992,8 +2004,8 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
sunaddr = NULL;
}
- chunk = min_t(unsigned int, skb->len, size);
- if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+ chunk = min_t(unsigned int, skb->len - skip, size);
+ if (memcpy_toiovec(msg->msg_iov, skb->data + skip, chunk)) {
if (copied == 0)
copied = -EFAULT;
break;
@@ -2005,6 +2017,8 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
if (!(flags & MSG_PEEK)) {
skb_pull(skb, chunk);
+ sk_peek_offset_bwd(sk, chunk);
+
if (UNIXCB(skb).fp)
unix_detach_fds(siocb->scm, skb);
@@ -2022,6 +2036,8 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
if (UNIXCB(skb).fp)
siocb->scm->fp = scm_fp_dup(UNIXCB(skb).fp);
+ sk_peek_offset_fwd(sk, chunk);
+
break;
}
} while (size);
--
1.5.5.6
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] datagram: Factor out sk queue referencing
2012-02-21 17:30 ` [PATCH 1/6] datagram: Factor out sk queue referencing Pavel Emelyanov
@ 2012-02-21 17:39 ` Eric Dumazet
2012-02-21 20:04 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2012-02-21 17:39 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
Le mardi 21 février 2012 à 21:30 +0400, Pavel Emelyanov a écrit :
> This makes lines shorter and simplifies further patching.
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
> net/core/datagram.c | 9 +++++----
> 1 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/core/datagram.c b/net/core/datagram.c
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram
2012-02-21 17:30 ` [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram Pavel Emelyanov
@ 2012-02-21 17:40 ` Eric Dumazet
2012-02-21 20:04 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2012-02-21 17:40 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
Le mardi 21 février 2012 à 21:30 +0400, Pavel Emelyanov a écrit :
> This one is only considered for MSG_PEEK flag and the value pointed by
> it specifies where to start peeking bytes from. If the offset happens to
> point into the middle of the returned skb, the offset within this skb is
> put back to this very argument.
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
> include/linux/skbuff.h | 2 +-
> net/core/datagram.c | 21 +++++++++++++--------
> net/ipv4/udp.c | 4 ++--
> net/ipv6/udp.c | 4 ++--
> 4 files changed, 18 insertions(+), 13 deletions(-)
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/6] skb: Add skb_peek_next helper
2012-02-21 17:31 ` [PATCH 3/6] skb: Add skb_peek_next helper Pavel Emelyanov
@ 2012-02-21 17:43 ` Eric Dumazet
2012-02-21 20:04 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2012-02-21 17:43 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
Le mardi 21 février 2012 à 21:31 +0400, Pavel Emelyanov a écrit :
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
> include/linux/skbuff.h | 18 ++++++++++++++++++
> 1 files changed, 18 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f3cf43d..c11a44e 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -877,6 +877,24 @@ static inline struct sk_buff *skb_peek(const struct sk_buff_head *list_)
> }
>
> /**
> + * skb_peek_next - peek skb following the given one from a queue
> + * @skb: skb to start from
> + * @list_: list to peek at
> + *
> + * Returns %NULL when the end of the list is met or a pointer to the
> + * next element. The reference count is not incremented and the
> + * reference is therefore volatile. Use with caution.
> + */
> +static inline struct sk_buff *skb_peek_next(struct sk_buff *skb,
> + const struct sk_buff_head *list_)
> +{
> + struct sk_buff *next = skb->next;
> + if (next == (struct sk_buff *)list_)
> + next = NULL;
> + return next;
> +}
> +
> +/**
> * skb_peek_tail - peek at the tail of an &sk_buff_head
> * @list_: list to peek at
> *
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option
2012-02-21 17:31 ` [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option Pavel Emelyanov
@ 2012-02-21 17:45 ` Eric Dumazet
2012-02-21 20:05 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2012-02-21 17:45 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
Le mardi 21 février 2012 à 21:31 +0400, Pavel Emelyanov a écrit :
> This one specifies where to start MSG_PEEK-ing queue data from. When
> set to negative value means that MSG_PEEK works as ususally -- peeks
> from the head of the queue always.
>
> When some bytes are peeked from queue and the peeking offset is non
> negative it is moved forward so that the next peek will return next
> portion of data.
>
> When non-peeking recvmsg occurs and the peeking offset is non negative
> is is moved backward so that the next peek will still peek the proper
> data (i.e. the one that would have been picked if there were no non
> peeking recv in between).
>
> The offset is set using per-proto opteration to let the protocol handle
> the locking issues and to check whether the peeking offset feature is
> supported by the protocol the socket belongs to.
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
> include/asm-generic/socket.h | 1 +
> include/linux/net.h | 1 +
> include/net/sock.h | 25 +++++++++++++++++++++++++
> net/core/sock.c | 13 +++++++++++++
> 4 files changed, 40 insertions(+), 0 deletions(-)
>
> diff --git a/include/asm-generic/socket.h b/include/asm-generic/socket.h
> index 49c1704..832c270 100644
> --- a/include/asm-generic/socket.h
> +++ b/include/asm-generic/socket.h
> @@ -66,5 +66,6 @@
> #define SO_RXQ_OVFL 40
>
> #define SO_WIFI_STATUS 41
> +#define SO_PEEK_OFF 42
> #define SCM_WIFI_STATUS SO_WIFI_STATUS
small note : should be moved down by one line, after SCM_WIFI_STATUS
Other than that,
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets
2012-02-21 17:31 ` [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets Pavel Emelyanov
@ 2012-02-21 17:49 ` Eric Dumazet
2012-02-21 20:05 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2012-02-21 17:49 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
Le mardi 21 février 2012 à 21:31 +0400, Pavel Emelyanov a écrit :
> The sk_peek_off manipulations are protected with the unix_sk->readlock mutex.
> This mutex is enough since all we need is to syncronize setting the offset
> vs reading the queue head. The latter is fully covered with the mentioned lock.
>
> The recently added __skb_recv_datagram's offset is used to pick the skb to
> read the data from.
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
> net/unix/af_unix.c | 30 +++++++++++++++++++++++++-----
> 1 files changed, 25 insertions(+), 5 deletions(-)
Very nice !
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] unix: Support peeking offset for stream sockets
2012-02-21 17:32 ` [PATCH 6/6] unix: Support peeking offset for stream sockets Pavel Emelyanov
@ 2012-02-21 17:51 ` Eric Dumazet
2012-02-21 20:05 ` David Miller
0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2012-02-21 17:51 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
Le mardi 21 février 2012 à 21:32 +0400, Pavel Emelyanov a écrit :
> The same here -- we can protect the sk_peek_off manipulations with
> the unix_sk->readlock mutex.
>
> The peeking of data from a stream socket is done in the datagram style,
> i.e. even if there's enough room for more data in the user buffer, only
> the head skb's data is copied in there. This feature is preserved when
> peeking data from a given offset -- the data is read till the nearest
> skb's boundary.
>
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>
> ---
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 1/6] datagram: Factor out sk queue referencing
2012-02-21 17:39 ` Eric Dumazet
@ 2012-02-21 20:04 ` David Miller
0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2012-02-21 20:04 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 21 Feb 2012 18:39:19 +0100
> Le mardi 21 février 2012 à 21:30 +0400, Pavel Emelyanov a écrit :
>> This makes lines shorter and simplifies further patching.
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>
>> ---
>> net/core/datagram.c | 9 +++++----
>> 1 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/core/datagram.c b/net/core/datagram.c
>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram
2012-02-21 17:40 ` Eric Dumazet
@ 2012-02-21 20:04 ` David Miller
0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2012-02-21 20:04 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 21 Feb 2012 18:40:32 +0100
> Le mardi 21 février 2012 à 21:30 +0400, Pavel Emelyanov a écrit :
>> This one is only considered for MSG_PEEK flag and the value pointed by
>> it specifies where to start peeking bytes from. If the offset happens to
>> point into the middle of the returned skb, the offset within this skb is
>> put back to this very argument.
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>
>> ---
>> include/linux/skbuff.h | 2 +-
>> net/core/datagram.c | 21 +++++++++++++--------
>> net/ipv4/udp.c | 4 ++--
>> net/ipv6/udp.c | 4 ++--
>> 4 files changed, 18 insertions(+), 13 deletions(-)
>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 3/6] skb: Add skb_peek_next helper
2012-02-21 17:43 ` Eric Dumazet
@ 2012-02-21 20:04 ` David Miller
0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2012-02-21 20:04 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 21 Feb 2012 18:43:53 +0100
> Le mardi 21 février 2012 à 21:31 +0400, Pavel Emelyanov a écrit :
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option
2012-02-21 17:45 ` Eric Dumazet
@ 2012-02-21 20:05 ` David Miller
0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2012-02-21 20:05 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 21 Feb 2012 18:45:50 +0100
> Le mardi 21 février 2012 à 21:31 +0400, Pavel Emelyanov a écrit :
>> This one specifies where to start MSG_PEEK-ing queue data from. When
>> set to negative value means that MSG_PEEK works as ususally -- peeks
>> from the head of the queue always.
>>
>> When some bytes are peeked from queue and the peeking offset is non
>> negative it is moved forward so that the next peek will return next
>> portion of data.
>>
>> When non-peeking recvmsg occurs and the peeking offset is non negative
>> is is moved backward so that the next peek will still peek the proper
>> data (i.e. the one that would have been picked if there were no non
>> peeking recv in between).
>>
>> The offset is set using per-proto opteration to let the protocol handle
>> the locking issues and to check whether the peeking offset feature is
>> supported by the protocol the socket belongs to.
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
...
>
> small note : should be moved down by one line, after SCM_WIFI_STATUS
>
> Other than that,
You also didn't update all of the arch/*/include/asm/socket.h files.
I took care of all of this, but what should have been a simple matter
of me typing "git am --signoff foo.mbox" and "make" turned into a
10 minute exercise in raking leaves. :-/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets
2012-02-21 17:49 ` Eric Dumazet
@ 2012-02-21 20:05 ` David Miller
0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2012-02-21 20:05 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 21 Feb 2012 18:49:56 +0100
> Le mardi 21 février 2012 à 21:31 +0400, Pavel Emelyanov a écrit :
>> The sk_peek_off manipulations are protected with the unix_sk->readlock mutex.
>> This mutex is enough since all we need is to syncronize setting the offset
>> vs reading the queue head. The latter is fully covered with the mentioned lock.
>>
>> The recently added __skb_recv_datagram's offset is used to pick the skb to
>> read the data from.
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>
>> ---
>> net/unix/af_unix.c | 30 +++++++++++++++++++++++++-----
>> 1 files changed, 25 insertions(+), 5 deletions(-)
>
> Very nice !
>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH 6/6] unix: Support peeking offset for stream sockets
2012-02-21 17:51 ` Eric Dumazet
@ 2012-02-21 20:05 ` David Miller
0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2012-02-21 20:05 UTC (permalink / raw)
To: eric.dumazet; +Cc: xemul, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 21 Feb 2012 18:51:19 +0100
> Le mardi 21 février 2012 à 21:32 +0400, Pavel Emelyanov a écrit :
>> The same here -- we can protect the sk_peek_off manipulations with
>> the unix_sk->readlock mutex.
>>
>> The peeking of data from a stream socket is done in the datagram style,
>> i.e. even if there's enough room for more data in the user buffer, only
>> the head skb's data is copied in there. This feature is preserved when
>> peeking data from a given offset -- the data is read till the nearest
>> skb's boundary.
>>
>> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
>>
>> ---
>
> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Applied.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-02-21 20:05 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-21 17:30 [PATCH net-next 0/6] Ability to peek full socket queue Pavel Emelyanov
2012-02-21 17:30 ` [PATCH 1/6] datagram: Factor out sk queue referencing Pavel Emelyanov
2012-02-21 17:39 ` Eric Dumazet
2012-02-21 20:04 ` David Miller
2012-02-21 17:30 ` [PATCH 2/6] datagram: Add offset argument to __skb_recv_datagram Pavel Emelyanov
2012-02-21 17:40 ` Eric Dumazet
2012-02-21 20:04 ` David Miller
2012-02-21 17:31 ` [PATCH 3/6] skb: Add skb_peek_next helper Pavel Emelyanov
2012-02-21 17:43 ` Eric Dumazet
2012-02-21 20:04 ` David Miller
2012-02-21 17:31 ` [PATCH 4/6] sock: Introduce the SO_PEEK_OFF sock option Pavel Emelyanov
2012-02-21 17:45 ` Eric Dumazet
2012-02-21 20:05 ` David Miller
2012-02-21 17:31 ` [PATCH 5/6] unix: Support peeking offset for datagram and seqpacket sockets Pavel Emelyanov
2012-02-21 17:49 ` Eric Dumazet
2012-02-21 20:05 ` David Miller
2012-02-21 17:32 ` [PATCH 6/6] unix: Support peeking offset for stream sockets Pavel Emelyanov
2012-02-21 17:51 ` Eric Dumazet
2012-02-21 20:05 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).