* [PATCH 1/1] net: Introduce recvmmsg socket syscall
@ 2009-10-12 16:20 Arnaldo Carvalho de Melo
2009-10-12 17:53 ` Nir Tzachar
2009-10-13 6:40 ` David Miller
0 siblings, 2 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-10-12 16:20 UTC (permalink / raw)
To: David Miller
Cc: netdev, Arnaldo Carvalho de Melo, Caitlin Bestler, Chris Van Hoof,
Clark Williams, Neil Horman, Nir Tzachar, Nivedita Singhvi,
Paul Moore, Rémi Denis-Courmont, Steven Whitehouse
Meaning receive multiple messages, reducing the number of syscalls and
net stack entry/exit operations.
Next patches will introduce mechanisms where protocols that want to
optimize this operation will provide an unlocked_recvmsg operation.
This takes into account comments made by:
. Paul Moore: sock_recvmsg is called only for the first datagram,
sock_recvmsg_nosec is used for the rest.
. Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
works in the same fashion as the ppoll one.
If the underlying protocol returns a datagram with MSG_OOB set, this
will make recvmmsg return right away with as many datagrams (+ the OOB
one) it has received so far.
. Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
datagrams and then recvmsg returns an error, recvmmsg will return
the successfully received datagrams, store the error and return it
in the next call.
This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
where we will be able to acquire the lock only at batch start and end, not at
every underlying recvmsg call.
Cc: Caitlin Bestler <caitlin.bestler@gmail.com>
Cc: Chris Van Hoof <vanhoof@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Nir Tzachar <nir.tzachar@gmail.com>
Cc: Nivedita Singhvi <niv@us.ibm.com>
Cc: Paul Moore <paul.moore@hp.com>
Cc: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Cc: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
arch/alpha/kernel/systbls.S | 1 +
arch/arm/kernel/calls.S | 1 +
arch/avr32/kernel/syscall_table.S | 1 +
arch/blackfin/mach-common/entry.S | 1 +
arch/ia64/kernel/entry.S | 1 +
arch/microblaze/kernel/syscall_table.S | 1 +
arch/mips/kernel/scall32-o32.S | 1 +
arch/mips/kernel/scall64-64.S | 1 +
arch/mips/kernel/scall64-n32.S | 1 +
arch/mips/kernel/scall64-o32.S | 1 +
arch/sh/kernel/syscalls_64.S | 1 +
arch/sparc/kernel/systbls_64.S | 4 +-
arch/x86/ia32/ia32entry.S | 1 +
arch/x86/include/asm/unistd_32.h | 3 +-
arch/x86/include/asm/unistd_64.h | 2 +
arch/x86/kernel/syscall_table_32.S | 1 +
arch/xtensa/include/asm/unistd.h | 4 +-
include/linux/net.h | 1 +
include/linux/socket.h | 10 ++
include/linux/syscalls.h | 4 +
include/net/compat.h | 8 +
kernel/sys_ni.c | 2 +
net/compat.c | 33 +++++-
net/socket.c | 225 ++++++++++++++++++++++++++------
24 files changed, 260 insertions(+), 49 deletions(-)
diff --git a/arch/alpha/kernel/systbls.S b/arch/alpha/kernel/systbls.S
index 95c9aef..cda6b8b 100644
--- a/arch/alpha/kernel/systbls.S
+++ b/arch/alpha/kernel/systbls.S
@@ -497,6 +497,7 @@ sys_call_table:
.quad sys_signalfd
.quad sys_ni_syscall
.quad sys_eventfd
+ .quad sys_recvmmsg
.size sys_call_table, . - sys_call_table
.type sys_call_table, @object
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index fafce1b..f58c115 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -374,6 +374,7 @@
CALL(sys_pwritev)
CALL(sys_rt_tgsigqueueinfo)
CALL(sys_perf_event_open)
+/* 365 */ CALL(sys_recvmmsg)
#ifndef syscalls_counted
.equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
#define syscalls_counted
diff --git a/arch/avr32/kernel/syscall_table.S b/arch/avr32/kernel/syscall_table.S
index 7ee0057..e76bad1 100644
--- a/arch/avr32/kernel/syscall_table.S
+++ b/arch/avr32/kernel/syscall_table.S
@@ -295,4 +295,5 @@ sys_call_table:
.long sys_signalfd
.long sys_ni_syscall /* 280, was sys_timerfd */
.long sys_eventfd
+ .long sys_recvmmsg
.long sys_ni_syscall /* r8 is saturated at nr_syscalls */
diff --git a/arch/blackfin/mach-common/entry.S b/arch/blackfin/mach-common/entry.S
index 1e7cac2..4869272 100644
--- a/arch/blackfin/mach-common/entry.S
+++ b/arch/blackfin/mach-common/entry.S
@@ -1621,6 +1621,7 @@ ENTRY(_sys_call_table)
.long _sys_pwritev
.long _sys_rt_tgsigqueueinfo
.long _sys_perf_event_open
+ .long _sys_recvmmsg /* 370 */
.rept NR_syscalls-(.-_sys_call_table)/4
.long _sys_ni_syscall
diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index d0e7d37..d75b872 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -1806,6 +1806,7 @@ sys_call_table:
data8 sys_preadv
data8 sys_pwritev // 1320
data8 sys_rt_tgsigqueueinfo
+ data8 sys_recvmmsg
.org sys_call_table + 8*NR_syscalls // guard against failures to increase NR_syscalls
#endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */
diff --git a/arch/microblaze/kernel/syscall_table.S b/arch/microblaze/kernel/syscall_table.S
index ecec191..c1ab1dc 100644
--- a/arch/microblaze/kernel/syscall_table.S
+++ b/arch/microblaze/kernel/syscall_table.S
@@ -371,3 +371,4 @@ ENTRY(sys_call_table)
.long sys_ni_syscall
.long sys_rt_tgsigqueueinfo /* 365 */
.long sys_perf_event_open
+ .long sys_recvmmsg
diff --git a/arch/mips/kernel/scall32-o32.S b/arch/mips/kernel/scall32-o32.S
index fd2a9bb..17202bb 100644
--- a/arch/mips/kernel/scall32-o32.S
+++ b/arch/mips/kernel/scall32-o32.S
@@ -583,6 +583,7 @@ einval: li v0, -ENOSYS
sys sys_rt_tgsigqueueinfo 4
sys sys_perf_event_open 5
sys sys_accept4 4
+ sys sys_recvmmsg 5
.endm
/* We pre-compute the number of _instruction_ bytes needed to
diff --git a/arch/mips/kernel/scall64-64.S b/arch/mips/kernel/scall64-64.S
index 18bf7f3..a8a6c59 100644
--- a/arch/mips/kernel/scall64-64.S
+++ b/arch/mips/kernel/scall64-64.S
@@ -420,4 +420,5 @@ sys_call_table:
PTR sys_rt_tgsigqueueinfo
PTR sys_perf_event_open
PTR sys_accept4
+ PTR sys_recvmmsg
.size sys_call_table,.-sys_call_table
diff --git a/arch/mips/kernel/scall64-n32.S b/arch/mips/kernel/scall64-n32.S
index 6ebc079..5154e64 100644
--- a/arch/mips/kernel/scall64-n32.S
+++ b/arch/mips/kernel/scall64-n32.S
@@ -418,4 +418,5 @@ EXPORT(sysn32_call_table)
PTR compat_sys_rt_tgsigqueueinfo /* 5295 */
PTR sys_perf_event_open
PTR sys_accept4
+ PTR compat_sys_recvmmsg
.size sysn32_call_table,.-sysn32_call_table
diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S
index 9bbf977..d0eff53 100644
--- a/arch/mips/kernel/scall64-o32.S
+++ b/arch/mips/kernel/scall64-o32.S
@@ -538,4 +538,5 @@ sys_call_table:
PTR compat_sys_rt_tgsigqueueinfo
PTR sys_perf_event_open
PTR sys_accept4
+ PTR compat_sys_recvmmsg
.size sys_call_table,.-sys_call_table
diff --git a/arch/sh/kernel/syscalls_64.S b/arch/sh/kernel/syscalls_64.S
index 5bfde6c..07d2aae 100644
--- a/arch/sh/kernel/syscalls_64.S
+++ b/arch/sh/kernel/syscalls_64.S
@@ -391,3 +391,4 @@ sys_call_table:
.long sys_pwritev
.long sys_rt_tgsigqueueinfo
.long sys_perf_event_open
+ .long sys_recvmmsg /* 365 */
diff --git a/arch/sparc/kernel/systbls_64.S b/arch/sparc/kernel/systbls_64.S
index 009825f..f37bef7 100644
--- a/arch/sparc/kernel/systbls_64.S
+++ b/arch/sparc/kernel/systbls_64.S
@@ -83,7 +83,7 @@ sys_call_table32:
/*310*/ .word compat_sys_utimensat, compat_sys_signalfd, sys_timerfd_create, sys_eventfd, compat_sys_fallocate
.word compat_sys_timerfd_settime, compat_sys_timerfd_gettime, compat_sys_signalfd4, sys_eventfd2, sys_epoll_create1
/*320*/ .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, compat_sys_preadv
- .word compat_sys_pwritev, compat_sys_rt_tgsigqueueinfo, sys_perf_event_open
+ .word compat_sys_pwritev, compat_sys_rt_tgsigqueueinfo, sys_perf_event_open, compat_sys_recvmmsg
#endif /* CONFIG_COMPAT */
@@ -158,4 +158,4 @@ sys_call_table:
/*310*/ .word sys_utimensat, sys_signalfd, sys_timerfd_create, sys_eventfd, sys_fallocate
.word sys_timerfd_settime, sys_timerfd_gettime, sys_signalfd4, sys_eventfd2, sys_epoll_create1
/*320*/ .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, sys_preadv
- .word sys_pwritev, sys_rt_tgsigqueueinfo, sys_perf_event_open
+ .word sys_pwritev, sys_rt_tgsigqueueinfo, sys_perf_event_open, sys_recvmmsg
diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 74619c4..11a6c79 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -832,4 +832,5 @@ ia32_sys_call_table:
.quad compat_sys_pwritev
.quad compat_sys_rt_tgsigqueueinfo /* 335 */
.quad sys_perf_event_open
+ .quad compat_sys_recvmmsg
ia32_syscall_end:
diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
index 6fb3c20..3baf379 100644
--- a/arch/x86/include/asm/unistd_32.h
+++ b/arch/x86/include/asm/unistd_32.h
@@ -342,10 +342,11 @@
#define __NR_pwritev 334
#define __NR_rt_tgsigqueueinfo 335
#define __NR_perf_event_open 336
+#define __NR_recvmmsg 337
#ifdef __KERNEL__
-#define NR_syscalls 337
+#define NR_syscalls 338
#define __ARCH_WANT_IPC_PARSE_VERSION
#define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
index 8d3ad0a..4843f7b 100644
--- a/arch/x86/include/asm/unistd_64.h
+++ b/arch/x86/include/asm/unistd_64.h
@@ -661,6 +661,8 @@ __SYSCALL(__NR_pwritev, sys_pwritev)
__SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
#define __NR_perf_event_open 298
__SYSCALL(__NR_perf_event_open, sys_perf_event_open)
+#define __NR_recvmmsg 299
+__SYSCALL(__NR_recvmmsg, sys_recvmmsg)
#ifndef __NO_STUBS
#define __ARCH_WANT_OLD_READDIR
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index 0157cd2..70c2125 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -336,3 +336,4 @@ ENTRY(sys_call_table)
.long sys_pwritev
.long sys_rt_tgsigqueueinfo /* 335 */
.long sys_perf_event_open
+ .long sys_recvmmsg
diff --git a/arch/xtensa/include/asm/unistd.h b/arch/xtensa/include/asm/unistd.h
index c092c8f..4e55dc7 100644
--- a/arch/xtensa/include/asm/unistd.h
+++ b/arch/xtensa/include/asm/unistd.h
@@ -681,8 +681,10 @@ __SYSCALL(304, sys_signalfd, 3)
__SYSCALL(305, sys_ni_syscall, 0)
#define __NR_eventfd 306
__SYSCALL(306, sys_eventfd, 1)
+#define __NR_recvmmsg 307
+__SYSCALL(307, sys_recvmmsg, 5)
-#define __NR_syscall_count 307
+#define __NR_syscall_count 308
/*
* sysxtensa syscall handler
diff --git a/include/linux/net.h b/include/linux/net.h
index 529a093..b42bb60 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -41,6 +41,7 @@
#define SYS_SENDMSG 16 /* sys_sendmsg(2) */
#define SYS_RECVMSG 17 /* sys_recvmsg(2) */
#define SYS_ACCEPT4 18 /* sys_accept4(2) */
+#define SYS_RECVMMSG 19 /* sys_recvmmsg(2) */
typedef enum {
SS_FREE = 0, /* not allocated */
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 3273a0c..59966f1 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -65,6 +65,12 @@ struct msghdr {
unsigned msg_flags;
};
+/* For recvmmsg/sendmmsg */
+struct mmsghdr {
+ struct msghdr msg_hdr;
+ unsigned msg_len;
+};
+
/*
* POSIX 1003.1g - ancillary data object information
* Ancillary data consits of a sequence of pairs of
@@ -312,6 +318,10 @@ extern int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uadd
extern int move_addr_to_kernel(void __user *uaddr, int ulen, struct sockaddr *kaddr);
extern int put_cmsg(struct msghdr*, int level, int type, int len, void *data);
+struct timespec;
+
+extern int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
+ unsigned int flags, struct timespec *timeout);
#endif
#endif /* not kernel and not glibc */
#endif /* _LINUX_SOCKET_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a990ace..714f063 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -25,6 +25,7 @@ struct linux_dirent64;
struct list_head;
struct msgbuf;
struct msghdr;
+struct mmsghdr;
struct msqid_ds;
struct new_utsname;
struct nfsctl_arg;
@@ -677,6 +678,9 @@ asmlinkage long sys_recv(int, void __user *, size_t, unsigned);
asmlinkage long sys_recvfrom(int, void __user *, size_t, unsigned,
struct sockaddr __user *, int __user *);
asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned flags);
+asmlinkage long sys_recvmmsg(int fd, struct mmsghdr __user *msg,
+ unsigned int vlen, unsigned flags,
+ struct timespec __user *timeout);
asmlinkage long sys_socket(int, int, int);
asmlinkage long sys_socketpair(int, int, int, int __user *);
asmlinkage long sys_socketcall(int call, unsigned long __user *args);
diff --git a/include/net/compat.h b/include/net/compat.h
index 7c30028..9679f05 100644
--- a/include/net/compat.h
+++ b/include/net/compat.h
@@ -18,6 +18,11 @@ struct compat_msghdr {
compat_uint_t msg_flags;
};
+struct compat_mmsghdr {
+ struct compat_msghdr msg_hdr;
+ compat_uint_t msg_len;
+};
+
struct compat_cmsghdr {
compat_size_t cmsg_len;
compat_int_t cmsg_level;
@@ -35,6 +40,9 @@ extern int get_compat_msghdr(struct msghdr *, struct compat_msghdr __user *);
extern int verify_compat_iovec(struct msghdr *, struct iovec *, struct sockaddr *, int);
extern asmlinkage long compat_sys_sendmsg(int,struct compat_msghdr __user *,unsigned);
extern asmlinkage long compat_sys_recvmsg(int,struct compat_msghdr __user *,unsigned);
+extern asmlinkage long compat_sys_recvmmsg(int, struct compat_mmsghdr __user *,
+ unsigned, unsigned,
+ struct timespec __user *);
extern asmlinkage long compat_sys_getsockopt(int, int, int, char __user *, int __user *);
extern int put_cmsg_compat(struct msghdr*, int, int, int, void *);
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index e06d0b8..f050ba8 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -48,8 +48,10 @@ cond_syscall(sys_shutdown);
cond_syscall(sys_sendmsg);
cond_syscall(compat_sys_sendmsg);
cond_syscall(sys_recvmsg);
+cond_syscall(sys_recvmmsg);
cond_syscall(compat_sys_recvmsg);
cond_syscall(compat_sys_recvfrom);
+cond_syscall(compat_sys_recvmmsg);
cond_syscall(sys_socketcall);
cond_syscall(sys_futex);
cond_syscall(compat_sys_futex);
diff --git a/net/compat.c b/net/compat.c
index a407c3a..e13f525 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -727,10 +727,10 @@ EXPORT_SYMBOL(compat_mc_getsockopt);
/* Argument list sizes for compat_sys_socketcall */
#define AL(x) ((x) * sizeof(u32))
-static unsigned char nas[19]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
+static unsigned char nas[20]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
- AL(4)};
+ AL(4),AL(5)};
#undef AL
asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg, unsigned flags)
@@ -755,13 +755,36 @@ asmlinkage long compat_sys_recvfrom(int fd, void __user *buf, size_t len,
return sys_recvfrom(fd, buf, len, flags | MSG_CMSG_COMPAT, addr, addrlen);
}
+asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
+ unsigned vlen, unsigned int flags,
+ struct timespec __user *timeout)
+{
+ int datagrams;
+ struct timespec ktspec;
+ struct compat_timespec __user *utspec =
+ (struct compat_timespec __user *)timeout;
+
+ if (get_user(ktspec.tv_sec, &utspec->tv_sec) ||
+ get_user(ktspec.tv_nsec, &utspec->tv_nsec))
+ return -EFAULT;
+
+ datagrams = __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen,
+ flags | MSG_CMSG_COMPAT, &ktspec);
+ if (datagrams > 0 &&
+ (put_user(ktspec.tv_sec, &utspec->tv_sec) ||
+ put_user(ktspec.tv_nsec, &utspec->tv_nsec)))
+ datagrams = -EFAULT;
+
+ return datagrams;
+}
+
asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
{
int ret;
u32 a[6];
u32 a0, a1;
- if (call < SYS_SOCKET || call > SYS_ACCEPT4)
+ if (call < SYS_SOCKET || call > SYS_RECVMMSG)
return -EINVAL;
if (copy_from_user(a, args, nas[call]))
return -EFAULT;
@@ -823,6 +846,10 @@ asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
case SYS_RECVMSG:
ret = compat_sys_recvmsg(a0, compat_ptr(a1), a[2]);
break;
+ case SYS_RECVMMSG:
+ ret = compat_sys_recvmmsg(a0, compat_ptr(a1), a[2], a[3],
+ compat_ptr(a[4]));
+ break;
case SYS_ACCEPT4:
ret = sys_accept4(a0, compat_ptr(a1), compat_ptr(a[2]), a[3]);
break;
diff --git a/net/socket.c b/net/socket.c
index 954f338..3dd03df 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -668,10 +668,9 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
EXPORT_SYMBOL_GPL(__sock_recv_timestamp);
-static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
- struct msghdr *msg, size_t size, int flags)
+static inline int __sock_recvmsg_nosec(struct kiocb *iocb, struct socket *sock,
+ struct msghdr *msg, size_t size, int flags)
{
- int err;
struct sock_iocb *si = kiocb_to_siocb(iocb);
si->sock = sock;
@@ -680,13 +679,17 @@ static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
si->size = size;
si->flags = flags;
- err = security_socket_recvmsg(sock, msg, size, flags);
- if (err)
- return err;
-
return sock->ops->recvmsg(iocb, sock, msg, size, flags);
}
+static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
+ struct msghdr *msg, size_t size, int flags)
+{
+ int err = security_socket_recvmsg(sock, msg, size, flags);
+
+ return err ?: __sock_recvmsg_nosec(iocb, sock, msg, size, flags);
+}
+
int sock_recvmsg(struct socket *sock, struct msghdr *msg,
size_t size, int flags)
{
@@ -702,6 +705,21 @@ int sock_recvmsg(struct socket *sock, struct msghdr *msg,
return ret;
}
+static int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg,
+ size_t size, int flags)
+{
+ struct kiocb iocb;
+ struct sock_iocb siocb;
+ int ret;
+
+ init_sync_kiocb(&iocb, NULL);
+ iocb.private = &siocb;
+ ret = __sock_recvmsg_nosec(&iocb, sock, msg, size, flags);
+ if (-EIOCBQUEUED == ret)
+ ret = wait_on_sync_kiocb(&iocb);
+ return ret;
+}
+
int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
struct kvec *vec, size_t num, size_t size, int flags)
{
@@ -1968,22 +1986,15 @@ out:
return err;
}
-/*
- * BSD recvmsg interface
- */
-
-SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
- unsigned int, flags)
+static int __sys_recvmsg(struct socket *sock, struct msghdr __user *msg,
+ struct msghdr *msg_sys, unsigned flags, int nosec)
{
struct compat_msghdr __user *msg_compat =
(struct compat_msghdr __user *)msg;
- struct socket *sock;
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
- struct msghdr msg_sys;
unsigned long cmsg_ptr;
int err, iov_size, total_len, len;
- int fput_needed;
/* kernel mode address */
struct sockaddr_storage addr;
@@ -1993,27 +2004,23 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
int __user *uaddr_len;
if (MSG_CMSG_COMPAT & flags) {
- if (get_compat_msghdr(&msg_sys, msg_compat))
+ if (get_compat_msghdr(msg_sys, msg_compat))
return -EFAULT;
}
- else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
+ else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
return -EFAULT;
- sock = sockfd_lookup_light(fd, &err, &fput_needed);
- if (!sock)
- goto out;
-
err = -EMSGSIZE;
- if (msg_sys.msg_iovlen > UIO_MAXIOV)
- goto out_put;
+ if (msg_sys->msg_iovlen > UIO_MAXIOV)
+ goto out;
/* Check whether to allocate the iovec area */
err = -ENOMEM;
- iov_size = msg_sys.msg_iovlen * sizeof(struct iovec);
- if (msg_sys.msg_iovlen > UIO_FASTIOV) {
+ iov_size = msg_sys->msg_iovlen * sizeof(struct iovec);
+ if (msg_sys->msg_iovlen > UIO_FASTIOV) {
iov = sock_kmalloc(sock->sk, iov_size, GFP_KERNEL);
if (!iov)
- goto out_put;
+ goto out;
}
/*
@@ -2021,46 +2028,47 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
* kernel msghdr to use the kernel address space)
*/
- uaddr = (__force void __user *)msg_sys.msg_name;
+ uaddr = (__force void __user *)msg_sys->msg_name;
uaddr_len = COMPAT_NAMELEN(msg);
if (MSG_CMSG_COMPAT & flags) {
- err = verify_compat_iovec(&msg_sys, iov,
+ err = verify_compat_iovec(msg_sys, iov,
(struct sockaddr *)&addr,
VERIFY_WRITE);
} else
- err = verify_iovec(&msg_sys, iov,
+ err = verify_iovec(msg_sys, iov,
(struct sockaddr *)&addr,
VERIFY_WRITE);
if (err < 0)
goto out_freeiov;
total_len = err;
- cmsg_ptr = (unsigned long)msg_sys.msg_control;
- msg_sys.msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
+ cmsg_ptr = (unsigned long)msg_sys->msg_control;
+ msg_sys->msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
if (sock->file->f_flags & O_NONBLOCK)
flags |= MSG_DONTWAIT;
- err = sock_recvmsg(sock, &msg_sys, total_len, flags);
+ err = (nosec ? sock_recvmsg_nosec : sock_recvmsg)(sock, msg_sys,
+ total_len, flags);
if (err < 0)
goto out_freeiov;
len = err;
if (uaddr != NULL) {
err = move_addr_to_user((struct sockaddr *)&addr,
- msg_sys.msg_namelen, uaddr,
+ msg_sys->msg_namelen, uaddr,
uaddr_len);
if (err < 0)
goto out_freeiov;
}
- err = __put_user((msg_sys.msg_flags & ~MSG_CMSG_COMPAT),
+ err = __put_user((msg_sys->msg_flags & ~MSG_CMSG_COMPAT),
COMPAT_FLAGS(msg));
if (err)
goto out_freeiov;
if (MSG_CMSG_COMPAT & flags)
- err = __put_user((unsigned long)msg_sys.msg_control - cmsg_ptr,
+ err = __put_user((unsigned long)msg_sys->msg_control - cmsg_ptr,
&msg_compat->msg_controllen);
else
- err = __put_user((unsigned long)msg_sys.msg_control - cmsg_ptr,
+ err = __put_user((unsigned long)msg_sys->msg_control - cmsg_ptr,
&msg->msg_controllen);
if (err)
goto out_freeiov;
@@ -2069,21 +2077,150 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
out_freeiov:
if (iov != iovstack)
sock_kfree_s(sock->sk, iov, iov_size);
-out_put:
+out:
+ return err;
+}
+
+/*
+ * BSD recvmsg interface
+ */
+
+SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
+ unsigned int, flags)
+{
+ int fput_needed, err;
+ struct msghdr msg_sys;
+ struct socket *sock = sockfd_lookup_light(fd, &err, &fput_needed);
+
+ if (!sock)
+ goto out;
+
+ err = __sys_recvmsg(sock, msg, &msg_sys, flags, 0);
+
fput_light(sock->file, fput_needed);
out:
return err;
}
-#ifdef __ARCH_WANT_SYS_SOCKETCALL
+/*
+ * Linux recvmmsg interface
+ */
+
+int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
+ unsigned int flags, struct timespec *timeout)
+{
+ int fput_needed, err, datagrams;
+ struct socket *sock;
+ struct mmsghdr __user *entry;
+ struct msghdr msg_sys;
+ struct timespec end_time;
+
+ if (timeout &&
+ poll_select_set_timeout(&end_time, timeout->tv_sec,
+ timeout->tv_nsec))
+ return -EINVAL;
+
+ datagrams = 0;
+
+ sock = sockfd_lookup_light(fd, &err, &fput_needed);
+ if (!sock)
+ return err;
+
+ err = sock_error(sock->sk);
+ if (err)
+ goto out_put;
+
+ entry = mmsg;
+
+ while (datagrams < vlen) {
+ /*
+ * No need to ask LSM for more than the first datagram.
+ */
+ err = __sys_recvmsg(sock, (struct msghdr __user *)entry,
+ &msg_sys, flags, datagrams);
+ if (err < 0)
+ break;
+ err = put_user(err, &entry->msg_len);
+ if (err)
+ break;
+ ++entry;
+ ++datagrams;
+
+ if (timeout) {
+ ktime_get_ts(timeout);
+ *timeout = timespec_sub(end_time, *timeout);
+ if (timeout->tv_sec < 0) {
+ timeout->tv_sec = timeout->tv_nsec = 0;
+ break;
+ }
+
+ /* Timeout, return less than vlen datagrams */
+ if (timeout->tv_nsec == 0 && timeout->tv_sec == 0)
+ break;
+ }
+
+ /* Out of band data, return right away */
+ if (msg_sys.msg_flags & MSG_OOB)
+ break;
+ }
+
+out_put:
+ fput_light(sock->file, fput_needed);
+ if (err == 0)
+ return datagrams;
+
+ if (datagrams != 0) {
+ /*
+ * We may return less entries than requested (vlen) if the
+ * sock is non block and there aren't enough datagrams...
+ */
+ if (err != -EAGAIN) {
+ /*
+ * ... or if recvmsg returns an error after we
+ * received some datagrams, where we record the
+ * error to return on the next call or if the
+ * app asks about it using getsockopt(SO_ERROR).
+ */
+ sock->sk->sk_err = -err;
+ }
+
+ return datagrams;
+ }
+
+ return err;
+}
+
+SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg,
+ unsigned int, vlen, unsigned int, flags,
+ struct timespec __user *, timeout)
+{
+ int datagrams;
+ struct timespec timeout_sys;
+
+ if (!timeout)
+ return __sys_recvmmsg(fd, mmsg, vlen, flags, NULL);
+
+ if (copy_from_user(&timeout_sys, timeout, sizeof(timeout_sys)))
+ return -EFAULT;
+
+ datagrams = __sys_recvmmsg(fd, mmsg, vlen, flags, &timeout_sys);
+
+ if (datagrams > 0 &&
+ copy_to_user(timeout, &timeout_sys, sizeof(timeout_sys)))
+ datagrams = -EFAULT;
+
+ return datagrams;
+}
+
+#ifdef __ARCH_WANT_SYS_SOCKETCALL
/* Argument list sizes for sys_socketcall */
#define AL(x) ((x) * sizeof(unsigned long))
-static const unsigned char nargs[19]={
+static const unsigned char nargs[20] = {
AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
- AL(4)
+ AL(4),AL(5)
};
#undef AL
@@ -2103,7 +2240,7 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args)
int err;
unsigned int len;
- if (call < 1 || call > SYS_ACCEPT4)
+ if (call < 1 || call > SYS_RECVMMSG)
return -EINVAL;
len = nargs[call];
@@ -2181,6 +2318,10 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args)
case SYS_RECVMSG:
err = sys_recvmsg(a0, (struct msghdr __user *)a1, a[2]);
break;
+ case SYS_RECVMMSG:
+ err = sys_recvmmsg(a0, (struct mmsghdr __user *)a1, a[2], a[3],
+ (struct timespec __user *)a[4]);
+ break;
case SYS_ACCEPT4:
err = sys_accept4(a0, (struct sockaddr __user *)a1,
(int __user *)a[2], a[3]);
--
1.5.5.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] net: Introduce recvmmsg socket syscall
2009-10-12 16:20 [PATCH 1/1] net: Introduce recvmmsg socket syscall Arnaldo Carvalho de Melo
@ 2009-10-12 17:53 ` Nir Tzachar
2009-10-13 1:56 ` Arnaldo Carvalho de Melo
2009-10-13 6:40 ` David Miller
1 sibling, 1 reply; 5+ messages in thread
From: Nir Tzachar @ 2009-10-12 17:53 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: David Miller, netdev, Arnaldo Carvalho de Melo, Caitlin Bestler,
Chris Van Hoof, Clark Williams, Neil Horman, Nivedita Singhvi,
Paul Moore, Rémi Denis-Courmont, Steven Whitehouse
Hi Arnaldo.
Do you have any plans on how we can further investigate the delays I
have seen with the second part of the patch? I have tried to simply
unlock/lock the socket's mutex every couple of iterations inside the
loop (to allow the system to process some backlog), but this seems to
have little to no effect.
Also, a way to enable/disable the no_lock version at runtime will
greatly help in testing. Maybe by first introducing a second syscall,
recvmmsg_no_lock, for testing purposes??
Cheers,
Nir.
On Mon, Oct 12, 2009 at 6:20 PM, Arnaldo Carvalho de Melo
<acme@ghostprotocols.net> wrote:
> Meaning receive multiple messages, reducing the number of syscalls and
> net stack entry/exit operations.
>
> Next patches will introduce mechanisms where protocols that want to
> optimize this operation will provide an unlocked_recvmsg operation.
>
> This takes into account comments made by:
>
> . Paul Moore: sock_recvmsg is called only for the first datagram,
> sock_recvmsg_nosec is used for the rest.
>
> . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
> works in the same fashion as the ppoll one.
>
> If the underlying protocol returns a datagram with MSG_OOB set, this
> will make recvmmsg return right away with as many datagrams (+ the OOB
> one) it has received so far.
>
> . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
> datagrams and then recvmsg returns an error, recvmmsg will return
> the successfully received datagrams, store the error and return it
> in the next call.
>
> This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
> where we will be able to acquire the lock only at batch start and end, not at
> every underlying recvmsg call.
>
> Cc: Caitlin Bestler <caitlin.bestler@gmail.com>
> Cc: Chris Van Hoof <vanhoof@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: Nir Tzachar <nir.tzachar@gmail.com>
> Cc: Nivedita Singhvi <niv@us.ibm.com>
> Cc: Paul Moore <paul.moore@hp.com>
> Cc: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
> Cc: Steven Whitehouse <steve@chygwyn.com>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> ---
> arch/alpha/kernel/systbls.S | 1 +
> arch/arm/kernel/calls.S | 1 +
> arch/avr32/kernel/syscall_table.S | 1 +
> arch/blackfin/mach-common/entry.S | 1 +
> arch/ia64/kernel/entry.S | 1 +
> arch/microblaze/kernel/syscall_table.S | 1 +
> arch/mips/kernel/scall32-o32.S | 1 +
> arch/mips/kernel/scall64-64.S | 1 +
> arch/mips/kernel/scall64-n32.S | 1 +
> arch/mips/kernel/scall64-o32.S | 1 +
> arch/sh/kernel/syscalls_64.S | 1 +
> arch/sparc/kernel/systbls_64.S | 4 +-
> arch/x86/ia32/ia32entry.S | 1 +
> arch/x86/include/asm/unistd_32.h | 3 +-
> arch/x86/include/asm/unistd_64.h | 2 +
> arch/x86/kernel/syscall_table_32.S | 1 +
> arch/xtensa/include/asm/unistd.h | 4 +-
> include/linux/net.h | 1 +
> include/linux/socket.h | 10 ++
> include/linux/syscalls.h | 4 +
> include/net/compat.h | 8 +
> kernel/sys_ni.c | 2 +
> net/compat.c | 33 +++++-
> net/socket.c | 225 ++++++++++++++++++++++++++------
> 24 files changed, 260 insertions(+), 49 deletions(-)
>
> diff --git a/arch/alpha/kernel/systbls.S b/arch/alpha/kernel/systbls.S
> index 95c9aef..cda6b8b 100644
> --- a/arch/alpha/kernel/systbls.S
> +++ b/arch/alpha/kernel/systbls.S
> @@ -497,6 +497,7 @@ sys_call_table:
> .quad sys_signalfd
> .quad sys_ni_syscall
> .quad sys_eventfd
> + .quad sys_recvmmsg
>
> .size sys_call_table, . - sys_call_table
> .type sys_call_table, @object
> diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
> index fafce1b..f58c115 100644
> --- a/arch/arm/kernel/calls.S
> +++ b/arch/arm/kernel/calls.S
> @@ -374,6 +374,7 @@
> CALL(sys_pwritev)
> CALL(sys_rt_tgsigqueueinfo)
> CALL(sys_perf_event_open)
> +/* 365 */ CALL(sys_recvmmsg)
> #ifndef syscalls_counted
> .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
> #define syscalls_counted
> diff --git a/arch/avr32/kernel/syscall_table.S b/arch/avr32/kernel/syscall_table.S
> index 7ee0057..e76bad1 100644
> --- a/arch/avr32/kernel/syscall_table.S
> +++ b/arch/avr32/kernel/syscall_table.S
> @@ -295,4 +295,5 @@ sys_call_table:
> .long sys_signalfd
> .long sys_ni_syscall /* 280, was sys_timerfd */
> .long sys_eventfd
> + .long sys_recvmmsg
> .long sys_ni_syscall /* r8 is saturated at nr_syscalls */
> diff --git a/arch/blackfin/mach-common/entry.S b/arch/blackfin/mach-common/entry.S
> index 1e7cac2..4869272 100644
> --- a/arch/blackfin/mach-common/entry.S
> +++ b/arch/blackfin/mach-common/entry.S
> @@ -1621,6 +1621,7 @@ ENTRY(_sys_call_table)
> .long _sys_pwritev
> .long _sys_rt_tgsigqueueinfo
> .long _sys_perf_event_open
> + .long _sys_recvmmsg /* 370 */
>
> .rept NR_syscalls-(.-_sys_call_table)/4
> .long _sys_ni_syscall
> diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
> index d0e7d37..d75b872 100644
> --- a/arch/ia64/kernel/entry.S
> +++ b/arch/ia64/kernel/entry.S
> @@ -1806,6 +1806,7 @@ sys_call_table:
> data8 sys_preadv
> data8 sys_pwritev // 1320
> data8 sys_rt_tgsigqueueinfo
> + data8 sys_recvmmsg
>
> .org sys_call_table + 8*NR_syscalls // guard against failures to increase NR_syscalls
> #endif /* __IA64_ASM_PARAVIRTUALIZED_NATIVE */
> diff --git a/arch/microblaze/kernel/syscall_table.S b/arch/microblaze/kernel/syscall_table.S
> index ecec191..c1ab1dc 100644
> --- a/arch/microblaze/kernel/syscall_table.S
> +++ b/arch/microblaze/kernel/syscall_table.S
> @@ -371,3 +371,4 @@ ENTRY(sys_call_table)
> .long sys_ni_syscall
> .long sys_rt_tgsigqueueinfo /* 365 */
> .long sys_perf_event_open
> + .long sys_recvmmsg
> diff --git a/arch/mips/kernel/scall32-o32.S b/arch/mips/kernel/scall32-o32.S
> index fd2a9bb..17202bb 100644
> --- a/arch/mips/kernel/scall32-o32.S
> +++ b/arch/mips/kernel/scall32-o32.S
> @@ -583,6 +583,7 @@ einval: li v0, -ENOSYS
> sys sys_rt_tgsigqueueinfo 4
> sys sys_perf_event_open 5
> sys sys_accept4 4
> + sys sys_recvmmsg 5
> .endm
>
> /* We pre-compute the number of _instruction_ bytes needed to
> diff --git a/arch/mips/kernel/scall64-64.S b/arch/mips/kernel/scall64-64.S
> index 18bf7f3..a8a6c59 100644
> --- a/arch/mips/kernel/scall64-64.S
> +++ b/arch/mips/kernel/scall64-64.S
> @@ -420,4 +420,5 @@ sys_call_table:
> PTR sys_rt_tgsigqueueinfo
> PTR sys_perf_event_open
> PTR sys_accept4
> + PTR sys_recvmmsg
> .size sys_call_table,.-sys_call_table
> diff --git a/arch/mips/kernel/scall64-n32.S b/arch/mips/kernel/scall64-n32.S
> index 6ebc079..5154e64 100644
> --- a/arch/mips/kernel/scall64-n32.S
> +++ b/arch/mips/kernel/scall64-n32.S
> @@ -418,4 +418,5 @@ EXPORT(sysn32_call_table)
> PTR compat_sys_rt_tgsigqueueinfo /* 5295 */
> PTR sys_perf_event_open
> PTR sys_accept4
> + PTR compat_sys_recvmmsg
> .size sysn32_call_table,.-sysn32_call_table
> diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S
> index 9bbf977..d0eff53 100644
> --- a/arch/mips/kernel/scall64-o32.S
> +++ b/arch/mips/kernel/scall64-o32.S
> @@ -538,4 +538,5 @@ sys_call_table:
> PTR compat_sys_rt_tgsigqueueinfo
> PTR sys_perf_event_open
> PTR sys_accept4
> + PTR compat_sys_recvmmsg
> .size sys_call_table,.-sys_call_table
> diff --git a/arch/sh/kernel/syscalls_64.S b/arch/sh/kernel/syscalls_64.S
> index 5bfde6c..07d2aae 100644
> --- a/arch/sh/kernel/syscalls_64.S
> +++ b/arch/sh/kernel/syscalls_64.S
> @@ -391,3 +391,4 @@ sys_call_table:
> .long sys_pwritev
> .long sys_rt_tgsigqueueinfo
> .long sys_perf_event_open
> + .long sys_recvmmsg /* 365 */
> diff --git a/arch/sparc/kernel/systbls_64.S b/arch/sparc/kernel/systbls_64.S
> index 009825f..f37bef7 100644
> --- a/arch/sparc/kernel/systbls_64.S
> +++ b/arch/sparc/kernel/systbls_64.S
> @@ -83,7 +83,7 @@ sys_call_table32:
> /*310*/ .word compat_sys_utimensat, compat_sys_signalfd, sys_timerfd_create, sys_eventfd, compat_sys_fallocate
> .word compat_sys_timerfd_settime, compat_sys_timerfd_gettime, compat_sys_signalfd4, sys_eventfd2, sys_epoll_create1
> /*320*/ .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, compat_sys_preadv
> - .word compat_sys_pwritev, compat_sys_rt_tgsigqueueinfo, sys_perf_event_open
> + .word compat_sys_pwritev, compat_sys_rt_tgsigqueueinfo, sys_perf_event_open, compat_sys_recvmmsg
>
> #endif /* CONFIG_COMPAT */
>
> @@ -158,4 +158,4 @@ sys_call_table:
> /*310*/ .word sys_utimensat, sys_signalfd, sys_timerfd_create, sys_eventfd, sys_fallocate
> .word sys_timerfd_settime, sys_timerfd_gettime, sys_signalfd4, sys_eventfd2, sys_epoll_create1
> /*320*/ .word sys_dup3, sys_pipe2, sys_inotify_init1, sys_accept4, sys_preadv
> - .word sys_pwritev, sys_rt_tgsigqueueinfo, sys_perf_event_open
> + .word sys_pwritev, sys_rt_tgsigqueueinfo, sys_perf_event_open, sys_recvmmsg
> diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
> index 74619c4..11a6c79 100644
> --- a/arch/x86/ia32/ia32entry.S
> +++ b/arch/x86/ia32/ia32entry.S
> @@ -832,4 +832,5 @@ ia32_sys_call_table:
> .quad compat_sys_pwritev
> .quad compat_sys_rt_tgsigqueueinfo /* 335 */
> .quad sys_perf_event_open
> + .quad compat_sys_recvmmsg
> ia32_syscall_end:
> diff --git a/arch/x86/include/asm/unistd_32.h b/arch/x86/include/asm/unistd_32.h
> index 6fb3c20..3baf379 100644
> --- a/arch/x86/include/asm/unistd_32.h
> +++ b/arch/x86/include/asm/unistd_32.h
> @@ -342,10 +342,11 @@
> #define __NR_pwritev 334
> #define __NR_rt_tgsigqueueinfo 335
> #define __NR_perf_event_open 336
> +#define __NR_recvmmsg 337
>
> #ifdef __KERNEL__
>
> -#define NR_syscalls 337
> +#define NR_syscalls 338
>
> #define __ARCH_WANT_IPC_PARSE_VERSION
> #define __ARCH_WANT_OLD_READDIR
> diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h
> index 8d3ad0a..4843f7b 100644
> --- a/arch/x86/include/asm/unistd_64.h
> +++ b/arch/x86/include/asm/unistd_64.h
> @@ -661,6 +661,8 @@ __SYSCALL(__NR_pwritev, sys_pwritev)
> __SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
> #define __NR_perf_event_open 298
> __SYSCALL(__NR_perf_event_open, sys_perf_event_open)
> +#define __NR_recvmmsg 299
> +__SYSCALL(__NR_recvmmsg, sys_recvmmsg)
>
> #ifndef __NO_STUBS
> #define __ARCH_WANT_OLD_READDIR
> diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
> index 0157cd2..70c2125 100644
> --- a/arch/x86/kernel/syscall_table_32.S
> +++ b/arch/x86/kernel/syscall_table_32.S
> @@ -336,3 +336,4 @@ ENTRY(sys_call_table)
> .long sys_pwritev
> .long sys_rt_tgsigqueueinfo /* 335 */
> .long sys_perf_event_open
> + .long sys_recvmmsg
> diff --git a/arch/xtensa/include/asm/unistd.h b/arch/xtensa/include/asm/unistd.h
> index c092c8f..4e55dc7 100644
> --- a/arch/xtensa/include/asm/unistd.h
> +++ b/arch/xtensa/include/asm/unistd.h
> @@ -681,8 +681,10 @@ __SYSCALL(304, sys_signalfd, 3)
> __SYSCALL(305, sys_ni_syscall, 0)
> #define __NR_eventfd 306
> __SYSCALL(306, sys_eventfd, 1)
> +#define __NR_recvmmsg 307
> +__SYSCALL(307, sys_recvmmsg, 5)
>
> -#define __NR_syscall_count 307
> +#define __NR_syscall_count 308
>
> /*
> * sysxtensa syscall handler
> diff --git a/include/linux/net.h b/include/linux/net.h
> index 529a093..b42bb60 100644
> --- a/include/linux/net.h
> +++ b/include/linux/net.h
> @@ -41,6 +41,7 @@
> #define SYS_SENDMSG 16 /* sys_sendmsg(2) */
> #define SYS_RECVMSG 17 /* sys_recvmsg(2) */
> #define SYS_ACCEPT4 18 /* sys_accept4(2) */
> +#define SYS_RECVMMSG 19 /* sys_recvmmsg(2) */
>
> typedef enum {
> SS_FREE = 0, /* not allocated */
> diff --git a/include/linux/socket.h b/include/linux/socket.h
> index 3273a0c..59966f1 100644
> --- a/include/linux/socket.h
> +++ b/include/linux/socket.h
> @@ -65,6 +65,12 @@ struct msghdr {
> unsigned msg_flags;
> };
>
> +/* For recvmmsg/sendmmsg */
> +struct mmsghdr {
> + struct msghdr msg_hdr;
> + unsigned msg_len;
> +};
> +
> /*
> * POSIX 1003.1g - ancillary data object information
> * Ancillary data consits of a sequence of pairs of
> @@ -312,6 +318,10 @@ extern int move_addr_to_user(struct sockaddr *kaddr, int klen, void __user *uadd
> extern int move_addr_to_kernel(void __user *uaddr, int ulen, struct sockaddr *kaddr);
> extern int put_cmsg(struct msghdr*, int level, int type, int len, void *data);
>
> +struct timespec;
> +
> +extern int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
> + unsigned int flags, struct timespec *timeout);
> #endif
> #endif /* not kernel and not glibc */
> #endif /* _LINUX_SOCKET_H */
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index a990ace..714f063 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -25,6 +25,7 @@ struct linux_dirent64;
> struct list_head;
> struct msgbuf;
> struct msghdr;
> +struct mmsghdr;
> struct msqid_ds;
> struct new_utsname;
> struct nfsctl_arg;
> @@ -677,6 +678,9 @@ asmlinkage long sys_recv(int, void __user *, size_t, unsigned);
> asmlinkage long sys_recvfrom(int, void __user *, size_t, unsigned,
> struct sockaddr __user *, int __user *);
> asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned flags);
> +asmlinkage long sys_recvmmsg(int fd, struct mmsghdr __user *msg,
> + unsigned int vlen, unsigned flags,
> + struct timespec __user *timeout);
> asmlinkage long sys_socket(int, int, int);
> asmlinkage long sys_socketpair(int, int, int, int __user *);
> asmlinkage long sys_socketcall(int call, unsigned long __user *args);
> diff --git a/include/net/compat.h b/include/net/compat.h
> index 7c30028..9679f05 100644
> --- a/include/net/compat.h
> +++ b/include/net/compat.h
> @@ -18,6 +18,11 @@ struct compat_msghdr {
> compat_uint_t msg_flags;
> };
>
> +struct compat_mmsghdr {
> + struct compat_msghdr msg_hdr;
> + compat_uint_t msg_len;
> +};
> +
> struct compat_cmsghdr {
> compat_size_t cmsg_len;
> compat_int_t cmsg_level;
> @@ -35,6 +40,9 @@ extern int get_compat_msghdr(struct msghdr *, struct compat_msghdr __user *);
> extern int verify_compat_iovec(struct msghdr *, struct iovec *, struct sockaddr *, int);
> extern asmlinkage long compat_sys_sendmsg(int,struct compat_msghdr __user *,unsigned);
> extern asmlinkage long compat_sys_recvmsg(int,struct compat_msghdr __user *,unsigned);
> +extern asmlinkage long compat_sys_recvmmsg(int, struct compat_mmsghdr __user *,
> + unsigned, unsigned,
> + struct timespec __user *);
> extern asmlinkage long compat_sys_getsockopt(int, int, int, char __user *, int __user *);
> extern int put_cmsg_compat(struct msghdr*, int, int, int, void *);
>
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index e06d0b8..f050ba8 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -48,8 +48,10 @@ cond_syscall(sys_shutdown);
> cond_syscall(sys_sendmsg);
> cond_syscall(compat_sys_sendmsg);
> cond_syscall(sys_recvmsg);
> +cond_syscall(sys_recvmmsg);
> cond_syscall(compat_sys_recvmsg);
> cond_syscall(compat_sys_recvfrom);
> +cond_syscall(compat_sys_recvmmsg);
> cond_syscall(sys_socketcall);
> cond_syscall(sys_futex);
> cond_syscall(compat_sys_futex);
> diff --git a/net/compat.c b/net/compat.c
> index a407c3a..e13f525 100644
> --- a/net/compat.c
> +++ b/net/compat.c
> @@ -727,10 +727,10 @@ EXPORT_SYMBOL(compat_mc_getsockopt);
>
> /* Argument list sizes for compat_sys_socketcall */
> #define AL(x) ((x) * sizeof(u32))
> -static unsigned char nas[19]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
> +static unsigned char nas[20]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
> AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
> AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
> - AL(4)};
> + AL(4),AL(5)};
> #undef AL
>
> asmlinkage long compat_sys_sendmsg(int fd, struct compat_msghdr __user *msg, unsigned flags)
> @@ -755,13 +755,36 @@ asmlinkage long compat_sys_recvfrom(int fd, void __user *buf, size_t len,
> return sys_recvfrom(fd, buf, len, flags | MSG_CMSG_COMPAT, addr, addrlen);
> }
>
> +asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
> + unsigned vlen, unsigned int flags,
> + struct timespec __user *timeout)
> +{
> + int datagrams;
> + struct timespec ktspec;
> + struct compat_timespec __user *utspec =
> + (struct compat_timespec __user *)timeout;
> +
> + if (get_user(ktspec.tv_sec, &utspec->tv_sec) ||
> + get_user(ktspec.tv_nsec, &utspec->tv_nsec))
> + return -EFAULT;
> +
> + datagrams = __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen,
> + flags | MSG_CMSG_COMPAT, &ktspec);
> + if (datagrams > 0 &&
> + (put_user(ktspec.tv_sec, &utspec->tv_sec) ||
> + put_user(ktspec.tv_nsec, &utspec->tv_nsec)))
> + datagrams = -EFAULT;
> +
> + return datagrams;
> +}
> +
> asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
> {
> int ret;
> u32 a[6];
> u32 a0, a1;
>
> - if (call < SYS_SOCKET || call > SYS_ACCEPT4)
> + if (call < SYS_SOCKET || call > SYS_RECVMMSG)
> return -EINVAL;
> if (copy_from_user(a, args, nas[call]))
> return -EFAULT;
> @@ -823,6 +846,10 @@ asmlinkage long compat_sys_socketcall(int call, u32 __user *args)
> case SYS_RECVMSG:
> ret = compat_sys_recvmsg(a0, compat_ptr(a1), a[2]);
> break;
> + case SYS_RECVMMSG:
> + ret = compat_sys_recvmmsg(a0, compat_ptr(a1), a[2], a[3],
> + compat_ptr(a[4]));
> + break;
> case SYS_ACCEPT4:
> ret = sys_accept4(a0, compat_ptr(a1), compat_ptr(a[2]), a[3]);
> break;
> diff --git a/net/socket.c b/net/socket.c
> index 954f338..3dd03df 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -668,10 +668,9 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>
> EXPORT_SYMBOL_GPL(__sock_recv_timestamp);
>
> -static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
> - struct msghdr *msg, size_t size, int flags)
> +static inline int __sock_recvmsg_nosec(struct kiocb *iocb, struct socket *sock,
> + struct msghdr *msg, size_t size, int flags)
> {
> - int err;
> struct sock_iocb *si = kiocb_to_siocb(iocb);
>
> si->sock = sock;
> @@ -680,13 +679,17 @@ static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
> si->size = size;
> si->flags = flags;
>
> - err = security_socket_recvmsg(sock, msg, size, flags);
> - if (err)
> - return err;
> -
> return sock->ops->recvmsg(iocb, sock, msg, size, flags);
> }
>
> +static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock,
> + struct msghdr *msg, size_t size, int flags)
> +{
> + int err = security_socket_recvmsg(sock, msg, size, flags);
> +
> + return err ?: __sock_recvmsg_nosec(iocb, sock, msg, size, flags);
> +}
> +
> int sock_recvmsg(struct socket *sock, struct msghdr *msg,
> size_t size, int flags)
> {
> @@ -702,6 +705,21 @@ int sock_recvmsg(struct socket *sock, struct msghdr *msg,
> return ret;
> }
>
> +static int sock_recvmsg_nosec(struct socket *sock, struct msghdr *msg,
> + size_t size, int flags)
> +{
> + struct kiocb iocb;
> + struct sock_iocb siocb;
> + int ret;
> +
> + init_sync_kiocb(&iocb, NULL);
> + iocb.private = &siocb;
> + ret = __sock_recvmsg_nosec(&iocb, sock, msg, size, flags);
> + if (-EIOCBQUEUED == ret)
> + ret = wait_on_sync_kiocb(&iocb);
> + return ret;
> +}
> +
> int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
> struct kvec *vec, size_t num, size_t size, int flags)
> {
> @@ -1968,22 +1986,15 @@ out:
> return err;
> }
>
> -/*
> - * BSD recvmsg interface
> - */
> -
> -SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
> - unsigned int, flags)
> +static int __sys_recvmsg(struct socket *sock, struct msghdr __user *msg,
> + struct msghdr *msg_sys, unsigned flags, int nosec)
> {
> struct compat_msghdr __user *msg_compat =
> (struct compat_msghdr __user *)msg;
> - struct socket *sock;
> struct iovec iovstack[UIO_FASTIOV];
> struct iovec *iov = iovstack;
> - struct msghdr msg_sys;
> unsigned long cmsg_ptr;
> int err, iov_size, total_len, len;
> - int fput_needed;
>
> /* kernel mode address */
> struct sockaddr_storage addr;
> @@ -1993,27 +2004,23 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
> int __user *uaddr_len;
>
> if (MSG_CMSG_COMPAT & flags) {
> - if (get_compat_msghdr(&msg_sys, msg_compat))
> + if (get_compat_msghdr(msg_sys, msg_compat))
> return -EFAULT;
> }
> - else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr)))
> + else if (copy_from_user(msg_sys, msg, sizeof(struct msghdr)))
> return -EFAULT;
>
> - sock = sockfd_lookup_light(fd, &err, &fput_needed);
> - if (!sock)
> - goto out;
> -
> err = -EMSGSIZE;
> - if (msg_sys.msg_iovlen > UIO_MAXIOV)
> - goto out_put;
> + if (msg_sys->msg_iovlen > UIO_MAXIOV)
> + goto out;
>
> /* Check whether to allocate the iovec area */
> err = -ENOMEM;
> - iov_size = msg_sys.msg_iovlen * sizeof(struct iovec);
> - if (msg_sys.msg_iovlen > UIO_FASTIOV) {
> + iov_size = msg_sys->msg_iovlen * sizeof(struct iovec);
> + if (msg_sys->msg_iovlen > UIO_FASTIOV) {
> iov = sock_kmalloc(sock->sk, iov_size, GFP_KERNEL);
> if (!iov)
> - goto out_put;
> + goto out;
> }
>
> /*
> @@ -2021,46 +2028,47 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
> * kernel msghdr to use the kernel address space)
> */
>
> - uaddr = (__force void __user *)msg_sys.msg_name;
> + uaddr = (__force void __user *)msg_sys->msg_name;
> uaddr_len = COMPAT_NAMELEN(msg);
> if (MSG_CMSG_COMPAT & flags) {
> - err = verify_compat_iovec(&msg_sys, iov,
> + err = verify_compat_iovec(msg_sys, iov,
> (struct sockaddr *)&addr,
> VERIFY_WRITE);
> } else
> - err = verify_iovec(&msg_sys, iov,
> + err = verify_iovec(msg_sys, iov,
> (struct sockaddr *)&addr,
> VERIFY_WRITE);
> if (err < 0)
> goto out_freeiov;
> total_len = err;
>
> - cmsg_ptr = (unsigned long)msg_sys.msg_control;
> - msg_sys.msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
> + cmsg_ptr = (unsigned long)msg_sys->msg_control;
> + msg_sys->msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT);
>
> if (sock->file->f_flags & O_NONBLOCK)
> flags |= MSG_DONTWAIT;
> - err = sock_recvmsg(sock, &msg_sys, total_len, flags);
> + err = (nosec ? sock_recvmsg_nosec : sock_recvmsg)(sock, msg_sys,
> + total_len, flags);
> if (err < 0)
> goto out_freeiov;
> len = err;
>
> if (uaddr != NULL) {
> err = move_addr_to_user((struct sockaddr *)&addr,
> - msg_sys.msg_namelen, uaddr,
> + msg_sys->msg_namelen, uaddr,
> uaddr_len);
> if (err < 0)
> goto out_freeiov;
> }
> - err = __put_user((msg_sys.msg_flags & ~MSG_CMSG_COMPAT),
> + err = __put_user((msg_sys->msg_flags & ~MSG_CMSG_COMPAT),
> COMPAT_FLAGS(msg));
> if (err)
> goto out_freeiov;
> if (MSG_CMSG_COMPAT & flags)
> - err = __put_user((unsigned long)msg_sys.msg_control - cmsg_ptr,
> + err = __put_user((unsigned long)msg_sys->msg_control - cmsg_ptr,
> &msg_compat->msg_controllen);
> else
> - err = __put_user((unsigned long)msg_sys.msg_control - cmsg_ptr,
> + err = __put_user((unsigned long)msg_sys->msg_control - cmsg_ptr,
> &msg->msg_controllen);
> if (err)
> goto out_freeiov;
> @@ -2069,21 +2077,150 @@ SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
> out_freeiov:
> if (iov != iovstack)
> sock_kfree_s(sock->sk, iov, iov_size);
> -out_put:
> +out:
> + return err;
> +}
> +
> +/*
> + * BSD recvmsg interface
> + */
> +
> +SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
> + unsigned int, flags)
> +{
> + int fput_needed, err;
> + struct msghdr msg_sys;
> + struct socket *sock = sockfd_lookup_light(fd, &err, &fput_needed);
> +
> + if (!sock)
> + goto out;
> +
> + err = __sys_recvmsg(sock, msg, &msg_sys, flags, 0);
> +
> fput_light(sock->file, fput_needed);
> out:
> return err;
> }
>
> -#ifdef __ARCH_WANT_SYS_SOCKETCALL
> +/*
> + * Linux recvmmsg interface
> + */
> +
> +int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
> + unsigned int flags, struct timespec *timeout)
> +{
> + int fput_needed, err, datagrams;
> + struct socket *sock;
> + struct mmsghdr __user *entry;
> + struct msghdr msg_sys;
> + struct timespec end_time;
> +
> + if (timeout &&
> + poll_select_set_timeout(&end_time, timeout->tv_sec,
> + timeout->tv_nsec))
> + return -EINVAL;
> +
> + datagrams = 0;
> +
> + sock = sockfd_lookup_light(fd, &err, &fput_needed);
> + if (!sock)
> + return err;
> +
> + err = sock_error(sock->sk);
> + if (err)
> + goto out_put;
> +
> + entry = mmsg;
> +
> + while (datagrams < vlen) {
> + /*
> + * No need to ask LSM for more than the first datagram.
> + */
> + err = __sys_recvmsg(sock, (struct msghdr __user *)entry,
> + &msg_sys, flags, datagrams);
> + if (err < 0)
> + break;
> + err = put_user(err, &entry->msg_len);
> + if (err)
> + break;
> + ++entry;
> + ++datagrams;
> +
> + if (timeout) {
> + ktime_get_ts(timeout);
> + *timeout = timespec_sub(end_time, *timeout);
> + if (timeout->tv_sec < 0) {
> + timeout->tv_sec = timeout->tv_nsec = 0;
> + break;
> + }
> +
> + /* Timeout, return less than vlen datagrams */
> + if (timeout->tv_nsec == 0 && timeout->tv_sec == 0)
> + break;
> + }
> +
> + /* Out of band data, return right away */
> + if (msg_sys.msg_flags & MSG_OOB)
> + break;
> + }
> +
> +out_put:
> + fput_light(sock->file, fput_needed);
>
> + if (err == 0)
> + return datagrams;
> +
> + if (datagrams != 0) {
> + /*
> + * We may return less entries than requested (vlen) if the
> + * sock is non block and there aren't enough datagrams...
> + */
> + if (err != -EAGAIN) {
> + /*
> + * ... or if recvmsg returns an error after we
> + * received some datagrams, where we record the
> + * error to return on the next call or if the
> + * app asks about it using getsockopt(SO_ERROR).
> + */
> + sock->sk->sk_err = -err;
> + }
> +
> + return datagrams;
> + }
> +
> + return err;
> +}
> +
> +SYSCALL_DEFINE5(recvmmsg, int, fd, struct mmsghdr __user *, mmsg,
> + unsigned int, vlen, unsigned int, flags,
> + struct timespec __user *, timeout)
> +{
> + int datagrams;
> + struct timespec timeout_sys;
> +
> + if (!timeout)
> + return __sys_recvmmsg(fd, mmsg, vlen, flags, NULL);
> +
> + if (copy_from_user(&timeout_sys, timeout, sizeof(timeout_sys)))
> + return -EFAULT;
> +
> + datagrams = __sys_recvmmsg(fd, mmsg, vlen, flags, &timeout_sys);
> +
> + if (datagrams > 0 &&
> + copy_to_user(timeout, &timeout_sys, sizeof(timeout_sys)))
> + datagrams = -EFAULT;
> +
> + return datagrams;
> +}
> +
> +#ifdef __ARCH_WANT_SYS_SOCKETCALL
> /* Argument list sizes for sys_socketcall */
> #define AL(x) ((x) * sizeof(unsigned long))
> -static const unsigned char nargs[19]={
> +static const unsigned char nargs[20] = {
> AL(0),AL(3),AL(3),AL(3),AL(2),AL(3),
> AL(3),AL(3),AL(4),AL(4),AL(4),AL(6),
> AL(6),AL(2),AL(5),AL(5),AL(3),AL(3),
> - AL(4)
> + AL(4),AL(5)
> };
>
> #undef AL
> @@ -2103,7 +2240,7 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args)
> int err;
> unsigned int len;
>
> - if (call < 1 || call > SYS_ACCEPT4)
> + if (call < 1 || call > SYS_RECVMMSG)
> return -EINVAL;
>
> len = nargs[call];
> @@ -2181,6 +2318,10 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args)
> case SYS_RECVMSG:
> err = sys_recvmsg(a0, (struct msghdr __user *)a1, a[2]);
> break;
> + case SYS_RECVMMSG:
> + err = sys_recvmmsg(a0, (struct mmsghdr __user *)a1, a[2], a[3],
> + (struct timespec __user *)a[4]);
> + break;
> case SYS_ACCEPT4:
> err = sys_accept4(a0, (struct sockaddr __user *)a1,
> (int __user *)a[2], a[3]);
> --
> 1.5.5.1
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] net: Introduce recvmmsg socket syscall
2009-10-12 17:53 ` Nir Tzachar
@ 2009-10-13 1:56 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-10-13 1:56 UTC (permalink / raw)
To: Nir Tzachar
Cc: David Miller, netdev, Caitlin Bestler, Chris Van Hoof,
Clark Williams, Neil Horman, Nivedita Singhvi, Paul Moore,
Rémi Denis-Courmont, Steven Whitehouse
Em Mon, Oct 12, 2009 at 07:53:43PM +0200, Nir Tzachar escreveu:
> Hi Arnaldo.
>
> Do you have any plans on how we can further investigate the delays I
> have seen with the second part of the patch? I have tried to simply
> unlock/lock the socket's mutex every couple of iterations inside the
Yeah, that is what tcp does, look at tcp_recvmsg (net/ipv4/tcp.c, line
1505), so I think we should do something along those lines, exactly when
and after which tests is a matter of experimentation.
I'll resume investigation tomorrow.
> loop (to allow the system to process some backlog), but this seems to
> have little to no effect.
> Also, a way to enable/disable the no_lock version at runtime will
> greatly help in testing. Maybe by first introducing a second syscall,
> recvmmsg_no_lock, for testing purposes??
I'll come up with a way for that to be possible.
- Arnaldo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] net: Introduce recvmmsg socket syscall
2009-10-12 16:20 [PATCH 1/1] net: Introduce recvmmsg socket syscall Arnaldo Carvalho de Melo
2009-10-12 17:53 ` Nir Tzachar
@ 2009-10-13 6:40 ` David Miller
2009-10-13 13:14 ` Arnaldo Carvalho de Melo
1 sibling, 1 reply; 5+ messages in thread
From: David Miller @ 2009-10-13 6:40 UTC (permalink / raw)
To: acme
Cc: netdev, caitlin.bestler, vanhoof, williams, nhorman, nir.tzachar,
niv, paul.moore, remi.denis-courmont, steve
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Mon, 12 Oct 2009 13:20:40 -0300
> Meaning receive multiple messages, reducing the number of syscalls and
> net stack entry/exit operations.
>
> Next patches will introduce mechanisms where protocols that want to
> optimize this operation will provide an unlocked_recvmsg operation.
>
> This takes into account comments made by:
...
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
You missed the syscall entry addition for arch/sparc/kernel/systbls_32.S
but I fixed that up while applying this to net-next-2.6
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/1] net: Introduce recvmmsg socket syscall
2009-10-13 6:40 ` David Miller
@ 2009-10-13 13:14 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-10-13 13:14 UTC (permalink / raw)
To: David Miller
Cc: netdev, caitlin.bestler, vanhoof, williams, nhorman, nir.tzachar,
niv, paul.moore, remi.denis-courmont, steve
Em Mon, Oct 12, 2009 at 11:40:46PM -0700, David Miller escreveu:
> You missed the syscall entry addition for arch/sparc/kernel/systbls_32.S
> but I fixed that up while applying this to net-next-2.6
Thanks a lot!
- Arnaldo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-10-13 13:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-12 16:20 [PATCH 1/1] net: Introduce recvmmsg socket syscall Arnaldo Carvalho de Melo
2009-10-12 17:53 ` Nir Tzachar
2009-10-13 1:56 ` Arnaldo Carvalho de Melo
2009-10-13 6:40 ` David Miller
2009-10-13 13:14 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).