* [RFC 0/2] New socket API: recvmmsg
@ 2009-05-20 23:06 Arnaldo Carvalho de Melo
2009-05-21 0:30 ` David Miller
0 siblings, 1 reply; 2+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-05-20 23:06 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Chris Van Hoof, Clark Williams
[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]
Hi,
The following two patches, that I cooked today and haven't
properly benchmarked, implements a new socket syscall, recvmmsg, that
stands for receive multiple messages, in one call.
I implemented the attached program as a test case and to show
it in action, and lightly tested it using two clients (netcat) sending
big files from a machine with a 100 mbit/s NIC and another with a 1
Gbit/s NIC to a server with the patched kernel, output:
$ ./recvmmsg 5001 128
nr_datagrams received: 19
4352 bytes received from doppio.ghostprotocols.net in 17 datagrams
256 bytes received from filo.ghostprotocols.net in 1 datagrams
256 bytes received from doppio.ghostprotocols.net in 1 datagrams
nr_datagrams received: 14
2816 bytes received from doppio.ghostprotocols.net in 11 datagrams
256 bytes received from filo.ghostprotocols.net in 1 datagrams
512 bytes received from doppio.ghostprotocols.net in 2 datagrams
nr_datagrams received: 19
2304 bytes received from doppio.ghostprotocols.net in 9 datagrams
256 bytes received from filo.ghostprotocols.net in 1 datagrams
2304 bytes received from doppio.ghostprotocols.net in 9 datagrams
nr_datagrams received: 14
2816 bytes received from doppio.ghostprotocols.net in 11 datagrams
256 bytes received from filo.ghostprotocols.net in 1 datagrams
512 bytes received from doppio.ghostprotocols.net in 2 datagrams
nr_datagrams received: 19
4608 bytes received from doppio.ghostprotocols.net in 18 datagrams
256 bytes received from filo.ghostprotocols.net in 1 datagrams
filo is the machine with a 100 mbit/s NIC, obviously :-)
There are some things I probably will change, like perhaps
pushing it deeper from socket to sock level, but I'd like to hear about
the general feeling about at least the userspace interface.
Best Regards,
- Arnaldo
[-- Attachment #2: recvmmsg.c --]
[-- Type: text/plain, Size: 3187 bytes --]
#include <stdlib.h>
#include <syscall.h>
#include <stdio.h>
#include <sys/socket.h>
#include <unistd.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <arpa/inet.h>
#include <netdb.h>
#include <poll.h>
#include <string.h>
struct mmsghdr {
struct msghdr msg_hdr;
unsigned msg_len;
};
#if defined(__x86_64__) || defined(__i386__)
#include "linux-2.6-tip/arch/x86/include/asm/unistd.h"
#endif
static inline int recvmmsg(int fd, struct mmsghdr *mmsg,
unsigned vlen, unsigned flags)
{
return syscall(__NR_recvmmsg, fd, mmsg, vlen, flags);
}
static void print_stats_peer(struct mmsghdr *datagram, int count, int bytes)
{
char peer[1024];
int err = getnameinfo(datagram->msg_hdr.msg_name,
datagram->msg_hdr.msg_namelen,
peer, sizeof(peer), NULL, 0, 0);
if (err != 0) {
fprintf(stderr, "error using getnameinfo: %s\n",
gai_strerror(err));
return;
}
printf(" %d bytes received from %s in %d datagrams\n",
bytes, peer, count);
}
int main(int argc, char *argv[])
{
struct addrinfo *host;
struct addrinfo hints = {
.ai_family = AF_INET,
.ai_socktype = SOCK_DGRAM,
.ai_protocol = IPPROTO_UDP,
.ai_flags = AI_PASSIVE,
};
const char *port = "5001";
int batch_size = 8;
int err, fd;
int i;
if (argc > 1)
port = argv[1];
if (argc > 2)
batch_size = atoi(argv[2]);
char buf[batch_size][256];
struct iovec iovec[batch_size][1];
struct sockaddr addr[batch_size];
struct mmsghdr datagrams[batch_size];
err = getaddrinfo(NULL, port, &hints, &host);
if (err != 0) {
fprintf(stderr, "error using getaddrinfo: %s\n",
gai_strerror(err));
goto out;
}
fd = socket(host->ai_family, host->ai_socktype, host->ai_protocol);
if (fd < 0) {
perror("socket: ");
goto out_freeaddrinfo;
}
if (bind(fd, host->ai_addr, host->ai_addrlen) < 0) {
perror("bind: ");
goto out_close_server;
}
for (i = 0; i < batch_size; ++i) {
iovec[i][0].iov_base = buf[i];
iovec[i][0].iov_len = sizeof(buf[i]);
datagrams[i].msg_hdr.msg_iov = iovec[i];
datagrams[i].msg_hdr.msg_iovlen = 1;
datagrams[i].msg_hdr.msg_name = &addr[i];
datagrams[i].msg_hdr.msg_namelen = sizeof(addr[i]);
}
struct pollfd pfds[1] = {
[0] = {
.fd = fd,
.events = POLLIN,
},
};
while (1) {
if (poll(pfds, 1, -1) < 0) {
perror("poll: ");
return EXIT_FAILURE;
}
int nr_datagrams = recvmmsg(fd, datagrams, batch_size,
MSG_DONTWAIT);
if (nr_datagrams == 0) {
perror("recvmmsg: ");
return EXIT_FAILURE;
}
printf("nr_datagrams received: %d\n", nr_datagrams);
int peer_count = 1;
int peer_bytes = datagrams[0].msg_len;
for (i = 1; i < nr_datagrams; ++i) {
if (memcmp(datagrams[i - 1].msg_hdr.msg_name,
datagrams[i].msg_hdr.msg_name,
datagrams[i].msg_hdr.msg_namelen) == 0) {
++peer_count;
peer_bytes += datagrams[i].msg_len;
continue;
}
print_stats_peer(&datagrams[i - 1],
peer_count, peer_bytes);
peer_bytes = datagrams[i].msg_len;
peer_count = 1;
}
print_stats_peer(&datagrams[nr_datagrams - 1],
peer_count, peer_bytes);
}
out_close_server:
close(fd);
out_freeaddrinfo:
freeaddrinfo(host);
out:
return err;
}
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [RFC 0/2] New socket API: recvmmsg
2009-05-20 23:06 [RFC 0/2] New socket API: recvmmsg Arnaldo Carvalho de Melo
@ 2009-05-21 0:30 ` David Miller
0 siblings, 0 replies; 2+ messages in thread
From: David Miller @ 2009-05-21 0:30 UTC (permalink / raw)
To: acme; +Cc: netdev, vanhoof, williams
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Wed, 20 May 2009 20:06:42 -0300
> The following two patches, that I cooked today and haven't
> properly benchmarked, implements a new socket syscall, recvmmsg, that
> stands for receive multiple messages, in one call.
As I discussed with Arnaldo on IRC I am OK with this kind of
interface.
And, also, I think we need to seriously consider the patches
posted by others a few weeks ago that allowed sending to
multiple receivers in one system call.
The old adage about syscalls being cheap no longer holds when
we're talking about traversing all the way into the protocol
stack socket code every call, taking the socket lock every
time, etc. So we really do need these batching variants of
I/O calls, or something similar.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2009-05-21 0:30 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-20 23:06 [RFC 0/2] New socket API: recvmmsg Arnaldo Carvalho de Melo
2009-05-21 0:30 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).