* O_NONBLOCK on files
@ 2001-09-19 6:46 Simon Kirby
2001-09-19 7:05 ` Eric W. Biederman
0 siblings, 1 reply; 7+ messages in thread
From: Simon Kirby @ 2001-09-19 6:46 UTC (permalink / raw)
To: linux-kernel
I've always wondered why it's not possible to do this:
fd = open("an_actual_file",O_RDONLY);
fcntl(fd,F_SETFL,O_NONBLOCK);
r = read(fd,buf,4096);
And actually have read return -1 and errno == EWOULDBLOCK/EAGAIN if the
block requested is not already cached.
Wouldn't this be the ideal interface for daemons of all types that want
to stay single-threaded and still offer useful performance when the
working set doesn't fit in cache? It works with sockets, so why not
with files?
I see even TUX has to have I/O worker threads to work around this
limitation, which seems a bit silly.
Simon-
[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ sim@stormix.com ][ sim@netnation.com ]
[ Opinions expressed are not necessarily those of my employers. ]
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: O_NONBLOCK on files 2001-09-19 6:46 O_NONBLOCK on files Simon Kirby @ 2001-09-19 7:05 ` Eric W. Biederman 2001-09-19 7:24 ` Simon Kirby 2001-09-19 8:52 ` Erik Andersen 0 siblings, 2 replies; 7+ messages in thread From: Eric W. Biederman @ 2001-09-19 7:05 UTC (permalink / raw) To: Simon Kirby; +Cc: linux-kernel Simon Kirby <sim@netnation.com> writes: > I've always wondered why it's not possible to do this: > > fd = open("an_actual_file",O_RDONLY); > fcntl(fd,F_SETFL,O_NONBLOCK); > r = read(fd,buf,4096); > > And actually have read return -1 and errno == EWOULDBLOCK/EAGAIN if the > block requested is not already cached. > > Wouldn't this be the ideal interface for daemons of all types that want > to stay single-threaded and still offer useful performance when the > working set doesn't fit in cache? It works with sockets, so why not > with files? Besides the SUS or the POSIX specs... What would cause the data to be read in if read just checks the caches? With sockets the other side is clearing pushing or pulling the data. With files there is no other side... This is resolveable just not currently implemented. Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: O_NONBLOCK on files 2001-09-19 7:05 ` Eric W. Biederman @ 2001-09-19 7:24 ` Simon Kirby 2001-09-24 20:47 ` Matti Aarnio 2001-09-19 8:52 ` Erik Andersen 1 sibling, 1 reply; 7+ messages in thread From: Simon Kirby @ 2001-09-19 7:24 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel On Wed, Sep 19, 2001 at 01:05:06AM -0600, Eric W. Biederman wrote: > Besides the SUS or the POSIX specs... Yeah, well, blah. > What would cause the data to be read in if read just checks the caches? > With sockets the other side is clearing pushing or pulling the data. With > files there is no other side... Hmm...Without even thinking about it, I assumed it would start a read and select() or poll() or some later call would return readable when my outstanding request was fulfilled. But yes, I guess you're right, this is different behavior because there is no other side. Reading a file would need a receive queue to make this work, I guess. :) Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ sim@stormix.com ][ sim@netnation.com ] [ Opinions expressed are not necessarily those of my employers. ] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: O_NONBLOCK on files 2001-09-19 7:24 ` Simon Kirby @ 2001-09-24 20:47 ` Matti Aarnio 2001-09-24 21:05 ` Simon Kirby 0 siblings, 1 reply; 7+ messages in thread From: Matti Aarnio @ 2001-09-24 20:47 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Simon Kirby, linux-kernel On Wed, Sep 19, 2001 at 12:24:39AM -0700, Simon Kirby wrote: > On Wed, Sep 19, 2001 at 01:05:06AM -0600, Eric W. Biederman wrote: > > What would cause the data to be read in if read just checks the caches? > > With sockets the other side is clearing pushing or pulling the data. With > > files there is no other side... > > Hmm...Without even thinking about it, I assumed it would start a read and > select() or poll() or some later call would return readable when my > outstanding request was fulfilled. But yes, I guess you're right, this is > different behavior because there is no other side. To push the idea into ultimate: AIO You open file, start IO, and do other things while the machine is doing IO. Doing open() asynchronously would be ultimate, but alas, not particularly trivial. There are problems also in the AIO status rendezvous mechanisms. Your best choice could be (with moderation) to do synchronous operations at separate threads. > Reading a file would need a receive queue to make this work, I guess. :) > Simon- > [ sim@stormix.com ][ sim@netnation.com ] /Matti Aarnio ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: O_NONBLOCK on files 2001-09-24 20:47 ` Matti Aarnio @ 2001-09-24 21:05 ` Simon Kirby 2001-09-24 21:30 ` Benjamin LaHaise 0 siblings, 1 reply; 7+ messages in thread From: Simon Kirby @ 2001-09-24 21:05 UTC (permalink / raw) To: Matti Aarnio; +Cc: Eric W. Biederman, linux-kernel On Mon, Sep 24, 2001 at 11:47:17PM +0300, Matti Aarnio wrote: > On Wed, Sep 19, 2001 at 12:24:39AM -0700, Simon Kirby wrote: > > > Hmm...Without even thinking about it, I assumed it would start a read and > > select() or poll() or some later call would return readable when my > > outstanding request was fulfilled. But yes, I guess you're right, this is > > different behavior because there is no other side. > > To push the idea into ultimate: AIO > > You open file, start IO, and do other things while the machine > is doing IO. Doing open() asynchronously would be ultimate, > but alas, not particularly trivial. > > There are problems also in the AIO status rendezvous mechanisms. > > Your best choice could be (with moderation) to do synchronous > operations at separate threads. Yes, but this sucks. My whole intent was an interface design that would never need to context switch. I'm not sure if this is even possible, but if so it would be very nice from a userspace perspective. Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ sim@stormix.com ][ sim@netnation.com ] [ Opinions expressed are not necessarily those of my employers. ] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: O_NONBLOCK on files 2001-09-24 21:05 ` Simon Kirby @ 2001-09-24 21:30 ` Benjamin LaHaise 0 siblings, 0 replies; 7+ messages in thread From: Benjamin LaHaise @ 2001-09-24 21:30 UTC (permalink / raw) To: Simon Kirby; +Cc: Matti Aarnio, Eric W. Biederman, linux-kernel On Mon, Sep 24, 2001 at 02:05:34PM -0700, Simon Kirby wrote: > Yes, but this sucks. My whole intent was an interface design that would > never need to context switch. I'm not sure if this is even possible, but > if so it would be very nice from a userspace perspective. Actually, this was one of the design concerns that I had in coming up with the async io interfaces. The interface was designed to make use of the upcoming syscalls-in-userspace stubs that x86-64 implemented and will be passed on to x86 in 2.5. All completion events are passed through a ring buffer with minimal locking on the kernel side which only needs thread locking in userspace. This way the whole userspace to kernel transition can be avoided by the server under sufficiently high load. -ben ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: O_NONBLOCK on files 2001-09-19 7:05 ` Eric W. Biederman 2001-09-19 7:24 ` Simon Kirby @ 2001-09-19 8:52 ` Erik Andersen 1 sibling, 0 replies; 7+ messages in thread From: Erik Andersen @ 2001-09-19 8:52 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Simon Kirby, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2219 bytes --] On Wed Sep 19, 2001 at 01:05:06AM -0600, Eric W. Biederman wrote: > Simon Kirby <sim@netnation.com> writes: > > > I've always wondered why it's not possible to do this: > > > > fd = open("an_actual_file",O_RDONLY); > > fcntl(fd,F_SETFL,O_NONBLOCK); > > r = read(fd,buf,4096); > > > > And actually have read return -1 and errno == EWOULDBLOCK/EAGAIN if the > > block requested is not already cached. > > > > Wouldn't this be the ideal interface for daemons of all types that want > > to stay single-threaded and still offer useful performance when the > > working set doesn't fit in cache? It works with sockets, so why not > > with files? > > Besides the SUS or the POSIX specs... > What would cause the data to be read in if read just checks the caches? > With sockets the other side is clearing pushing or pulling the data. With > files there is no other side... > > This is resolveable just not currently implemented. I was actually playing around with this a few days back, since this would have the advantage of 1) maximizing utilization of the bottleneck device and 2) leave the app free to update a user interface etc while the kernel was busy reading/writing. There are two problems with a dd type app. First the dd algorithm is fundamentally serial -- read from A, stop, write to B, stop, rinse, repeat. Second, it leaves the thread stuck in D state forever. For example, try something like dd if=/dev/sda of=/dev/sdb bs=1M (assuming these devices are not holding important stuff) and let it run for a bit and then hit ^C. Then wait for a minute or two while the kernel finishes flushing everything before dd finally exits... So to get sane stop behavior, you have to minimize the buffer size (and add many minutes to the copy time), while to maximize performance, you have to maximize the buffer size (goodbye user friendly). So are there no single threaded semantics by which one can copy from device A to device B without blocking (thereby serializing the copy and leaving the app unresponsive)? I'd really hate to have to use threads to do this, -Erik -- Erik B. Andersen email: andersee@debian.org, formerly of Lineo --This message was written using 73% post-consumer electrons-- [-- Attachment #2: scopy.c --] [-- Type: text/x-csrc, Size: 3066 bytes --] /* vi: set sw=4 ts=4: */ #include <stdlib.h> #include <stdio.h> #include <stdarg.h> #include <fcntl.h> #include <string.h> #include <errno.h> #include <signal.h> #include <dirent.h> #include <unistd.h> #include <utime.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/time.h> #include <sys/types.h> static const char *const app_name = "scopy"; static int intr_flag = 0; static void verror_msg(const char *s, va_list p) { fflush(stdout); fprintf(stderr, "%s: ", app_name); vfprintf(stderr, s, p); } static void vperror_msg(const char *s, va_list p) { int err = errno; if (s == 0) s = ""; verror_msg(s, p); if (*s) s = ": "; fprintf(stderr, "%s%s\n", s, strerror(err)); } static void perror_msg_and_die(const char *s, ...) { va_list p; va_start(p, s); vperror_msg(s, p); va_end(p); exit(EXIT_FAILURE); } static FILE *xfopen(const char *path, const char *mode) { FILE *fp; if ((fp = fopen(path, mode)) == NULL) perror_msg_and_die("%s", path); return fp; } static void show_usage(void) { fprintf(stderr, "Usage: scopy source dest\n"); exit(EXIT_FAILURE); } static void sig_int(int signo) { intr_flag = 1; } int main(int argc, char **argv) { int status = 0; int opt; char *sname, *dname; fd_set rfds, wfds; FILE *source, *dest; int source_fd, dest_fd, n; ssize_t rsize = 0, wsize = 0; char buf[1024 * 1024]; ssize_t bufsiz = sizeof(buf); struct timeval tv = { 0, 250000 }; /* 0.25 seconds */ while ((opt = getopt(argc, argv, "adfipR")) != -1) switch (opt) { default: show_usage(); } if (optind + 2 > argc) show_usage(); /* If there are only two arguments and... */ if (optind + 2 != argc) show_usage(); /* Store filenames for later use */ sname = argv[optind++]; dname = argv[optind]; source = xfopen(sname, "r"); dest = xfopen(dname, "w"); source_fd = fileno(source); dest_fd = fileno(dest); /* Watch to see if things happen */ FD_ZERO(&rfds); FD_ZERO(&wfds); n = (source_fd > dest_fd) ? source_fd + 1 : dest_fd + 1; if (fcntl(source_fd, F_SETFL, O_NONBLOCK) == -1) perror_msg_and_die("Setting O_NONBLOCK on %s", sname); if (fcntl(dest_fd, F_SETFL, O_NONBLOCK) == -1) perror_msg_and_die("Setting O_NONBLOCK on %s", dname); signal(SIGINT, sig_int); /* Ok, do the deed */ while (1) { if (intr_flag) { printf("interrupted... Please wait...\n"); exit(EXIT_FAILURE); } FD_SET(fileno(source), &rfds); FD_SET(fileno(dest), &wfds); status = select(n, &rfds, &wfds, NULL, &tv); if (status < 0) { if (errno == EINTR) { printf("interrupted... Please wait...\n"); exit(EXIT_FAILURE); } else if (errno != EBADF) { perror_msg_and_die("select"); } } if (FD_ISSET(source_fd, &rfds)) { rsize = read(source_fd, buf, bufsiz); if (rsize < 0) perror_msg_and_die("reading %d", sname); if (rsize == 0) exit(EXIT_SUCCESS); } if (FD_ISSET(dest_fd, &wfds)) { wsize = write(dest_fd, buf, rsize); if (wsize < 0) perror_msg_and_die("writing %s", dname); if (wsize == 0) exit(EXIT_SUCCESS); } } return 0; } ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2001-09-24 21:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-09-19 6:46 O_NONBLOCK on files Simon Kirby 2001-09-19 7:05 ` Eric W. Biederman 2001-09-19 7:24 ` Simon Kirby 2001-09-24 20:47 ` Matti Aarnio 2001-09-24 21:05 ` Simon Kirby 2001-09-24 21:30 ` Benjamin LaHaise 2001-09-19 8:52 ` Erik Andersen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox