idea: user to user pipe copy

All of lore.kernel.org
 help / color / mirror / Atom feed

* idea: user to user pipe copy
@ 2004-04-22 22:56 Mark Borgerding
  2004-04-23 11:46 ` Mark Borgerding
  2004-04-23 14:29 ` Jamie Lokier
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Borgerding @ 2004-04-22 22:56 UTC (permalink / raw)
  To: linux-fsdevel

Would someone tell me why this
a) won't work?
b) shouldn't be done?
c) is the dumbest idea since Microsoft Bob?

Currently, piped data gets copied from user space to a kernel buffer 
then back out to user space. 

This happens regardless of whether there is already a reader who is 
blocked on that fd.

Instead ...

Why not keep track of blocked read()s on a pipe fd?

When the writer writes something to the pipe, data could be copied 
directly from one user process to another, rather than
calling copy_from_user then copy_to_user. 

This alleged speed increase would benefit all blocking pipes & fifos, 
roughly half the time (i.e. whenever the read happens before the write).

-- Mark Borgerding

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: idea: user to user pipe copy
  2004-04-22 22:56 idea: user to user pipe copy Mark Borgerding
@ 2004-04-23 11:46 ` Mark Borgerding
  2004-04-23 14:29 ` Jamie Lokier
  1 sibling, 0 replies; 6+ messages in thread
From: Mark Borgerding @ 2004-04-23 11:46 UTC (permalink / raw)
  To: linux-fsdevel

Mark Borgerding wrote:

> Would someone tell me why this
> a) won't work?
> b) shouldn't be done?
> c) is the dumbest idea since Microsoft Bob?
>
>
> Currently, piped data gets copied from user space to a kernel buffer 
> then back out to user space.
> This happens regardless of whether there is already a reader who is 
> blocked on that fd.
>
> Instead ...
>
> Why not keep track of blocked read()s on a pipe fd?
>
> When the writer writes something to the pipe, data could be copied 
> directly from one user process to another, rather than
> calling copy_from_user then copy_to_user.
> This alleged speed increase would benefit all blocking pipes & fifos, 
> roughly half the time (i.e. whenever the read happens before the write).
>
> -- Mark Borgerding


Here is a rough idea how I think it could be implemented in fs/pipe.c 
(diff from 2.6.5).  The patch is just comments to help me flesh out the 
concept.

I'd appreciate any suggestions on how to implement the user-to-user 
memory copy between processes. 

-- Mark Borgerding


@@ -156,7 +156,25 @@ pipe_readv(struct file *filp, const stru
                        wake_up_interruptible_sync(PIPE_WAIT(*inode));
                        kill_fasync(PIPE_FASYNC_WRITERS(*inode), SIGIO, 
POLL_OUT);
                }
+        /* MB-TODO
+         Put struct iovec* into a waiting_reader member of pipe_inode_info
+         so that the writer can write directly to this caller's buffer.
+
+         if ( inode->waiting_reader == NULL )
+            inode->waiting_reader = iov;
+        */
+
                pipe_wait(inode);
+        /* MB-TODO
+         Check the waiting_reader struct to see if a writer has changed it.
+         Adjust the  byte lengths accordingly.
+
+         if ( inode->waiting_reader == iov ) {
+            ret += total_len - iov_length(iov, nr_segs);
+            inode->waiting_reader = NULL;
+         }
+
+         */
        }
        up(PIPE_SEM(*inode));
        /* Signal writers asynchronously that there is more room.  */
@@ -224,13 +242,31 @@ pipe_writev(struct file *filp, const str
                        if (chars > free)
                                chars = free;

+            /*
+             MB-TODO
+             Check to see if there is a current waiting_reader
+             on this inode.  If so call ,
+                pipe_iov_copy_user_to_user( TBD )
+             rather than
+                pipe_iov_copy_from_user
+
+             if ( inode->waiting_reader ) {
+                if( pipe_iov_copy_user_to_user( inode->waiting_reader , 
iov,chars) ) {
+                    if (!ret) ret = -EFAULT;
+                    break;
+                }
+             }
+
+            */
                        if (pipe_iov_copy_from_user(pipebuf, iov, chars)) {
                                if (!ret) ret = -EFAULT;
                                break;
                        }
                        ret += chars;
-
-                       PIPE_LEN(*inode) += chars;
+
+            /*  The PIPE_LEN does not increase for user-to-user copies
+            if ( ! inode->waiting_reader ) */
+                       PIPE_LEN(*inode) += chars;
                        total_len -= chars;
                        if (!total_len)
                                break;


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: idea: user to user pipe copy
  2004-04-22 22:56 idea: user to user pipe copy Mark Borgerding
  2004-04-23 11:46 ` Mark Borgerding
@ 2004-04-23 14:29 ` Jamie Lokier
  2004-04-23 16:27   ` Bryan Henderson
  1 sibling, 1 reply; 6+ messages in thread
From: Jamie Lokier @ 2004-04-23 14:29 UTC (permalink / raw)
  To: Mark Borgerding; +Cc: linux-fsdevel

Mark Borgerding wrote:
> Why not keep track of blocked read()s on a pipe fd?
> 
> When the writer writes something to the pipe, data could be copied 
> directly from one user process to another, rather than
> calling copy_from_user then copy_to_user. 
> 
> This alleged speed increase would benefit all blocking pipes & fifos, 
> roughly half the time (i.e. whenever the read happens before the write).

You only get page faults from one of the mm contexts, so
copy_from_user_to_other_user would have to do explicit page table
operations to find the pages of the other mm.  There may also be
a little data cache flushing required with SMP.

It seems feasible.

-- Jamie

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: idea: user to user pipe copy
  2004-04-23 14:29 ` Jamie Lokier
@ 2004-04-23 16:27   ` Bryan Henderson
  2004-04-23 19:26     ` Mark Borgerding
  0 siblings, 1 reply; 6+ messages in thread
From: Bryan Henderson @ 2004-04-23 16:27 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: linux-fsdevel, Mark Borgerding

>You only get page faults from one of the mm contexts, so
>copy_from_user_to_other_user would have to do explicit page table
>operations to find the pages of the other mm. 

That's kind of a roundabout way of stating the underlying problem:  You 
can have only one address space active at a time.  Copying from user 
memory to kernel memory and from kernel to user are both easy because the 
kernel is in every address space.  Copying from user to user requires you 
to do all the work yourself instead of simply executing a move instruction 
and relying on the hardware and page fault handlers.

As Jamie implies, you not only have to emulate a page fault in the 
reader's address space, but then pin the virtual memory, map the real 
memory that now backs the read buffer to a kernel virtual address, and 
then copy to the kernel virtual address.

Actually, the situation probably varies from one machine architecture to 
the next.

I would also be concerned about the waiting reader disappearing in the 
middle of the copy.  Via signal maybe?

--
Bryan Henderson                          IBM Almaden Research Center
San Jose CA                              Filesystems

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: idea: user to user pipe copy
  2004-04-23 16:27   ` Bryan Henderson
@ 2004-04-23 19:26     ` Mark Borgerding
  2004-04-23 20:40       ` Bryan Henderson
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Borgerding @ 2004-04-23 19:26 UTC (permalink / raw)
  To: Bryan Henderson; +Cc: Jamie Lokier, linux-fsdevel

Jamie & Bryan,

Thanks for your input on the user-to-user copy problem.  I won't pretend 
to understand all the issues of which you speak.

Are these issues insurmountable?  Is it worth it?  Would the extra 
complexity of the user-user copy add security holes and stability problems?

If someone were willing to work on copy_user_to_other_user, I could 
manage the pipe work.
Anyone feel like battling that windmill?
Eliminating one out of two buffer copies is a fine and noble goal.  
Something you can brag to your grandkids about ;)

BTW, it seems this could make unix sockets faster too.

-- Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: idea: user to user pipe copy
  2004-04-23 19:26     ` Mark Borgerding
@ 2004-04-23 20:40       ` Bryan Henderson
  0 siblings, 0 replies; 6+ messages in thread
From: Bryan Henderson @ 2004-04-23 20:40 UTC (permalink / raw)
  To: Mark Borgerding; +Cc: Jamie Lokier, linux-fsdevel

>Are these issues insurmountable?  Is it worth it?  Would the extra 
>complexity of the user-user copy add security holes and stability 
problems?

I don't think they're insurmountable, and I don't think it's a totally 
insane idea.  But since it's more than a trivial adjustment, I wonder if 
it would result in any noticeable gain.  Copying a block of memory isn't 
all that expensive and it might be unnoticeable in a typical piped 
application, with all the task switching going on.

Maybe before biting off something this big, we (you) should do some quick 
measurements of a real-world application to see how much time it spends 
doing that 2nd copy.  Maybe comment out the copy_to_user and see how much 
faster it goes or how much less CPU it uses.

--
Bryan Henderson                          IBM Almaden Research Center
San Jose CA                              Filesystems

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-04-23 20:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-22 22:56 idea: user to user pipe copy Mark Borgerding
2004-04-23 11:46 ` Mark Borgerding
2004-04-23 14:29 ` Jamie Lokier
2004-04-23 16:27   ` Bryan Henderson
2004-04-23 19:26     ` Mark Borgerding
2004-04-23 20:40       ` Bryan Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.