public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Inter-process send()/recv() using zero-copy ?
@ 2009-09-23  6:01 Xavier Roche
  2009-09-23  6:43 ` Arjan van de Ven
  0 siblings, 1 reply; 4+ messages in thread
From: Xavier Roche @ 2009-09-23  6:01 UTC (permalink / raw)
  To: Linux Kernel

Hi folks,

I was wondering if there was a way to have zero-copy send()/recv(), when 
the socket is connected to the local machine (to another process on the 
same machine, for example) ?

Such feature would be only feasible with page-aligned blocks, from an a 
mmap'ed block to another one, I guess.

Typical case:

Process #1 (uid A)
buff = mmap(0, size, ..) /* anonymous or not */
...
send(s, buff, size, 0)
munmap(buff, size)

Process #2 (uid B)
buff = mmap(0, size, .. | MAP_ANONYMOUS, ..)
recv(s, buff, size, 0)

In an ideal fantasy world, the first process would use send() to 
transmit the complete page-aligned memory block to the other side, and 
the second process would use recv() to get the memory block on a similar 
anonymously mmap'ed block, and the only operation the kernel would do 
would be to share the memory block between the two processes with 
copy-on-write.

On the real world, the same operation requires a first read of the whole 
memory block (possibly partially on disk) and a complete write (possibly 
partially on disk, too) with two copies of the same memory region at the 
end.

Two solutions can be used to emulate such feature:

1. use a temporary mmap'ed file
- but requires a temporary file
- permissions for the file ? (not necessarily from the same UID)
- special case for local network block transmissions vs. machine-to-machine

2. use shared memory explicitely
- handling of permissions ? (ditto)
- special case for local network block transmissions vs. machine-to-machine

splice() and friends do not appear to give any help for this case, and I 
was wondering if there was a chance to do that ?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inter-process send()/recv() using zero-copy ?
  2009-09-23  6:01 Inter-process send()/recv() using zero-copy ? Xavier Roche
@ 2009-09-23  6:43 ` Arjan van de Ven
  2009-09-23  6:51   ` Nikita V. Youshchenko
  2009-09-23  7:04   ` Xavier Roche
  0 siblings, 2 replies; 4+ messages in thread
From: Arjan van de Ven @ 2009-09-23  6:43 UTC (permalink / raw)
  To: Xavier Roche; +Cc: Linux Kernel

On Wed, 23 Sep 2009 08:01:27 +0200
Xavier Roche <roche+kml2@exalead.com> wrote:

> Hi folks,
> 
> I was wondering if there was a way to have zero-copy send()/recv(),
> when the socket is connected to the local machine (to another process
> on the same machine, for example) ?
> 
> Such feature would be only feasible with page-aligned blocks, from an
> a mmap'ed block to another one, I guess.
> 
> Typical case:
> 
> Process #1 (uid A)
> buff = mmap(0, size, ..) /* anonymous or not */
> ...
> send(s, buff, size, 0)
> munmap(buff, size)
> 
> Process #2 (uid B)
> buff = mmap(0, size, .. | MAP_ANONYMOUS, ..)
> recv(s, buff, size, 0)
> 
> In an ideal fantasy world, the first process would use send() to 
> transmit the complete page-aligned memory block to the other side,
> and the second process would use recv() to get the memory block on a
> similar anonymously mmap'ed block, and the only operation the kernel
> would do would be to share the memory block between the two processes
> with copy-on-write.
> 
> On the real world, the same operation requires a first read of the
> whole memory block (possibly partially on disk) and a complete write
> (possibly partially on disk, too) with two copies of the same memory
> region at the end.
> 
> Two solutions can be used to 

the problem you have is that
1) memory copies are cheap
   (say, 3000 cycles/page or less)
2) page table operations (mmap etc) are very expensive.

these two combined tend to not make it a win to substitute simple
copies with complex pagetable tricks.

-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inter-process send()/recv() using zero-copy ?
  2009-09-23  6:43 ` Arjan van de Ven
@ 2009-09-23  6:51   ` Nikita V. Youshchenko
  2009-09-23  7:04   ` Xavier Roche
  1 sibling, 0 replies; 4+ messages in thread
From: Nikita V. Youshchenko @ 2009-09-23  6:51 UTC (permalink / raw)
  To: Linux Kernel

> the problem you have is that
> 1) memory copies are cheap
>    (say, 3000 cycles/page or less)

What about L1 cache pollution? Doesn't it change situation?

> 2) page table operations (mmap etc) are very expensive.
>
> these two combined tend to not make it a win to substitute simple
> copies with complex pagetable tricks.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Inter-process send()/recv() using zero-copy ?
  2009-09-23  6:43 ` Arjan van de Ven
  2009-09-23  6:51   ` Nikita V. Youshchenko
@ 2009-09-23  7:04   ` Xavier Roche
  1 sibling, 0 replies; 4+ messages in thread
From: Xavier Roche @ 2009-09-23  7:04 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Linux Kernel

Arjan van de Ven wrote:
> 1) memory copies are cheap
>    (say, 3000 cycles/page or less)

Yes, but this case would be more than useful for large memory blocks 
(typically memory_size/N, with N typically 2..10) -- something you 
generally have when you deal with mmap'ed blocks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-09-23  7:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-23  6:01 Inter-process send()/recv() using zero-copy ? Xavier Roche
2009-09-23  6:43 ` Arjan van de Ven
2009-09-23  6:51   ` Nikita V. Youshchenko
2009-09-23  7:04   ` Xavier Roche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox