From: Andi Kleen <andi@firstfloor.org>
To: Christoph Lameter <cl@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>,
David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Add PGM protocol support to the IP stack
Date: Mon, 22 Mar 2010 19:53:10 +0100 [thread overview]
Message-ID: <20100322185310.GA20695@one.firstfloor.org> (raw)
In-Reply-To: <alpine.DEB.2.00.1003221300180.17230@router.home>
On Mon, Mar 22, 2010 at 01:07:37PM -0500, Christoph Lameter wrote:
> > > B. PGM over UDP
> > >
> > > fd = socket(AF_INET, SOCK_RDM, IPPROTO_UDP)
> > >
> > > C. PGM over SHM (?)
> > >
> > > fd = socket(AF_UNIX, SOCK_RDM, 0)
> >
> > Not sure how that should work.
>
> Multiple processes would communicate via shm segments. Maybe defer to the
> future but its an important operation mode as the systems grow bigger and bigger.
> SHM segment would have to contain some sort of ring buffer that the
> receivers could tap into. But that mode has not really been thought
> through.
AF_UNIX is not SHM today.
The only point is to avoid one copy? (user1 -> kernel -> user2 to user1 -> user2)
Not sure if that is really worth it. Don't you need another copy to the reliability
buffer anyways?
Letting kernel parse a data structure in user defined memory is also
always somewhat tricky.
But in principle AF_INET over localhost should not be that less efficient
than AF_UNIX, so you can probably drop it for now (unless you need special AF_UNIX
features like credentials)
> > >
> > > Packet sizes are determined by the number of packets in a single sendmsg() unless
> >
> > Number of bytes surely?
>
> Sorry yes you are right.
>
> > > overridden by the RM_SET_MESSAGE_BOUNDARY socket option.
> >
> > That's unusual to have such a option (except the MTU). What is it good for?
>
> No idea why it was implemented. It can be used to use send() for portions
> of a message. Triggers the send() only when all bytes have been provided.
> Probably necessary if one wants to have very long (megabytes) messages.
Those could be a problem in kernel memory consumption. One would need
to be very careful to have a good memory management scheme for the socket
in place.
> > >
> > > A. Setting the window size / rate.
> > >
> > > struct pgm_send_window x;
> > > x.RateKbitsPerSec = 56;
> > > x.WindowSizeInMsecs = 60000;
> > > x.WindowSizeinBytes = 10000000;
> > >
> > > setsockopt(fd, SOCK_RDM, RM_RATE_WINDOW_SIZE, &x, sizeof(x));
> > >
> > > Default is sending at 56Kbps with a buffer of 10 Megabytes and buffering for a minute.
> >
> > That's a very large buffer for a socket. It would be better to use the usual
> > auto shrinking/increasing mechanisms.
>
> Reliable multicast protocols have a defined time period / "reliabilty
> buffer" so that they can resend a message that was missed for a time
> period. It is customary to either specify a time period or define the size
> of the "reliability buffer".
One problem is memory management then. What happens when a process opens 100 of those
sockets and fills them all?
I guess you would still need a suitable global limit like TCP has.
> Never used it. I'd rather skip for now. Maybe later.
>
> >
> > > /* Socket API structures (established by M$DN) */
> > > struct pgm_receiver_stats {
> > > u64 NumODataPacketsReceived; /* Number of ODATA (original) sequences */
> >
> > It's difficult to maintain 64 bit counters on 32bit hosts on all targets.
> > But I guess it would be ok to only fill in 32bit in this case.
>
> 32 bit counters have the awful habit of overflowing.
There's just no portable atomic64_t. Ok maybe you can use the socket lock
to synchronize all the counts if they are only per socket.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
next prev parent reply other threads:[~2010-03-22 18:53 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-18 17:58 Add PGM protocol support to the IP stack Christoph Lameter
2010-03-18 21:58 ` Christoph Lameter
2010-03-19 17:18 ` Andi Kleen
2010-03-19 21:53 ` David Miller
2010-03-19 22:26 ` H. Peter Anvin
2010-03-22 14:24 ` Christoph Lameter
2010-03-22 14:20 ` Christoph Lameter
2010-03-22 16:36 ` Andi Kleen
2010-03-22 16:51 ` Christoph Lameter
2010-03-22 17:43 ` Andi Kleen
2010-03-22 18:07 ` Christoph Lameter
2010-03-22 18:53 ` Andi Kleen [this message]
2010-03-22 19:32 ` Christoph Lameter
2010-03-26 17:33 ` Christoph Lameter
2010-03-27 13:11 ` Andi Kleen
2010-03-27 16:54 ` Martin Sustrik
2010-03-29 14:50 ` Christoph Lameter
2010-03-29 15:00 ` Christoph Lameter
2010-03-29 21:43 ` Andi Kleen
2010-03-29 23:01 ` H. Peter Anvin
2010-03-30 18:12 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100322185310.GA20695@one.firstfloor.org \
--to=andi@firstfloor.org \
--cc=cl@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).