Re: [PATCH 3/3] Store socket owner with buffers

Linux Container Development
 help / color / mirror / Atom feed

From: Oren Laadan <orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
To: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
Subject: Re: [PATCH 3/3] Store socket owner with buffers
Date: Wed, 26 Aug 2009 16:34:18 -0400	[thread overview]
Message-ID: <4A959C4A.6050803@librato.com> (raw)
In-Reply-To: <87bpm2qy6q.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>

Dan Smith wrote:
> OL> I suspect this won't pass a test in which the sender of a dgram
> OL> has been closed already (and therefore is "dangling") ?
> 
> I had a test case for this, but it was passing because I was closing
> the wrong socket.  However, a few modifications to the code (and a
> corrected test case) have it working now.  I'll post that in the next
> go-round.
> 
> OL> I'm not sure why you removed that code completely.
> 
> OL> It can be executed, for example, by the caller of
> OL> unix_read_buffers(), before and after that call ?
> 
> You're right, I was confusing the behavior of sk_sndbuf with
> sk_wmem_alloc and thinking that I'd need to track it separately.
> 
>>> +		ret = ckpt_write_obj_type(ctx, NULL, sizeof(*h) + skb->len,
>>> CKPT_HDR_SOCKET_BUFFER);
>>> -		if (ret)
>>> +		if (ret < 0)
>>> +			goto end;
>>> +
>>> +		ret = ckpt_kwrite(ctx, h, sizeof(*h));
> 
> OL> The 'struct ckpt_hdr' part of @h was already written in the call
> OL> to ckpt_write_obj_type() above.
> 
> I'll admit that the increased complexity of header and object writing
> recently has me a bit unsure of what to do when, but I'm not sure how
> it is not failing to run if you're correct.  Certainly restore would
> get out of sync with the headers if there was an extra ckpt_hdr in the
> stream, no?

Lol - well, it worked because you wrote the ckpt_hdr twice at checkpoint
and then read it twice at restart...

> 
> Anyway, I've changed that bit to avoid the unnecessary appearance that
> the buffer data is actually part of the ckpt_hdr_socket_buffer
> structure.  It would never be optimal to read it that way anyway.
> 
> OL> I wonder if we need to consider this: With unix domain sockets,
> OL> when a connected, or unconnected dgram socket is (re)connected or
> OL> disconnected, IIRC, all the existing data in the receive buffers
> OL> is removed.
> 
> OL> I suppose this will automagically be enforced, because when you
> OL> connect the socket, old buffers will be discarded already, and if
> OL> it is already connected, then sending from other sockets (not
> OL> peer) will fail.
> 
> OL> So I think we're protected from such malicious data. Perhaps this
> OL> is worth a comment ?
> 
> While swizzling things to handle SOCK_DEAD, I explicitly avoid writing
> buffers for DEAD sockets (like LISTEN sockets).  That should make it
> clear enough, right?

I'm confused...

The concern I raised above is about manipulated checkpoint data that
attempts to make a connected socket to have data from another (not
peer) socket.

Then I realized that since you use the usual send functions, this
_should_ be covered by the existing checks in the unix networking
code itself.

So I wondered if it's worth a comment somewhere, just to let readers
of the code know that this scenario was considered. (Of course, after
verifying that my statement is indeed correct :)

Regarding DEAD sockets - the may still be needed if they are the
source of skb's, no ?  (of course, whatever they had in receive
buffer should have been discarded).

> 
> OL> Hmmm... I suspect this is plain wrong. Objects that are treated
> OL> via checkpoint_obj() and restore_obj() should only be added to the
> OL> objhash via these two functions.
> 
> OL> Because at checkpoint you used un->peer = checkpoint_obj(...), a
> OL> ckpt_obj_fetch() on @un->peer _must_ return the matching object,
> OL> or the image file data is invalid.
> 
> No, because the restore() may be a result of another restore() not yet
> completed, right?  If I'm restoring A, which restores B as a
> side-effect, the restore of B will expect to see its peer (A) as
> already in the objhash or it will fail.

Maybe I need to read the code again. My impression was that when you
checkpoint socket A, and you need B, then you call checkpoint_obj()
on B, from within the processing of checkpoint_obj(A).

restart-obj() - it gets called automatically when ckpt_read_obj() sees
an object of type CKPT_HDR_OBJREF; there is never an explicit call.
That's why I assumed that it will be called for A and then, while
processing A, it will be called for B, since A will read in data and
the ckpt_obj_read() will see the shared object.

> 
> OL> Unfortunately, neither restore_obj() nor ckpt_obj_insert() will
> OL> complain about adding the same objref twice...
> 
> That's going to be a problem for other scenarios where we have a
> circular dependency, right?
> 
> Assume A and B have buffers outstanding for each other.  When we
> restore A, we restore the first buffer and see that we need to restore
> B.  While restoring B's first buffer, we see think that we need to
> restore A because it's not in the objhash yet, and then we're stuck.

I wonder what would the checkpoint image look like ?
Reading the code, I expect the following to occur during checkpoint:

1) find A through the file pointers, call checkpoint_obj(S_a).
2) while in there, find B as peer, call checkpoint_obj(S_b)
3) write B's state, including B's buffers which will reference A
4) (back to A) write A's state, and buffers which will reference B

And at restart:

1) See ckpt_hdr_socket of A (start restart of A)
2) See ckpt_hdr_socket of B (start restart of B)
3) Read B's state in, including buffers. A will not have completed
at this point, so it won't be in the hash either...
...

So I guess it works because you explicitly ckpt_obj_insert() the
sockets to the objhash;  Then later restart_obj() inserts them
(or something) again, but the second instance is never actually
found in a lookup/fetch - the first instance is always returned.

Or am I missing something... ?

> 
> We could permit the restore function to insert itself (if necessary)
> and then we could check before re-insert it after ops->restore(),
> right?
> 

Hmmm... I need to think about it.

Oren.

next prev parent reply	other threads:[~2009-08-26 20:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-24 17:11 [RFC] Change socket checkpoint to retain DGRAM source addresses Dan Smith
     [not found] ` <1251133918-8117-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-24 17:11   ` [PATCH 1/3] Set the CHECKPOINTED flag on objects before calling checkpoint Dan Smith
     [not found]     ` <1251133918-8117-2-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-25  5:53       ` Oren Laadan
2009-08-24 17:11   ` [PATCH 2/3] Make sockets proper objhash objects and use checkpoint_obj() on them Dan Smith
     [not found]     ` <1251133918-8117-3-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-25  5:01       ` Oren Laadan
     [not found]         ` <4A937031.10300-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-08-25 14:52           ` Dan Smith
     [not found]             ` <87y6p8q728.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-08-25 17:55               ` Oren Laadan
     [not found]                 ` <4A94257C.5060702-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-08-26  2:53                   ` Matt Helsley
2009-08-26  2:47           ` Matt Helsley
2009-08-24 17:11   ` [PATCH 3/3] Store socket owner with buffers Dan Smith
     [not found]     ` <1251133918-8117-4-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-25  5:33       ` Oren Laadan
2009-08-26 17:31         ` Dan Smith
     [not found]           ` <87bpm2qy6q.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-08-26 20:34             ` Oren Laadan [this message]
     [not found]               ` <4A959C4A.6050803-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-08-26 21:44                 ` Dan Smith
2009-08-25  0:06   ` [RFC] Change socket checkpoint to retain DGRAM source addresses Serge E. Hallyn
2009-08-25  4:33   ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A959C4A.6050803@librato.com \
    --to=orenl-rdfvbdnroixbdgjk7y7tuq@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox