From: Dominique Martinet <asmadeus@codewreck.org>
To: David Miller <davem@davemloft.net>
Cc: doronrk@fb.com, tom@quantonium.net, davejwatson@fb.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages
Date: Wed, 31 Oct 2018 03:56:57 +0100 [thread overview]
Message-ID: <20181031025657.GA17861@nautica> (raw)
In-Reply-To: <20180918015723.GA26300@nautica>
Dominique Martinet wrote on Tue, Sep 18, 2018:
> David Miller wrote on Mon, Sep 17, 2018:
> > From: Dominique Martinet <asmadeus@codewreck.org>
> > Date: Wed, 12 Sep 2018 07:36:42 +0200
> > > Dominique Martinet wrote on Tue, Sep 11, 2018:
> > >> Hmm, while trying to benchmark this, I sometimes got hangs in
> > >> kcm_wait_data() for the last packet somehow?
> > >> The sender program was done (exited (zombie) so I assumed the sender
> > >> socket flushed), but the receiver was in kcm_wait_data in kcm_recvmsg
> > >> indicating it parsed a header but there was no skb to peek at?
> > >> But the sock is locked so this shouldn't be racy...
> > >>
> > >> I can get it fairly often with this patch and small messages with an
> > >> offset, but I think it's just because the pull changes some timing - I
> > >> can't hit it with just the clone, and I can hit it with a pull without
> > >> clone as well.... And I don't see how pulling a cloned skb can impact
> > >> the original socket, but I'm a bit fuzzy on this.
> > >
> > > This is weird, I cannot reproduce at all without that pull, even if I
> > > add another delay there instead of the pull, so it's not just timing...
> >
> > I really can't apply this patch until you resolve this.
> >
> > It is weird, given your description, though...
>
> Thanks for the reminder! I totally agree with you here and did not
> expect this to be merged as it is (in retrospect, I probably should have
> written something to that extent in the subject, "RFC"?)
Found the issue after some trouble reproducing on other VM, long story
short:
- I was blaming kcm_wait_data's sk_wait_data to wait while there was
something in sk->sk_receive_queue, but after adding a fake timeout and
some debug messages I can see the receive queue is empty.
However going back up from the kcm_sock to the kcm_mux to the kcm_psock,
there are things in the psock's socket's receive_queue... (If I'm
following the code correctly, that would be the underlying tcp socket)
- that psock's strparser contains some hints: the interrupted and
stopped bits are set. strp->interrupted looks like it's only set if
kcm_parse_msg returns something < 0. . .
And surely enough, the skb_pull returns NULL iff there's such a hang...!
I might be tempted to send a patch to strparser to add a pr_debug
message in strp_abort_strp...
Anyway, that probably explains I have no problem with bigger VM
(uselessly more memory available) or without KASAN (I guess there's
overhead?), but I'm sending at most 300k of data and the VM has a 1.5GB
of ram, so if there's an allocation failure there I think there's a
problem ! . . .
So, well, I'm not sure on the way forward. Adding a bpf helper and
document that kcm users should mind the offset?
Thanks,
--
Dominique
next prev parent reply other threads:[~2018-10-31 2:57 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-11 9:21 [PATCH v2] kcm: remove any offset before parsing messages Dominique Martinet
2018-09-12 5:36 ` Dominique Martinet
2018-09-18 1:45 ` David Miller
2018-09-18 1:57 ` Dominique Martinet
2018-09-18 2:40 ` David Miller
2018-09-18 2:45 ` Dominique Martinet
2018-09-18 2:51 ` David Miller
2018-09-18 2:58 ` Dominique Martinet
2018-10-31 2:56 ` Dominique Martinet [this message]
2019-02-15 1:00 ` Dominique Martinet
2019-02-15 1:20 ` Tom Herbert
2019-02-15 1:57 ` Dominique Martinet
2019-02-15 2:48 ` Tom Herbert
2019-02-15 3:31 ` Dominique Martinet
2019-02-15 4:01 ` Tom Herbert
2019-02-15 4:52 ` Dominique Martinet
2019-02-20 4:11 ` Dominique Martinet
2019-02-20 16:18 ` Tom Herbert
2019-02-21 8:22 ` Dominique Martinet
2019-02-22 19:24 ` Tom Herbert
2019-02-22 20:27 ` Dominique Martinet
2019-02-22 21:01 ` Tom Herbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181031025657.GA17861@nautica \
--to=asmadeus@codewreck.org \
--cc=davejwatson@fb.com \
--cc=davem@davemloft.net \
--cc=doronrk@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=tom@quantonium.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.