From: Thomas Graf <tgraf-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Pravin Shelar <pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>,
Zoltan Kiss <zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
Cc: "dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org"
<dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org>,
kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
netdev <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b@public.gmane.org
Subject: Re: [PATCH] openvswitch: Orphan frags before sending to userspace via Netlink to avoid guest stall
Date: Fri, 07 Mar 2014 16:58:06 +0100 [thread overview]
Message-ID: <5319EC8E.2010606@redhat.com> (raw)
In-Reply-To: <CALnjE+rWc=n_F+1jSLQtPrgKSvvxONEkkYxWEHon2_KVNG9z3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 03/07/2014 05:46 AM, Pravin Shelar wrote:
> But I found bug in datapath user-space queue code. I am not sure how
> this can work with skb fragments and MMAP-netlink socket.
> Here is what happens, OVS allocates netlink skb and adds fragments to
> skb using skb_zero_copy(), then calls genlmsg_unicast().
> But if netlink sock is mmped then netlink-send queues netlink
> allocated skb->head (linear data of skb) and ignore skb frags.
>
> Currently this is not problem with OVS vswitchd since it does not use
> netlink MMAP sockets. But if vswitchd stats using MMAP-netlink socket,
> it can break it.
The secret is out ;-)
I was very surprised too when I noticed that it worked. It's not just
OVS, it's nfqueue as well. The reason is that an netlink mmaped skb is
setup with a giant tailroom in netlink_ring_setup_skb():
skb->end = skb->tail + size;
and skb_zerocopy() will consume whatever tailroom is available first:
/* dont bother with small payloads */
if (len <= skb_tailroom(to)) {
skb_copy_bits(from, 0, skb_put(to, len), len);
return;
}
I was planning to fix this while adding GSO support to the upcall as
that is the moment when this bug would really surface.
WARNING: multiple messages have this Message-ID (diff)
From: Thomas Graf <tgraf@redhat.com>
To: Pravin Shelar <pshelar@nicira.com>, Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Jesse Gross <jesse@nicira.com>,
"dev@openvswitch.org" <dev@openvswitch.org>,
xen-devel@lists.xenproject.org, netdev <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
kvm@vger.kernel.org
Subject: Re: [PATCH] openvswitch: Orphan frags before sending to userspace via Netlink to avoid guest stall
Date: Fri, 07 Mar 2014 16:58:06 +0100 [thread overview]
Message-ID: <5319EC8E.2010606@redhat.com> (raw)
In-Reply-To: <CALnjE+rWc=n_F+1jSLQtPrgKSvvxONEkkYxWEHon2_KVNG9z3Q@mail.gmail.com>
On 03/07/2014 05:46 AM, Pravin Shelar wrote:
> But I found bug in datapath user-space queue code. I am not sure how
> this can work with skb fragments and MMAP-netlink socket.
> Here is what happens, OVS allocates netlink skb and adds fragments to
> skb using skb_zero_copy(), then calls genlmsg_unicast().
> But if netlink sock is mmped then netlink-send queues netlink
> allocated skb->head (linear data of skb) and ignore skb frags.
>
> Currently this is not problem with OVS vswitchd since it does not use
> netlink MMAP sockets. But if vswitchd stats using MMAP-netlink socket,
> it can break it.
The secret is out ;-)
I was very surprised too when I noticed that it worked. It's not just
OVS, it's nfqueue as well. The reason is that an netlink mmaped skb is
setup with a giant tailroom in netlink_ring_setup_skb():
skb->end = skb->tail + size;
and skb_zerocopy() will consume whatever tailroom is available first:
/* dont bother with small payloads */
if (len <= skb_tailroom(to)) {
skb_copy_bits(from, 0, skb_put(to, len), len);
return;
}
I was planning to fix this while adding GSO support to the upcall as
that is the moment when this bug would really surface.
next prev parent reply other threads:[~2014-03-07 15:58 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-28 19:16 [PATCH] openvswitch: Orphan frags before sending to userspace via Netlink to avoid guest stall Zoltan Kiss
2014-02-28 19:16 ` Zoltan Kiss
2014-03-06 17:09 ` Zoltan Kiss
2014-03-07 4:46 ` Pravin Shelar
2014-03-07 4:46 ` Pravin Shelar
2014-03-07 12:29 ` Zoltan Kiss
2014-03-07 12:29 ` Zoltan Kiss
2014-03-07 17:38 ` Pravin Shelar
[not found] ` <5319BBAE.7030109-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
2014-03-07 17:38 ` Pravin Shelar
2014-03-07 17:38 ` Pravin Shelar
[not found] ` <CALnjE+rWc=n_F+1jSLQtPrgKSvvxONEkkYxWEHon2_KVNG9z3Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-07 15:58 ` Thomas Graf [this message]
2014-03-07 15:58 ` Thomas Graf
[not found] ` <5319EC8E.2010606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-03-07 17:19 ` Pravin Shelar
2014-03-07 17:19 ` Pravin Shelar
2014-03-07 18:05 ` Thomas Graf
[not found] ` <CALnjE+oDM=ga_C6T_-9i2UNwv=K4g-+y-LJA04nh+=WmoeuNXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-03-07 18:05 ` Thomas Graf
2014-03-07 18:05 ` Thomas Graf
2014-03-07 18:43 ` Pravin Shelar
[not found] ` <531A0A5B.2000104-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-03-07 18:43 ` Pravin Shelar
2014-03-07 18:43 ` Pravin Shelar
2014-03-07 17:19 ` Pravin Shelar
2014-03-07 15:58 ` Thomas Graf
2014-03-06 17:09 ` Zoltan Kiss
2014-03-07 16:23 ` Thomas Graf
2014-03-07 16:23 ` Thomas Graf
2014-03-07 16:23 ` Thomas Graf
2014-03-07 17:28 ` Pravin Shelar
2014-03-07 17:28 ` Pravin Shelar
2014-03-07 17:59 ` Thomas Graf
2014-03-07 17:59 ` Thomas Graf
2014-03-07 18:41 ` Pravin Shelar
2014-03-07 18:41 ` Pravin Shelar
2014-03-07 18:41 ` Pravin Shelar
2014-03-11 19:41 ` Zoltan Kiss
2014-03-11 19:41 ` Zoltan Kiss
2014-03-14 22:26 ` [ovs-dev] " Zoltan Kiss
2014-03-14 22:26 ` Zoltan Kiss
2014-03-14 22:26 ` Zoltan Kiss
2014-03-11 19:41 ` Zoltan Kiss
2014-03-07 17:59 ` Thomas Graf
2014-03-07 17:28 ` Pravin Shelar
-- strict thread matches above, loose matches on Subject: below --
2014-02-28 19:16 Zoltan Kiss
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5319EC8E.2010606@redhat.com \
--to=tgraf-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org \
--cc=kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=pshelar-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org \
--cc=xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b@public.gmane.org \
--cc=zoltan.kiss-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.