netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com,
	John Fastabend <john.r.fastabend@intel.com>,
	Vlad Yasevich <vyasevic@redhat.com>
Subject: Re: [PATCH net 1/2] macvlan: forbid L2 fowarding offload for macvtap
Date: Tue, 07 Jan 2014 11:10:01 +0800	[thread overview]
Message-ID: <52CB7009.2030903@redhat.com> (raw)
In-Reply-To: <20140106122628.GA24280@hmsreliant.think-freely.org>

On 01/06/2014 08:26 PM, Neil Horman wrote:
> On Mon, Jan 06, 2014 at 03:54:21PM +0800, Jason Wang wrote:
>> On 01/06/2014 03:35 PM, John Fastabend wrote:
>>> On 01/05/2014 07:21 PM, Jason Wang wrote:
>>>> L2 fowarding offload will bypass the rx handler of real device. This
>>>> will make
>>>> the packet could not be forwarded to macvtap device. Another problem
>>>> is the
>>>> dev_hard_start_xmit() called for macvtap does not have any
>>>> synchronization.
>>>>
>>>> Fix this by forbidding L2 forwarding for macvtap.
>>>>
>>>> Cc: John Fastabend <john.r.fastabend@intel.com>
>>>> Cc: Neil Horman <nhorman@tuxdriver.com>
>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> ---
>>>>   drivers/net/macvlan.c |    5 ++++-
>>>>   1 files changed, 4 insertions(+), 1 deletions(-)
>>>>
>>> I must be missing something.
>>>
>>> The lower layer device should set skb->dev to the correct macvtap
>>> device on receive so that in netif_receive_skb_core() the macvtap
>>> handler is hit. Skipping the macvlan receive handler should be OK
>>> because the switching was done by the hardware. If I read macvtap.c
>>> correctly macvlan_common_newlink() is called with 'dev' where 'dev'
>>> is the macvtap device. Any idea what I'm missing? I guess I'll need
>>> to setup a macvtap test case.
>> Unlike macvlan, macvtap depends on rx handler on the lower device to
>> work. In this case macvlan_handle_frame() will call macvtap_receive().
>> So doing netif_receive_skb_core() for macvtap device directly won't work
>> since we need to forward the packet to userspace instead of kernel.
>>
>> For net-next.git, it may work since commit
>> 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 let macvtap device register an
>> rx handler for itself.
> I agree, this seems like it should already be fixed with the above commit.  With
> this the macvlan rx handler should effectively be a no-op as far as the
> reception of frames is concerned.  As long as the driver sets the dev correctly
> to the macvtap device (and it appears to), macvtap will get frames to user
> space, regardless of weather the software or hardware did the switching.  If
> thats the case though, I think the solution is moving that fix to -stable
> (pending testing of course), rather than comming up with a new fix.
>
>>> And what synchronization are you worried about on dev_hard_start_xmit()?
>>> In the L2 forwarding offload case macvlan_open() clears the NETIF_F_LLTX
>>> flag so HARD_TX_LOCK protects the driver txq. We might hit this warning
>>> in dev_queue_xmit() though,
>>>
>>>   net_crit_ratelimited("Virtual device %s asks to queue packet!\n",
>>>
>>> Perhaps we can remove it.
>> The problem is macvtap does not call dev_queue_xmit() for macvlan
>> device. It calls macvlan_start_xmit() directly from macvtap_get_user().
>> So HARD_TX_LOCK was not done for the txq.
> This seems to also be fixed by 6acf54f1cf0a6747bac9fea26f34cfc5a9029523.
> Macvtap does, as of that commit use dev_queue_xmit for the transmission of
> frames to the lowerdevice.

Unfortunately not. This commit has a side effect that it in fact
disables the multiqueue macvtap transmission. Since all macvtap queues
will contend on a single qdisc lock.

For L2 forwarding offload itself, more issues need to be addressed for
multiqueue macvtap:

- ndo_dfwd_add_station() can only create queues per device at ndo_open,
but multiqueue macvtap allows user to create and destroy queues at their
will and at any time.
- it looks that ixgbe has a upper limit of 4 queues per station, but
macvtap currently allows up to 16 queues per device.

So more works need to be done and unless those above 3 issues were
addressed, this patch is really needed to make sure macvtap works.

>
> Regards
> Neil
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  reply	other threads:[~2014-01-07  3:10 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-06  3:21 [PATCH net 1/2] macvlan: forbid L2 fowarding offload for macvtap Jason Wang
2014-01-06  3:21 ` [PATCH net 2/2] net: core: explicitly select a txq before doing l2 forwarding Jason Wang
2014-01-06 12:04   ` Jeff Kirsher
2014-01-06 12:42   ` Neil Horman
2014-01-06 15:06     ` John Fastabend
2014-01-06 15:29       ` Neil Horman
2014-01-07  3:42     ` Jason Wang
2014-01-07 13:17       ` Neil Horman
2014-01-08  3:21         ` Jason Wang
2014-01-08 14:40           ` Neil Horman
2014-01-09  8:28             ` Jason Wang
2014-01-09 11:53               ` Neil Horman
2014-01-07  8:22   ` John Fastabend
2014-01-07  8:37     ` John Fastabend
2014-01-06  7:35 ` [PATCH net 1/2] macvlan: forbid L2 fowarding offload for macvtap John Fastabend
2014-01-06  7:54   ` Jason Wang
2014-01-06 12:26     ` Neil Horman
2014-01-07  3:10       ` Jason Wang [this message]
2014-01-07  5:15         ` John Fastabend
2014-01-07  6:22           ` Jason Wang
2014-01-07  7:26             ` John Fastabend
2014-01-07  9:00               ` Jason Wang
2014-01-08 12:55                 ` Michael S. Tsirkin
2014-01-08 19:05                   ` John Fastabend
2014-01-09  7:17                     ` Michael S. Tsirkin
2014-01-09  8:55                       ` Jason Wang
2014-01-09 21:39                         ` Stephen Hemminger
2014-01-09 22:03                           ` Michael S. Tsirkin
2014-01-09 22:20                             ` Stephen Hemminger
2014-01-10  7:06                           ` Jason Wang
2014-01-10 16:40                             ` Vlad Yasevich
2014-01-07  5:16         ` John Fastabend
2014-01-06 20:47 ` David Miller
2014-01-07  3:17   ` Jason Wang
2014-01-07  5:57     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52CB7009.2030903@redhat.com \
    --to=jasowang@redhat.com \
    --cc=davem@davemloft.net \
    --cc=john.fastabend@gmail.com \
    --cc=john.r.fastabend@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=vyasevic@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).