All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
	john.r.fastabend@intel.com, netdev@vger.kernel.org,
	alexei.starovoitov@gmail.com, daniel@iogearbox.net
Subject: Re: [net PATCH] net: virtio: cap mtu when XDP programs are running
Date: Mon, 9 Jan 2017 20:25:43 -0800	[thread overview]
Message-ID: <58746247.604@gmail.com> (raw)
In-Reply-To: <20170110055023-mutt-send-email-mst@kernel.org>

On 17-01-09 07:55 PM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2017 at 07:30:34PM -0800, John Fastabend wrote:
>> On 17-01-09 06:51 PM, Michael S. Tsirkin wrote:
>>> On Tue, Jan 10, 2017 at 10:29:39AM +0800, Jason Wang wrote:
>>>>
>>>>
>>>> On 2017年01月10日 07:58, Michael S. Tsirkin wrote:
>>>>> On Mon, Jan 09, 2017 at 03:49:27PM -0800, John Fastabend wrote:
>>>>>> On 17-01-09 03:24 PM, Michael S. Tsirkin wrote:
>>>>>>> On Mon, Jan 09, 2017 at 03:13:15PM -0800, John Fastabend wrote:
>>>>>>>> On 17-01-09 03:05 PM, Michael S. Tsirkin wrote:
>>>>>>>>> On Thu, Jan 05, 2017 at 11:09:14AM +0800, Jason Wang wrote:
>>>>>>>>>> On 2017年01月05日 02:57, John Fastabend wrote:
>>>>>>>>>>> [...]
>>>>>>>>>>>
>>>>>>>>>>>> On 2017年01月04日 00:48, John Fastabend wrote:
>>>>>>>>>>>>> On 17-01-02 10:14 PM, Jason Wang wrote:
>>>>>>>>>>>>>> On 2017年01月03日 06:30, John Fastabend wrote:
>>>>>>>>>>>>>>> XDP programs can not consume multiple pages so we cap the MTU to
>>>>>>>>>>>>>>> avoid this case. Virtio-net however only checks the MTU at XDP
>>>>>>>>>>>>>>> program load and does not block MTU changes after the program
>>>>>>>>>>>>>>> has loaded.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This patch sets/clears the max_mtu value at XDP load/unload time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: John Fastabend<john.r.fastabend@intel.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>> [...]
>>>>>>>>>>>
>>>>>>>>>>>>> OK so this logic is a bit too simply. When it resets the max_mtu I guess it
>>>>>>>>>>>>> needs to read the mtu via
>>>>>>>>>>>>>
>>>>>>>>>>>>>       virtio_cread16(vdev, ...)
>>>>>>>>>>>>>
>>>>>>>>>>>>> or we may break the negotiated mtu.
>>>>>>>>>>>> Yes, this is a problem (even use ETH_MAX_MTU). We may need a method to notify
>>>>>>>>>>>> the device about the mtu in this case which is not supported by virtio now.
>>>>>>>>>>> Note this is not really a XDP specific problem. The guest can change the MTU
>>>>>>>>>>> after init time even without XDP which I assume should ideally result in a
>>>>>>>>>>> notification if the MTU is negotiated.
>>>>>>>>>> Yes, Michael, do you think we need add some mechanism to notify host about
>>>>>>>>>> MTU change in this case?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>> Why does host care?
>>>>>>>>>
>>>>>>>> Well the guest will drop packets after mtu has been reduced.
>>>>>>> I didn't know. What place in code does this?
>>>>>>>
>>>>>> hmm in many of the drivers it is convention to use the mtu to set the rx
>>>>>> buffer sizes and a receive side max length filter. For example in the Intel
>>>>>> drivers if a packet with length greater than MTU + some headroom is received we
>>>>>> drop it. I guess in the networking stack RX path though nothing forces this and
>>>>>> virtio doesn't have any code to drop packets on rx size.
>>>>>>
>>>>>> In virtio I don't see any existing case currently. In the XDP case though we
>>>>>> need to ensure packets fit in a page for the time being which is why I was
>>>>>> looking at this code and generated this patch.
>>>>> I'd say just look at the hardware max mtu. Ignore the configured mtu.
>>>>>
>>>>>
>>>>
>>>> Does this work for small buffers consider it always allocate skb with size
>>>> of GOOD_PACKET_LEN?
>>>
>>> Spec says hardware won't send in packets > max mtu in config space.
>>>
>>>> I think in any case, we should limit max_mtu to
>>>> GOOD_PACKET_LEN for small buffers.
>>>>
>>>> Thanks
>>>
>>> XDP seems to have a bunch of weird restrictions, I just
>>> do not like it that the logic spills out to all drivers.
>>> What if someone decides to extend it to two pages in the future?
>>> Recode it all in all drivers ...
>>>
>>> Why can't net core enforce mtu?
>>>
>>
>> OK I agree I'll put most the logic in rtnetlink.c when the program is added
>> or removed.
>>
>> But, I'm looking at the non-XDP receive_small path now and wondering how does
>> multiple buffer receives work (e.g. packet larger than GOOD_PACKET_LEN?)
> 
> I don't understand the question. Look at add_recvbuf_small,
> it adds a tiny buffer for head and then the skb.
> 

Specifically this seems to fail with mergeable buffers disabled

On the host:

# ip link set dev tap0 mtu 9000
# ping 22.2 -s 2048

On the guest:

# insmod ./drivers/net/virtio_net.ko
# ip link set dev eth0 mtu 9000

With mergeable buffers enabled no problems it works as I expect at least.


> 
>> I think
>> this is what Jason is looking at as well? The mergeable case clearly looks at
>> num_bufs in the descriptor to construct multi-buffer packets but nothing like
>> that exists in the small_receive path as best I can tell.
>>
>> .John
> 
> There's always a single buffer there.
> BTW it was always a legacy path but if it's now important for people we
> should probably check ANY_LAYOUT and put header linearly with the packet
> if there.
> 

  reply	other threads:[~2017-01-10  4:25 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-02 22:30 [net PATCH] net: virtio: cap mtu when XDP programs are running John Fastabend
2017-01-03  6:14 ` Jason Wang
2017-01-03 16:48   ` John Fastabend
2017-01-04  3:16     ` Jason Wang
2017-01-04 18:57       ` John Fastabend
2017-01-05  3:09         ` Jason Wang
2017-01-09 23:05           ` Michael S. Tsirkin
2017-01-09 23:13             ` John Fastabend
2017-01-09 23:24               ` Michael S. Tsirkin
2017-01-09 23:49                 ` John Fastabend
2017-01-09 23:58                   ` Michael S. Tsirkin
2017-01-10  2:29                     ` Jason Wang
2017-01-10  2:51                       ` Michael S. Tsirkin
2017-01-10  3:30                         ` John Fastabend
2017-01-10  3:55                           ` Michael S. Tsirkin
2017-01-10  4:25                             ` John Fastabend [this message]
2017-01-10  5:00                               ` Michael S. Tsirkin
2017-01-11  3:37                                 ` Jason Wang
2017-01-10  3:34                         ` Jason Wang
2017-01-10 14:51                         ` Michael S. Tsirkin
2017-01-03 16:48   ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58746247.604@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=jasowang@redhat.com \
    --cc=john.r.fastabend@intel.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.