* MTU on a virtio-net device?
@ 2008-10-23 8:34 Michael Tokarev
2008-10-23 12:09 ` Dor Laor
0 siblings, 1 reply; 5+ messages in thread
From: Michael Tokarev @ 2008-10-23 8:34 UTC (permalink / raw)
To: kvm
Right now (2.6.27), there's no way to change MTU of a
virtio-net interface, since the mtu-changing method is
not provided. Is there a simple way to add such a
beast?
I'm asking because I'm not familiar with the internals,
and because, I think, increasing MTU (so that the
resulting skb still fits in a single page) will increase
performance significantly, at least on a internal/virtual
network -- currently there are just way too many context
switches and the like while copying data from one guest
to another or between guest and host.
Thanks!
/mjt
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU on a virtio-net device?
2008-10-23 8:34 MTU on a virtio-net device? Michael Tokarev
@ 2008-10-23 12:09 ` Dor Laor
2008-10-23 12:30 ` Michael Tokarev
0 siblings, 1 reply; 5+ messages in thread
From: Dor Laor @ 2008-10-23 12:09 UTC (permalink / raw)
To: Michael Tokarev; +Cc: kvm
Michael Tokarev wrote:
> Right now (2.6.27), there's no way to change MTU of a
> virtio-net interface, since the mtu-changing method is
> not provided. Is there a simple way to add such a
> beast?
>
It should be a nice easy patch for mtu < 4k.
You can just implement a 'change_mtu' handler like:
static int virtio_change_mtu(struct net_device *netdev, int new_mtu)
{
if(new_mtu < ETH_ZLEN || new_mtu > PAGE_SIZE)
return -EINVAL;
netdev->mtu = new_mtu;
return 0;
}
> I'm asking because I'm not familiar with the internals,
> and because, I think, increasing MTU (so that the
> resulting skb still fits in a single page) will increase
> performance significantly, at least on a internal/virtual
> network -- currently there are just way too many context
> switches and the like while copying data from one guest
> to another or between guest and host.
>
> Thanks!
>
> /mjt
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU on a virtio-net device?
2008-10-23 12:09 ` Dor Laor
@ 2008-10-23 12:30 ` Michael Tokarev
[not found] ` <490073EA.5060009@redhat.com>
0 siblings, 1 reply; 5+ messages in thread
From: Michael Tokarev @ 2008-10-23 12:30 UTC (permalink / raw)
To: dor; +Cc: kvm
Dor Laor wrote:
> Michael Tokarev wrote:
>> Right now (2.6.27), there's no way to change MTU of a
>> virtio-net interface, since the mtu-changing method is
>> not provided. Is there a simple way to add such a
>> beast?
>>
> It should be a nice easy patch for mtu < 4k.
> You can just implement a 'change_mtu' handler like:
>
> static int virtio_change_mtu(struct net_device *netdev, int new_mtu)
> {
> if(new_mtu < ETH_ZLEN || new_mtu > PAGE_SIZE)
> return -EINVAL;
> netdev->mtu = new_mtu;
> return 0;
> }
Well, this isn't enough I think. That is, new_mtu's upper cap should be
less than PAGE_SIZE due to various additional data structures. But it
is enough to start playing.
I just added the above method, which allowed me to set MTU to 3500
(arbitrary). But it still does not work. In guest, I see the
following while pinging it from host with `ping -s2000':
16:26:57.952684 IP truncated-ip - 528 bytes missing! 81.13.33.145 > 81.13.33.150: ICMP echo request, id 12869, seq 19, length 2008
16:26:58.954133 IP truncated-ip - 528 bytes missing! 81.13.33.145 > 81.13.33.150: ICMP echo request, id 12869, seq 20, length 2008
...
So something else has to be changed for this to work, it seems.
That's why I wrote:
>> I'm asking because I'm not familiar with the internals,
[...]
;)
Thanks!
/mjt
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU on a virtio-net device?
[not found] ` <490073EA.5060009@redhat.com>
@ 2008-10-23 13:19 ` Michael Tokarev
2008-10-23 13:27 ` Dor Laor
0 siblings, 1 reply; 5+ messages in thread
From: Michael Tokarev @ 2008-10-23 13:19 UTC (permalink / raw)
To: dor; +Cc: kvm
Dor Laor wrote:
> Michael Tokarev wrote:
>> Dor Laor wrote:
>>
>>> Michael Tokarev wrote:
>>>
>>>> Right now (2.6.27), there's no way to change MTU of a
>>>> virtio-net interface, since the mtu-changing method is
>>>> not provided. Is there a simple way to add such a
>>>> beast?
>>>>
>>> It should be a nice easy patch for mtu < 4k.
>>> You can just implement a 'change_mtu' handler like:
[]
>> Well, this isn't enough I think. That is, new_mtu's upper cap should be
>> less than PAGE_SIZE due to various additional data structures. But it
>> is enough to start playing.
>>
> The virtio header is in a separate ring entry so no prob.
virtio header is one thing. Ethernet frame is another. And
so on. From the last experiment (sending 2000bytes-payload
pings resulting in 2008 bytes total, and 528 bytes missing
with original mtu=1500), it seems like the necessary upper
cap is PAGE_SIZE-28. Or something similar.
Also see receive_skb() routine:
receive_skb(struct net_device *dev, struct sk_buff *skb, unsigned len)
{
if (unlikely(len < sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
/*drop*/
}
len -= sizeof(struct virtio_net_hdr);
if (len <= MAX_PACKET_LEN) {
...
So it seems that virtio_net_hdr is in here, just like
ethernet header.
[]
>> So something else has to be changed for this to work, it seems.
> You're right, this was needs to be changed to:
> /* FIXME: MTU in config. */
> #define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN)
>
> You can change it to PAGE_SIZE or have the current mtu.
so s/MAX_PACKET_LEN/dev->mtu/g for the whole driver, it
seems. Plus/minus sizeof(virtio_net_hdr) - checking this now.
This constant is used in 3 places:
receive_skb(): if (len <= MAX_PACKET_LEN) {
(this one seems to be wrong, but again I don't know much
internals of all this stuff)
here, dev->mtu is what we want.
try_fill_recv(): skb = netdev_alloc_skb(vi->dev, MAX_PACKET_LEN);
here, we don't have dev, but have vi->dev, should be ok too.
try_fill_recv(): skb_put(skb, MAX_PACKET_LEN);
ditto
And by the way, what is "big_packets" here?
Ok, so I changed MAX_PACKET_LEN to be PAGE_SIZE (current MTU
seems to be more appropriate but PAGE_SIZE is enough for
testing anyway). It seems to be working, and network
speed increased significantly with MTU=3500 compared with
former 1500 - it seems it's about 2 times faster (which is
quite expectable, since there's 2x less context switches,
transmissions and the like).
>>>> I'm asking because I'm not familiar with the internals,
Still... ;)
Thanks!
/mjt
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: MTU on a virtio-net device?
2008-10-23 13:19 ` Michael Tokarev
@ 2008-10-23 13:27 ` Dor Laor
0 siblings, 0 replies; 5+ messages in thread
From: Dor Laor @ 2008-10-23 13:27 UTC (permalink / raw)
To: Michael Tokarev; +Cc: kvm
Michael Tokarev wrote:
> Dor Laor wrote:
>
>> Michael Tokarev wrote:
>>
>>> Dor Laor wrote:
>>>
>>>
>>>> Michael Tokarev wrote:
>>>>
>>>>
>>>>> Right now (2.6.27), there's no way to change MTU of a
>>>>> virtio-net interface, since the mtu-changing method is
>>>>> not provided. Is there a simple way to add such a
>>>>> beast?
>>>>>
>>>>>
>>>> It should be a nice easy patch for mtu < 4k.
>>>> You can just implement a 'change_mtu' handler like:
>>>>
> []
>
>>> Well, this isn't enough I think. That is, new_mtu's upper cap should be
>>> less than PAGE_SIZE due to various additional data structures. But it
>>> is enough to start playing.
>>>
>>>
>> The virtio header is in a separate ring entry so no prob.
>>
>
> virtio header is one thing. Ethernet frame is another. And
> so on. From the last experiment (sending 2000bytes-payload
> pings resulting in 2008 bytes total, and 528 bytes missing
> with original mtu=1500), it seems like the necessary upper
> cap is PAGE_SIZE-28. Or something similar.
>
> Also see receive_skb() routine:
>
> receive_skb(struct net_device *dev, struct sk_buff *skb, unsigned len)
> {
> if (unlikely(len < sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
> /*drop*/
> }
> len -= sizeof(struct virtio_net_hdr);
> if (len <= MAX_PACKET_LEN) {
> ...
>
> So it seems that virtio_net_hdr is in here, just like
> ethernet header.
>
> []
>
>>> So something else has to be changed for this to work, it seems.
>>>
>> You're right, this was needs to be changed to:
>> /* FIXME: MTU in config. */
>> #define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN)
>>
>> You can change it to PAGE_SIZE or have the current mtu.
>>
>
> so s/MAX_PACKET_LEN/dev->mtu/g for the whole driver, it
> seems. Plus/minus sizeof(virtio_net_hdr) - checking this now.
> This constant is used in 3 places:
>
> receive_skb(): if (len <= MAX_PACKET_LEN) {
> (this one seems to be wrong, but again I don't know much
> internals of all this stuff)
> here, dev->mtu is what we want.
>
> try_fill_recv(): skb = netdev_alloc_skb(vi->dev, MAX_PACKET_LEN);
> here, we don't have dev, but have vi->dev, should be ok too.
> try_fill_recv(): skb_put(skb, MAX_PACKET_LEN);
> ditto
>
>
I was too lazy to write a complete patch.
> And by the way, what is "big_packets" here?
>
It's a bit harder here, IIRC qemu also has a 4k limit.
Not that it can be done in a short period.
Anyway you can use GSO and achieve similar performance.
> Ok, so I changed MAX_PACKET_LEN to be PAGE_SIZE (current MTU
> seems to be more appropriate but PAGE_SIZE is enough for
> testing anyway). It seems to be working, and network
> speed increased significantly with MTU=3500 compared with
> former 1500 - it seems it's about 2 times faster (which is
> quite expectable, since there's 2x less context switches,
> transmissions and the like).
>
>
>>>>> I'm asking because I'm not familiar with the internals,
>>>>>
>
> Still... ;)
>
> Thanks!
>
> /mjt
>
>
You seems to be a fast learner :)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-10-23 13:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-23 8:34 MTU on a virtio-net device? Michael Tokarev
2008-10-23 12:09 ` Dor Laor
2008-10-23 12:30 ` Michael Tokarev
[not found] ` <490073EA.5060009@redhat.com>
2008-10-23 13:19 ` Michael Tokarev
2008-10-23 13:27 ` Dor Laor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).