qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Dmitry Fleytman <dmitry@daynix.com>
Cc: Leonid Bloch <leonid.bloch@ravellosystems.com>,
	Leonid Bloch <leonid@daynix.com>,
	qemu-devel@nongnu.org,
	Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Subject: Re: [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e)
Date: Tue, 3 Nov 2015 13:44:45 +0800	[thread overview]
Message-ID: <563849CD.5010402@redhat.com> (raw)
In-Reply-To: <5941010E-D55E-4692-B128-BDD9D7B086FE@daynix.com>



On 11/02/2015 03:49 PM, Dmitry Fleytman wrote:
>
>> On 2 Nov 2015, at 05:35 AM, Jason Wang <jasowang@redhat.com
>> <mailto:jasowang@redhat.com>> wrote:
>>
>>
>>
>> On 10/31/2015 01:52 PM, Dmitry Fleytman wrote:
>>> Hello Jason,
>>>
>>> Thanks for reviewing. See my answers inline.
>>>
>>>
>>>> On 30 Oct 2015, at 07:28 AM, Jason Wang <jasowang@redhat.com
>>>> <mailto:jasowang@redhat.com>
>>>> <mailto:jasowang@redhat.com>> wrote:
>>>>
>>>>
>>>>
>>>> On 10/28/2015 01:44 PM, Jason Wang wrote:
>>>>>
>>>>> On 10/26/2015 01:00 AM, Leonid Bloch wrote:
>>>>>> Hello qemu-devel,
>>>>>>
>>>>>> This patch series is an RFC for the new networking device emulation
>>>>>> we're developing for QEMU.
>>>>>>
>>>>>> This new device emulates the Intel 82574 GbE Controller and works
>>>>>> with unmodified Intel e1000e drivers from the Linux/Windows kernels.
>>>>>>
>>>>>> The status of the current series is "Functional Device Ready, work
>>>>>> on Extended Features in Progress".
>>>>>>
>>>>>> More precisely, these patches represent a functional device, which
>>>>>> is recognized by the standard Intel drivers, and is able to transfer
>>>>>> TX/RX packets with CSO/TSO offloads, according to the spec.
>>>>>>
>>>>>> Extended features not supported yet (work in progress):
>>>>>> 1. TX/RX Interrupt moderation mechanisms
>>>>>> 2. RSS
>>>>>> 3. Full-featured multi-queue (use of multiqueued network backend)
>>>>>>
>>>>>> Also, there will be some code refactoring and performance
>>>>>> optimization efforts.
>>>>>>
>>>>>> This series was tested on Linux (Fedora 22) and Windows (2012R2)
>>>>>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various
>>>>>> packet sizes.
>>>>>>
>>>>>> More thorough testing, including data streams with different MTU
>>>>>> sizes, and Microsoft Certification (HLK) tests, are pending missing
>>>>>> features' development.
>>>>>>
>>>>>> See commit messages (esp. "net: Introduce e1000e device emulation")
>>>>>> for more information about the development approaches and the
>>>>>> architecture options chosen for this device.
>>>>>>
>>>>>> This series is based upon v2.3.0 tag of the upstream QEMU repository,
>>>>>> and it will be rebased to latest before the final submission.
>>>>>>
>>>>>> Please share your thoughts - any feedback is highly welcomed :)
>>>>>>
>>>>>> Best Regards,
>>>>>> Dmitry Fleytman.
>>>>> Thanks for the series. Will go through this in next few days.
>>>>
>>>> Have a quick glance at the series, got the following questions:
>>>>
>>>> - Though e1000e differs from e1000 in many places, I still see lots of
>>>> code duplications. We need consider to reuse e1000.c (or at least part
>>>> of). I believe we don't want to fix a bug twice in two places in the
>>>> future and I expect hundreds of lines could be saved through this way.
>>>
>>> That’s a good question :)
>>>
>>> This is how we started, we had a common “core” code base meant to
>>> implement all common logic (this split is still present in the patches
>>> - there are e1000e_core.c and e1000e.c files).
>>> Unfortunately at some point it turned out that there are more
>>> differences that commons. We noticed that the code becomes filled with
>>> many minor differences handling.
>>> This also made the code base more complicated and harder to follow.
>>>
>>> So at some point of time it was decided to split the code base and
>>> revert all changes done to the e1000 device (except a few
>>> fixes/improvements Leonid submitted a few days ago).
>>>
>>> Although there was common code between devices, total SLOC of e1000
>>> and e1000e devices became smaller after the split.
>>>
>>> Amount of code that may be shared between devices will be even smaller
>>> after we complete the implementation which still misses a few features
>>> (see cover letter) that will change many things.
>>>
>>> Still after the device implementation is done, we plan to review code
>>> similarities again to see if there are possibilities for code sharing.
>>
>> I see, but if we can try to re-use or unify the codes from beginning, it
>> would be a little bit easier. Looks like the differences were mainly:
>>
>> 1) MSI-X support
>> 2) offloading support through virtio-net header
>> 3) trace points
>> 4) other new functions through e1000e specific registers
>>
>> So we could first unify the code through implementing the support of 2
>> and 3 for e1000. For MSI-X and other e1000e specific new functions, it
>> could be done through:
>>
>> 1) model specific callbacks, e.g realize, transmission and reception
>> 2) A new register flags e.g PHY_RW_E1000E which means the register is
>> for e1000e only. Or even model specific wirteops and readops
>> 3) For other subtle differences, it could be done in the code by
>> checking the model explicitly.
>>
>> What's your opinion? (A good example of code sharing is freebsd's e1000
>> driver which covers both 8254x and 8257x).
>
>
> Hi Jason,
>
> This is exactly how we started.
>
> Issues that made us split the code base were following:
>
> 1. The majority of registers are different. Even same registers in
> many cases have different bits and require different processing logic.
> This introduced too much if’s into the code;

Then we can probably have different writeops and readops. This way, we
can at least save the codes of common registers.

> 2. The data path is totally different not just because of virtio
> headers but also because these devices use different descriptor
> layouts and require different logic in order to parse and fill those.
> There are legacy descriptors that look almost the same but of course
> we must support all descriptor types described by spec;

Yes, but looks like the only extend rx/tx descriptors is 8257x specific.
And 8257x fully supports both legacy and context descriptor of 8254x.
This give us another chance to reuse the code.

> 3. Interrupt handling logic is different because of MSI support;

Right, but this is not hard to address, probably a new helper.

> 4. Mutli-queue and RSS make the code even less similar to the old device;

Yes, this could be in 8275x specific file.

> 5. Changes required to isolate shared code base required changes in
> migration tables and fishy tricks to preserve compatibility with
> current implementation;

Since 8257x is a totally new device, it can has its own vmstate if it's
simpler to be implemented and we don't even need to care migration
compatibility.

> 6. e1000 code suffered from massive changes which are very hard to
> verify because spec is big and there are no drivers that use all
> device features.

Then we can try to change e1000 as mini as possible. E.g just factor out
the common logic to helpers to reuse it in e1000e.

>
> Overall, code for handling differences in device behaviours was bigger
> and more complicated then the device logic itself. The difference is
> not subtle when it comes to the full featured device implementation.
> As for FreeBSD’s driver, I’m not deeply familiar with its code but I
> suspect it works with device in legacy mode which is pretty similar to
> an old device indeed. Since we must support all modes our situation is
> different.

Yes, it does not use extended descriptor format.

>
> Again, I’m totally into shared code and would like to have some common
> code base anyway. Currently it looks like the best way to achieve this
> is to finish with all device features and then see what parts of logic
> may be shared between the old and the new devices. It’s better to have
> slight code duplication and smaller shared code base than to have
> bloated and tricky shared code that will introduce its own problems to
> both devices.

The code duplication is not slight in this rfc :). So the code has the
possibility to be unified. But I'm ok to evaluate this after all
features were developed.

Thanks

>
> Best Regards,
> Dmitry
>
>>
>>>
>>>> - For e1000e it self, since it was a new device, so no need to care
>>>> about compatibility stuffs (e.g auto negotiation and mit). We can just
>>>> enable them forever.
>>>
>>> Yes, we have this in plans.
>>>
>>>> - And for the generic packet abstraction layer, what's the
>>>> advantages of
>>>> this? If it has lot, maybe we can use it in other nic model (e.g
>>>> virtio-net)?
>>>
>>> These abstractions were initially developed by me as a part of vmxnet3
>>> device to be generic and re-usable. Their main advantage is support
>>> for virtio headers for virtio-enabled backends and emulation of
>>> network offloads in software for backends that do not support virtio.
>>>
>>> Of course they may be re-used by virtio, however I’m not sure if it
>>> will be really useful because virtio has feature negotiation
>>> facilities and do not require SW emulation for network task offloads.
>>>
>>> For other devices they are useful because each and every device that
>>> requires SW offloads implementation need to do exactly the same things
>>> and it doesn’t make sense to have a few implementations for this.
>>>
>>> Best Regards,
>>> Dmitry
>>
>> Ok, thanks for the explanation.
>>
>>>
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>> Since 2.5 is in soft freeze, this looks a 2.6 material.
>

  reply	other threads:[~2015-11-03  5:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-25 17:00 [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e) Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 1/5] net: Add macros for ETH address tracing Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 2/5] net_pkt: Name vmxnet3 packet abstractions more generic Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 3/5] net_pkt: Extend packet abstraction as requied by e1000e functionality Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 4/5] e1000_regs: Add definitions for Intel 82574-specific bits Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 5/5] net: Introduce e1000e device emulation Leonid Bloch
2015-10-28  5:44 ` [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e) Jason Wang
2015-10-28  6:11   ` Dmitry Fleytman
2015-10-30  5:28   ` Jason Wang
2015-10-31  5:52     ` Dmitry Fleytman
2015-11-02  3:35       ` Jason Wang
2015-11-02  7:49         ` Dmitry Fleytman
2015-11-03  5:44           ` Jason Wang [this message]
2015-11-03  9:17             ` Dmitry Fleytman
2016-01-13  4:43 ` Prem Mallappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563849CD.5010402@redhat.com \
    --to=jasowang@redhat.com \
    --cc=dmitry@daynix.com \
    --cc=leonid.bloch@ravellosystems.com \
    --cc=leonid@daynix.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shmulik.ladkani@ravellosystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).