From: Jason Wang <jasowang@redhat.com>
To: Dmitry Fleytman <dmitry@daynix.com>
Cc: Leonid Bloch <leonid.bloch@ravellosystems.com>,
Leonid Bloch <leonid@daynix.com>,
qemu-devel@nongnu.org,
Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Subject: Re: [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e)
Date: Tue, 3 Nov 2015 13:44:45 +0800 [thread overview]
Message-ID: <563849CD.5010402@redhat.com> (raw)
In-Reply-To: <5941010E-D55E-4692-B128-BDD9D7B086FE@daynix.com>
On 11/02/2015 03:49 PM, Dmitry Fleytman wrote:
>
>> On 2 Nov 2015, at 05:35 AM, Jason Wang <jasowang@redhat.com
>> <mailto:jasowang@redhat.com>> wrote:
>>
>>
>>
>> On 10/31/2015 01:52 PM, Dmitry Fleytman wrote:
>>> Hello Jason,
>>>
>>> Thanks for reviewing. See my answers inline.
>>>
>>>
>>>> On 30 Oct 2015, at 07:28 AM, Jason Wang <jasowang@redhat.com
>>>> <mailto:jasowang@redhat.com>
>>>> <mailto:jasowang@redhat.com>> wrote:
>>>>
>>>>
>>>>
>>>> On 10/28/2015 01:44 PM, Jason Wang wrote:
>>>>>
>>>>> On 10/26/2015 01:00 AM, Leonid Bloch wrote:
>>>>>> Hello qemu-devel,
>>>>>>
>>>>>> This patch series is an RFC for the new networking device emulation
>>>>>> we're developing for QEMU.
>>>>>>
>>>>>> This new device emulates the Intel 82574 GbE Controller and works
>>>>>> with unmodified Intel e1000e drivers from the Linux/Windows kernels.
>>>>>>
>>>>>> The status of the current series is "Functional Device Ready, work
>>>>>> on Extended Features in Progress".
>>>>>>
>>>>>> More precisely, these patches represent a functional device, which
>>>>>> is recognized by the standard Intel drivers, and is able to transfer
>>>>>> TX/RX packets with CSO/TSO offloads, according to the spec.
>>>>>>
>>>>>> Extended features not supported yet (work in progress):
>>>>>> 1. TX/RX Interrupt moderation mechanisms
>>>>>> 2. RSS
>>>>>> 3. Full-featured multi-queue (use of multiqueued network backend)
>>>>>>
>>>>>> Also, there will be some code refactoring and performance
>>>>>> optimization efforts.
>>>>>>
>>>>>> This series was tested on Linux (Fedora 22) and Windows (2012R2)
>>>>>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various
>>>>>> packet sizes.
>>>>>>
>>>>>> More thorough testing, including data streams with different MTU
>>>>>> sizes, and Microsoft Certification (HLK) tests, are pending missing
>>>>>> features' development.
>>>>>>
>>>>>> See commit messages (esp. "net: Introduce e1000e device emulation")
>>>>>> for more information about the development approaches and the
>>>>>> architecture options chosen for this device.
>>>>>>
>>>>>> This series is based upon v2.3.0 tag of the upstream QEMU repository,
>>>>>> and it will be rebased to latest before the final submission.
>>>>>>
>>>>>> Please share your thoughts - any feedback is highly welcomed :)
>>>>>>
>>>>>> Best Regards,
>>>>>> Dmitry Fleytman.
>>>>> Thanks for the series. Will go through this in next few days.
>>>>
>>>> Have a quick glance at the series, got the following questions:
>>>>
>>>> - Though e1000e differs from e1000 in many places, I still see lots of
>>>> code duplications. We need consider to reuse e1000.c (or at least part
>>>> of). I believe we don't want to fix a bug twice in two places in the
>>>> future and I expect hundreds of lines could be saved through this way.
>>>
>>> That’s a good question :)
>>>
>>> This is how we started, we had a common “core” code base meant to
>>> implement all common logic (this split is still present in the patches
>>> - there are e1000e_core.c and e1000e.c files).
>>> Unfortunately at some point it turned out that there are more
>>> differences that commons. We noticed that the code becomes filled with
>>> many minor differences handling.
>>> This also made the code base more complicated and harder to follow.
>>>
>>> So at some point of time it was decided to split the code base and
>>> revert all changes done to the e1000 device (except a few
>>> fixes/improvements Leonid submitted a few days ago).
>>>
>>> Although there was common code between devices, total SLOC of e1000
>>> and e1000e devices became smaller after the split.
>>>
>>> Amount of code that may be shared between devices will be even smaller
>>> after we complete the implementation which still misses a few features
>>> (see cover letter) that will change many things.
>>>
>>> Still after the device implementation is done, we plan to review code
>>> similarities again to see if there are possibilities for code sharing.
>>
>> I see, but if we can try to re-use or unify the codes from beginning, it
>> would be a little bit easier. Looks like the differences were mainly:
>>
>> 1) MSI-X support
>> 2) offloading support through virtio-net header
>> 3) trace points
>> 4) other new functions through e1000e specific registers
>>
>> So we could first unify the code through implementing the support of 2
>> and 3 for e1000. For MSI-X and other e1000e specific new functions, it
>> could be done through:
>>
>> 1) model specific callbacks, e.g realize, transmission and reception
>> 2) A new register flags e.g PHY_RW_E1000E which means the register is
>> for e1000e only. Or even model specific wirteops and readops
>> 3) For other subtle differences, it could be done in the code by
>> checking the model explicitly.
>>
>> What's your opinion? (A good example of code sharing is freebsd's e1000
>> driver which covers both 8254x and 8257x).
>
>
> Hi Jason,
>
> This is exactly how we started.
>
> Issues that made us split the code base were following:
>
> 1. The majority of registers are different. Even same registers in
> many cases have different bits and require different processing logic.
> This introduced too much if’s into the code;
Then we can probably have different writeops and readops. This way, we
can at least save the codes of common registers.
> 2. The data path is totally different not just because of virtio
> headers but also because these devices use different descriptor
> layouts and require different logic in order to parse and fill those.
> There are legacy descriptors that look almost the same but of course
> we must support all descriptor types described by spec;
Yes, but looks like the only extend rx/tx descriptors is 8257x specific.
And 8257x fully supports both legacy and context descriptor of 8254x.
This give us another chance to reuse the code.
> 3. Interrupt handling logic is different because of MSI support;
Right, but this is not hard to address, probably a new helper.
> 4. Mutli-queue and RSS make the code even less similar to the old device;
Yes, this could be in 8275x specific file.
> 5. Changes required to isolate shared code base required changes in
> migration tables and fishy tricks to preserve compatibility with
> current implementation;
Since 8257x is a totally new device, it can has its own vmstate if it's
simpler to be implemented and we don't even need to care migration
compatibility.
> 6. e1000 code suffered from massive changes which are very hard to
> verify because spec is big and there are no drivers that use all
> device features.
Then we can try to change e1000 as mini as possible. E.g just factor out
the common logic to helpers to reuse it in e1000e.
>
> Overall, code for handling differences in device behaviours was bigger
> and more complicated then the device logic itself. The difference is
> not subtle when it comes to the full featured device implementation.
> As for FreeBSD’s driver, I’m not deeply familiar with its code but I
> suspect it works with device in legacy mode which is pretty similar to
> an old device indeed. Since we must support all modes our situation is
> different.
Yes, it does not use extended descriptor format.
>
> Again, I’m totally into shared code and would like to have some common
> code base anyway. Currently it looks like the best way to achieve this
> is to finish with all device features and then see what parts of logic
> may be shared between the old and the new devices. It’s better to have
> slight code duplication and smaller shared code base than to have
> bloated and tricky shared code that will introduce its own problems to
> both devices.
The code duplication is not slight in this rfc :). So the code has the
possibility to be unified. But I'm ok to evaluate this after all
features were developed.
Thanks
>
> Best Regards,
> Dmitry
>
>>
>>>
>>>> - For e1000e it self, since it was a new device, so no need to care
>>>> about compatibility stuffs (e.g auto negotiation and mit). We can just
>>>> enable them forever.
>>>
>>> Yes, we have this in plans.
>>>
>>>> - And for the generic packet abstraction layer, what's the
>>>> advantages of
>>>> this? If it has lot, maybe we can use it in other nic model (e.g
>>>> virtio-net)?
>>>
>>> These abstractions were initially developed by me as a part of vmxnet3
>>> device to be generic and re-usable. Their main advantage is support
>>> for virtio headers for virtio-enabled backends and emulation of
>>> network offloads in software for backends that do not support virtio.
>>>
>>> Of course they may be re-used by virtio, however I’m not sure if it
>>> will be really useful because virtio has feature negotiation
>>> facilities and do not require SW emulation for network task offloads.
>>>
>>> For other devices they are useful because each and every device that
>>> requires SW offloads implementation need to do exactly the same things
>>> and it doesn’t make sense to have a few implementations for this.
>>>
>>> Best Regards,
>>> Dmitry
>>
>> Ok, thanks for the explanation.
>>
>>>
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>> Since 2.5 is in soft freeze, this looks a 2.6 material.
>
next prev parent reply other threads:[~2015-11-03 5:45 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-25 17:00 [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e) Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 1/5] net: Add macros for ETH address tracing Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 2/5] net_pkt: Name vmxnet3 packet abstractions more generic Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 3/5] net_pkt: Extend packet abstraction as requied by e1000e functionality Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 4/5] e1000_regs: Add definitions for Intel 82574-specific bits Leonid Bloch
2015-10-25 17:00 ` [Qemu-devel] [RFC PATCH 5/5] net: Introduce e1000e device emulation Leonid Bloch
2015-10-28 5:44 ` [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e) Jason Wang
2015-10-28 6:11 ` Dmitry Fleytman
2015-10-30 5:28 ` Jason Wang
2015-10-31 5:52 ` Dmitry Fleytman
2015-11-02 3:35 ` Jason Wang
2015-11-02 7:49 ` Dmitry Fleytman
2015-11-03 5:44 ` Jason Wang [this message]
2015-11-03 9:17 ` Dmitry Fleytman
2016-01-13 4:43 ` Prem Mallappa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=563849CD.5010402@redhat.com \
--to=jasowang@redhat.com \
--cc=dmitry@daynix.com \
--cc=leonid.bloch@ravellosystems.com \
--cc=leonid@daynix.com \
--cc=qemu-devel@nongnu.org \
--cc=shmulik.ladkani@ravellosystems.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.