* [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
@ 2018-04-02 19:06 Sinan Kaya
2018-04-03 2:59 ` Sinan Kaya
0 siblings, 1 reply; 7+ messages in thread
From: Sinan Kaya @ 2018-04-02 19:06 UTC (permalink / raw)
To: jeffrey.t.kirsher
Cc: netdev, timur, sulrich, linux-arm-msm, linux-arm-kernel,
Sinan Kaya
Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.
This ends up CPU observing two barriers back to back before executing the
register write.
Since code already has an explicit barrier call, changing writel() to
writel_relaxed().
I did a regex search for wmb() followed by writel() in each drivers
directory. I scrubbed the ones I care about in this series.
I considered "ease of change", "popular usage" and "performance critical
path" as the determining criteria for my filtering.
We used relaxed API heavily on ARM for a long time but
it did not exist on other architectures. For this reason, relaxed
architectures have been paying double penalty in order to use the common
drivers.
Now that relaxed API is present on all architectures, we can go and scrub
all drivers to see what needs to change and what can remain.
We start with mostly used ones and hope to increase the coverage over time.
It will take a while to cover all drivers.
Feel free to apply patches individually.
Change since 7:
* API clarification with Linus and several architecture folks regarding
writel()
https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html
* removed wmb() in front of writel() at several places.
* remove old IA64 comments regarding ordering.
Sinan Kaya (7):
i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
ixgbe: eliminate duplicate barriers on weakly-ordered archs
igbvf: eliminate duplicate barriers on weakly-ordered archs
igb: eliminate duplicate barriers on weakly-ordered archs
fm10k: Eliminate duplicate barriers on weakly-ordered archs
ixgbevf: keep writel() closer to wmb()
ixgbevf: eliminate duplicate barriers on weakly-ordered archs
drivers/net/ethernet/intel/fm10k/fm10k_main.c | 15 ++-----------
drivers/net/ethernet/intel/i40e/i40e_txrx.c | 22 ++++++++----------
drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 8 +------
drivers/net/ethernet/intel/igb/igb_main.c | 14 ++----------
drivers/net/ethernet/intel/igbvf/netdev.c | 16 +++++---------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 23 ++-----------------
drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 27 ++++++++---------------
8 files changed, 30 insertions(+), 100 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
2018-04-02 19:06 Sinan Kaya
@ 2018-04-03 2:59 ` Sinan Kaya
2018-04-03 17:47 ` Alexander Duyck
0 siblings, 1 reply; 7+ messages in thread
From: Sinan Kaya @ 2018-04-03 2:59 UTC (permalink / raw)
To: jeffrey.t.kirsher, Alexander Duyck
Cc: netdev, Timur Tabi, sulrich, linux-arm-msm, linux-arm-kernel
Alex,
On 4/2/2018 3:06 PM, Sinan Kaya wrote:
> Code includes wmb() followed by writel() in multiple places. writel()
> already has a barrier on some architectures like arm64.
>
> This ends up CPU observing two barriers back to back before executing the
> register write.
>
> Since code already has an explicit barrier call, changing writel() to
> writel_relaxed().
>
> I did a regex search for wmb() followed by writel() in each drivers
> directory. I scrubbed the ones I care about in this series.
>
> I considered "ease of change", "popular usage" and "performance critical
> path" as the determining criteria for my filtering.
>
> We used relaxed API heavily on ARM for a long time but
> it did not exist on other architectures. For this reason, relaxed
> architectures have been paying double penalty in order to use the common
> drivers.
>
> Now that relaxed API is present on all architectures, we can go and scrub
> all drivers to see what needs to change and what can remain.
>
> We start with mostly used ones and hope to increase the coverage over time.
> It will take a while to cover all drivers.
>
> Feel free to apply patches individually.
>
> Change since 7:
>
> * API clarification with Linus and several architecture folks regarding
> writel()
>
> https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html
>
> * removed wmb() in front of writel() at several places.
> * remove old IA64 comments regarding ordering.
>
What do you think about this version? Did I miss any SMP barriers?
> Sinan Kaya (7):
> i40e/i40evf: Eliminate duplicate barriers on weakly-ordered archs
> ixgbe: eliminate duplicate barriers on weakly-ordered archs
> igbvf: eliminate duplicate barriers on weakly-ordered archs
> igb: eliminate duplicate barriers on weakly-ordered archs
> fm10k: Eliminate duplicate barriers on weakly-ordered archs
> ixgbevf: keep writel() closer to wmb()
> ixgbevf: eliminate duplicate barriers on weakly-ordered archs
>
> drivers/net/ethernet/intel/fm10k/fm10k_main.c | 15 ++-----------
> drivers/net/ethernet/intel/i40e/i40e_txrx.c | 22 ++++++++----------
> drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 8 +------
> drivers/net/ethernet/intel/igb/igb_main.c | 14 ++----------
> drivers/net/ethernet/intel/igbvf/netdev.c | 16 +++++---------
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 23 ++-----------------
> drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 5 -----
> drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 27 ++++++++---------------
> 8 files changed, 30 insertions(+), 100 deletions(-)
>
Sinan
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
2018-04-03 2:59 ` Sinan Kaya
@ 2018-04-03 17:47 ` Alexander Duyck
2018-04-03 17:50 ` Sinan Kaya
0 siblings, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2018-04-03 17:47 UTC (permalink / raw)
To: Sinan Kaya, intel-wired-lan
Cc: Jeff Kirsher, Netdev, Timur Tabi, sulrich, linux-arm-msm,
linux-arm-kernel
On Mon, Apr 2, 2018 at 7:59 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
> Alex,
>
> On 4/2/2018 3:06 PM, Sinan Kaya wrote:
>> Code includes wmb() followed by writel() in multiple places. writel()
>> already has a barrier on some architectures like arm64.
>>
>> This ends up CPU observing two barriers back to back before executing the
>> register write.
>>
>> Since code already has an explicit barrier call, changing writel() to
>> writel_relaxed().
>>
>> I did a regex search for wmb() followed by writel() in each drivers
>> directory. I scrubbed the ones I care about in this series.
>>
>> I considered "ease of change", "popular usage" and "performance critical
>> path" as the determining criteria for my filtering.
>>
>> We used relaxed API heavily on ARM for a long time but
>> it did not exist on other architectures. For this reason, relaxed
>> architectures have been paying double penalty in order to use the common
>> drivers.
>>
>> Now that relaxed API is present on all architectures, we can go and scrub
>> all drivers to see what needs to change and what can remain.
>>
>> We start with mostly used ones and hope to increase the coverage over time.
>> It will take a while to cover all drivers.
>>
>> Feel free to apply patches individually.
>>
>> Change since 7:
>>
>> * API clarification with Linus and several architecture folks regarding
>> writel()
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html
>>
>> * removed wmb() in front of writel() at several places.
>> * remove old IA64 comments regarding ordering.
>>
>
> What do you think about this version? Did I miss any SMP barriers?
I would say we should probably just drop the whole set and start over.
If we don't need the wmb() we are going to need to go through and
clean up all of these paths and consider the barriers when updating
the layout of the code.
For example I have been thinking about it and in the case of x86 we
are probably better off not bothering with the wmb() and
writel_relaxed() and instead switch over to the smp_wmb() and writel()
since in the case of a strongly ordered system like x86 or sparc this
ends up translating out to a couple of compile barriers.
I will also need time to reevaluate the Rx paths since dropping the
wmb() may have other effects which I need to verify.
Thanks.
- Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
2018-04-03 17:47 ` Alexander Duyck
@ 2018-04-03 17:50 ` Sinan Kaya
2018-04-03 18:26 ` Jeff Kirsher
0 siblings, 1 reply; 7+ messages in thread
From: Sinan Kaya @ 2018-04-03 17:50 UTC (permalink / raw)
To: Alexander Duyck, intel-wired-lan
Cc: Jeff Kirsher, Netdev, Timur Tabi, sulrich, linux-arm-msm,
linux-arm-kernel
On 4/3/2018 1:47 PM, Alexander Duyck wrote:
> On Mon, Apr 2, 2018 at 7:59 PM, Sinan Kaya <okaya@codeaurora.org> wrote:
>> Alex,
>>
>> On 4/2/2018 3:06 PM, Sinan Kaya wrote:
>>> Code includes wmb() followed by writel() in multiple places. writel()
>>> already has a barrier on some architectures like arm64.
>>>
>>> This ends up CPU observing two barriers back to back before executing the
>>> register write.
>>>
>>> Since code already has an explicit barrier call, changing writel() to
>>> writel_relaxed().
>>>
>>> I did a regex search for wmb() followed by writel() in each drivers
>>> directory. I scrubbed the ones I care about in this series.
>>>
>>> I considered "ease of change", "popular usage" and "performance critical
>>> path" as the determining criteria for my filtering.
>>>
>>> We used relaxed API heavily on ARM for a long time but
>>> it did not exist on other architectures. For this reason, relaxed
>>> architectures have been paying double penalty in order to use the common
>>> drivers.
>>>
>>> Now that relaxed API is present on all architectures, we can go and scrub
>>> all drivers to see what needs to change and what can remain.
>>>
>>> We start with mostly used ones and hope to increase the coverage over time.
>>> It will take a while to cover all drivers.
>>>
>>> Feel free to apply patches individually.
>>>
>>> Change since 7:
>>>
>>> * API clarification with Linus and several architecture folks regarding
>>> writel()
>>>
>>> https://www.mail-archive.com/netdev@vger.kernel.org/msg225806.html
>>>
>>> * removed wmb() in front of writel() at several places.
>>> * remove old IA64 comments regarding ordering.
>>>
>>
>> What do you think about this version? Did I miss any SMP barriers?
>
> I would say we should probably just drop the whole set and start over.
> If we don't need the wmb() we are going to need to go through and
> clean up all of these paths and consider the barriers when updating
> the layout of the code.
>
> For example I have been thinking about it and in the case of x86 we
> are probably better off not bothering with the wmb() and
> writel_relaxed() and instead switch over to the smp_wmb() and writel()
> since in the case of a strongly ordered system like x86 or sparc this
> ends up translating out to a couple of compile barriers.
>
> I will also need time to reevaluate the Rx paths since dropping the
> wmb() may have other effects which I need to verify.
Sounds good, I'll let you work on it.
@Jeff Kirsher: can you drop this version from your development branch until
Alex posts the next version?
>
> Thanks.
>
> - Alex
>
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
2018-04-03 17:50 ` Sinan Kaya
@ 2018-04-03 18:26 ` Jeff Kirsher
0 siblings, 0 replies; 7+ messages in thread
From: Jeff Kirsher @ 2018-04-03 18:26 UTC (permalink / raw)
To: Sinan Kaya, Alexander Duyck, intel-wired-lan
Cc: Netdev, Timur Tabi, sulrich, linux-arm-msm, linux-arm-kernel
[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]
On Tue, 2018-04-03 at 13:50 -0400, Sinan Kaya wrote:
> > > What do you think about this version? Did I miss any SMP
> > > barriers?
> >
> > I would say we should probably just drop the whole set and start
> > over.
> > If we don't need the wmb() we are going to need to go through and
> > clean up all of these paths and consider the barriers when updating
> > the layout of the code.
> >
> > For example I have been thinking about it and in the case of x86 we
> > are probably better off not bothering with the wmb() and
> > writel_relaxed() and instead switch over to the smp_wmb() and
> > writel()
> > since in the case of a strongly ordered system like x86 or sparc
> > this
> > ends up translating out to a couple of compile barriers.
> >
> > I will also need time to reevaluate the Rx paths since dropping the
> > wmb() may have other effects which I need to verify.
>
> Sounds good, I'll let you work on it.
>
> @Jeff Kirsher: can you drop this version from your development branch
> until
> Alex posts the next version?
Already on it, I will work with Alex on any possible future versions of
these patches.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
[not found] ` <CAKgT0UcbSfgg3mpv0Qr2-w8iHhbQMsyM3LcrM4YwAv8yTiq+Vw@mail.gmail.com>
@ 2018-07-30 19:20 ` Sinan Kaya
2018-07-30 20:06 ` Alexander Duyck
0 siblings, 1 reply; 7+ messages in thread
From: Sinan Kaya @ 2018-07-30 19:20 UTC (permalink / raw)
To: Alexander Duyck, netdev
+netdev,
On 7/30/2018 9:45 AM, Alexander Duyck wrote:
> I haven't had a chance to work on this much myself. My understanding
> is that igb has had the barriers updated, but I don't think any of the
> other drivers have been worked over yet.
Unfortunately, I have recently changed jobs and I no longer have the
hardware to test my changes. I thought that you wanted to handle this
yourself.
I haven't seen any follow ups. I wanted to double check.
I worked with several architecture leads on 4.17. All architectures
support the updated barrier semantics now. It is time to clean up the
network drivers.
92d7223a7423 alpha: io: reorder barriers to guarantee writeX() and
iowriteX() ordering #2
a1cc7034e33d MIPS: io: Add barrier after register read in readX()
f6b7aeee8f16 MIPS: io: Prevent compiler reordering writeX()
a71e7c44ffb7 io: change writeX_relaxed() to remove barriers
8875c5543761 io: change readX_relaxed() to remove barriers
cd0e00c10672 alpha: io: reorder barriers to guarantee writeX() and
iowriteX() ordering
87fe2d543f81 io: change inX() to have their own IO barrier overrides
a7851aa54c0c io: change outX() to have their own IO barrier overrides
755bd04aaf4b io: define stronger ordering for the default writeX()
implementation
032d59e1cde9 io: define stronger ordering for the default readX()
implementation
64e2c6738b4d io: define several IO & PIO barrier types for the
asm-generic version
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs
2018-07-30 19:20 ` [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
@ 2018-07-30 20:06 ` Alexander Duyck
0 siblings, 0 replies; 7+ messages in thread
From: Alexander Duyck @ 2018-07-30 20:06 UTC (permalink / raw)
To: Sinan Kaya; +Cc: Netdev
On Mon, Jul 30, 2018 at 12:20 PM, Sinan Kaya <okaya@kernel.org> wrote:
> +netdev,
>
> On 7/30/2018 9:45 AM, Alexander Duyck wrote:
>>
>> I haven't had a chance to work on this much myself. My understanding
>> is that igb has had the barriers updated, but I don't think any of the
>> other drivers have been worked over yet.
>
>
> Unfortunately, I have recently changed jobs and I no longer have the
> hardware to test my changes. I thought that you wanted to handle this
> yourself.
>
> I haven't seen any follow ups. I wanted to double check.
As I said so far igb has been the only one updated, and that was done
by a third party:
73017f4e051c8 igb: Use dma_wmb() instead of wmb() before doorbell writes
> I worked with several architecture leads on 4.17. All architectures
> support the updated barrier semantics now. It is time to clean up the
> network drivers.
Thanks for that update. As I said for now igb has the barriers
updated. The idea being that igb is the test vehicle for this so if we
go a kernel version or so without triggering any issues then we can
follow up with the other drivers.
The other thing we have to keep in mind is that unlike many other NICs
we have to also deal with emulations of our devices (e1000 and e1000e)
that may rely on certain barriers being used to enforce things like
SMP synchronization between CPUs, so we have to be careful as we roll
this out.
Thanks.
- Alex
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-07-30 21:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <173f1319-f030-4a3b-d730-86c7e113a747@kernel.org>
[not found] ` <CAKgT0UcbSfgg3mpv0Qr2-w8iHhbQMsyM3LcrM4YwAv8yTiq+Vw@mail.gmail.com>
2018-07-30 19:20 ` [PATCH v8 0/7] netdev: intel: Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-07-30 20:06 ` Alexander Duyck
2018-04-02 19:06 Sinan Kaya
2018-04-03 2:59 ` Sinan Kaya
2018-04-03 17:47 ` Alexander Duyck
2018-04-03 17:50 ` Sinan Kaya
2018-04-03 18:26 ` Jeff Kirsher
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).