* Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-14 19:16 [PATCH iwl-next v3] e1000e: Fix real-time violations on link up Gerhard Engleder
@ 2024-12-16 11:16 ` Lifshits, Vitaly
2024-12-16 19:23 ` Gerhard Engleder
2024-12-18 15:08 ` [Intel-wired-lan] " Avigail Dahan
2024-12-18 15:23 ` Alexander Lobakin
2 siblings, 1 reply; 9+ messages in thread
From: Lifshits, Vitaly @ 2024-12-16 11:16 UTC (permalink / raw)
To: Gerhard Engleder, intel-wired-lan, netdev, linux-pci
Cc: anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev, davem, kuba,
edumazet, pabeni, bhelgaas, pmenzel, Gerhard Engleder
On 12/14/2024 9:16 PM, Gerhard Engleder wrote:
> From: Gerhard Engleder <eg@keba.com>
>
> Link down and up triggers update of MTA table. This update executes many
> PCIe writes and a final flush. Thus, PCIe will be blocked until all
> writes are flushed. As a result, DMA transfers of other targets suffer
> from delay in the range of 50us. This results in timing violations on
> real-time systems during link down and up of e1000e in combination with
> an Intel i3-2310E Sandy Bridge CPU.
>
> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
> robot controller. The exact root cause of the problem is unclear and
> this situation won't change as Intel support for this CPU has ended
> years ago. Our experience is that the number of posted PCIe writes needs
> to be limited at least for real-time systems. With posted PCIe writes a
> much higher throughput can be generated than with PCIe reads which
> cannot be posted. Thus, the load on the interconnect is much higher.
> Additionally, a PCIe read waits until all posted PCIe writes are done.
> Therefore, the PCIe read can block the CPU for much more than 10us if a
> lot of PCIe writes were posted before. Both issues are the reason why we
> are limiting the number of posted PCIe writes in row in general for our
> real-time systems, not only for this driver.
>
> A flush after a low enough number of posted PCIe writes eliminates the
> delay but also increases the time needed for MTA table update. The
> following measurements were done on i3-2310E with e1000e for 128 MTA
> table entries:
>
> Single flush after all writes: 106us
> Flush after every write: 429us
> Flush after every 2nd write: 266us
> Flush after every 4th write: 180us
> Flush after every 8th write: 141us
> Flush after every 16th write: 121us
>
> A flush after every 8th write delays the link up by 35us and the
> negative impact to DMA transfers of other targets is still tolerable.
>
> Execute a flush after every 8th write. This prevents overloading the
> interconnect with posted writes.
>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> CC: Vitaly Lifshits <vitaly.lifshits@intel.com>
> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
> Signed-off-by: Gerhard Engleder <eg@keba.com>
> ---
> v3:
> - mention problematic platform explicitly (Bjorn Helgaas)
> - improve comment (Paul Menzel)
>
> v2:
> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
> ---
> drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
> index d7df2a0ed629..0174c16bbb43 100644
> --- a/drivers/net/ethernet/intel/e1000e/mac.c
> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
> }
>
> /* replace the entire MTA table */
> - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
> + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
> E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
> +
> + /* do not queue up too many posted writes to prevent increased
> + * latency for other devices on the interconnect
> + */
> + if ((i % 8) == 0 && i != 0)
> + e1e_flush();
I would prefer to avoid adding this code to all devices, particularly
those that don't operate on real-time systems. Implementing this code
will introduce three additional MMIO transactions which will increase
the driver start time in various flows (up, probe, etc.).
Is there a specific reason not to use if (IS_ENABLED(CONFIG_PREEMPT_RT))
as Andrew initially suggested?
> + }
> e1e_flush();
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-16 11:16 ` Lifshits, Vitaly
@ 2024-12-16 19:23 ` Gerhard Engleder
2024-12-18 8:36 ` Przemek Kitszel
0 siblings, 1 reply; 9+ messages in thread
From: Gerhard Engleder @ 2024-12-16 19:23 UTC (permalink / raw)
To: Lifshits, Vitaly, intel-wired-lan, netdev, linux-pci
Cc: anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev, davem, kuba,
edumazet, pabeni, bhelgaas, pmenzel, Gerhard Engleder
>> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct
>> e1000_hw *hw,
>> }
>> /* replace the entire MTA table */
>> - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>> + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>> E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
>> +
>> + /* do not queue up too many posted writes to prevent increased
>> + * latency for other devices on the interconnect
>> + */
>> + if ((i % 8) == 0 && i != 0)
>> + e1e_flush();
>
>
> I would prefer to avoid adding this code to all devices, particularly
> those that don't operate on real-time systems. Implementing this code
> will introduce three additional MMIO transactions which will increase
> the driver start time in various flows (up, probe, etc.).
>
> Is there a specific reason not to use if (IS_ENABLED(CONFIG_PREEMPT_RT))
> as Andrew initially suggested?
Andrew made two suggestions: IS_ENABLED(CONFIG_PREEMPT_RT) which I used
in the first version after the RFC. And he suggested to check for a
compromise between RT and none RT performance, as some distros might
enable PREEMPT_RT in the future.
Przemek suggested to remove the PREEMPT_RT check as "this change sounds
reasonable also for the standard kernel" after the first version with
IS_ENABLED(CONFIG_PREEMPT_RT).
I used the PREEMPT_RT dependency to limit effects to real-time systems,
to not make none real-time systems slower. But I could also follow the
reasoning of Andrew and Przemek. With that said, I have no problem to
add IS_ENABLED(CONFIG_PREEMPT_RT) again.
Gerhard
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-16 19:23 ` Gerhard Engleder
@ 2024-12-18 8:36 ` Przemek Kitszel
2024-12-18 19:21 ` Gerhard Engleder
0 siblings, 1 reply; 9+ messages in thread
From: Przemek Kitszel @ 2024-12-18 8:36 UTC (permalink / raw)
To: Gerhard Engleder, Lifshits, Vitaly
Cc: anthony.l.nguyen, andrew+netdev, davem, kuba, linux-pci, edumazet,
pabeni, bhelgaas, pmenzel, Gerhard Engleder, netdev,
intel-wired-lan
On 12/16/24 20:23, Gerhard Engleder wrote:
>>> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct
>>> e1000_hw *hw,
>>> }
>>> /* replace the entire MTA table */
>>> - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>>> + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>>> E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw-
>>> >mac.mta_shadow[i]);
>>> +
>>> + /* do not queue up too many posted writes to prevent increased
>>> + * latency for other devices on the interconnect
>>> + */
>>> + if ((i % 8) == 0 && i != 0)
>>> + e1e_flush();
>>
>>
>> I would prefer to avoid adding this code to all devices, particularly
>> those that don't operate on real-time systems. Implementing this code
>> will introduce three additional MMIO transactions which will increase
>> the driver start time in various flows (up, probe, etc.).
>>
>> Is there a specific reason not to use if
>> (IS_ENABLED(CONFIG_PREEMPT_RT)) as Andrew initially suggested?
>
> Andrew made two suggestions: IS_ENABLED(CONFIG_PREEMPT_RT) which I used
> in the first version after the RFC. And he suggested to check for a
> compromise between RT and none RT performance, as some distros might
> enable PREEMPT_RT in the future.
> Przemek suggested to remove the PREEMPT_RT check as "this change sounds
> reasonable also for the standard kernel" after the first version with
> IS_ENABLED(CONFIG_PREEMPT_RT).
>
> I used the PREEMPT_RT dependency to limit effects to real-time systems,
> to not make none real-time systems slower. But I could also follow the
> reasoning of Andrew and Przemek. With that said, I have no problem to
> add IS_ENABLED(CONFIG_PREEMPT_RT) again.
>
> Gerhard
I'm also fine with limiting the change to RT kernels.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-18 8:36 ` Przemek Kitszel
@ 2024-12-18 19:21 ` Gerhard Engleder
0 siblings, 0 replies; 9+ messages in thread
From: Gerhard Engleder @ 2024-12-18 19:21 UTC (permalink / raw)
To: Przemek Kitszel, Lifshits, Vitaly
Cc: anthony.l.nguyen, andrew+netdev, davem, kuba, linux-pci, edumazet,
pabeni, bhelgaas, pmenzel, Gerhard Engleder, netdev,
intel-wired-lan
On 18.12.24 09:36, Przemek Kitszel wrote:
> On 12/16/24 20:23, Gerhard Engleder wrote:
>>>> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct
>>>> e1000_hw *hw,
>>>> }
>>>> /* replace the entire MTA table */
>>>> - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>>>> + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>>>> E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw-
>>>> >mac.mta_shadow[i]);
>>>> +
>>>> + /* do not queue up too many posted writes to prevent increased
>>>> + * latency for other devices on the interconnect
>>>> + */
>>>> + if ((i % 8) == 0 && i != 0)
>>>> + e1e_flush();
>>>
>>>
>>> I would prefer to avoid adding this code to all devices, particularly
>>> those that don't operate on real-time systems. Implementing this code
>>> will introduce three additional MMIO transactions which will increase
>>> the driver start time in various flows (up, probe, etc.).
>>>
>>> Is there a specific reason not to use if
>>> (IS_ENABLED(CONFIG_PREEMPT_RT)) as Andrew initially suggested?
>>
>> Andrew made two suggestions: IS_ENABLED(CONFIG_PREEMPT_RT) which I used
>> in the first version after the RFC. And he suggested to check for a
>> compromise between RT and none RT performance, as some distros might
>> enable PREEMPT_RT in the future.
>> Przemek suggested to remove the PREEMPT_RT check as "this change sounds
>> reasonable also for the standard kernel" after the first version with
>> IS_ENABLED(CONFIG_PREEMPT_RT).
>>
>> I used the PREEMPT_RT dependency to limit effects to real-time systems,
>> to not make none real-time systems slower. But I could also follow the
>> reasoning of Andrew and Przemek. With that said, I have no problem to
>> add IS_ENABLED(CONFIG_PREEMPT_RT) again.
>>
>> Gerhard
>
> I'm also fine with limiting the change to RT kernels.
I will add IS_ENABLED(CONFIG_PREEMPT_RT).
Thanks!
Gerhard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-14 19:16 [PATCH iwl-next v3] e1000e: Fix real-time violations on link up Gerhard Engleder
2024-12-16 11:16 ` Lifshits, Vitaly
@ 2024-12-18 15:08 ` Avigail Dahan
2024-12-18 19:21 ` Gerhard Engleder
2024-12-18 15:23 ` Alexander Lobakin
2 siblings, 1 reply; 9+ messages in thread
From: Avigail Dahan @ 2024-12-18 15:08 UTC (permalink / raw)
To: Gerhard Engleder, intel-wired-lan, netdev, linux-pci
Cc: anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev, davem, kuba,
edumazet, pabeni, bhelgaas, pmenzel, Gerhard Engleder,
Vitaly Lifshits
On 14/12/2024 21:16, Gerhard Engleder wrote:
> From: Gerhard Engleder <eg@keba.com>
>
> Link down and up triggers update of MTA table. This update executes many
> PCIe writes and a final flush. Thus, PCIe will be blocked until all
> writes are flushed. As a result, DMA transfers of other targets suffer
> from delay in the range of 50us. This results in timing violations on
> real-time systems during link down and up of e1000e in combination with
> an Intel i3-2310E Sandy Bridge CPU.
>
> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
> robot controller. The exact root cause of the problem is unclear and
> this situation won't change as Intel support for this CPU has ended
> years ago. Our experience is that the number of posted PCIe writes needs
> to be limited at least for real-time systems. With posted PCIe writes a
> much higher throughput can be generated than with PCIe reads which
> cannot be posted. Thus, the load on the interconnect is much higher.
> Additionally, a PCIe read waits until all posted PCIe writes are done.
> Therefore, the PCIe read can block the CPU for much more than 10us if a
> lot of PCIe writes were posted before. Both issues are the reason why we
> are limiting the number of posted PCIe writes in row in general for our
> real-time systems, not only for this driver.
>
> A flush after a low enough number of posted PCIe writes eliminates the
> delay but also increases the time needed for MTA table update. The
> following measurements were done on i3-2310E with e1000e for 128 MTA
> table entries:
>
> Single flush after all writes: 106us
> Flush after every write: 429us
> Flush after every 2nd write: 266us
> Flush after every 4th write: 180us
> Flush after every 8th write: 141us
> Flush after every 16th write: 121us
>
> A flush after every 8th write delays the link up by 35us and the
> negative impact to DMA transfers of other targets is still tolerable.
>
> Execute a flush after every 8th write. This prevents overloading the
> interconnect with posted writes.
>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> CC: Vitaly Lifshits <vitaly.lifshits@intel.com>
> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
> Signed-off-by: Gerhard Engleder <eg@keba.com>
> ---
> v3:
> - mention problematic platform explicitly (Bjorn Helgaas)
> - improve comment (Paul Menzel)
>
> v2:
> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
> ---
> drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-18 15:08 ` [Intel-wired-lan] " Avigail Dahan
@ 2024-12-18 19:21 ` Gerhard Engleder
0 siblings, 0 replies; 9+ messages in thread
From: Gerhard Engleder @ 2024-12-18 19:21 UTC (permalink / raw)
To: Avigail Dahan, intel-wired-lan, netdev, linux-pci
Cc: anthony.l.nguyen, przemyslaw.kitszel, andrew+netdev, davem, kuba,
edumazet, pabeni, bhelgaas, pmenzel, Gerhard Engleder,
Vitaly Lifshits
On 18.12.24 16:08, Avigail Dahan wrote:
>
>
> On 14/12/2024 21:16, Gerhard Engleder wrote:
>> From: Gerhard Engleder <eg@keba.com>
>>
>> Link down and up triggers update of MTA table. This update executes many
>> PCIe writes and a final flush. Thus, PCIe will be blocked until all
>> writes are flushed. As a result, DMA transfers of other targets suffer
>> from delay in the range of 50us. This results in timing violations on
>> real-time systems during link down and up of e1000e in combination with
>> an Intel i3-2310E Sandy Bridge CPU.
>>
>> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
>> robot controller. The exact root cause of the problem is unclear and
>> this situation won't change as Intel support for this CPU has ended
>> years ago. Our experience is that the number of posted PCIe writes needs
>> to be limited at least for real-time systems. With posted PCIe writes a
>> much higher throughput can be generated than with PCIe reads which
>> cannot be posted. Thus, the load on the interconnect is much higher.
>> Additionally, a PCIe read waits until all posted PCIe writes are done.
>> Therefore, the PCIe read can block the CPU for much more than 10us if a
>> lot of PCIe writes were posted before. Both issues are the reason why we
>> are limiting the number of posted PCIe writes in row in general for our
>> real-time systems, not only for this driver.
>>
>> A flush after a low enough number of posted PCIe writes eliminates the
>> delay but also increases the time needed for MTA table update. The
>> following measurements were done on i3-2310E with e1000e for 128 MTA
>> table entries:
>>
>> Single flush after all writes: 106us
>> Flush after every write: 429us
>> Flush after every 2nd write: 266us
>> Flush after every 4th write: 180us
>> Flush after every 8th write: 141us
>> Flush after every 16th write: 121us
>>
>> A flush after every 8th write delays the link up by 35us and the
>> negative impact to DMA transfers of other targets is still tolerable.
>>
>> Execute a flush after every 8th write. This prevents overloading the
>> interconnect with posted writes.
>>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> CC: Vitaly Lifshits <vitaly.lifshits@intel.com>
>> Link:
>> https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
>> Signed-off-by: Gerhard Engleder <eg@keba.com>
>> ---
>> v3:
>> - mention problematic platform explicitly (Bjorn Helgaas)
>> - improve comment (Paul Menzel)
>>
>> v2:
>> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
>> ---
>> drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
> Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Thank you for the test!
Gerhard
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-14 19:16 [PATCH iwl-next v3] e1000e: Fix real-time violations on link up Gerhard Engleder
2024-12-16 11:16 ` Lifshits, Vitaly
2024-12-18 15:08 ` [Intel-wired-lan] " Avigail Dahan
@ 2024-12-18 15:23 ` Alexander Lobakin
2024-12-18 19:43 ` Gerhard Engleder
2 siblings, 1 reply; 9+ messages in thread
From: Alexander Lobakin @ 2024-12-18 15:23 UTC (permalink / raw)
To: Gerhard Engleder
Cc: intel-wired-lan, netdev, linux-pci, anthony.l.nguyen,
przemyslaw.kitszel, andrew+netdev, davem, kuba, edumazet, pabeni,
bhelgaas, pmenzel, Gerhard Engleder, Vitaly Lifshits
From: Gerhard Engleder <gerhard@engleder-embedded.com>
Date: Sat, 14 Dec 2024 20:16:23 +0100
> From: Gerhard Engleder <eg@keba.com>
>
> Link down and up triggers update of MTA table. This update executes many
> PCIe writes and a final flush. Thus, PCIe will be blocked until all
> writes are flushed. As a result, DMA transfers of other targets suffer
> from delay in the range of 50us. This results in timing violations on
> real-time systems during link down and up of e1000e in combination with
> an Intel i3-2310E Sandy Bridge CPU.
>
> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
> robot controller. The exact root cause of the problem is unclear and
> this situation won't change as Intel support for this CPU has ended
> years ago. Our experience is that the number of posted PCIe writes needs
> to be limited at least for real-time systems. With posted PCIe writes a
> much higher throughput can be generated than with PCIe reads which
> cannot be posted. Thus, the load on the interconnect is much higher.
> Additionally, a PCIe read waits until all posted PCIe writes are done.
> Therefore, the PCIe read can block the CPU for much more than 10us if a
> lot of PCIe writes were posted before. Both issues are the reason why we
> are limiting the number of posted PCIe writes in row in general for our
> real-time systems, not only for this driver.
>
> A flush after a low enough number of posted PCIe writes eliminates the
> delay but also increases the time needed for MTA table update. The
> following measurements were done on i3-2310E with e1000e for 128 MTA
> table entries:
>
> Single flush after all writes: 106us
> Flush after every write: 429us
> Flush after every 2nd write: 266us
> Flush after every 4th write: 180us
> Flush after every 8th write: 141us
> Flush after every 16th write: 121us
>
> A flush after every 8th write delays the link up by 35us and the
> negative impact to DMA transfers of other targets is still tolerable.
>
> Execute a flush after every 8th write. This prevents overloading the
> interconnect with posted writes.
>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> CC: Vitaly Lifshits <vitaly.lifshits@intel.com>
> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
> Signed-off-by: Gerhard Engleder <eg@keba.com>
> ---
> v3:
> - mention problematic platform explicitly (Bjorn Helgaas)
> - improve comment (Paul Menzel)
>
> v2:
> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
> ---
> drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
> index d7df2a0ed629..0174c16bbb43 100644
> --- a/drivers/net/ethernet/intel/e1000e/mac.c
> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
> }
>
> /* replace the entire MTA table */
> - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
> + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
> E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
> +
> + /* do not queue up too many posted writes to prevent increased
> + * latency for other devices on the interconnect
> + */
I think a multi-line comment should start with a capital letter and have
a '.' at the end of the sentence.
+ netdev code doesn't have the special rule for multi-line comments,
they should look the same way as in the rest of the kernel:
/*
* Do not queue up ...
* latency ...
*/
> + if ((i % 8) == 0 && i != 0)
> + e1e_flush();
IIRC explicit `== 0` / `!= 0` are considered redundant.
if (!(i % 8) && i)
I'd also mention in the comment above that this means "flush each 8th
write" and why exactly 8.
> + }
> e1e_flush();
> }
Thanks,
Olek
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH iwl-next v3] e1000e: Fix real-time violations on link up
2024-12-18 15:23 ` Alexander Lobakin
@ 2024-12-18 19:43 ` Gerhard Engleder
0 siblings, 0 replies; 9+ messages in thread
From: Gerhard Engleder @ 2024-12-18 19:43 UTC (permalink / raw)
To: Alexander Lobakin
Cc: intel-wired-lan, netdev, linux-pci, anthony.l.nguyen,
przemyslaw.kitszel, andrew+netdev, davem, kuba, edumazet, pabeni,
bhelgaas, pmenzel, Gerhard Engleder, Vitaly Lifshits
On 18.12.24 16:23, Alexander Lobakin wrote:
> From: Gerhard Engleder <gerhard@engleder-embedded.com>
> Date: Sat, 14 Dec 2024 20:16:23 +0100
>
>> From: Gerhard Engleder <eg@keba.com>
>>
>> Link down and up triggers update of MTA table. This update executes many
>> PCIe writes and a final flush. Thus, PCIe will be blocked until all
>> writes are flushed. As a result, DMA transfers of other targets suffer
>> from delay in the range of 50us. This results in timing violations on
>> real-time systems during link down and up of e1000e in combination with
>> an Intel i3-2310E Sandy Bridge CPU.
>>
>> The i3-2310E is quite old. Launched 2011 by Intel but still in use as
>> robot controller. The exact root cause of the problem is unclear and
>> this situation won't change as Intel support for this CPU has ended
>> years ago. Our experience is that the number of posted PCIe writes needs
>> to be limited at least for real-time systems. With posted PCIe writes a
>> much higher throughput can be generated than with PCIe reads which
>> cannot be posted. Thus, the load on the interconnect is much higher.
>> Additionally, a PCIe read waits until all posted PCIe writes are done.
>> Therefore, the PCIe read can block the CPU for much more than 10us if a
>> lot of PCIe writes were posted before. Both issues are the reason why we
>> are limiting the number of posted PCIe writes in row in general for our
>> real-time systems, not only for this driver.
>>
>> A flush after a low enough number of posted PCIe writes eliminates the
>> delay but also increases the time needed for MTA table update. The
>> following measurements were done on i3-2310E with e1000e for 128 MTA
>> table entries:
>>
>> Single flush after all writes: 106us
>> Flush after every write: 429us
>> Flush after every 2nd write: 266us
>> Flush after every 4th write: 180us
>> Flush after every 8th write: 141us
>> Flush after every 16th write: 121us
>>
>> A flush after every 8th write delays the link up by 35us and the
>> negative impact to DMA transfers of other targets is still tolerable.
>>
>> Execute a flush after every 8th write. This prevents overloading the
>> interconnect with posted writes.
>>
>> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
>> CC: Vitaly Lifshits <vitaly.lifshits@intel.com>
>> Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@lunn.ch/T/
>> Signed-off-by: Gerhard Engleder <eg@keba.com>
>> ---
>> v3:
>> - mention problematic platform explicitly (Bjorn Helgaas)
>> - improve comment (Paul Menzel)
>>
>> v2:
>> - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel)
>> ---
>> drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c
>> index d7df2a0ed629..0174c16bbb43 100644
>> --- a/drivers/net/ethernet/intel/e1000e/mac.c
>> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
>> @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw,
>> }
>>
>> /* replace the entire MTA table */
>> - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
>> + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
>> E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]);
>> +
>> + /* do not queue up too many posted writes to prevent increased
>> + * latency for other devices on the interconnect
>> + */
>
> I think a multi-line comment should start with a capital letter and have
> a '.' at the end of the sentence.
>
> + netdev code doesn't have the special rule for multi-line comments,
> they should look the same way as in the rest of the kernel:
>
> /*
> * Do not queue up ...
> * latency ...
> */
Oh the preferred style changed, I missed that. Will be done.
>> + if ((i % 8) == 0 && i != 0)
>> + e1e_flush();
>
> IIRC explicit `== 0` / `!= 0` are considered redundant.
>
> if (!(i % 8) && i)
You are right, will be changed.
>
> I'd also mention in the comment above that this means "flush each 8th
> write" and why exactly 8.
I will add that information to the comment.
Thank you for the review!
Gerhard
^ permalink raw reply [flat|nested] 9+ messages in thread