All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thierry Reding <thierry.reding@gmail.com>,
	Matteo Croce <mcroce@linux.microsoft.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-riscv@lists.infradead.org,
	Giuseppe Cavallaro <peppe.cavallaro@st.com>,
	Alexandre Torgue <alexandre.torgue@foss.st.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Drew Fustini <drew@beagleboard.org>,
	Emil Renner Berthing <kernel@esmil.dk>,
	Jon Hunter <jonathanh@nvidia.com>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH net-next] stmmac: align RX buffers
Date: Wed, 11 Aug 2021 15:16:18 +0100	[thread overview]
Message-ID: <87o8a49idp.wl-maz@kernel.org> (raw)
In-Reply-To: <202417ef-f8ae-895d-4d07-1f9f3d89b4a4@gmail.com>

On Wed, 11 Aug 2021 13:53:59 +0100,
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> 
> 
> On 8/11/21 12:28 PM, Thierry Reding wrote:
> > On Tue, Aug 10, 2021 at 08:07:47PM +0100, Marc Zyngier wrote:
> >> Hi all,
> >>
> >> [adding Thierry, Jon and Will to the fun]
> >>
> >> On Mon, 14 Jun 2021 03:25:04 +0100,
> >> Matteo Croce <mcroce@linux.microsoft.com> wrote:
> >>>
> >>> From: Matteo Croce <mcroce@microsoft.com>
> >>>
> >>> On RX an SKB is allocated and the received buffer is copied into it.
> >>> But on some architectures, the memcpy() needs the source and destination
> >>> buffers to have the same alignment to be efficient.
> >>>
> >>> This is not our case, because SKB data pointer is misaligned by two bytes
> >>> to compensate the ethernet header.
> >>>
> >>> Align the RX buffer the same way as the SKB one, so the copy is faster.
> >>> An iperf3 RX test gives a decent improvement on a RISC-V machine:
> >>>
> >>> before:
> >>> [ ID] Interval           Transfer     Bitrate         Retr
> >>> [  5]   0.00-10.00  sec   733 MBytes   615 Mbits/sec   88             sender
> >>> [  5]   0.00-10.01  sec   730 MBytes   612 Mbits/sec                  receiver
> >>>
> >>> after:
> >>> [ ID] Interval           Transfer     Bitrate         Retr
> >>> [  5]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec    0             sender
> >>> [  5]   0.00-10.00  sec  1.09 GBytes   940 Mbits/sec                  receiver
> >>>
> >>> And the memcpy() overhead during the RX drops dramatically.
> >>>
> >>> before:
> >>> Overhead  Shared O  Symbol
> >>>   43.35%  [kernel]  [k] memcpy
> >>>   33.77%  [kernel]  [k] __asm_copy_to_user
> >>>    3.64%  [kernel]  [k] sifive_l2_flush64_range
> >>>
> >>> after:
> >>> Overhead  Shared O  Symbol
> >>>   45.40%  [kernel]  [k] __asm_copy_to_user
> >>>   28.09%  [kernel]  [k] memcpy
> >>>    4.27%  [kernel]  [k] sifive_l2_flush64_range
> >>>
> >>> Signed-off-by: Matteo Croce <mcroce@microsoft.com>
> >>
> >> This patch completely breaks my Jetson TX2 system, composed of 2
> >> Nvidia Denver and 4 Cortex-A57, in a very "funny" way.
> >>
> >> Any significant amount of traffic result in all sort of corruption
> >> (ssh connections get dropped, Debian packages downloaded have the
> >> wrong checksums) if any Denver core is involved in any significant way
> >> (packet processing, interrupt handling). And it is all triggered by
> >> this very change.
> >>
> >> The only way I have to make it work on a Denver core is to route the
> >> interrupt to that particular core and taskset the workload to it. Any
> >> other configuration involving a Denver CPU results in some sort of
> >> corruption. On their own, the A57s are fine.
> >>
> >> This smells of memory ordering going really wrong, which this change
> >> would expose. I haven't had a chance to dig into the driver yet (it
> >> took me long enough to bisect it), but if someone points me at what is
> >> supposed to synchronise the DMA when receiving an interrupt, I'll have
> >> a look.
> > 
> > I recall that Jon was looking into a similar issue recently, though I
> > think the failure mode was slightly different. I also vaguely recall
> > that CPU frequency was impacting this to some degree (lower CPU
> > frequencies would increase the chances of this happening).
> > 
> > Jon's currently out of office, but let me try and dig up the details
> > on this.
> > 
> > Thierry
> > 
> >>
> >> Thanks,
> >>
> >> 	M.
> >>
> >>> ---
> >>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h | 4 ++--
> >>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>> index b6cd43eda7ac..04bdb3950d63 100644
> >>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>> @@ -338,9 +338,9 @@ static inline bool stmmac_xdp_is_enabled(struct stmmac_priv *priv)
> >>>  static inline unsigned int stmmac_rx_offset(struct stmmac_priv *priv)
> >>>  {
> >>>  	if (stmmac_xdp_is_enabled(priv))
> >>> -		return XDP_PACKET_HEADROOM;
> >>> +		return XDP_PACKET_HEADROOM + NET_IP_ALIGN;
> >>>  
> >>> -	return 0;
> >>> +	return NET_SKB_PAD + NET_IP_ALIGN;
> >>>  }
> >>>  
> >>>  void stmmac_disable_rx_queue(struct stmmac_priv *priv, u32 queue);
> >>> -- 
> >>> 2.31.1
> >>>
> >>>
> >>
> >> -- 
> >> Without deviation from the norm, progress is not possible.
> 
> Are you sure you do not need to adjust stmmac_set_bfsize(), 
> stmmac_rx_buf1_len() and stmmac_rx_buf2_len() ?
> 
> Presumably DEFAULT_BUFSIZE also want to be increased by NET_SKB_PAD
> 
> Patch for stmmac_rx_buf1_len() :
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 7b8404a21544cf29668e8a14240c3971e6bce0c3..041a74e7efca3436bfe3e17f972dd156173957a9 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -4508,12 +4508,12 @@ static unsigned int stmmac_rx_buf1_len(struct stmmac_priv *priv,
>  
>         /* First descriptor, not last descriptor and not split header */
>         if (status & rx_not_ls)
> -               return priv->dma_buf_sz;
> +               return priv->dma_buf_sz - NET_SKB_PAD - NET_IP_ALIGN;
>  
>         plen = stmmac_get_rx_frame_len(priv, p, coe);
>  
>         /* First descriptor and last descriptor and not split header */
> -       return min_t(unsigned int, priv->dma_buf_sz, plen);
> +       return min_t(unsigned int, priv->dma_buf_sz - NET_SKB_PAD - NET_IP_ALIGN, plen);
>  }
>  
>  static unsigned int stmmac_rx_buf2_len(struct stmmac_priv *priv,

Feels like a major deficiency of the original patch. Happy to test a
more complete patch if/when you have one.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thierry Reding <thierry.reding@gmail.com>,
	Matteo Croce <mcroce@linux.microsoft.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-riscv@lists.infradead.org,
	Giuseppe Cavallaro <peppe.cavallaro@st.com>,
	Alexandre Torgue <alexandre.torgue@foss.st.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Drew Fustini <drew@beagleboard.org>,
	Emil Renner Berthing <kernel@esmil.dk>,
	Jon Hunter <jonathanh@nvidia.com>, Will Deacon <will@kernel.org>
Subject: Re: [PATCH net-next] stmmac: align RX buffers
Date: Wed, 11 Aug 2021 15:16:18 +0100	[thread overview]
Message-ID: <87o8a49idp.wl-maz@kernel.org> (raw)
In-Reply-To: <202417ef-f8ae-895d-4d07-1f9f3d89b4a4@gmail.com>

On Wed, 11 Aug 2021 13:53:59 +0100,
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> 
> 
> On 8/11/21 12:28 PM, Thierry Reding wrote:
> > On Tue, Aug 10, 2021 at 08:07:47PM +0100, Marc Zyngier wrote:
> >> Hi all,
> >>
> >> [adding Thierry, Jon and Will to the fun]
> >>
> >> On Mon, 14 Jun 2021 03:25:04 +0100,
> >> Matteo Croce <mcroce@linux.microsoft.com> wrote:
> >>>
> >>> From: Matteo Croce <mcroce@microsoft.com>
> >>>
> >>> On RX an SKB is allocated and the received buffer is copied into it.
> >>> But on some architectures, the memcpy() needs the source and destination
> >>> buffers to have the same alignment to be efficient.
> >>>
> >>> This is not our case, because SKB data pointer is misaligned by two bytes
> >>> to compensate the ethernet header.
> >>>
> >>> Align the RX buffer the same way as the SKB one, so the copy is faster.
> >>> An iperf3 RX test gives a decent improvement on a RISC-V machine:
> >>>
> >>> before:
> >>> [ ID] Interval           Transfer     Bitrate         Retr
> >>> [  5]   0.00-10.00  sec   733 MBytes   615 Mbits/sec   88             sender
> >>> [  5]   0.00-10.01  sec   730 MBytes   612 Mbits/sec                  receiver
> >>>
> >>> after:
> >>> [ ID] Interval           Transfer     Bitrate         Retr
> >>> [  5]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec    0             sender
> >>> [  5]   0.00-10.00  sec  1.09 GBytes   940 Mbits/sec                  receiver
> >>>
> >>> And the memcpy() overhead during the RX drops dramatically.
> >>>
> >>> before:
> >>> Overhead  Shared O  Symbol
> >>>   43.35%  [kernel]  [k] memcpy
> >>>   33.77%  [kernel]  [k] __asm_copy_to_user
> >>>    3.64%  [kernel]  [k] sifive_l2_flush64_range
> >>>
> >>> after:
> >>> Overhead  Shared O  Symbol
> >>>   45.40%  [kernel]  [k] __asm_copy_to_user
> >>>   28.09%  [kernel]  [k] memcpy
> >>>    4.27%  [kernel]  [k] sifive_l2_flush64_range
> >>>
> >>> Signed-off-by: Matteo Croce <mcroce@microsoft.com>
> >>
> >> This patch completely breaks my Jetson TX2 system, composed of 2
> >> Nvidia Denver and 4 Cortex-A57, in a very "funny" way.
> >>
> >> Any significant amount of traffic result in all sort of corruption
> >> (ssh connections get dropped, Debian packages downloaded have the
> >> wrong checksums) if any Denver core is involved in any significant way
> >> (packet processing, interrupt handling). And it is all triggered by
> >> this very change.
> >>
> >> The only way I have to make it work on a Denver core is to route the
> >> interrupt to that particular core and taskset the workload to it. Any
> >> other configuration involving a Denver CPU results in some sort of
> >> corruption. On their own, the A57s are fine.
> >>
> >> This smells of memory ordering going really wrong, which this change
> >> would expose. I haven't had a chance to dig into the driver yet (it
> >> took me long enough to bisect it), but if someone points me at what is
> >> supposed to synchronise the DMA when receiving an interrupt, I'll have
> >> a look.
> > 
> > I recall that Jon was looking into a similar issue recently, though I
> > think the failure mode was slightly different. I also vaguely recall
> > that CPU frequency was impacting this to some degree (lower CPU
> > frequencies would increase the chances of this happening).
> > 
> > Jon's currently out of office, but let me try and dig up the details
> > on this.
> > 
> > Thierry
> > 
> >>
> >> Thanks,
> >>
> >> 	M.
> >>
> >>> ---
> >>>  drivers/net/ethernet/stmicro/stmmac/stmmac.h | 4 ++--
> >>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>> index b6cd43eda7ac..04bdb3950d63 100644
> >>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> >>> @@ -338,9 +338,9 @@ static inline bool stmmac_xdp_is_enabled(struct stmmac_priv *priv)
> >>>  static inline unsigned int stmmac_rx_offset(struct stmmac_priv *priv)
> >>>  {
> >>>  	if (stmmac_xdp_is_enabled(priv))
> >>> -		return XDP_PACKET_HEADROOM;
> >>> +		return XDP_PACKET_HEADROOM + NET_IP_ALIGN;
> >>>  
> >>> -	return 0;
> >>> +	return NET_SKB_PAD + NET_IP_ALIGN;
> >>>  }
> >>>  
> >>>  void stmmac_disable_rx_queue(struct stmmac_priv *priv, u32 queue);
> >>> -- 
> >>> 2.31.1
> >>>
> >>>
> >>
> >> -- 
> >> Without deviation from the norm, progress is not possible.
> 
> Are you sure you do not need to adjust stmmac_set_bfsize(), 
> stmmac_rx_buf1_len() and stmmac_rx_buf2_len() ?
> 
> Presumably DEFAULT_BUFSIZE also want to be increased by NET_SKB_PAD
> 
> Patch for stmmac_rx_buf1_len() :
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 7b8404a21544cf29668e8a14240c3971e6bce0c3..041a74e7efca3436bfe3e17f972dd156173957a9 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -4508,12 +4508,12 @@ static unsigned int stmmac_rx_buf1_len(struct stmmac_priv *priv,
>  
>         /* First descriptor, not last descriptor and not split header */
>         if (status & rx_not_ls)
> -               return priv->dma_buf_sz;
> +               return priv->dma_buf_sz - NET_SKB_PAD - NET_IP_ALIGN;
>  
>         plen = stmmac_get_rx_frame_len(priv, p, coe);
>  
>         /* First descriptor and last descriptor and not split header */
> -       return min_t(unsigned int, priv->dma_buf_sz, plen);
> +       return min_t(unsigned int, priv->dma_buf_sz - NET_SKB_PAD - NET_IP_ALIGN, plen);
>  }
>  
>  static unsigned int stmmac_rx_buf2_len(struct stmmac_priv *priv,

Feels like a major deficiency of the original patch. Happy to test a
more complete patch if/when you have one.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2021-08-11 14:16 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14  2:25 [PATCH net-next] stmmac: align RX buffers Matteo Croce
2021-06-14  2:25 ` Matteo Croce
2021-06-14 19:51 ` David Miller
2021-06-14 19:51   ` David Miller
2021-06-14 23:21   ` Matteo Croce
2021-06-14 23:21     ` Matteo Croce
2021-06-15 17:28     ` David Miller
2021-06-15 17:28       ` David Miller
2021-06-15 17:30 ` patchwork-bot+netdevbpf
2021-06-15 17:30   ` patchwork-bot+netdevbpf
2021-08-10 19:07 ` Marc Zyngier
2021-08-10 19:07   ` Marc Zyngier
2021-08-11 10:28   ` Thierry Reding
2021-08-11 10:28     ` Thierry Reding
2021-08-11 12:53     ` Eric Dumazet
2021-08-11 12:53       ` Eric Dumazet
2021-08-11 14:16       ` Marc Zyngier [this message]
2021-08-11 14:16         ` Marc Zyngier
2021-08-12  8:48         ` Eric Dumazet
2021-08-12  8:48           ` Eric Dumazet
2021-08-12 10:18           ` Matteo Croce
2021-08-12 10:18             ` Matteo Croce
2021-08-12 11:05             ` Marc Zyngier
2021-08-12 11:05               ` Marc Zyngier
2021-08-12 11:18               ` Matteo Croce
2021-08-12 11:18                 ` Matteo Croce
2021-08-19 16:29                 ` Marc Zyngier
2021-08-19 16:29                   ` Marc Zyngier
2021-08-20 10:37                   ` Matteo Croce
2021-08-20 10:37                     ` Matteo Croce
2021-08-20 16:26                     ` Marc Zyngier
2021-08-20 16:26                       ` Marc Zyngier
2021-08-20 16:38                       ` Matteo Croce
2021-08-20 16:38                         ` Matteo Croce
2021-08-20 17:09                         ` Marc Zyngier
2021-08-20 17:09                           ` Marc Zyngier
2021-08-20 17:14                           ` Matteo Croce
2021-08-20 17:14                             ` Matteo Croce
2021-08-20 17:24                             ` Marc Zyngier
2021-08-20 17:24                               ` Marc Zyngier
2021-08-20 17:35                               ` Matteo Croce
2021-08-20 17:35                                 ` Matteo Croce
2021-08-20 17:51                                 ` Marc Zyngier
2021-08-20 17:51                                   ` Marc Zyngier
2021-08-20 17:56                                   ` Matteo Croce
2021-08-20 17:56                                     ` Matteo Croce
2021-08-20 18:05                                     ` Matteo Croce
2021-08-20 18:05                                       ` Matteo Croce
2021-08-20 18:14                                       ` Marc Zyngier
2021-08-20 18:14                                         ` Marc Zyngier
2021-08-20 18:09                                     ` Marc Zyngier
2021-08-20 18:09                                       ` Marc Zyngier
2021-08-20 18:14                                       ` Matteo Croce
2021-08-20 18:14                                         ` Matteo Croce
2021-08-20 18:41                                         ` Marc Zyngier
2021-08-20 18:41                                           ` Marc Zyngier
2021-08-16 15:12               ` Jakub Kicinski
2021-08-16 15:12                 ` Jakub Kicinski
2021-08-17  0:01                 ` Matteo Croce
2021-08-17  0:01                   ` Matteo Croce
2021-08-19 15:26                   ` Marc Zyngier
2021-08-19 15:26                     ` Marc Zyngier
2021-08-11 10:41   ` Thierry Reding
2021-08-11 10:41     ` Thierry Reding
2021-08-11 10:56     ` Joakim Zhang
2021-08-11 10:56       ` Joakim Zhang
2021-08-11 13:23     ` Marc Zyngier
2021-08-11 13:23       ` Marc Zyngier
2021-08-12 14:29       ` Thierry Reding
2021-08-12 14:29         ` Thierry Reding
2021-08-12 15:26         ` Marc Zyngier
2021-08-12 15:26           ` Marc Zyngier
2021-08-13 14:44           ` Thierry Reding
2021-08-13 14:44             ` Thierry Reding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o8a49idp.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alexandre.torgue@foss.st.com \
    --cc=davem@davemloft.net \
    --cc=drew@beagleboard.org \
    --cc=eric.dumazet@gmail.com \
    --cc=jonathanh@nvidia.com \
    --cc=kernel@esmil.dk \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mcroce@linux.microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=peppe.cavallaro@st.com \
    --cc=thierry.reding@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.