[net-next 07/14] i40e/i40evf: bump tail only in multiples of 8

Netdev List
 help / color / mirror / Atom feed

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
To: davem@davemloft.net
Cc: Jacob Keller <jacob.e.keller@intel.com>,
	netdev@vger.kernel.org, nhorman@redhat.com, sassmann@redhat.com,
	jogreene@redhat.com, Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Subject: [net-next 07/14] i40e/i40evf: bump tail only in multiples of 8
Date: Mon,  9 Oct 2017 15:38:34 -0700	[thread overview]
Message-ID: <20171009223841.2557-8-jeffrey.t.kirsher@intel.com> (raw)
In-Reply-To: <20171009223841.2557-1-jeffrey.t.kirsher@intel.com>

From: Jacob Keller <jacob.e.keller@intel.com>

Hardware only fetches descriptors on cachelines of 8, essentially
ignoring the lower 3 bits of the tail register. Thus, it is pointless to
bump tail by an unaligned access as the hardware will ignore some of the
new descriptors we allocated. Thus, it's ideal if we can ensure tail
writes are always aligned to 8.

At first, it seems like we'd already do this, since we allocate
descriptors in batches which are a multiple of 8. Since we'd always
increment by a multiple of 8, it seems like the value should always be
aligned.

However, this ignores allocation failures. If we fail to allocate
a buffer, our tail register will become unaligned. Once it has become
unaligned it will essentially be stuck unaligned until a buffer
allocation happens to fail at the exact amount necessary to re-align it.

We can do better, by simply rounding down the number of buffers we're
about to allocate (cleaned_count) such that "next_to_clean
+ cleaned_count" is rounded to the nearest multiple of 8.

We do this by calculating how far off that value is and subtracting it
from the cleaned_count. This essentially defers allocation of buffers if
they're going to be ignored by hardware anyways, and re-aligns our
next_to_use and tail values after a failure to allocate a descriptor.

This calculation ensures that we always align the tail writes in a way
the hardware expects and don't unnecessarily allocate buffers which
won't be fetched immediately.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 9 +++++++++
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 616abf79253e..a23306f04e00 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -1372,6 +1372,15 @@ bool i40e_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
 	union i40e_rx_desc *rx_desc;
 	struct i40e_rx_buffer *bi;
 
+	/* Hardware only fetches new descriptors in cache lines of 8,
+	 * essentially ignoring the lower 3 bits of the tail register. We want
+	 * to ensure our tail writes are aligned to avoid unnecessary work. We
+	 * can't simply round down the cleaned count, since we might fail to
+	 * allocate some buffers. What we really want is to ensure that
+	 * next_to_used + cleaned_count produces an aligned value.
+	 */
+	cleaned_count -= (ntu + cleaned_count) & 0x7;
+
 	/* do nothing if no valid netdev defined */
 	if (!rx_ring->netdev || !cleaned_count)
 		return false;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index fe817e2b6fef..6806ada11490 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -711,6 +711,15 @@ bool i40evf_alloc_rx_buffers(struct i40e_ring *rx_ring, u16 cleaned_count)
 	union i40e_rx_desc *rx_desc;
 	struct i40e_rx_buffer *bi;
 
+	/* Hardware only fetches new descriptors in cache lines of 8,
+	 * essentially ignoring the lower 3 bits of the tail register. We want
+	 * to ensure our tail writes are aligned to avoid unnecessary work. We
+	 * can't simply round down the cleaned count, since we might fail to
+	 * allocate some buffers. What we really want is to ensure that
+	 * next_to_used + cleaned_count produces an aligned value.
+	 */
+	cleaned_count -= (ntu + cleaned_count) & 0x7;
+
 	/* do nothing if no valid netdev defined */
 	if (!rx_ring->netdev || !cleaned_count)
 		return false;
-- 
2.14.2

next prev parent reply	other threads:[~2017-10-09 22:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-09 22:38 [net-next 00/14][pull request] 40GbE Intel Wired LAN Driver Updates 2017-10-09 Jeff Kirsher
2017-10-09 22:38 ` [net-next 01/14] i40e: fix flags declaration Jeff Kirsher
2017-10-09 22:38 ` [net-next 02/14] i40e: use the safe hash table iterator when deleting mac filters Jeff Kirsher
2017-10-09 22:38 ` [net-next 03/14] i40evf: fix mac filter removal timing issue Jeff Kirsher
2017-10-09 22:38 ` [net-next 04/14] i40e/i40evf: fix incorrect default ITR values on driver load Jeff Kirsher
2017-10-09 22:38 ` [net-next 05/14] i40e/i40evf: always set the CLEARPBA flag when re-enabling interrupts Jeff Kirsher
2017-10-10 13:07   ` David Laight
2017-10-09 22:38 ` [net-next 06/14] i40e: reduce lrxqthresh from 2 to 1 Jeff Kirsher
2017-10-09 22:38 ` Jeff Kirsher [this message]
2017-10-09 22:38 ` [net-next 08/14] i40e/i40evf: bundle more descriptors when allocating buffers Jeff Kirsher
2017-10-09 22:38 ` [net-next 09/14] i40e: allow XPS with QoS enabled Jeff Kirsher
2017-10-09 22:38 ` [net-next 10/14] i40e: add check for return from find_first_bit call Jeff Kirsher
2017-10-09 22:38 ` [net-next 11/14] i40e: Retry AQC GetPhyAbilities to overcome I2CRead hangs Jeff Kirsher
2017-10-09 22:38 ` [net-next 12/14] i40e: use a local variable instead of calculating multiple times Jeff Kirsher
2017-10-09 22:38 ` [net-next 13/14] i40e: fix a typo Jeff Kirsher
2017-10-09 22:38 ` [net-next 14/14] i40e: Avoid some useless variables and initializers in NVM functions Jeff Kirsher
2017-10-10  1:12 ` [net-next 00/14][pull request] 40GbE Intel Wired LAN Driver Updates 2017-10-09 David Miller

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:616abf79253 dfblob:a23306f04e0 dfblob:fe817e2b6fe
dfblob:6806ada1149 )
 OR (
bs:"[net-next 07/14] i40e/i40evf: bump tail only in multiples of 8" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171009223841.2557-8-jeffrey.t.kirsher@intel.com \
    --to=jeffrey.t.kirsher@intel.com \
    --cc=davem@davemloft.net \
    --cc=jacob.e.keller@intel.com \
    --cc=jogreene@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@redhat.com \
    --cc=sassmann@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox