From: Alexander Duyck <alexander.duyck@gmail.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [next-queue 13/17] fm10k: Update adaptive ITR algorithm
Date: Wed, 14 Oct 2015 11:35:56 -0700 [thread overview]
Message-ID: <561EA08C.8090705@gmail.com> (raw)
In-Reply-To: <1444779554-20464-13-git-send-email-jacob.e.keller@intel.com>
On 10/13/2015 04:39 PM, Jacob Keller wrote:
> The existing adaptive ITR algorithm is overly restrictive. It throttles
> incorrectly for various traffic rates, and does not produce good
> performance. The algorithm now allows for more interrupts per second,
> and does some calculation to help improve for smaller packet loads. In
> addition, take into account the new itr_scale from the hardware which
> indicates how much to scale due to PCIe link speed.
>
> A single thread of receiving TCP_STREAM in netperf:
> - Before: 450 Mbps
> - After: 20,000 Mbps
>
> Reported-by: Matthew Vick <matthew.vick@intel.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> ---
> drivers/net/ethernet/intel/fm10k/fm10k.h | 1 +
> drivers/net/ethernet/intel/fm10k/fm10k_main.c | 29 +++++++++++++++++++++++----
> drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 6 ++++--
> 3 files changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k.h b/drivers/net/ethernet/intel/fm10k/fm10k.h
> index c40f50737d17..ceddf39d7cec 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k.h
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k.h
> @@ -164,6 +164,7 @@ struct fm10k_ring_container {
> unsigned int total_packets; /* total packets processed this int */
> u16 work_limit; /* total work allowed per interrupt */
> u16 itr; /* interrupt throttle rate value */
> + u8 itr_scale; /* ITR adjustment scaler based on PCI speed */
> u8 count; /* total number of rings in vector */
> };
>
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
> index babde3e4b2bb..cae6b4e309a9 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
> @@ -1386,11 +1386,30 @@ static void fm10k_update_itr(struct fm10k_ring_container *ring_container)
> if (avg_wire_size > 3000)
> avg_wire_size = 3000;
>
> - /* Give a little boost to mid-size frames */
> - if ((avg_wire_size > 300) && (avg_wire_size < 1200))
> - avg_wire_size /= 3;
> + /* Throttle rate management based on average wire size, attempting to
> + * slightly boost small and medium packet loads. Divide the average
> + * wire size by a small factor to calculate the minimum time until the
> + * next interrupt in microseconds. Save some cycles by using a
> + * multiply then a shift, which also accounts for difference due to
> + * PCIe link speed.
> + */
> +#define FM10K_ITR_SCALE_SMALL 6
> +#define FM10K_ITR_SCALE_MEDIUM 5
> +#define FM10K_ITR_SCALE_LARGE 4
> +
> + if (avg_wire_size < 300)
> + avg_wire_size *= FM10K_ITR_SCALE_SMALL;
> + else if ((avg_wire_size >= 300) && (avg_wire_size < 1200))
> + avg_wire_size *= FM10K_ITR_SCALE_MEDIUM;
> else
> - avg_wire_size /= 2;
> + avg_wire_size *= FM10K_ITR_SCALE_LARGE;
> +
Where is it these scaling values originated from? Just looking through
the values I am not sure having this broken out like it is provides much
value.
What I am getting at is that the input is a packet size, and the output
is a value somewhere between 2 and 47. (I think that 47 is still a bit
high by the way and probably should be something more like 25 which I
believe you established as the minimum Tx interrupt rate in a later patch.)
What you may want to do is look at pulling in the upper limit to
something more reasonable like 1536 for avg_wire_size, and then simplify
this logic a bit. Specifically what is it you are trying to accomplish
by tweaking the scale factor like you are? I assume you are wanting to
approximate a curve. If so you might wan to look at possibly including
an offset value so that you can smooth out the points where your
intersections occur.
For example what you may want to consider doing would be to instead use
a multiplication factor for small, an addition value for medium, and for
large you simply cap it at a certain value.
> + /* Round up average wire size, then perform bit shift, to ensure that
> + * the calculation will never get below 1. Account for changes in ITR
> + * value due to PCIe link speed.
> + */
> + avg_wire_size += (1 << (ring_container->itr_scale + 8)) - 1;
> + avg_wire_size >>= ring_container->itr_scale + 8;
>
You might want to store off the value for itr_scale + 8 somewhere. It
is likely you might save a cycle or two, especially if the compiler
thinks it has to read itr_scale twice.
> /* write back value and retain adaptive flag */
> ring_container->itr = avg_wire_size | FM10K_ITR_ADAPTIVE;
> @@ -1608,6 +1627,7 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
> q_vector->tx.ring = ring;
> q_vector->tx.work_limit = FM10K_DEFAULT_TX_WORK;
> q_vector->tx.itr = interface->tx_itr;
> + q_vector->tx.itr_scale = interface->hw.mac.itr_scale;
> q_vector->tx.count = txr_count;
>
> while (txr_count) {
> @@ -1636,6 +1656,7 @@ static int fm10k_alloc_q_vector(struct fm10k_intfc *interface,
> /* save Rx ring container info */
> q_vector->rx.ring = ring;
> q_vector->rx.itr = interface->rx_itr;
> + q_vector->rx.itr_scale = interface->hw.mac.itr_scale;
> q_vector->rx.count = rxr_count;
>
> while (rxr_count) {
> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> index 9ad9f9164d91..cbf38da0ada7 100644
> --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
> @@ -880,7 +880,8 @@ static irqreturn_t fm10k_msix_mbx_vf(int __always_unused irq, void *data)
>
> /* re-enable mailbox interrupt and indicate 20us delay */
> fm10k_write_reg(hw, FM10K_VFITR(FM10K_MBX_VECTOR),
> - FM10K_ITR_ENABLE | FM10K_MBX_INT_DELAY);
> + FM10K_ITR_ENABLE | (FM10K_MBX_INT_DELAY >>
> + hw->mac.itr_scale));
>
> /* service upstream mailbox */
> if (fm10k_mbx_trylock(interface)) {
> @@ -1111,7 +1112,8 @@ static irqreturn_t fm10k_msix_mbx_pf(int __always_unused irq, void *data)
>
> /* re-enable mailbox interrupt and indicate 20us delay */
> fm10k_write_reg(hw, FM10K_ITR(FM10K_MBX_VECTOR),
> - FM10K_ITR_ENABLE | FM10K_MBX_INT_DELAY);
> + FM10K_ITR_ENABLE | (FM10K_MBX_INT_DELAY >>
> + hw->mac.itr_scale));
>
> return IRQ_HANDLED;
> }
>
next prev parent reply other threads:[~2015-10-14 18:35 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-13 23:38 [Intel-wired-lan] [next-queue 01/17] fm10k: conditionally compile DCB and DebugFS support Jacob Keller
2015-10-13 23:38 ` [Intel-wired-lan] [next-queue 02/17] fm10k: set netdev features in one location Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 03/17] fm10k: reinitialize queuing scheme after calling init_hw Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 04/17] fm10k: reset max_queues on init_hw_vf failure Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 05/17] fm10k: always check init_hw for errors Jacob Keller
2015-10-14 0:46 ` Allan, Bruce W
2015-10-14 15:57 ` Keller, Jacob E
2015-10-28 0:47 ` Singh, Krishneil K
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 06/17] fm10k: Correct typecast in fm10k_update_xc_addr_pf Jacob Keller
2015-10-14 0:46 ` Allan, Bruce W
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 07/17] fm10k: explicitly typecast vlan values to u16 Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 08/17] fm10k: add statistics for actual DWORD count of mbmem mailbox Jacob Keller
2015-10-14 0:47 ` Allan, Bruce W
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 09/17] fm10k: rename mbx_tx_oversized statistic to mbx_tx_dropped Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 10/17] fm10k: add TEB check to fm10k_gre_is_nvgre Jacob Keller
2015-10-14 0:47 ` Allan, Bruce W
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 11/17] fm10k: Add support for ITR scaling based on PCIe link speed Jacob Keller
2015-10-14 0:47 ` Allan, Bruce W
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 12/17] fm10k: introduce ITR_IS_ADAPTIVE macro Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 13/17] fm10k: Update adaptive ITR algorithm Jacob Keller
2015-10-14 18:35 ` Alexander Duyck [this message]
2015-10-14 20:12 ` Keller, Jacob E
2015-10-14 22:40 ` Alexander Duyck
2015-10-14 23:50 ` Keller, Jacob E
2015-10-15 2:17 ` Alexander Duyck
2015-10-15 16:32 ` Keller, Jacob E
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 14/17] fm10k: use macro for default Tx and Rx ITR values Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 15/17] fm10k: change default Tx ITR to 25usec Jacob Keller
2015-10-14 15:15 ` Alexander Duyck
2015-10-14 15:59 ` Keller, Jacob E
2015-10-14 16:23 ` Alexander Duyck
2015-10-14 16:31 ` Keller, Jacob E
2015-10-14 17:57 ` Keller, Jacob E
2015-10-14 23:27 ` Alexander Duyck
2015-10-14 23:44 ` Keller, Jacob E
2015-10-15 2:23 ` Alexander Duyck
2015-10-15 16:35 ` Keller, Jacob E
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 16/17] fm10k: TRIVIAL fix typo of hardware Jacob Keller
2015-10-13 23:39 ` [Intel-wired-lan] [next-queue 17/17] fm10k: TRIVIAL cleanup order at top of fm10k_xmit_frame Jacob Keller
2015-10-14 0:46 ` [Intel-wired-lan] [next-queue 01/17] fm10k: conditionally compile DCB and DebugFS support Allan, Bruce W
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=561EA08C.8090705@gmail.com \
--to=alexander.duyck@gmail.com \
--cc=intel-wired-lan@osuosl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.