From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Paul Menzel <pmenzel@molgen.mpg.de>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
Larysa Zaremba <larysa.zaremba@intel.com>,
<netdev@vger.kernel.org>, Alexander Duyck <alexanderduyck@fb.com>,
"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
<linux-kernel@vger.kernel.org>,
Yunsheng Lin <linyunsheng@huawei.com>,
Michal Kubiak <michal.kubiak@intel.com>,
<intel-wired-lan@lists.osuosl.org>,
"David Christensen" <drc@linux.vnet.ibm.com>
Subject: Re: [Intel-wired-lan] [PATCH RFC net-next v4 6/9] iavf: switch to Page Pool
Date: Thu, 6 Jul 2023 18:56:21 +0200 [thread overview]
Message-ID: <52963031-76be-b215-052e-a200f01d7130@intel.com> (raw)
In-Reply-To: <CAKgT0Ud4h32UFwiUhcpLxSrPRMhbKYSDncL2YiursWgS7Qg7Ug@mail.gmail.com>
From: Alexander Duyck <alexander.duyck@gmail.com>
Date: Thu, 6 Jul 2023 08:26:00 -0700
> On Wed, Jul 5, 2023 at 8:58 AM Alexander Lobakin
> <aleksander.lobakin@intel.com> wrote:
>>
>> Now that the IAVF driver simply uses dev_alloc_page() + free_page() with
>> no custom recycling logics, it can easily be switched to using Page
>> Pool / libie API instead.
>> This allows to removing the whole dancing around headroom, HW buffer
>> size, and page order. All DMA-for-device is now done in the PP core,
>> for-CPU -- in the libie helper.
>> Use skb_mark_for_recycle() to bring back the recycling and restore the
>> performance. Speaking of performance: on par with the baseline and
>> faster with the PP optimization series applied. But the memory usage for
>> 1500b MTU is now almost 2x lower (x86_64) thanks to allocating a page
>> every second descriptor.
>>
>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>
> One thing I am noticing is that there seems to be a bunch of cleanup
> changes in here as well. Things like moving around values within
> structures which I am assuming are to fill holes. You may want to look
> at breaking some of those out as it makes it a bit harder to review
> this since they seem like unrelated changes.
min_mtu and watchdog are unrelated, I'll drop those.
Moving tail pointer around was supposed to land in a different commit,
not this one, as I wrote 10 minutes ago already :s
[...]
>> - bi_size = sizeof(struct iavf_rx_buffer) * rx_ring->count;
>> - memset(rx_ring->rx_bi, 0, bi_size);
>> -
>> - /* Zero out the descriptor ring */
>> - memset(rx_ring->desc, 0, rx_ring->size);
>> -
>
> I have some misgivings about not clearing these. We may want to double
> check to verify the code paths are resilient enough that it won't
> cause any issues w/ repeated up/down testing on the interface. The
> general idea is to keep things consistent w/ the state after
> setup_rx_descriptors. If we don't need this when we don't need to be
> calling the zalloc or calloc version of things in
> setup_rx_descriptors.
Both arrays will be freed couple instructions below, why zero them?
>
>
>> rx_ring->next_to_clean = 0;
>> rx_ring->next_to_use = 0;
>> }
[...]
>> struct net_device *netdev; /* netdev ring maps to */
>> union {
>> + struct libie_rx_buffer *rx_bi;
>> struct iavf_tx_buffer *tx_bi;
>> - struct iavf_rx_buffer *rx_bi;
>> };
>> DECLARE_BITMAP(state, __IAVF_RING_STATE_NBITS);
>> + u8 __iomem *tail;
>> u16 queue_index; /* Queue number of ring */
>> u8 dcb_tc; /* Traffic class of ring */
>> - u8 __iomem *tail;
>>
>> /* high bit set means dynamic, use accessors routines to read/write.
>> * hardware only supports 2us resolution for the ITR registers.
>
> I'm assuming "tail" was moved here since it is a pointer and fills a hole?
(see above)
>
>> @@ -329,9 +264,8 @@ struct iavf_ring {
>> */
>> u16 itr_setting;
>>
>> - u16 count; /* Number of descriptors */
>> u16 reg_idx; /* HW register index of the ring */
>> - u16 rx_buf_len;
>> + u16 count; /* Number of descriptors */
>
> Why move count down here? It is moving the constant value that is
> read-mostly into an area that will be updated more often.
With the ::tail put in a different slot, ::count was landing in a
different cacheline. I wanted to avoid this. But now I feel like I was
just lazy and must've tested both variants to see if this move affects
performance. I'll play with this one in the next rev.
>
>> /* used in interrupt processing */
>> u16 next_to_use;
>> @@ -398,17 +332,6 @@ struct iavf_ring_container {
>> #define iavf_for_each_ring(pos, head) \
>> for (pos = (head).ring; pos != NULL; pos = pos->next)
>>
>> -static inline unsigned int iavf_rx_pg_order(struct iavf_ring *ring)
>> -{
>> -#if (PAGE_SIZE < 8192)
>> - if (ring->rx_buf_len > (PAGE_SIZE / 2))
>> - return 1;
>> -#endif
>> - return 0;
>> -}
>> -
>> -#define iavf_rx_pg_size(_ring) (PAGE_SIZE << iavf_rx_pg_order(_ring))
>> -
>
> All this code probably could have been removed in an earlier patch
> since I don't think we need the higher order pages once we did away
> with the recycling. Odds are we can probably move this into the
> recycling code removal.
This went here as I merged "always use order 0" commit with "switch to
Page Pool". In general, IIRC having removals of all the stuff at once in
one commit (#2) was less readable than the current version, but I'll
double-check.
>
>> bool iavf_alloc_rx_buffers(struct iavf_ring *rxr, u16 cleaned_count);
>> netdev_tx_t iavf_xmit_frame(struct sk_buff *skb, struct net_device *netdev);
>> int iavf_setup_tx_descriptors(struct iavf_ring *tx_ring);
[...]
>> @@ -309,9 +310,7 @@ void iavf_configure_queues(struct iavf_adapter *adapter)
>> vqpi->rxq.ring_len = adapter->rx_rings[i].count;
>> vqpi->rxq.dma_ring_addr = adapter->rx_rings[i].dma;
>> vqpi->rxq.max_pkt_size = max_frame;
>> - vqpi->rxq.databuffer_size =
>> - ALIGN(adapter->rx_rings[i].rx_buf_len,
>> - BIT_ULL(IAVF_RXQ_CTX_DBUFF_SHIFT));
>
> Is this rendered redundant by something? Seems like you should be
> guaranteeing somewhere that you are still aligned to this.
See the previous commit, the place where I calculate max_len for the PP
params. 128 byte is Intel-wide HW req, so it lives there now.
>
>
>> + vqpi->rxq.databuffer_size = max_len;
>> vqpi++;
Thanks,
Olek
next prev parent reply other threads:[~2023-07-06 16:58 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-05 15:55 [PATCH RFC net-next v4 0/9] net: intel: start The Great Code Dedup + Page Pool for iavf Alexander Lobakin
2023-07-05 15:55 ` [PATCH RFC net-next v4 1/9] net: intel: introduce Intel Ethernet common library Alexander Lobakin
2023-07-14 14:17 ` [Intel-wired-lan] " Przemek Kitszel
2023-07-05 15:55 ` [PATCH RFC net-next v4 2/9] iavf: kill "legacy-rx" for good Alexander Lobakin
2023-07-14 14:17 ` [Intel-wired-lan] " Przemek Kitszel
2023-07-05 15:55 ` [PATCH RFC net-next v4 3/9] iavf: drop page splitting and recycling Alexander Lobakin
2023-07-06 14:47 ` [Intel-wired-lan] " Alexander Duyck
2023-07-06 16:45 ` Alexander Lobakin
2023-07-06 17:06 ` Alexander Duyck
2023-07-10 13:13 ` Alexander Lobakin
2023-07-05 15:55 ` [PATCH RFC net-next v4 4/9] net: page_pool: add DMA-sync-for-CPU inline helpers Alexander Lobakin
2023-07-05 15:55 ` [PATCH RFC net-next v4 5/9] libie: add Rx buffer management (via Page Pool) Alexander Lobakin
2023-07-06 12:47 ` Yunsheng Lin
2023-07-06 16:28 ` Alexander Lobakin
2023-07-09 5:16 ` Yunsheng Lin
2023-07-10 13:25 ` Alexander Lobakin
2023-07-11 11:39 ` Yunsheng Lin
2023-07-11 16:37 ` Alexander Lobakin
2023-07-12 11:13 ` Yunsheng Lin
2023-07-05 15:55 ` [PATCH RFC net-next v4 6/9] iavf: switch to Page Pool Alexander Lobakin
2023-07-06 12:47 ` Yunsheng Lin
2023-07-06 16:38 ` Alexander Lobakin
2023-07-09 5:16 ` Yunsheng Lin
2023-07-10 13:34 ` Alexander Lobakin
2023-07-11 11:47 ` Yunsheng Lin
2023-07-18 13:56 ` Alexander Lobakin
2023-07-06 15:26 ` [Intel-wired-lan] " Alexander Duyck
2023-07-06 16:56 ` Alexander Lobakin [this message]
2023-07-06 17:28 ` Alexander Duyck
2023-07-10 13:18 ` Alexander Lobakin
2023-07-05 15:55 ` [PATCH RFC net-next v4 7/9] libie: add common queue stats Alexander Lobakin
2023-07-05 15:55 ` [PATCH RFC net-next v4 8/9] libie: add per-queue Page Pool stats Alexander Lobakin
2023-07-05 15:55 ` [PATCH RFC net-next v4 9/9] iavf: switch queue stats to libie Alexander Lobakin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52963031-76be-b215-052e-a200f01d7130@intel.com \
--to=aleksander.lobakin@intel.com \
--cc=alexander.duyck@gmail.com \
--cc=alexanderduyck@fb.com \
--cc=davem@davemloft.net \
--cc=drc@linux.vnet.ibm.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=ilias.apalodimas@linaro.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linyunsheng@huawei.com \
--cc=michal.kubiak@intel.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pmenzel@molgen.mpg.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox