From: Simon Horman <horms@kernel.org>
To: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Mina Almasry <almasrymina@google.com>,
nex.sw.ncis.osdt.itp.upstreaming@intel.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Eric Dumazet <edumazet@google.com>,
Tony Nguyen <anthony.l.nguyen@intel.com>,
Jakub Kicinski <kuba@kernel.org>,
intel-wired-lan@lists.osuosl.org, Paolo Abeni <pabeni@redhat.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [Intel-wired-lan] [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures
Date: Sat, 15 Jun 2024 08:32:56 +0100 [thread overview]
Message-ID: <20240615073256.GZ8447@kernel.org> (raw)
In-Reply-To: <b110726e-d496-4975-8089-57a4931da47d@intel.com>
On Thu, Jun 13, 2024 at 01:03:00PM +0200, Alexander Lobakin wrote:
> From: Simon Horman <horms@kernel.org>
> Date: Sat, 1 Jun 2024 09:53:08 +0100
>
> > On Tue, May 28, 2024 at 03:48:37PM +0200, Alexander Lobakin wrote:
> >> Currently, sizeof(struct idpf_queue) is 32 Kb.
> >> This is due to the 12-bit hashtable declaration at the end of the queue.
> >> This HT is needed only for Tx queues when the flow scheduling mode is
> >> enabled. But &idpf_queue is unified for all of the queue types,
> >> provoking excessive memory usage.
> >> The unified structure in general makes the code less effective via
> >> suboptimal fields placement. You can't avoid that unless you make unions
> >> each 2 fields. Even then, different field alignment etc., doesn't allow
> >> you to optimize things to the limit.
> >> Split &idpf_queue into 4 structures corresponding to the queue types:
> >> RQ (Rx queue), SQ (Tx queue), FQ (buffer queue), and CQ (completion
> >> queue). Place only needed fields there and shortcuts handy for hotpath.
> >> Allocate the abovementioned hashtable dynamically and only when needed,
> >> keeping &idpf_tx_queue relatively short (192 bytes, same as Rx). This HT
> >> is used only for OOO completions, which aren't really hotpath anyway.
> >> Note that this change must be done atomically, otherwise it's really
> >> easy to get lost and miss something.
> >>
> >> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> >
> > ...
> >
> >> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> >
> > ...
> >
> >> @@ -1158,20 +1325,22 @@ static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
> >> */
> >> static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> >> {
> >> - bool flow_sch_en;
> >> - int err, i;
> >> + bool split, flow_sch_en;
> >> + int i;
> >>
> >> vport->txq_grps = kcalloc(vport->num_txq_grp,
> >> sizeof(*vport->txq_grps), GFP_KERNEL);
> >> if (!vport->txq_grps)
> >> return -ENOMEM;
> >>
> >> + split = idpf_is_queue_model_split(vport->txq_model);
> >> flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
> >> VIRTCHNL2_CAP_SPLITQ_QSCHED);
> >>
> >> for (i = 0; i < vport->num_txq_grp; i++) {
> >> struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
> >> struct idpf_adapter *adapter = vport->adapter;
> >> + struct idpf_txq_stash *stashes;
> >> int j;
> >>
> >> tx_qgrp->vport = vport;
> >> @@ -1180,45 +1349,62 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> >> for (j = 0; j < tx_qgrp->num_txq; j++) {
> >> tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
> >> GFP_KERNEL);
> >> - if (!tx_qgrp->txqs[j]) {
> >> - err = -ENOMEM;
> >> + if (!tx_qgrp->txqs[j])
> >> goto err_alloc;
> >> - }
> >> + }
> >> +
> >> + if (split && flow_sch_en) {
> >> + stashes = kcalloc(num_txq, sizeof(*stashes),
> >> + GFP_KERNEL);
> >
> > Hi Alexander,
> >
> > Here stashes is assigned a memory allocation and
> > then then assigned to tx_qgrp->stashes a few lines below...
> >
> >> + if (!stashes)
> >> + goto err_alloc;
> >> +
> >> + tx_qgrp->stashes = stashes;
> >> }
> >>
> >> for (j = 0; j < tx_qgrp->num_txq; j++) {
> >> - struct idpf_queue *q = tx_qgrp->txqs[j];
> >> + struct idpf_tx_queue *q = tx_qgrp->txqs[j];
> >>
> >> q->dev = &adapter->pdev->dev;
> >> q->desc_count = vport->txq_desc_count;
> >> q->tx_max_bufs = idpf_get_max_tx_bufs(adapter);
> >> q->tx_min_pkt_len = idpf_get_min_tx_pkt_len(adapter);
> >> - q->vport = vport;
> >> + q->netdev = vport->netdev;
> >> q->txq_grp = tx_qgrp;
> >> - hash_init(q->sched_buf_hash);
> >>
> >> - if (flow_sch_en)
> >> - set_bit(__IDPF_Q_FLOW_SCH_EN, q->flags);
> >> + if (!split) {
> >> + q->clean_budget = vport->compln_clean_budget;
> >> + idpf_queue_assign(CRC_EN, q,
> >> + vport->crc_enable);
> >> + }
> >> +
> >> + if (!flow_sch_en)
> >> + continue;
> >> +
> >> + if (split) {
> >
> > ... but here elements of stashes seem to be assigned to q->stash
> > without stashes having being initialised.
> >
> > Flagged by Smatch
>
> Hi! Yes, I saw the report, but isn't it a false positive?
>
> Allocation happens when `split && flow_sch_en`, and here we have
>
> if (!flow_sch_en)
> continue;
>
> if (split)
> // assign
>
> IOW the assignment can't happen without the allocation?
Thanks, and sorry for missing the points you highlight above.
I agree that this is a false positive.
>
> >
> >> + q->stash = &stashes[j];
> >> + hash_init(q->stash->sched_buf_hash);
> >> + }
> >> +
> >> + idpf_queue_set(FLOW_SCH_EN, q);
>
> Thanks,
> Olek
>
WARNING: multiple messages have this Message-ID (diff)
From: Simon Horman <horms@kernel.org>
To: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: intel-wired-lan@lists.osuosl.org,
Tony Nguyen <anthony.l.nguyen@intel.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Mina Almasry <almasrymina@google.com>,
nex.sw.ncis.osdt.itp.upstreaming@intel.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures
Date: Sat, 15 Jun 2024 08:32:56 +0100 [thread overview]
Message-ID: <20240615073256.GZ8447@kernel.org> (raw)
In-Reply-To: <b110726e-d496-4975-8089-57a4931da47d@intel.com>
On Thu, Jun 13, 2024 at 01:03:00PM +0200, Alexander Lobakin wrote:
> From: Simon Horman <horms@kernel.org>
> Date: Sat, 1 Jun 2024 09:53:08 +0100
>
> > On Tue, May 28, 2024 at 03:48:37PM +0200, Alexander Lobakin wrote:
> >> Currently, sizeof(struct idpf_queue) is 32 Kb.
> >> This is due to the 12-bit hashtable declaration at the end of the queue.
> >> This HT is needed only for Tx queues when the flow scheduling mode is
> >> enabled. But &idpf_queue is unified for all of the queue types,
> >> provoking excessive memory usage.
> >> The unified structure in general makes the code less effective via
> >> suboptimal fields placement. You can't avoid that unless you make unions
> >> each 2 fields. Even then, different field alignment etc., doesn't allow
> >> you to optimize things to the limit.
> >> Split &idpf_queue into 4 structures corresponding to the queue types:
> >> RQ (Rx queue), SQ (Tx queue), FQ (buffer queue), and CQ (completion
> >> queue). Place only needed fields there and shortcuts handy for hotpath.
> >> Allocate the abovementioned hashtable dynamically and only when needed,
> >> keeping &idpf_tx_queue relatively short (192 bytes, same as Rx). This HT
> >> is used only for OOO completions, which aren't really hotpath anyway.
> >> Note that this change must be done atomically, otherwise it's really
> >> easy to get lost and miss something.
> >>
> >> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> >
> > ...
> >
> >> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> >
> > ...
> >
> >> @@ -1158,20 +1325,22 @@ static void idpf_rxq_set_descids(struct idpf_vport *vport, struct idpf_queue *q)
> >> */
> >> static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> >> {
> >> - bool flow_sch_en;
> >> - int err, i;
> >> + bool split, flow_sch_en;
> >> + int i;
> >>
> >> vport->txq_grps = kcalloc(vport->num_txq_grp,
> >> sizeof(*vport->txq_grps), GFP_KERNEL);
> >> if (!vport->txq_grps)
> >> return -ENOMEM;
> >>
> >> + split = idpf_is_queue_model_split(vport->txq_model);
> >> flow_sch_en = !idpf_is_cap_ena(vport->adapter, IDPF_OTHER_CAPS,
> >> VIRTCHNL2_CAP_SPLITQ_QSCHED);
> >>
> >> for (i = 0; i < vport->num_txq_grp; i++) {
> >> struct idpf_txq_group *tx_qgrp = &vport->txq_grps[i];
> >> struct idpf_adapter *adapter = vport->adapter;
> >> + struct idpf_txq_stash *stashes;
> >> int j;
> >>
> >> tx_qgrp->vport = vport;
> >> @@ -1180,45 +1349,62 @@ static int idpf_txq_group_alloc(struct idpf_vport *vport, u16 num_txq)
> >> for (j = 0; j < tx_qgrp->num_txq; j++) {
> >> tx_qgrp->txqs[j] = kzalloc(sizeof(*tx_qgrp->txqs[j]),
> >> GFP_KERNEL);
> >> - if (!tx_qgrp->txqs[j]) {
> >> - err = -ENOMEM;
> >> + if (!tx_qgrp->txqs[j])
> >> goto err_alloc;
> >> - }
> >> + }
> >> +
> >> + if (split && flow_sch_en) {
> >> + stashes = kcalloc(num_txq, sizeof(*stashes),
> >> + GFP_KERNEL);
> >
> > Hi Alexander,
> >
> > Here stashes is assigned a memory allocation and
> > then then assigned to tx_qgrp->stashes a few lines below...
> >
> >> + if (!stashes)
> >> + goto err_alloc;
> >> +
> >> + tx_qgrp->stashes = stashes;
> >> }
> >>
> >> for (j = 0; j < tx_qgrp->num_txq; j++) {
> >> - struct idpf_queue *q = tx_qgrp->txqs[j];
> >> + struct idpf_tx_queue *q = tx_qgrp->txqs[j];
> >>
> >> q->dev = &adapter->pdev->dev;
> >> q->desc_count = vport->txq_desc_count;
> >> q->tx_max_bufs = idpf_get_max_tx_bufs(adapter);
> >> q->tx_min_pkt_len = idpf_get_min_tx_pkt_len(adapter);
> >> - q->vport = vport;
> >> + q->netdev = vport->netdev;
> >> q->txq_grp = tx_qgrp;
> >> - hash_init(q->sched_buf_hash);
> >>
> >> - if (flow_sch_en)
> >> - set_bit(__IDPF_Q_FLOW_SCH_EN, q->flags);
> >> + if (!split) {
> >> + q->clean_budget = vport->compln_clean_budget;
> >> + idpf_queue_assign(CRC_EN, q,
> >> + vport->crc_enable);
> >> + }
> >> +
> >> + if (!flow_sch_en)
> >> + continue;
> >> +
> >> + if (split) {
> >
> > ... but here elements of stashes seem to be assigned to q->stash
> > without stashes having being initialised.
> >
> > Flagged by Smatch
>
> Hi! Yes, I saw the report, but isn't it a false positive?
>
> Allocation happens when `split && flow_sch_en`, and here we have
>
> if (!flow_sch_en)
> continue;
>
> if (split)
> // assign
>
> IOW the assignment can't happen without the allocation?
Thanks, and sorry for missing the points you highlight above.
I agree that this is a false positive.
>
> >
> >> + q->stash = &stashes[j];
> >> + hash_init(q->stash->sched_buf_hash);
> >> + }
> >> +
> >> + idpf_queue_set(FLOW_SCH_EN, q);
>
> Thanks,
> Olek
>
next prev parent reply other threads:[~2024-06-15 7:33 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-28 13:48 [Intel-wired-lan] [PATCH iwl-next 00/12] idpf: XDP chapter I: convert Rx to libeth Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 01/12] libeth: add cacheline / struct alignment helpers Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-30 1:34 ` [Intel-wired-lan] " Jakub Kicinski
2024-05-30 1:34 ` Jakub Kicinski
2024-06-12 10:07 ` [Intel-wired-lan] " Przemek Kitszel
2024-06-12 10:07 ` Przemek Kitszel
2024-06-12 20:55 ` [Intel-wired-lan] " Jakub Kicinski
2024-06-12 20:55 ` Jakub Kicinski
2024-06-13 10:47 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-13 10:47 ` Alexander Lobakin
2024-06-13 13:47 ` [Intel-wired-lan] " Jakub Kicinski
2024-06-13 13:47 ` Jakub Kicinski
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 02/12] idpf: stop using macros for accessing queue descriptors Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 03/12] idpf: split &idpf_queue into 4 strictly-typed queue structures Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-06-01 8:53 ` [Intel-wired-lan] " Simon Horman
2024-06-01 8:53 ` Simon Horman
2024-06-13 11:03 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-13 11:03 ` Alexander Lobakin
2024-06-15 7:32 ` Simon Horman [this message]
2024-06-15 7:32 ` Simon Horman
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 04/12] idpf: avoid bloating &idpf_q_vector with big %NR_CPUS Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-29 0:36 ` [Intel-wired-lan] " Jacob Keller
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 05/12] idpf: strictly assert cachelines of queue and queue vector structures Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-29 0:43 ` [Intel-wired-lan] " Jacob Keller
2024-06-12 13:03 ` Alexander Lobakin
2024-06-12 13:03 ` Alexander Lobakin
2024-06-12 13:08 ` Alexander Lobakin
2024-06-12 13:08 ` Alexander Lobakin
2024-06-12 22:42 ` Jacob Keller
2024-06-12 22:42 ` Jacob Keller
2024-06-12 22:40 ` Jacob Keller
2024-06-12 22:40 ` Jacob Keller
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 06/12] idpf: merge singleq and splitq &net_device_ops Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-29 0:44 ` [Intel-wired-lan] " Jacob Keller
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 07/12] idpf: compile singleq code only under default-n CONFIG_IDPF_SINGLEQ Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-29 0:47 ` [Intel-wired-lan] " Jacob Keller
2024-06-12 13:15 ` Alexander Lobakin
2024-06-12 22:43 ` Jacob Keller
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 08/12] idpf: reuse libeth's definitions of parsed ptype structures Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-29 0:49 ` [Intel-wired-lan] " Jacob Keller
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 09/12] idpf: remove legacy Page Pool Ethtool stats Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-29 0:52 ` [Intel-wired-lan] " Jacob Keller
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 10/12] libeth: support different types of buffers for Rx Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 11/12] idpf: convert header split mode to libeth + napi_build_skb() Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-05-30 1:40 ` [Intel-wired-lan] " Jakub Kicinski
2024-05-30 1:40 ` Jakub Kicinski
2024-06-13 10:58 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-13 10:58 ` Alexander Lobakin
2024-05-30 13:46 ` [Intel-wired-lan] " Willem de Bruijn
2024-05-30 13:46 ` Willem de Bruijn
2024-06-17 11:06 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-17 11:06 ` Alexander Lobakin
2024-06-17 18:13 ` [Intel-wired-lan] " Willem de Bruijn
2024-06-17 18:13 ` Willem de Bruijn
2024-06-20 12:46 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-20 12:46 ` Alexander Lobakin
2024-06-20 16:29 ` [Intel-wired-lan] " Willem de Bruijn
2024-06-20 16:29 ` Willem de Bruijn
2024-05-28 13:48 ` [Intel-wired-lan] [PATCH iwl-next 12/12] idpf: use libeth Rx buffer management for payload buffer Alexander Lobakin
2024-05-28 13:48 ` Alexander Lobakin
2024-06-01 9:08 ` [Intel-wired-lan] " Simon Horman
2024-06-01 9:08 ` Simon Horman
2024-06-13 11:05 ` [Intel-wired-lan] " Alexander Lobakin
2024-06-13 11:05 ` Alexander Lobakin
2024-06-15 7:35 ` [Intel-wired-lan] " Simon Horman
2024-06-15 7:35 ` Simon Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240615073256.GZ8447@kernel.org \
--to=horms@kernel.org \
--cc=aleksander.lobakin@intel.com \
--cc=almasrymina@google.com \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.