netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Larysa Zaremba <larysa.zaremba@intel.com>
Cc: <intel-wired-lan@lists.osuosl.org>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Jacob Keller" <jacob.e.keller@intel.com>,
	Eric Dumazet <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	<netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<bpf@vger.kernel.org>, <magnus.karlsson@intel.com>,
	Michal Kubiak <michal.kubiak@intel.com>,
	Wojciech Drewek <wojciech.drewek@intel.com>,
	Amritha Nambiar <amritha.nambiar@intel.com>,
	Chandan Kumar Rout <chandanx.rout@intel.com>
Subject: Re: [PATCH iwl-net v3 1/6] ice: move netif_queue_set_napi to rtnl-protected sections
Date: Tue, 20 Aug 2024 15:26:52 +0200	[thread overview]
Message-ID: <ZsSZnLt9k7zn8G6T@boxer> (raw)
In-Reply-To: <ZsSQY14RCWlG/Bxd@lzaremba-mobl.ger.corp.intel.com>

On Tue, Aug 20, 2024 at 02:47:31PM +0200, Larysa Zaremba wrote:
> On Tue, Aug 20, 2024 at 02:31:51PM +0200, Maciej Fijalkowski wrote:
> > On Mon, Aug 19, 2024 at 12:05:38PM +0200, Larysa Zaremba wrote:
> > > Currently, netif_queue_set_napi() is called from ice_vsi_rebuild() that is
> > > not rtnl-locked when called from the reset. This creates the need to take
> > > the rtnl_lock just for a single function and complicates the
> > > synchronization with .ndo_bpf. At the same time, there no actual need to
> > > fill napi-to-queue information at this exact point.
> > > 
> > > Fill napi-to-queue information when opening the VSI and clear it when the
> > > VSI is being closed. Those routines are already rtnl-locked.
> > > 
> > > Also, rewrite napi-to-queue assignment in a way that prevents inclusion of
> > > XDP queues, as this leads to out-of-bounds writes, such as one below.
> > > 
> > > [  +0.000004] BUG: KASAN: slab-out-of-bounds in netif_queue_set_napi+0x1c2/0x1e0
> > > [  +0.000012] Write of size 8 at addr ffff889881727c80 by task bash/7047
> > > [  +0.000006] CPU: 24 PID: 7047 Comm: bash Not tainted 6.10.0-rc2+ #2
> > > [  +0.000004] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021
> > > [  +0.000003] Call Trace:
> > > [  +0.000003]  <TASK>
> > > [  +0.000002]  dump_stack_lvl+0x60/0x80
> > > [  +0.000007]  print_report+0xce/0x630
> > > [  +0.000007]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> > > [  +0.000007]  ? __virt_addr_valid+0x1c9/0x2c0
> > > [  +0.000005]  ? netif_queue_set_napi+0x1c2/0x1e0
> > > [  +0.000003]  kasan_report+0xe9/0x120
> > > [  +0.000004]  ? netif_queue_set_napi+0x1c2/0x1e0
> > > [  +0.000004]  netif_queue_set_napi+0x1c2/0x1e0
> > > [  +0.000005]  ice_vsi_close+0x161/0x670 [ice]
> > > [  +0.000114]  ice_dis_vsi+0x22f/0x270 [ice]
> > > [  +0.000095]  ice_pf_dis_all_vsi.constprop.0+0xae/0x1c0 [ice]
> > > [  +0.000086]  ice_prepare_for_reset+0x299/0x750 [ice]
> > > [  +0.000087]  pci_dev_save_and_disable+0x82/0xd0
> > > [  +0.000006]  pci_reset_function+0x12d/0x230
> > > [  +0.000004]  reset_store+0xa0/0x100
> > > [  +0.000006]  ? __pfx_reset_store+0x10/0x10
> > > [  +0.000002]  ? __pfx_mutex_lock+0x10/0x10
> > > [  +0.000004]  ? __check_object_size+0x4c1/0x640
> > > [  +0.000007]  kernfs_fop_write_iter+0x30b/0x4a0
> > > [  +0.000006]  vfs_write+0x5d6/0xdf0
> > > [  +0.000005]  ? fd_install+0x180/0x350
> > > [  +0.000005]  ? __pfx_vfs_write+0x10/0xA10
> > > [  +0.000004]  ? do_fcntl+0x52c/0xcd0
> > > [  +0.000004]  ? kasan_save_track+0x13/0x60
> > > [  +0.000003]  ? kasan_save_free_info+0x37/0x60
> > > [  +0.000006]  ksys_write+0xfa/0x1d0
> > > [  +0.000003]  ? __pfx_ksys_write+0x10/0x10
> > > [  +0.000002]  ? __x64_sys_fcntl+0x121/0x180
> > > [  +0.000004]  ? _raw_spin_lock+0x87/0xe0
> > > [  +0.000005]  do_syscall_64+0x80/0x170
> > > [  +0.000007]  ? _raw_spin_lock+0x87/0xe0
> > > [  +0.000004]  ? __pfx__raw_spin_lock+0x10/0x10
> > > [  +0.000003]  ? file_close_fd_locked+0x167/0x230
> > > [  +0.000005]  ? syscall_exit_to_user_mode+0x7d/0x220
> > > [  +0.000005]  ? do_syscall_64+0x8c/0x170
> > > [  +0.000004]  ? do_syscall_64+0x8c/0x170
> > > [  +0.000003]  ? do_syscall_64+0x8c/0x170
> > > [  +0.000003]  ? fput+0x1a/0x2c0
> > > [  +0.000004]  ? filp_close+0x19/0x30
> > > [  +0.000004]  ? do_dup2+0x25a/0x4c0
> > > [  +0.000004]  ? __x64_sys_dup2+0x6e/0x2e0
> > > [  +0.000002]  ? syscall_exit_to_user_mode+0x7d/0x220
> > > [  +0.000004]  ? do_syscall_64+0x8c/0x170
> > > [  +0.000003]  ? __count_memcg_events+0x113/0x380
> > > [  +0.000005]  ? handle_mm_fault+0x136/0x820
> > > [  +0.000005]  ? do_user_addr_fault+0x444/0xa80
> > > [  +0.000004]  ? clear_bhb_loop+0x25/0x80
> > > [  +0.000004]  ? clear_bhb_loop+0x25/0x80
> > > [  +0.000002]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > [  +0.000005] RIP: 0033:0x7f2033593154
> > > 
> > > Fixes: 080b0c8d6d26 ("ice: Fix ASSERT_RTNL() warning during certain scenarios")
> > 
> > Shouldn't you include:
> > Fixes: 91fdbce7e8d6 ("ice: Add support in the driver for associating queue with napi")
> > 
> > as we were iterating over XDP rings that were attached to q_vectors from
> > the very beginning?
> >
> 
> I probably should have done this.

Maybe this could be included while applying or sending the pull request by
Tony. I'll go through the rest of set today to see if I have any comments,
if there won't be anything outstanding then it won't make sense to send
next revision just to fix the fixes tags.

>  
> > > Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
> > > Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
> > > Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
> > > Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
> > > Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
> > > ---
> > >  drivers/net/ethernet/intel/ice/ice_base.c |  11 +-
> > >  drivers/net/ethernet/intel/ice/ice_lib.c  | 129 ++++++----------------
> > >  drivers/net/ethernet/intel/ice/ice_lib.h  |  10 +-
> > >  drivers/net/ethernet/intel/ice/ice_main.c |  17 ++-
> > >  4 files changed, 49 insertions(+), 118 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
> > > index f448d3a84564..c158749a80e0 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_base.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> > > @@ -190,16 +190,11 @@ static void ice_free_q_vector(struct ice_vsi *vsi, int v_idx)
> > >  	}
> > >  	q_vector = vsi->q_vectors[v_idx];
> > >  
> > > -	ice_for_each_tx_ring(tx_ring, q_vector->tx) {
> > > -		ice_queue_set_napi(vsi, tx_ring->q_index, NETDEV_QUEUE_TYPE_TX,
> > > -				   NULL);
> > > +	ice_for_each_tx_ring(tx_ring, vsi->q_vectors[v_idx]->tx)
> > >  		tx_ring->q_vector = NULL;
> > > -	}
> > > -	ice_for_each_rx_ring(rx_ring, q_vector->rx) {
> > > -		ice_queue_set_napi(vsi, rx_ring->q_index, NETDEV_QUEUE_TYPE_RX,
> > > -				   NULL);
> > > +
> > > +	ice_for_each_rx_ring(rx_ring, vsi->q_vectors[v_idx]->rx)
> > >  		rx_ring->q_vector = NULL;
> > > -	}
> > >  
> > >  	/* only VSI with an associated netdev is set up with NAPI */
> > >  	if (vsi->netdev)
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> > > index 03c4df4ed585..5f2ddcaf7031 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> > > @@ -2286,9 +2286,6 @@ static int ice_vsi_cfg_def(struct ice_vsi *vsi)
> > >  
> > >  		ice_vsi_map_rings_to_vectors(vsi);
> > >  
> > > -		/* Associate q_vector rings to napi */
> > > -		ice_vsi_set_napi_queues(vsi);
> > > -
> > >  		vsi->stat_offsets_loaded = false;
> > >  
> > >  		/* ICE_VSI_CTRL does not need RSS so skip RSS processing */
> > > @@ -2621,6 +2618,7 @@ void ice_vsi_close(struct ice_vsi *vsi)
> > >  	if (!test_and_set_bit(ICE_VSI_DOWN, vsi->state))
> > >  		ice_down(vsi);
> > >  
> > > +	ice_vsi_clear_napi_queues(vsi);
> > >  	ice_vsi_free_irq(vsi);
> > >  	ice_vsi_free_tx_rings(vsi);
> > >  	ice_vsi_free_rx_rings(vsi);
> > > @@ -2687,120 +2685,55 @@ void ice_dis_vsi(struct ice_vsi *vsi, bool locked)
> > >  }
> > >  
> > >  /**
> > > - * __ice_queue_set_napi - Set the napi instance for the queue
> > > - * @dev: device to which NAPI and queue belong
> > > - * @queue_index: Index of queue
> > > - * @type: queue type as RX or TX
> > > - * @napi: NAPI context
> > > - * @locked: is the rtnl_lock already held
> > > - *
> > > - * Set the napi instance for the queue. Caller indicates the lock status.
> > > - */
> > > -static void
> > > -__ice_queue_set_napi(struct net_device *dev, unsigned int queue_index,
> > > -		     enum netdev_queue_type type, struct napi_struct *napi,
> > > -		     bool locked)
> > > -{
> > > -	if (!locked)
> > > -		rtnl_lock();
> > > -	netif_queue_set_napi(dev, queue_index, type, napi);
> > > -	if (!locked)
> > > -		rtnl_unlock();
> > > -}
> > > -
> > > -/**
> > > - * ice_queue_set_napi - Set the napi instance for the queue
> > > - * @vsi: VSI being configured
> > > - * @queue_index: Index of queue
> > > - * @type: queue type as RX or TX
> > > - * @napi: NAPI context
> > > + * ice_vsi_set_napi_queues
> > > + * @vsi: VSI pointer
> > >   *
> > > - * Set the napi instance for the queue. The rtnl lock state is derived from the
> > > - * execution path.
> > > + * Associate queue[s] with napi for all vectors.
> > > + * The caller must hold rtnl_lock.
> > >   */
> > > -void
> > > -ice_queue_set_napi(struct ice_vsi *vsi, unsigned int queue_index,
> > > -		   enum netdev_queue_type type, struct napi_struct *napi)
> > > +void ice_vsi_set_napi_queues(struct ice_vsi *vsi)
> > 
> > this appears to be called only in ice_main.c. It should be moved there and
> > made a static function instead of having it in ice_lib.c.
> > 
> > Unless I overlooked something...
> >
> 
> You are not missing anything, but I think there is more value in keeping the 
> set-clear functions together and ice_lib.c is a good place for that.

Yeah I realized after sending the comment that clear func indeed has to be
exported so I'd say we can live with that.

> 
> > >  {
> > > -	struct ice_pf *pf = vsi->back;
> > > +	struct net_device *netdev = vsi->netdev;
> > > +	int q_idx, v_idx;
> > >  
> > > -	if (!vsi->netdev)
> > > +	if (!netdev)
> > >  		return;
> > >  
> > > -	if (current_work() == &pf->serv_task ||
> > > -	    test_bit(ICE_PREPARED_FOR_RESET, pf->state) ||
> > > -	    test_bit(ICE_DOWN, pf->state) ||
> > > -	    test_bit(ICE_SUSPENDED, pf->state))
> > > -		__ice_queue_set_napi(vsi->netdev, queue_index, type, napi,
> > > -				     false);
> > > -	else
> > > -		__ice_queue_set_napi(vsi->netdev, queue_index, type, napi,
> > > -				     true);
> > > -}
> > > +	ice_for_each_rxq(vsi, q_idx)
> > > +		netif_queue_set_napi(netdev, q_idx, NETDEV_QUEUE_TYPE_RX,
> > > +				     &vsi->rx_rings[q_idx]->q_vector->napi);
> > >  
> > > -/**

(...)

  reply	other threads:[~2024-08-20 13:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-19 10:05 [PATCH iwl-net v3 0/6] ice: fix synchronization between .ndo_bpf() and reset Larysa Zaremba
2024-08-19 10:05 ` [PATCH iwl-net v3 1/6] ice: move netif_queue_set_napi to rtnl-protected sections Larysa Zaremba
2024-08-20 12:31   ` Maciej Fijalkowski
2024-08-20 12:47     ` Larysa Zaremba
2024-08-20 13:26       ` Maciej Fijalkowski [this message]
2024-08-21 21:20         ` Tony Nguyen
2024-08-19 10:05 ` [PATCH iwl-net v3 2/6] ice: protect XDP configuration with a mutex Larysa Zaremba
2024-08-22 11:39   ` Maciej Fijalkowski
2024-08-22 13:05     ` Larysa Zaremba
2024-08-19 10:05 ` [PATCH iwl-net v3 3/6] ice: check for XDP rings instead of bpf program when unconfiguring Larysa Zaremba
2024-08-22 11:36   ` Maciej Fijalkowski
2024-08-19 10:05 ` [PATCH iwl-net v3 4/6] ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset Larysa Zaremba
2024-08-22 11:34   ` Maciej Fijalkowski
2024-08-22 12:56     ` Larysa Zaremba
2024-08-22 14:42       ` Maciej Fijalkowski
2024-08-22 17:18         ` Larysa Zaremba
2024-08-19 10:05 ` [PATCH iwl-net v3 5/6] ice: remove ICE_CFG_BUSY locking from AF_XDP code Larysa Zaremba
2024-08-22 11:43   ` Maciej Fijalkowski
2024-08-22 13:07     ` Larysa Zaremba
2024-08-19 10:05 ` [PATCH iwl-net v3 6/6] ice: do not bring the VSI up, if it was down before the XDP setup Larysa Zaremba
2024-08-22 11:35   ` Maciej Fijalkowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZsSZnLt9k7zn8G6T@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=amritha.nambiar@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=chandanx.rout@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=larysa.zaremba@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=magnus.karlsson@intel.com \
    --cc=michal.kubiak@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=wojciech.drewek@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).