Netdev List
 help / color / mirror / Atom feed
* Re: [syzbot] [net?] WARNING in tls_err_abort
From: Sabrina Dubroca @ 2026-06-16 21:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: syzbot, davem, edumazet, horms, john.fastabend, linux-kernel,
	netdev, pabeni, syzkaller-bugs
In-Reply-To: <20260616082816.4dd0f035@kernel.org>

2026-06-16, 08:28:16 -0700, Jakub Kicinski wrote:
> On Tue, 16 Jun 2026 17:19:22 +0200 Sabrina Dubroca wrote:
> > I suspect err==0, and sock_error() consumed sk_err in between (the
> > alternative would be err > 0).
> > 
> > Something like this?
> 
> Makes sense, but what's eating sk_err?

The 2 remaining sock_error() in tls_rx_rec_wait()? [1]

> Don't we depend on it being set
> to avoid further state transitions once we hit a crypto error?

I kind of thought so too.

> I thought that's why we don't consume sk_err in recvmsg and sendmsg in
> the first place (we are not calling sock_error() anywhere)

Umm...
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/tls/tls_sw.c#n1095

-- 
Sabrina

^ permalink raw reply

* Re: [PATCH net-next v2 2/4] udmabuf: emit one sg entry per pinned folio
From: Bobby Eshleman @ 2026-06-16 20:57 UTC (permalink / raw)
  To: Kasireddy, Vivek
  Cc: Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Simon Horman, Andrew Lunn, Gerd Hoffmann,
	Sumit Semwal, Christian König, Shuah Khan, Jason Gunthorpe,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org, linux-kselftest@vger.kernel.org,
	sdf@fomichev.me, razor@blackwall.org, daniel@iogearbox.net,
	almasrymina@google.com, matttbe@kernel.org, skhawaja@google.com,
	dw@davidwei.uk, Bobby Eshleman
In-Reply-To: <IA0PR11MB71852246277F773AC41DAAA3F8E52@IA0PR11MB7185.namprd11.prod.outlook.com>

On Tue, Jun 16, 2026 at 06:04:03AM +0000, Kasireddy, Vivek wrote:
> Adding Jason to this discussion.
> 
> Hi Bobby,
> 
> > Subject: [PATCH net-next v2 2/4] udmabuf: emit one sg entry per pinned
> > folio
> > 
> > From: Bobby Eshleman <bobbyeshleman@meta.com>
> > 
> > get_sg_table() emitted one PAGE_SIZE sg entry per page even when the
> > underlying folio was larger.
> > 
> > Instead, walk folios[] and emit one sg entry per folio. When folios
> We have recently merged a patch (that will make it into 7.2) from Jason that
> replaced sg_set_folio() with sg_alloc_table_from_pages() in udmabuf driver:
> https://gitlab.freedesktop.org/drm/tip/-/commit/5bf888673e0dda5a53220fa0c4956271a46c353c
> 
> Since you are relying on sg_set_folio(), the core argument against its usage
> in udmabuf is that it doesn't work well with offsets > PAGE_SIZE, resulting
> in a malformed scatterlist. Not sure if this can be fixed easily.
> 
> > represent large pages (as is for MFD_HUGETLB), each sg entry is a large
> > page. Normal PAGE_SIZE sg tables are unchanged.
> > 
> > This is helpful for importers like net/core/devmem that expect dmabuf sg
> IMO, udmabuf needs to detect whether importers can handle segments that
> are > PAGE_SIZE and set the entries appropriately. Please look into how the
> GPU drivers and other dmabuf exporters/importers handle this situation, so
> that we can adopt best practices to address this issue.
> 
> Thanks,
> Vivek

Hey Vivek,

It sounds looks like that patch might solve my problem. I'll apply and
troubleshoot from there.

Thanks!

Best,
Bobby

> 
> > entries to be size and length aligned. Prior to this patch udmabuf
> > handed over one PAGE_SIZE sg entry per page, so devmem only saw
> > PAGE_SIZE chunks regardless of the underlying folio size.
> > 
> > dma_map_sgtable() does not always merge contiguous pages for us, so we
> > do this internally before exporting.
> > 
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> > ---
> >  drivers/dma-buf/udmabuf.c | 52
> > ++++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 47 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
> > index 94b8ecb892bb..9b751dd98b12 100644
> > --- a/drivers/dma-buf/udmabuf.c
> > +++ b/drivers/dma-buf/udmabuf.c
> > @@ -141,26 +141,68 @@ static void vunmap_udmabuf(struct dma_buf
> > *buf, struct iosys_map *map)
> >  	vm_unmap_ram(map->vaddr, ubuf->pagecount);
> >  }
> > 
> > +/* Return the number of contiguous pages backed by the folio at @i.
> > + * A udmabuf may map only part of a folio, or reference the same folio
> > + * in multiple non-contiguous runs, so folio_nr_pages() can't be used.
> > + */
> > +static pgoff_t udmabuf_folio_nr_pages(struct udmabuf *ubuf, pgoff_t i)
> > +{
> > +	struct folio *f = ubuf->folios[i];
> > +	pgoff_t j;
> > +
> > +	for (j = 1; i + j < ubuf->pagecount; j++) {
> > +		if (ubuf->folios[i + j] != f)
> > +			break;
> > +		/* Same folio, but not a sequential offset within it. */
> > +		if (ubuf->offsets[i + j] != ubuf->offsets[i] + j * PAGE_SIZE)
> > +			break;
> > +	}
> > +	return j;
> > +}
> > +
> > +/* Count the contiguous folio runs in @ubuf, one sg entry per run.
> > + *
> > + * Coalescing folios into a single sg entry up front lets importers actually
> > + * see large chunks. We can't rely on dma_map_sgtable() to do this for us
> > as
> > + * the dma_map_direct() path preserves the input scatterlist lengths
> > verbatim.
> > + */
> > +static unsigned int udmabuf_sg_nents(struct udmabuf *ubuf)
> > +{
> > +	unsigned int nents = 0;
> > +	pgoff_t i;
> > +
> > +	for (i = 0; i < ubuf->pagecount; i += udmabuf_folio_nr_pages(ubuf,
> > i))
> > +		nents++;
> > +	return nents;
> > +}
> > +
> >  static struct sg_table *get_sg_table(struct device *dev, struct dma_buf
> > *buf,
> >  				     enum dma_data_direction direction)
> >  {
> >  	struct udmabuf *ubuf = buf->priv;
> > -	struct sg_table *sg;
> >  	struct scatterlist *sgl;
> > -	unsigned int i = 0;
> > +	struct sg_table *sg;
> > +	pgoff_t i, run;
> > +	unsigned int nents;
> >  	int ret;
> > 
> > +	nents = udmabuf_sg_nents(ubuf);
> > +
> >  	sg = kzalloc_obj(*sg);
> >  	if (!sg)
> >  		return ERR_PTR(-ENOMEM);
> > 
> > -	ret = sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL);
> > +	ret = sg_alloc_table(sg, nents, GFP_KERNEL);
> >  	if (ret < 0)
> >  		goto err_alloc;
> > 
> > -	for_each_sg(sg->sgl, sgl, ubuf->pagecount, i)
> > -		sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE,
> > +	sgl = sg->sgl;
> > +	for (i = 0; i < ubuf->pagecount; i += run) {
> > +		run = udmabuf_folio_nr_pages(ubuf, i);
> > +		sg_set_folio(sgl, ubuf->folios[i], run << PAGE_SHIFT,
> >  			     ubuf->offsets[i]);
> > +		sgl = sg_next(sgl);
> > +	}
> > 
> >  	ret = dma_map_sgtable(dev, sg, direction, 0);
> >  	if (ret < 0)
> > 
> > --
> > 2.53.0-Meta
> 

^ permalink raw reply

* Re: [PATCH net] octeontx2-af: Validate NIX maximum LFs correctly
From: Subbaraya Sundeep @ 2026-06-16 20:51 UTC (permalink / raw)
  To: Simon Horman
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, netdev, linux-kernel
In-Reply-To: <20260603170759.GG3766816@horms.kernel.org>

Hi Simon,

On 2026-06-03 at 22:37:59, Simon Horman (horms@kernel.org) wrote:
> On Wed, Jun 03, 2026 at 05:59:46PM +0100, Simon Horman wrote:
> > From: 'Simon Horman' <horms@kernel.org>
> > 
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> > Full review at: https://sashiko.dev
> 
> Sorry, my tooling isn't helping me much today.
> There was supposed to be some commentary in my previous email.
> Let me add it.
> 
> > ---
> > octeontx2-af: Validate NIX maximum LFs correctly
> > 
> > This patch changes the validation condition for setting the maximum
> > number of NIX LFs via devlink. It replaces the check for assigned MCAM
> > entries with a check for assigned NIX LFs to determine if resizing
> > is allowed.
> > 
> > > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
> > > index 6494a9ee2f0dd..8ba4b5ba9d2c7 100644
> > > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
> > > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_devlink.c
> > [ ... ]
> > > @@ -1521,11 +1523,18 @@ static int rvu_af_dl_nix_maxlf_validate(struct devlink *devlink, u32 id,
> > >  	/* Do not allow user to modify maximum NIX LFs while mcam entries
> > >  	 * have already been assigned.
> > >  	 */
> > > -	mcam = &rvu->hw->mcam;
> > > -	if (mcam->bmap_fcnt < mcam->bmap_entries) {
> > > -		NL_SET_ERR_MSG_MOD(extack,
> > > -				   "mcam entries have already been assigned, can't resize");
> > > -		return -EPERM;
> > 
> 
> The issue below looks like it should be looked at in the context of this
> patch.
> 
> > [Severity: High]
> > Does removing the check for MCAM entries allow the devlink parameter
> > set operation to silently destroy active MCAM tracking state?
> > 
> > The .set callback unconditionally calls npc_mcam_rsrcs_deinit() and
> > npc_mcam_rsrcs_init(), which frees and reallocates the mcam->bmap. Since
> > MCAM entries can be allocated independently of NIX LFs (e.g. during
> > rvu_switch_enable() or default switchdev/PTP rules), could this cause
> > resource leaks and use-after-free bugs when untracked rules are later
> > accessed?
>
Some background:
 Say NIX hardware supports 128 LFs then
 1. 128 MCAM entries are always reserved by AF to receive unicast traffic to each LF
 2. PFs/VFs request a LF attachment during their probe via mbox message to AF
 3. devlink command to set NIX LFs is added later upon requests from customers/users
    so that if their system uses only 2 PFs then they can use the rest 126 mcam
    entries for additional traffic steering
 4. The only execption is user has to load PF and VF drivers as modules after
    setting NIX LFs as first command via AF devlink.
> > 
> > > +	blkaddr = rvu_get_next_nix_blkaddr(rvu, blkaddr);
> > > +	while (blkaddr) {
> > > +		block = &rvu->hw->block[blkaddr];
> 
> While the one below seems like it can be assessed in the context
> of possible follow-up.
> 
> > [Severity: High]
> > This is a pre-existing issue, but does this locklessly read the resource
> > states?
> > 
> > It appears that the devlink validate and set callbacks execute without
> > holding the driver's resource locks (rvu->rsrc_lock or mcam->lock).
> > Could a concurrent mailbox request allocate an LF immediately after this
> > lockless validation check succeeds, leading to a race where the .set
> > callback frees structures that are actively being used?
> > 
As from above background this is not possible.
> > > +
> > > +		free_lfs = rvu_rsrc_free_count(&block->lf);
> > > +		if (free_lfs != block->lf.max) {
> > > +			NL_SET_ERR_MSG_MOD(extack,
> > > +					   "mcam entries have already been assigned, can't resize");
> > 
> 
> The one below seems less important. But perhaps it would be good
> to address if there is a v2 of this patch.
> 
> > [Severity: Medium]
> > Since the validation logic was updated to evaluate NIX LF allocations,
> > should this error message and the comment above be updated as well? The
> > message still states that "mcam entries have already been assigned", which
> > might confuse users since it's actually evaluating NIX LFs.
> > 
Sure will modify the message

Thanks,
Sundeep

> > > +			return -EPERM;
> > > +		}
> > > +
> > > +		blkaddr = rvu_get_next_nix_blkaddr(rvu, blkaddr);
> > >  	}

^ permalink raw reply

* Re: [PATCH v1 bpf-next 0/2] bpf: bpf_redirect_peer egress redirection
From: Jordan Rife @ 2026-06-16 20:49 UTC (permalink / raw)
  To: Paul Chaignon
  Cc: bpf, netdev, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Stanislav Fomichev
In-Reply-To: <ajAXF8Nvg91xU4f2@mail.gmail.com>

> IMO, calling it BPF_F_EGRESS would be less confusing. It's a shame we
> can't have the same flag API between bpf_redirect() and
> bpf_redirect_peer(), but this is creating inconsistent semantics for
> the terms egress/ingress across the two helpers.

Yeah, one annoying thing about BPF_F_EGRESS is that it would only
apply to bpf_redirect_peer, so you still have inconsistencies across
helpers. Perhaps this is less weird than having BPF_F_INGRESS perform
an egress redirection though.

Jordan

^ permalink raw reply

* Re: [PATCH net v2 1/2] iov_iter: export iov_iter_restore
From: Jens Axboe @ 2026-06-16 20:47 UTC (permalink / raw)
  To: Octavian Purdila, netdev
  Cc: Alexander Viro, Andrew Morton, Arseniy Krasnov, David S. Miller,
	Eric Dumazet, Eugenio Pérez, Jakub Kicinski, Jason Wang, kvm,
	linux-block, linux-fsdevel, linux-kernel, Michael S. Tsirkin,
	Paolo Abeni, Simon Horman, Stefan Hajnoczi, Stefano Garzarella,
	virtualization, Xuan Zhuo
In-Reply-To: <20260613000953.467473-2-tavip@google.com>

On 6/12/26 6:09 PM, Octavian Purdila wrote:
> Export iov_iter_restore so that it can be used by modules.
> 
> This is needed by the virtio vsock transport (which can be built as a
> module) to restore the msg_iter state when transmission fails.
> 
> Signed-off-by: Octavian Purdila <tavip@google.com>
> ---
>  lib/iov_iter.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 243662af1af73..067e745f9ef53 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1491,6 +1491,7 @@ void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state)
>  		i->__iov -= state->nr_segs - i->nr_segs;
>  	i->nr_segs = state->nr_segs;
>  }
> +EXPORT_SYMBOL(iov_iter_restore);

I don't have a problem exporting this to modules, but any new export
should be _GPL. So please change it to that.

-- 
Jens Axboe

^ permalink raw reply

* [PATCH] net: faraday: ftmac100: convert to devm resource management
From: Jack Lee @ 2026-06-16 20:32 UTC (permalink / raw)
  To: davem, kuba
  Cc: andrew+netdev, edumazet, pabeni, netdev, linux-kernel, Jack Lee

Replace manual resource management with device-managed alternatives:
- alloc_etherdev() -> devm_alloc_etherdev()
- request_mem_region() + ioremap() -> devm_platform_ioremap_resource()

This simplifies error handling by removing manual cleanup in error
paths and the remove function, and eliminates the risk of resource
leaks.

Signed-off-by: Jack Lee <skunkolee@gmail.com>
---
 drivers/net/ethernet/faraday/ftmac100.c | 47 +++++--------------------
 1 file changed, 9 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/faraday/ftmac100.c b/drivers/net/ethernet/faraday/ftmac100.c
index 5803a382f0ba..adb318925f44 100644
--- a/drivers/net/ethernet/faraday/ftmac100.c
+++ b/drivers/net/ethernet/faraday/ftmac100.c
@@ -49,7 +49,6 @@ struct ftmac100_descs {
 };
 
 struct ftmac100 {
-	struct resource *res;
 	void __iomem *base;
 	int irq;
 
@@ -1137,11 +1136,9 @@ static int ftmac100_probe(struct platform_device *pdev)
 		return irq;
 
 	/* setup net_device */
-	netdev = alloc_etherdev(sizeof(*priv));
-	if (!netdev) {
-		err = -ENOMEM;
-		goto err_alloc_etherdev;
-	}
+	netdev = devm_alloc_etherdev(&pdev->dev, sizeof(*priv));
+	if (!netdev)
+		return -ENOMEM;
 
 	SET_NETDEV_DEV(netdev, &pdev->dev);
 	netdev->ethtool_ops = &ftmac100_ethtool_ops;
@@ -1150,7 +1147,7 @@ static int ftmac100_probe(struct platform_device *pdev)
 
 	err = platform_get_ethdev_address(&pdev->dev, netdev);
 	if (err == -EPROBE_DEFER)
-		goto defer_get_mac;
+		return err;
 
 	platform_set_drvdata(pdev, netdev);
 
@@ -1165,20 +1162,9 @@ static int ftmac100_probe(struct platform_device *pdev)
 	netif_napi_add(netdev, &priv->napi, ftmac100_poll);
 
 	/* map io memory */
-	priv->res = request_mem_region(res->start, resource_size(res),
-				       dev_name(&pdev->dev));
-	if (!priv->res) {
-		dev_err(&pdev->dev, "Could not reserve memory region\n");
-		err = -ENOMEM;
-		goto err_req_mem;
-	}
-
-	priv->base = ioremap(res->start, resource_size(res));
-	if (!priv->base) {
-		dev_err(&pdev->dev, "Failed to ioremap ethernet registers\n");
-		err = -EIO;
-		goto err_ioremap;
-	}
+	priv->base = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(priv->base))
+		return PTR_ERR(priv->base);
 
 	priv->irq = irq;
 
@@ -1208,32 +1194,17 @@ static int ftmac100_probe(struct platform_device *pdev)
 	return 0;
 
 err_register_netdev:
-	iounmap(priv->base);
-err_ioremap:
-	release_resource(priv->res);
-err_req_mem:
 	netif_napi_del(&priv->napi);
-defer_get_mac:
-	free_netdev(netdev);
-err_alloc_etherdev:
 	return err;
 }
 
 static void ftmac100_remove(struct platform_device *pdev)
 {
-	struct net_device *netdev;
-	struct ftmac100 *priv;
-
-	netdev = platform_get_drvdata(pdev);
-	priv = netdev_priv(netdev);
+	struct net_device *netdev = platform_get_drvdata(pdev);
+	struct ftmac100 *priv = netdev_priv(netdev);
 
 	unregister_netdev(netdev);
-
-	iounmap(priv->base);
-	release_resource(priv->res);
-
 	netif_napi_del(&priv->napi);
-	free_netdev(netdev);
 }
 
 static const struct of_device_id ftmac100_of_ids[] = {
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH] ice: retry reading NVM if admin queue returns EBUSY
From: kernel test robot @ 2026-06-16 20:18 UTC (permalink / raw)
  To: Robert Malz, anthony.l.nguyen, przemyslaw.kitszel
  Cc: oe-kbuild-all, intel-wired-lan, netdev
In-Reply-To: <20260616104521.1545053-1-robert.malz@canonical.com>

Hi Robert,

kernel test robot noticed the following build errors:

[auto build test ERROR on tnguy-next-queue/dev-queue]
[also build test ERROR on tnguy-net-queue/dev-queue linus/master v7.1 next-20260616]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Robert-Malz/ice-retry-reading-NVM-if-admin-queue-returns-EBUSY/20260616-185349
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git dev-queue
patch link:    https://lore.kernel.org/r/20260616104521.1545053-1-robert.malz%40canonical.com
patch subject: [PATCH] ice: retry reading NVM if admin queue returns EBUSY
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260616/202606162237.EIrFZKip-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260616/202606162237.EIrFZKip-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202606162237.EIrFZKip-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/net/ethernet/intel/ice/ice_nvm.c: In function 'ice_read_flat_nvm':
>> drivers/net/ethernet/intel/ice/ice_nvm.c:101:58: error: 'ICE_AQ_RC_EBUSY' undeclared (first use in this function); did you mean 'LIBIE_AQ_RC_EBUSY'?
     101 |                         if (hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY ||
         |                                                          ^~~~~~~~~~~~~~~
         |                                                          LIBIE_AQ_RC_EBUSY
   drivers/net/ethernet/intel/ice/ice_nvm.c:101:58: note: each undeclared identifier is reported only once for each function it appears in


vim +101 drivers/net/ethernet/intel/ice/ice_nvm.c

    48	
    49	/**
    50	 * ice_read_flat_nvm - Read portion of NVM by flat offset
    51	 * @hw: pointer to the HW struct
    52	 * @offset: offset from beginning of NVM
    53	 * @length: (in) number of bytes to read; (out) number of bytes actually read
    54	 * @data: buffer to return data in (sized to fit the specified length)
    55	 * @read_shadow_ram: if true, read from shadow RAM instead of NVM
    56	 *
    57	 * Reads a portion of the NVM, as a flat memory space. This function correctly
    58	 * breaks read requests across Shadow RAM sectors and ensures that no single
    59	 * read request exceeds the maximum 4KB read for a single AdminQ command.
    60	 *
    61	 * Returns a status code on failure. Note that the data pointer may be
    62	 * partially updated if some reads succeed before a failure.
    63	 */
    64	int
    65	ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data,
    66			  bool read_shadow_ram)
    67	{
    68		u32 inlen = *length;
    69		u32 bytes_read = 0;
    70		int retry_cnt = 0;
    71		bool last_cmd;
    72		int status;
    73	
    74		*length = 0;
    75	
    76		/* Verify the length of the read if this is for the Shadow RAM */
    77		if (read_shadow_ram && ((offset + inlen) > (hw->flash.sr_words * 2u))) {
    78			ice_debug(hw, ICE_DBG_NVM, "NVM error: requested offset is beyond Shadow RAM limit\n");
    79			return -EINVAL;
    80		}
    81	
    82		do {
    83			u32 read_size, sector_offset;
    84	
    85			/* ice_aq_read_nvm cannot read more than 4KB at a time.
    86			 * Additionally, a read from the Shadow RAM may not cross over
    87			 * a sector boundary. Conveniently, the sector size is also
    88			 * 4KB.
    89			 */
    90			sector_offset = offset % ICE_AQ_MAX_BUF_LEN;
    91			read_size = min_t(u32, ICE_AQ_MAX_BUF_LEN - sector_offset,
    92					  inlen - bytes_read);
    93	
    94			last_cmd = !(bytes_read + read_size < inlen);
    95	
    96			status = ice_aq_read_nvm(hw, ICE_AQC_NVM_START_POINT,
    97						 offset, read_size,
    98						 data + bytes_read, last_cmd,
    99						 read_shadow_ram, NULL);
   100			if (status) {
 > 101				if (hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY ||
   102				    retry_cnt > ICE_SQ_SEND_MAX_EXECUTE)
   103					break;
   104				ice_debug(hw, ICE_DBG_NVM,
   105					  "NVM read EBUSY error, retry %d\n",
   106					  retry_cnt + 1);
   107				last_cmd = false;
   108				ice_release_nvm(hw);
   109				msleep(ICE_SQ_SEND_DELAY_TIME_MS);
   110				status = ice_acquire_nvm(hw, ICE_RES_READ);
   111				if (status)
   112					break;
   113				retry_cnt++;
   114			} else {
   115				bytes_read += read_size;
   116				offset += read_size;
   117				retry_cnt = 0;
   118			}
   119		} while (!last_cmd);
   120	
   121		*length = bytes_read;
   122		return status;
   123	}
   124	

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH net-next 2/2] udp: convert udp_lib_getsockopt to sockopt_t
From: David Laight @ 2026-06-16 20:16 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Stanislav Fomichev, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Willem de Bruijn, Shuah Khan, netdev,
	linux-kernel, linux-kselftest, kernel-team
In-Reply-To: <ajF4Odi_L28LdIXC@gmail.com>

On Tue, 16 Jun 2026 09:22:52 -0700
Breno Leitao <leitao@debian.org> wrote:

> On Fri, Jun 12, 2026 at 07:10:15PM -0700, Stanislav Fomichev wrote:
> > On 06/12, Breno Leitao wrote:  
> 
> > >  int udp_lib_getsockopt(struct sock *sk, int level, int optname,
> > > -		       char __user *optval, int __user *optlen)
> > > +		       sockopt_t *opt)
> > >  {
> > >  	struct udp_sock *up = udp_sk(sk);
> > >  	int val, len;
> > >  
> > > -	if (get_user(len, optlen))
> > > -		return -EFAULT;  
> > 
> > [..]
> >   
> > > -	if (len < 0)
> > > -		return -EINVAL;  
> > 
> > I see this part now in sockopt_init_user, but you mention that it's a
> > transitional helper. When we drop it, will we loose this <0 check?
> > Maybe keep `if ((int)opt->optlen < 0))` here for backwards
> > compatibility?  
> 
> Good idea. I will do it and respin (once net-next reopens).

The best place for the negative length check is in the syscall wrapper code.
Pass an unsigned length through to all the protocol code.
No need to require every function to do the test.

Note that the length check was actually broken in many protocols
going way back well before git.
There has pretty much always been an unsigned min() check that converted
negative values to small(ish) positive ones before the check for it being
negative.
(That predates min() being a #define.)

The recent change to actually error optlen < 0 might actually have broken
some applications that passed uninitialised stack that was always negative!

-- David

> 
> Thanks for the review,
> --breno
> 


^ permalink raw reply

* Landlock: LANDLOCK_ACCESS_NET_CONNECT_TCP bypass via TCP Fast Open
From: Bryam Vargas @ 2026-06-16 20:16 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, Matthieu Buffet, Paul Moore, Eric Dumazet,
	Neal Cardwell, linux-security-module, netdev, linux-kernel

Hello Mickaël, and Landlock folks,

A task confined by a Landlock ruleset that handles
LANDLOCK_ACCESS_NET_CONNECT_TCP and is denied connecting to a given port can
still establish a TCP connection to that port by using TCP Fast Open, i.e.
sendto(fd, ..., MSG_FASTOPEN, &dst, dstlen) on a fresh stream socket. The
network-egress confinement for TCP connect is silently bypassed.

Affected
--------
Any kernel with CONFIG_SECURITY_LANDLOCK=y and Landlock enabled that supports
the TCP network access rights (Landlock ABI >= 4, since Linux 6.7). Confirmed by
source inspection on mainline (v7.1-rc7) and reproduced on Linux 7.0.11
(Landlock ABI 8). No CONFIG beyond Landlock + IPv4/IPv6 TCP; TCP Fast Open client
is enabled by the per-netns default (net.ipv4.tcp_fastopen has TFO_CLIENT_ENABLE
set), so no sysctl change and no setsockopt are required.

Root cause
----------
LANDLOCK_ACCESS_NET_CONNECT_TCP is enforced only by the socket_connect LSM hook
(hook_socket_connect -> current_check_access_socket). security_socket_connect()
has exactly one call site in the tree, net/socket.c (the connect(2) syscall).

TCP Fast Open performs an implicit connect inside sendmsg:

  tcp_sendmsg_locked()            net/ipv4/tcp.c  (MSG_FASTOPEN branch)
   -> tcp_sendmsg_fastopen()      net/ipv4/tcp.c
   -> __inet_stream_connect(..., is_sendmsg=1)  net/ipv4/af_inet.c
   -> sk->sk_prot->connect()      net/ipv4/af_inet.c  -> tcp_v4_connect()

This path establishes the connection to the address taken from msg_name but
never calls security_socket_connect(). The only LSM hook fired on the sendmsg
path is security_socket_sendmsg(), and Landlock registers no socket_sendmsg
hook, so LANDLOCK_ACCESS_NET_CONNECT_TCP is never re-checked. __inet_stream_connect()
itself carries no LSM hook (only the cgroup-BPF pre_connect, a different
mechanism).

Notably the kernel already mediates the analogous AF_UNIX implicit-connect on the
send path via the unix_may_send hook, which Landlock does register
(hook_unix_may_send) -- so the sendmsg-implies-connect pattern is recognized, but
the TCP Fast Open case has no equivalent coverage. The MPTCP fast-open path
(mptcp_sendmsg_fastopen -> __inet_stream_connect) is a second producer of the
same unmediated connect (by source inspection; not separately reproduced).

Reproducer
----------
A self-contained, fully unprivileged PoC is available on request. It forks an
unconfined TFO-capable loopback listener, then in a child applies a Landlock
ruleset handling LANDLOCK_ACCESS_NET_CONNECT_TCP with no allow rule
(landlock_create_ruleset() with handled_access_net =
LANDLOCK_ACCESS_NET_CONNECT_TCP, no landlock_add_rule(), then
landlock_restrict_self(); every TCP connect is denied) and tries the forbidden
port two ways:

  (1) connect(fd, &dst)                 -> -EACCES   (Landlock enforces CONNECT_TCP)
  (2) sendto(fd2, buf, len, MSG_FASTOPEN, &dst, dstlen)
                                        -> succeeds; the listener accepts the
                                           connection and reads the payload.

Observed on Linux 7.0.11 (Landlock ABI 8):

  [1] connect(2)            -> ret=-1 errno=13 (Permission denied)
  [2] sendto(MSG_FASTOPEN)  -> ret=14 errno=0 (OK/queued)
  [+] listener ACCEPTED the confined child's connection; payload="..."

connect(2) to the port is denied while sendto(MSG_FASTOPEN) reaches the identical
port and delivers data.

Impact
------
A sandbox that uses LANDLOCK_ACCESS_NET_CONNECT_TCP to restrict outbound TCP
(e.g. to keep a confined component from reaching an internal service or a
metadata endpoint) can be escaped by an unprivileged, self-confined task with no
CAP and no namespace transition -- for any destination port, since the
implicit-connect path never consults the connect hook regardless of address (the
run above shows one port). It is an integrity
bypass of the network-confinement property; no memory safety is involved.
I score it CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:N (6.5 Medium) -- the
confined task escapes the policy authority that defined its sandbox, a scope
change; 5.5 if you treat the Landlock boundary as the same authority (S:U).

Note on the in-flight UDP series
--------------------------------
The "landlock: Add UDP access control support" series (v5, Matthieu Buffet,
https://lore.kernel.org/r/20260611162107.49278-3-matthieu@buffet.re) adds a
socket_sendmsg hook, hook_socket_sendmsg(), but it returns 0 for non-UDP
sockets:

    if (sk_is_udp(sock->sk))
            access_request = LANDLOCK_ACCESS_NET_CONNECT_SEND_UDP;
    else
            return 0;

so a TCP socket using MSG_FASTOPEN still bypasses LANDLOCK_ACCESS_NET_CONNECT_TCP
even after that series lands. It may be most convenient to fix this there.

Suggested direction
-------------------
Re-check LANDLOCK_ACCESS_NET_CONNECT_TCP on the implicit-connect path: either have
the socket_sendmsg hook evaluate CONNECT_TCP for stream sockets when the call
performs an implicit connect (mirroring the AF_UNIX unix_may_send handling), or
place the check inside __inet_stream_connect() so a single chokepoint covers
connect(2), TCP Fast Open, and the MPTCP fast-open sibling.

I am happy to send a patch for this if you would like me to.

Best regards,

Bryam Vargas
Independent security researcher, HEXLAB S.A.S., Cali, Colombia
hexlabsecurity@proton.me


^ permalink raw reply

* Re: [PATCH v27 4/5] sfc: obtain and map cxl range using devm_cxl_probe_mem
From: Dan Williams (nvidia) @ 2026-06-16 19:51 UTC (permalink / raw)
  To: Alejandro Lucero Palau, Dan Williams (nvidia),
	alejandro.lucero-palau, linux-cxl, netdev, edward.cree, davem,
	kuba, pabeni, edumazet, dave.jiang
In-Reply-To: <50d8e423-8248-4e26-901b-010d14d22e67@amd.com>

Alejandro Lucero Palau wrote:
> 
> On 6/10/26 14:56, Alejandro Lucero Palau wrote:
> >
> > On 6/10/26 07:10, Alejandro Lucero Palau wrote:
> >>
> >> On 6/10/26 00:30, Dan Williams (nvidia) wrote:
> >>> alejandro.lucero-palau@ wrote:
> >>>> From: Alejandro Lucero <alucerop@amd.com>
> >>>>
> >>>> Use core API for safely obtain the CXL range linked to an HDM 
> >>>> committed
> >>>> by the BIOS. Map such a range for being used as the ctpio buffer.
> >>>>
> >>>> A potential user space action through sysfs unbinding or core cxl
> >>>> modules remove will trigger sfc driver device detachment, with that 
> >>>> case
> >>>> not racing with this mapping as this is done during driver probe and
> >>>> therefore protected with device lock against those user space actions.
> >>>>
> >>>> Signed-off-by: Alejandro Lucero <alucerop@amd.com>
> >>>> ---
> >>>>   drivers/net/ethernet/sfc/efx.c     |  1 +
> >>>>   drivers/net/ethernet/sfc/efx_cxl.c | 24 ++++++++++++++++++++++++
> >>>>   drivers/net/ethernet/sfc/efx_cxl.h |  3 +++
> >>>>   3 files changed, 28 insertions(+)
> >>>>
> >>>> diff --git a/drivers/net/ethernet/sfc/efx.c 
> >>>> b/drivers/net/ethernet/sfc/efx.c
> >>>> index 90ccbe310386..578054c21e79 100644
> >>>> --- a/drivers/net/ethernet/sfc/efx.c
> >>>> +++ b/drivers/net/ethernet/sfc/efx.c
> >>>> @@ -984,6 +984,7 @@ static void efx_pci_remove(struct pci_dev 
> >>>> *pci_dev)
> >>>>       efx_fini_io(efx);
> >>>>         probe_data = container_of(efx, struct efx_probe_data, efx);
> >>>> +    efx_cxl_exit(probe_data);
> >>>>         pci_dbg(efx->pci_dev, "shutdown successful\n");
> >>>>   diff --git a/drivers/net/ethernet/sfc/efx_cxl.c 
> >>>> b/drivers/net/ethernet/sfc/efx_cxl.c
> >>>> index 4d55c08cf2a1..d5766a40e2cf 100644
> >>>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
> >>>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
> >>>> @@ -18,6 +18,7 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
> >>>>   {
> >>>>       struct efx_nic *efx = &probe_data->efx;
> >>>>       struct pci_dev *pci_dev = efx->pci_dev;
> >>>> +    struct range cxl_pio_range;
> >>>>       struct efx_cxl *cxl;
> >>>>       u16 dvsec;
> >>>>       int rc;
> >>>> @@ -75,9 +76,32 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
> >>>>           return -ENODEV;
> >>>>       }
> >>>>   +    cxl->cxlmd = devm_cxl_probe_mem(&cxl->cxlds, &cxl_pio_range);
> >>>> +    if (IS_ERR(cxl->cxlmd)) {
> >>>> +        pci_err(pci_dev, "CXL accel memdev creation failed\n");
> >>>> +        return PTR_ERR(cxl->cxlmd);
> >>>> +    }
> >>>> +
> >>>> +    cxl->ctpio_cxl = ioremap_wc(cxl_pio_range.start,
> >>>> +                    range_len(&cxl_pio_range));
> >>>> +    if (!cxl->ctpio_cxl) {
> >>>> +        pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
> >>>> +            &cxl_pio_range);
> >>>> +        return -ENOMEM;
> >>> Dave caught the iounmap leak, but another concern is since you want to
> >>> continue operation if efx_cxl_init() fails then you probably also want
> >>> to release the successful attachment to the CXL domain if this happens.
> >>
> >>
> >> I will do that.
> >>
> >
> > Looking at this issue, I think an error when creating the memdev or 
> > during the region attach triggers the memdev removal, but ...
> >
> >
> >>
> >>> Minor since something else is likely to fail if ioremap is not 
> >>> reliable.
> >
> >
> > .. if we want to specifically do that with an unlikely (but possible) 
> > ioremap error something else needs to be exported like 
> > cxl_memdev_unregister(). Are you happy with that approach?
> >
> 
> I have just tested with this:
> 
> +void cxl_memdev_remove(void *_cxlmd)
> +{
> +       struct cxl_memdev *cxlmd = _cxlmd;
> +       struct device *dev = &cxlmd->dev;
> +
> +       devm_remove_action_nowarn(cxlmd->cxlds->dev, cxl_memdev_unregister,
> +                                 cxlmd);
> +
> +       cdev_device_del(&cxlmd->cdev, dev);
> +       cxl_memdev_shutdown(dev);
> +       put_device(dev);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_remove, "CXL");
> 
> 
> only called if the ioremap fails.
> 
> 
> Please, let me know if you like this approach before sending another 
> version.

A devres group can automatically cleanup after devm_cxl_memdev_probe()
in the error path with no new exports needed from the CXL core.
Something like:

        void *group = devres_open_group(cxl->cxlds.dev, NULL, GFP_KERNEL);
        int rc = 0;

        if (!group)
                return -ENOMEM;
        
        cxl->cxlmd = devm_cxl_probe_mem(&cxl->cxlds, &cxl_pio_range);
        if (IS_ERR(cxl->cxlmd)) {
                pci_err(pci_dev, "CXL accel memdev creation failed\n");
                rc = PTR_ERR(cxl->cxlmd);
                goto out;
        }

        cxl->ctpio_cxl =
                ioremap_wc(cxl_pio_range.start, range_len(&cxl_pio_range));
        if (!cxl->ctpio_cxl) {
                pci_err(pci_dev, "CXL ioremap region (%pra) failed\n",
                        &cxl_pio_range);
                rc = -ENOMEM;
        }

out:
        if (rc)
                devres_release_group(group);
        else
                devres_remove_group(group);
        return rc;

^ permalink raw reply

* Re: [PATCH net-next v6 2/2] dinghai: add hardware register access and PCI? capability scanning
From: Andrew Lunn @ 2026-06-16 19:49 UTC (permalink / raw)
  To: han.junyang
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms, linux-kernel,
	netdev, ran.ming, han.chengfei, zhang.yanze
In-Reply-To: <20260616213550502kLzSZF2DiQyd9Dl0Dv0Gz@zte.com.cn>

> +int zxdh_pf_common_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +	struct zxdh_pf_device *pf_dev = dh_dev->priv;
> +	struct pci_dev *pdev = dh_dev->pdev;
> +	int common;
> +
> +	/* check for a common config: if not, use legacy mode (bar 0). */
> +	common = zxdh_pf_pci_find_capability(pdev, ZXDH_PCI_CAP_COMMON_CFG,
> +					     IORESOURCE_IO | IORESOURCE_MEM,
> +					     &pf_dev->modern_bars);
> +	if (common == 0) {
> +		dev_err(dh_dev->device,
> +			"missing capabilities %i, leaving for legacy driver\n",
> +			common);

That looks double odd. Normally you would use !common. Also, you know
common is 0, so why use "%i", when it could be just '0'.

> +int zxdh_pf_notify_cfg_init(struct dh_core_dev *dh_dev)
> +{
> +	struct zxdh_pf_device *pf_dev = dh_dev->priv;
> +	struct pci_dev *pdev = dh_dev->pdev;
> +	u32 notify_length;
> +	u32 notify_offset;
> +	int notify;
> +
> +	/* If common is there, these should be too... */
> +	notify = zxdh_pf_pci_find_capability(pdev, ZXDH_PCI_CAP_NOTIFY_CFG,
> +					     IORESOURCE_IO | IORESOURCE_MEM,
> +					     &pf_dev->modern_bars);
> +	if (notify == 0) {
> +		dev_err(dh_dev->device, "missing capabilities %i\n", notify);
> +		return -EINVAL;
> +	}
> +

Same again.

    Andrew

---
pw-bot: cr

^ permalink raw reply

* Re: [PATCH net-next v6 1/2] dinghai: add ZTE network driver support
From: Andrew Lunn @ 2026-06-16 19:39 UTC (permalink / raw)
  To: han.junyang
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms, linux-kernel,
	netdev, ran.ming, han.chengfei, zhang.yanze
In-Reply-To: <20260616213057452I2KLm3mVgWYl_SUTy_YYS@zte.com.cn>

> +++ b/drivers/net/ethernet/zte/dinghai/en_pf.h
> +static inline void *dh_core_alloc_priv(struct dh_core_dev *dh_dev,
> +				       size_t size)
> +{
> +	void *priv = kzalloc(size, GFP_KERNEL);
> +
> +	if (priv)
> +		dh_dev->priv = priv;
> +	return priv;
> +}
> +
> +static inline void dh_core_free_priv(struct dh_core_dev *dh_dev)
> +{
> +	kfree(dh_dev->priv);
> +}

It is unusual for these to be inline functions in a header. Why is
this?

	Andrew

^ permalink raw reply

* Re: [PATCH bpf-next 1/2] bpf: Guard conntrack opts error writes
From: Alexei Starovoitov @ 2026-06-16 19:36 UTC (permalink / raw)
  To: Yiyang Chen, bpf, netfilter-devel
  Cc: pablo, fw, phil, davem, edumazet, kuba, pabeni, horms, andrii,
	eddyz87, ast, daniel, memxor, martin.lau, song, yonghong.song,
	jolsa, emil, shuah, kartikey406, coreteam, netdev, linux-kernel,
	linux-kselftest
In-Reply-To: <70aeec0ab762aebe65129cf6052e132c7329edc2.1781586477.git.chenyy23@mails.tsinghua.edu.cn>

On Mon Jun 15, 2026 at 10:42 PM PDT, Yiyang Chen wrote:
> The conntrack lookup and allocation kfuncs take an opts pointer
> together with an opts__sz argument. The verifier checks only the memory
> range described by opts__sz, but the wrappers unconditionally write
> opts->error whenever the internal lookup or allocation helper returns an
> error.
>
> For an invalid size smaller than the end of opts->error, that write can
> land outside the verifier-checked range. Keep returning NULL for invalid
> arguments, but only report the error through opts->error when the
> supplied size includes the field.
>
> This preserves error reporting for the supported 12-byte and 16-byte
> layouts, and for other invalid sizes that still include opts->error.
>
> Fixes: b4c2b9593a1c ("net/netfilter: Add unstable CT lookup helpers for XDP and TC-BPF")
> Fixes: d7e79c97c00c ("net: netfilter: Add kfuncs to allocate and insert CT")
> Signed-off-by: Yiyang Chen <chenyy23@mails.tsinghua.edu.cn>
> ---
>  net/netfilter/nf_conntrack_bpf.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
> index 40c261cd0af38..3c182024ec509 100644
> --- a/net/netfilter/nf_conntrack_bpf.c
> +++ b/net/netfilter/nf_conntrack_bpf.c
> @@ -65,6 +65,11 @@ enum {
>  	NF_BPF_CT_OPTS_SZ = 16,
>  };
>  
> +static bool bpf_ct_opts_has_error(u32 opts_len)
> +{
> +	return opts_len >= offsetofend(struct bpf_ct_opts, error);
> +}
> +
>  static int bpf_nf_ct_tuple_parse(struct bpf_sock_tuple *bpf_tuple,
>  				 u32 tuple_len, u8 protonum, u8 dir,
>  				 struct nf_conntrack_tuple *tuple)
> @@ -298,7 +303,8 @@ bpf_xdp_ct_alloc(struct xdp_md *xdp_ctx, struct bpf_sock_tuple *bpf_tuple,
>  	nfct = __bpf_nf_ct_alloc_entry(dev_net(ctx->rxq->dev), bpf_tuple, tuple__sz,
>  				       opts, opts__sz, 10);
>  	if (IS_ERR(nfct)) {
> -		opts->error = PTR_ERR(nfct);
> +		if (bpf_ct_opts_has_error(opts__sz))
> +			opts->error = PTR_ERR(nfct);

LLMs have no taste.

Above two lines could have been one helper
   bpf_ct_opts_set_error(opts, opts__sz, PTR_ERR(nfct));

Or we can do a step further and simplify the code more.
Turn this:
   if (IS_ERR(nfct)) {
           opts->error = PTR_ERR(nfct);
           return NULL;
   }
   return (struct nf_conn___init *)nfct;
into:
   return (struct nf_conn___init *)bpf_ct_opts_result(opts, opts__sz, nfct);

static void *bpf_ct_opts_result(struct bpf_ct_opts *opts, u32 opts__sz, void *ret)
{
  if (!IS_ERR(ret))
    return ret;
  if (opts__sz >= offsetofend(struct bpf_ct_opts, error))
    opts->error = PTR_ERR(ret);
  return NULL;
}

This kind of small improvements should be obvious to any human developer.
Please do NOT send us patches straight out of LLM.
Review it first and think how to improve it.

pw-bot: cr

^ permalink raw reply

* Re: [PATCH v27 3/5] cxl/sfc: Initialize dpa without a mailbox
From: Dan Williams (nvidia) @ 2026-06-16 19:35 UTC (permalink / raw)
  To: Alejandro Lucero Palau, Dan Williams (nvidia),
	alejandro.lucero-palau, linux-cxl, netdev, edward.cree, davem,
	kuba, pabeni, edumazet, dave.jiang
  Cc: Dan Williams, Ben Cheatham, Jonathan Cameron
In-Reply-To: <17b68fb1-768e-49f6-884d-49e0952621b8@amd.com>

Alejandro Lucero Palau wrote:
> 
> On 6/10/26 00:24, Dan Williams (nvidia) wrote:
> > alejandro.lucero-palau@ wrote:
> >> From: Alejandro Lucero <alucerop@amd.com>
> >>
> >> Type3 relies on mailbox CXL_MBOX_OP_IDENTIFY command for initializing
> >> memdev state params which end up being used for DPA initialization.
> >>
> >> Allow a Type2 driver to initialize DPA simply by giving the size of its
> >> volatile hardware partition.
> >>
> >> Move related functions to memdev.
> > The code movement is not strictly necessary. Just add cxl_set_capacity()
> > and we can consider a move later if mbox.o and memdev.o are ever not
> > both included in cxl_core.o by default.
> 
> 
> I think it is the right thing to do as the new function uses add_part() 
> (moved) and the other add_part() client is the other function moved, 
> cxl_mem_dpa_fetch().
> 
> Note cxl_mem_get_partition() used by cxl_mem_dpa_fetch() is the one 
> working with mbox commands and it remains in the same place inside 
> core/mbox.c and the only cxl_mem_dpa_fetch() client is cxl/pci.c
> 
> 
> This was reviewed and accepted so no reason for not doing it ...

Sure, I am ok to let it go as is.

^ permalink raw reply

* Re: [Bug] incompatibility between 'e1000e' and Aruba AOS-CX switches (too small inter-packet gap)
From: Andrew Lunn @ 2026-06-16 19:34 UTC (permalink / raw)
  To: Philippe Andersson; +Cc: netdev, Ludovic Calmant, Fabian Noël
In-Reply-To: <457d1617-bd7f-44c5-a9af-7ba8aa9250f4@iba-group.com>

> A support ticket has already been opened with Aruba, but it's unclear at
> this stage that the problem is on their side.

How easy is it to reproduce? Can you run a git bisect from the last
known good kernel version to the first known bad version?

      Andrew

^ permalink raw reply

* [PATCH v1 net-next] ipv4: fib_rule: Move fib4_rules_exit() to ->exit().
From: Kuniyuki Iwashima @ 2026-06-16 19:13 UTC (permalink / raw)
  To: David Ahern, Ido Schimmel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev,
	syzbot+965506b59a2de0b6905c

syzbot reported use-after-free of net->ipv4.rules_ops. [0]

It can be reproduced with these commands:

  while true; do
  	ip netns add ns1
  	ip -n ns1 link set dev lo up
  	ip -n ns1 address add 192.0.2.1/24 dev lo
  	ip -n ns1 link add name dummy1 up type dummy
  	ip -n ns1 address add 198.51.100.1/24 dev dummy1
  	ip -n ns1 rule add ipproto tcp sport 12345 table 12345
  	ip -n ns1 fou add port 5555 ipproto 47 local 192.0.2.1 peer 198.51.100.2 peer_port 54321
  	ip netns del ns1
  done

The cited commit moved fib4_rules_exit() earlier to ->exit_rtnl(),
but the kernel socket destroyed in ->exit() could eventually reach
__fib_lookup().

I left fib4_rules_exit() in ->exit_rtnl() because fib4_rule_delete()
calls fib_unmerge(), which requires RTNL.

However, when ->delete() is called, ->configure() has already been
called, thus fib_unmerge() in ->delete() has no effect.

Let's remove fib_unmerge() in fib4_rule_delete() and move
fib4_rules_exit() to ->exit().

Many thanks to Ido Schimmel for providing the nice repro very quickly.

Note that we can make fib_rules_ops.delete() return void once
net-next opens.

[0]:
BUG: KASAN: slab-use-after-free in fib_rules_lookup+0x15e/0xeb0 net/core/fib_rules.c:321
Read of size 8 at addr ffff88804ec4c680 by task kworker/u8:21/12641

CPU: 0 UID: 0 PID: 12641 Comm: kworker/u8:21 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/09/2026
Workqueue: netns cleanup_net
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_address_description+0x55/0x1e0 mm/kasan/report.c:378
 print_report+0x58/0x70 mm/kasan/report.c:482
 kasan_report+0x117/0x150 mm/kasan/report.c:595
 fib_rules_lookup+0x15e/0xeb0 net/core/fib_rules.c:321
 __fib_lookup+0x106/0x210 net/ipv4/fib_rules.c:96
 ip_route_output_key_hash_rcu+0x294/0x2720 net/ipv4/route.c:2811
 ip_route_output_key_hash+0x18d/0x2a0 net/ipv4/route.c:2702
 __ip_route_output_key include/net/route.h:169 [inline]
 ip_route_output_flow+0x2a/0x150 net/ipv4/route.c:2929
 ip4_datagram_release_cb+0x89d/0xbe0 net/ipv4/datagram.c:118
 release_sock+0x206/0x260 net/core/sock.c:3861
 inet_shutdown+0x2b1/0x390 net/ipv4/af_inet.c:950
 udp_tunnel_sock_release+0x6d/0x80 net/ipv4/udp_tunnel_core.c:197
 fou_release net/ipv4/fou_core.c:562 [inline]
 fou_exit_net+0x17d/0x1f0 net/ipv4/fou_core.c:1230
 ops_exit_list net/core/net_namespace.c:199 [inline]
 ops_undo_list+0x43d/0x8d0 net/core/net_namespace.c:252
 cleanup_net+0x572/0x810 net/core/net_namespace.c:702
 process_one_work kernel/workqueue.c:3314 [inline]
 process_scheduled_works+0xa8e/0x14e0 kernel/workqueue.c:3397
 worker_thread+0xa47/0xfb0 kernel/workqueue.c:3478
 kthread+0x389/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>

Fixes: 759923cf03b0 ("ipv4: fib: Convert fib_net_exit_batch() to ->exit_rtnl().")
Reported-by: syzbot+965506b59a2de0b6905c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a315824.b0403584.28d0ff.0000.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv4/fib_frontend.c | 10 ++++++----
 net/ipv4/fib_rules.c    | 11 ++---------
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index c7d1f31650d7..42212970d735 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1612,10 +1612,6 @@ static void ip_fib_net_exit(struct net *net)
 			fib_free_table(tb);
 		}
 	}
-
-#ifdef CONFIG_IP_MULTIPLE_TABLES
-	fib4_rules_exit(net);
-#endif
 }
 
 static int __net_init fib_net_init(struct net *net)
@@ -1652,6 +1648,9 @@ static int __net_init fib_net_init(struct net *net)
 	ip_fib_net_exit(net);
 	rtnl_net_unlock(net);
 
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	fib4_rules_exit(net);
+#endif
 	kfree(net->ipv4.fib_table_hash);
 	fib4_notifier_exit(net);
 	goto out;
@@ -1671,6 +1670,9 @@ static void __net_exit fib_net_exit_rtnl(struct net *net,
 
 static void __net_exit fib_net_exit(struct net *net)
 {
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	fib4_rules_exit(net);
+#endif
 	kfree(net->ipv4.fib_table_hash);
 	fib4_notifier_exit(net);
 	fib4_semantics_exit(net);
diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c
index 51f0193092f0..e068a5bace73 100644
--- a/net/ipv4/fib_rules.c
+++ b/net/ipv4/fib_rules.c
@@ -352,24 +352,17 @@ static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb,
 static int fib4_rule_delete(struct fib_rule *rule)
 {
 	struct net *net = rule->fr_net;
-	int err;
-
-	/* split local/main if they are not already split */
-	err = fib_unmerge(net);
-	if (err)
-		goto errout;
 
 #ifdef CONFIG_IP_ROUTE_CLASSID
 	if (((struct fib4_rule *)rule)->tclassid)
 		atomic_dec(&net->ipv4.fib_num_tclassid_users);
 #endif
-	net->ipv4.fib_has_custom_rules = true;
 
 	if (net->ipv4.fib_rules_require_fldissect &&
 	    fib_rule_requires_fldissect(rule))
 		net->ipv4.fib_rules_require_fldissect--;
-errout:
-	return err;
+
+	return 0;
 }
 
 static int fib4_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh,
-- 
2.54.0.1136.gdb2ca164c4-goog


^ permalink raw reply related

* [PATCH 6.1] net: gro: don't merge zcopy skbs
From: Alexander Martyniuk @ 2026-06-16 22:00 UTC (permalink / raw)
  To: stable, Greg Kroah-Hartman
  Cc: Alexander Martyniuk, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Sasha Levin, Sabrina Dubroca,
	Hyunwoo Kim, Pavel Begunkov, netdev, linux-kernel, lvc-project,
	Huzaifa Sidhpurwala, Willem de Bruijn

From: Sabrina Dubroca <sd@queasysnail.net>

commit 4db79a322db8c97f7b73b8a347395ef4d685eb40 upstream.

skb_gro_receive() can currently copy frags between the source and GRO
skb, without checking the zerocopy status, and in particular the
SKBFL_MANAGED_FRAG_REFS flag.

When SKBFL_MANAGED_FRAG_REFS is set, the skb doesn't hold a reference
on the pages in shinfo->frags. Appending those frags to another skb's
frags without fixing up the page refcount can lead to UAF.

When either the last skb in the GRO chain (the one we would append
frags to) or the source skb is zerocopy, don't merge the skbs.

Fixes: 753f1ca4e1e5 ("net: introduce managed frags infrastructure")
Reported-by: Huzaifa Sidhpurwala <huzaifas@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/c3b7f906bbfcbdfd7b4fa9d6c18a438870df85be.1779307748.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Martyniuk <alexevgmart@gmail.com>
---
Backport fix for CVE-2026-46323
 net/core/gro.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/gro.c b/net/core/gro.c
index ea6571c01faa..c5a9733d929a 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -171,6 +171,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
 	if (p->pp_recycle != skb->pp_recycle)
 		return -ETOOMANYREFS;
 
+	if (skb_zcopy(p) || skb_zcopy(skb))
+		return -ETOOMANYREFS;
+
 	/* pairs with WRITE_ONCE() in netif_set_gro_max_size() */
 	gro_max_size = READ_ONCE(p->dev->gro_max_size);
 
-- 
2.30.2


^ permalink raw reply related

* [PATCH] octeontx2-pf: Clear stats of all resources when freeing resources
From: Subbaraya Sundeep @ 2026-06-16 19:00 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth
  Cc: netdev, linux-kernel, Subbaraya Sundeep
In-Reply-To: <1781636420-19816-1-git-send-email-sbhatta@marvell.com>

When all MCS resources mapped to a PF are being freed then clear
stats of all those resources too.

Fixes: 815debbbf7b5 ("octeontx2-pf: mcs: Clear stats before freeing resource")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
 drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
index 4d3a7f4be962..9524d38f1582 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
@@ -182,6 +182,7 @@ static void cn10k_mcs_free_rsrc(struct otx2_nic *pfvf, enum mcs_direction dir,
 	clear_req->id = hw_rsrc_id;
 	clear_req->type = type;
 	clear_req->dir = dir;
+	clear_req->all = all;
 
 	req = otx2_mbox_alloc_msg_mcs_free_resources(mbox);
 	if (!req)
-- 
2.48.1


^ permalink raw reply related

* [net PATCH v2] octeontx2-af: mcs: Fix unsupported secy stats read
From: Subbaraya Sundeep @ 2026-06-16 19:00 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth
  Cc: netdev, linux-kernel, Subbaraya Sundeep

From: Geetha sowjanya <gakula@marvell.com>

Secy control stats counter doesn't exist for CNF10KB platform.
Skip reading this respective register for CNF10KB silicon while
fetching secy stats.

Fixes: 9312150af8da ("octeontx2-af: cn10k: mcs: Support for stats collection")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
v2 changes:
 Fixed AI review by modifying debugfs also NOT to access
 Secy control stats counter

 drivers/net/ethernet/marvell/octeontx2/af/mcs.c         | 6 +++---
 drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mcs.c b/drivers/net/ethernet/marvell/octeontx2/af/mcs.c
index c1775bd01c2b..a07e0b3d8d00 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mcs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mcs.c
@@ -120,13 +120,13 @@ void mcs_get_rx_secy_stats(struct mcs *mcs, struct mcs_secy_stats *stats, int id
 	reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYUNTAGGEDX(id);
 	stats->pkt_untaged_cnt = mcs_reg_read(mcs, reg);
 
-	reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYCTLX(id);
-	stats->pkt_ctl_cnt = mcs_reg_read(mcs, reg);
-
 	if (mcs->hw->mcs_blks > 1) {
 		reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYNOTAGX(id);
 		stats->pkt_notag_cnt = mcs_reg_read(mcs, reg);
+		return;
 	}
+	reg = MCSX_CSE_RX_MEM_SLAVE_INPKTSSECYCTLX(id);
+	stats->pkt_ctl_cnt = mcs_reg_read(mcs, reg);
 }
 
 void mcs_get_flowid_stats(struct mcs *mcs, struct mcs_flowid_stats *stats,
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
index fa461489acdd..ca2704b188a5 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
@@ -482,10 +482,11 @@ static int rvu_dbg_mcs_rx_secy_stats_display(struct seq_file *filp, void *unused
 		seq_printf(filp, "secy%d: Tagged ctrl pkts: %lld\n", secy_id,
 			   stats.pkt_tagged_ctl_cnt);
 		seq_printf(filp, "secy%d: Untaged pkts: %lld\n", secy_id, stats.pkt_untaged_cnt);
-		seq_printf(filp, "secy%d: Ctrl pkts: %lld\n", secy_id, stats.pkt_ctl_cnt);
 		if (mcs->hw->mcs_blks > 1)
 			seq_printf(filp, "secy%d: pkts notag: %lld\n", secy_id,
 				   stats.pkt_notag_cnt);
+		else
+			seq_printf(filp, "secy%d: Ctrl pkts: %lld\n", secy_id, stats.pkt_ctl_cnt);
 	}
 	mutex_unlock(&mcs->stats_lock);
 	return 0;
-- 
2.48.1


^ permalink raw reply related

* [net PATCH v2] octeontx2-pf: mcs: Fix mcs resources free on PF shutdown
From: Subbaraya Sundeep @ 2026-06-16 19:00 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
	bbhushan2, rkannoth
  Cc: netdev, linux-kernel, Subbaraya Sundeep
In-Reply-To: <1781636420-19816-1-git-send-email-sbhatta@marvell.com>

From: Geetha sowjanya <gakula@marvell.com>

On PF shutdown, the current driver free mcs hardware
resources though mcs resources are not allocated to it.
This patch checks the mcs resources status and if resources
are allocated then only sends mailbox message to free them.

Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
---
v2 changes:
 Fixed AI review so that pfvf->macsec_cfg is freed correctly

 .../net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c    | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
index 2cc1bdfd9b2e..4d3a7f4be962 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c
@@ -1776,11 +1776,16 @@ int cn10k_mcs_init(struct otx2_nic *pfvf)
 
 void cn10k_mcs_free(struct otx2_nic *pfvf)
 {
+	struct cn10k_mcs_cfg *cfg = pfvf->macsec_cfg;
+
 	if (!test_bit(CN10K_HW_MACSEC, &pfvf->hw.cap_flag))
 		return;
 
-	cn10k_mcs_free_rsrc(pfvf, MCS_TX, MCS_RSRC_TYPE_SECY, 0, true);
-	cn10k_mcs_free_rsrc(pfvf, MCS_RX, MCS_RSRC_TYPE_SECY, 0, true);
+	if (!list_empty(&cfg->txsc_list)) {
+		cn10k_mcs_free_rsrc(pfvf, MCS_TX, MCS_RSRC_TYPE_SECY, 0, true);
+		cn10k_mcs_free_rsrc(pfvf, MCS_RX, MCS_RSRC_TYPE_SECY, 0, true);
+	}
+
 	kfree(pfvf->macsec_cfg);
 	pfvf->macsec_cfg = NULL;
 }
-- 
2.48.1


^ permalink raw reply related

* Re: [PATCH bpf] bpf, sockmap: fix lock inversion between stab->lock and sk_callback_lock
From: Sechang Lim @ 2026-06-16 18:40 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann, Eric Dumazet, Kuniyuki Iwashima, Paolo Abeni,
	Willem de Bruijn, David S . Miller, Jakub Kicinski, Simon Horman,
	netdev, bpf, linux-kernel
In-Reply-To: <575a878e-6d37-4337-a821-4883d3dd3a63@linux.dev>

On Tue, Jun 16, 2026 at 06:17:48PM +0800, Jiayuan Chen wrote:
>
>On 6/16/26 5:11 PM, Sechang Lim wrote:
>>sock_map_update_common() and __sock_map_delete() hold stab->lock and call
>>sock_map_unref() -> sock_map_del_link() under it. sock_map_del_link() takes
>>sk_callback_lock for write to stop the strparser and verdict, giving the
>>lock order stab->lock -> sk_callback_lock.
>>
>>The opposite order comes from an SK_SKB stream parser. On RX,
>>sk_psock_strp_data_ready() holds sk_callback_lock for read while running
>>the parser. The verdict redirects the skb to egress, where a sched_cls
>
>
>The commit message is wrong. A verdict does not redirect to egress
>synchronously — sk_psock_skb_redirect() only queues the skb and
>schedule_delayed_work()s sk_psock_backlog, so egress runs in workqueue
>context, not under sk_callback_lock.
>

Thanks, you're right. it's the inline ACK, not the redirect. Sorry for
the misleading changelog, I'll fix it in v2.

>
>>program calls bpf_map_delete_elem() on a sockmap, which takes stab->lock:
>>
>>   WARNING: possible circular locking dependency detected
>>   7.1.0-rc6 Not tainted
>>   ------------------------------------------------------
>>   syz.9.8824 is trying to acquire lock:
>>   (&stab->lock){+.-.}-{3:3}, at: __sock_map_delete net/core/sock_map.c:421
>>   but task is already holding lock:
>>   (clock-AF_INET){++.-}-{3:3}, at: sk_psock_strp_data_ready net/core/skmsg.c:1173
>>
>>   -> #1 (clock-AF_INET){++.-}-{3:3}:
>>          _raw_write_lock_bh
>>          sock_map_del_link net/core/sock_map.c:167
>>          sock_map_unref net/core/sock_map.c:184
>>          sock_map_update_common net/core/sock_map.c:509
>>          sock_map_update_elem_sys net/core/sock_map.c:588
>>          map_update_elem kernel/bpf/syscall.c:1805
>>
>>   -> #0 (&stab->lock){+.-.}-{3:3}:
>>          _raw_spin_lock_bh
>>          __sock_map_delete net/core/sock_map.c:421
>>          sock_map_delete_elem net/core/sock_map.c:452
>>          bpf_prog_06044d24140080b6
>>          tcx_run net/core/dev.c:4451
>>          sch_handle_egress net/core/dev.c:4541
>>          __dev_queue_xmit net/core/dev.c:4808
>>          ...
>>          tcp_bpf_strp_read_sock net/ipv4/tcp_bpf.c:701
>
>
>I guess it is an ACK. What is the actual purpose of a sched_cls 
>program calling
>
>sockmap delete on the TX path of an ACK? If there is no real use case 
>for it, this is
>
>just broken BPF usage, not a kernel bug worth this change.
>
>

I don't have a real use case for that exact program. But the verifier
allows sockmap delete from tc, and it deadlocks when the strparser's
socket is concurrently removed from the same map. The fix only moves
sock_map_unref() out from under stab->lock.

Best,
Sechang

^ permalink raw reply

* [PATCH nf-next v3 0/4] netfilter: replace u_int*_t with kernel int types
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, linux-kernel, netdev

Hi all! This is my first patch series of many, I hope :)
I'd like to start contributing by helping out with janitor work,
standardizing code and cleaning up.

This patch series replaces POSIX u_int8_t/u_int16_t with the preferred
kernel types u8/u16 across several netfilter files.

u_int*_t appears in many other files, but I wanted to keep this series
small, unless advised otherwise.

No functional changes.

Changes in v3:
- dropping changes to nf_log and xt_DSCP (need deeper understanding of the
  subsystem before converting these correctly)
- link to v2: https://lore.kernel.org/all/20260615133835.51273-1-carlos@carlosgrillet.me

Changes in v2:
- addresses sashiko comments https://sashiko.dev/#/patchset/32368
  - nf_sockopt: update function prototypes and struct definitions
  - nf_log: update the corresponding function declarations and the
    nf_logfn typedef
- link to v1: https://lore.kernel.org/all/20260612125146.75672-1-carlos@carlosgrillet.me

Carlos Grillet (4):
  netfilter: nf_nat_ftp: replace u_int16_t with u16
  netfilter: nf_nat_irc: replace u_int16_t with u16
  netfilter: nf_sockopt: replace u_int8_t with u8
  netfilter: xt_TCPOPTSTRIP: replace u_int8_t and u_int16_t with u8 and u16

 include/linux/netfilter.h      | 6 +++---
 net/netfilter/nf_nat_ftp.c     | 2 +-
 net/netfilter/nf_nat_irc.c     | 2 +-
 net/netfilter/nf_sockopt.c     | 8 ++++----
 net/netfilter/xt_TCPOPTSTRIP.c | 8 ++++----
 5 files changed, 13 insertions(+), 13 deletions(-)

-- 
2.54.0


^ permalink raw reply

* [PATCH nf-next v3 2/4] netfilter: nf_nat_irc: replace u_int16_t with u16
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, netdev, linux-kernel
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Replace POSIX u_int16_t with preferred kernel type u16

No functional changes.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 net/netfilter/nf_nat_irc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_nat_irc.c b/net/netfilter/nf_nat_irc.c
index 19c4fcc60c50..14b79cb0171b 100644
--- a/net/netfilter/nf_nat_irc.c
+++ b/net/netfilter/nf_nat_irc.c
@@ -39,7 +39,7 @@ static unsigned int help(struct sk_buff *skb,
 	char buffer[sizeof("4294967296 65635")];
 	struct nf_conn *ct = exp->master;
 	union nf_inet_addr newaddr;
-	u_int16_t port;
+	u16 port;
 
 	/* Reply comes from server. */
 	newaddr = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3;
-- 
2.54.0


^ permalink raw reply related

* [PATCH nf-next v3 4/4] netfilter: xt_TCPOPTSTRIP: replace u_int8_t and u_int16_t with u8 and u16
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, netdev, linux-kernel
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Replace POSIX u_int8_t/u_int16_t with preferred kernel types u8/u16

No functional changes.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 net/netfilter/xt_TCPOPTSTRIP.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/xt_TCPOPTSTRIP.c b/net/netfilter/xt_TCPOPTSTRIP.c
index 93f064306901..265d21697847 100644
--- a/net/netfilter/xt_TCPOPTSTRIP.c
+++ b/net/netfilter/xt_TCPOPTSTRIP.c
@@ -16,7 +16,7 @@
 #include <linux/netfilter/x_tables.h>
 #include <linux/netfilter/xt_TCPOPTSTRIP.h>
 
-static inline unsigned int optlen(const u_int8_t *opt, unsigned int offset)
+static inline unsigned int optlen(const u8 *opt, unsigned int offset)
 {
 	/* Beware zero-length options: make finite progress */
 	if (opt[offset] <= TCPOPT_NOP || opt[offset+1] == 0)
@@ -33,8 +33,8 @@ tcpoptstrip_mangle_packet(struct sk_buff *skb,
 	const struct xt_tcpoptstrip_target_info *info = par->targinfo;
 	struct tcphdr *tcph, _th;
 	unsigned int optl, i, j;
-	u_int16_t n, o;
-	u_int8_t *opt;
+	u16 n, o;
+	u8 *opt;
 	int tcp_hdrlen;
 
 	/* This is a fragment, no TCP header is available */
@@ -97,7 +97,7 @@ tcpoptstrip_tg6(struct sk_buff *skb, const struct xt_action_param *par)
 {
 	struct ipv6hdr *ipv6h = ipv6_hdr(skb);
 	int tcphoff;
-	u_int8_t nexthdr;
+	u8 nexthdr;
 	__be16 frag_off;
 
 	nexthdr = ipv6h->nexthdr;
-- 
2.54.0


^ permalink raw reply related

* [PATCH nf-next v3 3/4] netfilter: nf_sockopt: replace u_int8_t with u8
From: Carlos Grillet @ 2026-06-16 18:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal, Phil Sutter
  Cc: netfilter-devel, coreteam, linux-kernel, netdev
In-Reply-To: <20260616182948.96865-1-carlos@carlosgrillet.me>

Replace POSIX u_int8_t with preferred kernel type u8, update prototype
and struct definition.

No functional changes.

Signed-off-by: Carlos Grillet <carlos@carlosgrillet.me>
---
 include/linux/netfilter.h  | 6 +++---
 net/netfilter/nf_sockopt.c | 8 ++++----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index efbbfa770d66..91b68bdba3f5 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -181,7 +181,7 @@ static inline void nf_hook_state_init(struct nf_hook_state *p,
 struct nf_sockopt_ops {
 	struct list_head list;
 
-	u_int8_t pf;
+	u8 pf;
 
 	/* Non-inclusive ranges: use 0/0/NULL to never get called. */
 	int set_optmin;
@@ -357,9 +357,9 @@ NF_HOOK_LIST(uint8_t pf, unsigned int hook, struct net *net, struct sock *sk,
 }
 
 /* Call setsockopt() */
-int nf_setsockopt(struct sock *sk, u_int8_t pf, int optval, sockptr_t opt,
+int nf_setsockopt(struct sock *sk, u8 pf, int optval, sockptr_t opt,
 		  unsigned int len);
-int nf_getsockopt(struct sock *sk, u_int8_t pf, int optval, char __user *opt,
+int nf_getsockopt(struct sock *sk, u8 pf, int optval, char __user *opt,
 		  int *len);
 
 struct flowi;
diff --git a/net/netfilter/nf_sockopt.c b/net/netfilter/nf_sockopt.c
index 34afcd03b6f6..19a1d028158c 100644
--- a/net/netfilter/nf_sockopt.c
+++ b/net/netfilter/nf_sockopt.c
@@ -59,8 +59,8 @@ void nf_unregister_sockopt(struct nf_sockopt_ops *reg)
 }
 EXPORT_SYMBOL(nf_unregister_sockopt);
 
-static struct nf_sockopt_ops *nf_sockopt_find(struct sock *sk, u_int8_t pf,
-		int val, int get)
+static struct nf_sockopt_ops *nf_sockopt_find(struct sock *sk, u8 pf,
+					      int val, int get)
 {
 	struct nf_sockopt_ops *ops;
 
@@ -89,7 +89,7 @@ static struct nf_sockopt_ops *nf_sockopt_find(struct sock *sk, u_int8_t pf,
 	return ops;
 }
 
-int nf_setsockopt(struct sock *sk, u_int8_t pf, int val, sockptr_t opt,
+int nf_setsockopt(struct sock *sk, u8 pf, int val, sockptr_t opt,
 		  unsigned int len)
 {
 	struct nf_sockopt_ops *ops;
@@ -104,7 +104,7 @@ int nf_setsockopt(struct sock *sk, u_int8_t pf, int val, sockptr_t opt,
 }
 EXPORT_SYMBOL(nf_setsockopt);
 
-int nf_getsockopt(struct sock *sk, u_int8_t pf, int val, char __user *opt,
+int nf_getsockopt(struct sock *sk, u8 pf, int val, char __user *opt,
 		  int *len)
 {
 	struct nf_sockopt_ops *ops;
-- 
2.54.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox