* [PATCH net-next v3] netdevice: define and allocate &net_device _properly_
@ 2024-07-10 11:30 Breno Leitao
2024-07-10 13:44 ` Simon Horman
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Breno Leitao @ 2024-07-10 11:30 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Kees Cook, Gustavo A. R. Silva
Cc: keescook, horms, linux-hardening, Alexander Lobakin,
Przemek Kitszel, Jiri Pirko, Sebastian Andrzej Siewior,
Daniel Borkmann, Lorenzo Bianconi, Johannes Berg,
open list:NETWORKING [GENERAL], open list
From: Alexander Lobakin <aleksander.lobakin@intel.com>
In fact, this structure contains a flexible array at the end, but
historically its size, alignment etc., is calculated manually.
There are several instances of the structure embedded into other
structures, but also there's ongoing effort to remove them and we
could in the meantime declare &net_device properly.
Declare the array explicitly, use struct_size() and store the array
size inside the structure, so that __counted_by() can be applied.
Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the
allocated buffer is aligned to what the user expects.
Also, change its alignment from %NETDEV_ALIGN to the cacheline size
as per several suggestions on the netdev ML.
bloat-o-meter for vmlinux:
free_netdev 445 440 -5
netdev_freemem 24 - -24
alloc_netdev_mqs 1481 1450 -31
On x86_64 with several NICs of different vendors, I was never able to
get a &net_device pointer not aligned to the cacheline size after the
change.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kees Cook <kees@kernel.org>
---
Changelog
v3:
* Fix kernel-doc documentation for the new fields (Simon)
v2:
* Rebased Alexander's patch on top of f750dfe825b90 ("ethtool: provide
customized dim profile management").
* Removed the ALIGN() of SMP_CACHE_BYTES for sizeof_priv.
v1:
* https://lore.kernel.org/netdev/90fd7cd7-72dc-4df6-88ec-fbc8b64735ad@intel.com
include/linux/netdevice.h | 15 +++++++++------
net/core/dev.c | 30 ++++++------------------------
net/core/net-sysfs.c | 2 +-
3 files changed, 16 insertions(+), 31 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 93558645c6d0..9ba9552e4af6 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1819,7 +1819,8 @@ enum netdev_reg_state {
* @priv_flags: Like 'flags' but invisible to userspace,
* see if.h for the definitions
* @gflags: Global flags ( kept as legacy )
- * @padded: How much padding added by alloc_netdev()
+ * @priv_len: Size of the ->priv flexible array
+ * @priv: Flexible array containing private data
* @operstate: RFC2863 operstate
* @link_mode: Mapping policy to operstate
* @if_port: Selectable AUI, TP, ...
@@ -2199,10 +2200,10 @@ struct net_device {
unsigned short neigh_priv_len;
unsigned short dev_id;
unsigned short dev_port;
- unsigned short padded;
+ int irq;
+ u32 priv_len;
spinlock_t addr_list_lock;
- int irq;
struct netdev_hw_addr_list uc;
struct netdev_hw_addr_list mc;
@@ -2406,7 +2407,10 @@ struct net_device {
/** @irq_moder: dim parameters used if IS_ENABLED(CONFIG_DIMLIB). */
struct dim_irq_moder *irq_moder;
-};
+
+ u8 priv[] ____cacheline_aligned
+ __counted_by(priv_len);
+} ____cacheline_aligned;
#define to_net_dev(d) container_of(d, struct net_device, dev)
/*
@@ -2596,7 +2600,7 @@ void dev_net_set(struct net_device *dev, struct net *net)
*/
static inline void *netdev_priv(const struct net_device *dev)
{
- return (char *)dev + ALIGN(sizeof(struct net_device), NETDEV_ALIGN);
+ return (void *)dev->priv;
}
/* Set the sysfs physical device reference for the network logical device
@@ -3127,7 +3131,6 @@ static inline void unregister_netdevice(struct net_device *dev)
int netdev_refcnt_read(const struct net_device *dev);
void free_netdev(struct net_device *dev);
-void netdev_freemem(struct net_device *dev);
void init_dummy_netdev(struct net_device *dev);
struct net_device *netdev_get_xmit_slave(struct net_device *dev,
diff --git a/net/core/dev.c b/net/core/dev.c
index 73e5af6943c3..6ea1d20676fb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11006,13 +11006,6 @@ void netdev_sw_irq_coalesce_default_on(struct net_device *dev)
}
EXPORT_SYMBOL_GPL(netdev_sw_irq_coalesce_default_on);
-void netdev_freemem(struct net_device *dev)
-{
- char *addr = (char *)dev - dev->padded;
-
- kvfree(addr);
-}
-
/**
* alloc_netdev_mqs - allocate network device
* @sizeof_priv: size of private data to allocate space for
@@ -11032,8 +11025,6 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
unsigned int txqs, unsigned int rxqs)
{
struct net_device *dev;
- unsigned int alloc_size;
- struct net_device *p;
BUG_ON(strlen(name) >= sizeof(dev->name));
@@ -11047,21 +11038,12 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
return NULL;
}
- alloc_size = sizeof(struct net_device);
- if (sizeof_priv) {
- /* ensure 32-byte alignment of private area */
- alloc_size = ALIGN(alloc_size, NETDEV_ALIGN);
- alloc_size += sizeof_priv;
- }
- /* ensure 32-byte alignment of whole construct */
- alloc_size += NETDEV_ALIGN - 1;
-
- p = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
- if (!p)
+ dev = kvzalloc(struct_size(dev, priv, sizeof_priv),
+ GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
+ if (!dev)
return NULL;
- dev = PTR_ALIGN(p, NETDEV_ALIGN);
- dev->padded = (char *)dev - (char *)p;
+ dev->priv_len = sizeof_priv;
ref_tracker_dir_init(&dev->refcnt_tracker, 128, name);
#ifdef CONFIG_PCPU_DEV_REFCNT
@@ -11148,7 +11130,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
free_percpu(dev->pcpu_refcnt);
free_dev:
#endif
- netdev_freemem(dev);
+ kvfree(dev);
return NULL;
}
EXPORT_SYMBOL(alloc_netdev_mqs);
@@ -11203,7 +11185,7 @@ void free_netdev(struct net_device *dev)
/* Compatibility with error handling in drivers */
if (dev->reg_state == NETREG_UNINITIALIZED ||
dev->reg_state == NETREG_DUMMY) {
- netdev_freemem(dev);
+ kvfree(dev);
return;
}
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 4c27a360c294..0e2084ce7b75 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -2028,7 +2028,7 @@ static void netdev_release(struct device *d)
* device is dead and about to be freed.
*/
kfree(rcu_access_pointer(dev->ifalias));
- netdev_freemem(dev);
+ kvfree(dev);
}
static const void *net_namespace(const struct device *d)
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH net-next v3] netdevice: define and allocate &net_device _properly_
2024-07-10 11:30 [PATCH net-next v3] netdevice: define and allocate &net_device _properly_ Breno Leitao
@ 2024-07-10 13:44 ` Simon Horman
2024-07-10 14:01 ` Alexander Lobakin
2024-07-12 1:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 5+ messages in thread
From: Simon Horman @ 2024-07-10 13:44 UTC (permalink / raw)
To: Breno Leitao
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Kees Cook, Gustavo A. R. Silva, keescook, linux-hardening,
Alexander Lobakin, Przemek Kitszel, Jiri Pirko,
Sebastian Andrzej Siewior, Daniel Borkmann, Lorenzo Bianconi,
Johannes Berg, open list:NETWORKING [GENERAL], open list
On Wed, Jul 10, 2024 at 04:30:28AM -0700, Breno Leitao wrote:
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>
> In fact, this structure contains a flexible array at the end, but
> historically its size, alignment etc., is calculated manually.
> There are several instances of the structure embedded into other
> structures, but also there's ongoing effort to remove them and we
> could in the meantime declare &net_device properly.
> Declare the array explicitly, use struct_size() and store the array
> size inside the structure, so that __counted_by() can be applied.
> Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the
> allocated buffer is aligned to what the user expects.
> Also, change its alignment from %NETDEV_ALIGN to the cacheline size
> as per several suggestions on the netdev ML.
>
> bloat-o-meter for vmlinux:
>
> free_netdev 445 440 -5
> netdev_freemem 24 - -24
> alloc_netdev_mqs 1481 1450 -31
>
> On x86_64 with several NICs of different vendors, I was never able to
> get a &net_device pointer not aligned to the cacheline size after the
> change.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Reviewed-by: Kees Cook <kees@kernel.org>
>
> ---
> Changelog
>
> v3:
> * Fix kernel-doc documentation for the new fields (Simon)
Thanks for the update, LGTM.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v3] netdevice: define and allocate &net_device _properly_
2024-07-10 11:30 [PATCH net-next v3] netdevice: define and allocate &net_device _properly_ Breno Leitao
2024-07-10 13:44 ` Simon Horman
@ 2024-07-10 14:01 ` Alexander Lobakin
2024-07-10 16:29 ` Kees Cook
2024-07-12 1:20 ` patchwork-bot+netdevbpf
2 siblings, 1 reply; 5+ messages in thread
From: Alexander Lobakin @ 2024-07-10 14:01 UTC (permalink / raw)
To: Breno Leitao
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Kees Cook, Gustavo A. R. Silva, keescook, horms, linux-hardening,
Przemek Kitszel, Jiri Pirko, Sebastian Andrzej Siewior,
Daniel Borkmann, Lorenzo Bianconi, Johannes Berg,
open list:NETWORKING [GENERAL], open list
From: Breno Leitao <leitao@debian.org>
Date: Wed, 10 Jul 2024 04:30:28 -0700
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>
> In fact, this structure contains a flexible array at the end, but
> historically its size, alignment etc., is calculated manually.
> There are several instances of the structure embedded into other
> structures, but also there's ongoing effort to remove them and we
> could in the meantime declare &net_device properly.
> Declare the array explicitly, use struct_size() and store the array
> size inside the structure, so that __counted_by() can be applied.
> Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the
> allocated buffer is aligned to what the user expects.
> Also, change its alignment from %NETDEV_ALIGN to the cacheline size
> as per several suggestions on the netdev ML.
>
> bloat-o-meter for vmlinux:
>
> free_netdev 445 440 -5
> netdev_freemem 24 - -24
> alloc_netdev_mqs 1481 1450 -31
>
> On x86_64 with several NICs of different vendors, I was never able to
> get a &net_device pointer not aligned to the cacheline size after the
> change.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Reviewed-by: Kees Cook <kees@kernel.org>
You did a great job converting embedded &net_devices, thanks a lot!
I hope SLUB won't return you a non-cacheline-aligned pointer after that
you removed SMP_CACHE_ALIGN(sizeof_priv), right?
Olek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v3] netdevice: define and allocate &net_device _properly_
2024-07-10 14:01 ` Alexander Lobakin
@ 2024-07-10 16:29 ` Kees Cook
0 siblings, 0 replies; 5+ messages in thread
From: Kees Cook @ 2024-07-10 16:29 UTC (permalink / raw)
To: Alexander Lobakin
Cc: Breno Leitao, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Gustavo A. R. Silva, horms, linux-hardening,
Przemek Kitszel, Jiri Pirko, Sebastian Andrzej Siewior,
Daniel Borkmann, Lorenzo Bianconi, Johannes Berg,
open list:NETWORKING [GENERAL], open list
On Wed, Jul 10, 2024 at 04:01:35PM +0200, Alexander Lobakin wrote:
> From: Breno Leitao <leitao@debian.org>
> Date: Wed, 10 Jul 2024 04:30:28 -0700
>
> > From: Alexander Lobakin <aleksander.lobakin@intel.com>
> >
> > In fact, this structure contains a flexible array at the end, but
> > historically its size, alignment etc., is calculated manually.
> > There are several instances of the structure embedded into other
> > structures, but also there's ongoing effort to remove them and we
> > could in the meantime declare &net_device properly.
> > Declare the array explicitly, use struct_size() and store the array
> > size inside the structure, so that __counted_by() can be applied.
> > Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the
> > allocated buffer is aligned to what the user expects.
> > Also, change its alignment from %NETDEV_ALIGN to the cacheline size
> > as per several suggestions on the netdev ML.
> >
> > bloat-o-meter for vmlinux:
> >
> > free_netdev 445 440 -5
> > netdev_freemem 24 - -24
> > alloc_netdev_mqs 1481 1450 -31
> >
> > On x86_64 with several NICs of different vendors, I was never able to
> > get a &net_device pointer not aligned to the cacheline size after the
> > change.
> >
> > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> > Reviewed-by: Eric Dumazet <edumazet@google.com>
> > Reviewed-by: Kees Cook <kees@kernel.org>
>
> You did a great job converting embedded &net_devices, thanks a lot!
>
> I hope SLUB won't return you a non-cacheline-aligned pointer after that
> you removed SMP_CACHE_ALIGN(sizeof_priv), right?
Currently the slab will do power-of-2 alignment (i.e. aligned to the
bucket size), so this should be fine. In the future I'm trying to make
the slab more aware of the required alignments so that it can still
provide needed alignment without having to do maximal (power-of-2)
alignments.
--
Kees Cook
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next v3] netdevice: define and allocate &net_device _properly_
2024-07-10 11:30 [PATCH net-next v3] netdevice: define and allocate &net_device _properly_ Breno Leitao
2024-07-10 13:44 ` Simon Horman
2024-07-10 14:01 ` Alexander Lobakin
@ 2024-07-12 1:20 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-12 1:20 UTC (permalink / raw)
To: Breno Leitao
Cc: davem, edumazet, kuba, pabeni, kees, gustavoars, keescook, horms,
linux-hardening, aleksander.lobakin, przemyslaw.kitszel, jiri,
bigeasy, daniel, lorenzo, johannes.berg, netdev, linux-kernel
Hello:
This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 10 Jul 2024 04:30:28 -0700 you wrote:
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>
> In fact, this structure contains a flexible array at the end, but
> historically its size, alignment etc., is calculated manually.
> There are several instances of the structure embedded into other
> structures, but also there's ongoing effort to remove them and we
> could in the meantime declare &net_device properly.
> Declare the array explicitly, use struct_size() and store the array
> size inside the structure, so that __counted_by() can be applied.
> Don't use PTR_ALIGN(), as SLUB itself tries its best to ensure the
> allocated buffer is aligned to what the user expects.
> Also, change its alignment from %NETDEV_ALIGN to the cacheline size
> as per several suggestions on the netdev ML.
>
> [...]
Here is the summary with links:
- [net-next,v3] netdevice: define and allocate &net_device _properly_
https://git.kernel.org/netdev/net-next/c/13cabc47f8ae
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-07-12 1:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-10 11:30 [PATCH net-next v3] netdevice: define and allocate &net_device _properly_ Breno Leitao
2024-07-10 13:44 ` Simon Horman
2024-07-10 14:01 ` Alexander Lobakin
2024-07-10 16:29 ` Kees Cook
2024-07-12 1:20 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).