* [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it.
@ 2008-04-07 16:25 Pavel Emelyanov
2008-04-07 17:04 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Emelyanov @ 2008-04-07 16:25 UTC (permalink / raw)
To: Linux Netdev List; +Cc: Stephen Hemminger, Patrick McHardy
I've found a much easier way to shrink the net_device structure
rather that moving all the operations out of it. However, since
the net_device may grow further, moving the operations into a
separate place may look reasonable.
The pahole tool showed, that there are a 124 and 80 bytes holes
before the queue_lock and the _xmit_lock respectively. Moving most
of the devices callbacks into the 2nd hole makes the sizeof of the
structure be 1024 bytes.
The hard_start_xmit callback is not moved to keep it in previous
cacheline.
I think it's OK to make such a reordering, since all these hooks
are a) read-only and b) not called on fast paths, so their place
within the structure looks not very important.
Unfortunately diff makes the patch look like moving other fields,
rater than the ops :)
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
include/linux/netdevice.h | 47 +++++++++++++++++++++++----------------------
1 files changed, 24 insertions(+), 23 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8b17ed4..3397919 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -483,9 +483,6 @@ struct net_device
struct list_head napi_list;
#endif
- /* The device initialization function. Called only once. */
- int (*init)(struct net_device *dev);
-
/* ------- Fields preinitialized in Space.c finish here ------- */
/* Net device features */
@@ -641,27 +638,9 @@ struct net_device
int watchdog_timeo; /* used by dev_watchdog() */
struct timer_list watchdog_timer;
-/*
- * refcnt is a very hot point, so align it on SMP
- */
- /* Number of references to this device */
- atomic_t refcnt ____cacheline_aligned_in_smp;
-
- /* delayed register/unregister */
- struct list_head todo_list;
- /* device index hash chain */
- struct hlist_node index_hlist;
-
- struct net_device *link_watch_next;
-
- /* register/unregister state machine */
- enum { NETREG_UNINITIALIZED=0,
- NETREG_REGISTERED, /* completed register_netdevice */
- NETREG_UNREGISTERING, /* called unregister_netdevice */
- NETREG_UNREGISTERED, /* completed unregister todo */
- NETREG_RELEASED, /* called free_netdev */
- } reg_state;
+ /* The device initialization function. Called only once. */
+ int (*init)(struct net_device *dev);
/* Called after device is detached from network. */
void (*uninit)(struct net_device *dev);
/* Called after last user reference disappears. */
@@ -703,6 +682,28 @@ struct net_device
unsigned short vid);
int (*neigh_setup)(struct net_device *dev, struct neigh_parms *);
+
+/*
+ * refcnt is a very hot point, so align it on SMP
+ */
+ /* Number of references to this device */
+ atomic_t refcnt ____cacheline_aligned_in_smp;
+
+ /* delayed register/unregister */
+ struct list_head todo_list;
+ /* device index hash chain */
+ struct hlist_node index_hlist;
+
+ struct net_device *link_watch_next;
+
+ /* register/unregister state machine */
+ enum { NETREG_UNINITIALIZED=0,
+ NETREG_REGISTERED, /* completed register_netdevice */
+ NETREG_UNREGISTERING, /* called unregister_netdevice */
+ NETREG_UNREGISTERED, /* completed unregister todo */
+ NETREG_RELEASED, /* called free_netdev */
+ } reg_state;
+
#ifdef CONFIG_NETPOLL
struct netpoll_info *npinfo;
#endif
--
1.5.3.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it.
2008-04-07 16:25 [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it Pavel Emelyanov
@ 2008-04-07 17:04 ` Eric Dumazet
2008-04-07 17:17 ` Eric Dumazet
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2008-04-07 17:04 UTC (permalink / raw)
To: Pavel Emelyanov; +Cc: Linux Netdev List, Stephen Hemminger, Patrick McHardy
Pavel Emelyanov a écrit :
> I've found a much easier way to shrink the net_device structure
> rather that moving all the operations out of it. However, since
> the net_device may grow further, moving the operations into a
> separate place may look reasonable.
>
> The pahole tool showed, that there are a 124 and 80 bytes holes
> before the queue_lock and the _xmit_lock respectively. Moving most
> of the devices callbacks into the 2nd hole makes the sizeof of the
> structure be 1024 bytes.
>
>
On 32 bits platform and CONFIG_X86_L1_CACHE_SHIFT=7
I presume :)
Could you check if x86_64 machines with X86_L1_CACHE_SHIFT = 7 or 8
dont suffer from this patch ?
At first glance I would say it seems OK, but this net_device is really
touchy for SMP performance :)
> The hard_start_xmit callback is not moved to keep it in previous
> cacheline.
>
> I think it's OK to make such a reordering, since all these hooks
> are a) read-only and b) not called on fast paths, so their place
> within the structure looks not very important.
>
> Unfortunately diff makes the patch look like moving other fields,
> rater than the ops :)
>
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
>
> ---
> include/linux/netdevice.h | 47 +++++++++++++++++++++++----------------------
> 1 files changed, 24 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 8b17ed4..3397919 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -483,9 +483,6 @@ struct net_device
> struct list_head napi_list;
> #endif
>
> - /* The device initialization function. Called only once. */
> - int (*init)(struct net_device *dev);
> -
> /* ------- Fields preinitialized in Space.c finish here ------- */
>
> /* Net device features */
> @@ -641,27 +638,9 @@ struct net_device
> int watchdog_timeo; /* used by dev_watchdog() */
> struct timer_list watchdog_timer;
>
> -/*
> - * refcnt is a very hot point, so align it on SMP
> - */
> - /* Number of references to this device */
> - atomic_t refcnt ____cacheline_aligned_in_smp;
> -
> - /* delayed register/unregister */
> - struct list_head todo_list;
> - /* device index hash chain */
> - struct hlist_node index_hlist;
> -
> - struct net_device *link_watch_next;
> -
> - /* register/unregister state machine */
> - enum { NETREG_UNINITIALIZED=0,
> - NETREG_REGISTERED, /* completed register_netdevice */
> - NETREG_UNREGISTERING, /* called unregister_netdevice */
> - NETREG_UNREGISTERED, /* completed unregister todo */
> - NETREG_RELEASED, /* called free_netdev */
> - } reg_state;
>
> + /* The device initialization function. Called only once. */
> + int (*init)(struct net_device *dev);
> /* Called after device is detached from network. */
> void (*uninit)(struct net_device *dev);
> /* Called after last user reference disappears. */
> @@ -703,6 +682,28 @@ struct net_device
> unsigned short vid);
>
> int (*neigh_setup)(struct net_device *dev, struct neigh_parms *);
> +
> +/*
> + * refcnt is a very hot point, so align it on SMP
> + */
> + /* Number of references to this device */
> + atomic_t refcnt ____cacheline_aligned_in_smp;
> +
> + /* delayed register/unregister */
> + struct list_head todo_list;
> + /* device index hash chain */
> + struct hlist_node index_hlist;
> +
> + struct net_device *link_watch_next;
> +
> + /* register/unregister state machine */
> + enum { NETREG_UNINITIALIZED=0,
> + NETREG_REGISTERED, /* completed register_netdevice */
> + NETREG_UNREGISTERING, /* called unregister_netdevice */
> + NETREG_UNREGISTERED, /* completed unregister todo */
> + NETREG_RELEASED, /* called free_netdev */
> + } reg_state;
> +
> #ifdef CONFIG_NETPOLL
> struct netpoll_info *npinfo;
> #endif
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it.
2008-04-07 17:04 ` Eric Dumazet
@ 2008-04-07 17:17 ` Eric Dumazet
2008-04-07 17:47 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2008-04-07 17:17 UTC (permalink / raw)
To: Eric Dumazet
Cc: Pavel Emelyanov, Linux Netdev List, Stephen Hemminger,
Patrick McHardy
Eric Dumazet a écrit :
> Pavel Emelyanov a écrit :
>> I've found a much easier way to shrink the net_device structure
>> rather that moving all the operations out of it. However, since
>> the net_device may grow further, moving the operations into a
>> separate place may look reasonable.
>>
>> The pahole tool showed, that there are a 124 and 80 bytes holes
>> before the queue_lock and the _xmit_lock respectively. Moving most
>> of the devices callbacks into the 2nd hole makes the sizeof of the
>> structure be 1024 bytes.
>>
>>
> On 32 bits platform and CONFIG_X86_L1_CACHE_SHIFT=7
> I presume :)
>
> Could you check if x86_64 machines with X86_L1_CACHE_SHIFT = 7 or 8
> dont suffer from this patch ?
>
> At first glance I would say it seems OK, but this net_device is really
> touchy for SMP performance :)
I meant :
X86_L1_CACHE_SHIFT = 6 (MK8 | MCORE2) or 7 (others)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it.
2008-04-07 17:17 ` Eric Dumazet
@ 2008-04-07 17:47 ` Arnaldo Carvalho de Melo
2008-04-08 8:41 ` Pavel Emelyanov
0 siblings, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-04-07 17:47 UTC (permalink / raw)
To: Eric Dumazet
Cc: Pavel Emelyanov, Linux Netdev List, Stephen Hemminger,
Patrick McHardy
Em Mon, Apr 07, 2008 at 07:17:11PM +0200, Eric Dumazet escreveu:
> Eric Dumazet a écrit :
>> Pavel Emelyanov a écrit :
>>> I've found a much easier way to shrink the net_device structure rather
>>> that moving all the operations out of it. However, since
>>> the net_device may grow further, moving the operations into a
>>> separate place may look reasonable.
>>>
>>> The pahole tool showed, that there are a 124 and 80 bytes holes
>>> before the queue_lock and the _xmit_lock respectively. Moving most
>>> of the devices callbacks into the 2nd hole makes the sizeof of the
>>> structure be 1024 bytes.
>>>
>>>
>> On 32 bits platform and CONFIG_X86_L1_CACHE_SHIFT=7
>> I presume :)
>>
>> Could you check if x86_64 machines with X86_L1_CACHE_SHIFT = 7 or 8 dont
>> suffer from this patch ?
>>
>> At first glance I would say it seems OK, but this net_device is really
>> touchy for SMP performance :)
> I meant :
>
> X86_L1_CACHE_SHIFT = 6 (MK8 | MCORE2) or 7 (others)
Info _before_ Pavel's patch, not that big holes.
[acme@doppio linux-2.6]$ getconf LEVEL1_DCACHE_LINESIZE
64
model name : Intel(R) Core(TM)2 CPU T7200 @ 2.00GHz
[acme@doppio linux-2.6]$ grep X86_L1_CACHE_SHIFT /boot/config-`uname -r`
CONFIG_X86_L1_CACHE_SHIFT=6
[acme@doppio linux-2.6]$ pahole -C net_device ../build/linux-2.6/doppio/net/core/dev.o
struct net_device {
char name[16]; /* 0 16 */
struct hlist_node name_hlist; /* 16 16 */
long unsigned int mem_end; /* 32 8 */
long unsigned int mem_start; /* 40 8 */
long unsigned int base_addr; /* 48 8 */
unsigned int irq; /* 56 4 */
unsigned char if_port; /* 60 1 */
unsigned char dma; /* 61 1 */
/* XXX 2 bytes hole, try to pack */
/* --- cacheline 1 boundary (64 bytes) --- */
long unsigned int state; /* 64 8 */
struct list_head dev_list; /* 72 16 */
struct list_head napi_list; /* 88 16 */
int (*init)(struct net_device *); /* 104 8 */
long unsigned int features; /* 112 8 */
struct net_device * next_sched; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
int ifindex; /* 128 4 */
int iflink; /* 132 4 */
struct net_device_stats * (*get_stats)(struct net_device *); /* 136 8 */
struct net_device_stats stats; /* 144 184 */
/* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */
const struct iw_handler_def * wireless_handlers; /* 328 8 */
struct iw_public_data * wireless_data; /* 336 8 */
const struct ethtool_ops * ethtool_ops; /* 344 8 */
const struct header_ops * header_ops; /* 352 8 */
unsigned int flags; /* 360 4 */
short unsigned int gflags; /* 364 2 */
short unsigned int priv_flags; /* 366 2 */
short unsigned int padded; /* 368 2 */
unsigned char operstate; /* 370 1 */
unsigned char link_mode; /* 371 1 */
unsigned int mtu; /* 372 4 */
short unsigned int type; /* 376 2 */
short unsigned int hard_header_len; /* 378 2 */
/* XXX 4 bytes hole, try to pack */
/* --- cacheline 6 boundary (384 bytes) --- */
struct net_device * master; /* 384 8 */
unsigned char perm_addr[32]; /* 392 32 */
unsigned char addr_len; /* 424 1 */
/* XXX 1 byte hole, try to pack */
short unsigned int dev_id; /* 426 2 */
/* XXX 4 bytes hole, try to pack */
struct dev_addr_list * uc_list; /* 432 8 */
int uc_count; /* 440 4 */
int uc_promisc; /* 444 4 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct dev_addr_list * mc_list; /* 448 8 */
int mc_count; /* 456 4 */
int promiscuity; /* 460 4 */
int allmulti; /* 464 4 */
/* XXX 4 bytes hole, try to pack */
void * atalk_ptr; /* 472 8 */
void * ip_ptr; /* 480 8 */
void * dn_ptr; /* 488 8 */
void * ip6_ptr; /* 496 8 */
void * ec_ptr; /* 504 8 */
/* --- cacheline 8 boundary (512 bytes) --- */
void * ax25_ptr; /* 512 8 */
struct wireless_dev * ieee80211_ptr; /* 520 8 */
long unsigned int last_rx; /* 528 8 */
unsigned char dev_addr[32]; /* 536 32 */
unsigned char broadcast[32]; /* 568 32 */
/* --- cacheline 9 boundary (576 bytes) was 24 bytes ago --- */
spinlock_t ingress_lock; /* 600 4 */
/* XXX 4 bytes hole, try to pack */
struct Qdisc * qdisc_ingress; /* 608 8 */
/* XXX 24 bytes hole, try to pack */
/* --- cacheline 10 boundary (640 bytes) --- */
spinlock_t queue_lock; /* 640 4 */
/* XXX 4 bytes hole, try to pack */
struct Qdisc * qdisc; /* 648 8 */
struct Qdisc * qdisc_sleeping; /* 656 8 */
struct list_head qdisc_list; /* 664 16 */
long unsigned int tx_queue_len; /* 680 8 */
struct sk_buff * gso_skb; /* 688 8 */
/* XXX 8 bytes hole, try to pack */
/* --- cacheline 11 boundary (704 bytes) --- */
spinlock_t _xmit_lock; /* 704 4 */
int xmit_lock_owner; /* 708 4 */
void * priv; /* 712 8 */
int (*hard_start_xmit)(struct sk_buff *, struct net_device *); /* 720 8 */
long unsigned int trans_start; /* 728 8 */
int watchdog_timeo; /* 736 4 */
/* XXX 4 bytes hole, try to pack */
struct timer_list watchdog_timer; /* 744 80 */
/* XXX last struct has 4 bytes of padding */
/* XXX 8 bytes hole, try to pack */
/* --- cacheline 13 boundary (832 bytes) --- */
atomic_t refcnt; /* 832 4 */
/* XXX 4 bytes hole, try to pack */
struct list_head todo_list; /* 840 16 */
struct hlist_node index_hlist; /* 856 16 */
struct net_device * link_watch_next; /* 872 8 */
enum {
NETREG_UNINITIALIZED = 0,
NETREG_REGISTERED = 1,
NETREG_UNREGISTERING = 2,
NETREG_UNREGISTERED = 3,
NETREG_RELEASED = 4,
} reg_state; /* 880 4 */
/* XXX 4 bytes hole, try to pack */
void (*uninit)(struct net_device *); /* 888 8 */
/* --- cacheline 14 boundary (896 bytes) --- */
void (*destructor)(struct net_device *); /* 896 8 */
int (*open)(struct net_device *); /* 904 8 */
int (*stop)(struct net_device *); /* 912 8 */
void (*change_rx_flags)(struct net_device *, int); /* 920 8 */
void (*set_rx_mode)(struct net_device *); /* 928 8 */
void (*set_multicast_list)(struct net_device *); /* 936 8 */
int (*set_mac_address)(struct net_device *, void *); /* 944 8 */
int (*validate_addr)(struct net_device *); /* 952 8 */
/* --- cacheline 15 boundary (960 bytes) --- */
int (*do_ioctl)(struct net_device *, struct ifreq *, int); /* 960 8 */
int (*set_config)(struct net_device *, struct ifmap *); /* 968 8 */
int (*change_mtu)(struct net_device *, int); /* 976 8 */
void (*tx_timeout)(struct net_device *); /* 984 8 */
void (*vlan_rx_register)(struct net_device *, struct vlan_group *); /* 992 8 */
void (*vlan_rx_add_vid)(struct net_device *, short unsigned int); /* 1000 8 */
void (*vlan_rx_kill_vid)(struct net_device *, short unsigned int); /* 1008 8 */
int (*neigh_setup)(struct net_device *, struct neigh_parms *); /* 1016 8 */
/* --- cacheline 16 boundary (1024 bytes) --- */
struct netpoll_info * npinfo; /* 1024 8 */
void (*poll_controller)(struct net_device *); /* 1032 8 */
struct net * nd_net; /* 1040 8 */
struct net_bridge_port * br_port; /* 1048 8 */
struct macvlan_port * macvlan_port; /* 1056 8 */
struct device dev; /* 1064 584 */
/* --- cacheline 25 boundary (1600 bytes) was 48 bytes ago --- */
struct attribute_group * sysfs_groups[3]; /* 1648 24 */
/* --- cacheline 26 boundary (1664 bytes) was 8 bytes ago --- */
const struct rtnl_link_ops * rtnl_link_ops; /* 1672 8 */
unsigned int egress_subqueue_count; /* 1680 4 */
/* XXX 4 bytes hole, try to pack */
struct net_device_subqueue egress_subqueue[1]; /* 1688 8 */
/* size: 1728, cachelines: 27 */
/* sum members: 1617, holes: 14, sum holes: 79 */
/* padding: 32 */
/* paddings: 1, sum paddings: 4 */
};
Humm, I need to implement a way to inform pahole about fields that are
cacheline aligned, that together with --cacheline and --word_size would
allow us to try all these combinations, even --reorganize could use that
info...
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it.
2008-04-07 17:47 ` Arnaldo Carvalho de Melo
@ 2008-04-08 8:41 ` Pavel Emelyanov
2008-04-16 9:12 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Emelyanov @ 2008-04-08 8:41 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Eric Dumazet
Cc: Linux Netdev List, Stephen Hemminger, Patrick McHardy
Ok, here are the results (summary):
i386, no patch, L1_CACHE_SHIFT=6:
/* size: 1024, cachelines: 16 */
/* sum members: 897, holes: 5, sum holes: 115 */
/* padding: 12 */
i386, no patch, L1_CACHE_SHIFT=7:
/* size: 1280, cachelines: 20 */
/* sum members: 897, holes: 5, sum holes: 307 */
/* padding: 76 */
i386, with patch, L1_CACHE_SHIFT=6:
/* size: 960, cachelines: 15 */
/* sum members: 897, holes: 4, sum holes: 47 */
/* padding: 16 */
i386, with patch, L1_CACHE_SHIFT=7:
/* size: 1024, cachelines: 16 */
/* sum members: 897, holes: 4, sum holes: 111 */
/* padding: 16 */
x86_64, no patch, L1_CACHE_SHIFT=7:
/* size: 1792, cachelines: 28 */
/* sum members: 1589, holes: 13, sum holes: 155 */
/* padding: 48 */
/* paddings: 1, sum paddings: 4 */
x86_64, with patch, L1_CACHE_SHIFT=7:
/* size: 1920, cachelines: 30 */
/* sum members: 1589, holes: 13, sum holes: 275 */
/* padding: 56 */
/* paddings: 1, sum paddings: 4 */
The +128 bytes on x86_64 is due to enlarged hole before
the refcnt field :(
So, on the i386 the structure shrinks, but on x86_64 - grows, but
on i386 it can be allocated from a twice smaller cache, while on
x86_64 nothing changes - it still fits the size-2048.
BTW, L1_CACHE_SHIFT=7 is set by default for distributions such as
rhel5, suse10 and fedora8...
So, does it worth trying to push them to Dave? :)
Thanks,
Pavel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it.
2008-04-08 8:41 ` Pavel Emelyanov
@ 2008-04-16 9:12 ` David Miller
0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2008-04-16 9:12 UTC (permalink / raw)
To: xemul; +Cc: acme, dada1, netdev, shemminger, kaber
From: Pavel Emelyanov <xemul@openvz.org>
Date: Tue, 08 Apr 2008 12:41:18 +0400
> Ok, here are the results (summary):
...
> So, does it worth trying to push them to Dave? :)
I think if you can at least make it such that it stays
the same in all of the stated situations, I'm happy to
apply this.
Get creative with the rearrangement :-)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-04-16 9:12 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-07 16:25 [PATCH net-2.6.26 1/2] Shrink size of net_device by filling alignment holes in it Pavel Emelyanov
2008-04-07 17:04 ` Eric Dumazet
2008-04-07 17:17 ` Eric Dumazet
2008-04-07 17:47 ` Arnaldo Carvalho de Melo
2008-04-08 8:41 ` Pavel Emelyanov
2008-04-16 9:12 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).