From: Paul Moore <pmoore@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>,
David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, mvadkert@redhat.com,
selinux@tycho.nsa.gov, linux-security-module@vger.kernel.org
Subject: Re: [PATCH] tcp: assign the sock correctly to an outgoing SYNACK packet
Date: Tue, 09 Apr 2013 11:05:55 -0400 [thread overview]
Message-ID: <5061322.VOeJrtYPaJ@sifl> (raw)
In-Reply-To: <7718638.lBZi8geXkP@sifl>
On Tuesday, April 09, 2013 10:52:17 AM Paul Moore wrote:
> On Tuesday, April 09, 2013 07:31:04 AM Eric Dumazet wrote:
> > On Tue, 2013-04-09 at 10:19 -0400, Paul Moore wrote:
> > > On Tuesday, April 09, 2013 07:00:22 AM Eric Dumazet wrote:
> > > > On Tue, 2013-04-09 at 09:19 -0400, Paul Moore wrote:
> > > > > As Casey already mentioned, if this isn't acceptable please help me
> > > > > understand why.
> > > >
> > > > You see something which is not the reality. If you do such analysis,
> > > > better do it properly, because any change you are going to submit will
> > > > be doubly checked by people who really care.
> > >
> > > I am attempting to do it properly, I simply made a mistake. Ben also
> > > pointed it out. As you wrote yesterday, "Lets go forward".
> > >
> > > After fixing the BITS_PER_LONG problem I looked at it again and it
> > > appears
> > > that by simply replacing the "secmark" field with a blob we retain the
> > > size of the sk_buff as well as the cacheline positions of all the
> > > fields,
> > > e.g. dma_cookie no longer moves cachelines. Thoughts?
> >
> > If you take a look at recent history of changes on sk_buff, you can see
> > we added very recently fields for encapsulation support. These were
> > absolutely wanted for modern operations at datacenter level.
> >
> > This effort might still need new room, so I prefer not filling sk_buff
> > right now.
>
> Has anyone proposed any additional encapsulation patches which need
> additional fields in the sk_buff? Are you aware of any additional
> encapsulation patches which are in progress? When would you consider it
> "safe"?
Another thought.
If we move the "protocol" field after the flags2 bitfield, remove "secmark",
and add the blob before/after "destructor" then we preserve the existing four
byte hole in cacheline #3 for potential use by the encapsulation folks and
move the dma_cookie into cacheline #3 as well which may actually be a good
thing (corrections welcome on this comment). The overall size remains the
same at 256 bytes.
struct sk_buff_test {
struct sk_buff * next; /* 0 8 */
struct sk_buff * prev; /* 8 8 */
ktime_t tstamp; /* 16 8 */
struct sock * sk; /* 24 8 */
struct net_device * dev; /* 32 8 */
char cb[48]; /* 40 48 */
/* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
long unsigned int _skb_refdst; /* 88 8 */
struct sec_path * sp; /* 96 8 */
unsigned int len; /* 104 4 */
unsigned int data_len; /* 108 4 */
__u16 mac_len; /* 112 2 */
__u16 hdr_len; /* 114 2 */
union {
__wsum csum; /* 4 */
struct {
__u16 csum_start; /* 116 2 */
__u16 csum_offset; /* 118 2 */
}; /* 4 */
}; /* 116 4 */
__u32 priority; /* 120 4 */
int flags1_begin[0]; /* 124 0 */
__u8 local_df:1; /* 124: 7 1 */
__u8 cloned:1; /* 124: 6 1 */
__u8 ip_summed:2; /* 124: 4 1 */
__u8 nohdr:1; /* 124: 3 1 */
__u8 nfctinfo:3; /* 124: 0 1 */
__u8 pkt_type:3; /* 125: 5 1 */
__u8 fclone:2; /* 125: 3 1 */
__u8 ipvs_property:1; /* 125: 2 1 */
__u8 peeked:1; /* 125: 1 1 */
__u8 nf_trace:1; /* 125: 0 1 */
/* XXX 2 bytes hole, try to pack */
/* --- cacheline 2 boundary (128 bytes) --- */
int flags1_end[0]; /* 128 0 */
void * security; /* 128 8 */
void (*destructor)(struct sk_buff *); /* 136
8 */
struct nf_conntrack * nfct; /* 144 8 */
struct sk_buff * nfct_reasm; /* 152 8 */
struct nf_bridge_info * nf_bridge; /* 160 8 */
int skb_iif; /* 168 4 */
__u32 rxhash; /* 172 4 */
__u16 vlan_tci; /* 176 2 */
__u16 tc_index; /* 178 2 */
__u16 tc_verd; /* 180 2 */
__u16 queue_mapping; /* 182 2 */
int flags2_begin[0]; /* 184 0 */
__u8 ndisc_nodetype:2; /* 184: 6 1 */
__u8 pfmemalloc:1; /* 184: 5 1 */
__u8 ooo_okay:1; /* 184: 4 1 */
__u8 l4_rxhash:1; /* 184: 3 1 */
__u8 wifi_acked_valid:1; /* 184: 2 1 */
__u8 wifi_acked:1; /* 184: 1 1 */
__u8 no_fcs:1; /* 184: 0 1 */
__u8 head_frag:1; /* 185: 7 1 */
__u8 encapsulation:1; /* 185: 6 1 */
/* XXX 6 bits hole, try to pack */
/* XXX 2 bytes hole, try to pack */
int flags2_end[0]; /* 188 0 */
__be16 protocol; /* 188 2 */
/* XXX 2 bytes hole, try to pack */
/* --- cacheline 3 boundary (192 bytes) --- */
dma_cookie_t dma_cookie; /* 192 4 */
union {
__u32 mark; /* 4 */
__u32 dropcount; /* 4 */
__u32 reserved_tailroom; /* 4 */
}; /* 196 4 */
sk_buff_data_t inner_transport_header; /* 200 4 */
sk_buff_data_t inner_network_header; /* 204 4 */
sk_buff_data_t transport_header; /* 208 4 */
sk_buff_data_t network_header; /* 212 4 */
sk_buff_data_t mac_header; /* 216 4 */
sk_buff_data_t tail; /* 220 4 */
sk_buff_data_t end; /* 224 4 */
/* XXX 4 bytes hole, try to pack */
unsigned char * head; /* 232 8 */
unsigned char * data; /* 240 8 */
unsigned int truesize; /* 248 4 */
atomic_t users; /* 252 4 */
/* --- cacheline 4 boundary (256 bytes) --- */
/* size: 256, cachelines: 4, members: 62 */
/* sum members: 246, holes: 4, sum holes: 10 */
/* bit holes: 1, sum bit holes: 6 bits */
};
--
paul moore
security and virtualization @ redhat
--
This message was distributed to subscribers of the selinux mailing list.
If you no longer wish to subscribe, send mail to majordomo@tycho.nsa.gov with
the words "unsubscribe selinux" without quotes as the message.
WARNING: multiple messages have this Message-ID (diff)
From: Paul Moore <pmoore@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>,
David Miller <davem@davemloft.net>,
netdev@vger.kernel.org, mvadkert@redhat.com,
selinux@tycho.nsa.gov, linux-security-module@vger.kernel.org
Subject: Re: [PATCH] tcp: assign the sock correctly to an outgoing SYNACK packet
Date: Tue, 09 Apr 2013 11:05:55 -0400 [thread overview]
Message-ID: <5061322.VOeJrtYPaJ@sifl> (raw)
In-Reply-To: <7718638.lBZi8geXkP@sifl>
On Tuesday, April 09, 2013 10:52:17 AM Paul Moore wrote:
> On Tuesday, April 09, 2013 07:31:04 AM Eric Dumazet wrote:
> > On Tue, 2013-04-09 at 10:19 -0400, Paul Moore wrote:
> > > On Tuesday, April 09, 2013 07:00:22 AM Eric Dumazet wrote:
> > > > On Tue, 2013-04-09 at 09:19 -0400, Paul Moore wrote:
> > > > > As Casey already mentioned, if this isn't acceptable please help me
> > > > > understand why.
> > > >
> > > > You see something which is not the reality. If you do such analysis,
> > > > better do it properly, because any change you are going to submit will
> > > > be doubly checked by people who really care.
> > >
> > > I am attempting to do it properly, I simply made a mistake. Ben also
> > > pointed it out. As you wrote yesterday, "Lets go forward".
> > >
> > > After fixing the BITS_PER_LONG problem I looked at it again and it
> > > appears
> > > that by simply replacing the "secmark" field with a blob we retain the
> > > size of the sk_buff as well as the cacheline positions of all the
> > > fields,
> > > e.g. dma_cookie no longer moves cachelines. Thoughts?
> >
> > If you take a look at recent history of changes on sk_buff, you can see
> > we added very recently fields for encapsulation support. These were
> > absolutely wanted for modern operations at datacenter level.
> >
> > This effort might still need new room, so I prefer not filling sk_buff
> > right now.
>
> Has anyone proposed any additional encapsulation patches which need
> additional fields in the sk_buff? Are you aware of any additional
> encapsulation patches which are in progress? When would you consider it
> "safe"?
Another thought.
If we move the "protocol" field after the flags2 bitfield, remove "secmark",
and add the blob before/after "destructor" then we preserve the existing four
byte hole in cacheline #3 for potential use by the encapsulation folks and
move the dma_cookie into cacheline #3 as well which may actually be a good
thing (corrections welcome on this comment). The overall size remains the
same at 256 bytes.
struct sk_buff_test {
struct sk_buff * next; /* 0 8 */
struct sk_buff * prev; /* 8 8 */
ktime_t tstamp; /* 16 8 */
struct sock * sk; /* 24 8 */
struct net_device * dev; /* 32 8 */
char cb[48]; /* 40 48 */
/* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
long unsigned int _skb_refdst; /* 88 8 */
struct sec_path * sp; /* 96 8 */
unsigned int len; /* 104 4 */
unsigned int data_len; /* 108 4 */
__u16 mac_len; /* 112 2 */
__u16 hdr_len; /* 114 2 */
union {
__wsum csum; /* 4 */
struct {
__u16 csum_start; /* 116 2 */
__u16 csum_offset; /* 118 2 */
}; /* 4 */
}; /* 116 4 */
__u32 priority; /* 120 4 */
int flags1_begin[0]; /* 124 0 */
__u8 local_df:1; /* 124: 7 1 */
__u8 cloned:1; /* 124: 6 1 */
__u8 ip_summed:2; /* 124: 4 1 */
__u8 nohdr:1; /* 124: 3 1 */
__u8 nfctinfo:3; /* 124: 0 1 */
__u8 pkt_type:3; /* 125: 5 1 */
__u8 fclone:2; /* 125: 3 1 */
__u8 ipvs_property:1; /* 125: 2 1 */
__u8 peeked:1; /* 125: 1 1 */
__u8 nf_trace:1; /* 125: 0 1 */
/* XXX 2 bytes hole, try to pack */
/* --- cacheline 2 boundary (128 bytes) --- */
int flags1_end[0]; /* 128 0 */
void * security; /* 128 8 */
void (*destructor)(struct sk_buff *); /* 136
8 */
struct nf_conntrack * nfct; /* 144 8 */
struct sk_buff * nfct_reasm; /* 152 8 */
struct nf_bridge_info * nf_bridge; /* 160 8 */
int skb_iif; /* 168 4 */
__u32 rxhash; /* 172 4 */
__u16 vlan_tci; /* 176 2 */
__u16 tc_index; /* 178 2 */
__u16 tc_verd; /* 180 2 */
__u16 queue_mapping; /* 182 2 */
int flags2_begin[0]; /* 184 0 */
__u8 ndisc_nodetype:2; /* 184: 6 1 */
__u8 pfmemalloc:1; /* 184: 5 1 */
__u8 ooo_okay:1; /* 184: 4 1 */
__u8 l4_rxhash:1; /* 184: 3 1 */
__u8 wifi_acked_valid:1; /* 184: 2 1 */
__u8 wifi_acked:1; /* 184: 1 1 */
__u8 no_fcs:1; /* 184: 0 1 */
__u8 head_frag:1; /* 185: 7 1 */
__u8 encapsulation:1; /* 185: 6 1 */
/* XXX 6 bits hole, try to pack */
/* XXX 2 bytes hole, try to pack */
int flags2_end[0]; /* 188 0 */
__be16 protocol; /* 188 2 */
/* XXX 2 bytes hole, try to pack */
/* --- cacheline 3 boundary (192 bytes) --- */
dma_cookie_t dma_cookie; /* 192 4 */
union {
__u32 mark; /* 4 */
__u32 dropcount; /* 4 */
__u32 reserved_tailroom; /* 4 */
}; /* 196 4 */
sk_buff_data_t inner_transport_header; /* 200 4 */
sk_buff_data_t inner_network_header; /* 204 4 */
sk_buff_data_t transport_header; /* 208 4 */
sk_buff_data_t network_header; /* 212 4 */
sk_buff_data_t mac_header; /* 216 4 */
sk_buff_data_t tail; /* 220 4 */
sk_buff_data_t end; /* 224 4 */
/* XXX 4 bytes hole, try to pack */
unsigned char * head; /* 232 8 */
unsigned char * data; /* 240 8 */
unsigned int truesize; /* 248 4 */
atomic_t users; /* 252 4 */
/* --- cacheline 4 boundary (256 bytes) --- */
/* size: 256, cachelines: 4, members: 62 */
/* sum members: 246, holes: 4, sum holes: 10 */
/* bit holes: 1, sum bit holes: 6 bits */
};
--
paul moore
security and virtualization @ redhat
next prev parent reply other threads:[~2013-04-09 15:06 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-08 15:45 [PATCH] tcp: assign the sock correctly to an outgoing SYNACK packet Paul Moore
2013-04-08 16:14 ` David Miller
2013-04-08 17:22 ` Paul Moore
2013-04-08 17:36 ` Eric Dumazet
2013-04-08 17:40 ` Paul Moore
2013-04-08 17:47 ` Eric Dumazet
2013-04-08 18:01 ` Eric Dumazet
2013-04-08 18:12 ` Paul Moore
2013-04-08 18:21 ` Eric Dumazet
2013-04-08 18:26 ` Paul Moore
2013-04-08 18:34 ` Eric Dumazet
2013-04-08 18:30 ` Eric Dumazet
2013-04-08 20:37 ` Paul Moore
2013-04-08 20:44 ` David Miller
2013-04-08 20:53 ` Paul Moore
2013-04-08 20:55 ` Eric Dumazet
2013-04-08 21:09 ` Paul Moore
2013-04-08 21:14 ` David Miller
2013-04-08 21:17 ` Eric Dumazet
2013-04-09 3:58 ` [PATCH] selinux: add a skb_owned_by() hook Eric Dumazet
2013-04-09 4:29 ` Casey Schaufler
2013-04-09 4:41 ` David Miller
2013-04-09 5:14 ` Casey Schaufler
2013-04-09 11:39 ` Paul Moore
2013-04-09 6:24 ` Eric Dumazet
2013-04-09 11:45 ` Paul Moore
2013-04-09 7:38 ` James Morris
2013-04-09 12:06 ` Paul Moore
2013-04-09 17:23 ` David Miller
2013-04-08 18:32 ` [PATCH] tcp: assign the sock correctly to an outgoing SYNACK packet Paul Moore
2013-04-08 18:32 ` Paul Moore
2013-04-08 21:10 ` Paul Moore
2013-04-08 21:10 ` Paul Moore
2013-04-08 21:15 ` David Miller
2013-04-08 21:24 ` Paul Moore
2013-04-08 21:24 ` Paul Moore
2013-04-08 21:33 ` David Miller
2013-04-08 22:01 ` Paul Moore
2013-04-08 22:01 ` Paul Moore
2013-04-08 22:08 ` David Miller
2013-04-08 23:40 ` Casey Schaufler
2013-04-08 23:40 ` Casey Schaufler
2013-04-09 0:33 ` Eric Dumazet
2013-04-09 0:59 ` Casey Schaufler
2013-04-09 0:59 ` Casey Schaufler
2013-04-09 1:09 ` Eric Dumazet
2013-04-09 1:24 ` Casey Schaufler
2013-04-09 1:24 ` Casey Schaufler
2013-04-09 13:19 ` Paul Moore
2013-04-09 13:19 ` Paul Moore
2013-04-09 13:33 ` Paul Moore
2013-04-09 13:33 ` Paul Moore
2013-04-09 14:00 ` Eric Dumazet
2013-04-09 14:19 ` Paul Moore
2013-04-09 14:19 ` Paul Moore
2013-04-09 14:31 ` Eric Dumazet
2013-04-09 14:52 ` Paul Moore
2013-04-09 14:52 ` Paul Moore
2013-04-09 15:05 ` Paul Moore [this message]
2013-04-09 15:05 ` Paul Moore
2013-04-09 15:07 ` Eric Dumazet
2013-04-09 15:17 ` Paul Moore
2013-04-09 15:17 ` Paul Moore
2013-04-09 15:32 ` Eric Dumazet
2013-04-09 15:57 ` Paul Moore
2013-04-09 15:57 ` Paul Moore
2013-04-09 16:11 ` Casey Schaufler
2013-04-09 16:11 ` Casey Schaufler
2013-04-09 16:56 ` David Miller
2013-04-09 17:00 ` Paul Moore
2013-04-09 17:00 ` Paul Moore
2013-04-09 17:09 ` David Miller
2013-04-09 17:10 ` David Miller
2013-04-09 14:05 ` Ben Hutchings
2013-04-09 14:10 ` Paul Moore
2013-04-09 14:10 ` Paul Moore
2013-04-08 21:34 ` Ben Hutchings
2013-04-08 19:25 ` David Miller
2013-04-08 16:19 ` Eric Dumazet
2013-04-08 18:03 ` Sergei Shtylyov
2013-04-08 18:12 ` Paul Moore
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5061322.VOeJrtYPaJ@sifl \
--to=pmoore@redhat.com \
--cc=casey@schaufler-ca.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=linux-security-module@vger.kernel.org \
--cc=mvadkert@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=selinux@tycho.nsa.gov \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.