* [PATCH]: Fix sk_buff page offsets and lengths.
@ 2007-07-31 1:50 David Miller
2007-07-31 8:52 ` Eric Dumazet
2007-07-31 10:20 ` Evgeniy Polyakov
0 siblings, 2 replies; 6+ messages in thread
From: David Miller @ 2007-07-31 1:50 UTC (permalink / raw)
To: netdev; +Cc: sfr, shemminger
Stephen Rothwell pointed out to me that the skb_frag_struct
is broken on platforms using 64K or larger page sizes, it
even generates warnings when (for example) the myri10ge driver
tries to assign PAGE_SIZE into frag->size.
I've thus increased page offset and size to __u32 in the patch below.
I made this change much to even my own chagrin, but this is the
most direct fix and the ifdefs we could put here are both ugly
and also not something that we do with struct scatterlist so
no reason to do it in a place like this.
Actually, the cost on 64-bit is zero because there existed 4 bytes of
alignment padding for skb_frag_struct because of the page pointer.
On 32-bit the cost is up to 64-bytes :-/
Stephen, this opens up the doors a bit for the scatterlist work
you wanted to do in sk_buff.
commit 051c14dbc588590e0a165dda0305c7c1b9ce7fb0
Author: David S. Miller <davem@sunset.davemloft.net>
Date: Mon Jul 30 18:47:03 2007 -0700
[NET]: Page offsets and lengths need to be __u32.
Based upon a report from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ce25643..93c27f7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -134,8 +134,8 @@ typedef struct skb_frag_struct skb_frag_t;
struct skb_frag_struct {
struct page *page;
- __u16 page_offset;
- __u16 size;
+ __u32 page_offset;
+ __u32 size;
};
/* This data is invariant across clones and lives at
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH]: Fix sk_buff page offsets and lengths.
2007-07-31 1:50 [PATCH]: Fix sk_buff page offsets and lengths David Miller
@ 2007-07-31 8:52 ` Eric Dumazet
2007-08-01 20:44 ` David Miller
2007-07-31 10:20 ` Evgeniy Polyakov
1 sibling, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2007-07-31 8:52 UTC (permalink / raw)
To: David Miller; +Cc: netdev, sfr, shemminger
On Mon, 30 Jul 2007 18:50:28 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
>
> Stephen Rothwell pointed out to me that the skb_frag_struct
> is broken on platforms using 64K or larger page sizes, it
> even generates warnings when (for example) the myri10ge driver
> tries to assign PAGE_SIZE into frag->size.
>
> I've thus increased page offset and size to __u32 in the patch below.
>
> I made this change much to even my own chagrin, but this is the
> most direct fix and the ifdefs we could put here are both ugly
> and also not something that we do with struct scatterlist so
> no reason to do it in a place like this.
>
> Actually, the cost on 64-bit is zero because there existed 4 bytes of
> alignment padding for skb_frag_struct because of the page pointer.
> On 32-bit the cost is up to 64-bytes :-/
>
> Stephen, this opens up the doors a bit for the scatterlist work
> you wanted to do in sk_buff.
>
Ouch...
sizeof(struct skb_shared_info) is enlarged by 18*4 bytes on i386, a litle bit more than 64 bytes :(
I understand ifdefs are ugly, but in the common case (PAGE_SIZE<64K), this change seems very unfortunate.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH]: Fix sk_buff page offsets and lengths.
2007-07-31 1:50 [PATCH]: Fix sk_buff page offsets and lengths David Miller
2007-07-31 8:52 ` Eric Dumazet
@ 2007-07-31 10:20 ` Evgeniy Polyakov
2007-07-31 12:23 ` Christoph Hellwig
1 sibling, 1 reply; 6+ messages in thread
From: Evgeniy Polyakov @ 2007-07-31 10:20 UTC (permalink / raw)
To: David Miller; +Cc: netdev, sfr, shemminger
On Mon, Jul 30, 2007 at 06:50:28PM -0700, David Miller (davem@davemloft.net) wrote:
>
> Stephen Rothwell pointed out to me that the skb_frag_struct
> is broken on platforms using 64K or larger page sizes, it
> even generates warnings when (for example) the myri10ge driver
> tries to assign PAGE_SIZE into frag->size.
>
> I've thus increased page offset and size to __u32 in the patch below.
Maybe wrap it into
#if PAGE_OFFSET > 12
#endif
or something like that?
I'm not sure actually why drivers would want to have list of 64k pages,
instead driver could call give_me_pages(size) instead of alloc_pages
and per-arch allocator would return one page or set of pages. This is a
handwaving for now...
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH]: Fix sk_buff page offsets and lengths.
2007-07-31 10:20 ` Evgeniy Polyakov
@ 2007-07-31 12:23 ` Christoph Hellwig
2007-07-31 14:00 ` Evgeniy Polyakov
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2007-07-31 12:23 UTC (permalink / raw)
To: Evgeniy Polyakov; +Cc: David Miller, netdev, sfr, shemminger
On Tue, Jul 31, 2007 at 02:20:51PM +0400, Evgeniy Polyakov wrote:
> On Mon, Jul 30, 2007 at 06:50:28PM -0700, David Miller (davem@davemloft.net) wrote:
> >
> > Stephen Rothwell pointed out to me that the skb_frag_struct
> > is broken on platforms using 64K or larger page sizes, it
> > even generates warnings when (for example) the myri10ge driver
> > tries to assign PAGE_SIZE into frag->size.
> >
> > I've thus increased page offset and size to __u32 in the patch below.
>
> Maybe wrap it into
> #if PAGE_OFFSET > 12
> #endif
>
> or something like that?
>
> I'm not sure actually why drivers would want to have list of 64k pages,
> instead driver could call give_me_pages(size) instead of alloc_pages
> and per-arch allocator would return one page or set of pages. This is a
> handwaving for now...
What about sendfile/splice on hugetlbfs?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH]: Fix sk_buff page offsets and lengths.
2007-07-31 12:23 ` Christoph Hellwig
@ 2007-07-31 14:00 ` Evgeniy Polyakov
0 siblings, 0 replies; 6+ messages in thread
From: Evgeniy Polyakov @ 2007-07-31 14:00 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: David Miller, netdev, sfr, shemminger
On Tue, Jul 31, 2007 at 01:23:29PM +0100, Christoph Hellwig (hch@infradead.org) wrote:
> On Tue, Jul 31, 2007 at 02:20:51PM +0400, Evgeniy Polyakov wrote:
> > On Mon, Jul 30, 2007 at 06:50:28PM -0700, David Miller (davem@davemloft.net) wrote:
> > >
> > > Stephen Rothwell pointed out to me that the skb_frag_struct
> > > is broken on platforms using 64K or larger page sizes, it
> > > even generates warnings when (for example) the myri10ge driver
> > > tries to assign PAGE_SIZE into frag->size.
> > >
> > > I've thus increased page offset and size to __u32 in the patch below.
> >
> > Maybe wrap it into
> > #if PAGE_OFFSET > 12
> > #endif
> >
> > or something like that?
> >
> > I'm not sure actually why drivers would want to have list of 64k pages,
> > instead driver could call give_me_pages(size) instead of alloc_pages
> > and per-arch allocator would return one page or set of pages. This is a
> > handwaving for now...
>
> What about sendfile/splice on hugetlbfs?
offset in tcp_sendpage() ends up being poffset % PAGE_SIZE,
page is
struct page *page = pages[poffset / PAGE_SIZE];
So, as far as I understand, it will split bigpage into PAGE_SIZEd
chunks.
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH]: Fix sk_buff page offsets and lengths.
2007-07-31 8:52 ` Eric Dumazet
@ 2007-08-01 20:44 ` David Miller
0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2007-08-01 20:44 UTC (permalink / raw)
To: dada1; +Cc: netdev, sfr, shemminger
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 31 Jul 2007 10:52:15 +0200
> I understand ifdefs are ugly, but in the common case
> (PAGE_SIZE<64K), this change seems very unfortunate.
If this bothers you so much start where the real problems are and
advocate on linux-kernel for descreasing the type size of "offset" and
"length" in struct scatterlist. It's a bit on the hypocritical side
to complain about this skb_frag_t change when I've never seen a peep
out of you about scatterlist doing the same thing :-)
Otherwise, we're better off with the types being the same so that we
can use scatterlist in skb_frag_t and therefore kill serious
performance problems with DMA mapping of sk_buff objects that exists
today on platforms such as powerpc.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-08-01 20:44 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-31 1:50 [PATCH]: Fix sk_buff page offsets and lengths David Miller
2007-07-31 8:52 ` Eric Dumazet
2007-08-01 20:44 ` David Miller
2007-07-31 10:20 ` Evgeniy Polyakov
2007-07-31 12:23 ` Christoph Hellwig
2007-07-31 14:00 ` Evgeniy Polyakov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).