* Who's allowed to set a skb destructor? @ 2007-07-04 8:04 Brice Goglin 2007-07-04 9:38 ` Evgeniy Polyakov 2007-07-05 10:08 ` Andi Kleen 0 siblings, 2 replies; 13+ messages in thread From: Brice Goglin @ 2007-07-04 8:04 UTC (permalink / raw) To: netdev Hi, I am trying to understand whether I can setup a skb destructor in my code (which is basically a protocol above dev_queue_xmit() and co). From what I see in many parts in the current kernel code, the "protocol" (I mean, the one who actually creates the skb) may setup a destructor. However, I also see some places where some low-level drivers might be using a destructor too , without apparently checking whether an upper layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. I found some old threads about adding support for multiple destructors but I don't see anything like this in the current kernel. So, I'd like to have a clear statement about who's allowed to use a destructor :) Thanks, Brice ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-04 8:04 Who's allowed to set a skb destructor? Brice Goglin @ 2007-07-04 9:38 ` Evgeniy Polyakov 2007-07-05 10:08 ` Andi Kleen 1 sibling, 0 replies; 13+ messages in thread From: Evgeniy Polyakov @ 2007-07-04 9:38 UTC (permalink / raw) To: Brice Goglin; +Cc: netdev On Wed, Jul 04, 2007 at 10:04:54AM +0200, Brice Goglin (Brice.Goglin@ens-lyon.org) wrote: > So, I'd like to have a clear statement about who's allowed to use a > destructor :) That one who allocates skb - if it is socket layer, it sets own socket destructor, netlink has own too and so on. > Thanks, > Brice -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-04 8:04 Who's allowed to set a skb destructor? Brice Goglin 2007-07-04 9:38 ` Evgeniy Polyakov @ 2007-07-05 10:08 ` Andi Kleen 2007-07-05 11:07 ` Divy Le Ray 2007-07-05 12:28 ` Jarek Poplawski 1 sibling, 2 replies; 13+ messages in thread From: Andi Kleen @ 2007-07-05 10:08 UTC (permalink / raw) To: Brice Goglin; +Cc: netdev Brice Goglin <Brice.Goglin@ens-lyon.org> writes: > I am trying to understand whether I can setup a skb destructor in my > code (which is basically a protocol above dev_queue_xmit() and co). From > what I see in many parts in the current kernel code, the "protocol" (I > mean, the one who actually creates the skb) may setup a destructor. The socket layer generally needs it for its own accounting. Unless you never pass it up you can't use it. > However, I also see some places where some low-level drivers might be > using a destructor too , without apparently checking whether an upper > layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. Likely a bug. Normally that should not slip past code review. > found some old threads about adding support for multiple destructors but > I don't see anything like this in the current kernel. > > So, I'd like to have a clear statement about who's allowed to use a > destructor :) The traditional standpoint was that having your own large skb pools is not recommended because you won't interact well with the rest of the system running low on memory and you tieing up memory. Essentially you would recreate all the problems traditional Unix systems have with fixed size mbuf pools. Linux always used a more dynamic and flexible allocate-only-as-you-need approach even when it can have a little more overhead in managing IOMMUs etc. These days there are shrinker callbacks that would in theory allow you to handle this, but it would be likely still hard to implement correctly. -Andi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 10:08 ` Andi Kleen @ 2007-07-05 11:07 ` Divy Le Ray 2007-07-05 13:07 ` Andi Kleen 2007-07-05 12:28 ` Jarek Poplawski 1 sibling, 1 reply; 13+ messages in thread From: Divy Le Ray @ 2007-07-05 11:07 UTC (permalink / raw) To: Andi Kleen; +Cc: Brice Goglin, netdev Andi Kleen wrote: > Brice Goglin <Brice.Goglin@ens-lyon.org> writes: > > >> I am trying to understand whether I can setup a skb destructor in my >> code (which is basically a protocol above dev_queue_xmit() and co). From >> what I see in many parts in the current kernel code, the "protocol" (I >> mean, the one who actually creates the skb) may setup a destructor. >> > > The socket layer generally needs it for its own accounting. > Unless you never pass it up you can't use it. > > >> However, I also see some places where some low-level drivers might be >> using a destructor too , without apparently checking whether an upper >> layer already uses one. For instance, write_ofld_wr() in cxgb3/sge.c. >> > > Likely a bug. Normally that should not slip past code review. > Andi, The destructor method is set and used for skbs originating from the RDMA driver sitting above cxgb3. The patch introducing this code was discussed at the time. http://marc.info/?l=linux-netdev&m=117029329230969&w=2 Cheers, Divy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 11:07 ` Divy Le Ray @ 2007-07-05 13:07 ` Andi Kleen 0 siblings, 0 replies; 13+ messages in thread From: Andi Kleen @ 2007-07-05 13:07 UTC (permalink / raw) To: Divy Le Ray; +Cc: Andi Kleen, Brice Goglin, netdev > The destructor method is set and used for skbs originating from the RDMA > driver sitting above cxgb3. If these skbs never reach the normal sockets based stack it might be ok. -Andi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 10:08 ` Andi Kleen 2007-07-05 11:07 ` Divy Le Ray @ 2007-07-05 12:28 ` Jarek Poplawski 2007-07-05 12:28 ` Evgeniy Polyakov 2007-07-05 13:06 ` Andi Kleen 1 sibling, 2 replies; 13+ messages in thread From: Jarek Poplawski @ 2007-07-05 12:28 UTC (permalink / raw) To: Andi Kleen; +Cc: Brice Goglin, netdev, Evgeniy Polyakov, Divy Le Ray On 05-07-2007 12:08, Andi Kleen wrote: ... > The traditional standpoint was that having your own large skb pools > is not recommended because you won't interact well with the > rest of the system running low on memory and you tieing up > memory. > > Essentially you would recreate all the problems traditional Unix > systems have with fixed size mbuf pools. Linux always used a more > dynamic and flexible allocate-only-as-you-need approach even when it > can have a little more overhead in managing IOMMUs etc. I wonder if it's very unsound to think about a one way list of destructors. Of course, not owners could only clean their private allocations. Woudn't this save some skb clonning, copying or adding new fields for private infos? Regards, Jarek P. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 12:28 ` Jarek Poplawski @ 2007-07-05 12:28 ` Evgeniy Polyakov 2007-07-05 13:00 ` Jarek Poplawski 2007-07-06 9:08 ` Jarek Poplawski 2007-07-05 13:06 ` Andi Kleen 1 sibling, 2 replies; 13+ messages in thread From: Evgeniy Polyakov @ 2007-07-05 12:28 UTC (permalink / raw) To: Jarek Poplawski; +Cc: Andi Kleen, Brice Goglin, netdev, Divy Le Ray Hi, Jarek. On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski (jarkao2@o2.pl) wrote: > I wonder if it's very unsound to think about a one way list > of destructors. Of course, not owners could only clean their > private allocations. Woudn't this save some skb clonning, > copying or adding new fields for private infos? There should not be any additional allocations, since they are very slow, that part of mbuf is really horrible for performance - openbsd hackers removed additional allocation of mbuf tag in PF code during the last hackathon, which doubled its performance, that is why skb has only one control structure and data area, which incorporates additional control information, thus there is no need for multiple destructors. > Regards, > Jarek P. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 12:28 ` Evgeniy Polyakov @ 2007-07-05 13:00 ` Jarek Poplawski 2007-07-06 9:08 ` Jarek Poplawski 1 sibling, 0 replies; 13+ messages in thread From: Jarek Poplawski @ 2007-07-05 13:00 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Andi Kleen, Brice Goglin, netdev, Divy Le Ray On Thu, Jul 05, 2007 at 04:28:47PM +0400, Evgeniy Polyakov wrote: > Hi, Jarek. > > On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski (jarkao2@o2.pl) wrote: > > I wonder if it's very unsound to think about a one way list > > of destructors. Of course, not owners could only clean their > > private allocations. Woudn't this save some skb clonning, > > copying or adding new fields for private infos? > > There should not be any additional allocations, since they are very > slow, that part of mbuf is really horrible for performance - openbsd > hackers removed additional allocation of mbuf tag in PF code during the > last hackathon, which doubled its performance, that is why skb has only > one control structure and data area, which incorporates additional > control information, thus there is no need for multiple destructors. Of course, my knowledge of this is far not enough, and maybe I got this reversed, but from Andi's words I've understood that linux prefers another (mixed) approach, so I've thought such list should be a consequence... Thanks, Jarek P. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 12:28 ` Evgeniy Polyakov 2007-07-05 13:00 ` Jarek Poplawski @ 2007-07-06 9:08 ` Jarek Poplawski 2007-07-06 9:44 ` Jarek Poplawski 1 sibling, 1 reply; 13+ messages in thread From: Jarek Poplawski @ 2007-07-06 9:08 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Andi Kleen, Brice Goglin, netdev, Divy Le Ray On Thu, Jul 05, 2007 at 04:28:47PM +0400, Evgeniy Polyakov wrote: > Hi, Jarek. > > On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski (jarkao2@o2.pl) wrote: > > I wonder if it's very unsound to think about a one way list > > of destructors. Of course, not owners could only clean their > > private allocations. Woudn't this save some skb clonning, > > copying or adding new fields for private infos? > > There should not be any additional allocations, since they are very > slow, that part of mbuf is really horrible for performance - openbsd > hackers removed additional allocation of mbuf tag in PF code during the > last hackathon, which doubled its performance, that is why skb has only > one control structure and data area, which incorporates additional > control information, thus there is no need for multiple destructors. I'd like to add a few words about performance-way-thinking. Some time ago I've read mainly networking/admins lists. One of the most often questions was: what should I choose linux or bsd? And very often bsd was praised for better performance, but almost always linux was advised as more universal (even by people who said they use both). BSDs were sometimes recommended for specific jobs like mail etc. but usually linux better fitted the needs. Especially well linux appeared for an internet gateway/router/firewall/antispam thing, and the main reasons were: netfilter with additional, unofficial patches e.g. l-7 filtering and imq. BSD was no option here. Some time later, reading this list, I've found many people almost hate netfilter for performance. You can imagine how l-7 adds to this "performance". IMQ isn't even mentioned here - looks like some dirty word (lack of programmers affects it's quality and doesn't help linux too). But it's nothing near performance too. I can also remember quite a lot of questions like: how can I avoid tc/ip and do this with netfilter only? Probably the most of the readers/writers were small or middle networks admins (but quite often servicing hundreds or thousans boxes too), probably not always advanced enough, but you know what, they made 99% of interested. So, I understand something could've changed with voip, and there are high performance linux servers too (their admins have never heard of imq), and probably thinking about them could pay off better, but there could be some cost of such thinking too. Regards, Jarek P. PS: in my opinion lack of linux performance wasn't even the second most often asked question there; rather this: why my new & beautiful linux box sometimes lockups? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-06 9:08 ` Jarek Poplawski @ 2007-07-06 9:44 ` Jarek Poplawski 0 siblings, 0 replies; 13+ messages in thread From: Jarek Poplawski @ 2007-07-06 9:44 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Andi Kleen, Brice Goglin, netdev, Divy Le Ray On Fri, Jul 06, 2007 at 11:08:35AM +0200, Jarek Poplawski wrote: ... > BSDs were sometimes recommended for specific jobs like mail etc. > but usually linux better fitted the needs. Especially well linux > appeared for an internet gateway/router/firewall/antispam thing, > and the main reasons were: netfilter with additional, unofficial > patches e.g. l-7 filtering and imq. BSD was no option here. I've forgotten to mention two other "performance boosters" which very often completed these solutions: htb or hfsc. Jarek P. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 12:28 ` Jarek Poplawski 2007-07-05 12:28 ` Evgeniy Polyakov @ 2007-07-05 13:06 ` Andi Kleen 2007-07-05 13:51 ` Jarek Poplawski 2007-07-06 7:47 ` Jarek Poplawski 1 sibling, 2 replies; 13+ messages in thread From: Andi Kleen @ 2007-07-05 13:06 UTC (permalink / raw) To: Jarek Poplawski Cc: Andi Kleen, Brice Goglin, netdev, Evgeniy Polyakov, Divy Le Ray On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: > I wonder if it's very unsound to think about a one way list > of destructors. Of course, not owners could only clean their > private allocations. Woudn't this save some skb clonning, > copying or adding new fields for private infos? skb cloning isn't very expensive when you need it. And they got a little private area you can use for your own stuff while you have it queued (skb->cb) As a historical note one of the big changes during the Linux 2.0 and 2.1 TCP rewrite was that TCP was changed to always clone for the retransmit queue. This cleaned up the code greatly and fixed many problems. Cloning was also especially optimized for this. When TCP which is about one of the most performance critical protocols around can afford it likely other code can too. -Andi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 13:06 ` Andi Kleen @ 2007-07-05 13:51 ` Jarek Poplawski 2007-07-06 7:47 ` Jarek Poplawski 1 sibling, 0 replies; 13+ messages in thread From: Jarek Poplawski @ 2007-07-05 13:51 UTC (permalink / raw) To: Andi Kleen; +Cc: Brice Goglin, netdev, Evgeniy Polyakov, Divy Le Ray On Thu, Jul 05, 2007 at 03:06:40PM +0200, Andi Kleen wrote: > On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: > > I wonder if it's very unsound to think about a one way list > > of destructors. Of course, not owners could only clean their > > private allocations. Woudn't this save some skb clonning, > > copying or adding new fields for private infos? > > skb cloning isn't very expensive when you need it. And they > got a little private area you can use for your own stuff > while you have it queued (skb->cb) Not expensive in speed, but allocating size_of skb when you e.g. need 2 or 3 integers looks like a little expensive. > > As a historical note one of the big changes during the Linux 2.0 > and 2.1 TCP rewrite was that TCP was changed to always clone for the > retransmit queue. This cleaned up the code greatly and fixed > many problems. Cloning was also especially optimized for this. When TCP > which is about one of the most performance critical protocols around can > afford it likely other code can too. I've read opinions that current skb structure is far from optimal. So, it seems clonnig wasn't enough in many situations, and fiels were added. Of course, it's only a part of the story: some other clients couldn't think about the structure changed for them, so probably made it other, more expensive way? Jarek P. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Who's allowed to set a skb destructor? 2007-07-05 13:06 ` Andi Kleen 2007-07-05 13:51 ` Jarek Poplawski @ 2007-07-06 7:47 ` Jarek Poplawski 1 sibling, 0 replies; 13+ messages in thread From: Jarek Poplawski @ 2007-07-06 7:47 UTC (permalink / raw) To: Andi Kleen; +Cc: Brice Goglin, netdev, Evgeniy Polyakov, Divy Le Ray On Thu, Jul 05, 2007 at 03:06:40PM +0200, Andi Kleen wrote: > On Thu, Jul 05, 2007 at 02:28:50PM +0200, Jarek Poplawski wrote: > > I wonder if it's very unsound to think about a one way list > > of destructors. Of course, not owners could only clean their > > private allocations. Woudn't this save some skb clonning, > > copying or adding new fields for private infos? > > skb cloning isn't very expensive when you need it. And they > got a little private area you can use for your own stuff > while you have it queued (skb->cb) > > As a historical note one of the big changes during the Linux 2.0 > and 2.1 TCP rewrite was that TCP was changed to always clone for the > retransmit queue. This cleaned up the code greatly and fixed > many problems. Cloning was also especially optimized for this. When TCP > which is about one of the most performance critical protocols around can > afford it likely other code can too. I've thought about this a bit more, and, if I don't miss something, there is a possibility to use these things together: let's imagine such simplified api: - a driver which needs a bit of space to track skbs in a few places, registers itself with some function telling the size, maybe a callback/destructor and maybe a protocol id; some index is returned; - if this is the first one registered, api allocates new space using skb clonning or some similar slab pool, to get blank space, and reserves space for this driver according to the index (internally mapped to some offset); since this moment every new skb is automatically 'cloned' and the driver can read/write its place using the api to map the requests; - next registered drivers use the same 'clone', unless there is no more space, so next 'clones' are generated; - the lifetime of such 'clones' is controlled similarly to the 'real clones'; with the most basic version destructors could be avoided; - some indexes could be made public constants to allow sharing. Is this wrong? Regards, Jarek P. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2007-07-06 9:35 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-07-04 8:04 Who's allowed to set a skb destructor? Brice Goglin 2007-07-04 9:38 ` Evgeniy Polyakov 2007-07-05 10:08 ` Andi Kleen 2007-07-05 11:07 ` Divy Le Ray 2007-07-05 13:07 ` Andi Kleen 2007-07-05 12:28 ` Jarek Poplawski 2007-07-05 12:28 ` Evgeniy Polyakov 2007-07-05 13:00 ` Jarek Poplawski 2007-07-06 9:08 ` Jarek Poplawski 2007-07-06 9:44 ` Jarek Poplawski 2007-07-05 13:06 ` Andi Kleen 2007-07-05 13:51 ` Jarek Poplawski 2007-07-06 7:47 ` Jarek Poplawski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).