Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: loosing IPMI-card by loading netconsole
From: "Brandeburg, Jesse" @ 2010-05-14 17:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ronciak, John, Henning Fehrmann, Kirsher, Jeffrey T,
	Allan, Bruce W, Waskiewicz Jr, Peter P, netdev@vger.kernel.org,
	Matt Mackall, Carsten Aulbert, e1000-devel
In-Reply-To: <4BED79EB.1000204@kernel.org>

On Fri, 2010-05-14 at 09:27 -0700, Tejun Heo wrote:
> Hello, John.
> 
> As Henning seems offline, I'll try to fill in.
> 
> On 05/14/2010 04:51 PM, Ronciak, John wrote:
> > Sorry to hear about the problem you are having Henning.  What do you
> > mean when you say "it disappears"?
> 
> It stops responding to IPMI requests.

We've actually had quite a few problems like this over the years, so I'm
not quite so surprised to hear about something like this.

Its easy to break the reception of IPMI packets because there are a
couple of registers that if not correctly configured during all points
of driver lifetime (probe only, administratively down, up)


> > Can both eth0 and eth1 ping (or be pinged)?  Do all the networking
> > devices still show up in the system when you do an 'lspci'?
> 
> Yeah, everything other than IPMI works just fine.
> 
> > What happens if you down and then up the interface you are having
> > problems with?  Does 'rmmod' do the same thing as your removal
> > method?
> 
> Haven't tried these but well I think rmmoding should achieve about the
> same thing.
> 
> > Is there anything in the system logs saying anything about the
> > interfaces?
> 
> Nope.

One thing that would really help us is to see the stats from ethtool -S
ethX when interface is up, and not receiving IPMI
 
The other "smoking gun" indicator is the output of the register dump
tool called ethregs that we have posted at sourceforge.  Please gather
registers for the card in question before and after loading netconsole.

http://prdownloads.sf.net/e1000/ethregs-1.7.2.tar.gz

> > We have not had reports of this so this is a bit unusual.  Please let us know.
> > 
> > Does this happen on other systems as well or just one particular system?
> 
> Yeah, it happens on at least several hundred machines, so not an
> isolated hardware issue at all.
> 
> To sum up.
> 
> On 2.6.27.39, netconsole + IPMI works fine.  On 2.6.32.7, as soon as
> netconsole is loaded, IPMI stops working.  Unloading netconsole
> doesn't revive IPMI but detaching the driver from the controller does.
> In both cases, usual networking works fine.

I think that "loading netconsole" means bringing the interface "UP", in
this case, is this correct?  To ask another way: Is network traffic
active on the interface in question before netconsole is loaded?

Jesse


^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 17:26 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Arnd Bergmann, davem, netdev, chrisw
In-Reply-To: <C812D414.3180C%scofeldm@cisco.com>

Scott Feldman wrote:
> On 5/14/10 9:42 AM, "Patrick McHardy" <kaber@trash.net> wrote:
> 
>> Arnd Bergmann wrote:
>>> Maybe a better structure would be to separate the two cases, also allowing
>>> a port profile to be associated with both the PF and with each of its VFs?
>>>
>>> Something like this:
>>>
>>> [IFLA_NUM_VF]
>>> [IFLA_VF_PORTS]
>>>   [IFLA_VF_PORT]
>>>     [IFLA_VF_PORT_*], ...
>>>   [IFLA_VF_PORT]
>>>     [IFLA_VF_PORT_*], ...
>>> [IFLA_PORT_SELF]
>>>   [IFLA_VF_PORT_*], ...
>> That would also be fine.
> 
> I want to make sure I've got this right before starting on ver8 of patch:
> 
>     - we'll use the layout listed above
> 
>     - RTM_SETLINK msg includes the full nested layout
> 
>         - contains IFLA_VF_PORTs for all VFs of a PF
>         - OR, contains IFLA_PORT_SELF if PF is it's own VF
> 
>         - it's up to the receiver to compare for changes for each VF
> 
>     - RTM_GETLINK msg includes the full nested layout
> 
>         - same rules as RTM_SETLINK above
> 
> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
> followup patch.

Agreed.

> Do we have a plan?

That sounds good to me. If you have any netlink related questions,
just let me know.

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Arnd Bergmann @ 2010-05-14 17:29 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Patrick McHardy, davem, netdev, chrisw
In-Reply-To: <C812D414.3180C%scofeldm@cisco.com>

On Friday 14 May 2010 19:19:00 Scott Feldman wrote:
> I want to make sure I've got this right before starting on ver8 of patch:
> 
>     - we'll use the layout listed above
> 
>     - RTM_SETLINK msg includes the full nested layout
> 
>         - contains IFLA_VF_PORTs for all VFs of a PF
>         - OR, contains IFLA_PORT_SELF if PF is it's own VF
> 
>         - it's up to the receiver to compare for changes for each VF
> 
>     - RTM_GETLINK msg includes the full nested layout
> 
>         - same rules as RTM_SETLINK above

I was thinking that a device could have both IFLA_VF_PORTS and IFLA_PORT_SELF,
but you know more about the IOV specifics. If an adapter having multiple
VFs always gets configured as VF 0 itself, that would be fine as well, otherwise
we could have an extra argument to the two device driver callbacks to
differentiate VF/SELF. As long as this does not impact the user ABI, we
could do either.

> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
> followup patch.

I fear it's too late for that now. While we have not yet released 2.6.34
and 2.6.33 does not contain the broken message, it's extremely late in the
stabilization phase of v2.6.34, so I doubt that there is still a chance for
that at this point.

	Arnd

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Chris Wright @ 2010-05-14 17:35 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Patrick McHardy, Arnd Bergmann, davem, netdev, chrisw
In-Reply-To: <C812D414.3180C%scofeldm@cisco.com>

* Scott Feldman (scofeldm@cisco.com) wrote:
> On 5/14/10 9:42 AM, "Patrick McHardy" <kaber@trash.net> wrote:
> 
> > Arnd Bergmann wrote:
> >> Maybe a better structure would be to separate the two cases, also allowing
> >> a port profile to be associated with both the PF and with each of its VFs?
> >> 
> >> Something like this:
> >> 
> >> [IFLA_NUM_VF]
> >> [IFLA_VF_PORTS]
> >>   [IFLA_VF_PORT]
> >>     [IFLA_VF_PORT_*], ...
> >>   [IFLA_VF_PORT]
> >>     [IFLA_VF_PORT_*], ...
> >> [IFLA_PORT_SELF]
> >>   [IFLA_VF_PORT_*], ...
> > 
> > That would also be fine.
> 
> I want to make sure I've got this right before starting on ver8 of patch:
> 
>     - we'll use the layout listed above
> 
>     - RTM_SETLINK msg includes the full nested layout
> 
>         - contains IFLA_VF_PORTs for all VFs of a PF
>         - OR, contains IFLA_PORT_SELF if PF is it's own VF
> 
>         - it's up to the receiver to compare for changes for each VF
> 
>     - RTM_GETLINK msg includes the full nested layout
> 
>         - same rules as RTM_SETLINK above
> 
> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
> followup patch.

Patrick layed out some nice details before.  Here's the link:

http://thread.gmane.org/gmane.linux.network/151605/focus=151738

thanks,
-chris

^ permalink raw reply

* Re: loosing IPMI-card by loading netconsole
From: Matt Mackall @ 2010-05-14 17:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ronciak, John, Henning Fehrmann, Kirsher, Jeffrey T,
	Brandeburg, Jesse, Allan, Bruce W, Waskiewicz Jr, Peter P,
	netdev@vger.kernel.org, Carsten Aulbert
In-Reply-To: <4BED79EB.1000204@kernel.org>

On Fri, 2010-05-14 at 18:27 +0200, Tejun Heo wrote:
> Hello, John.
> 
> As Henning seems offline, I'll try to fill in.
> 
> On 05/14/2010 04:51 PM, Ronciak, John wrote:
> > Sorry to hear about the problem you are having Henning.  What do you
> > mean when you say "it disappears"?
> 
> It stops responding to IPMI requests.
> 
> > Can both eth0 and eth1 ping (or be pinged)?  Do all the networking
> > devices still show up in the system when you do an 'lspci'?
> 
> Yeah, everything other than IPMI works just fine.
> 
> > What happens if you down and then up the interface you are having
> > problems with?  Does 'rmmod' do the same thing as your removal
> > method?
> 
> Haven't tried these but well I think rmmoding should achieve about the
> same thing.
> 
> > Is there anything in the system logs saying anything about the
> > interfaces?
> 
> Nope.
> 
> > We have not had reports of this so this is a bit unusual.  Please let us know.
> > 
> > Does this happen on other systems as well or just one particular system?
> 
> Yeah, it happens on at least several hundred machines, so not an
> isolated hardware issue at all.
> 
> To sum up.
> 
> On 2.6.27.39, netconsole + IPMI works fine.  On 2.6.32.7, as soon as
> netconsole is loaded, IPMI stops working.  Unloading netconsole
> doesn't revive IPMI but detaching the driver from the controller does.
> In both cases, usual networking works fine.

Looks like a job for bisect.

-- 
Mathematics is the supreme nostalgia of our time.



^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Scott Feldman @ 2010-05-14 17:46 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Patrick McHardy, davem, netdev, chrisw
In-Reply-To: <201005141929.41534.arnd@arndb.de>

On 5/14/10 10:29 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:

> On Friday 14 May 2010 19:19:00 Scott Feldman wrote:
>> I want to make sure I've got this right before starting on ver8 of patch:
>> 
>>     - we'll use the layout listed above
>> 
>>     - RTM_SETLINK msg includes the full nested layout
>> 
>>         - contains IFLA_VF_PORTs for all VFs of a PF
>>         - OR, contains IFLA_PORT_SELF if PF is it's own VF
>> 
>>         - it's up to the receiver to compare for changes for each VF
>> 
>>     - RTM_GETLINK msg includes the full nested layout
>> 
>>         - same rules as RTM_SETLINK above
> 
> I was thinking that a device could have both IFLA_VF_PORTS and IFLA_PORT_SELF,
> but you know more about the IOV specifics. If an adapter having multiple
> VFs always gets configured as VF 0 itself, that would be fine as well,
> otherwise
> we could have an extra argument to the two device driver callbacks to
> differentiate VF/SELF. As long as this does not impact the user ABI, we
> could do either.

I think you're right.  I should have said AND/OR.  I would rather not have
an extra argument to the driver callbacks.
  
>> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
>> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
>> followup patch.
> 
> I fear it's too late for that now. While we have not yet released 2.6.34
> and 2.6.33 does not contain the broken message, it's extremely late in the
> stabilization phase of v2.6.34, so I doubt that there is still a chance for
> that at this point.

That's too bad.  I wish Patrick's objections were honored and then we
wouldn't have followed that broken model!  Can the broken msgs be disabled
somehow for 2.6.34?  Keep the definitions in if_link.h but fail the SET/GET
actions in rtnetlink.c?

-scott


^ permalink raw reply

* Re: [PATCH v3 3/3] ptp: Added a clock that uses the eTSEC found on the MPC85xx.
From: Scott Wood @ 2010-05-14 17:46 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, devicetree-discuss, linuxppc-dev
In-Reply-To: <ee6c3edca3ee6aa86565e59da999375f79c9de1b.1273855017.git.richard.cochran@omicron.at>

On 05/14/2010 11:46 AM, Richard Cochran wrote:
> diff --git a/Documentation/powerpc/dts-bindings/fsl/tsec.txt b/Documentation/powerpc/dts-bindings/fsl/tsec.txt
> index edb7ae1..b09ba66 100644
> --- a/Documentation/powerpc/dts-bindings/fsl/tsec.txt
> +++ b/Documentation/powerpc/dts-bindings/fsl/tsec.txt
> @@ -74,3 +74,59 @@ Example:
>   		interrupt-parent =<&mpic>;
>   		phy-handle =<&phy0>
>   	};
> +
> +* Gianfar PTP clock nodes
> +
> +General Properties:
> +
> +  - device_type  Should be "ptp_clock"

Device_type is deprecated in most contexts for flat device trees.

> +  - model        Model of the device.  Must be "eTSEC"

Model, while abused by the current gianfar binding code, is not supposed 
to be something that is ordinarily used to bind on.  It is supposed to 
be a freeform field for indicating the specific model of hardware, 
mainly for human consumption or as a last resort for working around 
problems.

Get rid of both device_type and model, and specify a compatible string 
instead (e.g. "fsl,etsec-ptp").  Or perhaps this should just be some 
additional properties on the existing gianfar nodes, rather than 
presenting it as a separate device?  How do you associate a given ptp 
block with the corresponding gianfar node?  If there are differences in 
ptp implementation between different versions of etsec, can the ptp 
driver see the etsec version register?

> +  - reg          Offset and length of the register set for the device
> +  - interrupts   There should be at least two and as many as four
> +                 PTP related interrupts
> +
> +Clock Properties:
> +
> +  - tclk_period  Timer reference clock period in nanoseconds.
> +  - tmr_prsc     Prescaler, divides the output clock.
> +  - tmr_add      Frequency compensation value.
> +  - cksel        0= external clock, 1= eTSEC system clock, 3= RTC clock input.
> +                 Currently the driver only supports choice "1".
> +  - tmr_fiper1   Fixed interval period pulse generator.
> +  - tmr_fiper2   Fixed interval period pulse generator.

Dashes are more typical in OF names than underscores, and it's generally 
better to be a little more verbose -- these aren't local loop iterators.

They should probably have an "fsl,ptp-" prefix as well.

> +  These properties set the operational parameters for the PTP
> +  clock. You must choose these carefully for the clock to work right.

Do these values describe the way the hardware is, or how it's been 
configured by firmware, or a set of values that are clearly optimal for 
this particular board?  If it's just configuration for the Linux driver, 
that could reasonably differ based on what a given user or OS will want, 
the device tree probably isn't the right place for it.

> diff --git a/arch/powerpc/boot/dts/p2020ds.dts b/arch/powerpc/boot/dts/p2020ds.dts
> index 1101914..f72353a 100644
> --- a/arch/powerpc/boot/dts/p2020ds.dts
> +++ b/arch/powerpc/boot/dts/p2020ds.dts
> @@ -336,6 +336,20 @@
>   			phy_type = "ulpi";
>   		};
>
> +		ptp_clock@24E00 {
> +			device_type = "ptp_clock";
> +			model = "eTSEC";
> +			reg = <0x24E00 0xB0>;
> +			interrupts = <68 2 69 2 70 2>;
> +			interrupt-parent = < &mpic >;
> +			tclk_period = <5>;
> +			tmr_prsc = <200>;
> +			tmr_add = <0xCCCCCCCD>;
> +			cksel = <1>;
> +			tmr_fiper1 = <0x3B9AC9FB>;
> +			tmr_fiper2 = <0x0001869B>;
> +		};
> +

This one has 3 interrupts?  The driver supports only two.

> +/* Private globals */
> +static struct ptp_clock *gianfar_clock;

Do you not support more than one of these?

> +static struct etsects the_clock;

"The" clock?  As oppsed to the "other" clock one line above? :-)

> +static irqreturn_t isr(int irq, void *priv)
> +{
> +	struct etsects *etsects = priv;
> +	struct ptp_clock_event event;
> +	u64 ns;
> +	u32 ack=0, lo, hi, mask, val;
> +
> +	val = gfar_read(&etsects->regs->tmr_tevent);
> +
> +	if (val&  ETS1) {
> +		ack |= ETS1;
> +		hi = gfar_read(&etsects->regs->tmr_etts1_h);
> +		lo = gfar_read(&etsects->regs->tmr_etts1_l);
> +		event.type = PTP_CLOCK_EXTTS;
> +		event.index = 0;
> +		event.timestamp = ((u64) hi)<<  32;
> +		event.timestamp |= lo;
> +		ptp_clock_event(gianfar_clock,&event);
> +	}
> +
> +	if (val&  ETS2) {
> +		ack |= ETS2;
> +		hi = gfar_read(&etsects->regs->tmr_etts2_h);
> +		lo = gfar_read(&etsects->regs->tmr_etts2_l);
> +		event.type = PTP_CLOCK_EXTTS;
> +		event.index = 1;
> +		event.timestamp = ((u64) hi)<<  32;
> +		event.timestamp |= lo;
> +		ptp_clock_event(gianfar_clock,&event);
> +	}
> +
> +	if (val&  ALM2) {
> +		ack |= ALM2;
> +		if (etsects->alarm_value) {
> +			event.type = PTP_CLOCK_ALARM;
> +			event.index = 0;
> +			event.timestamp = etsects->alarm_value;
> +			ptp_clock_event(gianfar_clock,&event);
> +		}
> +		if (etsects->alarm_interval) {
> +			ns = etsects->alarm_value + etsects->alarm_interval;
> +			hi = ns>>  32;
> +			lo = ns&  0xffffffff;
> +			spin_lock(&register_lock);
> +			gfar_write(&etsects->regs->tmr_alarm2_l, lo);
> +			gfar_write(&etsects->regs->tmr_alarm2_h, hi);
> +			spin_unlock(&register_lock);
> +			etsects->alarm_value = ns;
> +		} else {
> +			gfar_write(&etsects->regs->tmr_tevent, ALM2);
> +			spin_lock(&register_lock);
> +			mask = gfar_read(&etsects->regs->tmr_temask);
> +			mask&= ~ALM2EN;
> +			gfar_write(&etsects->regs->tmr_temask, mask);
> +			spin_unlock(&register_lock);
> +			etsects->alarm_value = 0;
> +			etsects->alarm_interval = 0;
> +		}
> +	}
> +
> +	gfar_write(&etsects->regs->tmr_tevent, ack);
> +
> +	return IRQ_HANDLED;

Should only return IRQ_HANDLED if you found an event.

> +	if (get_of_u32(node, "tclk_period",&etsects->tclk_period) ||
> +	    get_of_u32(node, "tmr_prsc",&etsects->tmr_prsc) ||
> +	    get_of_u32(node, "tmr_add",&etsects->tmr_add) ||
> +	    get_of_u32(node, "cksel",&etsects->cksel) ||
> +	    get_of_u32(node, "tmr_fiper1",&etsects->tmr_fiper1) ||
> +	    get_of_u32(node, "tmr_fiper2",&etsects->tmr_fiper2))
> +		return -ENODEV;

Might want to print an error so the user knows what's missing.

> +	for (i = 0; i<  N_IRQS; i++) {
> +
> +		etsects->irq[i] = irq_of_parse_and_map(node, i);
> +
> +		if (etsects->irq[i] == NO_IRQ) {
> +			pr_err("irq[%d] not in device tree", i);
> +			return -ENODEV;
> +		}
> +
> +		if (request_irq(etsects->irq[i], isr, 0, DRIVER, etsects)) {
> +			pr_err("request_irq failed irq %d", etsects->irq[i]);
> +			return -ENODEV;
> +		}

You've got two IRQs, with the same handler, and the same dev_id?  From 
the manual it looks like there's one PTP interrupt per eTSEC (which 
would explain 3 interrupts on p2020).

> +static struct of_device_id match_table[] = {
> +	{ .type = "ptp_clock" },
> +	{},
> +};

This driver controls every possible PTP implementation?

-Scott

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Scott Feldman @ 2010-05-14 17:54 UTC (permalink / raw)
  To: Chris Wright; +Cc: Patrick McHardy, Arnd Bergmann, davem, netdev
In-Reply-To: <20100514173530.GI5798@x200.localdomain>

On 5/14/10 10:35 AM, "Chris Wright" <chrisw@redhat.com> wrote:

> Patrick layed out some nice details before.  Here's the link:
> 
> http://thread.gmane.org/gmane.linux.network/151605/focus=151738

Double drats, it looks like that one was caught too late.  So we're
collectively agreeing to let a known bad netlink msg in?  I guess it can be
fixed up later with a IFLA_VF_INFOS nest, and move away from the broken
msgs.

-scott

^ permalink raw reply

* [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Jason Gunthorpe @ 2010-05-14 18:01 UTC (permalink / raw)
  To: netfilter-devel, netdev, Patrick McHardy

At least the XEN net front driver always produces non linear skbs,
so the SIP module does nothing at all when used with that NIC.

Unconditionally linearize the skb..

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
 net/netfilter/nf_conntrack_sip.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

Patrick/Jan, thanks.. This is what I wanted to do in the first place,
but I couldn't convince myself it was safe, as no other nf code does
this..

Unfortunately I can no longer test it :(
 
diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index 4b57216..02d0b59 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1275,13 +1275,10 @@ static int sip_help(struct sk_buff *skb,
 
 	nf_ct_refresh(ct, skb, sip_timeout * HZ);
 
-	if (!skb_is_nonlinear(skb))
-		dptr = skb->data + dataoff;
-	else {
-		pr_debug("Copy of skbuff not supported yet.\n");
-		return NF_ACCEPT;
-	}
+	if (unlikely(skb_linearize(skb)))
+		return NF_DROP;
 
+	dptr = skb->data + dataoff;
 	datalen = skb->len - dataoff;
 	if (datalen < strlen("SIP/2.0 200"))
 		return NF_ACCEPT;
-- 
1.6.0.4


^ permalink raw reply related

* [PATCH] net_sched: sch_hfsc: fix classification loops
From: kaber @ 2010-05-14 18:08 UTC (permalink / raw)
  To: davem; +Cc: netdev

From: Patrick McHardy <kaber@trash.net>

When attaching filters to a class pointing to a class higher up in the
hierarchy, classification may enter an endless loop. Currently this is
prevented for filters that are already resolved, but not for filters
resolved at runtime.

Only allow filters to point downwards in the hierarchy, similar to what
CBQ does.

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 net/sched/sch_hfsc.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index b38b39c..a435cf1 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1155,7 +1155,7 @@ static struct hfsc_class *
 hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 {
 	struct hfsc_sched *q = qdisc_priv(sch);
-	struct hfsc_class *cl;
+	struct hfsc_class *head, *cl;
 	struct tcf_result res;
 	struct tcf_proto *tcf;
 	int result;
@@ -1166,6 +1166,7 @@ hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 			return cl;
 
 	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
+	head = &q->root;
 	tcf = q->root.filter_list;
 	while (tcf && (result = tc_classify(skb, tcf, &res)) >= 0) {
 #ifdef CONFIG_NET_CLS_ACT
@@ -1180,6 +1181,8 @@ hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 		if ((cl = (struct hfsc_class *)res.class) == NULL) {
 			if ((cl = hfsc_find_class(res.classid, sch)) == NULL)
 				break; /* filter selected invalid classid */
+			if (cl->level >= head->level)
+				break; /* filter may only point downwards */
 		}
 
 		if (cl->level == 0)
@@ -1187,6 +1190,7 @@ hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
 
 		/* apply inner filter chain */
 		tcf = cl->filter_list;
+		head = cl;
 	}
 
 	/* classification failed, try default class */
-- 
1.7.0.4


^ permalink raw reply related

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 18:09 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Arnd Bergmann, davem, netdev, chrisw
In-Reply-To: <C812DA73.31839%scofeldm@cisco.com>

Scott Feldman wrote:
> On 5/14/10 10:29 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:
> 
>> I was thinking that a device could have both IFLA_VF_PORTS and IFLA_PORT_SELF,
>> but you know more about the IOV specifics. If an adapter having multiple
>> VFs always gets configured as VF 0 itself, that would be fine as well,
>> otherwise
>> we could have an extra argument to the two device driver callbacks to
>> differentiate VF/SELF. As long as this does not impact the user ABI, we
>> could do either.
> 
> I think you're right.  I should have said AND/OR.  I would rather not have
> an extra argument to the driver callbacks.
>   
>>> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
>>> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
>>> followup patch.
>> I fear it's too late for that now. While we have not yet released 2.6.34
>> and 2.6.33 does not contain the broken message, it's extremely late in the
>> stabilization phase of v2.6.34, so I doubt that there is still a chance for
>> that at this point.
> 
> That's too bad.  I wish Patrick's objections were honored and then we
> wouldn't have followed that broken model!  Can the broken msgs be disabled
> somehow for 2.6.34?  Keep the definitions in if_link.h but fail the SET/GET
> actions in rtnetlink.c?

That would be a possibility. Unfortunately I don't think we can fix
this in a backwards compatible way.

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Patrick McHardy @ 2010-05-14 18:13 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: netfilter-devel, netdev
In-Reply-To: <20100514180138.GF15969@obsidianresearch.com>

[-- Attachment #1: Type: text/plain, Size: 667 bytes --]

Jason Gunthorpe wrote:
> At least the XEN net front driver always produces non linear skbs,
> so the SIP module does nothing at all when used with that NIC.
> 
> Unconditionally linearize the skb..
> 
> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
> ---
>  net/netfilter/nf_conntrack_sip.c |    9 +++------
>  1 files changed, 3 insertions(+), 6 deletions(-)
> 
> Patrick/Jan, thanks.. This is what I wanted to do in the first place,
> but I couldn't convince myself it was safe, as no other nf code does
> this..

Your patch is based on an old version, the current version also
supports TCP. I'll commit this patch to my tree after some testing.

[-- Attachment #2: x --]
[-- Type: text/plain, Size: 881 bytes --]

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index b20f427..45750cc 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1393,10 +1393,8 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff,
 
 	nf_ct_refresh(ct, skb, sip_timeout * HZ);
 
-	if (skb_is_nonlinear(skb)) {
-		pr_debug("Copy of skbuff not supported yet.\n");
+	if (unlikely(skb_linearize(skb)))
 		return NF_ACCEPT;
-	}
 
 	dptr = skb->data + dataoff;
 	datalen = skb->len - dataoff;
@@ -1455,10 +1453,8 @@ static int sip_help_udp(struct sk_buff *skb, unsigned int protoff,
 
 	nf_ct_refresh(ct, skb, sip_timeout * HZ);
 
-	if (skb_is_nonlinear(skb)) {
-		pr_debug("Copy of skbuff not supported yet.\n");
+	if (unlikely(skb_linearize(skb)))
 		return NF_ACCEPT;
-	}
 
 	dptr = skb->data + dataoff;
 	datalen = skb->len - dataoff;

^ permalink raw reply related

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Chris Wright @ 2010-05-14 18:25 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Scott Feldman, Arnd Bergmann, davem, netdev, chrisw
In-Reply-To: <4BED91D0.5020407@trash.net>

* Patrick McHardy (kaber@trash.net) wrote:
> Scott Feldman wrote:
> > On 5/14/10 10:29 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:
> > 
> >> I was thinking that a device could have both IFLA_VF_PORTS and IFLA_PORT_SELF,
> >> but you know more about the IOV specifics. If an adapter having multiple
> >> VFs always gets configured as VF 0 itself, that would be fine as well,
> >> otherwise
> >> we could have an extra argument to the two device driver callbacks to
> >> differentiate VF/SELF. As long as this does not impact the user ABI, we
> >> could do either.
> > 
> > I think you're right.  I should have said AND/OR.  I would rather not have
> > an extra argument to the driver callbacks.
> >   
> >>> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
> >>> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
> >>> followup patch.
> >> I fear it's too late for that now. While we have not yet released 2.6.34
> >> and 2.6.33 does not contain the broken message, it's extremely late in the
> >> stabilization phase of v2.6.34, so I doubt that there is still a chance for
> >> that at this point.
> > 
> > That's too bad.  I wish Patrick's objections were honored and then we
> > wouldn't have followed that broken model!  Can the broken msgs be disabled
> > somehow for 2.6.34?  Keep the definitions in if_link.h but fail the SET/GET
> > actions in rtnetlink.c?
> 
> That would be a possibility. Unfortunately I don't think we can fix
> this in a backwards compatible way.

$ git describe --contains ebc08a6f47ee76ecad8e9f26c26e6ec9b46ca659
v2.6.34-rc1~233^2~336

It's not released yet?

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Jason Gunthorpe @ 2010-05-14 18:26 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel, netdev
In-Reply-To: <4BED92AF.50704@trash.net>

On Fri, May 14, 2010 at 08:13:03PM +0200, Patrick McHardy wrote:
> Your patch is based on an old version, the current version also
> supports TCP. I'll commit this patch to my tree after some testing.

Thanks!

> diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
> index b20f427..45750cc 100644
> +++ b/net/netfilter/nf_conntrack_sip.c
> @@ -1393,10 +1393,8 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff,
>  
>  	nf_ct_refresh(ct, skb, sip_timeout * HZ);
>  
> -	if (skb_is_nonlinear(skb)) {
> -		pr_debug("Copy of skbuff not supported yet.\n");
> +	if (unlikely(skb_linearize(skb)))
>  		return NF_ACCEPT;
> -	}

Should this be NF_DROP? As I understand it skb_linearize only failes
if it runs out of memory, which probably means dropping is OK. But
passing a packet that might need rewriting could be harmful..

Jason

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Jan Engelhardt @ 2010-05-14 18:33 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Jason Gunthorpe, netfilter-devel, netdev
In-Reply-To: <4BED92AF.50704@trash.net>


On Friday 2010-05-14 20:13, Patrick McHardy wrote:
>Jason Gunthorpe wrote:
>> At least the XEN net front driver always produces non linear skbs,
>> so the SIP module does nothing at all when used with that NIC.
>> 
>> Unconditionally linearize the skb..
>> 
>> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
>> ---
>>  net/netfilter/nf_conntrack_sip.c |    9 +++------
>>  1 files changed, 3 insertions(+), 6 deletions(-)
>> 
>> Patrick/Jan, thanks.. This is what I wanted to do in the first place,
>> but I couldn't convince myself it was safe, as no other nf code does
>> this..
>
>Your patch is based on an old version, the current version also
>supports TCP. I'll commit this patch to my tree after some testing.

nf_defrag defragments the packets, but then they're still non-linear?
I'm clearly missing something, could somenoe elaborate?

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Patrick McHardy @ 2010-05-14 18:42 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: netfilter-devel, netdev
In-Reply-To: <20100514182601.GJ15969@obsidianresearch.com>

Jason Gunthorpe wrote:
> On Fri, May 14, 2010 at 08:13:03PM +0200, Patrick McHardy wrote:
>> Your patch is based on an old version, the current version also
>> supports TCP. I'll commit this patch to my tree after some testing.
> 
> Thanks!
> 
>> diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
>> index b20f427..45750cc 100644
>> +++ b/net/netfilter/nf_conntrack_sip.c
>> @@ -1393,10 +1393,8 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff,
>>  
>>  	nf_ct_refresh(ct, skb, sip_timeout * HZ);
>>  
>> -	if (skb_is_nonlinear(skb)) {
>> -		pr_debug("Copy of skbuff not supported yet.\n");
>> +	if (unlikely(skb_linearize(skb)))
>>  		return NF_ACCEPT;
>> -	}
> 
> Should this be NF_DROP? As I understand it skb_linearize only failes
> if it runs out of memory, which probably means dropping is OK. But
> passing a packet that might need rewriting could be harmful..

We so far also didn't rewrite the packet. But agreed, its
a corner case and dropping it is the safer choice.

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Patrick McHardy @ 2010-05-14 18:45 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Jason Gunthorpe, netfilter-devel, netdev
In-Reply-To: <alpine.LSU.2.01.1005142032270.13800@obet.zrqbmnf.qr>

Jan Engelhardt wrote:
> On Friday 2010-05-14 20:13, Patrick McHardy wrote:
>> Jason Gunthorpe wrote:
>>> At least the XEN net front driver always produces non linear skbs,
>>> so the SIP module does nothing at all when used with that NIC.
>>>
>>> Unconditionally linearize the skb..
>>>
>>> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
>>> ---
>>>  net/netfilter/nf_conntrack_sip.c |    9 +++------
>>>  1 files changed, 3 insertions(+), 6 deletions(-)
>>>
>>> Patrick/Jan, thanks.. This is what I wanted to do in the first place,
>>> but I couldn't convince myself it was safe, as no other nf code does
>>> this..
>> Your patch is based on an old version, the current version also
>> supports TCP. I'll commit this patch to my tree after some testing.
> 
> nf_defrag defragments the packets, but then they're still non-linear?
> I'm clearly missing something, could somenoe elaborate?

We're talking about packets with non-linear data, which is unrelated
to fragments. Reassembled fragments are non-linear as well though.

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 18:46 UTC (permalink / raw)
  To: Chris Wright; +Cc: Scott Feldman, Arnd Bergmann, davem, netdev
In-Reply-To: <20100514182526.GO5798@x200.localdomain>

Chris Wright wrote:
> * Patrick McHardy (kaber@trash.net) wrote:
>> Scott Feldman wrote:
>>> On 5/14/10 10:29 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:
>>>
>>>>> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
>>>>> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
>>>>> followup patch.
>>>> I fear it's too late for that now. While we have not yet released 2.6.34
>>>> and 2.6.33 does not contain the broken message, it's extremely late in the
>>>> stabilization phase of v2.6.34, so I doubt that there is still a chance for
>>>> that at this point.
>>> That's too bad.  I wish Patrick's objections were honored and then we
>>> wouldn't have followed that broken model!  Can the broken msgs be disabled
>>> somehow for 2.6.34?  Keep the definitions in if_link.h but fail the SET/GET
>>> actions in rtnetlink.c?
>> That would be a possibility. Unfortunately I don't think we can fix
>> this in a backwards compatible way.
> 
> $ git describe --contains ebc08a6f47ee76ecad8e9f26c26e6ec9b46ca659
> v2.6.34-rc1~233^2~336
> 
> It's not released yet?

Correct, it was added in 2.6.34-rc.

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Chris Wright @ 2010-05-14 18:48 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Chris Wright, Scott Feldman, Arnd Bergmann, davem, shemminger,
	netdev
In-Reply-To: <4BED9A85.4080507@trash.net>

* Patrick McHardy (kaber@trash.net) wrote:
> Chris Wright wrote:
> > * Patrick McHardy (kaber@trash.net) wrote:
> >> Scott Feldman wrote:
> >>> On 5/14/10 10:29 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:
> >>>
> >>>>> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
> >>>>> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
> >>>>> followup patch.
> >>>> I fear it's too late for that now. While we have not yet released 2.6.34
> >>>> and 2.6.33 does not contain the broken message, it's extremely late in the
> >>>> stabilization phase of v2.6.34, so I doubt that there is still a chance for
> >>>> that at this point.
> >>> That's too bad.  I wish Patrick's objections were honored and then we
> >>> wouldn't have followed that broken model!  Can the broken msgs be disabled
> >>> somehow for 2.6.34?  Keep the definitions in if_link.h but fail the SET/GET
> >>> actions in rtnetlink.c?
> >> That would be a possibility. Unfortunately I don't think we can fix
> >> this in a backwards compatible way.
> > 
> > $ git describe --contains ebc08a6f47ee76ecad8e9f26c26e6ec9b46ca659
> > v2.6.34-rc1~233^2~336
> > 
> > It's not released yet?
> 
> Correct, it was added in 2.6.34-rc.

AFAICT iproute2 hasn't been released either w/ that support.
So, I'll prepare patches to fix it (or disable as Scott mentioned).
What do you think?

thanks,
-chris

^ permalink raw reply

* Re: [net-next-2.6 V7 PATCH 1/2] Add netlink support for virtual port management (was iovnl)
From: Patrick McHardy @ 2010-05-14 18:50 UTC (permalink / raw)
  To: Chris Wright; +Cc: Scott Feldman, Arnd Bergmann, davem, shemminger, netdev
In-Reply-To: <20100514184803.GP5798@x200.localdomain>

Chris Wright wrote:
> * Patrick McHardy (kaber@trash.net) wrote:
>> Chris Wright wrote:
>>> * Patrick McHardy (kaber@trash.net) wrote:
>>>> Scott Feldman wrote:
>>>>> On 5/14/10 10:29 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:
>>>>>
>>>>>>> I think we should redo the other IFLA_VF_xxx msgs in the same style.  I'm
>>>>>>> not going to tackle that for IFLA_VF_PORTS patch, but it would be a good
>>>>>>> followup patch.
>>>>>> I fear it's too late for that now. While we have not yet released 2.6.34
>>>>>> and 2.6.33 does not contain the broken message, it's extremely late in the
>>>>>> stabilization phase of v2.6.34, so I doubt that there is still a chance for
>>>>>> that at this point.
>>>>> That's too bad.  I wish Patrick's objections were honored and then we
>>>>> wouldn't have followed that broken model!  Can the broken msgs be disabled
>>>>> somehow for 2.6.34?  Keep the definitions in if_link.h but fail the SET/GET
>>>>> actions in rtnetlink.c?
>>>> That would be a possibility. Unfortunately I don't think we can fix
>>>> this in a backwards compatible way.
>>> $ git describe --contains ebc08a6f47ee76ecad8e9f26c26e6ec9b46ca659
>>> v2.6.34-rc1~233^2~336
>>>
>>> It's not released yet?
>> Correct, it was added in 2.6.34-rc.
> 
> AFAICT iproute2 hasn't been released either w/ that support.
> So, I'll prepare patches to fix it (or disable as Scott mentioned).
> What do you think?

That would be great, otherwise we'll probably have to support it
forever.

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Patrick McHardy @ 2010-05-14 19:26 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: netfilter-devel, netdev
In-Reply-To: <4BED99A3.2050404@trash.net>

[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]

Patrick McHardy wrote:
> Jason Gunthorpe wrote:
>> On Fri, May 14, 2010 at 08:13:03PM +0200, Patrick McHardy wrote:
>>> Your patch is based on an old version, the current version also
>>> supports TCP. I'll commit this patch to my tree after some testing.
>> Thanks!
>>
>>> diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
>>> index b20f427..45750cc 100644
>>> +++ b/net/netfilter/nf_conntrack_sip.c
>>> @@ -1393,10 +1393,8 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff,
>>>  
>>>  	nf_ct_refresh(ct, skb, sip_timeout * HZ);
>>>  
>>> -	if (skb_is_nonlinear(skb)) {
>>> -		pr_debug("Copy of skbuff not supported yet.\n");
>>> +	if (unlikely(skb_linearize(skb)))
>>>  		return NF_ACCEPT;
>>> -	}
>> Should this be NF_DROP? As I understand it skb_linearize only failes
>> if it runs out of memory, which probably means dropping is OK. But
>> passing a packet that might need rewriting could be harmful..
> 
> We so far also didn't rewrite the packet. But agreed, its
> a corner case and dropping it is the safer choice.

This is what I've added to my tree. Tested with asterisk and TSO
enabled NIC, which fails without this patch.

[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1462 bytes --]

commit a1d7c1b4b8dfbc5ecadcff9284d64bb6ad4c0196
Author: Patrick McHardy <kaber@trash.net>
Date:   Fri May 14 21:18:17 2010 +0200

    netfilter: nf_ct_sip: handle non-linear skbs
    
    Handle non-linear skbs by linearizing them instead of silently failing.
    Long term the helper should be fixed to either work with non-linear skbs
    directly by using the string search API or work on a copy of the data.
    
    Based on patch by Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
    Signed-off-by: Patrick McHardy <kaber@trash.net>

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index b20f427..53d8922 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1393,10 +1393,8 @@ static int sip_help_tcp(struct sk_buff *skb, unsigned int protoff,
 
 	nf_ct_refresh(ct, skb, sip_timeout * HZ);
 
-	if (skb_is_nonlinear(skb)) {
-		pr_debug("Copy of skbuff not supported yet.\n");
-		return NF_ACCEPT;
-	}
+	if (unlikely(skb_linearize(skb)))
+		return NF_DROP;
 
 	dptr = skb->data + dataoff;
 	datalen = skb->len - dataoff;
@@ -1455,10 +1453,8 @@ static int sip_help_udp(struct sk_buff *skb, unsigned int protoff,
 
 	nf_ct_refresh(ct, skb, sip_timeout * HZ);
 
-	if (skb_is_nonlinear(skb)) {
-		pr_debug("Copy of skbuff not supported yet.\n");
-		return NF_ACCEPT;
-	}
+	if (unlikely(skb_linearize(skb)))
+		return NF_DROP;
 
 	dptr = skb->data + dataoff;
 	datalen = skb->len - dataoff;

^ permalink raw reply related

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Jan Engelhardt @ 2010-05-14 19:33 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Jason Gunthorpe, netfilter-devel, netdev
In-Reply-To: <4BEDA3CA.4030407@trash.net>


On Friday 2010-05-14 21:26, Patrick McHardy wrote:
>>> Should this be NF_DROP? As I understand it skb_linearize only failes
>>> if it runs out of memory, which probably means dropping is OK. But
>>> passing a packet that might need rewriting could be harmful..
>> 
>> We so far also didn't rewrite the packet. But agreed, its
>> a corner case and dropping it is the safer choice.
>
>This is what I've added to my tree. Tested with asterisk and TSO
>enabled NIC, which fails without this patch.
>

[..patch..]

Shouldn't we do this for the other nf_conntrack_xyz too?
That would mean getting rid of the size-limited locked packet
buffer.

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Patrick McHardy @ 2010-05-14 19:41 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: Jason Gunthorpe, netfilter-devel, netdev
In-Reply-To: <alpine.LSU.2.01.1005142133001.16437@obet.zrqbmnf.qr>

Jan Engelhardt wrote:
> On Friday 2010-05-14 21:26, Patrick McHardy wrote:
>>>> Should this be NF_DROP? As I understand it skb_linearize only failes
>>>> if it runs out of memory, which probably means dropping is OK. But
>>>> passing a packet that might need rewriting could be harmful..
>>> We so far also didn't rewrite the packet. But agreed, its
>>> a corner case and dropping it is the safer choice.
>> This is what I've added to my tree. Tested with asterisk and TSO
>> enabled NIC, which fails without this patch.
>>
> 
> [..patch..]
> 
> Shouldn't we do this for the other nf_conntrack_xyz too?
> That would mean getting rid of the size-limited locked packet
> buffer.

Those got introduced to avoid the linearization. SIP is kind of special
because its the only helper mangling packets multiple times with
variable sizes and is currently unable to use the data copying scheme.
Amanda does this as well, but uses the string search API, which works
fine.

The best fix to get rid of the copying in other helpers would be to
convert them to the string search API. I started doing that a few
years ago, but never finished it. One related improvement we could
add would be to make only those parts of the skb writable that are
actually written to when mangling packets, at least for non-linear
non-paged skbs (using frag_list).

^ permalink raw reply

* Re: [PATCHv2] netfilter: Remove skb_is_nonlinear check from nf_conntrack_sip
From: Jason Gunthorpe @ 2010-05-14 19:56 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel, netdev
In-Reply-To: <4BED99A3.2050404@trash.net>

On Fri, May 14, 2010 at 08:42:43PM +0200, Patrick McHardy wrote:

> > Should this be NF_DROP? As I understand it skb_linearize only failes
> > if it runs out of memory, which probably means dropping is OK. But
> > passing a packet that might need rewriting could be harmful..
> 
> We so far also didn't rewrite the packet. But agreed, its
> a corner case and dropping it is the safer choice.

I was just thinking that, say, a request goes out, gets rewritten but
the reply comes back and does not get rewritten = bad. Better to drop.

Looks OK to me..

Jason

^ permalink raw reply

* RE: why get different number of MSI-X vector for broadcom bnx2x every time
From: Dmitry Kravkov @ 2010-05-14 20:02 UTC (permalink / raw)
  To: Jon Zhou, netdev
In-Reply-To: <4A6A2125329CFD4D8CC40C9E8ABCAB9F2497DED2BE@MILEXCH2.ds.jdsu.net>

Hi

Your system (from the log below) allowed bnx2x to use only 4 MSI-X vectors instead of 16 required by the driver.

Regards,
Dmitry

-----Original Message-----
From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Jon Zhou
Sent: Friday, May 14, 2010 11:02 AM
To: netdev
Subject: why get different number of MSI-X vector for broadcom bnx2x every time

hi there:

bnx2x_enable_msix :

...
rc = pci_enable_msix(bp->pdev, &bp->msix_table[0],
			     BNX2X_NUM_QUEUES(bp) + offset);

	/* 
	 * reconfigure number of tx/rx queues according to available
	 * MSI-X vectors
	 */
	if (rc >= BNX2X_MIN_MSIX_VEC_CNT) {
		/* vectors available for FP */
		int fp_vec = rc - BNX2X_MSIX_VEC_FP_START;


sometimes I can run up the driver with 4 queues but most of time I can only get 2 queues
why?

May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_set_num_queues:8053(eth5)]set number of queues to 15
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7544(eth5)]msix_table[0].entry = 0 (slowpath)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7549(eth5)]msix_table[1].entry = 1 (CNIC)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[2].entry = 2 (fastpath #0)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[3].entry = 3 (fastpath #1)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[4].entry = 4 (fastpath #2)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[5].entry = 5 (fastpath #3)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[6].entry = 6 (fastpath #4)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[7].entry = 7 (fastpath #5)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[8].entry = 8 (fastpath #6)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[9].entry = 9 (fastpath #7)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[10].entry = 10 (fastpath #8)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[11].entry = 11 (fastpath #9)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[12].entry = 12 (fastpath #10)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[13].entry = 13 (fastpath #11)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[14].entry = 14 (fastpath #12)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[15].entry = 15 (fastpath #13)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7556(eth5)]msix_table[16].entry = 16 (fastpath #14)
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7571(eth5)]Trying to use less MSI-X vectors: 4
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_enable_msix:7584(eth5)]New queue configuration set: 2
May 14 01:40:16 ibm-bc-54 kernel: bnx2x: eth5: using MSI-X  IRQs: sp 4321  fp[0] 4319 ... fp[1] 4318
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_nic_init:6067(eth5)]queue[0]:  bnx2x_init_sb(ffff8803e6810780,ffff8803f9c4f000)  cl_id 0  sb 1  cos 0
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_nic_init:6067(eth5)]queue[1]:  bnx2x_init_sb(ffff8803e6810780,ffff8803fb006000)  cl_id 1  sb 2  cos 0
May 14 01:40:16 ibm-bc-54 kernel: [bnx2x_init_rx_rings:5305(eth5)]mtu 1500  rx_buf_size 1650
 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox