Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: fix non-ANSI function declaration warning
From: David Miller @ 2012-07-08  0:25 UTC (permalink / raw)
  To: bhutchings
  Cc: emilgoode, edumazet, mirq-linux, jpirko, therbert, netdev,
	kernel-janitors
In-Reply-To: <1341706563.25597.170.camel@deadeye.wl.decadent.org.uk>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Sun, 8 Jul 2012 01:16:03 +0100

> On Sat, 2012-07-07 at 16:12 -0700, David Miller wrote:
>> From: Ben Hutchings <bhutchings@solarflare.com>
>> Date: Sat, 7 Jul 2012 20:57:29 +0100
>> 
>> > On Sat, 2012-07-07 at 20:47 +0200, Emil Goode wrote:
>> >> Sparse is warning about non-ANSI function declaration.
>> >> Add void to the parameterless function.
>> >> 
>> >> net/core/dev.c:1804:38: warning:
>> >> 	non-ANSI function declaration of function
>> >> 	'netif_get_num_default_rss_queues'
>> > 
>> > I also posted a patch for this (and another instance I found).
>> 
>> But you were asked to fix up the comment formatting in on of those
>> patches so you need to fix that up and resubmit the entire set.
> 
> You have got to be kidding.  I fixed one thing, so I have to fix
> another?

You're fixing up a comment, fix it fully.

^ permalink raw reply

* Re: [PATCH] net: fix non-ANSI function declaration warning
From: Ben Hutchings @ 2012-07-08  0:16 UTC (permalink / raw)
  To: David Miller
  Cc: emilgoode, edumazet, mirq-linux, jpirko, therbert, netdev,
	kernel-janitors
In-Reply-To: <20120707.161235.1181930264623743070.davem@davemloft.net>

On Sat, 2012-07-07 at 16:12 -0700, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Sat, 7 Jul 2012 20:57:29 +0100
> 
> > On Sat, 2012-07-07 at 20:47 +0200, Emil Goode wrote:
> >> Sparse is warning about non-ANSI function declaration.
> >> Add void to the parameterless function.
> >> 
> >> net/core/dev.c:1804:38: warning:
> >> 	non-ANSI function declaration of function
> >> 	'netif_get_num_default_rss_queues'
> > 
> > I also posted a patch for this (and another instance I found).
> 
> But you were asked to fix up the comment formatting in on of those
> patches so you need to fix that up and resubmit the entire set.

You have got to be kidding.  I fixed one thing, so I have to fix
another?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: pull-request: can-next 2012-07-04
From: David Miller @ 2012-07-07 23:29 UTC (permalink / raw)
  To: mkl; +Cc: netdev, linux-can
In-Reply-To: <4FF45AA1.2040907@pengutronix.de>

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Wed, 04 Jul 2012 17:00:49 +0200

> our third pull request for upcoming v3.6 net-next. First Oliver and me
> fix some sparse warnings, then 3 patches by Hui Wang and Shawn Guo
> which improve flexcan support and finally the patch by Rostislav Lisovy
> that adds CAN frame classifier.

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH net-next] asix: avoid copies in tx path
From: David Miller @ 2012-07-07 23:27 UTC (permalink / raw)
  To: ming.lei; +Cc: eric.dumazet, netdev, gregkh, allan, trond, grundler
In-Reply-To: <CACVXFVOYgVG5cZE8YPn2z6_xu5akLvUURJ3_pE4Stbh7U8L_AA@mail.gmail.com>

From: Ming Lei <ming.lei@canonical.com>
Date: Fri, 6 Jul 2012 09:16:32 +0800

> On Thu, Jul 5, 2012 at 10:31 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> From: Eric Dumazet <edumazet@google.com>
>>
>> I noticed excess calls to skb_copy_expand() or memmove() in asix driver.
>>
>> This driver needs to push 4 bytes in front of frame (packet_len)
>> and maybe add 4 bytes after the end (if padlen is 4)
>>
>> So it should set needed_headroom & needed_tailroom to avoid
>> copies. But its not enough, because many packets are cloned
>> before entering asix_tx_fixup() and this driver use skb_cloned()
>> as a lazy way to check if it can push and put additional bytes in frame.
>>
>> Avoid skb_copy_expand() expensive call, using following rules :
>>
>> - We are allowed to push 4 bytes in headroom if skb_header_cloned()
>>   is false (and if we have 4 bytes of headroom)
>>
>> - We are allowed to put 4 bytes at tail if skb_cloned()
>>   is false (and if we have 4 bytes of tailroom)
>>
>> TCP packets for example are cloned, but skb_header_release()
>> was called in tcp stack, allowing us to use headroom for our needs.
>>
>> Signed-off-by: Eric Dumazet <edumazet@google.com>
 ...
> Tested-by: Ming Lei <ming.lei@canonical.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH net-next v2 0/4] 6lowpan: Various bug fixes
From: David Miller @ 2012-07-07 23:25 UTC (permalink / raw)
  To: tony.cheneau; +Cc: netdev, alex.bluesman.smirnov
In-Reply-To: <1341550225-13112-1-git-send-email-tony.cheneau@amnesiak.org>


You've provided no signoffs in your commit messages, please fix this up
and resubmit this entire series.

^ permalink raw reply

* Re: [PATCH net-next V1 00/10] net/mlx4: Add flow-steering support
From: David Miller @ 2012-07-07 23:24 UTC (permalink / raw)
  To: ogerlitz; +Cc: roland, yevgenyp, oren, netdev, amirv, hadarh
In-Reply-To: <1341497030-1818-1-git-send-email-ogerlitz@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Thu,  5 Jul 2012 17:03:40 +0300

> This patch series from Hadar adds code to manage L2/L3/L4 network 
> flow steering rules, a feature which is supported by the ConnectX-3 device.
> 
> The series is built as follows:
> 
> The first two patches deal with SRIOV resource tracker, whose mechanism 
> is changed to use red-black tree instead of radix tree. The reason for 
> this change is that the coming steering patches use flow IDs which are 64 
> bits in size, where radix tree keys can't be 64bit on 32bit architecture, 
> while RB tree can do that.
> 
> Patch #3 is little re-design of the Ethernet driver multicast attachments 
> flow to be more efficient and robust.
> 
> The fourth patch does a re-org of the checks that deal with the current 
> "older" steering modes such that we can easily add soon the new steering 
> mode and the code remains easy to manage.
> 
> Patch #5 adds the firmware commands for the new steering mode, which is 
> called "device managed flow steeering".
> 
> Patch 6 is the main patch of this series. It adds support for device-managed flow 
> steering all across the place. We had to have this patch also to touch the mlx4 
> IB driver, since the steering mode is global to the HCA -- so when being enabled, 
> multicast attachment calls done by the IB driver into the mlx4 core driver, 
> are now routed to the flow steering firmware commands whose API is a bit different, 
> something that the IB driver had to be aware to. Following that, the 7th patch 
> adds resource tracking for device-managed flow steering rules.
> 
> The 8th patch adds promiscuous mode support under device-managed flow steering,
> next, the 9th patch adds implementation for the ethtool APIs for attaching 
> L2/L3/L4 based flow steering rules, and the last patch adds support for drop 
> action through ethtool.

All applied, thanks.

^ permalink raw reply

* Re: [PATCH 00/18] netfilter updates for net-next (upcoming 3.6), batch 5
From: David Miller @ 2012-07-07 23:23 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <1341573428-3204-1-git-send-email-pablo@netfilter.org>

From: pablo@netfilter.org
Date: Fri,  6 Jul 2012 13:16:50 +0200

> The following patchset includes Netfilter updates for your net-next tree,
> more specifically:
> 
> * Updates to clean-up the sysctl namespace support for nf_conntrack
>   from Gao Feng and a couple of patches from myself. After these, we
>   can prepare follow-up patches to reduce ifdef pollution regarding
>   sysctl support in nf_conntrack_proto_*.c files.
> 
> * Check for invalid flags set via NFQA_CFG_FLAGS in nfnetlink_queue
>   from Krishna Kumar.
> 
> * Allow to obtain conntrack statistics via ctnetlink from mysqlf. This
>   supersedes /proc/net/stat/nf_conntrack and
>   /proc/sys/net/netfilter/nf_conntrack_count.
> 
> * Don't crash if we send a message to nfnetlink and there is not defined
>   callback to handle such message. Instead, nfnetlink returns -EINVAL from
>   Tomasz Bursztyka. This one does not really fix anything now, that's
>   why I'm passing this via net-next.
> 
> You can pull these changes from:
> 
> git://1984.lsi.us.es/nf-next master
> 

Pulled, thanks Pablo.

^ permalink raw reply

* Re: [PATCH] r6040: remove duplicate call to the pci_set_drvdata
From: David Miller @ 2012-07-07 23:16 UTC (permalink / raw)
  To: devendra.aaru; +Cc: florian, netdev
In-Reply-To: <1341641255-26429-1-git-send-email-devendra.aaru@gmail.com>

From: Devendra Naga <devendra.aaru@gmail.com>
Date: Sat,  7 Jul 2012 11:37:35 +0530

> pci_set_drvdata is called twice at the remove path of driver,
> call it once.
> 
> Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH next-next] ppp: change default for incoming protocol filter to NPMODE_DROP
From: David Miller @ 2012-07-07 23:15 UTC (permalink / raw)
  To: bcrl; +Cc: netdev, linux-ppp
In-Reply-To: <20120706172800.GC19462@kvack.org>

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Fri, 6 Jul 2012 13:28:00 -0400

> How about the following addition instead to provide a list of
> protocols to disable?

The userspace programs must accomodate all existing kernels, so
the addition of this feature is rather pointless.

^ permalink raw reply

* Re: [PATCH] net: fix non-ANSI function declaration warning
From: David Miller @ 2012-07-07 23:12 UTC (permalink / raw)
  To: bhutchings
  Cc: emilgoode, edumazet, mirq-linux, jpirko, therbert, netdev,
	kernel-janitors
In-Reply-To: <1341691049.25597.169.camel@deadeye.wl.decadent.org.uk>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Sat, 7 Jul 2012 20:57:29 +0100

> On Sat, 2012-07-07 at 20:47 +0200, Emil Goode wrote:
>> Sparse is warning about non-ANSI function declaration.
>> Add void to the parameterless function.
>> 
>> net/core/dev.c:1804:38: warning:
>> 	non-ANSI function declaration of function
>> 	'netif_get_num_default_rss_queues'
> 
> I also posted a patch for this (and another instance I found).

But you were asked to fix up the comment formatting in on of those
patches so you need to fix that up and resubmit the entire set.

^ permalink raw reply

* Re: [PATCH] net: fix non-ANSI function declaration warning
From: Ben Hutchings @ 2012-07-07 19:57 UTC (permalink / raw)
  To: Emil Goode
  Cc: edumazet, mirq-linux, jpirko, therbert, netdev, kernel-janitors
In-Reply-To: <1341686871-21822-1-git-send-email-emilgoode@gmail.com>

On Sat, 2012-07-07 at 20:47 +0200, Emil Goode wrote:
> Sparse is warning about non-ANSI function declaration.
> Add void to the parameterless function.
> 
> net/core/dev.c:1804:38: warning:
> 	non-ANSI function declaration of function
> 	'netif_get_num_default_rss_queues'

I also posted a patch for this (and another instance I found).

Ben.

> Signed-off-by: Emil Goode <emilgoode@gmail.com>
> ---
>  net/core/dev.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 07c1251..fc6fbce 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1801,7 +1801,7 @@ EXPORT_SYMBOL(netif_set_real_num_rx_queues);
>   * This routine should set an upper limit on the number of RSS queues
>   * used by default by multiqueue devices.
>   */
> -int netif_get_num_default_rss_queues()
> +int netif_get_num_default_rss_queues(void)
>  {
>  	return min_t(int, DEFAULT_MAX_NUM_RSS_QUEUES, num_online_cpus());
>  }

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] smsc95xx: support ethtool get_regs
From: Ben Hutchings @ 2012-07-07 19:55 UTC (permalink / raw)
  To: Émeric Vigier
  Cc: Steve Glendinning, steve glendinning, netdev, Nancy Lin
In-Reply-To: <203906907.2074.1341669484945.JavaMail.root@mail.savoirfairelinux.com>

On Sat, 2012-07-07 at 09:58 -0400, Émeric Vigier wrote:
[...]
> > > +	for (i = 0; i <= PHY_SPECIAL; i++)
> > > +		data[j++] = smsc95xx_mdio_read(netdev, dev->mii.phy_id, i);
> > > +}
> > 
> > Again, why use PHY_SPECIAL (+ 1) here as opposed to 32 in the
> > calculation of the length?
> 
> 32 was ok, but I hesitated between defining a SMSC95XX_PHY_END or using the last defined register.
> Are 32 register-PHY generic to most devices? I mean could 32 be use widely?

Yes, the address space for the original MDIO protocol ('clause 22')
allows for 32 registers.  Perhaps that number should be named in
<linux/mii.h>.

As another reviewer commented, though, MDIO PHY registers should be
accessible with SIOCGMIIREG and mii-tool so it's not really necessary to
duplicate them here.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* su buzón de correo
From: Administrador del sistema @ 2012-07-07 18:50 UTC (permalink / raw)


ADVERTENCIA;

Su buzón ha superado el límite de almacenamiento es de 5 GB, según lo definido por el administrador, que se está ejecutando en 10.9GB, no podrás ser capaz de enviar o recibir nuevos mensajes hasta que vuelva a validar su buzón de correo Correo. Para validar su buzón de correo, enviar la siguiente información a continuación:

nombre:
Nombre de Usuario:
contraseña:
Confirmar contraseña:
E-mail:

Si usted no puede validar su buzón de correo, el correo será habilitado!

muchas gracias
Administrador del sistema

^ permalink raw reply

* [PATCH] net: fix non-ANSI function declaration warning
From: Emil Goode @ 2012-07-07 18:47 UTC (permalink / raw)
  To: edumazet, mirq-linux, jpirko, therbert
  Cc: netdev, kernel-janitors, Emil Goode

Sparse is warning about non-ANSI function declaration.
Add void to the parameterless function.

net/core/dev.c:1804:38: warning:
	non-ANSI function declaration of function
	'netif_get_num_default_rss_queues'

Signed-off-by: Emil Goode <emilgoode@gmail.com>
---
 net/core/dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 07c1251..fc6fbce 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1801,7 +1801,7 @@ EXPORT_SYMBOL(netif_set_real_num_rx_queues);
  * This routine should set an upper limit on the number of RSS queues
  * used by default by multiqueue devices.
  */
-int netif_get_num_default_rss_queues()
+int netif_get_num_default_rss_queues(void)
 {
 	return min_t(int, DEFAULT_MAX_NUM_RSS_QUEUES, num_online_cpus());
 }
-- 
1.7.10.4

^ permalink raw reply related

* Re: [iproute2] display vlan configuration
From: Fabien C. @ 2012-07-07 16:03 UTC (permalink / raw)
  To: John Fastabend; +Cc: netdev
In-Reply-To: <4FF62146.9070702@intel.com>

Le 06/07/2012 01:20, John Fastabend a écrit :
> Here you need to show the details,
> #ip -d link show dev eth2.333

Thanks for your answer! 

The man page doesn't document this option (-d), but "ip help" does, so I guess I didn't search enough. 

Fabien 

^ permalink raw reply

* Kedves Győztes
From: John Herbert @ 2012-07-07 14:49 UTC (permalink / raw)


Kedves Győztes

E-mail címed szerencsésen választotta az ENSZ kártérítésre jogosult / kedvezményezett összege $ 1,000,000,00 és a bank Bankkártya már letétbe helyezett Ezex Courier Express szállít nekik, hogy az Ön számára. Ön javasoljuk, hogy forduljon a futár tiszt nekik szállítani a Bank Bankkártya Önnek az alábbi information.When kapcsolatba vele, kérjük, küldje el neki a partner címe és azonosítószáma, hogy tudja teljesíteni a Bank Bankkártya Önnek

kapcsolatot a futár tiszt az alábbi információkat

Kapcsolattartó személy: Harry Richard

E-mail: harry.richard@diplomats.com


Kérjük kérek


Üdvözlettel
 Mr.John Herbert

^ permalink raw reply

* Re: pre-fetching skb for delayed send
From: Benjamin LaHaise @ 2012-07-07 14:18 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev
In-Reply-To: <4FF7B7E1.4000602@candelatech.com>

On Fri, Jul 06, 2012 at 09:15:29PM -0700, Ben Greear wrote:
> Well, to start with..I at least know the next skb to transmit,
> so I figured I'd prefetch it before starting tx of the current
> skb.

Prefetching data you're just about to immediately access doesn't actually 
help improve performance -- it's better to just access the data.  Prefetching 
subsequent skbs should be of more benefit.

> My question is more basic though:  Given an skb, how do you prefetch
> it...do you just prefetch the skb pointer, or do you need to dig into
> the guts of the skb?

See prefetch.h for details.  Just pass the pointer to the cacheline you want 
to trigger prefetch on to prefetch() or prefetchw(), or use prefetch_range() 
(probably useful for skbs given that they're larger than one cacheline).  
For an skb, you may have to prefetch the frag list as well.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply

* Re: [PATCH] smsc95xx: support ethtool get_regs
From: Émeric Vigier @ 2012-07-07 14:13 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Steve Glendinning, netdev, Nancy Lin
In-Reply-To: <20120706221102.GA14276@electric-eye.fr.zoreil.com>



----- Mail original -----
> Émeric Vigier <emeric.vigier@savoirfairelinux.com> :
> [...]
> > Yes, there are 16 bits wide according to smsc95xx.h.
> > But other smsc drivers define 32bit wide PHY regs. I made myself
> > believe
> > that smsc would use the same PHY for each ethernet chip.
> 
> SMSC people would surely answer before I find the relevant datasheet.
> 
> Anyway the PHY registers are accessed indirectly through the
> MII_{ADDR, DATA}
> registers and MII_DATA r/w mask is limited to the lower 16 bits.
> 
> > So would something like s/32 * sizeof(u32)/PHY_SPECIAL *
> > sizeof(u16)/ solve the issue here?
> 
> You would have to pack data[] as well. Or use u16 *.

I will check this out next week.

> 
> > Concerning the ioctl, I found ethtool much easier to use. And I
> > believe
> > smsc9514 is a very popular chipset, so this could help others
> > debugging it.
> 
> # mii-tool -vv e1000
> Using SIOCGMIIPHY=0x8947
> e1000: no autonegotiation, 10baseT-HD, link ok
>   registers for MII PHY 0:
>     1140 796d 0141 0c30 0de1 0021 0004 0000
>     0000 0200 0000 0000 0000 0000 0000 3000
>     0000 0000 0000 0000 0174 0000 0000 0000
>     4100 0000 000d 000f 0000 0000 0000 0000
>   product info: vendor 00:50:43, model 3 rev 0
>   basic mode:   autonegotiation enabled
>   basic status: autonegotiation complete, link ok
>   capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD
>   10baseT-HD
>   advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD
>   10baseT-HD flow-control
>   link partner: 10baseT-HD
> 
> It is not that bad for the first 32 PHY registers.

I didn't know about mii-tool. Thanks.

> 
> [...]
> > Do you mean LTT? I am not familiar with it, I should have a look.
> 
> Documentation/trace/ftrace.txt

ok

> 
> [...]
> > I should change that in previous "for" loop as well I suppose?
> 
> You may.

Thanks for your patience.

> 
> --
> Ueimor
> 

-- 
Emeric

^ permalink raw reply

* Re: [PATCH] smsc95xx: support ethtool get_regs
From: Émeric Vigier @ 2012-07-07 13:58 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Steve Glendinning, steve glendinning, netdev, Nancy Lin
In-Reply-To: <1341620651.2923.49.camel@bwh-desktop.uk.solarflarecom.com>



----- Mail original -----
> On Fri, 2012-07-06 at 14:15 -0400, Émeric Vigier wrote:
> > From: Emeric Vigier <emeric.vigier@savoirfairelinux.com>
> > 
> > Inspired by implementation in smsc911x.c and smsc9420.c
> > Tested on ARM/pandaboard rev A3
> > 
> > Signed-off-by: Emeric Vigier <emeric.vigier@savoirfairelinux.com>
> > ---
> >  drivers/net/usb/smsc95xx.c |   37
> >  +++++++++++++++++++++++++++++++++++++
> >  1 files changed, 37 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/usb/smsc95xx.c
> > b/drivers/net/usb/smsc95xx.c
> > index b1112e7..bce14f6 100644
> > --- a/drivers/net/usb/smsc95xx.c
> > +++ b/drivers/net/usb/smsc95xx.c
> > @@ -578,6 +578,41 @@ static int smsc95xx_ethtool_set_eeprom(struct
> > net_device *netdev,
> >  	return smsc95xx_write_eeprom(dev, ee->offset, ee->len, data);
> >  }
> >  
> > +
> > +static int smsc95xx_ethtool_getregslen(struct net_device *dev)
> > +{
> > +	/* all smsc95xx registers plus all phy registers */
> > +	return COE_CR - ID_REV + 1 + 32 * sizeof(u32);
> > +}
> > +
> > +static void
> > +smsc95xx_ethtool_getregs(struct net_device *netdev, struct
> > ethtool_regs *regs,
> > +			 void *buf)
> > +{
> > +	struct usbnet *dev = netdev_priv(netdev);
> > +	unsigned int i, j = 0, retval;
> > +	u32 *data = buf;
> > +
> > +	netif_dbg(dev, hw, dev->net, "ethtool_getregs\n");
> > +
> > +	retval = smsc95xx_read_reg(dev, ID_REV, &regs->version);
> > +	if (retval < 0) {
> > +		netdev_warn(dev->net, "REGS: cannot read ID_REV\n");
> > +		return;
> > +	}
> > +
> > +	for (i = 0; i <= COE_CR; i += (sizeof(u32))) {
> > +		retval = smsc95xx_read_reg(dev, i, &data[j++]);
> > +		if (retval < 0) {
> > +			netdev_warn(dev->net, "REGS: cannot read reg[%x]\n", i);
> > +			return;
> > +		}
> > +	}
> 
> Why does this start with i = 0 whereas the calculation of the length
> uses ID_REV as the starting point?  Maybe ID_REV == 0, but you should
> be
> consistent in whether you use the name or literal number.

You are right. I will broadcast ID_REV usage.

> 
> > +	for (i = 0; i <= PHY_SPECIAL; i++)
> > +		data[j++] = smsc95xx_mdio_read(netdev, dev->mii.phy_id, i);
> > +}
> 
> Again, why use PHY_SPECIAL (+ 1) here as opposed to 32 in the
> calculation of the length?

32 was ok, but I hesitated between defining a SMSC95XX_PHY_END or using the last defined register.
Are 32 register-PHY generic to most devices? I mean could 32 be use widely?

> 
> Ben.
> 
> >  static const struct ethtool_ops smsc95xx_ethtool_ops = {
> >  	.get_link	= usbnet_get_link,
> >  	.nway_reset	= usbnet_nway_reset,
> > @@ -589,6 +624,8 @@ static const struct ethtool_ops
> > smsc95xx_ethtool_ops = {
> >  	.get_eeprom_len	= smsc95xx_ethtool_get_eeprom_len,
> >  	.get_eeprom	= smsc95xx_ethtool_get_eeprom,
> >  	.set_eeprom	= smsc95xx_ethtool_set_eeprom,
> > +	.get_regs_len	= smsc95xx_ethtool_getregslen,
> > +	.get_regs	= smsc95xx_ethtool_getregs,
> >  };
> >  
> >  static int smsc95xx_ioctl(struct net_device *netdev, struct ifreq
> >  *rq, int cmd)
> 
> --
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> 

-- 
Emeric

^ permalink raw reply

* Kernel Oops
From: RuanZhijie @ 2012-07-07 12:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, skinsbursky


Hi, all.

Mr. Stanislav Kinsbursky suggests me send you a report about an oops I encountered in the past few days.

A few days ago, I tested some VMs with NAT enabled under KVM and libvirt, but kernel crashed when I shut down these VMs, though this issue did not occur every time. I did some search and found a webpage(http://www.spinics.net/lists/netdev/msg193846.html) in which Simon reported a similar issue.

The operating system I use is gentoo-amd64 with no-multilib profile, kernel version is 3.4.0, libvirt-0.9.13 with USE flag "qemu virt-network" enabled and qemu-kvm-1.0.1-r1. Here are the steps to reproduce:

1. Let's define that starting a VM with NAT enabled under KVM and libvirt and then shut it down immediately as one operation.
2. Repeat the operation for several times.

I also did 3 tests:

Test 1: 
The host machine is with a regular linux 3.4.0 kernel, and the VM had NAT enabled. Kernel crashed after 2, 7 and 13 operations.

Test 2:
The host machine is with a regular linux 3.4.0 kernel, and the VM had no network access. No crash occured after 100 operations.

Test 3:
The host machine is with a linux 3.4.0 kernel, but drivers/net/tun.c was reverted back to just before commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d (https://github.com/torvalds/linux/commit/1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d#drivers/net/tun.c), (or you can use a tun.c from a 3.2.0 kernel, according to Simon's report), and the VM had NAT enabled. No crash occured after 100 operations.

Moreover, I observe that a virtual interface is created to handle network access when a VM with NAT enabled starts, and the virtual interface is removed when the VM is shut down. Crashes usually occur at the time the virtual interface is removed.

Finally, 3 types of kernel crash traces were observed; and thanks to rsyslog, they are all recorded:

Type 1:
2012-07-06T11:44:31.513203+08:00 timemars NetworkManager[1761]: <warn> /sys/devices/virtual/net/vnet0: couldn't determine device driver; ignoring...
2012-07-06T11:44:31.523305+08:00 timemars kernel: device vnet0 entered promiscuous mode
2012-07-06T11:44:31.532555+08:00 timemars kernel: virbr0: topology change detected, propagating
2012-07-06T11:44:31.532591+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T11:44:31.532599+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T11:44:33.019292+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T11:44:33.021282+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T11:44:33.021305+08:00 timemars kernel: device vnet0 left promiscuous mode
2012-07-06T11:44:33.021308+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T11:44:33.352293+08:00 timemars kernel: BUG: unable to handle kernel paging request at 00001fff813e1b10
2012-07-06T11:44:33.352452+08:00 timemars kernel: IP: [<ffffffff810bcaed>] __pfn_to_section+0x9/0x28
2012-07-06T11:44:33.352509+08:00 timemars kernel: PGD 0 
2012-07-06T11:44:33.352562+08:00 timemars kernel: Oops: 0000 [#1] SMP 
2012-07-06T11:44:33.352613+08:00 timemars kernel: CPU 1 
2012-07-06T11:44:33.352665+08:00 timemars kernel: Modules linked in:
2012-07-06T11:44:33.352716+08:00 timemars kernel: 
2012-07-06T11:44:33.352770+08:00 timemars kernel: Pid: 2076, comm: libvirtd Not tainted 3.4.0 #1 Dell Inc. Inspiron 1440                   /0K138P
2012-07-06T11:44:33.352826+08:00 timemars kernel: RIP: 0010:[<ffffffff810bcaed>]  [<ffffffff810bcaed>] __pfn_to_section+0x9/0x28
2012-07-06T11:44:33.352878+08:00 timemars kernel: RSP: 0018:ffff8800aacc5d40  EFLAGS: 00010246
2012-07-06T11:44:33.352931+08:00 timemars kernel: RAX: 0000000000000000 RBX: ffffe780281e6600 RCX: fffffe780281e660
2012-07-06T11:44:33.352983+08:00 timemars kernel: RDX: 0000000000003434 RSI: 0000000000000207 RDI: 000003fffff9e00a
2012-07-06T11:44:33.353035+08:00 timemars kernel: RBP: ffff8800a0799820 R08: dead000000100100 R09: dead000000200200
2012-07-06T11:44:33.353053+08:00 timemars kernel: R10: ffff88011fd10b40 R11: ffff88011fd10b40 R12: ffff8800a0799800
2012-07-06T11:44:33.353061+08:00 timemars kernel: R13: ffff8800948ef800 R14: 0000000000000000 R15: ffff8800948ef000
2012-07-06T11:44:33.353094+08:00 timemars kernel: FS:  00007ff98fdf1700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
2012-07-06T11:44:33.353103+08:00 timemars kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2012-07-06T11:44:33.353110+08:00 timemars kernel: CR2: 00001fff813e1b10 CR3: 00000000aaceb000 CR4: 00000000000407e0
2012-07-06T11:44:33.353117+08:00 timemars kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2012-07-06T11:44:33.353143+08:00 timemars kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2012-07-06T11:44:33.353153+08:00 timemars kernel: Process libvirtd (pid: 2076, threadinfo ffff8800aacc4000, task ffff8800aeaff200)
2012-07-06T11:44:33.353160+08:00 timemars kernel: Stack:
2012-07-06T11:44:33.353169+08:00 timemars kernel: ffffffff810bcb2b ffff8800a0799820 ffffffff810bc004 ffff880118cfc920
2012-07-06T11:44:33.353176+08:00 timemars kernel: ffff8800a2368f00 0000000200005058 0000000000000002 ffff880104aa8618
2012-07-06T11:44:33.353183+08:00 timemars kernel: ffffffff81608dc0 0000000000000000 0000000000000000 0000000200000005
2012-07-06T11:44:33.353190+08:00 timemars kernel: Call Trace:
2012-07-06T11:44:33.353198+08:00 timemars kernel: [<ffffffff810bcb2b>] ? lookup_page_cgroup+0x1f/0x28
2012-07-06T11:44:33.353206+08:00 timemars kernel: [<ffffffff810bc004>] ? mem_cgroup_force_empty+0x1c1/0x496
2012-07-06T11:44:33.353213+08:00 timemars kernel: [<ffffffff810d318d>] ? mntput_no_expire+0x1f/0xf4
2012-07-06T11:44:33.353222+08:00 timemars kernel: [<ffffffff8105f2ef>] ? should_resched+0x5/0x23
2012-07-06T11:44:33.353230+08:00 timemars kernel: [<ffffffff81079d92>] ? cgroup_rmdir+0x9d/0x39c
2012-07-06T11:44:33.353237+08:00 timemars kernel: [<ffffffff8105a4e8>] ? add_wait_queue+0x3c/0x3c
2012-07-06T11:44:33.353244+08:00 timemars kernel: [<ffffffff8105f2ef>] ? should_resched+0x5/0x23
2012-07-06T11:44:33.353250+08:00 timemars kernel: [<ffffffff810c859e>] ? vfs_rmdir+0x67/0xab
2012-07-06T11:44:33.353275+08:00 timemars kernel: [<ffffffff810c8f4b>] ? do_rmdir+0xad/0x101
2012-07-06T11:44:33.353285+08:00 timemars kernel: [<ffffffff810d318d>] ? mntput_no_expire+0x1f/0xf4
2012-07-06T11:44:33.353293+08:00 timemars kernel: [<ffffffff810bd095>] ? filp_close+0x57/0x5f
2012-07-06T11:44:33.353321+08:00 timemars kernel: [<ffffffff813eaf62>] ? system_call_fastpath+0x16/0x1b
2012-07-06T11:44:33.353333+08:00 timemars kernel: Code: 8b bd 28 01 00 00 e8 fc c8 ff ff eb 03 45 31 ff 48 83 c4 68 4c 89 f8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 f9 48 c1 ef 16 31 c0 <48> 8b 14 fd c0 1a 6f 81 48 c1 e9 0f 48 85 d2 74 0d 48 89 c8 83 
2012-07-06T11:44:33.353341+08:00 timemars kernel: RIP  [<ffffffff810bcaed>] __pfn_to_section+0x9/0x28
2012-07-06T11:44:33.353366+08:00 timemars kernel: RSP <ffff8800aacc5d40>
2012-07-06T11:44:33.353374+08:00 timemars kernel: CR2: 00001fff813e1b10
2012-07-06T11:44:33.353398+08:00 timemars kernel: ---[ end trace 239af6a79d1fdbe3 ]---

Type 2:
2012-07-06T12:46:13.772228+08:00 timemars NetworkManager[1684]: <warn> /sys/devices/virtual/net/vnet0: couldn't determine device driver; ignoring...
2012-07-06T12:46:13.782523+08:00 timemars kernel: device vnet0 entered promiscuous mode
2012-07-06T12:46:13.792507+08:00 timemars kernel: virbr0: topology change detected, propagating
2012-07-06T12:46:13.792539+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T12:46:13.792543+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T12:46:15.097601+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T12:46:15.097628+08:00 timemars kernel: device vnet0 left promiscuous mode
2012-07-06T12:46:15.097632+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T12:46:15.112429+08:00 timemars kernel: BUG: unable to handle kernel paging request at ffffff816d9f715f
2012-07-06T12:46:15.112456+08:00 timemars kernel: IP: [<ffffffff810a9bc6>] filp_close+0x30/0x5f
2012-07-06T12:46:15.112459+08:00 timemars kernel: PGD 15a1067 PUD 0 
2012-07-06T12:46:15.112477+08:00 timemars kernel: Oops: 0000 [#1] SMP 
2012-07-06T12:46:15.112480+08:00 timemars kernel: CPU 0 
2012-07-06T12:46:15.112483+08:00 timemars kernel: Modules linked in:
2012-07-06T12:46:15.112486+08:00 timemars kernel: 
2012-07-06T12:46:15.112489+08:00 timemars kernel: Pid: 2868, comm: qemu-system-x86 Not tainted 3.4.0 #1 Dell Inc. Inspiron 1440                   /0K138P
2012-07-06T12:46:15.112494+08:00 timemars kernel: RIP: 0010:[<ffffffff810a9bc6>]  [<ffffffff810a9bc6>] filp_close+0x30/0x5f
2012-07-06T12:46:15.112497+08:00 timemars kernel: RSP: 0018:ffff8800a676bcc8  EFLAGS: 00010286
2012-07-06T12:46:15.112500+08:00 timemars kernel: RAX: ffffff816d9f70ff RBX: ffff8800a53bafff RCX: 000000000000000f
2012-07-06T12:46:15.112503+08:00 timemars kernel: RDX: 0000000000000000 RSI: ffff88011b26d080 RDI: ffff8800a53bafff
2012-07-06T12:46:15.112506+08:00 timemars kernel: RBP: ffff88011b26d080 R08: ffff8800a40de000 R09: ffff88009bd0f800
2012-07-06T12:46:15.112510+08:00 timemars kernel: R10: ffffffff81130d8d R11: ffffffff812f0aa6 R12: 0000000000000000
2012-07-06T12:46:15.112513+08:00 timemars kernel: R13: 0000000000000001 R14: ffff88009bcc3c80 R15: 0000000000000004
2012-07-06T12:46:15.112516+08:00 timemars kernel: FS:  00007fa1d2654700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
2012-07-06T12:46:15.112519+08:00 timemars kernel: CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
2012-07-06T12:46:15.112522+08:00 timemars kernel: CR2: ffffff816d9f715f CR3: 000000000159f000 CR4: 00000000000427e0
2012-07-06T12:46:15.112525+08:00 timemars kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2012-07-06T12:46:15.112528+08:00 timemars kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2012-07-06T12:46:15.112532+08:00 timemars kernel: Process qemu-system-x86 (pid: 2868, threadinfo ffff8800a676a000, task ffff88009bc9cec0)
2012-07-06T12:46:15.112542+08:00 timemars kernel: Stack:
2012-07-06T12:46:15.112546+08:00 timemars kernel: ffff88011b26d080 0000000000000000 00000000000fdfbf ffffffff81048e0d
2012-07-06T12:46:15.112548+08:00 timemars kernel: ffffffff81130d8d ffff88009bc9cec0 0000000000000000 00007ffffffff000
2012-07-06T12:46:15.112551+08:00 timemars kernel: ffff88009bc9cec0 ffff88009bc9cec0 0000000000000001 ffffffff810490e7
2012-07-06T12:46:15.112554+08:00 timemars kernel: Call Trace:
2012-07-06T12:46:15.112557+08:00 timemars kernel: [<ffffffff81048e0d>] ? put_files_struct+0x60/0xb9
2012-07-06T12:46:15.112575+08:00 timemars kernel: [<ffffffff81130d8d>] ? exit_sem+0x1e8/0x1f7
2012-07-06T12:46:15.112579+08:00 timemars kernel: [<ffffffff810490e7>] ? do_exit+0x204/0x6df
2012-07-06T12:46:15.112582+08:00 timemars kernel: [<ffffffff8104983e>] ? do_group_exit+0x70/0x9a
2012-07-06T12:46:15.112585+08:00 timemars kernel: [<ffffffff810516ff>] ? get_signal_to_deliver+0x40d/0x42f
2012-07-06T12:46:15.112588+08:00 timemars kernel: [<ffffffff81027796>] ? do_signal+0x38/0x431
2012-07-06T12:46:15.112591+08:00 timemars kernel: [<ffffffff81051a9f>] ? copy_siginfo_to_user+0x5c/0x1bb
2012-07-06T12:46:15.112594+08:00 timemars kernel: [<ffffffff810715a5>] ? sys_futex+0x138/0x147
2012-07-06T12:46:15.112597+08:00 timemars kernel: [<ffffffff81027bc5>] ? do_notify_resume+0x25/0x50
2012-07-06T12:46:15.112600+08:00 timemars kernel: [<ffffffff8105f152>] ? should_resched+0x5/0x23
2012-07-06T12:46:15.112603+08:00 timemars kernel: [<ffffffff813d511b>] ? _cond_resched+0x6/0x1a
2012-07-06T12:46:15.112606+08:00 timemars kernel: [<ffffffff813d6628>] ? int_signal+0x12/0x17
2012-07-06T12:46:15.112610+08:00 timemars kernel: Code: f5 53 48 89 fb 48 8b 47 30 48 85 c0 75 11 48 c7 c7 ec 6d 50 81 45 31 e4 e8 1f 67 32 00 eb 33 48 8b 47 20 45 31 e4 48 85 c0 74 0e <48> 8b 40 60 48 85 c0 74 05 ff d0 41 89 c4 f6 43 3d 40 75 0b 48 
2012-07-06T12:46:15.112613+08:00 timemars kernel: RIP  [<ffffffff810a9bc6>] filp_close+0x30/0x5f
2012-07-06T12:46:15.112616+08:00 timemars kernel: RSP <ffff8800a676bcc8>
2012-07-06T12:46:15.112624+08:00 timemars kernel: CR2: ffffff816d9f715f
2012-07-06T12:46:15.179496+08:00 timemars kernel: ---[ end trace deec135ba51c758d ]---
2012-07-06T12:46:15.179516+08:00 timemars kernel: Fixing recursive fault but reboot is needed!

Type 3:
2012-07-07T19:51:52.532199+08:00 timemars NetworkManager[1778]: <warn> /sys/devices/virtual/net/vnet0: couldn't determine device driver; ignoring...
2012-07-07T19:51:52.539805+08:00 timemars kernel: device vnet0 entered promiscuous mode
2012-07-07T19:51:52.550668+08:00 timemars kernel: virbr0: topology change detected, propagating
2012-07-07T19:51:52.550704+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-07T19:51:52.550713+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-07T19:51:54.245653+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-07T19:51:54.245680+08:00 timemars kernel: device vnet0 left promiscuous mode
2012-07-07T19:51:54.245684+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-07T19:51:54.252041+08:00 timemars kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
2012-07-07T19:51:54.252071+08:00 timemars kernel: IP: [<ffffffff810d04f2>] iput+0x3e/0x191
2012-07-07T19:51:54.252074+08:00 timemars kernel: PGD 0 
2012-07-07T19:51:54.252078+08:00 timemars kernel: Oops: 0000 [#1] SMP 
2012-07-07T19:51:54.252080+08:00 timemars kernel: CPU 1 
2012-07-07T19:51:54.252085+08:00 timemars kernel: Modules linked in:
2012-07-07T19:51:54.252088+08:00 timemars kernel: 
2012-07-07T19:51:54.252091+08:00 timemars kernel: Pid: 2608, comm: qemu-system-x86 Not tainted 3.4.0 #1 Dell Inc. Inspiron 1440                   /0K138P
2012-07-07T19:51:54.252095+08:00 timemars kernel: RIP: 0010:[<ffffffff810d04f2>]  [<ffffffff810d04f2>] iput+0x3e/0x191
2012-07-07T19:51:54.252099+08:00 timemars kernel: RSP: 0018:ffff880102fede58  EFLAGS: 00010246
2012-07-07T19:51:54.252102+08:00 timemars kernel: RAX: 0000000000000001 RBX: ffff8800ac78ef20 RCX: ffff88011fd00000
2012-07-07T19:51:54.252105+08:00 timemars kernel: RDX: ffff88011fd00000 RSI: ffff8800ac78ef88 RDI: ffff8800ac78ef88
2012-07-07T19:51:54.252108+08:00 timemars kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8160c4a0
2012-07-07T19:51:54.252111+08:00 timemars kernel: R10: dead000000200200 R11: ffff880118eb3400 R12: 00000000fffcfaf8
2012-07-07T19:51:54.252115+08:00 timemars kernel: R13: 0000000000000000 R14: ffff880102fede88 R15: 00000000fffcfaf8
2012-07-07T19:51:54.252118+08:00 timemars kernel: FS:  00007f51766358c0(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
2012-07-07T19:51:54.252121+08:00 timemars kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2012-07-07T19:51:54.252124+08:00 timemars kernel: CR2: 0000000000000030 CR3: 0000000118d41000 CR4: 00000000000427f0
2012-07-07T19:51:54.252139+08:00 timemars kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2012-07-07T19:51:54.252142+08:00 timemars kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2012-07-07T19:51:54.252145+08:00 timemars kernel: Process qemu-system-x86 (pid: 2608, threadinfo ffff880102fec000, task ffff8800a5f3da00)
2012-07-07T19:51:54.252148+08:00 timemars kernel: Stack:
2012-07-07T19:51:54.252151+08:00 timemars kernel: ffff880118eb3400 ffff8800ac78e800 00000000fffcfaf8 ffffffff81307563
2012-07-07T19:51:54.252163+08:00 timemars kernel: ffff8800ac78ec00 ffffffff813169ef ffff880102fede88 ffff880102fede88
2012-07-07T19:51:54.252166+08:00 timemars kernel: dead000000100100 ffff8801174bc2a0 ffff8800ac78e800 ffff8800ac78ee80
2012-07-07T19:51:54.252169+08:00 timemars kernel: Call Trace:
2012-07-07T19:51:54.252172+08:00 timemars kernel: [<ffffffff81307563>] ? sk_release_kernel+0x28/0x47
2012-07-07T19:51:54.252175+08:00 timemars kernel: [<ffffffff813169ef>] ? netdev_run_todo+0x1c9/0x1f3
2012-07-07T19:51:54.252178+08:00 timemars kernel: [<ffffffff81244bb3>] ? tun_chr_close+0x4c/0x99
2012-07-07T19:51:54.252180+08:00 timemars kernel: [<ffffffff810bf948>] ? fput+0xf9/0x1ea
2012-07-07T19:51:54.252192+08:00 timemars kernel: [<ffffffff810bd095>] ? filp_close+0x57/0x5f
2012-07-07T19:51:54.252195+08:00 timemars kernel: [<ffffffff810bd111>] ? sys_close+0x74/0xb1
2012-07-07T19:51:54.252198+08:00 timemars kernel: [<ffffffff813eaf62>] ? system_call_fastpath+0x16/0x1b
2012-07-07T19:51:54.252210+08:00 timemars kernel: Code: 00 00 00 40 74 02 0f 0b 48 8d 77 68 48 8d bf 00 01 00 00 e8 29 ef 08 00 85 c0 0f 84 59 01 00 00 48 8b 6b 18 f6 83 80 00 00 00 08 <4c> 8b 65 30 74 11 be 61 05 00 00 48 c7 c7 45 27 52 81 e8 da 5a 
2012-07-07T19:51:54.252214+08:00 timemars kernel: RIP  [<ffffffff810d04f2>] iput+0x3e/0x191
2012-07-07T19:51:54.252217+08:00 timemars kernel: RSP <ffff880102fede58>
2012-07-07T19:51:54.252219+08:00 timemars kernel: CR2: 0000000000000030
2012-07-07T19:51:54.298648+08:00 timemars kernel: ---[ end trace 23837b1c67685f78 ]---

Best wishes,

Zhijie 		 	   		  

^ permalink raw reply

* [PATCH] ipvs: fix oops on NAT reply in br_nf context
From: Lin Ming @ 2012-07-07 10:26 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov
  Cc: Massimo Cetra, Eric Dumazet, David S. Miller, netdev

IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.781792] PGD 218f9067 PUD 0 
[  579.781865] Oops: 0000 [#1] SMP 
[  579.781945] CPU 0 
[  579.781983] Modules linked in:
[  579.782047] 
[  579.782080] 
[  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
[  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
[  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[  579.783919] Stack:
[  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[  579.784477] Call Trace:
[  579.784523]  <IRQ> 
[  579.784562] 
[  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
[  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30

The steps to reproduce as follow,

1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
   ipvsadm -A -t 192.168.1.106:80 -s rr
   ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
   ab -n 1000 http://192.168.1.106/

ip_vs_reply4
  ip_vs_out
    handle_response
      ip_vs_notrack
        nf_reset()
        {
          skb->nf_bridge = NULL;
        }

Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 include/net/ip_vs.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		nf_reset(skb);
+		nf_conntrack_put(skb->nfct);
 		skb->nfct = &nf_ct_untracked_get()->ct_general;
 		skb->nfctinfo = IP_CT_NEW;
 		nf_conntrack_get(skb->nfct);

^ permalink raw reply related

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Julian Anastasov @ 2012-07-07 10:27 UTC (permalink / raw)
  To: Lin Ming
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Simon Horman
In-Reply-To: <1341655223.7993.3.camel@chief-river-32>


	Hello,

On Sat, 7 Jul 2012, Lin Ming wrote:

> On Sat, 2012-07-07 at 12:48 +0300, Julian Anastasov wrote:
> > 
> > 	Very good. Thanks for tracking and fixing this bug.
> > Can you send a copy to Simon Horman <horms@verge.net.au>
> > with correct Subject. As this change can go to stable
> > kernels you can also improve the comments, for example:
> > 
> > ipvs: fix oops on NAT reply in br_nf context
> > 
> > 	IPVS should not reset skb->nf_bridge in FORWARD hook
> > by calling nf_reset for NAT replies. It triggers oops in
> > br_nf_forward_finish.
> > 
> > [here follows your corrected description including
> > the stack trace]
> 
> How about below? Can I have your ACK?
> I'll resend this patch in another mail.

	Very good. You can add my

Signed-off-by: Julian Anastasov <ja@ssi.bg>

> ===
> 
> Subject: [PATCH] ipvs: fix oops on NAT reply in br_nf context
> 
> IPVS should not reset skb->nf_bridge in FORWARD hook
> by calling nf_reset for NAT replies. It triggers oops in
> br_nf_forward_finish.
> 
> [  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> [  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.781792] PGD 218f9067 PUD 0 
> [  579.781865] Oops: 0000 [#1] SMP 
> [  579.781945] CPU 0 
> [  579.781983] Modules linked in:
> [  579.782047] 
> [  579.782080] 
> [  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
> [  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
> [  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
> [  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
> [  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
> [  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
> [  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
> [  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
> [  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> [  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
> [  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
> [  579.783919] Stack:
> [  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
> [  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
> [  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
> [  579.784477] Call Trace:
> [  579.784523]  <IRQ> 
> [  579.784562] 
> [  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
> [  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
> [  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
> [  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
> [  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
> [  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
> [  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
> [  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
> [  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
> [  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
> [  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
> [  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
> [  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
> [  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> [  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30
> 
> The steps to reproduce as follow,
> 
> 1. On Host1, setup brige br0(192.168.1.106)
> 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
> 3. Start IPVS service on Host1
>    ipvsadm -A -t 192.168.1.106:80 -s rr
>    ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
> 4. Run apache benchmark on Host2(192.168.1.101)
>    ab -n 1000 http://192.168.1.106/
> 
> ip_vs_reply4
>   ip_vs_out
>     handle_response
>       ip_vs_notrack
>         nf_reset()
>         {
>           skb->nf_bridge = NULL;
>         }
> 
> Actually, IPVS wants in this case just to replace nfct
> with untracked version. So replace the nf_reset(skb) call
> in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
> 
> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
> ---
>  include/net/ip_vs.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index d6146b4..95374d1 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
>  	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
>  
>  	if (!ct || !nf_ct_is_untracked(ct)) {
> -		nf_reset(skb);
> +		nf_conntrack_put(skb->nfct);
>  		skb->nfct = &nf_ct_untracked_get()->ct_general;
>  		skb->nfctinfo = IP_CT_NEW;
>  		nf_conntrack_get(skb->nfct);
> 

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Lin Ming @ 2012-07-07 10:00 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Simon Horman
In-Reply-To: <alpine.LFD.2.00.1207071229010.1595@ja.ssi.bg>

On Sat, 2012-07-07 at 12:48 +0300, Julian Anastasov wrote:
> 
> 	Very good. Thanks for tracking and fixing this bug.
> Can you send a copy to Simon Horman <horms@verge.net.au>
> with correct Subject. As this change can go to stable
> kernels you can also improve the comments, for example:
> 
> ipvs: fix oops on NAT reply in br_nf context
> 
> 	IPVS should not reset skb->nf_bridge in FORWARD hook
> by calling nf_reset for NAT replies. It triggers oops in
> br_nf_forward_finish.
> 
> [here follows your corrected description including
> the stack trace]

How about below? Can I have your ACK?
I'll resend this patch in another mail.
===

Subject: [PATCH] ipvs: fix oops on NAT reply in br_nf context

IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.781792] PGD 218f9067 PUD 0 
[  579.781865] Oops: 0000 [#1] SMP 
[  579.781945] CPU 0 
[  579.781983] Modules linked in:
[  579.782047] 
[  579.782080] 
[  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
[  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
[  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[  579.783919] Stack:
[  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[  579.784477] Call Trace:
[  579.784523]  <IRQ> 
[  579.784562] 
[  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
[  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30

The steps to reproduce as follow,

1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
   ipvsadm -A -t 192.168.1.106:80 -s rr
   ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
   ab -n 1000 http://192.168.1.106/

ip_vs_reply4
  ip_vs_out
    handle_response
      ip_vs_notrack
        nf_reset()
        {
          skb->nf_bridge = NULL;
        }

Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
---
 include/net/ip_vs.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		nf_reset(skb);
+		nf_conntrack_put(skb->nfct);
 		skb->nfct = &nf_ct_untracked_get()->ct_general;
 		skb->nfctinfo = IP_CT_NEW;
 		nf_conntrack_get(skb->nfct);

^ permalink raw reply related

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Julian Anastasov @ 2012-07-07  9:48 UTC (permalink / raw)
  To: Lin Ming
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Simon Horman
In-Reply-To: <1341622087.4004.2.camel@chief-river-32>


	Hello,

On Sat, 7 Jul 2012, Lin Ming wrote:

> Below panic was trigger when testing IPVS.
> 
> [  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> [  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.781792] PGD 218f9067 PUD 0
> [  579.781865] Oops: 0000 [#1] SMP
> [  579.781945] CPU 0
> [  579.781983] Modules linked in:
> [  579.782047]
> [  579.782080]
> [  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
> [  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
> [  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
> [  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
> [  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
> [  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
> [  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
> [  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
> [  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> [  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
> [  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
> [  579.783919] Stack:
> [  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
> [  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
> [  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
> [  579.784477] Call Trace:
> [  579.784523]  <IRQ>
> [  579.784562]
> [  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
> [  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
> [  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
> [  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
> [  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
> [  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
> [  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
> [  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
> [  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
> [  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
> [  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
> [  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
> [  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
> [  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> [  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30
> 
> The steps to reproduce as follow,
> 
> 1. On Host1, setup brige br0(192.168.1.106)
> 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
> 3. Start IPVS service on Host1
>    ipvsadm -A -t 192.168.1.106:80 -s rr
>    ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
> 4. Run apache benchmark on Host2(192.168.1.101)
>    ab -n 1000 http://192.168.1.106/
> 
> The panic happened in br_nf_forward_finish because skb->nf_bridge is NULL.
> skb->nf_bridge was set to NULL in ip_vs_reply4 hook.
> 
> br_nf_forward_ip():
>   NF_HOOK(pf, NF_INET_FORWARD, skb, brnf_get_logical_dev(skb, in), parent,
>                 br_nf_forward_finish);
> 
> This calls IPVS hook ip_vs_reply4.
> 
> ip_vs_reply4
>   ip_vs_out
>     handle_response
>       ip_vs_notrack
>         nf_reset()
>         {
>           skb->nf_bridge = NULL;
>         }
> 
> Julian said,
>     Actually, IPVS wants in this case just to replace nfct
>     with untracked version. May be it is better to replace
>     the nf_reset(skb) call in ip_vs_notrack() with a
>     nf_conntrack_put(skb->nfct) call.
> 
> This patch does what Julian suggested and it fixes the panic.

	Very good. Thanks for tracking and fixing this bug.
Can you send a copy to Simon Horman <horms@verge.net.au>
with correct Subject. As this change can go to stable
kernels you can also improve the comments, for example:

ipvs: fix oops on NAT reply in br_nf context

	IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[here follows your corrected description including
the stack trace]

> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
> ---
>  include/net/ip_vs.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index d6146b4..95374d1 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
>  	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
>  
>  	if (!ct || !nf_ct_is_untracked(ct)) {
> -		nf_reset(skb);
> +		nf_conntrack_put(skb->nfct);
>  		skb->nfct = &nf_ct_untracked_get()->ct_general;
>  		skb->nfctinfo = IP_CT_NEW;
>  		nf_conntrack_get(skb->nfct);
> -- 
> 1.7.2.5

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH 2/2] irda/pxa:
From: David Miller @ 2012-07-07  9:41 UTC (permalink / raw)
  To: arnd; +Cc: rmk+kernel, linux-arm-kernel, samuel, netdev
In-Reply-To: <201207070855.15431.arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Sat, 7 Jul 2012 08:55:15 +0000

> After c00184f9ab4 "ARM: sa11x0/pxa: convert OS timer registers to IOMEM",
> magician_defconfig and a few others fail to build because the OSCR
> register is accessed by the drivers/net/irda/pxaficp_ir.c but has turned
> into a pointer that needs to be read using readl.
> 
> There are other registers in the same driver that eventually should
> be converted, and it's unclear whether we would want a better interface
> to access the OSCR from a device driver.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> This patch should be applied to Russell's ARM tree which contains the
> patch that broke it, Cc to netdev for information and Acks.

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox