Netdev List

Netdev List
 help / color / mirror / Atom feed

* pre-fetching skb for delayed send
From: Ben Greear @ 2012-07-06 23:10 UTC (permalink / raw)
  To: netdev

Suppose one has a bridge-like thing that may store up packets
for a bit (100+ms) before sending them....

We notice that performance is very good if we can run with (near) zero
delay, but at higher delay, performance goes down significantly when attempting 10G
speeds.  I assume this is because we are sending skbs that are no longer in the cache...

So, is there an easy way to pre-fetch an skb and it's related pages?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: pre-fetching skb for delayed send
From: Benjamin LaHaise @ 2012-07-06 23:52 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev
In-Reply-To: <4FF77070.7040705@candelatech.com>

On Fri, Jul 06, 2012 at 04:10:40PM -0700, Ben Greear wrote:
> So, is there an easy way to pre-fetch an skb and it's related pages?

One trick would be to store the pointers to the skbs in an array rather than 
relying on a list of skbs.  That would enable you to issue prefetches for 
multiple skbs from a single cache line of the array, rather than relying 
upon the slow chasing of pointers guaranteed to cache miss in a list.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply

* Re: [RFC PATCH 01/10] net: Split core bits of dev_pick_tx into __dev_pick_tx
From: Ben Hutchings @ 2012-07-07  0:03 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, davem, jeffrey.t.kirsher, edumazet, therbert,
	alexander.duyck
In-Reply-To: <20120630001618.29939.26996.stgit@gitlad.jf.intel.com>

On Fri, 2012-06-29 at 17:16 -0700, Alexander Duyck wrote:
> This change splits the core bits of dev_pick_tx into a separate function.
> The main idea behind this is to make this code accessible to select queue
> functions when they decide to process the standard path instead of their
> own custom path in their select queue routine.
[...]

I like this.  Uninlining that code is going to cost some cycles, but
hopefully not enough to worry about.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH net,1/1] hyperv: Add support for setting MAC from within guests
From: Ben Hutchings @ 2012-07-07  0:19 UTC (permalink / raw)
  To: Haiyang Zhang; +Cc: davem, netdev, kys, olaf, linux-kernel, devel
In-Reply-To: <1341609932-18971-1-git-send-email-haiyangz@microsoft.com>

On Fri, 2012-07-06 at 14:25 -0700, Haiyang Zhang wrote:
> This adds support for setting synthetic NIC MAC address from within Linux
> guests. Before using this feature, the option "spoofing of MAC address"
> should be enabled at the Hyper-V manager / Settings of the synthetic
> NIC.
[...]
> +int rndis_filter_set_device_mac(struct hv_device *hdev, char *mac)
> +{
[...]
> +	t = wait_for_completion_timeout(&request->wait_event, 5*HZ);
> +	if (t == 0) {
> +		netdev_err(ndev, "timeout before we got a set response...\n");
> +		/*
> +		 * can't put_rndis_request, since we may still receive a
> +		 * send-completion.
> +		 */
> +		return -EBUSY;
> +	} else {
> +		set_complete = &request->response_msg.msg.set_complete;
> +		if (set_complete->status != RNDIS_STATUS_SUCCESS)
> +			ret = -EINVAL;
[...]

Is there a specific error code that indicates the hypervisor is
configured not to allow MAC address changes?  If so, shouldn't that be
translated to return EPERM rather than EINVAL?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] smsc95xx: support ethtool get_regs
From: Ben Hutchings @ 2012-07-07  0:24 UTC (permalink / raw)
  To: Émeric Vigier
  Cc: Steve Glendinning, steve glendinning, netdev, Nancy Lin
In-Reply-To: <1847398984.224080.1341598531284.JavaMail.root@mail.savoirfairelinux.com>

On Fri, 2012-07-06 at 14:15 -0400, Émeric Vigier wrote:
> From: Emeric Vigier <emeric.vigier@savoirfairelinux.com>
> 
> Inspired by implementation in smsc911x.c and smsc9420.c
> Tested on ARM/pandaboard rev A3
> 
> Signed-off-by: Emeric Vigier <emeric.vigier@savoirfairelinux.com>
> ---
>  drivers/net/usb/smsc95xx.c |   37 +++++++++++++++++++++++++++++++++++++
>  1 files changed, 37 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
> index b1112e7..bce14f6 100644
> --- a/drivers/net/usb/smsc95xx.c
> +++ b/drivers/net/usb/smsc95xx.c
> @@ -578,6 +578,41 @@ static int smsc95xx_ethtool_set_eeprom(struct net_device *netdev,
>  	return smsc95xx_write_eeprom(dev, ee->offset, ee->len, data);
>  }
>  
> +
> +static int smsc95xx_ethtool_getregslen(struct net_device *dev)
> +{
> +	/* all smsc95xx registers plus all phy registers */
> +	return COE_CR - ID_REV + 1 + 32 * sizeof(u32);
> +}
> +
> +static void
> +smsc95xx_ethtool_getregs(struct net_device *netdev, struct ethtool_regs *regs,
> +			 void *buf)
> +{
> +	struct usbnet *dev = netdev_priv(netdev);
> +	unsigned int i, j = 0, retval;
> +	u32 *data = buf;
> +
> +	netif_dbg(dev, hw, dev->net, "ethtool_getregs\n");
> +
> +	retval = smsc95xx_read_reg(dev, ID_REV, &regs->version);
> +	if (retval < 0) {
> +		netdev_warn(dev->net, "REGS: cannot read ID_REV\n");
> +		return;
> +	}
> +
> +	for (i = 0; i <= COE_CR; i += (sizeof(u32))) {
> +		retval = smsc95xx_read_reg(dev, i, &data[j++]);
> +		if (retval < 0) {
> +			netdev_warn(dev->net, "REGS: cannot read reg[%x]\n", i);
> +			return;
> +		}
> +	}

Why does this start with i = 0 whereas the calculation of the length
uses ID_REV as the starting point?  Maybe ID_REV == 0, but you should be
consistent in whether you use the name or literal number.

> +	for (i = 0; i <= PHY_SPECIAL; i++)
> +		data[j++] = smsc95xx_mdio_read(netdev, dev->mii.phy_id, i);
> +}

Again, why use PHY_SPECIAL (+ 1) here as opposed to 32 in the
calculation of the length?

Ben.

>  static const struct ethtool_ops smsc95xx_ethtool_ops = {
>  	.get_link	= usbnet_get_link,
>  	.nway_reset	= usbnet_nway_reset,
> @@ -589,6 +624,8 @@ static const struct ethtool_ops smsc95xx_ethtool_ops = {
>  	.get_eeprom_len	= smsc95xx_ethtool_get_eeprom_len,
>  	.get_eeprom	= smsc95xx_ethtool_get_eeprom,
>  	.set_eeprom	= smsc95xx_ethtool_set_eeprom,
> +	.get_regs_len	= smsc95xx_ethtool_getregslen,
> +	.get_regs	= smsc95xx_ethtool_getregs,
>  };
>  
>  static int smsc95xx_ioctl(struct net_device *netdev, struct ifreq *rq, int cmd)

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Lin Ming @ 2012-07-07  0:48 UTC (permalink / raw)
  To: Massimo Cetra, Eric Dumazet, Julian Anastasov
  Cc: netdev, Stephen Hemminger, David S. Miller

Below panic was trigger when testing IPVS.

[  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.781792] PGD 218f9067 PUD 0
[  579.781865] Oops: 0000 [#1] SMP
[  579.781945] CPU 0
[  579.781983] Modules linked in:
[  579.782047]
[  579.782080]
[  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
[  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
[  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[  579.783919] Stack:
[  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[  579.784477] Call Trace:
[  579.784523]  <IRQ>
[  579.784562]
[  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
[  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30

The steps to reproduce as follow,

1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
   ipvsadm -A -t 192.168.1.106:80 -s rr
   ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
   ab -n 1000 http://192.168.1.106/

The panic happened in br_nf_forward_finish because skb->nf_bridge is NULL.
skb->nf_bridge was set to NULL in ip_vs_reply4 hook.

br_nf_forward_ip():
  NF_HOOK(pf, NF_INET_FORWARD, skb, brnf_get_logical_dev(skb, in), parent,
                br_nf_forward_finish);

This calls IPVS hook ip_vs_reply4.

ip_vs_reply4
  ip_vs_out
    handle_response
      ip_vs_notrack
        nf_reset()
        {
          skb->nf_bridge = NULL;
        }

Julian said,
    Actually, IPVS wants in this case just to replace nfct
    with untracked version. May be it is better to replace
    the nf_reset(skb) call in ip_vs_notrack() with a
    nf_conntrack_put(skb->nfct) call.

This patch does what Julian suggested and it fixes the panic.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
---
 include/net/ip_vs.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		nf_reset(skb);
+		nf_conntrack_put(skb->nfct);
 		skb->nfct = &nf_ct_untracked_get()->ct_general;
 		skb->nfctinfo = IP_CT_NEW;
 		nf_conntrack_get(skb->nfct);
-- 
1.7.2.5

^ permalink raw reply related

* Question: eth0 and wlan0 traffic load
From: Lin Ming @ 2012-07-07  3:28 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet

On Host1
  eth0: 192.168.1.102
wlan0: 192.168.1.103

On Host2, scp a file to Host1 via wlan0
# scp tmp.file 192.168.1.103:/tmp

But ifstat shows all traffic goes to eth0.
Why does kernel choose eth0 although I use wlan0's address?
Is this an intended behavior?

       eth0               wlan0
 KB/s in  KB/s out   KB/s in  KB/s out
11989.26    286.05      0.00      0.00
11998.90    286.00      0.00      0.00
 6910.18    170.63      0.00      0.00
10260.05    246.82      0.00      0.00
 8295.90    202.12      0.00      0.00
11979.90    282.41      0.00      0.00
11618.39    274.71      0.00      0.00
 7004.78    170.79      0.00      0.00
12003.33    286.21      0.00      0.00
 7775.28    190.59      0.00      0.00
11980.18    280.76      0.00      0.00
11998.32    286.29      0.00      0.00
11995.61    286.08      0.00      0.00
11996.57    285.88      0.00      0.00
 6938.20    162.98      0.00      0.00
 6736.30    169.22      0.04      0.00
12000.92    284.83      0.00      0.00
12005.41    286.13      0.00      0.00
12000.58    286.27      0.00      0.00
11963.65    284.13      0.00      0.00
11935.24    281.32      0.00      0.00
12002.83    286.27      0.00      0.00
11997.59    285.68      0.00      0.00

Thanks,
Lin Ming

^ permalink raw reply

* Re: pre-fetching skb for delayed send
From: Ben Greear @ 2012-07-07  4:15 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: netdev
In-Reply-To: <20120706235258.GD19462@kvack.org>

On 07/06/2012 04:52 PM, Benjamin LaHaise wrote:
> On Fri, Jul 06, 2012 at 04:10:40PM -0700, Ben Greear wrote:
>> So, is there an easy way to pre-fetch an skb and it's related pages?
>
> One trick would be to store the pointers to the skbs in an array rather than
> relying on a list of skbs.  That would enable you to issue prefetches for
> multiple skbs from a single cache line of the array, rather than relying
> upon the slow chasing of pointers guaranteed to cache miss in a list.

Well, to start with..I at least know the next skb to transmit,
so I figured I'd prefetch it before starting tx of the current
skb.

My question is more basic though:  Given an skb, how do you prefetch
it...do you just prefetch the skb pointer, or do you need to dig into
the guts of the skb?

Thanks,
Ben

>
> 		-ben
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: Question: eth0 and wlan0 traffic load
From: Ben Greear @ 2012-07-07  4:16 UTC (permalink / raw)
  To: Lin Ming; +Cc: netdev, Eric Dumazet
In-Reply-To: <CAF1ivSaY3s-Vjbe7_vziCRCEsaXduug3+2G9R55SUBQ8HPT6hQ@mail.gmail.com>

On 07/06/2012 08:28 PM, Lin Ming wrote:
> On Host1
>    eth0: 192.168.1.102
> wlan0: 192.168.1.103
>
> On Host2, scp a file to Host1 via wlan0
> # scp tmp.file 192.168.1.103:/tmp
>
> But ifstat shows all traffic goes to eth0.
> Why does kernel choose eth0 although I use wlan0's address?
> Is this an intended behavior?

You have to be clever with routing rules, arp-filtering, and such
to ensure that works as you want.

Thanks,
Ben

>
>         eth0               wlan0
>   KB/s in  KB/s out   KB/s in  KB/s out
> 11989.26    286.05      0.00      0.00
> 11998.90    286.00      0.00      0.00
>   6910.18    170.63      0.00      0.00
> 10260.05    246.82      0.00      0.00
>   8295.90    202.12      0.00      0.00
> 11979.90    282.41      0.00      0.00
> 11618.39    274.71      0.00      0.00
>   7004.78    170.79      0.00      0.00
> 12003.33    286.21      0.00      0.00
>   7775.28    190.59      0.00      0.00
> 11980.18    280.76      0.00      0.00
> 11998.32    286.29      0.00      0.00
> 11995.61    286.08      0.00      0.00
> 11996.57    285.88      0.00      0.00
>   6938.20    162.98      0.00      0.00
>   6736.30    169.22      0.04      0.00
> 12000.92    284.83      0.00      0.00
> 12005.41    286.13      0.00      0.00
> 12000.58    286.27      0.00      0.00
> 11963.65    284.13      0.00      0.00
> 11935.24    281.32      0.00      0.00
> 12002.83    286.27      0.00      0.00
> 11997.59    285.68      0.00      0.00
>
> Thanks,
> Lin Ming
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: Question: eth0 and wlan0 traffic load
From: Lin Ming @ 2012-07-07  4:55 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev, Eric Dumazet
In-Reply-To: <4FF7B826.4010007@candelatech.com>

On Sat, Jul 7, 2012 at 12:16 PM, Ben Greear <greearb@candelatech.com> wrote:
> On 07/06/2012 08:28 PM, Lin Ming wrote:
>>
>> On Host1
>>    eth0: 192.168.1.102
>> wlan0: 192.168.1.103
>>
>> On Host2, scp a file to Host1 via wlan0
>> # scp tmp.file 192.168.1.103:/tmp
>>
>> But ifstat shows all traffic goes to eth0.
>> Why does kernel choose eth0 although I use wlan0's address?
>> Is this an intended behavior?
>
>
> You have to be clever with routing rules, arp-filtering, and such
> to ensure that works as you want.

Ah, right.
The arp cache on Host2 for .102 and .103 points to Host1's eth0.

Host2 # arp -a
chief-river-32.local (192.168.1.102) at 00:1c:c4:a4:33:9e [ether] on eth0
chief-river-32.local (192.168.1.103) at 00:1c:c4:a4:33:9e [ether] on eth0

Thanks,
Lin Ming

^ permalink raw reply

* RE: speed specific port cost calculation in br_if.c and how to make it generic based on 802.1d Table-17-3?
From: Parav.Pandit @ 2012-07-07  5:54 UTC (permalink / raw)
  To: netdev; +Cc: Parav.Pandit, buytenh

Can any one point me on how to go about it?
Parav

> -----Original Message-----
> From: Pandit, Parav
> Sent: Tuesday, July 03, 2012 6:10 PM
> To: 'netdev@vger.kernel.org'
> Cc: Parav Pandit
> Subject: speed specific port cost calculation in br_if.c and how to make it
> generic based on 802.1d Table-17-3?
> 
> Hi,
> 
> I am trying to add further support to bridging portion of stack for 40G and
> 100G speeds.
> net/bridge/br_if.c function port_cost() has hard coded values of 2, 4, 19, 100
> for speed of 10G, 1G, 100M, 10Mbps respectively.
> Comment mentions about based on 802.1d standard.
> 
> I am referring to 802.1d-2004 Table 17-3-Port Path Cost values.
> It mentions port cost value of 2000, 20000, 200000 respectively for speed of
> 10G, 1G, 100Mbps respectively.
> This makes sense to me as the post cost value is inversely proportional and
> scalar function of its speed.
> 
> Can anyone please guide me on
> 1. how the current calculation of path/port cost is being done so that I can
> enhance it for other speeds too in generic way if possible?
> 2. How can I incorporate for other speed settings in generic way based on
> 802.1a-2004 spec, Table 17-3?
> 
> Current code snippet:
> /* Determine initial path cost based on speed.
>  * using recommendations from 802.1d standard
>  *
>  * Since driver might sleep need to not be holding any locks.
>  */
> static int port_cost(struct net_device *dev) {
>         struct ethtool_cmd ecmd;
> 
>         if (!__ethtool_get_settings(dev, &ecmd)) {
>                 switch (ethtool_cmd_speed(&ecmd)) {
>                 case SPEED_10000:
>                         return 2;
>                 case SPEED_1000:
>                         return 4;
>                 case SPEED_100:
>                         return 19;
>                 case SPEED_10:
>                         return 100;
>                 }
>         }
> 
>         /* Old silly heuristics based on name */
>         if (!strncmp(dev->name, "lec", 3))
>                 return 7;
> 
>         if (!strncmp(dev->name, "plip", 4))
>                 return 2500;
> 
>         return 100;     /* assume old 10Mbps */
> }
> 
> Regards,
> Parav Pandit

^ permalink raw reply

* [PATCH] r6040: remove duplicate call to the pci_set_drvdata
From: Devendra Naga @ 2012-07-07  6:07 UTC (permalink / raw)
  To: Florian Fainelli, netdev; +Cc: Devendra Naga

pci_set_drvdata is called twice at the remove path of driver,
call it once.

Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
---
 drivers/net/ethernet/rdc/r6040.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/rdc/r6040.c b/drivers/net/ethernet/rdc/r6040.c
index d1827e8..9acc026 100644
--- a/drivers/net/ethernet/rdc/r6040.c
+++ b/drivers/net/ethernet/rdc/r6040.c
@@ -1256,7 +1256,6 @@ static void __devexit r6040_remove_one(struct pci_dev *pdev)
 	kfree(lp->mii_bus->irq);
 	mdiobus_free(lp->mii_bus);
 	netif_napi_del(&lp->napi);
-	pci_set_drvdata(pdev, NULL);
 	pci_iounmap(pdev, lp->base);
 	pci_release_regions(pdev);
 	free_netdev(dev);
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH net-next 3/5] net: Fix non-kernel-doc comments with kernel-doc start marker
From: Antonio Quartulli @ 2012-07-07  7:04 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, netdev
In-Reply-To: <1341614742.2923.18.camel@bwh-desktop.uk.solarflarecom.com>

[-- Attachment #1: Type: text/plain, Size: 1172 bytes --]

On Fri, Jul 06, 2012 at 11:45:42PM +0100, Ben Hutchings wrote:
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
>  net/ceph/pagelist.c |    6 +++---
>  net/dcb/dcbnl.c     |    2 +-
>  net/ipv4/tcp.c      |    2 +-
>  net/tipc/socket.c   |    2 +-
>  4 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ceph/pagelist.c b/net/ceph/pagelist.c
> index 13cb409..693852f 100644
> --- a/net/ceph/pagelist.c
> +++ b/net/ceph/pagelist.c
> @@ -95,7 +95,7 @@ int ceph_pagelist_reserve(struct ceph_pagelist *pl, size_t space)
>  }
>  EXPORT_SYMBOL(ceph_pagelist_reserve);
>  
> -/**
> +/*
>   * Free any pages that have been preallocated.
>   */


Am I wrong or this kind of comment should be:

/* Free any pages that have been preallocated. */


> -/**
> +/*
>   * Truncate a pagelist to the given point. Move extra pages to reserve.
>   * This won't sleep.
>   * Returns: 0 on success,

And this also should start like:

/* Truncate a pagelist to the given point. Move extra pages to reserve.
 * .....

?

Cheers,

-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: [PATCH 1/4] asix: Fix checkpatch warnings
From: Eric Dumazet @ 2012-07-07  8:36 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Christian Riesch, netdev, Oliver Neukum, Eric Dumazet, Allan Chou,
	Mark Lord, Ming Lei, Michael Riesch
In-Reply-To: <CANEJEGvEUYNpWd6FPFnONk4AUFqLsD-oXG7ZRqibqhqxfUYKZA@mail.gmail.com>

On Fri, 2012-07-06 at 14:43 -0700, Grant Grundler wrote:

> Christian is clearly running checkpatch.pl as suggested in
> Documentation/SubmittingPatches. He missed the part about "You should
> be able to justify all violations that remain in your patch." and/or
> didn't know about "fixing existing code is a waste of time".
> 
> Given the extent of the changes Christian is making (factoring out
> asix common code from model specific code), it's helpful to have clean
> checkpatch runs. I don't believe it's a waste of time to apply this
> patch. Is it conflicting with any other code changes that are in
> flight now?

It was a waste of time for me, at least (Since I was CCed for the
patch), and just sent my personal opinion on checkpatch generated
patches.

Splitting a perfectly good line :

netdev_err(dev->net, "Error reading PHYID register: %02x\n", ret);

into

netdev_err(dev->net, "Error reading PHYID register: %02x\n",
	   ret);

is a clear sign of how stupid checkpatch is.

And fact we can spend time on discussions about checkpatch is
astonishing.

Automatic tools should be smart and ease people tasks, not
slowing them.

Note that Christian patch serie in itself is good, I don't want to block
it at all.

^ permalink raw reply

* [PATCH 2/2] irda/pxa:
From: Arnd Bergmann @ 2012-07-07  8:55 UTC (permalink / raw)
  To: Russell King; +Cc: linux-arm-kernel, Samuel Ortiz, netdev
In-Reply-To: <201207070848.30706.arnd@arndb.de>

After c00184f9ab4 "ARM: sa11x0/pxa: convert OS timer registers to IOMEM",
magician_defconfig and a few others fail to build because the OSCR
register is accessed by the drivers/net/irda/pxaficp_ir.c but has turned
into a pointer that needs to be read using readl.

There are other registers in the same driver that eventually should
be converted, and it's unclear whether we would want a better interface
to access the OSCR from a device driver.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
This patch should be applied to Russell's ARM tree which contains the
patch that broke it, Cc to netdev for information and Acks.

diff --git a/drivers/net/irda/pxaficp_ir.c b/drivers/net/irda/pxaficp_ir.c
index ff16daf..8d54767 100644
--- a/drivers/net/irda/pxaficp_ir.c
+++ b/drivers/net/irda/pxaficp_ir.c
@@ -289,7 +289,7 @@ static irqreturn_t pxa_irda_sir_irq(int irq, void *dev_id)
 			}
 			lsr = STLSR;
 		}
-		si->last_oscr = OSCR;
+		si->last_oscr = readl_relaxed(OSCR);
 		break;
 
 	case 0x04: /* Received Data Available */
@@ -300,7 +300,7 @@ static irqreturn_t pxa_irda_sir_irq(int irq, void *dev_id)
 		    dev->stats.rx_bytes++;
 	            async_unwrap_char(dev, &dev->stats, &si->rx_buff, STRBR);
 	  	} while (STLSR & LSR_DR);
-		si->last_oscr = OSCR;
+		si->last_oscr = readl_relaxed(OSCR);
 	  	break;
 
 	case 0x02: /* Transmit FIFO Data Request */
@@ -316,7 +316,7 @@ static irqreturn_t pxa_irda_sir_irq(int irq, void *dev_id)
                         /* We need to ensure that the transmitter has finished. */
 			while ((STLSR & LSR_TEMT) == 0)
 				cpu_relax();
-			si->last_oscr = OSCR;
+			si->last_oscr = readl_relaxed(OSCR);
 
 			/*
 		 	* Ok, we've finished transmitting.  Now enable
@@ -370,7 +370,7 @@ static void pxa_irda_fir_dma_tx_irq(int channel, void *data)
 
 	while (ICSR1 & ICSR1_TBY)
 		cpu_relax();
-	si->last_oscr = OSCR;
+	si->last_oscr = readl_relaxed(OSCR);
 
 	/*
 	 * HACK: It looks like the TBY bit is dropped too soon.
@@ -470,7 +470,7 @@ static irqreturn_t pxa_irda_fir_irq(int irq, void *dev_id)
 
 	/* stop RX DMA */
 	DCSR(si->rxdma) &= ~DCSR_RUN;
-	si->last_oscr = OSCR;
+	si->last_oscr = readl_relaxed(OSCR);
 	icsr0 = ICSR0;
 
 	if (icsr0 & (ICSR0_FRE | ICSR0_RAB)) {
@@ -546,7 +546,7 @@ static int pxa_irda_hard_xmit(struct sk_buff *skb, struct net_device *dev)
 		skb_copy_from_linear_data(skb, si->dma_tx_buff, skb->len);
 
 		if (mtt)
-			while ((unsigned)(OSCR - si->last_oscr)/4 < mtt)
+			while ((unsigned)(readl_relaxed(OSCR) - si->last_oscr)/4 < mtt)
 				cpu_relax();
 
 		/* stop RX DMA,  disable FICP */

^ permalink raw reply related

* Re: [PATCH 2/2] irda/pxa:
From: David Miller @ 2012-07-07  9:41 UTC (permalink / raw)
  To: arnd; +Cc: rmk+kernel, linux-arm-kernel, samuel, netdev
In-Reply-To: <201207070855.15431.arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Sat, 7 Jul 2012 08:55:15 +0000

> After c00184f9ab4 "ARM: sa11x0/pxa: convert OS timer registers to IOMEM",
> magician_defconfig and a few others fail to build because the OSCR
> register is accessed by the drivers/net/irda/pxaficp_ir.c but has turned
> into a pointer that needs to be read using readl.
> 
> There are other registers in the same driver that eventually should
> be converted, and it's unclear whether we would want a better interface
> to access the OSCR from a device driver.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> This patch should be applied to Russell's ARM tree which contains the
> patch that broke it, Cc to netdev for information and Acks.

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Julian Anastasov @ 2012-07-07  9:48 UTC (permalink / raw)
  To: Lin Ming
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Simon Horman
In-Reply-To: <1341622087.4004.2.camel@chief-river-32>


	Hello,

On Sat, 7 Jul 2012, Lin Ming wrote:

> Below panic was trigger when testing IPVS.
> 
> [  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> [  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.781792] PGD 218f9067 PUD 0
> [  579.781865] Oops: 0000 [#1] SMP
> [  579.781945] CPU 0
> [  579.781983] Modules linked in:
> [  579.782047]
> [  579.782080]
> [  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
> [  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
> [  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
> [  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
> [  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
> [  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
> [  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
> [  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
> [  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> [  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
> [  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
> [  579.783919] Stack:
> [  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
> [  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
> [  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
> [  579.784477] Call Trace:
> [  579.784523]  <IRQ>
> [  579.784562]
> [  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
> [  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
> [  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
> [  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
> [  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
> [  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
> [  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
> [  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
> [  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
> [  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
> [  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
> [  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
> [  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
> [  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> [  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30
> 
> The steps to reproduce as follow,
> 
> 1. On Host1, setup brige br0(192.168.1.106)
> 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
> 3. Start IPVS service on Host1
>    ipvsadm -A -t 192.168.1.106:80 -s rr
>    ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
> 4. Run apache benchmark on Host2(192.168.1.101)
>    ab -n 1000 http://192.168.1.106/
> 
> The panic happened in br_nf_forward_finish because skb->nf_bridge is NULL.
> skb->nf_bridge was set to NULL in ip_vs_reply4 hook.
> 
> br_nf_forward_ip():
>   NF_HOOK(pf, NF_INET_FORWARD, skb, brnf_get_logical_dev(skb, in), parent,
>                 br_nf_forward_finish);
> 
> This calls IPVS hook ip_vs_reply4.
> 
> ip_vs_reply4
>   ip_vs_out
>     handle_response
>       ip_vs_notrack
>         nf_reset()
>         {
>           skb->nf_bridge = NULL;
>         }
> 
> Julian said,
>     Actually, IPVS wants in this case just to replace nfct
>     with untracked version. May be it is better to replace
>     the nf_reset(skb) call in ip_vs_notrack() with a
>     nf_conntrack_put(skb->nfct) call.
> 
> This patch does what Julian suggested and it fixes the panic.

	Very good. Thanks for tracking and fixing this bug.
Can you send a copy to Simon Horman <horms@verge.net.au>
with correct Subject. As this change can go to stable
kernels you can also improve the comments, for example:

ipvs: fix oops on NAT reply in br_nf context

	IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[here follows your corrected description including
the stack trace]

> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
> ---
>  include/net/ip_vs.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index d6146b4..95374d1 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
>  	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
>  
>  	if (!ct || !nf_ct_is_untracked(ct)) {
> -		nf_reset(skb);
> +		nf_conntrack_put(skb->nfct);
>  		skb->nfct = &nf_ct_untracked_get()->ct_general;
>  		skb->nfctinfo = IP_CT_NEW;
>  		nf_conntrack_get(skb->nfct);
> -- 
> 1.7.2.5

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Lin Ming @ 2012-07-07 10:00 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Simon Horman
In-Reply-To: <alpine.LFD.2.00.1207071229010.1595@ja.ssi.bg>

On Sat, 2012-07-07 at 12:48 +0300, Julian Anastasov wrote:
> 
> 	Very good. Thanks for tracking and fixing this bug.
> Can you send a copy to Simon Horman <horms@verge.net.au>
> with correct Subject. As this change can go to stable
> kernels you can also improve the comments, for example:
> 
> ipvs: fix oops on NAT reply in br_nf context
> 
> 	IPVS should not reset skb->nf_bridge in FORWARD hook
> by calling nf_reset for NAT replies. It triggers oops in
> br_nf_forward_finish.
> 
> [here follows your corrected description including
> the stack trace]

How about below? Can I have your ACK?
I'll resend this patch in another mail.
===

Subject: [PATCH] ipvs: fix oops on NAT reply in br_nf context

IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.781792] PGD 218f9067 PUD 0 
[  579.781865] Oops: 0000 [#1] SMP 
[  579.781945] CPU 0 
[  579.781983] Modules linked in:
[  579.782047] 
[  579.782080] 
[  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
[  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
[  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[  579.783919] Stack:
[  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[  579.784477] Call Trace:
[  579.784523]  <IRQ> 
[  579.784562] 
[  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
[  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30

The steps to reproduce as follow,

1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
   ipvsadm -A -t 192.168.1.106:80 -s rr
   ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
   ab -n 1000 http://192.168.1.106/

ip_vs_reply4
  ip_vs_out
    handle_response
      ip_vs_notrack
        nf_reset()
        {
          skb->nf_bridge = NULL;
        }

Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
---
 include/net/ip_vs.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		nf_reset(skb);
+		nf_conntrack_put(skb->nfct);
 		skb->nfct = &nf_ct_untracked_get()->ct_general;
 		skb->nfctinfo = IP_CT_NEW;
 		nf_conntrack_get(skb->nfct);

^ permalink raw reply related

* Re: [PATCH v2] bridge: netfilter: fix skb->nf_bridge NULL panic in br_nf_forward_finish
From: Julian Anastasov @ 2012-07-07 10:27 UTC (permalink / raw)
  To: Lin Ming
  Cc: Massimo Cetra, Eric Dumazet, netdev, Stephen Hemminger,
	David S. Miller, Simon Horman
In-Reply-To: <1341655223.7993.3.camel@chief-river-32>


	Hello,

On Sat, 7 Jul 2012, Lin Ming wrote:

> On Sat, 2012-07-07 at 12:48 +0300, Julian Anastasov wrote:
> > 
> > 	Very good. Thanks for tracking and fixing this bug.
> > Can you send a copy to Simon Horman <horms@verge.net.au>
> > with correct Subject. As this change can go to stable
> > kernels you can also improve the comments, for example:
> > 
> > ipvs: fix oops on NAT reply in br_nf context
> > 
> > 	IPVS should not reset skb->nf_bridge in FORWARD hook
> > by calling nf_reset for NAT replies. It triggers oops in
> > br_nf_forward_finish.
> > 
> > [here follows your corrected description including
> > the stack trace]
> 
> How about below? Can I have your ACK?
> I'll resend this patch in another mail.

	Very good. You can add my

Signed-off-by: Julian Anastasov <ja@ssi.bg>

> ===
> 
> Subject: [PATCH] ipvs: fix oops on NAT reply in br_nf context
> 
> IPVS should not reset skb->nf_bridge in FORWARD hook
> by calling nf_reset for NAT replies. It triggers oops in
> br_nf_forward_finish.
> 
> [  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
> [  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.781792] PGD 218f9067 PUD 0 
> [  579.781865] Oops: 0000 [#1] SMP 
> [  579.781945] CPU 0 
> [  579.781983] Modules linked in:
> [  579.782047] 
> [  579.782080] 
> [  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
> [  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
> [  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
> [  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
> [  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
> [  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
> [  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
> [  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
> [  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
> [  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> [  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
> [  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
> [  579.783919] Stack:
> [  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
> [  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
> [  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
> [  579.784477] Call Trace:
> [  579.784523]  <IRQ> 
> [  579.784562] 
> [  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
> [  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
> [  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
> [  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
> [  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
> [  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
> [  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
> [  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
> [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
> [  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
> [  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
> [  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
> [  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
> [  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
> [  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
> [  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
> [  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> [  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30
> 
> The steps to reproduce as follow,
> 
> 1. On Host1, setup brige br0(192.168.1.106)
> 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
> 3. Start IPVS service on Host1
>    ipvsadm -A -t 192.168.1.106:80 -s rr
>    ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
> 4. Run apache benchmark on Host2(192.168.1.101)
>    ab -n 1000 http://192.168.1.106/
> 
> ip_vs_reply4
>   ip_vs_out
>     handle_response
>       ip_vs_notrack
>         nf_reset()
>         {
>           skb->nf_bridge = NULL;
>         }
> 
> Actually, IPVS wants in this case just to replace nfct
> with untracked version. So replace the nf_reset(skb) call
> in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
> 
> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
> ---
>  include/net/ip_vs.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index d6146b4..95374d1 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
>  	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
>  
>  	if (!ct || !nf_ct_is_untracked(ct)) {
> -		nf_reset(skb);
> +		nf_conntrack_put(skb->nfct);
>  		skb->nfct = &nf_ct_untracked_get()->ct_general;
>  		skb->nfctinfo = IP_CT_NEW;
>  		nf_conntrack_get(skb->nfct);
> 

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* [PATCH] ipvs: fix oops on NAT reply in br_nf context
From: Lin Ming @ 2012-07-07 10:26 UTC (permalink / raw)
  To: Simon Horman, Julian Anastasov
  Cc: Massimo Cetra, Eric Dumazet, David S. Miller, netdev

IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.

[  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.781792] PGD 218f9067 PUD 0 
[  579.781865] Oops: 0000 [#1] SMP 
[  579.781945] CPU 0 
[  579.781983] Modules linked in:
[  579.782047] 
[  579.782080] 
[  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
[  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
[  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[  579.783919] Stack:
[  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[  579.784477] Call Trace:
[  579.784523]  <IRQ> 
[  579.784562] 
[  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
[  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30

The steps to reproduce as follow,

1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
   ipvsadm -A -t 192.168.1.106:80 -s rr
   ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
   ab -n 1000 http://192.168.1.106/

ip_vs_reply4
  ip_vs_out
    handle_response
      ip_vs_notrack
        nf_reset()
        {
          skb->nf_bridge = NULL;
        }

Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
---
 include/net/ip_vs.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
 
 	if (!ct || !nf_ct_is_untracked(ct)) {
-		nf_reset(skb);
+		nf_conntrack_put(skb->nfct);
 		skb->nfct = &nf_ct_untracked_get()->ct_general;
 		skb->nfctinfo = IP_CT_NEW;
 		nf_conntrack_get(skb->nfct);

^ permalink raw reply related

* Kernel Oops
From: RuanZhijie @ 2012-07-07 12:54 UTC (permalink / raw)
  To: davem; +Cc: netdev, skinsbursky


Hi, all.

Mr. Stanislav Kinsbursky suggests me send you a report about an oops I encountered in the past few days.

A few days ago, I tested some VMs with NAT enabled under KVM and libvirt, but kernel crashed when I shut down these VMs, though this issue did not occur every time. I did some search and found a webpage(http://www.spinics.net/lists/netdev/msg193846.html) in which Simon reported a similar issue.

The operating system I use is gentoo-amd64 with no-multilib profile, kernel version is 3.4.0, libvirt-0.9.13 with USE flag "qemu virt-network" enabled and qemu-kvm-1.0.1-r1. Here are the steps to reproduce:

1. Let's define that starting a VM with NAT enabled under KVM and libvirt and then shut it down immediately as one operation.
2. Repeat the operation for several times.

I also did 3 tests:

Test 1: 
The host machine is with a regular linux 3.4.0 kernel, and the VM had NAT enabled. Kernel crashed after 2, 7 and 13 operations.

Test 2:
The host machine is with a regular linux 3.4.0 kernel, and the VM had no network access. No crash occured after 100 operations.

Test 3:
The host machine is with a linux 3.4.0 kernel, but drivers/net/tun.c was reverted back to just before commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d (https://github.com/torvalds/linux/commit/1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d#drivers/net/tun.c), (or you can use a tun.c from a 3.2.0 kernel, according to Simon's report), and the VM had NAT enabled. No crash occured after 100 operations.

Moreover, I observe that a virtual interface is created to handle network access when a VM with NAT enabled starts, and the virtual interface is removed when the VM is shut down. Crashes usually occur at the time the virtual interface is removed.

Finally, 3 types of kernel crash traces were observed; and thanks to rsyslog, they are all recorded:

Type 1:
2012-07-06T11:44:31.513203+08:00 timemars NetworkManager[1761]: <warn> /sys/devices/virtual/net/vnet0: couldn't determine device driver; ignoring...
2012-07-06T11:44:31.523305+08:00 timemars kernel: device vnet0 entered promiscuous mode
2012-07-06T11:44:31.532555+08:00 timemars kernel: virbr0: topology change detected, propagating
2012-07-06T11:44:31.532591+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T11:44:31.532599+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T11:44:33.019292+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T11:44:33.021282+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T11:44:33.021305+08:00 timemars kernel: device vnet0 left promiscuous mode
2012-07-06T11:44:33.021308+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T11:44:33.352293+08:00 timemars kernel: BUG: unable to handle kernel paging request at 00001fff813e1b10
2012-07-06T11:44:33.352452+08:00 timemars kernel: IP: [<ffffffff810bcaed>] __pfn_to_section+0x9/0x28
2012-07-06T11:44:33.352509+08:00 timemars kernel: PGD 0 
2012-07-06T11:44:33.352562+08:00 timemars kernel: Oops: 0000 [#1] SMP 
2012-07-06T11:44:33.352613+08:00 timemars kernel: CPU 1 
2012-07-06T11:44:33.352665+08:00 timemars kernel: Modules linked in:
2012-07-06T11:44:33.352716+08:00 timemars kernel: 
2012-07-06T11:44:33.352770+08:00 timemars kernel: Pid: 2076, comm: libvirtd Not tainted 3.4.0 #1 Dell Inc. Inspiron 1440                   /0K138P
2012-07-06T11:44:33.352826+08:00 timemars kernel: RIP: 0010:[<ffffffff810bcaed>]  [<ffffffff810bcaed>] __pfn_to_section+0x9/0x28
2012-07-06T11:44:33.352878+08:00 timemars kernel: RSP: 0018:ffff8800aacc5d40  EFLAGS: 00010246
2012-07-06T11:44:33.352931+08:00 timemars kernel: RAX: 0000000000000000 RBX: ffffe780281e6600 RCX: fffffe780281e660
2012-07-06T11:44:33.352983+08:00 timemars kernel: RDX: 0000000000003434 RSI: 0000000000000207 RDI: 000003fffff9e00a
2012-07-06T11:44:33.353035+08:00 timemars kernel: RBP: ffff8800a0799820 R08: dead000000100100 R09: dead000000200200
2012-07-06T11:44:33.353053+08:00 timemars kernel: R10: ffff88011fd10b40 R11: ffff88011fd10b40 R12: ffff8800a0799800
2012-07-06T11:44:33.353061+08:00 timemars kernel: R13: ffff8800948ef800 R14: 0000000000000000 R15: ffff8800948ef000
2012-07-06T11:44:33.353094+08:00 timemars kernel: FS:  00007ff98fdf1700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
2012-07-06T11:44:33.353103+08:00 timemars kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2012-07-06T11:44:33.353110+08:00 timemars kernel: CR2: 00001fff813e1b10 CR3: 00000000aaceb000 CR4: 00000000000407e0
2012-07-06T11:44:33.353117+08:00 timemars kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2012-07-06T11:44:33.353143+08:00 timemars kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2012-07-06T11:44:33.353153+08:00 timemars kernel: Process libvirtd (pid: 2076, threadinfo ffff8800aacc4000, task ffff8800aeaff200)
2012-07-06T11:44:33.353160+08:00 timemars kernel: Stack:
2012-07-06T11:44:33.353169+08:00 timemars kernel: ffffffff810bcb2b ffff8800a0799820 ffffffff810bc004 ffff880118cfc920
2012-07-06T11:44:33.353176+08:00 timemars kernel: ffff8800a2368f00 0000000200005058 0000000000000002 ffff880104aa8618
2012-07-06T11:44:33.353183+08:00 timemars kernel: ffffffff81608dc0 0000000000000000 0000000000000000 0000000200000005
2012-07-06T11:44:33.353190+08:00 timemars kernel: Call Trace:
2012-07-06T11:44:33.353198+08:00 timemars kernel: [<ffffffff810bcb2b>] ? lookup_page_cgroup+0x1f/0x28
2012-07-06T11:44:33.353206+08:00 timemars kernel: [<ffffffff810bc004>] ? mem_cgroup_force_empty+0x1c1/0x496
2012-07-06T11:44:33.353213+08:00 timemars kernel: [<ffffffff810d318d>] ? mntput_no_expire+0x1f/0xf4
2012-07-06T11:44:33.353222+08:00 timemars kernel: [<ffffffff8105f2ef>] ? should_resched+0x5/0x23
2012-07-06T11:44:33.353230+08:00 timemars kernel: [<ffffffff81079d92>] ? cgroup_rmdir+0x9d/0x39c
2012-07-06T11:44:33.353237+08:00 timemars kernel: [<ffffffff8105a4e8>] ? add_wait_queue+0x3c/0x3c
2012-07-06T11:44:33.353244+08:00 timemars kernel: [<ffffffff8105f2ef>] ? should_resched+0x5/0x23
2012-07-06T11:44:33.353250+08:00 timemars kernel: [<ffffffff810c859e>] ? vfs_rmdir+0x67/0xab
2012-07-06T11:44:33.353275+08:00 timemars kernel: [<ffffffff810c8f4b>] ? do_rmdir+0xad/0x101
2012-07-06T11:44:33.353285+08:00 timemars kernel: [<ffffffff810d318d>] ? mntput_no_expire+0x1f/0xf4
2012-07-06T11:44:33.353293+08:00 timemars kernel: [<ffffffff810bd095>] ? filp_close+0x57/0x5f
2012-07-06T11:44:33.353321+08:00 timemars kernel: [<ffffffff813eaf62>] ? system_call_fastpath+0x16/0x1b
2012-07-06T11:44:33.353333+08:00 timemars kernel: Code: 8b bd 28 01 00 00 e8 fc c8 ff ff eb 03 45 31 ff 48 83 c4 68 4c 89 f8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 f9 48 c1 ef 16 31 c0 <48> 8b 14 fd c0 1a 6f 81 48 c1 e9 0f 48 85 d2 74 0d 48 89 c8 83 
2012-07-06T11:44:33.353341+08:00 timemars kernel: RIP  [<ffffffff810bcaed>] __pfn_to_section+0x9/0x28
2012-07-06T11:44:33.353366+08:00 timemars kernel: RSP <ffff8800aacc5d40>
2012-07-06T11:44:33.353374+08:00 timemars kernel: CR2: 00001fff813e1b10
2012-07-06T11:44:33.353398+08:00 timemars kernel: ---[ end trace 239af6a79d1fdbe3 ]---

Type 2:
2012-07-06T12:46:13.772228+08:00 timemars NetworkManager[1684]: <warn> /sys/devices/virtual/net/vnet0: couldn't determine device driver; ignoring...
2012-07-06T12:46:13.782523+08:00 timemars kernel: device vnet0 entered promiscuous mode
2012-07-06T12:46:13.792507+08:00 timemars kernel: virbr0: topology change detected, propagating
2012-07-06T12:46:13.792539+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T12:46:13.792543+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-06T12:46:15.097601+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T12:46:15.097628+08:00 timemars kernel: device vnet0 left promiscuous mode
2012-07-06T12:46:15.097632+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-06T12:46:15.112429+08:00 timemars kernel: BUG: unable to handle kernel paging request at ffffff816d9f715f
2012-07-06T12:46:15.112456+08:00 timemars kernel: IP: [<ffffffff810a9bc6>] filp_close+0x30/0x5f
2012-07-06T12:46:15.112459+08:00 timemars kernel: PGD 15a1067 PUD 0 
2012-07-06T12:46:15.112477+08:00 timemars kernel: Oops: 0000 [#1] SMP 
2012-07-06T12:46:15.112480+08:00 timemars kernel: CPU 0 
2012-07-06T12:46:15.112483+08:00 timemars kernel: Modules linked in:
2012-07-06T12:46:15.112486+08:00 timemars kernel: 
2012-07-06T12:46:15.112489+08:00 timemars kernel: Pid: 2868, comm: qemu-system-x86 Not tainted 3.4.0 #1 Dell Inc. Inspiron 1440                   /0K138P
2012-07-06T12:46:15.112494+08:00 timemars kernel: RIP: 0010:[<ffffffff810a9bc6>]  [<ffffffff810a9bc6>] filp_close+0x30/0x5f
2012-07-06T12:46:15.112497+08:00 timemars kernel: RSP: 0018:ffff8800a676bcc8  EFLAGS: 00010286
2012-07-06T12:46:15.112500+08:00 timemars kernel: RAX: ffffff816d9f70ff RBX: ffff8800a53bafff RCX: 000000000000000f
2012-07-06T12:46:15.112503+08:00 timemars kernel: RDX: 0000000000000000 RSI: ffff88011b26d080 RDI: ffff8800a53bafff
2012-07-06T12:46:15.112506+08:00 timemars kernel: RBP: ffff88011b26d080 R08: ffff8800a40de000 R09: ffff88009bd0f800
2012-07-06T12:46:15.112510+08:00 timemars kernel: R10: ffffffff81130d8d R11: ffffffff812f0aa6 R12: 0000000000000000
2012-07-06T12:46:15.112513+08:00 timemars kernel: R13: 0000000000000001 R14: ffff88009bcc3c80 R15: 0000000000000004
2012-07-06T12:46:15.112516+08:00 timemars kernel: FS:  00007fa1d2654700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
2012-07-06T12:46:15.112519+08:00 timemars kernel: CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
2012-07-06T12:46:15.112522+08:00 timemars kernel: CR2: ffffff816d9f715f CR3: 000000000159f000 CR4: 00000000000427e0
2012-07-06T12:46:15.112525+08:00 timemars kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2012-07-06T12:46:15.112528+08:00 timemars kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2012-07-06T12:46:15.112532+08:00 timemars kernel: Process qemu-system-x86 (pid: 2868, threadinfo ffff8800a676a000, task ffff88009bc9cec0)
2012-07-06T12:46:15.112542+08:00 timemars kernel: Stack:
2012-07-06T12:46:15.112546+08:00 timemars kernel: ffff88011b26d080 0000000000000000 00000000000fdfbf ffffffff81048e0d
2012-07-06T12:46:15.112548+08:00 timemars kernel: ffffffff81130d8d ffff88009bc9cec0 0000000000000000 00007ffffffff000
2012-07-06T12:46:15.112551+08:00 timemars kernel: ffff88009bc9cec0 ffff88009bc9cec0 0000000000000001 ffffffff810490e7
2012-07-06T12:46:15.112554+08:00 timemars kernel: Call Trace:
2012-07-06T12:46:15.112557+08:00 timemars kernel: [<ffffffff81048e0d>] ? put_files_struct+0x60/0xb9
2012-07-06T12:46:15.112575+08:00 timemars kernel: [<ffffffff81130d8d>] ? exit_sem+0x1e8/0x1f7
2012-07-06T12:46:15.112579+08:00 timemars kernel: [<ffffffff810490e7>] ? do_exit+0x204/0x6df
2012-07-06T12:46:15.112582+08:00 timemars kernel: [<ffffffff8104983e>] ? do_group_exit+0x70/0x9a
2012-07-06T12:46:15.112585+08:00 timemars kernel: [<ffffffff810516ff>] ? get_signal_to_deliver+0x40d/0x42f
2012-07-06T12:46:15.112588+08:00 timemars kernel: [<ffffffff81027796>] ? do_signal+0x38/0x431
2012-07-06T12:46:15.112591+08:00 timemars kernel: [<ffffffff81051a9f>] ? copy_siginfo_to_user+0x5c/0x1bb
2012-07-06T12:46:15.112594+08:00 timemars kernel: [<ffffffff810715a5>] ? sys_futex+0x138/0x147
2012-07-06T12:46:15.112597+08:00 timemars kernel: [<ffffffff81027bc5>] ? do_notify_resume+0x25/0x50
2012-07-06T12:46:15.112600+08:00 timemars kernel: [<ffffffff8105f152>] ? should_resched+0x5/0x23
2012-07-06T12:46:15.112603+08:00 timemars kernel: [<ffffffff813d511b>] ? _cond_resched+0x6/0x1a
2012-07-06T12:46:15.112606+08:00 timemars kernel: [<ffffffff813d6628>] ? int_signal+0x12/0x17
2012-07-06T12:46:15.112610+08:00 timemars kernel: Code: f5 53 48 89 fb 48 8b 47 30 48 85 c0 75 11 48 c7 c7 ec 6d 50 81 45 31 e4 e8 1f 67 32 00 eb 33 48 8b 47 20 45 31 e4 48 85 c0 74 0e <48> 8b 40 60 48 85 c0 74 05 ff d0 41 89 c4 f6 43 3d 40 75 0b 48 
2012-07-06T12:46:15.112613+08:00 timemars kernel: RIP  [<ffffffff810a9bc6>] filp_close+0x30/0x5f
2012-07-06T12:46:15.112616+08:00 timemars kernel: RSP <ffff8800a676bcc8>
2012-07-06T12:46:15.112624+08:00 timemars kernel: CR2: ffffff816d9f715f
2012-07-06T12:46:15.179496+08:00 timemars kernel: ---[ end trace deec135ba51c758d ]---
2012-07-06T12:46:15.179516+08:00 timemars kernel: Fixing recursive fault but reboot is needed!

Type 3:
2012-07-07T19:51:52.532199+08:00 timemars NetworkManager[1778]: <warn> /sys/devices/virtual/net/vnet0: couldn't determine device driver; ignoring...
2012-07-07T19:51:52.539805+08:00 timemars kernel: device vnet0 entered promiscuous mode
2012-07-07T19:51:52.550668+08:00 timemars kernel: virbr0: topology change detected, propagating
2012-07-07T19:51:52.550704+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-07T19:51:52.550713+08:00 timemars kernel: virbr0: port 1(vnet0) entered forwarding state
2012-07-07T19:51:54.245653+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-07T19:51:54.245680+08:00 timemars kernel: device vnet0 left promiscuous mode
2012-07-07T19:51:54.245684+08:00 timemars kernel: virbr0: port 1(vnet0) entered disabled state
2012-07-07T19:51:54.252041+08:00 timemars kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
2012-07-07T19:51:54.252071+08:00 timemars kernel: IP: [<ffffffff810d04f2>] iput+0x3e/0x191
2012-07-07T19:51:54.252074+08:00 timemars kernel: PGD 0 
2012-07-07T19:51:54.252078+08:00 timemars kernel: Oops: 0000 [#1] SMP 
2012-07-07T19:51:54.252080+08:00 timemars kernel: CPU 1 
2012-07-07T19:51:54.252085+08:00 timemars kernel: Modules linked in:
2012-07-07T19:51:54.252088+08:00 timemars kernel: 
2012-07-07T19:51:54.252091+08:00 timemars kernel: Pid: 2608, comm: qemu-system-x86 Not tainted 3.4.0 #1 Dell Inc. Inspiron 1440                   /0K138P
2012-07-07T19:51:54.252095+08:00 timemars kernel: RIP: 0010:[<ffffffff810d04f2>]  [<ffffffff810d04f2>] iput+0x3e/0x191
2012-07-07T19:51:54.252099+08:00 timemars kernel: RSP: 0018:ffff880102fede58  EFLAGS: 00010246
2012-07-07T19:51:54.252102+08:00 timemars kernel: RAX: 0000000000000001 RBX: ffff8800ac78ef20 RCX: ffff88011fd00000
2012-07-07T19:51:54.252105+08:00 timemars kernel: RDX: ffff88011fd00000 RSI: ffff8800ac78ef88 RDI: ffff8800ac78ef88
2012-07-07T19:51:54.252108+08:00 timemars kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8160c4a0
2012-07-07T19:51:54.252111+08:00 timemars kernel: R10: dead000000200200 R11: ffff880118eb3400 R12: 00000000fffcfaf8
2012-07-07T19:51:54.252115+08:00 timemars kernel: R13: 0000000000000000 R14: ffff880102fede88 R15: 00000000fffcfaf8
2012-07-07T19:51:54.252118+08:00 timemars kernel: FS:  00007f51766358c0(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
2012-07-07T19:51:54.252121+08:00 timemars kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2012-07-07T19:51:54.252124+08:00 timemars kernel: CR2: 0000000000000030 CR3: 0000000118d41000 CR4: 00000000000427f0
2012-07-07T19:51:54.252139+08:00 timemars kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2012-07-07T19:51:54.252142+08:00 timemars kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2012-07-07T19:51:54.252145+08:00 timemars kernel: Process qemu-system-x86 (pid: 2608, threadinfo ffff880102fec000, task ffff8800a5f3da00)
2012-07-07T19:51:54.252148+08:00 timemars kernel: Stack:
2012-07-07T19:51:54.252151+08:00 timemars kernel: ffff880118eb3400 ffff8800ac78e800 00000000fffcfaf8 ffffffff81307563
2012-07-07T19:51:54.252163+08:00 timemars kernel: ffff8800ac78ec00 ffffffff813169ef ffff880102fede88 ffff880102fede88
2012-07-07T19:51:54.252166+08:00 timemars kernel: dead000000100100 ffff8801174bc2a0 ffff8800ac78e800 ffff8800ac78ee80
2012-07-07T19:51:54.252169+08:00 timemars kernel: Call Trace:
2012-07-07T19:51:54.252172+08:00 timemars kernel: [<ffffffff81307563>] ? sk_release_kernel+0x28/0x47
2012-07-07T19:51:54.252175+08:00 timemars kernel: [<ffffffff813169ef>] ? netdev_run_todo+0x1c9/0x1f3
2012-07-07T19:51:54.252178+08:00 timemars kernel: [<ffffffff81244bb3>] ? tun_chr_close+0x4c/0x99
2012-07-07T19:51:54.252180+08:00 timemars kernel: [<ffffffff810bf948>] ? fput+0xf9/0x1ea
2012-07-07T19:51:54.252192+08:00 timemars kernel: [<ffffffff810bd095>] ? filp_close+0x57/0x5f
2012-07-07T19:51:54.252195+08:00 timemars kernel: [<ffffffff810bd111>] ? sys_close+0x74/0xb1
2012-07-07T19:51:54.252198+08:00 timemars kernel: [<ffffffff813eaf62>] ? system_call_fastpath+0x16/0x1b
2012-07-07T19:51:54.252210+08:00 timemars kernel: Code: 00 00 00 40 74 02 0f 0b 48 8d 77 68 48 8d bf 00 01 00 00 e8 29 ef 08 00 85 c0 0f 84 59 01 00 00 48 8b 6b 18 f6 83 80 00 00 00 08 <4c> 8b 65 30 74 11 be 61 05 00 00 48 c7 c7 45 27 52 81 e8 da 5a 
2012-07-07T19:51:54.252214+08:00 timemars kernel: RIP  [<ffffffff810d04f2>] iput+0x3e/0x191
2012-07-07T19:51:54.252217+08:00 timemars kernel: RSP <ffff880102fede58>
2012-07-07T19:51:54.252219+08:00 timemars kernel: CR2: 0000000000000030
2012-07-07T19:51:54.298648+08:00 timemars kernel: ---[ end trace 23837b1c67685f78 ]---

Best wishes,

Zhijie 		 	   		  

^ permalink raw reply

* Re: [PATCH] smsc95xx: support ethtool get_regs
From: Émeric Vigier @ 2012-07-07 13:58 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Steve Glendinning, steve glendinning, netdev, Nancy Lin
In-Reply-To: <1341620651.2923.49.camel@bwh-desktop.uk.solarflarecom.com>



----- Mail original -----
> On Fri, 2012-07-06 at 14:15 -0400, Émeric Vigier wrote:
> > From: Emeric Vigier <emeric.vigier@savoirfairelinux.com>
> > 
> > Inspired by implementation in smsc911x.c and smsc9420.c
> > Tested on ARM/pandaboard rev A3
> > 
> > Signed-off-by: Emeric Vigier <emeric.vigier@savoirfairelinux.com>
> > ---
> >  drivers/net/usb/smsc95xx.c |   37
> >  +++++++++++++++++++++++++++++++++++++
> >  1 files changed, 37 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/net/usb/smsc95xx.c
> > b/drivers/net/usb/smsc95xx.c
> > index b1112e7..bce14f6 100644
> > --- a/drivers/net/usb/smsc95xx.c
> > +++ b/drivers/net/usb/smsc95xx.c
> > @@ -578,6 +578,41 @@ static int smsc95xx_ethtool_set_eeprom(struct
> > net_device *netdev,
> >  	return smsc95xx_write_eeprom(dev, ee->offset, ee->len, data);
> >  }
> >  
> > +
> > +static int smsc95xx_ethtool_getregslen(struct net_device *dev)
> > +{
> > +	/* all smsc95xx registers plus all phy registers */
> > +	return COE_CR - ID_REV + 1 + 32 * sizeof(u32);
> > +}
> > +
> > +static void
> > +smsc95xx_ethtool_getregs(struct net_device *netdev, struct
> > ethtool_regs *regs,
> > +			 void *buf)
> > +{
> > +	struct usbnet *dev = netdev_priv(netdev);
> > +	unsigned int i, j = 0, retval;
> > +	u32 *data = buf;
> > +
> > +	netif_dbg(dev, hw, dev->net, "ethtool_getregs\n");
> > +
> > +	retval = smsc95xx_read_reg(dev, ID_REV, &regs->version);
> > +	if (retval < 0) {
> > +		netdev_warn(dev->net, "REGS: cannot read ID_REV\n");
> > +		return;
> > +	}
> > +
> > +	for (i = 0; i <= COE_CR; i += (sizeof(u32))) {
> > +		retval = smsc95xx_read_reg(dev, i, &data[j++]);
> > +		if (retval < 0) {
> > +			netdev_warn(dev->net, "REGS: cannot read reg[%x]\n", i);
> > +			return;
> > +		}
> > +	}
> 
> Why does this start with i = 0 whereas the calculation of the length
> uses ID_REV as the starting point?  Maybe ID_REV == 0, but you should
> be
> consistent in whether you use the name or literal number.

You are right. I will broadcast ID_REV usage.

> 
> > +	for (i = 0; i <= PHY_SPECIAL; i++)
> > +		data[j++] = smsc95xx_mdio_read(netdev, dev->mii.phy_id, i);
> > +}
> 
> Again, why use PHY_SPECIAL (+ 1) here as opposed to 32 in the
> calculation of the length?

32 was ok, but I hesitated between defining a SMSC95XX_PHY_END or using the last defined register.
Are 32 register-PHY generic to most devices? I mean could 32 be use widely?

> 
> Ben.
> 
> >  static const struct ethtool_ops smsc95xx_ethtool_ops = {
> >  	.get_link	= usbnet_get_link,
> >  	.nway_reset	= usbnet_nway_reset,
> > @@ -589,6 +624,8 @@ static const struct ethtool_ops
> > smsc95xx_ethtool_ops = {
> >  	.get_eeprom_len	= smsc95xx_ethtool_get_eeprom_len,
> >  	.get_eeprom	= smsc95xx_ethtool_get_eeprom,
> >  	.set_eeprom	= smsc95xx_ethtool_set_eeprom,
> > +	.get_regs_len	= smsc95xx_ethtool_getregslen,
> > +	.get_regs	= smsc95xx_ethtool_getregs,
> >  };
> >  
> >  static int smsc95xx_ioctl(struct net_device *netdev, struct ifreq
> >  *rq, int cmd)
> 
> --
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
> 
> 

-- 
Emeric

^ permalink raw reply

* Re: [PATCH] smsc95xx: support ethtool get_regs
From: Émeric Vigier @ 2012-07-07 14:13 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Steve Glendinning, netdev, Nancy Lin
In-Reply-To: <20120706221102.GA14276@electric-eye.fr.zoreil.com>



----- Mail original -----
> Émeric Vigier <emeric.vigier@savoirfairelinux.com> :
> [...]
> > Yes, there are 16 bits wide according to smsc95xx.h.
> > But other smsc drivers define 32bit wide PHY regs. I made myself
> > believe
> > that smsc would use the same PHY for each ethernet chip.
> 
> SMSC people would surely answer before I find the relevant datasheet.
> 
> Anyway the PHY registers are accessed indirectly through the
> MII_{ADDR, DATA}
> registers and MII_DATA r/w mask is limited to the lower 16 bits.
> 
> > So would something like s/32 * sizeof(u32)/PHY_SPECIAL *
> > sizeof(u16)/ solve the issue here?
> 
> You would have to pack data[] as well. Or use u16 *.

I will check this out next week.

> 
> > Concerning the ioctl, I found ethtool much easier to use. And I
> > believe
> > smsc9514 is a very popular chipset, so this could help others
> > debugging it.
> 
> # mii-tool -vv e1000
> Using SIOCGMIIPHY=0x8947
> e1000: no autonegotiation, 10baseT-HD, link ok
>   registers for MII PHY 0:
>     1140 796d 0141 0c30 0de1 0021 0004 0000
>     0000 0200 0000 0000 0000 0000 0000 3000
>     0000 0000 0000 0000 0174 0000 0000 0000
>     4100 0000 000d 000f 0000 0000 0000 0000
>   product info: vendor 00:50:43, model 3 rev 0
>   basic mode:   autonegotiation enabled
>   basic status: autonegotiation complete, link ok
>   capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD
>   10baseT-HD
>   advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD
>   10baseT-HD flow-control
>   link partner: 10baseT-HD
> 
> It is not that bad for the first 32 PHY registers.

I didn't know about mii-tool. Thanks.

> 
> [...]
> > Do you mean LTT? I am not familiar with it, I should have a look.
> 
> Documentation/trace/ftrace.txt

ok

> 
> [...]
> > I should change that in previous "for" loop as well I suppose?
> 
> You may.

Thanks for your patience.

> 
> --
> Ueimor
> 

-- 
Emeric

^ permalink raw reply

* Re: pre-fetching skb for delayed send
From: Benjamin LaHaise @ 2012-07-07 14:18 UTC (permalink / raw)
  To: Ben Greear; +Cc: netdev
In-Reply-To: <4FF7B7E1.4000602@candelatech.com>

On Fri, Jul 06, 2012 at 09:15:29PM -0700, Ben Greear wrote:
> Well, to start with..I at least know the next skb to transmit,
> so I figured I'd prefetch it before starting tx of the current
> skb.

Prefetching data you're just about to immediately access doesn't actually 
help improve performance -- it's better to just access the data.  Prefetching 
subsequent skbs should be of more benefit.

> My question is more basic though:  Given an skb, how do you prefetch
> it...do you just prefetch the skb pointer, or do you need to dig into
> the guts of the skb?

See prefetch.h for details.  Just pass the pointer to the cacheline you want 
to trigger prefetch on to prefetch() or prefetchw(), or use prefetch_range() 
(probably useful for skbs given that they're larger than one cacheline).  
For an skb, you may have to prefetch the frag list as well.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply

* Kedves Győztes
From: John Herbert @ 2012-07-07 14:49 UTC (permalink / raw)


Kedves Győztes

E-mail címed szerencsésen választotta az ENSZ kártérítésre jogosult / kedvezményezett összege $ 1,000,000,00 és a bank Bankkártya már letétbe helyezett Ezex Courier Express szállít nekik, hogy az Ön számára. Ön javasoljuk, hogy forduljon a futár tiszt nekik szállítani a Bank Bankkártya Önnek az alábbi information.When kapcsolatba vele, kérjük, küldje el neki a partner címe és azonosítószáma, hogy tudja teljesíteni a Bank Bankkártya Önnek

kapcsolatot a futár tiszt az alábbi információkat

Kapcsolattartó személy: Harry Richard

E-mail: harry.richard@diplomats.com


Kérjük kérek


Üdvözlettel
 Mr.John Herbert

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox