Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] IPVS: replace sprintf to snprintf to avoid stack buffer overflow
From: Patrick McHardy @ 2010-04-08 11:37 UTC (permalink / raw)
  To: Simon Horman
  Cc: wzt.wzt, linux-kernel, Wensong Zhang, Julian Anastasov, netdev,
	lvs-devel
In-Reply-To: <20100407223445.GA15810@verge.net.au>

Simon Horman wrote:
> On Wed, Apr 07, 2010 at 06:09:54PM +0200, Patrick McHardy wrote:
>> Simon Horman wrote:
>>> On Tue, Apr 06, 2010 at 10:50:20AM +0800, wzt.wzt@gmail.com wrote:
>>>> IPVS not check the length of pp->name, use sprintf will cause stack buffer overflow.
>>>> struct ip_vs_protocol{} declare name as char *, if register a protocol as:
>>>> struct ip_vs_protocol ip_vs_test = {
>>>>         .name =			"aaaaaaaa....128...aaa",
>>>> 	.debug_packet =         ip_vs_tcpudp_debug_packet,
>>>> };
>>>>
>>>> when called ip_vs_tcpudp_debug_packet(), sprintf(buf, "%s TRUNCATED", pp->name); 
>>>> will cause stack buffer overflow.
>>>>
>>>> Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
>>> I think that the simple answer is, don't do that.
>> Indeed.
>>
>>> But your patch seems entirely reasonable to me.
>>>
>>> Acked-by: Simon Horman <horms@verge.net.au>
>>>
>>> Patrick, please consider merging this.
>> I think this fix is a bit silly, we can simply print the name in
>> the pr_debug() statement and avoid both the potential overflow
>> and truncation.
>>
>> How does this look?
> 
> Looks good to me:
> 
> Acked-by: Simon Horman <horms@verge.net.au>

Thanks, I've applied the patch.

^ permalink raw reply

* Re: Crashes in xfrm_lookup
From: Mark Brown @ 2010-04-08 11:31 UTC (permalink / raw)
  To: Timo Teräs; +Cc: netdev
In-Reply-To: <4BBDBCF9.5060906@iki.fi>

On Thu, Apr 08, 2010 at 02:24:41PM +0300, Timo Teräs wrote:

> Probably the same as http://marc.info/?t=127071006600005&r=1&w=2

> Happens because CONFIG_XFRM_SUB_POLICY is not enabled, and one of
> the helper functions I used did unexpected things in that case.

> Try the following:

Yes, that seems to make things much happier - thanks!  I do have one
patch I came up with while I was looking at the code (though I'm not
sure it's correct), I'll post that once I've had lunch.

^ permalink raw reply

* Re: Crashes in xfrm_lookup
From: Timo Teräs @ 2010-04-08 11:24 UTC (permalink / raw)
  To: Mark Brown; +Cc: netdev
In-Reply-To: <20100408111441.GA14241@sirena.org.uk>

Mark Brown wrote:
> With -next as of today I'm experiencing crashes in the xfrm_lookup code
> when attempting to boot from an NFS root on a SMDK6410 (an ARM based
> development board).  I'm currently investigating what's gone wrong, but
> thought it was better to report early in case it's obvious to someone
> familiar with the code or there's a fix already.

Probably the same as http://marc.info/?t=127071006600005&r=1&w=2

Happens because CONFIG_XFRM_SUB_POLICY is not enabled, and one of
the helper functions I used did unexpected things in that case.

Try the following:

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 625dd61..cccb049 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -735,19 +735,12 @@ static inline void xfrm_pol_put(struct xfrm_policy *policy)
 		xfrm_policy_destroy(policy);
 }
 
-#ifdef CONFIG_XFRM_SUB_POLICY
 static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols)
 {
 	int i;
 	for (i = npols - 1; i >= 0; --i)
 		xfrm_pol_put(pols[i]);
 }
-#else
-static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols)
-{
-	xfrm_pol_put(pols[0]);
-}
-#endif
 
 extern void __xfrm_state_destroy(struct xfrm_state *);
 


^ permalink raw reply related

* Re: [PATCH v3 0/4] xfrm: add x86 CONFIG_COMPAT support
From: Patrick McHardy @ 2010-04-08 11:24 UTC (permalink / raw)
  To: David Miller; +Cc: fw, netdev, johannes
In-Reply-To: <20100408.025412.247148013.davem@davemloft.net>

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Thu, 08 Apr 2010 11:44:43 +0200
> 
>> Either the kernel or the userspace programs have to be updated
>> either way.
> 
> The only case in which we only have to change one side is if we add
> the full set of compat support to the kernel.
> 
> If we take any other option (new XFRM numbers and new datastructures,
> or only convert sendmsg() to do compat translations), it requires both
> the kernel and userspace to change.

You're right of course.

> And the currently existing 32-bit binaries don't work on 64-bit
> kernels because of something that cannot be classified any other way
> than as being a kernel bug.

Agreed, but since its not a new bug, I think we have some flexibility
in how to fix it. In the wireless case we had no real choice but to
add the COMPAT_NETLINK thing, but its a bit sad to add more of this
crap to netlink.

Anyways, I guess there's no reason why we couldn't do both and
transistion to a fixed API in the long term.

^ permalink raw reply

* Crashes in xfrm_lookup
From: Mark Brown @ 2010-04-08 11:14 UTC (permalink / raw)
  To: netdev; +Cc: Timo =?unknown-8bit?B?VGVyw6Rz?=

With -next as of today I'm experiencing crashes in the xfrm_lookup code
when attempting to boot from an NFS root on a SMDK6410 (an ARM based
development board).  I'm currently investigating what's gone wrong, but
thought it was better to report early in case it's obvious to someone
familiar with the code or there's a fix already.

Sample backtrace below:

[    6.540000] Unable to handle kernel NULL pointer dereference at virtual addrc
[    6.540000] pgd = c0004000                                                   
[    6.550000] [0000003c] *pgd=00000000                                         
[    6.550000] Internal error: Oops: 5 [#1]                                     
[    6.550000] last sysfs file:                                                 
[    6.550000] Modules linked in:                                               
[    6.550000] CPU: 0    Not tainted (2.6.34-rc3-next-20100408-00011-g3d8ee7a-)
[    6.550000] PC is at __xfrm_lookup+0x308/0x3a8                               
[    6.550000] LR is at ip_route_output_flow+0x7c/0x210                         

...

[    6.550000] [<c02c1394>] (__xfrm_lookup+0x308/0x3a8) from [<c02889a4>] (ip_r)
[    6.550000] [<c02889a4>] (ip_route_output_flow+0x7c/0x210) from [<c02ade9c>])
[    6.550000] [<c02ade9c>] (udp_sendmsg+0x320/0x5ec) from [<c02b3b20>] (inet_s)
[    6.550000] [<c02b3b20>] (inet_sendmsg+0x58/0x64) from [<c02612d8>] (sock_se)
[    6.550000] [<c02612d8>] (sock_sendmsg+0x88/0xa8) from [<c0261338>] (kernel_)
[    6.550000] [<c0261338>] (kernel_sendmsg+0x40/0x7c) from [<c02d0dd0>] (xs_se)
[    6.550000] [<c02d0dd0>] (xs_send_kvec+0x94/0xa4) from [<c02d0e70>] (xs_send)
[    6.550000] [<c02d0e70>] (xs_sendpages+0x90/0x204) from [<c02d1268>] (xs_udp)
[    6.550000] [<c02d1268>] (xs_udp_send_request+0x3c/0x120) from [<c02cf254>] )
[    6.550000] [<c02cf254>] (xprt_transmit+0x158/0x25c) from [<c02cc9c0>] (call)
[    6.550000] [<c02cc9c0>] (call_transmit+0x204/0x284) from [<c02d380c>] (__rp)
[    6.550000] [<c02d380c>] (__rpc_execute+0x94/0x2c4) from [<c02cd35c>] (rpc_r)
[    6.550000] [<c02cd35c>] (rpc_run_task+0x60/0x6c) from [<c02cd4a0>] (rpc_cal)
[    6.550000] [<c02cd4a0>] (rpc_call_sync+0x58/0x84) from [<c02cd51c>] (rpc_pi)
[    6.550000] [<c02cd51c>] (rpc_ping+0x50/0x70) from [<c02cded0>] (rpc_create+)
[    6.550000] [<c02cded0>] (rpc_create+0x3e4/0x498) from [<c013c194>] (nfs_mou)
[    6.550000] [<c013c194>] (nfs_mount+0xd8/0x1c4) from [<c0016d14>] (nfs_root_)
[    6.550000] [<c0016d14>] (nfs_root_data+0x2cc/0x398) from [<c0008fb8>] (moun)
[    6.550000] [<c0008fb8>] (mount_root+0x1c/0x104) from [<c0009204>] (prepare_)
[    6.550000] [<c0009204>] (prepare_namespace+0x164/0x1c8) from [<c0008478>] ()
[    6.550000] [<c0008478>] (kernel_init+0x108/0x148) from [<c002df5c>] (kernel)


^ permalink raw reply

* HTB - What's the minimal value for 'rate' parameter?
From: Antonio Almeida @ 2010-04-08 11:07 UTC (permalink / raw)
  To: netdev, jarkao2, kaber, davem, devik

Hi!
I've been using HTB for a while, and we've already sent some e-mails
each other when resolving HTB accuracy issue.
When using HTB, I realised that for some configurations the rate limit
doesn't work.
I suspect that the problem is the minimum value of rate parameter,
which I cant figure out what is.

I simple configuration that turns out to be wrong is as fallows: The
root (1:1) gets the link bandwidth configuration; the second (1:2) is
set to 4096Kbit; then I have two branches (1:10 and 1:11) with rate
1024Kbit and ceil 4096Kbit; and finally a leaf class in each branch
(1:111 below 1:11, and 1:101 below 1:10) with rate 8bit and ceil
4096Kbit, and the same priority.
I don't want to have sustained rate, and since I must configure 'rate'
parameter I decide to set it to 8bits - which is the minimal accepted
value. My cue goes for 'rate' parameter. If I set 'rate' parameter to
1Kbit for instance, the problem disappears and the shaping is done
perfectly.

So, I'm looking for help to find out if the problem is actually in
this parameter configuration or if it's just coincidence and I'll get
the same problem ahead :(
What's the minimal value for 'rate' parameter using HTB qdisc?

Here's the tc command output, using leaves rate set to 8bit:

# tc -s class list dev eth1
class htb 1:101 parent 1:10 leaf 101: prio 3 rate 8bit ceil 4096Kbit
burst 225b cburst 3655b
 Sent 42305702 bytes 27943 pkt (dropped 23031, overlimits 0 requeues 0)
 rate 4036Kbit 333pps backlog 0b 126p requeues 0
 lended: 27817 borrowed: 0 giants: 0
 tokens: 1250000000 ctokens: -39387

class htb 1:11 parent 1:2 rate 1024Kbit ceil 4096Kbit burst 2113b cburst 3655b
 Sent 42170956 bytes 27854 pkt (dropped 0, overlimits 0 requeues 0)
 rate 4035Kbit 333pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -937499999 ctokens: -42881

class htb 1:10 parent 1:2 rate 1024Kbit ceil 4096Kbit burst 2113b cburst 3655b
 Sent 42114938 bytes 27817 pkt (dropped 0, overlimits 0 requeues 0)
 rate 4035Kbit 333pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -937499999 ctokens: -39387

class htb 1:1 root rate 1000Mbit ceil 1000Mbit burst 503375b cburst 503375b
 Sent 84285894 bytes 55671 pkt (dropped 0, overlimits 0 requeues 0)
 rate 8071Kbit 666pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: 62750 ctokens: 62750

class htb 1:111 parent 1:11 leaf 111: prio 3 rate 8bit ceil 4096Kbit
burst 225b cburst 3655b
 Sent 42363234 bytes 27981 pkt (dropped 23064, overlimits 0 requeues 0)
 rate 4035Kbit 333pps backlog 0b 127p requeues 0
 lended: 27854 borrowed: 0 giants: 0
 tokens: 1250000000 ctokens: -42881

class htb 1:2 parent 1:1 rate 4096Kbit ceil 4096Kbit burst 3655b cburst 3655b
 Sent 84285894 bytes 55671 pkt (dropped 0, overlimits 0 requeues 0)
 rate 8071Kbit 666pps backlog 0b 0p requeues 0
 lended: 0 borrowed: 0 giants: 0
 tokens: -937499999 ctokens: -937499999

class sfq 111:16 parent 111:
 (dropped 0, overlimits 0 requeues 0)
 backlog 0b 127p requeues 0
 allot 1514

class sfq 101:252 parent 101:
 (dropped 0, overlimits 0 requeues 0)
 backlog 0b 126p requeues 0
 allot 1514


Regards
  Antonio Almeida

^ permalink raw reply

* Re: dhcp client packet sniffing...
From: David Miller @ 2010-04-08 11:01 UTC (permalink / raw)
  To: nhorman; +Cc: netdev, herbert
In-Reply-To: <20100408105951.GA16868@hmsreliant.think-freely.org>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 8 Apr 2010 06:59:51 -0400

> Dave, sorry to be late to the discussion, but can you refesh us on
> why dhcp has to sniff every packet?

It wants to see DHCP packets from the server.

And it can't use ipv4 RAW sockets due to some limitations wrt. getting
at the link layer headers when using that interfaces.

^ permalink raw reply

* Re: dhcp client packet sniffing...
From: Neil Horman @ 2010-04-08 10:59 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, herbert
In-Reply-To: <20100408.035049.177640912.davem@davemloft.net>

On Thu, Apr 08, 2010 at 03:50:49AM -0700, David Miller wrote:
> 
> This is an old topic, but looking at traces tonight I was reminded
> about it.
> 
> dhcp clients sniff every packet in the system, the reason it does this
> and the things we can do to make it not have to do so have been
> discussed before.  Actually, I don't remember where we got with
> that and if we were able to make it such that the dhcp client
> doesn't have to do this any more.  Herbert?
> 
Dave, sorry to be late to the discussion, but can you refesh us on why dhcp has
to sniff every packet?
Regards
Neil


^ permalink raw reply

* dhcp client packet sniffing...
From: David Miller @ 2010-04-08 10:50 UTC (permalink / raw)
  To: netdev; +Cc: herbert


This is an old topic, but looking at traces tonight I was reminded
about it.

dhcp clients sniff every packet in the system, the reason it does this
and the things we can do to make it not have to do so have been
discussed before.  Actually, I don't remember where we got with
that and if we were able to make it such that the dhcp client
doesn't have to do this any more.  Herbert?

But, in any event, the fact of the matter is that currently it still
does on many machines.

This means every packet in the machine gets sniffed.

The DHCP client at least installs a socket filter that only accepts
the packets that the DHCP client is actually interested in.

The problem is that we clone the SKB and do some other operations
before running the socket filter.

I was thinking, what if we simply move the sk_filter() call up to
dev_queue_xmit_nit()?  And if sk_filter() rejects we don't even need
to clone the packet.

^ permalink raw reply

* FEC driver: rcv is not +last
From: Matthias Kaehlcke @ 2010-04-08 10:40 UTC (permalink / raw)
  To: netdev; +Cc: Sascha Hauer

hi,

i have problems with the FEC on a i.MX25 3-Stack board. the kernel is
v2.6.34-rc2 plus the following patch:
http://patchwork.ozlabs.org/patch/41235/

the following traces are generated at boot time:

FEC Ethernet Driver
fec: PHY @ 0x1, ID 0x20005ce1 -- unknown PHY!
...
eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
...
FEC ENET: rcv is not +last
FEC ENET: rcv is not +last
FEC ENET: rcv is not +last
FEC ENET: rcv is not +last
...

the PHY of the board is a DP83840, which is not supported by the
driver. could this be the problem? i tried to make the kernel think
the DP83840 is a DP83848, which is supported, but the behaviour is the
same except the 'unknown PHY' warning.

any idea what could be wrong?

-- 
Matthias Kaehlcke
Embedded Linux Developer
Barcelona

              You can't separate peace from freedom because no
               one can be at peace unless he has his freedom
                              (Malcolm X)
                                                                 .''`.
    using free software / Debian GNU/Linux | http://debian.org  : :'  :
                                                                `. `'`
gpg --keyserver pgp.mit.edu --recv-keys 47D8E5D4                  `-

^ permalink raw reply

* Re: IP/UDP encapsulation
From: ZioPRoTo (Saverio Proto) @ 2010-04-08 10:39 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mark Smith, Gustavo F. Padovan, netdev, marco bonola,
	Behling Mario, L. Aaron Kaplan
In-Reply-To: <1270722573.2215.47.camel@edumazet-laptop>

>> I'm a bit confused. How can tunnelling IP in UDP in IP be faster than IP in IP?
>>
>
> Maybe the 'gateway' doesnt handle IPIP at all ;)

Yes that's the point. IP in UDP is more supported. It has much an
easier way when your traffic travels over the Internet, and maybe you
have to pass some NAT. Some NAT will not handle at all IP in IP
packets.

IP in IP has of course less overhead but you can experience problems
and many network setups. This is why at Freifunk we are thinking of
developing this kernel module.

Regards

Saverio Proto

^ permalink raw reply

* Re: IP/UDP encapsulation
From: Eric Dumazet @ 2010-04-08 10:29 UTC (permalink / raw)
  To: Mark Smith
  Cc: Gustavo F. Padovan, netdev, marco bonola,
	ZioPRoTo (Saverio Proto), Behling Mario, L. Aaron Kaplan
In-Reply-To: <20100408191831.08cd8d7b@opy.nosense.org>

Le jeudi 08 avril 2010 à 19:18 +0930, Mark Smith a écrit :
> On Thu, 8 Apr 2010 04:42:47 -0300
> "Gustavo F. Padovan" <gustavo@padovan.org> wrote:
> 
> > Hi,
> > 
> > I'm looking for some advice on that work. The Freifunk organization is
> > planning work on the IP/UDP encapsulation kernel module as a GSoC
> > project. The idea is to create a IP-in-UDP tunnel like we do for
> > IP-in-IP or IP-in-GRE tunnels. The only way to do that today is to use
> > some VPN software.
> > 
> > The module will export its virtual interface through sockets and will
> > have support for the standard syscalls like the others encapsulation
> > modules.
> > 
> > It will improve the performance of mesh networks that will we be able
> > to use IP-in-UDP rather than IP-in-IP.
> 
> I'm a bit confused. How can tunnelling IP in UDP in IP be faster than IP in IP?
> 

Maybe the 'gateway' doesnt handle IPIP at all ;)

Until 2.6.32, IPIP tunnels were not so scalable then UDP (RCU enabled)
ipip_rcv() was hitting a global rwlock

git describe 8f95dd63a2ab6fe7243c4f0bd2c3266e3a5525ab
v2.6.32-rc3-468-g8f95dd6




^ permalink raw reply

* Re: [PATCH v3 0/4] xfrm: add x86 CONFIG_COMPAT support
From: David Miller @ 2010-04-08  9:54 UTC (permalink / raw)
  To: kaber; +Cc: fw, netdev, johannes
In-Reply-To: <4BBDA58B.8090301@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Thu, 08 Apr 2010 11:44:43 +0200

> Either the kernel or the userspace programs have to be updated
> either way.

The only case in which we only have to change one side is if we add
the full set of compat support to the kernel.

If we take any other option (new XFRM numbers and new datastructures,
or only convert sendmsg() to do compat translations), it requires both
the kernel and userspace to change.

And the currently existing 32-bit binaries don't work on 64-bit
kernels because of something that cannot be classified any other way
than as being a kernel bug.

^ permalink raw reply

* Re: IP/UDP encapsulation
From: Mark Smith @ 2010-04-08  9:48 UTC (permalink / raw)
  To: Gustavo F. Padovan
  Cc: netdev, marco bonola, ZioPRoTo (Saverio Proto), Behling Mario,
	L. Aaron Kaplan
In-Reply-To: <20100408074247.GA19798@vigoh>

On Thu, 8 Apr 2010 04:42:47 -0300
"Gustavo F. Padovan" <gustavo@padovan.org> wrote:

> Hi,
> 
> I'm looking for some advice on that work. The Freifunk organization is
> planning work on the IP/UDP encapsulation kernel module as a GSoC
> project. The idea is to create a IP-in-UDP tunnel like we do for
> IP-in-IP or IP-in-GRE tunnels. The only way to do that today is to use
> some VPN software.
> 
> The module will export its virtual interface through sockets and will
> have support for the standard syscalls like the others encapsulation
> modules.
> 
> It will improve the performance of mesh networks that will we be able
> to use IP-in-UDP rather than IP-in-IP.

I'm a bit confused. How can tunnelling IP in UDP in IP be faster than IP in IP?


> So, instead of push all data to
> local gateway into the mesh all the data can be tunneled to a faster
> server and from there to the Internet. With all the data exiting with
> the same IP address (the fast server IP). That will improve bandwidth,
> especially for upload.
> 
> Is such module acceptable for merge into the Linux Kernel?
> 
> Any comments or suggestions to the module architecture and
> implementation? If you want more information about the module I can
> provide that.
> 
> Regards,
> 
> -- 
> Gustavo F. Padovan
> http://padovan.org
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 0/4] xfrm: add x86 CONFIG_COMPAT support
From: Patrick McHardy @ 2010-04-08  9:44 UTC (permalink / raw)
  To: David Miller; +Cc: fw, netdev, johannes
In-Reply-To: <20100407.164842.54065324.davem@davemloft.net>

David Miller wrote:
> From: Patrick McHardy <kaber@trash.net>
> Date: Wed, 07 Apr 2010 15:45:51 +0200
> 
>> Florian Westphal wrote:
>>> David Miller <davem@davemloft.net> wrote:
>>>> From: Florian Westphal <fw@strlen.de>
>>>> Date: Tue,  6 Apr 2010 00:27:07 +0200
>>> [..]
>>>
>>>>> I sent a patch that solved this by adding a sys_compat_write syscall
>>>>> and a ->compat_aio_write() to struct file_operations to the
>>>>> vfs mailing list, but that patch was ignored by the vfs people,
>>>>> and the x86 folks did not exactly like the idea either.
>>>>>
>>>>> So this leaves three alternatives:
>>>>> 1 - drop the whole idea and keep the current status.
>>>>> 2 - Add new structure definitions (with new numbering) that would work
>>>>>     everywhere, keep the old ones for backwards compatibility (This
>>>>>     was suggested by Arnd Bergmann).
>> Given that there is only a quite small number of users of this
>> interface, that would in my opinion be the best way.
> 
> Can you explain that line of reasoning?
> 
> It's not that there are only "3 or 4 tools" using these interfaces,
> it's the fact that 32-bit binaries of those tools are on millions and
> millions of systems out there.

Yeah, but they're currently not working if used on 64 bit hosts,
so I don't think its unreasonable to consider just the number of
tools that need to be fixed. Either the kernel or the userspace
programs have to be updated either way.

^ permalink raw reply

* Re: [PATCH] tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb
From: Joe Perches @ 2010-04-08  9:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, herbert
In-Reply-To: <20100408.012617.98660541.davem@davemloft.net>

On Thu, 2010-04-08 at 01:26 -0700, David Miller wrote:
> Fix this by setting skb->ip_summed in the common non-data packet
> constructor.  It already is setting skb->csum to zero.
> Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index f181b78..00afbb0 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -349,6 +349,7 @@ static inline void TCP_ECN_send(struct sock *sk, struct sk_buff *skb,
>   */
>  static void tcp_init_nondata_skb(struct sk_buff *skb, u32 seq, u8 flags)
>  {
> +	skb->ip_summed = CHECKSUM_PARTIAL;
>  	skb->csum = 0;
>  
>  	TCP_SKB_CB(skb)->flags = flags;

There might be trivial value in using the
struct layout order for the sets avoiding
crossing cachelines.

from:

static void tcp_init_nondata_skb(struct sk_buff *skb, u32 seq, u8 flags)
{
	skb->ip_summed = CHECKSUM_PARTIAL;
	skb->csum = 0;

	TCP_SKB_CB(skb)->flags = flags;
	TCP_SKB_CB(skb)->sacked = 0;

	skb_shinfo(skb)->gso_segs = 1;
	skb_shinfo(skb)->gso_size = 0;
	skb_shinfo(skb)->gso_type = 0;

	TCP_SKB_CB(skb)->seq = seq;
	if (flags & (TCPCB_FLAG_SYN | TCPCB_FLAG_FIN))
		seq++;
	TCP_SKB_CB(skb)->end_seq = seq;
}

to:

static void tcp_init_nondata_skb(struct sk_buff *skb, u32 seq, u8 flags)
{
	skb->ip_summed = CHECKSUM_PARTIAL; 
	skb->csum = 0;

	TCP_SKB_CB(skb)->seq = seq;
	if (flags & (TCPCB_FLAG_SYN | TCPCB_FLAG_FIN))
		seq++;
	TCP_SKB_CB(skb)->end_seq = seq;
	TCP_SKB_CB(skb)->sacked = 0;
	TCP_SKB_CB(skb)->flags = flags;

	skb_shinfo(skb)->gso_size = 0;
	skb_shinfo(skb)->gso_segs = 1;
	skb_shinfo(skb)->gso_type = 0;
}



^ permalink raw reply

* Re:[PATCH v1 2/3] Provides multiple submits and asynchronous notifications.
From: xiaohui.xin @ 2010-04-08  9:07 UTC (permalink / raw)
  To: mst; +Cc: netdev, kvm, linux-kernel, jdike, Xin Xiaohui
In-Reply-To: <20100407081800.GC9550@redhat.com>

From: Xin Xiaohui <xiaohui.xin@intel.com>

---
Michael,
This is a small patch for the write logging issue with async queue.
I have made a __vhost_get_vq_desc() func which may compute the log
info with any valid buffer index. The __vhost_get_vq_desc() is 
coming from the code in vq_get_vq_desc().
And I use it to recompute the log info when logging is enabled.

Thanks
Xiaohui

 drivers/vhost/net.c   |   27 ++++++++---
 drivers/vhost/vhost.c |  115 ++++++++++++++++++++++++++++---------------------
 drivers/vhost/vhost.h |    5 ++
 3 files changed, 90 insertions(+), 57 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 2aafd90..00a45ef 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -115,7 +115,8 @@ static void handle_async_rx_events_notify(struct vhost_net *net,
 	struct kiocb *iocb = NULL;
 	struct vhost_log *vq_log = NULL;
 	int rx_total_len = 0;
-	int log, size;
+	unsigned int head, log, in, out;
+	int size;
 
 	if (vq->link_state != VHOST_VQ_LINK_ASYNC)
 		return;
@@ -130,14 +131,25 @@ static void handle_async_rx_events_notify(struct vhost_net *net,
 				iocb->ki_pos, iocb->ki_nbytes);
 		log = (int)iocb->ki_user_data;
 		size = iocb->ki_nbytes;
+		head = iocb->ki_pos;
 		rx_total_len += iocb->ki_nbytes;
 
 		if (iocb->ki_dtor)
 			iocb->ki_dtor(iocb);
 		kmem_cache_free(net->cache, iocb);
 
-		if (unlikely(vq_log))
+		/* when log is enabled, recomputing the log info is needed,
+		 * since these buffers are in async queue, and may not get
+		 * the log info before.
+		 */
+		if (unlikely(vq_log)) {
+			if (!log)
+				__vhost_get_vq_desc(&net->dev, vq, vq->iov,
+						    ARRAY_SIZE(vq->iov),
+						    &out, &in, vq_log,
+						    &log, head);
 			vhost_log_write(vq, vq_log, log, size);
+		}
 		if (unlikely(rx_total_len >= VHOST_NET_WEIGHT)) {
 			vhost_poll_queue(&vq->poll);
 			break;
@@ -313,14 +325,13 @@ static void handle_rx(struct vhost_net *net)
 	vhost_disable_notify(vq);
 	hdr_size = vq->hdr_size;
 
-	/* In async cases, for write logging, the simple way is to get
-	 * the log info always, and really logging is decided later.
-	 * Thus, when logging enabled, we can get log, and when logging
-	 * disabled, we can get log disabled accordingly.
+	/* In async cases, when write log is enabled, in case the submitted
+	 * buffers did not get log info before the log enabling, so we'd
+	 * better recompute the log info when needed. We do this in
+	 * handle_async_rx_events_notify().
 	 */
 
-	vq_log = unlikely(vhost_has_feature(&net->dev, VHOST_F_LOG_ALL)) |
-		(vq->link_state == VHOST_VQ_LINK_ASYNC) ?
+	vq_log = unlikely(vhost_has_feature(&net->dev, VHOST_F_LOG_ALL)) ?
 		vq->log : NULL;
 
 	handle_async_rx_events_notify(net, vq);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 97233d5..53dab80 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -715,66 +715,21 @@ static unsigned get_indirect(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 	return 0;
 }
 
-/* This looks in the virtqueue and for the first available buffer, and converts
- * it to an iovec for convenient access.  Since descriptors consist of some
- * number of output then some number of input descriptors, it's actually two
- * iovecs, but we pack them into one and note how many of each there were.
- *
- * This function returns the descriptor number found, or vq->num (which
- * is never a valid descriptor number) if none was found. */
-unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
+unsigned __vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 			   struct iovec iov[], unsigned int iov_size,
 			   unsigned int *out_num, unsigned int *in_num,
-			   struct vhost_log *log, unsigned int *log_num)
+			   struct vhost_log *log, unsigned int *log_num,
+			   unsigned int head)
 {
 	struct vring_desc desc;
-	unsigned int i, head, found = 0;
-	u16 last_avail_idx;
+	unsigned int i = head, found = 0;
 	int ret;
 
-	/* Check it isn't doing very strange things with descriptor numbers. */
-	last_avail_idx = vq->last_avail_idx;
-	if (get_user(vq->avail_idx, &vq->avail->idx)) {
-		vq_err(vq, "Failed to access avail idx at %p\n",
-		       &vq->avail->idx);
-		return vq->num;
-	}
-
-	if ((u16)(vq->avail_idx - last_avail_idx) > vq->num) {
-		vq_err(vq, "Guest moved used index from %u to %u",
-		       last_avail_idx, vq->avail_idx);
-		return vq->num;
-	}
-
-	/* If there's nothing new since last we looked, return invalid. */
-	if (vq->avail_idx == last_avail_idx)
-		return vq->num;
-
-	/* Only get avail ring entries after they have been exposed by guest. */
-	rmb();
-
-	/* Grab the next descriptor number they're advertising, and increment
-	 * the index we've seen. */
-	if (get_user(head, &vq->avail->ring[last_avail_idx % vq->num])) {
-		vq_err(vq, "Failed to read head: idx %d address %p\n",
-		       last_avail_idx,
-		       &vq->avail->ring[last_avail_idx % vq->num]);
-		return vq->num;
-	}
-
-	/* If their number is silly, that's an error. */
-	if (head >= vq->num) {
-		vq_err(vq, "Guest says index %u > %u is available",
-		       head, vq->num);
-		return vq->num;
-	}
-
 	/* When we start there are none of either input nor output. */
 	*out_num = *in_num = 0;
 	if (unlikely(log))
 		*log_num = 0;
 
-	i = head;
 	do {
 		unsigned iov_count = *in_num + *out_num;
 		if (i >= vq->num) {
@@ -833,8 +788,70 @@ unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
 			*out_num += ret;
 		}
 	} while ((i = next_desc(&desc)) != -1);
+	return head;
+}
+
+/* This looks in the virtqueue and for the first available buffer, and converts
+ * it to an iovec for convenient access.  Since descriptors consist of some
+ * number of output then some number of input descriptors, it's actually two
+ * iovecs, but we pack them into one and note how many of each there were.
+ *
+ * This function returns the descriptor number found, or vq->num (which
+ * is never a valid descriptor number) if none was found. */
+unsigned vhost_get_vq_desc(struct vhost_dev *dev, struct vhost_virtqueue *vq,
+			   struct iovec iov[], unsigned int iov_size,
+			   unsigned int *out_num, unsigned int *in_num,
+			   struct vhost_log *log, unsigned int *log_num)
+{
+	struct vring_desc desc;
+	unsigned int i, head, found = 0;
+	u16 last_avail_idx;
+	unsigned int ret;
+
+	/* Check it isn't doing very strange things with descriptor numbers. */
+	last_avail_idx = vq->last_avail_idx;
+	if (get_user(vq->avail_idx, &vq->avail->idx)) {
+		vq_err(vq, "Failed to access avail idx at %p\n",
+		       &vq->avail->idx);
+		return vq->num;
+	}
+
+	if ((u16)(vq->avail_idx - last_avail_idx) > vq->num) {
+		vq_err(vq, "Guest moved used index from %u to %u",
+		       last_avail_idx, vq->avail_idx);
+		return vq->num;
+	}
+
+	/* If there's nothing new since last we looked, return invalid. */
+	if (vq->avail_idx == last_avail_idx)
+		return vq->num;
+
+	/* Only get avail ring entries after they have been exposed by guest. */
+	rmb();
+
+	/* Grab the next descriptor number they're advertising, and increment
+	 * the index we've seen. */
+	if (get_user(head, &vq->avail->ring[last_avail_idx % vq->num])) {
+		vq_err(vq, "Failed to read head: idx %d address %p\n",
+		       last_avail_idx,
+		       &vq->avail->ring[last_avail_idx % vq->num]);
+		return vq->num;
+	}
+
+	/* If their number is silly, that's an error. */
+	if (head >= vq->num) {
+		vq_err(vq, "Guest says index %u > %u is available",
+		       head, vq->num);
+		return vq->num;
+	}
+
+	ret = __vhost_get_vq_desc(dev, vq, iov, iov_size,
+				  out_num, in_num,
+				  log, log_num, head);
 
 	/* On success, increment avail index. */
+	if (ret == vq->num)
+		return ret;
 	vq->last_avail_idx++;
 	return head;
 }
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index cffe39a..a74a6d4 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -132,6 +132,11 @@ unsigned vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
 			   struct iovec iov[], unsigned int iov_count,
 			   unsigned int *out_num, unsigned int *in_num,
 			   struct vhost_log *log, unsigned int *log_num);
+unsigned __vhost_get_vq_desc(struct vhost_dev *, struct vhost_virtqueue *,
+			   struct iovec iov[], unsigned int iov_count,
+			   unsigned int *out_num, unsigned int *in_num,
+			   struct vhost_log *log, unsigned int *log_num,
+			   unsigned int head);
 void vhost_discard_vq_desc(struct vhost_virtqueue *);
 
 int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len);
-- 
1.5.4.4

^ permalink raw reply related

* Re: [net-next-2.6 PATCH 4/4] vxge: Version update.
From: David Miller @ 2010-04-08  8:49 UTC (permalink / raw)
  To: Sreenivasa.Honnur; +Cc: netdev, support
In-Reply-To: <Pine.GSO.4.10.11004080444280.365-100000@guinness>

From: Sreenivasa Honnur <Sreenivasa.Honnur@neterion.com>
Date: Thu, 8 Apr 2010 04:45:07 -0400 (EDT)

> - Version update.
>  
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>

Also applied, thanks.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 3/4] vxge: Pass correct number of VFs value to pci_sriov_enable().
From: David Miller @ 2010-04-08  8:49 UTC (permalink / raw)
  To: Sreenivasa.Honnur; +Cc: netdev, support
In-Reply-To: <Pine.GSO.4.10.11004080443390.365-100000@guinness>

From: Sreenivasa Honnur <Sreenivasa.Honnur@neterion.com>
Date: Thu, 8 Apr 2010 04:44:26 -0400 (EDT)

> -  max_config_dev loadable parameter is set to 0xFF by default. Pass correct
>    number of VFs value to pci_sriov_enable() if max_config_dev is set to its 
>    default value.
>  
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 2/4] vxge: Allow driver load for all enumerated pci functions.
From: David Miller @ 2010-04-08  8:48 UTC (permalink / raw)
  To: Sreenivasa.Honnur; +Cc: netdev, support
In-Reply-To: <Pine.GSO.4.10.11004080442170.365-100000@guinness>

From: Sreenivasa Honnur <Sreenivasa.Honnur@neterion.com>
Date: Thu, 8 Apr 2010 04:43:37 -0400 (EDT)

> - Allow all instances of the driver be loaded when multiple pci functions are
> enumerated. The max_config_dev driver loadable option limits the driver
> load instances if required. The X3100's function configuration of single/multi
> function, SR and MR IOV allows the user to select the number of pci functions.
>  
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>

Applied.

^ permalink raw reply

* [net-next-2.6 PATCH 4/4] vxge: Version update.
From: Sreenivasa Honnur @ 2010-04-08  8:45 UTC (permalink / raw)
  To: davem; +Cc: netdev, support

- Version update.
 
Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>
---
diff -urpN patch3/drivers/net/vxge/vxge-version.h patch4/drivers/net/vxge/vxge-version.h
--- patch3/drivers/net/vxge/vxge-version.h	2010-04-01 12:13:17.000000000 +0530
+++ patch4/drivers/net/vxge/vxge-version.h	2010-04-07 11:39:50.000000000 +0530
@@ -17,7 +17,7 @@
 
 #define VXGE_VERSION_MAJOR	"2"
 #define VXGE_VERSION_MINOR	"0"
-#define VXGE_VERSION_FIX	"7"
-#define VXGE_VERSION_BUILD	"20144"
+#define VXGE_VERSION_FIX	"8"
+#define VXGE_VERSION_BUILD	"20182"
 #define VXGE_VERSION_FOR	"k"
 #endif


^ permalink raw reply

* Re: net-next-2.6 PATCH 1/4] vxge: Fix a possible memory leak in vxge_hw_device_initialize().
From: David Miller @ 2010-04-08  8:44 UTC (permalink / raw)
  To: Sreenivasa.Honnur; +Cc: netdev, support
In-Reply-To: <Pine.GSO.4.10.11004080441240.365-100000@guinness>

From: Sreenivasa Honnur <Sreenivasa.Honnur@neterion.com>
Date: Thu, 8 Apr 2010 04:42:02 -0400 (EDT)

> - Fix a possible memory leak in vxge_hw_device_initialize(). Free hldev if
> vxge_hw_device_reg_addr_get() fails.
>  
> Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
> Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>

Applied, thanks.

^ permalink raw reply

* [net-next-2.6 PATCH 3/4] vxge: Pass correct number of VFs value to pci_sriov_enable().
From: Sreenivasa Honnur @ 2010-04-08  8:44 UTC (permalink / raw)
  To: davem; +Cc: netdev, support

-  max_config_dev loadable parameter is set to 0xFF by default. Pass correct
   number of VFs value to pci_sriov_enable() if max_config_dev is set to its 
   default value.
 
Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>
---
diff -urpN patch2/drivers/net/vxge/vxge-config.h patch3/drivers/net/vxge/vxge-config.h
--- patch2/drivers/net/vxge/vxge-config.h	2010-04-01 12:06:55.000000000 +0530
+++ patch3/drivers/net/vxge/vxge-config.h	2010-04-01 14:20:19.000000000 +0530
@@ -764,10 +764,18 @@ struct vxge_hw_device_hw_info {
 #define VXGE_HW_SR_VH_VIRTUAL_FUNCTION				6
 #define VXGE_HW_VH_NORMAL_FUNCTION				7
 	u64		function_mode;
-#define VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION			0
-#define VXGE_HW_FUNCTION_MODE_SINGLE_FUNCTION			1
+#define VXGE_HW_FUNCTION_MODE_SINGLE_FUNCTION			0
+#define VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION			1
 #define VXGE_HW_FUNCTION_MODE_SRIOV				2
 #define VXGE_HW_FUNCTION_MODE_MRIOV				3
+#define VXGE_HW_FUNCTION_MODE_MRIOV_8				4
+#define VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION_17			5
+#define VXGE_HW_FUNCTION_MODE_SRIOV_8				6
+#define VXGE_HW_FUNCTION_MODE_SRIOV_4				7
+#define VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION_2			8
+#define VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION_4			9
+#define VXGE_HW_FUNCTION_MODE_MRIOV_4				10
+
 	u32		func_id;
 	u64		vpath_mask;
 	struct vxge_hw_device_version fw_version;
@@ -2265,4 +2273,6 @@ enum vxge_hw_status vxge_hw_vpath_rts_rt
 	struct vxge_hw_rth_hash_types *hash_type,
 	u16 bucket_size);
 
+enum vxge_hw_status
+__vxge_hw_device_is_privilaged(u32 host_type, u32 func_id);
 #endif
diff -urpN patch2/drivers/net/vxge/vxge-main.c patch3/drivers/net/vxge/vxge-main.c
--- patch2/drivers/net/vxge/vxge-main.c	2010-04-01 12:12:50.000000000 +0530
+++ patch3/drivers/net/vxge/vxge-main.c	2010-04-01 14:33:30.000000000 +0530
@@ -3965,6 +3965,36 @@ static void vxge_io_resume(struct pci_de
 	netif_device_attach(netdev);
 }
 
+static inline u32 vxge_get_num_vfs(u64 function_mode)
+{
+	u32 num_functions = 0;
+
+	switch (function_mode) {
+	case VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION:
+	case VXGE_HW_FUNCTION_MODE_SRIOV_8:
+		num_functions = 8;
+		break;
+	case VXGE_HW_FUNCTION_MODE_SINGLE_FUNCTION:
+		num_functions = 1;
+		break;
+	case VXGE_HW_FUNCTION_MODE_SRIOV:
+	case VXGE_HW_FUNCTION_MODE_MRIOV:
+	case VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION_17:
+		num_functions = 17;
+		break;
+	case VXGE_HW_FUNCTION_MODE_SRIOV_4:
+		num_functions = 4;
+		break;
+	case VXGE_HW_FUNCTION_MODE_MULTI_FUNCTION_2:
+		num_functions = 2;
+		break;
+	case VXGE_HW_FUNCTION_MODE_MRIOV_8:
+		num_functions = 8; /* TODO */
+		break;
+	}
+	return num_functions;
+}
+
 /**
  * vxge_probe
  * @pdev : structure containing the PCI related information of the device.
@@ -3992,14 +4022,19 @@ vxge_probe(struct pci_dev *pdev, const s
 	u8 *macaddr;
 	struct vxge_mac_addrs *entry;
 	static int bus = -1, device = -1;
+	u32 host_type;
 	u8 new_device = 0;
+	enum vxge_hw_status is_privileged;
+	u32 function_mode;
+	u32 num_vfs = 0;
 
 	vxge_debug_entryexit(VXGE_TRACE, "%s:%d", __func__, __LINE__);
 	attr.pdev = pdev;
 
-	if (bus != pdev->bus->number)
-		new_device = 1;
-	if (device != PCI_SLOT(pdev->devfn))
+	/* In SRIOV-17 mode, functions of the same adapter
+	 * can be deployed on different buses */
+	if ((!pdev->is_virtfn) && ((bus != pdev->bus->number) ||
+		(device != PCI_SLOT(pdev->devfn))))
 		new_device = 1;
 
 	bus = pdev->bus->number;
@@ -4133,6 +4168,11 @@ vxge_probe(struct pci_dev *pdev, const s
 		"%s:%d  Vpath mask = %llx", __func__, __LINE__,
 		(unsigned long long)vpath_mask);
 
+	function_mode = ll_config.device_hw_info.function_mode;
+	host_type = ll_config.device_hw_info.host_type;
+	is_privileged = __vxge_hw_device_is_privilaged(host_type,
+		ll_config.device_hw_info.func_id);
+
 	/* Check how many vpaths are available */
 	for (i = 0; i < VXGE_HW_MAX_VIRTUAL_PATHS; i++) {
 		if (!((vpath_mask) & vxge_mBIT(i)))
@@ -4140,14 +4180,18 @@ vxge_probe(struct pci_dev *pdev, const s
 		max_vpath_supported++;
 	}
 
+	if (new_device)
+		num_vfs = vxge_get_num_vfs(function_mode) - 1;
+
 	/* Enable SRIOV mode, if firmware has SRIOV support and if it is a PF */
-	if ((VXGE_HW_FUNCTION_MODE_SRIOV ==
-		ll_config.device_hw_info.function_mode) &&
-		(max_config_dev > 1) && (pdev->is_physfn)) {
-			ret = pci_enable_sriov(pdev, max_config_dev - 1);
-			if (ret)
-				vxge_debug_ll_config(VXGE_ERR,
-					"Failed to enable SRIOV: %d\n", ret);
+	if (is_sriov(function_mode) && (max_config_dev > 1) &&
+		(ll_config.intr_type != INTA) &&
+		(is_privileged == VXGE_HW_OK)) {
+		ret = pci_enable_sriov(pdev, ((max_config_dev - 1) < num_vfs)
+			? (max_config_dev - 1) : num_vfs);
+		if (ret)
+			vxge_debug_ll_config(VXGE_ERR,
+				"Failed in enabling SRIOV mode: %d\n", ret);
 	}
 
 	/*
diff -urpN patch2/drivers/net/vxge/vxge-main.h patch3/drivers/net/vxge/vxge-main.h
--- patch2/drivers/net/vxge/vxge-main.h	2010-04-01 12:06:55.000000000 +0530
+++ patch3/drivers/net/vxge/vxge-main.h	2010-04-01 13:52:56.000000000 +0530
@@ -90,6 +90,11 @@
 
 #define VXGE_LL_MAX_FRAME_SIZE(dev) ((dev)->mtu + VXGE_HW_MAC_HEADER_MAX_SIZE)
 
+#define is_sriov(function_mode) \
+	((function_mode == VXGE_HW_FUNCTION_MODE_SRIOV) || \
+	(function_mode == VXGE_HW_FUNCTION_MODE_SRIOV_8) || \
+	(function_mode == VXGE_HW_FUNCTION_MODE_SRIOV_4))
+
 enum vxge_reset_event {
 	/* reset events */
 	VXGE_LL_VPATH_RESET	= 0,


^ permalink raw reply

* [net-next-2.6 PATCH 2/4] vxge: Allow driver load for all enumerated pci functions.
From: Sreenivasa Honnur @ 2010-04-08  8:43 UTC (permalink / raw)
  To: davem; +Cc: netdev, support

- Allow all instances of the driver be loaded when multiple pci functions are
enumerated. The max_config_dev driver loadable option limits the driver
load instances if required. The X3100's function configuration of single/multi
function, SR and MR IOV allows the user to select the number of pci functions.
 
Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>
---
diff -urpN patch1/drivers/net/vxge/vxge-main.c patch2/drivers/net/vxge/vxge-main.c
--- patch1/drivers/net/vxge/vxge-main.c	2010-04-01 12:04:00.000000000 +0530
+++ patch2/drivers/net/vxge/vxge-main.c	2010-04-01 12:12:50.000000000 +0530
@@ -4016,9 +4016,11 @@ vxge_probe(struct pci_dev *pdev, const s
 				driver_config->total_dev_cnt);
 		driver_config->config_dev_cnt = 0;
 		driver_config->total_dev_cnt = 0;
-		driver_config->g_no_cpus = 0;
 	}
-
+	/* Now making the CPU based no of vpath calculation
+	 * applicable for individual functions as well.
+	 */
+	driver_config->g_no_cpus = 0;
 	driver_config->vpath_per_dev = max_config_vpath;
 
 	driver_config->total_dev_cnt++;


^ permalink raw reply

* net-next-2.6 PATCH 1/4] vxge: Fix a possible memory leak in vxge_hw_device_initialize().
From: Sreenivasa Honnur @ 2010-04-08  8:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, support

- Fix a possible memory leak in vxge_hw_device_initialize(). Free hldev if
vxge_hw_device_reg_addr_get() fails.
 
Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@exar.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@exar.com>
---
diff -urpN orig/drivers/net/vxge/vxge-config.c patch1/drivers/net/vxge/vxge-config.c
--- orig/drivers/net/vxge/vxge-config.c	2010-04-01 12:03:33.000000000 +0530
+++ patch1/drivers/net/vxge/vxge-config.c	2010-04-01 12:05:55.000000000 +0530
@@ -634,8 +634,10 @@ vxge_hw_device_initialize(
 	__vxge_hw_device_pci_e_init(hldev);
 
 	status = __vxge_hw_device_reg_addr_get(hldev);
-	if (status != VXGE_HW_OK)
+	if (status != VXGE_HW_OK) {
+		vfree(hldev);
 		goto exit;
+	}
 	__vxge_hw_device_id_get(hldev);
 
 	__vxge_hw_device_host_info_get(hldev);


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox