Netdev List
 help / color / mirror / Atom feed
* Re: unsubscribe netdev
From: Peter Portante @ 2012-07-23 17:26 UTC (permalink / raw)
  To: netdev
In-Reply-To: <C8E995C2-7D48-451F-8AEA-69BF2DA59F1F@redhat.com>

unsubscribe netdev

^ permalink raw reply

* Re: [PATCH net/for-next V1 1/1] IB/ipoib: break linkage to neighbouring system
From: Eric Dumazet @ 2012-07-23 17:17 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: roland, davem, Christoph Lameter, linux-rdma, erezsh,
	Shlomo Pongratz, netdev
In-Reply-To: <500D82AD.2040508@mellanox.com>

On Mon, 2012-07-23 at 19:58 +0300, Or Gerlitz wrote:

> Sorry for the possible spam, but resending (last time forgot to put Eric 
> on the "to" list so he might missed it, also added Christoph) - this is 
> a fix for very long time bug in IPoIB and we do want it to be reviewed 
> && and hopefully accepted, or if needed get feedback and fix/change.
> 
> So, any further feedback? there was one feedback on V0, not to use read 
> lock for RCU protected
> hash table lookup, and it was addressed in V1.
> 

I have no idea of what you are talking about, I have not the patch or a
copy of it ;)

^ permalink raw reply

* RE: New commands to configure IOV features
From: Rose, Gregory V @ 2012-07-23 17:06 UTC (permalink / raw)
  To: Chris Friesen, Don Dutile
  Cc: Ben Hutchings, David Miller, yuvalmin@broadcom.com,
	netdev@vger.kernel.org, linux-pci@vger.kernel.org
In-Reply-To: <500D6932.8090306@genband.com>

> -----Original Message-----
> From: Chris Friesen [mailto:chris.friesen@genband.com]
> Sent: Monday, July 23, 2012 8:10 AM
> To: Don Dutile
> Cc: Ben Hutchings; David Miller; yuvalmin@broadcom.com; Rose, Gregory V;
> netdev@vger.kernel.org; linux-pci@vger.kernel.org
> Subject: Re: New commands to configure IOV features
> 
> On 07/23/2012 08:03 AM, Don Dutile wrote:
> > On 07/20/2012 07:42 PM, Chris Friesen wrote:
> >>
> >> I actually have a use-case where the guest needs to be able to modify
> >> the MAC addresses of network devices that are actually VFs.
> >>
> >> The guest is bonding the network devices together, so the bonding
> >> driver in the guest expects to be able to set all the slaves to the
> >> same MAC address.
> >>
> >> As I read the ixgbe driver, this should be possible as long as the
> >> host hasn't explicitly set the MAC address of the VF. Is that correct?
> >>
> >> Chris
> >
> > Interesting tug of war: hypervisors will want to set the macaddrs for
> > security reasons,
> >                         some guests may want to set macaddr for
> > (valid?) config reasons.
> >
> 
> In our case we have control over both guest an host anyways, so it's
> less of a security issue.  In the general case though I could see it
> being an interesting problem.
> 
> Back to the original discussion though--has anyone got any ideas about
> the best way to trigger runtime creation of VFs?  I don't know what the
> binary APIs looks like, but via sysfs I could see something like
> 
> echo number_of_new_vfs_to_create >
> /sys/bus/pci/devices/<address>/create_vfs

The original proposals for creation and management of virtual functions were very much along these lines.  However, at the time most of the distributions that used virtualization were based upon the 2.6.18 kernel.  Red Hat Enterprise Linux 5.x and Citrix Xen Server were both using kernels derived from the 2.6.18 kernel.  In order to implement a sysfs based management approach we would have had to break the 2.6.18 kernel ABI, which is of course a non-starter.  We were able to implement an SR-IOV solution without breaking the ABI but it required use of a module parameter to inform the PF driver of how many VFs it should create.

This was fine during the first couple of years in which not many folks were using SR-IOV outside of a lab and the number of platforms that supported SR-IOV was very limited.  As an experimental solution it has worked pretty well.

The last year and a half or so we have seen SR-IOV go from an experimental technology to one that is becoming increasingly deployed in real world applications and now the limitations of that original approach are more apparent.  

> 
> Something else that occurred to me--is there buy-in from driver
> maintainers?

That's a good question.  I've seen a lot of resistance to using sysfs based interfaces in drivers but usually that is when a driver wants to implement a private interface that other drivers wouldn't want or be able to use.  I'm less sure about a sysfs solution that could be deployed by all SR-IOV capable devices of any type, be they Ethernet controllers, SCSI controllers, etc.  I'm not sure what the objection would be in case of a general purpose solution, however I've been told that in general sysfs based solutions are often frowned upon by kernel maintainers.  Perhaps that is because many of them are not generic solutions to a well defined problem.

  I know the Intel ethernet drivers (what I'm most familiar
> with) would need to be substantially modified to support on-the-fly
> addition of new vfs.

Actually, it wouldn't be that bad.  But yes, there'd have to be some way for a driver to register a callback routine with the PCI interface so that it could be notified when changes have been made to the SR-IOV configuration of the device.  This would require a new API, and some driver changes.  In the case of the Intel drivers it wouldn't be too intrusive and it would definitely help us to meet some customer requirements.  The current model using a module parameter forces all ports controlled by a PF driver to use the same number of VFs per function.  This is clunky and there are a lot of users that would like the ability to assign differing numbers of VFs to respective physical functions.

I should also note that we need to be careful about what we mean by the phrase "support on-the-fly addition of new vfs".

You cannot just add VFs without first tearing down the VFs that are currently allocated.  This is a limitation of the PCIE SR-IOV spec IIRC, and in any case it is true with Intel SR-IOV capable devices.  The numVFs parameter in the SR-IOV capability structure is not writeable while the VF enable bit is set.  To change that value you must first clear the VF enable bit and when you do that all your current VFs cease to exist.  You can then write a different number of VFs and re-enable them but during a short interval the VFs that were already there are destroyed and completely reset when they come back on-line.

- Greg

^ permalink raw reply

* Re: [PATCH net/for-next V1 1/1] IB/ipoib: break linkage to neighbouring system
From: Or Gerlitz @ 2012-07-23 16:58 UTC (permalink / raw)
  To: Eric Dumazet, roland-DgEjT+Ai2ygdnm+yROfE0A,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, Christoph Lameter
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, erezsh-VPRAkNaXOzVWk0Htik3J/w,
	Shlomo Pongratz, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CAJZOPZ+kRcBjJgB_HaMqeuB5E-SLSqskgoaLZ_hvVx4KffHgpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On 20/07/2012 18:49, Or Gerlitz wrote:
> On Thu, Jul 19, 2012 at 4:18 PM, Or Gerlitz<ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>  wrote:
>> From: Shlomo Pongratz<shlomop-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
>>
>> Dave Miller<davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>  provided a detailed description of why the
>> way IPoIB is using neighbours for its own ipoib_neigh struct is buggy:
> [...]
>
>> This patch aims to solve the race conditions found in the IPoIB driver.
>>
>> The patch breaks the connection between the core networking neighbour structure
>> and the ipoib_neigh structure. Except for avoiding the race, it allows to in
>> under a setup where SKBs carrying IP packets that don't have any associated
>> neighbour are transmitted through IPoIB.
>>
>> We add an ipoib_neigh hash table with 1024 buckets. The hash table key is the destin
>> hardware address. Thus the ipoib_neigh is fetched from the hash table and not
>> dereferenced from the stashed location at the neighbour structure. The hash table uses
>> both RCU and reference count mechanisms to guarantee that no ipoib_neigh instance is
>> ever deleted while in use.
>>
>> Fetching the ipoib_neigh structure instance from the hash also makes the special
>> code in ipoib_start_xmit that handles remote and local bonding failover redundant.
>>
>> Aged ipoib_neigh instances are deleted by a garbage collection task that runs every
>> 30 seconds and deletes every ipoib_neigh instance that was idle for at least 60
>> seconds. The deletion is safe since the ipoib_neigh instances are protected
>> using RCU and reference count mechanisms.
>
> Hi Dave, Roland, Eric
>
> So how does this look? in the right direction? anything that need to be fixed?

Sorry for the possible spam, but resending (last time forgot to put Eric 
on the "to" list so he might missed it, also added Christoph) - this is 
a fix for very long time bug in IPoIB and we do want it to be reviewed 
&& and hopefully accepted, or if needed get feedback and fix/change.

So, any further feedback? there was one feedback on V0, not to use read 
lock for RCU protected
hash table lookup, and it was addressed in V1.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: New commands to configure IOV features
From: Rose, Gregory V @ 2012-07-23 16:37 UTC (permalink / raw)
  To: Don Dutile, Chris Friesen
  Cc: Ben Hutchings, David Miller, yuvalmin@broadcom.com,
	netdev@vger.kernel.org, linux-pci@vger.kernel.org
In-Reply-To: <500D59BF.9040006@redhat.com>

> -----Original Message-----
> From: Don Dutile [mailto:ddutile@redhat.com]
> Sent: Monday, July 23, 2012 7:04 AM
> To: Chris Friesen
> Cc: Ben Hutchings; David Miller; yuvalmin@broadcom.com; Rose, Gregory V;
> netdev@vger.kernel.org; linux-pci@vger.kernel.org
> Subject: Re: New commands to configure IOV features
> 
> On 07/20/2012 07:42 PM, Chris Friesen wrote:
> > On 07/20/2012 02:01 PM, Ben Hutchings wrote:
> >> On Fri, 2012-07-20 at 13:29 -0600, Chris Friesen wrote:
> >
> >>> Once the device exists, then domain-specific APIs would be used to
> >>> configure it the same way that they would configure a physical device.
> >>
> >> To an extent, but not entirely.
> >>
> >> Currently, the assigned MAC address and (optional) VLAN tag for each
> >> networking VF are configured via the PF net device (though this is
> >> done though the rtnetlink API rather than ethtool).
> >
> > I actually have a use-case where the guest needs to be able to modify
> the MAC addresses of network devices that are actually VFs.
> >
> > The guest is bonding the network devices together, so the bonding driver
> in the guest expects to be able to set all the slaves to the same MAC
> address.
> >
> > As I read the ixgbe driver, this should be possible as long as the host
> hasn't explicitly set the MAC address of the VF. Is that correct?
> >
> > Chris
> 
> Interesting tug of war: hypervisors will want to set the macaddrs for
> security reasons,
>                          some guests may want to set macaddr for (valid?)
> config reasons.

It is a matter of trust.  The ability to set your own MAC address filters is a potential security issue, so host administrators have the ability to determine whether they trust the VF (and implicitly, the domain in which the VF resides).  There is also a sort of half-way solution.  By turning off anti-spoofing you can allow the VF to use source MAC addresses that are not actually assigned in HW filters.  This was done to support some bonding scenarios where the VF will need to transmit with a different source address.

Many applications using SR-IOV are embedded devices such as switches, edge relay devices, IP forwarding/filtering appliances, routers, etc.  More often than not the host administrator can trust domains that the VFs are assigned to because those domains are completely under the control of the local host.  In those cases the VFs are trusted and can be allowed to set their own MAC filters and use any source MAC address they please.

Other applications might need to assign VF devices to non-trusted domains.  Perhaps a service provider has leased a virtual machine domain to a subscriber who has purchased QoS levels that can only be met with the performance levels available with SR-IOV VF devices.  Other scenarios exist.  In these cases it is worthwhile to be able to restrict the VF's ability to set MAC filters and use source MAD addresses not assigned to it.

Rather than a tug of war I just view it as balancing security concerns with levels of additional capability and functionality.  That goes on all the time.

- Greg

^ permalink raw reply

* RE: [E1000-devel] [PATCH net-next 3/4] e1000e: advertise transmit time stamping
From: Allan, Bruce W @ 2012-07-23 16:07 UTC (permalink / raw)
  To: Richard Cochran, netdev@vger.kernel.org
  Cc: e1000-devel@lists.sourceforge.net, Willem de Bruijn, David Miller
In-Reply-To: <e0e31f0229b5bfda1684122498b4d6fb3195cf26.1342976654.git.richardcochran@gmail.com>

> -----Original Message-----
> From: Richard Cochran [mailto:richardcochran@gmail.com]
> Sent: Sunday, July 22, 2012 10:16 AM
> To: netdev@vger.kernel.org
> Cc: e1000-devel@lists.sourceforge.net; Willem de Bruijn; David Miller
> Subject: [E1000-devel] [PATCH net-next 3/4] e1000e: advertise transmit
> time stamping
> 
> This driver now offers software transmit time stamping, so it should
> advertise that fact via ethtool. Compile tested only.
> 
> Signed-off-by: Richard Cochran <richardcochran@gmail.com>
> 
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: e1000-devel@lists.sourceforge.net
> ---
>  drivers/net/ethernet/intel/e1000e/ethtool.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/ethtool.c
> b/drivers/net/ethernet/intel/e1000e/ethtool.c
> index 105d554..0349e24 100644
> --- a/drivers/net/ethernet/intel/e1000e/ethtool.c
> +++ b/drivers/net/ethernet/intel/e1000e/ethtool.c
> @@ -2061,6 +2061,7 @@ static const struct ethtool_ops e1000_ethtool_ops
> = {
>  	.get_coalesce		= e1000_get_coalesce,
>  	.set_coalesce		= e1000_set_coalesce,
>  	.get_rxnfc		= e1000_get_rxnfc,
> +	.get_ts_info		= ethtool_op_get_ts_info,
>  };
> 
>  void e1000e_set_ethtool_ops(struct net_device *netdev)

Thanks Richard.  I have a similar patch amongst the PTP 1588 work I'm currently doing
for e1000e, but this one is good for now.

Acked-by: Bruce Allan <bruce.w.allan@intel.com>

^ permalink raw reply

* Re: [PATCH] ppp: add 64 bit stats
From: Eric Dumazet @ 2012-07-23 15:59 UTC (permalink / raw)
  To: Kevin Groeneveld; +Cc: netdev
In-Reply-To: <CABF+-6WHqzXhvv9etdAqsamZirbWxWJEsnU-CQ1H8tvynCTOZA@mail.gmail.com>

On Mon, 2012-07-23 at 11:25 -0400, Kevin Groeneveld wrote:

> I am curious if you can elaborate on what is racy about the patch, I
> am still trying to learn.  I assumed (possibly incorrectly) that
> because I was using percpu variables that the stats updates didn't
> need any extra synchronization as any concurrent updates would be on
> different cpus.
> 

ppp paths (xmit versus receive) are reentrant.

Therefore several cpus might do the
u64_stats_update_begin(&stats->syncp) at the same moment : one increment
could be lost forever, making all readers looping forever in
u64_stats_fetch_begin_bh()

include/linux/u64_stats_sync.h

* 3) Write side must ensure mutual exclusion or one seqcount update could
 *    be lost, thus blocking readers forever.
 *    If this synchronization point is not a mutex, but a spinlock or
 *    spinlock_bh() or disable_bh() :


> > I really doubt ppp is performance sensitive, it so doesnt need percpu
> > counter.
> >
> > If you really want 64bits stats on ppp, use proper synchronization
> > around u64 counters (but shared ones)
> 
> I will work on an updated patch without the percpu variables.  I
> didn't really think about servers with many cpus and many ppp sessions
> when I created the patch, I was mainly thinking about my Linksys
> router and other simple clients.  Many of the other network drivers
> use percpu variables for their stats so I just followed along.

Because there is one loopback device only, not thousands ;)

> 
> Would proper synchronization in this case just be wrapping the updates
> in a spin_lock/spin_unlock?

Would be fine (if the proper BH safe variant is used), or you could also
use atomic64_t.

^ permalink raw reply

* Re: [PATCH] ppp: add 64 bit stats
From: Kevin Groeneveld @ 2012-07-23 15:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1343020585.2626.10054.camel@edumazet-glaptop>

Hi Eric,

On Mon, Jul 23, 2012 at 1:16 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Hmm. This patches adds races, but also adds a good amount of memory on
> these servers with thousand of ppp devices, and 64 cpus.

I am curious if you can elaborate on what is racy about the patch, I
am still trying to learn.  I assumed (possibly incorrectly) that
because I was using percpu variables that the stats updates didn't
need any extra synchronization as any concurrent updates would be on
different cpus.

> I really doubt ppp is performance sensitive, it so doesnt need percpu
> counter.
>
> If you really want 64bits stats on ppp, use proper synchronization
> around u64 counters (but shared ones)

I will work on an updated patch without the percpu variables.  I
didn't really think about servers with many cpus and many ppp sessions
when I created the patch, I was mainly thinking about my Linksys
router and other simple clients.  Many of the other network drivers
use percpu variables for their stats so I just followed along.

Would proper synchronization in this case just be wrapping the updates
in a spin_lock/spin_unlock?


Kevin

^ permalink raw reply

* Re: New commands to configure IOV features
From: Chris Friesen @ 2012-07-23 15:09 UTC (permalink / raw)
  To: Don Dutile
  Cc: Ben Hutchings, David Miller, yuvalmin, gregory.v.rose, netdev,
	linux-pci
In-Reply-To: <500D59BF.9040006@redhat.com>

On 07/23/2012 08:03 AM, Don Dutile wrote:
> On 07/20/2012 07:42 PM, Chris Friesen wrote:
>>
>> I actually have a use-case where the guest needs to be able to modify 
>> the MAC addresses of network devices that are actually VFs.
>>
>> The guest is bonding the network devices together, so the bonding 
>> driver in the guest expects to be able to set all the slaves to the 
>> same MAC address.
>>
>> As I read the ixgbe driver, this should be possible as long as the 
>> host hasn't explicitly set the MAC address of the VF. Is that correct?
>>
>> Chris
>
> Interesting tug of war: hypervisors will want to set the macaddrs for 
> security reasons,
>                         some guests may want to set macaddr for 
> (valid?) config reasons.
>

In our case we have control over both guest an host anyways, so it's 
less of a security issue.  In the general case though I could see it 
being an interesting problem.

Back to the original discussion though--has anyone got any ideas about 
the best way to trigger runtime creation of VFs?  I don't know what the 
binary APIs looks like, but via sysfs I could see something like

echo number_of_new_vfs_to_create >  
/sys/bus/pci/devices/<address>/create_vfs

Something else that occurred to me--is there buy-in from driver 
maintainers?  I know the Intel ethernet drivers (what I'm most familiar 
with) would need to be substantially modified to support on-the-fly 
addition of new vfs.  Currently they assume that the number of vfs is 
known at module init time.

Chris

^ permalink raw reply

* Re: [3.5 regression / bridge] constantly toggeling between disabled and forwarding
From: Stephen Hemminger @ 2012-07-23 15:02 UTC (permalink / raw)
  To: Michael Leun; +Cc: netdev, bridge, linux-kernel
In-Reply-To: <20120723091504.2d035d28@xenia.leun.net>

On Mon, 23 Jul 2012 09:15:04 +0200
Michael Leun <lkml20120218@newton.leun.net> wrote:

> Hi,
> 
> when I use my usb ethernet adapter
> 
> # > lsusb
> [...]
> Bus 002 Device 009: ID 9710:7830 MosChip Semiconductor MCS7830 10/100 Mbps Ethernet adapter
> [...]
> 
> as port of an bridge
> 
> > # brctl addbr br0
> > # brctl addif br0 eth0
> > # brctl addif br0 ue5
> > # ifconfig ue5 up
> > # ifconfig br0 up
> 
> (Also does happen when eth0 is not part of the bridge, but the logs I
> had available were from that situation...)
> 
> I constantly get messages showing the interface toggeling between
> disabled and forwarding state:
> 
> Jul 23 07:40:50 elektra kernel: [ 1539.497337] br0: port 2(ue5) entered disabled state
> Jul 23 07:40:50 elektra kernel: [ 1539.554992] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:50 elektra kernel: [ 1539.555005] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:51 elektra kernel: [ 1540.496242] br0: port 2(ue5) entered disabled state
> Jul 23 07:40:51 elektra kernel: [ 1540.552534] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:51 elektra kernel: [ 1540.552548] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:52 elektra kernel: [ 1541.550413] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:53 elektra kernel: [ 1542.529672] br0: port 2(ue5) entered disabled state
> Jul 23 07:40:53 elektra kernel: [ 1542.587162] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:53 elektra kernel: [ 1542.587175] br0: port 2(ue5) entered forwarding state
> Jul 23 07:40:54 elektra kernel: [ 1543.585309] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:00 elektra kernel: [ 1549.360600] br0: port 2(ue5) entered disabled state
> Jul 23 07:41:00 elektra kernel: [ 1549.442998] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:00 elektra kernel: [ 1549.443011] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:01 elektra kernel: [ 1550.357686] br0: port 2(ue5) entered disabled state
> Jul 23 07:41:01 elektra kernel: [ 1550.408208] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:01 elektra kernel: [ 1550.408222] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:02 elektra kernel: [ 1551.407656] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:03 elektra kernel: [ 1552.401578] br0: port 2(ue5) entered disabled state
> Jul 23 07:41:03 elektra kernel: [ 1552.474773] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:03 elektra kernel: [ 1552.474786] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:04 elektra kernel: [ 1553.472487] br0: port 2(ue5) entered forwarding state
> Jul 23 07:41:05 elektra kernel: [ 1554.356138] br0: port 2(ue5) entered disabled state
> [...]
> 
> This does (in the same situation, nothing else than the kernel changed)
> not happen with 3.4.5.
> 
> Does anybody have an idea what the issue might be or do I need to bisect?

Probably not a bridge issue, but rather an issue with link status reporting
on the device. The bridge changes state when the attached ethernet
device raises/lowers carrier.

An independent way to observe link changes is to use a tool like:
  ip monitor
which will show carrier up/down events.

^ permalink raw reply

* Re: [PATCH v5] sctp: Implement quick failover draft from tsvwg
From: Flavio Leitner @ 2012-07-23 14:28 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, netdev, vyasevich, sri, linux-sctp, joe
In-Reply-To: <20120720.123109.718847527715087868.davem@davemloft.net>

On Fri, 20 Jul 2012 12:31:09 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:
[...]
> Just quote the commit message or similar.

Sure thing. I was going to clean up when I accidentally sent that email.
The previous ones were fine though :)
Sorry about that.
fbl

^ permalink raw reply

* [PATCH] ipv4: Remove redundant assignment
From: Lin Ming @ 2012-07-23 14:11 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

It is redundant to set no_addr and accept_local to 0 and then set them
with other values just after that.

Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
---
 net/ipv4/fib_frontend.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index f277cf0..8732cc7 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -258,7 +258,6 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
 	fl4.flowi4_tos = tos;
 	fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
 
-	no_addr = accept_local = 0;
 	no_addr = idev->ifa_list == NULL;
 
 	accept_local = IN_DEV_ACCEPT_LOCAL(idev);

^ permalink raw reply related

* Re: New commands to configure IOV features
From: Don Dutile @ 2012-07-23 14:03 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Ben Hutchings, David Miller, yuvalmin, gregory.v.rose, netdev,
	linux-pci
In-Reply-To: <5009ECDF.4090305@genband.com>

On 07/20/2012 07:42 PM, Chris Friesen wrote:
> On 07/20/2012 02:01 PM, Ben Hutchings wrote:
>> On Fri, 2012-07-20 at 13:29 -0600, Chris Friesen wrote:
>
>>> Once the device exists, then domain-specific APIs would be used to
>>> configure it the same way that they would configure a physical device.
>>
>> To an extent, but not entirely.
>>
>> Currently, the assigned MAC address and (optional) VLAN tag for each
>> networking VF are configured via the PF net device (though this is done
>> though the rtnetlink API rather than ethtool).
>
> I actually have a use-case where the guest needs to be able to modify the MAC addresses of network devices that are actually VFs.
>
> The guest is bonding the network devices together, so the bonding driver in the guest expects to be able to set all the slaves to the same MAC address.
>
> As I read the ixgbe driver, this should be possible as long as the host hasn't explicitly set the MAC address of the VF. Is that correct?
>
> Chris

Interesting tug of war: hypervisors will want to set the macaddrs for security reasons,
                         some guests may want to set macaddr for (valid?) config reasons.

^ permalink raw reply

* Re: [PATCH] mlx4: Add support for EEH error recovery
From: Or Gerlitz @ 2012-07-23 13:45 UTC (permalink / raw)
  To: Kleber Sacilotto de Souza
  Cc: David Miller, netdev, jackm, yevgenyp, cascardo, brking, shlomop
In-Reply-To: <500D4F31.9020408@linux.vnet.ibm.com>

On 7/23/2012 4:18 PM, Kleber Sacilotto de Souza wrote:
> Exactly. The callbacks implemented are from standard PCI error recovery
> (Documentation/PCI/pci-error-recovery.txt) and the changes doesn't
> assume any platform in specific. The code was tested only on powerpc
> systems [...]

So how did you test that? using the kernel provided error injection 
support and user space tool (which?) or in another way? we've trying 
quickly here to inject errors using /sbin/ear-inject from 
ras-utils-6.1-1.el6.x86_64 on a kernel built with

CONFIG_PCIEAER=y
CONFIG_PCIEAER_INJECT=m

and it failed to inject errors, SB details.

Or.
> since I don't have any mlx4 card on other platforms, however,
> these changes shouldn't make the error recover any worse than the
> current state.

> # lspci | grep 08.00.1
> 08:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network 
> Connection (rev 02)

> # cat /tmp/intel.aer
> AER
> BUS 8 DEV 0 FN 1
> COR_STATUS BAD_TLP
> HEADER_LOG 0 1 2 3

> # /sbin/aer-inject < /tmp/intel.aer
> Error: Failed to write, Invalid argument



> # strace -F -f /sbin/aer-inject < /tmp/intel.aer
> [...]

> open("/dev/aer_inject", O_WRONLY)       = 3
> write(3, "\10\0\1\0\0\0\0\0@\0\0\0\0\0\0\0\1\0\0\0\2\0\0\0\3\0\0\0", 
> 28) = -1 EINVAL (Invalid argument)
> write(2, "Error: ", 7Error: )                  = 7
> write(2, "Failed to write", 15Failed to write)         = 15
> write(2, ", Invalid argument\n", 19, Invalid argument
> )    = 19
> exit_group(-1)                          = ?

^ permalink raw reply

* Re: [PATCH net-next 4/4] forcedeth: advertise transmit time stamping
From: Willem de Bruijn @ 2012-07-23 13:27 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, David Miller
In-Reply-To: <d95c3b5bf30e3a7d056ad7f9fb774bbf572dc217.1342976654.git.richardcochran@gmail.com>

On Sun, Jul 22, 2012 at 1:15 PM, Richard Cochran
<richardcochran@gmail.com> wrote:
> This driver now offers software transmit time stamping, so it should
> advertise that fact via ethtool. Compile tested only.
>
> Signed-off-by: Richard Cochran <richardcochran@gmail.com>

Acked-by: Willem de Bruijn <willemb@google.com>

> Cc: Willem de Bruijn <willemb@google.com>
> ---
>  drivers/net/ethernet/nvidia/forcedeth.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
> index 67edc2e..f45def0 100644
> --- a/drivers/net/ethernet/nvidia/forcedeth.c
> +++ b/drivers/net/ethernet/nvidia/forcedeth.c
> @@ -5182,6 +5182,7 @@ static const struct ethtool_ops ops = {
>         .get_ethtool_stats = nv_get_ethtool_stats,
>         .get_sset_count = nv_get_sset_count,
>         .self_test = nv_self_test,
> +       .get_ts_info = ethtool_op_get_ts_info,
>  };
>
>  /* The mgmt unit and driver use a semaphore to access the phy during init */
> --
> 1.7.2.5
>

^ permalink raw reply

* Re: [PATCH] mlx4: Add support for EEH error recovery
From: Kleber Sacilotto de Souza @ 2012-07-23 13:18 UTC (permalink / raw)
  To: David Miller; +Cc: ogerlitz, netdev, jackm, yevgenyp, cascardo, brking, shlomop
In-Reply-To: <20120722.171553.2139258607165498367.davem@davemloft.net>

On 07/22/2012 09:15 PM, David Miller wrote:

> From: Or Gerlitz <ogerlitz@mellanox.com>
> Date: Sun, 22 Jul 2012 13:26:32 +0300
> 
>> is there anything in the code you added which maybe implicitly
>> assumes PPC arch?
> 
> He implemented support for a standard PCI API in the kernel, he
> happened to test it on a particular platform, and I think that's
> the long and short of it.
> 


Exactly. The callbacks implemented are from standard PCI error recovery
(Documentation/PCI/pci-error-recovery.txt) and the changes doesn't
assume any platform in specific. The code was tested only on powerpc
systems since I don't have any mlx4 card on other platforms, however,
these changes shouldn't make the error recover any worse than the
current state.

-- 
Kleber Sacilotto de Souza
IBM Linux Technology Center

^ permalink raw reply

* Re: [PATCH net-next V1 6/9] net/eipoib: Add sysfs support
From: Or Gerlitz @ 2012-07-23 12:55 UTC (permalink / raw)
  To: davem; +Cc: roland, netdev, ali, sean.hefty, shlomop, Erez Shitrit
In-Reply-To: <1342609202-32427-7-git-send-email-ogerlitz@mellanox.com>

On 7/18/2012 1:59 PM, Or Gerlitz wrote:
> The management interface for the driver uses sysfs entries. Via these sysfs entries the driver gets details on new VIF's to manage. The driver can enslave new VIF (IPoIB cloned interface) or detaches from it. Here are few sysfs commands that are used in order to manage the driver, according to few scenarios:
>
> 1. create new clone of IPoIB interface:
> 	$ echo .Y > /sys/class/net/ibX/create_child
> create new clone ibX.Y with the same pkey as ibX, for example:
> 	$ echo .1 > /sys/class/net/ib0/create_child
> will create new interface ib0.1
>
> 2. notify parent interface on new VIF to enslave:
> 	$ echo +ibX.Y > /sys/class/net/ethZ/eth/slaves
> where ethZ is the driver interface, for example:
> 	$ echo +ib0.1 > /sys/class/net/eth4/eth/slaves
> will enslave ib0.1 to eth4
>
> 3. notify parent interface interface on VIF details (mac and vlan)
> 	$ echo +ibX.Y <MAC address> > /sys/class/net/ethZ/eth/vifs
> for example:
> 	$ echo +ib0.1 00:02:c9:43:3b:f1 > /sys/class/net/eth4/eth/vifs

Hi Dave,

Following the comment you made on patch 1/9 we are modifying operations

#1 - create/delete clone of IPoIB device -- changed to use rtnl_link_ops

#2 - enslave/un-enslave a IPoIB device clone to eIPoIB device -- changed 
to support ndo_add_slave/ndo_delete_slave on eIPoIB

re #3, which is to create association which we call a VIF within the 
eIPoIB driver between an IPoIB slave to mac and vlan, we used sysfs as 
you can see above, and I wanted to ask re the correct way to do that.

One option which we consider, is to add new ndo operation ndo_add_vif to 
be supported by eIPoIB and call it from new netlink channel 
ifla_vif_mac_vlan, makes sense?

Or.



>
>
> 4. notify parent to release VIF:
>
> 	$ echo -ibX.Y > /sys/class/net/ethZ/eth/slaves
>
> where ethZ is the driver interface, for example:
>
>          $ echo -ib0.1 > /sys/class/net/eth4/eth/slaves
>
> will release ib0.1 from eth4
>
> 5. see the list of ipoib interfaces enslaved under eipoib interface,
>
> 	$ cat /sys/class/net/ethX/eth/vifs
>
> for example:
>
> 	$ cat /sys/class/net/eth4/eth/vifs
>
> 	SLAVE=ib0.1      MAC=9a:c2:1f:d7:3b:63 VLAN=N/A
> 	SLAVE=ib0.2      MAC=52:54:00:60:55:88 VLAN=N/A
> 	SLAVE=ib0.3      MAC=52:54:00:60:55:89 VLAN=N/A
>
> Signed-off-by: Erez Shitrit <erezsh@mellanox.co.il>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> ---
>   drivers/net/eipoib/eth_ipoib_sysfs.c |  640 ++++++++++++++++++++++++++++++++++
>   1 files changed, 640 insertions(+), 0 deletions(-)
>   create mode 100644 drivers/net/eipoib/eth_ipoib_sysfs.c
>
> diff --git a/drivers/net/eipoib/eth_ipoib_sysfs.c b/drivers/net/eipoib/eth_ipoib_sysfs.c
> new file mode 100644
> index 0000000..c3fc121
> --- /dev/null
> +++ b/drivers/net/eipoib/eth_ipoib_sysfs.c
> @@ -0,0 +1,640 @@
> +/*
> + * Copyright (c) 2012 Mellanox Technologies. All rights reserved
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * openfabric.org BSD license below:
> + *
> + *     Redistribution and use in source and binary forms, with or
> + *     without modification, are permitted provided that the following
> + *     conditions are met:
> + *
> + *      - Redistributions of source code must retain the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer.
> + *
> + *      - Redistributions in binary form must reproduce the above
> + *        copyright notice, this list of conditions and the following
> + *        disclaimer in the documentation and/or other materials
> + *        provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/sched.h>
> +#include <linux/fs.h>
> +#include <linux/types.h>
> +#include <linux/string.h>
> +#include <linux/netdevice.h>
> +#include <linux/inetdevice.h>
> +#include <linux/in.h>
> +#include <linux/sysfs.h>
> +#include <linux/ctype.h>
> +#include <linux/inet.h>
> +#include <linux/rtnetlink.h>
> +#include <linux/etherdevice.h>
> +#include <net/net_namespace.h>
> +
> +#include "eth_ipoib.h"
> +
> +#define to_dev(obj)	container_of(obj, struct device, kobj)
> +#define to_parent(cd)	((struct parent *)(netdev_priv(to_net_dev(cd))))
> +#define MOD_NA_STRING		"N/A"
> +
> +#define _sprintf(p, buf, format, arg...)				\
> +((PAGE_SIZE - (int)(p - buf)) <= 0 ? 0 :				\
> +	scnprintf(p, PAGE_SIZE - (int)(p - buf), format, ## arg))\
> +
> +#define _end_of_line(_p, _buf)					\
> +do { if (_p - _buf) /* eat the leftover space */			\
> +		buf[_p - _buf - 1] = '\n';				\
> +} while (0)
> +
> +/* helper functions */
> +static int get_emac(u8 *mac, char *s)
> +{
> +	if (sscanf(s, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx",
> +		   mac + 0, mac + 1, mac + 2, mac + 3, mac + 4,
> +		   mac + 5) != 6)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +static int get_imac(u8 *mac, char *s)
> +{
> +	if (sscanf(s, "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx:%hhx:%hhx:"
> +		   "%hhx:%hhx:%hhx:%hhx:%hhx:%hhx:%hhx:%hhx:"
> +		   "%hhx:%hhx:%hhx:%hhx",
> +		   mac + 0, mac + 1, mac + 2, mac + 3, mac + 4,
> +		   mac + 5, mac + 6, mac + 7, mac + 8, mac + 9,
> +		   mac + 10, mac + 11, mac + 12, mac + 13,
> +		   mac + 14, mac + 15, mac + 16, mac + 17,
> +		   mac + 18, mac + 19) != 20)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +/* show/store functions per module (CLASS_ATTR) */
> +static ssize_t show_parents(struct class *cls, struct class_attribute *attr,
> +			    char *buf)
> +{
> +	char *p = buf;
> +	struct parent *parent;
> +
> +	rtnl_lock(); /* because of parent_dev_list */
> +
> +	list_for_each_entry(parent, &parent_dev_list, parent_list) {
> +		p += _sprintf(p, buf, "%s over IB port: %s\n",
> +			      parent->dev->name,
> +			      parent->ipoib_main_interface);
> +	}
> +	_end_of_line(p, buf);
> +
> +	rtnl_unlock();
> +	return (ssize_t)(p - buf);
> +}
> +
> +/* show/store functions per parent (DEVICE_ATTR) */
> +static ssize_t parent_show_neighs(struct device *d,
> +				  struct device_attribute *attr, char *buf)
> +{
> +	struct slave *slave;
> +	struct neigh *neigh;
> +	struct parent *parent = to_parent(d);
> +	char *p = buf;
> +
> +	read_lock_bh(&parent->lock);
> +	parent_for_each_slave(parent, slave) {
> +		list_for_each_entry(neigh, &slave->neigh_list, list) {
> +			p += _sprintf(p, buf, "SLAVE=%-10s EMAC=%pM IMAC=%pM:%pM:%pM:%.2x:%.2x\n",
> +				      slave->dev->name,
> +				      neigh->emac,
> +				      neigh->imac, neigh->imac + 6, neigh->imac + 12,
> +				      neigh->imac[18], neigh->imac[19]);
> +		}
> +	}
> +
> +	read_unlock_bh(&parent->lock);
> +
> +	_end_of_line(p, buf);
> +
> +	return (ssize_t)(p - buf);
> +}
> +
> +struct neigh *parent_get_neigh_cmd(char op,
> +				   char *ifname, u8 *remac, u8 *rimac)
> +{
> +	struct neigh *neigh_cmd;
> +
> +	neigh_cmd = kzalloc(sizeof *neigh_cmd, GFP_ATOMIC);
> +	if (!neigh_cmd) {
> +		pr_err("%s cannot allocate neigh struct\n", ifname);
> +		goto out;
> +	}
> +
> +	/*
> +	 * populate emac field so it can be used easily
> +	 * in neigh_cmd_find_by_mac()
> +	 */
> +	memcpy(neigh_cmd->emac, remac, ETH_ALEN);
> +	memcpy(neigh_cmd->imac, rimac, INFINIBAND_ALEN);
> +
> +	/* prepare the command as a string */
> +	sprintf(neigh_cmd->cmd, "%c%s %pM %pM:%pM:%pM:%.2x:%.2x",
> +		op, ifname, remac, rimac, rimac + 6, rimac + 12, rimac[18], rimac[19]);
> +out:
> +	return neigh_cmd;
> +}
> +
> +/* write_lock_bh(&parent->lock) must be held */
> +ssize_t __parent_store_neighs(struct device *d,
> +			      struct device_attribute *attr,
> +			      const char *buffer, size_t count)
> +{
> +	char command[IFNAMSIZ + 1] = { 0, };
> +	char emac_str[ETH_ALEN * 3] = { 0, };
> +	u8 emac[ETH_ALEN];
> +	char imac_str[INFINIBAND_ALEN * 3] = { 0, };
> +	u8 imac[INFINIBAND_ALEN];
> +	char *ifname;
> +	int found = 0, ret = count;
> +	struct slave *slave = NULL, *slave_tmp;
> +	struct neigh *neigh;
> +	struct parent *parent = to_parent(d);
> +
> +	sscanf(buffer, "%s %s %s", command, emac_str, imac_str);
> +
> +	/* check ifname */
> +	ifname = command + 1;
> +	if ((strlen(command) <= 1) || !dev_valid_name(ifname) ||
> +	    (command[0] != '+' && command[0] != '-'))
> +		goto err_no_cmd;
> +
> +	/* check if ifname exist */
> +	parent_for_each_slave(parent, slave_tmp) {
> +		if (!strcmp(slave_tmp->dev->name, ifname)) {
> +			found = 1;
> +			slave = slave_tmp;
> +		}
> +	}
> +
> +	if (!found) {
> +		pr_err("%s could not find slave\n", ifname);
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	if (get_emac(emac, emac_str)) {
> +		pr_err("%s bad emac %s\n", ifname, emac_str);
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	if (get_imac(imac, imac_str)) {
> +		pr_err("%s bad imac %s\n", ifname, imac_str);
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	/* process command */
> +	if (command[0] == '+') {
> +		found = 0;
> +		list_for_each_entry(neigh, &slave->neigh_list, list) {
> +			if (!memcmp(neigh->emac, emac, ETH_ALEN))
> +				found = 1;
> +		}
> +
> +		if (found) {
> +			pr_err("%s: cannot update neigh, slave already has "
> +			       "this neigh mac %pM\n",
> +			       slave->dev->name, emac);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
> +		neigh = kzalloc(sizeof *neigh, GFP_ATOMIC);
> +		if (!neigh) {
> +			pr_err("%s cannot allocate neigh struct\n",
> +			       slave->dev->name);
> +			ret = -ENOMEM;
> +			goto out;
> +		}
> +
> +		/* ready to go */
> +		pr_info("%s: slave %s neigh mac is set to %pM\n",
> +			ifname, parent->dev->name, emac);
> +		memcpy(neigh->emac, emac, ETH_ALEN);
> +		memcpy(neigh->imac, imac, INFINIBAND_ALEN);
> +
> +		list_add_tail(&neigh->list, &slave->neigh_list);
> +
> +		goto out;
> +	}
> +
> +	if (command[0] == '-') {
> +		found = 0;
> +		list_for_each_entry(neigh, &slave->neigh_list, list) {
> +			if (!memcmp(neigh->emac, emac, ETH_ALEN))
> +				found = 1;
> +		}
> +
> +		if (!found) {
> +			pr_err("%s cannot delete neigh mac %pM\n",
> +			       ifname, emac);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
> +		list_del(&neigh->list);
> +		kfree(neigh);
> +
> +		goto out;
> +	}
> +
> +err_no_cmd:
> +	pr_err("%s USAGE: (-|+)ifname emac imac\n", DRV_NAME);
> +	ret = -EPERM;
> +
> +out:
> +	return ret;
> +}
> +
> +static ssize_t parent_store_neighs(struct device *d,
> +				   struct device_attribute *attr,
> +				   const char *buffer, size_t count)
> +{
> +	struct parent *parent = to_parent(d);
> +	ssize_t rc;
> +
> +	write_lock_bh(&parent->lock);
> +	rc = __parent_store_neighs(d, attr, buffer, count);
> +	write_unlock_bh(&parent->lock);
> +
> +	return rc;
> +}
> +
> +static DEVICE_ATTR(neighs, S_IRUGO | S_IWUSR, parent_show_neighs,
> +		   parent_store_neighs);
> +
> +static ssize_t parent_show_vifs(struct device *d,
> +				struct device_attribute *attr, char *buf)
> +{
> +	struct slave *slave;
> +	struct parent *parent = to_parent(d);
> +	char *p = buf;
> +
> +	read_lock_bh(&parent->lock);
> +	parent_for_each_slave(parent, slave) {
> +		if (is_zero_ether_addr(slave->emac)) {
> +			p += _sprintf(p, buf, "SLAVE=%-10s MAC=%-17s "
> +				      "VLAN=%s\n", slave->dev->name,
> +				      MOD_NA_STRING, MOD_NA_STRING);
> +		} else if (slave->vlan == VLAN_N_VID) {
> +			p += _sprintf(p, buf, "SLAVE=%-10s MAC=%pM VLAN=%s\n",
> +				      slave->dev->name,
> +				      slave->emac,
> +				      MOD_NA_STRING);
> +		} else {
> +			p += _sprintf(p, buf, "SLAVE=%-10s MAC=%pM VLAN=%d\n",
> +				      slave->dev->name,
> +				      slave->emac,
> +				      slave->vlan);
> +		}
> +	}
> +	read_unlock_bh(&parent->lock);
> +
> +	_end_of_line(p, buf);
> +
> +	return (ssize_t)(p - buf);
> +}
> +
> +static ssize_t parent_store_vifs(struct device *d,
> +				 struct device_attribute *attr,
> +				 const char *buffer, size_t count)
> +{
> +	char command[IFNAMSIZ + 1] = { 0, };
> +	char mac_str[ETH_ALEN * 3] = { 0, };
> +	char *ifname;
> +	u8 mac[ETH_ALEN];
> +	int found = 0, ret = count;
> +	struct slave *slave = NULL, *slave_tmp;
> +	struct parent *parent = to_parent(d);
> +
> +	sscanf(buffer, "%s %s", command, mac_str);
> +
> +	write_lock_bh(&parent->lock);
> +
> +	/* check ifname */
> +	ifname = command + 1;
> +	if ((strlen(command) <= 1) || !dev_valid_name(ifname) ||
> +	    (command[0] != '+' && command[0] != '-'))
> +		goto err_no_cmd;
> +
> +	/* check if ifname exist */
> +	parent_for_each_slave(parent, slave_tmp) {
> +		if (!strcmp(slave_tmp->dev->name, ifname)) {
> +			found = 1;
> +			slave = slave_tmp;
> +		}
> +	}
> +
> +	if (!found) {
> +		pr_err("%s could not find slave\n", ifname);
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	/* process command */
> +	if (command[0] == '+') {
> +		if (get_emac(mac, mac_str) || !is_valid_ether_addr(mac)) {
> +			pr_err("%s invalid mac input\n", ifname);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
> +		if (!is_zero_ether_addr(slave->emac)) {
> +			pr_err("%s slave %s mac already set to %pM\n",
> +			       ifname, slave->dev->name, slave->emac);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
> +		/* check another slave has this mac/vlan */
> +		found = 0;
> +		parent_for_each_slave(parent, slave_tmp) {
> +			if (!memcmp(slave_tmp->emac, mac, ETH_ALEN) &&
> +			    slave_tmp->vlan == slave->vlan) {
> +				pr_err("cannot update %s, slave %s already has"
> +				       " vlan 0x%x mac %pM\n",
> +				       parent->dev->name, slave->dev->name,
> +				       slave_tmp->vlan,
> +				       mac);
> +				ret = -EINVAL;
> +				goto out;
> +			}
> +		}
> +
> +		/* ready to go */
> +		pr_info("slave %s mac is set to %pM\n",
> +			ifname, mac);
> +
> +		memcpy(slave->emac, mac, ETH_ALEN);
> +		goto out;
> +	}
> +
> +	if (command[0] == '-') {
> +		if (is_zero_ether_addr(slave->emac)) {
> +			pr_err("%s slave mac already unset %pM\n",
> +			       ifname, slave->emac);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
> +		pr_info("slave %s mac is unset (was %pM)\n",
> +			ifname, slave->emac);
> +
> +		goto out;
> +	}
> +
> +err_no_cmd:
> +	pr_err("%s USAGE: (-|+)ifname [mac]\n", DRV_NAME);
> +	ret = -EPERM;
> +
> +out:
> +	write_unlock_bh(&parent->lock);
> +
> +	return ret;
> +}
> +
> +static DEVICE_ATTR(vifs, S_IRUGO | S_IWUSR, parent_show_vifs,
> +		   parent_store_vifs);
> +
> +static ssize_t parent_show_slaves(struct device *d,
> +				  struct device_attribute *attr, char *buf)
> +{
> +	struct slave *slave;
> +	struct parent *parent = to_parent(d);
> +	char *p = buf;
> +
> +	read_lock_bh(&parent->lock);
> +	parent_for_each_slave(parent, slave)
> +		p += _sprintf(p, buf, "%s\n", slave->dev->name);
> +	read_unlock_bh(&parent->lock);
> +
> +	_end_of_line(p, buf);
> +
> +	return (ssize_t)(p - buf);
> +}
> +
> +static ssize_t parent_store_slaves(struct device *d,
> +				   struct device_attribute *attr,
> +				   const char *buffer, size_t count)
> +{
> +	char command[IFNAMSIZ + 1] = { 0, };
> +	char *ifname;
> +	int res, ret = count;
> +	struct slave *slave;
> +	struct net_device *dev = NULL;
> +	struct parent *parent = to_parent(d);
> +
> +	/* Quick sanity check -- is the parent interface up? */
> +	if (!(parent->dev->flags & IFF_UP)) {
> +		pr_warn("%s: doing slave updates when "
> +			"interface is down.\n", dev->name);
> +	}
> +
> +	if (!rtnl_trylock()) /* because __dev_get_by_name */
> +		return restart_syscall();
> +
> +	sscanf(buffer, "%16s", command);
> +
> +	ifname = command + 1;
> +	if ((strlen(command) <= 1) || !dev_valid_name(ifname))
> +		goto err_no_cmd;
> +
> +	if (command[0] == '+') {
> +		/* Got a slave name in ifname. Is it already in the list? */
> +		dev = __dev_get_by_name(&init_net, ifname);
> +		if (!dev) {
> +			pr_warn("%s: Interface %s does not exist!\n",
> +				parent->dev->name, ifname);
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +
> +		read_lock_bh(&parent->lock);
> +		parent_for_each_slave(parent, slave) {
> +			if (slave->dev == dev) {
> +				pr_err("%s ERR- Interface %s is already enslaved!\n",
> +				       parent->dev->name, dev->name);
> +				ret = -EPERM;
> +			}
> +		}
> +		read_unlock_bh(&parent->lock);
> +
> +		if (ret < 0)
> +			goto out;
> +
> +		pr_info("%s: adding slave %s\n",
> +			parent->dev->name, ifname);
> +
> +		res = parent_enslave(parent->dev, dev);
> +		if (res)
> +			ret = res;
> +
> +		goto out;
> +	}
> +
> +	if (command[0] == '-') {
> +		dev = NULL;
> +		parent_for_each_slave(parent, slave)
> +			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0) {
> +				dev = slave->dev;
> +				break;
> +			}
> +
> +		if (dev) {
> +			pr_info("%s: removing slave %s\n",
> +				parent->dev->name, dev->name);
> +			res = parent_release_slave(parent->dev, dev);
> +			if (res) {
> +				ret = res;
> +				goto out;
> +			}
> +		} else {
> +			pr_warn("%s: unable to remove non-existent "
> +				"slave for parent %s.\n",
> +				ifname, parent->dev->name);
> +			ret = -ENODEV;
> +		}
> +		goto out;
> +	}
> +
> +err_no_cmd:
> +	pr_err("%s USAGE: (-|+)ifname\n", DRV_NAME);
> +	ret = -EPERM;
> +
> +out:
> +	rtnl_unlock();
> +	return ret;
> +}
> +
> +static DEVICE_ATTR(slaves, S_IRUGO | S_IWUSR, parent_show_slaves,
> +		   parent_store_slaves);
> +
> +/* sysfs create/destroy functions */
> +static struct attribute *per_parent_attrs[] = {
> +	&dev_attr_slaves.attr, /* DEVICE_ATTR(slaves..) */
> +	&dev_attr_vifs.attr,
> +	&dev_attr_neighs.attr,
> +	NULL,
> +};
> +
> +/* name spcase  support */
> +static const void *eipoib_namespace(struct class *cls,
> +				    const struct class_attribute *attr)
> +{
> +	const struct eipoib_net *eipoib_n =
> +		container_of(attr,
> +			     struct eipoib_net, class_attr_eipoib_interfaces);
> +	return eipoib_n->net;
> +}
> +
> +static struct attribute_group parent_group = {
> +	/* per parent sysfs files under: /sys/class/net/<IF>/eth/.. */
> +	.name = "eth",
> +	.attrs = per_parent_attrs
> +};
> +
> +int create_slave_symlinks(struct net_device *master,
> +			  struct net_device *slave)
> +{
> +	char linkname[IFNAMSIZ+7];
> +	int ret = 0;
> +
> +	ret = sysfs_create_link(&(slave->dev.kobj), &(master->dev.kobj),
> +				"eth_parent");
> +	if (ret)
> +		return ret;
> +
> +	sprintf(linkname, "slave_%s", slave->name);
> +	ret = sysfs_create_link(&(master->dev.kobj), &(slave->dev.kobj),
> +				linkname);
> +	return ret;
> +
> +}
> +
> +void destroy_slave_symlinks(struct net_device *master,
> +			    struct net_device *slave)
> +{
> +	char linkname[IFNAMSIZ+7];
> +
> +	sysfs_remove_link(&(slave->dev.kobj), "eth_parent");
> +	sprintf(linkname, "slave_%s", slave->name);
> +	sysfs_remove_link(&(master->dev.kobj), linkname);
> +}
> +
> +static struct class_attribute class_attr_eth_ipoib_interfaces = {
> +	.attr = {
> +		.name = "eth_ipoib_interfaces",
> +		.mode = S_IWUSR | S_IRUGO,
> +	},
> +	.show = show_parents,
> +	.namespace = eipoib_namespace,
> +};
> +
> +/* per module sysfs file under: /sys/class/net/eth_ipoib_interfaces */
> +int mod_create_sysfs(struct eipoib_net *eipoib_n)
> +{
> +	int rc;
> +	/* defined in CLASS_ATTR(eth_ipoib_interfaces..) */
> +	eipoib_n->class_attr_eipoib_interfaces =
> +		class_attr_eth_ipoib_interfaces;
> +
> +	sysfs_attr_init(&eipoib_n->class_attr_eipoib_interfaces.attr);
> +
> +	rc = netdev_class_create_file(&eipoib_n->class_attr_eipoib_interfaces);
> +	if (rc)
> +		pr_err("%s failed to create sysfs (rc %d)\n",
> +		       eipoib_n->class_attr_eipoib_interfaces.attr.name, rc);
> +
> +	return rc;
> +}
> +
> +void mod_destroy_sysfs(struct eipoib_net *eipoib_n)
> +{
> +	netdev_class_remove_file(&eipoib_n->class_attr_eipoib_interfaces);
> +}
> +
> +int parent_create_sysfs_entry(struct parent *parent)
> +{
> +	struct net_device *dev = parent->dev;
> +	int rc;
> +
> +	rc = sysfs_create_group(&(dev->dev.kobj), &parent_group);
> +	if (rc)
> +		pr_info("failed to create sysfs group\n");
> +
> +	return rc;
> +}
> +
> +void parent_destroy_sysfs_entry(struct parent *parent)
> +{
> +	struct net_device *dev = parent->dev;
> +
> +	sysfs_remove_group(&(dev->dev.kobj), &parent_group);
> +}

^ permalink raw reply

* Re: [PATCH] sctp: Make "Invalid Stream Identifier" ERROR follows SACK when bundling
From: Neil Horman @ 2012-07-23 12:14 UTC (permalink / raw)
  To: xufeng zhang
  Cc: xufengzhang.main, vyasevich, sri, davem, linux-sctp, netdev,
	linux-kernel
In-Reply-To: <500CB74A.4040300@windriver.com>

On Mon, Jul 23, 2012 at 10:30:34AM +0800, xufeng zhang wrote:
> On 07/23/2012 08:49 AM, Neil Horman wrote:
> >
> >Not sure I understand how you came into this error.  If we get an invalid
> >stream, we issue an SCTP_REPORT_TSN side effect, followed by an SCTP_CMD_REPLY
> >which sends the error chunk.  The reply goes through
> >sctp_outq_tail->sctp_outq_chunk->sctp_outq_transmit_chunk->sctp_outq_append_chunk.
> >That last function checks to see if a sack is already part of the packet, and if
> >there isn't one, appends one, using the updated tsn map.
> Yes, you are right, but consider the invalid stream identifier's
> DATA chunk is the first
> DATA chunk in the association which will need SACK immediately.
> Here is what I thought of the scenario:
>     sctp_sf_eat_data_6_2()
>         -->sctp_eat_data()
>             -->sctp_make_op_error()
>             -->sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(err))
>             -->sctp_outq_tail()          /* First enqueue ERROR chunk */
>         -->sctp_add_cmd_sf(commands, SCTP_CMD_GEN_SACK, SCTP_FORCE())
>             -->sctp_gen_sack()
>                 -->sctp_make_sack()
>                 -->sctp_add_cmd_sf(commands, SCTP_CMD_REPLY,
> SCTP_CHUNK(sack))
>                 -->sctp_outq_tail()          /* Then enqueue SACK chunk */
> 
> So SACK chunk is enqueued after ERROR chunk.
Ah, I see.  Since the ERROR and SACK chunks are both control chunks, and since
we explicitly add the SACK to the control queue instead of going through the
bundle path in sctp_packet_append_chunk the ordering gets wrong.

Ok, so the problem makes sense.  I think the soultion could be alot easier
though.  IIRC SACK chunks always live at the head of a packet, so why not just
special case it in sctp_outq_tail?  I.e. instead of doing a list_add_tail, in
the else clause of sctp_outq_tail check the chunk_hdr->type to see if its
SCTP_CID_SACK.  If it is, use list_add_head rather than list_add_tail.  I think
that will fix up both the COOKIE_ECHO and ESTABLISHED cases, won't it?  And then
you won't have keep track of extra state in the packet configuration.

Regards
Neil

^ permalink raw reply

* [PATCH] USB: plusb: Add support for PL-2501
From: kyak @ 2012-07-23 11:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Sergei Shtylyov, linux-usb, netdev, Greg Kroah-Hartman

From: Mikhail Peselnik <peselnik@gmail.com>

This patch adds support for PL-2501 by adding the appropriate USB
ID's. This chip is used in several USB 'Easy Trasfer' Cables.

Signed-off-by: Mikhail Peselnik <peselnik@gmail.com>
Tested-by: Mikhail Peselnik <peselnik@gmail.com>
---
Now with proper sign-offs and right people in cc and unwrapped lines.
plusb driver (drivers/net/usb/plusb.c) doesn't recognize PL2501 chip.
Since PL2501 uses the same code as PL2301/PL2302 (PL2501 works in
compatibility mode with PL2301/PL2302), the fix is trivial and
attached as a patch.

Just to note: the patch is not mine, it can be found here and there on
Internet.
I've tested the patch and it works great.

Thank you.

--- linux-3.5/drivers/net/usb/plusb.c.orig	2012-07-22 21:06:41.905802795 +0400
+++ linux-3.5/drivers/net/usb/plusb.c	2012-07-22 21:07:00.345552404 +0400
@@ -107,7 +107,7 @@ static int pl_reset(struct usbnet *dev)
  }

  static const struct driver_info	prolific_info = {
-	.description =	"Prolific PL-2301/PL-2302/PL-25A1",
+	.description =	"Prolific PL-2301/PL-2302/PL-25A1/PL-2501",
  	.flags =	FLAG_POINTTOPOINT | FLAG_NO_SETINT,
  		/* some PL-2302 versions seem to fail usb_set_interface() */
  	.reset =	pl_reset,
@@ -139,6 +139,9 @@ static const struct usb_device_id	produc
  }, {
  	USB_DEVICE(0x050d, 0x258a),     /* Belkin F5U258/F5U279 (PL-25A1) */
  	.driver_info =  (unsigned long) &prolific_info,
+}, {
+	USB_DEVICE(0x067b, 0x2501),     /* PL-2501 */
+	.driver_info =  (unsigned long) &prolific_info,
  },

  	{ },		// END
@@ -158,5 +161,5 @@ static struct usb_driver plusb_driver =
  module_usb_driver(plusb_driver);

  MODULE_AUTHOR("David Brownell");
-MODULE_DESCRIPTION("Prolific PL-2301/2302/25A1 USB Host to Host Link Driver");
+MODULE_DESCRIPTION("Prolific PL-2301/2302/25A1/2501 USB Host to Host Link Driver");
  MODULE_LICENSE("GPL");

^ permalink raw reply

* Re: [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list
From: Neil Horman @ 2012-07-23 11:40 UTC (permalink / raw)
  To: Gao feng
  Cc: Srivatsa S. Bhat, eric.dumazet, davem, linux-kernel, netdev,
	mark.d.rustad, john.r.fastabend, lizefan
In-Reply-To: <500CA599.6030907@cn.fujitsu.com>

On Mon, Jul 23, 2012 at 09:15:05AM +0800, Gao feng wrote:
> 于 2012年07月20日 00:27, Srivatsa S. Bhat 写道:
> > After commit ef209f15 (net: cgroup: fix access the unallocated memory in
> > netprio cgroup), boot fails with the following NULL pointer dereference:
> > 
> > Initializing cgroup subsys devices
> > Initializing cgroup subsys freezer
> > Initializing cgroup subsys net_cls
> > Initializing cgroup subsys blkio
> > Initializing cgroup subsys perf_event
> > Initializing cgroup subsys net_prio
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
> > IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> > PGD 0
> > Oops: 0000 [#1] SMP
> > CPU 0
> > Modules linked in:
> > 
> > Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
> > RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> > RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
> > RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
> > RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
> > R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
> > R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
> > FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
> > Stack:
> >  ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
> >  ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
> >  0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
> > Call Trace:
> >  [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
> >  [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
> >  [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
> >  [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
> >  [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
> >  [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
> > Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
> > RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
> >  RSP <ffffffff81a01ea8>
> > CR2: 0000000000000698
> > ---[ end trace a7919e7f17c0a725 ]---
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > 
> > The code corresponds to:
> > 
> > update_netdev_tables():
> >         for_each_netdev(&init_net, dev) {
> >                 map = rtnl_dereference(dev->priomap);  <---- HERE
> > 
> > 
> > The list head is initialized in netdev_init(), which is called much
> > later than cgrp_create(). So the problem is that we are calling
> > update_netdev_tables() way too early (in cgrp_create()), which will
> > end up traversing the not-yet-circular linked list. So at some point,
> > the dev pointer will become NULL and hence dev->priomap becomes an
> > invalid access.
> > 
> > To fix this, just remove the update_netdev_tables() function entirely,
> > since it appears that write_update_netdev_table() will handle things
> > just fine.
> 
> The reason I add update_netdev_tables in cgrp_create is to avoid additional
> bound checkings when we accessing the dev->priomap.priomap.
> 
> Eric,can we revert this commit 91c68ce2b26319248a32d7baa1226f819d283758 now?
> I think it's safe enough to access priomap without bound check.
> 
> Thanks
> 

I think its probably safe, yes, but lets leave it there for just a bit.  Its not
hurting anything, and I'd like to look into getting Srivatsa' patch in first.
Neil

^ permalink raw reply

* Re: [patch net-next] rtnl: do not include num_rx_queues into msg when CONFIG_RPS is not set
From: Jiri Pirko @ 2012-07-23 11:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, sfr
In-Reply-To: <20120722.122746.1524911782331447848.davem@davemloft.net>

Sun, Jul 22, 2012 at 09:27:46PM CEST, davem@davemloft.net wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Sun, 22 Jul 2012 09:26:42 +0200
>
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>
>Jiri, I've been patiently waiting for you to ACK Mark A. Greear's
>patch fixing this bug, did you not see it at all?

Oh I missed that. That email somehow skipped my inbox and went only to
ml folder...

>
>http://patchwork.ozlabs.org/patch/172393/
>
>I'm just going to apply it since you're not watching for fixes
>for code you've changed.

^ permalink raw reply

* Re: [PATCH] USB: plusb: Add support for PL-2501
From: Sergei Shtylyov @ 2012-07-23 10:25 UTC (permalink / raw)
  To: kyak
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, Greg Kroah-Hartman,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <alpine.LNX.2.02.1207222138410.1183@bas>

Hello.

On 22-07-2012 21:42, kyak wrote:

> From: Mikhail Peselnik <peselnik-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

> This patch adds support for PL-2501 by adding the appropriate USB
> ID's. This chip is used in several USB 'Easy Trasfer' Cables.

> Signed-off-by: Mikhail Peselnik <peselnik-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Tested-by: Mikhail Peselnik <peselnik-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> Now with proper sign-offs and right people in cc.
> plusb driver (drivers/net/usb/plusb.c) doesn't recognize PL2501 chip.
> Since PL2501 uses the same code as PL2301/PL2302 (PL2501 works in
> compatibility mode with PL2301/PL2302), the fix is trivial and
> attached as a patch.

> Just to note: the patch is not mine, it can be found here and there on
> Internet.
> I've tested the patch and it works great.

> Thank you.

> --- linux-3.5/drivers/net/usb/plusb.c.orig      2012-07-22
> 21:06:41.905802795 +0400
> +++ linux-3.5/drivers/net/usb/plusb.c   2012-07-22 21:07:00.345552404 +0400
[...]
> @@ -158,5 +161,5 @@ static struct usb_driver plusb_driver =
>   module_usb_driver(plusb_driver);
>
>   MODULE_AUTHOR("David Brownell");
> -MODULE_DESCRIPTION("Prolific PL-2301/2302/25A1 USB Host to Host Link Driver");
> +MODULE_DESCRIPTION("Prolific PL-2301/2302/25A1/2501 USB Host to Host
> Link Driver");

    Your patch is line wrapped. Seeems easy to fix though...

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: flush cache according to 'preferred life time'
From: Gao feng @ 2012-07-23  9:44 UTC (permalink / raw)
  To: BALAKUMARAN KANNAN; +Cc: netdev@vger.kernel.org
In-Reply-To: <4A71D24947E78D43BC584A7CD4391A41017DE4FB@SIXPRD0410MB359.apcprd04.prod.outlook.com>

于 2012年07月23日 16:36, BALAKUMARAN KANNAN 写道:
> Than you Gao-san. Your patch is helpful. I will try that. Also I am facing another problem. Tahi test case section nd test case 145 is failing if gc_interval is 30. The test case is as follows
>  * The tester node (tn) sends RA with curhoplimit 64
>  * tn sends a ICMP_REQUEST and checks the ICMP_REPLY from nut is having hoplimit 64. (it is fine in my case)
>  * Then the tn sends a RA with curhoplimit 0. (It should be ignored)
>  * Then again tn sends ICMP_REQUEST and checks the ICMP_REPLY from nut whether the hoplimit remains 64 (but it changes to 255 if cache is present. But once I changed the gc_interval to 1, this testcase passes)
> Can you please explain what is the reason. 
> 

Hi

Can you test with the last kernel?
Maybe this bug has been fixed too.

Thanks

^ permalink raw reply

* RE: [PATCH] Crash in tun
From: David Laight @ 2012-07-23  9:43 UTC (permalink / raw)
  To: Al Viro, David Miller
  Cc: mikulas, eric.dumazet, maxk, vtun, netdev, Nicholas A. Bellinger,
	linux-sctp
In-Reply-To: <20120721075518.GY31729@ZenIV.linux.org.uk>

> 	BTW, speaking of struct file treatment related to sockets -
> there's this piece of code in iscsi:
>         /*
>          * The SCTP stack needs struct socket->file.
>          */
>         if ((np->np_network_transport == ISCSI_SCTP_TCP) ||
>             (np->np_network_transport == ISCSI_SCTP_UDP)) {
>                 if (!new_sock->file) {
>                         new_sock->file = kzalloc(
>                                         sizeof(struct file),
> GFP_KERNEL);
> 
> For one thing, as far as I can see it'not true - sctp does *not*
> depend on socket->file being non-NULL; it does, in one place,
> check socket->file->f_flags for O_NONBLOCK, but there it treats
> NULL socket->file as "flag no set".

The SCTP code certainly has unconditionally looked at file->f_flags,
we had to allocate a 'struct file' for our in-kernel socket code.
We set sock->file = NULL before the sock_release() call so
hopefully don't suffer the 'side effects'.

	David

^ permalink raw reply

* Re: [net-next RFC V5 4/5] virtio_net: multiqueue support
From: Sasha Levin @ 2012-07-23  9:28 UTC (permalink / raw)
  To: Jason Wang
  Cc: krkumar2, habanero, mashirle, kvm, Michael S. Tsirkin, netdev,
	linux-kernel, virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <500CE72B.2040101@redhat.com>

On 07/23/2012 07:54 AM, Jason Wang wrote:
> On 07/21/2012 08:02 PM, Sasha Levin wrote:
>> On 07/20/2012 03:40 PM, Michael S. Tsirkin wrote:
>>>> -    err = init_vqs(vi);
>>>>> +    if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
>>>>> +        vi->has_cvq = true;
>>>>> +
>>> How about we disable multiqueue if there's no cvq?
>>> Will make logic a bit simpler, won't it?
>> multiqueues don't really depend on cvq. Does this added complexity really justifies adding an artificial limit?
>>
> 
> Yes, it does not depends on cvq. Cvq were just used to negotiate the number of queues a guest wishes to use which is really useful (at least for now). Since multiqueue can not out-perform for single queue in every kinds of workloads or benchmark, so we want to let guest driver use single queue by default even when multiqueue were enabled by management software and let use to enalbe it through ethtool. So user could not feel regression when it switch to use a multiqueue capable driver and backend.

Why would you limit it to a single vq if the user has specified a different number of vqs (>1) in the virtio-net device config?

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox