* RE: [PATCH 00/13] drivers: hv: kvp
From: KY Srinivasan @ 2012-07-03 15:24 UTC (permalink / raw)
To: Ben Hutchings
Cc: Olaf Hering, Greg KH, apw@canonical.com,
devel@linuxdriverproject.org, virtualization@lists.osdl.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <20120702195721.GE1894@decadent.org.uk>
> -----Original Message-----
> From: Ben Hutchings [mailto:ben@decadent.org.uk]
> Sent: Monday, July 02, 2012 3:57 PM
> To: KY Srinivasan
> Cc: Olaf Hering; Greg KH; apw@canonical.com; devel@linuxdriverproject.org;
> virtualization@lists.osdl.org; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org
> Subject: Re: [PATCH 00/13] drivers: hv: kvp
>
> On Mon, Jul 02, 2012 at 03:22:25PM +0000, KY Srinivasan wrote:
> >
> >
> > > -----Original Message-----
> > > From: Olaf Hering [mailto:olaf@aepfle.de]
> > > Sent: Thursday, June 28, 2012 10:24 AM
> > > To: KY Srinivasan
> > > Cc: Greg KH; apw@canonical.com; devel@linuxdriverproject.org;
> > > virtualization@lists.osdl.org; linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 00/13] drivers: hv: kvp
> > >
> > > On Tue, Jun 26, KY Srinivasan wrote:
> > >
> > > > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > > > The fact that it was Red Hat specific was the main part, this should be
> > > > > done in a standard way, with standard tools, right?
> > > >
> > > > The reason I asked this question was to make sure I address these
> > > > issues in addition to whatever I am debugging now. I use the standard
> > > > tools and calls to retrieve all the IP configuration. As I look at
> > > > each distribution the files they keep persistent IP configuration
> > > > Information is different and that is the reason I chose to start with
> > > > RedHat. If there is a standard way to store the configuration, I will
> > > > do that.
> > >
> > >
> > > KY,
> > >
> > > instead of using system() in kvp_get_ipconfig_info and kvp_set_ip_info,
> > > wouldnt it be easier to call an external helper script which does all
> > > the distribution specific work? Just define some API to pass values to
> > > the script, and something to read values collected by the script back
> > > into the daemon.
> >
> > On the "Get" side I mostly use standard commands/APIs to get all the
> information:
> >
> > 1) IP address information and subnet mask: getifaddrs()
> > 2) DNS information: Parsing /etc/resolv.conf
> > 3) /sbin/ip command for all the routing information
>
> If you're interested in the *current* configuration then (1) and (3)
> are OK but you should really use the rtnetlink API.
>
> However, I suspect that Hyper-V assumes that current and persistent
> configuration are the same thing, which is obviously not true in
> general on Linux. But if NetworkManager is running then you can
> assume they are.
I am only interested in the currently active information. Why do you
recommend the use of rtnetlink API over the "ip" command. If I am not
mistaken, the ip command uses netlink to get the information.
>
> > 4) Parse /etc/sysconfig/network-scripts/ifcfg-ethx for boot protocol
This is the only information that requires parsing a distro specific configuration file. Do
you have any suggestion on how I may get this information in a distro independent way.
> >
> > As you can see, all but the boot protocol is gathered using the "standard distro
> > independent mechanisms. I was looking at NetworkManager cli and it looks
> > like I could gather all the information except the boot protocol information. I am
> > not sure how to gather the boot protocol information in a distro independent
> fashion.
> >
> > On the SET side, I need to persistently store the settings in an appropriate
> configuration
> > file and flush these settings down so that the interface is appropriately
> configured. It is here
> > that I am struggling to find a distro independent way of doing things. It would
> be great if I can
> > use NetworkManager cli (nmcli) to accomplish this. Any help here would be
> greatly appreciated.
> [...]
>
> What was wrong with the NetworkManager D-Bus API I pointed you at?
> I don't see how it makes sense to use nmcli as an API.
I saw some documentation that claimed that nmcli could be used to accomplish
all that can be done with the GUI interface. I am looking for a portable way
to accomplish configuring an interface. If nmcli can do that, I would use it. With
regards to D-BUS API, I took a cursory look at the APIs. I am still evaluating
my options.
Regards,
K. Y
^ permalink raw reply
* Re: [PATCH] etherdevice: introduce broadcast_ether_addr
From: Johannes Berg @ 2012-07-03 15:16 UTC (permalink / raw)
To: Joe Perches; +Cc: netdev, linux-wireless
In-Reply-To: <1341328402.2164.3.camel@joe2Laptop>
On Tue, 2012-07-03 at 08:13 -0700, Joe Perches wrote:
> On Tue, 2012-07-03 at 12:16 +0200, Johannes Berg wrote:
> > From: Johannes Berg <johannes.berg@intel.com>
> >
> > A lot of code has either the memset or an
> > inefficient copy from a static array that
> > contains the all-ones broadcast address.
> > Introduce broadcast_ether_addr() to fill
> > an address with all ones, making the code
> > clearer and allowing us to get rid of the
> > various constant arrays.
> []
> > diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> []
> > +static inline void broadcast_ether_addr(u8 *addr)
> > +{
> > + memset(addr, 0xff, ETH_ALEN);
> > +}
>
> I think this sort of patch should come as the first
> patch in a series with some example conversions.
>
> It might be too easy to confuse is_broadcast_ether_addr
> with this function name too. Maybe set_broadcast_ether_addr
> might be better.
Well, it's void so that'd be a compiler error :-)
Also, it's more like random_ether_addr()
johannes
^ permalink raw reply
* Re: [PATCH] etherdevice: introduce broadcast_ether_addr
From: Joe Perches @ 2012-07-03 15:13 UTC (permalink / raw)
To: Johannes Berg; +Cc: netdev, linux-wireless
In-Reply-To: <1341310587.5131.2.camel@jlt3.sipsolutions.net>
On Tue, 2012-07-03 at 12:16 +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
>
> A lot of code has either the memset or an
> inefficient copy from a static array that
> contains the all-ones broadcast address.
> Introduce broadcast_ether_addr() to fill
> an address with all ones, making the code
> clearer and allowing us to get rid of the
> various constant arrays.
[]
> diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
[]
> +static inline void broadcast_ether_addr(u8 *addr)
> +{
> + memset(addr, 0xff, ETH_ALEN);
> +}
I think this sort of patch should come as the first
patch in a series with some example conversions.
It might be too easy to confuse is_broadcast_ether_addr
with this function name too. Maybe set_broadcast_ether_addr
might be better.
I really don't see an issue with using memset though.
Everyone already knows what that does.
^ permalink raw reply
* Re: [PATCH 00/13] drivers: hv: kvp
From: Stephen Hemminger @ 2012-07-03 15:03 UTC (permalink / raw)
To: Olaf Hering; +Cc: Greg KH, apw, devel, linux-kernel, netdev, KY Srinivasan
In-Reply-To: <20120703132049.GA10663@aepfle.de>
> On Mon, Jul 02, KY Srinivasan wrote:
>
> > While I toyed with your proposal, I feel it just pushes the problem
> > out of the daemon code - we would still need to write distro
> > specific
> > scripts. If this approach is something that everybody is
> > comfortable
> > with, I can take a stab at implementing that.
>
> Until NetworkManager is feature complete and until every distro is
> using
> NetworkManager per default the kvp_daemon needs distro specific code
> to
> get and set network related settings.
> Doing it with an external script will simplify debugging and changes
> to
> the code.
Although, Network Manager is a good tool for what it does;
it is not appropriate for every distro. It is overkill
in embedded systems, and it's GUI dependency makes it unmanageable
on servers.
^ permalink raw reply
* Re: [PATCH] sctp: refactor sctp_packet_append_chunk and clenup some memory leaks
From: Vlad Yasevich @ 2012-07-03 14:39 UTC (permalink / raw)
To: Neil Horman; +Cc: netdev, David S. Miller, linux-sctp
In-Reply-To: <1341259164-7396-1-git-send-email-nhorman@tuxdriver.com>
On 07/02/2012 03:59 PM, Neil Horman wrote:
> While doing some recent work on sctp sack bundling I noted that
> sctp_packet_append_chunk was pretty inefficient. Specifially, it was called
> recursively while trying to bundle auth and sack chunks. Because of that we
> call sctp_packet_bundle_sack and sctp_packet_bundle_auth a total of 4 times for
> every call to sctp_packet_append_chunk, knowing that at least 3 of those calls
> will do nothing.
>
> So lets refactor sctp_packet_bundle_auth to have an outer part that does the
> attempted bundling, and an inner part that just does the chunk appends. This
> saves us several calls per iteration that we just don't need.
>
> Also, noticed that the auth and sack bundling fail to free the chunks they
> allocate if the append fails, so make sure we add that in
>
> Signed-off-by: Neil Horman<nhorman@tuxdriver.com>
> CC: Vlad Yasevich<vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
-vlad
> CC: "David S. Miller"<davem@davemloft.net>
> CC: linux-sctp@vger.kernel.org
> ---
> net/sctp/output.c | 80 +++++++++++++++++++++++++++++++++++------------------
> 1 files changed, 53 insertions(+), 27 deletions(-)
>
> diff --git a/net/sctp/output.c b/net/sctp/output.c
> index 0de6cd5..0b62f6c 100644
> --- a/net/sctp/output.c
> +++ b/net/sctp/output.c
> @@ -64,6 +64,8 @@
> #include<net/sctp/checksum.h>
>
> /* Forward declarations for private helpers. */
> +static sctp_xmit_t __sctp_packet_append_chunk(struct sctp_packet *packet,
> + struct sctp_chunk *chunk);
> static sctp_xmit_t sctp_packet_can_append_data(struct sctp_packet *packet,
> struct sctp_chunk *chunk);
> static void sctp_packet_append_data(struct sctp_packet *packet,
> @@ -224,7 +226,10 @@ static sctp_xmit_t sctp_packet_bundle_auth(struct sctp_packet *pkt,
> if (!auth)
> return retval;
>
> - retval = sctp_packet_append_chunk(pkt, auth);
> + retval = __sctp_packet_append_chunk(pkt, auth);
> +
> + if (retval != SCTP_XMIT_OK)
> + sctp_chunk_free(auth);
>
> return retval;
> }
> @@ -256,48 +261,31 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct sctp_packet *pkt,
> asoc->a_rwnd = asoc->rwnd;
> sack = sctp_make_sack(asoc);
> if (sack) {
> - retval = sctp_packet_append_chunk(pkt, sack);
> + retval = __sctp_packet_append_chunk(pkt, sack);
> + if (retval != SCTP_XMIT_OK) {
> + sctp_chunk_free(sack);
> + goto out;
> + }
> asoc->peer.sack_needed = 0;
> if (del_timer(timer))
> sctp_association_put(asoc);
> }
> }
> }
> +out:
> return retval;
> }
>
> +
> /* Append a chunk to the offered packet reporting back any inability to do
> * so.
> */
> -sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *packet,
> - struct sctp_chunk *chunk)
> +static sctp_xmit_t __sctp_packet_append_chunk(struct sctp_packet *packet,
> + struct sctp_chunk *chunk)
> {
> sctp_xmit_t retval = SCTP_XMIT_OK;
> __u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));
>
> - SCTP_DEBUG_PRINTK("%s: packet:%p chunk:%p\n", __func__, packet,
> - chunk);
> -
> - /* Data chunks are special. Before seeing what else we can
> - * bundle into this packet, check to see if we are allowed to
> - * send this DATA.
> - */
> - if (sctp_chunk_is_data(chunk)) {
> - retval = sctp_packet_can_append_data(packet, chunk);
> - if (retval != SCTP_XMIT_OK)
> - goto finish;
> - }
> -
> - /* Try to bundle AUTH chunk */
> - retval = sctp_packet_bundle_auth(packet, chunk);
> - if (retval != SCTP_XMIT_OK)
> - goto finish;
> -
> - /* Try to bundle SACK chunk */
> - retval = sctp_packet_bundle_sack(packet, chunk);
> - if (retval != SCTP_XMIT_OK)
> - goto finish;
> -
> /* Check to see if this chunk will fit into the packet */
> retval = sctp_packet_will_fit(packet, chunk, chunk_len);
> if (retval != SCTP_XMIT_OK)
> @@ -339,6 +327,44 @@ finish:
> return retval;
> }
>
> +/* Append a chunk to the offered packet reporting back any inability to do
> + * so.
> + */
> +sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *packet,
> + struct sctp_chunk *chunk)
> +{
> + sctp_xmit_t retval = SCTP_XMIT_OK;
> + __u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));
> +
> + SCTP_DEBUG_PRINTK("%s: packet:%p chunk:%p\n", __func__, packet,
> + chunk);
> +
> + /* Data chunks are special. Before seeing what else we can
> + * bundle into this packet, check to see if we are allowed to
> + * send this DATA.
> + */
> + if (sctp_chunk_is_data(chunk)) {
> + retval = sctp_packet_can_append_data(packet, chunk);
> + if (retval != SCTP_XMIT_OK)
> + goto finish;
> + }
> +
> + /* Try to bundle AUTH chunk */
> + retval = sctp_packet_bundle_auth(packet, chunk);
> + if (retval != SCTP_XMIT_OK)
> + goto finish;
> +
> + /* Try to bundle SACK chunk */
> + retval = sctp_packet_bundle_sack(packet, chunk);
> + if (retval != SCTP_XMIT_OK)
> + goto finish;
> +
> + retval = __sctp_packet_append_chunk(packet, chunk);
> +
> +finish:
> + return retval;
> +}
> +
> /* All packets are sent to the network through this function from
> * sctp_outq_tail().
> *
^ permalink raw reply
* [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out
From: Ren, Cloud @ 2012-07-03 13:27 UTC (permalink / raw)
To: davem, netdev, linux-kernel; +Cc: qca-linux-team, nic-devel, Cloud Ren
From: Cloud Ren <cjren@qca.qualcomm.com>
some people report atl1c could cause system hang with following
kernel trace info:
---------------------------------------
WARNING: at.../net/sched/sch_generic.c:258 dev_watchdog+0x1db/0x1d0()
...
NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
...
---------------------------------------
This is caused by netif_stop_queue calling when cable Link is down.
So remove netif_stop_queue, because link_watch will take it over.
Signed-off-by: xiong <xiong@qca.qualcomm.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com>
---
drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
index 85717cb..7901831 100644
--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
@@ -261,7 +261,6 @@ static void atl1c_check_link_status(struct atl1c_adapter *adapter)
if ((phy_data & BMSR_LSTATUS) == 0) {
/* link down */
netif_carrier_off(netdev);
- netif_stop_queue(netdev);
hw->hibernate = true;
if (atl1c_reset_mac(hw) != 0)
if (netif_msg_hw(adapter))
--
1.7.7
^ permalink raw reply related
* Re: [PATCH 00/13] drivers: hv: kvp
From: Olaf Hering @ 2012-07-03 13:20 UTC (permalink / raw)
To: KY Srinivasan; +Cc: Greg KH, apw, devel, linux-kernel, netdev
In-Reply-To: <426367E2313C2449837CD2DE46E7EAF9155EF399@SN2PRD0310MB382.namprd03.prod.outlook.com>
On Mon, Jul 02, KY Srinivasan wrote:
> While I toyed with your proposal, I feel it just pushes the problem
> out of the daemon code - we would still need to write distro specific
> scripts. If this approach is something that everybody is comfortable
> with, I can take a stab at implementing that.
Until NetworkManager is feature complete and until every distro is using
NetworkManager per default the kvp_daemon needs distro specific code to
get and set network related settings.
Doing it with an external script will simplify debugging and changes to
the code.
Olaf
^ permalink raw reply
* RE: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out
From: Huang, Xiong @ 2012-07-03 13:04 UTC (permalink / raw)
To: Ren, Cloud, davem@davemloft.net, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Cc: qca-linux-team, nic-devel
In-Reply-To: <1341322056-26582-1-git-send-email-cjren@qca.qualcomm.com>
Cloud, why your patch contains : 'From: Cloud Ren <cjren@qca.qualcomm.com>'
I don't find it in other people's patch.
Doesn't David Miller think your time is wrong ?
> -----Original Message-----
> From: Ren, Cloud
> Sent: Tuesday, July 03, 2012 21:28
> To: davem@davemloft.net; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Cc: qca-linux-team; nic-devel; Ren, Cloud
> Subject: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out
>
> From: Cloud Ren <cjren@qca.qualcomm.com>
>
> some people report atl1c could cause system hang with following kernel trace
> info:
> ---------------------------------------
> WARNING: at.../net/sched/sch_generic.c:258 dev_watchdog+0x1db/0x1d0() ...
> NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out ...
> ---------------------------------------
> This is caused by netif_stop_queue calling when cable Link is down.
> So remove netif_stop_queue, because link_watch will take it over.
>
> Signed-off-by: xiong <xiong@qca.qualcomm.com>
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com>
> ---
> drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> index 85717cb..7901831 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> @@ -261,7 +261,6 @@ static void atl1c_check_link_status(struct
> atl1c_adapter *adapter)
> if ((phy_data & BMSR_LSTATUS) == 0) {
> /* link down */
> netif_carrier_off(netdev);
> - netif_stop_queue(netdev);
> hw->hibernate = true;
> if (atl1c_reset_mac(hw) != 0)
> if (netif_msg_hw(adapter))
> --
> 1.7.7
^ permalink raw reply
* speed specific port cost calculation in br_if.c and how to make it generic based on 802.1d Table-17-3?
From: Parav.Pandit @ 2012-07-03 12:40 UTC (permalink / raw)
To: netdev; +Cc: Parav.Pandit
Hi,
I am trying to add further support to bridging portion of stack for 40G and 100G speeds.
net/bridge/br_if.c function port_cost() has hard coded values of 2, 4, 19, 100 for speed of 10G, 1G, 100M, 10Mbps respectively.
Comment mentions about based on 802.1d standard.
I am referring to 802.1d-2004 Table 17-3-Port Path Cost values.
It mentions port cost value of 2000, 20000, 200000 respectively for speed of 10G, 1G, 100Mbps respectively.
This makes sense to me as the post cost value is inversely proportional and scalar function of its speed.
Can anyone please guide me on
1. how the current calculation of path/port cost is being done so that I can enhance it for other speeds too in generic way if possible?
2. How can I incorporate for other speed settings in generic way based on 802.1a-2004 spec, Table 17-3?
Current code snippet:
/* Determine initial path cost based on speed.
* using recommendations from 802.1d standard
*
* Since driver might sleep need to not be holding any locks.
*/
static int port_cost(struct net_device *dev)
{
struct ethtool_cmd ecmd;
if (!__ethtool_get_settings(dev, &ecmd)) {
switch (ethtool_cmd_speed(&ecmd)) {
case SPEED_10000:
return 2;
case SPEED_1000:
return 4;
case SPEED_100:
return 19;
case SPEED_10:
return 100;
}
}
/* Old silly heuristics based on name */
if (!strncmp(dev->name, "lec", 3))
return 7;
if (!strncmp(dev->name, "plip", 4))
return 2500;
return 100; /* assume old 10Mbps */
}
Regards,
Parav Pandit
^ permalink raw reply
* Re: [PATCH net-next 09/15] net: bus: Add garbage collector for AF_BUS sockets.
From: Alban Crequy @ 2012-07-03 12:11 UTC (permalink / raw)
To: Ben Hutchings
Cc: Vincent Sanders, netdev, linux-kernel, David S. Miller,
Javier Martinez Canillas
In-Reply-To: <1341251063.2590.5.camel@bwh-desktop.uk.solarflarecom.com>
Mon, 2 Jul 2012 18:44:23 +0100,
Ben Hutchings <bhutchings@solarflare.com> wrote :
> On Fri, 2012-06-29 at 17:45 +0100, Vincent Sanders wrote:
> > From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
> >
> > This patch adds a garbage collector for AF_BUS sockets.
> [...]
> > +struct sock *bus_get_socket(struct file *filp)
> > +{
> > + struct sock *u_sock = NULL;
> > + struct inode *inode = filp->f_path.dentry->d_inode;
> > +
> > + /*
> > + * Socket ?
> > + */
> > + if (S_ISSOCK(inode->i_mode) && !(filp->f_mode &
> > FMODE_PATH)) {
> > + struct socket *sock = SOCKET_I(inode);
> > + struct sock *s = sock->sk;
> > +
> > + /*
> > + * PF_BUS ?
> > + */
> > + if (s && sock->ops && sock->ops->family == PF_BUS)
> > + u_sock = s;
> > + }
> > + return u_sock;
> > +}
> [...]
>
> What about references cycles involving both AF_BUS and AF_UNIX
> sockets? I think you must either specifically prevent passing AF_UNIX
> sockets through AF_BUS sockets, or make a single garbage collector
> handle them both.
Indeed. Thanks for the feedback.
As far as I know, the current users of fd passing in D-Bus are Bluez
and Ofono and they pass AF_BLUETOOTH sockets. There might be
others, I am not sure what is in the wild. Passing AF_UNIX sockets in
D-Bus would be useful for Telepathy (for Tubes and File Transfer).
So I would like to be able to pass AF_UNIX sockets through AF_BUS
sockets.
I wrote the following small program to test this bug based on the
previous code <https://lkml.org/lkml/2010/11/25/8> referred by commit
25888e30:
http://people.collabora.com/~alban/d/2012/07/fd-passing/cmsg.c
The effect of the bug is to reach the file limit
in /proc/sys/fs/file-max and print "VFS: file-max limit 101771 reached"
even when the maximum of file descriptors per process ("ulimit -n") is
small.
I am not sure what is the best way to fix this. The easiest could be to
move the garbage collector related fields (recursion_level,
gc_candidate, etc.) from struct unix_sock and struct bus_sock to
struct sock and make a generic garbage collector for all sockets.
Best regards,
Alban
^ permalink raw reply
* Good News!
From: D.S.T.V Company @ 2012-07-02 17:26 UTC (permalink / raw)
Your email has won you £950,000.00GBP from the D.S.T.V Company. For Claims send
us your: Full name/Home Address/ Age/Sex/Occupation/Country.
Mrs. Elaine Raymond
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
^ permalink raw reply
* Re: [PATCH net-next 06/10] {NET,IB}/mlx4: Add device managed flow steering firmware API
From: Or Gerlitz @ 2012-07-03 11:10 UTC (permalink / raw)
To: David Miller
Cc: bhutchings, roland, yevgenyp, oren, netdev, hadarh, Amir Vadai
In-Reply-To: <20120702.180458.2296165153479998120.davem@davemloft.net>
On 7/3/2012 4:04 AM, David Miller wrote:
>
> Just in case you guys _really_ and _truly_ are so unable to think
> outside the box that you can't come up with something reasonable, I'll
> get you started with two ideas:
>
> 1) A special "chipset" dummy netdev that a special class of ethtool
> commands can run on to set these things.
>
> 2) A "chipset" genetlink family with suitable operations and
> attributes.
Dave,
Thanks for trying to address the need here, as I wrote you, we've
removed the module param
from the patch-set and will submit V1 without this. Once the comments
are over and hopefully
the patches are accepted, we'll see what can/need to be done for
allowing that flexibility.
Or.
>
>
> In both cases appropriate mechanisms are added to make for keys that
> are used for chipset matching, and device drivers simply register
> a notifier handler that is called on two occaisions:
>
> 1) When settings are changed.
>
> 2) Upon initial handler registry, to acquire the initial settings.
^ permalink raw reply
* Re: [PATCH v2 net-next 1/2] r8169: support RTL8106E
From: Francois Romieu @ 2012-07-03 11:01 UTC (permalink / raw)
To: Hayes Wang; +Cc: netdev, linux-kernel
In-Reply-To: <1341221002-1522-1-git-send-email-hayeswang@realtek.com>
Hayes Wang <hayeswang@realtek.com> :
> Support the new chip RTL8106E.
>
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Almost-Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Simple bidirectional traffic worked fine (rsync + wget + ping) both
with and without tx checksumming offload.
Wol g or u worked as well. Link was ok and traffic flowing after a
suspend to ram.
No problem with module removal / insertion / device up / traffic loop.
> +#define FIRMWARE_8106E_1 "rtl_nic/rtl8106e-1.fw"
I did not notice it. Was it submitted ?
It was obviously not required for testing :o)
> @@ -1933,6 +1941,8 @@ static void rtl8169_get_mac_version(struct rtl8169_private *tp,
> { 0x7c800000, 0x30000000, RTL_GIGA_MAC_VER_11 },
>
> /* 8101 family. */
> + { 0x7cf00000, 0x44900000, RTL_GIGA_MAC_VER_39 },
> + { 0x7c800000, 0x44800000, RTL_GIGA_MAC_VER_39 },
> { 0x7c800000, 0x44000000, RTL_GIGA_MAC_VER_37 },
> { 0x7cf00000, 0x40b00000, RTL_GIGA_MAC_VER_30 },
> { 0x7cf00000, 0x40a00000, RTL_GIGA_MAC_VER_30 },
Realtek's 1.022.00 8101 driver only maps { 0x7c800000; 0x44800000 } to
a generic device - if at all - and it maps { 0x7cf00000; 0x44800000 }
to a different chipset (namely CFG_METHOD_15 where RTL_GIGA_MAC_VER_39
is CFG_METHOD_16).
Why should both drivers diverge ?
--
Ueimor
^ permalink raw reply
* Re: [PATCH 0/5] rtcache remove respin
From: David Miller @ 2012-07-03 10:56 UTC (permalink / raw)
To: netdev
In-Reply-To: <20120701.051556.541026724350825709.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Sun, 01 Jul 2012 05:15:56 -0700 (PDT)
> From: David Miller <davem@davemloft.net>
> Date: Sun, 01 Jul 2012 05:02:43 -0700 (PDT)
>
>> On a SPARC T3-1:
>>
>> 1) Output route lookup: ~2800 cycles
>> 2) Input route lookups: ~3000 cycles (rpfilter=0)
>> ~4300 cycles (rpfilter=1)
>
> Out of curiosity I got rid of the local table and made all routes
> go into the main table and those numbers above become:
>
> 1) Output route lookup: ~2500 cycles
> 2) Input route lookups: ~2800 cycles (rpfilter=0)
> ~4100 cycles (rpfilter=1)
>
And with the neighbour patches I posted tonight these numbers are:
1) Output route lookup: ~2150 cycles
2) Input route lookups: ~2600 cycles (rpfilter=0)
~3700 cycles (rpfilter=1)
^ permalink raw reply
* Re: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out
From: David Miller @ 2012-07-03 10:23 UTC (permalink / raw)
To: cjren; +Cc: netdev, linux-kernel, qca-linux-team, nic-devel
In-Reply-To: <1341322056-26582-1-git-send-email-cjren@qca.qualcomm.com>
From: "Ren, Cloud" <cjren@qca.qualcomm.com>
Date: Tue, 3 Jul 2012 10:27:36 -0300
Please fix whatever you are using to set the dates in your patch emails.
Your date here is in the future compared to all of the other patches
posted in the hours since you posted your's.
This screws up the patch queue in patchwork and therefore I really
need you to correct this.
^ permalink raw reply
* [PATCH] etherdevice: introduce broadcast_ether_addr
From: Johannes Berg @ 2012-07-03 10:16 UTC (permalink / raw)
To: netdev; +Cc: linux-wireless
From: Johannes Berg <johannes.berg@intel.com>
A lot of code has either the memset or an
inefficient copy from a static array that
contains the all-ones broadcast address.
Introduce broadcast_ether_addr() to fill
an address with all ones, making the code
clearer and allowing us to get rid of the
various constant arrays.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
include/linux/etherdevice.h | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
index 3d406e0..6da05bb 100644
--- a/include/linux/etherdevice.h
+++ b/include/linux/etherdevice.h
@@ -138,6 +138,17 @@ static inline void random_ether_addr(u8 *addr)
}
/**
+ * broadcast_ether_addr - Assign broadcast address
+ * @addr: Pointer to a six-byte array containing the Ethernet address
+ *
+ * Assign the broadcast address to the given address array.
+ */
+static inline void broadcast_ether_addr(u8 *addr)
+{
+ memset(addr, 0xff, ETH_ALEN);
+}
+
+/**
* eth_hw_addr_random - Generate software assigned random Ethernet and
* set device flag
* @dev: pointer to net_device structure
--
1.7.10
^ permalink raw reply related
* RFC: (now non Base64) replace packets already in queue
From: Erdt, Ralph @ 2012-07-03 10:02 UTC (permalink / raw)
To: netdev@vger.kernel.org; +Cc: Eric Dumazet, Nicolas de Pesloüan, Rick Jones
In-Reply-To: <FB112703C4930F4ABEBB5B763F96491139379DC7@MAILSERV2A.lorien.fkie.fgan.de>
Hello,
I found, that my eMail program send the eMail as UTF-8 / Base64. I'm sorry for the circumstances. I hope it's OK for you, that I'm repeating the content, so that everyone can follow it. But I will compress the whole discussion (Thanks to Eric Dumazet, Rick Jones and Nicolas de Peslouan):
-
I'm writing a kernel module (net/sched) which replaces packets in the queue. I'm glad hearing your option.
Background:
We are working with a property wireless network (no 802.11). The specifications:
Range: Kilometers.
Speed: <=9,600 bps (no Megs, Gigs or Kilo!) (*).
RTT (standard ping): idle-network: 1,5 *seconds*; network with load: minutes
Other: shared.
Connection: loosely to a Linux box, but without Linux driver.
Driver: a user-space program, which opens a virtual network device and sends
the packet from the virtual network device to the device.
Queue: The device has no queue nor the user-space program. The program waits
for the ACK from the device before sending the next packet. This is
possible, because the wireless is so lame...
(*you remember the good ol' times with modems over telephone lines? When the
internet was called BBS? And how it suddenly feels, when the BBS starts using
ANSI? This was comfortable compared to our problem..)
With such a very low bandwidth network its hard (rather: impossible) to get all
packets sent. (TCP isn't possible, so we are sending everything as UDP.)
Some of the packets contain information, which gets obsolete over time. E.g.
(GPS) positions, which will be sent periodically. If the application sends a new
packet while an old position packet is still in the queue, the old packet is
obsolete. This can be dropped. But just dropping the old packet and queuing the
new packet will result in never sending a packet of this type.
So I've written a tc-qdisc scheduler module, which replaces packets in the queue
on enqueuing, when this properties are given:
- UDPv4
- not fragmented
- (TOS & bitmask) = givenCompare; (bitmask and compare are adjustable)
- same source IP
- same destination IP
- same destination port
- same TOS
So, the packet got sent over the time - but with the actual information.
(The code is tiny - just a file (and Kconfig, etc.))
Why not using "codel"? Because codel will drop "randomly" ("random" not for the
protocol, but for the application) packets. It's made to reduce the flow speed.
But we haven't a flow, only periodic sensor data.
My qdisc won't drop random packets. It will reduce the traffic by intelligently
replacing packets in the queue. Surely - the application must handle this. But
in such a network a administrator have to configure the queues and he knows the
applications.
Why not adding a tun/tap interface and do everything in the user space?
As mentioned, the device "driver" is a user space program, which creates a tun/tap
interface. We CAN mix the code there in, but separating the work has a few benefits:
- separating the work with clear interfaces is always a good idea.
- having a qdisc will allow complex TC rules using this qdisc - it's fully compatible.
(In fact - we have guidelines given. We can handle all of the guidelines with the
given TC classes - except the replace thing.)
- administrators knowing TC can work with this qdisc without any problems
My question is: Should I do the work to create and release a kernel patch and make
it perfect over the time, or is this just our internal code, which I can leave at
the current state? I know our module won't be used widely (too special), but I'm
sure, there are a few people out there, which would be thankful for this.
Have you any other comments?
Greetings
Ralph Erdt
^ permalink raw reply
* Re: [PATCH] netem: fix rate extension and drop accounting
From: Eric Dumazet @ 2012-07-03 9:54 UTC (permalink / raw)
To: David Miller
Cc: netdev, Hagen Paul Pfeifer, Yuchung Cheng, Andreas Terzis,
Mark Gordon
In-Reply-To: <1341307524.2583.115.camel@edumazet-glaptop>
On Tue, 2012-07-03 at 11:25 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> commit 7bc0f28c7a0c (netem: rate extension) did wrong maths when packet
> is enqueued while queue is not empty.
>
> Result is unexpected cumulative delays
>
> # tc qd add dev eth0 root est 1sec 4sec netem delay 200ms rate 100kbit
> # ping -i 0.1 172.30.42.18
> PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
> 64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=208 ms
> 64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=424 ms
> 64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=838 ms
> 64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=1142 ms
> 64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=1335 ms
> 64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=1949 ms
> 64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=2450 ms
> 64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=2840 ms
> 64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=3121 ms
> 64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=3291 ms
> 64 bytes from 172.30.42.18: icmp_req=11 ttl=64 time=3784 ms
>
> This patch also fixes a double drop accounting in case packet is dropped
> in tfifo_enqueue()
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Yuchung Cheng <ycheng@google.com>
> Cc: Andreas Terzis <aterzis@google.com>
> Cc: Mark Gordon <msg@google.com>
> Cc: Hagen Paul Pfeifer <hagen@jauu.net>
> ---
Hmm, I'll send a v2
^ permalink raw reply
* Re: [PATCH] qlge: fix endian issue
From: David Miller @ 2012-07-03 9:47 UTC (permalink / raw)
To: roy.qing.li; +Cc: netdev, ron.mercer
In-Reply-To: <CAJFZqHz7tn6AduKguFe7rRP6adiMo+VKe-mVvO5nbAM6QsoBHg@mail.gmail.com>
From: RongQing Li <roy.qing.li@gmail.com>
Date: Tue, 3 Jul 2012 17:44:58 +0800
> I maybe should not add the dot.
That's what I'm trying to say.
^ permalink raw reply
* [PATCH 19/19] net: Kill dst->_neighbour, accessors, and final uses.
From: David Miller @ 2012-07-03 9:47 UTC (permalink / raw)
To: netdev
No longer used.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/net/dst.h | 17 +----------------
net/core/dst.c | 18 ------------------
2 files changed, 1 insertion(+), 34 deletions(-)
diff --git a/include/net/dst.h b/include/net/dst.h
index 295a705..b2634e4 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -42,7 +42,7 @@ struct dst_entry {
struct dst_entry *from;
};
struct dst_entry *path;
- struct neighbour __rcu *_neighbour;
+ void *__pad0;
#ifdef CONFIG_XFRM
struct xfrm_state *xfrm;
#else
@@ -96,21 +96,6 @@ struct dst_entry {
};
};
-static inline struct neighbour *dst_get_neighbour_noref(struct dst_entry *dst)
-{
- return rcu_dereference(dst->_neighbour);
-}
-
-static inline struct neighbour *dst_get_neighbour_noref_raw(struct dst_entry *dst)
-{
- return rcu_dereference_raw(dst->_neighbour);
-}
-
-static inline void dst_set_neighbour(struct dst_entry *dst, struct neighbour *neigh)
-{
- rcu_assign_pointer(dst->_neighbour, neigh);
-}
-
extern u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old);
extern const u32 dst_default_metrics[RTAX_MAX];
diff --git a/net/core/dst.c b/net/core/dst.c
index a6e19a2..07bacff 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -171,7 +171,6 @@ void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
dst_init_metrics(dst, dst_default_metrics, true);
dst->expires = 0UL;
dst->path = dst;
- RCU_INIT_POINTER(dst->_neighbour, NULL);
#ifdef CONFIG_XFRM
dst->xfrm = NULL;
#endif
@@ -225,19 +224,12 @@ EXPORT_SYMBOL(__dst_free);
struct dst_entry *dst_destroy(struct dst_entry * dst)
{
struct dst_entry *child;
- struct neighbour *neigh;
smp_rmb();
again:
- neigh = rcu_dereference_protected(dst->_neighbour, 1);
child = dst->child;
- if (neigh) {
- RCU_INIT_POINTER(dst->_neighbour, NULL);
- neigh_release(neigh);
- }
-
if (!(dst->flags & DST_NOCOUNT))
dst_entries_add(dst->ops, -1);
@@ -361,19 +353,9 @@ static void dst_ifdown(struct dst_entry *dst, struct net_device *dev,
if (!unregister) {
dst->input = dst->output = dst_discard;
} else {
- struct neighbour *neigh;
-
dst->dev = dev_net(dst->dev)->loopback_dev;
dev_hold(dst->dev);
dev_put(dev);
- rcu_read_lock();
- neigh = dst_get_neighbour_noref(dst);
- if (neigh && neigh->dev == dev) {
- neigh->dev = dst->dev;
- dev_hold(dst->dev);
- dev_put(dev);
- }
- rcu_read_unlock();
}
}
--
1.7.10
^ permalink raw reply related
* [PATCH 18/19] xfrm: No need to copy generic neighbour pointer.
From: David Miller @ 2012-07-03 9:47 UTC (permalink / raw)
To: netdev
Nobody reads it any longer.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/xfrm/xfrm_policy.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index a28a3f9..6e97855 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1500,9 +1500,6 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
if (!dev)
goto free_dst;
- /* Copy neighbour for reachability confirmation */
- dst_set_neighbour(dst0, neigh_clone(dst_get_neighbour_noref(dst)));
-
xfrm_init_path((struct xfrm_dst *)dst0, dst, nfheader_len);
xfrm_init_pmtu(dst_prev);
--
1.7.10
^ permalink raw reply related
* [PATCH 17/19] ipv4: No need to set generic neighbour pointer.
From: David Miller @ 2012-07-03 9:47 UTC (permalink / raw)
To: netdev
Nobody reads it any longer.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/ipv4/route.c | 62 +++---------------------------------------------------
1 file changed, 3 insertions(+), 59 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 7453dfc..72e88c2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1111,16 +1111,6 @@ static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst,
return neigh_create(&arp_tbl, pkey, dev);
}
-static int rt_bind_neighbour(struct rtable *rt)
-{
- struct neighbour *n = ipv4_neigh_lookup(&rt->dst, NULL, &rt->rt_gateway);
- if (IS_ERR(n))
- return PTR_ERR(n);
- dst_set_neighbour(&rt->dst, n);
-
- return 0;
-}
-
static struct rtable *rt_intern_hash(unsigned int hash, struct rtable *rt,
struct sk_buff *skb, int ifindex)
{
@@ -1129,7 +1119,6 @@ static struct rtable *rt_intern_hash(unsigned int hash, struct rtable *rt,
unsigned long now;
u32 min_score;
int chain_length;
- int attempts = !in_softirq();
restart:
chain_length = 0;
@@ -1156,15 +1145,6 @@ restart:
*/
rt->dst.flags |= DST_NOCACHE;
- if (rt->rt_type == RTN_UNICAST || rt_is_output_route(rt)) {
- int err = rt_bind_neighbour(rt);
- if (err) {
- net_warn_ratelimited("Neighbour table failure & not caching routes\n");
- ip_rt_put(rt);
- return ERR_PTR(err);
- }
- }
-
goto skip_hashing;
}
@@ -1247,40 +1227,6 @@ restart:
}
}
- /* Try to bind route to arp only if it is output
- route or unicast forwarding path.
- */
- if (rt->rt_type == RTN_UNICAST || rt_is_output_route(rt)) {
- int err = rt_bind_neighbour(rt);
- if (err) {
- spin_unlock_bh(rt_hash_lock_addr(hash));
-
- if (err != -ENOBUFS) {
- rt_drop(rt);
- return ERR_PTR(err);
- }
-
- /* Neighbour tables are full and nothing
- can be released. Try to shrink route cache,
- it is most likely it holds some neighbour records.
- */
- if (attempts-- > 0) {
- int saved_elasticity = ip_rt_gc_elasticity;
- int saved_int = ip_rt_gc_min_interval;
- ip_rt_gc_elasticity = 1;
- ip_rt_gc_min_interval = 0;
- rt_garbage_collect(&ipv4_dst_ops);
- ip_rt_gc_min_interval = saved_int;
- ip_rt_gc_elasticity = saved_elasticity;
- goto restart;
- }
-
- net_warn_ratelimited("Neighbour table overflow\n");
- rt_drop(rt);
- return ERR_PTR(-ENOBUFS);
- }
- }
-
rt->dst.rt_next = rt_hash_table[hash].chain;
/*
@@ -1388,26 +1334,24 @@ static void check_peer_redir(struct dst_entry *dst, struct inet_peer *peer)
{
struct rtable *rt = (struct rtable *) dst;
__be32 orig_gw = rt->rt_gateway;
- struct neighbour *n, *old_n;
+ struct neighbour *n;
dst_confirm(&rt->dst);
rt->rt_gateway = peer->redirect_learned.a4;
n = ipv4_neigh_lookup(&rt->dst, NULL, &rt->rt_gateway);
- if (IS_ERR(n)) {
+ if (!n) {
rt->rt_gateway = orig_gw;
return;
}
- old_n = xchg(&rt->dst._neighbour, n);
- if (old_n)
- neigh_release(old_n);
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
} else {
rt->rt_flags |= RTCF_REDIRECTED;
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
}
+ neigh_release(n);
}
/* called in rcu_read_lock() section */
--
1.7.10
^ permalink raw reply related
* [PATCH 16/19] ipv6: Store route neighbour in rt6_info struct.
From: David Miller @ 2012-07-03 9:47 UTC (permalink / raw)
To: netdev
This makes for a simplified conversion away from dst_get_neighbour*().
All code outside of ipv6 will use neigh lookups via dst_neigh_lookup*().
Signed-off-by: David S. Miller <davem@davemloft.net>
---
include/net/ip6_fib.h | 2 ++
net/ipv6/ip6_output.c | 8 ++++++--
net/ipv6/route.c | 42 ++++++++++++++++++++++++++----------------
net/ipv6/xfrm6_policy.c | 1 +
4 files changed, 35 insertions(+), 18 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index a192f78..0fedbd8 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -86,6 +86,8 @@ struct fib6_table;
struct rt6_info {
struct dst_entry dst;
+ struct neighbour *n;
+
/*
* Tail elements of dst_entry (__refcnt etc.)
* and these elements (rarely used in hot path) are in
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index c94e4aa..6d9c0ab 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -88,6 +88,7 @@ static int ip6_finish_output2(struct sk_buff *skb)
struct dst_entry *dst = skb_dst(skb);
struct net_device *dev = dst->dev;
struct neighbour *neigh;
+ struct rt6_info *rt;
skb->protocol = htons(ETH_P_IPV6);
skb->dev = dev;
@@ -123,7 +124,8 @@ static int ip6_finish_output2(struct sk_buff *skb)
}
rcu_read_lock();
- neigh = dst_get_neighbour_noref(dst);
+ rt = (struct rt6_info *) dst;
+ neigh = rt->n;
if (neigh) {
int res = dst_neigh_output(dst, neigh, skb);
@@ -944,6 +946,7 @@ static int ip6_dst_lookup_tail(struct sock *sk,
struct net *net = sock_net(sk);
#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
struct neighbour *n;
+ struct rt6_info *rt;
#endif
int err;
@@ -972,7 +975,8 @@ static int ip6_dst_lookup_tail(struct sock *sk,
* dst entry of the nexthop router
*/
rcu_read_lock();
- n = dst_get_neighbour_noref(*dst);
+ rt = (struct rt6_info *) dst;
+ n = rt->n;
if (n && !(n->nud_state & NUD_VALID)) {
struct inet6_ifaddr *ifp;
struct flowi6 fl_gw6;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 24034d7..b4ecea0 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -155,7 +155,7 @@ static int rt6_bind_neighbour(struct rt6_info *rt, struct net_device *dev)
if (IS_ERR(n))
return PTR_ERR(n);
}
- dst_set_neighbour(&rt->dst, n);
+ rt->n = n;
return 0;
}
@@ -285,6 +285,9 @@ static void ip6_dst_destroy(struct dst_entry *dst)
struct rt6_info *rt = (struct rt6_info *)dst;
struct inet6_dev *idev = rt->rt6i_idev;
+ if (rt->n)
+ neigh_release(rt->n);
+
if (!(rt->dst.flags & DST_HOST))
dst_destroy_metrics_generic(dst);
@@ -335,12 +338,19 @@ static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
struct net_device *loopback_dev =
dev_net(dev)->loopback_dev;
- if (dev != loopback_dev && idev && idev->dev == dev) {
- struct inet6_dev *loopback_idev =
- in6_dev_get(loopback_dev);
- if (loopback_idev) {
- rt->rt6i_idev = loopback_idev;
- in6_dev_put(idev);
+ if (dev != loopback_dev) {
+ if (idev && idev->dev == dev) {
+ struct inet6_dev *loopback_idev =
+ in6_dev_get(loopback_dev);
+ if (loopback_idev) {
+ rt->rt6i_idev = loopback_idev;
+ in6_dev_put(idev);
+ }
+ }
+ if (rt->n && rt->n->dev == dev) {
+ rt->n->dev = loopback_dev;
+ dev_hold(loopback_dev);
+ dev_put(dev);
}
}
}
@@ -430,7 +440,7 @@ static void rt6_probe(struct rt6_info *rt)
* to no more than one per minute.
*/
rcu_read_lock();
- neigh = rt ? dst_get_neighbour_noref(&rt->dst) : NULL;
+ neigh = rt ? rt->n : NULL;
if (!neigh || (neigh->nud_state & NUD_VALID))
goto out;
read_lock_bh(&neigh->lock);
@@ -477,7 +487,7 @@ static inline int rt6_check_neigh(struct rt6_info *rt)
int m;
rcu_read_lock();
- neigh = dst_get_neighbour_noref(&rt->dst);
+ neigh = rt->n;
if (rt->rt6i_flags & RTF_NONEXTHOP ||
!(rt->rt6i_flags & RTF_GATEWAY))
m = 1;
@@ -824,7 +834,7 @@ static struct rt6_info *rt6_alloc_clone(struct rt6_info *ort,
if (rt) {
rt->rt6i_flags |= RTF_CACHE;
- dst_set_neighbour(&rt->dst, neigh_clone(dst_get_neighbour_noref_raw(&ort->dst)));
+ rt->n = neigh_clone(ort->n);
}
return rt;
}
@@ -858,7 +868,7 @@ restart:
dst_hold(&rt->dst);
read_unlock_bh(&table->tb6_lock);
- if (!dst_get_neighbour_noref_raw(&rt->dst) && !(rt->rt6i_flags & RTF_NONEXTHOP))
+ if (!rt->n && !(rt->rt6i_flags & RTF_NONEXTHOP))
nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
else if (!(rt->dst.flags & DST_HOST))
nrt = rt6_alloc_clone(rt, &fl6->daddr);
@@ -1178,7 +1188,7 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
rt->dst.flags |= DST_HOST;
rt->dst.output = ip6_output;
- dst_set_neighbour(&rt->dst, neigh);
+ rt->n = neigh;
atomic_set(&rt->dst.__refcnt, 1);
rt->rt6i_dst.addr = fl6->daddr;
rt->rt6i_dst.plen = 128;
@@ -1715,7 +1725,7 @@ void rt6_redirect(const struct in6_addr *dest, const struct in6_addr *src,
dst_confirm(&rt->dst);
/* Duplicate redirect: silently ignore. */
- old_neigh = dst_get_neighbour_noref_raw(&rt->dst);
+ old_neigh = rt->n;
if (neigh == old_neigh)
goto out;
@@ -1728,7 +1738,7 @@ void rt6_redirect(const struct in6_addr *dest, const struct in6_addr *src,
nrt->rt6i_flags &= ~RTF_GATEWAY;
nrt->rt6i_gateway = *(struct in6_addr *)neigh->primary_key;
- dst_set_neighbour(&nrt->dst, neigh_clone(neigh));
+ nrt->n = neigh_clone(neigh);
if (ip6_ins_rt(nrt))
goto out;
@@ -2441,7 +2451,7 @@ static int rt6_fill_node(struct net *net,
goto nla_put_failure;
rcu_read_lock();
- n = dst_get_neighbour_noref(&rt->dst);
+ n = rt->n;
if (n) {
if (nla_put(skb, RTA_GATEWAY, 16, &n->primary_key) < 0) {
rcu_read_unlock();
@@ -2665,7 +2675,7 @@ static int rt6_info_route(struct rt6_info *rt, void *p_arg)
seq_puts(m, "00000000000000000000000000000000 00 ");
#endif
rcu_read_lock();
- n = dst_get_neighbour_noref(&rt->dst);
+ n = rt->n;
if (n) {
seq_printf(m, "%pi6", n->primary_key);
} else {
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index d749484..bb02038 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -103,6 +103,7 @@ static int xfrm6_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
/* Sheit... I remember I did this right. Apparently,
* it was magically lost, so this code needs audit */
+ xdst->u.rt6.n = neigh_clone(rt->n);
xdst->u.rt6.rt6i_flags = rt->rt6i_flags & (RTF_ANYCAST |
RTF_LOCAL);
xdst->u.rt6.rt6i_metric = rt->rt6i_metric;
--
1.7.10
^ permalink raw reply related
* [PATCH 15/19] cxgb3: Convert t3_l2t_get() over to dst_neigh_lookup().
From: David Miller @ 2012-07-03 9:46 UTC (permalink / raw)
To: netdev
This means passing in a suitable destination address.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
drivers/infiniband/hw/cxgb3/iwch_cm.c | 5 +++--
drivers/net/ethernet/chelsio/cxgb3/l2t.c | 6 ++++--
drivers/net/ethernet/chelsio/cxgb3/l2t.h | 2 +-
drivers/scsi/cxgbi/cxgb3i/cxgb3i.c | 3 ++-
4 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c b/drivers/infiniband/hw/cxgb3/iwch_cm.c
index 740dcc0..77b6b18 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -1374,7 +1374,7 @@ static int pass_accept_req(struct t3cdev *tdev, struct sk_buff *skb, void *ctx)
goto reject;
}
dst = &rt->dst;
- l2t = t3_l2t_get(tdev, dst, NULL);
+ l2t = t3_l2t_get(tdev, dst, NULL, &req->peer_ip);
if (!l2t) {
printk(KERN_ERR MOD "%s - failed to allocate l2t entry!\n",
__func__);
@@ -1942,7 +1942,8 @@ int iwch_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
goto fail3;
}
ep->dst = &rt->dst;
- ep->l2t = t3_l2t_get(ep->com.tdev, ep->dst, NULL);
+ ep->l2t = t3_l2t_get(ep->com.tdev, ep->dst, NULL,
+ &cm_id->remote_addr.sin_addr.s_addr);
if (!ep->l2t) {
printk(KERN_ERR MOD "%s - cannot alloc l2e.\n", __func__);
err = -ENOMEM;
diff --git a/drivers/net/ethernet/chelsio/cxgb3/l2t.c b/drivers/net/ethernet/chelsio/cxgb3/l2t.c
index 3fa3c88..8d53438 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/l2t.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/l2t.c
@@ -299,7 +299,7 @@ static inline void reuse_entry(struct l2t_entry *e, struct neighbour *neigh)
}
struct l2t_entry *t3_l2t_get(struct t3cdev *cdev, struct dst_entry *dst,
- struct net_device *dev)
+ struct net_device *dev, const void *daddr)
{
struct l2t_entry *e = NULL;
struct neighbour *neigh;
@@ -311,7 +311,7 @@ struct l2t_entry *t3_l2t_get(struct t3cdev *cdev, struct dst_entry *dst,
int smt_idx;
rcu_read_lock();
- neigh = dst_get_neighbour_noref(dst);
+ neigh = dst_neigh_lookup(dst, daddr);
if (!neigh)
goto done_rcu;
@@ -360,6 +360,8 @@ struct l2t_entry *t3_l2t_get(struct t3cdev *cdev, struct dst_entry *dst,
done_unlock:
write_unlock_bh(&d->lock);
done_rcu:
+ if (neigh)
+ neigh_release(neigh);
rcu_read_unlock();
return e;
}
diff --git a/drivers/net/ethernet/chelsio/cxgb3/l2t.h b/drivers/net/ethernet/chelsio/cxgb3/l2t.h
index c4e8643..8cffcdf 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/l2t.h
+++ b/drivers/net/ethernet/chelsio/cxgb3/l2t.h
@@ -110,7 +110,7 @@ static inline void set_arp_failure_handler(struct sk_buff *skb,
void t3_l2e_free(struct l2t_data *d, struct l2t_entry *e);
void t3_l2t_update(struct t3cdev *dev, struct neighbour *neigh);
struct l2t_entry *t3_l2t_get(struct t3cdev *cdev, struct dst_entry *dst,
- struct net_device *dev);
+ struct net_device *dev, const void *daddr);
int t3_l2t_send_slow(struct t3cdev *dev, struct sk_buff *skb,
struct l2t_entry *e);
void t3_l2t_send_event(struct t3cdev *dev, struct l2t_entry *e);
diff --git a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
index 36739da..49692a1 100644
--- a/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
+++ b/drivers/scsi/cxgbi/cxgb3i/cxgb3i.c
@@ -966,7 +966,8 @@ static int init_act_open(struct cxgbi_sock *csk)
csk->saddr.sin_addr.s_addr = chba->ipv4addr;
csk->rss_qid = 0;
- csk->l2t = t3_l2t_get(t3dev, dst, ndev);
+ csk->l2t = t3_l2t_get(t3dev, dst, ndev,
+ &csk->daddr.sin_addr.s_addr);
if (!csk->l2t) {
pr_err("NO l2t available.\n");
return -EINVAL;
--
1.7.10
^ permalink raw reply related
* [PATCH 14/19] net: Pass neighbours into the NETEVENT_REDIRECT event.
From: David Miller @ 2012-07-03 9:46 UTC (permalink / raw)
To: netdev
Signed-off-by: David S. Miller <davem@davemloft.net>
---
drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c | 23 ++++++++------------
include/net/netevent.h | 3 +++
net/ipv6/route.c | 6 ++++-
3 files changed, 17 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
index 55cf72a..633c602 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
@@ -62,7 +62,8 @@ static const unsigned int MAX_ATIDS = 64 * 1024;
static const unsigned int ATID_BASE = 0x10000;
static void cxgb_neigh_update(struct neighbour *neigh);
-static void cxgb_redirect(struct dst_entry *old, struct dst_entry *new);
+static void cxgb_redirect(struct dst_entry *old, struct neighbour *old_neigh,
+ struct dst_entry *new, struct neighbour *new_neigh);
static inline int offload_activated(struct t3cdev *tdev)
{
@@ -968,8 +969,9 @@ static int nb_callback(struct notifier_block *self, unsigned long event,
}
case (NETEVENT_REDIRECT):{
struct netevent_redirect *nr = ctx;
- cxgb_redirect(nr->old, nr->new);
- cxgb_neigh_update(dst_get_neighbour_noref(nr->new));
+ cxgb_redirect(nr->old, nr->old_neigh,
+ nr->new, nr->new_neigh);
+ cxgb_neigh_update(nr->new_neigh);
break;
}
default:
@@ -1107,10 +1109,10 @@ static void set_l2t_ix(struct t3cdev *tdev, u32 tid, struct l2t_entry *e)
tdev->send(tdev, skb);
}
-static void cxgb_redirect(struct dst_entry *old, struct dst_entry *new)
+static void cxgb_redirect(struct dst_entry *old, struct neighbour *old_neigh,
+ struct dst_entry *new, struct neighbour *new_neigh)
{
struct net_device *olddev, *newdev;
- struct neighbour *n;
struct tid_info *ti;
struct t3cdev *tdev;
u32 tid;
@@ -1118,15 +1120,8 @@ static void cxgb_redirect(struct dst_entry *old, struct dst_entry *new)
struct l2t_entry *e;
struct t3c_tid_entry *te;
- n = dst_get_neighbour_noref(old);
- if (!n)
- return;
- olddev = n->dev;
-
- n = dst_get_neighbour_noref(new);
- if (!n)
- return;
- newdev = n->dev;
+ olddev = old_neigh->dev;
+ newdev = new_neigh->dev;
if (!is_offloading(olddev))
return;
diff --git a/include/net/netevent.h b/include/net/netevent.h
index 086f8a5..56e9862 100644
--- a/include/net/netevent.h
+++ b/include/net/netevent.h
@@ -12,10 +12,13 @@
*/
struct dst_entry;
+struct neighbour;
struct netevent_redirect {
struct dst_entry *old;
+ struct neighbour *old_neigh;
struct dst_entry *new;
+ struct neighbour *new_neigh;
};
enum netevent_notif_type {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4b581c6..24034d7 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1687,6 +1687,7 @@ void rt6_redirect(const struct in6_addr *dest, const struct in6_addr *src,
struct rt6_info *rt, *nrt = NULL;
struct netevent_redirect netevent;
struct net *net = dev_net(neigh->dev);
+ struct neighbour *old_neigh;
rt = ip6_route_redirect(dest, src, saddr, neigh->dev);
@@ -1714,7 +1715,8 @@ void rt6_redirect(const struct in6_addr *dest, const struct in6_addr *src,
dst_confirm(&rt->dst);
/* Duplicate redirect: silently ignore. */
- if (neigh == dst_get_neighbour_noref_raw(&rt->dst))
+ old_neigh = dst_get_neighbour_noref_raw(&rt->dst);
+ if (neigh == old_neigh)
goto out;
nrt = ip6_rt_copy(rt, dest);
@@ -1732,7 +1734,9 @@ void rt6_redirect(const struct in6_addr *dest, const struct in6_addr *src,
goto out;
netevent.old = &rt->dst;
+ netevent.old_neigh = old_neigh;
netevent.new = &nrt->dst;
+ netevent.new_neigh = neigh;
call_netevent_notifiers(NETEVENT_REDIRECT, &netevent);
if (rt->rt6i_flags & RTF_CACHE) {
--
1.7.10
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox