Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Andy Lutomirski @ 2014-11-14 20:16 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Eric Dumazet, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <CA+mtBx_DyUCxE6BqL_xXcqKPUDu7b3jLShV+AL6L+gCzJLriPg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Nov 14, 2014 at 11:52 AM, Tom Herbert <therbert-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, Nov 14, 2014 at 11:33 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> On Fri, 2014-11-14 at 09:17 -0800, Andy Lutomirski wrote:
>>
>>> As a heavy user of RFS (and finder of bugs in it, too), here's my
>>> question about this API:
>>>
>>> How does an application tell whether the socket represents a
>>> non-actively-steered flow?  If the flow is subject to RFS, then moving
>>> the application handling to the socket's CPU seems problematic, as the
>>> socket's CPU might move as well.  The current implementation in this
>>> patch seems to tell me which CPU the most recent packet came in on,
>>> which is not necessarily very useful.
>>
>> Its the cpu that hit the TCP stack, bringing dozens of cache lines in
>> its cache. This is all that matters,
>>
>>>
>>> Some possibilities:
>>>
>>> 1. Let SO_INCOMING_CPU fail if RFS or RPS are in play.
>>
>> Well, idea is to not use RFS at all. Otherwise, it is useless.

Sure, but how do I know that it'll be the same CPU next time?

>>
> Bear in mind this is only an interface to report RX CPU and in itself
> doesn't provide any functionality for changing scheduling, there is
> obviously logic needed in user space that would need to do something.
>
> If we track the interrupting CPU in skb, the interface could be easily
> extended to provide the interrupting CPU, the RPS CPU (calculated at
> reported time), and the CPU processing transport (post steering which
> is what is currently returned). That would provide the complete
> picture to control scheduling a flow from userspace, and an interface
> to selectively turn off RFS for a socket would make sense then.

I think that a turn-off-RFS interface would also want a way to figure
out where the flow would go without RFS.  Can the network stack do
that (e.g. evaluate the rx indirection hash or whatever happens these
days)?

>
>> RFS is the other way around : You want the flow to follow your thread.
>>
>> RPS wont be a problem if you have sensible RPS settings.
>>
>>>
>>> 2. Change the interface a bit to report the socket's preferred CPU
>>> (where it would go without RFS, for example) and then let the
>>> application use setsockopt to tell the socket to stay put (i.e. turn
>>> off RFS and RPS for that flow).
>>>
>>> 3. Report the preferred CPU as in (2) but let the application ask for
>>> something different.
>>>
>>> For example, I have flows for which I know which CPU I want.  A nice
>>> API to put the flow there would be quite useful.
>>>
>>>
>>> Also, it may be worth changing the naming to indicate that these are
>>> about the rx cpu (they are, right?).  For some applications (sparse,
>>> low-latency flows, for example), it can be useful to keep the tx
>>> completion handling on a different CPU.
>>
>> SO_INCOMING_CPU is rx, like incoming ;)
>>
>>

Duh :)

>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply

* Re: [PATCH] ixgbe: Look up MAC address in Open Firmware
From: Martin K. Petersen @ 2014-11-14 20:20 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Martin K. Petersen, netdev, linux.nics
In-Reply-To: <54666106.9000609@cogentembedded.com>

>>>>> "Sergei" == Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> writes:

>> +#ifdef CONFIG_OF
>> + ixgbe_of_mac_addr(adapter);
>> +#endif

Sergei>    Eww... why not define the following above instead:

Sergei> #else
Sergei> static inline void ixgbe_of_mac_addr(struct ixgbe_adapter
Sergei> *adapter) {}
Sergei> #endif

I don't care much either way. But we might as well do this, then, and
shave off an ifdef...

commit 01e25f145972563ee87ebf85b7cb02a4ff8fce3b
Author: Martin K. Petersen <martin.petersen@oracle.com>
Date:   Wed Nov 12 20:47:42 2014 -0500

    ixgbe: Look up MAC address in Open Firmware
    
    Attempt to look up the MAC address in Open Firmware on systems that
    support it. If the "local-mac-address" property is not valid resort to
    using the IDPROM value.
    
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index d2df4e3d1032..0e45a43172eb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -50,6 +50,11 @@
 #include <linux/prefetch.h>
 #include <scsi/fc/fc_fcoe.h>
 
+#ifdef CONFIG_OF
+#include <asm/idprom.h>
+#include <asm/prom.h>
+#endif
+
 #include "ixgbe.h"
 #include "ixgbe_common.h"
 #include "ixgbe_dcb_82599.h"
@@ -7960,6 +7965,31 @@ int ixgbe_wol_supported(struct ixgbe_adapter *adapter, u16 device_id,
 }
 
 /**
+ * ixgbe_of_mac_addr - Look up MAC address in Open Firmware
+ * @adapter: Pointer to adapter struct
+ */
+static void ixgbe_of_mac_addr(struct ixgbe_adapter *adapter)
+{
+#ifdef CONFIG_OF
+	struct device_node *dp = pci_device_to_OF_node(adapter->pdev);
+	struct ixgbe_hw *hw = &adapter->hw;
+	const unsigned char *addr;
+	int len;
+
+	addr = of_get_property(dp, "local-mac-address", &len);
+	if (addr && len == 6) {
+		e_dev_info("Using OpenPROM MAC address\n");
+		memcpy(hw->mac.perm_addr, addr, 6);
+	}
+
+	if (!is_valid_ether_addr(hw->mac.perm_addr)) {
+		e_dev_info("Using IDPROM MAC address\n");
+		memcpy(hw->mac.perm_addr, idprom->id_ethaddr, 6);
+	}
+#endif
+}
+
+/**
  * ixgbe_probe - Device Initialization Routine
  * @pdev: PCI device information struct
  * @ent: entry in ixgbe_pci_tbl
@@ -8223,6 +8253,8 @@ skip_sriov:
 		goto err_sw_init;
 	}
 
+	ixgbe_of_mac_addr(adapter);
+
 	memcpy(netdev->dev_addr, hw->mac.perm_addr, netdev->addr_len);
 
 	if (!is_valid_ether_addr(netdev->dev_addr)) {

^ permalink raw reply related

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Tom Herbert @ 2014-11-14 20:25 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Dumazet, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <CALCETrUuChgxXcNrKjYzpe1ry0hx3kf19EfAbY1N1YnG7NFZUw@mail.gmail.com>

On Fri, Nov 14, 2014 at 12:16 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, Nov 14, 2014 at 11:52 AM, Tom Herbert <therbert@google.com> wrote:
>> On Fri, Nov 14, 2014 at 11:33 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> On Fri, 2014-11-14 at 09:17 -0800, Andy Lutomirski wrote:
>>>
>>>> As a heavy user of RFS (and finder of bugs in it, too), here's my
>>>> question about this API:
>>>>
>>>> How does an application tell whether the socket represents a
>>>> non-actively-steered flow?  If the flow is subject to RFS, then moving
>>>> the application handling to the socket's CPU seems problematic, as the
>>>> socket's CPU might move as well.  The current implementation in this
>>>> patch seems to tell me which CPU the most recent packet came in on,
>>>> which is not necessarily very useful.
>>>
>>> Its the cpu that hit the TCP stack, bringing dozens of cache lines in
>>> its cache. This is all that matters,
>>>
>>>>
>>>> Some possibilities:
>>>>
>>>> 1. Let SO_INCOMING_CPU fail if RFS or RPS are in play.
>>>
>>> Well, idea is to not use RFS at all. Otherwise, it is useless.
>
> Sure, but how do I know that it'll be the same CPU next time?
>
>>>
>> Bear in mind this is only an interface to report RX CPU and in itself
>> doesn't provide any functionality for changing scheduling, there is
>> obviously logic needed in user space that would need to do something.
>>
>> If we track the interrupting CPU in skb, the interface could be easily
>> extended to provide the interrupting CPU, the RPS CPU (calculated at
>> reported time), and the CPU processing transport (post steering which
>> is what is currently returned). That would provide the complete
>> picture to control scheduling a flow from userspace, and an interface
>> to selectively turn off RFS for a socket would make sense then.
>
> I think that a turn-off-RFS interface would also want a way to figure
> out where the flow would go without RFS.  Can the network stack do
> that (e.g. evaluate the rx indirection hash or whatever happens these
> days)?
>
Yes,. We need the rxhash and the CPU that packets are received on from
the device for the socket. The former we already have, the latter
might be done by adding a field to skbuff to set received CPU. Given
the L4 hash and interrupting CPU we can calculated the RPS CPU which
is where packet would have landed with RFS off.

>>
>>> RFS is the other way around : You want the flow to follow your thread.
>>>
>>> RPS wont be a problem if you have sensible RPS settings.
>>>
>>>>
>>>> 2. Change the interface a bit to report the socket's preferred CPU
>>>> (where it would go without RFS, for example) and then let the
>>>> application use setsockopt to tell the socket to stay put (i.e. turn
>>>> off RFS and RPS for that flow).
>>>>
>>>> 3. Report the preferred CPU as in (2) but let the application ask for
>>>> something different.
>>>>
>>>> For example, I have flows for which I know which CPU I want.  A nice
>>>> API to put the flow there would be quite useful.
>>>>
>>>>
>>>> Also, it may be worth changing the naming to indicate that these are
>>>> about the rx cpu (they are, right?).  For some applications (sparse,
>>>> low-latency flows, for example), it can be useful to keep the tx
>>>> completion handling on a different CPU.
>>>
>>> SO_INCOMING_CPU is rx, like incoming ;)
>>>
>>>
>
> Duh :)
>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC

^ permalink raw reply

* Re: [PATCH 1/1 net-next] openvswitch: use PTR_ERR_OR_ZERO
From: Pravin Shelar @ 2014-11-14 20:34 UTC (permalink / raw)
  To: Fabian Frederick; +Cc: LKML, David S. Miller, dev@openvswitch.org, netdev
In-Reply-To: <1415989978-28559-1-git-send-email-fabf@skynet.be>

On Fri, Nov 14, 2014 at 10:32 AM, Fabian Frederick <fabf@skynet.be> wrote:
> Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Pravin B Shelar <pshelar@nicira.com>

> ---
>  net/openvswitch/flow_netlink.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index c0d066d..c8fccdd 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -1425,10 +1425,8 @@ static int add_action(struct sw_flow_actions **sfa, int attrtype,
>         struct nlattr *a;
>
>         a = __add_action(sfa, attrtype, data, len, log);
> -       if (IS_ERR(a))
> -               return PTR_ERR(a);
>
> -       return 0;
> +       return PTR_ERR_OR_ZERO(a);
>  }
>
>  static inline int add_nested_action_start(struct sw_flow_actions **sfa,
> --
> 1.9.3
>

^ permalink raw reply

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Andy Lutomirski @ 2014-11-14 20:34 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Eric Dumazet, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <CA+mtBx_tNW_Bz0hovZ90XxcLK6pL8d6RX=v5Rg8utfKkaFoJ+Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, Nov 14, 2014 at 12:25 PM, Tom Herbert <therbert-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, Nov 14, 2014 at 12:16 PM, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
>> On Fri, Nov 14, 2014 at 11:52 AM, Tom Herbert <therbert-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>> On Fri, Nov 14, 2014 at 11:33 AM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>>> On Fri, 2014-11-14 at 09:17 -0800, Andy Lutomirski wrote:
>>>>
>>>>> As a heavy user of RFS (and finder of bugs in it, too), here's my
>>>>> question about this API:
>>>>>
>>>>> How does an application tell whether the socket represents a
>>>>> non-actively-steered flow?  If the flow is subject to RFS, then moving
>>>>> the application handling to the socket's CPU seems problematic, as the
>>>>> socket's CPU might move as well.  The current implementation in this
>>>>> patch seems to tell me which CPU the most recent packet came in on,
>>>>> which is not necessarily very useful.
>>>>
>>>> Its the cpu that hit the TCP stack, bringing dozens of cache lines in
>>>> its cache. This is all that matters,
>>>>
>>>>>
>>>>> Some possibilities:
>>>>>
>>>>> 1. Let SO_INCOMING_CPU fail if RFS or RPS are in play.
>>>>
>>>> Well, idea is to not use RFS at all. Otherwise, it is useless.
>>
>> Sure, but how do I know that it'll be the same CPU next time?
>>
>>>>
>>> Bear in mind this is only an interface to report RX CPU and in itself
>>> doesn't provide any functionality for changing scheduling, there is
>>> obviously logic needed in user space that would need to do something.
>>>
>>> If we track the interrupting CPU in skb, the interface could be easily
>>> extended to provide the interrupting CPU, the RPS CPU (calculated at
>>> reported time), and the CPU processing transport (post steering which
>>> is what is currently returned). That would provide the complete
>>> picture to control scheduling a flow from userspace, and an interface
>>> to selectively turn off RFS for a socket would make sense then.
>>
>> I think that a turn-off-RFS interface would also want a way to figure
>> out where the flow would go without RFS.  Can the network stack do
>> that (e.g. evaluate the rx indirection hash or whatever happens these
>> days)?
>>
> Yes,. We need the rxhash and the CPU that packets are received on from
> the device for the socket. The former we already have, the latter
> might be done by adding a field to skbuff to set received CPU. Given
> the L4 hash and interrupting CPU we can calculated the RPS CPU which
> is where packet would have landed with RFS off.

Hmm.  I think this would be useful for me.  It would *definitely* be
useful for me if I could pin an RFS flow to a cpu of my choice.

With SO_INCOMING_CPU as described, I'm worried that people will write
programs that perform very well if RFS is off, but that once that code
runs with RFS on, weird things could happen.

(On a side note: the RFS flow hash stuff seems to be rather buggy.
Some Solarflare engineers know about this, but a fix seems to be
rather slow in the works.  I think that some of the bugs are in core
code, though.)

--Andy

>
>>>
>>>> RFS is the other way around : You want the flow to follow your thread.
>>>>
>>>> RPS wont be a problem if you have sensible RPS settings.
>>>>
>>>>>
>>>>> 2. Change the interface a bit to report the socket's preferred CPU
>>>>> (where it would go without RFS, for example) and then let the
>>>>> application use setsockopt to tell the socket to stay put (i.e. turn
>>>>> off RFS and RPS for that flow).
>>>>>
>>>>> 3. Report the preferred CPU as in (2) but let the application ask for
>>>>> something different.
>>>>>
>>>>> For example, I have flows for which I know which CPU I want.  A nice
>>>>> API to put the flow there would be quite useful.
>>>>>
>>>>>
>>>>> Also, it may be worth changing the naming to indicate that these are
>>>>> about the rx cpu (they are, right?).  For some applications (sparse,
>>>>> low-latency flows, for example), it can be useful to keep the tx
>>>>> completion handling on a different CPU.
>>>>
>>>> SO_INCOMING_CPU is rx, like incoming ;)
>>>>
>>>>
>>
>> Duh :)
>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> --
>> Andy Lutomirski
>> AMA Capital Management, LLC



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
From: Hannes Frederic Sowa @ 2014-11-14 20:35 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse,
	discuss
In-Reply-To: <17658.1415996115@famine>

On Fr, 2014-11-14 at 12:15 -0800, Jay Vosburgh wrote:
> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> 
> >On Fr, 2014-11-14 at 13:38 -0500, David Miller wrote:
> >> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >> Date: Fri, 14 Nov 2014 16:46:18 +0100
> >> 
> >> > I would still like to see the current proposed fix getting applied and
> >> > we can do this on-top. The inline call after this patch reassembles a
> >> > direct function call, so besides the long list of clobbers, it should
> >> > still be pretty fast.
> >> 
> >> I would rather revert the change entirely until it is implemented
> >> properly.
> >> 
> >> Also, I am strongly of the opinion that this is a mis-use of the
> >> alternative call interface.  It was never intended to be used for
> >> things that can make real function calls.
> >
> >I tend to disagree. Grepping e.g. shows
> >
> >        alternative_call_2(copy_user_generic_unrolled,
> >                         copy_user_generic_string,
> >                         X86_FEATURE_REP_GOOD,
> >                         copy_user_enhanced_fast_string,
> >                         X86_FEATURE_ERMS,
> >                         ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
> >                                     "=d" (len)),
> >                         "1" (to), "2" (from), "3" (len)
> >                         : "memory", "rcx", "r8", "r9", "r10", "r11");
> >
> >
> >(it has a few less clobbers because it has more output operands)
> 
> 	As those functions (copy_user_generic_unrolled, et al) are all
> in assembly language, presumably the list of clobbered registers can be
> had via inspection.
> 
> 	For the arch_fast_hash2 case, the functions (__intel_crc4_2_hash
> and __jash2) are both written in C, so how would the clobber list be
> created?

I created it via the function calling convention documented in
arch/x86/include/asm/calling.h, so I specified each register which a
function is allowed to clobber with.

I currently cannot see how I can resolve the invalid constraints error
easily. :(

So either go with my first patch, which I puts the alternative_call
switch point into its own function without ever inlining or the patch
needs to be reverted. :/

Bye,
Hannes

^ permalink raw reply

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
From: Hannes Frederic Sowa @ 2014-11-14 20:42 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Eric Dumazet, netdev, Or Gerlitz, Pravin B Shelar,
	Jesse Gross, jay.vosburgh, discuss
In-Reply-To: <CAHA+R7NKrwE7F4k4wiaZRJ67QAhB9bUe8SdyuT8Ky3-wAou6cA@mail.gmail.com>

On Fr, 2014-11-14 at 11:02 -0800, Cong Wang wrote:
> 
> I am wondering, compare with alternative call, how slower is just
> testing
> cpu_has_xmm4_2?

I can test, but cpu_has_xmm4_2 expands into quite some code. I don't
know if indirect function call or a test of this flag is faster. I'll
test this.

Bye,
Hannes

^ permalink raw reply

* [PATCH 1/1 net-next] wireless: remove unnecessary sizeof(u8)
From: Fabian Frederick @ 2014-11-14 20:55 UTC (permalink / raw)
  To: linux-kernel
  Cc: Fabian Frederick, Stefano Brivio, John W. Linville, Johannes Berg,
	Emmanuel Grumbach, Intel Linux Wireless, Larry Finger,
	Chaoming Li, linux-wireless, b43-dev, netdev

sizeof(u8) is always 1.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
---
 drivers/net/wireless/b43/ppr.c            |  2 +-
 drivers/net/wireless/iwlwifi/pcie/trans.c |  2 +-
 drivers/net/wireless/rtlwifi/efuse.c      | 16 ++++++++--------
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/b43/ppr.c b/drivers/net/wireless/b43/ppr.c
index 9a77027..6bc1c6f 100644
--- a/drivers/net/wireless/b43/ppr.c
+++ b/drivers/net/wireless/b43/ppr.c
@@ -28,7 +28,7 @@ void b43_ppr_clear(struct b43_wldev *dev, struct b43_ppr *ppr)
 	memset(ppr, 0, sizeof(*ppr));
 
 	/* Compile-time PPR check */
-	BUILD_BUG_ON(sizeof(struct b43_ppr) != B43_PPR_RATES_NUM * sizeof(u8));
+	BUILD_BUG_ON(sizeof(struct b43_ppr) != B43_PPR_RATES_NUM);
 }
 
 void b43_ppr_add(struct b43_wldev *dev, struct b43_ppr *ppr, int diff)
diff --git a/drivers/net/wireless/iwlwifi/pcie/trans.c b/drivers/net/wireless/iwlwifi/pcie/trans.c
index 836725e..f016824 100644
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -1157,7 +1157,7 @@ static void iwl_trans_pcie_configure(struct iwl_trans *trans,
 		trans_pcie->n_no_reclaim_cmds = trans_cfg->n_no_reclaim_cmds;
 	if (trans_pcie->n_no_reclaim_cmds)
 		memcpy(trans_pcie->no_reclaim_cmds, trans_cfg->no_reclaim_cmds,
-		       trans_pcie->n_no_reclaim_cmds * sizeof(u8));
+		       trans_pcie->n_no_reclaim_cmds);
 
 	trans_pcie->rx_buf_size_8k = trans_cfg->rx_buf_size_8k;
 	if (trans_pcie->rx_buf_size_8k)
diff --git a/drivers/net/wireless/rtlwifi/efuse.c b/drivers/net/wireless/rtlwifi/efuse.c
index 0b4082c..a3135c5 100644
--- a/drivers/net/wireless/rtlwifi/efuse.c
+++ b/drivers/net/wireless/rtlwifi/efuse.c
@@ -251,8 +251,8 @@ void read_efuse(struct ieee80211_hw *hw, u16 _offset, u16 _size_byte, u8 *pbuf)
 	}
 
 	/* allocate memory for efuse_tbl and efuse_word */
-	efuse_tbl = kzalloc(rtlpriv->cfg->maps[EFUSE_HWSET_MAX_SIZE] *
-			    sizeof(u8), GFP_ATOMIC);
+	efuse_tbl = kzalloc(rtlpriv->cfg->maps[EFUSE_HWSET_MAX_SIZE],
+			    GFP_ATOMIC);
 	if (!efuse_tbl)
 		return;
 	efuse_word = kzalloc(EFUSE_MAX_WORD_UNIT * sizeof(u16 *), GFP_ATOMIC);
@@ -733,8 +733,8 @@ static int efuse_pg_packet_read(struct ieee80211_hw *hw, u8 offset, u8 *data)
 	if (offset > 15)
 		return false;
 
-	memset(data, 0xff, PGPKT_DATA_SIZE * sizeof(u8));
-	memset(tmpdata, 0xff, PGPKT_DATA_SIZE * sizeof(u8));
+	memset(data, 0xff, PGPKT_DATA_SIZE);
+	memset(tmpdata, 0xff, PGPKT_DATA_SIZE);
 
 	while (continual && (efuse_addr < EFUSE_MAX_SIZE)) {
 		if (readstate & PG_STATE_HEADER) {
@@ -772,7 +772,7 @@ static void efuse_write_data_case1(struct ieee80211_hw *hw, u16 *efuse_addr,
 	struct rtl_priv *rtlpriv = rtl_priv(hw);
 	struct pgpkt_struct tmp_pkt;
 	int dataempty = true;
-	u8 originaldata[8 * sizeof(u8)];
+	u8 originaldata[8];
 	u8 badworden = 0x0F;
 	u8 match_word_en, tmp_word_en;
 	u8 tmpindex;
@@ -881,7 +881,7 @@ static void efuse_write_data_case2(struct ieee80211_hw *hw, u16 *efuse_addr,
 	struct pgpkt_struct tmp_pkt;
 	u8 pg_header;
 	u8 tmp_header;
-	u8 originaldata[8 * sizeof(u8)];
+	u8 originaldata[8];
 	u8 tmp_word_cnts;
 	u8 badworden = 0x0F;
 
@@ -904,7 +904,7 @@ static void efuse_write_data_case2(struct ieee80211_hw *hw, u16 *efuse_addr,
 
 		tmp_word_cnts = efuse_calculate_word_cnts(tmp_pkt.word_en);
 
-		memset(originaldata, 0xff,  8 * sizeof(u8));
+		memset(originaldata, 0xff,  8);
 
 		if (efuse_pg_packet_read(hw, tmp_pkt.offset, originaldata)) {
 			badworden = enable_efuse_data_write(hw,
@@ -962,7 +962,7 @@ static int efuse_pg_packet_write(struct ieee80211_hw *hw,
 	target_pkt.offset = offset;
 	target_pkt.word_en = word_en;
 
-	memset(target_pkt.data, 0xFF,  8 * sizeof(u8));
+	memset(target_pkt.data, 0xFF,  8);
 
 	efuse_word_enable_data_read(word_en, data, target_pkt.data);
 	target_word_cnts = efuse_calculate_word_cnts(target_pkt.word_en);
-- 
1.9.3

^ permalink raw reply related

* Re: [PATCH] ixgbe: Look up MAC address in Open Firmware
From: Jesse Brandeburg @ 2014-11-14 21:11 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Sergei Shtylyov, netdev, linux.nics, jesse.brandeburg
In-Reply-To: <yq1mw7tv8jq.fsf@sermon.lab.mkp.net>

On Fri, 14 Nov 2014 15:20:09 -0500
"Martin K. Petersen" <martin.petersen@oracle.com> wrote:
> commit 01e25f145972563ee87ebf85b7cb02a4ff8fce3b
> Author: Martin K. Petersen <martin.petersen@oracle.com>
> Date:   Wed Nov 12 20:47:42 2014 -0500
> 
>     ixgbe: Look up MAC address in Open Firmware
>     
>     Attempt to look up the MAC address in Open Firmware on systems that
>     support it. If the "local-mac-address" property is not valid resort to
>     using the IDPROM value.
>     
>     Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

seems fine, but see below...

> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index d2df4e3d1032..0e45a43172eb 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -50,6 +50,11 @@
>  #include <linux/prefetch.h>
>  #include <scsi/fc/fc_fcoe.h>
>  
> +#ifdef CONFIG_OF
> +#include <asm/idprom.h>
> +#include <asm/prom.h>
> +#endif
> +
>  #include "ixgbe.h"
>  #include "ixgbe_common.h"
>  #include "ixgbe_dcb_82599.h"
> @@ -7960,6 +7965,31 @@ int ixgbe_wol_supported(struct ixgbe_adapter *adapter, u16 device_id,
>  }
>  
>  /**
> + * ixgbe_of_mac_addr - Look up MAC address in Open Firmware
> + * @adapter: Pointer to adapter struct
> + */
> +static void ixgbe_of_mac_addr(struct ixgbe_adapter *adapter)
> +{
> +#ifdef CONFIG_OF
> +	struct device_node *dp = pci_device_to_OF_node(adapter->pdev);
> +	struct ixgbe_hw *hw = &adapter->hw;
> +	const unsigned char *addr;
> +	int len;
> +
> +	addr = of_get_property(dp, "local-mac-address", &len);
> +	if (addr && len == 6) {
> +		e_dev_info("Using OpenPROM MAC address\n");
> +		memcpy(hw->mac.perm_addr, addr, 6);
> +	}
> +
> +	if (!is_valid_ether_addr(hw->mac.perm_addr)) {
> +		e_dev_info("Using IDPROM MAC address\n");

There really isn't much point to print this message, just delete this
e_dev_info line.

^ permalink raw reply

* Re: [PATCH] ixgbe: Look up MAC address in Open Firmware
From: David Miller @ 2014-11-14 21:25 UTC (permalink / raw)
  To: martin.petersen; +Cc: netdev, linux.nics
In-Reply-To: <yq1r3x5vbh9.fsf@sermon.lab.mkp.net>

From: "Martin K. Petersen" <martin.petersen@oracle.com>
Date: Fri, 14 Nov 2014 14:16:50 -0500

> +#ifdef CONFIG_OF
> +#include <asm/idprom.h>
> +#include <asm/prom.h>
> +#endif

CONFIG_OF doesn't imply the presence of idprom.

idprom is completely sparc/m68k specific, whereas OF exists on many
platforms other than sparc/m68k.

^ permalink raw reply

* Re: [PATCH Iproute2 next] ip link: Add ipvlan support to the iproute2/ip util
From: Mahesh Bandewar @ 2014-11-14 21:25 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: netdev, Stephen Hemminger, Eric Dumazet, Maciej Zenczykowski,
	Laurent Chavey, Tim Hockin, David Miller, Brandon Philips,
	Pavel Emelianov
In-Reply-To: <20141114100959.GA19965@unicorn.suse.cz>

On Fri, Nov 14, 2014 at 2:09 AM, Michal Kubecek <mkubecek@suse.cz> wrote:
> On Thu, Nov 13, 2014 at 10:56:51PM -0800, Mahesh Bandewar wrote:
>> Adding basic support to create virtual devices using 'ip'
>> utility. Following is the syntax -
>>
>>       ip link add link <master> <virtual> mode [ l2 | l3 ]
>>       e.g. ip link add link eth0 ipvl0 mode l3
>
> I suppose "type ipvlan" is missing on both lines.
>
yes it does, damn!

> Michal Kubecek
>

^ permalink raw reply

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
From: David Miller @ 2014-11-14 21:35 UTC (permalink / raw)
  To: hannes
  Cc: cwang, eric.dumazet, netdev, ogerlitz, pshelar, jesse,
	jay.vosburgh, discuss
In-Reply-To: <1415997733.15154.62.camel@localhost>

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 14 Nov 2014 21:42:13 +0100

> On Fr, 2014-11-14 at 11:02 -0800, Cong Wang wrote:
>> 
>> I am wondering, compare with alternative call, how slower is just
>> testing
>> cpu_has_xmm4_2?
> 
> I can test, but cpu_has_xmm4_2 expands into quite some code. I don't
> know if indirect function call or a test of this flag is faster. I'll
> test this.

I'm going to apply Jay's revert, and then you guys can take your time
sorting this out.

^ permalink raw reply

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Tom Herbert @ 2014-11-14 21:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Eric Dumazet, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <CALCETrWYrR2X9u7CAr5xkSg8kLH_O=sdk=JF+ZW+FxCTNTFBZg@mail.gmail.com>

On Fri, Nov 14, 2014 at 12:34 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, Nov 14, 2014 at 12:25 PM, Tom Herbert <therbert@google.com> wrote:
>> On Fri, Nov 14, 2014 at 12:16 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Fri, Nov 14, 2014 at 11:52 AM, Tom Herbert <therbert@google.com> wrote:
>>>> On Fri, Nov 14, 2014 at 11:33 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>>> On Fri, 2014-11-14 at 09:17 -0800, Andy Lutomirski wrote:
>>>>>
>>>>>> As a heavy user of RFS (and finder of bugs in it, too), here's my
>>>>>> question about this API:
>>>>>>
>>>>>> How does an application tell whether the socket represents a
>>>>>> non-actively-steered flow?  If the flow is subject to RFS, then moving
>>>>>> the application handling to the socket's CPU seems problematic, as the
>>>>>> socket's CPU might move as well.  The current implementation in this
>>>>>> patch seems to tell me which CPU the most recent packet came in on,
>>>>>> which is not necessarily very useful.
>>>>>
>>>>> Its the cpu that hit the TCP stack, bringing dozens of cache lines in
>>>>> its cache. This is all that matters,
>>>>>
>>>>>>
>>>>>> Some possibilities:
>>>>>>
>>>>>> 1. Let SO_INCOMING_CPU fail if RFS or RPS are in play.
>>>>>
>>>>> Well, idea is to not use RFS at all. Otherwise, it is useless.
>>>
>>> Sure, but how do I know that it'll be the same CPU next time?
>>>
>>>>>
>>>> Bear in mind this is only an interface to report RX CPU and in itself
>>>> doesn't provide any functionality for changing scheduling, there is
>>>> obviously logic needed in user space that would need to do something.
>>>>
>>>> If we track the interrupting CPU in skb, the interface could be easily
>>>> extended to provide the interrupting CPU, the RPS CPU (calculated at
>>>> reported time), and the CPU processing transport (post steering which
>>>> is what is currently returned). That would provide the complete
>>>> picture to control scheduling a flow from userspace, and an interface
>>>> to selectively turn off RFS for a socket would make sense then.
>>>
>>> I think that a turn-off-RFS interface would also want a way to figure
>>> out where the flow would go without RFS.  Can the network stack do
>>> that (e.g. evaluate the rx indirection hash or whatever happens these
>>> days)?
>>>
>> Yes,. We need the rxhash and the CPU that packets are received on from
>> the device for the socket. The former we already have, the latter
>> might be done by adding a field to skbuff to set received CPU. Given
>> the L4 hash and interrupting CPU we can calculated the RPS CPU which
>> is where packet would have landed with RFS off.
>
> Hmm.  I think this would be useful for me.  It would *definitely* be
> useful for me if I could pin an RFS flow to a cpu of my choice.
>
Andy, can you elaborate a little more on your use case. I've thought
several times about an interface to program the flow table from
userspace, but never quite came up with a compelling use case and
there is the security concern that a user could "steal" cycles from
arbitrary CPUs.

> With SO_INCOMING_CPU as described, I'm worried that people will write
> programs that perform very well if RFS is off, but that once that code
> runs with RFS on, weird things could happen.
>
> (On a side note: the RFS flow hash stuff seems to be rather buggy.
> Some Solarflare engineers know about this, but a fix seems to be
> rather slow in the works.  I think that some of the bugs are in core
> code, though.)

This is problems with accelerated RFS or just getting the flow hash for packets?

Thanks,
Tom

>
> --Andy
>
>>
>>>>
>>>>> RFS is the other way around : You want the flow to follow your thread.
>>>>>
>>>>> RPS wont be a problem if you have sensible RPS settings.
>>>>>
>>>>>>
>>>>>> 2. Change the interface a bit to report the socket's preferred CPU
>>>>>> (where it would go without RFS, for example) and then let the
>>>>>> application use setsockopt to tell the socket to stay put (i.e. turn
>>>>>> off RFS and RPS for that flow).
>>>>>>
>>>>>> 3. Report the preferred CPU as in (2) but let the application ask for
>>>>>> something different.
>>>>>>
>>>>>> For example, I have flows for which I know which CPU I want.  A nice
>>>>>> API to put the flow there would be quite useful.
>>>>>>
>>>>>>
>>>>>> Also, it may be worth changing the naming to indicate that these are
>>>>>> about the rx cpu (they are, right?).  For some applications (sparse,
>>>>>> low-latency flows, for example), it can be useful to keep the tx
>>>>>> completion handling on a different CPU.
>>>>>
>>>>> SO_INCOMING_CPU is rx, like incoming ;)
>>>>>
>>>>>
>>>
>>> Duh :)
>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>>
>>> --
>>> Andy Lutomirski
>>> AMA Capital Management, LLC
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC

^ permalink raw reply

* Re: [PATCH net-next] Revert "fast_hash: avoid indirect function calls"
From: David Miller @ 2014-11-14 21:36 UTC (permalink / raw)
  To: jay.vosburgh
  Cc: hannes, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss
In-Reply-To: <16481.1415991906@famine>

From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Fri, 14 Nov 2014 11:05:06 -0800

> 
> This reverts commit e5a2c899957659cd1a9f789bc462f9c0b35f5150.
> 
> 	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
> that selects between __jhash2 and __intel_crc4_2_hash based on the
> X86_FEATURE_XMM4_2.
> 
> 	Unfortunately, the alternative_call system does not appear to be
> suitable for use with C functions, as register usage is not handled
> properly for the called functions.  The __jhash2 function in particular
> clobbers registers that are not preserved when called via
> alternative_call, resulting in a panic for direct callers of
> arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
> __intel_crc4_2_hash works merely by chance because it uses fewer
> registers.
> 
> 	This commit was suggested as the source of the problem by Jesse
> Gross <jesse@nicira.com>.
> 
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

Applied, thanks Jay.

^ permalink raw reply

* Re: [linux-nics] [PATCH] ixgbe: Look up MAC address in Open Firmware
From: Jeff Kirsher @ 2014-11-14 21:40 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Sergei Shtylyov, David Miller, Jesse Brandeburg, netdev,
	linux.nics
In-Reply-To: <yq1mw7tv8jq.fsf@sermon.lab.mkp.net>

[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

On Fri, 2014-11-14 at 15:20 -0500, Martin K. Petersen wrote:
> >>>>> "Sergei" == Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> writes:
> 
> >> +#ifdef CONFIG_OF
> >> + ixgbe_of_mac_addr(adapter);
> >> +#endif
> 
> Sergei>    Eww... why not define the following above instead:
> 
> Sergei> #else
> Sergei> static inline void ixgbe_of_mac_addr(struct ixgbe_adapter
> Sergei> *adapter) {}
> Sergei> #endif
> 
> I don't care much either way. But we might as well do this, then, and
> shave off an ifdef...
> 
> commit 01e25f145972563ee87ebf85b7cb02a4ff8fce3b
> Author: Martin K. Petersen <martin.petersen@oracle.com>
> Date:   Wed Nov 12 20:47:42 2014 -0500
> 
>     ixgbe: Look up MAC address in Open Firmware
>     
>     Attempt to look up the MAC address in Open Firmware on systems
> that
>     support it. If the "local-mac-address" property is not valid
> resort to
>     using the IDPROM value.
>     
>     Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Was this meant to be a v2 patch???  Yikes!

Based on the feedback from Segei, Dave and Jesse.  Dropping this patch
from my queue and will wait for a properly formatted v2 of this patch.

Please CC me on any future patches against any changes to
drivers/net/ethernet/intel/*

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply

* Re: [PATCH net-next] Revert "fast_hash: avoid indirect function calls"
From: Hannes Frederic Sowa @ 2014-11-14 21:43 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse,
	discuss
In-Reply-To: <16481.1415991906@famine>

On Fr, 2014-11-14 at 11:05 -0800, Jay Vosburgh wrote:
> This reverts commit e5a2c899957659cd1a9f789bc462f9c0b35f5150.
> 
> 	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
> that selects between __jhash2 and __intel_crc4_2_hash based on the
> X86_FEATURE_XMM4_2.
> 
> 	Unfortunately, the alternative_call system does not appear to be
> suitable for use with C functions, as register usage is not handled
> properly for the called functions.  The __jhash2 function in particular
> clobbers registers that are not preserved when called via
> alternative_call, resulting in a panic for direct callers of
> arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
> __intel_crc4_2_hash works merely by chance because it uses fewer
> registers.
> 
> 	This commit was suggested as the source of the problem by Jesse
> Gross <jesse@nicira.com>.
> 
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

I am totally fine to revert this and try to come up with a better
solution.

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Bye,
Hannes

^ permalink raw reply

* Re: [PATCH] inetdevice: fixed signed integer overflow
From: David Miller @ 2014-11-14 22:09 UTC (permalink / raw)
  To: vincent.benayoun; +Cc: jiri, ja, nicolas.dichtel, linux-kernel, netdev
In-Reply-To: <1e488fa546724140b55c8d4bd6b5a9d6@S1688.EX1688.lan>

From: Vincent Benayoun <vincent.benayoun@trust-in-soft.com>
Date: Thu, 13 Nov 2014 15:04:57 +0000

> From f25ff0f2645f9763552147d480f86973b7041e26 Mon Sep 17 00:00:00 2001
> From: Vincent BENAYOUN <vincent.benayoun@trust-in-soft.com>
> Date: Thu, 13 Nov 2014 13:47:26 +0100
> Subject: [PATCH] inetdevice: fixed signed integer overflow
> 
> There could be a signed overflow in the following code.
> 
> The expression, (32-logmask) is comprised between 0 and 31 included.
> It may be equal to 31.
> In such a case the left shift will produce a signed integer overflow.
> According to the C99 Standard, this is an undefined behavior.
> A simple fix is to replace the signed int 1 with the unsigned int 1U.
> 
> Signed-off-by: Vincent BENAYOUN <vincent.benayoun@trust-in-soft.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
From: Jay Vosburgh @ 2014-11-14 22:10 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse,
	discuss
In-Reply-To: <1415997309.15154.59.camel@localhost>

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
[...]
>I created it via the function calling convention documented in
>arch/x86/include/asm/calling.h, so I specified each register which a
>function is allowed to clobber with.
>
>I currently cannot see how I can resolve the invalid constraints error
>easily. :(
>
>So either go with my first patch, which I puts the alternative_call
>switch point into its own function without ever inlining or the patch
>needs to be reverted. :/

	As a data point, I tested the first patch as well, and the
system does not panic with it in place.  Inspection shows that it's
using %r14 in place of %r8 in the prior (crashing) implementation.

	Disassembly of the call site (on the non-sse4_1 system) in
ovs_flow_tbl_insert with the first patch applied looks like this:

0xffffffffa00b6bb9 <ovs_flow_tbl_insert+0xb9>:	mov    %r15,0x348(%r14)
0xffffffffa00b6bc0 <ovs_flow_tbl_insert+0xc0>:	movzwl 0x28(%r15),%ecx
0xffffffffa00b6bc5 <ovs_flow_tbl_insert+0xc5>:	movzwl 0x2a(%r15),%esi
0xffffffffa00b6bca <ovs_flow_tbl_insert+0xca>:	movzwl %cx,%eax
0xffffffffa00b6bcd <ovs_flow_tbl_insert+0xcd>:	sub    %ecx,%esi
0xffffffffa00b6bcf <ovs_flow_tbl_insert+0xcf>:	lea    0x38(%r14,%rax,1),%rdi
0xffffffffa00b6bd4 <ovs_flow_tbl_insert+0xd4>:	sar    $0x2,%esi
0xffffffffa00b6bd7 <ovs_flow_tbl_insert+0xd7>:	callq  0xffffffff813a7810 <__jhash2>
0xffffffffa00b6bdc <ovs_flow_tbl_insert+0xdc>:	mov    %eax,0x30(%r14)
0xffffffffa00b6be0 <ovs_flow_tbl_insert+0xe0>:	mov    (%rbx),%r13
0xffffffffa00b6be3 <ovs_flow_tbl_insert+0xe3>:	mov    %r14,%rsi
0xffffffffa00b6be6 <ovs_flow_tbl_insert+0xe6>:	mov    %r13,%rdi
0xffffffffa00b6be9 <ovs_flow_tbl_insert+0xe9>:	callq  0xffffffffa00b61a0 <table_instance_insert>

	Compared to the panicking version's function:

0xffffffffa01a55c9 <ovs_flow_tbl_insert+0xb9>:  mov    %r15,0x348(%r8)
0xffffffffa01a55d0 <ovs_flow_tbl_insert+0xc0>:  movzwl 0x28(%r15),%ecx
0xffffffffa01a55d5 <ovs_flow_tbl_insert+0xc5>:  movzwl 0x2a(%r15),%esi
0xffffffffa01a55da <ovs_flow_tbl_insert+0xca>:  movzwl %cx,%eax
0xffffffffa01a55dd <ovs_flow_tbl_insert+0xcd>:  sub    %ecx,%esi
0xffffffffa01a55df <ovs_flow_tbl_insert+0xcf>:  lea    0x38(%r8,%rax,1),%rdi
0xffffffffa01a55e4 <ovs_flow_tbl_insert+0xd4>:  sar    $0x2,%esi
0xffffffffa01a55e7 <ovs_flow_tbl_insert+0xd7>:  callq  0xffffffff813a75c0 <__jhash2>
0xffffffffa01a55ec <ovs_flow_tbl_insert+0xdc>:  mov    %eax,0x30(%r8)
0xffffffffa01a55f0 <ovs_flow_tbl_insert+0xe0>:  mov    (%rbx),%r13
0xffffffffa01a55f3 <ovs_flow_tbl_insert+0xe3>:  mov    %r8,%rsi
0xffffffffa01a55f6 <ovs_flow_tbl_insert+0xe6>:  mov    %r13,%rdi
0xffffffffa01a55f9 <ovs_flow_tbl_insert+0xe9>:  callq  0xffffffffa01a4ba0 <table_instance_insert>

	It appears to generate the same instructions, but allocates
registers differently (using %r14 instead of %r8).

	The __jhash2 disassembly appears to be unchanged between the two
versions.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply

* Re: pull request: wireless 2014-11-13
From: David Miller @ 2014-11-14 22:10 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20141113212815.GA12075@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Thu, 13 Nov 2014 16:28:16 -0500

> Please pull this set of a few more wireless fixes intended for the
> 3.18 stream...

Pulled, thanks John.

^ permalink raw reply

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Andy Lutomirski @ 2014-11-14 22:10 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Eric Dumazet, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <CA+mtBx9dSNOD5Z8GinU8F1mj_PgU0sTC5g8S9vRuSnSUzAoD=g@mail.gmail.com>

On Fri, Nov 14, 2014 at 1:36 PM, Tom Herbert <therbert@google.com> wrote:
> On Fri, Nov 14, 2014 at 12:34 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Fri, Nov 14, 2014 at 12:25 PM, Tom Herbert <therbert@google.com> wrote:
>>> On Fri, Nov 14, 2014 at 12:16 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Fri, Nov 14, 2014 at 11:52 AM, Tom Herbert <therbert@google.com> wrote:
>>>>> On Fri, Nov 14, 2014 at 11:33 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>>>>> On Fri, 2014-11-14 at 09:17 -0800, Andy Lutomirski wrote:
>>>>>>
>>>>>>> As a heavy user of RFS (and finder of bugs in it, too), here's my
>>>>>>> question about this API:
>>>>>>>
>>>>>>> How does an application tell whether the socket represents a
>>>>>>> non-actively-steered flow?  If the flow is subject to RFS, then moving
>>>>>>> the application handling to the socket's CPU seems problematic, as the
>>>>>>> socket's CPU might move as well.  The current implementation in this
>>>>>>> patch seems to tell me which CPU the most recent packet came in on,
>>>>>>> which is not necessarily very useful.
>>>>>>
>>>>>> Its the cpu that hit the TCP stack, bringing dozens of cache lines in
>>>>>> its cache. This is all that matters,
>>>>>>
>>>>>>>
>>>>>>> Some possibilities:
>>>>>>>
>>>>>>> 1. Let SO_INCOMING_CPU fail if RFS or RPS are in play.
>>>>>>
>>>>>> Well, idea is to not use RFS at all. Otherwise, it is useless.
>>>>
>>>> Sure, but how do I know that it'll be the same CPU next time?
>>>>
>>>>>>
>>>>> Bear in mind this is only an interface to report RX CPU and in itself
>>>>> doesn't provide any functionality for changing scheduling, there is
>>>>> obviously logic needed in user space that would need to do something.
>>>>>
>>>>> If we track the interrupting CPU in skb, the interface could be easily
>>>>> extended to provide the interrupting CPU, the RPS CPU (calculated at
>>>>> reported time), and the CPU processing transport (post steering which
>>>>> is what is currently returned). That would provide the complete
>>>>> picture to control scheduling a flow from userspace, and an interface
>>>>> to selectively turn off RFS for a socket would make sense then.
>>>>
>>>> I think that a turn-off-RFS interface would also want a way to figure
>>>> out where the flow would go without RFS.  Can the network stack do
>>>> that (e.g. evaluate the rx indirection hash or whatever happens these
>>>> days)?
>>>>
>>> Yes,. We need the rxhash and the CPU that packets are received on from
>>> the device for the socket. The former we already have, the latter
>>> might be done by adding a field to skbuff to set received CPU. Given
>>> the L4 hash and interrupting CPU we can calculated the RPS CPU which
>>> is where packet would have landed with RFS off.
>>
>> Hmm.  I think this would be useful for me.  It would *definitely* be
>> useful for me if I could pin an RFS flow to a cpu of my choice.
>>
> Andy, can you elaborate a little more on your use case. I've thought
> several times about an interface to program the flow table from
> userspace, but never quite came up with a compelling use case and
> there is the security concern that a user could "steal" cycles from
> arbitrary CPUs.

I have a bunch of threads that are pinned to various CPUs or groups of
CPUs.  Each thread is responsible for a fixed set of flows.  I'd like
those flows to go to those CPUs.

RFS will eventually do it, but it would be nice if I could
deterministically ask for a flow to be routed to the right CPU.  Also,
if my thread bounces temporarily to another CPU, I don't really need
the flow to follow it -- I'd like it to stay put.

This has a significant benefit over using automatic steering: with
automatic steering, I have to make all of the hash tables have a size
around the square of the total number of the flows in order to make it
reliable.

Something like SO_STEER_TO_THIS_CPU would be fine, as long as it
reported whether it worked (for my diagnostics).

>
>> With SO_INCOMING_CPU as described, I'm worried that people will write
>> programs that perform very well if RFS is off, but that once that code
>> runs with RFS on, weird things could happen.
>>
>> (On a side note: the RFS flow hash stuff seems to be rather buggy.
>> Some Solarflare engineers know about this, but a fix seems to be
>> rather slow in the works.  I think that some of the bugs are in core
>> code, though.)
>
> This is problems with accelerated RFS or just getting the flow hash for packets?

Accelerated RFS.

Digging through my email, I thought that
net.core.rps_sock_flow_entries != the per-queue rps_flow_cnt would
make no sense, although I haven't configured it that way.

More importantly, though, I think that some of the has table stuff is
problematic.  My understanding is:

get_rps_cpu may call set_rps_cpu, passing rflow =
flow_table->flows[hash & flow_table->mask];

set_rps_cpu will compute flow_id = hash & flow_table->mask, which
looks to me like it has the property that rflow == flow_table[hash]
(unless we race with a hash table resize).

Now set_rps_cpu tries to steer the new flow to the right CPU (all good
so far), but then it gets weird.  We have a very high probability of
old_flow == rflow.  rflow->filter gets overwritten with the filter id,
the if condition doesn't execute, and nothing gets set to
RPS_NO_FILTER.

This is technically all correct, but if there are two active flows
with the same hash, they'll each keep getting steered to the same
place.  This wastes cycles and seems to anger the sfc driver (the
latter is presumably an sfc bug).  It also means that some of the
filters are likely to get expired for no good reason.

Also, shouldn't ndo_rx_flow_steer be passed the full hash, not just
the index into the hash table?  What's the point of cutting off the
high bits?

--Andy

--Andy

^ permalink raw reply

* [PATCH] brcmfmac: fix error handling of irq_of_parse_and_map
From: Dmitry Torokhov @ 2014-11-14 22:12 UTC (permalink / raw)
  To: John W. Linville
  Cc: Brett Rudley, Arend van Spriel, Franky (Zhenhui) Lin,
	Hante Meuleman, Pieter-Paul Giesberts, linux-wireless,
	brcm80211-dev-list, netdev, linux-kernel

Return value of irq_of_parse_and_map() is unsigned int, with 0
indicating failure, so testing for negative result never works.

Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
---

Not tested, found by casual code inspection.

 drivers/net/wireless/brcm80211/brcmfmac/of.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/brcm80211/brcmfmac/of.c b/drivers/net/wireless/brcm80211/brcmfmac/of.c
index eb3fce82..c824570 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/of.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/of.c
@@ -40,8 +40,8 @@ void brcmf_of_probe(struct brcmf_sdio_dev *sdiodev)
 		return;
 
 	irq = irq_of_parse_and_map(np, 0);
-	if (irq < 0) {
-		brcmf_err("interrupt could not be mapped: err=%d\n", irq);
+	if (!irq) {
+		brcmf_err("interrupt could not be mapped\n");
 		devm_kfree(dev, sdiodev->pdata);
 		return;
 	}
-- 
2.1.0.rc2.206.gedb03e5


-- 
Dmitry

^ permalink raw reply related

* Re: [PATCHv2 net 0/4] Implement ndo_gso_check() for vxlan nics
From: David Miller @ 2014-11-14 22:13 UTC (permalink / raw)
  To: joestringer
  Cc: netdev, sathya.perla, shahed.shaikh, amirv, Dept-GELinuxNICDev,
	therbert, gerlitz.or, linux-kernel
In-Reply-To: <1415925495-59312-1-git-send-email-joestringer@nicira.com>

From: Joe Stringer <joestringer@nicira.com>
Date: Thu, 13 Nov 2014 16:38:11 -0800

> Most NICs that report NETIF_F_GSO_UDP_TUNNEL support VXLAN, and not other
> UDP-based encapsulation protocols where the format and size of the header may
> differ. This patch series implements a generic ndo_gso_check() for detecting
> VXLAN, then reuses it for these NICs.
> 
> Implementation shamelessly stolen from Tom Herbert (with minor fixups):
> http://thread.gmane.org/gmane.linux.network/332428/focus=333111
> 
> v2: Drop i40e/fm10k patches (code diverged; handling separately).
>     Refactor common code into vxlan_gso_check() helper.
>     Minor style fixes.

Series applied, thanks Joe.

^ permalink raw reply

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Eric Dumazet @ 2014-11-14 22:16 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Tom Herbert, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <CALCETrUuChgxXcNrKjYzpe1ry0hx3kf19EfAbY1N1YnG7NFZUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Fri, 2014-11-14 at 12:16 -0800, Andy Lutomirski wrote:

> Sure, but how do I know that it'll be the same CPU next time?

Because the NIC always use same RX queue for a given flow.

So if you setup your IRQ affinities properly, the same CPU will drain
packets from this RX queue. And since RFS is off, you have the guarantee
the same CPU will be used to process packets in TCP stack.

This SO_INCOMING_CPU info is a hint, there is no guarantee eg if you use
bonding and some load balancer or switch decides to send packets on
different links.

Most NIC use Toeplitz hash, so given the 4-tuple, and rss key (40
bytes), you can actually compute the hash in software and know on which
RX queue traffic should land.

^ permalink raw reply

* Re: [PATCH net-next] net: introduce SO_INCOMING_CPU
From: Andy Lutomirski @ 2014-11-14 22:18 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, Michael Kerrisk, David Miller, netdev, Ying Cai,
	Willem de Bruijn, Neal Cardwell, Linux API
In-Reply-To: <1416003371.17262.66.camel-XN9IlZ5yJG9HTL0Zs8A6p/gx64E7kk8eUsxypvmhUTTZJqsBc5GL+g@public.gmane.org>

On Fri, Nov 14, 2014 at 2:16 PM, Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, 2014-11-14 at 12:16 -0800, Andy Lutomirski wrote:
>
>> Sure, but how do I know that it'll be the same CPU next time?
>
> Because the NIC always use same RX queue for a given flow.
>
> So if you setup your IRQ affinities properly, the same CPU will drain
> packets from this RX queue. And since RFS is off, you have the guarantee
> the same CPU will be used to process packets in TCP stack.

Right.  My concern is that if RFS is off, everything works fine, but
if RFS is on (and the app has no good way to know), then silly results
will happen.  I think I'd rather have the getsockopt fail if RFS is on
or at least give some indication so that the app can react
accordingly.

--Andy

>
> This SO_INCOMING_CPU info is a hint, there is no guarantee eg if you use
> bonding and some load balancer or switch decides to send packets on
> different links.
>
> Most NIC use Toeplitz hash, so given the 4-tuple, and rss key (40
> bytes), you can actually compute the hash in software and know on which
> RX queue traffic should land.
>
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply

* Re: [PATCH] ixgbe: Look up MAC address in Open Firmware
From: Florian Fainelli @ 2014-11-14 22:23 UTC (permalink / raw)
  To: Martin K. Petersen, netdev; +Cc: linux.nics
In-Reply-To: <yq1r3x5vbh9.fsf@sermon.lab.mkp.net>

On 11/14/2014 11:16 AM, Martin K. Petersen wrote:

[snip]

> +#ifdef CONFIG_OF
> +/**
> + * ixgbe_of_mac_addr - Look up MAC address in Open Firmware
> + * @adapter: Pointer to adapter struct
> + */
> +static void ixgbe_of_mac_addr(struct ixgbe_adapter *adapter)
> +{
> +	struct device_node *dp = pci_device_to_OF_node(adapter->pdev);
> +	struct ixgbe_hw *hw = &adapter->hw;
> +	const unsigned char *addr;
> +	int len;
> +
> +	addr = of_get_property(dp, "local-mac-address", &len);
> +	if (addr && len == 6) {
> +		e_dev_info("Using OpenPROM MAC address\n");
> +		memcpy(hw->mac.perm_addr, addr, 6);
> +	}

Cannot you use of_get_mac_address() here which does iterate through all
the OF standard ways to specify a MAC address: mac-address,
local-mac-address and address? Benefit is that this will work for all
DT-enabled platforms.

> +
> +	if (!is_valid_ether_addr(hw->mac.perm_addr)) {
> +		e_dev_info("Using IDPROM MAC address\n");
> +		memcpy(hw->mac.perm_addr, idprom->id_ethaddr, 6);
> +	}
> +}
> +#endif
> +
>  /**
>   * ixgbe_probe - Device Initialization Routine
>   * @pdev: PCI device information struct
> @@ -8223,6 +8253,10 @@ skip_sriov:
>  		goto err_sw_init;
>  	}
>  
> +#ifdef CONFIG_OF
> +	ixgbe_of_mac_addr(adapter);
> +#endif
> +
>  	memcpy(netdev->dev_addr, hw->mac.perm_addr, netdev->addr_len);
>  
>  	if (!is_valid_ether_addr(netdev->dev_addr)) {
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox