Netdev List
 help / color / mirror / Atom feed
* Re: brcmsmac: phy_lcn: remove duplicate code
From: Kalle Valo @ 2018-04-25  8:24 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Arend van Spriel, Franky Lin, Hante Meuleman, Chi-Hsien Lin,
	Wright Feng, linux-wireless, brcm80211-dev-list.pdl,
	brcm80211-dev-list, netdev, linux-kernel, Gustavo A. R. Silva
In-Reply-To: <20180405000944.GA22743@embeddedor.com>

"Gustavo A. R. Silva" <gustavo@embeddedor.com> wrote:

> Remove and refactor some code in order to avoid having identical code
> for different branches.
> 
> Notice that this piece of code hasn't been modified since 2011.
> 
> Addresses-Coverity-ID: 1226756 ("Identical code for different branches")
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Patch applied to wireless-drivers-next.git, thanks.

863683cfbbfc brcmsmac: phy_lcn: remove duplicate code

-- 
https://patchwork.kernel.org/patch/10323665/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* [PATCH bpf-next] bpf: Allow bpf_jit_enable = 2 with BPF_JIT_ALWAYS_ON config
From: Leo Yan @ 2018-04-25  8:18 UTC (permalink / raw)
  To: David S. Miller, Alexei Starovoitov, Daniel Borkmann,
	Kirill Tkhai, netdev, linux-kernel
  Cc: Leo Yan

After enabled BPF_JIT_ALWAYS_ON config, bpf_jit_enable always equals to
1; it is impossible to set 'bpf_jit_enable = 2' and the kernel has no
chance to call bpf_jit_dump().

This patch relaxes bpf_jit_enable range to [1..2] when kernel config
BPF_JIT_ALWAYS_ON is enabled so can invoke jit dump.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 net/core/sysctl_net_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index b1a2c5e..6a39b22 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -371,7 +371,7 @@ static int proc_dointvec_minmax_bpf_enable(struct ctl_table *table, int write,
 		.proc_handler	= proc_dointvec_minmax_bpf_enable,
 # ifdef CONFIG_BPF_JIT_ALWAYS_ON
 		.extra1		= &one,
-		.extra2		= &one,
+		.extra2		= &two,
 # else
 		.extra1		= &zero,
 		.extra2		= &two,
-- 
1.9.1

^ permalink raw reply related

* Re: [1/2] net: wireless: zydas: Replace mdelay with msleep in zd1201_probe
From: Kalle Valo @ 2018-04-25  8:15 UTC (permalink / raw)
  To: Jia-Ju Bai
  Cc: davem, stephen, arvind.yadav.cs, johannes.berg, linux-wireless,
	netdev, linux-kernel, Jia-Ju Bai
In-Reply-To: <1523367004-31935-1-git-send-email-baijiaju1990@gmail.com>

Jia-Ju Bai <baijiaju1990@gmail.com> wrote:

> zd1201_probe() is never called in atomic context.
> 
> zd1201_probe() is only set as ".probe" in struct usb_driver.
> 
> Despite never getting called from atomic context, zd1201_probe()
> calls mdelay() to busily wait.
> This is not necessary and can be replaced with msleep() to
> avoid busy waiting.
> 
> This is found by a static analysis tool named DCNS written by myself.
> And I also manually check it.
> 
> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>

I need a review from someone else before I'm willing to take this.

-- 
https://patchwork.kernel.org/patch/10333189/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH bpf-next 0/4] nfp: bpf: optimize negative sums
From: Daniel Borkmann @ 2018-04-25  7:58 UTC (permalink / raw)
  To: Jakub Kicinski, alexei.starovoitov; +Cc: oss-drivers, netdev
In-Reply-To: <20180425042239.27869-1-jakub.kicinski@netronome.com>

On 04/25/2018 06:22 AM, Jakub Kicinski wrote:
> Hi!
> 
> This set adds an optimization run to the NFP jit to turn ADD and SUB
> instructions with negative immediate into the opposite operation with
> a positive immediate.  NFP can fit small immediates into the instructions
> but it can't ever fit negative immediates.  Addition of small negative
> immediates is quite common in BPF programs for stack address calculations,
> therefore this optimization gives us non-negligible savings in instruction
> count (up to 4%).

Applied to bpf-next, thanks Jakub!

^ permalink raw reply

* Re: [PATCH bpf-next] bpf: clear the ip_tunnel_info.
From: Daniel Borkmann @ 2018-04-25  7:54 UTC (permalink / raw)
  To: William Tu, netdev
In-Reply-To: <1524638819-31626-1-git-send-email-u9012063@gmail.com>

On 04/25/2018 08:46 AM, William Tu wrote:
> The percpu metadata_dst might carry the stale ip_tunnel_info
> and cause incorrect behavior.  When mixing tests using ipv4/ipv6
> bpf vxlan and geneve tunnel, the ipv6 tunnel info incorrectly uses
> ipv4's src ip addr as its ipv6 src address, because the previous
> tunnel info does not clean up.  The patch zeros the fields in
> ip_tunnel_info.
> 
> Signed-off-by: William Tu <u9012063@gmail.com>
> Reported-by: Yifeng Sun <pkusunyifeng@gmail.com>

Since this is a fix, I've applied this to bpf, thanks William!

^ permalink raw reply

* Re: [PATCH] qtnfmac: fix qtnf_netdev_hard_start_xmit()'s return type
From: Sergey Matyukevich @ 2018-04-25  7:49 UTC (permalink / raw)
  To: Luc Van Oostenryck
  Cc: linux-kernel, Igor Mitsyanko, Avinash Patil, Sergey Matyukevich,
	Kalle Valo, Kees Cook, linux-wireless, netdev
In-Reply-To: <20180424131810.4963-1-luc.vanoostenryck@gmail.com>

Hi Luc and all,

> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
> 
> Fix this by returning 'netdev_tx_t' in this driver too.
> 
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---
>  drivers/net/wireless/quantenna/qtnfmac/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/quantenna/qtnfmac/core.c b/drivers/net/wireless/quantenna/qtnfmac/core.c
> index cf26c15a8..b3bfb4faa 100644
> --- a/drivers/net/wireless/quantenna/qtnfmac/core.c
> +++ b/drivers/net/wireless/quantenna/qtnfmac/core.c
> @@ -76,7 +76,7 @@ static int qtnf_netdev_close(struct net_device *ndev)
> 
>  /* Netdev handler for data transmission.
>   */
> -static int
> +static netdev_tx_t
>  qtnf_netdev_hard_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  {
>         struct qtnf_vif *vif;

Previous ACK from Igor slipped through the cracks due to
outlook/exchange issues. So here is another one.

Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>

Thanks for the fix !

Regards,
Sergey

^ permalink raw reply

* Re: [RFC bpf] bpf, x64: fix JIT emission for dead code
From: Daniel Borkmann @ 2018-04-25  7:48 UTC (permalink / raw)
  To: Gianluca Borello, netdev; +Cc: ast
In-Reply-To: <20180425054216.48961-1-g.borello@gmail.com>

On 04/25/2018 07:42 AM, Gianluca Borello wrote:
> Commit 2a5418a13fcf ("bpf: improve dead code sanitizing") replaced dead
> code with a series of ja-1 instructions, for safety. That made JIT
> compilation much more complex for some BPF programs. One instance of such
> programs is, for example:
[...]
> A possible approach to mitigate this behavior consists into noticing that
> for ja-1 instructions we don't really need to rely on the estimated size
> of the previous and current instructions, we know that a -1 BPF jump
> offset can be safely translated into a 0xEB instruction with a jump offset
> of -2.
> 
> Such fix brings the BPF program in the previous example to complete again
> in ~9 passes.
> 
> Fixes: 2a5418a13fcf ("bpf: improve dead code sanitizing")
> Signed-off-by: Gianluca Borello <g.borello@gmail.com>

Thanks for reporting, Gianluca. The approach your fix takes looks good to me!

^ permalink raw reply

* Re: [PATCH v3] ath9k: dfs: Remove VLA usage
From: Kalle Valo @ 2018-04-25  7:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andreas Christoforou, Rosen Penev, Eric Dumazet, Joe Perches,
	linux-wireless, netdev, QCA ath9k Development, kernel-hardening,
	linux-kernel
In-Reply-To: <20180424235752.GA37317@beast>

Kees Cook <keescook@chromium.org> writes:

> In the quest to remove all stack VLA usage from the kernel[1], this
> redefines FFT_NUM_SAMPLES as a #define instead of const int, which still
> triggers gcc's VLA checking pass.
>
> [1] https://lkml.org/lkml/2018/3/7/621
>
> Co-developed-by: Andreas Christoforou <andreaschristofo@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
> v3: replace FFT_NUM_SAMPLES as a #define (Joe)
> ---
>  drivers/net/wireless/ath/ath9k/dfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/dfs.c
> b/drivers/net/wireless/ath/ath9k/dfs.c
> index 6fee9a464cce..e6e56a925121 100644
> --- a/drivers/net/wireless/ath/ath9k/dfs.c
> +++ b/drivers/net/wireless/ath/ath9k/dfs.c
> @@ -40,8 +40,8 @@ static const int BIN_DELTA_MIN		= 1;
>  static const int BIN_DELTA_MAX		= 10;
>  
>  /* we need at least 3 deltas / 4 samples for a reliable chirp detection */
> -#define NUM_DIFFS 3
> -static const int FFT_NUM_SAMPLES	= (NUM_DIFFS + 1);
> +#define NUM_DIFFS	3
> +#define FFT_NUM_SAMPLES	(NUM_DIFFS + 1)

I have already applied an almost identical patch:

ath9k: dfs: remove accidental use of stack VLA

https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=ath-next&id=9c27489a34548913baaaf3b2776e05d4a9389e3e

-- 
Kalle Valo

^ permalink raw reply

* Re: Boot failures with net-next after rebase to v4.17.0-rc1
From: Jesper Dangaard Brouer @ 2018-04-25  7:16 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: netdev@vger.kernel.org, LKML, David Miller,
	Toke Høiland-Jørgensen, Paul E. McKenney, David Ahern,
	brouer, Ingo Molnar
In-Reply-To: <CA+55aFx-V8L602X-EHgrJPNA49v_ZiBZW=+v=tcSQK5QZOXYJw@mail.gmail.com>

On Tue, 24 Apr 2018 13:04:23 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Apr 24, 2018 at 12:54 PM, Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> > Hi all,
> >
> > I'm experiencing boot failures with net-next git-tree after it got
> > rebased/merged with Linus'es tree at v4.17.0-rc1.  
> 
> I suspect it's the global bit stuff that came in very late in the
> merge window, and had been developed and tested for a while before,
> but showed some problems under some configs.
> 
> The fix is currently in the x86/pti tree in -tip, see:
> 
>    x86/pti: Fix boot problems from Global-bit setting
> 
> and I expect it will percolate upstream soon.
> 
> In the meantime, it would be good to verify that merging that x86/pti
> branch fixes it for you?

Thanks for spotting this so quickly!
I have verified that this DOES solve the issue for me :-)))

If others are hit by this, and cannot wait for Linus to pull the tip
tree, this is the pull command:

 git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/pti


> There is another candidate for boot problems - do you happen to have
> CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled? That can under certain
> circumstances get a percpu setup page fault because memory hadn't been
> initialized sufficiently.

CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set

> The fix there is to move the mm_init() call one step earlier in
> init_main(): start_kernel() (to before trap_init()).
> 
> And if it's neither of the above, I think you'll need to help bisect it.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* RE: [PATCH  1/8] net: ethernet: stmmac: add adaptation for stm32mp157c.
From: Christophe ROULLIER @ 2018-04-25  7:12 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: mark.rutland@arm.com, devicetree@vger.kernel.org,
	Alexandre TORGUE, netdev@vger.kernel.org,
	mcoquelin.stm32@gmail.com, Peppe CAVALLARO,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <20180424153953.GB2360@lunn.ch>

Hi Andrew,

For moment, I've only tested with PHY RGMII, RMII, MII, GMII, I do not have other kind of PHY interface.
Normally there is no impact in my glue, the value of syscfg register will be the same for RGMII/ID/TXID/RXID.
Do you think that I should add these interfaces in my case ?

	case PHY_INTERFACE_MODE_RGMII:
> +	case PHY_INTERFACE_MODE_RGMII_ID:
> +	case PHY_INTERFACE_MODE_RGMII_RXID:
> +	case PHY_INTERFACE_MODE_RGMII_TXID:
		val = SYSCFG_PMCR_ETH_SEL_RGMII;
		if (dwmac->int_phyclk)
			val |= SYSCFG_PMCR_ETH_CLK_SEL;
		pr_debug("SYSCFG init : PHY_INTERFACE_MODE_RGMII\n");
		break;

Christophe.

-----Original Message-----
From: Andrew Lunn [mailto:andrew@lunn.ch] 
Sent: mardi 24 avril 2018 17:40
To: Christophe ROULLIER <christophe.roullier@st.com>
Cc: mark.rutland@arm.com; mcoquelin.stm32@gmail.com; Alexandre TORGUE <alexandre.torgue@st.com>; Peppe CAVALLARO <peppe.cavallaro@st.com>; devicetree@vger.kernel.org; linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org
Subject: Re: [PATCH 1/8] net: ethernet: stmmac: add adaptation for stm32mp157c.

On Tue, Apr 24, 2018 at 05:01:53PM +0200, Christophe Roullier wrote:

> +	case PHY_INTERFACE_MODE_RGMII:
> +		val = SYSCFG_PMCR_ETH_SEL_RGMII;
> +		if (dwmac->int_phyclk)
> +			val |= SYSCFG_PMCR_ETH_CLK_SEL;
> +		pr_debug("SYSCFG init : PHY_INTERFACE_MODE_RGMII\n");
> +		break;

Hi Christophe

What about PHY_INTERFACE_MODE_RGMII_ID, PHY_INTERFACE_MODE_RGMII_RXID and PHY_INTERFACE_MODE_RGMII_TXID.

    Andrew

^ permalink raw reply

* Re: [PATCH] can: xilinx: fix xcan_start_xmit()'s return type
From: Michal Simek @ 2018-04-25  7:10 UTC (permalink / raw)
  To: Luc Van Oostenryck, linux-kernel
  Cc: Wolfgang Grandegger, Marc Kleine-Budde, Michal Simek, linux-can,
	netdev, linux-arm-kernel
In-Reply-To: <20180424131614.3357-1-luc.vanoostenryck@gmail.com>

On 24.4.2018 15:16, Luc Van Oostenryck wrote:
> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
> 
> Fix this by returning 'netdev_tx_t' in this driver too.
> 
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---
>  drivers/net/can/xilinx_can.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
> index 89aec07c2..a19648606 100644
> --- a/drivers/net/can/xilinx_can.c
> +++ b/drivers/net/can/xilinx_can.c
> @@ -386,7 +386,7 @@ static int xcan_do_set_mode(struct net_device *ndev, enum can_mode mode)
>   *
>   * Return: 0 on success and failure value on error
>   */
> -static int xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +static netdev_tx_t xcan_start_xmit(struct sk_buff *skb, struct net_device *ndev)
>  {
>  	struct xcan_priv *priv = netdev_priv(ndev);
>  	struct net_device_stats *stats = &ndev->stats;
> 

Can you please also align kernel-doc format above?
I see that the whole function is already returning proper enum values.

Thanks,
Michal

^ permalink raw reply

* Re: [PATCH 1/1] IB/rxe: avoid double kfree_skb
From: Yanjun Zhu @ 2018-04-25  6:56 UTC (permalink / raw)
  To: Doug Ledford, monis, jgg, linux-rdma; +Cc: netdev
In-Reply-To: <0b87f416-eee1-fd6c-c386-27469d6db143@oracle.com>

Hi, all

rxe_send [rdma_rxe]
     ip_local_out
         __ip_local_out
         ip_output
             ip_finish_output
                 ip_finish_output2
                     dev_queue_xmit
                         __dev_queue_xmit
                             dev_hard_start_xmit
                                 e1000_xmit_frame [e1000]

When skb is sent, it will pass the above functions. I checked all the 
above functions. If error occurs in the above functions after 
ip_local_out, kfree_skb will be called.
So when ip_local_out returns an error, skb should be freed. It is not 
necessary to call kfree_skb in soft roce module again.

If I am wrong, please correct me.

Zhu Yanjun
On 2018/4/24 16:34, Yanjun Zhu wrote:
> Hi, all
>
> rxe_send
>     ip_local_out
>         __ip_local_out
>             nf_hook_slow
>
> In the above call process, nf_hook_slow drops and frees skb, then 
> -EPERM is returned when iptables rules(iptables -I OUTPUT -p udp 
> --dport 4791 -j DROP) is set.
>
> If skb->users is not changed in softroce, kfree_skb should not be 
> called in this module.
>
> I will make further investigations about other error handler after 
> ip_local_out.
> If I am wrong, please correct me.
>
> Any reply is appreciated.
>
> Zhu Yanjun
> On 2018/4/20 13:46, Yanjun Zhu wrote:
>>
>>
>> On 2018/4/20 10:19, Doug Ledford wrote:
>>> On Thu, 2018-04-19 at 10:01 -0400, Zhu Yanjun wrote:
>>>> When skb is dropped by iptables rules, the skb is freed at the same 
>>>> time
>>>> -EPERM is returned. So in softroce, it is not necessary to free skb 
>>>> again.
>>>> Or else, crash will occur.
>>>>
>>>> The steps to reproduce:
>>>>
>>>>       server                       client
>>>>      ---------                    ---------
>>>>      |1.1.1.1|<----rxe-channel--->|1.1.1.2|
>>>>      ---------                    ---------
>>>>
>>>> On server: rping -s -a 1.1.1.1 -v -C 10000 -S 512
>>>> On client: rping -c -a 1.1.1.1 -v -C 10000 -S 512
>>>>
>>>> The kernel configs CONFIG_DEBUG_KMEMLEAK and
>>>> CONFIG_DEBUG_OBJECTS are enabled on both server and client.
>>>>
>>>> When rping runs, run the following command in server:
>>>>
>>>> iptables -I OUTPUT -p udp  --dport 4791 -j DROP
>>>>
>>>> Without this patch, crash will occur.
>>>>
>>>> CC: Srinivas Eeda <srinivas.eeda@oracle.com>
>>>> CC: Junxiao Bi <junxiao.bi@oracle.com>
>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
>>>> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
>>> I have no reason to doubt your analysis, but if there are a bunch of
>>> error paths for net_xmit and they all return with your skb still being
>>> valid and holding a reference, and then one oddball that returns with
>>> your skb already gone, that just sounds like a mistake waiting to 
>>> happen
>>> (not to mention a bajillion special cases sprinkled everywhere to deal
>>> with this apparent inconsistency).
>>>
>>> Can we get a netdev@ confirmation on this being the right solution?
>> Yes. I agree with you.
>> After iptables rule "iptables -I OUTPUT -p udp  --dport 4791 -j 
>> DROP", the skb is freed in this function
>>
>> /* Returns 1 if okfn() needs to be executed by the caller,
>>  * -EPERM for NF_DROP, 0 otherwise.  Caller must hold rcu_read_lock. */
>> int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state,
>>                  const struct nf_hook_entries *e, unsigned int s)
>> {
>>         unsigned int verdict;
>>         int ret;
>>
>>         for (; s < e->num_hook_entries; s++) {
>>                 verdict = nf_hook_entry_hookfn(&e->hooks[s], skb, 
>> state);
>>                 switch (verdict & NF_VERDICT_MASK) {
>>                 case NF_ACCEPT:
>>                         break;
>>                 case NF_DROP:
>> kfree_skb(skb);                               <----here, skb is freed
>>                         ret = NF_DROP_GETERR(verdict);
>>                         if (ret == 0)
>>                                 ret = -EPERM;
>>                         return ret;
>>                 case NF_QUEUE:
>>                         ret = nf_queue(skb, state, e, s, verdict);
>>                         if (ret == 1)
>>                                 continue;
>>                         return ret;
>>                 default:
>>                         /* Implicit handling for NF_STOLEN, as well 
>> as any other
>>                          * non conventional verdicts.
>>                          */
>>                         return 0;
>>                 }
>>         }
>>
>>         return 1;
>> }
>> EXPORT_SYMBOL(nf_hook_slow);
>>
>> If I am wrong, please correct me.
>>
>> And my test environment is still there, any solution can be verified 
>> in it.
>>
>> Zhu Yanjun
>>>
>>>> ---
>>>>   drivers/infiniband/sw/rxe/rxe_net.c  | 3 +++
>>>>   drivers/infiniband/sw/rxe/rxe_req.c  | 5 +++--
>>>>   drivers/infiniband/sw/rxe/rxe_resp.c | 9 ++++++---
>>>>   3 files changed, 12 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_net.c 
>>>> b/drivers/infiniband/sw/rxe/rxe_net.c
>>>> index 9da6e37..2094434 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_net.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_net.c
>>>> @@ -511,6 +511,9 @@ int rxe_send(struct rxe_pkt_info *pkt, struct 
>>>> sk_buff *skb)
>>>>            if (unlikely(net_xmit_eval(err))) {
>>>>                  pr_debug("error sending packet: %d\n", err);
>>>> +               /* -EPERM means the skb is dropped and freed. */
>>>> +               if (err == -EPERM)
>>>> +                       return -EPERM;
>>>>                  return -EAGAIN;
>>>>          }
>>>>   diff --git a/drivers/infiniband/sw/rxe/rxe_req.c 
>>>> b/drivers/infiniband/sw/rxe/rxe_req.c
>>>> index 7bdaf71..9d2efec 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_req.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
>>>> @@ -727,8 +727,9 @@ int rxe_requester(void *arg)
>>>>                    rollback_state(wqe, qp, &rollback_wqe, 
>>>> rollback_psn);
>>>>   -               if (ret == -EAGAIN) {
>>>> -                       kfree_skb(skb);
>>>> +               if ((ret == -EAGAIN) || (ret == -EPERM)) {
>>>> +                       if (ret == -EAGAIN)
>>>> +                               kfree_skb(skb);
>>>>                          rxe_run_task(&qp->req.task, 1);
>>>>                          goto exit;
>>>>                  }
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c 
>>>> b/drivers/infiniband/sw/rxe/rxe_resp.c
>>>> index a65c996..6bdf9b2 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
>>>> @@ -742,7 +742,8 @@ static enum resp_states read_reply(struct 
>>>> rxe_qp *qp,
>>>>          err = rxe_xmit_packet(rxe, qp, &ack_pkt, skb);
>>>>          if (err) {
>>>>                  pr_err("Failed sending RDMA reply.\n");
>>>> -               kfree_skb(skb);
>>>> +               if (err != -EPERM)
>>>> +                       kfree_skb(skb);
>>>>                  return RESPST_ERR_RNR;
>>>>          }
>>>>   @@ -956,7 +957,8 @@ static int send_ack(struct rxe_qp *qp, struct 
>>>> rxe_pkt_info *pkt,
>>>>          err = rxe_xmit_packet(rxe, qp, &ack_pkt, skb);
>>>>          if (err) {
>>>>                  pr_err_ratelimited("Failed sending ack\n");
>>>> -               kfree_skb(skb);
>>>> +               if (err != -EPERM)
>>>> +                       kfree_skb(skb);
>>>>          }
>>>>     err1:
>>>> @@ -1141,7 +1143,8 @@ static enum resp_states 
>>>> duplicate_request(struct rxe_qp *qp,
>>>>                          if (rc) {
>>>>                                  pr_err("Failed resending result. 
>>>> This flow is not handled - skb ignored\n");
>>>>                                  rxe_drop_ref(qp);
>>>> -                               kfree_skb(skb_copy);
>>>> +                               if (rc != -EPERM)
>>>> +                                       kfree_skb(skb_copy);
>>>>                                  rc = RESPST_CLEANUP;
>>>>                                  goto out;
>>>>                          }
>>>> -- 
>>>> 2.7.4
>>>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* e1000e I219 timestamping oops related to TSYNCRXCTL read
From: Benjamin Poirier @ 2018-04-25  6:52 UTC (permalink / raw)
  To: Bruce Allan, Yanir Lubetkin, Jacob Keller, Neftin, Sasha
  Cc: Alexander Duyck, Jeff Kirsher, Achim Mildenberger, olouvignes,
	jayanth, ehabkost, postmodern.mod3, Bart.VanAssche,
	intel-wired-lan, netdev

In the following openSUSE bug report
https://bugzilla.suse.com/show_bug.cgi?id=1075876
Achim reported an oops related to e1000e timestamping:
kernel: RIP: 0010:[<ffffffff8110303f>] timecounter_read+0xf/0x50
[...]
kernel: Call Trace:
kernel:  [<ffffffffa0806b0f>] e1000e_phc_gettime+0x2f/0x60 [e1000e]
kernel:  [<ffffffffa0806c5d>] e1000e_systim_overflow_work+0x1d/0x80 [e1000e]
kernel:  [<ffffffff810992c5>] process_one_work+0x155/0x440
kernel:  [<ffffffff81099e16>] worker_thread+0x116/0x4b0
kernel:  [<ffffffff8109f422>] kthread+0xd2/0xf0
kernel:  [<ffffffff8163184f>] ret_from_fork+0x3f/0x70

It always occurs 4 hours after boot but not on every boot. It is most
likely the same problem reported here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668356
http://lkml.iu.edu/hypermail/linux/kernel/1506.2/index.html#02530
https://bugzilla.redhat.com/show_bug.cgi?id=1463882
https://bugzilla.redhat.com/show_bug.cgi?id=1431863

This occurs with MAC: 12, e1000_pch_spt/I219. The reporter has
reproduced it on a v4.16 derivative.

We've traced it to the fact that e1000e_systim_reset() skips the
timecounter_init() call if e1000e_get_base_timinca() returns -EINVAL,
which leads to a null deref in timecounter_read() (see comment 8 of the
suse bugzilla for more details.)

In commit 83129b37ef35 ("e1000e: fix systim issues", v4.2-rc1) Yanir
reworked e1000e_get_base_timinca() in such a way that it can return
-EINVAL for e1000_pch_spt if the SYSCFI bit is not set in TSYNCRXCTL.
This is also the commit that was identified by bisection in the second
link above (lkml).

What we've observed (in comment 14) is that TSYNCRXCTL reads sometimes
don't have the SYSCFI bit set. Retrying the read shortly after finds the
bit to be set. This was observed at boot (probe) but also link up and
link down.

I have a few questions:

What's the purpose of the SYSCFI bit in TSYNCRXCTL ("Reserved" in the
datasheet)?

Why does it look like subsequent reads of TSYNCRXCTL sometimes have the
SYSCFI bit set/not set on I219?

Is it right to check the SYSCFI bit in e1000e_get_base_timinca() for
_spt and return -EINVAL if it's not set? Could we just remove that
check?

The patch in comment 13 of the suse bugzilla works around the problem by
retrying TSYNCRXCTL reads, maybe we could instead remove that read
altogether or move the timecounter_init() call to at least avoid the
oops. The best approach to take seems to depend on the behavior expected
of TSYNCRXCTL reads on I219 so I'm hoping that you could provide more
info on that.

Thanks,
-Benjamin

^ permalink raw reply

* [PATCH bpf-next] bpf: clear the ip_tunnel_info.
From: William Tu @ 2018-04-25  6:46 UTC (permalink / raw)
  To: netdev

The percpu metadata_dst might carry the stale ip_tunnel_info
and cause incorrect behavior.  When mixing tests using ipv4/ipv6
bpf vxlan and geneve tunnel, the ipv6 tunnel info incorrectly uses
ipv4's src ip addr as its ipv6 src address, because the previous
tunnel info does not clean up.  The patch zeros the fields in
ip_tunnel_info.

Signed-off-by: William Tu <u9012063@gmail.com>
Reported-by: Yifeng Sun <pkusunyifeng@gmail.com>
---
 net/core/filter.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 8e45c6c7ab08..d3781daa26ab 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3281,6 +3281,7 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
 	skb_dst_set(skb, (struct dst_entry *) md);
 
 	info = &md->u.tun_info;
+	memset(info, 0, sizeof(*info));
 	info->mode = IP_TUNNEL_INFO_TX;
 
 	info->key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] can: janz-ican3: fix ican3_xmit()'s return type
From: Marc Kleine-Budde @ 2018-04-25  6:47 UTC (permalink / raw)
  To: Luc Van Oostenryck, linux-kernel; +Cc: Wolfgang Grandegger, linux-can, netdev
In-Reply-To: <20180424131609.3258-1-luc.vanoostenryck@gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 717 bytes --]

On 04/24/2018 03:16 PM, Luc Van Oostenryck wrote:
> The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
> which is a typedef for an enum type, but the implementation in this
> driver returns an 'int'.
> 
> Fix this by returning 'netdev_tx_t' in this driver too.
> 
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---

Can you send a series fixing the return type in all CAN drivers?

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH v1 net-next] lan78xx: Lan7801 Support for Fixed PHY
From: Raghuram Chary J @ 2018-04-25  6:43 UTC (permalink / raw)
  To: davem; +Cc: netdev, unglinuxdriver, woojung.huh, raghuramchary.jallipalli

Adding Fixed PHY support to the lan78xx driver.

Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
---
v0->v1:
   * Remove driver version #define
   * Modify netdev_info to netdev_dbg
   * Move lan7801 specific to new routine and add switch case
   * Minor cleanup
---
 drivers/net/usb/Kconfig   |   1 +
 drivers/net/usb/lan78xx.c | 113 ++++++++++++++++++++++++++++++++--------------
 2 files changed, 79 insertions(+), 35 deletions(-)

diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
index f28bd74ac275..418b0904cecb 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -111,6 +111,7 @@ config USB_LAN78XX
 	select MII
 	select PHYLIB
 	select MICROCHIP_PHY
+	select FIXED_PHY
 	help
 	  This option adds support for Microchip LAN78XX based USB 2
 	  & USB 3 10/100/1000 Ethernet adapters.
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 0867f7275852..47fa34a2d09f 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -36,13 +36,12 @@
 #include <linux/irq.h>
 #include <linux/irqchip/chained_irq.h>
 #include <linux/microchipphy.h>
-#include <linux/phy.h>
+#include <linux/phy_fixed.h>
 #include "lan78xx.h"
 
 #define DRIVER_AUTHOR	"WOOJUNG HUH <woojung.huh@microchip.com>"
 #define DRIVER_DESC	"LAN78XX USB 3.0 Gigabit Ethernet Devices"
 #define DRIVER_NAME	"lan78xx"
-#define DRIVER_VERSION	"1.0.6"
 
 #define TX_TIMEOUT_JIFFIES		(5 * HZ)
 #define THROTTLE_JIFFIES		(HZ / 8)
@@ -402,6 +401,7 @@ struct lan78xx_net {
 	struct statstage	stats;
 
 	struct irq_domain_data	domain_data;
+	struct phy_device	*fixedphy;
 };
 
 /* define external phy id */
@@ -1477,7 +1477,6 @@ static void lan78xx_get_drvinfo(struct net_device *net,
 	struct lan78xx_net *dev = netdev_priv(net);
 
 	strncpy(info->driver, DRIVER_NAME, sizeof(info->driver));
-	strncpy(info->version, DRIVER_VERSION, sizeof(info->version));
 	usb_make_path(dev->udev, info->bus_info, sizeof(info->bus_info));
 }
 
@@ -2003,52 +2002,91 @@ static int ksz9031rnx_fixup(struct phy_device *phydev)
 	return 1;
 }
 
-static int lan78xx_phy_init(struct lan78xx_net *dev)
+static int lan7801_phy_init(struct phy_device **phydev,
+			    struct lan78xx_net *dev)
 {
+	u32 buf;
 	int ret;
-	u32 mii_adv;
-	struct phy_device *phydev;
-
-	phydev = phy_find_first(dev->mdiobus);
-	if (!phydev) {
-		netdev_err(dev->net, "no PHY found\n");
-		return -EIO;
-	}
-
-	if ((dev->chipid == ID_REV_CHIP_ID_7800_) ||
-	    (dev->chipid == ID_REV_CHIP_ID_7850_)) {
-		phydev->is_internal = true;
-		dev->interface = PHY_INTERFACE_MODE_GMII;
-
-	} else if (dev->chipid == ID_REV_CHIP_ID_7801_) {
-		if (!phydev->drv) {
+	struct fixed_phy_status fphy_status = {
+		.link = 1,
+		.speed = SPEED_1000,
+		.duplex = DUPLEX_FULL,
+	};
+
+	if (!*phydev) {
+		netdev_dbg(dev->net, "PHY Not Found!! Registering Fixed PHY\n");
+		*phydev = fixed_phy_register(PHY_POLL, &fphy_status, -1,
+					     NULL);
+		if (IS_ERR(*phydev)) {
+			netdev_err(dev->net, "No PHY/fixed_PHY found\n");
+			return -ENODEV;
+		}
+		netdev_dbg(dev->net, "Registered FIXED PHY\n");
+		dev->interface = PHY_INTERFACE_MODE_RGMII;
+		dev->fixedphy = *phydev;
+		ret = lan78xx_write_reg(dev, MAC_RGMII_ID,
+					MAC_RGMII_ID_TXC_DELAY_EN_);
+		ret = lan78xx_write_reg(dev, RGMII_TX_BYP_DLL, 0x3D00);
+		ret = lan78xx_read_reg(dev, HW_CFG, &buf);
+		buf |= HW_CFG_CLK125_EN_;
+		buf |= HW_CFG_REFCLK25_EN_;
+		ret = lan78xx_write_reg(dev, HW_CFG, buf);
+	} else {
+		if (!(*phydev)->drv) {
 			netdev_err(dev->net, "no PHY driver found\n");
 			return -EIO;
 		}
-
 		dev->interface = PHY_INTERFACE_MODE_RGMII;
-
 		/* external PHY fixup for KSZ9031RNX */
 		ret = phy_register_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0,
 						 ksz9031rnx_fixup);
 		if (ret < 0) {
-			netdev_err(dev->net, "fail to register fixup\n");
+			netdev_err(dev->net, "fail to register fixup PHY_KSZ9031RNX\n");
 			return ret;
 		}
 		/* external PHY fixup for LAN8835 */
 		ret = phy_register_fixup_for_uid(PHY_LAN8835, 0xfffffff0,
 						 lan8835_fixup);
 		if (ret < 0) {
-			netdev_err(dev->net, "fail to register fixup\n");
+			netdev_err(dev->net, "fail to register fixup for PHY_LAN8835\n");
 			return ret;
 		}
 		/* add more external PHY fixup here if needed */
 
-		phydev->is_internal = false;
-	} else {
-		netdev_err(dev->net, "unknown ID found\n");
-		ret = -EIO;
-		goto error;
+		(*phydev)->is_internal = false;
+	}
+	return 0;
+}
+
+static int lan78xx_phy_init(struct lan78xx_net *dev)
+{
+	int ret;
+	u32 mii_adv;
+	struct phy_device *phydev;
+
+	phydev = phy_find_first(dev->mdiobus);
+	switch (dev->chipid) {
+	case ID_REV_CHIP_ID_7801_:
+		ret = lan7801_phy_init(&phydev, dev);
+		if (ret < 0) {
+			netdev_err(dev->net, "lan7801: PHY Init Failed");
+			return ret;
+		}
+		break;
+
+	case ID_REV_CHIP_ID_7800_:
+	case ID_REV_CHIP_ID_7850_:
+		if (!phydev) {
+			netdev_err(dev->net, "no PHY found\n");
+			return -EIO;
+		}
+		phydev->is_internal = true;
+		dev->interface = PHY_INTERFACE_MODE_GMII;
+		break;
+
+	default:
+		netdev_err(dev->net, "Unknown CHIP ID found\n");
+		return -EIO;
 	}
 
 	/* if phyirq is not set, use polling mode in phylib */
@@ -2067,6 +2105,12 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
 	if (ret) {
 		netdev_err(dev->net, "can't attach PHY to %s\n",
 			   dev->mdiobus->id);
+		if (dev->chipid == ID_REV_CHIP_ID_7801_) {
+			phy_unregister_fixup_for_uid(PHY_KSZ9031RNX,
+						     0xfffffff0);
+			phy_unregister_fixup_for_uid(PHY_LAN8835,
+						     0xfffffff0);
+		}
 		return -EIO;
 	}
 
@@ -2084,12 +2128,6 @@ static int lan78xx_phy_init(struct lan78xx_net *dev)
 	dev->fc_autoneg = phydev->autoneg;
 
 	return 0;
-
-error:
-	phy_unregister_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0);
-	phy_unregister_fixup_for_uid(PHY_LAN8835, 0xfffffff0);
-
-	return ret;
 }
 
 static int lan78xx_set_rx_max_frame_length(struct lan78xx_net *dev, int size)
@@ -3501,6 +3539,11 @@ static void lan78xx_disconnect(struct usb_interface *intf)
 
 	phy_disconnect(net->phydev);
 
+	if (dev->fixedphy) {
+		fixed_phy_unregister(dev->fixedphy);
+		dev->fixedphy = NULL;
+	}
+
 	unregister_netdev(net);
 
 	cancel_delayed_work_sync(&dev->wq);
-- 
2.16.2

^ permalink raw reply related

* Re: [PATCH net-next 3/4] nfp: flower: support offloading multiple rules with same cookie
From: Or Gerlitz @ 2018-04-25  6:31 UTC (permalink / raw)
  To: Jakub Kicinski, John Hurley
  Cc: David Miller, Linux Netdev List, oss-drivers, ASAP_Direct_Dev
In-Reply-To: <20180425041704.26882-4-jakub.kicinski@netronome.com>

On Wed, Apr 25, 2018 at 7:17 AM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> From: John Hurley <john.hurley@netronome.com>
>
> When multiple netdevs are attached to a tc offload block and register for
> callbacks, a rule added to the block will be propogated to all netdevs.
> Previously these were detected as duplicates (based on cookie) and
> rejected. Modify the rule nfp lookup function to optionally include an
> ingress netdev and a host context along with the cookie value when
> searching for a rule. When a new rule is passed to the driver, the netdev
> the rule is to be attached to is considered when searching for dublicates.

so if the same rule (cookie) is provided to the driver through multiple ingress
devices you will not reject it -- what is the use case for that, is it
block sharing?

^ permalink raw reply

* Re: [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive
From: Christoph Hellwig @ 2018-04-25  6:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, netdev, Andy Lutomirski, linux-kernel, linux-mm,
	Soheil Hassas Yeganeh, Eric Dumazet
In-Reply-To: <20180425052722.73022-2-edumazet@google.com>

On Tue, Apr 24, 2018 at 10:27:21PM -0700, Eric Dumazet wrote:
> When adding tcp mmap() implementation, I forgot that socket lock
> had to be taken before current->mm->mmap_sem. syzbot eventually caught
> the bug.
> 
> Since we can not lock the socket in tcp mmap() handler we have to
> split the operation in two phases.
> 
> 1) mmap() on a tcp socket simply reserves VMA space, and nothing else.
>   This operation does not involve any TCP locking.
> 
> 2) setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) implements
>  the transfert of pages from skbs to one VMA.
>   This operation only uses down_read(&current->mm->mmap_sem) after
>   holding TCP lock, thus solving the lockdep issue.
> 
> This new implementation was suggested by Andy Lutomirski with great details.

Thanks, this looks much more sensible to me.

^ permalink raw reply

* [RFC bpf] bpf, x64: fix JIT emission for dead code
From: Gianluca Borello @ 2018-04-25  5:42 UTC (permalink / raw)
  To: netdev; +Cc: daniel, ast, Gianluca Borello

Commit 2a5418a13fcf ("bpf: improve dead code sanitizing") replaced dead
code with a series of ja-1 instructions, for safety. That made JIT
compilation much more complex for some BPF programs. One instance of such
programs is, for example:

bool flag = false
...
/* A bunch of other code */
...
if (flag)
        do_something()

In some cases llvm is not able to remove at compile time the code for
do_something(), so the generated BPF program ends up with a large amount
of dead instructions. In one specific real life example, there are two
series of ~500 and ~1000 dead instructions in the program. When the
verifier replaces them with a series of ja-1 instructions, it causes an
interesting behavior at JIT time.

During the first pass, since all the instructions are estimated at 64
bytes, the ja-1 instructions end up being translated as 5 bytes JMP
instructions (0xE9), since the jump offsets become increasingly large (>
127) as each instruction gets discovered to be 5 bytes instead of the
estimated 64.

Starting from the second pass, the first N instructions of the ja-1
sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets
become <= 127 this time. In particular, N is defined as roughly 127 / (5
- 2) ~= 42. So, each further pass will make the subsequent N JMP
instructions shrink from 5 to 2 bytes, making the image shrink every time.
This means that in order to have the entire program converge, there need
to be, in the real example above, at least ~1000 / 42 ~= 24 passes just
for translating the dead code. If we add this number to the passes needed
to translate the other non dead code, it brings such program to 40+
passes, and JIT doesn't complete. Ultimately the userspace loader fails
because such BPF program was supposed to be part of a prog array owner
being JITed.

While it is certainly possible to try to refactor such programs to help
the compiler remove dead code, the behavior is not really intuitive and it
puts further burden on the BPF developer who is not expecting such
behavior. To make things worse, such programs are working just fine in all
the kernel releases prior to the ja-1 fix.

A possible approach to mitigate this behavior consists into noticing that
for ja-1 instructions we don't really need to rely on the estimated size
of the previous and current instructions, we know that a -1 BPF jump
offset can be safely translated into a 0xEB instruction with a jump offset
of -2.

Such fix brings the BPF program in the previous example to complete again
in ~9 passes.

Fixes: 2a5418a13fcf ("bpf: improve dead code sanitizing")
Signed-off-by: Gianluca Borello <g.borello@gmail.com>
---
Hi

Posting this as RFC since I just wanted to report this potential bug and
propose a possible fix, although I'm not sure if:

1) There might be a better fix on the JIT side
2) We might want to replace the ja-1 instructions with something else
3) We might want to ignore this issue

If we choose option 3, I'd just like to point out that this caused a real
regression on a couple BPF programs that are part of a larger collection
of programs that used to work fine on 4.15 and I recently found broken 
in 4.16, so there would be some value in somehow addressing this.

Thanks

 arch/x86/net/bpf_jit_comp.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b725154182cc..abce27ceb411 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1027,7 +1027,17 @@ xadd:			if (is_imm8(insn->off))
 			break;
 
 		case BPF_JMP | BPF_JA:
-			jmp_offset = addrs[i + insn->off] - addrs[i];
+			if (insn->off == -1)
+				/* -1 jmp instructions will always jump
+				 * backwards two bytes. Explicitly handling
+				 * this case avoids wasting too many passes
+				 * when there are long sequences of replaced
+				 * dead code.
+				 */
+				jmp_offset = -2;
+			else
+				jmp_offset = addrs[i + insn->off] - addrs[i];
+
 			if (!jmp_offset)
 				/* optimize out nop jumps */
 				break;
-- 
2.17.0

^ permalink raw reply related

* [PATCH net-next 2/2] selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE
From: Eric Dumazet @ 2018-04-25  5:27 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Andy Lutomirski, linux-kernel, linux-mm,
	Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet
In-Reply-To: <20180425052722.73022-1-edumazet@google.com>

After prior kernel change, mmap() on TCP socket only reserves VMA.

We have to use setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...)
to perform the transfert of pages from skbs in TCP receive queue into such VMA.

struct tcp_zerocopy_receive {
	__u64 address;		/* in: address of mapping */
	__u32 length;		/* in/out: number of bytes to map/mapped */
	__u32 recv_skip_hint;	/* out: amount of bytes to skip */
};

After a successful setsockopt(...TCP_ZEROCOPY_RECEIVE...), @length contains
number of bytes that were mapped, and @recv_skip_hint contains number of bytes
that should be read using conventional read()/recv()/recvmsg() system calls,
to skip a sequence of bytes that can not be mapped, because not properly page
aligned.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
---
 tools/testing/selftests/net/tcp_mmap.c | 63 +++++++++++++++-----------
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c
index dea342fe6f4e88b5709d2ac37b2fc9a2a320bf44..5b381cdbdd6319556ba4e3dad530fae8f13f5a9b 100644
--- a/tools/testing/selftests/net/tcp_mmap.c
+++ b/tools/testing/selftests/net/tcp_mmap.c
@@ -76,9 +76,10 @@
 #include <time.h>
 #include <sys/time.h>
 #include <netinet/in.h>
-#include <netinet/tcp.h>
 #include <arpa/inet.h>
 #include <poll.h>
+#include <linux/tcp.h>
+#include <assert.h>
 
 #ifndef MSG_ZEROCOPY
 #define MSG_ZEROCOPY    0x4000000
@@ -134,11 +135,12 @@ void hash_zone(void *zone, unsigned int length)
 void *child_thread(void *arg)
 {
 	unsigned long total_mmap = 0, total = 0;
+	struct tcp_zerocopy_receive zc;
 	unsigned long delta_usec;
 	int flags = MAP_SHARED;
 	struct timeval t0, t1;
 	char *buffer = NULL;
-	void *oaddr = NULL;
+	void *addr = NULL;
 	double throughput;
 	struct rusage ru;
 	int lu, fd;
@@ -153,41 +155,45 @@ void *child_thread(void *arg)
 		perror("malloc");
 		goto error;
 	}
+	if (zflg) {
+		addr = mmap(NULL, chunk_size, PROT_READ, flags, fd, 0);
+		if (addr == (void *)-1)
+			zflg = 0;
+	}
 	while (1) {
 		struct pollfd pfd = { .fd = fd, .events = POLLIN, };
 		int sub;
 
 		poll(&pfd, 1, 10000);
 		if (zflg) {
-			void *naddr;
+			int res;
 
-			naddr = mmap(oaddr, chunk_size, PROT_READ, flags, fd, 0);
-			if (naddr == (void *)-1) {
-				if (errno == EAGAIN) {
-					/* That is if SO_RCVLOWAT is buggy */
-					usleep(1000);
-					continue;
-				}
-				if (errno == EINVAL) {
-					flags = MAP_SHARED;
-					oaddr = NULL;
-					goto fallback;
-				}
-				if (errno != EIO)
-					perror("mmap()");
+			zc.address = (__u64)addr;
+			zc.length = chunk_size;
+			zc.recv_skip_hint = 0;
+			res = setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE,
+					 &zc, sizeof(zc));
+			if (res == -1)
 				break;
+
+			if (zc.length) {
+				assert(zc.length <= chunk_size);
+				total_mmap += zc.length;
+				if (xflg)
+					hash_zone(addr, zc.length);
+				total += zc.length;
 			}
-			total_mmap += chunk_size;
-			if (xflg)
-				hash_zone(naddr, chunk_size);
-			total += chunk_size;
-			if (!keepflag) {
-				flags |= MAP_FIXED;
-				oaddr = naddr;
+			if (zc.recv_skip_hint) {
+				assert(zc.recv_skip_hint <= chunk_size);
+				lu = read(fd, buffer, zc.recv_skip_hint);
+				if (lu > 0) {
+					if (xflg)
+						hash_zone(buffer, lu);
+					total += lu;
+				}
 			}
 			continue;
 		}
-fallback:
 		sub = 0;
 		while (sub < chunk_size) {
 			lu = read(fd, buffer + sub, chunk_size - sub);
@@ -228,6 +234,8 @@ void *child_thread(void *arg)
 error:
 	free(buffer);
 	close(fd);
+	if (zflg)
+		munmap(addr, chunk_size);
 	pthread_exit(0);
 }
 
@@ -371,7 +379,8 @@ int main(int argc, char *argv[])
 		setup_sockaddr(cfg_family, host, &listenaddr);
 
 		if (mss &&
-		    setsockopt(fdlisten, SOL_TCP, TCP_MAXSEG, &mss, sizeof(mss)) == -1) {
+		    setsockopt(fdlisten, IPPROTO_TCP, TCP_MAXSEG,
+			       &mss, sizeof(mss)) == -1) {
 			perror("setsockopt TCP_MAXSEG");
 			exit(1);
 		}
@@ -402,7 +411,7 @@ int main(int argc, char *argv[])
 	setup_sockaddr(cfg_family, host, &addr);
 
 	if (mss &&
-	    setsockopt(fd, SOL_TCP, TCP_MAXSEG, &mss, sizeof(mss)) == -1) {
+	    setsockopt(fd, IPPROTO_TCP, TCP_MAXSEG, &mss, sizeof(mss)) == -1) {
 		perror("setsockopt TCP_MAXSEG");
 		exit(1);
 	}
-- 
2.17.0.484.g0c8726318c-goog

^ permalink raw reply related

* [PATCH net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive
From: Eric Dumazet @ 2018-04-25  5:27 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Andy Lutomirski, linux-kernel, linux-mm,
	Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet
In-Reply-To: <20180425052722.73022-1-edumazet@google.com>

When adding tcp mmap() implementation, I forgot that socket lock
had to be taken before current->mm->mmap_sem. syzbot eventually caught
the bug.

Since we can not lock the socket in tcp mmap() handler we have to
split the operation in two phases.

1) mmap() on a tcp socket simply reserves VMA space, and nothing else.
  This operation does not involve any TCP locking.

2) setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) implements
 the transfert of pages from skbs to one VMA.
  This operation only uses down_read(&current->mm->mmap_sem) after
  holding TCP lock, thus solving the lockdep issue.

This new implementation was suggested by Andy Lutomirski with great details.

Benefits are :

- Better scalability, in case multiple threads reuse VMAS
   (without mmap()/munmap() calls) since mmap_sem wont be write locked.

- Better error recovery.
   The previous mmap() model had to provide the expected size of the
   mapping. If for some reason one part could not be mapped (partial MSS),
   the whole operation had to be aborted.
   With the tcp_zerocopy_receive struct, kernel can report how
   many bytes were successfuly mapped, and how many bytes should
   be read to skip the problematic sequence.

- No more memory allocation to hold an array of page pointers.
  16 MB mappings needed 32 KB for this array, potentially using vmalloc() :/

- skbs are freed while mmap_sem has been released

Following patch makes the change in tcp_mmap tool to demonstrate
one possible use of mmap() and setsockopt(... TCP_ZEROCOPY_RECEIVE ...)

Note that memcg might require additional changes.

Fixes: 93ab6cc69162 ("tcp: implement mmap() for zero copy receive")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Cc: linux-mm@kvack.org
Cc: Soheil Hassas Yeganeh <soheil@google.com>
---
 include/uapi/linux/tcp.h |   8 ++
 net/ipv4/tcp.c           | 186 ++++++++++++++++++++-------------------
 2 files changed, 103 insertions(+), 91 deletions(-)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 379b08700a542d49bbce9b4b49b17879d00b69bb..e9e8373b34b9ddc735329341b91f455bf5c0b17c 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -122,6 +122,7 @@ enum {
 #define TCP_MD5SIG_EXT		32	/* TCP MD5 Signature with extensions */
 #define TCP_FASTOPEN_KEY	33	/* Set the key for Fast Open (cookie) */
 #define TCP_FASTOPEN_NO_COOKIE	34	/* Enable TFO without a TFO cookie */
+#define TCP_ZEROCOPY_RECEIVE	35
 
 struct tcp_repair_opt {
 	__u32	opt_code;
@@ -276,4 +277,11 @@ struct tcp_diag_md5sig {
 	__u8	tcpm_key[TCP_MD5SIG_MAXKEYLEN];
 };
 
+/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */
+
+struct tcp_zerocopy_receive {
+	__u64 address;		/* in: address of mapping */
+	__u32 length;		/* in/out: number of bytes to map/mapped */
+	__u32 recv_skip_hint;	/* out: amount of bytes to skip */
+};
 #endif /* _UAPI_LINUX_TCP_H */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index dfd090ea54ad47112fc23c61180b5bf8edd2c736..a28eca97df9465a0aa522a1833a171b53c237b1f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1726,118 +1726,108 @@ int tcp_set_rcvlowat(struct sock *sk, int val)
 }
 EXPORT_SYMBOL(tcp_set_rcvlowat);
 
-/* When user wants to mmap X pages, we first need to perform the mapping
- * before freeing any skbs in receive queue, otherwise user would be unable
- * to fallback to standard recvmsg(). This happens if some data in the
- * requested block is not exactly fitting in a page.
- *
- * We only support order-0 pages for the moment.
- * mmap() on TCP is very strict, there is no point
- * trying to accommodate with pathological layouts.
- */
+static const struct vm_operations_struct tcp_vm_ops = {
+};
+
 int tcp_mmap(struct file *file, struct socket *sock,
 	     struct vm_area_struct *vma)
 {
-	unsigned long size = vma->vm_end - vma->vm_start;
-	unsigned int nr_pages = size >> PAGE_SHIFT;
-	struct page **pages_array = NULL;
-	u32 seq, len, offset, nr = 0;
-	struct sock *sk = sock->sk;
-	const skb_frag_t *frags;
+	if (vma->vm_flags & (VM_WRITE | VM_EXEC))
+		return -EPERM;
+	vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC);
+
+	/* Instruct vm_insert_page() to not down_read(mmap_sem) */
+	vma->vm_flags |= VM_MIXEDMAP;
+
+	vma->vm_ops = &tcp_vm_ops;
+	return 0;
+}
+EXPORT_SYMBOL(tcp_mmap);
+
+static int tcp_zerocopy_receive(struct sock *sk,
+				struct tcp_zerocopy_receive *zc)
+{
+	unsigned long address = (unsigned long)zc->address;
+	const skb_frag_t *frags = NULL;
+	u32 length = 0, seq, offset;
+	struct vm_area_struct *vma;
+	struct sk_buff *skb = NULL;
 	struct tcp_sock *tp;
-	struct sk_buff *skb;
 	int ret;
 
-	if (vma->vm_pgoff || !nr_pages)
+	if (address & (PAGE_SIZE - 1) || address != zc->address)
 		return -EINVAL;
 
-	if (vma->vm_flags & VM_WRITE)
-		return -EPERM;
-	/* TODO: Maybe the following is not needed if pages are COW */
-	vma->vm_flags &= ~VM_MAYWRITE;
-
-	lock_sock(sk);
-
-	ret = -ENOTCONN;
 	if (sk->sk_state == TCP_LISTEN)
-		goto out;
+		return -ENOTCONN;
 
 	sock_rps_record_flow(sk);
 
-	if (tcp_inq(sk) < size) {
-		ret = sock_flag(sk, SOCK_DONE) ? -EIO : -EAGAIN;
+	down_read(&current->mm->mmap_sem);
+
+	ret = -EINVAL;
+	vma = find_vma(current->mm, address);
+	if (!vma || vma->vm_start > address || vma->vm_ops != &tcp_vm_ops)
 		goto out;
-	}
+	zc->length = min_t(unsigned long, zc->length, vma->vm_end - address);
+
 	tp = tcp_sk(sk);
 	seq = tp->copied_seq;
-	/* Abort if urgent data is in the area */
-	if (unlikely(tp->urg_data)) {
-		u32 urg_offset = tp->urg_seq - seq;
+	zc->length = min_t(u32, zc->length, tcp_inq(sk));
 
-		ret = -EINVAL;
-		if (urg_offset < size)
-			goto out;
-	}
-	ret = -ENOMEM;
-	pages_array = kvmalloc_array(nr_pages, sizeof(struct page *),
-				     GFP_KERNEL);
-	if (!pages_array)
-		goto out;
-	skb = tcp_recv_skb(sk, seq, &offset);
-	ret = -EINVAL;
-skb_start:
-	/* We do not support anything not in page frags */
-	offset -= skb_headlen(skb);
-	if ((int)offset < 0)
-		goto out;
-	if (skb_has_frag_list(skb))
-		goto out;
-	len = skb->data_len - offset;
-	frags = skb_shinfo(skb)->frags;
-	while (offset) {
-		if (frags->size > offset)
-			goto out;
-		offset -= frags->size;
-		frags++;
-	}
-	while (nr < nr_pages) {
-		if (len) {
-			if (len < PAGE_SIZE)
-				goto out;
-			if (frags->size != PAGE_SIZE || frags->page_offset)
-				goto out;
-			pages_array[nr++] = skb_frag_page(frags);
-			frags++;
-			len -= PAGE_SIZE;
-			seq += PAGE_SIZE;
-			continue;
-		}
-		skb = skb->next;
-		offset = seq - TCP_SKB_CB(skb)->seq;
-		goto skb_start;
-	}
-	/* OK, we have a full set of pages ready to be inserted into vma */
-	for (nr = 0; nr < nr_pages; nr++) {
-		ret = vm_insert_page(vma, vma->vm_start + (nr << PAGE_SHIFT),
-				     pages_array[nr]);
-		if (ret)
-			goto out;
-	}
-	/* operation is complete, we can 'consume' all skbs */
-	tp->copied_seq = seq;
-	tcp_rcv_space_adjust(sk);
-
-	/* Clean up data we have read: This will do ACK frames. */
-	tcp_recv_skb(sk, seq, &offset);
-	tcp_cleanup_rbuf(sk, size);
+	zap_page_range(vma, address, zc->length);
 
+	zc->recv_skip_hint = 0;
 	ret = 0;
+	while (length + PAGE_SIZE <= zc->length) {
+		if (zc->recv_skip_hint < PAGE_SIZE) {
+			if (skb) {
+				skb = skb->next;
+				offset = seq - TCP_SKB_CB(skb)->seq;
+			} else {
+				skb = tcp_recv_skb(sk, seq, &offset);
+			}
+
+			zc->recv_skip_hint = skb->len - offset;
+			offset -= skb_headlen(skb);
+			if ((int)offset < 0 || skb_has_frag_list(skb))
+				break;
+			frags = skb_shinfo(skb)->frags;
+			while (offset) {
+				if (frags->size > offset)
+					goto out;
+				offset -= frags->size;
+				frags++;
+			}
+		}
+		if (frags->size != PAGE_SIZE || frags->page_offset)
+			break;
+		ret = vm_insert_page(vma, address + length,
+				     skb_frag_page(frags));
+		if (ret)
+			break;
+		length += PAGE_SIZE;
+		seq += PAGE_SIZE;
+		zc->recv_skip_hint -= PAGE_SIZE;
+		frags++;
+	}
 out:
-	release_sock(sk);
-	kvfree(pages_array);
+	up_read(&current->mm->mmap_sem);
+	if (length) {
+		tp->copied_seq = seq;
+		tcp_rcv_space_adjust(sk);
+
+		/* Clean up data we have read: This will do ACK frames. */
+		tcp_recv_skb(sk, seq, &offset);
+		tcp_cleanup_rbuf(sk, length);
+		ret = 0;
+	} else {
+		if (!zc->recv_skip_hint && sock_flag(sk, SOCK_DONE))
+			ret = -EIO;
+	}
+	zc->length = length;
 	return ret;
 }
-EXPORT_SYMBOL(tcp_mmap);
 
 static void tcp_update_recv_tstamps(struct sk_buff *skb,
 				    struct scm_timestamping *tss)
@@ -2738,6 +2728,20 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 
 		return tcp_fastopen_reset_cipher(net, sk, key, sizeof(key));
 	}
+	case TCP_ZEROCOPY_RECEIVE: {
+		struct tcp_zerocopy_receive zc;
+
+		if (optlen != sizeof(zc))
+			return -EINVAL;
+		if (copy_from_user(&zc, optval, optlen))
+			return -EFAULT;
+		lock_sock(sk);
+		err = tcp_zerocopy_receive(sk, &zc);
+		release_sock(sk);
+		if (!err && copy_to_user(optval, &zc, optlen))
+			err = -EFAULT;
+		return err;
+	}
 	default:
 		/* fallthru */
 		break;
-- 
2.17.0.484.g0c8726318c-goog

^ permalink raw reply related

* [PATCH net-next 0/2] tcp: mmap: rework zerocopy receive
From: Eric Dumazet @ 2018-04-25  5:27 UTC (permalink / raw)
  To: David S . Miller
  Cc: netdev, Andy Lutomirski, linux-kernel, linux-mm,
	Soheil Hassas Yeganeh, Eric Dumazet, Eric Dumazet

syzbot reported a lockdep issue caused by tcp mmap() support.

I implemented Andy Lutomirski nice suggestions to resolve the
issue and increase scalability as well.

First patch is adding a new setsockopt() operation and changes mmap()
behavior.

Second patch changes tcp_mmap reference program.

Eric Dumazet (2):
  tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive
  selftests: net: tcp_mmap must use TCP_ZEROCOPY_RECEIVE

 include/uapi/linux/tcp.h               |   8 ++
 net/ipv4/tcp.c                         | 186 +++++++++++++------------
 tools/testing/selftests/net/tcp_mmap.c |  63 +++++----
 3 files changed, 139 insertions(+), 118 deletions(-)

-- 
2.17.0.484.g0c8726318c-goog

^ permalink raw reply

* RE: [PATCH net-next] lan78xx: Lan7801 Support for Fixed PHY
From: RaghuramChary.Jallipalli @ 2018-04-25  5:21 UTC (permalink / raw)
  To: andrew; +Cc: davem, netdev, UNGLinuxDriver, Woojung.Huh
In-Reply-To: <20180423124203.GC25919@lunn.ch>

Hi Andrew,

> > +#define DRIVER_VERSION	"1.0.7"
> 
> Hi Raghuram
> 
> Driver version strings a pretty pointless. You might want to remove it.
>
OK, will remove it.
 
> > +			netdev_info(dev->net, "Registered FIXED PHY\n");
> 
> There are too many detdev_info() messages here. Maybe make them both
> netdev_dbg().
>

OK. Will modify to netdev_dbg()
 
> > +			dev->interface = PHY_INTERFACE_MODE_RGMII;
> > +			dev->fixedphy = phydev;
> 
> You can use
> 
> if (!phy_is_pseudo_fixed_link(phydev))
> 
> to determine is a PHY is a fixed phy. I think you can then do without
> dev->fixedphy.
> 
dev->fixedphy stores the fixed phydev, which will be passed to the 
fixed_phy_unregister routine , so I think phy_is_pseudo_fixed_link check is not necessary.

> > +			ret = lan78xx_write_reg(dev, HW_CFG, buf);
> > +			goto phyinit;
> 
> Please don't use a goto like this. Maybe turn this into a switch statement?
>
OK. Will modify.
 
> > static int lan78xx_phy_init(struct lan78xx_net *dev)
> >  		goto error;
> 
> Please take a look at what happens at error: It does not look correct.
> Probably now is a good time to refactor the whole of lan78xx_phy_init()
> 

OK. Will take care.

Thanks,
-Raghu

^ permalink raw reply

* KASAN: stack-out-of-bounds Write in compat_copy_entries
From: syzbot @ 2018-04-25  5:19 UTC (permalink / raw)
  To: bridge, coreteam, davem, fw, kadlec, linux-kernel, netdev,
	netfilter-devel, pablo, stephen, syzkaller-bugs

Hello,

syzbot hit the following crash on upstream commit
24cac7009cb1b211f1c793ecb6a462c03dc35818 (Tue Apr 24 21:16:40 2018 +0000)
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=4e42a04e0bc33cb6c087

So far this crash happened 3 times on upstream.
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=4827027970457600
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=6212733133389824
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=7043958930931867332
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
user-space arch: i386

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+4e42a04e0bc33cb6c087@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.
If you forward the report, please keep this part and the footer.

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
IPVS: ftp: loaded support on port[0] = 21
==================================================================
BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300  
[inline]
BUG: KASAN: stack-out-of-bounds in compat_mtw_from_user  
net/bridge/netfilter/ebtables.c:1957 [inline]
BUG: KASAN: stack-out-of-bounds in ebt_size_mwt  
net/bridge/netfilter/ebtables.c:2059 [inline]
BUG: KASAN: stack-out-of-bounds in size_entry_mwt  
net/bridge/netfilter/ebtables.c:2155 [inline]
BUG: KASAN: stack-out-of-bounds in compat_copy_entries+0x96c/0x14a0  
net/bridge/netfilter/ebtables.c:2194
Write of size 33 at addr ffff8801b0abf888 by task syz-executor0/4504

CPU: 0 PID: 4504 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #40
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  print_address_description+0x6c/0x20b mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
  check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
  memcpy+0x37/0x50 mm/kasan/kasan.c:303
  strlcpy include/linux/string.h:300 [inline]
  compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline]
  ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline]
  size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline]
  compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194
  compat_do_replace+0x483/0x900 net/bridge/netfilter/ebtables.c:2285
  compat_do_ebt_set_ctl+0x2ac/0x324 net/bridge/netfilter/ebtables.c:2367
  compat_nf_sockopt net/netfilter/nf_sockopt.c:144 [inline]
  compat_nf_setsockopt+0x9b/0x140 net/netfilter/nf_sockopt.c:156
  compat_ip_setsockopt+0xff/0x140 net/ipv4/ip_sockglue.c:1279
  inet_csk_compat_setsockopt+0x97/0x120 net/ipv4/inet_connection_sock.c:1041
  compat_tcp_setsockopt+0x49/0x80 net/ipv4/tcp.c:2901
  compat_sock_common_setsockopt+0xb4/0x150 net/core/sock.c:3050
  __compat_sys_setsockopt+0x1ab/0x7c0 net/compat.c:403
  __do_compat_sys_setsockopt net/compat.c:416 [inline]
  __se_compat_sys_setsockopt net/compat.c:413 [inline]
  __ia32_compat_sys_setsockopt+0xbd/0x150 net/compat.c:413
  do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
  do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7fb3cb9
RSP: 002b:00000000fff0c26c EFLAGS: 00000282 ORIG_RAX: 000000000000016e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000080 RSI: 0000000020000300 RDI: 00000000000005f4
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

The buggy address belongs to the page:
page:ffffea0006c2afc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x2fffc0000000000()
raw: 02fffc0000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: 0000000000000000 ffffea0006c20101 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8801b0abf780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8801b0abf800: 00 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2 f2 f2 f2
> ffff8801b0abf880: f2 00 00 00 07 f3 f3 f3 f3 00 00 00 00 00 00 00
                                ^
  ffff8801b0abf900: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00
  ffff8801b0abf980: 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkaller@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply

* [RFC v3 5/5] virtio_ring: enable packed ring
From: Tiwei Bie @ 2018-04-25  5:15 UTC (permalink / raw)
  To: mst, jasowang, virtualization, linux-kernel, netdev; +Cc: wexu
In-Reply-To: <20180425051550.24342-1-tiwei.bie@intel.com>

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
---
 drivers/virtio/virtio_ring.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index b1039c2985b9..9a3d13e1e2ba 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1873,6 +1873,8 @@ void vring_transport_features(struct virtio_device *vdev)
 			break;
 		case VIRTIO_F_IOMMU_PLATFORM:
 			break;
+		case VIRTIO_F_RING_PACKED:
+			break;
 		default:
 			/* We don't understand this bit. */
 			__virtio_clear_bit(vdev, i);
-- 
2.11.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox