Netdev List
 help / color / mirror / Atom feed
* [PATCH net] VSOCK: check sk state before receive
From: Hangbin Liu @ 2018-05-27  1:02 UTC (permalink / raw)
  To: netdev; +Cc: Stefan Hajnoczi, Jorgen Hansen, David S. Miller, Hangbin Liu

Since vmci_transport_recv_dgram_cb is a callback function and we access the
socket struct without holding the lock here, there is a possibility that
sk has been released and we use it again. This may cause a NULL pointer
dereference later, while receiving. Here is the call trace:

[  389.486319] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[  389.494148] PGD 0 P4D 0
[  389.496687] Oops: 0000 [#1] SMP PTI
[  389.500170] Modules linked in: vhost_net vmw_vsock_vmci_transport tun vsock vhost vmw_vmci tap iptable_security iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_s
[  389.510984] Failed to add new resource (handle=0x2:0x2711), error: -22
[  389.543309] Failed to add new resource (handle=0x2:0x2711), error: -22
[  389.570936]  ttm drm crc32c_intel mptsas scsi_transport_sas serio_raw ata_piix mptscsih libata i2c_core mptbase bnx2 dm_mirror dm_region_hash dm_log dm_mod
[  389.597899] CPU: 3 PID: 113 Comm: kworker/3:2 Tainted: G          I       4.17.0-rc6.latest+ #25
[  389.606673] Hardware name: Dell Inc. PowerEdge R710/0XDX06, BIOS 6.1.0 10/18/2011
[  389.614158] Workqueue: events dg_delayed_dispatch [vmw_vmci]
[  389.619820] RIP: 0010:selinux_socket_sock_rcv_skb+0x46/0x270
[  389.625475] RSP: 0018:ffffbcb5416b7ce0 EFLAGS: 00010293
[  389.630698] RAX: 0000000000000000 RBX: 0000000000000028 RCX: 0000000000000007
[  389.637825] RDX: 0000000000000000 RSI: ffff94a29feec500 RDI: ffffbcb5416b7d18
[  389.644953] RBP: ffff94a29bd9a640 R08: 0000000000000001 R09: ffff94a187c03080
[  389.652080] R10: ffffbcb5416b7d80 R11: 0000000000000000 R12: ffffbcb5416b7d18
[  389.659206] R13: ffff94a29feec500 R14: ffff94a2afda5e00 R15: 0ffff94a2afda5e0
[  389.666336] FS:  0000000000000000(0000) GS:ffff94a2afd80000(0000) knlGS:0000000000000000
[  389.674419] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  389.680160] CR2: 0000000000000010 CR3: 000000004320a003 CR4: 00000000000206e0
[  389.687283] Call Trace:
[  389.689738]  ? __alloc_skb+0xa0/0x230
[  389.693407]  security_sock_rcv_skb+0x32/0x60
[  389.697679]  ? __alloc_skb+0xa0/0x230
[  389.701343]  sk_filter_trim_cap+0x4e/0x1f0
[  389.705442]  __sk_receive_skb+0x32/0x290
[  389.709372]  vmci_transport_recv_dgram_cb+0xa7/0xd0 [vmw_vsock_vmci_transport]
[  389.716593]  dg_delayed_dispatch+0x22/0x50 [vmw_vmci]
[  389.721648]  process_one_work+0x1f2/0x4a0
[  389.725662]  worker_thread+0x38/0x4c0
[  389.729329]  ? process_one_work+0x4a0/0x4a0
[  389.733512]  kthread+0x12f/0x150
[  389.736743]  ? kthread_create_worker_on_cpu+0x90/0x90
[  389.741796]  ret_from_fork+0x35/0x40
[  389.745370] Code: 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 e8 42 15 db ff 0f b7 5d 10 48 8b 85 70 02 00 00 4c 8d 64 24 38 b9 07 00 00 00 4c 89 e7 <44> 8b 70 10 31 c0 41 89 df 41 83 e7 f7
[  389.764342] RIP: selinux_socket_sock_rcv_skb+0x46/0x270 RSP: ffffbcb5416b7ce0
[  389.771467] CR2: 0000000000000010
[  389.774784] ---[ end trace e83d65291a15ae6a ]---

Fix it by checking sk state before using it.

Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 net/vmw_vsock/vmci_transport.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index a7a73ff..0d26040 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -612,6 +612,13 @@ static int vmci_transport_recv_dgram_cb(void *data, struct vmci_datagram *dg)
 	if (!vmci_transport_allow_dgram(vsk, dg->src.context))
 		return VMCI_ERROR_NO_ACCESS;
 
+	bh_lock_sock(sk);
+	if (sk->sk_state == TCP_CLOSE) {
+		bh_unlock_sock(sk);
+		return VMCI_ERROR_DATAGRAM_FAILED;
+	}
+	bh_unlock_sock(sk);
+
 	size = VMCI_DG_SIZE(dg);
 
 	/* Attach the packet to the socket's receive queue as an sk_buff. */
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs
From: Neil Horman @ 2018-05-27  1:01 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Michael Tuexen, Xin Long, network dev, linux-sctp, David Miller,
	David Ahern, Eric Dumazet, Marcelo Ricardo Leitner, syzkaller
In-Reply-To: <CACT4Y+YozRSfcoUoKHOWy5wujhVdks38vcfNGhwNj-REWcd-hw@mail.gmail.com>

On Sat, May 26, 2018 at 05:50:39PM +0200, Dmitry Vyukov wrote:
> On Sat, May 26, 2018 at 5:42 PM, Michael Tuexen
> <michael.tuexen@lurchi.franken.de> wrote:
> >> On 25. May 2018, at 21:13, Neil Horman <nhorman@tuxdriver.com> wrote:
> >>
> >> On Sat, May 26, 2018 at 01:41:02AM +0800, Xin Long wrote:
> >>> syzbot reported a rcu_sched self-detected stall on CPU which is caused
> >>> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
> >>> value, hb_timer will get stuck there, as in its timer handler it starts
> >>> this timer again with this value, then goes to the timer handler again.
> >>>
> >>> This problem is there since very beginning, and thanks to Eric for the
> >>> reproducer shared from a syzbot mail.
> >>>
> >>> This patch fixes it by not allowing to set rto_min with a value below
> >>> 200 msecs, which is based on TCP's, by either setsockopt or sysctl.
> >>>
> >>> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
> >>> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> >>> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> >>> ---
> >>> include/net/sctp/constants.h |  1 +
> >>> net/sctp/socket.c            | 10 +++++++---
> >>> net/sctp/sysctl.c            |  3 ++-
> >>> 3 files changed, 10 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
> >>> index 20ff237..2ee7a7b 100644
> >>> --- a/include/net/sctp/constants.h
> >>> +++ b/include/net/sctp/constants.h
> >>> @@ -277,6 +277,7 @@ enum { SCTP_MAX_GABS = 16 };
> >>> #define SCTP_RTO_INITIAL     (3 * 1000)
> >>> #define SCTP_RTO_MIN         (1 * 1000)
> >>> #define SCTP_RTO_MAX         (60 * 1000)
> >>> +#define SCTP_RTO_HARD_MIN   200
> >>>
> >>> #define SCTP_RTO_ALPHA          3   /* 1/8 when converted to right shifts. */
> >>> #define SCTP_RTO_BETA           2   /* 1/4 when converted to right shifts. */
> >>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> >>> index ae7e7c6..6ef12c7 100644
> >>> --- a/net/sctp/socket.c
> >>> +++ b/net/sctp/socket.c
> >>> @@ -3029,7 +3029,8 @@ static int sctp_setsockopt_nodelay(struct sock *sk, char __user *optval,
> >>>  * be changed.
> >>>  *
> >>>  */
> >>> -static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigned int optlen)
> >>> +static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval,
> >>> +                               unsigned int optlen)
> >>> {
> >>>      struct sctp_rtoinfo rtoinfo;
> >>>      struct sctp_association *asoc;
> >>> @@ -3056,10 +3057,13 @@ static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigne
> >>>      else
> >>>              rto_max = asoc ? asoc->rto_max : sp->rtoinfo.srto_max;
> >>>
> >>> -    if (rto_min)
> >>> +    if (rto_min) {
> >>> +            if (rto_min < SCTP_RTO_HARD_MIN)
> >>> +                    return -EINVAL;
> >>>              rto_min = asoc ? msecs_to_jiffies(rto_min) : rto_min;
> >>> -    else
> >>> +    } else {
> >>>              rto_min = asoc ? asoc->rto_min : sp->rtoinfo.srto_min;
> >>> +    }
> >>>
> >>>      if (rto_min > rto_max)
> >>>              return -EINVAL;
> >>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
> >>> index 33ca5b7..7ec854a 100644
> >>> --- a/net/sctp/sysctl.c
> >>> +++ b/net/sctp/sysctl.c
> >>> @@ -52,6 +52,7 @@ static int rto_alpha_min = 0;
> >>> static int rto_beta_min = 0;
> >>> static int rto_alpha_max = 1000;
> >>> static int rto_beta_max = 1000;
> >>> +static int rto_hard_min = SCTP_RTO_HARD_MIN;
> >>>
> >>> static unsigned long max_autoclose_min = 0;
> >>> static unsigned long max_autoclose_max =
> >>> @@ -116,7 +117,7 @@ static struct ctl_table sctp_net_table[] = {
> >>>              .maxlen         = sizeof(unsigned int),
> >>>              .mode           = 0644,
> >>>              .proc_handler   = proc_sctp_do_rto_min,
> >>> -            .extra1         = &one,
> >>> +            .extra1         = &rto_hard_min,
> >>>              .extra2         = &init_net.sctp.rto_max
> >>>      },
> >>>      {
> >>> --
> >>> 2.1.0
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >> Patch looks fine, you probably want to note this hard minimum in man(7) sctp as
> >> well
> >>
> > I'm aware of some signalling networks which use RTO.min of smaller values than 200ms.
> > So could this be reduced?
> 
> Hi Michael,
> 
> What value do they use?
> 
> Xin, Neil, is there more principled way of ensuring that a timer won't
> cause a hard CPU stall? There are slow machines and there are slow
> kernels (in particular syzbot kernel has tons of debug configs
> enabled). 200ms _should_ not cause problems because we did not see
> them with tcp. But it's hard to say what's the low limit as we are
> trying to put a hard upper bound on execution time of a complex
> section of code. Is there something like cond_resched for timers?
Unfortunately, Theres not really a way to do conditional rescheduling of timers,
additionally, we have a problem because the timer is reset as a side effect of
the SCTP state machine, and so the execution time between timer updates has a
signifcant amount of jitter (meaning its a pretty hard value to calibrate,
unless you just select a 'safe' large value for the floor).

What we might could do (though this might impact the protocol function is change
the timer update side effects to simply set a flag, and consistently update the
timers on exit from sctp_do_sm, so they don't re-arm until all state machine
processing is complete.  Anyone have any thoughts on that?

Neil

> 

^ permalink raw reply

* CONGRATULATION FROM FRIDMAN
From: Mr. Mikhail Fridman @ 2018-05-26 23:30 UTC (permalink / raw)





Hello,

I Mikhail Fridman. has selected you specially as one of my beneficiaries
for my Charitable Donation, Just as I have declared on May 23, 2016 to
give my fortune as charity.

Check the link below for confirmation:

http://www.ibtimes.co.uk/russias-second-wealthiest-man-mikhail-fridman-plans-leaving-14-2bn-fortune-charity-1561604

Reply as soon as possible with further directives.


Best Regards,
Mikhail Fridman.

^ permalink raw reply

* CONGRATULATION FROM FRIDMAN
From: Mr. Mikhail Fridman @ 2018-05-26 22:48 UTC (permalink / raw)





Hello,

I Mikhail Fridman. has selected you specially as one of my beneficiaries
for my Charitable Donation, Just as I have declared on May 23, 2016 to
give my fortune as charity.

Check the link below for confirmation:

http://www.ibtimes.co.uk/russias-second-wealthiest-man-mikhail-fridman-plans-leaving-14-2bn-fortune-charity-1561604

Reply as soon as possible with further directives.


Best Regards,
Mikhail Fridman.

^ permalink raw reply

* Re: [PATCH 3/6] ravb: remove custom .set_link_ksettings from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 19:50 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <1527160318-10958-4-git-send-email-vladimir_zapolskiy@mentor.com>

On 05/24/2018 02:11 PM, Vladimir Zapolskiy wrote:

> The change replaces a custom implementation of .set_link_ksettings
> callback with a shared phy_ethtool_set_link_ksettings(), this fixes
> sleep in atomic context bug, which is encountered every time when link
> settings are changed by ethtool.

   Seeing it now...

> Now duplex mode setting is enforced in ravb_adjust_link() only, also
> now TX/RX is disabled when link is put down or modifications to E-MAC
> registers ECMR and GECMR are expected for both cases of checked and
> ignored link status pin state from E-MAC interrupt handler.
> 
> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 58 +++++++++-----------------------
>  1 file changed, 15 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index 3d91caa44176..0d811c02ff34 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -980,6 +980,13 @@ static void ravb_adjust_link(struct net_device *ndev)
>  	struct ravb_private *priv = netdev_priv(ndev);
>  	struct phy_device *phydev = ndev->phydev;
>  	bool new_state = false;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&priv->lock, flags);
> +
> +	/* Disable TX and RX right over here, if E-MAC change is ignored */
> +	if (priv->no_avb_link)
> +		ravb_rcv_snd_disable(ndev);
>  
>  	if (phydev->link) {
>  		if (phydev->duplex != priv->duplex) {
> @@ -997,18 +1004,21 @@ static void ravb_adjust_link(struct net_device *ndev)
>  			ravb_modify(ndev, ECMR, ECMR_TXF, 0);
>  			new_state = true;
>  			priv->link = phydev->link;
> -			if (priv->no_avb_link)
> -				ravb_rcv_snd_enable(ndev);
>  		}
>  	} else if (priv->link) {
>  		new_state = true;
>  		priv->link = 0;
>  		priv->speed = 0;
>  		priv->duplex = -1;
> -		if (priv->no_avb_link)
> -			ravb_rcv_snd_disable(ndev);
>  	}
>  
> +	/* Enable TX and RX right over here, if E-MAC change is ignored */
> +	if (priv->no_avb_link && phydev->link)
> +		ravb_rcv_snd_enable(ndev);
> +
> +	mmiowb();
> +	spin_unlock_irqrestore(&priv->lock, flags);
> +

   I like this part. :-)

>  	if (new_state && netif_msg_link(priv))
>  		phy_print_status(phydev);
>  }
> @@ -1096,44 +1106,6 @@ static int ravb_phy_start(struct net_device *ndev)
>  	return 0;
>  }
>  
> -static int ravb_set_link_ksettings(struct net_device *ndev,
> -				   const struct ethtool_link_ksettings *cmd)
> -{
> -	struct ravb_private *priv = netdev_priv(ndev);
> -	unsigned long flags;
> -	int error;
> -
> -	if (!ndev->phydev)
> -		return -ENODEV;
> -
> -	spin_lock_irqsave(&priv->lock, flags);
> -
> -	/* Disable TX and RX */
> -	ravb_rcv_snd_disable(ndev);
> -
> -	error = phy_ethtool_ksettings_set(ndev->phydev, cmd);
> -	if (error)
> -		goto error_exit;
> -
> -	if (cmd->base.duplex == DUPLEX_FULL)
> -		priv->duplex = 1;
> -	else
> -		priv->duplex = 0;
> -
> -	ravb_set_duplex(ndev);
> -
> -error_exit:
> -	mdelay(1);
> -
> -	/* Enable TX and RX */
> -	ravb_rcv_snd_enable(ndev);
> -
> -	mmiowb();
> -	spin_unlock_irqrestore(&priv->lock, flags);
> -
> -	return error;
> -}
> -

   But this part is clearly lumping it all together... 

[...]
> @@ -1357,7 +1329,7 @@ static const struct ethtool_ops ravb_ethtool_ops = {
>  	.set_ringparam		= ravb_set_ringparam,
>  	.get_ts_info		= ravb_get_ts_info,
>  	.get_link_ksettings	= phy_ethtool_get_link_ksettings,
> -	.set_link_ksettings	= ravb_set_link_ksettings,
> +	.set_link_ksettings	= phy_ethtool_set_link_ksettings,

   Should have been a part of the final patch in the fix/enhancement chain...

>  	.get_wol		= ravb_get_wol,
>  	.set_wol		= ravb_set_wol,
>  };

MBR, Sergei

^ permalink raw reply

* Re: [PATCH net] net: sched: check netif_xmit_frozen_or_stopped() in sch_direct_xmit()
From: John Fastabend @ 2018-05-26 19:43 UTC (permalink / raw)
  To: Song Liu, Song Liu, ast; +Cc: netdev, kernel-team, David S . Miller
In-Reply-To: <CAPhsuW53SQCrWX-y-Ewcs5GS80NrsF_skpiv7Qwvb7fWy72Cgg@mail.gmail.com>

On 05/25/2018 12:46 PM, Song Liu wrote:
> On Fri, May 25, 2018 at 11:11 AM, Song Liu <songliubraving@fb.com> wrote:
>> Summary:
>>
>> At the end of sch_direct_xmit(), we are in the else path of
>> !dev_xmit_complete(ret), which means ret == NETDEV_TX_OK. The following
>> condition will always fail and netif_xmit_frozen_or_stopped() is not
>> checked at all.
>>
>>     if (ret && netif_xmit_frozen_or_stopped(txq))
>>          return false;
>>
>> In this patch, this condition is fixed as:
>>
>>     if (netif_xmit_frozen_or_stopped(txq))
>>          return false;
>>
>> and further simplifies the code as:
>>
>>     return !netif_xmit_frozen_or_stopped(txq);
>>
>> Fixes: 29b86cdac00a ("net: sched: remove remaining uses for qdisc_qlen in xmit path")
>> Cc: John Fastabend <john.fastabend@gmail.com>
>> Cc: David S. Miller <davem@davemloft.net>
>> Signed-off-by: Song Liu <songliubraving@fb.com>
>> ---
>>  net/sched/sch_generic.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 39c144b..8261d48 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -346,10 +346,7 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
>>                 return false;
>>         }
>>
>> -       if (ret && netif_xmit_frozen_or_stopped(txq))
>> -               return false;
>> -
>> -       return true;
>> +       return !netif_xmit_frozen_or_stopped(txq);
>>  }
>>
>>  /*
>> --
>> 2.9.5
>>
> 
> Alexei and I discussed about this offline. We would like to share our
> discussion here to
> clarify the motivation.
> 
> Before 29b86cdac00a, ret in condition "if (ret &&
> netif_xmit_frozen_or_stopped()" is not
> the value from dev_hard_start_xmit(), because ret is overwritten by
> either qdisc_qlen()
> or dev_requeue_skb(). Therefore, 29b86cdac00a changed the behavior of
> this condition.
> 
> For ret from dev_hard_start_xmit(), I dig into the function and found
> it is from return value
> of ndo_start_xmit(). Per netdevice.h, ndo_start_xmit() should only
> return NETDEV_TX_OK
> or NETDEV_TX_BUSY. I survey many drivers, and they all follow the rule. The only
> exception is vlan.
> 
> Given ret could only be NETDEV_TX_OK or NETDEV_TX_BUSY (ignore vlan for now),
> if it fails condition "if (!dev_xmit_complete(ret))", ret must be
> NETDEV_TX_OK == 0. So
> netif_xmit_frozen_or_stopped() will always be bypassed.
> 
> It is probably OK to ignore netif_xmit_frozen_or_stopped(), and return true from
> sch_direct_xmit(), as I didn't see that break any functionality. But
> it is more like "correct
> by accident" to me. This is the motivation of my original patch.
> 
> Alexei pointed out that, the following condition is more like original logic:
> 
>       if (qdisc_qlen(q) && netif_xmit_frozen_or_stopped(txq))
>             return false;
> 
> However, I think John would like to remove qdisc_qlen() from the tx
> path. I didn't see

Yep qdisc_qlen() is not very friendly for lockless users. At
some point we will get around to writing a distributed rate
limiter qdisc and it will be nice to not have to work-around
qdisc_qlen().

> any issue without the extra qdisc_qlen() check, so the patch is
> probably good AS-IS.
> 
> Please share your comments and feedback on this.
> 

Thanks for the detailed analysis. The above patch looks OK
to me. Actually I'm debating if we should just drop the check.
But, there looks to be a case where drivers return NETDEV_TX_OK
and then stop the queue because it is nearly overrun. By putting
the check there we stop early instead of doing some extra work
before realizing the driver ring is full.

Still this overrun case should be rare so removing the check
should be OK. Plus as you note its not been running anyways. My
current recommendation is just remove the check altogether.

Thanks,
John 

> Thanks,
> Song
> 

^ permalink raw reply

* [PATCH v4 net] stmmac: 802.1ad tag stripping support fix
From: Elad Nachman @ 2018-05-26 19:24 UTC (permalink / raw)
  To: Toshiaki Makita, Jose Abreu, Florian Fainelli, David Miller
  Cc: netdev, peppe.cavallaro, alexandre.torgue, eladv6
In-Reply-To: <6946ddeb-757b-268c-0443-897c0742f58e@lab.ntt.co.jp>

stmmac reception handler calls stmmac_rx_vlan() to strip the vlan before calling napi_gro_receive().

The function assumes VLAN tagged frames are always tagged with 802.1Q protocol,
and assigns ETH_P_8021Q to the skb by hard-coding the parameter on call to __vlan_hwaccel_put_tag() .

This causes packets not to be passed to the VLAN slave if it was created with 802.1AD protocol
(ip link add link eth0 eth0.100 type vlan proto 802.1ad id 100).

This fix passes the protocol from the VLAN header into __vlan_hwaccel_put_tag()
instead of using the hard-coded value of ETH_P_8021Q.
NETIF_F_HW_VLAN_CTAG_RX check was removed to be in line with the driver actual abilities.

Signed-off-by: Elad Nachman <eladn@gilat.com>

---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b65e2d1..284e6a7 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -3293,17 +3293,17 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb)
 {
-	struct ethhdr *ehdr;
+	struct vlan_ethhdr *veth;
 	u16 vlanid;
+	__be16 vlan_proto;
 
-	if ((dev->features & NETIF_F_HW_VLAN_CTAG_RX) ==
-	    NETIF_F_HW_VLAN_CTAG_RX &&
-	    !__vlan_get_tag(skb, &vlanid)) {
+	if (!__vlan_get_tag(skb, &vlanid)) {
 		/* pop the vlan tag */
-		ehdr = (struct ethhdr *)skb->data;
-		memmove(skb->data + VLAN_HLEN, ehdr, ETH_ALEN * 2);
+		veth = (struct vlan_ethhdr *)skb->data;
+		vlan_proto = veth->h_vlan_proto;
+		memmove(skb->data + VLAN_HLEN, veth, ETH_ALEN * 2);
 		skb_pull(skb, VLAN_HLEN);
-		__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlanid);
+		__vlan_hwaccel_put_tag(skb, vlan_proto, vlanid);
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 4/7] x86: remove a stray reference to pci-nommu.c
From: Thomas Gleixner @ 2018-05-26 19:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ingo Molnar, Tony Luck, Fenghua Yu, Greg Kroah-Hartman, x86,
	iommu, linux-kernel, linux-ia64, netdev
In-Reply-To: <20180525143512.1466-5-hch@lst.de>

On Fri, 25 May 2018, Christoph Hellwig wrote:

Subject should be: Documentation/x86: Remove .....

please

> This is just the minimal workaround.  The file file is mostly either stale

file file?

> and/or duplicative of Documentation/admin-guide/kernel-parameters.txt,
> but that is much more work than I'm willing to do right now.

Yeah, this thing is on the todo list ...

> Signed-off-by: Christoph Hellwig <hch@lst.de>

Other than the above nits:

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

> ---
>  Documentation/x86/x86_64/boot-options.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
> index b297c48389b9..153b3a57fba2 100644
> --- a/Documentation/x86/x86_64/boot-options.txt
> +++ b/Documentation/x86/x86_64/boot-options.txt
> @@ -187,9 +187,9 @@ PCI
>  
>  IOMMU (input/output memory management unit)
>  
> - Currently four x86-64 PCI-DMA mapping implementations exist:
> + Multiple x86-64 PCI-DMA mapping implementations exist, for example:
>  
> -   1. <arch/x86_64/kernel/pci-nommu.c>: use no hardware/software IOMMU at all
> +   1. <lib/dma-direct.c>: use no hardware/software IOMMU at all
>        (e.g. because you have < 3 GB memory).
>        Kernel boot message: "PCI-DMA: Disabling IOMMU"
>  
> -- 
> 2.17.0
> 
> 

^ permalink raw reply

* Re: [PATCH 4/6] sh_eth: remove custom .nway_reset from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 19:22 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <919f9993-1a31-5b37-76aa-6268c6e686ba@cogentembedded.com>

On 05/26/2018 09:46 PM, Sergei Shtylyov wrote:

>> The change fixes a sleep in atomic context issue, which can be
>> always triggered by running 'ethtool -r' command, because
>> phy_start_aneg() protects phydev fields by a mutex.
> 
>    Again, I'm unable to reproduce this BUG()...

   Now I can! I started to suspect this check needs to be specifically enabled
under the Kernel Hacking menu, and it turned out to be so...

>> Another note is that the change implicitly replaces phy_start_aneg()
>> with a newer phy_restart_aneg().
>>
>> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
> [...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH net-next v12 2/5] netvsc: refactor notifier/event handling code to use the failover framework
From: Samudrala, Sridhar @ 2018-05-26 19:22 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: alexander.h.duyck, virtio-dev, mst, kubakici, netdev,
	virtualization, loseweigh, anjali.singhai, aaron.f.brown, davem
In-Reply-To: <20180526075156.GC4288@nanopsycho.orion>

'On 5/26/2018 12:51 AM, Jiri Pirko wrote:
> Sat, May 26, 2018 at 09:22:18AM CEST, sridhar.samudrala@intel.com wrote:
>> On 5/25/2018 4:28 PM, Stephen Hemminger wrote:
>>> On Fri, 25 May 2018 16:11:47 -0700
>>> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
>>>
>>>> On 5/25/2018 3:34 PM, Stephen Hemminger wrote:
>>>>> On Thu, 24 May 2018 09:55:14 -0700
>>>>> Sridhar Samudrala <sridhar.samudrala@intel.com> wrote:
>>>>>> --- a/drivers/net/hyperv/Kconfig
>>>>>> +++ b/drivers/net/hyperv/Kconfig
>>>>>> @@ -2,5 +2,6 @@ config HYPERV_NET
>>>>>>     	tristate "Microsoft Hyper-V virtual network driver"
>>>>>>     	depends on HYPERV
>>>>>>     	select UCS2_STRING
>>>>>> +	select FAILOVER
>>>>> When I take a working kernel config, add the patches then do
>>>>> make oldconfig
>>>>>
>>>>> It is not autoselecting FAILOVER, it prompts me for it. This means
>>>>> if user says no then a non-working netvsc device is made.
>>>> I see
>>>>       Generic failover module (FAILOVER) [M/y/?] (NEW)
>>>>
>>>> So the user is given an option to either build as a Module or part of the
>>>> kernel. 'n' is not an option.
>>> With most libraries there is no prompt at all.
>> Not sure what you meant by this.
>> Without any patches applied, i had a .config file with HYPERV_NET configured
>> as a module.
>> Then after applying the first 2 patches in this series, i did a
>>   make oldconfig
>> and i see the above prompt.
>>
>> Are you saying that on some distros, 'make oldconfig creates a .config
>> file without any prompt and FAILOVER is not getting selected even when HYPERV_NET
>> is enabled?
>>
>>
> Well the thing is that for a user, it makes no sense to select
> "FAILOVER" by hand. It is a lib, so it should be only select it by a
> user. It has no sense to have it turned on by hand - no lib user.
> You can achieve that by simply removing "help" for the Kconfig
> item. Same thing for "NET_FAILOVER".

I played around with the CONFIG options and i see that FAILOVER options do
get selected correctly when virtio-net/netvsc are enabled.  Even if the FAILOVER
is turned off by the user before the hyperv-net/virtio-net patches are applied,
it gets selected automatically when hyperv-net/virtio-net patches are applied and
enabled in config.

If we don't want to allow the user to see these options, then i think we need to
remove them from Kconfig files.  Just removing "help" doesn't seem to make a
difference.

Can we address any config issues (i don't see any at this point) as a bug-fix on top
of this series?


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH 4/6] sh_eth: remove custom .nway_reset from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 18:46 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <1527160318-10958-5-git-send-email-vladimir_zapolskiy@mentor.com>

On 05/24/2018 02:11 PM, Vladimir Zapolskiy wrote:

> The change fixes a sleep in atomic context issue, which can be
> always triggered by running 'ethtool -r' command, because
> phy_start_aneg() protects phydev fields by a mutex.

   Again, I'm unable to reproduce this BUG()...

> Another note is that the change implicitly replaces phy_start_aneg()
> with a newer phy_restart_aneg().
> 
> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH 1/6] ravb: remove custom .nway_reset from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 17:51 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <1527160318-10958-2-git-send-email-vladimir_zapolskiy@mentor.com>

On 05/24/2018 02:11 PM, Vladimir Zapolskiy wrote:

> The change fixes a sleep in atomic context issue, which can be
> always triggered by running 'ethtool -r' command, because
> phy_start_aneg() protects phydev fields by a mutex.

   BTW, I was unable to trigger the BUG() with 'ethtool -r eth0' where 'eth0'
is EtherAVB. What am I doing wrong? :-)

MBR, Sergei

^ permalink raw reply

* Greetings
From: Zeliha Omer Faruk @ 2018-05-26 17:34 UTC (permalink / raw)





Hello

Greetings to you please i have a business proposal for you contact me
for more detailes asap thanks.

Best Regards,
Miss.Zeliha ömer faruk
Esentepe Mahallesi Büyükdere
Caddesi Kristal Kule Binasi
No:215
Sisli - Istanbul, Turkey

^ permalink raw reply

* Re: [PATCH 0/6] ravb/sh_eth: fix sleep in atomic by reusing shared ethtool handlers
From: Sergei Shtylyov @ 2018-05-26 17:31 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <22b370a7-0990-f1f5-196e-6e353a4e34ac@mentor.com>

On 05/25/2018 09:25 AM, Vladimir Zapolskiy wrote:

>>>> For ages trivial changes to RAVB and SuperH ethernet links by means of
>>>> standard 'ethtool' trigger a 'sleeping function called from invalid
>>>> context' bug, to visualize it on r8a7795 ULCB:
>>>>
>>>>   % ethtool -r eth0
>>>>   BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
>>>>   in_atomic(): 1, irqs_disabled(): 128, pid: 554, name: ethtool
>>>>   INFO: lockdep is turned off.
>>>>   irq event stamp: 0
>>>>   hardirqs last  enabled at (0): [<0000000000000000>]           (null)
>>>>   hardirqs last disabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
>>>>   softirqs last  enabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918
>>>>   softirqs last disabled at (0): [<0000000000000000>]           (null)
>>>>   CPU: 5 PID: 554 Comm: ethtool Not tainted 4.17.0-rc4-arm64-renesas+ #33
>>>>   Hardware name: Renesas H3ULCB board based on r8a7795 ES2.0+ (DT)
>>>>   Call trace:
>>>>    dump_backtrace+0x0/0x198
>>>>    show_stack+0x24/0x30
>>>>    dump_stack+0xb8/0xf4
>>>>    ___might_sleep+0x1c8/0x1f8
>>>>    __might_sleep+0x58/0x90
>>>>    __mutex_lock+0x50/0x890
>>>>    mutex_lock_nested+0x3c/0x50
>>>>    phy_start_aneg_priv+0x38/0x180
>>>>    phy_start_aneg+0x24/0x30
>>>>    ravb_nway_reset+0x3c/0x68
>>>>    dev_ethtool+0x3dc/0x2338
>>>>    dev_ioctl+0x19c/0x490
>>>>    sock_do_ioctl+0xe0/0x238
>>>>    sock_ioctl+0x254/0x460
>>>>    do_vfs_ioctl+0xb0/0x918
>>>>    ksys_ioctl+0x50/0x80
>>>>    sys_ioctl+0x34/0x48
>>>>    __sys_trace_return+0x0/0x4
>>>>
>>>> The root cause is that an attempt to modify ECMR and GECMR registers
>>>> only when RX/TX function is disabled was too overcomplicated in its
>>>> original implementation, also processing of an optional Link Change
>>>> interrupt added even more complexity, as a result the implementation
>>>> was error prone.
>>>>
>>>> The new locking scheme is confirmed to be correct by dumping driver
>>>> specific and generic PHY framework function calls with aid of ftrace
>>>> while running more or less advanced tests.
>>>>
>>>> Please note that sh_eth patches from the series were built-tested only.
>>>>
>>>> On purpose I do not add Fixes tags, the reused PHY handlers were added
>>>> way later than the fixed problems were firstly found in the drivers.
>>>
>>>    I think you went one step too far with these fixes. On the first glance,
>>> the real fixes are to remove grabbing/releasing the spinlock for the duration
>>> of the phylib calls. Am I right? If so, making use of the new phylib APIs
>>> would be a further enhancement, it's not needed for fixing the splats per se...
>>
>>    Note that I hadn't looked at the patches #3/#6 at the time of writing this;
>> those seem to be more complicated than the rest.
> 
> Right, the simplistic approach of just removing the held spinlock does
> not fit well into the overall lame locking model found in the driver.

   Yet you only try fixing it in the patches #3 and #6. I was talking about
the patches #1 and #4 mostly (#2 and #5 turned out to be non-fixes).

> The thing is that I would prefer to exhibit 'remove custom callbacks'
> side of the changes as it is done now, and fixing severe 'invalid contex'
> bugs is left as a valuable side effect. I may attempt to find enough
> free time to follow your instructions, but frankly speaking I don't
> see it beneficial to split a single good all-sufficient change into
> three or more: removal of spinlocks, replacement of phy_start_aneg(),
> then a non-functional clean-up.
   Yes, I would prefer these step-by-step changes.

> Bikeshedding isn't my preference,

   This is not about bikeshedding. What you are trying to do clearly
violates the 2 basic principles of the kernel development: "don't mix
fixes and enhancements" and "do one thing per patch". 

> but a report about technical flaws related to the published changes
> is appreciated, otherwise let me ask you to accept the changes as is,
> secondary optimizations can be done on top of them.

   No, I'll certainly have to NAK patches #1/#3 in their current form.
I'm yet to review patches #3/#6... anyway, if you lack the time to do things
properly, I'll have to take this burden on my shoulders (giving you credits).
Yet I'm basically is in the same situation as you -- I have to spend my copiuos
free time on the large patch sets (like yours) and I'm still having some cleanups
to sh_eth cooking here (which I'll most probably have to defer)...

> --
> With best wishes,
> Vladimir

MBR, Sergei

^ permalink raw reply

* Re: [PATCH 2/6] ravb: remove custom .get_link_ksettings from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 17:07 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <1527160318-10958-3-git-send-email-vladimir_zapolskiy@mentor.com>

On 05/24/2018 02:11 PM, Vladimir Zapolskiy wrote:

> The change replaces a custom implementation of .get_link_ksettings
> callback with a shared phy_ethtool_get_link_ksettings(), note that

> &priv->lock wrapping is not needed, because the lock does not
> serialize access to phydev fields.

   No BUG() here, AFAICT. But then this is not a fix but an enhancement.
And I would have done that in 2 steps: 1st removing the spinlock code
and the 2nd removing the custom method implementation. 

> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH 1/6] ravb: remove custom .nway_reset from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 16:56 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <1527160318-10958-2-git-send-email-vladimir_zapolskiy@mentor.com>

Hello.

   A formal patch review this time...

On 05/24/2018 02:11 PM, Vladimir Zapolskiy wrote:

> The change fixes a sleep in atomic context issue, which can be
> always triggered by running 'ethtool -r' command, because
> phy_start_aneg() protects phydev fields by a mutex.

   OK so far...

> Another note is that the change implicitly replaces phy_start_aneg()
> with a newer phy_restart_aneg().

   Why? Is this necessary to fix the BUG()?

> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 17 +----------------
>  1 file changed, 1 insertion(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index 68f122140966..4a043eb0e2aa 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -1150,21 +1150,6 @@ static int ravb_set_link_ksettings(struct net_device *ndev,
>  	return error;
>  }
>  
> -static int ravb_nway_reset(struct net_device *ndev)
> -{
> -	struct ravb_private *priv = netdev_priv(ndev);
> -	int error = -ENODEV;
> -	unsigned long flags;
> -
> -	if (ndev->phydev) {
> -		spin_lock_irqsave(&priv->lock, flags);

   OK, removing spin_lock_irqsave() fixes the BUG()...
   Not sure what we rotect against here anyway, MAC interrupts?

> -		error = phy_start_aneg(ndev->phydev);
> -		spin_unlock_irqrestore(&priv->lock, flags);
> -	}
> -
> -	return error;
> -}
> -
>  static u32 ravb_get_msglevel(struct net_device *ndev)
>  {
>  	struct ravb_private *priv = netdev_priv(ndev);
> @@ -1377,7 +1362,7 @@ static int ravb_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
>  }
>  
>  static const struct ethtool_ops ravb_ethtool_ops = {
> -	.nway_reset		= ravb_nway_reset,
> +	.nway_reset		= phy_ethtool_nway_reset,

   What does this fix?

>  	.get_msglevel		= ravb_get_msglevel,
>  	.set_msglevel		= ravb_set_msglevel,
>  	.get_link		= ethtool_op_get_link,

MBR, Sergei

^ permalink raw reply

* Re: [PATCH 1/6] ravb: remove custom .nway_reset from ethtool ops
From: Sergei Shtylyov @ 2018-05-26 16:56 UTC (permalink / raw)
  To: Vladimir Zapolskiy, David S. Miller; +Cc: netdev, linux-renesas-soc
In-Reply-To: <1527160318-10958-2-git-send-email-vladimir_zapolskiy@mentor.com>

Hello.

   A formal patch review this time...

On 05/24/2018 02:11 PM, Vladimir Zapolskiy wrote:

> The change fixes a sleep in atomic context issue, which can be
> always triggered by running 'ethtool -r' command, because
> phy_start_aneg() protects phydev fields by a mutex.

   OK so far...

> Another note is that the change implicitly replaces phy_start_aneg()
> with a newer phy_restart_aneg().

   Why? Is this necessary to fix the BUG()?

> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 17 +----------------
>  1 file changed, 1 insertion(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index 68f122140966..4a043eb0e2aa 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -1150,21 +1150,6 @@ static int ravb_set_link_ksettings(struct net_device *ndev,
>  	return error;
>  }
>  
> -static int ravb_nway_reset(struct net_device *ndev)
> -{
> -	struct ravb_private *priv = netdev_priv(ndev);
> -	int error = -ENODEV;
> -	unsigned long flags;
> -
> -	if (ndev->phydev) {
> -		spin_lock_irqsave(&priv->lock, flags);

   OK, removing spin_lock_irqsave() fixes the BUG()...
   Not sure what we rotect against here anyway, MAC interrupts?

> -		error = phy_start_aneg(ndev->phydev);
> -		spin_unlock_irqrestore(&priv->lock, flags);
> -	}
> -
> -	return error;
> -}
> -
>  static u32 ravb_get_msglevel(struct net_device *ndev)
>  {
>  	struct ravb_private *priv = netdev_priv(ndev);
> @@ -1377,7 +1362,7 @@ static int ravb_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
>  }
>  
>  static const struct ethtool_ops ravb_ethtool_ops = {
> -	.nway_reset		= ravb_nway_reset,
> +	.nway_reset		= phy_ethtool_nway_reset,

   What does this fix?

>  	.get_msglevel		= ravb_get_msglevel,
>  	.set_msglevel		= ravb_set_msglevel,
>  	.get_link		= ethtool_op_get_link,

MBR, Sergei

^ permalink raw reply

* Re: [PATCH] crypto: chtls: generic handling of data and hdr
From: Herbert Xu @ 2018-05-26 16:25 UTC (permalink / raw)
  To: Atul Gupta; +Cc: linux-crypto, netdev, davem, Harsh Jain
In-Reply-To: <1526296298-30183-1-git-send-email-atul.gupta@chelsio.com>

On Mon, May 14, 2018 at 04:41:38PM +0530, Atul Gupta wrote:
> removed redundant check and made TLS PDU and header recv
> handling common as received from HW.
> Ensure that only tls header is read in cpl_rx_tls_cmp
> read-ahead and skb is freed when entire data is processed.
> 
> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
> Signed-off-by: Harsh Jain <harsh@chelsio.com>

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs
From: Dmitry Vyukov @ 2018-05-26 15:50 UTC (permalink / raw)
  To: Michael Tuexen
  Cc: Neil Horman, Xin Long, network dev, linux-sctp, David Miller,
	David Ahern, Eric Dumazet, Marcelo Ricardo Leitner, syzkaller
In-Reply-To: <71FD541D-25FE-4100-980B-C3A0CEAF6CD4@lurchi.franken.de>

On Sat, May 26, 2018 at 5:42 PM, Michael Tuexen
<michael.tuexen@lurchi.franken.de> wrote:
>> On 25. May 2018, at 21:13, Neil Horman <nhorman@tuxdriver.com> wrote:
>>
>> On Sat, May 26, 2018 at 01:41:02AM +0800, Xin Long wrote:
>>> syzbot reported a rcu_sched self-detected stall on CPU which is caused
>>> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
>>> value, hb_timer will get stuck there, as in its timer handler it starts
>>> this timer again with this value, then goes to the timer handler again.
>>>
>>> This problem is there since very beginning, and thanks to Eric for the
>>> reproducer shared from a syzbot mail.
>>>
>>> This patch fixes it by not allowing to set rto_min with a value below
>>> 200 msecs, which is based on TCP's, by either setsockopt or sysctl.
>>>
>>> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
>>> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>>> Signed-off-by: Xin Long <lucien.xin@gmail.com>
>>> ---
>>> include/net/sctp/constants.h |  1 +
>>> net/sctp/socket.c            | 10 +++++++---
>>> net/sctp/sysctl.c            |  3 ++-
>>> 3 files changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
>>> index 20ff237..2ee7a7b 100644
>>> --- a/include/net/sctp/constants.h
>>> +++ b/include/net/sctp/constants.h
>>> @@ -277,6 +277,7 @@ enum { SCTP_MAX_GABS = 16 };
>>> #define SCTP_RTO_INITIAL     (3 * 1000)
>>> #define SCTP_RTO_MIN         (1 * 1000)
>>> #define SCTP_RTO_MAX         (60 * 1000)
>>> +#define SCTP_RTO_HARD_MIN   200
>>>
>>> #define SCTP_RTO_ALPHA          3   /* 1/8 when converted to right shifts. */
>>> #define SCTP_RTO_BETA           2   /* 1/4 when converted to right shifts. */
>>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>>> index ae7e7c6..6ef12c7 100644
>>> --- a/net/sctp/socket.c
>>> +++ b/net/sctp/socket.c
>>> @@ -3029,7 +3029,8 @@ static int sctp_setsockopt_nodelay(struct sock *sk, char __user *optval,
>>>  * be changed.
>>>  *
>>>  */
>>> -static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigned int optlen)
>>> +static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval,
>>> +                               unsigned int optlen)
>>> {
>>>      struct sctp_rtoinfo rtoinfo;
>>>      struct sctp_association *asoc;
>>> @@ -3056,10 +3057,13 @@ static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigne
>>>      else
>>>              rto_max = asoc ? asoc->rto_max : sp->rtoinfo.srto_max;
>>>
>>> -    if (rto_min)
>>> +    if (rto_min) {
>>> +            if (rto_min < SCTP_RTO_HARD_MIN)
>>> +                    return -EINVAL;
>>>              rto_min = asoc ? msecs_to_jiffies(rto_min) : rto_min;
>>> -    else
>>> +    } else {
>>>              rto_min = asoc ? asoc->rto_min : sp->rtoinfo.srto_min;
>>> +    }
>>>
>>>      if (rto_min > rto_max)
>>>              return -EINVAL;
>>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
>>> index 33ca5b7..7ec854a 100644
>>> --- a/net/sctp/sysctl.c
>>> +++ b/net/sctp/sysctl.c
>>> @@ -52,6 +52,7 @@ static int rto_alpha_min = 0;
>>> static int rto_beta_min = 0;
>>> static int rto_alpha_max = 1000;
>>> static int rto_beta_max = 1000;
>>> +static int rto_hard_min = SCTP_RTO_HARD_MIN;
>>>
>>> static unsigned long max_autoclose_min = 0;
>>> static unsigned long max_autoclose_max =
>>> @@ -116,7 +117,7 @@ static struct ctl_table sctp_net_table[] = {
>>>              .maxlen         = sizeof(unsigned int),
>>>              .mode           = 0644,
>>>              .proc_handler   = proc_sctp_do_rto_min,
>>> -            .extra1         = &one,
>>> +            .extra1         = &rto_hard_min,
>>>              .extra2         = &init_net.sctp.rto_max
>>>      },
>>>      {
>>> --
>>> 2.1.0
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> Patch looks fine, you probably want to note this hard minimum in man(7) sctp as
>> well
>>
> I'm aware of some signalling networks which use RTO.min of smaller values than 200ms.
> So could this be reduced?

Hi Michael,

What value do they use?

Xin, Neil, is there more principled way of ensuring that a timer won't
cause a hard CPU stall? There are slow machines and there are slow
kernels (in particular syzbot kernel has tons of debug configs
enabled). 200ms _should_ not cause problems because we did not see
them with tcp. But it's hard to say what's the low limit as we are
trying to put a hard upper bound on execution time of a complex
section of code. Is there something like cond_resched for timers?

^ permalink raw reply

* Re: [PATCH net] sctp: not allow to set rto_min with a value below 200 msecs
From: Michael Tuexen @ 2018-05-26 15:42 UTC (permalink / raw)
  To: Neil Horman
  Cc: Xin Long, network dev, linux-sctp, davem, David Ahern,
	Eric Dumazet, Marcelo Ricardo Leitner, syzkaller
In-Reply-To: <20180525191319.GB392@hmswarspite.think-freely.org>

> On 25. May 2018, at 21:13, Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> On Sat, May 26, 2018 at 01:41:02AM +0800, Xin Long wrote:
>> syzbot reported a rcu_sched self-detected stall on CPU which is caused
>> by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
>> value, hb_timer will get stuck there, as in its timer handler it starts
>> this timer again with this value, then goes to the timer handler again.
>> 
>> This problem is there since very beginning, and thanks to Eric for the
>> reproducer shared from a syzbot mail.
>> 
>> This patch fixes it by not allowing to set rto_min with a value below
>> 200 msecs, which is based on TCP's, by either setsockopt or sysctl.
>> 
>> Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
>> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
>> Signed-off-by: Xin Long <lucien.xin@gmail.com>
>> ---
>> include/net/sctp/constants.h |  1 +
>> net/sctp/socket.c            | 10 +++++++---
>> net/sctp/sysctl.c            |  3 ++-
>> 3 files changed, 10 insertions(+), 4 deletions(-)
>> 
>> diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h
>> index 20ff237..2ee7a7b 100644
>> --- a/include/net/sctp/constants.h
>> +++ b/include/net/sctp/constants.h
>> @@ -277,6 +277,7 @@ enum { SCTP_MAX_GABS = 16 };
>> #define SCTP_RTO_INITIAL	(3 * 1000)
>> #define SCTP_RTO_MIN		(1 * 1000)
>> #define SCTP_RTO_MAX		(60 * 1000)
>> +#define SCTP_RTO_HARD_MIN	200
>> 
>> #define SCTP_RTO_ALPHA          3   /* 1/8 when converted to right shifts. */
>> #define SCTP_RTO_BETA           2   /* 1/4 when converted to right shifts. */
>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>> index ae7e7c6..6ef12c7 100644
>> --- a/net/sctp/socket.c
>> +++ b/net/sctp/socket.c
>> @@ -3029,7 +3029,8 @@ static int sctp_setsockopt_nodelay(struct sock *sk, char __user *optval,
>>  * be changed.
>>  *
>>  */
>> -static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigned int optlen)
>> +static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval,
>> +				   unsigned int optlen)
>> {
>> 	struct sctp_rtoinfo rtoinfo;
>> 	struct sctp_association *asoc;
>> @@ -3056,10 +3057,13 @@ static int sctp_setsockopt_rtoinfo(struct sock *sk, char __user *optval, unsigne
>> 	else
>> 		rto_max = asoc ? asoc->rto_max : sp->rtoinfo.srto_max;
>> 
>> -	if (rto_min)
>> +	if (rto_min) {
>> +		if (rto_min < SCTP_RTO_HARD_MIN)
>> +			return -EINVAL;
>> 		rto_min = asoc ? msecs_to_jiffies(rto_min) : rto_min;
>> -	else
>> +	} else {
>> 		rto_min = asoc ? asoc->rto_min : sp->rtoinfo.srto_min;
>> +	}
>> 
>> 	if (rto_min > rto_max)
>> 		return -EINVAL;
>> diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c
>> index 33ca5b7..7ec854a 100644
>> --- a/net/sctp/sysctl.c
>> +++ b/net/sctp/sysctl.c
>> @@ -52,6 +52,7 @@ static int rto_alpha_min = 0;
>> static int rto_beta_min = 0;
>> static int rto_alpha_max = 1000;
>> static int rto_beta_max = 1000;
>> +static int rto_hard_min = SCTP_RTO_HARD_MIN;
>> 
>> static unsigned long max_autoclose_min = 0;
>> static unsigned long max_autoclose_max =
>> @@ -116,7 +117,7 @@ static struct ctl_table sctp_net_table[] = {
>> 		.maxlen		= sizeof(unsigned int),
>> 		.mode		= 0644,
>> 		.proc_handler	= proc_sctp_do_rto_min,
>> -		.extra1         = &one,
>> +		.extra1         = &rto_hard_min,
>> 		.extra2         = &init_net.sctp.rto_max
>> 	},
>> 	{
>> -- 
>> 2.1.0
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> Patch looks fine, you probably want to note this hard minimum in man(7) sctp as
> well
> 
I'm aware of some signalling networks which use RTO.min of smaller values than 200ms.
So could this be reduced?

Best regards
Michael
> Acked-by: Neil Horman <nhorman@tuxdriver.com>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: INFO: rcu detected stall in skb_free_head
From: Dmitry Vyukov @ 2018-05-26 15:38 UTC (permalink / raw)
  To: syzbot
  Cc: Andrei Vagin, David Miller, Kirill Tkhai, LKML, netdev,
	syzkaller-bugs
In-Reply-To: <000000000000799e6c056aff4a6f@google.com>

On Sun, Apr 29, 2018 at 6:33 PM, syzbot
<syzbot+cac7c17ec0aca89d3c45@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> a27fc14219f2e3c4a46ba9177b04d9b52c875532 (Mon Apr 16 21:07:39 2018 +0000)
> Merge branch 'parisc-4.17-3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=cac7c17ec0aca89d3c45
>
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=6517400396627968
> Kernel config:
> https://syzkaller.appspot.com/x/.config?id=-5914490758943236750
> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cac7c17ec0aca89d3c45@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> INFO: rcu_sched self-detected stall on CPU
>         1-...!: (117917 ticks this GP) idle=036/1/4611686018427387906
> softirq=114416/114416 fqs=32
>          (t=125000 jiffies g=60712 c=60711 q=345938)
> rcu_sched kthread starved for 124847 jiffies! g60712 c60711 f0x2
> RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
> RCU grace-period kthread stack dump:
> rcu_sched       R  running task    23592     9      2 0x80000000
> Call Trace:
>  context_switch kernel/sched/core.c:2848 [inline]
>  __schedule+0x801/0x1e30 kernel/sched/core.c:3490
>  schedule+0xef/0x430 kernel/sched/core.c:3549
>  schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
>  rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
>  kthread+0x345/0x410 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> NMI backtrace for cpu 1
> CPU: 1 PID: 24 Comm: kworker/1:1 Not tainted 4.17.0-rc1+ #6
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: events rht_deferred_worker
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>  trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
>  rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
>  print_cpu_stall kernel/rcu/tree.c:1525 [inline]
>  check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
>  __rcu_pending kernel/rcu/tree.c:3356 [inline]
>  rcu_pending kernel/rcu/tree.c:3401 [inline]
>  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
>  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
>  tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:173
>  tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1283
>  __run_hrtimer kernel/time/hrtimer.c:1386 [inline]
>  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1448
>  hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1506
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
>  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> [inline]
> RIP: 0010:kfree+0x124/0x260 mm/slab.c:3814
> RSP: 0018:ffff8801db105450 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000007 RBX: ffff88006c118040 RCX: 1ffff1003b3059e7
> RDX: 0000000000000000 RSI: ffff8801d982cf90 RDI: 0000000000000286
> RBP: ffff8801db105470 R08: ffff8801d982ce78 R09: 0000000000000002
> R10: ffff8801d982c640 R11: 0000000000000000 R12: 0000000000000286
> R13: ffff8801dac00ac0 R14: ffffffff85bd7b69 R15: ffff88006c0f8180
>  skb_free_head+0x99/0xc0 net/core/skbuff.c:550
>  skb_release_data+0x690/0x860 net/core/skbuff.c:570
>  skb_release_all+0x4a/0x60 net/core/skbuff.c:627
>  __kfree_skb net/core/skbuff.c:641 [inline]
>  kfree_skb+0x195/0x560 net/core/skbuff.c:659
>  enqueue_to_backlog+0x2fc/0xc90 net/core/dev.c:3968
>  netif_rx_internal+0x14d/0xae0 net/core/dev.c:4181
>  netif_rx+0xba/0x400 net/core/dev.c:4206
>  loopback_xmit+0x283/0x741 drivers/net/loopback.c:91
>  __netdev_start_xmit include/linux/netdevice.h:4087 [inline]
>  netdev_start_xmit include/linux/netdevice.h:4096 [inline]
>  xmit_one net/core/dev.c:3053 [inline]
>  dev_hard_start_xmit+0x264/0xc10 net/core/dev.c:3069
>  __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584
>  dev_queue_xmit+0x17/0x20 net/core/dev.c:3617
>  neigh_hh_output include/net/neighbour.h:472 [inline]
>  neigh_output include/net/neighbour.h:480 [inline]
>  ip_finish_output2+0x1046/0x1840 net/ipv4/ip_output.c:229
>  ip_finish_output+0x828/0xf80 net/ipv4/ip_output.c:317
>  NF_HOOK_COND include/linux/netfilter.h:277 [inline]
>  ip_output+0x21b/0x850 net/ipv4/ip_output.c:405
>  dst_output include/net/dst.h:444 [inline]
>  ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124
>  ip_queue_xmit+0x9d7/0x1f70 net/ipv4/ip_output.c:504
>  sctp_v4_xmit+0x108/0x140 net/sctp/protocol.c:983
>  sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:650
>  sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
>  sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>  sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>  sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406

#syz fix: sctp: not allow to set rto_min with a value below 200 msecs


>  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>  expire_timers kernel/time/timer.c:1363 [inline]
>  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
>  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
>  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1046
>  </IRQ>
>  do_softirq.part.17+0x14d/0x190 kernel/softirq.c:329
>  do_softirq arch/x86/include/asm/preempt.h:23 [inline]
>  __local_bh_enable_ip+0x1ec/0x230 kernel/softirq.c:182
>  __raw_spin_unlock_bh include/linux/spinlock_api_smp.h:176 [inline]
>  _raw_spin_unlock_bh+0x30/0x40 kernel/locking/spinlock.c:200
>  spin_unlock_bh include/linux/spinlock.h:355 [inline]
>  rhashtable_rehash_chain lib/rhashtable.c:292 [inline]
>  rhashtable_rehash_table lib/rhashtable.c:333 [inline]
>  rht_deferred_worker+0x1058/0x1fb0 lib/rhashtable.c:432
>  process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
>  worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
>  kthread+0x345/0x410 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 126s!
> BUG: workqueue lockup - pool cpus=0-1 flags=0x4 nice=0 stuck for 126s!
> Showing busy workqueues and worker pools:
> workqueue events: flags=0x0
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=8/256
>     in-flight: 24:rht_deferred_worker
>     pending: rht_deferred_worker, defense_work_handler,
> defense_work_handler, perf_sched_delayed, defense_work_handler,
> switchdev_deferred_process_work, cache_reap
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=11/256
>     pending: jump_label_update_timeout, defense_work_handler,
> defense_work_handler, defense_work_handler, defense_work_handler,
> defense_work_handler, defense_work_handler, check_corruption,
> key_garbage_collector, vmstat_shepherd, cache_reap
> workqueue events_long: flags=0x0
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=3/256
>     pending: br_fdb_cleanup, br_fdb_cleanup, br_fdb_cleanup
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=5/256
>     pending: br_fdb_cleanup, br_fdb_cleanup, br_fdb_cleanup, br_fdb_cleanup,
> br_fdb_cleanup
> workqueue events_unbound: flags=0x2
>   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/512
>     in-flight: 10434:fsnotify_connector_destroy_workfn
> workqueue events_power_efficient: flags=0x80
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>     pending: neigh_periodic_work
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
>     pending: check_lifetime, gc_worker, do_cache_clean, neigh_periodic_work
> workqueue rcu_gp: flags=0x8
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>     pending: process_srcu
> workqueue mm_percpu_wq: flags=0x8
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
>     pending: vmstat_update
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>     pending: vmstat_update
> workqueue writeback: flags=0x4e
>   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256
>     in-flight: 13940:wb_workfn
> workqueue kblockd: flags=0x18
>   pwq 1: cpus=0 node=0 flags=0x0 nice=-20 active=2/256
>     pending: blk_mq_timeout_work, blk_mq_timeout_work
> workqueue dm_bufio_cache: flags=0x8
>   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
>     pending: work_fn
> workqueue ipv6_addrconf: flags=0x40008
>   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/1
>     pending: addrconf_verify_work
> workqueue krdsd: flags=0xe000a
>   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/1
>     in-flight: 301:rds_connect_worker
> pool 2: cpus=1 node=0 flags=0x0 nice=0 hung=127s workers=3 idle: 4831 4860
> pool 4: cpus=0-1 flags=0x4 nice=0 hung=0s workers=8 idle: 301 22 6 7437 6882
> 50
>
>
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is
> merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line in the email body.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/000000000000799e6c056aff4a6f%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* Re: INFO: rcu detected stall in kmem_cache_alloc_node_trace
From: Dmitry Vyukov @ 2018-05-26 15:38 UTC (permalink / raw)
  To: syzbot
  Cc: David Miller, LKML, linux-sctp, netdev, Neil Horman,
	syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <00000000000060115b056b14b1fb@google.com>

On Mon, Apr 30, 2018 at 8:05 PM, syzbot
<syzbot+deec965c578bb9b81613@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    17dec0a94915 Merge branch 'userns-linus' of
> git://git.kerne...
> git tree:       net-next
> console output: https://syzkaller.appspot.com/x/log.txt?id=6093051722203136
> kernel config:
> https://syzkaller.appspot.com/x/.config?id=-2735707888269579554
> dashboard link: https://syzkaller.appspot.com/bug?extid=deec965c578bb9b81613
> compiler:       gcc (GCC) 8.0.1 20180301 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+deec965c578bb9b81613@syzkaller.appspotmail.com
>
> sctp: [Deprecated]: syz-executor3 (pid 10218) Use of int in max_burst socket
> option.
> Use struct sctp_assoc_value instead
> sctp: [Deprecated]: syz-executor3 (pid 10218) Use of int in max_burst socket
> option.
> Use struct sctp_assoc_value instead
> random: crng init done
> INFO: rcu_sched self-detected stall on CPU
>         0-....: (120712 ticks this GP) idle=ac6/1/4611686018427387908
> softirq=31693/31693 fqs=31173
>          (t=125001 jiffies g=17039 c=17038 q=303419)
> NMI backtrace for cpu 0
> CPU: 0 PID: 10218 Comm: syz-executor3 Not tainted 4.16.0+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x1b9/0x29f lib/dump_stack.c:53
>  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>  trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
>  rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
>  print_cpu_stall kernel/rcu/tree.c:1525 [inline]
>  check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
>  __rcu_pending kernel/rcu/tree.c:3356 [inline]
>  rcu_pending kernel/rcu/tree.c:3401 [inline]
>  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
>  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
>  tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:162
>  tick_sched_timer+0x42/0x130 kernel/time/tick-sched.c:1170
>  __run_hrtimer kernel/time/hrtimer.c:1349 [inline]
>  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1411
>  hrtimer_interrupt+0x2f3/0x750 kernel/time/hrtimer.c:1469
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
>  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> [inline]
> RIP: 0010:lock_is_held_type+0x18b/0x210 kernel/locking/lockdep.c:3960
> RSP: 0018:ffff8801db006400 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff12
> RAX: dffffc0000000000 RBX: 0000000000000282 RCX: 0000000000000000
> RDX: 1ffffffff1162e55 RSI: ffffffff88b90c60 RDI: 0000000000000282
> RBP: ffff8801db006420 R08: ffffed003b6046c3 R09: ffffed003b6046c2
> R10: ffffed003b6046c2 R11: ffff8801db023613 R12: ffff8801b2f623c0
> R13: 0000000000000000 R14: ffff88009932bb00 R15: 00000000ffffffff
>  lock_is_held include/linux/lockdep.h:344 [inline]
>  rcu_read_lock_sched_held+0x108/0x120 kernel/rcu/update.c:117
>  trace_kmalloc_node include/trace/events/kmem.h:100 [inline]
>  kmem_cache_alloc_node_trace+0x34e/0x770 mm/slab.c:3652
>  __do_kmalloc_node mm/slab.c:3669 [inline]
>  __kmalloc_node_track_caller+0x33/0x70 mm/slab.c:3684
>  __kmalloc_reserve.isra.38+0x3a/0xe0 net/core/skbuff.c:137
>  __alloc_skb+0x14d/0x780 net/core/skbuff.c:205
>  alloc_skb include/linux/skbuff.h:987 [inline]
>  sctp_packet_transmit+0x45e/0x3ba0 net/sctp/output.c:585
>  sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
>  sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>  sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>  sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406

#syz fix: sctp: not allow to set rto_min with a value below 200 msecs


>  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>  expire_timers kernel/time/timer.c:1363 [inline]
>  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
>  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
>  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1d1/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
>  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>  </IRQ>
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> [inline]
> RIP: 0010:console_unlock+0xcdf/0x1100 kernel/printk/printk.c:2403
> RSP: 0018:ffff8801946eec00 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff12
> RAX: 0000000000040000 RBX: 0000000000000200 RCX: ffffc90002ee8000
> RDX: 0000000000004461 RSI: ffffffff815f3446 RDI: 0000000000000212
> RBP: ffff8801946eed68 R08: ffff8801b2f62c38 R09: 0000000000000006
> R10: ffff8801b2f623c0 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffffff84b84430 R14: 0000000000000001 R15: dffffc0000000000
>  vprintk_emit+0x6ad/0xdd0 kernel/printk/printk.c:1907
>  vprintk_default+0x28/0x30 kernel/printk/printk.c:1947
>  vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:379
>  printk+0x9e/0xba kernel/printk/printk.c:1980
>  sctp_getsockopt_maxburst net/sctp/socket.c:6265 [inline]
>  sctp_getsockopt.cold.34+0x11d/0x14c net/sctp/socket.c:7240
>  sock_common_getsockopt+0x9a/0xe0 net/core/sock.c:2998
>  __sys_getsockopt+0x1a5/0x370 net/socket.c:1940
>  SYSC_getsockopt net/socket.c:1951 [inline]
>  SyS_getsockopt+0x34/0x50 net/socket.c:1948
>  do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x455279
> RSP: 002b:00007f5c0c0f2c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
> RAX: ffffffffffffffda RBX: 00007f5c0c0f36d4 RCX: 0000000000455279
> RDX: 0000000000000014 RSI: 0000000000000084 RDI: 0000000000000014
> RBP: 000000000072bea0 R08: 0000000020000240 R09: 0000000000000000
> R10: 0000000020000140 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 000000000000012b R14: 00000000006f4ca8 R15: 0000000000000000
> INFO: rcu_sched detected stalls on CPUs/tasks:
>         0-....: (120712 ticks this GP) idle=ac6/1/4611686018427387908
> softirq=31693/31693 fqs=31173
>         (detected by 1, t=125002 jiffies, g=17039, c=17038, q=303419)
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 10218 Comm: syz-executor3 Not tainted 4.16.0+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:__lock_acquire+0xa78/0x5130 kernel/locking/lockdep.c:3434
> RSP: 0018:ffff8801db005b40 EFLAGS: 00000046
> RAX: dffffc0000000000 RBX: 00000000858813a6 RCX: 0000000000000001
> RDX: 1ffff100365ec585 RSI: ffff8801b2f62c38 RDI: ffff8801b2f62cf9
> RBP: ffff8801db005ed0 R08: 0000000000000008 R09: 0000000000000004
> R10: ffff8801b2f62cd8 R11: ffff8801b2f623c0 R12: 00000000be6c7baf
> R13: 0000000000000000 R14: 2ca977e0cccfd81f R15: 0000000000000000
> FS:  00007f5c0c0f3700(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffe57df8fa8 CR3: 00000001d7124000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
>  rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
>  rcu_read_lock include/linux/rcupdate.h:632 [inline]
>  is_bpf_text_address+0x3b/0x170 kernel/bpf/core.c:478
>  kernel_text_address+0x79/0xf0 kernel/extable.c:152
>  __kernel_text_address+0xd/0x40 kernel/extable.c:107
>  unwind_get_return_address+0x61/0xa0 arch/x86/kernel/unwind_frame.c:18
>  __save_stack_trace+0x7e/0xd0 arch/x86/kernel/stacktrace.c:45
>  save_stack_trace+0x1a/0x20 arch/x86/kernel/stacktrace.c:60
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527
>  __cache_free mm/slab.c:3486 [inline]
>  kmem_cache_free+0x86/0x2d0 mm/slab.c:3744
>  kfree_skbmem+0x13c/0x210 net/core/skbuff.c:582
>  __kfree_skb net/core/skbuff.c:642 [inline]
>  consume_skb+0x193/0x550 net/core/skbuff.c:701
>  sctp_chunk_destroy net/sctp/sm_make_chunk.c:1477 [inline]
>  sctp_chunk_put+0x2c0/0x440 net/sctp/sm_make_chunk.c:1504
>  sctp_chunk_free+0x53/0x60 net/sctp/sm_make_chunk.c:1491
>  sctp_packet_pack net/sctp/output.c:489 [inline]
>  sctp_packet_transmit+0x142e/0x3ba0 net/sctp/output.c:610
>  sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
>  sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>  sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>  sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
>  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>  expire_timers kernel/time/timer.c:1363 [inline]
>  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
>  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
>  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1d1/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
>  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>  </IRQ>
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> [inline]
> RIP: 0010:console_unlock+0xcdf/0x1100 kernel/printk/printk.c:2403
> RSP: 0018:ffff8801946eec00 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff12
> RAX: 0000000000040000 RBX: 0000000000000200 RCX: ffffc90002ee8000
> RDX: 0000000000004461 RSI: ffffffff815f3446 RDI: 0000000000000212
> RBP: ffff8801946eed68 R08: ffff8801b2f62c38 R09: 0000000000000006
> R10: ffff8801b2f623c0 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffffff84b84430 R14: 0000000000000001 R15: dffffc0000000000
>  vprintk_emit+0x6ad/0xdd0 kernel/printk/printk.c:1907
>  vprintk_default+0x28/0x30 kernel/printk/printk.c:1947
>  vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:379
>  printk+0x9e/0xba kernel/printk/printk.c:1980
>  sctp_getsockopt_maxburst net/sctp/socket.c:6265 [inline]
>  sctp_getsockopt.cold.34+0x11d/0x14c net/sctp/socket.c:7240
>  sock_common_getsockopt+0x9a/0xe0 net/core/sock.c:2998
>  __sys_getsockopt+0x1a5/0x370 net/socket.c:1940
>  SYSC_getsockopt net/socket.c:1951 [inline]
>  SyS_getsockopt+0x34/0x50 net/socket.c:1948
>  do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x455279
> RSP: 002b:00007f5c0c0f2c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
> RAX: ffffffffffffffda RBX: 00007f5c0c0f36d4 RCX: 0000000000455279
> RDX: 0000000000000014 RSI: 0000000000000084 RDI: 0000000000000014
> RBP: 000000000072bea0 R08: 0000000020000240 R09: 0000000000000000
> R10: 0000000020000140 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 000000000000012b R14: 00000000006f4ca8 R15: 0000000000000000
> Code: 0f 85 dc 32 00 00 8b 0d b7 9e 8b 07 85 c9 0f 84 62 f7 ff ff 48 b8 00
> 00 00 00 00 fc ff df 48 8b 54 24 78 48 c1 ea 03 80 3c 02 00 <0f> 85 0b 31 00
> 00 48 8b 94 24 88 00 00 00 4d 89 b3 68 08 00 00
> INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 3.007
> msecs
> INFO: task kworker/u4:4:688 blocked for more than 120 seconds.
>       Not tainted 4.16.0+ #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/u4:4    D19560   688      2 0x80000000
> Workqueue: events_unbound fsnotify_mark_destroy_workfn
> Call Trace:
>  context_switch kernel/sched/core.c:2848 [inline]
>  __schedule+0x807/0x1e40 kernel/sched/core.c:3490
>  schedule+0xef/0x430 kernel/sched/core.c:3549
>  schedule_timeout+0x1b5/0x240 kernel/time/timer.c:1777
>  do_wait_for_common kernel/sched/completion.c:83 [inline]
>  __wait_for_common kernel/sched/completion.c:104 [inline]
>  wait_for_common kernel/sched/completion.c:115 [inline]
>  wait_for_completion+0x3e7/0x870 kernel/sched/completion.c:136
>  __synchronize_srcu+0x189/0x240 kernel/rcu/srcutree.c:924
>  synchronize_srcu+0x408/0x54f kernel/rcu/srcutree.c:1002
>  fsnotify_mark_destroy_workfn+0x1aa/0x530 fs/notify/mark.c:759
>  process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
>  worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
>  kthread+0x345/0x410 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
>
> Showing all locks held in the system:
> 2 locks held by kworker/u4:4/688:
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> __write_once_size include/linux/compiler.h:215 [inline]
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> atomic64_set include/asm-generic/atomic-instrumented.h:40 [inline]
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> atomic_long_set include/asm-generic/atomic-long.h:57 [inline]
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> set_work_data kernel/workqueue.c:617 [inline]
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
>  #0: 000000007b6e92d0 ((wq_completion)"events_unbound"){+.+.}, at:
> process_one_work+0xaef/0x1b50 kernel/workqueue.c:2116
>  #1: 000000004c7e11cf ((reaper_work).work){+.+.}, at:
> process_one_work+0xb46/0x1b50 kernel/workqueue.c:2120
> 2 locks held by khungtaskd/878:
>  #0: 000000001cc267e2 (rcu_read_lock){....}, at:
> check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
>  #0: 000000001cc267e2 (rcu_read_lock){....}, at: watchdog+0x1ff/0xf60
> kernel/hung_task.c:249
>  #1: 000000002f71223f (tasklist_lock){.+.+}, at:
> debug_show_all_locks+0xde/0x34a kernel/locking/lockdep.c:4470
> 2 locks held by kworker/1:2/1980:
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at: __write_once_size
> include/linux/compiler.h:215 [inline]
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at: arch_atomic64_set
> arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at: atomic64_set
> include/asm-generic/atomic-instrumented.h:40 [inline]
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at: atomic_long_set
> include/asm-generic/atomic-long.h:57 [inline]
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at: set_work_data
> kernel/workqueue.c:617 [inline]
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
>  #0: 00000000dc6681fd ((wq_completion)"events"){+.+.}, at:
> process_one_work+0xaef/0x1b50 kernel/workqueue.c:2116
>  #1: 000000008e69c2e7 (xfrm_state_gc_work){+.+.}, at:
> process_one_work+0xb46/0x1b50 kernel/workqueue.c:2120
> 1 lock held by rsyslogd/4365:
>  #0: 00000000ef321630 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0
> fs/file.c:766
> 2 locks held by getty/4456:
>  #0: 0000000044e43f49 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 0000000093b079e0 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4457:
>  #0: 000000005b25b55f (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 0000000099955ee5 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4458:
>  #0: 0000000050f5738d (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 00000000cccc0402 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4459:
>  #0: 00000000ddecfb5c (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 000000003c071f3a (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4460:
>  #0: 000000003bec706e (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 00000000dd28453c (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4461:
>  #0: 00000000313fc55a (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 00000000e9766156 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4462:
>  #0: 000000003212bb13 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
>  #1: 0000000065fa2f73 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
>
> =============================================
>
> NMI backtrace for cpu 1
> CPU: 1 PID: 878 Comm: khungtaskd Not tainted 4.16.0+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x1b9/0x29f lib/dump_stack.c:53
>  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>  trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
>  check_hung_task kernel/hung_task.c:132 [inline]
>  check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
>  watchdog+0xc10/0xf60 kernel/hung_task.c:249
>  kthread+0x345/0x410 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0
> CPU: 0 PID: 10218 Comm: syz-executor3 Not tainted 4.16.0+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:debug_lockdep_rcu_enabled.part.1+0xb/0x60 kernel/rcu/update.c:298
> RSP: 0018:ffff8801db0062e8 EFLAGS: 00000202
> RAX: dffffc0000000000 RBX: 1ffff1003b600c61 RCX: ffffffff86583d15
> RDX: 0000000000000004 RSI: ffffffff86583d65 RDI: ffffffff88e6cc40
> RBP: ffff8801db0062f8 R08: ffff8801b2f623c0 R09: ffffed003b6046c2
> R10: ffffed003b6046c2 R11: ffff8801db023613 R12: ffff8801c3d740c0
> R13: 0000000000000001 R14: 0000000000010000 R15: ffff88019cc0a900
> FS:  00007f5c0c0f3700(0000) GS:ffff8801db000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffe57df8fa8 CR3: 00000001d7124000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  rcu_read_unlock include/linux/rcupdate.h:684 [inline]
>  ip6_mtu+0x36a/0x590 net/ipv6/route.c:2420
>  dst_mtu include/net/dst.h:210 [inline]
>  ip6_xmit+0xb42/0x23f0 net/ipv6/ip6_output.c:262
>  sctp_v6_xmit+0x4a5/0x6b0 net/sctp/ipv6.c:225
>  sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:642
>  sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
>  sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>  sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>  sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
>  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>  expire_timers kernel/time/timer.c:1363 [inline]
>  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
>  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
>  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>  invoke_softirq kernel/softirq.c:365 [inline]
>  irq_exit+0x1d1/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
>  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>  </IRQ>
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
> [inline]
> RIP: 0010:console_unlock+0xcdf/0x1100 kernel/printk/printk.c:2403
> RSP: 0018:ffff8801946eec00 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff12
> RAX: 0000000000040000 RBX: 0000000000000200 RCX: ffffc90002ee8000
> RDX: 0000000000004461 RSI: ffffffff815f3446 RDI: 0000000000000212
> RBP: ffff8801946eed68 R08: ffff8801b2f62c38 R09: 0000000000000006
> R10: ffff8801b2f623c0 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffffff84b84430 R14: 0000000000000001 R15: dffffc0000000000
>  vprintk_emit+0x6ad/0xdd0 kernel/printk/printk.c:1907
>  vprintk_default+0x28/0x30 kernel/printk/printk.c:1947
>  vprintk_func+0x7a/0xe7 kernel/printk/printk_safe.c:379
>  printk+0x9e/0xba kernel/printk/printk.c:1980
>  sctp_getsockopt_maxburst net/sctp/socket.c:6265 [inline]
>  sctp_getsockopt.cold.34+0x11d/0x14c net/sctp/socket.c:7240
>  sock_common_getsockopt+0x9a/0xe0 net/core/sock.c:2998
>  __sys_getsockopt+0x1a5/0x370 net/socket.c:1940
>  SYSC_getsockopt net/socket.c:1951 [inline]
>  SyS_getsockopt+0x34/0x50 net/socket.c:1948
>  do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x455279
> RSP: 002b:00007f5c0c0f2c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
> RAX: ffffffffffffffda RBX: 00007f5c0c0f36d4 RCX: 0000000000455279
> RDX: 0000000000000014 RSI: 0000000000000084 RDI: 0000000000000014
> RBP: 000000000072bea0 R08: 0000000020000240 R09: 0000000000000000
> R10: 0000000020000140 R11: 0000000000000246 R12: 00000000ffffffff
> R13: 000000000000012b R14: 00000000006f4ca8 R15: 0000000000000000
> Code: e9 c3 fd ff ff 48 8b 7d c8 e8 52 6c 50 00 e9 9c fd ff ff 0f 1f 00 66
> 2e 0f 1f 84 00 00 00 00 00 48 b8 00 00 00 00 00 fc ff df 55 <48> 89 e5 53 65
> 48 8b 1c 25 c0 ed 01 00 48 8d bb 74 08 00 00 48
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is
> merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line in the email body.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/00000000000060115b056b14b1fb%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* Re: INFO: rcu detected stall in sctp_generate_heartbeat_event
From: Dmitry Vyukov @ 2018-05-26 15:36 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: syzbot, David Miller, LKML, linux-sctp, netdev, Neil Horman,
	syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <20180508120643.GM4977@localhost.localdomain>

On Tue, May 8, 2018 at 2:06 PM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
> On Tue, May 08, 2018 at 12:35:02AM -0700, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit:    90278871d4b0 Merge git://git.kernel.org/pub/scm/linux/kern..
>> git tree:       net-next
>> console output: https://syzkaller.appspot.com/x/log.txt?x=119a7237800000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=aea320d3af5ef99d
>> dashboard link: https://syzkaller.appspot.com/bug?extid=e4a5bbd54260c93014f9
>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>
>> Unfortunately, I don't have any reproducer for this crash yet.
>
> A reproducer will be welcomed. With just these traces, I don't think
> we have enough information.


#syz fix: sctp: not allow to set rto_min with a value below 200 msecs

^ permalink raw reply

* Re: INFO: rcu detected stall in kfree_skbmem
From: Dmitry Vyukov @ 2018-05-26 15:34 UTC (permalink / raw)
  To: Xin Long
  Cc: Neil Horman, Marcelo Ricardo Leitner, syzbot, Vladislav Yasevich,
	linux-sctp, Andrei Vagin, David Miller, Kirill Tkhai, LKML,
	netdev, syzkaller-bugs
In-Reply-To: <CADvbK_crGBk-q_910r-xdh2p=xnxiV=1EExLSX-ecddFwMag6w@mail.gmail.com>

On Mon, May 14, 2018 at 8:04 PM, Xin Long <lucien.xin@gmail.com> wrote:
> On Mon, May 14, 2018 at 9:34 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
>> On Fri, May 11, 2018 at 12:00:38PM +0200, Dmitry Vyukov wrote:
>>> On Mon, Apr 30, 2018 at 8:09 PM, syzbot
>>> <syzbot+fc78715ba3b3257caf6a@syzkaller.appspotmail.com> wrote:
>>> > Hello,
>>> >
>>> > syzbot found the following crash on:
>>> >
>>> > HEAD commit:    5d1365940a68 Merge
>>> > git://git.kernel.org/pub/scm/linux/kerne...
>>> > git tree:       net-next
>>> > console output: https://syzkaller.appspot.com/x/log.txt?id=5667997129637888
>>> > kernel config:
>>> > https://syzkaller.appspot.com/x/.config?id=-5947642240294114534
>>> > dashboard link: https://syzkaller.appspot.com/bug?extid=fc78715ba3b3257caf6a
>>> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>> >
>>> > Unfortunately, I don't have any reproducer for this crash yet.
>>>
>>> This looks sctp-related, +sctp maintainers.
>>>
>> Looking at the entire trace, it appears that we are getting caught in the
>> kfree_skb that is getting triggered in enqueue_to_backlog which occurs when our
>> rx backlog list grows over netdev_max_backlog packets.  That suggests to me that
> It might be a long skb->frag_list that made kfree_skb slow when packing
> lots of small chunks to go through lo device?
>
>> whatever test(s) is/are causing this trace are queuing up a large number of
>> frames to be sent over the loopback interface, and are never/rarely getting
>> received.  Looking up higher in the stack, in the sctp_generate_heartbeat_event
>> function, we (in addition to the rcu_read_lock in sctp_v6_xmit) we also hold the
>> socket lock during the entirety of the xmit operaion.  Is it possible that we
>> are just enqueuing so many frames for xmit that we are blocking progress of
>> other threads using the same socket that we cross the RCU self detected stall
>> boundary?  While its not a fix per se, it might be a worthwhile test to limit
>> the number of frames we flush in a single pass.
>>
>> Neil
>>
>>> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> > Reported-by: syzbot+fc78715ba3b3257caf6a@syzkaller.appspotmail.com
>>> >
>>> > INFO: rcu_sched self-detected stall on CPU
>>> >         1-...!: (1 GPs behind) idle=a3e/1/4611686018427387908
>>> > softirq=71980/71983 fqs=33
>>> >          (t=125000 jiffies g=39438 c=39437 q=958)
>>> > rcu_sched kthread starved for 124829 jiffies! g39438 c39437 f0x0
>>> > RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
>>> > RCU grace-period kthread stack dump:
>>> > rcu_sched       R  running task    23768     9      2 0x80000000
>>> > Call Trace:
>>> >  context_switch kernel/sched/core.c:2848 [inline]
>>> >  __schedule+0x801/0x1e30 kernel/sched/core.c:3490
>>> >  schedule+0xef/0x430 kernel/sched/core.c:3549
>>> >  schedule_timeout+0x138/0x240 kernel/time/timer.c:1801
>>> >  rcu_gp_kthread+0x6b5/0x1940 kernel/rcu/tree.c:2231
>>> >  kthread+0x345/0x410 kernel/kthread.c:238
>>> >  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:411
>>> > NMI backtrace for cpu 1
>>> > CPU: 1 PID: 20560 Comm: syz-executor4 Not tainted 4.16.0+ #1
>>> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>> > Google 01/01/2011
>>> > Call Trace:
>>> >  <IRQ>
>>> >  __dump_stack lib/dump_stack.c:77 [inline]
>>> >  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>>> >  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>>> >  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>>> >  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>>> >  trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
>>> >  rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
>>> >  print_cpu_stall kernel/rcu/tree.c:1525 [inline]
>>> >  check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
>>> >  __rcu_pending kernel/rcu/tree.c:3356 [inline]
>>> >  rcu_pending kernel/rcu/tree.c:3401 [inline]
>>> >  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
>>> >  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
>>> >  tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:173
>>> >  tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1283
>>> >  __run_hrtimer kernel/time/hrtimer.c:1386 [inline]
>>> >  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1448
>>> >  hrtimer_interrupt+0x286/0x650 kernel/time/hrtimer.c:1506
>>> >  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
>>> >  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
>>> >  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>>> > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
>>> > [inline]
>>> > RIP: 0010:kmem_cache_free+0xb3/0x2d0 mm/slab.c:3757
>>> > RSP: 0018:ffff8801db105228 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
>>> > RAX: 0000000000000007 RBX: ffff8800b055c940 RCX: 1ffff1003b2345a5
>>> > RDX: 0000000000000000 RSI: ffff8801d91a2d80 RDI: 0000000000000282
>>> > RBP: ffff8801db105248 R08: ffff8801d91a2cb8 R09: 0000000000000002
>>> > R10: ffff8801d91a2480 R11: 0000000000000000 R12: ffff8801d9848e40
>>> > R13: 0000000000000282 R14: ffffffff85b7f27c R15: 0000000000000000
>>> >  kfree_skbmem+0x13c/0x210 net/core/skbuff.c:582
>>> >  __kfree_skb net/core/skbuff.c:642 [inline]
>>> >  kfree_skb+0x19d/0x560 net/core/skbuff.c:659
>>> >  enqueue_to_backlog+0x2fc/0xc90 net/core/dev.c:3968
>>> >  netif_rx_internal+0x14d/0xae0 net/core/dev.c:4181
>>> >  netif_rx+0xba/0x400 net/core/dev.c:4206
>>> >  loopback_xmit+0x283/0x741 drivers/net/loopback.c:91
>>> >  __netdev_start_xmit include/linux/netdevice.h:4087 [inline]
>>> >  netdev_start_xmit include/linux/netdevice.h:4096 [inline]
>>> >  xmit_one net/core/dev.c:3053 [inline]
>>> >  dev_hard_start_xmit+0x264/0xc10 net/core/dev.c:3069
>>> >  __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584
>>> >  dev_queue_xmit+0x17/0x20 net/core/dev.c:3617
>>> >  neigh_hh_output include/net/neighbour.h:472 [inline]
>>> >  neigh_output include/net/neighbour.h:480 [inline]
>>> >  ip6_finish_output2+0x134e/0x2810 net/ipv6/ip6_output.c:120
>>> >  ip6_finish_output+0x5fe/0xbc0 net/ipv6/ip6_output.c:154
>>> >  NF_HOOK_COND include/linux/netfilter.h:277 [inline]
>>> >  ip6_output+0x227/0x9b0 net/ipv6/ip6_output.c:171
>>> >  dst_output include/net/dst.h:444 [inline]
>>> >  NF_HOOK include/linux/netfilter.h:288 [inline]
>>> >  ip6_xmit+0xf51/0x23f0 net/ipv6/ip6_output.c:277
>>> >  sctp_v6_xmit+0x4a5/0x6b0 net/sctp/ipv6.c:225
>>> >  sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:650
>>> >  sctp_outq_flush+0x1373/0x4370 net/sctp/outqueue.c:1197
>>> >  sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>>> >  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>>> >  sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>>> >  sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>>> >  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406


#syz fix: sctp: not allow to set rto_min with a value below 200 msecs


>>> >  call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>>> >  expire_timers kernel/time/timer.c:1363 [inline]
>>> >  __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
>>> >  run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
>>> >  __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
>>> >  invoke_softirq kernel/softirq.c:365 [inline]
>>> >  irq_exit+0x1d1/0x200 kernel/softirq.c:405
>>> >  exiting_irq arch/x86/include/asm/apic.h:525 [inline]
>>> >  smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
>>> >  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:862
>>> >  </IRQ>
>>> > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:783
>>> > [inline]
>>> > RIP: 0010:lock_release+0x4d4/0xa10 kernel/locking/lockdep.c:3942
>>> > RSP: 0018:ffff8801971ce7b0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
>>> > RAX: dffffc0000000000 RBX: 1ffff10032e39cfb RCX: 1ffff1003b234595
>>> > RDX: 1ffffffff11630ed RSI: 0000000000000002 RDI: 0000000000000282
>>> > RBP: ffff8801971ce8e0 R08: 1ffff10032e39cff R09: ffffed003b6246c2
>>> > R10: 0000000000000003 R11: 0000000000000001 R12: ffff8801d91a2480
>>> > R13: ffffffff88b8df60 R14: ffff8801d91a2480 R15: ffff8801971ce7f8
>>> >  rcu_lock_release include/linux/rcupdate.h:251 [inline]
>>> >  rcu_read_unlock include/linux/rcupdate.h:688 [inline]
>>> >  __unlock_page_memcg+0x72/0x100 mm/memcontrol.c:1654
>>> >  unlock_page_memcg+0x2c/0x40 mm/memcontrol.c:1663
>>> >  page_remove_file_rmap mm/rmap.c:1248 [inline]
>>> >  page_remove_rmap+0x6f2/0x1250 mm/rmap.c:1299
>>> >  zap_pte_range mm/memory.c:1337 [inline]
>>> >  zap_pmd_range mm/memory.c:1441 [inline]
>>> >  zap_pud_range mm/memory.c:1470 [inline]
>>> >  zap_p4d_range mm/memory.c:1491 [inline]
>>> >  unmap_page_range+0xeb4/0x2200 mm/memory.c:1512
>>> >  unmap_single_vma+0x1a0/0x310 mm/memory.c:1557
>>> >  unmap_vmas+0x120/0x1f0 mm/memory.c:1587
>>> >  exit_mmap+0x265/0x570 mm/mmap.c:3038
>>> >  __mmput kernel/fork.c:962 [inline]
>>> >  mmput+0x251/0x610 kernel/fork.c:983
>>> >  exit_mm kernel/exit.c:544 [inline]
>>> >  do_exit+0xe98/0x2730 kernel/exit.c:852
>>> >  do_group_exit+0x16f/0x430 kernel/exit.c:968
>>> >  get_signal+0x886/0x1960 kernel/signal.c:2469
>>> >  do_signal+0x98/0x2040 arch/x86/kernel/signal.c:810
>>> >  exit_to_usermode_loop+0x28a/0x310 arch/x86/entry/common.c:162
>>> >  prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>> >  syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
>>> >  do_syscall_64+0x792/0x9d0 arch/x86/entry/common.c:292
>>> >  entry_SYSCALL_64_after_hwframe+0x42/0xb7
>>> > RIP: 0033:0x455319
>>> > RSP: 002b:00007fa346e81ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>>> > RAX: fffffffffffffe00 RBX: 000000000072bf80 RCX: 0000000000455319
>>> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000072bf80
>>> > RBP: 000000000072bf80 R08: 0000000000000000 R09: 000000000072bf58
>>> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>>> > R13: 0000000000a3e81f R14: 00007fa346e829c0 R15: 0000000000000001
>>> >
>>> >
>>> > ---
>>> > This bug is generated by a bot. It may contain errors.
>>> > See https://goo.gl/tpsmEJ for more information about syzbot.
>>> > syzbot engineers can be reached at syzkaller@googlegroups.com.
>>> >
>>> > syzbot will keep track of this bug report.
>>> > If you forgot to add the Reported-by tag, once the fix for this bug is
>>> > merged
>>> > into any tree, please reply to this email with:
>>> > #syz fix: exact-commit-title
>>> > To mark this as a duplicate of another syzbot report, please reply with:
>>> > #syz dup: exact-subject-of-another-report
>>> > If it's a one-off invalid bug report, please reply with:
>>> > #syz invalid
>>> > Note: if the crash happens again, it will cause creation of a new bug
>>> > report.
>>> > Note: all commands must start from beginning of the line in the email body.
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google Groups
>>> > "syzkaller-bugs" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send an
>>> > email to syzkaller-bugs+unsubscribe@googlegroups.com.
>>> > To view this discussion on the web visit
>>> > https://groups.google.com/d/msgid/syzkaller-bugs/000000000000a9b0e3056b14bfb2%40google.com.
>>> > For more options, visit https://groups.google.com/d/optout.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: INFO: rcu detected stall in sctp_packet_transmit
From: Dmitry Vyukov @ 2018-05-26 15:34 UTC (permalink / raw)
  To: Xin Long
  Cc: syzbot, davem, LKML, linux-sctp, Marcelo Ricardo Leitner,
	network dev, Neil Horman, syzkaller-bugs, Vlad Yasevich
In-Reply-To: <CACT4Y+aG0CdZOe8+Y5+OWsghV2o36UpgAXFE4cCm-Mmv6Cq0oA@mail.gmail.com>

On Wed, May 16, 2018 at 2:12 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, May 16, 2018 at 1:02 PM, Xin Long <lucien.xin@gmail.com> wrote:
>>>> <syzbot+ff0b569fb5111dcd1a36@syzkaller.appspotmail.com> wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following crash on:
>>>>>
>>>>> HEAD commit:    961423f9fcbc Merge branch 'sctp-Introduce-sctp_flush_ctx'
>>>>> git tree:       net-next
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1366aea7800000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=51fb0a6913f757db
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=ff0b569fb5111dcd1a36
>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>>
>>>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>>>
>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>> Reported-by: syzbot+ff0b569fb5111dcd1a36@syzkaller.appspotmail.com
>>>>>
>>>>> INFO: rcu_sched self-detected stall on CPU
>>>>>         0-....: (1 GPs behind) idle=dae/1/4611686018427387908
>>>>> softirq=93090/93091 fqs=30902
>>>>>          (t=125000 jiffies g=51107 c=51106 q=972)
>>>>> NMI backtrace for cpu 0
>>>>> CPU: 0 PID: 24668 Comm: syz-executor6 Not tainted 4.17.0-rc4+ #44
>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>>>>> Google 01/01/2011
>>>>> Call Trace:
>>>>>  <IRQ>
>>>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>>>>>  nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
>>>>>  nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
>>>>>  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
>>>>>  trigger_single_cpu_backtrace include/linux/nmi.h:156 [inline]
>>>>>  rcu_dump_cpu_stacks+0x175/0x1c2 kernel/rcu/tree.c:1376
>>>>>  print_cpu_stall kernel/rcu/tree.c:1525 [inline]
>>>>>  check_cpu_stall.isra.61.cold.80+0x36c/0x59a kernel/rcu/tree.c:1593
>>>>>  __rcu_pending kernel/rcu/tree.c:3356 [inline]
>>>>>  rcu_pending kernel/rcu/tree.c:3401 [inline]
>>>>>  rcu_check_callbacks+0x21b/0xad0 kernel/rcu/tree.c:2763
>>>>>  update_process_times+0x2d/0x70 kernel/time/timer.c:1636
>>>>>  tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:164
>>>>>  tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1274
>>>>>  __run_hrtimer kernel/time/hrtimer.c:1398 [inline]
>>>>>  __hrtimer_run_queues+0x3e3/0x10a0 kernel/time/hrtimer.c:1460
>>>>>  hrtimer_interrupt+0x2f3/0x750 kernel/time/hrtimer.c:1518
>>>>>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1025 [inline]
>>>>>  smp_apic_timer_interrupt+0x15d/0x710 arch/x86/kernel/apic/apic.c:1050
>>>>>  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
>>>>> RIP: 0010:sctp_v6_xmit+0x259/0x6b0 net/sctp/ipv6.c:219
>>>>> RSP: 0018:ffff8801dae068e8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
>>>>> RAX: 0000000000000007 RBX: ffff8801bb7ec800 RCX: ffffffff86f1b345
>>>>> RDX: 0000000000000000 RSI: ffffffff86f1b381 RDI: ffff8801b73d97c4
>>>>> RBP: ffff8801dae06988 R08: ffff88019505c300 R09: ffffed003b5c46c2
>>>>> R10: ffffed003b5c46c2 R11: ffff8801dae23613 R12: ffff88011fd57300
>>>>> R13: ffff8801bb7ecec8 R14: 0000000000000029 R15: 0000000000000002
>>>>>  sctp_packet_transmit+0x26f6/0x3ba0 net/sctp/output.c:642
>>>>>  sctp_outq_flush_transports net/sctp/outqueue.c:1164 [inline]
>>>>>  sctp_outq_flush+0x5f5/0x3430 net/sctp/outqueue.c:1212
>>>>>  sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>>>>>  sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>>>>>  sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>>>>>  sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>>>>>  sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
>>>> Shocks, this timer event again. Can we try to minimize the repo.syz and
>>>> get a short script, not neccessary to reproduce the issue 100%. we need
>>>> to know what it was doing when this happened.
>>>>
>>>> Thanks.
>>>
>>> It's possible to reply the whole log from console output following
>>> these instructions:
>>> https://github.com/google/syzkaller/blob/master/docs/executing_syzkaller_programs.md
>> Thanks, it's running now.
>> Usually how long will it take to finish running this 5000-line log?
>
> If you run with -repeat=0 then it will run infinitely repeating the
> log again and again. If you see:
>
> parsed 1000 programs
> ...
> executed 5000 programs
>
> then it looped 5 times already. You can run with -repeat=10.
>
> syzbot has tried replaying the log, but for some reason it wasn't able
> to reproduce the crash (maybe accumulated state, or maybe it crashed
> in a different way). You can also try logs from other sctp hangs.


#syz fix: sctp: not allow to set rto_min with a value below 200 msecs

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox