* Re: [PATCH net 2/2] geneve, vxlan: Don't set exceptions if skb->len < mtu
From: Stefano Brivio @ 2018-10-15 8:27 UTC (permalink / raw)
To: Xin Long; +Cc: davem, Sabrina Dubroca, network dev
In-Reply-To: <CADvbK_fVyZA-MzmESYOQmp_pes+X61iftnYtNNU4Y_uqSg2LhQ@mail.gmail.com>
On Mon, 15 Oct 2018 15:01:31 +0900
Xin Long <lucien.xin@gmail.com> wrote:
> On Sat, Oct 13, 2018 at 6:54 AM Stefano Brivio <sbrivio@redhat.com> wrote:
> >
> > We shouldn't abuse exceptions: if the destination MTU is already higher
> > than what we're transmitting, no exception should be created.
> makes sense, shouldn't ip(6) tunnels also do this?
I should probably have mentioned this in the cover letter: in theory
yes, but I'm doing this as preparation for ICMP handling in UDP
tunnels, and those will get selftests soon (once I'm done).
Writing extensive selftests for IP tunnels will take significantly
longer, so I'm not too confident to change this right now. I'd prefer
to address that at a later time.
--
Stefano
^ permalink raw reply
* Re: [RFC] VSOCK: The performance problem of vhost_vsock.
From: jiangyiwen @ 2018-10-15 6:12 UTC (permalink / raw)
To: Jason Wang, stefanha; +Cc: kvm, virtualization, netdev
In-Reply-To: <30d7c370-b206-cdac-dc85-53e9be1e1c63@redhat.com>
On 2018/10/15 10:33, Jason Wang wrote:
>
>
> On 2018年10月15日 09:43, jiangyiwen wrote:
>> Hi Stefan & All:
>>
>> Now I find vhost-vsock has two performance problems even if it
>> is not designed for performance.
>>
>> First, I think vhost-vsock should faster than vhost-net because it
>> is no TCP/IP stack, but the real test result vhost-net is 5~10
>> times than vhost-vsock, currently I am looking for the reason.
>
> TCP/IP is not a must for vhost-net.
>
> How do you test and compare the performance?
>
> Thanks
>
I test the performance used my test tool, like follows:
Server Client
socket()
bind()
listen()
socket(AF_VSOCK) or socket(AF_INET)
Accept() <-------------->connect()
*======Start Record Time======*
Call syscall sendfile()
Recv()
Send end
Receive end
Send(file_size)
Recv(file_size)
*======End Record Time======*
The test result, vhost-vsock is about 500MB/s, and vhost-net is about 2500MB/s.
By the way, vhost-net use single queue.
Thanks.
>> Second, vhost-vsock only supports two vqs(tx and rx), that means
>> if multiple sockets in the guest will use the same vq to transmit
>> the message and get the response. So if there are multiple applications
>> in the guest, we should support "Multiqueue" feature for Virtio-vsock.
>>
>> Stefan, have you encountered these problems?
>>
>> Thanks,
>> Yiwen.
>>
>
>
> .
>
^ permalink raw reply
* Re: [PATCH net 2/2] geneve, vxlan: Don't set exceptions if skb->len < mtu
From: Xin Long @ 2018-10-15 6:01 UTC (permalink / raw)
To: Stefano Brivio; +Cc: davem, Sabrina Dubroca, network dev
In-Reply-To: <97f93a69894ed49e6b914a8211ec813cc1bbc433.1539381018.git.sbrivio@redhat.com>
On Sat, Oct 13, 2018 at 6:54 AM Stefano Brivio <sbrivio@redhat.com> wrote:
>
> We shouldn't abuse exceptions: if the destination MTU is already higher
> than what we're transmitting, no exception should be created.
makes sense, shouldn't ip(6) tunnels also do this?
>
> Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
> Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
> ---
> drivers/net/geneve.c | 7 +++----
> drivers/net/vxlan.c | 4 ++--
> include/net/dst.h | 10 ++++++++++
> 3 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
> index 61c4bfbeb41c..493cd382b8aa 100644
> --- a/drivers/net/geneve.c
> +++ b/drivers/net/geneve.c
> @@ -830,8 +830,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
> if (IS_ERR(rt))
> return PTR_ERR(rt);
>
> - skb_dst_update_pmtu(skb, dst_mtu(&rt->dst) -
> - GENEVE_IPV4_HLEN - info->options_len);
> + skb_tunnel_check_pmtu(skb, &rt->dst,
> + GENEVE_IPV4_HLEN + info->options_len);
>
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> @@ -872,8 +872,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
> if (IS_ERR(dst))
> return PTR_ERR(dst);
>
> - skb_dst_update_pmtu(skb, dst_mtu(dst) -
> - GENEVE_IPV6_HLEN - info->options_len);
> + skb_tunnel_check_pmtu(skb, dst, GENEVE_IPV6_HLEN + info->options_len);
>
> sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
> if (geneve->collect_md) {
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index 22e0ce592e07..27bd586b94b0 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2194,7 +2194,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
> }
>
> ndst = &rt->dst;
> - skb_dst_update_pmtu(skb, dst_mtu(ndst) - VXLAN_HEADROOM);
> + skb_tunnel_check_pmtu(skb, ndst, VXLAN_HEADROOM);
>
> tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
> ttl = ttl ? : ip4_dst_hoplimit(&rt->dst);
> @@ -2231,7 +2231,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
> goto out_unlock;
> }
>
> - skb_dst_update_pmtu(skb, dst_mtu(ndst) - VXLAN6_HEADROOM);
> + skb_tunnel_check_pmtu(skb, ndst, VXLAN6_HEADROOM);
>
> tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
> ttl = ttl ? : ip6_dst_hoplimit(ndst);
> diff --git a/include/net/dst.h b/include/net/dst.h
> index 7f735e76ca73..6cf0870414c7 100644
> --- a/include/net/dst.h
> +++ b/include/net/dst.h
> @@ -527,4 +527,14 @@ static inline void skb_dst_update_pmtu(struct sk_buff *skb, u32 mtu)
> dst->ops->update_pmtu(dst, NULL, skb, mtu);
> }
>
> +static inline void skb_tunnel_check_pmtu(struct sk_buff *skb,
> + struct dst_entry *encap_dst,
> + int headroom)
> +{
> + u32 encap_mtu = dst_mtu(encap_dst);
> +
> + if (skb->len > encap_mtu - headroom)
> + skb_dst_update_pmtu(skb, encap_mtu - headroom);
> +}
> +
> #endif /* _NET_DST_H */
> --
> 2.19.1
>
^ permalink raw reply
* Re: [rtnetlink] Potential bug in Linux (rt)netlink code
From: Henning Rogge @ 2018-10-15 5:25 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20181012115159.7ead2f97@xeon-e3>
Am 12.10.2018 um 20:51 schrieb Stephen Hemminger:
> On Fri, 12 Oct 2018 09:30:40 +0200
> Henning Rogge <henning.rogge@fkie.fraunhofer.de> wrote:
>
>> Hi,
>>
>> I am working on a self-written routing agent
>> (https://github.com/OLSR/OONF) and am stuck on a problem with netlink
>> that I cannot explain with an userspace error.
>>
>> I am using a netlink socket for setting routes
>> (RTM_NEWROUTE/RTM_DELROUTE), querying the kernel for the current routes
>> in the database (via a RTM_GETROUTE dump) and for getting multicast
>> messages for ongoing routing changes.
>>
>> After a few netlink messages I get to the point where the kernel just
>> does not responst to a RTM_NEWROUTE. No error, no answer, despite the
>> NLM_F_ACK flag set)... but sometime when (during shutdown of the routing
>> agent) the program sends another route command (most times a
>> RTM_DELROUTE) I get a single netlink packet with a "successful" response
>> for both the "missing" RTM_NEWROUTE and one for the new RTM DELROUTE
>> sequence number.
>>
>> I am testing two routing agents, each of them in a systemd-nspawn based
>> container connected over a bridge on the host system on a current Debian
>> Testing (kernel 4.18.0-1-amd64).
>>
>> I am directly using the netlink sockets, without any other userspace
>> library in between.
>>
>> I have checked the hexdumps of a couple of netlink messages (including
>> the ones just before the bug happens) by hand and they seem to be okay.
>>
>> When I tried to add a "netlink listener" socket for futher debugging (ip
>> link add nlmon0 type nlmon) the problem vanished until I removed the
>> listener socket again.
>>
>> Any ideas how to debug this problem? Unfortunately I have no short
>> example program to trigger the bug... I have rarely seen the problem for
>> years (once every couple of months), but until a few days ago I never
>> managed to reproduce it.
>>
>> Henning Rogge
>
> Are you reading the responses to your requests? If you don't read
> the response, the socket will get flow blocked.
Yes, I do...
all netlink sockets the program uses are constantly watched for traffic
coming from the kernel (with an epoll()-based event loop, no edge-trigger).
I even have a rate limitation towards the kernel, only sending a
"pagesize" full of netlink data towards the kernel, then waiting for the
reply before sending more (I had the blocking problem a few years ago
when experimenting with LOTS of routes).
Henning Rogge
--
Diplom-Informatiker Henning Rogge , Fraunhofer-Institut für
Kommunikation, Informationsverarbeitung und Ergonomie FKIE
Kommunikationssysteme (KOM)
Zanderstrasse 5, 53177 Bonn, Germany
Telefon +49 228 50212-469
mailto:henning.rogge@fkie.fraunhofer.de http://www.fkie.fraunhofer.de
^ permalink raw reply
* [PATCH net,stable 1/1] net: fec: don't dump RX FIFO register when not available
From: Andy Duan @ 2018-10-15 5:19 UTC (permalink / raw)
To: davem@davemloft.net; +Cc: netdev@vger.kernel.org, tremyfr@gmail.com, Andy Duan
From: Fugang Duan <fugang.duan@nxp.com>
Commit db65f35f50e0 ("net: fec: add support of ethtool get_regs") introduce
ethool "--register-dump" interface to dump all FEC registers.
But not all silicon implementations of the Freescale FEC hardware module
have the FRBR (FIFO Receive Bound Register) and FRSR (FIFO Receive Start
Register) register, so we should not be trying to dump them on those that
don't.
To fix it we create a quirk flag, FEC_QUIRK_HAS_RFREG, and check it before
dump those RX FIFO registers.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
---
drivers/net/ethernet/freescale/fec.h | 4 ++++
drivers/net/ethernet/freescale/fec_main.c | 16 ++++++++++++----
2 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
index 4778b66..bf80855 100644
--- a/drivers/net/ethernet/freescale/fec.h
+++ b/drivers/net/ethernet/freescale/fec.h
@@ -452,6 +452,10 @@ struct bufdesc_ex {
* initialisation.
*/
#define FEC_QUIRK_MIB_CLEAR (1 << 15)
+/* Only i.MX25/i.MX27/i.MX28 controller supports FRBR,FRSR registers,
+ * those FIFO receive registers are resolved in other platforms.
+ */
+#define FEC_QUIRK_HAS_FRREG (1 << 16)
struct bufdesc_prop {
int qid;
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index a17cc97..6db69ba 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -91,14 +91,16 @@
.driver_data = 0,
}, {
.name = "imx25-fec",
- .driver_data = FEC_QUIRK_USE_GASKET | FEC_QUIRK_MIB_CLEAR,
+ .driver_data = FEC_QUIRK_USE_GASKET | FEC_QUIRK_MIB_CLEAR |
+ FEC_QUIRK_HAS_FRREG,
}, {
.name = "imx27-fec",
- .driver_data = FEC_QUIRK_MIB_CLEAR,
+ .driver_data = FEC_QUIRK_MIB_CLEAR | FEC_QUIRK_HAS_FRREG,
}, {
.name = "imx28-fec",
.driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_SWAP_FRAME |
- FEC_QUIRK_SINGLE_MDIO | FEC_QUIRK_HAS_RACC,
+ FEC_QUIRK_SINGLE_MDIO | FEC_QUIRK_HAS_RACC |
+ FEC_QUIRK_HAS_FRREG,
}, {
.name = "imx6q-fec",
.driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_HAS_GBIT |
@@ -2162,7 +2164,13 @@ static void fec_enet_get_regs(struct net_device *ndev,
memset(buf, 0, regs->len);
for (i = 0; i < ARRAY_SIZE(fec_enet_register_offset); i++) {
- off = fec_enet_register_offset[i] / 4;
+ off = fec_enet_register_offset[i];
+
+ if ((off == FEC_R_BOUND || off == FEC_R_FSTART) &&
+ !(fep->quirks & FEC_QUIRK_HAS_FRREG))
+ continue;
+
+ off >>= 2;
buf[off] = readl(&theregs[off]);
}
}
--
1.9.1
^ permalink raw reply related
* Re: [PATCH 2/8] usbnet: smsc95xx: add kconfig for turbo mode
From: Bjørn Mork @ 2018-10-15 12:48 UTC (permalink / raw)
To: Ben Dooks; +Cc: netdev, oneukum, davem, linux-usb, linux-kernel, linux-kernel
In-Reply-To: <20181012083405.19246-3-ben.dooks@codethink.co.uk>
Ben Dooks <ben.dooks@codethink.co.uk> writes:
> Add a configuration option for the default state of turbo mode
> on the smsc95xx networking driver. Some systems it is better
> to default this to off as it causes significant increases in
> soft-irq load.
So there is already a module option allowing you to change this, using
e.g kernel command line or kmod config files. It's even writable,
taking effect on the next netdev open, so you can change it at runtime
without reloading the module.
What good does this new build-time setting do, except causing confusion
wrt driver defaults?
Note also that the smsc95xx and smsc75xx drivers are pretty similar.
Both have the same turbo_mode setting. If you change the defaults, then
they should at least be kept in sync to cause as little confusion as
possible..
Bjørn
^ permalink raw reply
* Re: net/wan: hostess_sv11 + z85230 problems
From: Krzysztof Hałasa @ 2018-10-15 12:41 UTC (permalink / raw)
To: Alan Cox; +Cc: Randy Dunlap, netdev@vger.kernel.org, LKML
In-Reply-To: <20181015112910.4504cd3a@alans-desktop>
Alan Cox <gnomes@lxorguk.ukuu.org.uk> writes:
>> BTW Hostess SV11 is apparently an ISA card, with all those problems.
>
> Actually it worked perfectly well of old but people kept changing it
> who didn't have hardware.
Right, that's the same with all of this hardware.
In fact I meant all the ISA problems (in comparison to e.g. PCI),
not of any particular design.
--
Krzysztof Halasa
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland
^ permalink raw reply
* Re: [PATCH net-next v4] net/ncsi: Add NCSI Broadcom OEM command
From: Samuel Mendoza-Jonas @ 2018-10-15 3:51 UTC (permalink / raw)
To: Vijay Khemka, David S. Miller, netdev, linux-kernel; +Cc: linux-aspeed, openbmc
In-Reply-To: <69cae9d44cbebb2cd4f468dc710d6a97210af835.camel@mendozajonas.com>
On Mon, 2018-10-15 at 13:08 +1100, Samuel Mendoza-Jonas wrote:
> On Fri, 2018-10-12 at 11:20 -0700, Vijay Khemka wrote:
> > This patch adds OEM Broadcom commands and response handling. It also
> > defines OEM Get MAC Address handler to get and configure the device.
> >
> > ncsi_oem_gma_handler_bcm: This handler send NCSI broadcom command for
> > getting mac address.
> > ncsi_rsp_handler_oem_bcm: This handles response received for all
> > broadcom OEM commands.
> > ncsi_rsp_handler_oem_bcm_gma: This handles get mac address response and
> > set it to device.
> >
> > Signed-off-by: Vijay Khemka <vijaykhemka@fb.com>
> > ---
> > v4: updated as per comment from Sam, I was just wondering if I can remove
> > NCSI_OEM_CMD_GET_MAC config option and let this code be valid always and
> > it will configure mac address if there is get mac address handler for given
> > manufacture id.
>
> Hi Vijay,
>
> We can look at handling this a different way, but I don't think we want
> to unconditionally set the system's MAC address based on the OEM GMA
> command. If the user wants to set a custom MAC address, or in the case of
> OpenBMC for example who have their MAC address saved in flash, this will
> override that value with whatever the Network Controller has saved. In
> particular as it is set up it will override any MAC address every time a
> channel is configured, such as during a failover event.
>
> We *could* always send the GMA command if it is available and move the
> decision whether to use the resulting address or not into the response
> handler. That would simplify the ncsi_configure_channel() logic a bit.
> Another idea may be to have a Netlink command to tell NCSI to ignore the
> GMA result; then we could drop the config option and the system can
> safely change the address if desired.
>
> Any thoughts? I'll also ping some of the OpenBMC people and see what
> their expectations are.
After a bit of a think and an ask around, to quote a colleague:
> I think we'd want it handled (overall) like any other net device; the MAC
> address in the device's ROM provides a default, and is overridden by anything
> specified by userspace
Which describes what I was thinking pretty well.
So if we can have it such that the NCSI driver only sets the MAC address
_once_, and then after then does not update it again, we should be able to call
the OEM GMA command without hiding it behind a config option. So the first time
a channel was configured we store and set the MAC address given, but then on
later configure events we don't continue to update it. What do you think?
Cheers,
Sam
>
> > +#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
> > +
> > +/* NCSI OEM Command APIs */
> > +static void ncsi_oem_gma_handler_bcm(struct ncsi_cmd_arg *nca)
> > +{
> > + unsigned char data[NCSI_OEM_BCM_CMD_GMA_LEN];
> > + int ret = 0;
> > +
> > + nca->payload = NCSI_OEM_BCM_CMD_GMA_LEN;
> > +
> > + memset(data, 0, NCSI_OEM_BCM_CMD_GMA_LEN);
> > + *(unsigned int *)data = ntohl(NCSI_OEM_MFR_BCM_ID);
> > + data[5] = NCSI_OEM_BCM_CMD_GMA;
> > +
> > + nca->data = data;
> > +
> > + ret = ncsi_xmit_cmd(nca);
> > + if (ret)
> > + netdev_err(nca->ndp->ndev.dev,
> > + "NCSI: Failed to transmit cmd 0x%x during configure\n",
> > + nca->type);
> > +}
>
> As a side note while unlikely we probably want to propagate the return
> value of ncsi_xmit_cmd() from here; otherwise we'll miss a failure and
> the configure process will stall.
>
> Regards,
> Sam
>
^ permalink raw reply
* [PATCH -next] fore200e: fix missing unlock on error in bsq_audit()
From: Wei Yongjun @ 2018-10-15 3:07 UTC (permalink / raw)
To: Chas Williams, Christoph Hellwig
Cc: Wei Yongjun, linux-atm-general, netdev, kernel-janitors
Add the missing unlock before return from function bsq_audit()
in the error handling case.
Fixes: 1d9d8be91788 ("fore200e: check for dma mapping failures")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
drivers/atm/fore200e.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/atm/fore200e.c b/drivers/atm/fore200e.c
index 2b5dc8f..ffc07ab 100644
--- a/drivers/atm/fore200e.c
+++ b/drivers/atm/fore200e.c
@@ -1606,6 +1606,7 @@ int bsq_audit(int where, struct host_bsq* bsq, int scheme, int magn)
if (dma_mapping_error(fore200e->dev, tpd->tsd[0].buffer)) {
if (tx_copy)
kfree(data);
+ spin_unlock_irqrestore(&fore200e->q_lock, flags);
return -ENOMEM;
}
tpd->tsd[ 0 ].length = tx_len;
^ permalink raw reply related
* [PATCH net-next] rxrpc: Add /proc/net/rxrpc/peers to display peer list
From: David Howells @ 2018-10-15 10:31 UTC (permalink / raw)
To: netdev; +Cc: dhowells, linux-afs, linux-kernel
Add /proc/net/rxrpc/peers to display the list of peers currently active.
Signed-off-by: David Howells <dhowells@redhat.com>
---
net/rxrpc/ar-internal.h | 1
net/rxrpc/net_ns.c | 3 +
net/rxrpc/proc.c | 126 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 130 insertions(+)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 8cee7644965c..0a7c49e8e053 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1062,6 +1062,7 @@ void rxrpc_put_peer(struct rxrpc_peer *);
*/
extern const struct seq_operations rxrpc_call_seq_ops;
extern const struct seq_operations rxrpc_connection_seq_ops;
+extern const struct seq_operations rxrpc_peer_seq_ops;
/*
* recvmsg.c
diff --git a/net/rxrpc/net_ns.c b/net/rxrpc/net_ns.c
index 417d80867c4f..fd7eba8467fa 100644
--- a/net/rxrpc/net_ns.c
+++ b/net/rxrpc/net_ns.c
@@ -102,6 +102,9 @@ static __net_init int rxrpc_init_net(struct net *net)
proc_create_net("conns", 0444, rxnet->proc_net,
&rxrpc_connection_seq_ops,
sizeof(struct seq_net_private));
+ proc_create_net("peers", 0444, rxnet->proc_net,
+ &rxrpc_peer_seq_ops,
+ sizeof(struct seq_net_private));
return 0;
err_proc:
diff --git a/net/rxrpc/proc.c b/net/rxrpc/proc.c
index 9805e3b85c36..c7d976859d40 100644
--- a/net/rxrpc/proc.c
+++ b/net/rxrpc/proc.c
@@ -212,3 +212,129 @@ const struct seq_operations rxrpc_connection_seq_ops = {
.stop = rxrpc_connection_seq_stop,
.show = rxrpc_connection_seq_show,
};
+
+/*
+ * generate a list of extant virtual peers in /proc/net/rxrpc/peers
+ */
+static int rxrpc_peer_seq_show(struct seq_file *seq, void *v)
+{
+ struct rxrpc_peer *peer;
+ time64_t now;
+ char lbuff[50], rbuff[50];
+
+ if (v == SEQ_START_TOKEN) {
+ seq_puts(seq,
+ "Proto Local "
+ " Remote "
+ " Use CW MTU LastUse RTT Rc\n"
+ );
+ return 0;
+ }
+
+ peer = list_entry(v, struct rxrpc_peer, hash_link);
+
+ sprintf(lbuff, "%pISpc", &peer->local->srx.transport);
+
+ sprintf(rbuff, "%pISpc", &peer->srx.transport);
+
+ now = ktime_get_seconds();
+ seq_printf(seq,
+ "UDP %-47.47s %-47.47s %3u"
+ " %3u %5u %6llus %12llu %2u\n",
+ lbuff,
+ rbuff,
+ atomic_read(&peer->usage),
+ peer->cong_cwnd,
+ peer->mtu,
+ now - peer->last_tx_at,
+ peer->rtt,
+ peer->rtt_cursor);
+
+ return 0;
+}
+
+static void *rxrpc_peer_seq_start(struct seq_file *seq, loff_t *_pos)
+ __acquires(rcu)
+{
+ struct rxrpc_net *rxnet = rxrpc_net(seq_file_net(seq));
+ unsigned int bucket, n;
+ unsigned int shift = 32 - HASH_BITS(rxnet->peer_hash);
+ void *p;
+
+ rcu_read_lock();
+
+ if (*_pos >= UINT_MAX)
+ return NULL;
+
+ n = *_pos & ((1U << shift) - 1);
+ bucket = *_pos >> shift;
+ for (;;) {
+ if (bucket >= HASH_SIZE(rxnet->peer_hash)) {
+ *_pos = UINT_MAX;
+ return NULL;
+ }
+ if (n == 0) {
+ if (bucket == 0)
+ return SEQ_START_TOKEN;
+ *_pos += 1;
+ n++;
+ }
+
+ p = seq_hlist_start_rcu(&rxnet->peer_hash[bucket], n - 1);
+ if (p)
+ return p;
+ bucket++;
+ n = 1;
+ *_pos = (bucket << shift) | n;
+ }
+}
+
+static void *rxrpc_peer_seq_next(struct seq_file *seq, void *v, loff_t *_pos)
+{
+ struct rxrpc_net *rxnet = rxrpc_net(seq_file_net(seq));
+ unsigned int bucket, n;
+ unsigned int shift = 32 - HASH_BITS(rxnet->peer_hash);
+ void *p;
+
+ if (*_pos >= UINT_MAX)
+ return NULL;
+
+ bucket = *_pos >> shift;
+
+ p = seq_hlist_next_rcu(v, &rxnet->peer_hash[bucket], _pos);
+ if (p)
+ return p;
+
+ for (;;) {
+ bucket++;
+ n = 1;
+ *_pos = (bucket << shift) | n;
+
+ if (bucket >= HASH_SIZE(rxnet->peer_hash)) {
+ *_pos = UINT_MAX;
+ return NULL;
+ }
+ if (n == 0) {
+ *_pos += 1;
+ n++;
+ }
+
+ p = seq_hlist_start_rcu(&rxnet->peer_hash[bucket], n - 1);
+ if (p)
+ return p;
+ }
+}
+
+static void rxrpc_peer_seq_stop(struct seq_file *seq, void *v)
+ __releases(rcu)
+{
+ rcu_read_unlock();
+}
+
+
+const struct seq_operations rxrpc_peer_seq_ops = {
+ .start = rxrpc_peer_seq_start,
+ .next = rxrpc_peer_seq_next,
+ .stop = rxrpc_peer_seq_stop,
+ .show = rxrpc_peer_seq_show,
+};
^ permalink raw reply related
* Re: [PATCH net-next V2 6/8] vhost: packed ring support
From: Michael S. Tsirkin @ 2018-10-15 10:25 UTC (permalink / raw)
To: Jason Wang
Cc: Tiwei Bie, kvm, virtualization, netdev, linux-kernel, wexu,
jfreimann, maxime.coquelin
In-Reply-To: <ee3785be-b309-c4ae-e959-737018c21464@redhat.com>
On Mon, Oct 15, 2018 at 10:51:06AM +0800, Jason Wang wrote:
>
>
> On 2018年10月15日 10:43, Michael S. Tsirkin wrote:
> > On Mon, Oct 15, 2018 at 10:22:33AM +0800, Jason Wang wrote:
> > >
> > > On 2018年10月13日 01:23, Michael S. Tsirkin wrote:
> > > > On Fri, Oct 12, 2018 at 10:32:44PM +0800, Tiwei Bie wrote:
> > > > > On Mon, Jul 16, 2018 at 11:28:09AM +0800, Jason Wang wrote:
> > > > > [...]
> > > > > > @@ -1367,10 +1397,48 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> > > > > > vq->last_avail_idx = s.num;
> > > > > > /* Forget the cached index value. */
> > > > > > vq->avail_idx = vq->last_avail_idx;
> > > > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
> > > > > > + vq->last_avail_wrap_counter = wrap_counter;
> > > > > > + vq->avail_wrap_counter = vq->last_avail_wrap_counter;
> > > > > > + }
> > > > > > break;
> > > > > > case VHOST_GET_VRING_BASE:
> > > > > > s.index = idx;
> > > > > > s.num = vq->last_avail_idx;
> > > > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
> > > > > > + s.num |= vq->last_avail_wrap_counter << 31;
> > > > > > + if (copy_to_user(argp, &s, sizeof(s)))
> > > > > > + r = -EFAULT;
> > > > > > + break;
> > > > > > + case VHOST_SET_VRING_USED_BASE:
> > > > > > + /* Moving base with an active backend?
> > > > > > + * You don't want to do that.
> > > > > > + */
> > > > > > + if (vq->private_data) {
> > > > > > + r = -EBUSY;
> > > > > > + break;
> > > > > > + }
> > > > > > + if (copy_from_user(&s, argp, sizeof(s))) {
> > > > > > + r = -EFAULT;
> > > > > > + break;
> > > > > > + }
> > > > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
> > > > > > + wrap_counter = s.num >> 31;
> > > > > > + s.num &= ~(1 << 31);
> > > > > > + }
> > > > > > + if (s.num > 0xffff) {
> > > > > > + r = -EINVAL;
> > > > > > + break;
> > > > > > + }
> > > > > Do we want to put wrap_counter at bit 15?
> > > > I think I second that - seems to be consistent with
> > > > e.g. event suppression structure and the proposed
> > > > extension to driver notifications.
> > > Ok, I assumes packed virtqueue support 64K but looks not. I can change it to
> > > bit 15 and GET_VRING_BASE need to be changed as well.
> > >
> > > >
> > > > > If put wrap_counter at bit 31, the check (s.num > 0xffff)
> > > > > won't be able to catch the illegal index 0x8000~0xffff for
> > > > > packed ring.
> > > > >
> > > Do we need to clarify this in the spec?
> > Isn't this all internal vhost stuff?
>
> I meant the illegal index 0x8000-0xffff.
It does say packed virtqueues support up to 2 15 entries each.
But yes we can add a requirement that devices do not expose
larger rings. Split does not support 2**16 either, right?
With 2**16 enties avail index becomes 0 and ring looks empty.
>
> >
> > > > > > + vq->last_used_idx = s.num;
> > > > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
> > > > > > + vq->last_used_wrap_counter = wrap_counter;
> > > > > > + break;
> > > > > > + case VHOST_GET_VRING_USED_BASE:
> > > > > Do we need the new VHOST_GET_VRING_USED_BASE and
> > > > > VHOST_SET_VRING_USED_BASE ops?
> > > > >
> > > > > We are going to merge below series in DPDK:
> > > > >
> > > > > http://patches.dpdk.org/patch/45874/
> > > > >
> > > > > We may need to reach an agreement first.
> > > If we agree that 64K virtqueue won't be supported, I'm ok with either.
> > Well the spec says right at the beginning:
> >
> > Packed virtqueues support up to 2 15 entries each.
>
> Ok. I get it.
>
> Then I can change vhost to match what dpdk did.
>
> Thanks
>
> >
> >
> > > Btw the code assumes used_wrap_counter is equal to avail_wrap_counter which
> > > looks wrong?
> > >
> > > Thanks
> > >
> > > > > > + s.index = idx;
> > > > > > + s.num = vq->last_used_idx;
> > > > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
> > > > > > + s.num |= vq->last_used_wrap_counter << 31;
> > > > > > if (copy_to_user(argp, &s, sizeof s))
> > > > > > r = -EFAULT;
> > > > > > break;
> > > > > [...]
^ permalink raw reply
* Re: [RFC] VSOCK: The performance problem of vhost_vsock.
From: Jason Wang @ 2018-10-15 2:33 UTC (permalink / raw)
To: jiangyiwen, stefanha; +Cc: kvm, virtualization, netdev
In-Reply-To: <5BC3F0D4.60409@huawei.com>
On 2018年10月15日 09:43, jiangyiwen wrote:
> Hi Stefan & All:
>
> Now I find vhost-vsock has two performance problems even if it
> is not designed for performance.
>
> First, I think vhost-vsock should faster than vhost-net because it
> is no TCP/IP stack, but the real test result vhost-net is 5~10
> times than vhost-vsock, currently I am looking for the reason.
TCP/IP is not a must for vhost-net.
How do you test and compare the performance?
Thanks
> Second, vhost-vsock only supports two vqs(tx and rx), that means
> if multiple sockets in the guest will use the same vq to transmit
> the message and get the response. So if there are multiple applications
> in the guest, we should support "Multiqueue" feature for Virtio-vsock.
>
> Stefan, have you encountered these problems?
>
> Thanks,
> Yiwen.
>
^ permalink raw reply
* Re: [PATCH] virtio_net: enable tx after resuming from suspend
From: ake @ 2018-10-15 10:08 UTC (permalink / raw)
To: Jason Wang
Cc: Michael S. Tsirkin, David S. Miller, virtualization, netdev,
linux-kernel
In-Reply-To: <1aff0ad2-9d63-6d38-6b25-5c681eafdfb2@igel.co.jp>
On 2018年10月12日 18:18, ake wrote:
>
>
> On 2018年10月12日 17:23, Jason Wang wrote:
>>
>>
>> On 2018年10月12日 12:30, ake wrote:
>>>
>>> On 2018年10月11日 22:06, Jason Wang wrote:
>>>>
>>>> On 2018年10月11日 18:22, ake wrote:
>>>>> On 2018年10月11日 18:44, Jason Wang wrote:
>>>>>> On 2018年10月11日 15:51, Ake Koomsin wrote:
>>>>>>> commit 713a98d90c5e ("virtio-net: serialize tx routine during reset")
>>>>>>> disabled the virtio tx before going to suspend to avoid a use after
>>>>>>> free.
>>>>>>> However, after resuming, it causes the virtio_net device to lose its
>>>>>>> network connectivity.
>>>>>>>
>>>>>>> To solve the issue, we need to enable tx after resuming.
>>>>>>>
>>>>>>> Fixes commit 713a98d90c5e ("virtio-net: serialize tx routine during
>>>>>>> reset")
>>>>>>> Signed-off-by: Ake Koomsin <ake@igel.co.jp>
>>>>>>> ---
>>>>>>> drivers/net/virtio_net.c | 1 +
>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>
>>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>>> index dab504ec5e50..3453d80f5f81 100644
>>>>>>> --- a/drivers/net/virtio_net.c
>>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>>> @@ -2256,6 +2256,7 @@ static int virtnet_restore_up(struct
>>>>>>> virtio_device *vdev)
>>>>>>> }
>>>>>>> netif_device_attach(vi->dev);
>>>>>>> + netif_start_queue(vi->dev);
>>>>>> I believe this is duplicated with netif_tx_wake_all_queues() in
>>>>>> netif_device_attach() above?
>>>>> Thank you for your review.
>>>>>
>>>>> If both netif_tx_wake_all_queues() and netif_start_queue() result in
>>>>> clearing __QUEUE_STATE_DRV_XOFF, then is it possible that some
>>>>> conditions in netif_device_attach() is not satisfied?
>>>> Yes, maybe. One case I can see now is when the device is down, in this
>>>> case netif_device_attach() won't try to wakeup the queue.
>>>>
>>>>> Without
>>>>> netif_start_queue(), the virtio_net device does not resume properly
>>>>> after waking up.
>>>> How do you trigger the issue? Just do suspend/resume?
>>> Yes, simply suspend and resume.
>>>
>>> Here is how I trigger the issue:
>>>
>>> 1) Start the Virtual Machine Manager GUI program.
>>> 2) Create a guest Linux OS. Make sure that the guest OS kernel is
>>> >= 4.12. Make sure that it uses virtio_net as its network device.
>>> In addition, make sure that the video adapter is VGA. Otherwise,
>>> waking up with the virtual power button does not work.
>>> 3) After installing the guest OS, log in, and test the network
>>> connectivity by ping the host machine.
>>> 4) Suspend. After this, the screen is blank.
>>> 5) Resume by hitting the virtual power button. The login screen
>>> appears again.
>>> 6) Log in again. The guest loses its network connection.
>>>
>>> In my test:
>>> Guest: Ubuntu 16.04/18.04 with kernel 4.15.0-36-generic
>>> Host: Ubuntu 16.04 with kernel 4.15.0-36-generic/4.4.0-137-generic
>>
>> I can not reproduce this issue if virtio-net interface is up in guest
>> before the suspend. I'm using net-next.git and qemu master. But I do
>> reproduce when virtio-net interface is down in guest before suspend,
>> after resume, even if I make it up, the network is still lost.
>>
>> I think the interface is up in your case, but please confirm this.
>
> If you mean the interface state before I hit the suspend button,
> the answer is yes. The interface is up before I suspend the guest
> machine.
>
> Note that my current QEMU version is QEMU emulator version 2.5.0
> (Debian 1:2.5+dfsg-5ubuntu10.32).
>
> I will try with net-next.git and qemu master later and see if I can
> reproduce the issue.
Update. I tried with net-next and qemu master. Interestingly, the result
is different from yours. The network is lost even if the virtio_net
interface is up before suspending.
Host: Ubuntu 16.04 with net-next kernel (default configuration)
Guest: Ubuntu 18.04 with net-next kernel (default configuration)
Qemu: master
Qemu command:
qemu-system-x86_64 -cpu host -m 2048 -enable-kvm \
-bios /usr/share/OVMF/OVMF_CODE.fd \
-drive file=/var/lib/libvirt/images/virtio_test.qcow2,if=virtio \
-netdev user,id=hostnet0 \
-device virtio-net-pci,netdev=hostnet0 \
-device VGA,id=video0,vgamem_mb=16 \
-global PIIX4_PM.disable_s3=1 \
-global PIIX4_PM.disable_s4=1 -monitor stdio
>>>
>>>>> Is it better to report this as a bug first?
>>>> Nope, you're very welcome to post patch directly.
>>>>
>>>>> If I am to do more
>>>>> investigation, what areas should I look into?
>>>> As you've figured out, you can start with why netif_tx_wake_all_queues()
>>>> were not executed?
>>>>
>>>> (Btw, does the issue disappear if you move netif_tx_disable() under the
>>>> check of netif_running() in virtnet_freeze_down()?)
>>> The issue disappears if I move netif_tx_disable() under the check of
>>> netif_running() in virtnet_freeze_down(). Moving netif_tx_disable()
>>> is probably better as its logic is consistent with
>>> netif_device_attach() implementation. If you are OK with this idea,
>>> I will submit another patch.
>>
>> I think the it helps for the case when interface is down before suspend.
>> But it's still unclear why it help even if the interface is up
>> (netif_running() is true).
>>
>> Please submit a patch but we should figure out why it help for a up
>> interface as well.
>>
I will think about the proper reason first.
>> Thanks
>>
>>>
>>>> Thanks
>>>>
>>>>> Best Regards
>>>>> Ake Koomsin
>>>>>
>>> Best Regards
>>
Best Regards
^ permalink raw reply
* [RFC] VSOCK: The performance problem of vhost_vsock.
From: jiangyiwen @ 2018-10-15 1:43 UTC (permalink / raw)
To: stefanha; +Cc: kvm, virtualization, netdev
Hi Stefan & All:
Now I find vhost-vsock has two performance problems even if it
is not designed for performance.
First, I think vhost-vsock should faster than vhost-net because it
is no TCP/IP stack, but the real test result vhost-net is 5~10
times than vhost-vsock, currently I am looking for the reason.
Second, vhost-vsock only supports two vqs(tx and rx), that means
if multiple sockets in the guest will use the same vq to transmit
the message and get the response. So if there are multiple applications
in the guest, we should support "Multiqueue" feature for Virtio-vsock.
Stefan, have you encountered these problems?
Thanks,
Yiwen.
^ permalink raw reply
* Re: [PATCH net] r8169: Enable MSI-X on RTL8106e
From: Jian-Hong Pan @ 2018-10-15 8:51 UTC (permalink / raw)
To: David S. Miller
Cc: Heiner Kallweit, Realtek linux nic maintainers, Linux Netdev List,
Linux Kernel, Kai-Heng Feng, Linux Upstreaming Team
In-Reply-To: <CAPpJ_eduPv5RDNgUOw3K+XprZBfUVn_sSOLbDWjNtaGRssUgLA@mail.gmail.com>
2018-10-02 13:57 GMT+08:00 Jian-Hong Pan <jian-hong@endlessm.com>:
> David Miller <davem@davemloft.net> 於 2018年10月2日 週二 下午1:51寫道:
>>
>> From: Jian-Hong Pan <jian-hong@endlessm.com>
>> Date: Thu, 27 Sep 2018 12:09:48 +0800
>>
>> > However, there is a commit which resolves the drivers getting nothing in
>> > PCI BAR=4 after system resumes. It is 04cb3ae895d7 "PCI: Reprogram
>> > bridge prefetch registers on resume" by Daniel Drake.
>>
>> I don't see this upstream yet.
>
> It is in linux-next repository:
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=04cb3ae895d7efdc60f0fe17182b200a3da20f09
The commit is also back ported into Linux kernel 4.19-rc7 now.
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=083874549fdfefa629dfa752785e20427dde1511
Regrads,
Jian-Hong Pan
^ permalink raw reply
* Re: net/wan: hostess_sv11 + z85230 problems
From: Krzysztof Hałasa @ 2018-10-15 8:20 UTC (permalink / raw)
To: Randy Dunlap; +Cc: netdev@vger.kernel.org, LKML, Alan Cox
In-Reply-To: <1468d110-4f92-da7a-21b5-36afcec3c94e@infradead.org>
Hi,
Randy Dunlap <rdunlap@infradead.org> writes:
> kernel 4.19-rc7, on i386, with NO wan/hdlc/hostess/z85230 hardware:
>
> modprobe hostess_sv11 + autoload of z85230 give:
BTW Hostess SV11 is apparently an ISA card, with all those problems.
> [ 3162.511877] Call Trace:
> [ 3162.511877] <IRQ>
> [ 3162.511877] dump_stack+0x58/0x7d
> [ 3162.511877] register_lock_class+0x4a3/0x4b0
> [ 3162.511877] ? native_sched_clock+0x2f/0x110
> [ 3162.511877] __lock_acquire.isra.26+0x46/0x770
> [ 3162.511877] ? sched_clock+0x8/0x10
> [ 3162.511877] lock_acquire+0x5c/0x80
> [ 3162.511877] ? z8530_interrupt+0x35/0x180 [z85230]
> [ 3162.511877] _raw_spin_lock+0x28/0x70
> [ 3162.511877] ? z8530_interrupt+0x35/0x180 [z85230]
> [ 3162.511877] z8530_interrupt+0x35/0x180 [z85230]
> [ 3162.511877] __handle_irq_event_percpu+0x35/0xe0
> [ 3162.511877] handle_irq_event_percpu+0x26/0x70
> [ 3162.511877] handle_irq_event+0x29/0x42
> [ 3162.511877] handle_fasteoi_irq+0x9b/0x170
> [ 3162.511877] ? handle_simple_irq+0x90/0x90
> [ 3162.511877] handle_irq+0xc3/0xee
> [ 3162.511877] </IRQ>
> [ 3162.511877] do_IRQ+0x51/0xe0
> [ 3162.511877] common_interrupt+0xe7/0xec
Look's like something triggered an IRQ, and the z8530 driver got
confused given the lack of z8530 hardware.
> [ 3162.511877] EAX: 00000246 EBX: 00000246 ECX: 00000002 EDX: 00000058
> [ 3162.511877] ESI: f400d8e4 EDI: 00000000 EBP: efbf9d74 ESP: efbf9d6c
> [ 3162.511877] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000246
> [ 3162.511877] __setup_irq+0x2f0/0x620
> [ 3162.511877] request_threaded_irq+0xcd/0x170
> [ 3162.511877] init_module+0xb0/0x270 [hostess_sv11]
I think the IRQ came as soon as it was requested (enabled).
Now the code does:
static struct z8530_dev *sv11_init(int iobase, int irq)
{
...
if (request_irq(irq, z8530_interrupt, 0,
"Hostess SV11", sv) < 0) {
pr_warn("IRQ %d already in use\n", irq);
goto err_irq;
}
...
disable_irq(irq);
and only then:
if (z8530_init(sv)) {
pr_err("Z8530 series device not found\n");
enable_irq(irq);
goto free_dma; (including free_irq())
}
Not sure about z8530 internals (driver and hw), but I guess the sv11
driver should initialize the hw first, and only then request_irq().
Perhaps there should be no "default address" either? The user would
have to provide the hardware parameters explicitly.
How about this (totally untested):
Fix the Hostess SV11 driver trying to use the hardware before its
existence is detected.
Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
diff --git a/drivers/net/wan/hostess_sv11.c b/drivers/net/wan/hostess_sv11.c
index 4de0737fbf8a..e8808449c9e5 100644
--- a/drivers/net/wan/hostess_sv11.c
+++ b/drivers/net/wan/hostess_sv11.c
@@ -216,15 +216,6 @@ static struct z8530_dev *sv11_init(int iobase, int irq)
outb(0, iobase + 4); /* DMA off */
- /* We want a fast IRQ for this device. Actually we'd like an even faster
- IRQ ;) - This is one driver RtLinux is made for */
-
- if (request_irq(irq, z8530_interrupt, 0,
- "Hostess SV11", sv) < 0) {
- pr_warn("IRQ %d already in use\n", irq);
- goto err_irq;
- }
-
sv->irq = irq;
sv->chanA.private = sv;
sv->chanA.dev = sv;
@@ -246,17 +237,12 @@ static struct z8530_dev *sv11_init(int iobase, int irq)
goto err_rxdma;
}
- /* Kill our private IRQ line the hostess can end up chattering
- until the configuration is set */
- disable_irq(irq);
-
/*
* Begin normal initialise
*/
if (z8530_init(sv)) {
pr_err("Z8530 series device not found\n");
- enable_irq(irq);
goto free_dma;
}
z8530_channel_load(&sv->chanB, z8530_dead_port);
@@ -265,12 +251,6 @@ static struct z8530_dev *sv11_init(int iobase, int irq)
else
z8530_channel_load(&sv->chanA, z8530_hdlc_kilostream_85230);
- enable_irq(irq);
-
- /*
- * Now we can take the IRQ
- */
-
sv->chanA.netdevice = netdev = alloc_hdlcdev(sv);
if (!netdev)
goto free_dma;
@@ -288,9 +268,21 @@ static struct z8530_dev *sv11_init(int iobase, int irq)
}
z8530_describe(sv, "I/O", iobase);
+
+ /* We want a fast IRQ for this device. Actually we'd like an even faster
+ IRQ ;) - This is one driver RtLinux is made for */
+
+ if (request_irq(irq, z8530_interrupt, 0,
+ "Hostess SV11", sv) < 0) {
+ pr_warn("IRQ %d already in use\n", irq);
+ goto err_irq;
+ }
+
sv->active = 1;
return sv;
+err_irq:
+ unregister_hdlc_device(netdev);
free_dma:
if (dma == 1)
free_dma(sv->chanA.rxdma);
@@ -298,8 +290,6 @@ static struct z8530_dev *sv11_init(int iobase, int irq)
if (dma)
free_dma(sv->chanA.txdma);
err_txdma:
- free_irq(irq, sv);
-err_irq:
kfree(sv);
err_kzalloc:
release_region(iobase, 8);
--
Krzysztof Halasa
Industrial Research Institute for Automation and Measurements PIAP
Al. Jerozolimskie 202, 02-486 Warsaw, Poland
^ permalink raw reply related
* INFO: task hung in genl_rcv_msg
From: syzbot @ 2018-10-15 8:06 UTC (permalink / raw)
To: davem, gregkh, keescook, kstewart, ktkhai, linux-kernel, netdev,
nicolas.dichtel, pombredanne, syzkaller-bugs
Hello,
syzbot found the following crash on:
HEAD commit: bab5c80b2110 Merge tag 'armsoc-fixes-4.19' of git://git.ke..
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=15462c41400000
kernel config: https://syzkaller.appspot.com/x/.config?x=88e9a8a39dc0be2d
dashboard link: https://syzkaller.appspot.com/bug?extid=c3b90a95b2d6bd4f29b1
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=135e41a5400000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+c3b90a95b2d6bd4f29b1@syzkaller.appspotmail.com
IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
8021q: adding VLAN 0 to HW filter on device team0
8021q: adding VLAN 0 to HW filter on device team0
8021q: adding VLAN 0 to HW filter on device team0
INFO: task syz-executor0:6925 blocked for more than 140 seconds.
Not tainted 4.19.0-rc7+ #140
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0 D24952 6925 5376 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2825 [inline]
__schedule+0x86c/0x1ed0 kernel/sched/core.c:3473
schedule+0xfe/0x460 kernel/sched/core.c:3517
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3575
__mutex_lock_common kernel/locking/mutex.c:1002 [inline]
__mutex_lock+0xbe7/0x1700 kernel/locking/mutex.c:1072
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
genl_lock net/netlink/genetlink.c:33 [inline]
genl_rcv_msg+0x13a/0x168 net/netlink/genetlink.c:624
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: Bad RIP value.
RSP: 002b:00007f805f83dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000004
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f805f83e6d4
R13: 00000000004c387d R14: 00000000004d56d0 R15: 00000000ffffffff
INFO: task syz-executor5:6923 blocked for more than 140 seconds.
Not tainted 4.19.0-rc7+ #140
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor5 D24952 6923 5384 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2825 [inline]
__schedule+0x86c/0x1ed0 kernel/sched/core.c:3473
schedule+0xfe/0x460 kernel/sched/core.c:3517
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3575
__mutex_lock_common kernel/locking/mutex.c:1002 [inline]
__mutex_lock+0xbe7/0x1700 kernel/locking/mutex.c:1072
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
genl_lock net/netlink/genetlink.c:33 [inline]
genl_rcv_msg+0x13a/0x168 net/netlink/genetlink.c:624
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: Bad RIP value.
RSP: 002b:00007f00193a4c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000004
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f00193a56d4
R13: 00000000004c387d R14: 00000000004d56d0 R15: 00000000ffffffff
INFO: task syz-executor1:6930 blocked for more than 140 seconds.
Not tainted 4.19.0-rc7+ #140
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor1 D24952 6930 5377 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2825 [inline]
__schedule+0x86c/0x1ed0 kernel/sched/core.c:3473
schedule+0xfe/0x460 kernel/sched/core.c:3517
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3575
__mutex_lock_common kernel/locking/mutex.c:1002 [inline]
__mutex_lock+0xbe7/0x1700 kernel/locking/mutex.c:1072
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
genl_lock net/netlink/genetlink.c:33 [inline]
genl_rcv_msg+0x13a/0x168 net/netlink/genetlink.c:624
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: Bad RIP value.
RSP: 002b:00007fc5ec984c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000004
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc5ec9856d4
R13: 00000000004c387d R14: 00000000004d56d0 R15: 00000000ffffffff
INFO: task syz-executor2:6940 blocked for more than 140 seconds.
Not tainted 4.19.0-rc7+ #140
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor2 D24952 6940 5382 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2825 [inline]
__schedule+0x86c/0x1ed0 kernel/sched/core.c:3473
schedule+0xfe/0x460 kernel/sched/core.c:3517
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3575
__mutex_lock_common kernel/locking/mutex.c:1002 [inline]
__mutex_lock+0xbe7/0x1700 kernel/locking/mutex.c:1072
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
genl_lock net/netlink/genetlink.c:33 [inline]
genl_rcv_msg+0x13a/0x168 net/netlink/genetlink.c:624
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: Bad RIP value.
RSP: 002b:00007f988ec35c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000004
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f988ec366d4
R13: 00000000004c387d R14: 00000000004d56d0 R15: 00000000ffffffff
INFO: task syz-executor3:6942 blocked for more than 140 seconds.
Not tainted 4.19.0-rc7+ #140
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor3 D23080 6942 5380 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2825 [inline]
__schedule+0x86c/0x1ed0 kernel/sched/core.c:3473
schedule+0xfe/0x460 kernel/sched/core.c:3517
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3575
__mutex_lock_common kernel/locking/mutex.c:1002 [inline]
__mutex_lock+0xbe7/0x1700 kernel/locking/mutex.c:1072
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
genl_lock net/netlink/genetlink.c:33 [inline]
genl_rcv_msg+0x13a/0x168 net/netlink/genetlink.c:624
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: Bad RIP value.
RSP: 002b:00007f8ef4a35c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000004
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8ef4a366d4
R13: 00000000004c387d R14: 00000000004d56d0 R15: 00000000ffffffff
Showing all locks held in the system:
1 lock held by khungtaskd/982:
#0: 000000000bcea75c (rcu_read_lock){....}, at:
debug_show_all_locks+0xd0/0x424 kernel/locking/lockdep.c:4435
1 lock held by rsyslogd/5240:
#0: 00000000baacc0ba (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1bb/0x200
fs/file.c:766
2 locks held by getty/5329:
#0: 00000000da399000 (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 00000000078f6766 (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/5330:
#0: 000000006658d2ab (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 00000000a4e55645 (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/5331:
#0: 00000000f3b48bf4 (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 00000000e44b615e (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/5332:
#0: 00000000005368f9 (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 00000000717606d2 (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/5333:
#0: 000000008f6095e9 (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 00000000cbfa8653 (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/5334:
#0: 00000000a7e4496b (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 00000000e3981645 (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
2 locks held by getty/5335:
#0: 000000009ff3ef1e (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x32/0x40 drivers/tty/tty_ldsem.c:353
#1: 000000004c38513e (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x335/0x1ce0 drivers/tty/n_tty.c:2140
4 locks held by syz-executor4/6919:
2 locks held by syz-executor0/6925:
#0: 00000000455a9b7b (cb_lock){++++}, at: genl_rcv+0x19/0x40
net/netlink/genetlink.c:636
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_lock
net/netlink/genetlink.c:33 [inline]
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_rcv_msg+0x13a/0x168
net/netlink/genetlink.c:624
2 locks held by syz-executor5/6923:
#0: 00000000455a9b7b (cb_lock){++++}, at: genl_rcv+0x19/0x40
net/netlink/genetlink.c:636
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_lock
net/netlink/genetlink.c:33 [inline]
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_rcv_msg+0x13a/0x168
net/netlink/genetlink.c:624
2 locks held by syz-executor1/6930:
#0: 00000000455a9b7b (cb_lock){++++}, at: genl_rcv+0x19/0x40
net/netlink/genetlink.c:636
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_lock
net/netlink/genetlink.c:33 [inline]
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_rcv_msg+0x13a/0x168
net/netlink/genetlink.c:624
2 locks held by syz-executor2/6940:
#0: 00000000455a9b7b (cb_lock){++++}, at: genl_rcv+0x19/0x40
net/netlink/genetlink.c:636
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_lock
net/netlink/genetlink.c:33 [inline]
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_rcv_msg+0x13a/0x168
net/netlink/genetlink.c:624
2 locks held by syz-executor3/6942:
#0: 00000000455a9b7b (cb_lock){++++}, at: genl_rcv+0x19/0x40
net/netlink/genetlink.c:636
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_lock
net/netlink/genetlink.c:33 [inline]
#1: 00000000f78598ee (genl_mutex){+.+.}, at: genl_rcv_msg+0x13a/0x168
net/netlink/genetlink.c:624
=============================================
NMI backtrace for cpu 1
CPU: 1 PID: 982 Comm: khungtaskd Not tainted 4.19.0-rc7+ #140
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
nmi_cpu_backtrace.cold.3+0x63/0xa2 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x1b3/0x1ed lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_all_cpu_backtrace include/linux/nmi.h:144 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:204 [inline]
watchdog+0xb3e/0x1050 kernel/hung_task.c:265
kthread+0x35a/0x420 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
Sending NMI from CPU 1 to CPUs 0:
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.662
msecs
NMI backtrace for cpu 0
CPU: 0 PID: 6919 Comm: syz-executor4 Not tainted 4.19.0-rc7+ #140
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:__rhashtable_lookup include/linux/rhashtable.h:481 [inline]
RIP: 0010:rhashtable_lookup include/linux/rhashtable.h:516 [inline]
RIP: 0010:rhashtable_lookup_fast include/linux/rhashtable.h:542 [inline]
RIP: 0010:tipc_sk_lookup+0x99e/0xff0 net/tipc/socket.c:2698
Code: 85 2b 06 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 03 4b
06 fa 0f b6 05 9c 06 76 02 31 ff 89 c6 88 85 d0 fd ff ff <e8> bd 4b 06 fa
0f b6 85 d0 fd ff ff 84 c0 0f 85 74 fc ff ff e8 d9
RSP: 0018:ffff8801cecde8c8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff8801cecdeb18 RCX: ffffffff87788ad8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff8801cecdeb40 R08: ffff8801ce884040 R09: 1ffffffff1273965
R10: ffffed003b5c4732 R11: ffff8801dae23993 R12: ffff8801c89b2a00
R13: dffffc0000000000 R14: 0000000000000092 R15: 0000000000000001
FS: 00007fc2cfef9700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 00000001bc63a000 CR4: 00000000001406f0
Call Trace:
tipc_nl_publ_dump+0x22d/0xf9c net/tipc/socket.c:3502
__tipc_nl_compat_dumpit.isra.11+0x25d/0xb50 net/tipc/netlink_compat.c:196
tipc_nl_compat_publ_dump net/tipc/netlink_compat.c:925 [inline]
tipc_nl_compat_sk_dump+0x88e/0xc50 net/tipc/netlink_compat.c:973
__tipc_nl_compat_dumpit.isra.11+0x389/0xb50 net/tipc/netlink_compat.c:205
tipc_nl_compat_dumpit+0x1f4/0x440 net/tipc/netlink_compat.c:270
tipc_nl_compat_handle net/tipc/netlink_compat.c:1147 [inline]
tipc_nl_compat_recv+0x12b3/0x19a0 net/tipc/netlink_compat.c:1210
genl_family_rcv_msg+0x8a9/0x1140 net/netlink/genetlink.c:601
genl_rcv_msg+0xc6/0x168 net/netlink/genetlink.c:626
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
genl_rcv+0x28/0x40 net/netlink/genetlink.c:637
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457569
Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fc2cfef8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457569
RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000006
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc2cfef96d4
R13: 00000000004c387d R14: 00000000004d56d0 R15: 00000000ffffffff
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
^ permalink raw reply
* Re: KASAN: use-after-free Write in skb_release_data (2)
From: Dmitry Vyukov @ 2018-10-15 7:49 UTC (permalink / raw)
To: syzbot
Cc: David Miller, Alexey Kuznetsov, LKML, netdev, syzkaller-bugs,
Hideaki YOSHIFUJI
In-Reply-To: <0000000000003ba80905783e9189@google.com>
On Mon, Oct 15, 2018 at 8:30 AM, syzbot
<syzbot+580be3953ed99133804f@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: e40a826a6cbc qed: Add support for virtual link.
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=168954a5400000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7e7e2279c0020d5f
> dashboard link: https://syzkaller.appspot.com/bug?extid=580be3953ed99133804f
> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+580be3953ed99133804f@syzkaller.appspotmail.com
>
> ==================================================================
> BUG: KASAN: use-after-free in atomic_sub_return
> include/asm-generic/atomic-instrumented.h:305 [inline]
> BUG: KASAN: use-after-free in skb_release_data+0x1a3/0x880
> net/core/skbuff.c:559
> Write of size 4 at addr ffff8801d8cb4120 by task syz-executor0/8395
>
> CPU: 0 PID: 8395 Comm: syz-executor0 Not tainted 4.19.0-rc6+ #254
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
> print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
> kasan_report_error mm/kasan/report.c:354 [inline]
> kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
> check_memory_region_inline mm/kasan/kasan.c:260 [inline]
> check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
> kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
> atomic_sub_return include/asm-generic/atomic-instrumented.h:305 [inline]
> skb_release_data+0x1a3/0x880 net/core/skbuff.c:559
> skb_release_all+0x4a/0x60 net/core/skbuff.c:627
> __kfree_skb net/core/skbuff.c:641 [inline]
> kfree_skb+0x1bb/0x580 net/core/skbuff.c:659
> sit_tunnel_xmit+0x173/0x30d0 net/ipv6/sit.c:1044
> __netdev_start_xmit include/linux/netdevice.h:4328 [inline]
> netdev_start_xmit include/linux/netdevice.h:4337 [inline]
> xmit_one net/core/dev.c:3219 [inline]
> dev_hard_start_xmit+0x295/0xc90 net/core/dev.c:3235
> __dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805
> dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
> packet_sendmsg_spkt+0xf44/0x16f0 net/packet/af_packet.c:1975
> sock_sendmsg_nosec net/socket.c:621 [inline]
> sock_sendmsg+0xd5/0x120 net/socket.c:631
> ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
> __sys_sendmsg+0x11d/0x280 net/socket.c:2154
> __do_sys_sendmsg net/socket.c:2163 [inline]
> __se_sys_sendmsg net/socket.c:2161 [inline]
> __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
> do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x457519
> Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007f2a52d30c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457519
> RDX: 0000000000000000 RSI: 00000000200003c0 RDI: 0000000000000003
> RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2a52d316d4
> R13: 00000000004c34d5 R14: 00000000004d52c8 R15: 00000000ffffffff
>
> Allocated by task 8395:
> save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> set_track mm/kasan/kasan.c:460 [inline]
> kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
> __do_kmalloc_node mm/slab.c:3682 [inline]
> __kmalloc_node_track_caller+0x47/0x70 mm/slab.c:3696
> __kmalloc_reserve.isra.39+0x41/0xe0 net/core/skbuff.c:137
> __alloc_skb+0x155/0x770 net/core/skbuff.c:205
> alloc_skb include/linux/skbuff.h:997 [inline]
> sock_wmalloc+0x16d/0x1f0 net/core/sock.c:1934
> packet_sendmsg_spkt+0x48a/0x16f0 net/packet/af_packet.c:1922
> sock_sendmsg_nosec net/socket.c:621 [inline]
> sock_sendmsg+0xd5/0x120 net/socket.c:631
> ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
> __sys_sendmsg+0x11d/0x280 net/socket.c:2154
> __do_sys_sendmsg net/socket.c:2163 [inline]
> __se_sys_sendmsg net/socket.c:2161 [inline]
> __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
> do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Freed by task 8395:
> save_stack+0x43/0xd0 mm/kasan/kasan.c:448
> set_track mm/kasan/kasan.c:460 [inline]
> __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
> kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
> __cache_free mm/slab.c:3498 [inline]
> kfree+0xcf/0x230 mm/slab.c:3813
> skb_free_head+0x99/0xc0 net/core/skbuff.c:550
> skb_release_data+0x6a4/0x880 net/core/skbuff.c:570
> skb_release_all+0x4a/0x60 net/core/skbuff.c:627
> __kfree_skb net/core/skbuff.c:641 [inline]
> consume_skb+0x1ae/0x570 net/core/skbuff.c:701
> packet_rcv+0x172/0x1820 net/packet/af_packet.c:2139
> dev_queue_xmit_nit+0x8ae/0xb30 net/core/dev.c:2020
> xmit_one net/core/dev.c:3215 [inline]
> dev_hard_start_xmit+0x186/0xc90 net/core/dev.c:3235
> __dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805
> dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
> packet_sendmsg_spkt+0xf44/0x16f0 net/packet/af_packet.c:1975
> sock_sendmsg_nosec net/socket.c:621 [inline]
> sock_sendmsg+0xd5/0x120 net/socket.c:631
> ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
> __sys_sendmsg+0x11d/0x280 net/socket.c:2154
> __do_sys_sendmsg net/socket.c:2163 [inline]
> __se_sys_sendmsg net/socket.c:2161 [inline]
> __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
> do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> The buggy address belongs to the object at ffff8801d8cb4040
> which belongs to the cache kmalloc-512 of size 512
> The buggy address is located 224 bytes inside of
> 512-byte region [ffff8801d8cb4040, ffff8801d8cb4240)
> The buggy address belongs to the page:
> page:ffffea0007632d00 count:1 mapcount:0 mapping:ffff8801da800940
> index:0xffff8801d8cb42c0
> flags: 0x2fffc0000000100(slab)
> raw: 02fffc0000000100 ffffea00070baf08 ffffea000736a788 ffff8801da800940
> raw: ffff8801d8cb42c0 ffff8801d8cb4040 0000000100000005 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff8801d8cb4000: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
> ffff8801d8cb4080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>
>> ffff8801d8cb4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>
> ^
> ffff8801d8cb4180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8801d8cb4200: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ==================================================================
FWIW from the log this was triggered on this program:
05:22:33 executing program 0:
r0 = socket(0x11, 0xa, 0x0)
sendmsg(r0, &(0x7f00000003c0)={&(0x7f00000000c0)=@pptp={0x18, 0x2,
{0x0, @dev}}, 0x80, &(0x7f0000000380), 0x0, &(0x7f0000001880)}, 0x0)
Looks like a race with a very short inconsistency window since it was
triggered only once.
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
^ permalink raw reply
* KASAN: use-after-free Write in skb_release_data (2)
From: syzbot @ 2018-10-15 6:30 UTC (permalink / raw)
To: davem, kuznet, linux-kernel, netdev, syzkaller-bugs, yoshfuji
Hello,
syzbot found the following crash on:
HEAD commit: e40a826a6cbc qed: Add support for virtual link.
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=168954a5400000
kernel config: https://syzkaller.appspot.com/x/.config?x=7e7e2279c0020d5f
dashboard link: https://syzkaller.appspot.com/bug?extid=580be3953ed99133804f
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+580be3953ed99133804f@syzkaller.appspotmail.com
==================================================================
BUG: KASAN: use-after-free in atomic_sub_return
include/asm-generic/atomic-instrumented.h:305 [inline]
BUG: KASAN: use-after-free in skb_release_data+0x1a3/0x880
net/core/skbuff.c:559
Write of size 4 at addr ffff8801d8cb4120 by task syz-executor0/8395
CPU: 0 PID: 8395 Comm: syz-executor0 Not tainted 4.19.0-rc6+ #254
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
atomic_sub_return include/asm-generic/atomic-instrumented.h:305 [inline]
skb_release_data+0x1a3/0x880 net/core/skbuff.c:559
skb_release_all+0x4a/0x60 net/core/skbuff.c:627
__kfree_skb net/core/skbuff.c:641 [inline]
kfree_skb+0x1bb/0x580 net/core/skbuff.c:659
sit_tunnel_xmit+0x173/0x30d0 net/ipv6/sit.c:1044
__netdev_start_xmit include/linux/netdevice.h:4328 [inline]
netdev_start_xmit include/linux/netdevice.h:4337 [inline]
xmit_one net/core/dev.c:3219 [inline]
dev_hard_start_xmit+0x295/0xc90 net/core/dev.c:3235
__dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805
dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
packet_sendmsg_spkt+0xf44/0x16f0 net/packet/af_packet.c:1975
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457519
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2a52d30c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457519
RDX: 0000000000000000 RSI: 00000000200003c0 RDI: 0000000000000003
RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2a52d316d4
R13: 00000000004c34d5 R14: 00000000004d52c8 R15: 00000000ffffffff
Allocated by task 8395:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
__do_kmalloc_node mm/slab.c:3682 [inline]
__kmalloc_node_track_caller+0x47/0x70 mm/slab.c:3696
__kmalloc_reserve.isra.39+0x41/0xe0 net/core/skbuff.c:137
__alloc_skb+0x155/0x770 net/core/skbuff.c:205
alloc_skb include/linux/skbuff.h:997 [inline]
sock_wmalloc+0x16d/0x1f0 net/core/sock.c:1934
packet_sendmsg_spkt+0x48a/0x16f0 net/packet/af_packet.c:1922
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 8395:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xcf/0x230 mm/slab.c:3813
skb_free_head+0x99/0xc0 net/core/skbuff.c:550
skb_release_data+0x6a4/0x880 net/core/skbuff.c:570
skb_release_all+0x4a/0x60 net/core/skbuff.c:627
__kfree_skb net/core/skbuff.c:641 [inline]
consume_skb+0x1ae/0x570 net/core/skbuff.c:701
packet_rcv+0x172/0x1820 net/packet/af_packet.c:2139
dev_queue_xmit_nit+0x8ae/0xb30 net/core/dev.c:2020
xmit_one net/core/dev.c:3215 [inline]
dev_hard_start_xmit+0x186/0xc90 net/core/dev.c:3235
__dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805
dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
packet_sendmsg_spkt+0xf44/0x16f0 net/packet/af_packet.c:1975
sock_sendmsg_nosec net/socket.c:621 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:631
___sys_sendmsg+0x7fd/0x930 net/socket.c:2116
__sys_sendmsg+0x11d/0x280 net/socket.c:2154
__do_sys_sendmsg net/socket.c:2163 [inline]
__se_sys_sendmsg net/socket.c:2161 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at ffff8801d8cb4040
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 224 bytes inside of
512-byte region [ffff8801d8cb4040, ffff8801d8cb4240)
The buggy address belongs to the page:
page:ffffea0007632d00 count:1 mapcount:0 mapping:ffff8801da800940
index:0xffff8801d8cb42c0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea00070baf08 ffffea000736a788 ffff8801da800940
raw: ffff8801d8cb42c0 ffff8801d8cb4040 0000000100000005 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801d8cb4000: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
ffff8801d8cb4080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8801d8cb4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801d8cb4180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801d8cb4200: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
^ permalink raw reply
* KASAN: use-after-free Read in bpf_prog_kallsyms_del
From: syzbot @ 2018-10-15 6:28 UTC (permalink / raw)
To: ast, daniel, linux-kernel, netdev, syzkaller-bugs
Hello,
syzbot found the following crash on:
HEAD commit: 67e89ac32828 bpf: Fix dev pointer dereference from sk_skb
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=11381531400000
kernel config: https://syzkaller.appspot.com/x/.config?x=7e7e2279c0020d5f
dashboard link: https://syzkaller.appspot.com/bug?extid=10cffda23c81a3ff1088
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
Unfortunately, I don't have any reproducer for this crash yet.
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+10cffda23c81a3ff1088@syzkaller.appspotmail.com
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xf1/0xf3
lib/list_debug.c:51
Read of size 8 at addr ffff8801d18fd820 by task syz-executor1/1917
CPU: 0 PID: 1917 Comm: syz-executor1 Not tainted 4.19.0-rc6+ #120
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
__list_del_entry_valid+0xf1/0xf3 lib/list_debug.c:51
__list_del_entry include/linux/list.h:117 [inline]
list_del_rcu include/linux/rculist.h:130 [inline]
bpf_prog_ksym_node_del kernel/bpf/core.c:467 [inline]
bpf_prog_kallsyms_del+0x1e7/0x410 kernel/bpf/core.c:498
bpf_prog_kallsyms_del_all+0x1d/0x20 kernel/bpf/core.c:364
__bpf_prog_put+0xd7/0x150 kernel/bpf/syscall.c:1135
bpf_prog_put kernel/bpf/syscall.c:1143 [inline]
bpf_prog_release+0x3c/0x50 kernel/bpf/syscall.c:1151
__fput+0x385/0xa30 fs/file_table.c:278
____fput+0x15/0x20 fs/file_table.c:309
task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:193 [inline]
exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x411021
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 34 19 00 00 c3 48
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007fff25880b30 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000411021
RDX: 0000000000000000 RSI: 0000000000732b80 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 00007fff25880a60 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000311 R15: 0000000000000001
Allocated by task 1907:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
kmem_cache_alloc_trace+0x152/0x750 mm/slab.c:3620
kmalloc include/linux/slab.h:513 [inline]
kzalloc include/linux/slab.h:707 [inline]
bpf_prog_alloc+0x328/0x3e0 kernel/bpf/core.c:89
bpf_prog_load+0x435/0x1cb0 kernel/bpf/syscall.c:1402
__do_sys_bpf kernel/bpf/syscall.c:2409 [inline]
__se_sys_bpf kernel/bpf/syscall.c:2371 [inline]
__x64_sys_bpf+0x36c/0x510 kernel/bpf/syscall.c:2371
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 2682:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kfree+0xcf/0x230 mm/slab.c:3813
__bpf_prog_free kernel/bpf/core.c:146 [inline]
bpf_prog_unlock_free include/linux/filter.h:740 [inline]
bpf_prog_free_deferred+0x2a4/0x420 kernel/bpf/core.c:1740
process_one_work+0xc90/0x1b90 kernel/workqueue.c:2153
worker_thread+0x17f/0x1390 kernel/workqueue.c:2296
kthread+0x35a/0x420 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
The buggy address belongs to the object at ffff8801d18fd7c0
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 96 bytes inside of
512-byte region [ffff8801d18fd7c0, ffff8801d18fd9c0)
The buggy address belongs to the page:
page:ffffea0007463f40 count:1 mapcount:0 mapping:ffff8801da800940 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea000706cf08 ffffea00075d1b88 ffff8801da800940
raw: 0000000000000000 ffff8801d18fd040 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801d18fd700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff8801d18fd780: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
> ffff8801d18fd800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801d18fd880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801d18fd900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
^ permalink raw reply
* Re: Bug in MACSec - stops passing traffic after approx 5TB
From: Josh Coombs @ 2018-10-14 20:52 UTC (permalink / raw)
To: sd; +Cc: netdev
In-Reply-To: <20181014202510.GA19253@bistromath.localdomain>
On Sun, Oct 14, 2018 at 4:24 PM Sabrina Dubroca <sd@queasysnail.net> wrote:
>
> 2018-10-14, 10:59:31 -0400, Josh Coombs wrote:
> > I initially mistook this for a traffic control issue, but after
> > stripping the test beds down to just the MACSec component, I can still
> > replicate the issue. After approximately 5TB of transfer / 4 billion
> > packets over a MACSec link it stops passing traffic.
>
> I think you're just hitting packet number exhaustion. After 2^32
> packets, the packet number would wrap to 0 and start being reused,
> which breaks the crypto used by macsec. Before this point, you have to
> add a new SA, and tell the macsec device to switch to it.
I had not considered that, I naively thought as long as I didn't
specify a replay window, it'd roll the PN over on it's own and life
would be good. I'll test that theory tomorrow, should be easy to
prove out.
> That's why you should be using wpa_supplicant. It will monitor the
> growth of the packet number, and handle the rekey for you.
Thank you for the heads up, I'll read up on this as well.
Josh C
^ permalink raw reply
* Re: Bug in MACSec - stops passing traffic after approx 5TB
From: Sabrina Dubroca @ 2018-10-14 20:25 UTC (permalink / raw)
To: Josh Coombs; +Cc: netdev
In-Reply-To: <CACcUnf8uKOFjJQq6AAroKjArUTwOuLx4HhteZg8y0wDT9NdZDA@mail.gmail.com>
2018-10-14, 10:59:31 -0400, Josh Coombs wrote:
> I initially mistook this for a traffic control issue, but after
> stripping the test beds down to just the MACSec component, I can still
> replicate the issue. After approximately 5TB of transfer / 4 billion
> packets over a MACSec link it stops passing traffic.
I think you're just hitting packet number exhaustion. After 2^32
packets, the packet number would wrap to 0 and start being reused,
which breaks the crypto used by macsec. Before this point, you have to
add a new SA, and tell the macsec device to switch to it.
That's why you should be using wpa_supplicant. It will monitor the
growth of the packet number, and handle the rekey for you.
If you start with a PN already close to exhaustion (say, 4294967000),
you should hit the "bug" very quickly.
> # Bring up macsec:
> echo "* Enable MACSec"
> modprobe macsec
> ip link add link "$dif" "$eif" type macsec
> ip macsec add "$eif" tx sa 0 pn 1 on key 02 "$txkey"
Keep the rest of the configuration, and replace that one with:
ip macsec add "$eif" tx sa 0 pn 4294967000 on key 02 "$txkey"
to trigger the issue faster.
--
Sabrina
^ permalink raw reply
* Re: [PATCH net-next V2 6/8] vhost: packed ring support
From: Jason Wang @ 2018-10-15 2:51 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Tiwei Bie, kvm, virtualization, netdev, linux-kernel, wexu,
jfreimann, maxime.coquelin
In-Reply-To: <20181014224208-mutt-send-email-mst@kernel.org>
On 2018年10月15日 10:43, Michael S. Tsirkin wrote:
> On Mon, Oct 15, 2018 at 10:22:33AM +0800, Jason Wang wrote:
>>
>> On 2018年10月13日 01:23, Michael S. Tsirkin wrote:
>>> On Fri, Oct 12, 2018 at 10:32:44PM +0800, Tiwei Bie wrote:
>>>> On Mon, Jul 16, 2018 at 11:28:09AM +0800, Jason Wang wrote:
>>>> [...]
>>>>> @@ -1367,10 +1397,48 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
>>>>> vq->last_avail_idx = s.num;
>>>>> /* Forget the cached index value. */
>>>>> vq->avail_idx = vq->last_avail_idx;
>>>>> + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
>>>>> + vq->last_avail_wrap_counter = wrap_counter;
>>>>> + vq->avail_wrap_counter = vq->last_avail_wrap_counter;
>>>>> + }
>>>>> break;
>>>>> case VHOST_GET_VRING_BASE:
>>>>> s.index = idx;
>>>>> s.num = vq->last_avail_idx;
>>>>> + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>>>>> + s.num |= vq->last_avail_wrap_counter << 31;
>>>>> + if (copy_to_user(argp, &s, sizeof(s)))
>>>>> + r = -EFAULT;
>>>>> + break;
>>>>> + case VHOST_SET_VRING_USED_BASE:
>>>>> + /* Moving base with an active backend?
>>>>> + * You don't want to do that.
>>>>> + */
>>>>> + if (vq->private_data) {
>>>>> + r = -EBUSY;
>>>>> + break;
>>>>> + }
>>>>> + if (copy_from_user(&s, argp, sizeof(s))) {
>>>>> + r = -EFAULT;
>>>>> + break;
>>>>> + }
>>>>> + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
>>>>> + wrap_counter = s.num >> 31;
>>>>> + s.num &= ~(1 << 31);
>>>>> + }
>>>>> + if (s.num > 0xffff) {
>>>>> + r = -EINVAL;
>>>>> + break;
>>>>> + }
>>>> Do we want to put wrap_counter at bit 15?
>>> I think I second that - seems to be consistent with
>>> e.g. event suppression structure and the proposed
>>> extension to driver notifications.
>> Ok, I assumes packed virtqueue support 64K but looks not. I can change it to
>> bit 15 and GET_VRING_BASE need to be changed as well.
>>
>>>
>>>> If put wrap_counter at bit 31, the check (s.num > 0xffff)
>>>> won't be able to catch the illegal index 0x8000~0xffff for
>>>> packed ring.
>>>>
>> Do we need to clarify this in the spec?
> Isn't this all internal vhost stuff?
I meant the illegal index 0x8000-0xffff.
>
>>>>> + vq->last_used_idx = s.num;
>>>>> + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>>>>> + vq->last_used_wrap_counter = wrap_counter;
>>>>> + break;
>>>>> + case VHOST_GET_VRING_USED_BASE:
>>>> Do we need the new VHOST_GET_VRING_USED_BASE and
>>>> VHOST_SET_VRING_USED_BASE ops?
>>>>
>>>> We are going to merge below series in DPDK:
>>>>
>>>> http://patches.dpdk.org/patch/45874/
>>>>
>>>> We may need to reach an agreement first.
>> If we agree that 64K virtqueue won't be supported, I'm ok with either.
> Well the spec says right at the beginning:
>
> Packed virtqueues support up to 2 15 entries each.
Ok. I get it.
Then I can change vhost to match what dpdk did.
Thanks
>
>
>> Btw the code assumes used_wrap_counter is equal to avail_wrap_counter which
>> looks wrong?
>>
>> Thanks
>>
>>>>> + s.index = idx;
>>>>> + s.num = vq->last_used_idx;
>>>>> + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
>>>>> + s.num |= vq->last_used_wrap_counter << 31;
>>>>> if (copy_to_user(argp, &s, sizeof s))
>>>>> r = -EFAULT;
>>>>> break;
>>>> [...]
^ permalink raw reply
* Re: [RFC PATCH 2/2] net/ncsi: Configure multi-package, multi-channel modes with failover
From: Samuel Mendoza-Jonas @ 2018-10-15 2:44 UTC (permalink / raw)
To: Justin.Lee1, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <d648d156e08c4b2a8135f54abe0e9474@AUSX13MPS302.AMER.DELL.COM>
On Fri, 2018-10-12 at 19:16 +0000, Justin.Lee1@Dell.com wrote:
> > > > > > +
> > > > > > + NCSI_FOR_EACH_CHANNEL(np, channel) {
> > > > > > + ncm = &channel->modes[NCSI_MODE_TX_ENABLE];
> > > > > > + /* Another channel is already Tx */
> > > > > > + if (ncm->enable)
> > > > > > + return false;
> > > > > > + }
>
> As we don't suspend the old channel when we call the ncsi_stop_dev() function,
> this will always be false and we will not set it to the right channel.
> If mutli_channel is enabled, suppose that we only need to send TX enable/disable
> when the link is changed.
Ah, good point. I was working on improving the ncsi_stop_dev/ncsi_start_dev
interactions in a separate patch but we're going to need to fix it for
multi_channel to work properly. I'll look into that and include it in this
series.
<snip>
> > > > > > - if (!found) {
> > > > > > + if (!with_link && found) {
> > > > > > + netdev_info(ndp->ndev.dev,
> > > > > > + "NCSI: No channel with link found, configuring channel %u\n",
> > > > > > + found->id);
> > > > > > + spin_lock_irqsave(&ndp->lock, flags);
> > > > > > + list_add_tail_rcu(&found->link, &ndp->channel_queue);
> > > > > > + spin_unlock_irqrestore(&ndp->lock, flags);
>
> If multi-channel is enabled and without the link, the last found channel would be added again.
Yep, I've fixed this up by checking whether anything has been added to the
channel queue instead.
Thanks,
Sam
^ permalink raw reply
* Re: [PATCH net-next V2 6/8] vhost: packed ring support
From: Michael S. Tsirkin @ 2018-10-15 2:43 UTC (permalink / raw)
To: Jason Wang
Cc: Tiwei Bie, kvm, virtualization, netdev, linux-kernel, wexu,
jfreimann, maxime.coquelin
In-Reply-To: <447f47fa-32dd-a408-dd81-13a9839e0748@redhat.com>
On Mon, Oct 15, 2018 at 10:22:33AM +0800, Jason Wang wrote:
>
>
> On 2018年10月13日 01:23, Michael S. Tsirkin wrote:
> > On Fri, Oct 12, 2018 at 10:32:44PM +0800, Tiwei Bie wrote:
> > > On Mon, Jul 16, 2018 at 11:28:09AM +0800, Jason Wang wrote:
> > > [...]
> > > > @@ -1367,10 +1397,48 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg
> > > > vq->last_avail_idx = s.num;
> > > > /* Forget the cached index value. */
> > > > vq->avail_idx = vq->last_avail_idx;
> > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
> > > > + vq->last_avail_wrap_counter = wrap_counter;
> > > > + vq->avail_wrap_counter = vq->last_avail_wrap_counter;
> > > > + }
> > > > break;
> > > > case VHOST_GET_VRING_BASE:
> > > > s.index = idx;
> > > > s.num = vq->last_avail_idx;
> > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
> > > > + s.num |= vq->last_avail_wrap_counter << 31;
> > > > + if (copy_to_user(argp, &s, sizeof(s)))
> > > > + r = -EFAULT;
> > > > + break;
> > > > + case VHOST_SET_VRING_USED_BASE:
> > > > + /* Moving base with an active backend?
> > > > + * You don't want to do that.
> > > > + */
> > > > + if (vq->private_data) {
> > > > + r = -EBUSY;
> > > > + break;
> > > > + }
> > > > + if (copy_from_user(&s, argp, sizeof(s))) {
> > > > + r = -EFAULT;
> > > > + break;
> > > > + }
> > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) {
> > > > + wrap_counter = s.num >> 31;
> > > > + s.num &= ~(1 << 31);
> > > > + }
> > > > + if (s.num > 0xffff) {
> > > > + r = -EINVAL;
> > > > + break;
> > > > + }
> > > Do we want to put wrap_counter at bit 15?
> > I think I second that - seems to be consistent with
> > e.g. event suppression structure and the proposed
> > extension to driver notifications.
>
> Ok, I assumes packed virtqueue support 64K but looks not. I can change it to
> bit 15 and GET_VRING_BASE need to be changed as well.
>
> >
> >
> > > If put wrap_counter at bit 31, the check (s.num > 0xffff)
> > > won't be able to catch the illegal index 0x8000~0xffff for
> > > packed ring.
> > >
>
> Do we need to clarify this in the spec?
Isn't this all internal vhost stuff?
> > > > + vq->last_used_idx = s.num;
> > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
> > > > + vq->last_used_wrap_counter = wrap_counter;
> > > > + break;
> > > > + case VHOST_GET_VRING_USED_BASE:
> > > Do we need the new VHOST_GET_VRING_USED_BASE and
> > > VHOST_SET_VRING_USED_BASE ops?
> > >
> > > We are going to merge below series in DPDK:
> > >
> > > http://patches.dpdk.org/patch/45874/
> > >
> > > We may need to reach an agreement first.
>
> If we agree that 64K virtqueue won't be supported, I'm ok with either.
Well the spec says right at the beginning:
Packed virtqueues support up to 2 15 entries each.
> Btw the code assumes used_wrap_counter is equal to avail_wrap_counter which
> looks wrong?
>
> Thanks
>
> > >
> > > > + s.index = idx;
> > > > + s.num = vq->last_used_idx;
> > > > + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED))
> > > > + s.num |= vq->last_used_wrap_counter << 31;
> > > > if (copy_to_user(argp, &s, sizeof s))
> > > > r = -EFAULT;
> > > > break;
> > > [...]
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox