* [Qemu-devel] dropped pkts with Qemu on tap interace (RX) @ 2018-01-02 11:17 Stefan Priebe - Profihost AG 2018-01-02 14:20 ` Wei Xu 2018-01-03 8:14 ` Alexandre DERUMIER 0 siblings, 2 replies; 10+ messages in thread From: Stefan Priebe - Profihost AG @ 2018-01-02 11:17 UTC (permalink / raw) To: qemu-devel Hello, currently i'm trying to fix a problem where we have "random" missing packets. We're doing an ssh connect from machine a to machine b every 5 minutes via rsync and ssh. Sometimes it happens that we get this cron message: "Connection to 192.168.0.2 closed by remote host. rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] ssh: connect to host 192.168.0.2 port 22: Connection refused" The tap devices on the target vm shows dropped RX packages on BOTH tap interfaces - strangely with the same amount of pkts? # ifconfig tap317i0; ifconfig tap317i1 tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) Any ideas how to inspect this issue? Greets, Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-02 11:17 [Qemu-devel] dropped pkts with Qemu on tap interace (RX) Stefan Priebe - Profihost AG @ 2018-01-02 14:20 ` Wei Xu 2018-01-02 15:24 ` Stefan Priebe - Profihost AG 2018-01-03 8:14 ` Alexandre DERUMIER 1 sibling, 1 reply; 10+ messages in thread From: Wei Xu @ 2018-01-02 14:20 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: > Hello, > > currently i'm trying to fix a problem where we have "random" missing > packets. > > We're doing an ssh connect from machine a to machine b every 5 minutes > via rsync and ssh. > > Sometimes it happens that we get this cron message: > "Connection to 192.168.0.2 closed by remote host. > rsync: connection unexpectedly closed (0 bytes received so far) [sender] > rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] > ssh: connect to host 192.168.0.2 port 22: Connection refused" Hi Stefan, What kind of virtio-net backend are you using? Can you paste your qemu command line here? 'Connection refused' usually means that the client gets a TCP Reset rather than losing packets, so this might not be a relevant issue. Also you can do a tcpdump on both guests and see what happened to SSH packets (tcpdump -i tapXXX port 22). > > The tap devices on the target vm shows dropped RX packages on BOTH tap > interfaces - strangely with the same amount of pkts? > > # ifconfig tap317i0; ifconfig tap317i1 > tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 > TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) > > tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 > TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) > > Any ideas how to inspect this issue? It seems both tap interfaces lose RX pkts, dropping pkts of RX means the host(backend) cann't receive packets from the guest as fast as the guest sends. Are you running some symmetrical test on both guests? Wei > > Greets, > Stefan > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-02 14:20 ` Wei Xu @ 2018-01-02 15:24 ` Stefan Priebe - Profihost AG 2018-01-02 17:04 ` Wei Xu 0 siblings, 1 reply; 10+ messages in thread From: Stefan Priebe - Profihost AG @ 2018-01-02 15:24 UTC (permalink / raw) To: Wei Xu; +Cc: qemu-devel Hi, Am 02.01.2018 um 15:20 schrieb Wei Xu: > On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: >> Hello, >> >> currently i'm trying to fix a problem where we have "random" missing >> packets. >> >> We're doing an ssh connect from machine a to machine b every 5 minutes >> via rsync and ssh. >> >> Sometimes it happens that we get this cron message: >> "Connection to 192.168.0.2 closed by remote host. >> rsync: connection unexpectedly closed (0 bytes received so far) [sender] >> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] >> ssh: connect to host 192.168.0.2 port 22: Connection refused" > > Hi Stefan, > What kind of virtio-net backend are you using? Can you paste your qemu > command line here? Sure netdev part: -netdev type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 -netdev type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 -device virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 > 'Connection refused' usually means that the client gets a TCP Reset rather > than losing packets, so this might not be a relevant issue. Mhm so you mean these might be two seperate ones? > Also you can do a tcpdump on both guests and see what happened to SSH packets > (tcpdump -i tapXXX port 22). Sadly not as there's too much traffic on that part as rsync is syncing every 5 minutes through ssh. >> The tap devices on the target vm shows dropped RX packages on BOTH tap >> interfaces - strangely with the same amount of pkts? >> >> # ifconfig tap317i0; ifconfig tap317i1 >> tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >> RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 >> TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) >> >> tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >> RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 >> TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 >> collisions:0 txqueuelen:1000 >> RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) >> >> Any ideas how to inspect this issue? > > It seems both tap interfaces lose RX pkts, dropping pkts of RX means the > host(backend) cann't receive packets from the guest as fast as the guest sends. Inside the guest i see no dropped packets at all. It's only on the host and strangely on both taps at the same value? And both are connected to absolutely different networks. > Are you running some symmetrical test on both guests? No. Stefan > Wei > >> >> Greets, >> Stefan >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-02 15:24 ` Stefan Priebe - Profihost AG @ 2018-01-02 17:04 ` Wei Xu 2018-01-02 21:17 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 10+ messages in thread From: Wei Xu @ 2018-01-02 17:04 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG wrote: > Hi, > Am 02.01.2018 um 15:20 schrieb Wei Xu: > > On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: > >> Hello, > >> > >> currently i'm trying to fix a problem where we have "random" missing > >> packets. > >> > >> We're doing an ssh connect from machine a to machine b every 5 minutes > >> via rsync and ssh. > >> > >> Sometimes it happens that we get this cron message: > >> "Connection to 192.168.0.2 closed by remote host. > >> rsync: connection unexpectedly closed (0 bytes received so far) [sender] > >> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] > >> ssh: connect to host 192.168.0.2 port 22: Connection refused" > > > > Hi Stefan, > > What kind of virtio-net backend are you using? Can you paste your qemu > > command line here? > > Sure netdev part: > -netdev > type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on > -device > virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 > -netdev > type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 > -device > virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 According to what you have mentioned, the traffic is not heavy for the guests, the dropping shouldn't happen for regular case. What is your hardware platform? and Which versions are you using for both guest/host kernel and qemu? Are there other VMs on the same host? > > > > 'Connection refused' usually means that the client gets a TCP Reset rather > > than losing packets, so this might not be a relevant issue. > > Mhm so you mean these might be two seperate ones? Yes. > > > Also you can do a tcpdump on both guests and see what happened to SSH packets > > (tcpdump -i tapXXX port 22). > > Sadly not as there's too much traffic on that part as rsync is syncing > every 5 minutes through ssh. You can do a tcpdump for the entire traffic from the guest and host and compare what kind of packets are dropped if the traffic is not overloaded. Wei > > >> The tap devices on the target vm shows dropped RX packages on BOTH tap > >> interfaces - strangely with the same amount of pkts? > >> > >> # ifconfig tap317i0; ifconfig tap317i1 > >> tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf > >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > >> RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 > >> TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:1000 > >> RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) > >> > >> tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 > >> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > >> RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 > >> TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 > >> collisions:0 txqueuelen:1000 > >> RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) > >> > >> Any ideas how to inspect this issue? > > > > It seems both tap interfaces lose RX pkts, dropping pkts of RX means the > > host(backend) cann't receive packets from the guest as fast as the guest sends. > > Inside the guest i see no dropped packets at all. It's only on the host > and strangely on both taps at the same value? And both are connected to > absolutely different networks. > > > Are you running some symmetrical test on both guests? > > No. > > Stefan > > > > Wei > > > >> > >> Greets, > >> Stefan > >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-02 17:04 ` Wei Xu @ 2018-01-02 21:17 ` Stefan Priebe - Profihost AG 2018-01-03 3:57 ` Wei Xu 0 siblings, 1 reply; 10+ messages in thread From: Stefan Priebe - Profihost AG @ 2018-01-02 21:17 UTC (permalink / raw) To: Wei Xu; +Cc: qemu-devel Am 02.01.2018 um 18:04 schrieb Wei Xu: > On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG wrote: >> Hi, >> Am 02.01.2018 um 15:20 schrieb Wei Xu: >>> On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: >>>> Hello, >>>> >>>> currently i'm trying to fix a problem where we have "random" missing >>>> packets. >>>> >>>> We're doing an ssh connect from machine a to machine b every 5 minutes >>>> via rsync and ssh. >>>> >>>> Sometimes it happens that we get this cron message: >>>> "Connection to 192.168.0.2 closed by remote host. >>>> rsync: connection unexpectedly closed (0 bytes received so far) [sender] >>>> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] >>>> ssh: connect to host 192.168.0.2 port 22: Connection refused" >>> >>> Hi Stefan, >>> What kind of virtio-net backend are you using? Can you paste your qemu >>> command line here? >> >> Sure netdev part: >> -netdev >> type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on >> -device >> virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 >> -netdev >> type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 >> -device >> virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 > > According to what you have mentioned, the traffic is not heavy for the guests, > the dropping shouldn't happen for regular case. The avg traffic is around 300kb/s. > What is your hardware platform? Dual Intel Xeon E5-2680 v4 > and Which versions are you using for both > guest/host kernel Kernel v4.4.103 > and qemu? 2.9.1 > Are there other VMs on the same host? Yes. >>> 'Connection refused' usually means that the client gets a TCP Reset rather >>> than losing packets, so this might not be a relevant issue. >> >> Mhm so you mean these might be two seperate ones? > > Yes. > >> >>> Also you can do a tcpdump on both guests and see what happened to SSH packets >>> (tcpdump -i tapXXX port 22). >> >> Sadly not as there's too much traffic on that part as rsync is syncing >> every 5 minutes through ssh. > > You can do a tcpdump for the entire traffic from the guest and host and compare > what kind of packets are dropped if the traffic is not overloaded. Are you sure? I don't get why the same amount and same kind of packets should be received by both tap which are connected to different bridges to different HW and physical interfaces. Stefan > Wei > >> >>>> The tap devices on the target vm shows dropped RX packages on BOTH tap >>>> interfaces - strangely with the same amount of pkts? >>>> >>>> # ifconfig tap317i0; ifconfig tap317i1 >>>> tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf >>>> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >>>> RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 >>>> TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 >>>> collisions:0 txqueuelen:1000 >>>> RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) >>>> >>>> tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 >>>> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 >>>> RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 >>>> TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 >>>> collisions:0 txqueuelen:1000 >>>> RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) >>>> >>>> Any ideas how to inspect this issue? >>> >>> It seems both tap interfaces lose RX pkts, dropping pkts of RX means the >>> host(backend) cann't receive packets from the guest as fast as the guest sends. >> >> Inside the guest i see no dropped packets at all. It's only on the host >> and strangely on both taps at the same value? And both are connected to >> absolutely different networks. >> >>> Are you running some symmetrical test on both guests? >> >> No. >> >> Stefan >> >> >>> Wei >>> >>>> >>>> Greets, >>>> Stefan >>>> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-02 21:17 ` Stefan Priebe - Profihost AG @ 2018-01-03 3:57 ` Wei Xu 2018-01-03 15:07 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 10+ messages in thread From: Wei Xu @ 2018-01-03 3:57 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel On Tue, Jan 02, 2018 at 10:17:25PM +0100, Stefan Priebe - Profihost AG wrote: > > Am 02.01.2018 um 18:04 schrieb Wei Xu: > > On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG wrote: > >> Hi, > >> Am 02.01.2018 um 15:20 schrieb Wei Xu: > >>> On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: > >>>> Hello, > >>>> > >>>> currently i'm trying to fix a problem where we have "random" missing > >>>> packets. > >>>> > >>>> We're doing an ssh connect from machine a to machine b every 5 minutes > >>>> via rsync and ssh. > >>>> > >>>> Sometimes it happens that we get this cron message: > >>>> "Connection to 192.168.0.2 closed by remote host. > >>>> rsync: connection unexpectedly closed (0 bytes received so far) [sender] > >>>> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] > >>>> ssh: connect to host 192.168.0.2 port 22: Connection refused" > >>> > >>> Hi Stefan, > >>> What kind of virtio-net backend are you using? Can you paste your qemu > >>> command line here? > >> > >> Sure netdev part: > >> -netdev > >> type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on > >> -device > >> virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 > >> -netdev > >> type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 > >> -device > >> virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 > > > > According to what you have mentioned, the traffic is not heavy for the guests, > > the dropping shouldn't happen for regular case. > > The avg traffic is around 300kb/s. > > > What is your hardware platform? > > Dual Intel Xeon E5-2680 v4 > > > and Which versions are you using for both > > guest/host kernel > Kernel v4.4.103 > > > and qemu? > 2.9.1 > > > Are there other VMs on the same host? > Yes. What about the CPU load? > > > >>> 'Connection refused' usually means that the client gets a TCP Reset rather > >>> than losing packets, so this might not be a relevant issue. > >> > >> Mhm so you mean these might be two seperate ones? > > > > Yes. > > > >> > >>> Also you can do a tcpdump on both guests and see what happened to SSH packets > >>> (tcpdump -i tapXXX port 22). > >> > >> Sadly not as there's too much traffic on that part as rsync is syncing > >> every 5 minutes through ssh. > > > > You can do a tcpdump for the entire traffic from the guest and host and compare > > what kind of packets are dropped if the traffic is not overloaded. > > Are you sure? I don't get why the same amount and same kind of packets > should be received by both tap which are connected to different bridges > to different HW and physical interfaces. Exactly, possibly this would be a host or guest kernel bug cos than qemu issue you are using vhost kernel as the backend and the two stats are independent, you might have to check out what is happening inside the traffic. Wei ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-03 3:57 ` Wei Xu @ 2018-01-03 15:07 ` Stefan Priebe - Profihost AG 2018-01-04 3:09 ` Wei Xu 0 siblings, 1 reply; 10+ messages in thread From: Stefan Priebe - Profihost AG @ 2018-01-03 15:07 UTC (permalink / raw) To: Wei Xu; +Cc: qemu-devel Am 03.01.2018 um 04:57 schrieb Wei Xu: > On Tue, Jan 02, 2018 at 10:17:25PM +0100, Stefan Priebe - Profihost AG wrote: >> >> Am 02.01.2018 um 18:04 schrieb Wei Xu: >>> On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG wrote: >>>> Hi, >>>> Am 02.01.2018 um 15:20 schrieb Wei Xu: >>>>> On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: >>>>>> Hello, >>>>>> >>>>>> currently i'm trying to fix a problem where we have "random" missing >>>>>> packets. >>>>>> >>>>>> We're doing an ssh connect from machine a to machine b every 5 minutes >>>>>> via rsync and ssh. >>>>>> >>>>>> Sometimes it happens that we get this cron message: >>>>>> "Connection to 192.168.0.2 closed by remote host. >>>>>> rsync: connection unexpectedly closed (0 bytes received so far) [sender] >>>>>> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] >>>>>> ssh: connect to host 192.168.0.2 port 22: Connection refused" >>>>> >>>>> Hi Stefan, >>>>> What kind of virtio-net backend are you using? Can you paste your qemu >>>>> command line here? >>>> >>>> Sure netdev part: >>>> -netdev >>>> type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on >>>> -device >>>> virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 >>>> -netdev >>>> type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 >>>> -device >>>> virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 >>> >>> According to what you have mentioned, the traffic is not heavy for the guests, >>> the dropping shouldn't happen for regular case. >> >> The avg traffic is around 300kb/s. >> >>> What is your hardware platform? >> >> Dual Intel Xeon E5-2680 v4 >> >>> and Which versions are you using for both >>> guest/host kernel >> Kernel v4.4.103 >> >>> and qemu? >> 2.9.1 >> >>> Are there other VMs on the same host? >> Yes. > > What about the CPU load? Host: 80-90% Idle LoadAvg: 6-7 VM: 97%-99% Idle >>>>> 'Connection refused' usually means that the client gets a TCP Reset rather >>>>> than losing packets, so this might not be a relevant issue. >>>> >>>> Mhm so you mean these might be two seperate ones? >>> >>> Yes. >>> >>>> >>>>> Also you can do a tcpdump on both guests and see what happened to SSH packets >>>>> (tcpdump -i tapXXX port 22). >>>> >>>> Sadly not as there's too much traffic on that part as rsync is syncing >>>> every 5 minutes through ssh. >>> >>> You can do a tcpdump for the entire traffic from the guest and host and compare >>> what kind of packets are dropped if the traffic is not overloaded. >> >> Are you sure? I don't get why the same amount and same kind of packets >> should be received by both tap which are connected to different bridges >> to different HW and physical interfaces. > > Exactly, possibly this would be a host or guest kernel bug cos than qemu issue > you are using vhost kernel as the backend and the two stats are independent, > you might have to check out what is happening inside the traffic. What do you mean by inside the traffic? Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-03 15:07 ` Stefan Priebe - Profihost AG @ 2018-01-04 3:09 ` Wei Xu 0 siblings, 0 replies; 10+ messages in thread From: Wei Xu @ 2018-01-04 3:09 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel On Wed, Jan 03, 2018 at 04:07:44PM +0100, Stefan Priebe - Profihost AG wrote: > > Am 03.01.2018 um 04:57 schrieb Wei Xu: > > On Tue, Jan 02, 2018 at 10:17:25PM +0100, Stefan Priebe - Profihost AG wrote: > >> > >> Am 02.01.2018 um 18:04 schrieb Wei Xu: > >>> On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG wrote: > >>>> Hi, > >>>> Am 02.01.2018 um 15:20 schrieb Wei Xu: > >>>>> On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG wrote: > >>>>>> Hello, > >>>>>> > >>>>>> currently i'm trying to fix a problem where we have "random" missing > >>>>>> packets. > >>>>>> > >>>>>> We're doing an ssh connect from machine a to machine b every 5 minutes > >>>>>> via rsync and ssh. > >>>>>> > >>>>>> Sometimes it happens that we get this cron message: > >>>>>> "Connection to 192.168.0.2 closed by remote host. > >>>>>> rsync: connection unexpectedly closed (0 bytes received so far) [sender] > >>>>>> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] > >>>>>> ssh: connect to host 192.168.0.2 port 22: Connection refused" > >>>>> > >>>>> Hi Stefan, > >>>>> What kind of virtio-net backend are you using? Can you paste your qemu > >>>>> command line here? > >>>> > >>>> Sure netdev part: > >>>> -netdev > >>>> type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on > >>>> -device > >>>> virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 > >>>> -netdev > >>>> type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 > >>>> -device > >>>> virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 > >>> > >>> According to what you have mentioned, the traffic is not heavy for the guests, > >>> the dropping shouldn't happen for regular case. > >> > >> The avg traffic is around 300kb/s. > >> > >>> What is your hardware platform? > >> > >> Dual Intel Xeon E5-2680 v4 > >> > >>> and Which versions are you using for both > >>> guest/host kernel > >> Kernel v4.4.103 > >> > >>> and qemu? > >> 2.9.1 > >> > >>> Are there other VMs on the same host? > >> Yes. > > > > What about the CPU load? > > Host: > 80-90% Idle > LoadAvg: 6-7 > > VM: > 97%-99% Idle > OK, then this shouldn't be a concern. > >>>>> 'Connection refused' usually means that the client gets a TCP Reset rather > >>>>> than losing packets, so this might not be a relevant issue. > >>>> > >>>> Mhm so you mean these might be two seperate ones? > >>> > >>> Yes. > >>> > >>>> > >>>>> Also you can do a tcpdump on both guests and see what happened to SSH packets > >>>>> (tcpdump -i tapXXX port 22). > >>>> > >>>> Sadly not as there's too much traffic on that part as rsync is syncing > >>>> every 5 minutes through ssh. > >>> > >>> You can do a tcpdump for the entire traffic from the guest and host and compare > >>> what kind of packets are dropped if the traffic is not overloaded. > >> > >> Are you sure? I don't get why the same amount and same kind of packets > >> should be received by both tap which are connected to different bridges > >> to different HW and physical interfaces. > > > > Exactly, possibly this would be a host or guest kernel bug cos than qemu issue > > you are using vhost kernel as the backend and the two stats are independent, > > you might have to check out what is happening inside the traffic. > > What do you mean by inside the traffic? You might need to figure what kind of packets are dropped on host tap interface, are they random packets or specific packets? There are few other tests which help to see what happened besides triaging the traffic, or you can try alternative tests according to your test bed. 1). Upgrade host & guest kernel to latest kernel and see if it comes up, you can use net-next tree. git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 2). Do some traffic throughput(netperf, iperf, etc) on both guests(traffic from guest to host if the guests are isolated due to your comments) and check out the statistics. Wei > > Stefan > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-02 11:17 [Qemu-devel] dropped pkts with Qemu on tap interace (RX) Stefan Priebe - Profihost AG 2018-01-02 14:20 ` Wei Xu @ 2018-01-03 8:14 ` Alexandre DERUMIER 2018-01-03 15:10 ` Stefan Priebe - Profihost AG 1 sibling, 1 reply; 10+ messages in thread From: Alexandre DERUMIER @ 2018-01-03 8:14 UTC (permalink / raw) To: Stefan Priebe, Profihost AG; +Cc: qemu-devel Hi Stefan, >>The tap devices on the target vm shows dropped RX packages on BOTH tap >>interfaces - strangely with the same amount of pkts? that's strange indeed. if you tcpdump tap interfaces, do you see incoming traffic only on 1 interface, or both random ? (can you provide the network configuration in the guest for both interfaces ?) I'm seeing that you have enable multiqueue on 1 of the interfaces, do you have setup correctly the multiqueue part inside the guest. do you have enough vcpu to handle all the queues ? ----- Mail original ----- De: "Stefan Priebe, Profihost AG" <s.priebe@profihost.ag> À: "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Mardi 2 Janvier 2018 12:17:29 Objet: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) Hello, currently i'm trying to fix a problem where we have "random" missing packets. We're doing an ssh connect from machine a to machine b every 5 minutes via rsync and ssh. Sometimes it happens that we get this cron message: "Connection to 192.168.0.2 closed by remote host. rsync: connection unexpectedly closed (0 bytes received so far) [sender] rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] ssh: connect to host 192.168.0.2 port 22: Connection refused" The tap devices on the target vm shows dropped RX packages on BOTH tap interfaces - strangely with the same amount of pkts? # ifconfig tap317i0; ifconfig tap317i1 tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) Any ideas how to inspect this issue? Greets, Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) 2018-01-03 8:14 ` Alexandre DERUMIER @ 2018-01-03 15:10 ` Stefan Priebe - Profihost AG 0 siblings, 0 replies; 10+ messages in thread From: Stefan Priebe - Profihost AG @ 2018-01-03 15:10 UTC (permalink / raw) To: Alexandre DERUMIER; +Cc: qemu-devel Am 03.01.2018 um 09:14 schrieb Alexandre DERUMIER: > Hi Stefan, > >>> The tap devices on the target vm shows dropped RX packages on BOTH tap >>> interfaces - strangely with the same amount of pkts? > > that's strange indeed. > if you tcpdump tap interfaces, do you see incoming traffic only on 1 interface, or both random ? complete independend random traffic as it should. > (can you provide the network configuration in the guest for both interfaces ?) inside the guest? where the drop counter stays 0? auto eth0 iface eth0 inet dhcp auto eth1 iface eth1 inet static address 192.168.0.2 netmask 255.255.255.0 that's it. > I'm seeing that you have enable multiqueue on 1 of the interfaces, do you have setup correctly the multiqueue part inside the guest. uh oh? What is needed inside the guest? > do you have enough vcpu to handle all the queues ? Yes. Stefan > ----- Mail original ----- > De: "Stefan Priebe, Profihost AG" <s.priebe@profihost.ag> > À: "qemu-devel" <qemu-devel@nongnu.org> > Envoyé: Mardi 2 Janvier 2018 12:17:29 > Objet: [Qemu-devel] dropped pkts with Qemu on tap interace (RX) > > Hello, > > currently i'm trying to fix a problem where we have "random" missing > packets. > > We're doing an ssh connect from machine a to machine b every 5 minutes > via rsync and ssh. > > Sometimes it happens that we get this cron message: > "Connection to 192.168.0.2 closed by remote host. > rsync: connection unexpectedly closed (0 bytes received so far) [sender] > rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] > ssh: connect to host 192.168.0.2 port 22: Connection refused" > > The tap devices on the target vm shows dropped RX packages on BOTH tap > interfaces - strangely with the same amount of pkts? > > # ifconfig tap317i0; ifconfig tap317i1 > tap317i0 Link encap:Ethernet HWaddr 6e:cb:65:94:bb:bf > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > RX packets:2238445 errors:0 dropped:13159 overruns:0 frame:0 > TX packets:9655853 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:177991267 (169.7 MiB) TX bytes:910412749 (868.2 MiB) > > tap317i1 Link encap:Ethernet HWaddr 96:f8:b5:d0:9a:07 > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 > RX packets:1516085 errors:0 dropped:13159 overruns:0 frame:0 > TX packets:1446964 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:1597564313 (1.4 GiB) TX bytes:3517734365 (3.2 GiB) > > Any ideas how to inspect this issue? > > Greets, > Stefan > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-01-04 2:47 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-02 11:17 [Qemu-devel] dropped pkts with Qemu on tap interace (RX) Stefan Priebe - Profihost AG 2018-01-02 14:20 ` Wei Xu 2018-01-02 15:24 ` Stefan Priebe - Profihost AG 2018-01-02 17:04 ` Wei Xu 2018-01-02 21:17 ` Stefan Priebe - Profihost AG 2018-01-03 3:57 ` Wei Xu 2018-01-03 15:07 ` Stefan Priebe - Profihost AG 2018-01-04 3:09 ` Wei Xu 2018-01-03 8:14 ` Alexandre DERUMIER 2018-01-03 15:10 ` Stefan Priebe - Profihost AG
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).