From: Sasha Levin <sasha.levin@oracle.com>
To: Dmitry Vyukov <dvyukov@google.com>, syzkaller@googlegroups.com
Cc: Sasha Levin <levinsasha928@gmail.com>,
Pekka Enberg <penberg@kernel.org>,
Asias He <asias.hejun@gmail.com>,
penberg@cs.helsinki.fi, Cyrill Gorcunov <gorcunov@gmail.com>,
Will Deacon <will.deacon@arm.com>,
matt@ozlabs.org, Michael Ellerman <michael@ellerman.id.au>,
Prasad Joshi <prasadjoshi124@gmail.com>,
marc.zyngier@arm.com,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
mingo@elte.hu, gorcunov@openvz.org, kvm@vger.kernel.org,
Kostya Serebryany <kcc@google.com>,
Evgenii Stepanov <eugenis@google.com>,
Alexey Samsonov <samsonov@google.com>,
Alexander Potapenko <glider@google.com>
Subject: Re: Network hangs when communicating with host
Date: Mon, 19 Oct 2015 10:20:41 -0400 [thread overview]
Message-ID: <5624FC39.2060708@oracle.com> (raw)
In-Reply-To: <CACT4Y+bxk7p8aekCc=jvQRH+viEZ-Y22LLqCO9JLqFbFewA3Qg@mail.gmail.com>
On 10/19/2015 05:28 AM, Dmitry Vyukov wrote:
> On Mon, Oct 19, 2015 at 11:22 AM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Hi Dmitry,
>>
>> On 19/10/15 10:05, Dmitry Vyukov wrote:
>>> On Fri, Oct 16, 2015 at 7:25 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>>>> On 10/15/2015 04:20 PM, Dmitry Vyukov wrote:
>>>>> Hello,
>>>>>
>>>>> I am trying to run a program in lkvm sandbox so that it communicates
>>>>> with a program on host. I run lkvm as:
>>>>>
>>>>> ./lkvm sandbox --disk sandbox-test --mem=2048 --cpus=4 --kernel
>>>>> /arch/x86/boot/bzImage --network mode=user -- /my_prog
>>>>>
>>>>> /my_prog then connects to a program on host over a tcp socket.
>>>>> I see that host receives some data, sends some data back, but then
>>>>> my_prog hangs on network read.
>>>>>
>>>>> To localize this I wrote 2 programs (attached). ping is run on host
>>>>> and pong is run from lkvm sandbox. They successfully establish tcp
>>>>> connection, but after some iterations both hang on read.
>>>>>
>>>>> Networking code in Go runtime is there for more than 3 years, widely
>>>>> used in production and does not have any known bugs. However, it uses
>>>>> epoll edge-triggered readiness notifications that known to be tricky.
>>>>> Is it possible that lkvm contains some networking bug? Can it be
>>>>> related to the data races in lkvm I reported earlier today?
>>
>> Just to let you know:
>> I think we have seen networking issues in the past - root over NFS had
>> issues IIRC. Will spent some time on debugging this and it looked like a
>> race condition in kvmtool's virtio implementation. I think pinning
>> kvmtool's virtio threads to one host core made this go away. However
>> although he tried hard (even by Will's standards!) he couldn't find a
>> the real root cause or a fix at the time he looked at it and we found
>> other ways to work around the issues (using virtio-blk or initrd's).
>>
>> So it's quite possible that there are issues. I haven't had time yet to
>> look at your sanitizer reports, but it looks like a promising approach
>> to find the root cause.
>
>
> Thanks, Andre!
>
> ping/pong does not hang within at least 5 minutes when I run lkvm
> under taskset 1.
>
> And, yeah, this pretty strongly suggests a data race. ThreadSanitizer
> can point you to the bug within a minute, so you just need to say
> "aha! it is here". Or maybe not. There are no guarantees. But if you
> already spent significant time on this, then checking the reports
> definitely looks like a good idea.
Okay, that's good to know.
I have a few busy days, but I'll definitely try to clear up these reports
as they seem to be pointing to real issues.
Thanks,
Sasha
next prev parent reply other threads:[~2015-10-19 14:21 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-15 20:20 Network hangs when communicating with host Dmitry Vyukov
2015-10-16 17:25 ` Sasha Levin
[not found] ` <CACT4Y+bG3gZv7eBUg5hv=5CEasdGUHwYEe6Bae6OVMK3bZe1Rw@mail.gmail.com>
2015-10-19 9:22 ` Andre Przywara
2015-10-19 9:28 ` Dmitry Vyukov
2015-10-19 14:20 ` Sasha Levin [this message]
2015-10-20 13:42 ` Dmitry Vyukov
2015-10-20 13:58 ` Sasha Levin
2015-10-27 9:31 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5624FC39.2060708@oracle.com \
--to=sasha.levin@oracle.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=asias.hejun@gmail.com \
--cc=dvyukov@google.com \
--cc=eugenis@google.com \
--cc=glider@google.com \
--cc=gorcunov@gmail.com \
--cc=gorcunov@openvz.org \
--cc=kcc@google.com \
--cc=kvm@vger.kernel.org \
--cc=levinsasha928@gmail.com \
--cc=marc.zyngier@arm.com \
--cc=matt@ozlabs.org \
--cc=michael@ellerman.id.au \
--cc=mingo@elte.hu \
--cc=penberg@cs.helsinki.fi \
--cc=penberg@kernel.org \
--cc=prasadjoshi124@gmail.com \
--cc=samsonov@google.com \
--cc=syzkaller@googlegroups.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).