kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sasha.levin@oracle.com>
To: Dmitry Vyukov <dvyukov@google.com>, syzkaller@googlegroups.com
Cc: Sasha Levin <levinsasha928@gmail.com>,
	Pekka Enberg <penberg@kernel.org>,
	Asias He <asias.hejun@gmail.com>,
	penberg@cs.helsinki.fi, Cyrill Gorcunov <gorcunov@gmail.com>,
	Will Deacon <will.deacon@arm.com>,
	matt@ozlabs.org, Michael Ellerman <michael@ellerman.id.au>,
	Prasad Joshi <prasadjoshi124@gmail.com>,
	marc.zyngier@arm.com,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	mingo@elte.hu, gorcunov@openvz.org, kvm@vger.kernel.org,
	Kostya Serebryany <kcc@google.com>,
	Evgenii Stepanov <eugenis@google.com>,
	Alexey Samsonov <samsonov@google.com>,
	Alexander Potapenko <glider@google.com>
Subject: Re: Network hangs when communicating with host
Date: Mon, 19 Oct 2015 10:20:41 -0400	[thread overview]
Message-ID: <5624FC39.2060708@oracle.com> (raw)
In-Reply-To: <CACT4Y+bxk7p8aekCc=jvQRH+viEZ-Y22LLqCO9JLqFbFewA3Qg@mail.gmail.com>

On 10/19/2015 05:28 AM, Dmitry Vyukov wrote:
> On Mon, Oct 19, 2015 at 11:22 AM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Hi Dmitry,
>>
>> On 19/10/15 10:05, Dmitry Vyukov wrote:
>>> On Fri, Oct 16, 2015 at 7:25 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>>>> On 10/15/2015 04:20 PM, Dmitry Vyukov wrote:
>>>>> Hello,
>>>>>
>>>>> I am trying to run a program in lkvm sandbox so that it communicates
>>>>> with a program on host. I run lkvm as:
>>>>>
>>>>> ./lkvm sandbox --disk sandbox-test --mem=2048 --cpus=4 --kernel
>>>>> /arch/x86/boot/bzImage --network mode=user -- /my_prog
>>>>>
>>>>> /my_prog then connects to a program on host over a tcp socket.
>>>>> I see that host receives some data, sends some data back, but then
>>>>> my_prog hangs on network read.
>>>>>
>>>>> To localize this I wrote 2 programs (attached). ping is run on host
>>>>> and pong is run from lkvm sandbox. They successfully establish tcp
>>>>> connection, but after some iterations both hang on read.
>>>>>
>>>>> Networking code in Go runtime is there for more than 3 years, widely
>>>>> used in production and does not have any known bugs. However, it uses
>>>>> epoll edge-triggered readiness notifications that known to be tricky.
>>>>> Is it possible that lkvm contains some networking bug? Can it be
>>>>> related to the data races in lkvm I reported earlier today?
>>
>> Just to let you know:
>> I think we have seen networking issues in the past - root over NFS had
>> issues IIRC. Will spent some time on debugging this and it looked like a
>> race condition in kvmtool's virtio implementation. I think pinning
>> kvmtool's virtio threads to one host core made this go away. However
>> although he tried hard (even by Will's standards!) he couldn't find a
>> the real root cause or a fix at the time he looked at it and we found
>> other ways to work around the issues (using virtio-blk or initrd's).
>>
>> So it's quite possible that there are issues. I haven't had time yet to
>> look at your sanitizer reports, but it looks like a promising approach
>> to find the root cause.
> 
> 
> Thanks, Andre!
> 
> ping/pong does not hang within at least 5 minutes when I run lkvm
> under taskset 1.
> 
> And, yeah, this pretty strongly suggests a data race. ThreadSanitizer
> can point you to the bug within a minute, so you just need to say
> "aha! it is here". Or maybe not. There are no guarantees. But if you
> already spent significant time on this, then checking the reports
> definitely looks like a good idea.

Okay, that's good to know.

I have a few busy days, but I'll definitely try to clear up these reports
as they seem to be pointing to real issues.


Thanks,
Sasha


  reply	other threads:[~2015-10-19 14:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15 20:20 Network hangs when communicating with host Dmitry Vyukov
2015-10-16 17:25 ` Sasha Levin
     [not found]   ` <CACT4Y+bG3gZv7eBUg5hv=5CEasdGUHwYEe6Bae6OVMK3bZe1Rw@mail.gmail.com>
2015-10-19  9:22     ` Andre Przywara
2015-10-19  9:28       ` Dmitry Vyukov
2015-10-19 14:20         ` Sasha Levin [this message]
2015-10-20 13:42           ` Dmitry Vyukov
2015-10-20 13:58             ` Sasha Levin
2015-10-27  9:31               ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5624FC39.2060708@oracle.com \
    --to=sasha.levin@oracle.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=asias.hejun@gmail.com \
    --cc=dvyukov@google.com \
    --cc=eugenis@google.com \
    --cc=glider@google.com \
    --cc=gorcunov@gmail.com \
    --cc=gorcunov@openvz.org \
    --cc=kcc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=levinsasha928@gmail.com \
    --cc=marc.zyngier@arm.com \
    --cc=matt@ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=mingo@elte.hu \
    --cc=penberg@cs.helsinki.fi \
    --cc=penberg@kernel.org \
    --cc=prasadjoshi124@gmail.com \
    --cc=samsonov@google.com \
    --cc=syzkaller@googlegroups.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).