dev.dpdk.org archive mirror
 help / color / mirror / Atom feed
From: Gopakumar Choorakkot Edakkunni <gopakumar.c.e@gmail.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: dev@dpdk.org
Subject: Re: virtio "how to restart applications" - //dpdk.org/doc/virtio-net-pmd
Date: Sat, 18 Mar 2017 14:37:56 -0700	[thread overview]
Message-ID: <CABK1yFDFbsZRSDb7tiAnZ6OFACKBHAFM9618N1phREmYYq4FEQ@mail.gmail.com> (raw)
In-Reply-To: <CABK1yFBsa_tg9tCL75_jbHQZkTUUGEr1jD+AKurZjBn=hf=vAA@mail.gmail.com>

I mean vtpci_reset is called from rte_eal_pci_probe() which is the *last*
thing in rte_eal_init(), *after* hugepage init, so if I can somehow get
that done *before* hugepage init maybe all will be well (because I cant do
anything to fix the host side)

Rgds,
Gopa.

On Sat, Mar 18, 2017 at 2:32 PM, Gopakumar Choorakkot Edakkunni <
gopakumar.c.e@gmail.com> wrote:

> Hi Yuan,
>
> As a "hack"/"workaround", in rte_eal_init(), if I can call vtpci_reset()
> just before rte_eal_memory_init(), that should take care of the problem of
> host zeroing out hugepages right ? As of today vtpci_reset() is called in
> rte_eal_dev_init() which comes *after* rte_eal_memory_init()
>
> Rgds,
> Gopa.
>
> On Thu, Mar 16, 2017 at 10:50 PM, Gopakumar Choorakkot Edakkunni <
> gopakumar.c.e@gmail.com> wrote:
>
>> Thanks again Yuanhan, you are the true expert!!
>>
>> Rgds,
>> Gopa.
>>
>> On Thu, Mar 16, 2017 at 10:40 PM, Yuanhan Liu <
>> yuanhan.liu@linux.intel.com> wrote:
>>
>>> On Thu, Mar 16, 2017 at 10:30:09PM -0700, Gopakumar Choorakkot Edakkunni
>>> wrote:
>>> > Thanks for the confirmation, glad I reached the person who knows the
>>> nuts and
>>> > bolts of virtio :-). So if the host is not in our control (ie if I am
>>> just
>>> > running as a VM on host provided by thirdparty vendor), is there any
>>> workaround
>>> > I can do from the guest side to prevent problems from happening on a
>>> guest
>>> > restart ?
>>>
>>> Not too much. You might want to hack the guest DPDK EAL memory initiation
>>> part though, to not reset the hugepage memory on start. But that's too
>>> hacky
>>> that I will not recommend you to do so!
>>>
>>> > And if theres no workarounds at all and the host has to change,
>>> instead of
>>> > asking the third party vendor to do a wholesale upgrade to 16.04, is
>>> there one/
>>> > few commits that can be added to the host ovs-dpdk to take care of
>>> this guest
>>> > restart virtio-reset-before opening case ?
>>>
>>> Yes, backporting the commits I have mentioned should be able to fix it.
>>> But please note that I did some code refactorings before those fixes: it
>>> won't apply cleanly to DPDK v2.2.
>>>
>>> And if you want to upgrade, I'd suggest to upgrade to v16.11, which is
>>> LTS release.
>>>
>>>         --yliu
>>> >
>>> > Rgds,
>>> > Gopa.
>>> >
>>> > On Thu, Mar 16, 2017 at 10:24 PM, Yuanhan Liu <
>>> yuanhan.liu@linux.intel.com>
>>> > wrote:
>>> >
>>> >     On Thu, Mar 16, 2017 at 10:20:30PM -0700, Gopakumar Choorakkot
>>> Edakkunni
>>> >     wrote:
>>> >     > >> When I was saying dpdk version, I meant the DPDK version with
>>> OVS.
>>> >     >
>>> >     > Oh I see! My apologies for the misuderstanding. The dpdk version
>>> used by
>>> >     host
>>> >     > ovs should be dpdk2.2, the guest process uses dpdk16.07. The OVS
>>> process
>>> >     is not
>>> >     > getting restarted, what is getting restarted is the guest
>>> process using
>>> >     > dpdk16.07 - so the above clarifications you had about virtio
>>> being
>>> >     > reset-before-opened on guest restart - does that still hold good
>>> or does
>>> >     that
>>> >     > need the HOST side dpdk to be 16.04 or above ?
>>> >
>>> >     Yes, the HOST dpdk should be >= v16.04.
>>> >
>>> >             --yliu
>>> >     >
>>> >     > >> And yes, the fixes are not included in the DPDK required for
>>> OVS 2.4.
>>> >     >
>>> >     > Thanks for the info.
>>> >     >
>>> >     > Rgds,
>>> >     > Gopa.
>>> >     >
>>> >     > On Thu, Mar 16, 2017 at 10:13 PM, Yuanhan Liu <
>>> >     yuanhan.liu@linux.intel.com>
>>> >     > wrote:
>>> >     >
>>> >     >     On Thu, Mar 16, 2017 at 09:56:01PM -0700, Gopakumar
>>> Choorakkot
>>> >     Edakkunni
>>> >     >     wrote:
>>> >     >     > Hi Yuanhan,
>>> >     >     >
>>> >     >     > Thanks for the confirmation about not having to do
>>> anything special
>>> >     to
>>> >     >     close
>>> >     >     > the ports on dpdk going down or coming up.
>>> >     >     >
>>> >     >     > As for the question about if I met any issue of ovs
>>> getting stuck -
>>> >     yes,
>>> >     >     my
>>> >     >     > guest process runs dpdk 16.07 as I mentioned earlier - and
>>> if I
>>> >     kill my
>>> >     >     guest
>>> >     >     > process, then the host OVS-dpdk on the host reports stall
>>> ! The
>>> >     OVS-dpdk
>>> >     >     and
>>> >     >     > emu versions I use are as below. But maybe that is because
>>> of the
>>> >     ovs
>>> >     >     missing
>>> >     >     > the fixes you mentioned ?
>>> >     >
>>> >     >     When I was saying dpdk version, I meant the DPDK version
>>> with OVS.
>>> >     >
>>> >     >     > ~# ovs-vswitchd --version
>>> >     >     > ovs-vswitchd (Open vSwitch) 2.4.1
>>> >     >
>>> >     >     And yes, the fixes are not included in the DPDK required for
>>> OVS 2.4.
>>> >     >
>>> >     >             --yliu
>>> >     >
>>> >     >     > Compiled Nov 14 2016 06:53:31
>>> >     >     > # kvm --version
>>> >     >     > QEMU emulator version 2.2.0, Copyright (c) 2003-2008
>>> Fabrice
>>> >     Bellard
>>> >     >     > ~#
>>> >     >     >
>>> >     >     >
>>> >     >     > Rgds,
>>> >     >     > Gopa.
>>> >     >     >
>>> >     >     > On Thu, Mar 16, 2017 at 9:35 PM, Yuanhan Liu <
>>> >     yuanhan.liu@linux.intel.com
>>> >     >     >
>>> >     >     > wrote:
>>> >     >     >
>>> >     >     >     On Thu, Mar 16, 2017 at 07:48:28PM -0700, Gopakumar
>>> Choorakkot
>>> >     >     Edakkunni
>>> >     >     >     wrote:
>>> >     >     >     > Thanks a lot for the response Yuanhan. I am using
>>> dpdk
>>> >     v16.07. So
>>> >     >     what
>>> >     >     >     you are
>>> >     >     >     > saying is that in 16.07, we dont really need to call
>>> >     >     rte_eth_dev_close()
>>> >     >     >     on
>>> >     >     >     > exit,
>>> >     >     >
>>> >     >     >     It's not about "don't really need", it's more like
>>> "it's hard
>>> >     to".
>>> >     >     Just
>>> >     >     >     think that it may crash at any time.
>>> >     >     >
>>> >     >     >     > because dpdk will ensure that it will do virtio
>>> reset before
>>> >     init
>>> >     >     when it
>>> >     >     >     > comes up right ?
>>> >     >     >
>>> >     >     >     No, It just handles the abnormal case well when guest
>>> APP
>>> >     restarts.
>>> >     >     >
>>> >     >     >     > Regarding the vhost commits you mentioned - do we
>>> still need
>>> >     those
>>> >     >     fixes
>>> >     >     >     if we
>>> >     >     >     > have the "virtio reset before init" mechanism ?
>>> >     >     >
>>> >     >     >     Yes, we still need them: just think some malicious
>>> guest may
>>> >     also
>>> >     >     forge
>>> >     >     >     data like that.
>>> >     >     >
>>> >     >     >     I'm a bit confused then. Have you actually met any
>>> issue (like
>>> >     got
>>> >     >     stucked)
>>> >     >     >     with DPDK v16.07?
>>> >     >     >
>>> >     >     >             --yliu
>>> >     >     >
>>> >     >     >     > Or that is a seperate problem
>>> >     >     >     > altogether (and hence we would need those fixes) ?
>>> >     >     >     >
>>> >     >     >     > Rgds,
>>> >     >     >     > Gopa.
>>> >     >     >     >
>>> >     >     >     > On Thu, Mar 16, 2017 at 7:06 PM, Yuanhan Liu <
>>> >     >     yuanhan.liu@linux.intel.com
>>> >     >     >     >
>>> >     >     >     > wrote:
>>> >     >     >     >
>>> >     >     >     >     On Thu, Mar 16, 2017 at 12:39:16PM -0700,
>>> Gopakumar
>>> >     Choorakkot
>>> >     >     >     Edakkunni
>>> >     >     >     >     wrote:
>>> >     >     >     >     > So the doc says we should call
>>> rte_eth_dev_close()
>>> >     *before*
>>> >     >     going
>>> >     >     >     down.
>>> >     >     >     >     And I
>>> >     >     >     >     > know that especially in dpdk-virtionet  in the
>>> guest +
>>> >     >     ovs-dpdk in
>>> >     >     >     the
>>> >     >     >     >     host,
>>> >     >     >     >     > the ovs ends up getting stalled/stuck (!!) if
>>> I dont
>>> >     close
>>> >     >     the port
>>> >     >     >     >     before
>>> >     >     >     >     > starting() it when the guest dpdk process
>>> comes back
>>> >     up.
>>> >     >     >     >
>>> >     >     >     >     I'm assuming you were using an old version,
>>> something
>>> >     like dpdk
>>> >     >     v2.2?
>>> >     >     >     >     IIRC, DPDK v16.04 should have fixed your issue.
>>> >     >     >     >
>>> >     >     >     >     > Considering that this not done properly can
>>> screw up
>>> >     the HOST
>>> >     >     ovs,
>>> >     >     >     and I
>>> >     >     >     >     want
>>> >     >     >     >     > to do everything possible to avoid that, I
>>> want to be
>>> >     200%
>>> >     >     sure
>>> >     >     >     that I
>>> >     >     >     >     call
>>> >     >     >     >     > close even if my process gets a kill -9 .. So
>>> obviously
>>> >     the
>>> >     >     only
>>> >     >     >     way of
>>> >     >     >     >     doing
>>> >     >     >     >     > that is to close the port when the dpdk
>>> process comes
>>> >     back up
>>> >     >     and
>>> >     >     >     >     *before* we
>>> >     >     >     >     > init the port. rte_eth_dev_close() is not
>>> capable of
>>> >     doing
>>> >     >     that as
>>> >     >     >     it
>>> >     >     >     >     expects
>>> >     >     >     >     > the port parameters to be initialized etc..
>>> before it
>>> >     can be
>>> >     >     >     called.
>>> >     >     >     >
>>> >     >     >     >     We do virtio reset before init, which is
>>> basically what
>>> >     >     >     rte_eth_dev_close()
>>> >     >     >     >     mainly does. So I see no big issue here.
>>> >     >     >     >
>>> >     >     >     >     The stuck issue is due to hugepage reset by the
>>> guest
>>> >     DPDK
>>> >     >     >     application,
>>> >     >     >     >     leading all virtio vring elements being mem
>>> zeroed. The
>>> >     old
>>> >     >     vhost
>>> >     >     >     doesn't
>>> >     >     >     >     handle it well, as a result, it got stuck. And
>>> here are
>>> >     some
>>> >     >     relevant
>>> >     >     >     >     commits:
>>> >     >     >     >
>>> >     >     >     >         a436f53 vhost: avoid dead loop chain
>>> >     >     >     >         c687b0b vhost: check for ring descriptors
>>> overflow
>>> >     >     >     >         623bc47 vhost: do sanity check for ring
>>> descriptor
>>> >     length
>>> >     >     >     >
>>> >     >     >     >             --yliu
>>> >     >     >     >
>>> >     >     >     >     > Any other
>>> >     >     >     >     > suggestions on what can be done to close on
>>> restart
>>> >     rather
>>> >     >     than
>>> >     >     >     close on
>>> >     >     >     >     going
>>> >     >     >     >     > down ? Thought of bouncing this by the alias
>>> before I
>>> >     add a
>>> >     >     version
>>> >     >     >     of
>>> >     >     >     >     close
>>> >     >     >     >     > myself that can do this close-on-restart
>>> >     >     >     >
>>> >     >     >     >
>>> >     >     >
>>> >     >     >
>>> >     >
>>> >     >
>>> >
>>> >
>>>
>>
>>
>

  reply	other threads:[~2017-03-18 21:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-16 19:39 virtio "how to restart applications" - //dpdk.org/doc/virtio-net-pmd Gopakumar Choorakkot Edakkunni
2017-03-17  2:06 ` Yuanhan Liu
2017-03-17  2:48   ` Gopakumar Choorakkot Edakkunni
2017-03-17  4:35     ` Yuanhan Liu
2017-03-17  4:56       ` Gopakumar Choorakkot Edakkunni
2017-03-17  5:13         ` Yuanhan Liu
2017-03-17  5:20           ` Gopakumar Choorakkot Edakkunni
2017-03-17  5:24             ` Yuanhan Liu
2017-03-17  5:30               ` Gopakumar Choorakkot Edakkunni
2017-03-17  5:40                 ` Yuanhan Liu
2017-03-17  5:50                   ` Gopakumar Choorakkot Edakkunni
2017-03-18 21:32                     ` Gopakumar Choorakkot Edakkunni
2017-03-18 21:37                       ` Gopakumar Choorakkot Edakkunni [this message]
2017-03-18 23:43                         ` Gopakumar Choorakkot Edakkunni
2017-03-22  5:32                           ` Gopakumar Choorakkot Edakkunni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABK1yFDFbsZRSDb7tiAnZ6OFACKBHAFM9618N1phREmYYq4FEQ@mail.gmail.com \
    --to=gopakumar.c.e@gmail.com \
    --cc=dev@dpdk.org \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).