All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ohad Ben-Cohen <ohad@wizery.com>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [RFC] virtio: use mandatory barriers for remote processor vdevs
Date: Thu, 1 Dec 2011 01:43:18 +0200	[thread overview]
Message-ID: <20111130234318.GC31069@redhat.com> (raw)
In-Reply-To: <CAK=WgbbZH3ZFzbwwyVBGEvwHPbogDW6+Kb_1pRam2jj5oJHuDA@mail.gmail.com>

On Thu, Dec 01, 2011 at 01:27:10AM +0200, Ohad Ben-Cohen wrote:
> On Wed, Nov 30, 2011 at 6:24 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> > On Wed, Nov 30, 2011 at 6:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> How are the rings mapped? normal memory, right?
> >
> > No, device memory.
> 
> Ok, I have more info.
> 
> Originally remoteproc was mapping the rings using ioremap, and that
> meant ARM Device memory.
> 
> Recently, though, we moved to CMA (allocating memory for the rings via
> dma_alloc_coherent), and that isn't Device memory anymore: it's
> uncacheable Normal memory (on ARM v6+).

And these accesses need to be ordered with DSB? Or DMB?

> We still require mandatory barriers though: one very reproducible
> problem I personally face is that the avail index doesn't get updated
> before the kick.

Aha! The *kick* really is MMIO. So I think we do need a mandatory barrier
before the kick.  Maybe we need it for virtio-pci as well
(not on kvm, naturally :) Off-hand this seems to belong in the transport
layer but need to think about it.

> As a result, the remote processor misses a buffer
> that was just added (the kick wakes it up only to find that the avail
> index wasn't changed yet). In this case, it probably happens because
> the mailbox, used to kick the remote processor, is mapped as Device
> memory, and therefore the kick can be reordered before the updates to
> the ring  can be observed.
> 
> I did get two additional reports about reordering issues, on different
> setups than my own, and which I can't personally reproduce: the one
> I've described earlier (avail index gets updated before the avail
> array) and one in the receive path (reading a used buffer which we
> already read). I couldn't personally verify those, but both issues
> were reported to be gone when mandatory barriers were used.

Hmm. So it's a hint that something is wrong with memory
but not what's wrong exactly.

> I expect those reports only to increase: the diversity of platforms
> that are now looking into adopting virtio for this kind of
> inter-process communication is quite huge, with several different
> architectures and even more hardware implementations on the way (not
> only ARM).
> 
> Thanks,
> Ohad.

Right. We need to be very careful with memory,
it's a tricky field. One known problem with virtio
is its insistance on using native endian-ness
for some fields. If power is used, we'll have to fix this.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: mst@redhat.com (Michael S. Tsirkin)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC] virtio: use mandatory barriers for remote processor vdevs
Date: Thu, 1 Dec 2011 01:43:18 +0200	[thread overview]
Message-ID: <20111130234318.GC31069@redhat.com> (raw)
In-Reply-To: <CAK=WgbbZH3ZFzbwwyVBGEvwHPbogDW6+Kb_1pRam2jj5oJHuDA@mail.gmail.com>

On Thu, Dec 01, 2011 at 01:27:10AM +0200, Ohad Ben-Cohen wrote:
> On Wed, Nov 30, 2011 at 6:24 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> > On Wed, Nov 30, 2011 at 6:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> How are the rings mapped? normal memory, right?
> >
> > No, device memory.
> 
> Ok, I have more info.
> 
> Originally remoteproc was mapping the rings using ioremap, and that
> meant ARM Device memory.
> 
> Recently, though, we moved to CMA (allocating memory for the rings via
> dma_alloc_coherent), and that isn't Device memory anymore: it's
> uncacheable Normal memory (on ARM v6+).

And these accesses need to be ordered with DSB? Or DMB?

> We still require mandatory barriers though: one very reproducible
> problem I personally face is that the avail index doesn't get updated
> before the kick.

Aha! The *kick* really is MMIO. So I think we do need a mandatory barrier
before the kick.  Maybe we need it for virtio-pci as well
(not on kvm, naturally :) Off-hand this seems to belong in the transport
layer but need to think about it.

> As a result, the remote processor misses a buffer
> that was just added (the kick wakes it up only to find that the avail
> index wasn't changed yet). In this case, it probably happens because
> the mailbox, used to kick the remote processor, is mapped as Device
> memory, and therefore the kick can be reordered before the updates to
> the ring  can be observed.
> 
> I did get two additional reports about reordering issues, on different
> setups than my own, and which I can't personally reproduce: the one
> I've described earlier (avail index gets updated before the avail
> array) and one in the receive path (reading a used buffer which we
> already read). I couldn't personally verify those, but both issues
> were reported to be gone when mandatory barriers were used.

Hmm. So it's a hint that something is wrong with memory
but not what's wrong exactly.

> I expect those reports only to increase: the diversity of platforms
> that are now looking into adopting virtio for this kind of
> inter-process communication is quite huge, with several different
> architectures and even more hardware implementations on the way (not
> only ARM).
> 
> Thanks,
> Ohad.

Right. We need to be very careful with memory,
it's a tricky field. One known problem with virtio
is its insistance on using native endian-ness
for some fields. If power is used, we'll have to fix this.

-- 
MST

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ohad Ben-Cohen <ohad@wizery.com>
Cc: virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: [RFC] virtio: use mandatory barriers for remote processor vdevs
Date: Thu, 1 Dec 2011 01:43:18 +0200	[thread overview]
Message-ID: <20111130234318.GC31069@redhat.com> (raw)
In-Reply-To: <CAK=WgbbZH3ZFzbwwyVBGEvwHPbogDW6+Kb_1pRam2jj5oJHuDA@mail.gmail.com>

On Thu, Dec 01, 2011 at 01:27:10AM +0200, Ohad Ben-Cohen wrote:
> On Wed, Nov 30, 2011 at 6:24 PM, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> > On Wed, Nov 30, 2011 at 6:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> How are the rings mapped? normal memory, right?
> >
> > No, device memory.
> 
> Ok, I have more info.
> 
> Originally remoteproc was mapping the rings using ioremap, and that
> meant ARM Device memory.
> 
> Recently, though, we moved to CMA (allocating memory for the rings via
> dma_alloc_coherent), and that isn't Device memory anymore: it's
> uncacheable Normal memory (on ARM v6+).

And these accesses need to be ordered with DSB? Or DMB?

> We still require mandatory barriers though: one very reproducible
> problem I personally face is that the avail index doesn't get updated
> before the kick.

Aha! The *kick* really is MMIO. So I think we do need a mandatory barrier
before the kick.  Maybe we need it for virtio-pci as well
(not on kvm, naturally :) Off-hand this seems to belong in the transport
layer but need to think about it.

> As a result, the remote processor misses a buffer
> that was just added (the kick wakes it up only to find that the avail
> index wasn't changed yet). In this case, it probably happens because
> the mailbox, used to kick the remote processor, is mapped as Device
> memory, and therefore the kick can be reordered before the updates to
> the ring  can be observed.
> 
> I did get two additional reports about reordering issues, on different
> setups than my own, and which I can't personally reproduce: the one
> I've described earlier (avail index gets updated before the avail
> array) and one in the receive path (reading a used buffer which we
> already read). I couldn't personally verify those, but both issues
> were reported to be gone when mandatory barriers were used.

Hmm. So it's a hint that something is wrong with memory
but not what's wrong exactly.

> I expect those reports only to increase: the diversity of platforms
> that are now looking into adopting virtio for this kind of
> inter-process communication is quite huge, with several different
> architectures and even more hardware implementations on the way (not
> only ARM).
> 
> Thanks,
> Ohad.

Right. We need to be very careful with memory,
it's a tricky field. One known problem with virtio
is its insistance on using native endian-ness
for some fields. If power is used, we'll have to fix this.

-- 
MST

  reply	other threads:[~2011-11-30 23:43 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-29 12:31 [RFC] virtio: use mandatory barriers for remote processor vdevs Ohad Ben-Cohen
2011-11-29 12:31 ` Ohad Ben-Cohen
2011-11-29 13:11 ` Michael S. Tsirkin
2011-11-29 13:11   ` Michael S. Tsirkin
2011-11-29 13:11   ` Michael S. Tsirkin
2011-11-29 13:57   ` Ohad Ben-Cohen
2011-11-29 13:57     ` Ohad Ben-Cohen
2011-11-29 13:57     ` Ohad Ben-Cohen
2011-11-29 15:16     ` Michael S. Tsirkin
2011-11-29 15:16       ` Michael S. Tsirkin
2011-11-29 15:16       ` Michael S. Tsirkin
2011-11-30 11:45       ` Ohad Ben-Cohen
2011-11-30 11:45         ` Ohad Ben-Cohen
2011-11-30 11:45         ` Ohad Ben-Cohen
2011-11-30 14:59         ` Michael S. Tsirkin
2011-11-30 14:59           ` Michael S. Tsirkin
2011-11-30 14:59           ` Michael S. Tsirkin
2011-11-30 16:04           ` Ohad Ben-Cohen
2011-11-30 16:04             ` Ohad Ben-Cohen
2011-11-30 16:15             ` Michael S. Tsirkin
2011-11-30 16:15               ` Michael S. Tsirkin
2011-11-30 16:15               ` Michael S. Tsirkin
2011-11-30 16:24               ` Ohad Ben-Cohen
2011-11-30 16:24                 ` Ohad Ben-Cohen
2011-11-30 16:24                 ` Ohad Ben-Cohen
2011-11-30 23:27                 ` Ohad Ben-Cohen
2011-11-30 23:27                   ` Ohad Ben-Cohen
2011-11-30 23:27                   ` Ohad Ben-Cohen
2011-11-30 23:43                   ` Michael S. Tsirkin [this message]
2011-11-30 23:43                     ` Michael S. Tsirkin
2011-11-30 23:43                     ` Michael S. Tsirkin
2011-12-01  6:20                     ` Ohad Ben-Cohen
2011-12-01  6:20                       ` Ohad Ben-Cohen
2011-12-01  6:20                       ` Ohad Ben-Cohen
2011-11-30 16:04           ` Ohad Ben-Cohen
2011-11-29 15:19     ` Michael S. Tsirkin
2011-11-29 15:19       ` Michael S. Tsirkin
2011-11-29 15:19       ` Michael S. Tsirkin
2011-11-30 11:55       ` Ohad Ben-Cohen
2011-11-30 11:55         ` Ohad Ben-Cohen
2011-11-30 11:55         ` Ohad Ben-Cohen
2011-11-30 14:50         ` Michael S. Tsirkin
2011-11-30 14:50           ` Michael S. Tsirkin
2011-11-30 14:50           ` Michael S. Tsirkin
2011-11-30 22:43           ` Ohad Ben-Cohen
2011-11-30 22:43             ` Ohad Ben-Cohen
2011-11-30 22:43             ` Ohad Ben-Cohen
2011-11-30 23:13             ` Michael S. Tsirkin
2011-11-30 23:13               ` Michael S. Tsirkin
2011-11-30 23:13               ` Michael S. Tsirkin
2011-12-01  2:28               ` Rusty Russell
2011-12-01  2:28                 ` Rusty Russell
2011-12-01  2:28                 ` Rusty Russell
2011-12-01  7:15                 ` Ohad Ben-Cohen
2011-12-01  7:15                   ` Ohad Ben-Cohen
2011-12-01  7:15                   ` Ohad Ben-Cohen
2011-12-01  8:12                 ` Michael S. Tsirkin
2011-12-01  8:12                   ` Michael S. Tsirkin
2011-12-01  8:12                   ` Michael S. Tsirkin
2011-12-02  0:26                   ` Rusty Russell
2011-12-02  0:26                     ` Rusty Russell
2011-12-02  0:26                     ` Rusty Russell
2011-12-01  6:14               ` Ohad Ben-Cohen
2011-12-01  6:14                 ` Ohad Ben-Cohen
2011-12-01  6:14                 ` Ohad Ben-Cohen
2011-12-01  9:09                 ` Michael S. Tsirkin
2011-12-01  9:09                   ` Michael S. Tsirkin
2011-12-01  9:09                   ` Michael S. Tsirkin
2011-12-02 23:09 ` Benjamin Herrenschmidt
2011-12-02 23:09   ` Benjamin Herrenschmidt
2011-12-02 23:09   ` Benjamin Herrenschmidt
2011-12-03  5:14   ` Rusty Russell
2011-12-03  5:14     ` Rusty Russell
2011-12-03  5:14     ` Rusty Russell
2011-12-11 12:25     ` Michael S. Tsirkin
2011-12-11 12:25       ` Michael S. Tsirkin
2011-12-11 12:25       ` Michael S. Tsirkin
2011-12-11 22:27       ` Benjamin Herrenschmidt
2011-12-11 22:27         ` Benjamin Herrenschmidt
2011-12-11 22:27         ` Benjamin Herrenschmidt
2011-12-12  3:06         ` Amos Kong
2011-12-12  3:06           ` Amos Kong
2011-12-12  3:06           ` Amos Kong
2011-12-12  5:12           ` Rusty Russell
2011-12-12  5:12             ` Rusty Russell
2011-12-12  5:12             ` Rusty Russell
2011-12-12 23:56             ` Amos Kong
2011-12-12 23:56               ` Amos Kong
2011-12-12 23:56               ` Amos Kong
2011-12-19  2:35               ` Rusty Russell
2011-12-19  2:35                 ` Rusty Russell
2011-12-19  2:35                 ` Rusty Russell
2011-12-19  2:19             ` Amos Kong
2011-12-19  2:19               ` Amos Kong
2011-12-19  2:19               ` Amos Kong
2011-12-19  2:41               ` Benjamin Herrenschmidt
2011-12-19  2:41                 ` Benjamin Herrenschmidt
2011-12-19  2:41                 ` Benjamin Herrenschmidt
2011-12-19  7:21                 ` Amos Kong
2011-12-19  7:21                   ` Amos Kong
2011-12-19  7:21                   ` Amos Kong
2011-12-19  2:50               ` Amos Kong
2011-12-19  2:50                 ` Amos Kong
2011-12-19  2:50                 ` Amos Kong
2011-12-19  8:37                 ` Rusty Russell
2011-12-19  8:37                   ` Rusty Russell
2011-12-19  8:37                   ` Rusty Russell
2011-12-03  6:01   ` Ohad Ben-Cohen
2011-12-03  6:01     ` Ohad Ben-Cohen
2011-12-03  6:01     ` Ohad Ben-Cohen
  -- strict thread matches above, loose matches on Subject: below --
2011-11-29 12:31 Ohad Ben-Cohen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111130234318.GC31069@redhat.com \
    --to=mst@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ohad@wizery.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.