All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Babis Chalios <bchalios@amazon.es>
Cc: virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org, "Cali,
	Marco" <xmarcalx@amazon.co.uk>, "Graf (AWS),
	Alexander" <graf@amazon.de>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	aams@amazon.de
Subject: [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
Date: Thu, 2 Nov 2023 07:25:10 -0400	[thread overview]
Message-ID: <20231102072055-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <c0c5b159-997f-4ddf-ab3e-1c749ad93e10@amazon.es>

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> 
> 
> On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > 
> > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > However, this just
> > > > > > > > > > decreases
> > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > leak event happens it
> > > > > > > > > > might not
> > > > > > > > > > find any buffers to use:
> > > > > > > > > > 
> > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > use buffers in X
> > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > use buffers in 1-X
> > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > >                                      <- leak event 3 (it
> > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > 9. goto 3
> > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > actually add the
> > > > > > > > buffers in X and make
> > > > > > > > them visible to the device?
> > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > 
> > > > > So the scenario I have in mind is the following:
> > > > > 
> > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > maintained by random.c
> > > > > that changes every time a leak event happens).
> > > > > 
> > > > > 1. add buffers to 1-X
> > > > > 2. add buffers to X
> > > > > 3. poll queue X
> > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > 5. Device: First snapshot, uses buffers in X
> > > > > 6. vcpu 1: sees used buffers
> > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > has not yet finished adding new buffers).
> > > > > 10. vcpu 1 adds new buffer in X
> > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > 
> > > > > 
> > > > > In this succession of events, when the third snapshot will happen, the
> > > > > device won't find
> > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > any entropy
> > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > something?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > > 
> > > > Yes but notice how this is followed by:
> > > > 
> > > > 12. vcpu 1: sees used buffers in 1-X
> > > > 
> > > > Driver can notify getrandom I guess?
> > > It could, but then we have the exact race condition that VMGENID had,
> > > userspace has already consumed stale entropy and there's nothing we
> > > can do about that.
> > > 
> > > Although this is indeed a corner case, it feels like it beats the purpose
> > > of having the hardware update directly userspace (via copy on leak).
> > > 
> > > How do you feel about the proposal a couple of emails back? It looks to
> > > me that it avoids completely the race condition.
> > > 
> > > Cheers,
> > > Babis
> > It does. The problem of course is that this means that e.g.
> > taking a snapshot of a guest that is stuck won't work well.
> 
> That is true, but does it matter? The intention of the proposal
> is that if it is not safe to take snapshots (i.e. no buffers in the
> queue) don't take snapshots.

OK. Basically I think if there's a way for device to detect that
guest is stuck and not refilling the queue in a timely
manner, then we are ok - host will make its own decisions
on whether to snapshot or not.

However, I feel in that case we need a way to create a big
backlog of buffers for guest to fill such that this
ring empty condition is very unlikely.
One or even 2 queues does not seem enough then.

For example, I can see a "stop" command that will
tell device: "stop consuming buffers" and device
will stop consuming buffers until the next leak event.




> > I have been thinking of adding MAP/UNMAP descriptors for
> > a while now. Thus it will be possible to modify
> > userspace memory without consuming buffers.
> > Would something like this solve the problem?
> 
> I am not familiar with MAP/UNMAP descriptors. Is there
> a link where I can read about them?
> 
> Cheers,
> Babis


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Babis Chalios <bchalios@amazon.es>
Cc: virtio-comment@lists.oasis-open.org,
	virtio-dev@lists.oasis-open.org, "Cali,
	Marco" <xmarcalx@amazon.co.uk>, "Graf (AWS),
	Alexander" <graf@amazon.de>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	aams@amazon.de
Subject: Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
Date: Thu, 2 Nov 2023 07:25:10 -0400	[thread overview]
Message-ID: <20231102072055-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <c0c5b159-997f-4ddf-ab3e-1c749ad93e10@amazon.es>

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> 
> 
> On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > 
> > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > However, this just
> > > > > > > > > > decreases
> > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > leak event happens it
> > > > > > > > > > might not
> > > > > > > > > > find any buffers to use:
> > > > > > > > > > 
> > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > use buffers in X
> > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > use buffers in 1-X
> > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > >                                      <- leak event 3 (it
> > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > 9. goto 3
> > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > actually add the
> > > > > > > > buffers in X and make
> > > > > > > > them visible to the device?
> > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > 
> > > > > So the scenario I have in mind is the following:
> > > > > 
> > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > maintained by random.c
> > > > > that changes every time a leak event happens).
> > > > > 
> > > > > 1. add buffers to 1-X
> > > > > 2. add buffers to X
> > > > > 3. poll queue X
> > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > 5. Device: First snapshot, uses buffers in X
> > > > > 6. vcpu 1: sees used buffers
> > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > has not yet finished adding new buffers).
> > > > > 10. vcpu 1 adds new buffer in X
> > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > 
> > > > > 
> > > > > In this succession of events, when the third snapshot will happen, the
> > > > > device won't find
> > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > any entropy
> > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > something?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > > 
> > > > Yes but notice how this is followed by:
> > > > 
> > > > 12. vcpu 1: sees used buffers in 1-X
> > > > 
> > > > Driver can notify getrandom I guess?
> > > It could, but then we have the exact race condition that VMGENID had,
> > > userspace has already consumed stale entropy and there's nothing we
> > > can do about that.
> > > 
> > > Although this is indeed a corner case, it feels like it beats the purpose
> > > of having the hardware update directly userspace (via copy on leak).
> > > 
> > > How do you feel about the proposal a couple of emails back? It looks to
> > > me that it avoids completely the race condition.
> > > 
> > > Cheers,
> > > Babis
> > It does. The problem of course is that this means that e.g.
> > taking a snapshot of a guest that is stuck won't work well.
> 
> That is true, but does it matter? The intention of the proposal
> is that if it is not safe to take snapshots (i.e. no buffers in the
> queue) don't take snapshots.

OK. Basically I think if there's a way for device to detect that
guest is stuck and not refilling the queue in a timely
manner, then we are ok - host will make its own decisions
on whether to snapshot or not.

However, I feel in that case we need a way to create a big
backlog of buffers for guest to fill such that this
ring empty condition is very unlikely.
One or even 2 queues does not seem enough then.

For example, I can see a "stop" command that will
tell device: "stop consuming buffers" and device
will stop consuming buffers until the next leak event.




> > I have been thinking of adding MAP/UNMAP descriptors for
> > a while now. Thus it will be possible to modify
> > userspace memory without consuming buffers.
> > Would something like this solve the problem?
> 
> I am not familiar with MAP/UNMAP descriptors. Is there
> a link where I can read about them?
> 
> Cheers,
> Babis


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  parent reply	other threads:[~2023-11-02 11:25 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 16:30 [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 1/3] rng: move to a file of its own Michael S. Tsirkin
2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 2/3] rng: be specific about the virtqueue Michael S. Tsirkin
2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
2022-11-25 12:41   ` [virtio-dev] " Babis Chalios
2022-12-12 10:10     ` Babis Chalios
2023-01-11 13:57   ` Babis Chalios
2023-08-31 10:16   ` [virtio-dev] " Babis Chalios
2023-09-12 21:05     ` [virtio-comment] " Michael S. Tsirkin
2023-09-12 21:05       ` Michael S. Tsirkin
2023-09-13  9:32       ` Babis Chalios
2023-09-13  9:37         ` [virtio-comment] " Michael S. Tsirkin
2023-09-13  9:37           ` Michael S. Tsirkin
2023-09-13 11:19           ` Babis Chalios
2023-09-18 11:14             ` Babis Chalios
2023-09-18 12:41             ` [virtio-comment] " Michael S. Tsirkin
2023-09-18 12:41               ` Michael S. Tsirkin
2023-09-18 13:00               ` Babis Chalios
2023-09-18 13:58                 ` [virtio-comment] " Michael S. Tsirkin
2023-09-18 13:58                   ` Michael S. Tsirkin
2023-09-18 14:02                   ` Babis Chalios
2023-09-18 14:05                     ` [virtio-comment] " Michael S. Tsirkin
2023-09-18 14:05                       ` Michael S. Tsirkin
2023-09-18 16:30                       ` Babis Chalios
2023-09-19  7:32                         ` Babis Chalios
2023-09-19 10:01                           ` [virtio-comment] " Michael S. Tsirkin
2023-09-19 10:01                             ` Michael S. Tsirkin
2023-09-19 10:11                             ` Babis Chalios
2023-09-22 12:30                               ` Babis Chalios
2023-09-22 15:06                               ` [virtio-comment] " Michael S. Tsirkin
2023-09-22 15:06                                 ` Michael S. Tsirkin
2023-09-22 15:40                                 ` Babis Chalios
2023-09-22 16:01                                   ` [virtio-comment] " Michael S. Tsirkin
2023-09-22 16:01                                     ` Michael S. Tsirkin
2023-09-27 10:43                                     ` Babis Chalios
2023-09-27 21:47                                       ` [virtio-comment] " Michael S. Tsirkin
2023-09-27 21:47                                         ` Michael S. Tsirkin
2023-09-28 18:16                                         ` Babis Chalios
2023-10-13  7:49                                           ` Babis Chalios
2023-10-13 13:38                                             ` [virtio-comment] " Michael S. Tsirkin
2023-10-13 13:38                                               ` Michael S. Tsirkin
2023-11-02 11:20                                           ` [virtio-comment] " Michael S. Tsirkin
2023-11-02 11:20                                             ` Michael S. Tsirkin
2023-11-02 11:38                                             ` Babis Chalios
2023-11-02 11:51                                               ` [virtio-comment] " Michael S. Tsirkin
2023-11-02 11:51                                                 ` Michael S. Tsirkin
2023-11-02 13:42                                                 ` Babis Chalios
2023-11-02 11:25                                   ` Michael S. Tsirkin [this message]
2023-11-02 11:25                                     ` Michael S. Tsirkin
2023-11-02 11:51                                     ` Babis Chalios
2023-01-12  7:02 ` [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
2023-01-16 11:39   ` Babis Chalios
     [not found]     ` <CAHmME9ry2fss2gsbPs2zVJkY=8Cdeae0XFD9FzCVnW67Xy3thA@mail.gmail.com>
2023-01-16 18:11       ` [virtio-comment] " Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231102072055-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=Jason@zx2c4.com \
    --cc=aams@amazon.de \
    --cc=bchalios@amazon.es \
    --cc=graf@amazon.de \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xmarcalx@amazon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.