qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	Wei Wang <wei.w.wang@intel.com>,
	Leonardo Bras <leobras@redhat.com>
Subject: Re: [PATCH v4 8/8] migration: Add a wrapper to cleanup migration files
Date: Wed, 16 Aug 2023 18:20:58 -0300	[thread overview]
Message-ID: <87edk24rb9.fsf@suse.de> (raw)
In-Reply-To: <ZN0lgcI3Ieksdbv/@x1n>

Peter Xu <peterx@redhat.com> writes:

> On Wed, Aug 16, 2023 at 03:35:24PM -0400, Peter Xu wrote:
>> On Wed, Aug 16, 2023 at 03:47:26PM -0300, Fabiano Rosas wrote:
>> > Peter Xu <peterx@redhat.com> writes:
>> > 
>> > > On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote:
>> > >> @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms)
>> > >>          return -1;
>> > >>      }
>> > >>  
>> > >> +    migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file));
>> > >
>> > > I think I didn't really get why it wasn't paired before yesterday.  My
>> > > fault.
>> > >
>> > > Registering from_dst_file, afaict, will register two identical yank objects
>> > > because the ioc is the same.
>> > >
>> > 
>> > Why do we have two QEMUFiles for the same fd again?
>> 
>> Because qemufile has a "direction" (either read / write)?
>> 
>> > 
>> > We're bound to crash at some point by trying to qemu_fclose() the two
>> > QEMUFiles at the same time.
>> 
>> Even with each qemufile holding a reference on the ioc object?  I thought
>> it won't crash, but if it will please point that out; or fix it would be
>> even better.

You're right, it wouldn't crash. But it's still the same ioc object. If
qio_channel_close() is called twice, then we could potentially close the
fd twice. Which would either error out or close a reused fd. The window
is small though, so probably unlikely to ever happen.

>> 
>> > 
>> > > Should we make migration_file_release() not handle the unregister of
>> > > yank(), but leave that to callers?  Then we keep the rule of only register
>> > > yank for each ioc once.
>> > >
>> > 
>> > We need the unregister to be at migration_file_release() so that it
>> > takes benefit of the locking while checking the file for NULL. If it
>> > moves out then the caller will have to do locking as well. Which
>> > defeats the purpose of the patch.
>> > 
>> > I don't understand why you moved the unregister out of channel_close in
>> > commit 39675ffffb ("migration: Move the yank unregister of channel_close
>> > out"). You called it a "hack" at the time, but looking at the current
>> > situation, it seems like a reasonable thing to do: unregister the yank
>> > when the ioc refcount drops to 1.
>> > 
>> > I would go even further and say that qemu_fclose should also avoid
>> > calling qio_channel_close if the ioc refcnt is elevated.
>> 
>> I'd rather not; I still think it's a hack, always open to be corrected.

It's hard to figure out what you mean by hack at times. Even more when
reading a years-old commit message.

>> 
>> I think the problem is yank can register anything so it's separate from
>> iochannels.  If one would like to have ioc close() automatically
>> unregister, then one should also register yank transparently without the
>> ioc user even aware of yank's existance.

Ok, fair point.

>> 
>> Now the condition is the caller register yank itself, then I think the
>> caller should unreg it.. not iochannel itself, silently.

I think the issue is that we're linking the yank with the QEMUFile for
no reason. The migration_yank_iochannel() performs a
qio_channel_shutdown() which is an operation on the fd. The QEMUFile
just happens to hold a pointer to the ioc.

>
> I just noticed this is not really copying the list.. let me add the cc list
> back, assuming it was just forgotten.

I'm sorry, I hit the wrong key while replying.

> One more thing to mention is, now I kind of agree probably we should
> register yank over each qemufile, as you raised the concern in the other
> thread that otherwise qmp_yank() won't set error for the qemufile, which
> seems to be unexpected.

I haven't made up my mind yet, but I think I'd rather stop setting that
error instead of doing it from other places. A shutdown() is mostly a
benign operation intended to end the connection. The fact that we use it
in some cases to kick the thread out of a possible hang doesn't seem
compelling enough to set -EIO.

Of course we currently have no other way to indicate that the file was
shutdown, so the -EIO will have to stay and that's a discussion for
another day.




  reply	other threads:[~2023-08-16 21:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16 14:25 [PATCH v4 0/8] Fix segfault on migration return path Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 1/8] migration: Fix possible race when setting rp_state.error Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 2/8] migration: Fix possible races when shutting down the return path Fabiano Rosas
2023-08-16 15:05   ` Peter Xu
2023-08-16 14:25 ` [PATCH v4 3/8] migration: Fix possible race when shutting down to_dst_file Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 4/8] migration: Remove redundant cleanup of postcopy_qemufile_src Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 5/8] migration: Consolidate return path closing code Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 6/8] migration: Replace the return path retry logic Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 7/8] migration: Move return path cleanup to main migration thread Fabiano Rosas
2023-08-16 14:25 ` [PATCH v4 8/8] migration: Add a wrapper to cleanup migration files Fabiano Rosas
2023-08-16 15:18   ` Peter Xu
     [not found]     ` <87leealt8h.fsf@suse.de>
     [not found]       ` <ZN0k/DFQQIeEpgjl@x1n>
2023-08-16 19:37         ` Peter Xu
2023-08-16 21:20           ` Fabiano Rosas [this message]
2023-08-16 21:55             ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edk24rb9.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=leobras@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wei.w.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).