All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] migration: QEMU aborts migration of "none" machine
@ 2017-01-25 10:01 Ashijeet Acharya
  2017-01-25 10:53 ` Thomas Huth
  0 siblings, 1 reply; 7+ messages in thread
From: Ashijeet Acharya @ 2017-01-25 10:01 UTC (permalink / raw)
  To: Thomas Huth; +Cc: dgilbert@redhat.com, QEMU Developers, Juan Quintela

Hi Thomas,

I was able to reproduce the bug you encountered while trying to migrate a
"none" machine.

After digging deep and debugging the problem with gdb, I found that
bitmap_new() is the function that crashes qemu as it always has zero bits
being passed to it.

I had a discussion with Dave on the IRC yesterday and he suggested the
reason of crash being the absence of RAM ( since machine is "none") while
attempting the migration.

I am not sure what the ideal solution will be in this case; Do any of the
maintainers have any suggestions or even a starting point which I can
ponder upon?

Thanks
Ashijeet

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migration: QEMU aborts migration of "none" machine
  2017-01-25 10:01 [Qemu-devel] migration: QEMU aborts migration of "none" machine Ashijeet Acharya
@ 2017-01-25 10:53 ` Thomas Huth
  2017-01-25 11:07   ` Daniel P. Berrange
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2017-01-25 10:53 UTC (permalink / raw)
  To: Ashijeet Acharya; +Cc: dgilbert@redhat.com, QEMU Developers, Juan Quintela

On 25.01.2017 11:01, Ashijeet Acharya wrote:
> Hi Thomas,
> 
> I was able to reproduce the bug you encountered while trying to migrate
> a "none" machine.
> 
> After digging deep and debugging the problem with gdb, I found that
> bitmap_new() is the function that crashes qemu as it always has
> zero bits being passed to it.
> 
> I had a discussion with Dave on the IRC yesterday and he suggested the
> reason of crash being the absence of RAM ( since machine is "none")
> while attempting the migration.
> 
> I am not sure what the ideal solution will be in this case; Do any of
> the maintainers have any suggestions or even a starting point which I
> can ponder upon?

I'm not that familiar with the migration code either, but maybe you
could simply check for last_ram_offset() == 0 at the beginning of
ram_save_setup() and return with an error in that case?

 Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migration: QEMU aborts migration of "none" machine
  2017-01-25 10:53 ` Thomas Huth
@ 2017-01-25 11:07   ` Daniel P. Berrange
  2017-01-25 11:16     ` Juan Quintela
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel P. Berrange @ 2017-01-25 11:07 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Ashijeet Acharya, Juan Quintela, dgilbert@redhat.com,
	QEMU Developers

On Wed, Jan 25, 2017 at 11:53:06AM +0100, Thomas Huth wrote:
> On 25.01.2017 11:01, Ashijeet Acharya wrote:
> > Hi Thomas,
> > 
> > I was able to reproduce the bug you encountered while trying to migrate
> > a "none" machine.
> > 
> > After digging deep and debugging the problem with gdb, I found that
> > bitmap_new() is the function that crashes qemu as it always has
> > zero bits being passed to it.
> > 
> > I had a discussion with Dave on the IRC yesterday and he suggested the
> > reason of crash being the absence of RAM ( since machine is "none")
> > while attempting the migration.
> > 
> > I am not sure what the ideal solution will be in this case; Do any of
> > the maintainers have any suggestions or even a starting point which I
> > can ponder upon?
> 
> I'm not that familiar with the migration code either, but maybe you
> could simply check for last_ram_offset() == 0 at the beginning of
> ram_save_setup() and return with an error in that case?

Do we actually need/want migration to succeed when RAM is 0 or vCPU
count is 0 ?  If not, then we could just register a migration blocker
during machine init

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migration: QEMU aborts migration of "none" machine
  2017-01-25 11:07   ` Daniel P. Berrange
@ 2017-01-25 11:16     ` Juan Quintela
  2017-01-25 12:58       ` Ashijeet Acharya
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Quintela @ 2017-01-25 11:16 UTC (permalink / raw)
  To: Daniel P. Berrange
  Cc: Thomas Huth, Ashijeet Acharya, dgilbert@redhat.com,
	QEMU Developers

"Daniel P. Berrange" <berrange@redhat.com> wrote:
> On Wed, Jan 25, 2017 at 11:53:06AM +0100, Thomas Huth wrote:
>> On 25.01.2017 11:01, Ashijeet Acharya wrote:
>> > Hi Thomas,
>> > 
>> > I was able to reproduce the bug you encountered while trying to migrate
>> > a "none" machine.
>> > 
>> > After digging deep and debugging the problem with gdb, I found that
>> > bitmap_new() is the function that crashes qemu as it always has
>> > zero bits being passed to it.
>> > 
>> > I had a discussion with Dave on the IRC yesterday and he suggested the
>> > reason of crash being the absence of RAM ( since machine is "none")
>> > while attempting the migration.
>> > 
>> > I am not sure what the ideal solution will be in this case; Do any of
>> > the maintainers have any suggestions or even a starting point which I
>> > can ponder upon?
>> 
>> I'm not that familiar with the migration code either, but maybe you
>> could simply check for last_ram_offset() == 0 at the beginning of
>> ram_save_setup() and return with an error in that case?
>
> Do we actually need/want migration to succeed when RAM is 0 or vCPU
> count is 0 ?  If not, then we could just register a migration blocker
> during machine init

Only use that I can see for that is to test that we have setup things
correctly, but for that we could also use an stopped machine.  So, I
don't know if it really makes sense.

If the only problem is that bitmap_new(0) returns 0, we can try to fix
that, no?

Later, Juan.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migration: QEMU aborts migration of "none" machine
  2017-01-25 11:16     ` Juan Quintela
@ 2017-01-25 12:58       ` Ashijeet Acharya
  2017-01-25 16:10         ` Thomas Huth
  0 siblings, 1 reply; 7+ messages in thread
From: Ashijeet Acharya @ 2017-01-25 12:58 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Daniel P. Berrange, Thomas Huth, dgilbert@redhat.com,
	QEMU Developers

On Wed, Jan 25, 2017 at 4:46 PM, Juan Quintela <quintela@redhat.com> wrote:
> "Daniel P. Berrange" <berrange@redhat.com> wrote:
>> On Wed, Jan 25, 2017 at 11:53:06AM +0100, Thomas Huth wrote:
>>> On 25.01.2017 11:01, Ashijeet Acharya wrote:
>>> > Hi Thomas,
>>> >
>>> > I was able to reproduce the bug you encountered while trying to migrate
>>> > a "none" machine.
>>> >
>>> > After digging deep and debugging the problem with gdb, I found that
>>> > bitmap_new() is the function that crashes qemu as it always has
>>> > zero bits being passed to it.
>>> >
>>> > I had a discussion with Dave on the IRC yesterday and he suggested the
>>> > reason of crash being the absence of RAM ( since machine is "none")
>>> > while attempting the migration.
>>> >
>>> > I am not sure what the ideal solution will be in this case; Do any of
>>> > the maintainers have any suggestions or even a starting point which I
>>> > can ponder upon?
>>>
>>> I'm not that familiar with the migration code either, but maybe you
>>> could simply check for last_ram_offset() == 0 at the beginning of
>>> ram_save_setup() and return with an error in that case?
>>
>> Do we actually need/want migration to succeed when RAM is 0 or vCPU
>> count is 0 ?  If not, then we could just register a migration blocker
>> during machine init
>
> Only use that I can see for that is to test that we have setup things
> correctly, but for that we could also use an stopped machine.  So, I
> don't know if it really makes sense.
>
> If the only problem is that bitmap_new(0) returns 0, we can try to fix
> that, no?

Yes, bitmap_new(0) aborts in this case. The only way I see to fix this
would be to error out rather than simply crashing?
Adding a migration blocker is one way too; although I am not aware of
the use cases of a "none" machine.

Ashijeet
>
> Later, Juan.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migration: QEMU aborts migration of "none" machine
  2017-01-25 12:58       ` Ashijeet Acharya
@ 2017-01-25 16:10         ` Thomas Huth
  2017-01-25 16:34           ` Ashijeet Acharya
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2017-01-25 16:10 UTC (permalink / raw)
  To: Ashijeet Acharya, Juan Quintela
  Cc: Daniel P. Berrange, dgilbert@redhat.com, QEMU Developers

On 25.01.2017 13:58, Ashijeet Acharya wrote:
> On Wed, Jan 25, 2017 at 4:46 PM, Juan Quintela <quintela@redhat.com> wrote:
>> "Daniel P. Berrange" <berrange@redhat.com> wrote:
>>> On Wed, Jan 25, 2017 at 11:53:06AM +0100, Thomas Huth wrote:
>>>> On 25.01.2017 11:01, Ashijeet Acharya wrote:
>>>>> Hi Thomas,
>>>>>
>>>>> I was able to reproduce the bug you encountered while trying to migrate
>>>>> a "none" machine.
>>>>>
>>>>> After digging deep and debugging the problem with gdb, I found that
>>>>> bitmap_new() is the function that crashes qemu as it always has
>>>>> zero bits being passed to it.
>>>>>
>>>>> I had a discussion with Dave on the IRC yesterday and he suggested the
>>>>> reason of crash being the absence of RAM ( since machine is "none")
>>>>> while attempting the migration.
>>>>>
>>>>> I am not sure what the ideal solution will be in this case; Do any of
>>>>> the maintainers have any suggestions or even a starting point which I
>>>>> can ponder upon?
>>>>
>>>> I'm not that familiar with the migration code either, but maybe you
>>>> could simply check for last_ram_offset() == 0 at the beginning of
>>>> ram_save_setup() and return with an error in that case?
>>>
>>> Do we actually need/want migration to succeed when RAM is 0 or vCPU
>>> count is 0 ?  If not, then we could just register a migration blocker
>>> during machine init
>>
>> Only use that I can see for that is to test that we have setup things
>> correctly, but for that we could also use an stopped machine.  So, I
>> don't know if it really makes sense.
>>
>> If the only problem is that bitmap_new(0) returns 0, we can try to fix
>> that, no?
> 
> Yes, bitmap_new(0) aborts in this case. The only way I see to fix this
> would be to error out rather than simply crashing?
> Adding a migration blocker is one way too; although I am not aware of
> the use cases of a "none" machine.

The "none" machine is - so far - only used for the testing framework
(see the tests/ directory in the QEMU sources). With the upcoming QEMU
2.9, it should also be possible to use it as a simple instruction set
simulation board. But in both cases, we do not need migration for this
machine, i.e. simply printing an error message (and refusing to migrate)
instead of crashing should IMHO be OK here.

 Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] migration: QEMU aborts migration of "none" machine
  2017-01-25 16:10         ` Thomas Huth
@ 2017-01-25 16:34           ` Ashijeet Acharya
  0 siblings, 0 replies; 7+ messages in thread
From: Ashijeet Acharya @ 2017-01-25 16:34 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Juan Quintela, Daniel P. Berrange, dgilbert@redhat.com,
	QEMU Developers

On Wed, Jan 25, 2017 at 9:40 PM, Thomas Huth <thuth@redhat.com> wrote:
> On 25.01.2017 13:58, Ashijeet Acharya wrote:
>> On Wed, Jan 25, 2017 at 4:46 PM, Juan Quintela <quintela@redhat.com> wrote:
>>> "Daniel P. Berrange" <berrange@redhat.com> wrote:
>>>> On Wed, Jan 25, 2017 at 11:53:06AM +0100, Thomas Huth wrote:
>>>>> On 25.01.2017 11:01, Ashijeet Acharya wrote:
>>>>>> Hi Thomas,
>>>>>>
>>>>>> I was able to reproduce the bug you encountered while trying to migrate
>>>>>> a "none" machine.
>>>>>>
>>>>>> After digging deep and debugging the problem with gdb, I found that
>>>>>> bitmap_new() is the function that crashes qemu as it always has
>>>>>> zero bits being passed to it.
>>>>>>
>>>>>> I had a discussion with Dave on the IRC yesterday and he suggested the
>>>>>> reason of crash being the absence of RAM ( since machine is "none")
>>>>>> while attempting the migration.
>>>>>>
>>>>>> I am not sure what the ideal solution will be in this case; Do any of
>>>>>> the maintainers have any suggestions or even a starting point which I
>>>>>> can ponder upon?
>>>>>
>>>>> I'm not that familiar with the migration code either, but maybe you
>>>>> could simply check for last_ram_offset() == 0 at the beginning of
>>>>> ram_save_setup() and return with an error in that case?
>>>>
>>>> Do we actually need/want migration to succeed when RAM is 0 or vCPU
>>>> count is 0 ?  If not, then we could just register a migration blocker
>>>> during machine init
>>>
>>> Only use that I can see for that is to test that we have setup things
>>> correctly, but for that we could also use an stopped machine.  So, I
>>> don't know if it really makes sense.
>>>
>>> If the only problem is that bitmap_new(0) returns 0, we can try to fix
>>> that, no?
>>
>> Yes, bitmap_new(0) aborts in this case. The only way I see to fix this
>> would be to error out rather than simply crashing?
>> Adding a migration blocker is one way too; although I am not aware of
>> the use cases of a "none" machine.
>
> The "none" machine is - so far - only used for the testing framework
> (see the tests/ directory in the QEMU sources). With the upcoming QEMU
> 2.9, it should also be possible to use it as a simple instruction set
> simulation board. But in both cases, we do not need migration for this
> machine, i.e. simply printing an error message (and refusing to migrate)
> instead of crashing should IMHO be OK here.
>

Okay, I will write a patch to print an error message and if more
suggestions come during that time, I will make the changes
accordingly.

Ashijeet
>  Thomas
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-01-25 16:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-25 10:01 [Qemu-devel] migration: QEMU aborts migration of "none" machine Ashijeet Acharya
2017-01-25 10:53 ` Thomas Huth
2017-01-25 11:07   ` Daniel P. Berrange
2017-01-25 11:16     ` Juan Quintela
2017-01-25 12:58       ` Ashijeet Acharya
2017-01-25 16:10         ` Thomas Huth
2017-01-25 16:34           ` Ashijeet Acharya

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.