* [Qemu-devel] NBD server for QEMU images
@ 2006-12-12 12:48 Salvador Fandiño
2006-12-12 13:37 ` Martin Guy
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Salvador Fandiño @ 2006-12-12 12:48 UTC (permalink / raw)
To: qemu-devel
Hi,
The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718 adds a new utility, qemu-nbds, that implements a NBD server (see http://nbd.sf.net) for QEMU images.
Using this utility it is posible to mount images in any format supported by QEMU.
Unfortunatelly, only read access works (locally) due to a limitation on the Linux Kernel :-(
BTW, only tested on Linux!
Regards,
- Salvador
____________________________________________________________________________________
Cheap talk?
Check out Yahoo! Messenger's low PC-to-Phone call rates.
http://voice.yahoo.com
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] NBD server for QEMU images
2006-12-12 12:48 [Qemu-devel] NBD server for QEMU images Salvador Fandiño
@ 2006-12-12 13:37 ` Martin Guy
2006-12-12 17:00 ` [Qemu-devel] " Salvador Fandino
2006-12-12 15:09 ` Anthony Liguori
2006-12-13 16:58 ` [Qemu-devel] " Mulyadi Santosa
2 siblings, 1 reply; 32+ messages in thread
From: Martin Guy @ 2006-12-12 13:37 UTC (permalink / raw)
To: qemu-devel
> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718
> adds a new utility, qemu-nbds, that implements a NBD server
I have been using nbd volumes mounted from inside qemu for filestore
and for swap, both read-write, served from files and from partitions,
with the unmodified standard nbd-server (debian testing version) for
intensive work and it has been faster and more reliable than NFS (not
that that's saying much).
The only thing that doesn't work is the -swap option, which just
hangs, but that proves not to be necessary when swapping onto nbd host
volume from qemu-land, even when stress-testing it.
What problem is solved by a specially modified nbd server?
M
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 13:37 ` Martin Guy
@ 2006-12-12 17:00 ` Salvador Fandino
2006-12-12 16:58 ` Paul Brook
0 siblings, 1 reply; 32+ messages in thread
From: Salvador Fandino @ 2006-12-12 17:00 UTC (permalink / raw)
To: qemu-devel
Martin Guy wrote:
>> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718
>> adds a new utility, qemu-nbds, that implements a NBD server
>
> I have been using nbd volumes mounted from inside qemu for filestore
> and for swap, both read-write, served from files and from partitions,
> with the unmodified standard nbd-server (debian testing version) for
> intensive work and it has been faster and more reliable than NFS (not
> that that's saying much).
>
> The only thing that doesn't work is the -swap option, which just
> hangs, but that proves not to be necessary when swapping onto nbd host
> volume from qemu-land, even when stress-testing it.
>
> What problem is solved by a specially modified nbd server?
It serves disk images in any format QEMU can handle, for instance, qcow
images.
It's mostly intended to be used for accessing the files inside QEMU disk
images locally, without having to launch a virtual machine and accessing
then from there.
For instance, if you use QEMU to run Windows, and at some point you need
to get some file from your emulated windows disk, you can do it as follows:
$ qemu-nbds windows.qcow -p 8001 -o 32256 &
# modprobe nbd
# nbd-client localhost 8001 /dev/nbd0
# mount -o ro /dev/nbd0 /mnt/windows
$ cp /mnt/windows/FOO.txt ~/
Cheers,
- Salva
Cheers,
- Salvador.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:00 ` [Qemu-devel] " Salvador Fandino
@ 2006-12-12 16:58 ` Paul Brook
2006-12-12 17:13 ` Daniel Jacobowitz
0 siblings, 1 reply; 32+ messages in thread
From: Paul Brook @ 2006-12-12 16:58 UTC (permalink / raw)
To: qemu-devel; +Cc: Salvador Fandino
On Tuesday 12 December 2006 17:00, Salvador Fandino wrote:
> Martin Guy wrote:
> >> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718
> >> adds a new utility, qemu-nbds, that implements a NBD server
> >
> > I have been using nbd volumes mounted from inside qemu for filestore
> > and for swap, both read-write, served from files and from partitions,
> > with the unmodified standard nbd-server (debian testing version) for
> > intensive work and it has been faster and more reliable than NFS (not
> > that that's saying much).
> >
> > The only thing that doesn't work is the -swap option, which just
> > hangs, but that proves not to be necessary when swapping onto nbd host
> > volume from qemu-land, even when stress-testing it.
> >
> > What problem is solved by a specially modified nbd server?
>
> It serves disk images in any format QEMU can handle, for instance, qcow
> images.
>
> It's mostly intended to be used for accessing the files inside QEMU disk
> images locally, without having to launch a virtual machine and accessing
> then from there.
mount -o loop does this.
Paul
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 16:58 ` Paul Brook
@ 2006-12-12 17:13 ` Daniel Jacobowitz
2006-12-12 17:33 ` RE : " Sylvain Petreolle
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Daniel Jacobowitz @ 2006-12-12 17:13 UTC (permalink / raw)
To: qemu-devel; +Cc: Salvador Fandino
On Tue, Dec 12, 2006 at 04:58:32PM +0000, Paul Brook wrote:
> On Tuesday 12 December 2006 17:00, Salvador Fandino wrote:
> > It serves disk images in any format QEMU can handle, for instance, qcow
> > images.
> >
> > It's mostly intended to be used for accessing the files inside QEMU disk
> > images locally, without having to launch a virtual machine and accessing
> > then from there.
>
> mount -o loop does this.
How is everybody missing the point? :-) mount -o loop doesn't mount
qcow images.
--
Daniel Jacobowitz
CodeSourcery
^ permalink raw reply [flat|nested] 32+ messages in thread
* RE : Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:13 ` Daniel Jacobowitz
@ 2006-12-12 17:33 ` Sylvain Petreolle
2006-12-12 17:39 ` Paul Brook
` (3 more replies)
2006-12-12 17:45 ` [Qemu-devel] " Mark Williamson
2006-12-12 19:30 ` Christian MICHON
2 siblings, 4 replies; 32+ messages in thread
From: Sylvain Petreolle @ 2006-12-12 17:33 UTC (permalink / raw)
To: qemu-devel
> > > It's mostly intended to be used for accessing the files inside QEMU disk
> > > images locally, without having to launch a virtual machine and accessing
> > > then from there.
> >
> > mount -o loop does this.
>
> How is everybody missing the point? :-) mount -o loop doesn't mount
> qcow images.
>
Would be that difficult to write a qcow fs module ?
Kind regards,
Sylvain Petreolle (aka Usurp)
--- --- --- --- --- --- --- --- --- --- --- --- ---
Run your favorite Windows apps with free ReactOS : http://www.reactos.org
Listen to non-DRMised Music: http://www.jamendo.com
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:33 ` RE : " Sylvain Petreolle
@ 2006-12-12 17:39 ` Paul Brook
2006-12-12 18:54 ` Anthony Liguori
2006-12-12 17:41 ` RE : " Johannes Schindelin
` (2 subsequent siblings)
3 siblings, 1 reply; 32+ messages in thread
From: Paul Brook @ 2006-12-12 17:39 UTC (permalink / raw)
To: qemu-devel, spetreolle
> > > mount -o loop does this.
> >
> > How is everybody missing the point? :-) mount -o loop doesn't mount
> > qcow images.
>
> Would be that difficult to write a qcow fs module ?
qcow is an image format, not a filesystem.
I'd guess it should be possible to use the device-mapper framework to do this.
I've ho idea how hard this is in practice.
Paul
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: RE : Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:33 ` RE : " Sylvain Petreolle
2006-12-12 17:39 ` Paul Brook
@ 2006-12-12 17:41 ` Johannes Schindelin
2006-12-12 17:42 ` Daniel Jacobowitz
2006-12-12 19:00 ` Salvador Fandino
3 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2006-12-12 17:41 UTC (permalink / raw)
To: Sylvain Petreolle; +Cc: qemu-devel
Hi,
On Tue, 12 Dec 2006, Sylvain Petreolle wrote:
> > > > It's mostly intended to be used for accessing the files inside QEMU disk
> > > > images locally, without having to launch a virtual machine and accessing
> > > > then from there.
> > >
> > > mount -o loop does this.
> >
> > How is everybody missing the point? :-) mount -o loop doesn't mount
> > qcow images.
> >
> Would be that difficult to write a qcow fs module ?
It would be _more_ difficult. Although I would have done it as a FUSE
module, just to learn how to do it.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: RE : Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:33 ` RE : " Sylvain Petreolle
2006-12-12 17:39 ` Paul Brook
2006-12-12 17:41 ` RE : " Johannes Schindelin
@ 2006-12-12 17:42 ` Daniel Jacobowitz
2006-12-12 18:41 ` [Qemu-devel] Re: RE : " Salvador Fandino
2006-12-12 19:00 ` Salvador Fandino
3 siblings, 1 reply; 32+ messages in thread
From: Daniel Jacobowitz @ 2006-12-12 17:42 UTC (permalink / raw)
To: spetreolle, qemu-devel
On Tue, Dec 12, 2006 at 06:33:22PM +0100, Sylvain Petreolle wrote:
> > > > It's mostly intended to be used for accessing the files inside QEMU disk
> > > > images locally, without having to launch a virtual machine and accessing
> > > > then from there.
> > >
> > > mount -o loop does this.
> >
> > How is everybody missing the point? :-) mount -o loop doesn't mount
> > qcow images.
> >
> Would be that difficult to write a qcow fs module ?
Probably not, but I think using nbd for it is much nicer. I think
there would be trouble with partitionable devices, though.
--
Daniel Jacobowitz
CodeSourcery
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-12 17:42 ` Daniel Jacobowitz
@ 2006-12-12 18:41 ` Salvador Fandino
2006-12-13 12:23 ` Jan Marten Simons
0 siblings, 1 reply; 32+ messages in thread
From: Salvador Fandino @ 2006-12-12 18:41 UTC (permalink / raw)
To: qemu-devel
Daniel Jacobowitz wrote:
> On Tue, Dec 12, 2006 at 06:33:22PM +0100, Sylvain Petreolle wrote:
>>>>> It's mostly intended to be used for accessing the files inside QEMU disk
>>>>> images locally, without having to launch a virtual machine and accessing
>>>>> then from there.
>>>> mount -o loop does this.
>>> How is everybody missing the point? :-) mount -o loop doesn't mount
>>> qcow images.
>>>
>> Would be that difficult to write a qcow fs module ?
>
> Probably not, but I think using nbd for it is much nicer. I think
> there would be trouble with partitionable devices, though.
right now, you can use "-o offset" and "-s size" to serve a partition
inside a partitioned disk image. And you can use fdisk or a similar tool
to examine the partition table (they work on /dev/nbd0).
I am also looking for some working code to parse the MBR to incorporate
it in qemu-nbds (something as libparted but simpler), so it would be
possible to just indicate the partition number to serve.
- Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-12 18:41 ` [Qemu-devel] Re: RE : " Salvador Fandino
@ 2006-12-13 12:23 ` Jan Marten Simons
2006-12-13 19:03 ` Salvador Fandino
0 siblings, 1 reply; 32+ messages in thread
From: Jan Marten Simons @ 2006-12-13 12:23 UTC (permalink / raw)
To: qemu-devel
Salvador Fandino schrieb:
> right now, you can use "-o offset" and "-s size" to serve a partition
> inside a partitioned disk image. And you can use fdisk or a similar tool
> to examine the partition table (they work on /dev/nbd0).
>
> I am also looking for some working code to parse the MBR to incorporate
> it in qemu-nbds (something as libparted but simpler), so it would be
> possible to just indicate the partition number to serve.
>
> - Salva
>
The code of lomount might be what you're looking for. Lomount allows one
to mount partions (via loop) from a raw diskimage.
- Jan
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-13 12:23 ` Jan Marten Simons
@ 2006-12-13 19:03 ` Salvador Fandino
2006-12-13 20:03 ` Jim C. Brown
0 siblings, 1 reply; 32+ messages in thread
From: Salvador Fandino @ 2006-12-13 19:03 UTC (permalink / raw)
To: qemu-devel
Jan Marten Simons wrote:
> Salvador Fandino schrieb:
>> right now, you can use "-o offset" and "-s size" to serve a partition
>> inside a partitioned disk image. And you can use fdisk or a similar tool
>> to examine the partition table (they work on /dev/nbd0).
>>
>> I am also looking for some working code to parse the MBR to incorporate
>> it in qemu-nbds (something as libparted but simpler), so it would be
>> possible to just indicate the partition number to serve.
>>
>> - Salva
>>
> The code of lomount might be what you're looking for. Lomount allows one
> to mount partions (via loop) from a raw diskimage.
That was my intention, but I have found that lomount handling of EBR and
logical partition is not correct, they perform as if EBR where
structured as MBR, what is wrong!
Anyway, I have implemented the partition table parsing from scratch and
upload a new version of qemu-nbds.c to the QEMU forum.
Cheers,
- Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-13 19:03 ` Salvador Fandino
@ 2006-12-13 20:03 ` Jim C. Brown
2006-12-13 22:07 ` Salvador Fandino
0 siblings, 1 reply; 32+ messages in thread
From: Jim C. Brown @ 2006-12-13 20:03 UTC (permalink / raw)
To: qemu-devel
On Wed, Dec 13, 2006 at 08:03:13PM +0100, Salvador Fandino wrote:
> > The code of lomount might be what you're looking for. Lomount allows one
> > to mount partions (via loop) from a raw diskimage.
>
> That was my intention, but I have found that lomount handling of EBR and
> logical partition is not correct, they perform as if EBR where
> structured as MBR, what is wrong!
>
> Cheers,
>
> - Salva
>
How is it incorrect? What needs to be fixed?
My understanding is that the extended partition has a partition table
set up with the first partition entry pointing to the logical partition,
the second entry pointing to a partition table that exists immediately
after the logical partition, and then the 3rd and 4th entries are not
used. The second partition table is structed the same way, so you
essentially have a linked list of extended partitions. (Unlike the MBR,
there are no boot sectors associated with these partition tables.)
--
Infinite complexity begets infinite beauty.
Infinite precision begets infinite perfection.
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-13 20:03 ` Jim C. Brown
@ 2006-12-13 22:07 ` Salvador Fandino
2006-12-13 22:55 ` Jim C. Brown
0 siblings, 1 reply; 32+ messages in thread
From: Salvador Fandino @ 2006-12-13 22:07 UTC (permalink / raw)
To: qemu-devel
Jim C. Brown wrote:
> On Wed, Dec 13, 2006 at 08:03:13PM +0100, Salvador Fandino wrote:
>>> The code of lomount might be what you're looking for. Lomount allows one
>>> to mount partions (via loop) from a raw diskimage.
>> That was my intention, but I have found that lomount handling of EBR and
>> logical partition is not correct, they perform as if EBR where
>> structured as MBR, what is wrong!
>>
>> Cheers,
>>
>> - Salva
>>
>
> How is it incorrect? What needs to be fixed?
>
> My understanding is that the extended partition has a partition table
> set up with the first partition entry pointing to the logical partition,
> the second entry pointing to a partition table that exists immediately
> after the logical partition, and then the 3rd and 4th entries are not
> used. The second partition table is structed the same way, so you
> essentially have a linked list of extended partitions. (Unlike the MBR,
> there are no boot sectors associated with these partition tables.)
>
yes, that's right, but it's not what lomount does. It parses the data on
the EBR in the same way as the MBR, reading 4 partition registers from them.
EBRs are explained here: http://en.wikipedia.org/wiki/Extended_Boot_Record
I believe that the implementation in the last version of qemu-nbds I
have uploaded to the forum is correct.
Cheers,
- Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-13 22:07 ` Salvador Fandino
@ 2006-12-13 22:55 ` Jim C. Brown
2006-12-14 8:37 ` Salvador Fandino
0 siblings, 1 reply; 32+ messages in thread
From: Jim C. Brown @ 2006-12-13 22:55 UTC (permalink / raw)
To: Salvador Fandino; +Cc: qemu-devel
On Wed, Dec 13, 2006 at 11:07:54PM +0100, Salvador Fandino wrote:
> Jim C. Brown wrote:
> > On Wed, Dec 13, 2006 at 08:03:13PM +0100, Salvador Fandino wrote:
> >>> The code of lomount might be what you're looking for. Lomount allows one
> >>> to mount partions (via loop) from a raw diskimage.
> >> That was my intention, but I have found that lomount handling of EBR and
> >> logical partition is not correct, they perform as if EBR where
> >> structured as MBR, what is wrong!
> >>
> >> Cheers,
> >>
> >> - Salva
> >>
> >
> > How is it incorrect? What needs to be fixed?
> >
> > My understanding is that the extended partition has a partition table
> > set up with the first partition entry pointing to the logical partition,
> > the second entry pointing to a partition table that exists immediately
> > after the logical partition, and then the 3rd and 4th entries are not
> > used. The second partition table is structed the same way, so you
> > essentially have a linked list of extended partitions. (Unlike the MBR,
> > there are no boot sectors associated with these partition tables.)
> >
>
> yes, that's right, but it's not what lomount does. It parses the data on
> the EBR in the same way as the MBR, reading 4 partition registers from them.
>
It only uses the first two. It reads in the rest but ignores them.
> Cheers,
>
> - Salva
>
>
>
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
>
--
Infinite complexity begets infinite beauty.
Infinite precision begets infinite perfection.
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: RE : Re: Re: NBD server for QEMU images
2006-12-12 17:33 ` RE : " Sylvain Petreolle
` (2 preceding siblings ...)
2006-12-12 17:42 ` Daniel Jacobowitz
@ 2006-12-12 19:00 ` Salvador Fandino
3 siblings, 0 replies; 32+ messages in thread
From: Salvador Fandino @ 2006-12-12 19:00 UTC (permalink / raw)
To: qemu-devel
Sylvain Petreolle wrote:
>>>> It's mostly intended to be used for accessing the files inside QEMU disk
>>>> images locally, without having to launch a virtual machine and accessing
>>>> then from there.
>>> mount -o loop does this.
>> How is everybody missing the point? :-) mount -o loop doesn't mount
>> qcow images.
>>
> Would be that difficult to write a qcow fs module ?
well, it would mean adapting the qemu disk image handling code to run in
kernel mode (or just reimplementing the required functionality), and
wrapping it inside a block device driver similar to 'loop'.
My solution is much simpler because it's just a small adapter (600 lines
of C) that links with the unmodified qemu source and it runs in user space.
Cheers,
- Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:13 ` Daniel Jacobowitz
2006-12-12 17:33 ` RE : " Sylvain Petreolle
@ 2006-12-12 17:45 ` Mark Williamson
2006-12-12 19:30 ` Christian MICHON
2 siblings, 0 replies; 32+ messages in thread
From: Mark Williamson @ 2006-12-12 17:45 UTC (permalink / raw)
To: qemu-devel; +Cc: Salvador Fandino
> > > It's mostly intended to be used for accessing the files inside QEMU
> > > disk images locally, without having to launch a virtual machine and
> > > accessing then from there.
> >
> > mount -o loop does this.
>
> How is everybody missing the point? :-) mount -o loop doesn't mount
> qcow images.
Using dm-userspace (a device mapper with mappings generated by a userspace
daemon instead of a kernel module) I believe it is possible to mount all
kinds of weird and wonderful things - including things like qcow.
The patches for dm-userspace are floating around, I think on the device mapper
and Xen developer's mailing lists.
Of course, this is a Linux-specific solution so an NBD server is probably
still useful (can other OSes mount NBD? I assume so...?).
In principle, you could use the NDB server to host storage for physical
machines too, right? For instance you could opt for a fairly "thin" setup
where all user disks are stored separately in qcow format to save space.
This might be nice for some users of centralised storage systems...
Cheers,
Mark
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:13 ` Daniel Jacobowitz
2006-12-12 17:33 ` RE : " Sylvain Petreolle
2006-12-12 17:45 ` [Qemu-devel] " Mark Williamson
@ 2006-12-12 19:30 ` Christian MICHON
2 siblings, 0 replies; 32+ messages in thread
From: Christian MICHON @ 2006-12-12 19:30 UTC (permalink / raw)
To: qemu-devel
On 12/12/06, Daniel Jacobowitz <drow@false.org> wrote:
> How is everybody missing the point? :-) mount -o loop doesn't mount
> qcow images.
>
you could also mount it through a samba tunnel
--
Christian
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 12:48 [Qemu-devel] NBD server for QEMU images Salvador Fandiño
2006-12-12 13:37 ` Martin Guy
@ 2006-12-12 15:09 ` Anthony Liguori
2006-12-12 17:32 ` Salvador Fandino
2006-12-13 16:58 ` [Qemu-devel] " Mulyadi Santosa
2 siblings, 1 reply; 32+ messages in thread
From: Anthony Liguori @ 2006-12-12 15:09 UTC (permalink / raw)
To: qemu-devel
Salvador Fandiño wrote:
> Hi,
>
> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718 adds a new utility, qemu-nbds, that implements a NBD server (see http://nbd.sf.net) for QEMU images.
>
> Using this utility it is posible to mount images in any format supported by QEMU.
>
> Unfortunatelly, only read access works (locally) due to a limitation on the Linux Kernel :-(
http://hg.codemonkey.ws/qemu-nbd/
And write access works for me. What's this limitation you speak of?
Regards,
Anthony Liguori
> BTW, only tested on Linux!
>
> Regards,
>
> - Salvador
>
>
>
>
>
> ____________________________________________________________________________________
> Cheap talk?
> Check out Yahoo! Messenger's low PC-to-Phone call rates.
> http://voice.yahoo.com
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 15:09 ` Anthony Liguori
@ 2006-12-12 17:32 ` Salvador Fandino
2006-12-12 20:13 ` Anthony Liguori
0 siblings, 1 reply; 32+ messages in thread
From: Salvador Fandino @ 2006-12-12 17:32 UTC (permalink / raw)
To: qemu-devel
Anthony Liguori wrote:
> Salvador Fandiño wrote:
>> Hi,
>>
>> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718
>> adds a new utility, qemu-nbds, that implements a NBD server (see
>> http://nbd.sf.net) for QEMU images.
>>
>> Using this utility it is posible to mount images in any format
>> supported by QEMU.
>>
>> Unfortunatelly, only read access works (locally) due to a limitation
>> on the Linux Kernel :-(
>
> http://hg.codemonkey.ws/qemu-nbd/
>
> And write access works for me. What's this limitation you speak of?
Mounting a partition being served on the same host as read-write can
cause deadlocks. From nbd-2.9.0 README file:
"When you write something to a block device, the kernel will not
immediately write that to the physical block device; instead, your
changes are written to a cache, which is periodically flushed by a
kernel thread, 'kblockd'. If you're using a single-processor system,
then you'll have only one kblockd, meaning, the kernel can't write to
more than one block device at the same time.
If, while your kblockd is emptying the NBD buffer cache, the kernel
decides that the cache of the block device your nbd-server is writing to
needs to be emptied, then you've got a deadlock."
Regards,
- Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 17:32 ` Salvador Fandino
@ 2006-12-12 20:13 ` Anthony Liguori
2006-12-13 2:14 ` Mark Williamson
2006-12-13 11:37 ` Avi Kivity
0 siblings, 2 replies; 32+ messages in thread
From: Anthony Liguori @ 2006-12-12 20:13 UTC (permalink / raw)
To: qemu-devel
Salvador Fandino wrote:
> Anthony Liguori wrote:
>> Salvador Fandiño wrote:
>>> Hi,
>>>
>>> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718
>>> adds a new utility, qemu-nbds, that implements a NBD server (see
>>> http://nbd.sf.net) for QEMU images.
>>>
>>> Using this utility it is posible to mount images in any format
>>> supported by QEMU.
>>>
>>> Unfortunatelly, only read access works (locally) due to a limitation
>>> on the Linux Kernel :-(
>> http://hg.codemonkey.ws/qemu-nbd/
>>
>> And write access works for me. What's this limitation you speak of?
>
> Mounting a partition being served on the same host as read-write can
> cause deadlocks. From nbd-2.9.0 README file:
This text is pretty old. Is this still valid? This would imply that
things like loop can result in dead locks. I don't see why flushing one
device would depend on the completion of another device. Otherwise, if
you had two disk adapters, they would always be operating in lock step.
As I've said, I've never seen a problem doing-write with nbd on localhost.
Regards,
Anthony Liguori
> "When you write something to a block device, the kernel will not
> immediately write that to the physical block device; instead, your
> changes are written to a cache, which is periodically flushed by a
> kernel thread, 'kblockd'. If you're using a single-processor system,
> then you'll have only one kblockd, meaning, the kernel can't write to
> more than one block device at the same time.
>
> If, while your kblockd is emptying the NBD buffer cache, the kernel
> decides that the cache of the block device your nbd-server is writing to
> needs to be emptied, then you've got a deadlock."
>
> Regards,
>
> - Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 20:13 ` Anthony Liguori
@ 2006-12-13 2:14 ` Mark Williamson
2006-12-13 11:37 ` Avi Kivity
1 sibling, 0 replies; 32+ messages in thread
From: Mark Williamson @ 2006-12-13 2:14 UTC (permalink / raw)
To: qemu-devel; +Cc: Anthony Liguori
> >> And write access works for me. What's this limitation you speak of?
> >
> > Mounting a partition being served on the same host as read-write can
> > cause deadlocks. From nbd-2.9.0 README file:
>
> This text is pretty old. Is this still valid? This would imply that
> things like loop can result in dead locks. I don't see why flushing one
> device would depend on the completion of another device. Otherwise, if
> you had two disk adapters, they would always be operating in lock step.
In the right kind of low memory condition, I guess they might...
> As I've said, I've never seen a problem doing-write with nbd on localhost.
If the NBD device is read-write, this implies it can have associated dirty
pages. If you're going to flush those, the kernel is going to have to talk
to the userspace NBD server. This is going to require the allocation of
book-keeping data structures, skbufs, etc and possibly trigger some flushes
of other dirty data and / or swapping.
I guess you could perhaps get into a loop of needing to flush dirty data to
make space for data structures needed to flush dirty data? Which would
deadlock you quite effectively, but not necessarily be all *that* probably
under moderate use...
Anybody have any more information on this?
Cheers,
Mark
>
> Regards,
>
> Anthony Liguori
>
> > "When you write something to a block device, the kernel will not
> > immediately write that to the physical block device; instead, your
> > changes are written to a cache, which is periodically flushed by a
> > kernel thread, 'kblockd'. If you're using a single-processor system,
> > then you'll have only one kblockd, meaning, the kernel can't write to
> > more than one block device at the same time.
> >
> > If, while your kblockd is emptying the NBD buffer cache, the kernel
> > decides that the cache of the block device your nbd-server is writing to
> > needs to be emptied, then you've got a deadlock."
> >
> > Regards,
> >
> > - Salva
>
> _______________________________________________
> Qemu-devel mailing list
> Qemu-devel@nongnu.org
> http://lists.nongnu.org/mailman/listinfo/qemu-devel
--
Dave: Just a question. What use is a unicyle with no seat? And no pedals!
Mark: To answer a question with a question: What use is a skateboard?
Dave: Skateboards have wheels.
Mark: My wheel has a wheel!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-12 20:13 ` Anthony Liguori
2006-12-13 2:14 ` Mark Williamson
@ 2006-12-13 11:37 ` Avi Kivity
2006-12-13 13:19 ` Martin Guy
1 sibling, 1 reply; 32+ messages in thread
From: Avi Kivity @ 2006-12-13 11:37 UTC (permalink / raw)
To: qemu-devel
Anthony Liguori wrote:
>>
>> Mounting a partition being served on the same host as read-write can
>> cause deadlocks. From nbd-2.9.0 README file:
>
> This text is pretty old. Is this still valid? This would imply that
> things like loop can result in dead locks. I don't see why flushing
> one device would depend on the completion of another device.
> Otherwise, if you had two disk adapters, they would always be
> operating in lock step.
>
> As I've said, I've never seen a problem doing-write with nbd on
> localhost.
A deadlock can happen under heavy I/O load:
- write tons of data to nbd device, data ends up in pagecache
- memory gets low, kswapd wakes up, calls nbd device to actually write
the data
- nbd issues a request, which ends up on the nbd server on the same machine
- the nbd server allocates memory
- memory allocation hangs waiting for kswapd
I submitted a patch/workaround for this some years ago (for a similar
problem with a local mount to a userspace nfs server).
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-13 11:37 ` Avi Kivity
@ 2006-12-13 13:19 ` Martin Guy
2006-12-13 13:29 ` Avi Kivity
0 siblings, 1 reply; 32+ messages in thread
From: Martin Guy @ 2006-12-13 13:19 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 930 bytes --]
> - write tons of data to nbd device, data ends up in pagecache
> - memory gets low, kswapd wakes up, calls nbd device to actually write
> the data
> - nbd issues a request, which ends up on the nbd server on the same machine
> - the nbd server allocates memory
> - memory allocation hangs waiting for kswapd
In other words, it can deadlock only if you are swapping to an nbd
device that is served by nbd-server running on the same machine and
kernel. In the case of a qemu system swapping over nbd to a server on
the host machine, it is the guest kernel that waits on the host kernel
paging the nbd server in from the host's separate swap space, so no
deadlock is possible.
Practice bears this out; if you wanna stress-test it, here's a program
that creates a low memory condition by saturating the VM.
Of course, this has nothing to do with the original patch, which just
lets nbd-server interpret qemu image files ;)
M
[-- Attachment #2: thrash.c --]
[-- Type: text/x-csrc, Size: 2062 bytes --]
/*
* thrash.c
*
* A standalone Unix command-line program to
* make the machine thrash, ie go into permanent swapping,
* by using VM >= RAM size and accessing all pages repeatedly
*
* Usage: thrash size
* where "size" is the number of megabytes to thrash
* A good choice for N is the number of megabytes of physical RAM
* that the machine has.
*
* Reason:
* to force a machine to use its swap space,
* to flush all unused pages out to swap and so free RAM for other purposes
* or to see how a system behaves under extreme duress.
*
* It currently *writes* to all pages, but could be made to read them
* as an alternative, or as well.
*
* Martin Guy, 9 November 2006
*/
#include <stdlib.h> /* for exit() */
#include <stdio.h>
#include <unistd.h> /* for system calls */
main(int argc, char **argv)
{
int megabytes = 0; /* MB of VM to thrash, from command-line argument.
* 0 means uninitialised */
char *buf; /* Huge VM buffer */
intptr_t bufsize; /* Size of buffer in bytes */
long pagesize; /* size of VM page */
int i; /* index into argv */
int verbose = 0; /* Print a dot for every pass through VM? */
pagesize = getpagesize();
for (i=1; i<argc; i++) {
if (argv[i][0] == '-') {
switch (argv[i][1]) {
case 'v': verbose = 1; break;
default: goto usage;
}
} else if (isdigit(argv[i][0])) {
megabytes = atoi(argv[i]);
/* Sanity check comes later */
} else {
usage: fputs("Usage: thrash [-v] N\n", stderr);
fputs("-v\tPrint a dot every time for each pass through memoery\n", stderr);
fputs("N\tNumber of megabytes of VM to thrash\n", stderr);
exit(1);
}
}
/* Sanity checks */
if (megabytes <= 0) goto usage;
bufsize = (long) megabytes * (long)(1024 * 1024);
buf = (char *) sbrk(bufsize);
if (buf == (char *)-1) {
perror("thrash: Failed to allocate VM");
exit(1);
}
/* Write every page repeatedly */
for (;;) {
char *p;
int i;
for (p=buf, i=bufsize/pagesize; i>0; p+=pagesize, i--)
*p=(char) i;
if (verbose) { putchar('.'); fflush(stdout); }
}
/* NOTREACHED */
}
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-13 13:19 ` Martin Guy
@ 2006-12-13 13:29 ` Avi Kivity
2006-12-13 19:14 ` Salvador Fandino
0 siblings, 1 reply; 32+ messages in thread
From: Avi Kivity @ 2006-12-13 13:29 UTC (permalink / raw)
To: qemu-devel
Martin Guy wrote:
>> - write tons of data to nbd device, data ends up in pagecache
>> - memory gets low, kswapd wakes up, calls nbd device to actually write
>> the data
>> - nbd issues a request, which ends up on the nbd server on the same
>> machine
>> - the nbd server allocates memory
>> - memory allocation hangs waiting for kswapd
>
> In other words, it can deadlock only if you are swapping to an nbd
> device that is served by nbd-server running on the same machine and
> kernel.
No. It is possible if you issue non-O_SYNC writes.
> In the case of a qemu system swapping over nbd to a server on
> the host machine, it is the guest kernel that waits on the host kernel
> paging the nbd server in from the host's separate swap space, so no
> deadlock is possible.
>
> Practice bears this out; if you wanna stress-test it, here's a program
> that creates a low memory condition by saturating the VM.
It isn't enough to thrash the guest, you need to exhaust host memory as
well. You also need to do it faster than kswapd can write it out.
>
> Of course, this has nothing to do with the original patch, which just
> lets nbd-server interpret qemu image files ;)
>
Agreed. But mounting nbd from localhost is dangerous.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 32+ messages in thread
* [Qemu-devel] Re: NBD server for QEMU images
2006-12-13 13:29 ` Avi Kivity
@ 2006-12-13 19:14 ` Salvador Fandino
2006-12-14 8:34 ` Avi Kivity
0 siblings, 1 reply; 32+ messages in thread
From: Salvador Fandino @ 2006-12-13 19:14 UTC (permalink / raw)
To: qemu-devel
Avi Kivity wrote:
> Martin Guy wrote:
>>> - write tons of data to nbd device, data ends up in pagecache
>>> - memory gets low, kswapd wakes up, calls nbd device to actually write
>>> the data
>>> - nbd issues a request, which ends up on the nbd server on the same
>>> machine
>>> - the nbd server allocates memory
>>> - memory allocation hangs waiting for kswapd
>>
>> In other words, it can deadlock only if you are swapping to an nbd
>> device that is served by nbd-server running on the same machine and
>> kernel.
>
> No. It is possible if you issue non-O_SYNC writes.
I have run some tests and found that it's easy to cause a deadlock just
untaring a file over an nbd device being served from localhost (using
the standard nbd-server or my own, it doesn't matter).
Another interesting finding is that when the deadlock happens, qemu-nbds
is inside a read() call, waiting for new nbd requests to arrive over the
socket, and so, not trying to allocate memory or writing to disk.
BTW, I am using Debian unstable with kernel 2.6.18-1-686
Regards,
- Salva
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] Re: NBD server for QEMU images
2006-12-13 19:14 ` Salvador Fandino
@ 2006-12-14 8:34 ` Avi Kivity
0 siblings, 0 replies; 32+ messages in thread
From: Avi Kivity @ 2006-12-14 8:34 UTC (permalink / raw)
To: qemu-devel
Salvador Fandino wrote:
> I have run some tests and found that it's easy to cause a deadlock just
> untaring a file over an nbd device being served from localhost (using
> the standard nbd-server or my own, it doesn't matter).
>
> Another interesting finding is that when the deadlock happens, qemu-nbds
> is inside a read() call, waiting for new nbd requests to arrive over the
> socket, and so, not trying to allocate memory or writing to disk.
>
>
If you use sysrq-t and a serial console, you will find exactly how it's
waiting for memory.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [Qemu-devel] NBD server for QEMU images
2006-12-12 12:48 [Qemu-devel] NBD server for QEMU images Salvador Fandiño
2006-12-12 13:37 ` Martin Guy
2006-12-12 15:09 ` Anthony Liguori
@ 2006-12-13 16:58 ` Mulyadi Santosa
2 siblings, 0 replies; 32+ messages in thread
From: Mulyadi Santosa @ 2006-12-13 16:58 UTC (permalink / raw)
To: qemu-devel, Salvador Fandiño
Hi Salvador...
> The patch available from http://qemu-forum.ipi.fi/viewtopic.php?t=2718 adds
> a new utility, qemu-nbds, that implements a NBD server (see
> http://nbd.sf.net) for QEMU images.
>
> Using this utility it is posible to mount images in any format supported by
> QEMU.
Good work IMHO ! Although I am sure there are many ways to do this, yours is a
nice idea and I think it's pretty easy to be maintained too. Like perl motto:
there are more than one way to do it :)
regards,
Mulyadi
^ permalink raw reply [flat|nested] 32+ messages in thread
* RE: RE : Re: [Qemu-devel] Re: NBD server for QEMU images
@ 2006-12-12 17:48 Paul Robinson
0 siblings, 0 replies; 32+ messages in thread
From: Paul Robinson @ 2006-12-12 17:48 UTC (permalink / raw)
To: qemu-devel
> > > > > It's mostly intended to be used for accessing the files inside
> > > > > QEMU disk images locally, without having to launch a virtual
> > > > > machine and accessing then from there.
> > > >
> > > > mount -o loop does this.
> > >
> > > How is everybody missing the point? :-) mount -o loop doesn't
mount
> > > qcow images.
> > >
> > Would be that difficult to write a qcow fs module ?
>
> It would be _more_ difficult. Although I would have done it as a FUSE
module, just to learn how to do it.
Fuse can do one half of the job and you might want to look at
http://www.smallworks.com/~jim/fsimage/ dated 23-Feb-2005
It's a program to copy files from various types of disk image. (I
haven't tried it).
I found it at
http://www.kidsquid.com/cgi-bin/moin.cgi/FrequentlyAskedQuestions
Cheers,
Paul R.
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2006-12-14 14:58 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-12 12:48 [Qemu-devel] NBD server for QEMU images Salvador Fandiño
2006-12-12 13:37 ` Martin Guy
2006-12-12 17:00 ` [Qemu-devel] " Salvador Fandino
2006-12-12 16:58 ` Paul Brook
2006-12-12 17:13 ` Daniel Jacobowitz
2006-12-12 17:33 ` RE : " Sylvain Petreolle
2006-12-12 17:39 ` Paul Brook
2006-12-12 18:54 ` Anthony Liguori
2006-12-12 17:41 ` RE : " Johannes Schindelin
2006-12-12 17:42 ` Daniel Jacobowitz
2006-12-12 18:41 ` [Qemu-devel] Re: RE : " Salvador Fandino
2006-12-13 12:23 ` Jan Marten Simons
2006-12-13 19:03 ` Salvador Fandino
2006-12-13 20:03 ` Jim C. Brown
2006-12-13 22:07 ` Salvador Fandino
2006-12-13 22:55 ` Jim C. Brown
2006-12-14 8:37 ` Salvador Fandino
2006-12-14 14:58 ` Jim C. Brown
2006-12-12 19:00 ` Salvador Fandino
2006-12-12 17:45 ` [Qemu-devel] " Mark Williamson
2006-12-12 19:30 ` Christian MICHON
2006-12-12 15:09 ` Anthony Liguori
2006-12-12 17:32 ` Salvador Fandino
2006-12-12 20:13 ` Anthony Liguori
2006-12-13 2:14 ` Mark Williamson
2006-12-13 11:37 ` Avi Kivity
2006-12-13 13:19 ` Martin Guy
2006-12-13 13:29 ` Avi Kivity
2006-12-13 19:14 ` Salvador Fandino
2006-12-14 8:34 ` Avi Kivity
2006-12-13 16:58 ` [Qemu-devel] " Mulyadi Santosa
-- strict thread matches above, loose matches on Subject: below --
2006-12-12 17:48 RE : Re: [Qemu-devel] " Paul Robinson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).