* [Qemu-devel] [PATCH V12 0/6] add-cow file format @ 2012-08-10 15:39 Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 1/6] docs: document for " Dong Xu Wang ` (6 more replies) 0 siblings, 7 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang This will introduce a new file format: add-cow. add-cow can benefit from other available functions, such as path_has_protocol and qed_read_string, so we will make them public. Now add-cow is still using QEMUOptionParameter, not QemuOpts, I will send a separate patch series to convert. snapshot_blkdev are not supported now for add-cow, after converting QEMUOptionParameter to QemuOpts, will add related code. v11->v12: 1) Removed un-used feature bit. 2) Share cache code with qcow2.c. 3) Remove snapshot_blkdev support, will add it in another patch. 5) COW Bitmap field in add-cow file will be multiple of 65536. 6) fix grammer and typo. Dong Xu Wang (6): docs: document for add cow file format make path_has_protocol non-static qed_read_string to bdrv_read_string rename qcow2-cache.c to block-cache.c add-cow file format qemu-iotests block.c | 29 ++- block.h | 6 + block/Makefile.objs | 4 +- block/add-cow.c | 613 ++++++++++++++++++++++++++++++++++++++++++ block/add-cow.h | 85 ++++++ block/qcow2-cache.c | 323 ---------------------- block/qcow2-cluster.c | 66 +++-- block/qcow2-refcount.c | 66 +++-- block/qcow2.c | 36 ++-- block/qcow2.h | 24 +-- block/qed.c | 29 +-- block_int.h | 2 + docs/specs/add-cow.txt | 123 +++++++++ tests/qemu-iotests/017 | 2 +- tests/qemu-iotests/020 | 2 +- tests/qemu-iotests/check | 4 +- tests/qemu-iotests/common | 6 + tests/qemu-iotests/common.rc | 19 ++ trace-events | 13 +- 19 files changed, 994 insertions(+), 458 deletions(-) create mode 100644 block/add-cow.c create mode 100644 block/add-cow.h delete mode 100644 block/qcow2-cache.c create mode 100644 docs/specs/add-cow.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH V12 1/6] docs: document for add-cow file format 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang @ 2012-08-10 15:39 ` Dong Xu Wang 2012-09-06 17:27 ` Michael Roth 2012-09-10 15:23 ` Kevin Wolf 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 2/6] make path_has_protocol non-static Dong Xu Wang ` (5 subsequent siblings) 6 siblings, 2 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang Document for add-cow format, the usage and spec of add-cow are introduced. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> --- docs/specs/add-cow.txt | 123 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 123 insertions(+), 0 deletions(-) create mode 100644 docs/specs/add-cow.txt diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt new file mode 100644 index 0000000..d5a7a68 --- /dev/null +++ b/docs/specs/add-cow.txt @@ -0,0 +1,123 @@ +== General == + +The raw file format does not support backing files or copy on write feature. +The add-cow image format makes it possible to use backing files with raw +image by keeping a separate .add-cow metadata file. Once all sectors +have been written into the raw image it is safe to discard the .add-cow +and backing files, then we can use the raw image directly. + +An example usage of add-cow would look like:: +(ubuntu.img is a disk image which has been installed OS.) + 1) Create a raw image with the same size of ubuntu.img + qemu-img create -f raw test.raw 8G + 2) Create an add-cow image which will store dirty bitmap + qemu-img create -f add-cow test.add-cow \ + -o backing_file=ubuntu.img,image_file=test.raw + 3) Run qemu with add-cow image + qemu -drive if=virtio,file=test.add-cow + +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow +will be calculated from the size of test.raw. + +=Specification= + +The file format looks like this: + + +---------------+-------------+-----------------+ + | Header | Reserved | COW bitmap | + +---------------+-------------+-----------------+ + +All numbers in add-cow are stored in Little Endian byte order. + +== Header == + +The Header is included in the first bytes: +(#define HEADER_SIZE (4096 * header_pages_size)) + Byte 0 - 7: magic + add-cow magic string ("ADD_COW\xff"). + + 8 - 11: version + Version number (only valid value is 1 now). + + 12 - 15: backing file name offset + Offset in the add-cow file at which the backing file + name is stored (NB: The string is not nul-terminated). + If backing file name does NOT exist, this field will be + 0. Must be between 80 and [HEADER_SIZE - 2](a file name + must be at least 1 byte). + + 16 - 19: backing file name size + Length of the backing file name in bytes. It will be 0 + if the backing file name offset is 0. If backing file + name offset is non-zero, then it must be non-zero. Must + be less than [HEADER_SIZE - 80] to fit in the reserved + part of the header. + + 20 - 23: image file name offset + Offset in the add-cow file at which the image file name + is stored (NB: The string is not null terminated). It + must be between 80 and [HEADER_SIZE - 2]. + + 24 - 27: image file name size + Length of the image file name in bytes. + Must be less than [HEADER_SIZE - 80] to fit in the reserved + part of the header. + + 28 - 35: features + Currently only 1 feature bit is used: + Feature bits: + * ADD_COW_F_All_ALLOCATED = 0x01. + + 36 - 43: optional features + Not used now. Reserved for future use. It must be set to 0. + + 44 - 47: header pages size + The header field is variable-sized. This field indicates + how many pages(4k) will be used to store add-cow header. + In add-cow v1, it is fixed to 1, so the header size will + be 4k * 1 = 4096 bytes. + + 48 - 63: backing file format + format of backing file. It will be filled with 0 if + backing file name offset is 0. If backing file name + offset is non-zero, it must be non-zero. It is coded + in free-form ASCII, and is not NUL-terminated. + + 64 - 79: image file format + format of image file. It must be non-zero. It is coded + in free-form ASCII, and is not NUL-terminated. + + 80 - [HEADER_SIZE - 1]: + It is used to make sure COW bitmap field starts at the + HEADER_SIZE byte, backing file name and image file name + will be stored here. The bytes that is not pointing to + backing file and image file names will bet set to 0. + +== COW bitmap == + +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to +backing file and image file. The bitmap will track whether the sector in +backing file is dirty or not. + +Each bit in the bitmap indicates one cluster's status. One cluster includes 128 +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is +calculated according to virtual size of image file, and it also should be multipe +of 65536, the bits not used will be set to 0. Within each byte, the least +significant bit covers the first cluster. Bit orders in one byte look like: + +----+----+----+----+----+----+----+----+ + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | + +----+----+----+----+----+----+----+----+ + +If the bit is 0, indicates the sector has not been allocated in image file, data +should be loaded from backing file while reading; if the bit is 1, indicates the +related sector has been dirty, should be loaded from image file while reading. +Writing to a sector causes the corresponding bit to be set to 1. + +If raw image is not an even multiple of cluster bytes, bits that correspond to +bytes beyond the raw file size in add-cow will be 0. + +Image file name and backing file name must NOT be the same, we prevent this +while creating add-cow files. + +Image file and backing file are interpreted relative to the qcow2 file, not +to the current working directory of the process that opened the qcow2 file. -- 1.7.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 1/6] docs: document for add-cow file format 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 1/6] docs: document for " Dong Xu Wang @ 2012-09-06 17:27 ` Michael Roth 2012-09-10 1:48 ` Dong Xu Wang 2012-09-10 15:23 ` Kevin Wolf 1 sibling, 1 reply; 25+ messages in thread From: Michael Roth @ 2012-09-06 17:27 UTC (permalink / raw) To: Dong Xu Wang; +Cc: kwolf, qemu-devel On Fri, Aug 10, 2012 at 11:39:40PM +0800, Dong Xu Wang wrote: > Document for add-cow format, the usage and spec of add-cow are introduced. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > docs/specs/add-cow.txt | 123 ++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 123 insertions(+), 0 deletions(-) > create mode 100644 docs/specs/add-cow.txt > > diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt > new file mode 100644 > index 0000000..d5a7a68 > --- /dev/null > +++ b/docs/specs/add-cow.txt > @@ -0,0 +1,123 @@ > +== General == > + > +The raw file format does not support backing files or copy on write feature. > +The add-cow image format makes it possible to use backing files with raw > +image by keeping a separate .add-cow metadata file. Once all sectors > +have been written into the raw image it is safe to discard the .add-cow > +and backing files, then we can use the raw image directly. > + > +An example usage of add-cow would look like:: > +(ubuntu.img is a disk image which has been installed OS.) > + 1) Create a raw image with the same size of ubuntu.img > + qemu-img create -f raw test.raw 8G > + 2) Create an add-cow image which will store dirty bitmap > + qemu-img create -f add-cow test.add-cow \ > + -o backing_file=ubuntu.img,image_file=test.raw > + 3) Run qemu with add-cow image > + qemu -drive if=virtio,file=test.add-cow > + > +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow > +will be calculated from the size of test.raw. > + > +=Specification= > + > +The file format looks like this: > + > + +---------------+-------------+-----------------+ > + | Header | Reserved | COW bitmap | > + +---------------+-------------+-----------------+ > + > +All numbers in add-cow are stored in Little Endian byte order. > + > +== Header == > + > +The Header is included in the first bytes: > +(#define HEADER_SIZE (4096 * header_pages_size)) > + Byte 0 - 7: magic > + add-cow magic string ("ADD_COW\xff"). > + > + 8 - 11: version > + Version number (only valid value is 1 now). > + > + 12 - 15: backing file name offset > + Offset in the add-cow file at which the backing file > + name is stored (NB: The string is not nul-terminated). > + If backing file name does NOT exist, this field will be > + 0. Must be between 80 and [HEADER_SIZE - 2](a file name > + must be at least 1 byte). > + > + 16 - 19: backing file name size > + Length of the backing file name in bytes. It will be 0 > + if the backing file name offset is 0. If backing file > + name offset is non-zero, then it must be non-zero. Must > + be less than [HEADER_SIZE - 80] to fit in the reserved > + part of the header. > + > + 20 - 23: image file name offset > + Offset in the add-cow file at which the image file name > + is stored (NB: The string is not null terminated). It > + must be between 80 and [HEADER_SIZE - 2]. > + > + 24 - 27: image file name size > + Length of the image file name in bytes. > + Must be less than [HEADER_SIZE - 80] to fit in the reserved > + part of the header. > + > + 28 - 35: features > + Currently only 1 feature bit is used: > + Feature bits: > + * ADD_COW_F_All_ALLOCATED = 0x01. > + > + 36 - 43: optional features > + Not used now. Reserved for future use. It must be set to 0. > + > + 44 - 47: header pages size > + The header field is variable-sized. This field indicates > + how many pages(4k) will be used to store add-cow header. > + In add-cow v1, it is fixed to 1, so the header size will > + be 4k * 1 = 4096 bytes. > + > + 48 - 63: backing file format > + format of backing file. It will be filled with 0 if > + backing file name offset is 0. If backing file name > + offset is non-zero, it must be non-zero. It is coded > + in free-form ASCII, and is not NUL-terminated. > + > + 64 - 79: image file format > + format of image file. It must be non-zero. It is coded > + in free-form ASCII, and is not NUL-terminated. > + > + 80 - [HEADER_SIZE - 1]: > + It is used to make sure COW bitmap field starts at the > + HEADER_SIZE byte, backing file name and image file name > + will be stored here. The bytes that is not pointing to > + backing file and image file names will bet set to 0. > + > +== COW bitmap == > + > +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to > +backing file and image file. The bitmap will track whether the sector in > +backing file is dirty or not. > + > +Each bit in the bitmap indicates one cluster's status. One cluster includes 128 > +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is > +calculated according to virtual size of image file, and it also should be multipe > +of 65536, the bits not used will be set to 0. Within each byte, the least > +significant bit covers the first cluster. Bit orders in one byte look like: > + +----+----+----+----+----+----+----+----+ > + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | > + +----+----+----+----+----+----+----+----+ > + > +If the bit is 0, indicates the sector has not been allocated in image file, data > +should be loaded from backing file while reading; if the bit is 1, indicates the > +related sector has been dirty, should be loaded from image file while reading. > +Writing to a sector causes the corresponding bit to be set to 1. > + > +If raw image is not an even multiple of cluster bytes, bits that correspond to > +bytes beyond the raw file size in add-cow will be 0. > + > +Image file name and backing file name must NOT be the same, we prevent this > +while creating add-cow files. > + > +Image file and backing file are interpreted relative to the qcow2 file, not Relative to the add-cow file? > +to the current working directory of the process that opened the qcow2 file. > -- > 1.7.1 > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 1/6] docs: document for add-cow file format 2012-09-06 17:27 ` Michael Roth @ 2012-09-10 1:48 ` Dong Xu Wang 0 siblings, 0 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-09-10 1:48 UTC (permalink / raw) To: Michael Roth; +Cc: kwolf, qemu-devel On Fri, Sep 7, 2012 at 1:27 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote: > On Fri, Aug 10, 2012 at 11:39:40PM +0800, Dong Xu Wang wrote: >> Document for add-cow format, the usage and spec of add-cow are introduced. >> >> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> --- >> docs/specs/add-cow.txt | 123 ++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 123 insertions(+), 0 deletions(-) >> create mode 100644 docs/specs/add-cow.txt >> >> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt >> new file mode 100644 >> index 0000000..d5a7a68 >> --- /dev/null >> +++ b/docs/specs/add-cow.txt >> @@ -0,0 +1,123 @@ >> +== General == >> + >> +The raw file format does not support backing files or copy on write feature. >> +The add-cow image format makes it possible to use backing files with raw >> +image by keeping a separate .add-cow metadata file. Once all sectors >> +have been written into the raw image it is safe to discard the .add-cow >> +and backing files, then we can use the raw image directly. >> + >> +An example usage of add-cow would look like:: >> +(ubuntu.img is a disk image which has been installed OS.) >> + 1) Create a raw image with the same size of ubuntu.img >> + qemu-img create -f raw test.raw 8G >> + 2) Create an add-cow image which will store dirty bitmap >> + qemu-img create -f add-cow test.add-cow \ >> + -o backing_file=ubuntu.img,image_file=test.raw >> + 3) Run qemu with add-cow image >> + qemu -drive if=virtio,file=test.add-cow >> + >> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow >> +will be calculated from the size of test.raw. >> + >> +=Specification= >> + >> +The file format looks like this: >> + >> + +---------------+-------------+-----------------+ >> + | Header | Reserved | COW bitmap | >> + +---------------+-------------+-----------------+ >> + >> +All numbers in add-cow are stored in Little Endian byte order. >> + >> +== Header == >> + >> +The Header is included in the first bytes: >> +(#define HEADER_SIZE (4096 * header_pages_size)) >> + Byte 0 - 7: magic >> + add-cow magic string ("ADD_COW\xff"). >> + >> + 8 - 11: version >> + Version number (only valid value is 1 now). >> + >> + 12 - 15: backing file name offset >> + Offset in the add-cow file at which the backing file >> + name is stored (NB: The string is not nul-terminated). >> + If backing file name does NOT exist, this field will be >> + 0. Must be between 80 and [HEADER_SIZE - 2](a file name >> + must be at least 1 byte). >> + >> + 16 - 19: backing file name size >> + Length of the backing file name in bytes. It will be 0 >> + if the backing file name offset is 0. If backing file >> + name offset is non-zero, then it must be non-zero. Must >> + be less than [HEADER_SIZE - 80] to fit in the reserved >> + part of the header. >> + >> + 20 - 23: image file name offset >> + Offset in the add-cow file at which the image file name >> + is stored (NB: The string is not null terminated). It >> + must be between 80 and [HEADER_SIZE - 2]. >> + >> + 24 - 27: image file name size >> + Length of the image file name in bytes. >> + Must be less than [HEADER_SIZE - 80] to fit in the reserved >> + part of the header. >> + >> + 28 - 35: features >> + Currently only 1 feature bit is used: >> + Feature bits: >> + * ADD_COW_F_All_ALLOCATED = 0x01. >> + >> + 36 - 43: optional features >> + Not used now. Reserved for future use. It must be set to 0. >> + >> + 44 - 47: header pages size >> + The header field is variable-sized. This field indicates >> + how many pages(4k) will be used to store add-cow header. >> + In add-cow v1, it is fixed to 1, so the header size will >> + be 4k * 1 = 4096 bytes. >> + >> + 48 - 63: backing file format >> + format of backing file. It will be filled with 0 if >> + backing file name offset is 0. If backing file name >> + offset is non-zero, it must be non-zero. It is coded >> + in free-form ASCII, and is not NUL-terminated. >> + >> + 64 - 79: image file format >> + format of image file. It must be non-zero. It is coded >> + in free-form ASCII, and is not NUL-terminated. >> + >> + 80 - [HEADER_SIZE - 1]: >> + It is used to make sure COW bitmap field starts at the >> + HEADER_SIZE byte, backing file name and image file name >> + will be stored here. The bytes that is not pointing to >> + backing file and image file names will bet set to 0. >> + >> +== COW bitmap == >> + >> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to >> +backing file and image file. The bitmap will track whether the sector in >> +backing file is dirty or not. >> + >> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128 >> +sectors, then each bit indicates 512 * 128 = 64k bytes. the size of bitmap is >> +calculated according to virtual size of image file, and it also should be multipe >> +of 65536, the bits not used will be set to 0. Within each byte, the least >> +significant bit covers the first cluster. Bit orders in one byte look like: >> + +----+----+----+----+----+----+----+----+ >> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | >> + +----+----+----+----+----+----+----+----+ >> + >> +If the bit is 0, indicates the sector has not been allocated in image file, data >> +should be loaded from backing file while reading; if the bit is 1, indicates the >> +related sector has been dirty, should be loaded from image file while reading. >> +Writing to a sector causes the corresponding bit to be set to 1. >> + >> +If raw image is not an even multiple of cluster bytes, bits that correspond to >> +bytes beyond the raw file size in add-cow will be 0. >> + >> +Image file name and backing file name must NOT be the same, we prevent this >> +while creating add-cow files. >> + >> +Image file and backing file are interpreted relative to the qcow2 file, not > > Relative to the add-cow file? Ah, yes.. > >> +to the current working directory of the process that opened the qcow2 file. >> -- >> 1.7.1 >> >> > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 1/6] docs: document for add-cow file format 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 1/6] docs: document for " Dong Xu Wang 2012-09-06 17:27 ` Michael Roth @ 2012-09-10 15:23 ` Kevin Wolf 2012-09-11 2:12 ` Dong Xu Wang 1 sibling, 1 reply; 25+ messages in thread From: Kevin Wolf @ 2012-09-10 15:23 UTC (permalink / raw) To: Dong Xu Wang; +Cc: qemu-devel Am 10.08.2012 17:39, schrieb Dong Xu Wang: > Document for add-cow format, the usage and spec of add-cow are introduced. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > docs/specs/add-cow.txt | 123 ++++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 123 insertions(+), 0 deletions(-) > create mode 100644 docs/specs/add-cow.txt > > diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt > new file mode 100644 > index 0000000..d5a7a68 > --- /dev/null > +++ b/docs/specs/add-cow.txt > @@ -0,0 +1,123 @@ > +== General == > + > +The raw file format does not support backing files or copy on write feature. > +The add-cow image format makes it possible to use backing files with raw > +image by keeping a separate .add-cow metadata file. Once all sectors > +have been written into the raw image it is safe to discard the .add-cow > +and backing files, then we can use the raw image directly. > + > +An example usage of add-cow would look like:: > +(ubuntu.img is a disk image which has been installed OS.) > + 1) Create a raw image with the same size of ubuntu.img > + qemu-img create -f raw test.raw 8G > + 2) Create an add-cow image which will store dirty bitmap > + qemu-img create -f add-cow test.add-cow \ > + -o backing_file=ubuntu.img,image_file=test.raw > + 3) Run qemu with add-cow image > + qemu -drive if=virtio,file=test.add-cow > + > +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow > +will be calculated from the size of test.raw. > + > +=Specification= > + > +The file format looks like this: > + > + +---------------+-------------+-----------------+ > + | Header | Reserved | COW bitmap | > + +---------------+-------------+-----------------+ > + > +All numbers in add-cow are stored in Little Endian byte order. > + > +== Header == > + > +The Header is included in the first bytes: > +(#define HEADER_SIZE (4096 * header_pages_size)) > + Byte 0 - 7: magic > + add-cow magic string ("ADD_COW\xff"). > + > + 8 - 11: version > + Version number (only valid value is 1 now). > + > + 12 - 15: backing file name offset > + Offset in the add-cow file at which the backing file > + name is stored (NB: The string is not nul-terminated). > + If backing file name does NOT exist, this field will be > + 0. Must be between 80 and [HEADER_SIZE - 2](a file name > + must be at least 1 byte). > + > + 16 - 19: backing file name size > + Length of the backing file name in bytes. It will be 0 > + if the backing file name offset is 0. If backing file > + name offset is non-zero, then it must be non-zero. Must > + be less than [HEADER_SIZE - 80] to fit in the reserved > + part of the header. > + > + 20 - 23: image file name offset > + Offset in the add-cow file at which the image file name > + is stored (NB: The string is not null terminated). It > + must be between 80 and [HEADER_SIZE - 2]. > + > + 24 - 27: image file name size > + Length of the image file name in bytes. > + Must be less than [HEADER_SIZE - 80] to fit in the reserved > + part of the header. > + > + 28 - 35: features > + Currently only 1 feature bit is used: What happens when opening a file with an unknown bit set? How must unknown bits be initialised? > + Feature bits: > + * ADD_COW_F_All_ALLOCATED = 0x01. What does this flag mean, and is it required to be set on that condition? Also, please use ALL_CAPS. > + > + 36 - 43: optional features > + Not used now. Reserved for future use. It must be set to 0. And must be ignored when reading. > + > + 44 - 47: header pages size > + The header field is variable-sized. This field indicates > + how many pages(4k) will be used to store add-cow header. > + In add-cow v1, it is fixed to 1, so the header size will > + be 4k * 1 = 4096 bytes. Why arbitrarily defined "pages" instead of bytes or at least clusters? > + > + 48 - 63: backing file format > + format of backing file. It will be filled with 0 if > + backing file name offset is 0. If backing file name > + offset is non-zero, it must be non-zero. It is coded > + in free-form ASCII, and is not NUL-terminated. Zero padded on the right, I guess? Also defining that a string must be "non-zero" looks odd, should probably be "non-empty". > + > + 64 - 79: image file format > + format of image file. It must be non-zero. It is coded > + in free-form ASCII, and is not NUL-terminated. Same here. > + > + 80 - [HEADER_SIZE - 1]: > + It is used to make sure COW bitmap field starts at the > + HEADER_SIZE byte, backing file name and image file name > + will be stored here. The bytes that is not pointing to > + backing file and image file names will bet set to 0. "will be set to 0" describes the behaviour of qemu. A spec should describe the file format, not a specific implementation. Make it "must" or "should". > + > +== COW bitmap == > + > +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to > +backing file and image file. The bitmap will track whether the sector in > +backing file is dirty or not. > + > +Each bit in the bitmap indicates one cluster's status. One cluster includes 128 > +sectors, then each bit indicates 512 * 128 = 64k bytes. Should we make the cluster size configurable? > the size of bitmap is > +calculated according to virtual size of image file, and it also should be multipe Typo: multiple Sure you mean "should", or should it be "must"? > +of 65536, the bits not used will be set to 0. Within each byte, the least > +significant bit covers the first cluster. Bit orders in one byte look like: > + +----+----+----+----+----+----+----+----+ > + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | > + +----+----+----+----+----+----+----+----+ > + > +If the bit is 0, indicates the sector has not been allocated in image file, data > +should be loaded from backing file while reading; if the bit is 1, indicates the > +related sector has been dirty, should be loaded from image file while reading. > +Writing to a sector causes the corresponding bit to be set to 1. > + > +If raw image is not an even multiple of cluster bytes, bits that correspond to > +bytes beyond the raw file size in add-cow will be 0. "must be written as 0 and must be ignored when reading" or something like that. > +Image file name and backing file name must NOT be the same, we prevent this > +while creating add-cow files. What we do is irrelevant for a spec. > +Image file and backing file are interpreted relative to the qcow2 file, not > +to the current working directory of the process that opened the qcow2 file. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 1/6] docs: document for add-cow file format 2012-09-10 15:23 ` Kevin Wolf @ 2012-09-11 2:12 ` Dong Xu Wang 0 siblings, 0 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-09-11 2:12 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel On Mon, Sep 10, 2012 at 11:23 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 10.08.2012 17:39, schrieb Dong Xu Wang: >> Document for add-cow format, the usage and spec of add-cow are introduced. >> >> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> --- >> docs/specs/add-cow.txt | 123 ++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 123 insertions(+), 0 deletions(-) >> create mode 100644 docs/specs/add-cow.txt >> >> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt >> new file mode 100644 >> index 0000000..d5a7a68 >> --- /dev/null >> +++ b/docs/specs/add-cow.txt >> @@ -0,0 +1,123 @@ >> +== General == >> + >> +The raw file format does not support backing files or copy on write feature. >> +The add-cow image format makes it possible to use backing files with raw >> +image by keeping a separate .add-cow metadata file. Once all sectors >> +have been written into the raw image it is safe to discard the .add-cow >> +and backing files, then we can use the raw image directly. >> + >> +An example usage of add-cow would look like:: >> +(ubuntu.img is a disk image which has been installed OS.) >> + 1) Create a raw image with the same size of ubuntu.img >> + qemu-img create -f raw test.raw 8G >> + 2) Create an add-cow image which will store dirty bitmap >> + qemu-img create -f add-cow test.add-cow \ >> + -o backing_file=ubuntu.img,image_file=test.raw >> + 3) Run qemu with add-cow image >> + qemu -drive if=virtio,file=test.add-cow >> + >> +test.raw may be larger than ubuntu.img, in that case, the size of test.add-cow >> +will be calculated from the size of test.raw. >> + >> +=Specification= >> + >> +The file format looks like this: >> + >> + +---------------+-------------+-----------------+ >> + | Header | Reserved | COW bitmap | >> + +---------------+-------------+-----------------+ >> + >> +All numbers in add-cow are stored in Little Endian byte order. >> + >> +== Header == >> + >> +The Header is included in the first bytes: >> +(#define HEADER_SIZE (4096 * header_pages_size)) >> + Byte 0 - 7: magic >> + add-cow magic string ("ADD_COW\xff"). >> + >> + 8 - 11: version >> + Version number (only valid value is 1 now). >> + >> + 12 - 15: backing file name offset >> + Offset in the add-cow file at which the backing file >> + name is stored (NB: The string is not nul-terminated). >> + If backing file name does NOT exist, this field will be >> + 0. Must be between 80 and [HEADER_SIZE - 2](a file name >> + must be at least 1 byte). >> + >> + 16 - 19: backing file name size >> + Length of the backing file name in bytes. It will be 0 >> + if the backing file name offset is 0. If backing file >> + name offset is non-zero, then it must be non-zero. Must >> + be less than [HEADER_SIZE - 80] to fit in the reserved >> + part of the header. >> + >> + 20 - 23: image file name offset >> + Offset in the add-cow file at which the image file name >> + is stored (NB: The string is not null terminated). It >> + must be between 80 and [HEADER_SIZE - 2]. >> + >> + 24 - 27: image file name size >> + Length of the image file name in bytes. >> + Must be less than [HEADER_SIZE - 80] to fit in the reserved >> + part of the header. >> + >> + 28 - 35: features >> + Currently only 1 feature bit is used: > > What happens when opening a file with an unknown bit set? How must > unknown bits be initialised? Okay, I will code as qcow2, report report_unsupported_feature error. And I will update the spec file. > >> + Feature bits: >> + * ADD_COW_F_All_ALLOCATED = 0x01. > > What does this flag mean, and is it required to be set on that > condition? Also, please use ALL_CAPS. This feature bit will used as: qemu-img create -f add-cow -o image_file=t.raw t.add-cow. While creating add-cow and without backing_file, this feature can avoid reading/updating bitmap. I think it can let the code be more faster. And also, maybe, I can implement add_cow_check, check if the feature bit should be set. How do you think, Kevin? > >> + >> + 36 - 43: optional features >> + Not used now. Reserved for future use. It must be set to 0. > > And must be ignored when reading. > Okay. >> + >> + 44 - 47: header pages size >> + The header field is variable-sized. This field indicates >> + how many pages(4k) will be used to store add-cow header. >> + In add-cow v1, it is fixed to 1, so the header size will >> + be 4k * 1 = 4096 bytes. > > Why arbitrarily defined "pages" instead of bytes or at least clusters? Okay, next version I will just caclulate it by bytes. > >> + >> + 48 - 63: backing file format >> + format of backing file. It will be filled with 0 if >> + backing file name offset is 0. If backing file name >> + offset is non-zero, it must be non-zero. It is coded >> + in free-form ASCII, and is not NUL-terminated. > > Zero padded on the right, I guess? Yes, will update. > > Also defining that a string must be "non-zero" looks odd, should > probably be "non-empty". > Okay. >> + >> + 64 - 79: image file format >> + format of image file. It must be non-zero. It is coded >> + in free-form ASCII, and is not NUL-terminated. > > Same here. Okay. > >> + >> + 80 - [HEADER_SIZE - 1]: >> + It is used to make sure COW bitmap field starts at the >> + HEADER_SIZE byte, backing file name and image file name >> + will be stored here. The bytes that is not pointing to >> + backing file and image file names will bet set to 0. > > "will be set to 0" describes the behaviour of qemu. A spec should > describe the file format, not a specific implementation. Make it "must" > or "should". Okay. > >> + >> +== COW bitmap == >> + >> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap related to >> +backing file and image file. The bitmap will track whether the sector in >> +backing file is dirty or not. >> + >> +Each bit in the bitmap indicates one cluster's status. One cluster includes 128 >> +sectors, then each bit indicates 512 * 128 = 64k bytes. > > Should we make the cluster size configurable? > >> the size of bitmap is >> +calculated according to virtual size of image file, and it also should be multipe > > Typo: multiple > > Sure you mean "should", or should it be "must"? Okay. > >> +of 65536, the bits not used will be set to 0. Within each byte, the least >> +significant bit covers the first cluster. Bit orders in one byte look like: >> + +----+----+----+----+----+----+----+----+ >> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | >> + +----+----+----+----+----+----+----+----+ >> + >> +If the bit is 0, indicates the sector has not been allocated in image file, data >> +should be loaded from backing file while reading; if the bit is 1, indicates the >> +related sector has been dirty, should be loaded from image file while reading. >> +Writing to a sector causes the corresponding bit to be set to 1. >> + >> +If raw image is not an even multiple of cluster bytes, bits that correspond to >> +bytes beyond the raw file size in add-cow will be 0. > > "must be written as 0 and must be ignored when reading" or something > like that. Okay. > >> +Image file name and backing file name must NOT be the same, we prevent this >> +while creating add-cow files. > > What we do is irrelevant for a spec. Okay. > >> +Image file and backing file are interpreted relative to the qcow2 file, not >> +to the current working directory of the process that opened the qcow2 file. > > Kevin > Thank you, Kevin. ^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH V12 2/6] make path_has_protocol non-static 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 1/6] docs: document for " Dong Xu Wang @ 2012-08-10 15:39 ` Dong Xu Wang 2012-09-06 17:27 ` Michael Roth 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string Dong Xu Wang ` (4 subsequent siblings) 6 siblings, 1 reply; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang We will use path_has_protocol outside block.c, so just make it public. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> --- block.c | 2 +- block.h | 1 + 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/block.c b/block.c index 24323c1..c13d803 100644 --- a/block.c +++ b/block.c @@ -196,7 +196,7 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs, } /* check if the path starts with "<protocol>:" */ -static int path_has_protocol(const char *path) +int path_has_protocol(const char *path) { const char *p; diff --git a/block.h b/block.h index 650d872..54e61c9 100644 --- a/block.h +++ b/block.h @@ -307,6 +307,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn); char *get_human_readable_size(char *buf, int buf_size, int64_t size); int path_is_absolute(const char *path); +int path_has_protocol(const char *path); void path_combine(char *dest, int dest_size, const char *base_path, const char *filename); -- 1.7.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 2/6] make path_has_protocol non-static 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 2/6] make path_has_protocol non-static Dong Xu Wang @ 2012-09-06 17:27 ` Michael Roth 0 siblings, 0 replies; 25+ messages in thread From: Michael Roth @ 2012-09-06 17:27 UTC (permalink / raw) To: Dong Xu Wang; +Cc: kwolf, qemu-devel On Fri, Aug 10, 2012 at 11:39:41PM +0800, Dong Xu Wang wrote: > We will use path_has_protocol outside block.c, so just make it public. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> > --- > block.c | 2 +- > block.h | 1 + > 2 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/block.c b/block.c > index 24323c1..c13d803 100644 > --- a/block.c > +++ b/block.c > @@ -196,7 +196,7 @@ static void bdrv_io_limits_intercept(BlockDriverState *bs, > } > > /* check if the path starts with "<protocol>:" */ > -static int path_has_protocol(const char *path) > +int path_has_protocol(const char *path) > { > const char *p; > > diff --git a/block.h b/block.h > index 650d872..54e61c9 100644 > --- a/block.h > +++ b/block.h > @@ -307,6 +307,7 @@ char *bdrv_snapshot_dump(char *buf, int buf_size, QEMUSnapshotInfo *sn); > > char *get_human_readable_size(char *buf, int buf_size, int64_t size); > int path_is_absolute(const char *path); > +int path_has_protocol(const char *path); > void path_combine(char *dest, int dest_size, > const char *base_path, > const char *filename); > -- > 1.7.1 > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 1/6] docs: document for " Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 2/6] make path_has_protocol non-static Dong Xu Wang @ 2012-08-10 15:39 ` Dong Xu Wang 2012-09-06 17:32 ` Michael Roth 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang ` (3 subsequent siblings) 6 siblings, 1 reply; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang Make qed_read_string function to a common interface, so move it to block.c. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> --- block.c | 27 +++++++++++++++++++++++++++ block.h | 2 ++ block/qed.c | 29 +---------------------------- 3 files changed, 30 insertions(+), 28 deletions(-) diff --git a/block.c b/block.c index c13d803..d906b35 100644 --- a/block.c +++ b/block.c @@ -213,6 +213,33 @@ int path_has_protocol(const char *path) return *p == ':'; } +/** + * Read a string of known length from the image file + * + * @bs: Image file + * @offset: File offset to start of string, in bytes + * @n: String length in bytes + * @buf: Destination buffer + * @buflen: Destination buffer length in bytes + * @ret: 0 on success, -errno on failure + * + * The string is NUL-terminated. + */ +int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n, + char *buf, size_t buflen) +{ + int ret; + if (n >= buflen) { + return -EINVAL; + } + ret = bdrv_pread(bs, offset, buf, n); + if (ret < 0) { + return ret; + } + buf[n] = '\0'; + return 0; +} + int path_is_absolute(const char *path) { #ifdef _WIN32 diff --git a/block.h b/block.h index 54e61c9..e5dfcd7 100644 --- a/block.h +++ b/block.h @@ -154,6 +154,8 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset, const void *buf, int count); int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); +int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n, + char *buf, size_t buflen); int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num, diff --git a/block/qed.c b/block/qed.c index 5f3eefa..311c589 100644 --- a/block/qed.c +++ b/block/qed.c @@ -217,33 +217,6 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size, } /** - * Read a string of known length from the image file - * - * @file: Image file - * @offset: File offset to start of string, in bytes - * @n: String length in bytes - * @buf: Destination buffer - * @buflen: Destination buffer length in bytes - * @ret: 0 on success, -errno on failure - * - * The string is NUL-terminated. - */ -static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n, - char *buf, size_t buflen) -{ - int ret; - if (n >= buflen) { - return -EINVAL; - } - ret = bdrv_pread(file, offset, buf, n); - if (ret < 0) { - return ret; - } - buf[n] = '\0'; - return 0; -} - -/** * Allocate new clusters * * @s: QED state @@ -437,7 +410,7 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) return -EINVAL; } - ret = qed_read_string(bs->file, s->header.backing_filename_offset, + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, s->header.backing_filename_size, bs->backing_file, sizeof(bs->backing_file)); if (ret < 0) { -- 1.7.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string Dong Xu Wang @ 2012-09-06 17:32 ` Michael Roth 2012-09-10 1:49 ` Dong Xu Wang 0 siblings, 1 reply; 25+ messages in thread From: Michael Roth @ 2012-09-06 17:32 UTC (permalink / raw) To: Dong Xu Wang; +Cc: kwolf, qemu-devel On Fri, Aug 10, 2012 at 11:39:42PM +0800, Dong Xu Wang wrote: > Make qed_read_string function to a common interface, so move it to block.c. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > block.c | 27 +++++++++++++++++++++++++++ > block.h | 2 ++ > block/qed.c | 29 +---------------------------- > 3 files changed, 30 insertions(+), 28 deletions(-) > > diff --git a/block.c b/block.c > index c13d803..d906b35 100644 > --- a/block.c > +++ b/block.c > @@ -213,6 +213,33 @@ int path_has_protocol(const char *path) > return *p == ':'; > } > > +/** > + * Read a string of known length from the image file > + * > + * @bs: Image file > + * @offset: File offset to start of string, in bytes > + * @n: String length in bytes > + * @buf: Destination buffer > + * @buflen: Destination buffer length in bytes > + * @ret: 0 on success, -errno on failure > + * > + * The string is NUL-terminated. > + */ > +int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n, > + char *buf, size_t buflen) Small alignment issue ^ > +{ > + int ret; > + if (n >= buflen) { > + return -EINVAL; > + } > + ret = bdrv_pread(bs, offset, buf, n); > + if (ret < 0) { > + return ret; > + } > + buf[n] = '\0'; > + return 0; > +} > + > int path_is_absolute(const char *path) > { > #ifdef _WIN32 > diff --git a/block.h b/block.h > index 54e61c9..e5dfcd7 100644 > --- a/block.h > +++ b/block.h > @@ -154,6 +154,8 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset, > const void *buf, int count); > int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num, > int nb_sectors, QEMUIOVector *qiov); > +int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n, > + char *buf, size_t buflen); Another one here ^ > int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs, > int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); > int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num, > diff --git a/block/qed.c b/block/qed.c > index 5f3eefa..311c589 100644 > --- a/block/qed.c > +++ b/block/qed.c > @@ -217,33 +217,6 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size, > } > > /** > - * Read a string of known length from the image file > - * > - * @file: Image file > - * @offset: File offset to start of string, in bytes > - * @n: String length in bytes > - * @buf: Destination buffer > - * @buflen: Destination buffer length in bytes > - * @ret: 0 on success, -errno on failure > - * > - * The string is NUL-terminated. > - */ > -static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n, > - char *buf, size_t buflen) > -{ > - int ret; > - if (n >= buflen) { > - return -EINVAL; > - } > - ret = bdrv_pread(file, offset, buf, n); > - if (ret < 0) { > - return ret; > - } > - buf[n] = '\0'; > - return 0; > -} > - > -/** > * Allocate new clusters > * > * @s: QED state > @@ -437,7 +410,7 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) > return -EINVAL; > } > > - ret = qed_read_string(bs->file, s->header.backing_filename_offset, > + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, > s->header.backing_filename_size, bs->backing_file, > sizeof(bs->backing_file)); Here too ^ Looks good otherwise. > if (ret < 0) { > -- > 1.7.1 > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string 2012-09-06 17:32 ` Michael Roth @ 2012-09-10 1:49 ` Dong Xu Wang 0 siblings, 0 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-09-10 1:49 UTC (permalink / raw) To: Michael Roth; +Cc: kwolf, qemu-devel On Fri, Sep 7, 2012 at 1:32 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote: > On Fri, Aug 10, 2012 at 11:39:42PM +0800, Dong Xu Wang wrote: >> Make qed_read_string function to a common interface, so move it to block.c. >> >> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> --- >> block.c | 27 +++++++++++++++++++++++++++ >> block.h | 2 ++ >> block/qed.c | 29 +---------------------------- >> 3 files changed, 30 insertions(+), 28 deletions(-) >> >> diff --git a/block.c b/block.c >> index c13d803..d906b35 100644 >> --- a/block.c >> +++ b/block.c >> @@ -213,6 +213,33 @@ int path_has_protocol(const char *path) >> return *p == ':'; >> } >> >> +/** >> + * Read a string of known length from the image file >> + * >> + * @bs: Image file >> + * @offset: File offset to start of string, in bytes >> + * @n: String length in bytes >> + * @buf: Destination buffer >> + * @buflen: Destination buffer length in bytes >> + * @ret: 0 on success, -errno on failure >> + * >> + * The string is NUL-terminated. >> + */ >> +int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n, >> + char *buf, size_t buflen) > > Small alignment issue ^ > >> +{ >> + int ret; >> + if (n >= buflen) { >> + return -EINVAL; >> + } >> + ret = bdrv_pread(bs, offset, buf, n); >> + if (ret < 0) { >> + return ret; >> + } >> + buf[n] = '\0'; >> + return 0; >> +} >> + >> int path_is_absolute(const char *path) >> { >> #ifdef _WIN32 >> diff --git a/block.h b/block.h >> index 54e61c9..e5dfcd7 100644 >> --- a/block.h >> +++ b/block.h >> @@ -154,6 +154,8 @@ int bdrv_pwrite_sync(BlockDriverState *bs, int64_t offset, >> const void *buf, int count); >> int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num, >> int nb_sectors, QEMUIOVector *qiov); >> +int bdrv_read_string(BlockDriverState *bs, uint64_t offset, size_t n, >> + char *buf, size_t buflen); > > Another one here ^ > >> int coroutine_fn bdrv_co_copy_on_readv(BlockDriverState *bs, >> int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); >> int coroutine_fn bdrv_co_writev(BlockDriverState *bs, int64_t sector_num, >> diff --git a/block/qed.c b/block/qed.c >> index 5f3eefa..311c589 100644 >> --- a/block/qed.c >> +++ b/block/qed.c >> @@ -217,33 +217,6 @@ static bool qed_is_image_size_valid(uint64_t image_size, uint32_t cluster_size, >> } >> >> /** >> - * Read a string of known length from the image file >> - * >> - * @file: Image file >> - * @offset: File offset to start of string, in bytes >> - * @n: String length in bytes >> - * @buf: Destination buffer >> - * @buflen: Destination buffer length in bytes >> - * @ret: 0 on success, -errno on failure >> - * >> - * The string is NUL-terminated. >> - */ >> -static int qed_read_string(BlockDriverState *file, uint64_t offset, size_t n, >> - char *buf, size_t buflen) >> -{ >> - int ret; >> - if (n >= buflen) { >> - return -EINVAL; >> - } >> - ret = bdrv_pread(file, offset, buf, n); >> - if (ret < 0) { >> - return ret; >> - } >> - buf[n] = '\0'; >> - return 0; >> -} >> - >> -/** >> * Allocate new clusters >> * >> * @s: QED state >> @@ -437,7 +410,7 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) >> return -EINVAL; >> } >> >> - ret = qed_read_string(bs->file, s->header.backing_filename_offset, >> + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, >> s->header.backing_filename_size, bs->backing_file, >> sizeof(bs->backing_file)); > > Here too ^ > > Looks good otherwise. > >> if (ret < 0) { >> -- >> 1.7.1 >> >> > Thank you Michael . ^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang ` (2 preceding siblings ...) 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string Dong Xu Wang @ 2012-08-10 15:39 ` Dong Xu Wang 2012-09-06 17:52 ` Michael Roth 2012-09-11 8:41 ` Kevin Wolf 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 5/6] add-cow file format Dong Xu Wang ` (2 subsequent siblings) 6 siblings, 2 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang add-cow and qcow2 file format will share the same cache code, so rename block-cache.c to block-cache.c. And related structure and qcow2 code also are changed. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> --- block.h | 3 + block/Makefile.objs | 3 +- block/qcow2-cache.c | 323 ------------------------------------------------ block/qcow2-cluster.c | 66 ++++++---- block/qcow2-refcount.c | 66 ++++++----- block/qcow2.c | 36 +++--- block/qcow2.h | 24 +--- trace-events | 13 +- 8 files changed, 109 insertions(+), 425 deletions(-) delete mode 100644 block/qcow2-cache.c diff --git a/block.h b/block.h index e5dfcd7..c325661 100644 --- a/block.h +++ b/block.h @@ -401,6 +401,9 @@ typedef enum { BLKDBG_CLUSTER_ALLOC_BYTES, BLKDBG_CLUSTER_FREE, + BLKDBG_ADD_COW_UPDATE, + BLKDBG_ADD_COW_LOAD, + BLKDBG_EVENT_MAX, } BlkDebugEvent; diff --git a/block/Makefile.objs b/block/Makefile.objs index b5754d3..23bdfc8 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -1,7 +1,8 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-obj-y += qed-check.o +block-obj-y += block-cache.o block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-obj-y += stream.o block-obj-$(CONFIG_WIN32) += raw-win32.o diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c deleted file mode 100644 index 2d4322a..0000000 --- a/block/qcow2-cache.c +++ /dev/null @@ -1,323 +0,0 @@ -/* - * L2/refcount table cache for the QCOW2 format - * - * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com> - * - * Permission is hereby granted, free of charge, to any person obtaining a copy - * of this software and associated documentation files (the "Software"), to deal - * in the Software without restriction, including without limitation the rights - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - * copies of the Software, and to permit persons to whom the Software is - * furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN - * THE SOFTWARE. - */ - -#include "block_int.h" -#include "qemu-common.h" -#include "qcow2.h" -#include "trace.h" - -typedef struct Qcow2CachedTable { - void* table; - int64_t offset; - bool dirty; - int cache_hits; - int ref; -} Qcow2CachedTable; - -struct Qcow2Cache { - Qcow2CachedTable* entries; - struct Qcow2Cache* depends; - int size; - bool depends_on_flush; -}; - -Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) -{ - BDRVQcowState *s = bs->opaque; - Qcow2Cache *c; - int i; - - c = g_malloc0(sizeof(*c)); - c->size = num_tables; - c->entries = g_malloc0(sizeof(*c->entries) * num_tables); - - for (i = 0; i < c->size; i++) { - c->entries[i].table = qemu_blockalign(bs, s->cluster_size); - } - - return c; -} - -int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c) -{ - int i; - - for (i = 0; i < c->size; i++) { - assert(c->entries[i].ref == 0); - qemu_vfree(c->entries[i].table); - } - - g_free(c->entries); - g_free(c); - - return 0; -} - -static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c) -{ - int ret; - - ret = qcow2_cache_flush(bs, c->depends); - if (ret < 0) { - return ret; - } - - c->depends = NULL; - c->depends_on_flush = false; - - return 0; -} - -static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) -{ - BDRVQcowState *s = bs->opaque; - int ret = 0; - - if (!c->entries[i].dirty || !c->entries[i].offset) { - return 0; - } - - trace_qcow2_cache_entry_flush(qemu_coroutine_self(), - c == s->l2_table_cache, i); - - if (c->depends) { - ret = qcow2_cache_flush_dependency(bs, c); - } else if (c->depends_on_flush) { - ret = bdrv_flush(bs->file); - if (ret >= 0) { - c->depends_on_flush = false; - } - } - - if (ret < 0) { - return ret; - } - - if (c == s->refcount_block_cache) { - BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART); - } else if (c == s->l2_table_cache) { - BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE); - } - - ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table, - s->cluster_size); - if (ret < 0) { - return ret; - } - - c->entries[i].dirty = false; - - return 0; -} - -int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c) -{ - BDRVQcowState *s = bs->opaque; - int result = 0; - int ret; - int i; - - trace_qcow2_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache); - - for (i = 0; i < c->size; i++) { - ret = qcow2_cache_entry_flush(bs, c, i); - if (ret < 0 && result != -ENOSPC) { - result = ret; - } - } - - if (result == 0) { - ret = bdrv_flush(bs->file); - if (ret < 0) { - result = ret; - } - } - - return result; -} - -int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, - Qcow2Cache *dependency) -{ - int ret; - - if (dependency->depends) { - ret = qcow2_cache_flush_dependency(bs, dependency); - if (ret < 0) { - return ret; - } - } - - if (c->depends && (c->depends != dependency)) { - ret = qcow2_cache_flush_dependency(bs, c); - if (ret < 0) { - return ret; - } - } - - c->depends = dependency; - return 0; -} - -void qcow2_cache_depends_on_flush(Qcow2Cache *c) -{ - c->depends_on_flush = true; -} - -static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) -{ - int i; - int min_count = INT_MAX; - int min_index = -1; - - - for (i = 0; i < c->size; i++) { - if (c->entries[i].ref) { - continue; - } - - if (c->entries[i].cache_hits < min_count) { - min_index = i; - min_count = c->entries[i].cache_hits; - } - - /* Give newer hits priority */ - /* TODO Check how to optimize the replacement strategy */ - c->entries[i].cache_hits /= 2; - } - - if (min_index == -1) { - /* This can't happen in current synchronous code, but leave the check - * here as a reminder for whoever starts using AIO with the cache */ - abort(); - } - return min_index; -} - -static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c, - uint64_t offset, void **table, bool read_from_disk) -{ - BDRVQcowState *s = bs->opaque; - int i; - int ret; - - trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache, - offset, read_from_disk); - - /* Check if the table is already cached */ - for (i = 0; i < c->size; i++) { - if (c->entries[i].offset == offset) { - goto found; - } - } - - /* If not, write a table back and replace it */ - i = qcow2_cache_find_entry_to_replace(c); - trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(), - c == s->l2_table_cache, i); - if (i < 0) { - return i; - } - - ret = qcow2_cache_entry_flush(bs, c, i); - if (ret < 0) { - return ret; - } - - trace_qcow2_cache_get_read(qemu_coroutine_self(), - c == s->l2_table_cache, i); - c->entries[i].offset = 0; - if (read_from_disk) { - if (c == s->l2_table_cache) { - BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD); - } - - ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size); - if (ret < 0) { - return ret; - } - } - - /* Give the table some hits for the start so that it won't be replaced - * immediately. The number 32 is completely arbitrary. */ - c->entries[i].cache_hits = 32; - c->entries[i].offset = offset; - - /* And return the right table */ -found: - c->entries[i].cache_hits++; - c->entries[i].ref++; - *table = c->entries[i].table; - - trace_qcow2_cache_get_done(qemu_coroutine_self(), - c == s->l2_table_cache, i); - - return 0; -} - -int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, - void **table) -{ - return qcow2_cache_do_get(bs, c, offset, table, true); -} - -int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, - void **table) -{ - return qcow2_cache_do_get(bs, c, offset, table, false); -} - -int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table) -{ - int i; - - for (i = 0; i < c->size; i++) { - if (c->entries[i].table == *table) { - goto found; - } - } - return -ENOENT; - -found: - c->entries[i].ref--; - *table = NULL; - - assert(c->entries[i].ref >= 0); - return 0; -} - -void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table) -{ - int i; - - for (i = 0; i < c->size; i++) { - if (c->entries[i].table == table) { - goto found; - } - } - abort(); - -found: - c->entries[i].dirty = true; -} diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index e179211..335dc7a 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -28,6 +28,7 @@ #include "block_int.h" #include "block/qcow2.h" #include "trace.h" +#include "block-cache.h" int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) { @@ -69,7 +70,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) return new_l1_table_offset; } - ret = qcow2_cache_flush(bs, s->refcount_block_cache); + ret = block_cache_flush(bs, s->refcount_block_cache, + BLOCK_TABLE_REF, s->cluster_size); if (ret < 0) { goto fail; } @@ -119,7 +121,8 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset, BDRVQcowState *s = bs->opaque; int ret; - ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table); + ret = block_cache_get(bs, s->l2_table_cache, l2_offset, + (void **) l2_table, BLOCK_TABLE_L2, s->cluster_size); return ret; } @@ -180,7 +183,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) return l2_offset; } - ret = qcow2_cache_flush(bs, s->refcount_block_cache); + ret = block_cache_flush(bs, s->refcount_block_cache, + BLOCK_TABLE_REF, s->cluster_size); if (ret < 0) { goto fail; } @@ -188,7 +192,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) /* allocate a new entry in the l2 cache */ trace_qcow2_l2_allocate_get_empty(bs, l1_index); - ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table); + ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset, + (void **) table, BLOCK_TABLE_L2, s->cluster_size); if (ret < 0) { return ret; } @@ -203,16 +208,17 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) /* if there was an old l2 table, read it from the disk */ BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ); - ret = qcow2_cache_get(bs, s->l2_table_cache, + ret = block_cache_get(bs, s->l2_table_cache, old_l2_offset & L1E_OFFSET_MASK, - (void**) &old_table); + (void **) &old_table, BLOCK_TABLE_L2, s->cluster_size); if (ret < 0) { goto fail; } memcpy(l2_table, old_table, s->cluster_size); - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &old_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &old_table, BLOCK_TABLE_L2); if (ret < 0) { goto fail; } @@ -222,8 +228,9 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE); trace_qcow2_l2_allocate_write_l2(bs, l1_index); - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); - ret = qcow2_cache_flush(bs, s->l2_table_cache); + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + ret = block_cache_flush(bs, s->l2_table_cache, + BLOCK_TABLE_L2, s->cluster_size); if (ret < 0) { goto fail; } @@ -242,7 +249,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) fail: trace_qcow2_l2_allocate_done(bs, l1_index, ret); - qcow2_cache_put(bs, s->l2_table_cache, (void**) table); + block_cache_put(bs, s->l2_table_cache, (void **) table, BLOCK_TABLE_L2); s->l1_table[l1_index] = old_l2_offset; return ret; } @@ -475,7 +482,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, abort(); } - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + block_cache_put(bs, s->l2_table_cache, (void **) &l2_table, BLOCK_TABLE_L2); nb_available = (c * s->cluster_sectors); @@ -584,13 +591,15 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, * allocated. */ cluster_offset = be64_to_cpu(l2_table[l2_index]); if (cluster_offset & L2E_OFFSET_MASK) { - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); return 0; } cluster_offset = qcow2_alloc_bytes(bs, compressed_size); if (cluster_offset < 0) { - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); return 0; } @@ -605,9 +614,10 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, /* compressed clusters never have the copied flag */ BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED); - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); l2_table[l2_index] = cpu_to_be64(cluster_offset); - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); if (ret < 0) { return 0; } @@ -659,18 +669,16 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) * handled. */ if (cow) { - qcow2_cache_depends_on_flush(s->l2_table_cache); + block_cache_depends_on_flush(s->l2_table_cache); } - if (qcow2_need_accurate_refcounts(s)) { - qcow2_cache_set_dependency(bs, s->l2_table_cache, - s->refcount_block_cache); - } + block_cache_set_dependency(bs, s->l2_table_cache, BLOCK_TABLE_L2, + s->refcount_block_cache, s->cluster_size); ret = get_cluster_table(bs, m->offset, &l2_table, &l2_index); if (ret < 0) { goto err; } - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); for (i = 0; i < m->nb_clusters; i++) { /* if two concurrent writes happen to the same unallocated cluster @@ -687,7 +695,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) } - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); if (ret < 0) { goto err; } @@ -913,7 +922,8 @@ again: * request to complete. If we still had the reference, we could use up the * whole cache with sleeping requests. */ - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); if (ret < 0) { return ret; } @@ -1077,14 +1087,15 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset, } /* First remove L2 entries */ - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); l2_table[l2_index + i] = cpu_to_be64(0); /* Then decrease the refcount */ qcow2_free_any_clusters(bs, old_offset, 1); } - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); if (ret < 0) { return ret; } @@ -1154,7 +1165,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, old_offset = be64_to_cpu(l2_table[l2_index + i]); /* Update L2 entries */ - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); if (old_offset & QCOW_OFLAG_COMPRESSED) { l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO); qcow2_free_any_clusters(bs, old_offset, 1); @@ -1163,7 +1174,8 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, } } - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); if (ret < 0) { return ret; } diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 5e3f915..728bfc1 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -25,6 +25,7 @@ #include "qemu-common.h" #include "block_int.h" #include "block/qcow2.h" +#include "block-cache.h" static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size); static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, @@ -71,8 +72,8 @@ static int load_refcount_block(BlockDriverState *bs, int ret; BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD); - ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, - refcount_block); + ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset, + refcount_block, BLOCK_TABLE_REF, s->cluster_size); return ret; } @@ -98,8 +99,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index) if (!refcount_block_offset) return 0; - ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, - (void**) &refcount_block); + ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset, + (void **) &refcount_block, BLOCK_TABLE_REF, s->cluster_size); if (ret < 0) { return ret; } @@ -108,8 +109,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index) ((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1); refcount = be16_to_cpu(refcount_block[block_index]); - ret = qcow2_cache_put(bs, s->refcount_block_cache, - (void**) &refcount_block); + ret = block_cache_put(bs, s->refcount_block_cache, + (void **) &refcount_block, BLOCK_TABLE_REF); if (ret < 0) { return ret; } @@ -201,7 +202,8 @@ static int alloc_refcount_block(BlockDriverState *bs, *refcount_block = NULL; /* We write to the refcount table, so we might depend on L2 tables */ - qcow2_cache_flush(bs, s->l2_table_cache); + block_cache_flush(bs, s->l2_table_cache, + BLOCK_TABLE_L2, s->cluster_size); /* Allocate the refcount block itself and mark it as used */ int64_t new_block = alloc_clusters_noref(bs, s->cluster_size); @@ -217,8 +219,8 @@ static int alloc_refcount_block(BlockDriverState *bs, if (in_same_refcount_block(s, new_block, cluster_index << s->cluster_bits)) { /* Zero the new refcount block before updating it */ - ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block, - (void**) refcount_block); + ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block, + (void **) refcount_block, BLOCK_TABLE_REF, s->cluster_size); if (ret < 0) { goto fail_block; } @@ -241,8 +243,8 @@ static int alloc_refcount_block(BlockDriverState *bs, /* Initialize the new refcount block only after updating its refcount, * update_refcount uses the refcount cache itself */ - ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block, - (void**) refcount_block); + ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block, + (void **) refcount_block, BLOCK_TABLE_REF, s->cluster_size); if (ret < 0) { goto fail_block; } @@ -252,8 +254,9 @@ static int alloc_refcount_block(BlockDriverState *bs, /* Now the new refcount block needs to be written to disk */ BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_WRITE); - qcow2_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block); - ret = qcow2_cache_flush(bs, s->refcount_block_cache); + block_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block); + ret = block_cache_flush(bs, s->refcount_block_cache, + BLOCK_TABLE_REF, s->cluster_size); if (ret < 0) { goto fail_block; } @@ -273,7 +276,8 @@ static int alloc_refcount_block(BlockDriverState *bs, return 0; } - ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block); + ret = block_cache_put(bs, s->refcount_block_cache, + (void **) refcount_block, BLOCK_TABLE_REF); if (ret < 0) { goto fail_block; } @@ -406,7 +410,8 @@ fail_table: g_free(new_table); fail_block: if (*refcount_block != NULL) { - qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block); + block_cache_put(bs, s->refcount_block_cache, + (void **) refcount_block, BLOCK_TABLE_REF); } return ret; } @@ -432,8 +437,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, } if (addend < 0) { - qcow2_cache_set_dependency(bs, s->refcount_block_cache, - s->l2_table_cache); + block_cache_set_dependency(bs, s->refcount_block_cache, BLOCK_TABLE_REF, + s->l2_table_cache, s->cluster_size); } start = offset & ~(s->cluster_size - 1); @@ -449,8 +454,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, /* Load the refcount block and allocate it if needed */ if (table_index != old_table_index) { if (refcount_block) { - ret = qcow2_cache_put(bs, s->refcount_block_cache, - (void**) &refcount_block); + ret = block_cache_put(bs, s->refcount_block_cache, + (void **) &refcount_block, BLOCK_TABLE_REF); if (ret < 0) { goto fail; } @@ -463,7 +468,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, } old_table_index = table_index; - qcow2_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block); + block_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block); /* we can update the count and save it */ block_index = cluster_index & @@ -486,8 +491,8 @@ fail: /* Write last changed block to disk */ if (refcount_block) { int wret; - wret = qcow2_cache_put(bs, s->refcount_block_cache, - (void**) &refcount_block); + wret = block_cache_put(bs, s->refcount_block_cache, + (void **) &refcount_block, BLOCK_TABLE_REF); if (wret < 0) { return ret < 0 ? ret : wret; } @@ -763,8 +768,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, old_l2_offset = l2_offset; l2_offset &= L1E_OFFSET_MASK; - ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, - (void**) &l2_table); + ret = block_cache_get(bs, s->l2_table_cache, l2_offset, + (void **) &l2_table, BLOCK_TABLE_L2, s->cluster_size); if (ret < 0) { goto fail; } @@ -811,16 +816,18 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, } if (offset != old_offset) { if (addend > 0) { - qcow2_cache_set_dependency(bs, s->l2_table_cache, - s->refcount_block_cache); + block_cache_set_dependency(bs, s->l2_table_cache, + BLOCK_TABLE_L2, s->refcount_block_cache, + s->cluster_size); } l2_table[j] = cpu_to_be64(offset); - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); } } } - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + ret = block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); if (ret < 0) { goto fail; } @@ -847,7 +854,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, ret = 0; fail: if (l2_table) { - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); + block_cache_put(bs, s->l2_table_cache, + (void **) &l2_table, BLOCK_TABLE_L2); } /* Update L1 only if it isn't deleted anyway (addend = -1) */ diff --git a/block/qcow2.c b/block/qcow2.c index fd5e214..b89d312 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -30,6 +30,7 @@ #include "qemu-error.h" #include "qerror.h" #include "trace.h" +#include "block-cache.h" /* Differences with QCOW: @@ -415,8 +416,9 @@ static int qcow2_open(BlockDriverState *bs, int flags) } /* alloc L2 table/refcount block cache */ - s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE); - s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE); + s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE, s->cluster_size); + s->refcount_block_cache = + block_cache_create(bs, REFCOUNT_CACHE_SIZE, s->cluster_size); s->cluster_cache = g_malloc(s->cluster_size); /* one more sector for decompressed data alignment */ @@ -500,7 +502,7 @@ static int qcow2_open(BlockDriverState *bs, int flags) qcow2_refcount_close(bs); g_free(s->l1_table); if (s->l2_table_cache) { - qcow2_cache_destroy(bs, s->l2_table_cache); + block_cache_destroy(bs, s->l2_table_cache, BLOCK_TABLE_L2); } g_free(s->cluster_cache); qemu_vfree(s->cluster_data); @@ -860,13 +862,13 @@ static void qcow2_close(BlockDriverState *bs) BDRVQcowState *s = bs->opaque; g_free(s->l1_table); - qcow2_cache_flush(bs, s->l2_table_cache); - qcow2_cache_flush(bs, s->refcount_block_cache); - + block_cache_flush(bs, s->l2_table_cache, + BLOCK_TABLE_L2, s->cluster_size); + block_cache_flush(bs, s->refcount_block_cache, + BLOCK_TABLE_REF, s->cluster_size); qcow2_mark_clean(bs); - - qcow2_cache_destroy(bs, s->l2_table_cache); - qcow2_cache_destroy(bs, s->refcount_block_cache); + block_cache_destroy(bs, s->l2_table_cache, BLOCK_TABLE_L2); + block_cache_destroy(bs, s->refcount_block_cache, BLOCK_TABLE_REF); g_free(s->unknown_header_fields); cleanup_unknown_header_ext(bs); @@ -1339,8 +1341,6 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options) options->value.s); return -EINVAL; } - } else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) { - flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0; } options++; } @@ -1537,18 +1537,18 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs) int ret; qemu_co_mutex_lock(&s->lock); - ret = qcow2_cache_flush(bs, s->l2_table_cache); + ret = block_cache_flush(bs, s->l2_table_cache, + BLOCK_TABLE_L2, s->cluster_size); if (ret < 0) { qemu_co_mutex_unlock(&s->lock); return ret; } - if (qcow2_need_accurate_refcounts(s)) { - ret = qcow2_cache_flush(bs, s->refcount_block_cache); - if (ret < 0) { - qemu_co_mutex_unlock(&s->lock); - return ret; - } + ret = block_cache_flush(bs, s->refcount_block_cache, + BLOCK_TABLE_REF, s->cluster_size); + if (ret < 0) { + qemu_co_mutex_unlock(&s->lock); + return ret; } qemu_co_mutex_unlock(&s->lock); diff --git a/block/qcow2.h b/block/qcow2.h index b4eb654..cb6fd7a 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "aes.h" #include "qemu-coroutine.h" +#include "block-cache.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -94,8 +95,6 @@ typedef struct QCowSnapshot { uint64_t vm_clock_nsec; } QCowSnapshot; -struct Qcow2Cache; -typedef struct Qcow2Cache Qcow2Cache; typedef struct Qcow2UnknownHeaderExtension { uint32_t magic; @@ -146,8 +145,8 @@ typedef struct BDRVQcowState { uint64_t l1_table_offset; uint64_t *l1_table; - Qcow2Cache* l2_table_cache; - Qcow2Cache* refcount_block_cache; + BlockCache *l2_table_cache; + BlockCache *refcount_block_cache; uint8_t *cluster_cache; uint8_t *cluster_data; @@ -316,21 +315,4 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name); void qcow2_free_snapshots(BlockDriverState *bs); int qcow2_read_snapshots(BlockDriverState *bs); - -/* qcow2-cache.c functions */ -Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables); -int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c); - -void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table); -int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c); -int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, - Qcow2Cache *dependency); -void qcow2_cache_depends_on_flush(Qcow2Cache *c); - -int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, - void **table); -int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, - void **table); -int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table); - #endif diff --git a/trace-events b/trace-events index 6b12f83..52b6438 100644 --- a/trace-events +++ b/trace-events @@ -439,12 +439,13 @@ qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d" qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d" qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d" -qcow2_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d" -qcow2_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d" -qcow2_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d" -qcow2_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d" -qcow2_cache_flush(void *co, int c) "co %p is_l2_cache %d" -qcow2_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d" +# block/block-cache.c +block_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d" +block_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d" +block_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d" +block_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d" +block_cache_flush(void *co, int c) "co %p is_l2_cache %d" +block_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d" # block/qed-l2-cache.c qed_alloc_l2_cache_entry(void *l2_cache, void *entry) "l2_cache %p entry %p" -- 1.7.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang @ 2012-09-06 17:52 ` Michael Roth 2012-09-10 2:14 ` Dong Xu Wang 2012-09-11 8:41 ` Kevin Wolf 1 sibling, 1 reply; 25+ messages in thread From: Michael Roth @ 2012-09-06 17:52 UTC (permalink / raw) To: Dong Xu Wang; +Cc: kwolf, qemu-devel On Fri, Aug 10, 2012 at 11:39:43PM +0800, Dong Xu Wang wrote: > add-cow and qcow2 file format will share the same cache code, so rename > block-cache.c to block-cache.c. And related structure and qcow2 code also "qcow2-cache.c to block-cache.c" But I've scanned through the rest of your patches and can't seem to find where block-cache.c gets introduced. Did you forget to git add it? > are changed. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > block.h | 3 + > block/Makefile.objs | 3 +- > block/qcow2-cache.c | 323 ------------------------------------------------ > block/qcow2-cluster.c | 66 ++++++---- > block/qcow2-refcount.c | 66 ++++++----- > block/qcow2.c | 36 +++--- > block/qcow2.h | 24 +--- > trace-events | 13 +- > 8 files changed, 109 insertions(+), 425 deletions(-) > delete mode 100644 block/qcow2-cache.c > > diff --git a/block.h b/block.h > index e5dfcd7..c325661 100644 > --- a/block.h > +++ b/block.h > @@ -401,6 +401,9 @@ typedef enum { > BLKDBG_CLUSTER_ALLOC_BYTES, > BLKDBG_CLUSTER_FREE, > > + BLKDBG_ADD_COW_UPDATE, > + BLKDBG_ADD_COW_LOAD, > + > BLKDBG_EVENT_MAX, > } BlkDebugEvent; > > diff --git a/block/Makefile.objs b/block/Makefile.objs > index b5754d3..23bdfc8 100644 > --- a/block/Makefile.objs > +++ b/block/Makefile.objs > @@ -1,7 +1,8 @@ > block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o > -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o > +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o > block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o > block-obj-y += qed-check.o > +block-obj-y += block-cache.o > block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o > block-obj-y += stream.o > block-obj-$(CONFIG_WIN32) += raw-win32.o > diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c > deleted file mode 100644 > index 2d4322a..0000000 > --- a/block/qcow2-cache.c > +++ /dev/null > @@ -1,323 +0,0 @@ > -/* > - * L2/refcount table cache for the QCOW2 format > - * > - * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com> > - * > - * Permission is hereby granted, free of charge, to any person obtaining a copy > - * of this software and associated documentation files (the "Software"), to deal > - * in the Software without restriction, including without limitation the rights > - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > - * copies of the Software, and to permit persons to whom the Software is > - * furnished to do so, subject to the following conditions: > - * > - * The above copyright notice and this permission notice shall be included in > - * all copies or substantial portions of the Software. > - * > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > - * THE SOFTWARE. > - */ > - > -#include "block_int.h" > -#include "qemu-common.h" > -#include "qcow2.h" > -#include "trace.h" > - > -typedef struct Qcow2CachedTable { > - void* table; > - int64_t offset; > - bool dirty; > - int cache_hits; > - int ref; > -} Qcow2CachedTable; > - > -struct Qcow2Cache { > - Qcow2CachedTable* entries; > - struct Qcow2Cache* depends; > - int size; > - bool depends_on_flush; > -}; > - > -Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) > -{ > - BDRVQcowState *s = bs->opaque; > - Qcow2Cache *c; > - int i; > - > - c = g_malloc0(sizeof(*c)); > - c->size = num_tables; > - c->entries = g_malloc0(sizeof(*c->entries) * num_tables); > - > - for (i = 0; i < c->size; i++) { > - c->entries[i].table = qemu_blockalign(bs, s->cluster_size); > - } > - > - return c; > -} > - > -int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c) > -{ > - int i; > - > - for (i = 0; i < c->size; i++) { > - assert(c->entries[i].ref == 0); > - qemu_vfree(c->entries[i].table); > - } > - > - g_free(c->entries); > - g_free(c); > - > - return 0; > -} > - > -static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c) > -{ > - int ret; > - > - ret = qcow2_cache_flush(bs, c->depends); > - if (ret < 0) { > - return ret; > - } > - > - c->depends = NULL; > - c->depends_on_flush = false; > - > - return 0; > -} > - > -static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) > -{ > - BDRVQcowState *s = bs->opaque; > - int ret = 0; > - > - if (!c->entries[i].dirty || !c->entries[i].offset) { > - return 0; > - } > - > - trace_qcow2_cache_entry_flush(qemu_coroutine_self(), > - c == s->l2_table_cache, i); > - > - if (c->depends) { > - ret = qcow2_cache_flush_dependency(bs, c); > - } else if (c->depends_on_flush) { > - ret = bdrv_flush(bs->file); > - if (ret >= 0) { > - c->depends_on_flush = false; > - } > - } > - > - if (ret < 0) { > - return ret; > - } > - > - if (c == s->refcount_block_cache) { > - BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART); > - } else if (c == s->l2_table_cache) { > - BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE); > - } > - > - ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table, > - s->cluster_size); > - if (ret < 0) { > - return ret; > - } > - > - c->entries[i].dirty = false; > - > - return 0; > -} > - > -int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c) > -{ > - BDRVQcowState *s = bs->opaque; > - int result = 0; > - int ret; > - int i; > - > - trace_qcow2_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache); > - > - for (i = 0; i < c->size; i++) { > - ret = qcow2_cache_entry_flush(bs, c, i); > - if (ret < 0 && result != -ENOSPC) { > - result = ret; > - } > - } > - > - if (result == 0) { > - ret = bdrv_flush(bs->file); > - if (ret < 0) { > - result = ret; > - } > - } > - > - return result; > -} > - > -int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, > - Qcow2Cache *dependency) > -{ > - int ret; > - > - if (dependency->depends) { > - ret = qcow2_cache_flush_dependency(bs, dependency); > - if (ret < 0) { > - return ret; > - } > - } > - > - if (c->depends && (c->depends != dependency)) { > - ret = qcow2_cache_flush_dependency(bs, c); > - if (ret < 0) { > - return ret; > - } > - } > - > - c->depends = dependency; > - return 0; > -} > - > -void qcow2_cache_depends_on_flush(Qcow2Cache *c) > -{ > - c->depends_on_flush = true; > -} > - > -static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) > -{ > - int i; > - int min_count = INT_MAX; > - int min_index = -1; > - > - > - for (i = 0; i < c->size; i++) { > - if (c->entries[i].ref) { > - continue; > - } > - > - if (c->entries[i].cache_hits < min_count) { > - min_index = i; > - min_count = c->entries[i].cache_hits; > - } > - > - /* Give newer hits priority */ > - /* TODO Check how to optimize the replacement strategy */ > - c->entries[i].cache_hits /= 2; > - } > - > - if (min_index == -1) { > - /* This can't happen in current synchronous code, but leave the check > - * here as a reminder for whoever starts using AIO with the cache */ > - abort(); > - } > - return min_index; > -} > - > -static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c, > - uint64_t offset, void **table, bool read_from_disk) > -{ > - BDRVQcowState *s = bs->opaque; > - int i; > - int ret; > - > - trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache, > - offset, read_from_disk); > - > - /* Check if the table is already cached */ > - for (i = 0; i < c->size; i++) { > - if (c->entries[i].offset == offset) { > - goto found; > - } > - } > - > - /* If not, write a table back and replace it */ > - i = qcow2_cache_find_entry_to_replace(c); > - trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(), > - c == s->l2_table_cache, i); > - if (i < 0) { > - return i; > - } > - > - ret = qcow2_cache_entry_flush(bs, c, i); > - if (ret < 0) { > - return ret; > - } > - > - trace_qcow2_cache_get_read(qemu_coroutine_self(), > - c == s->l2_table_cache, i); > - c->entries[i].offset = 0; > - if (read_from_disk) { > - if (c == s->l2_table_cache) { > - BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD); > - } > - > - ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size); > - if (ret < 0) { > - return ret; > - } > - } > - > - /* Give the table some hits for the start so that it won't be replaced > - * immediately. The number 32 is completely arbitrary. */ > - c->entries[i].cache_hits = 32; > - c->entries[i].offset = offset; > - > - /* And return the right table */ > -found: > - c->entries[i].cache_hits++; > - c->entries[i].ref++; > - *table = c->entries[i].table; > - > - trace_qcow2_cache_get_done(qemu_coroutine_self(), > - c == s->l2_table_cache, i); > - > - return 0; > -} > - > -int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, > - void **table) > -{ > - return qcow2_cache_do_get(bs, c, offset, table, true); > -} > - > -int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, > - void **table) > -{ > - return qcow2_cache_do_get(bs, c, offset, table, false); > -} > - > -int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table) > -{ > - int i; > - > - for (i = 0; i < c->size; i++) { > - if (c->entries[i].table == *table) { > - goto found; > - } > - } > - return -ENOENT; > - > -found: > - c->entries[i].ref--; > - *table = NULL; > - > - assert(c->entries[i].ref >= 0); > - return 0; > -} > - > -void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table) > -{ > - int i; > - > - for (i = 0; i < c->size; i++) { > - if (c->entries[i].table == table) { > - goto found; > - } > - } > - abort(); > - > -found: > - c->entries[i].dirty = true; > -} > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index e179211..335dc7a 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -28,6 +28,7 @@ > #include "block_int.h" > #include "block/qcow2.h" > #include "trace.h" > +#include "block-cache.h" > > int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) > { > @@ -69,7 +70,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) > return new_l1_table_offset; > } > > - ret = qcow2_cache_flush(bs, s->refcount_block_cache); > + ret = block_cache_flush(bs, s->refcount_block_cache, > + BLOCK_TABLE_REF, s->cluster_size); > if (ret < 0) { > goto fail; > } > @@ -119,7 +121,8 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset, > BDRVQcowState *s = bs->opaque; > int ret; > > - ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table); > + ret = block_cache_get(bs, s->l2_table_cache, l2_offset, > + (void **) l2_table, BLOCK_TABLE_L2, s->cluster_size); > > return ret; > } > @@ -180,7 +183,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) > return l2_offset; > } > > - ret = qcow2_cache_flush(bs, s->refcount_block_cache); > + ret = block_cache_flush(bs, s->refcount_block_cache, > + BLOCK_TABLE_REF, s->cluster_size); > if (ret < 0) { > goto fail; > } > @@ -188,7 +192,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) > /* allocate a new entry in the l2 cache */ > > trace_qcow2_l2_allocate_get_empty(bs, l1_index); > - ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table); > + ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset, > + (void **) table, BLOCK_TABLE_L2, s->cluster_size); > if (ret < 0) { > return ret; > } > @@ -203,16 +208,17 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) > > /* if there was an old l2 table, read it from the disk */ > BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ); > - ret = qcow2_cache_get(bs, s->l2_table_cache, > + ret = block_cache_get(bs, s->l2_table_cache, > old_l2_offset & L1E_OFFSET_MASK, > - (void**) &old_table); > + (void **) &old_table, BLOCK_TABLE_L2, s->cluster_size); > if (ret < 0) { > goto fail; > } > > memcpy(l2_table, old_table, s->cluster_size); > > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &old_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &old_table, BLOCK_TABLE_L2); > if (ret < 0) { > goto fail; > } > @@ -222,8 +228,9 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) > BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE); > > trace_qcow2_l2_allocate_write_l2(bs, l1_index); > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > - ret = qcow2_cache_flush(bs, s->l2_table_cache); > + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > + ret = block_cache_flush(bs, s->l2_table_cache, > + BLOCK_TABLE_L2, s->cluster_size); > if (ret < 0) { > goto fail; > } > @@ -242,7 +249,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) > > fail: > trace_qcow2_l2_allocate_done(bs, l1_index, ret); > - qcow2_cache_put(bs, s->l2_table_cache, (void**) table); > + block_cache_put(bs, s->l2_table_cache, (void **) table, BLOCK_TABLE_L2); > s->l1_table[l1_index] = old_l2_offset; > return ret; > } > @@ -475,7 +482,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, > abort(); > } > > - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + block_cache_put(bs, s->l2_table_cache, (void **) &l2_table, BLOCK_TABLE_L2); > > nb_available = (c * s->cluster_sectors); > > @@ -584,13 +591,15 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, > * allocated. */ > cluster_offset = be64_to_cpu(l2_table[l2_index]); > if (cluster_offset & L2E_OFFSET_MASK) { > - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > return 0; > } > > cluster_offset = qcow2_alloc_bytes(bs, compressed_size); > if (cluster_offset < 0) { > - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > return 0; > } > > @@ -605,9 +614,10 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, > /* compressed clusters never have the copied flag */ > > BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED); > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > l2_table[l2_index] = cpu_to_be64(cluster_offset); > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > if (ret < 0) { > return 0; > } > @@ -659,18 +669,16 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) > * handled. > */ > if (cow) { > - qcow2_cache_depends_on_flush(s->l2_table_cache); > + block_cache_depends_on_flush(s->l2_table_cache); > } > > - if (qcow2_need_accurate_refcounts(s)) { > - qcow2_cache_set_dependency(bs, s->l2_table_cache, > - s->refcount_block_cache); > - } > + block_cache_set_dependency(bs, s->l2_table_cache, BLOCK_TABLE_L2, > + s->refcount_block_cache, s->cluster_size); > ret = get_cluster_table(bs, m->offset, &l2_table, &l2_index); > if (ret < 0) { > goto err; > } > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > > for (i = 0; i < m->nb_clusters; i++) { > /* if two concurrent writes happen to the same unallocated cluster > @@ -687,7 +695,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) > } > > > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > if (ret < 0) { > goto err; > } > @@ -913,7 +922,8 @@ again: > * request to complete. If we still had the reference, we could use up the > * whole cache with sleeping requests. > */ > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > if (ret < 0) { > return ret; > } > @@ -1077,14 +1087,15 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset, > } > > /* First remove L2 entries */ > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > l2_table[l2_index + i] = cpu_to_be64(0); > > /* Then decrease the refcount */ > qcow2_free_any_clusters(bs, old_offset, 1); > } > > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > if (ret < 0) { > return ret; > } > @@ -1154,7 +1165,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, > old_offset = be64_to_cpu(l2_table[l2_index + i]); > > /* Update L2 entries */ > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > if (old_offset & QCOW_OFLAG_COMPRESSED) { > l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO); > qcow2_free_any_clusters(bs, old_offset, 1); > @@ -1163,7 +1174,8 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, > } > } > > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > if (ret < 0) { > return ret; > } > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 5e3f915..728bfc1 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -25,6 +25,7 @@ > #include "qemu-common.h" > #include "block_int.h" > #include "block/qcow2.h" > +#include "block-cache.h" > > static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size); > static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, > @@ -71,8 +72,8 @@ static int load_refcount_block(BlockDriverState *bs, > int ret; > > BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD); > - ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, > - refcount_block); > + ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset, > + refcount_block, BLOCK_TABLE_REF, s->cluster_size); > > return ret; > } > @@ -98,8 +99,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index) > if (!refcount_block_offset) > return 0; > > - ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, > - (void**) &refcount_block); > + ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset, > + (void **) &refcount_block, BLOCK_TABLE_REF, s->cluster_size); > if (ret < 0) { > return ret; > } > @@ -108,8 +109,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index) > ((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1); > refcount = be16_to_cpu(refcount_block[block_index]); > > - ret = qcow2_cache_put(bs, s->refcount_block_cache, > - (void**) &refcount_block); > + ret = block_cache_put(bs, s->refcount_block_cache, > + (void **) &refcount_block, BLOCK_TABLE_REF); > if (ret < 0) { > return ret; > } > @@ -201,7 +202,8 @@ static int alloc_refcount_block(BlockDriverState *bs, > *refcount_block = NULL; > > /* We write to the refcount table, so we might depend on L2 tables */ > - qcow2_cache_flush(bs, s->l2_table_cache); > + block_cache_flush(bs, s->l2_table_cache, > + BLOCK_TABLE_L2, s->cluster_size); > > /* Allocate the refcount block itself and mark it as used */ > int64_t new_block = alloc_clusters_noref(bs, s->cluster_size); > @@ -217,8 +219,8 @@ static int alloc_refcount_block(BlockDriverState *bs, > > if (in_same_refcount_block(s, new_block, cluster_index << s->cluster_bits)) { > /* Zero the new refcount block before updating it */ > - ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block, > - (void**) refcount_block); > + ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block, > + (void **) refcount_block, BLOCK_TABLE_REF, s->cluster_size); > if (ret < 0) { > goto fail_block; > } > @@ -241,8 +243,8 @@ static int alloc_refcount_block(BlockDriverState *bs, > > /* Initialize the new refcount block only after updating its refcount, > * update_refcount uses the refcount cache itself */ > - ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block, > - (void**) refcount_block); > + ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block, > + (void **) refcount_block, BLOCK_TABLE_REF, s->cluster_size); > if (ret < 0) { > goto fail_block; > } > @@ -252,8 +254,9 @@ static int alloc_refcount_block(BlockDriverState *bs, > > /* Now the new refcount block needs to be written to disk */ > BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_WRITE); > - qcow2_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block); > - ret = qcow2_cache_flush(bs, s->refcount_block_cache); > + block_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block); > + ret = block_cache_flush(bs, s->refcount_block_cache, > + BLOCK_TABLE_REF, s->cluster_size); > if (ret < 0) { > goto fail_block; > } > @@ -273,7 +276,8 @@ static int alloc_refcount_block(BlockDriverState *bs, > return 0; > } > > - ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block); > + ret = block_cache_put(bs, s->refcount_block_cache, > + (void **) refcount_block, BLOCK_TABLE_REF); > if (ret < 0) { > goto fail_block; > } > @@ -406,7 +410,8 @@ fail_table: > g_free(new_table); > fail_block: > if (*refcount_block != NULL) { > - qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block); > + block_cache_put(bs, s->refcount_block_cache, > + (void **) refcount_block, BLOCK_TABLE_REF); > } > return ret; > } > @@ -432,8 +437,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, > } > > if (addend < 0) { > - qcow2_cache_set_dependency(bs, s->refcount_block_cache, > - s->l2_table_cache); > + block_cache_set_dependency(bs, s->refcount_block_cache, BLOCK_TABLE_REF, > + s->l2_table_cache, s->cluster_size); > } > > start = offset & ~(s->cluster_size - 1); > @@ -449,8 +454,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, > /* Load the refcount block and allocate it if needed */ > if (table_index != old_table_index) { > if (refcount_block) { > - ret = qcow2_cache_put(bs, s->refcount_block_cache, > - (void**) &refcount_block); > + ret = block_cache_put(bs, s->refcount_block_cache, > + (void **) &refcount_block, BLOCK_TABLE_REF); > if (ret < 0) { > goto fail; > } > @@ -463,7 +468,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, > } > old_table_index = table_index; > > - qcow2_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block); > + block_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block); > > /* we can update the count and save it */ > block_index = cluster_index & > @@ -486,8 +491,8 @@ fail: > /* Write last changed block to disk */ > if (refcount_block) { > int wret; > - wret = qcow2_cache_put(bs, s->refcount_block_cache, > - (void**) &refcount_block); > + wret = block_cache_put(bs, s->refcount_block_cache, > + (void **) &refcount_block, BLOCK_TABLE_REF); > if (wret < 0) { > return ret < 0 ? ret : wret; > } > @@ -763,8 +768,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, > old_l2_offset = l2_offset; > l2_offset &= L1E_OFFSET_MASK; > > - ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, > - (void**) &l2_table); > + ret = block_cache_get(bs, s->l2_table_cache, l2_offset, > + (void **) &l2_table, BLOCK_TABLE_L2, s->cluster_size); > if (ret < 0) { > goto fail; > } > @@ -811,16 +816,18 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, > } > if (offset != old_offset) { > if (addend > 0) { > - qcow2_cache_set_dependency(bs, s->l2_table_cache, > - s->refcount_block_cache); > + block_cache_set_dependency(bs, s->l2_table_cache, > + BLOCK_TABLE_L2, s->refcount_block_cache, > + s->cluster_size); > } > l2_table[j] = cpu_to_be64(offset); > - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); > } > } > } > > - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + ret = block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > if (ret < 0) { > goto fail; > } > @@ -847,7 +854,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, > ret = 0; > fail: > if (l2_table) { > - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); > + block_cache_put(bs, s->l2_table_cache, > + (void **) &l2_table, BLOCK_TABLE_L2); > } > > /* Update L1 only if it isn't deleted anyway (addend = -1) */ > diff --git a/block/qcow2.c b/block/qcow2.c > index fd5e214..b89d312 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -30,6 +30,7 @@ > #include "qemu-error.h" > #include "qerror.h" > #include "trace.h" > +#include "block-cache.h" > > /* > Differences with QCOW: > @@ -415,8 +416,9 @@ static int qcow2_open(BlockDriverState *bs, int flags) > } > > /* alloc L2 table/refcount block cache */ > - s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE); > - s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE); > + s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE, s->cluster_size); > + s->refcount_block_cache = > + block_cache_create(bs, REFCOUNT_CACHE_SIZE, s->cluster_size); > > s->cluster_cache = g_malloc(s->cluster_size); > /* one more sector for decompressed data alignment */ > @@ -500,7 +502,7 @@ static int qcow2_open(BlockDriverState *bs, int flags) > qcow2_refcount_close(bs); > g_free(s->l1_table); > if (s->l2_table_cache) { > - qcow2_cache_destroy(bs, s->l2_table_cache); > + block_cache_destroy(bs, s->l2_table_cache, BLOCK_TABLE_L2); > } > g_free(s->cluster_cache); > qemu_vfree(s->cluster_data); > @@ -860,13 +862,13 @@ static void qcow2_close(BlockDriverState *bs) > BDRVQcowState *s = bs->opaque; > g_free(s->l1_table); > > - qcow2_cache_flush(bs, s->l2_table_cache); > - qcow2_cache_flush(bs, s->refcount_block_cache); > - > + block_cache_flush(bs, s->l2_table_cache, > + BLOCK_TABLE_L2, s->cluster_size); > + block_cache_flush(bs, s->refcount_block_cache, > + BLOCK_TABLE_REF, s->cluster_size); > qcow2_mark_clean(bs); > - > - qcow2_cache_destroy(bs, s->l2_table_cache); > - qcow2_cache_destroy(bs, s->refcount_block_cache); > + block_cache_destroy(bs, s->l2_table_cache, BLOCK_TABLE_L2); > + block_cache_destroy(bs, s->refcount_block_cache, BLOCK_TABLE_REF); > > g_free(s->unknown_header_fields); > cleanup_unknown_header_ext(bs); > @@ -1339,8 +1341,6 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options) > options->value.s); > return -EINVAL; > } > - } else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) { > - flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0; > } > options++; > } > @@ -1537,18 +1537,18 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs) > int ret; > > qemu_co_mutex_lock(&s->lock); > - ret = qcow2_cache_flush(bs, s->l2_table_cache); > + ret = block_cache_flush(bs, s->l2_table_cache, > + BLOCK_TABLE_L2, s->cluster_size); > if (ret < 0) { > qemu_co_mutex_unlock(&s->lock); > return ret; > } > > - if (qcow2_need_accurate_refcounts(s)) { > - ret = qcow2_cache_flush(bs, s->refcount_block_cache); > - if (ret < 0) { > - qemu_co_mutex_unlock(&s->lock); > - return ret; > - } > + ret = block_cache_flush(bs, s->refcount_block_cache, > + BLOCK_TABLE_REF, s->cluster_size); > + if (ret < 0) { > + qemu_co_mutex_unlock(&s->lock); > + return ret; > } > qemu_co_mutex_unlock(&s->lock); > > diff --git a/block/qcow2.h b/block/qcow2.h > index b4eb654..cb6fd7a 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -27,6 +27,7 @@ > > #include "aes.h" > #include "qemu-coroutine.h" > +#include "block-cache.h" > > //#define DEBUG_ALLOC > //#define DEBUG_ALLOC2 > @@ -94,8 +95,6 @@ typedef struct QCowSnapshot { > uint64_t vm_clock_nsec; > } QCowSnapshot; > > -struct Qcow2Cache; > -typedef struct Qcow2Cache Qcow2Cache; > > typedef struct Qcow2UnknownHeaderExtension { > uint32_t magic; > @@ -146,8 +145,8 @@ typedef struct BDRVQcowState { > uint64_t l1_table_offset; > uint64_t *l1_table; > > - Qcow2Cache* l2_table_cache; > - Qcow2Cache* refcount_block_cache; > + BlockCache *l2_table_cache; > + BlockCache *refcount_block_cache; > > uint8_t *cluster_cache; > uint8_t *cluster_data; > @@ -316,21 +315,4 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name); > > void qcow2_free_snapshots(BlockDriverState *bs); > int qcow2_read_snapshots(BlockDriverState *bs); > - > -/* qcow2-cache.c functions */ > -Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables); > -int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c); > - > -void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table); > -int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c); > -int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, > - Qcow2Cache *dependency); > -void qcow2_cache_depends_on_flush(Qcow2Cache *c); > - > -int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, > - void **table); > -int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, > - void **table); > -int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table); > - > #endif > diff --git a/trace-events b/trace-events > index 6b12f83..52b6438 100644 > --- a/trace-events > +++ b/trace-events > @@ -439,12 +439,13 @@ qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d" > qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d" > qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d" > > -qcow2_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d" > -qcow2_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d" > -qcow2_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d" > -qcow2_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d" > -qcow2_cache_flush(void *co, int c) "co %p is_l2_cache %d" > -qcow2_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d" > +# block/block-cache.c > +block_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d" > +block_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d" > +block_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d" > +block_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d" > +block_cache_flush(void *co, int c) "co %p is_l2_cache %d" > +block_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d" > > # block/qed-l2-cache.c > qed_alloc_l2_cache_entry(void *l2_cache, void *entry) "l2_cache %p entry %p" > -- > 1.7.1 > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c 2012-09-06 17:52 ` Michael Roth @ 2012-09-10 2:14 ` Dong Xu Wang 0 siblings, 0 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-09-10 2:14 UTC (permalink / raw) To: Michael Roth; +Cc: kwolf, qemu-devel On Fri, Sep 7, 2012 at 1:52 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote: > On Fri, Aug 10, 2012 at 11:39:43PM +0800, Dong Xu Wang wrote: >> add-cow and qcow2 file format will share the same cache code, so rename >> block-cache.c to block-cache.c. And related structure and qcow2 code also > > "qcow2-cache.c to block-cache.c" > > But I've scanned through the rest of your patches and can't seem to find > where block-cache.c gets introduced. Did you forget to git add it? Really sorry for that, I forget to add the block-cache.c, will add it in v13. > >> are changed. >> >> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> --- >> block.h | 3 + >> block/Makefile.objs | 3 +- >> block/qcow2-cache.c | 323 ------------------------------------------------ >> block/qcow2-cluster.c | 66 ++++++---- >> block/qcow2-refcount.c | 66 ++++++----- >> block/qcow2.c | 36 +++--- >> block/qcow2.h | 24 +--- >> trace-events | 13 +- >> 8 files changed, 109 insertions(+), 425 deletions(-) >> delete mode 100644 block/qcow2-cache.c >> >> diff --git a/block.h b/block.h >> index e5dfcd7..c325661 100644 >> --- a/block.h >> +++ b/block.h >> @@ -401,6 +401,9 @@ typedef enum { >> BLKDBG_CLUSTER_ALLOC_BYTES, >> BLKDBG_CLUSTER_FREE, >> >> + BLKDBG_ADD_COW_UPDATE, >> + BLKDBG_ADD_COW_LOAD, >> + >> BLKDBG_EVENT_MAX, >> } BlkDebugEvent; >> >> diff --git a/block/Makefile.objs b/block/Makefile.objs >> index b5754d3..23bdfc8 100644 >> --- a/block/Makefile.objs >> +++ b/block/Makefile.objs >> @@ -1,7 +1,8 @@ >> block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o >> -block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o >> +block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o >> block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o >> block-obj-y += qed-check.o >> +block-obj-y += block-cache.o >> block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o >> block-obj-y += stream.o >> block-obj-$(CONFIG_WIN32) += raw-win32.o >> diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c >> deleted file mode 100644 >> index 2d4322a..0000000 >> --- a/block/qcow2-cache.c >> +++ /dev/null >> @@ -1,323 +0,0 @@ >> -/* >> - * L2/refcount table cache for the QCOW2 format >> - * >> - * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com> >> - * >> - * Permission is hereby granted, free of charge, to any person obtaining a copy >> - * of this software and associated documentation files (the "Software"), to deal >> - * in the Software without restriction, including without limitation the rights >> - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell >> - * copies of the Software, and to permit persons to whom the Software is >> - * furnished to do so, subject to the following conditions: >> - * >> - * The above copyright notice and this permission notice shall be included in >> - * all copies or substantial portions of the Software. >> - * >> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR >> - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER >> - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, >> - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN >> - * THE SOFTWARE. >> - */ >> - >> -#include "block_int.h" >> -#include "qemu-common.h" >> -#include "qcow2.h" >> -#include "trace.h" >> - >> -typedef struct Qcow2CachedTable { >> - void* table; >> - int64_t offset; >> - bool dirty; >> - int cache_hits; >> - int ref; >> -} Qcow2CachedTable; >> - >> -struct Qcow2Cache { >> - Qcow2CachedTable* entries; >> - struct Qcow2Cache* depends; >> - int size; >> - bool depends_on_flush; >> -}; >> - >> -Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables) >> -{ >> - BDRVQcowState *s = bs->opaque; >> - Qcow2Cache *c; >> - int i; >> - >> - c = g_malloc0(sizeof(*c)); >> - c->size = num_tables; >> - c->entries = g_malloc0(sizeof(*c->entries) * num_tables); >> - >> - for (i = 0; i < c->size; i++) { >> - c->entries[i].table = qemu_blockalign(bs, s->cluster_size); >> - } >> - >> - return c; >> -} >> - >> -int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c) >> -{ >> - int i; >> - >> - for (i = 0; i < c->size; i++) { >> - assert(c->entries[i].ref == 0); >> - qemu_vfree(c->entries[i].table); >> - } >> - >> - g_free(c->entries); >> - g_free(c); >> - >> - return 0; >> -} >> - >> -static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c) >> -{ >> - int ret; >> - >> - ret = qcow2_cache_flush(bs, c->depends); >> - if (ret < 0) { >> - return ret; >> - } >> - >> - c->depends = NULL; >> - c->depends_on_flush = false; >> - >> - return 0; >> -} >> - >> -static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) >> -{ >> - BDRVQcowState *s = bs->opaque; >> - int ret = 0; >> - >> - if (!c->entries[i].dirty || !c->entries[i].offset) { >> - return 0; >> - } >> - >> - trace_qcow2_cache_entry_flush(qemu_coroutine_self(), >> - c == s->l2_table_cache, i); >> - >> - if (c->depends) { >> - ret = qcow2_cache_flush_dependency(bs, c); >> - } else if (c->depends_on_flush) { >> - ret = bdrv_flush(bs->file); >> - if (ret >= 0) { >> - c->depends_on_flush = false; >> - } >> - } >> - >> - if (ret < 0) { >> - return ret; >> - } >> - >> - if (c == s->refcount_block_cache) { >> - BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_UPDATE_PART); >> - } else if (c == s->l2_table_cache) { >> - BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE); >> - } >> - >> - ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table, >> - s->cluster_size); >> - if (ret < 0) { >> - return ret; >> - } >> - >> - c->entries[i].dirty = false; >> - >> - return 0; >> -} >> - >> -int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c) >> -{ >> - BDRVQcowState *s = bs->opaque; >> - int result = 0; >> - int ret; >> - int i; >> - >> - trace_qcow2_cache_flush(qemu_coroutine_self(), c == s->l2_table_cache); >> - >> - for (i = 0; i < c->size; i++) { >> - ret = qcow2_cache_entry_flush(bs, c, i); >> - if (ret < 0 && result != -ENOSPC) { >> - result = ret; >> - } >> - } >> - >> - if (result == 0) { >> - ret = bdrv_flush(bs->file); >> - if (ret < 0) { >> - result = ret; >> - } >> - } >> - >> - return result; >> -} >> - >> -int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, >> - Qcow2Cache *dependency) >> -{ >> - int ret; >> - >> - if (dependency->depends) { >> - ret = qcow2_cache_flush_dependency(bs, dependency); >> - if (ret < 0) { >> - return ret; >> - } >> - } >> - >> - if (c->depends && (c->depends != dependency)) { >> - ret = qcow2_cache_flush_dependency(bs, c); >> - if (ret < 0) { >> - return ret; >> - } >> - } >> - >> - c->depends = dependency; >> - return 0; >> -} >> - >> -void qcow2_cache_depends_on_flush(Qcow2Cache *c) >> -{ >> - c->depends_on_flush = true; >> -} >> - >> -static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) >> -{ >> - int i; >> - int min_count = INT_MAX; >> - int min_index = -1; >> - >> - >> - for (i = 0; i < c->size; i++) { >> - if (c->entries[i].ref) { >> - continue; >> - } >> - >> - if (c->entries[i].cache_hits < min_count) { >> - min_index = i; >> - min_count = c->entries[i].cache_hits; >> - } >> - >> - /* Give newer hits priority */ >> - /* TODO Check how to optimize the replacement strategy */ >> - c->entries[i].cache_hits /= 2; >> - } >> - >> - if (min_index == -1) { >> - /* This can't happen in current synchronous code, but leave the check >> - * here as a reminder for whoever starts using AIO with the cache */ >> - abort(); >> - } >> - return min_index; >> -} >> - >> -static int qcow2_cache_do_get(BlockDriverState *bs, Qcow2Cache *c, >> - uint64_t offset, void **table, bool read_from_disk) >> -{ >> - BDRVQcowState *s = bs->opaque; >> - int i; >> - int ret; >> - >> - trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache, >> - offset, read_from_disk); >> - >> - /* Check if the table is already cached */ >> - for (i = 0; i < c->size; i++) { >> - if (c->entries[i].offset == offset) { >> - goto found; >> - } >> - } >> - >> - /* If not, write a table back and replace it */ >> - i = qcow2_cache_find_entry_to_replace(c); >> - trace_qcow2_cache_get_replace_entry(qemu_coroutine_self(), >> - c == s->l2_table_cache, i); >> - if (i < 0) { >> - return i; >> - } >> - >> - ret = qcow2_cache_entry_flush(bs, c, i); >> - if (ret < 0) { >> - return ret; >> - } >> - >> - trace_qcow2_cache_get_read(qemu_coroutine_self(), >> - c == s->l2_table_cache, i); >> - c->entries[i].offset = 0; >> - if (read_from_disk) { >> - if (c == s->l2_table_cache) { >> - BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD); >> - } >> - >> - ret = bdrv_pread(bs->file, offset, c->entries[i].table, s->cluster_size); >> - if (ret < 0) { >> - return ret; >> - } >> - } >> - >> - /* Give the table some hits for the start so that it won't be replaced >> - * immediately. The number 32 is completely arbitrary. */ >> - c->entries[i].cache_hits = 32; >> - c->entries[i].offset = offset; >> - >> - /* And return the right table */ >> -found: >> - c->entries[i].cache_hits++; >> - c->entries[i].ref++; >> - *table = c->entries[i].table; >> - >> - trace_qcow2_cache_get_done(qemu_coroutine_self(), >> - c == s->l2_table_cache, i); >> - >> - return 0; >> -} >> - >> -int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, >> - void **table) >> -{ >> - return qcow2_cache_do_get(bs, c, offset, table, true); >> -} >> - >> -int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, >> - void **table) >> -{ >> - return qcow2_cache_do_get(bs, c, offset, table, false); >> -} >> - >> -int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table) >> -{ >> - int i; >> - >> - for (i = 0; i < c->size; i++) { >> - if (c->entries[i].table == *table) { >> - goto found; >> - } >> - } >> - return -ENOENT; >> - >> -found: >> - c->entries[i].ref--; >> - *table = NULL; >> - >> - assert(c->entries[i].ref >= 0); >> - return 0; >> -} >> - >> -void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table) >> -{ >> - int i; >> - >> - for (i = 0; i < c->size; i++) { >> - if (c->entries[i].table == table) { >> - goto found; >> - } >> - } >> - abort(); >> - >> -found: >> - c->entries[i].dirty = true; >> -} >> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c >> index e179211..335dc7a 100644 >> --- a/block/qcow2-cluster.c >> +++ b/block/qcow2-cluster.c >> @@ -28,6 +28,7 @@ >> #include "block_int.h" >> #include "block/qcow2.h" >> #include "trace.h" >> +#include "block-cache.h" >> >> int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) >> { >> @@ -69,7 +70,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) >> return new_l1_table_offset; >> } >> >> - ret = qcow2_cache_flush(bs, s->refcount_block_cache); >> + ret = block_cache_flush(bs, s->refcount_block_cache, >> + BLOCK_TABLE_REF, s->cluster_size); >> if (ret < 0) { >> goto fail; >> } >> @@ -119,7 +121,8 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset, >> BDRVQcowState *s = bs->opaque; >> int ret; >> >> - ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table); >> + ret = block_cache_get(bs, s->l2_table_cache, l2_offset, >> + (void **) l2_table, BLOCK_TABLE_L2, s->cluster_size); >> >> return ret; >> } >> @@ -180,7 +183,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) >> return l2_offset; >> } >> >> - ret = qcow2_cache_flush(bs, s->refcount_block_cache); >> + ret = block_cache_flush(bs, s->refcount_block_cache, >> + BLOCK_TABLE_REF, s->cluster_size); >> if (ret < 0) { >> goto fail; >> } >> @@ -188,7 +192,8 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) >> /* allocate a new entry in the l2 cache */ >> >> trace_qcow2_l2_allocate_get_empty(bs, l1_index); >> - ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table); >> + ret = block_cache_get_empty(bs, s->l2_table_cache, l2_offset, >> + (void **) table, BLOCK_TABLE_L2, s->cluster_size); >> if (ret < 0) { >> return ret; >> } >> @@ -203,16 +208,17 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) >> >> /* if there was an old l2 table, read it from the disk */ >> BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ); >> - ret = qcow2_cache_get(bs, s->l2_table_cache, >> + ret = block_cache_get(bs, s->l2_table_cache, >> old_l2_offset & L1E_OFFSET_MASK, >> - (void**) &old_table); >> + (void **) &old_table, BLOCK_TABLE_L2, s->cluster_size); >> if (ret < 0) { >> goto fail; >> } >> >> memcpy(l2_table, old_table, s->cluster_size); >> >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &old_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &old_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> goto fail; >> } >> @@ -222,8 +228,9 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) >> BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_WRITE); >> >> trace_qcow2_l2_allocate_write_l2(bs, l1_index); >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> - ret = qcow2_cache_flush(bs, s->l2_table_cache); >> + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> + ret = block_cache_flush(bs, s->l2_table_cache, >> + BLOCK_TABLE_L2, s->cluster_size); >> if (ret < 0) { >> goto fail; >> } >> @@ -242,7 +249,7 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) >> >> fail: >> trace_qcow2_l2_allocate_done(bs, l1_index, ret); >> - qcow2_cache_put(bs, s->l2_table_cache, (void**) table); >> + block_cache_put(bs, s->l2_table_cache, (void **) table, BLOCK_TABLE_L2); >> s->l1_table[l1_index] = old_l2_offset; >> return ret; >> } >> @@ -475,7 +482,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, >> abort(); >> } >> >> - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + block_cache_put(bs, s->l2_table_cache, (void **) &l2_table, BLOCK_TABLE_L2); >> >> nb_available = (c * s->cluster_sectors); >> >> @@ -584,13 +591,15 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, >> * allocated. */ >> cluster_offset = be64_to_cpu(l2_table[l2_index]); >> if (cluster_offset & L2E_OFFSET_MASK) { >> - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> return 0; >> } >> >> cluster_offset = qcow2_alloc_bytes(bs, compressed_size); >> if (cluster_offset < 0) { >> - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> return 0; >> } >> >> @@ -605,9 +614,10 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, >> /* compressed clusters never have the copied flag */ >> >> BLKDBG_EVENT(bs->file, BLKDBG_L2_UPDATE_COMPRESSED); >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> l2_table[l2_index] = cpu_to_be64(cluster_offset); >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> return 0; >> } >> @@ -659,18 +669,16 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) >> * handled. >> */ >> if (cow) { >> - qcow2_cache_depends_on_flush(s->l2_table_cache); >> + block_cache_depends_on_flush(s->l2_table_cache); >> } >> >> - if (qcow2_need_accurate_refcounts(s)) { >> - qcow2_cache_set_dependency(bs, s->l2_table_cache, >> - s->refcount_block_cache); >> - } >> + block_cache_set_dependency(bs, s->l2_table_cache, BLOCK_TABLE_L2, >> + s->refcount_block_cache, s->cluster_size); >> ret = get_cluster_table(bs, m->offset, &l2_table, &l2_index); >> if (ret < 0) { >> goto err; >> } >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> >> for (i = 0; i < m->nb_clusters; i++) { >> /* if two concurrent writes happen to the same unallocated cluster >> @@ -687,7 +695,8 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) >> } >> >> >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> goto err; >> } >> @@ -913,7 +922,8 @@ again: >> * request to complete. If we still had the reference, we could use up the >> * whole cache with sleeping requests. >> */ >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> return ret; >> } >> @@ -1077,14 +1087,15 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset, >> } >> >> /* First remove L2 entries */ >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> l2_table[l2_index + i] = cpu_to_be64(0); >> >> /* Then decrease the refcount */ >> qcow2_free_any_clusters(bs, old_offset, 1); >> } >> >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> return ret; >> } >> @@ -1154,7 +1165,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, >> old_offset = be64_to_cpu(l2_table[l2_index + i]); >> >> /* Update L2 entries */ >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> if (old_offset & QCOW_OFLAG_COMPRESSED) { >> l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO); >> qcow2_free_any_clusters(bs, old_offset, 1); >> @@ -1163,7 +1174,8 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset, >> } >> } >> >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> return ret; >> } >> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c >> index 5e3f915..728bfc1 100644 >> --- a/block/qcow2-refcount.c >> +++ b/block/qcow2-refcount.c >> @@ -25,6 +25,7 @@ >> #include "qemu-common.h" >> #include "block_int.h" >> #include "block/qcow2.h" >> +#include "block-cache.h" >> >> static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size); >> static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, >> @@ -71,8 +72,8 @@ static int load_refcount_block(BlockDriverState *bs, >> int ret; >> >> BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD); >> - ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, >> - refcount_block); >> + ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset, >> + refcount_block, BLOCK_TABLE_REF, s->cluster_size); >> >> return ret; >> } >> @@ -98,8 +99,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index) >> if (!refcount_block_offset) >> return 0; >> >> - ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset, >> - (void**) &refcount_block); >> + ret = block_cache_get(bs, s->refcount_block_cache, refcount_block_offset, >> + (void **) &refcount_block, BLOCK_TABLE_REF, s->cluster_size); >> if (ret < 0) { >> return ret; >> } >> @@ -108,8 +109,8 @@ static int get_refcount(BlockDriverState *bs, int64_t cluster_index) >> ((1 << (s->cluster_bits - REFCOUNT_SHIFT)) - 1); >> refcount = be16_to_cpu(refcount_block[block_index]); >> >> - ret = qcow2_cache_put(bs, s->refcount_block_cache, >> - (void**) &refcount_block); >> + ret = block_cache_put(bs, s->refcount_block_cache, >> + (void **) &refcount_block, BLOCK_TABLE_REF); >> if (ret < 0) { >> return ret; >> } >> @@ -201,7 +202,8 @@ static int alloc_refcount_block(BlockDriverState *bs, >> *refcount_block = NULL; >> >> /* We write to the refcount table, so we might depend on L2 tables */ >> - qcow2_cache_flush(bs, s->l2_table_cache); >> + block_cache_flush(bs, s->l2_table_cache, >> + BLOCK_TABLE_L2, s->cluster_size); >> >> /* Allocate the refcount block itself and mark it as used */ >> int64_t new_block = alloc_clusters_noref(bs, s->cluster_size); >> @@ -217,8 +219,8 @@ static int alloc_refcount_block(BlockDriverState *bs, >> >> if (in_same_refcount_block(s, new_block, cluster_index << s->cluster_bits)) { >> /* Zero the new refcount block before updating it */ >> - ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block, >> - (void**) refcount_block); >> + ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block, >> + (void **) refcount_block, BLOCK_TABLE_REF, s->cluster_size); >> if (ret < 0) { >> goto fail_block; >> } >> @@ -241,8 +243,8 @@ static int alloc_refcount_block(BlockDriverState *bs, >> >> /* Initialize the new refcount block only after updating its refcount, >> * update_refcount uses the refcount cache itself */ >> - ret = qcow2_cache_get_empty(bs, s->refcount_block_cache, new_block, >> - (void**) refcount_block); >> + ret = block_cache_get_empty(bs, s->refcount_block_cache, new_block, >> + (void **) refcount_block, BLOCK_TABLE_REF, s->cluster_size); >> if (ret < 0) { >> goto fail_block; >> } >> @@ -252,8 +254,9 @@ static int alloc_refcount_block(BlockDriverState *bs, >> >> /* Now the new refcount block needs to be written to disk */ >> BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_ALLOC_WRITE); >> - qcow2_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block); >> - ret = qcow2_cache_flush(bs, s->refcount_block_cache); >> + block_cache_entry_mark_dirty(s->refcount_block_cache, *refcount_block); >> + ret = block_cache_flush(bs, s->refcount_block_cache, >> + BLOCK_TABLE_REF, s->cluster_size); >> if (ret < 0) { >> goto fail_block; >> } >> @@ -273,7 +276,8 @@ static int alloc_refcount_block(BlockDriverState *bs, >> return 0; >> } >> >> - ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block); >> + ret = block_cache_put(bs, s->refcount_block_cache, >> + (void **) refcount_block, BLOCK_TABLE_REF); >> if (ret < 0) { >> goto fail_block; >> } >> @@ -406,7 +410,8 @@ fail_table: >> g_free(new_table); >> fail_block: >> if (*refcount_block != NULL) { >> - qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block); >> + block_cache_put(bs, s->refcount_block_cache, >> + (void **) refcount_block, BLOCK_TABLE_REF); >> } >> return ret; >> } >> @@ -432,8 +437,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, >> } >> >> if (addend < 0) { >> - qcow2_cache_set_dependency(bs, s->refcount_block_cache, >> - s->l2_table_cache); >> + block_cache_set_dependency(bs, s->refcount_block_cache, BLOCK_TABLE_REF, >> + s->l2_table_cache, s->cluster_size); >> } >> >> start = offset & ~(s->cluster_size - 1); >> @@ -449,8 +454,8 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, >> /* Load the refcount block and allocate it if needed */ >> if (table_index != old_table_index) { >> if (refcount_block) { >> - ret = qcow2_cache_put(bs, s->refcount_block_cache, >> - (void**) &refcount_block); >> + ret = block_cache_put(bs, s->refcount_block_cache, >> + (void **) &refcount_block, BLOCK_TABLE_REF); >> if (ret < 0) { >> goto fail; >> } >> @@ -463,7 +468,7 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, >> } >> old_table_index = table_index; >> >> - qcow2_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block); >> + block_cache_entry_mark_dirty(s->refcount_block_cache, refcount_block); >> >> /* we can update the count and save it */ >> block_index = cluster_index & >> @@ -486,8 +491,8 @@ fail: >> /* Write last changed block to disk */ >> if (refcount_block) { >> int wret; >> - wret = qcow2_cache_put(bs, s->refcount_block_cache, >> - (void**) &refcount_block); >> + wret = block_cache_put(bs, s->refcount_block_cache, >> + (void **) &refcount_block, BLOCK_TABLE_REF); >> if (wret < 0) { >> return ret < 0 ? ret : wret; >> } >> @@ -763,8 +768,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, >> old_l2_offset = l2_offset; >> l2_offset &= L1E_OFFSET_MASK; >> >> - ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, >> - (void**) &l2_table); >> + ret = block_cache_get(bs, s->l2_table_cache, l2_offset, >> + (void **) &l2_table, BLOCK_TABLE_L2, s->cluster_size); >> if (ret < 0) { >> goto fail; >> } >> @@ -811,16 +816,18 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, >> } >> if (offset != old_offset) { >> if (addend > 0) { >> - qcow2_cache_set_dependency(bs, s->l2_table_cache, >> - s->refcount_block_cache); >> + block_cache_set_dependency(bs, s->l2_table_cache, >> + BLOCK_TABLE_L2, s->refcount_block_cache, >> + s->cluster_size); >> } >> l2_table[j] = cpu_to_be64(offset); >> - qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> + block_cache_entry_mark_dirty(s->l2_table_cache, l2_table); >> } >> } >> } >> >> - ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + ret = block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> if (ret < 0) { >> goto fail; >> } >> @@ -847,7 +854,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, >> ret = 0; >> fail: >> if (l2_table) { >> - qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table); >> + block_cache_put(bs, s->l2_table_cache, >> + (void **) &l2_table, BLOCK_TABLE_L2); >> } >> >> /* Update L1 only if it isn't deleted anyway (addend = -1) */ >> diff --git a/block/qcow2.c b/block/qcow2.c >> index fd5e214..b89d312 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -30,6 +30,7 @@ >> #include "qemu-error.h" >> #include "qerror.h" >> #include "trace.h" >> +#include "block-cache.h" >> >> /* >> Differences with QCOW: >> @@ -415,8 +416,9 @@ static int qcow2_open(BlockDriverState *bs, int flags) >> } >> >> /* alloc L2 table/refcount block cache */ >> - s->l2_table_cache = qcow2_cache_create(bs, L2_CACHE_SIZE); >> - s->refcount_block_cache = qcow2_cache_create(bs, REFCOUNT_CACHE_SIZE); >> + s->l2_table_cache = block_cache_create(bs, L2_CACHE_SIZE, s->cluster_size); >> + s->refcount_block_cache = >> + block_cache_create(bs, REFCOUNT_CACHE_SIZE, s->cluster_size); >> >> s->cluster_cache = g_malloc(s->cluster_size); >> /* one more sector for decompressed data alignment */ >> @@ -500,7 +502,7 @@ static int qcow2_open(BlockDriverState *bs, int flags) >> qcow2_refcount_close(bs); >> g_free(s->l1_table); >> if (s->l2_table_cache) { >> - qcow2_cache_destroy(bs, s->l2_table_cache); >> + block_cache_destroy(bs, s->l2_table_cache, BLOCK_TABLE_L2); >> } >> g_free(s->cluster_cache); >> qemu_vfree(s->cluster_data); >> @@ -860,13 +862,13 @@ static void qcow2_close(BlockDriverState *bs) >> BDRVQcowState *s = bs->opaque; >> g_free(s->l1_table); >> >> - qcow2_cache_flush(bs, s->l2_table_cache); >> - qcow2_cache_flush(bs, s->refcount_block_cache); >> - >> + block_cache_flush(bs, s->l2_table_cache, >> + BLOCK_TABLE_L2, s->cluster_size); >> + block_cache_flush(bs, s->refcount_block_cache, >> + BLOCK_TABLE_REF, s->cluster_size); >> qcow2_mark_clean(bs); >> - >> - qcow2_cache_destroy(bs, s->l2_table_cache); >> - qcow2_cache_destroy(bs, s->refcount_block_cache); >> + block_cache_destroy(bs, s->l2_table_cache, BLOCK_TABLE_L2); >> + block_cache_destroy(bs, s->refcount_block_cache, BLOCK_TABLE_REF); >> >> g_free(s->unknown_header_fields); >> cleanup_unknown_header_ext(bs); >> @@ -1339,8 +1341,6 @@ static int qcow2_create(const char *filename, QEMUOptionParameter *options) >> options->value.s); >> return -EINVAL; >> } >> - } else if (!strcmp(options->name, BLOCK_OPT_LAZY_REFCOUNTS)) { >> - flags |= options->value.n ? BLOCK_FLAG_LAZY_REFCOUNTS : 0; >> } >> options++; >> } >> @@ -1537,18 +1537,18 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs) >> int ret; >> >> qemu_co_mutex_lock(&s->lock); >> - ret = qcow2_cache_flush(bs, s->l2_table_cache); >> + ret = block_cache_flush(bs, s->l2_table_cache, >> + BLOCK_TABLE_L2, s->cluster_size); >> if (ret < 0) { >> qemu_co_mutex_unlock(&s->lock); >> return ret; >> } >> >> - if (qcow2_need_accurate_refcounts(s)) { >> - ret = qcow2_cache_flush(bs, s->refcount_block_cache); >> - if (ret < 0) { >> - qemu_co_mutex_unlock(&s->lock); >> - return ret; >> - } >> + ret = block_cache_flush(bs, s->refcount_block_cache, >> + BLOCK_TABLE_REF, s->cluster_size); >> + if (ret < 0) { >> + qemu_co_mutex_unlock(&s->lock); >> + return ret; >> } >> qemu_co_mutex_unlock(&s->lock); >> >> diff --git a/block/qcow2.h b/block/qcow2.h >> index b4eb654..cb6fd7a 100644 >> --- a/block/qcow2.h >> +++ b/block/qcow2.h >> @@ -27,6 +27,7 @@ >> >> #include "aes.h" >> #include "qemu-coroutine.h" >> +#include "block-cache.h" >> >> //#define DEBUG_ALLOC >> //#define DEBUG_ALLOC2 >> @@ -94,8 +95,6 @@ typedef struct QCowSnapshot { >> uint64_t vm_clock_nsec; >> } QCowSnapshot; >> >> -struct Qcow2Cache; >> -typedef struct Qcow2Cache Qcow2Cache; >> >> typedef struct Qcow2UnknownHeaderExtension { >> uint32_t magic; >> @@ -146,8 +145,8 @@ typedef struct BDRVQcowState { >> uint64_t l1_table_offset; >> uint64_t *l1_table; >> >> - Qcow2Cache* l2_table_cache; >> - Qcow2Cache* refcount_block_cache; >> + BlockCache *l2_table_cache; >> + BlockCache *refcount_block_cache; >> >> uint8_t *cluster_cache; >> uint8_t *cluster_data; >> @@ -316,21 +315,4 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name); >> >> void qcow2_free_snapshots(BlockDriverState *bs); >> int qcow2_read_snapshots(BlockDriverState *bs); >> - >> -/* qcow2-cache.c functions */ >> -Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables); >> -int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c); >> - >> -void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table); >> -int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c); >> -int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, >> - Qcow2Cache *dependency); >> -void qcow2_cache_depends_on_flush(Qcow2Cache *c); >> - >> -int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, >> - void **table); >> -int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, >> - void **table); >> -int qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table); >> - >> #endif >> diff --git a/trace-events b/trace-events >> index 6b12f83..52b6438 100644 >> --- a/trace-events >> +++ b/trace-events >> @@ -439,12 +439,13 @@ qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d" >> qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d" >> qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d" >> >> -qcow2_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d" >> -qcow2_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> -qcow2_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> -qcow2_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> -qcow2_cache_flush(void *co, int c) "co %p is_l2_cache %d" >> -qcow2_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> +# block/block-cache.c >> +block_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d" >> +block_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> +block_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> +block_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> +block_cache_flush(void *co, int c) "co %p is_l2_cache %d" >> +block_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d" >> >> # block/qed-l2-cache.c >> qed_alloc_l2_cache_entry(void *l2_cache, void *entry) "l2_cache %p entry %p" >> -- >> 1.7.1 >> >> > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang 2012-09-06 17:52 ` Michael Roth @ 2012-09-11 8:41 ` Kevin Wolf 1 sibling, 0 replies; 25+ messages in thread From: Kevin Wolf @ 2012-09-11 8:41 UTC (permalink / raw) To: Dong Xu Wang; +Cc: qemu-devel Am 10.08.2012 17:39, schrieb Dong Xu Wang: > add-cow and qcow2 file format will share the same cache code, so rename > block-cache.c to block-cache.c. And related structure and qcow2 code also > are changed. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > block.h | 3 + > block/Makefile.objs | 3 +- > block/qcow2-cache.c | 323 ------------------------------------------------ > block/qcow2-cluster.c | 66 ++++++---- > block/qcow2-refcount.c | 66 ++++++----- > block/qcow2.c | 36 +++--- > block/qcow2.h | 24 +--- > trace-events | 13 +- > 8 files changed, 109 insertions(+), 425 deletions(-) > delete mode 100644 block/qcow2-cache.c > > diff --git a/block.h b/block.h > index e5dfcd7..c325661 100644 > --- a/block.h > +++ b/block.h > @@ -401,6 +401,9 @@ typedef enum { > BLKDBG_CLUSTER_ALLOC_BYTES, > BLKDBG_CLUSTER_FREE, > > + BLKDBG_ADD_COW_UPDATE, > + BLKDBG_ADD_COW_LOAD, > + I don't think you should add new events, the existing ones should be generic enough that you can reuse them. It's somewhat hard to see without block-cache.c, though. Can you make sure to have one patch with pure code motion, and a separate one with the changes needed to make it work with add-cow? It will help reviewers a lot. > BLKDBG_EVENT_MAX, > } BlkDebugEvent; > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index e179211..335dc7a 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -28,6 +28,7 @@ > #include "block_int.h" > #include "block/qcow2.h" > #include "trace.h" > +#include "block-cache.h" > > int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) > { > @@ -69,7 +70,8 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) > return new_l1_table_offset; > } > > - ret = qcow2_cache_flush(bs, s->refcount_block_cache); > + ret = block_cache_flush(bs, s->refcount_block_cache, > + BLOCK_TABLE_REF, s->cluster_size); I think its better to pass s->cluster_size to the cache initialisation instead of in each call of the cache function. For the blkdebug events I guess it's possible as well to move this to the initialisation, but I'd have to see the block-cache.c code to say something specific about this. > @@ -659,18 +669,16 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) > * handled. > */ > if (cow) { > - qcow2_cache_depends_on_flush(s->l2_table_cache); > + block_cache_depends_on_flush(s->l2_table_cache); > } > > - if (qcow2_need_accurate_refcounts(s)) { > - qcow2_cache_set_dependency(bs, s->l2_table_cache, > - s->refcount_block_cache); > - } > + block_cache_set_dependency(bs, s->l2_table_cache, BLOCK_TABLE_L2, > + s->refcount_block_cache, s->cluster_size); What happened with lazy refcounting? Is this a mismerge or did you intentionally remove the condition? (There's a second place where you do the same) Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang ` (3 preceding siblings ...) 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang @ 2012-08-10 15:39 ` Dong Xu Wang 2012-09-06 20:19 ` Michael Roth 2012-09-11 9:40 ` Kevin Wolf 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 6/6] add-cow: add qemu-iotests support Dong Xu Wang 2012-08-23 5:34 ` [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang 6 siblings, 2 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang add-cow file format core code. It use block-cache.c as cache code. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> --- block/Makefile.objs | 1 + block/add-cow.c | 613 +++++++++++++++++++++++++++++++++++++++++++++++++++ block/add-cow.h | 85 +++++++ block_int.h | 2 + 4 files changed, 701 insertions(+), 0 deletions(-) create mode 100644 block/add-cow.c create mode 100644 block/add-cow.h diff --git a/block/Makefile.objs b/block/Makefile.objs index 23bdfc8..7ed5051 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-obj-y += qed-check.o +block-obj-y += add-cow.o block-obj-y += block-cache.o block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o block-obj-y += stream.o diff --git a/block/add-cow.c b/block/add-cow.c new file mode 100644 index 0000000..d4711d5 --- /dev/null +++ b/block/add-cow.c @@ -0,0 +1,613 @@ +/* + * QEMU ADD-COW Disk Format + * + * Copyright IBM, Corp. 2012 + * + * Authors: + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include "qemu-common.h" +#include "block_int.h" +#include "module.h" +#include "add-cow.h" + +static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu) +{ + cpu->magic = le64_to_cpu(le->magic); + cpu->version = le32_to_cpu(le->version); + + cpu->backing_filename_offset = le32_to_cpu(le->backing_filename_offset); + cpu->backing_filename_size = le32_to_cpu(le->backing_filename_size); + + cpu->image_filename_offset = le32_to_cpu(le->image_filename_offset); + cpu->image_filename_size = le32_to_cpu(le->image_filename_size); + + cpu->features = le64_to_cpu(le->features); + cpu->optional_features = le64_to_cpu(le->optional_features); + cpu->header_pages_size = le32_to_cpu(le->header_pages_size); +} + +static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le) +{ + le->magic = cpu_to_le64(cpu->magic); + le->version = cpu_to_le32(cpu->version); + + le->backing_filename_offset = cpu_to_le32(cpu->backing_filename_offset); + le->backing_filename_size = cpu_to_le32(cpu->backing_filename_size); + + le->image_filename_offset = cpu_to_le32(cpu->image_filename_offset); + le->image_filename_size = cpu_to_le32(cpu->image_filename_size); + + le->features = cpu_to_le64(cpu->features); + le->optional_features = cpu_to_le64(cpu->optional_features); + le->header_pages_size = cpu_to_le32(cpu->header_pages_size); +} + +static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename) +{ + const AddCowHeader *header = (const AddCowHeader *)buf; + + if (le64_to_cpu(header->magic) == ADD_COW_MAGIC && + le32_to_cpu(header->version) == ADD_COW_VERSION) { + return 100; + } else { + return 0; + } +} + +static int add_cow_create(const char *filename, QEMUOptionParameter *options) +{ + AddCowHeader header = { + .magic = ADD_COW_MAGIC, + .version = ADD_COW_VERSION, + .features = 0, + .optional_features = 0, + .header_pages_size = ADD_COW_DEFAULT_PAGE_SIZE, + }; + AddCowHeader le_header; + int64_t image_len = 0; + const char *backing_filename = NULL; + const char *backing_fmt = NULL; + const char *image_filename = NULL; + const char *image_format = NULL; + BlockDriverState *bs, *image_bs = NULL, *backing_bs = NULL; + BlockDriver *drv = bdrv_find_format("add-cow"); + BDRVAddCowState s; + int ret; + + while (options && options->name) { + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { + image_len = options->value.n; + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) { + backing_filename = options->value.s; + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) { + backing_fmt = options->value.s; + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FILE)) { + image_filename = options->value.s; + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FORMAT)) { + image_format = options->value.s; + } + options++; + } + + if (backing_filename) { + header.backing_filename_offset = sizeof(header) + + sizeof(s.backing_file_format) + sizeof(s.image_file_format); + header.backing_filename_size = strlen(backing_filename); + + if (!backing_fmt) { + backing_bs = bdrv_new("image"); + ret = bdrv_open(backing_bs, backing_filename, BDRV_O_RDWR + | BDRV_O_CACHE_WB, NULL); + if (ret < 0) { + return ret; + } + backing_fmt = bdrv_get_format_name(backing_bs); + bdrv_delete(backing_bs); + } + } else { + header.features |= ADD_COW_F_All_ALLOCATED; + } + + if (image_filename) { + header.image_filename_offset = + sizeof(header) + sizeof(s.backing_file_format) + + sizeof(s.image_file_format) + header.backing_filename_size; + header.image_filename_size = strlen(image_filename); + } else { + error_report("Error: image_file should be given."); + return -EINVAL; + } + + if (backing_filename && !strcmp(backing_filename, image_filename)) { + error_report("Error: Trying to create an image with the " + "same backing file name as the image file name"); + return -EINVAL; + } + + if (!strcmp(filename, image_filename)) { + error_report("Error: Trying to create an image with the " + "same filename as the image file name"); + return -EINVAL; + } + + if (header.image_filename_offset + header.image_filename_size + > ADD_COW_PAGE_SIZE * ADD_COW_DEFAULT_PAGE_SIZE) { + error_report("image_file name or backing_file name too long."); + return -ENOSPC; + } + + ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR); + if (ret < 0) { + return ret; + } + bdrv_delete(image_bs); + + ret = bdrv_create_file(filename, NULL); + if (ret < 0) { + return ret; + } + + ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR); + if (ret < 0) { + return ret; + } + add_cow_header_cpu_to_le(&header, &le_header); + ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header)); + if (ret < 0) { + bdrv_delete(bs); + return ret; + } + + ret = bdrv_pwrite(bs, sizeof(le_header), backing_fmt ? backing_fmt : "", + backing_fmt ? strlen(backing_fmt) : 0); + if (ret < 0) { + bdrv_delete(bs); + return ret; + } + + ret = bdrv_pwrite(bs, sizeof(le_header) + sizeof(s.backing_file_format), + image_format ? image_format : "raw", + image_format ? strlen(image_format) : sizeof("raw")); + if (ret < 0) { + bdrv_delete(bs); + return ret; + } + + if (backing_filename) { + ret = bdrv_pwrite(bs, header.backing_filename_offset, + backing_filename, header.backing_filename_size); + if (ret < 0) { + bdrv_delete(bs); + return ret; + } + } + + ret = bdrv_pwrite(bs, header.image_filename_offset, + image_filename, header.image_filename_size); + if (ret < 0) { + bdrv_delete(bs); + return ret; + } + + ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv); + if (ret < 0) { + bdrv_delete(bs); + return ret; + } + + ret = bdrv_truncate(bs, image_len); + bdrv_delete(bs); + return ret; +} + +static int add_cow_open(BlockDriverState *bs, int flags) +{ + char image_filename[ADD_COW_FILE_LEN]; + char tmp_name[ADD_COW_FILE_LEN]; + BlockDriver *image_drv = NULL; + int ret; + int sector_per_byte; + BDRVAddCowState *s = bs->opaque; + AddCowHeader le_header; + + ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header)); + if (ret != sizeof(s->header)) { + goto fail; + } + + add_cow_header_le_to_cpu(&le_header, &s->header); + + if (le64_to_cpu(s->header.magic) != ADD_COW_MAGIC) { + ret = -EINVAL; + goto fail; + } + + if (s->header.version != ADD_COW_VERSION) { + char version[64]; + snprintf(version, sizeof(version), "ADD-COW version %d", + s->header.version); + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, + bs->device_name, "add-cow", version); + ret = -ENOTSUP; + goto fail; + } + + if (s->header.features & ~ADD_COW_FEATURE_MASK) { + char buf[64]; + snprintf(buf, sizeof(buf), "%" PRIx64, + s->header.features & ~ADD_COW_FEATURE_MASK); + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, + bs->device_name, "add-cow", buf); + return -ENOTSUP; + } + + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { + ret = bdrv_read_string(bs->file, sizeof(s->header), + sizeof(s->backing_file_format) - 1, s->backing_file_format, + sizeof(s->backing_file_format)); + if (ret < 0) { + goto fail; + } + } + + ret = bdrv_read_string(bs->file, + sizeof(s->header) + sizeof(s->image_file_format), + sizeof(s->image_file_format) - 1, s->image_file_format, + sizeof(s->image_file_format)); + if (ret < 0) { + goto fail; + } + + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, + s->header.backing_filename_size, bs->backing_file, + sizeof(bs->backing_file)); + if (ret < 0) { + goto fail; + } + } + + ret = bdrv_read_string(bs->file, s->header.image_filename_offset, + s->header.image_filename_size, tmp_name, + sizeof(tmp_name)); + if (ret < 0) { + goto fail; + } + + s->image_hd = bdrv_new(""); + if (path_has_protocol(image_filename)) { + pstrcpy(image_filename, sizeof(image_filename), tmp_name); + } else { + path_combine(image_filename, sizeof(image_filename), + bs->filename, tmp_name); + } + + ret = bdrv_open(s->image_hd, image_filename, flags, image_drv); + if (ret < 0) { + bdrv_delete(s->image_hd); + goto fail; + } + + bs->total_sectors = bdrv_getlength(s->image_hd) >> 9; + s->cluster_size = ADD_COW_CLUSTER_SIZE; + sector_per_byte = SECTORS_PER_CLUSTER * 8; + s->bitmap_size = + (bs->total_sectors + sector_per_byte - 1) / sector_per_byte; + s->bitmap_cache = + block_cache_create(bs, ADD_COW_CACHE_SIZE, ADD_COW_CACHE_ENTRY_SIZE); + + qemu_co_mutex_init(&s->lock); + return 0; +fail: + if (s->bitmap_cache) { + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); + } + return ret; +} + +static void add_cow_close(BlockDriverState *bs) +{ + BDRVAddCowState *s = bs->opaque; + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); + bdrv_delete(s->image_hd); +} + +static bool is_allocated(BlockDriverState *bs, int64_t sector_num) +{ + BDRVAddCowState *s = bs->opaque; + BlockCache *c = s->bitmap_cache; + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; + uint8_t *table = NULL; + uint64_t offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size + + (offset_in_bitmap(sector_num) & (~(c->entry_size - 1))); + int ret = block_cache_get(bs, s->bitmap_cache, offset, + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); + + if (ret < 0) { + return ret; + } + return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE] + & (1 << (cluster_num % 8)); +} + +static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs, + int64_t sector_num, int nb_sectors, int *num_same) +{ + BDRVAddCowState *s = bs->opaque; + int changed; + + if (nb_sectors == 0) { + *num_same = 0; + return 0; + } + + if (s->header.features & ADD_COW_F_All_ALLOCATED) { + *num_same = nb_sectors - 1; + return 1; + } + changed = is_allocated(bs, sector_num); + + for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) { + if (is_allocated(bs, sector_num + *num_same) != changed) { + break; + } + } + return changed; +} + +static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov, + int64_t sector_num, int nb_sectors) +{ + int n1; + if ((sector_num + nb_sectors) <= bs->total_sectors) { + return nb_sectors; + } + if (sector_num >= bs->total_sectors) { + n1 = 0; + } else { + n1 = bs->total_sectors - sector_num; + } + + qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1, + 0, BDRV_SECTOR_SIZE * (nb_sectors - n1)); + + return n1; +} + +static coroutine_fn int add_cow_co_readv(BlockDriverState *bs, + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) +{ + BDRVAddCowState *s = bs->opaque; + int cur_nr_sectors; + uint64_t bytes_done = 0; + QEMUIOVector hd_qiov; + int n, n1, ret = 0; + + qemu_iovec_init(&hd_qiov, qiov->niov); + qemu_co_mutex_lock(&s->lock); + while (remaining_sectors != 0) { + cur_nr_sectors = remaining_sectors; + if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors, &n)) { + cur_nr_sectors = n; + qemu_iovec_reset(&hd_qiov); + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, + cur_nr_sectors * BDRV_SECTOR_SIZE); + qemu_co_mutex_unlock(&s->lock); + ret = bdrv_co_readv(s->image_hd, sector_num, n, &hd_qiov); + qemu_co_mutex_lock(&s->lock); + if (ret < 0) { + goto fail; + } + } else { + cur_nr_sectors = n; + if (bs->backing_hd) { + qemu_iovec_reset(&hd_qiov); + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, + cur_nr_sectors * BDRV_SECTOR_SIZE); + n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov, + sector_num, cur_nr_sectors); + if (n1 > 0) { + qemu_co_mutex_unlock(&s->lock); + ret = bdrv_co_readv(bs->backing_hd, sector_num, + n, &hd_qiov); + qemu_co_mutex_lock(&s->lock); + if (ret < 0) { + goto fail; + } + } + } else { + qemu_iovec_memset(&hd_qiov, 0, 0, + BDRV_SECTOR_SIZE * cur_nr_sectors); + } + } + remaining_sectors -= cur_nr_sectors; + sector_num += cur_nr_sectors; + bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE; + } +fail: + qemu_co_mutex_unlock(&s->lock); + qemu_iovec_destroy(&hd_qiov); + return ret; +} + +static int coroutine_fn copy_sectors(BlockDriverState *bs, + int n_start, int n_end) +{ + BDRVAddCowState *s = bs->opaque; + QEMUIOVector qiov; + struct iovec iov; + int n, ret; + + n = n_end - n_start; + if (n <= 0) { + return 0; + } + + iov.iov_len = n * BDRV_SECTOR_SIZE; + iov.iov_base = qemu_blockalign(bs, iov.iov_len); + + qemu_iovec_init_external(&qiov, &iov, 1); + + ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov); + if (ret < 0) { + goto out; + } + ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov); + if (ret < 0) { + goto out; + } + + ret = 0; +out: + qemu_vfree(iov.iov_base); + return ret; +} + +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs, + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) +{ + BDRVAddCowState *s = bs->opaque; + BlockCache *c = s->bitmap_cache; + int ret = 0, i; + QEMUIOVector hd_qiov; + uint8_t *table; + uint64_t offset; + + qemu_co_mutex_lock(&s->lock); + qemu_iovec_init(&hd_qiov, qiov->niov); + ret = bdrv_co_writev(s->image_hd, + sector_num, + remaining_sectors, qiov); + + if (ret < 0) { + goto fail; + } + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { + /* Copy content of unmodified sectors */ + if (!is_cluster_head(sector_num) && !is_allocated(bs, sector_num)) { + ret = copy_sectors(bs, sector_num & ~(SECTORS_PER_CLUSTER - 1), + sector_num); + if (ret < 0) { + goto fail; + } + } + + if (!is_cluster_tail(sector_num + remaining_sectors - 1) + && !is_allocated(bs, sector_num + remaining_sectors - 1)) { + ret = copy_sectors(bs, sector_num + remaining_sectors, + ((sector_num + remaining_sectors) | (SECTORS_PER_CLUSTER - 1)) + 1); + if (ret < 0) { + goto fail; + } + } + + for (i = sector_num / SECTORS_PER_CLUSTER; + i <= (sector_num + remaining_sectors - 1) / SECTORS_PER_CLUSTER; + i++) { + offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size + + (offset_in_bitmap(i * SECTORS_PER_CLUSTER) & (~(c->entry_size - 1))); + ret = block_cache_get(bs, s->bitmap_cache, offset, + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); + if (ret < 0) { + goto fail; + } + if ((table[i / 8] & (1 << (i % 8))) == 0) { + table[i / 8] |= (1 << (i % 8)); + block_cache_entry_mark_dirty(s->bitmap_cache, table); + } + } + } + ret = 0; +fail: + qemu_co_mutex_unlock(&s->lock); + qemu_iovec_destroy(&hd_qiov); + return ret; +} + +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size) +{ + BDRVAddCowState *s = bs->opaque; + int sector_per_byte = SECTORS_PER_CLUSTER * 8; + int ret; + uint32_t bitmap_pos = s->header.header_pages_size * ADD_COW_PAGE_SIZE; + int64_t bitmap_size = + (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte; + bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1) + & (~(ADD_COW_CACHE_ENTRY_SIZE - 1)); + + ret = bdrv_truncate(bs->file, bitmap_pos + bitmap_size); + if (ret < 0) { + return ret; + } + return 0; +} + +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) +{ + BDRVAddCowState *s = bs->opaque; + int ret; + + qemu_co_mutex_lock(&s->lock); + ret = block_cache_flush(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP, + ADD_COW_CACHE_ENTRY_SIZE); + qemu_co_mutex_unlock(&s->lock); + return ret; +} + +static QEMUOptionParameter add_cow_create_options[] = { + { + .name = BLOCK_OPT_SIZE, + .type = OPT_SIZE, + .help = "Virtual disk size" + }, + { + .name = BLOCK_OPT_BACKING_FILE, + .type = OPT_STRING, + .help = "File name of a base image" + }, + { + .name = BLOCK_OPT_BACKING_FMT, + .type = OPT_STRING, + .help = "Image format of the base image" + }, + { + .name = BLOCK_OPT_IMAGE_FILE, + .type = OPT_STRING, + .help = "File name of a image file" + }, + { + .name = BLOCK_OPT_IMAGE_FORMAT, + .type = OPT_STRING, + .help = "Image format of the image file" + }, + { NULL } +}; + +static BlockDriver bdrv_add_cow = { + .format_name = "add-cow", + .instance_size = sizeof(BDRVAddCowState), + .bdrv_probe = add_cow_probe, + .bdrv_open = add_cow_open, + .bdrv_close = add_cow_close, + .bdrv_create = add_cow_create, + .bdrv_co_readv = add_cow_co_readv, + .bdrv_co_writev = add_cow_co_writev, + .bdrv_truncate = bdrv_add_cow_truncate, + .bdrv_co_is_allocated = add_cow_is_allocated, + + .create_options = add_cow_create_options, + .bdrv_co_flush_to_os = add_cow_co_flush, +}; + +static void bdrv_add_cow_init(void) +{ + bdrv_register(&bdrv_add_cow); +} + +block_init(bdrv_add_cow_init); diff --git a/block/add-cow.h b/block/add-cow.h new file mode 100644 index 0000000..f058376 --- /dev/null +++ b/block/add-cow.h @@ -0,0 +1,85 @@ +/* + * QEMU ADD-COW Disk Format + * + * Copyright IBM, Corp. 2012 + * + * Authors: + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#ifndef BLOCK_ADD_COW_H +#define BLOCK_ADD_COW_H +#include "block-cache.h" + +enum { + ADD_COW_F_All_ALLOCATED = 0X01, + ADD_COW_FEATURE_MASK = ADD_COW_F_All_ALLOCATED, + + ADD_COW_MAGIC = (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | \ + ((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \ + ((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \ + ((uint64_t)'W' << 8) | 0xFF), + ADD_COW_VERSION = 1, + ADD_COW_FILE_LEN = 1024, + ADD_COW_CACHE_SIZE = 16, + ADD_COW_CACHE_ENTRY_SIZE = 65536, + ADD_COW_CLUSTER_SIZE = 65536, + SECTORS_PER_CLUSTER = (ADD_COW_CLUSTER_SIZE / BDRV_SECTOR_SIZE), + ADD_COW_PAGE_SIZE = 4096, + ADD_COW_DEFAULT_PAGE_SIZE = 1, +}; + +typedef struct AddCowHeader { + uint64_t magic; + uint32_t version; + + uint32_t backing_filename_offset; + uint32_t backing_filename_size; + + uint32_t image_filename_offset; + uint32_t image_filename_size; + + uint64_t features; + uint64_t optional_features; + uint32_t header_pages_size; +} QEMU_PACKED AddCowHeader; + +typedef struct BDRVAddCowState { + BlockDriverState *image_hd; + CoMutex lock; + int cluster_size; + BlockCache *bitmap_cache; + uint64_t bitmap_size; + AddCowHeader header; + char backing_file_format[16]; + char image_file_format[16]; +} BDRVAddCowState; + +/* Convert sector_num to offset in bitmap */ +static inline int64_t offset_in_bitmap(int64_t sector_num) +{ + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; + return cluster_num / 8; +} + +static inline bool is_cluster_head(int64_t sector_num) +{ + return sector_num % SECTORS_PER_CLUSTER == 0; +} + +static inline bool is_cluster_tail(int64_t sector_num) +{ + return (sector_num + 1) % SECTORS_PER_CLUSTER == 0; +} + +BlockCache *add_cow_cache_create(BlockDriverState *bs, int num_tables); +int add_cow_cache_destroy(BlockDriverState *bs, BlockCache *c); +void add_cow_cache_entry_mark_dirty(BlockCache *c, void *table); +int add_cow_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset, + void **table); +int add_cow_cache_flush(BlockDriverState *bs, BlockCache *c); +#endif diff --git a/block_int.h b/block_int.h index 6c1d9ca..67954ec 100644 --- a/block_int.h +++ b/block_int.h @@ -53,6 +53,8 @@ #define BLOCK_OPT_SUBFMT "subformat" #define BLOCK_OPT_COMPAT_LEVEL "compat" #define BLOCK_OPT_LAZY_REFCOUNTS "lazy_refcounts" +#define BLOCK_OPT_IMAGE_FILE "image_file" +#define BLOCK_OPT_IMAGE_FORMAT "image_format" typedef struct BdrvTrackedRequest BdrvTrackedRequest; -- 1.7.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 5/6] add-cow file format Dong Xu Wang @ 2012-09-06 20:19 ` Michael Roth 2012-09-10 2:25 ` Dong Xu Wang 2012-09-11 9:40 ` Kevin Wolf 1 sibling, 1 reply; 25+ messages in thread From: Michael Roth @ 2012-09-06 20:19 UTC (permalink / raw) To: Dong Xu Wang; +Cc: kwolf, qemu-devel On Fri, Aug 10, 2012 at 11:39:44PM +0800, Dong Xu Wang wrote: > add-cow file format core code. It use block-cache.c as cache code. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > block/Makefile.objs | 1 + > block/add-cow.c | 613 +++++++++++++++++++++++++++++++++++++++++++++++++++ > block/add-cow.h | 85 +++++++ > block_int.h | 2 + > 4 files changed, 701 insertions(+), 0 deletions(-) > create mode 100644 block/add-cow.c > create mode 100644 block/add-cow.h > > diff --git a/block/Makefile.objs b/block/Makefile.objs > index 23bdfc8..7ed5051 100644 > --- a/block/Makefile.objs > +++ b/block/Makefile.objs > @@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat > block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o > block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o > block-obj-y += qed-check.o > +block-obj-y += add-cow.o > block-obj-y += block-cache.o > block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o > block-obj-y += stream.o > diff --git a/block/add-cow.c b/block/add-cow.c > new file mode 100644 > index 0000000..d4711d5 > --- /dev/null > +++ b/block/add-cow.c > @@ -0,0 +1,613 @@ > +/* > + * QEMU ADD-COW Disk Format > + * > + * Copyright IBM, Corp. 2012 > + * > + * Authors: > + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > + * > + * This work is licensed under the terms of the GNU LGPL, version 2 or later. > + * See the COPYING.LIB file in the top-level directory. > + * > + */ > + > +#include "qemu-common.h" > +#include "block_int.h" > +#include "module.h" > +#include "add-cow.h" > + > +static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu) > +{ > + cpu->magic = le64_to_cpu(le->magic); > + cpu->version = le32_to_cpu(le->version); > + > + cpu->backing_filename_offset = le32_to_cpu(le->backing_filename_offset); > + cpu->backing_filename_size = le32_to_cpu(le->backing_filename_size); > + > + cpu->image_filename_offset = le32_to_cpu(le->image_filename_offset); > + cpu->image_filename_size = le32_to_cpu(le->image_filename_size); > + > + cpu->features = le64_to_cpu(le->features); > + cpu->optional_features = le64_to_cpu(le->optional_features); > + cpu->header_pages_size = le32_to_cpu(le->header_pages_size); > +} > + > +static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le) > +{ > + le->magic = cpu_to_le64(cpu->magic); > + le->version = cpu_to_le32(cpu->version); > + > + le->backing_filename_offset = cpu_to_le32(cpu->backing_filename_offset); > + le->backing_filename_size = cpu_to_le32(cpu->backing_filename_size); > + > + le->image_filename_offset = cpu_to_le32(cpu->image_filename_offset); > + le->image_filename_size = cpu_to_le32(cpu->image_filename_size); > + > + le->features = cpu_to_le64(cpu->features); > + le->optional_features = cpu_to_le64(cpu->optional_features); > + le->header_pages_size = cpu_to_le32(cpu->header_pages_size); > +} > + > +static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename) > +{ > + const AddCowHeader *header = (const AddCowHeader *)buf; > + > + if (le64_to_cpu(header->magic) == ADD_COW_MAGIC && > + le32_to_cpu(header->version) == ADD_COW_VERSION) { > + return 100; > + } else { > + return 0; > + } > +} > + > +static int add_cow_create(const char *filename, QEMUOptionParameter *options) > +{ > + AddCowHeader header = { > + .magic = ADD_COW_MAGIC, > + .version = ADD_COW_VERSION, > + .features = 0, > + .optional_features = 0, > + .header_pages_size = ADD_COW_DEFAULT_PAGE_SIZE, > + }; > + AddCowHeader le_header; > + int64_t image_len = 0; > + const char *backing_filename = NULL; > + const char *backing_fmt = NULL; > + const char *image_filename = NULL; > + const char *image_format = NULL; > + BlockDriverState *bs, *image_bs = NULL, *backing_bs = NULL; > + BlockDriver *drv = bdrv_find_format("add-cow"); > + BDRVAddCowState s; > + int ret; > + > + while (options && options->name) { > + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { > + image_len = options->value.n; > + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) { > + backing_filename = options->value.s; > + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) { > + backing_fmt = options->value.s; > + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FILE)) { > + image_filename = options->value.s; > + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FORMAT)) { > + image_format = options->value.s; > + } > + options++; > + } > + > + if (backing_filename) { > + header.backing_filename_offset = sizeof(header) > + + sizeof(s.backing_file_format) + sizeof(s.image_file_format); > + header.backing_filename_size = strlen(backing_filename); > + > + if (!backing_fmt) { > + backing_bs = bdrv_new("image"); > + ret = bdrv_open(backing_bs, backing_filename, BDRV_O_RDWR > + | BDRV_O_CACHE_WB, NULL); > + if (ret < 0) { > + return ret; > + } > + backing_fmt = bdrv_get_format_name(backing_bs); > + bdrv_delete(backing_bs); > + } > + } else { > + header.features |= ADD_COW_F_All_ALLOCATED; > + } > + > + if (image_filename) { > + header.image_filename_offset = > + sizeof(header) + sizeof(s.backing_file_format) > + + sizeof(s.image_file_format) + header.backing_filename_size; > + header.image_filename_size = strlen(image_filename); > + } else { > + error_report("Error: image_file should be given."); > + return -EINVAL; > + } > + > + if (backing_filename && !strcmp(backing_filename, image_filename)) { > + error_report("Error: Trying to create an image with the " > + "same backing file name as the image file name"); > + return -EINVAL; > + } > + > + if (!strcmp(filename, image_filename)) { > + error_report("Error: Trying to create an image with the " > + "same filename as the image file name"); > + return -EINVAL; > + } > + > + if (header.image_filename_offset + header.image_filename_size > + > ADD_COW_PAGE_SIZE * ADD_COW_DEFAULT_PAGE_SIZE) { > + error_report("image_file name or backing_file name too long."); > + return -ENOSPC; > + } > + > + ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR); > + if (ret < 0) { > + return ret; > + } > + bdrv_delete(image_bs); > + > + ret = bdrv_create_file(filename, NULL); > + if (ret < 0) { > + return ret; > + } > + > + ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR); > + if (ret < 0) { > + return ret; > + } > + add_cow_header_cpu_to_le(&header, &le_header); > + ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header)); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_pwrite(bs, sizeof(le_header), backing_fmt ? backing_fmt : "", > + backing_fmt ? strlen(backing_fmt) : 0); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_pwrite(bs, sizeof(le_header) + sizeof(s.backing_file_format), > + image_format ? image_format : "raw", > + image_format ? strlen(image_format) : sizeof("raw")); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + if (backing_filename) { > + ret = bdrv_pwrite(bs, header.backing_filename_offset, > + backing_filename, header.backing_filename_size); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + } > + > + ret = bdrv_pwrite(bs, header.image_filename_offset, > + image_filename, header.image_filename_size); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_truncate(bs, image_len); > + bdrv_delete(bs); > + return ret; > +} > + > +static int add_cow_open(BlockDriverState *bs, int flags) > +{ > + char image_filename[ADD_COW_FILE_LEN]; > + char tmp_name[ADD_COW_FILE_LEN]; > + BlockDriver *image_drv = NULL; > + int ret; > + int sector_per_byte; > + BDRVAddCowState *s = bs->opaque; > + AddCowHeader le_header; > + > + ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header)); > + if (ret != sizeof(s->header)) { > + goto fail; > + } > + > + add_cow_header_le_to_cpu(&le_header, &s->header); > + > + if (le64_to_cpu(s->header.magic) != ADD_COW_MAGIC) { > + ret = -EINVAL; > + goto fail; > + } > + > + if (s->header.version != ADD_COW_VERSION) { > + char version[64]; > + snprintf(version, sizeof(version), "ADD-COW version %d", > + s->header.version); > + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, > + bs->device_name, "add-cow", version); > + ret = -ENOTSUP; > + goto fail; > + } > + > + if (s->header.features & ~ADD_COW_FEATURE_MASK) { > + char buf[64]; > + snprintf(buf, sizeof(buf), "%" PRIx64, > + s->header.features & ~ADD_COW_FEATURE_MASK); > + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, > + bs->device_name, "add-cow", buf); > + return -ENOTSUP; > + } > + > + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { > + ret = bdrv_read_string(bs->file, sizeof(s->header), > + sizeof(s->backing_file_format) - 1, s->backing_file_format, > + sizeof(s->backing_file_format)); > + if (ret < 0) { > + goto fail; > + } > + } > + > + ret = bdrv_read_string(bs->file, > + sizeof(s->header) + sizeof(s->image_file_format), > + sizeof(s->image_file_format) - 1, s->image_file_format, > + sizeof(s->image_file_format)); > + if (ret < 0) { > + goto fail; > + } > + > + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { > + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, > + s->header.backing_filename_size, bs->backing_file, > + sizeof(bs->backing_file)); > + if (ret < 0) { > + goto fail; > + } > + } > + > + ret = bdrv_read_string(bs->file, s->header.image_filename_offset, > + s->header.image_filename_size, tmp_name, > + sizeof(tmp_name)); > + if (ret < 0) { > + goto fail; > + } > + > + s->image_hd = bdrv_new(""); > + if (path_has_protocol(image_filename)) { > + pstrcpy(image_filename, sizeof(image_filename), tmp_name); > + } else { > + path_combine(image_filename, sizeof(image_filename), > + bs->filename, tmp_name); > + } > + > + ret = bdrv_open(s->image_hd, image_filename, flags, image_drv); > + if (ret < 0) { > + bdrv_delete(s->image_hd); > + goto fail; > + } > + > + bs->total_sectors = bdrv_getlength(s->image_hd) >> 9; > + s->cluster_size = ADD_COW_CLUSTER_SIZE; > + sector_per_byte = SECTORS_PER_CLUSTER * 8; > + s->bitmap_size = > + (bs->total_sectors + sector_per_byte - 1) / sector_per_byte; > + s->bitmap_cache = > + block_cache_create(bs, ADD_COW_CACHE_SIZE, ADD_COW_CACHE_ENTRY_SIZE); > + > + qemu_co_mutex_init(&s->lock); > + return 0; > +fail: > + if (s->bitmap_cache) { > + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); > + } > + return ret; > +} > + > +static void add_cow_close(BlockDriverState *bs) > +{ > + BDRVAddCowState *s = bs->opaque; > + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); > + bdrv_delete(s->image_hd); > +} > + > +static bool is_allocated(BlockDriverState *bs, int64_t sector_num) > +{ > + BDRVAddCowState *s = bs->opaque; > + BlockCache *c = s->bitmap_cache; > + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; > + uint8_t *table = NULL; > + uint64_t offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size > + + (offset_in_bitmap(sector_num) & (~(c->entry_size - 1))); > + int ret = block_cache_get(bs, s->bitmap_cache, offset, > + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); > + > + if (ret < 0) { > + return ret; > + } > + return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE] > + & (1 << (cluster_num % 8)); > +} > + > +static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs, > + int64_t sector_num, int nb_sectors, int *num_same) > +{ > + BDRVAddCowState *s = bs->opaque; > + int changed; > + > + if (nb_sectors == 0) { > + *num_same = 0; > + return 0; > + } > + > + if (s->header.features & ADD_COW_F_All_ALLOCATED) { > + *num_same = nb_sectors - 1; > + return 1; > + } > + changed = is_allocated(bs, sector_num); > + > + for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) { > + if (is_allocated(bs, sector_num + *num_same) != changed) { > + break; > + } > + } > + return changed; > +} > + > +static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov, > + int64_t sector_num, int nb_sectors) > +{ > + int n1; > + if ((sector_num + nb_sectors) <= bs->total_sectors) { > + return nb_sectors; > + } > + if (sector_num >= bs->total_sectors) { > + n1 = 0; > + } else { > + n1 = bs->total_sectors - sector_num; > + } > + > + qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1, > + 0, BDRV_SECTOR_SIZE * (nb_sectors - n1)); > + > + return n1; > +} > + > +static coroutine_fn int add_cow_co_readv(BlockDriverState *bs, > + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) > +{ > + BDRVAddCowState *s = bs->opaque; > + int cur_nr_sectors; > + uint64_t bytes_done = 0; > + QEMUIOVector hd_qiov; > + int n, n1, ret = 0; > + > + qemu_iovec_init(&hd_qiov, qiov->niov); > + qemu_co_mutex_lock(&s->lock); > + while (remaining_sectors != 0) { > + cur_nr_sectors = remaining_sectors; > + if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors, &n)) { > + cur_nr_sectors = n; > + qemu_iovec_reset(&hd_qiov); > + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, > + cur_nr_sectors * BDRV_SECTOR_SIZE); > + qemu_co_mutex_unlock(&s->lock); > + ret = bdrv_co_readv(s->image_hd, sector_num, n, &hd_qiov); > + qemu_co_mutex_lock(&s->lock); > + if (ret < 0) { > + goto fail; > + } > + } else { > + cur_nr_sectors = n; > + if (bs->backing_hd) { > + qemu_iovec_reset(&hd_qiov); > + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, > + cur_nr_sectors * BDRV_SECTOR_SIZE); > + n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov, > + sector_num, cur_nr_sectors); > + if (n1 > 0) { > + qemu_co_mutex_unlock(&s->lock); > + ret = bdrv_co_readv(bs->backing_hd, sector_num, > + n, &hd_qiov); > + qemu_co_mutex_lock(&s->lock); > + if (ret < 0) { > + goto fail; > + } > + } > + } else { > + qemu_iovec_memset(&hd_qiov, 0, 0, > + BDRV_SECTOR_SIZE * cur_nr_sectors); > + } > + } > + remaining_sectors -= cur_nr_sectors; > + sector_num += cur_nr_sectors; > + bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE; > + } > +fail: > + qemu_co_mutex_unlock(&s->lock); > + qemu_iovec_destroy(&hd_qiov); > + return ret; > +} > + > +static int coroutine_fn copy_sectors(BlockDriverState *bs, > + int n_start, int n_end) > +{ > + BDRVAddCowState *s = bs->opaque; > + QEMUIOVector qiov; > + struct iovec iov; > + int n, ret; > + > + n = n_end - n_start; > + if (n <= 0) { > + return 0; > + } > + > + iov.iov_len = n * BDRV_SECTOR_SIZE; > + iov.iov_base = qemu_blockalign(bs, iov.iov_len); > + > + qemu_iovec_init_external(&qiov, &iov, 1); > + > + ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov); > + if (ret < 0) { > + goto out; > + } > + ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov); > + if (ret < 0) { > + goto out; > + } > + > + ret = 0; > +out: > + qemu_vfree(iov.iov_base); > + return ret; > +} > + > +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs, > + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) > +{ > + BDRVAddCowState *s = bs->opaque; > + BlockCache *c = s->bitmap_cache; > + int ret = 0, i; > + QEMUIOVector hd_qiov; > + uint8_t *table; > + uint64_t offset; > + > + qemu_co_mutex_lock(&s->lock); > + qemu_iovec_init(&hd_qiov, qiov->niov); > + ret = bdrv_co_writev(s->image_hd, > + sector_num, > + remaining_sectors, qiov); alignment ^ or even at ^ if you prefer and have done in some places, just need to be consistent about it for better readability. > + > + if (ret < 0) { > + goto fail; > + } > + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { > + /* Copy content of unmodified sectors */ > + if (!is_cluster_head(sector_num) && !is_allocated(bs, sector_num)) { Why do we avoid a COW when writing to the first sector of a cluster? > + ret = copy_sectors(bs, sector_num & ~(SECTORS_PER_CLUSTER - 1), > + sector_num); > + if (ret < 0) { > + goto fail; > + } > + } > + > + if (!is_cluster_tail(sector_num + remaining_sectors - 1) > + && !is_allocated(bs, sector_num + remaining_sectors - 1)) { > + ret = copy_sectors(bs, sector_num + remaining_sectors, > + ((sector_num + remaining_sectors) | (SECTORS_PER_CLUSTER - 1)) + 1); > + if (ret < 0) { > + goto fail; > + } > + } > + > + for (i = sector_num / SECTORS_PER_CLUSTER; > + i <= (sector_num + remaining_sectors - 1) / SECTORS_PER_CLUSTER; > + i++) { > + offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size > + + (offset_in_bitmap(i * SECTORS_PER_CLUSTER) & (~(c->entry_size - 1))); > + ret = block_cache_get(bs, s->bitmap_cache, offset, > + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); > + if (ret < 0) { > + goto fail; > + } > + if ((table[i / 8] & (1 << (i % 8))) == 0) { > + table[i / 8] |= (1 << (i % 8)); > + block_cache_entry_mark_dirty(s->bitmap_cache, table); > + } > + } > + } > + ret = 0; > +fail: > + qemu_co_mutex_unlock(&s->lock); > + qemu_iovec_destroy(&hd_qiov); > + return ret; > +} > + > +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size) > +{ > + BDRVAddCowState *s = bs->opaque; > + int sector_per_byte = SECTORS_PER_CLUSTER * 8; > + int ret; > + uint32_t bitmap_pos = s->header.header_pages_size * ADD_COW_PAGE_SIZE; > + int64_t bitmap_size = > + (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte; > + bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1) > + & (~(ADD_COW_CACHE_ENTRY_SIZE - 1)); > + > + ret = bdrv_truncate(bs->file, bitmap_pos + bitmap_size); > + if (ret < 0) { > + return ret; > + } > + return 0; > +} > + > +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) > +{ > + BDRVAddCowState *s = bs->opaque; > + int ret; > + > + qemu_co_mutex_lock(&s->lock); > + ret = block_cache_flush(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP, > + ADD_COW_CACHE_ENTRY_SIZE); > + qemu_co_mutex_unlock(&s->lock); > + return ret; > +} > + > +static QEMUOptionParameter add_cow_create_options[] = { > + { > + .name = BLOCK_OPT_SIZE, > + .type = OPT_SIZE, > + .help = "Virtual disk size" > + }, > + { > + .name = BLOCK_OPT_BACKING_FILE, > + .type = OPT_STRING, > + .help = "File name of a base image" > + }, > + { > + .name = BLOCK_OPT_BACKING_FMT, > + .type = OPT_STRING, > + .help = "Image format of the base image" > + }, > + { > + .name = BLOCK_OPT_IMAGE_FILE, > + .type = OPT_STRING, > + .help = "File name of a image file" > + }, > + { > + .name = BLOCK_OPT_IMAGE_FORMAT, > + .type = OPT_STRING, > + .help = "Image format of the image file" > + }, > + { NULL } > +}; > + > +static BlockDriver bdrv_add_cow = { > + .format_name = "add-cow", > + .instance_size = sizeof(BDRVAddCowState), > + .bdrv_probe = add_cow_probe, > + .bdrv_open = add_cow_open, > + .bdrv_close = add_cow_close, > + .bdrv_create = add_cow_create, > + .bdrv_co_readv = add_cow_co_readv, > + .bdrv_co_writev = add_cow_co_writev, > + .bdrv_truncate = bdrv_add_cow_truncate, > + .bdrv_co_is_allocated = add_cow_is_allocated, > + > + .create_options = add_cow_create_options, > + .bdrv_co_flush_to_os = add_cow_co_flush, > +}; > + > +static void bdrv_add_cow_init(void) > +{ > + bdrv_register(&bdrv_add_cow); > +} > + > +block_init(bdrv_add_cow_init); > diff --git a/block/add-cow.h b/block/add-cow.h > new file mode 100644 > index 0000000..f058376 > --- /dev/null > +++ b/block/add-cow.h > @@ -0,0 +1,85 @@ > +/* > + * QEMU ADD-COW Disk Format > + * > + * Copyright IBM, Corp. 2012 > + * > + * Authors: > + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > + * > + * This work is licensed under the terms of the GNU LGPL, version 2 or later. > + * See the COPYING.LIB file in the top-level directory. > + * > + */ > + > +#ifndef BLOCK_ADD_COW_H > +#define BLOCK_ADD_COW_H > +#include "block-cache.h" > + > +enum { > + ADD_COW_F_All_ALLOCATED = 0X01, Please use "ADD_COW_F_ALL_ALLOCATED" (all caps) was searching your patch for how this was used and was scratching my head when I wasn't seeing any matches :) > + ADD_COW_FEATURE_MASK = ADD_COW_F_All_ALLOCATED, > + > + ADD_COW_MAGIC = (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | \ > + ((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \ > + ((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \ > + ((uint64_t)'W' << 8) | 0xFF), > + ADD_COW_VERSION = 1, > + ADD_COW_FILE_LEN = 1024, > + ADD_COW_CACHE_SIZE = 16, > + ADD_COW_CACHE_ENTRY_SIZE = 65536, > + ADD_COW_CLUSTER_SIZE = 65536, > + SECTORS_PER_CLUSTER = (ADD_COW_CLUSTER_SIZE / BDRV_SECTOR_SIZE), > + ADD_COW_PAGE_SIZE = 4096, > + ADD_COW_DEFAULT_PAGE_SIZE = 1, > +}; > + > +typedef struct AddCowHeader { > + uint64_t magic; > + uint32_t version; > + > + uint32_t backing_filename_offset; > + uint32_t backing_filename_size; > + > + uint32_t image_filename_offset; > + uint32_t image_filename_size; > + > + uint64_t features; > + uint64_t optional_features; > + uint32_t header_pages_size; > +} QEMU_PACKED AddCowHeader; You should avoid using packed structures for image format headers. Instead, I would either: a) re-order the fields so that 32/64-bit fields, respectively, fall on 32/64-bit boundaries (in your case, for instance, moving header_pages_size above features) like qed/qcow2 do, or b) read/write the fields individually rather than reading/writing directly into/from the header struct. The safest route is b). Adds a few lines of code, but you won't have to re-work things (or worry about introducing bugs) later if you were to add, say, a 32-bit value, and then a 64-bit value later. > + > +typedef struct BDRVAddCowState { > + BlockDriverState *image_hd; > + CoMutex lock; > + int cluster_size; > + BlockCache *bitmap_cache; > + uint64_t bitmap_size; > + AddCowHeader header; > + char backing_file_format[16]; > + char image_file_format[16]; > +} BDRVAddCowState; > + > +/* Convert sector_num to offset in bitmap */ > +static inline int64_t offset_in_bitmap(int64_t sector_num) > +{ > + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; > + return cluster_num / 8; > +} > + > +static inline bool is_cluster_head(int64_t sector_num) > +{ > + return sector_num % SECTORS_PER_CLUSTER == 0; > +} > + > +static inline bool is_cluster_tail(int64_t sector_num) > +{ > + return (sector_num + 1) % SECTORS_PER_CLUSTER == 0; > +} > + > +BlockCache *add_cow_cache_create(BlockDriverState *bs, int num_tables); > +int add_cow_cache_destroy(BlockDriverState *bs, BlockCache *c); > +void add_cow_cache_entry_mark_dirty(BlockCache *c, void *table); > +int add_cow_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset, > + void **table); > +int add_cow_cache_flush(BlockDriverState *bs, BlockCache *c); > +#endif > diff --git a/block_int.h b/block_int.h > index 6c1d9ca..67954ec 100644 > --- a/block_int.h > +++ b/block_int.h > @@ -53,6 +53,8 @@ > #define BLOCK_OPT_SUBFMT "subformat" > #define BLOCK_OPT_COMPAT_LEVEL "compat" > #define BLOCK_OPT_LAZY_REFCOUNTS "lazy_refcounts" > +#define BLOCK_OPT_IMAGE_FILE "image_file" > +#define BLOCK_OPT_IMAGE_FORMAT "image_format" > > typedef struct BdrvTrackedRequest BdrvTrackedRequest; > > -- > 1.7.1 > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-09-06 20:19 ` Michael Roth @ 2012-09-10 2:25 ` Dong Xu Wang 2012-09-11 9:44 ` Kevin Wolf 0 siblings, 1 reply; 25+ messages in thread From: Dong Xu Wang @ 2012-09-10 2:25 UTC (permalink / raw) To: Michael Roth; +Cc: kwolf, qemu-devel On Fri, Sep 7, 2012 at 4:19 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote: > On Fri, Aug 10, 2012 at 11:39:44PM +0800, Dong Xu Wang wrote: >> add-cow file format core code. It use block-cache.c as cache code. >> >> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> --- >> block/Makefile.objs | 1 + >> block/add-cow.c | 613 +++++++++++++++++++++++++++++++++++++++++++++++++++ >> block/add-cow.h | 85 +++++++ >> block_int.h | 2 + >> 4 files changed, 701 insertions(+), 0 deletions(-) >> create mode 100644 block/add-cow.c >> create mode 100644 block/add-cow.h >> >> diff --git a/block/Makefile.objs b/block/Makefile.objs >> index 23bdfc8..7ed5051 100644 >> --- a/block/Makefile.objs >> +++ b/block/Makefile.objs >> @@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat >> block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o >> block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o >> block-obj-y += qed-check.o >> +block-obj-y += add-cow.o >> block-obj-y += block-cache.o >> block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o >> block-obj-y += stream.o >> diff --git a/block/add-cow.c b/block/add-cow.c >> new file mode 100644 >> index 0000000..d4711d5 >> --- /dev/null >> +++ b/block/add-cow.c >> @@ -0,0 +1,613 @@ >> +/* >> + * QEMU ADD-COW Disk Format >> + * >> + * Copyright IBM, Corp. 2012 >> + * >> + * Authors: >> + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU LGPL, version 2 or later. >> + * See the COPYING.LIB file in the top-level directory. >> + * >> + */ >> + >> +#include "qemu-common.h" >> +#include "block_int.h" >> +#include "module.h" >> +#include "add-cow.h" >> + >> +static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu) >> +{ >> + cpu->magic = le64_to_cpu(le->magic); >> + cpu->version = le32_to_cpu(le->version); >> + >> + cpu->backing_filename_offset = le32_to_cpu(le->backing_filename_offset); >> + cpu->backing_filename_size = le32_to_cpu(le->backing_filename_size); >> + >> + cpu->image_filename_offset = le32_to_cpu(le->image_filename_offset); >> + cpu->image_filename_size = le32_to_cpu(le->image_filename_size); >> + >> + cpu->features = le64_to_cpu(le->features); >> + cpu->optional_features = le64_to_cpu(le->optional_features); >> + cpu->header_pages_size = le32_to_cpu(le->header_pages_size); >> +} >> + >> +static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le) >> +{ >> + le->magic = cpu_to_le64(cpu->magic); >> + le->version = cpu_to_le32(cpu->version); >> + >> + le->backing_filename_offset = cpu_to_le32(cpu->backing_filename_offset); >> + le->backing_filename_size = cpu_to_le32(cpu->backing_filename_size); >> + >> + le->image_filename_offset = cpu_to_le32(cpu->image_filename_offset); >> + le->image_filename_size = cpu_to_le32(cpu->image_filename_size); >> + >> + le->features = cpu_to_le64(cpu->features); >> + le->optional_features = cpu_to_le64(cpu->optional_features); >> + le->header_pages_size = cpu_to_le32(cpu->header_pages_size); >> +} >> + >> +static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename) >> +{ >> + const AddCowHeader *header = (const AddCowHeader *)buf; >> + >> + if (le64_to_cpu(header->magic) == ADD_COW_MAGIC && >> + le32_to_cpu(header->version) == ADD_COW_VERSION) { >> + return 100; >> + } else { >> + return 0; >> + } >> +} >> + >> +static int add_cow_create(const char *filename, QEMUOptionParameter *options) >> +{ >> + AddCowHeader header = { >> + .magic = ADD_COW_MAGIC, >> + .version = ADD_COW_VERSION, >> + .features = 0, >> + .optional_features = 0, >> + .header_pages_size = ADD_COW_DEFAULT_PAGE_SIZE, >> + }; >> + AddCowHeader le_header; >> + int64_t image_len = 0; >> + const char *backing_filename = NULL; >> + const char *backing_fmt = NULL; >> + const char *image_filename = NULL; >> + const char *image_format = NULL; >> + BlockDriverState *bs, *image_bs = NULL, *backing_bs = NULL; >> + BlockDriver *drv = bdrv_find_format("add-cow"); >> + BDRVAddCowState s; >> + int ret; >> + >> + while (options && options->name) { >> + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { >> + image_len = options->value.n; >> + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) { >> + backing_filename = options->value.s; >> + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) { >> + backing_fmt = options->value.s; >> + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FILE)) { >> + image_filename = options->value.s; >> + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FORMAT)) { >> + image_format = options->value.s; >> + } >> + options++; >> + } >> + >> + if (backing_filename) { >> + header.backing_filename_offset = sizeof(header) >> + + sizeof(s.backing_file_format) + sizeof(s.image_file_format); >> + header.backing_filename_size = strlen(backing_filename); >> + >> + if (!backing_fmt) { >> + backing_bs = bdrv_new("image"); >> + ret = bdrv_open(backing_bs, backing_filename, BDRV_O_RDWR >> + | BDRV_O_CACHE_WB, NULL); >> + if (ret < 0) { >> + return ret; >> + } >> + backing_fmt = bdrv_get_format_name(backing_bs); >> + bdrv_delete(backing_bs); >> + } >> + } else { >> + header.features |= ADD_COW_F_All_ALLOCATED; >> + } >> + >> + if (image_filename) { >> + header.image_filename_offset = >> + sizeof(header) + sizeof(s.backing_file_format) >> + + sizeof(s.image_file_format) + header.backing_filename_size; >> + header.image_filename_size = strlen(image_filename); >> + } else { >> + error_report("Error: image_file should be given."); >> + return -EINVAL; >> + } >> + >> + if (backing_filename && !strcmp(backing_filename, image_filename)) { >> + error_report("Error: Trying to create an image with the " >> + "same backing file name as the image file name"); >> + return -EINVAL; >> + } >> + >> + if (!strcmp(filename, image_filename)) { >> + error_report("Error: Trying to create an image with the " >> + "same filename as the image file name"); >> + return -EINVAL; >> + } >> + >> + if (header.image_filename_offset + header.image_filename_size >> + > ADD_COW_PAGE_SIZE * ADD_COW_DEFAULT_PAGE_SIZE) { >> + error_report("image_file name or backing_file name too long."); >> + return -ENOSPC; >> + } >> + >> + ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR); >> + if (ret < 0) { >> + return ret; >> + } >> + bdrv_delete(image_bs); >> + >> + ret = bdrv_create_file(filename, NULL); >> + if (ret < 0) { >> + return ret; >> + } >> + >> + ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR); >> + if (ret < 0) { >> + return ret; >> + } >> + add_cow_header_cpu_to_le(&header, &le_header); >> + ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header)); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_pwrite(bs, sizeof(le_header), backing_fmt ? backing_fmt : "", >> + backing_fmt ? strlen(backing_fmt) : 0); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_pwrite(bs, sizeof(le_header) + sizeof(s.backing_file_format), >> + image_format ? image_format : "raw", >> + image_format ? strlen(image_format) : sizeof("raw")); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + if (backing_filename) { >> + ret = bdrv_pwrite(bs, header.backing_filename_offset, >> + backing_filename, header.backing_filename_size); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + } >> + >> + ret = bdrv_pwrite(bs, header.image_filename_offset, >> + image_filename, header.image_filename_size); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_truncate(bs, image_len); >> + bdrv_delete(bs); >> + return ret; >> +} >> + >> +static int add_cow_open(BlockDriverState *bs, int flags) >> +{ >> + char image_filename[ADD_COW_FILE_LEN]; >> + char tmp_name[ADD_COW_FILE_LEN]; >> + BlockDriver *image_drv = NULL; >> + int ret; >> + int sector_per_byte; >> + BDRVAddCowState *s = bs->opaque; >> + AddCowHeader le_header; >> + >> + ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header)); >> + if (ret != sizeof(s->header)) { >> + goto fail; >> + } >> + >> + add_cow_header_le_to_cpu(&le_header, &s->header); >> + >> + if (le64_to_cpu(s->header.magic) != ADD_COW_MAGIC) { >> + ret = -EINVAL; >> + goto fail; >> + } >> + >> + if (s->header.version != ADD_COW_VERSION) { >> + char version[64]; >> + snprintf(version, sizeof(version), "ADD-COW version %d", >> + s->header.version); >> + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, >> + bs->device_name, "add-cow", version); >> + ret = -ENOTSUP; >> + goto fail; >> + } >> + >> + if (s->header.features & ~ADD_COW_FEATURE_MASK) { >> + char buf[64]; >> + snprintf(buf, sizeof(buf), "%" PRIx64, >> + s->header.features & ~ADD_COW_FEATURE_MASK); >> + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, >> + bs->device_name, "add-cow", buf); >> + return -ENOTSUP; >> + } >> + >> + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { >> + ret = bdrv_read_string(bs->file, sizeof(s->header), >> + sizeof(s->backing_file_format) - 1, s->backing_file_format, >> + sizeof(s->backing_file_format)); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + ret = bdrv_read_string(bs->file, >> + sizeof(s->header) + sizeof(s->image_file_format), >> + sizeof(s->image_file_format) - 1, s->image_file_format, >> + sizeof(s->image_file_format)); >> + if (ret < 0) { >> + goto fail; >> + } >> + >> + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { >> + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, >> + s->header.backing_filename_size, bs->backing_file, >> + sizeof(bs->backing_file)); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + ret = bdrv_read_string(bs->file, s->header.image_filename_offset, >> + s->header.image_filename_size, tmp_name, >> + sizeof(tmp_name)); >> + if (ret < 0) { >> + goto fail; >> + } >> + >> + s->image_hd = bdrv_new(""); >> + if (path_has_protocol(image_filename)) { >> + pstrcpy(image_filename, sizeof(image_filename), tmp_name); >> + } else { >> + path_combine(image_filename, sizeof(image_filename), >> + bs->filename, tmp_name); >> + } >> + >> + ret = bdrv_open(s->image_hd, image_filename, flags, image_drv); >> + if (ret < 0) { >> + bdrv_delete(s->image_hd); >> + goto fail; >> + } >> + >> + bs->total_sectors = bdrv_getlength(s->image_hd) >> 9; >> + s->cluster_size = ADD_COW_CLUSTER_SIZE; >> + sector_per_byte = SECTORS_PER_CLUSTER * 8; >> + s->bitmap_size = >> + (bs->total_sectors + sector_per_byte - 1) / sector_per_byte; >> + s->bitmap_cache = >> + block_cache_create(bs, ADD_COW_CACHE_SIZE, ADD_COW_CACHE_ENTRY_SIZE); >> + >> + qemu_co_mutex_init(&s->lock); >> + return 0; >> +fail: >> + if (s->bitmap_cache) { >> + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); >> + } >> + return ret; >> +} >> + >> +static void add_cow_close(BlockDriverState *bs) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); >> + bdrv_delete(s->image_hd); >> +} >> + >> +static bool is_allocated(BlockDriverState *bs, int64_t sector_num) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + BlockCache *c = s->bitmap_cache; >> + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; >> + uint8_t *table = NULL; >> + uint64_t offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size >> + + (offset_in_bitmap(sector_num) & (~(c->entry_size - 1))); >> + int ret = block_cache_get(bs, s->bitmap_cache, offset, >> + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); >> + >> + if (ret < 0) { >> + return ret; >> + } >> + return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE] >> + & (1 << (cluster_num % 8)); >> +} >> + >> +static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs, >> + int64_t sector_num, int nb_sectors, int *num_same) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int changed; >> + >> + if (nb_sectors == 0) { >> + *num_same = 0; >> + return 0; >> + } >> + >> + if (s->header.features & ADD_COW_F_All_ALLOCATED) { >> + *num_same = nb_sectors - 1; >> + return 1; >> + } >> + changed = is_allocated(bs, sector_num); >> + >> + for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) { >> + if (is_allocated(bs, sector_num + *num_same) != changed) { >> + break; >> + } >> + } >> + return changed; >> +} >> + >> +static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov, >> + int64_t sector_num, int nb_sectors) >> +{ >> + int n1; >> + if ((sector_num + nb_sectors) <= bs->total_sectors) { >> + return nb_sectors; >> + } >> + if (sector_num >= bs->total_sectors) { >> + n1 = 0; >> + } else { >> + n1 = bs->total_sectors - sector_num; >> + } >> + >> + qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1, >> + 0, BDRV_SECTOR_SIZE * (nb_sectors - n1)); >> + >> + return n1; >> +} >> + >> +static coroutine_fn int add_cow_co_readv(BlockDriverState *bs, >> + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int cur_nr_sectors; >> + uint64_t bytes_done = 0; >> + QEMUIOVector hd_qiov; >> + int n, n1, ret = 0; >> + >> + qemu_iovec_init(&hd_qiov, qiov->niov); >> + qemu_co_mutex_lock(&s->lock); >> + while (remaining_sectors != 0) { >> + cur_nr_sectors = remaining_sectors; >> + if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors, &n)) { >> + cur_nr_sectors = n; >> + qemu_iovec_reset(&hd_qiov); >> + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, >> + cur_nr_sectors * BDRV_SECTOR_SIZE); >> + qemu_co_mutex_unlock(&s->lock); >> + ret = bdrv_co_readv(s->image_hd, sector_num, n, &hd_qiov); >> + qemu_co_mutex_lock(&s->lock); >> + if (ret < 0) { >> + goto fail; >> + } >> + } else { >> + cur_nr_sectors = n; >> + if (bs->backing_hd) { >> + qemu_iovec_reset(&hd_qiov); >> + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, >> + cur_nr_sectors * BDRV_SECTOR_SIZE); >> + n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov, >> + sector_num, cur_nr_sectors); >> + if (n1 > 0) { >> + qemu_co_mutex_unlock(&s->lock); >> + ret = bdrv_co_readv(bs->backing_hd, sector_num, >> + n, &hd_qiov); >> + qemu_co_mutex_lock(&s->lock); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + } else { >> + qemu_iovec_memset(&hd_qiov, 0, 0, >> + BDRV_SECTOR_SIZE * cur_nr_sectors); >> + } >> + } >> + remaining_sectors -= cur_nr_sectors; >> + sector_num += cur_nr_sectors; >> + bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE; >> + } >> +fail: >> + qemu_co_mutex_unlock(&s->lock); >> + qemu_iovec_destroy(&hd_qiov); >> + return ret; >> +} >> + >> +static int coroutine_fn copy_sectors(BlockDriverState *bs, >> + int n_start, int n_end) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + QEMUIOVector qiov; >> + struct iovec iov; >> + int n, ret; >> + >> + n = n_end - n_start; >> + if (n <= 0) { >> + return 0; >> + } >> + >> + iov.iov_len = n * BDRV_SECTOR_SIZE; >> + iov.iov_base = qemu_blockalign(bs, iov.iov_len); >> + >> + qemu_iovec_init_external(&qiov, &iov, 1); >> + >> + ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov); >> + if (ret < 0) { >> + goto out; >> + } >> + ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + ret = 0; >> +out: >> + qemu_vfree(iov.iov_base); >> + return ret; >> +} >> + >> +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs, >> + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + BlockCache *c = s->bitmap_cache; >> + int ret = 0, i; >> + QEMUIOVector hd_qiov; >> + uint8_t *table; >> + uint64_t offset; >> + >> + qemu_co_mutex_lock(&s->lock); >> + qemu_iovec_init(&hd_qiov, qiov->niov); >> + ret = bdrv_co_writev(s->image_hd, >> + sector_num, >> + remaining_sectors, qiov); > > alignment ^ > > or even at ^ if you prefer and have done in some places, just need to be > consistent about it for better readability. > >> + >> + if (ret < 0) { >> + goto fail; >> + } >> + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { >> + /* Copy content of unmodified sectors */ >> + if (!is_cluster_head(sector_num) && !is_allocated(bs, sector_num)) { > > Why do we avoid a COW when writing to the first sector of a cluster? Because if it is the first sector, we need not use copy_sector, we write it directly would be enough, it starts at the begening of one cluster. > >> + ret = copy_sectors(bs, sector_num & ~(SECTORS_PER_CLUSTER - 1), >> + sector_num); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + if (!is_cluster_tail(sector_num + remaining_sectors - 1) >> + && !is_allocated(bs, sector_num + remaining_sectors - 1)) { >> + ret = copy_sectors(bs, sector_num + remaining_sectors, >> + ((sector_num + remaining_sectors) | (SECTORS_PER_CLUSTER - 1)) + 1); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + for (i = sector_num / SECTORS_PER_CLUSTER; >> + i <= (sector_num + remaining_sectors - 1) / SECTORS_PER_CLUSTER; >> + i++) { >> + offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size >> + + (offset_in_bitmap(i * SECTORS_PER_CLUSTER) & (~(c->entry_size - 1))); >> + ret = block_cache_get(bs, s->bitmap_cache, offset, >> + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); >> + if (ret < 0) { >> + goto fail; >> + } >> + if ((table[i / 8] & (1 << (i % 8))) == 0) { >> + table[i / 8] |= (1 << (i % 8)); >> + block_cache_entry_mark_dirty(s->bitmap_cache, table); >> + } >> + } >> + } >> + ret = 0; >> +fail: >> + qemu_co_mutex_unlock(&s->lock); >> + qemu_iovec_destroy(&hd_qiov); >> + return ret; >> +} >> + >> +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int sector_per_byte = SECTORS_PER_CLUSTER * 8; >> + int ret; >> + uint32_t bitmap_pos = s->header.header_pages_size * ADD_COW_PAGE_SIZE; >> + int64_t bitmap_size = >> + (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte; >> + bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1) >> + & (~(ADD_COW_CACHE_ENTRY_SIZE - 1)); >> + >> + ret = bdrv_truncate(bs->file, bitmap_pos + bitmap_size); >> + if (ret < 0) { >> + return ret; >> + } >> + return 0; >> +} >> + >> +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int ret; >> + >> + qemu_co_mutex_lock(&s->lock); >> + ret = block_cache_flush(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP, >> + ADD_COW_CACHE_ENTRY_SIZE); >> + qemu_co_mutex_unlock(&s->lock); >> + return ret; >> +} >> + >> +static QEMUOptionParameter add_cow_create_options[] = { >> + { >> + .name = BLOCK_OPT_SIZE, >> + .type = OPT_SIZE, >> + .help = "Virtual disk size" >> + }, >> + { >> + .name = BLOCK_OPT_BACKING_FILE, >> + .type = OPT_STRING, >> + .help = "File name of a base image" >> + }, >> + { >> + .name = BLOCK_OPT_BACKING_FMT, >> + .type = OPT_STRING, >> + .help = "Image format of the base image" >> + }, >> + { >> + .name = BLOCK_OPT_IMAGE_FILE, >> + .type = OPT_STRING, >> + .help = "File name of a image file" >> + }, >> + { >> + .name = BLOCK_OPT_IMAGE_FORMAT, >> + .type = OPT_STRING, >> + .help = "Image format of the image file" >> + }, >> + { NULL } >> +}; >> + >> +static BlockDriver bdrv_add_cow = { >> + .format_name = "add-cow", >> + .instance_size = sizeof(BDRVAddCowState), >> + .bdrv_probe = add_cow_probe, >> + .bdrv_open = add_cow_open, >> + .bdrv_close = add_cow_close, >> + .bdrv_create = add_cow_create, >> + .bdrv_co_readv = add_cow_co_readv, >> + .bdrv_co_writev = add_cow_co_writev, >> + .bdrv_truncate = bdrv_add_cow_truncate, >> + .bdrv_co_is_allocated = add_cow_is_allocated, >> + >> + .create_options = add_cow_create_options, >> + .bdrv_co_flush_to_os = add_cow_co_flush, >> +}; >> + >> +static void bdrv_add_cow_init(void) >> +{ >> + bdrv_register(&bdrv_add_cow); >> +} >> + >> +block_init(bdrv_add_cow_init); >> diff --git a/block/add-cow.h b/block/add-cow.h >> new file mode 100644 >> index 0000000..f058376 >> --- /dev/null >> +++ b/block/add-cow.h >> @@ -0,0 +1,85 @@ >> +/* >> + * QEMU ADD-COW Disk Format >> + * >> + * Copyright IBM, Corp. 2012 >> + * >> + * Authors: >> + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU LGPL, version 2 or later. >> + * See the COPYING.LIB file in the top-level directory. >> + * >> + */ >> + >> +#ifndef BLOCK_ADD_COW_H >> +#define BLOCK_ADD_COW_H >> +#include "block-cache.h" >> + >> +enum { >> + ADD_COW_F_All_ALLOCATED = 0X01, > > Please use "ADD_COW_F_ALL_ALLOCATED" (all caps) Okay. > > was searching your patch for how this was used and was scratching my > head when I wasn't seeing any matches :) It wil be used such as: qemu-img create -f add-cow -o image_file=t.raw t.add-cow while we need not read from backing_file any more. > >> + ADD_COW_FEATURE_MASK = ADD_COW_F_All_ALLOCATED, >> + >> + ADD_COW_MAGIC = (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | \ >> + ((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \ >> + ((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \ >> + ((uint64_t)'W' << 8) | 0xFF), >> + ADD_COW_VERSION = 1, >> + ADD_COW_FILE_LEN = 1024, >> + ADD_COW_CACHE_SIZE = 16, >> + ADD_COW_CACHE_ENTRY_SIZE = 65536, >> + ADD_COW_CLUSTER_SIZE = 65536, >> + SECTORS_PER_CLUSTER = (ADD_COW_CLUSTER_SIZE / BDRV_SECTOR_SIZE), >> + ADD_COW_PAGE_SIZE = 4096, >> + ADD_COW_DEFAULT_PAGE_SIZE = 1, >> +}; >> + >> +typedef struct AddCowHeader { >> + uint64_t magic; >> + uint32_t version; >> + >> + uint32_t backing_filename_offset; >> + uint32_t backing_filename_size; >> + >> + uint32_t image_filename_offset; >> + uint32_t image_filename_size; >> + >> + uint64_t features; >> + uint64_t optional_features; >> + uint32_t header_pages_size; >> +} QEMU_PACKED AddCowHeader; > > You should avoid using packed structures for image format headers. > Instead, I would either: > > a) re-order the fields so that 32/64-bit fields, respectively, fall on > 32/64-bit boundaries (in your case, for instance, moving header_pages_size > above features) like qed/qcow2 do, or > > b) read/write the fields individually rather than reading/writing directly > into/from the header struct. > > The safest route is b). Adds a few lines of code, but you won't have to > re-work things (or worry about introducing bugs) later if you were to add, > say, a 32-bit value, and then a 64-bit value later. While, Kevin's suggestion is using PACKED, so .. > >> + >> +typedef struct BDRVAddCowState { >> + BlockDriverState *image_hd; >> + CoMutex lock; >> + int cluster_size; >> + BlockCache *bitmap_cache; >> + uint64_t bitmap_size; >> + AddCowHeader header; >> + char backing_file_format[16]; >> + char image_file_format[16]; >> +} BDRVAddCowState; >> + >> +/* Convert sector_num to offset in bitmap */ >> +static inline int64_t offset_in_bitmap(int64_t sector_num) >> +{ >> + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; >> + return cluster_num / 8; >> +} >> + >> +static inline bool is_cluster_head(int64_t sector_num) >> +{ >> + return sector_num % SECTORS_PER_CLUSTER == 0; >> +} >> + >> +static inline bool is_cluster_tail(int64_t sector_num) >> +{ >> + return (sector_num + 1) % SECTORS_PER_CLUSTER == 0; >> +} >> + >> +BlockCache *add_cow_cache_create(BlockDriverState *bs, int num_tables); >> +int add_cow_cache_destroy(BlockDriverState *bs, BlockCache *c); >> +void add_cow_cache_entry_mark_dirty(BlockCache *c, void *table); >> +int add_cow_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset, >> + void **table); >> +int add_cow_cache_flush(BlockDriverState *bs, BlockCache *c); >> +#endif >> diff --git a/block_int.h b/block_int.h >> index 6c1d9ca..67954ec 100644 >> --- a/block_int.h >> +++ b/block_int.h >> @@ -53,6 +53,8 @@ >> #define BLOCK_OPT_SUBFMT "subformat" >> #define BLOCK_OPT_COMPAT_LEVEL "compat" >> #define BLOCK_OPT_LAZY_REFCOUNTS "lazy_refcounts" >> +#define BLOCK_OPT_IMAGE_FILE "image_file" >> +#define BLOCK_OPT_IMAGE_FORMAT "image_format" >> >> typedef struct BdrvTrackedRequest BdrvTrackedRequest; >> >> -- >> 1.7.1 >> >> > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-09-10 2:25 ` Dong Xu Wang @ 2012-09-11 9:44 ` Kevin Wolf 0 siblings, 0 replies; 25+ messages in thread From: Kevin Wolf @ 2012-09-11 9:44 UTC (permalink / raw) To: Dong Xu Wang; +Cc: Michael Roth, qemu-devel Am 10.09.2012 04:25, schrieb Dong Xu Wang: > On Fri, Sep 7, 2012 at 4:19 AM, Michael Roth <mdroth@linux.vnet.ibm.com> wrote: >> On Fri, Aug 10, 2012 at 11:39:44PM +0800, Dong Xu Wang wrote: >>> +typedef struct AddCowHeader { >>> + uint64_t magic; >>> + uint32_t version; >>> + >>> + uint32_t backing_filename_offset; >>> + uint32_t backing_filename_size; >>> + >>> + uint32_t image_filename_offset; >>> + uint32_t image_filename_size; >>> + >>> + uint64_t features; >>> + uint64_t optional_features; >>> + uint32_t header_pages_size; >>> +} QEMU_PACKED AddCowHeader; >> >> You should avoid using packed structures for image format headers. >> Instead, I would either: >> >> a) re-order the fields so that 32/64-bit fields, respectively, fall on >> 32/64-bit boundaries (in your case, for instance, moving header_pages_size >> above features) like qed/qcow2 do, or >> >> b) read/write the fields individually rather than reading/writing directly >> into/from the header struct. >> >> The safest route is b). Adds a few lines of code, but you won't have to >> re-work things (or worry about introducing bugs) later if you were to add, >> say, a 32-bit value, and then a 64-bit value later. > > While, Kevin's suggestion is using PACKED, so .. Yes, I think QEMU_PACKED is fine, and it's the safest version. It would be nice to additionally do Michael's option a) if you like, but I don't think the header is accessed too often, so the optimisation isn't that important. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 5/6] add-cow file format Dong Xu Wang 2012-09-06 20:19 ` Michael Roth @ 2012-09-11 9:40 ` Kevin Wolf 2012-09-12 7:28 ` Dong Xu Wang 1 sibling, 1 reply; 25+ messages in thread From: Kevin Wolf @ 2012-09-11 9:40 UTC (permalink / raw) To: Dong Xu Wang; +Cc: qemu-devel Am 10.08.2012 17:39, schrieb Dong Xu Wang: > add-cow file format core code. It use block-cache.c as cache code. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > block/Makefile.objs | 1 + > block/add-cow.c | 613 +++++++++++++++++++++++++++++++++++++++++++++++++++ > block/add-cow.h | 85 +++++++ > block_int.h | 2 + > 4 files changed, 701 insertions(+), 0 deletions(-) > create mode 100644 block/add-cow.c > create mode 100644 block/add-cow.h > > diff --git a/block/Makefile.objs b/block/Makefile.objs > index 23bdfc8..7ed5051 100644 > --- a/block/Makefile.objs > +++ b/block/Makefile.objs > @@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat > block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o > block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o > block-obj-y += qed-check.o > +block-obj-y += add-cow.o > block-obj-y += block-cache.o > block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o > block-obj-y += stream.o > diff --git a/block/add-cow.c b/block/add-cow.c > new file mode 100644 > index 0000000..d4711d5 > --- /dev/null > +++ b/block/add-cow.c > @@ -0,0 +1,613 @@ > +/* > + * QEMU ADD-COW Disk Format > + * > + * Copyright IBM, Corp. 2012 > + * > + * Authors: > + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > + * > + * This work is licensed under the terms of the GNU LGPL, version 2 or later. > + * See the COPYING.LIB file in the top-level directory. > + * > + */ > + > +#include "qemu-common.h" > +#include "block_int.h" > +#include "module.h" > +#include "add-cow.h" > + > +static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu) > +{ > + cpu->magic = le64_to_cpu(le->magic); > + cpu->version = le32_to_cpu(le->version); > + > + cpu->backing_filename_offset = le32_to_cpu(le->backing_filename_offset); > + cpu->backing_filename_size = le32_to_cpu(le->backing_filename_size); > + > + cpu->image_filename_offset = le32_to_cpu(le->image_filename_offset); > + cpu->image_filename_size = le32_to_cpu(le->image_filename_size); > + > + cpu->features = le64_to_cpu(le->features); > + cpu->optional_features = le64_to_cpu(le->optional_features); > + cpu->header_pages_size = le32_to_cpu(le->header_pages_size); > +} > + > +static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le) > +{ > + le->magic = cpu_to_le64(cpu->magic); > + le->version = cpu_to_le32(cpu->version); > + > + le->backing_filename_offset = cpu_to_le32(cpu->backing_filename_offset); > + le->backing_filename_size = cpu_to_le32(cpu->backing_filename_size); > + > + le->image_filename_offset = cpu_to_le32(cpu->image_filename_offset); > + le->image_filename_size = cpu_to_le32(cpu->image_filename_size); > + > + le->features = cpu_to_le64(cpu->features); > + le->optional_features = cpu_to_le64(cpu->optional_features); > + le->header_pages_size = cpu_to_le32(cpu->header_pages_size); > +} > + > +static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename) > +{ > + const AddCowHeader *header = (const AddCowHeader *)buf; > + > + if (le64_to_cpu(header->magic) == ADD_COW_MAGIC && > + le32_to_cpu(header->version) == ADD_COW_VERSION) { > + return 100; > + } else { > + return 0; > + } > +} > + > +static int add_cow_create(const char *filename, QEMUOptionParameter *options) > +{ > + AddCowHeader header = { > + .magic = ADD_COW_MAGIC, > + .version = ADD_COW_VERSION, > + .features = 0, > + .optional_features = 0, > + .header_pages_size = ADD_COW_DEFAULT_PAGE_SIZE, > + }; > + AddCowHeader le_header; > + int64_t image_len = 0; > + const char *backing_filename = NULL; > + const char *backing_fmt = NULL; > + const char *image_filename = NULL; > + const char *image_format = NULL; > + BlockDriverState *bs, *image_bs = NULL, *backing_bs = NULL; > + BlockDriver *drv = bdrv_find_format("add-cow"); > + BDRVAddCowState s; > + int ret; > + > + while (options && options->name) { > + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { > + image_len = options->value.n; > + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) { > + backing_filename = options->value.s; > + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) { > + backing_fmt = options->value.s; > + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FILE)) { > + image_filename = options->value.s; > + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FORMAT)) { > + image_format = options->value.s; > + } > + options++; > + } > + > + if (backing_filename) { > + header.backing_filename_offset = sizeof(header) > + + sizeof(s.backing_file_format) + sizeof(s.image_file_format); > + header.backing_filename_size = strlen(backing_filename); > + > + if (!backing_fmt) { > + backing_bs = bdrv_new("image"); > + ret = bdrv_open(backing_bs, backing_filename, BDRV_O_RDWR > + | BDRV_O_CACHE_WB, NULL); > + if (ret < 0) { > + return ret; > + } > + backing_fmt = bdrv_get_format_name(backing_bs); > + bdrv_delete(backing_bs); > + } > + } else { > + header.features |= ADD_COW_F_All_ALLOCATED; > + } > + > + if (image_filename) { > + header.image_filename_offset = > + sizeof(header) + sizeof(s.backing_file_format) > + + sizeof(s.image_file_format) + header.backing_filename_size; > + header.image_filename_size = strlen(image_filename); > + } else { > + error_report("Error: image_file should be given."); > + return -EINVAL; > + } > + > + if (backing_filename && !strcmp(backing_filename, image_filename)) { > + error_report("Error: Trying to create an image with the " > + "same backing file name as the image file name"); > + return -EINVAL; > + } > + > + if (!strcmp(filename, image_filename)) { > + error_report("Error: Trying to create an image with the " > + "same filename as the image file name"); > + return -EINVAL; > + } > + > + if (header.image_filename_offset + header.image_filename_size > + > ADD_COW_PAGE_SIZE * ADD_COW_DEFAULT_PAGE_SIZE) { > + error_report("image_file name or backing_file name too long."); > + return -ENOSPC; > + } > + > + ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR); > + if (ret < 0) { > + return ret; > + } > + bdrv_delete(image_bs); > + > + ret = bdrv_create_file(filename, NULL); > + if (ret < 0) { > + return ret; > + } > + > + ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR); > + if (ret < 0) { > + return ret; > + } > + add_cow_header_cpu_to_le(&header, &le_header); > + ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header)); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_pwrite(bs, sizeof(le_header), backing_fmt ? backing_fmt : "", > + backing_fmt ? strlen(backing_fmt) : 0); The spec requires zero padding, which you don't do here. > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_pwrite(bs, sizeof(le_header) + sizeof(s.backing_file_format), > + image_format ? image_format : "raw", > + image_format ? strlen(image_format) : sizeof("raw")); And here. > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + if (backing_filename) { > + ret = bdrv_pwrite(bs, header.backing_filename_offset, > + backing_filename, header.backing_filename_size); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + } > + > + ret = bdrv_pwrite(bs, header.image_filename_offset, > + image_filename, header.image_filename_size); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv); > + if (ret < 0) { > + bdrv_delete(bs); > + return ret; > + } > + > + ret = bdrv_truncate(bs, image_len); > + bdrv_delete(bs); > + return ret; > +} > + > +static int add_cow_open(BlockDriverState *bs, int flags) > +{ > + char image_filename[ADD_COW_FILE_LEN]; > + char tmp_name[ADD_COW_FILE_LEN]; > + BlockDriver *image_drv = NULL; > + int ret; > + int sector_per_byte; > + BDRVAddCowState *s = bs->opaque; > + AddCowHeader le_header; > + > + ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header)); > + if (ret != sizeof(s->header)) { if (ret < 0) would be more consistent with the rest of the code. > + goto fail; > + } > + > + add_cow_header_le_to_cpu(&le_header, &s->header); > + > + if (le64_to_cpu(s->header.magic) != ADD_COW_MAGIC) { Isn't this one endianess conversion too much? s->header is already LE. Did you test add-cow on a big endian host? > + ret = -EINVAL; > + goto fail; > + } > + > + if (s->header.version != ADD_COW_VERSION) { > + char version[64]; > + snprintf(version, sizeof(version), "ADD-COW version %d", > + s->header.version); > + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, > + bs->device_name, "add-cow", version); > + ret = -ENOTSUP; > + goto fail; > + } > + > + if (s->header.features & ~ADD_COW_FEATURE_MASK) { > + char buf[64]; > + snprintf(buf, sizeof(buf), "%" PRIx64, > + s->header.features & ~ADD_COW_FEATURE_MASK); This message is a bit terse, most users will be confused with an error message that only consists of a hex number. Maybe better "Feature flags: %" PRIx64. > + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, > + bs->device_name, "add-cow", buf); > + return -ENOTSUP; > + } > + > + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { > + ret = bdrv_read_string(bs->file, sizeof(s->header), > + sizeof(s->backing_file_format) - 1, s->backing_file_format, > + sizeof(s->backing_file_format)); > + if (ret < 0) { > + goto fail; > + } > + } Would be great if this was not only read into memory, but actually used... It must end up in bs->backing_format in order take effect. > + > + ret = bdrv_read_string(bs->file, > + sizeof(s->header) + sizeof(s->image_file_format), > + sizeof(s->image_file_format) - 1, s->image_file_format, > + sizeof(s->image_file_format)); > + if (ret < 0) { > + goto fail; > + } This one is unused, too. > + > + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { > + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, > + s->header.backing_filename_size, bs->backing_file, > + sizeof(bs->backing_file)); > + if (ret < 0) { > + goto fail; > + } > + } > + > + ret = bdrv_read_string(bs->file, s->header.image_filename_offset, > + s->header.image_filename_size, tmp_name, > + sizeof(tmp_name)); > + if (ret < 0) { > + goto fail; > + } > + > + s->image_hd = bdrv_new(""); > + if (path_has_protocol(image_filename)) { > + pstrcpy(image_filename, sizeof(image_filename), tmp_name); > + } else { > + path_combine(image_filename, sizeof(image_filename), > + bs->filename, tmp_name); > + } > + > + ret = bdrv_open(s->image_hd, image_filename, flags, image_drv); image_drv is always NULL. > + if (ret < 0) { > + bdrv_delete(s->image_hd); > + goto fail; > + } > + > + bs->total_sectors = bdrv_getlength(s->image_hd) >> 9; > + s->cluster_size = ADD_COW_CLUSTER_SIZE; > + sector_per_byte = SECTORS_PER_CLUSTER * 8; > + s->bitmap_size = > + (bs->total_sectors + sector_per_byte - 1) / sector_per_byte; > + s->bitmap_cache = > + block_cache_create(bs, ADD_COW_CACHE_SIZE, ADD_COW_CACHE_ENTRY_SIZE); > + > + qemu_co_mutex_init(&s->lock); > + return 0; > +fail: > + if (s->bitmap_cache) { > + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); > + } > + return ret; > +} > + > +static void add_cow_close(BlockDriverState *bs) > +{ > + BDRVAddCowState *s = bs->opaque; > + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); > + bdrv_delete(s->image_hd); > +} > + > +static bool is_allocated(BlockDriverState *bs, int64_t sector_num) > +{ > + BDRVAddCowState *s = bs->opaque; > + BlockCache *c = s->bitmap_cache; > + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; > + uint8_t *table = NULL; > + uint64_t offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size > + + (offset_in_bitmap(sector_num) & (~(c->entry_size - 1))); > + int ret = block_cache_get(bs, s->bitmap_cache, offset, > + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); No matching block_cache_put? > + > + if (ret < 0) { > + return ret; > + } > + return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE] > + & (1 << (cluster_num % 8)); > +} > + > +static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs, > + int64_t sector_num, int nb_sectors, int *num_same) > +{ > + BDRVAddCowState *s = bs->opaque; > + int changed; > + > + if (nb_sectors == 0) { > + *num_same = 0; > + return 0; > + } > + > + if (s->header.features & ADD_COW_F_All_ALLOCATED) { > + *num_same = nb_sectors - 1; Why - 1? > + return 1; > + } > + changed = is_allocated(bs, sector_num); > + > + for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) { > + if (is_allocated(bs, sector_num + *num_same) != changed) { > + break; > + } > + } > + return changed; > +} > + > +static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov, > + int64_t sector_num, int nb_sectors) > +{ > + int n1; > + if ((sector_num + nb_sectors) <= bs->total_sectors) { > + return nb_sectors; > + } > + if (sector_num >= bs->total_sectors) { > + n1 = 0; > + } else { > + n1 = bs->total_sectors - sector_num; > + } > + > + qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1, > + 0, BDRV_SECTOR_SIZE * (nb_sectors - n1)); > + > + return n1; > +} > + > +static coroutine_fn int add_cow_co_readv(BlockDriverState *bs, > + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) > +{ > + BDRVAddCowState *s = bs->opaque; > + int cur_nr_sectors; > + uint64_t bytes_done = 0; > + QEMUIOVector hd_qiov; > + int n, n1, ret = 0; > + > + qemu_iovec_init(&hd_qiov, qiov->niov); > + qemu_co_mutex_lock(&s->lock); > + while (remaining_sectors != 0) { > + cur_nr_sectors = remaining_sectors; > + if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors, &n)) { > + cur_nr_sectors = n; One of n and cur_nr_sectors is redundant. > + qemu_iovec_reset(&hd_qiov); > + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, > + cur_nr_sectors * BDRV_SECTOR_SIZE); > + qemu_co_mutex_unlock(&s->lock); > + ret = bdrv_co_readv(s->image_hd, sector_num, n, &hd_qiov); > + qemu_co_mutex_lock(&s->lock); > + if (ret < 0) { > + goto fail; > + } > + } else { > + cur_nr_sectors = n; > + if (bs->backing_hd) { > + qemu_iovec_reset(&hd_qiov); > + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, > + cur_nr_sectors * BDRV_SECTOR_SIZE); > + n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov, > + sector_num, cur_nr_sectors); > + if (n1 > 0) { > + qemu_co_mutex_unlock(&s->lock); > + ret = bdrv_co_readv(bs->backing_hd, sector_num, > + n, &hd_qiov); > + qemu_co_mutex_lock(&s->lock); > + if (ret < 0) { > + goto fail; > + } > + } > + } else { > + qemu_iovec_memset(&hd_qiov, 0, 0, > + BDRV_SECTOR_SIZE * cur_nr_sectors); > + } > + } > + remaining_sectors -= cur_nr_sectors; > + sector_num += cur_nr_sectors; > + bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE; > + } > +fail: > + qemu_co_mutex_unlock(&s->lock); > + qemu_iovec_destroy(&hd_qiov); > + return ret; > +} > + > +static int coroutine_fn copy_sectors(BlockDriverState *bs, > + int n_start, int n_end) > +{ > + BDRVAddCowState *s = bs->opaque; > + QEMUIOVector qiov; > + struct iovec iov; > + int n, ret; > + > + n = n_end - n_start; > + if (n <= 0) { > + return 0; > + } > + > + iov.iov_len = n * BDRV_SECTOR_SIZE; > + iov.iov_base = qemu_blockalign(bs, iov.iov_len); > + > + qemu_iovec_init_external(&qiov, &iov, 1); > + > + ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov); > + if (ret < 0) { > + goto out; > + } > + ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov); > + if (ret < 0) { > + goto out; > + } > + > + ret = 0; > +out: > + qemu_vfree(iov.iov_base); > + return ret; > +} > + > +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs, > + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) > +{ > + BDRVAddCowState *s = bs->opaque; > + BlockCache *c = s->bitmap_cache; > + int ret = 0, i; > + QEMUIOVector hd_qiov; > + uint8_t *table; > + uint64_t offset; > + > + qemu_co_mutex_lock(&s->lock); > + qemu_iovec_init(&hd_qiov, qiov->niov); > + ret = bdrv_co_writev(s->image_hd, > + sector_num, > + remaining_sectors, qiov); > + > + if (ret < 0) { > + goto fail; > + } > + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { > + /* Copy content of unmodified sectors */ > + if (!is_cluster_head(sector_num) && !is_allocated(bs, sector_num)) { > + ret = copy_sectors(bs, sector_num & ~(SECTORS_PER_CLUSTER - 1), > + sector_num); > + if (ret < 0) { > + goto fail; > + } > + } > + > + if (!is_cluster_tail(sector_num + remaining_sectors - 1) > + && !is_allocated(bs, sector_num + remaining_sectors - 1)) { > + ret = copy_sectors(bs, sector_num + remaining_sectors, > + ((sector_num + remaining_sectors) | (SECTORS_PER_CLUSTER - 1)) + 1); > + if (ret < 0) { > + goto fail; > + } > + } > + > + for (i = sector_num / SECTORS_PER_CLUSTER; > + i <= (sector_num + remaining_sectors - 1) / SECTORS_PER_CLUSTER; > + i++) { > + offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size > + + (offset_in_bitmap(i * SECTORS_PER_CLUSTER) & (~(c->entry_size - 1))); The maths in this loop looks a bit confusing, but I think it's correct. > + ret = block_cache_get(bs, s->bitmap_cache, offset, > + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); > + if (ret < 0) { > + goto fail; > + } > + if ((table[i / 8] & (1 << (i % 8))) == 0) { > + table[i / 8] |= (1 << (i % 8)); > + block_cache_entry_mark_dirty(s->bitmap_cache, table); > + } Missing block_cache_put again? > + } > + } > + ret = 0; > +fail: > + qemu_co_mutex_unlock(&s->lock); > + qemu_iovec_destroy(&hd_qiov); > + return ret; > +} > + > +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size) > +{ > + BDRVAddCowState *s = bs->opaque; > + int sector_per_byte = SECTORS_PER_CLUSTER * 8; > + int ret; > + uint32_t bitmap_pos = s->header.header_pages_size * ADD_COW_PAGE_SIZE; > + int64_t bitmap_size = > + (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte; > + bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1) > + & (~(ADD_COW_CACHE_ENTRY_SIZE - 1)); > + > + ret = bdrv_truncate(bs->file, bitmap_pos + bitmap_size); > + if (ret < 0) { > + return ret; > + } > + return 0; > +} So you don't truncate s->image_file? Does this work? > + > +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) > +{ > + BDRVAddCowState *s = bs->opaque; > + int ret; > + > + qemu_co_mutex_lock(&s->lock); > + ret = block_cache_flush(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP, > + ADD_COW_CACHE_ENTRY_SIZE); > + qemu_co_mutex_unlock(&s->lock); > + return ret; > +} What about flushing s->image_file? > + > +static QEMUOptionParameter add_cow_create_options[] = { > + { > + .name = BLOCK_OPT_SIZE, > + .type = OPT_SIZE, > + .help = "Virtual disk size" > + }, > + { > + .name = BLOCK_OPT_BACKING_FILE, > + .type = OPT_STRING, > + .help = "File name of a base image" > + }, > + { > + .name = BLOCK_OPT_BACKING_FMT, > + .type = OPT_STRING, > + .help = "Image format of the base image" > + }, > + { > + .name = BLOCK_OPT_IMAGE_FILE, > + .type = OPT_STRING, > + .help = "File name of a image file" > + }, > + { > + .name = BLOCK_OPT_IMAGE_FORMAT, > + .type = OPT_STRING, > + .help = "Image format of the image file" > + }, > + { NULL } > +}; > + > +static BlockDriver bdrv_add_cow = { > + .format_name = "add-cow", > + .instance_size = sizeof(BDRVAddCowState), > + .bdrv_probe = add_cow_probe, > + .bdrv_open = add_cow_open, > + .bdrv_close = add_cow_close, > + .bdrv_create = add_cow_create, > + .bdrv_co_readv = add_cow_co_readv, > + .bdrv_co_writev = add_cow_co_writev, > + .bdrv_truncate = bdrv_add_cow_truncate, > + .bdrv_co_is_allocated = add_cow_is_allocated, > + > + .create_options = add_cow_create_options, > + .bdrv_co_flush_to_os = add_cow_co_flush, > +}; > + > +static void bdrv_add_cow_init(void) > +{ > + bdrv_register(&bdrv_add_cow); > +} > + > +block_init(bdrv_add_cow_init); > diff --git a/block/add-cow.h b/block/add-cow.h > new file mode 100644 > index 0000000..f058376 > --- /dev/null > +++ b/block/add-cow.h > @@ -0,0 +1,85 @@ > +/* > + * QEMU ADD-COW Disk Format > + * > + * Copyright IBM, Corp. 2012 > + * > + * Authors: > + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > + * > + * This work is licensed under the terms of the GNU LGPL, version 2 or later. > + * See the COPYING.LIB file in the top-level directory. > + * > + */ > + > +#ifndef BLOCK_ADD_COW_H > +#define BLOCK_ADD_COW_H > +#include "block-cache.h" > + > +enum { > + ADD_COW_F_All_ALLOCATED = 0X01, > + ADD_COW_FEATURE_MASK = ADD_COW_F_All_ALLOCATED, > + > + ADD_COW_MAGIC = (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | \ > + ((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \ > + ((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \ > + ((uint64_t)'W' << 8) | 0xFF), > + ADD_COW_VERSION = 1, > + ADD_COW_FILE_LEN = 1024, > + ADD_COW_CACHE_SIZE = 16, > + ADD_COW_CACHE_ENTRY_SIZE = 65536, > + ADD_COW_CLUSTER_SIZE = 65536, > + SECTORS_PER_CLUSTER = (ADD_COW_CLUSTER_SIZE / BDRV_SECTOR_SIZE), > + ADD_COW_PAGE_SIZE = 4096, > + ADD_COW_DEFAULT_PAGE_SIZE = 1, > +}; > + > +typedef struct AddCowHeader { > + uint64_t magic; > + uint32_t version; > + > + uint32_t backing_filename_offset; > + uint32_t backing_filename_size; > + > + uint32_t image_filename_offset; > + uint32_t image_filename_size; > + > + uint64_t features; > + uint64_t optional_features; > + uint32_t header_pages_size; > +} QEMU_PACKED AddCowHeader; Why aren't backing/image_file_format part of the header here? They are in the spec. It would also simplify some offset calculation code. > + > +typedef struct BDRVAddCowState { > + BlockDriverState *image_hd; > + CoMutex lock; > + int cluster_size; > + BlockCache *bitmap_cache; > + uint64_t bitmap_size; > + AddCowHeader header; > + char backing_file_format[16]; > + char image_file_format[16]; > +} BDRVAddCowState; > + > +/* Convert sector_num to offset in bitmap */ > +static inline int64_t offset_in_bitmap(int64_t sector_num) > +{ > + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; > + return cluster_num / 8; > +} > + > +static inline bool is_cluster_head(int64_t sector_num) > +{ > + return sector_num % SECTORS_PER_CLUSTER == 0; > +} > + > +static inline bool is_cluster_tail(int64_t sector_num) > +{ > + return (sector_num + 1) % SECTORS_PER_CLUSTER == 0; > +} > + > +BlockCache *add_cow_cache_create(BlockDriverState *bs, int num_tables); > +int add_cow_cache_destroy(BlockDriverState *bs, BlockCache *c); > +void add_cow_cache_entry_mark_dirty(BlockCache *c, void *table); > +int add_cow_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset, > + void **table); > +int add_cow_cache_flush(BlockDriverState *bs, BlockCache *c); These functions don't really exist any more, do they? Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-09-11 9:40 ` Kevin Wolf @ 2012-09-12 7:28 ` Dong Xu Wang 2012-09-12 7:50 ` Kevin Wolf 0 siblings, 1 reply; 25+ messages in thread From: Dong Xu Wang @ 2012-09-12 7:28 UTC (permalink / raw) To: Kevin Wolf; +Cc: qemu-devel On Tue, Sep 11, 2012 at 5:40 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 10.08.2012 17:39, schrieb Dong Xu Wang: >> add-cow file format core code. It use block-cache.c as cache code. >> >> Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> --- >> block/Makefile.objs | 1 + >> block/add-cow.c | 613 +++++++++++++++++++++++++++++++++++++++++++++++++++ >> block/add-cow.h | 85 +++++++ >> block_int.h | 2 + >> 4 files changed, 701 insertions(+), 0 deletions(-) >> create mode 100644 block/add-cow.c >> create mode 100644 block/add-cow.h >> >> diff --git a/block/Makefile.objs b/block/Makefile.objs >> index 23bdfc8..7ed5051 100644 >> --- a/block/Makefile.objs >> +++ b/block/Makefile.objs >> @@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat >> block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o >> block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o >> block-obj-y += qed-check.o >> +block-obj-y += add-cow.o >> block-obj-y += block-cache.o >> block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o >> block-obj-y += stream.o >> diff --git a/block/add-cow.c b/block/add-cow.c >> new file mode 100644 >> index 0000000..d4711d5 >> --- /dev/null >> +++ b/block/add-cow.c >> @@ -0,0 +1,613 @@ >> +/* >> + * QEMU ADD-COW Disk Format >> + * >> + * Copyright IBM, Corp. 2012 >> + * >> + * Authors: >> + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU LGPL, version 2 or later. >> + * See the COPYING.LIB file in the top-level directory. >> + * >> + */ >> + >> +#include "qemu-common.h" >> +#include "block_int.h" >> +#include "module.h" >> +#include "add-cow.h" >> + >> +static void add_cow_header_le_to_cpu(const AddCowHeader *le, AddCowHeader *cpu) >> +{ >> + cpu->magic = le64_to_cpu(le->magic); >> + cpu->version = le32_to_cpu(le->version); >> + >> + cpu->backing_filename_offset = le32_to_cpu(le->backing_filename_offset); >> + cpu->backing_filename_size = le32_to_cpu(le->backing_filename_size); >> + >> + cpu->image_filename_offset = le32_to_cpu(le->image_filename_offset); >> + cpu->image_filename_size = le32_to_cpu(le->image_filename_size); >> + >> + cpu->features = le64_to_cpu(le->features); >> + cpu->optional_features = le64_to_cpu(le->optional_features); >> + cpu->header_pages_size = le32_to_cpu(le->header_pages_size); >> +} >> + >> +static void add_cow_header_cpu_to_le(const AddCowHeader *cpu, AddCowHeader *le) >> +{ >> + le->magic = cpu_to_le64(cpu->magic); >> + le->version = cpu_to_le32(cpu->version); >> + >> + le->backing_filename_offset = cpu_to_le32(cpu->backing_filename_offset); >> + le->backing_filename_size = cpu_to_le32(cpu->backing_filename_size); >> + >> + le->image_filename_offset = cpu_to_le32(cpu->image_filename_offset); >> + le->image_filename_size = cpu_to_le32(cpu->image_filename_size); >> + >> + le->features = cpu_to_le64(cpu->features); >> + le->optional_features = cpu_to_le64(cpu->optional_features); >> + le->header_pages_size = cpu_to_le32(cpu->header_pages_size); >> +} >> + >> +static int add_cow_probe(const uint8_t *buf, int buf_size, const char *filename) >> +{ >> + const AddCowHeader *header = (const AddCowHeader *)buf; >> + >> + if (le64_to_cpu(header->magic) == ADD_COW_MAGIC && >> + le32_to_cpu(header->version) == ADD_COW_VERSION) { >> + return 100; >> + } else { >> + return 0; >> + } >> +} >> + >> +static int add_cow_create(const char *filename, QEMUOptionParameter *options) >> +{ >> + AddCowHeader header = { >> + .magic = ADD_COW_MAGIC, >> + .version = ADD_COW_VERSION, >> + .features = 0, >> + .optional_features = 0, >> + .header_pages_size = ADD_COW_DEFAULT_PAGE_SIZE, >> + }; >> + AddCowHeader le_header; >> + int64_t image_len = 0; >> + const char *backing_filename = NULL; >> + const char *backing_fmt = NULL; >> + const char *image_filename = NULL; >> + const char *image_format = NULL; >> + BlockDriverState *bs, *image_bs = NULL, *backing_bs = NULL; >> + BlockDriver *drv = bdrv_find_format("add-cow"); >> + BDRVAddCowState s; >> + int ret; >> + >> + while (options && options->name) { >> + if (!strcmp(options->name, BLOCK_OPT_SIZE)) { >> + image_len = options->value.n; >> + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FILE)) { >> + backing_filename = options->value.s; >> + } else if (!strcmp(options->name, BLOCK_OPT_BACKING_FMT)) { >> + backing_fmt = options->value.s; >> + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FILE)) { >> + image_filename = options->value.s; >> + } else if (!strcmp(options->name, BLOCK_OPT_IMAGE_FORMAT)) { >> + image_format = options->value.s; >> + } >> + options++; >> + } >> + >> + if (backing_filename) { >> + header.backing_filename_offset = sizeof(header) >> + + sizeof(s.backing_file_format) + sizeof(s.image_file_format); >> + header.backing_filename_size = strlen(backing_filename); >> + >> + if (!backing_fmt) { >> + backing_bs = bdrv_new("image"); >> + ret = bdrv_open(backing_bs, backing_filename, BDRV_O_RDWR >> + | BDRV_O_CACHE_WB, NULL); >> + if (ret < 0) { >> + return ret; >> + } >> + backing_fmt = bdrv_get_format_name(backing_bs); >> + bdrv_delete(backing_bs); >> + } >> + } else { >> + header.features |= ADD_COW_F_All_ALLOCATED; >> + } >> + >> + if (image_filename) { >> + header.image_filename_offset = >> + sizeof(header) + sizeof(s.backing_file_format) >> + + sizeof(s.image_file_format) + header.backing_filename_size; >> + header.image_filename_size = strlen(image_filename); >> + } else { >> + error_report("Error: image_file should be given."); >> + return -EINVAL; >> + } >> + >> + if (backing_filename && !strcmp(backing_filename, image_filename)) { >> + error_report("Error: Trying to create an image with the " >> + "same backing file name as the image file name"); >> + return -EINVAL; >> + } >> + >> + if (!strcmp(filename, image_filename)) { >> + error_report("Error: Trying to create an image with the " >> + "same filename as the image file name"); >> + return -EINVAL; >> + } >> + >> + if (header.image_filename_offset + header.image_filename_size >> + > ADD_COW_PAGE_SIZE * ADD_COW_DEFAULT_PAGE_SIZE) { >> + error_report("image_file name or backing_file name too long."); >> + return -ENOSPC; >> + } >> + >> + ret = bdrv_file_open(&image_bs, image_filename, BDRV_O_RDWR); >> + if (ret < 0) { >> + return ret; >> + } >> + bdrv_delete(image_bs); >> + >> + ret = bdrv_create_file(filename, NULL); >> + if (ret < 0) { >> + return ret; >> + } >> + >> + ret = bdrv_file_open(&bs, filename, BDRV_O_RDWR); >> + if (ret < 0) { >> + return ret; >> + } >> + add_cow_header_cpu_to_le(&header, &le_header); >> + ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header)); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_pwrite(bs, sizeof(le_header), backing_fmt ? backing_fmt : "", >> + backing_fmt ? strlen(backing_fmt) : 0); > > The spec requires zero padding, which you don't do here. Okay. > >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_pwrite(bs, sizeof(le_header) + sizeof(s.backing_file_format), >> + image_format ? image_format : "raw", >> + image_format ? strlen(image_format) : sizeof("raw")); > > And here. Okay. > >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + if (backing_filename) { >> + ret = bdrv_pwrite(bs, header.backing_filename_offset, >> + backing_filename, header.backing_filename_size); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + } >> + >> + ret = bdrv_pwrite(bs, header.image_filename_offset, >> + image_filename, header.image_filename_size); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_open(bs, filename, BDRV_O_RDWR | BDRV_O_NO_FLUSH, drv); >> + if (ret < 0) { >> + bdrv_delete(bs); >> + return ret; >> + } >> + >> + ret = bdrv_truncate(bs, image_len); >> + bdrv_delete(bs); >> + return ret; >> +} >> + >> +static int add_cow_open(BlockDriverState *bs, int flags) >> +{ >> + char image_filename[ADD_COW_FILE_LEN]; >> + char tmp_name[ADD_COW_FILE_LEN]; >> + BlockDriver *image_drv = NULL; >> + int ret; >> + int sector_per_byte; >> + BDRVAddCowState *s = bs->opaque; >> + AddCowHeader le_header; >> + >> + ret = bdrv_pread(bs->file, 0, &le_header, sizeof(le_header)); >> + if (ret != sizeof(s->header)) { > > if (ret < 0) would be more consistent with the rest of the code. > Okay. >> + goto fail; >> + } >> + >> + add_cow_header_le_to_cpu(&le_header, &s->header); >> + >> + if (le64_to_cpu(s->header.magic) != ADD_COW_MAGIC) { > > Isn't this one endianess conversion too much? s->header is already LE. > > Did you test add-cow on a big endian host? My fault, will correct it in next version. > >> + ret = -EINVAL; >> + goto fail; >> + } >> + >> + if (s->header.version != ADD_COW_VERSION) { >> + char version[64]; >> + snprintf(version, sizeof(version), "ADD-COW version %d", >> + s->header.version); >> + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, >> + bs->device_name, "add-cow", version); >> + ret = -ENOTSUP; >> + goto fail; >> + } >> + >> + if (s->header.features & ~ADD_COW_FEATURE_MASK) { >> + char buf[64]; >> + snprintf(buf, sizeof(buf), "%" PRIx64, >> + s->header.features & ~ADD_COW_FEATURE_MASK); > > This message is a bit terse, most users will be confused with an error > message that only consists of a hex number. Maybe better "Feature flags: > %" PRIx64. > Okay. >> + qerror_report(QERR_UNKNOWN_BLOCK_FORMAT_FEATURE, >> + bs->device_name, "add-cow", buf); >> + return -ENOTSUP; >> + } >> + >> + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { >> + ret = bdrv_read_string(bs->file, sizeof(s->header), >> + sizeof(s->backing_file_format) - 1, s->backing_file_format, >> + sizeof(s->backing_file_format)); >> + if (ret < 0) { >> + goto fail; >> + } >> + } > > Would be great if this was not only read into memory, but actually > used... It must end up in bs->backing_format in order take effect. > >> + >> + ret = bdrv_read_string(bs->file, >> + sizeof(s->header) + sizeof(s->image_file_format), >> + sizeof(s->image_file_format) - 1, s->image_file_format, >> + sizeof(s->image_file_format)); >> + if (ret < 0) { >> + goto fail; >> + } > > This one is unused, too. > Okay. >> + >> + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { >> + ret = bdrv_read_string(bs->file, s->header.backing_filename_offset, >> + s->header.backing_filename_size, bs->backing_file, >> + sizeof(bs->backing_file)); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + ret = bdrv_read_string(bs->file, s->header.image_filename_offset, >> + s->header.image_filename_size, tmp_name, >> + sizeof(tmp_name)); >> + if (ret < 0) { >> + goto fail; >> + } >> + >> + s->image_hd = bdrv_new(""); >> + if (path_has_protocol(image_filename)) { >> + pstrcpy(image_filename, sizeof(image_filename), tmp_name); >> + } else { >> + path_combine(image_filename, sizeof(image_filename), >> + bs->filename, tmp_name); >> + } >> + >> + ret = bdrv_open(s->image_hd, image_filename, flags, image_drv); > > image_drv is always NULL. > >> + if (ret < 0) { >> + bdrv_delete(s->image_hd); >> + goto fail; >> + } >> + >> + bs->total_sectors = bdrv_getlength(s->image_hd) >> 9; >> + s->cluster_size = ADD_COW_CLUSTER_SIZE; >> + sector_per_byte = SECTORS_PER_CLUSTER * 8; >> + s->bitmap_size = >> + (bs->total_sectors + sector_per_byte - 1) / sector_per_byte; >> + s->bitmap_cache = >> + block_cache_create(bs, ADD_COW_CACHE_SIZE, ADD_COW_CACHE_ENTRY_SIZE); >> + >> + qemu_co_mutex_init(&s->lock); >> + return 0; >> +fail: >> + if (s->bitmap_cache) { >> + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); >> + } >> + return ret; >> +} >> + >> +static void add_cow_close(BlockDriverState *bs) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + block_cache_destroy(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP); >> + bdrv_delete(s->image_hd); >> +} >> + >> +static bool is_allocated(BlockDriverState *bs, int64_t sector_num) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + BlockCache *c = s->bitmap_cache; >> + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; >> + uint8_t *table = NULL; >> + uint64_t offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size >> + + (offset_in_bitmap(sector_num) & (~(c->entry_size - 1))); >> + int ret = block_cache_get(bs, s->bitmap_cache, offset, >> + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); > > No matching block_cache_put? > >> + >> + if (ret < 0) { >> + return ret; >> + } >> + return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE] >> + & (1 << (cluster_num % 8)); >> +} >> + >> +static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs, >> + int64_t sector_num, int nb_sectors, int *num_same) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int changed; >> + >> + if (nb_sectors == 0) { >> + *num_same = 0; >> + return 0; >> + } >> + >> + if (s->header.features & ADD_COW_F_All_ALLOCATED) { >> + *num_same = nb_sectors - 1; > > Why - 1? > >> + return 1; >> + } >> + changed = is_allocated(bs, sector_num); >> + >> + for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) { >> + if (is_allocated(bs, sector_num + *num_same) != changed) { >> + break; >> + } >> + } >> + return changed; >> +} >> + >> +static int add_cow_backing_read(BlockDriverState *bs, QEMUIOVector *qiov, >> + int64_t sector_num, int nb_sectors) >> +{ >> + int n1; >> + if ((sector_num + nb_sectors) <= bs->total_sectors) { >> + return nb_sectors; >> + } >> + if (sector_num >= bs->total_sectors) { >> + n1 = 0; >> + } else { >> + n1 = bs->total_sectors - sector_num; >> + } >> + >> + qemu_iovec_memset(qiov, BDRV_SECTOR_SIZE * n1, >> + 0, BDRV_SECTOR_SIZE * (nb_sectors - n1)); >> + >> + return n1; >> +} >> + >> +static coroutine_fn int add_cow_co_readv(BlockDriverState *bs, >> + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int cur_nr_sectors; >> + uint64_t bytes_done = 0; >> + QEMUIOVector hd_qiov; >> + int n, n1, ret = 0; >> + >> + qemu_iovec_init(&hd_qiov, qiov->niov); >> + qemu_co_mutex_lock(&s->lock); >> + while (remaining_sectors != 0) { >> + cur_nr_sectors = remaining_sectors; >> + if (add_cow_is_allocated(bs, sector_num, cur_nr_sectors, &n)) { >> + cur_nr_sectors = n; > > One of n and cur_nr_sectors is redundant. Okay. > >> + qemu_iovec_reset(&hd_qiov); >> + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, >> + cur_nr_sectors * BDRV_SECTOR_SIZE); >> + qemu_co_mutex_unlock(&s->lock); >> + ret = bdrv_co_readv(s->image_hd, sector_num, n, &hd_qiov); >> + qemu_co_mutex_lock(&s->lock); >> + if (ret < 0) { >> + goto fail; >> + } >> + } else { >> + cur_nr_sectors = n; >> + if (bs->backing_hd) { >> + qemu_iovec_reset(&hd_qiov); >> + qemu_iovec_concat(&hd_qiov, qiov, bytes_done, >> + cur_nr_sectors * BDRV_SECTOR_SIZE); >> + n1 = add_cow_backing_read(bs->backing_hd, &hd_qiov, >> + sector_num, cur_nr_sectors); >> + if (n1 > 0) { >> + qemu_co_mutex_unlock(&s->lock); >> + ret = bdrv_co_readv(bs->backing_hd, sector_num, >> + n, &hd_qiov); >> + qemu_co_mutex_lock(&s->lock); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + } else { >> + qemu_iovec_memset(&hd_qiov, 0, 0, >> + BDRV_SECTOR_SIZE * cur_nr_sectors); >> + } >> + } >> + remaining_sectors -= cur_nr_sectors; >> + sector_num += cur_nr_sectors; >> + bytes_done += cur_nr_sectors * BDRV_SECTOR_SIZE; >> + } >> +fail: >> + qemu_co_mutex_unlock(&s->lock); >> + qemu_iovec_destroy(&hd_qiov); >> + return ret; >> +} >> + >> +static int coroutine_fn copy_sectors(BlockDriverState *bs, >> + int n_start, int n_end) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + QEMUIOVector qiov; >> + struct iovec iov; >> + int n, ret; >> + >> + n = n_end - n_start; >> + if (n <= 0) { >> + return 0; >> + } >> + >> + iov.iov_len = n * BDRV_SECTOR_SIZE; >> + iov.iov_base = qemu_blockalign(bs, iov.iov_len); >> + >> + qemu_iovec_init_external(&qiov, &iov, 1); >> + >> + ret = bdrv_co_readv(bs->backing_hd, n_start, n, &qiov); >> + if (ret < 0) { >> + goto out; >> + } >> + ret = bdrv_co_writev(s->image_hd, n_start, n, &qiov); >> + if (ret < 0) { >> + goto out; >> + } >> + >> + ret = 0; >> +out: >> + qemu_vfree(iov.iov_base); >> + return ret; >> +} >> + >> +static coroutine_fn int add_cow_co_writev(BlockDriverState *bs, >> + int64_t sector_num, int remaining_sectors, QEMUIOVector *qiov) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + BlockCache *c = s->bitmap_cache; >> + int ret = 0, i; >> + QEMUIOVector hd_qiov; >> + uint8_t *table; >> + uint64_t offset; >> + >> + qemu_co_mutex_lock(&s->lock); >> + qemu_iovec_init(&hd_qiov, qiov->niov); >> + ret = bdrv_co_writev(s->image_hd, >> + sector_num, >> + remaining_sectors, qiov); >> + >> + if (ret < 0) { >> + goto fail; >> + } >> + if ((s->header.features & ADD_COW_F_All_ALLOCATED) == 0) { >> + /* Copy content of unmodified sectors */ >> + if (!is_cluster_head(sector_num) && !is_allocated(bs, sector_num)) { >> + ret = copy_sectors(bs, sector_num & ~(SECTORS_PER_CLUSTER - 1), >> + sector_num); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + if (!is_cluster_tail(sector_num + remaining_sectors - 1) >> + && !is_allocated(bs, sector_num + remaining_sectors - 1)) { >> + ret = copy_sectors(bs, sector_num + remaining_sectors, >> + ((sector_num + remaining_sectors) | (SECTORS_PER_CLUSTER - 1)) + 1); >> + if (ret < 0) { >> + goto fail; >> + } >> + } >> + >> + for (i = sector_num / SECTORS_PER_CLUSTER; >> + i <= (sector_num + remaining_sectors - 1) / SECTORS_PER_CLUSTER; >> + i++) { >> + offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size >> + + (offset_in_bitmap(i * SECTORS_PER_CLUSTER) & (~(c->entry_size - 1))); > > The maths in this loop looks a bit confusing, but I think it's correct. > >> + ret = block_cache_get(bs, s->bitmap_cache, offset, >> + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); >> + if (ret < 0) { >> + goto fail; >> + } >> + if ((table[i / 8] & (1 << (i % 8))) == 0) { >> + table[i / 8] |= (1 << (i % 8)); >> + block_cache_entry_mark_dirty(s->bitmap_cache, table); >> + } > > Missing block_cache_put again? > >> + } >> + } >> + ret = 0; >> +fail: >> + qemu_co_mutex_unlock(&s->lock); >> + qemu_iovec_destroy(&hd_qiov); >> + return ret; >> +} >> + >> +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int sector_per_byte = SECTORS_PER_CLUSTER * 8; >> + int ret; >> + uint32_t bitmap_pos = s->header.header_pages_size * ADD_COW_PAGE_SIZE; >> + int64_t bitmap_size = >> + (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte; >> + bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1) >> + & (~(ADD_COW_CACHE_ENTRY_SIZE - 1)); >> + >> + ret = bdrv_truncate(bs->file, bitmap_pos + bitmap_size); >> + if (ret < 0) { >> + return ret; >> + } >> + return 0; >> +} > > So you don't truncate s->image_file? Does this work? s->image_file should be truncated? Image file can have a larger virtual size than backing_file, my understanding is we should not truncate image file. > >> + >> +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) >> +{ >> + BDRVAddCowState *s = bs->opaque; >> + int ret; >> + >> + qemu_co_mutex_lock(&s->lock); >> + ret = block_cache_flush(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP, >> + ADD_COW_CACHE_ENTRY_SIZE); >> + qemu_co_mutex_unlock(&s->lock); >> + return ret; >> +} > > What about flushing s->image_file? > >> + >> +static QEMUOptionParameter add_cow_create_options[] = { >> + { >> + .name = BLOCK_OPT_SIZE, >> + .type = OPT_SIZE, >> + .help = "Virtual disk size" >> + }, >> + { >> + .name = BLOCK_OPT_BACKING_FILE, >> + .type = OPT_STRING, >> + .help = "File name of a base image" >> + }, >> + { >> + .name = BLOCK_OPT_BACKING_FMT, >> + .type = OPT_STRING, >> + .help = "Image format of the base image" >> + }, >> + { >> + .name = BLOCK_OPT_IMAGE_FILE, >> + .type = OPT_STRING, >> + .help = "File name of a image file" >> + }, >> + { >> + .name = BLOCK_OPT_IMAGE_FORMAT, >> + .type = OPT_STRING, >> + .help = "Image format of the image file" >> + }, >> + { NULL } >> +}; >> + >> +static BlockDriver bdrv_add_cow = { >> + .format_name = "add-cow", >> + .instance_size = sizeof(BDRVAddCowState), >> + .bdrv_probe = add_cow_probe, >> + .bdrv_open = add_cow_open, >> + .bdrv_close = add_cow_close, >> + .bdrv_create = add_cow_create, >> + .bdrv_co_readv = add_cow_co_readv, >> + .bdrv_co_writev = add_cow_co_writev, >> + .bdrv_truncate = bdrv_add_cow_truncate, >> + .bdrv_co_is_allocated = add_cow_is_allocated, >> + >> + .create_options = add_cow_create_options, >> + .bdrv_co_flush_to_os = add_cow_co_flush, >> +}; >> + >> +static void bdrv_add_cow_init(void) >> +{ >> + bdrv_register(&bdrv_add_cow); >> +} >> + >> +block_init(bdrv_add_cow_init); >> diff --git a/block/add-cow.h b/block/add-cow.h >> new file mode 100644 >> index 0000000..f058376 >> --- /dev/null >> +++ b/block/add-cow.h >> @@ -0,0 +1,85 @@ >> +/* >> + * QEMU ADD-COW Disk Format >> + * >> + * Copyright IBM, Corp. 2012 >> + * >> + * Authors: >> + * Dong Xu Wang <wdongxu@linux.vnet.ibm.com> >> + * >> + * This work is licensed under the terms of the GNU LGPL, version 2 or later. >> + * See the COPYING.LIB file in the top-level directory. >> + * >> + */ >> + >> +#ifndef BLOCK_ADD_COW_H >> +#define BLOCK_ADD_COW_H >> +#include "block-cache.h" >> + >> +enum { >> + ADD_COW_F_All_ALLOCATED = 0X01, >> + ADD_COW_FEATURE_MASK = ADD_COW_F_All_ALLOCATED, >> + >> + ADD_COW_MAGIC = (((uint64_t)'A' << 56) | ((uint64_t)'D' << 48) | \ >> + ((uint64_t)'D' << 40) | ((uint64_t)'_' << 32) | \ >> + ((uint64_t)'C' << 24) | ((uint64_t)'O' << 16) | \ >> + ((uint64_t)'W' << 8) | 0xFF), >> + ADD_COW_VERSION = 1, >> + ADD_COW_FILE_LEN = 1024, >> + ADD_COW_CACHE_SIZE = 16, >> + ADD_COW_CACHE_ENTRY_SIZE = 65536, >> + ADD_COW_CLUSTER_SIZE = 65536, >> + SECTORS_PER_CLUSTER = (ADD_COW_CLUSTER_SIZE / BDRV_SECTOR_SIZE), >> + ADD_COW_PAGE_SIZE = 4096, >> + ADD_COW_DEFAULT_PAGE_SIZE = 1, >> +}; >> + >> +typedef struct AddCowHeader { >> + uint64_t magic; >> + uint32_t version; >> + >> + uint32_t backing_filename_offset; >> + uint32_t backing_filename_size; >> + >> + uint32_t image_filename_offset; >> + uint32_t image_filename_size; >> + >> + uint64_t features; >> + uint64_t optional_features; >> + uint32_t header_pages_size; >> +} QEMU_PACKED AddCowHeader; > > Why aren't backing/image_file_format part of the header here? They are > in the spec. It would also simplify some offset calculation code. > Anthony said "It's far better to shrink the size of the header and use an offset/len pointer to the backing file string. Limiting backing files to 1023 is unacceptable" http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg04110.html So I use offset and length instead of using string directly. >> + >> +typedef struct BDRVAddCowState { >> + BlockDriverState *image_hd; >> + CoMutex lock; >> + int cluster_size; >> + BlockCache *bitmap_cache; >> + uint64_t bitmap_size; >> + AddCowHeader header; >> + char backing_file_format[16]; >> + char image_file_format[16]; >> +} BDRVAddCowState; >> + >> +/* Convert sector_num to offset in bitmap */ >> +static inline int64_t offset_in_bitmap(int64_t sector_num) >> +{ >> + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; >> + return cluster_num / 8; >> +} >> + >> +static inline bool is_cluster_head(int64_t sector_num) >> +{ >> + return sector_num % SECTORS_PER_CLUSTER == 0; >> +} >> + >> +static inline bool is_cluster_tail(int64_t sector_num) >> +{ >> + return (sector_num + 1) % SECTORS_PER_CLUSTER == 0; >> +} >> + >> +BlockCache *add_cow_cache_create(BlockDriverState *bs, int num_tables); >> +int add_cow_cache_destroy(BlockDriverState *bs, BlockCache *c); >> +void add_cow_cache_entry_mark_dirty(BlockCache *c, void *table); >> +int add_cow_cache_get(BlockDriverState *bs, BlockCache *c, uint64_t offset, >> + void **table); >> +int add_cow_cache_flush(BlockDriverState *bs, BlockCache *c); > > These functions don't really exist any more, do they? Right, sorry. > > Kevin > Thank you, Kevin. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 5/6] add-cow file format 2012-09-12 7:28 ` Dong Xu Wang @ 2012-09-12 7:50 ` Kevin Wolf 0 siblings, 0 replies; 25+ messages in thread From: Kevin Wolf @ 2012-09-12 7:50 UTC (permalink / raw) To: Dong Xu Wang; +Cc: qemu-devel Am 12.09.2012 09:28, schrieb Dong Xu Wang: >>> +static bool is_allocated(BlockDriverState *bs, int64_t sector_num) >>> +{ >>> + BDRVAddCowState *s = bs->opaque; >>> + BlockCache *c = s->bitmap_cache; >>> + int64_t cluster_num = sector_num / SECTORS_PER_CLUSTER; >>> + uint8_t *table = NULL; >>> + uint64_t offset = ADD_COW_PAGE_SIZE * s->header.header_pages_size >>> + + (offset_in_bitmap(sector_num) & (~(c->entry_size - 1))); >>> + int ret = block_cache_get(bs, s->bitmap_cache, offset, >>> + (void **)&table, BLOCK_TABLE_BITMAP, ADD_COW_CACHE_ENTRY_SIZE); >> >> No matching block_cache_put? >> >>> + >>> + if (ret < 0) { >>> + return ret; >>> + } >>> + return table[cluster_num / 8 % ADD_COW_CACHE_ENTRY_SIZE] >>> + & (1 << (cluster_num % 8)); >>> +} >>> + >>> +static coroutine_fn int add_cow_is_allocated(BlockDriverState *bs, >>> + int64_t sector_num, int nb_sectors, int *num_same) >>> +{ >>> + BDRVAddCowState *s = bs->opaque; >>> + int changed; >>> + >>> + if (nb_sectors == 0) { >>> + *num_same = 0; >>> + return 0; >>> + } >>> + >>> + if (s->header.features & ADD_COW_F_All_ALLOCATED) { >>> + *num_same = nb_sectors - 1; >> >> Why - 1? >> >>> + return 1; >>> + } >>> + changed = is_allocated(bs, sector_num); >>> + >>> + for (*num_same = 1; *num_same < nb_sectors; (*num_same)++) { >>> + if (is_allocated(bs, sector_num + *num_same) != changed) { >>> + break; >>> + } >>> + } >>> + return changed; >>> +} >>> +static int bdrv_add_cow_truncate(BlockDriverState *bs, int64_t size) >>> +{ >>> + BDRVAddCowState *s = bs->opaque; >>> + int sector_per_byte = SECTORS_PER_CLUSTER * 8; >>> + int ret; >>> + uint32_t bitmap_pos = s->header.header_pages_size * ADD_COW_PAGE_SIZE; >>> + int64_t bitmap_size = >>> + (size / BDRV_SECTOR_SIZE + sector_per_byte - 1) / sector_per_byte; >>> + bitmap_size = (bitmap_size + ADD_COW_CACHE_ENTRY_SIZE - 1) >>> + & (~(ADD_COW_CACHE_ENTRY_SIZE - 1)); >>> + >>> + ret = bdrv_truncate(bs->file, bitmap_pos + bitmap_size); >>> + if (ret < 0) { >>> + return ret; >>> + } >>> + return 0; >>> +} >> >> So you don't truncate s->image_file? Does this work? > > s->image_file should be truncated? Image file can have a larger virtual size > than backing_file, my understanding is we should not truncate image file. I'm talking about s->image_hd, not bs->backing_hd. You are right that the backing file should not be changed. But the associated raw image should be resized, shouldn't it? >>> +static coroutine_fn int add_cow_co_flush(BlockDriverState *bs) >>> +{ >>> + BDRVAddCowState *s = bs->opaque; >>> + int ret; >>> + >>> + qemu_co_mutex_lock(&s->lock); >>> + ret = block_cache_flush(bs, s->bitmap_cache, BLOCK_TABLE_BITMAP, >>> + ADD_COW_CACHE_ENTRY_SIZE); >>> + qemu_co_mutex_unlock(&s->lock); >>> + return ret; >>> +} >> >> What about flushing s->image_file? >>> +typedef struct AddCowHeader { >>> + uint64_t magic; >>> + uint32_t version; >>> + >>> + uint32_t backing_filename_offset; >>> + uint32_t backing_filename_size; >>> + >>> + uint32_t image_filename_offset; >>> + uint32_t image_filename_size; >>> + >>> + uint64_t features; >>> + uint64_t optional_features; >>> + uint32_t header_pages_size; >>> +} QEMU_PACKED AddCowHeader; >> >> Why aren't backing/image_file_format part of the header here? They are >> in the spec. It would also simplify some offset calculation code. >> > > Anthony said "It's far better to shrink the size of the header and use > an offset/len > pointer to the backing file string. Limiting backing files to 1023 is > unacceptable" > > http://lists.gnu.org/archive/html/qemu-devel/2012-05/msg04110.html > > So I use offset and length instead of using string directly. I'm talking about the format, not the path. Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* [Qemu-devel] [PATCH V12 6/6] add-cow: add qemu-iotests support 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang ` (4 preceding siblings ...) 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 5/6] add-cow file format Dong Xu Wang @ 2012-08-10 15:39 ` Dong Xu Wang 2012-09-11 9:55 ` Kevin Wolf 2012-08-23 5:34 ` [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang 6 siblings, 1 reply; 25+ messages in thread From: Dong Xu Wang @ 2012-08-10 15:39 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang Add qemu-iotests support for add-cow. Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> --- tests/qemu-iotests/017 | 2 +- tests/qemu-iotests/020 | 2 +- tests/qemu-iotests/check | 4 ++-- tests/qemu-iotests/common | 6 ++++++ tests/qemu-iotests/common.rc | 19 +++++++++++++++++++ 5 files changed, 29 insertions(+), 4 deletions(-) diff --git a/tests/qemu-iotests/017 b/tests/qemu-iotests/017 index 66951eb..d31432f 100755 --- a/tests/qemu-iotests/017 +++ b/tests/qemu-iotests/017 @@ -40,7 +40,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 . ./common.pattern # Any format supporting backing files -_supported_fmt qcow qcow2 vmdk qed +_supported_fmt qcow qcow2 vmdk qed add-cow _supported_proto generic _supported_os Linux diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020 index 2fb0ff8..3dbb495 100755 --- a/tests/qemu-iotests/020 +++ b/tests/qemu-iotests/020 @@ -42,7 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15 . ./common.pattern # Any format supporting backing files -_supported_fmt qcow qcow2 vmdk qed +_supported_fmt qcow qcow2 vmdk qed add-cow _supported_proto generic _supported_os Linux diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index 432732c..122267b 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -243,7 +243,7 @@ do echo " - no qualified output" err=true else - if diff -w $seq.out $tmp.out >/dev/null 2>&1 + if diff -w -I "^Formatting" $seq.out $tmp.out >/dev/null 2>&1 then echo "" if $err @@ -255,7 +255,7 @@ do else echo " - output mismatch (see $seq.out.bad)" mv $tmp.out $seq.out.bad - $diff -w $seq.out $seq.out.bad + $diff -w -I "^Formatting" $seq.out $seq.out.bad err=true fi fi diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common index 1f6fdf5..1c81b09 100644 --- a/tests/qemu-iotests/common +++ b/tests/qemu-iotests/common @@ -128,6 +128,7 @@ common options check options -raw test raw (default) -cow test cow + -add-cow test add-cow -qcow test qcow -qcow2 test qcow2 -qed test qed @@ -163,6 +164,11 @@ testlist options xpand=false ;; + -add-cow) + IMGFMT=add-cow + xpand=false + ;; + -qcow) IMGFMT=qcow xpand=false diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 7782808..ec5afd7 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -97,6 +97,18 @@ _make_test_img() fi if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then optstr=$(_optstr_add "$optstr" "cluster_size=$CLUSTER_SIZE") + elif [ "$IMGFMT" = "add-cow" ]; then + local BACKING="$TEST_IMG"".qcow2" + local IMG="$TEST_IMG"".raw" + if [ "$1" = "-b" ]; then + IMG="$IMG"".b" + $QEMU_IMG create -f raw $IMG $image_size>/dev/null + extra_img_options="-o image_file=$IMG $extra_img_options" + else + $QEMU_IMG create -f raw $IMG $image_size>/dev/null + $QEMU_IMG create -f qcow2 $BACKING $image_size>/dev/null + extra_img_options="-o backing_file=$BACKING,image_file=$IMG" + fi fi if [ -n "$optstr" ]; then @@ -125,6 +137,13 @@ _cleanup_test_img() rm -f $TEST_DIR/t.$IMGFMT rm -f $TEST_DIR/t.$IMGFMT.orig rm -f $TEST_DIR/t.$IMGFMT.base + if [ "$IMGFMT" = "add-cow" ]; then + rm -f $TEST_DIR/t.$IMGFMT.qcow2 + rm -f $TEST_DIR/t.$IMGFMT.raw + rm -f $TEST_DIR/t.$IMGFMT.raw.b + rm -f $TEST_DIR/t.$IMGFMT.ct.qcow2 + rm -f $TEST_DIR/t.$IMGFMT.ct.raw + fi ;; rbd) -- 1.7.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 6/6] add-cow: add qemu-iotests support 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 6/6] add-cow: add qemu-iotests support Dong Xu Wang @ 2012-09-11 9:55 ` Kevin Wolf 0 siblings, 0 replies; 25+ messages in thread From: Kevin Wolf @ 2012-09-11 9:55 UTC (permalink / raw) To: Dong Xu Wang; +Cc: qemu-devel Am 10.08.2012 17:39, schrieb Dong Xu Wang: > Add qemu-iotests support for add-cow. > > Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com> > --- > tests/qemu-iotests/017 | 2 +- > tests/qemu-iotests/020 | 2 +- > tests/qemu-iotests/check | 4 ++-- > tests/qemu-iotests/common | 6 ++++++ > tests/qemu-iotests/common.rc | 19 +++++++++++++++++++ > 5 files changed, 29 insertions(+), 4 deletions(-) > diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check > index 432732c..122267b 100755 > --- a/tests/qemu-iotests/check > +++ b/tests/qemu-iotests/check > @@ -243,7 +243,7 @@ do > echo " - no qualified output" > err=true > else > - if diff -w $seq.out $tmp.out >/dev/null 2>&1 > + if diff -w -I "^Formatting" $seq.out $tmp.out >/dev/null 2>&1 > then > echo "" > if $err > @@ -255,7 +255,7 @@ do > else > echo " - output mismatch (see $seq.out.bad)" > mv $tmp.out $seq.out.bad > - $diff -w $seq.out $seq.out.bad > + $diff -w -I "^Formatting" $seq.out $seq.out.bad > err=true > fi > fi These two hunks don't look right. You probably want to amend the sed command in _make_test_img(). > diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc > index 7782808..ec5afd7 100644 > --- a/tests/qemu-iotests/common.rc > +++ b/tests/qemu-iotests/common.rc > @@ -97,6 +97,18 @@ _make_test_img() > fi > if [ \( "$IMGFMT" = "qcow2" -o "$IMGFMT" = "qed" \) -a -n "$CLUSTER_SIZE" ]; then > optstr=$(_optstr_add "$optstr" "cluster_size=$CLUSTER_SIZE") > + elif [ "$IMGFMT" = "add-cow" ]; then > + local BACKING="$TEST_IMG"".qcow2" > + local IMG="$TEST_IMG"".raw" > + if [ "$1" = "-b" ]; then > + IMG="$IMG"".b" > + $QEMU_IMG create -f raw $IMG $image_size>/dev/null > + extra_img_options="-o image_file=$IMG $extra_img_options" > + else > + $QEMU_IMG create -f raw $IMG $image_size>/dev/null > + $QEMU_IMG create -f qcow2 $BACKING $image_size>/dev/null > + extra_img_options="-o backing_file=$BACKING,image_file=$IMG" > + fi This looks a bit hackish... Doesn't it completely ignore the requested backing file name? I'm not sure if this is a good idea. Can't you just create the raw image file and then use _optstr_add to add the right -o image_file=... option? It should automatically get the backing file right. > fi > > if [ -n "$optstr" ]; then > @@ -125,6 +137,13 @@ _cleanup_test_img() > rm -f $TEST_DIR/t.$IMGFMT > rm -f $TEST_DIR/t.$IMGFMT.orig > rm -f $TEST_DIR/t.$IMGFMT.base > + if [ "$IMGFMT" = "add-cow" ]; then > + rm -f $TEST_DIR/t.$IMGFMT.qcow2 > + rm -f $TEST_DIR/t.$IMGFMT.raw > + rm -f $TEST_DIR/t.$IMGFMT.raw.b > + rm -f $TEST_DIR/t.$IMGFMT.ct.qcow2 > + rm -f $TEST_DIR/t.$IMGFMT.ct.raw What are the .ct files? Kevin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [Qemu-devel] [PATCH V12 0/6] add-cow file format 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang ` (5 preceding siblings ...) 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 6/6] add-cow: add qemu-iotests support Dong Xu Wang @ 2012-08-23 5:34 ` Dong Xu Wang 6 siblings, 0 replies; 25+ messages in thread From: Dong Xu Wang @ 2012-08-23 5:34 UTC (permalink / raw) To: qemu-devel; +Cc: kwolf, Dong Xu Wang Anyone can give me some comments? That will be very grateful.. On Fri, Aug 10, 2012 at 11:39 PM, Dong Xu Wang <wdongxu@linux.vnet.ibm.com> wrote: > This will introduce a new file format: add-cow. > > add-cow can benefit from other available functions, such as path_has_protocol and > qed_read_string, so we will make them public. > > Now add-cow is still using QEMUOptionParameter, not QemuOpts, I will send a > separate patch series to convert. > > snapshot_blkdev are not supported now for add-cow, after converting QEMUOptionParameter > to QemuOpts, will add related code. > > > v11->v12: > 1) Removed un-used feature bit. > 2) Share cache code with qcow2.c. > 3) Remove snapshot_blkdev support, will add it in another patch. > 5) COW Bitmap field in add-cow file will be multiple of 65536. > 6) fix grammer and typo. > > Dong Xu Wang (6): > docs: document for add cow file format > make path_has_protocol non-static > qed_read_string to bdrv_read_string > rename qcow2-cache.c to block-cache.c > add-cow file format > qemu-iotests > > block.c | 29 ++- > block.h | 6 + > block/Makefile.objs | 4 +- > block/add-cow.c | 613 ++++++++++++++++++++++++++++++++++++++++++ > block/add-cow.h | 85 ++++++ > block/qcow2-cache.c | 323 ---------------------- > block/qcow2-cluster.c | 66 +++-- > block/qcow2-refcount.c | 66 +++-- > block/qcow2.c | 36 ++-- > block/qcow2.h | 24 +-- > block/qed.c | 29 +-- > block_int.h | 2 + > docs/specs/add-cow.txt | 123 +++++++++ > tests/qemu-iotests/017 | 2 +- > tests/qemu-iotests/020 | 2 +- > tests/qemu-iotests/check | 4 +- > tests/qemu-iotests/common | 6 + > tests/qemu-iotests/common.rc | 19 ++ > trace-events | 13 +- > 19 files changed, 994 insertions(+), 458 deletions(-) > create mode 100644 block/add-cow.c > create mode 100644 block/add-cow.h > delete mode 100644 block/qcow2-cache.c > create mode 100644 docs/specs/add-cow.txt > ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2012-09-12 7:50 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-10 15:39 [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 1/6] docs: document for " Dong Xu Wang 2012-09-06 17:27 ` Michael Roth 2012-09-10 1:48 ` Dong Xu Wang 2012-09-10 15:23 ` Kevin Wolf 2012-09-11 2:12 ` Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 2/6] make path_has_protocol non-static Dong Xu Wang 2012-09-06 17:27 ` Michael Roth 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 3/6] qed_read_string to bdrv_read_string Dong Xu Wang 2012-09-06 17:32 ` Michael Roth 2012-09-10 1:49 ` Dong Xu Wang 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 4/6] rename qcow2-cache.c to block-cache.c Dong Xu Wang 2012-09-06 17:52 ` Michael Roth 2012-09-10 2:14 ` Dong Xu Wang 2012-09-11 8:41 ` Kevin Wolf 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 5/6] add-cow file format Dong Xu Wang 2012-09-06 20:19 ` Michael Roth 2012-09-10 2:25 ` Dong Xu Wang 2012-09-11 9:44 ` Kevin Wolf 2012-09-11 9:40 ` Kevin Wolf 2012-09-12 7:28 ` Dong Xu Wang 2012-09-12 7:50 ` Kevin Wolf 2012-08-10 15:39 ` [Qemu-devel] [PATCH V12 6/6] add-cow: add qemu-iotests support Dong Xu Wang 2012-09-11 9:55 ` Kevin Wolf 2012-08-23 5:34 ` [Qemu-devel] [PATCH V12 0/6] add-cow file format Dong Xu Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).