From: Peter Wu <peter@lekensteyn.nl>
To: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 08/10] block/dmg: fix sector data offset calculation
Date: Sat, 03 Jan 2015 13:47:11 +0100 [thread overview]
Message-ID: <1605759.uV2RWABCx2@al> (raw)
In-Reply-To: <54A73249.7090002@redhat.com>
On Friday 02 January 2015 19:05:29 John Snow wrote:
> On 12/27/2014 10:01 AM, Peter Wu wrote:
> > This patch addresses two issues:
> >
> > - The data fork offset was not taken into account, resulting in failure
> > to read an InstallESD.dmg file (5164763151 bytes) which had a
> > non-zero DataForkOffset field.
> > - The offset of the previous block ("partition") was unconditionally
> > added to the current block because older files would start the input
> > offset of a new block at zero. Newer files (including vlc-2.1.5.dmg,
> > tuxpaint-0.9.15-macosx.dmg and OS X Yosemite [MAS].dmg) failed in
> > reads because these files have a continuous input offset.
> >
>
> What does "continuous input offset" mean? This change is not as clear to
> me, see below.
By "continuous" I mean that the new files have absolute offsets while
the offsets in older files were relative to the previous block.
> > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> > ---
> > block/dmg.c | 16 +++++++++++++++-
> > 1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/dmg.c b/block/dmg.c
> > index 984997f..93b597f 100644
> > --- a/block/dmg.c
> > +++ b/block/dmg.c
> > @@ -179,6 +179,7 @@ static int64_t dmg_find_koly_offset(BlockDriverState *file_bs)
> > typedef struct DmgHeaderState {
> > /* used internally by dmg_read_mish_block to remember offsets of blocks
> > * across calls */
> > + uint64_t data_fork_offset;
> > uint64_t last_in_offset;
> > uint64_t last_out_offset;
> > /* exported for dmg_open */
> > @@ -194,6 +195,7 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
> > size_t new_size;
> > uint32_t chunk_count;
> > int64_t offset = 0;
> > + uint64_t in_offset = ds->data_fork_offset;
> >
> > type = buff_read_uint32(buffer, offset);
> > /* skip data that is not a valid MISH block (invalid magic or too small) */
> > @@ -246,7 +248,12 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
> > }
> >
> > s->offsets[i] = buff_read_uint64(buffer, offset);
> > - s->offsets[i] += ds->last_in_offset;
> > + /* If this offset is below the previous chunk end, then assume that all
> > + * following offsets are after the previous chunks. */
> > + if (s->offsets[i] + in_offset < ds->last_in_offset) {
> > + in_offset = ds->last_in_offset;
> > + }
> > + s->offsets[i] += in_offset;
>
> I take it that all of the offsets referenced in the mish structures are
> relative to the start of the data fork block, which is why we are taking
> a value from the koly block and applying it to mish block values.
>
> correct?
Correct, the mish block describes the contents of the data fork.
http://newosxbook.com/DMG.html says:
typedef struct {
// ...
uint64_t CompressedOffset; // Start of chunk in data fork
uint64_t CompressedLength; // Count of bytes of chunk, in data fork
} __attribute__((__packed__)) BLKXChunkEntry;
> > offset += 8;
> >
> > s->lengths[i] = buff_read_uint64(buffer, offset);
> > @@ -400,6 +407,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
> > bs->read_only = 1;
> > s->n_chunks = 0;
> > s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
> > + ds.data_fork_offset = 0;
> > ds.last_in_offset = 0;
> > ds.last_out_offset = 0;
> > ds.max_compressed_size = 1;
> > @@ -412,6 +420,12 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
> > goto fail;
> > }
> >
> > + /* offset of data fork (DataForkOffset) */
> > + ret = read_uint64(bs, offset + 0x18, &ds.data_fork_offset);
> > + if (ret < 0) {
> > + goto fail;
> > + }
> > +
> > /* offset of resource fork (RsrcForkOffset) */
> > ret = read_uint64(bs, offset + 0x28, &rsrc_fork_offset);
> > if (ret < 0) {
> >
>
> A general question here:
>
> Are we ever reading the preamble of the mish block? I see we are reading
> the 'n' items of 40-byte chunk data, but is there a reason we skip the
> first 200 bytes of mish data, or have I misread the documents on DMG
> that exist?
We only use the Signature field to verify that we indeed have a BLKX
entry (required since the XML fork may yield other kind of data which
does not have this magic).
The UDIFChecksum field is 136 bytes (confirmed by a internet search).
Adding the other fields (version..reserved6 and NumberOfBlockChunks)
results in 200 (+4 for the Signature).
> It looks like there are some good fields here: SectorNumber,
> SectorCount, DataOffset, and BlockDescriptors -- can these not be used
> to provide a more explicit error-checking of offsets, allowing us to
> make less assumptions about where these blocks begin and end?
>
> Is there some reason they are unreliable?
As far as I know this is not checked because nobody added it. I am not
aware of incorrect values inside this block. Let's see:
- Version: could check this against '1' and fail if unknown?
- SectorNumber: looks like this can be taken as the absolute offset,
all entries seem to be relative to this one.
- SectorCount: looks like the number of sectors which should match the
entries (at least for the tuxpaint example dmg file).
- DataOffset: 0 in the tuxpaint example. Perhaps it should be added to
the data fork offset, but let's ignore it for now.
- 0x208 for 3/4 mish blocks, 0 for the last mish block which does not
seem to contain data.
- NumberOfBlockChunks: 0xFFFFFFFF, 0, 1, 2 respectively for the mish
blocks. No idea what this means. Probably the ordering, but we assume
that the offsets are sorted AFAIK.
I will try to make use of SectorNumber and SectorCount, the others will
be ignored for now.
--
Kind regards,
Peter
https://lekensteyn.nl
next prev parent reply other threads:[~2015-01-03 12:47 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-27 15:01 [Qemu-devel] [PATCH 00/10] block/dmg: (compatibility) fixes and bzip2 support Peter Wu
2014-12-27 15:01 ` [Qemu-devel] [PATCH 01/10] block/dmg: properly detect the UDIF trailer Peter Wu
2015-01-02 23:58 ` John Snow
2015-01-03 9:39 ` Peter Wu
2015-01-06 13:35 ` Stefan Hajnoczi
2014-12-27 15:01 ` [Qemu-devel] [PATCH 02/10] block/dmg: extract mish block decoding functionality Peter Wu
2015-01-02 23:59 ` John Snow
2015-01-03 11:05 ` Peter Wu
2015-01-06 13:42 ` Stefan Hajnoczi
2014-12-27 15:01 ` [Qemu-devel] [PATCH 03/10] block/dmg: extract processing of resource forks Peter Wu
2015-01-03 0:01 ` John Snow
2015-01-03 11:24 ` Peter Wu
2014-12-27 15:01 ` [Qemu-devel] [PATCH 04/10] block/dmg: process a buffer instead of reading ints Peter Wu
2015-01-03 0:01 ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 05/10] block/dmg: validate chunk size to avoid overflow Peter Wu
2015-01-03 0:02 ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 06/10] block/dmg: process XML plists Peter Wu
2015-01-03 0:04 ` John Snow
2015-01-03 11:54 ` Peter Wu
2015-01-05 16:46 ` John Snow
2015-01-05 16:54 ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 07/10] block/dmg: set virtual size to a non-zero value Peter Wu
2015-01-03 0:04 ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 08/10] block/dmg: fix sector data offset calculation Peter Wu
2015-01-03 0:05 ` John Snow
2015-01-03 12:47 ` Peter Wu [this message]
2014-12-27 15:01 ` [Qemu-devel] [PATCH 09/10] block/dmg: support bzip2 block entry types Peter Wu
2015-01-05 19:32 ` John Snow
2015-01-07 10:29 ` Paolo Bonzini
2015-01-07 10:31 ` Peter Wu
2015-01-07 10:53 ` Paolo Bonzini
2014-12-27 15:01 ` [Qemu-devel] [PATCH 10/10] block/dmg: improve zeroes handling Peter Wu
2015-01-05 19:48 ` John Snow
2015-01-06 0:21 ` Peter Wu
2015-01-02 14:14 ` [Qemu-devel] [PATCH 00/10] block/dmg: (compatibility) fixes and bzip2 support Stefan Hajnoczi
2015-01-02 16:31 ` John Snow
2015-01-02 18:46 ` Peter Wu
2015-01-02 18:58 ` John Snow
2015-01-02 21:49 ` Peter Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1605759.uV2RWABCx2@al \
--to=peter@lekensteyn.nl \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).