Re: [Qemu-devel] [PATCH 08/10] block/dmg: fix sector data offset calculation

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Peter Wu <peter@lekensteyn.nl>
To: John Snow <jsnow@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 08/10] block/dmg: fix sector data offset calculation
Date: Sat, 03 Jan 2015 13:47:11 +0100	[thread overview]
Message-ID: <1605759.uV2RWABCx2@al> (raw)
In-Reply-To: <54A73249.7090002@redhat.com>

On Friday 02 January 2015 19:05:29 John Snow wrote:
> On 12/27/2014 10:01 AM, Peter Wu wrote:
> > This patch addresses two issues:
> >
> >   - The data fork offset was not taken into account, resulting in failure
> >     to read an InstallESD.dmg file (5164763151 bytes) which had a
> >     non-zero DataForkOffset field.
> >   - The offset of the previous block ("partition") was unconditionally
> >     added to the current block because older files would start the input
> >     offset of a new block at zero. Newer files (including vlc-2.1.5.dmg,
> >     tuxpaint-0.9.15-macosx.dmg and OS X Yosemite [MAS].dmg) failed in
> >     reads because these files have a continuous input offset.
> >
> 
> What does "continuous input offset" mean? This change is not as clear to 
> me, see below.

By "continuous" I mean that the new files have absolute offsets while
the offsets in older files were relative to the previous block.

> > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> > ---
> >   block/dmg.c | 16 +++++++++++++++-
> >   1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/dmg.c b/block/dmg.c
> > index 984997f..93b597f 100644
> > --- a/block/dmg.c
> > +++ b/block/dmg.c
> > @@ -179,6 +179,7 @@ static int64_t dmg_find_koly_offset(BlockDriverState *file_bs)
> >   typedef struct DmgHeaderState {
> >       /* used internally by dmg_read_mish_block to remember offsets of blocks
> >        * across calls */
> > +    uint64_t data_fork_offset;
> >       uint64_t last_in_offset;
> >       uint64_t last_out_offset;
> >       /* exported for dmg_open */
> > @@ -194,6 +195,7 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
> >       size_t new_size;
> >       uint32_t chunk_count;
> >       int64_t offset = 0;
> > +    uint64_t in_offset = ds->data_fork_offset;
> >
> >       type = buff_read_uint32(buffer, offset);
> >       /* skip data that is not a valid MISH block (invalid magic or too small) */
> > @@ -246,7 +248,12 @@ static int dmg_read_mish_block(BDRVDMGState *s, DmgHeaderState *ds,
> >           }
> >
> >           s->offsets[i] = buff_read_uint64(buffer, offset);
> > -        s->offsets[i] += ds->last_in_offset;
> > +        /* If this offset is below the previous chunk end, then assume that all
> > +         * following offsets are after the previous chunks. */
> > +        if (s->offsets[i] + in_offset < ds->last_in_offset) {
> > +            in_offset = ds->last_in_offset;
> > +        }
> > +        s->offsets[i] += in_offset;
> 
> I take it that all of the offsets referenced in the mish structures are 
> relative to the start of the data fork block, which is why we are taking 
> a value from the koly block and applying it to mish block values.
> 
> correct?

Correct, the mish block describes the contents of the data fork.
http://newosxbook.com/DMG.html says:

typedef struct {
        // ...
        uint64_t CompressedOffset;  // Start of chunk in data fork
        uint64_t CompressedLength;  // Count of bytes of chunk, in data fork
} __attribute__((__packed__)) BLKXChunkEntry;

> >           offset += 8;
> >
> >           s->lengths[i] = buff_read_uint64(buffer, offset);
> > @@ -400,6 +407,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
> >       bs->read_only = 1;
> >       s->n_chunks = 0;
> >       s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
> > +    ds.data_fork_offset = 0;
> >       ds.last_in_offset = 0;
> >       ds.last_out_offset = 0;
> >       ds.max_compressed_size = 1;
> > @@ -412,6 +420,12 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
> >           goto fail;
> >       }
> >
> > +    /* offset of data fork (DataForkOffset) */
> > +    ret = read_uint64(bs, offset + 0x18, &ds.data_fork_offset);
> > +    if (ret < 0) {
> > +        goto fail;
> > +    }
> > +
> >       /* offset of resource fork (RsrcForkOffset) */
> >       ret = read_uint64(bs, offset + 0x28, &rsrc_fork_offset);
> >       if (ret < 0) {
> >
> 
> A general question here:
> 
> Are we ever reading the preamble of the mish block? I see we are reading 
> the 'n' items of 40-byte chunk data, but is there a reason we skip the 
> first 200 bytes of mish data, or have I misread the documents on DMG 
> that exist?

We only use the Signature field to verify that we indeed have a BLKX
entry (required since the XML fork may yield other kind of data which
does not have this magic).

The UDIFChecksum field is 136 bytes (confirmed by a internet search).
Adding the other fields (version..reserved6 and NumberOfBlockChunks)
results in 200 (+4 for the Signature).

> It looks like there are some good fields here: SectorNumber, 
> SectorCount, DataOffset, and BlockDescriptors -- can these not be used 
> to provide a more explicit error-checking of offsets, allowing us to 
> make less assumptions about where these blocks begin and end?
> 
> Is there some reason they are unreliable?

As far as I know this is not checked because nobody added it. I am not
aware of incorrect values inside this block. Let's see:

 - Version: could check this against '1' and fail if unknown?
 - SectorNumber: looks like this can be taken as the absolute offset,
   all entries seem to be relative to this one.
 - SectorCount: looks like the number of sectors which should match the
   entries (at least for the tuxpaint example dmg file).
 - DataOffset: 0 in the tuxpaint example. Perhaps it should be added to
   the data fork offset, but let's ignore it for now.
 - 0x208 for 3/4 mish blocks, 0 for the last mish block which does not
   seem to contain data.
 - NumberOfBlockChunks: 0xFFFFFFFF, 0, 1, 2 respectively for the mish
   blocks. No idea what this means. Probably the ordering, but we assume
   that the offsets are sorted AFAIK.

I will try to make use of SectorNumber and SectorCount, the others will
be ignored for now.
-- 
Kind regards,
Peter
https://lekensteyn.nl

next prev parent reply	other threads:[~2015-01-03 12:47 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-27 15:01 [Qemu-devel] [PATCH 00/10] block/dmg: (compatibility) fixes and bzip2 support Peter Wu
2014-12-27 15:01 ` [Qemu-devel] [PATCH 01/10] block/dmg: properly detect the UDIF trailer Peter Wu
2015-01-02 23:58   ` John Snow
2015-01-03  9:39     ` Peter Wu
2015-01-06 13:35   ` Stefan Hajnoczi
2014-12-27 15:01 ` [Qemu-devel] [PATCH 02/10] block/dmg: extract mish block decoding functionality Peter Wu
2015-01-02 23:59   ` John Snow
2015-01-03 11:05     ` Peter Wu
2015-01-06 13:42   ` Stefan Hajnoczi
2014-12-27 15:01 ` [Qemu-devel] [PATCH 03/10] block/dmg: extract processing of resource forks Peter Wu
2015-01-03  0:01   ` John Snow
2015-01-03 11:24     ` Peter Wu
2014-12-27 15:01 ` [Qemu-devel] [PATCH 04/10] block/dmg: process a buffer instead of reading ints Peter Wu
2015-01-03  0:01   ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 05/10] block/dmg: validate chunk size to avoid overflow Peter Wu
2015-01-03  0:02   ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 06/10] block/dmg: process XML plists Peter Wu
2015-01-03  0:04   ` John Snow
2015-01-03 11:54     ` Peter Wu
2015-01-05 16:46       ` John Snow
2015-01-05 16:54   ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 07/10] block/dmg: set virtual size to a non-zero value Peter Wu
2015-01-03  0:04   ` John Snow
2014-12-27 15:01 ` [Qemu-devel] [PATCH 08/10] block/dmg: fix sector data offset calculation Peter Wu
2015-01-03  0:05   ` John Snow
2015-01-03 12:47     ` Peter Wu [this message]
2014-12-27 15:01 ` [Qemu-devel] [PATCH 09/10] block/dmg: support bzip2 block entry types Peter Wu
2015-01-05 19:32   ` John Snow
2015-01-07 10:29     ` Paolo Bonzini
2015-01-07 10:31       ` Peter Wu
2015-01-07 10:53         ` Paolo Bonzini
2014-12-27 15:01 ` [Qemu-devel] [PATCH 10/10] block/dmg: improve zeroes handling Peter Wu
2015-01-05 19:48   ` John Snow
2015-01-06  0:21     ` Peter Wu
2015-01-02 14:14 ` [Qemu-devel] [PATCH 00/10] block/dmg: (compatibility) fixes and bzip2 support Stefan Hajnoczi
2015-01-02 16:31   ` John Snow
2015-01-02 18:46     ` Peter Wu
2015-01-02 18:58       ` John Snow
2015-01-02 21:49         ` Peter Wu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1605759.uV2RWABCx2@al \
    --to=peter@lekensteyn.nl \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).