Re: [PATCH 2/2 v2] debugfs: dump a sparse file as a new sparse file

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, George Spelvin <linux@horizon.com>,
	Zheng Liu <wenqing.lz@taobao.com>
Subject: Re: [PATCH 2/2 v2] debugfs: dump a sparse file as a new sparse file
Date: Wed, 9 Jan 2013 23:34:24 +0800	[thread overview]
Message-ID: <20130109153424.GA2945@gmail.com> (raw)
In-Reply-To: <20130109145854.GA27630@thunk.org>

On Wed, Jan 09, 2013 at 09:58:54AM -0500, Theodore Ts'o wrote:
> On Sat, Jan 05, 2013 at 12:45:22PM +0800, Zheng Liu wrote:
> > Yes, some programs call ext2fs_file_read() with a 4k or 16k fixed size
> > buffer, and ext2fs_file_read() calls ext2fs_file_read2().  But it won't
> > skip the sparse blocks because when ext2fs_file_read2() is called in
> > ext2fs_file_read(), the last argument, namely 'seek', is 0.  That means
> > that in ext2fs_file_read2() 'flags' is 0.  Thus, in load_buffer()
> > 'flags' is not equal to SEEK, and EXT2_FILE_BUF_VALID is marked.  Then
> > we return back to ext2fs_file_read2() and all data in file->buf is
> > copied.  So I think the behavior of ext2fs_file_read() doesn't be
> > changed.
> 
> You're right; I had forgotten about that part of the change.
> 
> I still am a bit concerned about the interface, because if you specify
> a pointer to seek in ext2fs_file_read2(), you have to know what the
> file system blocksize is, because if you give a count which is larger
> than a single block, the value of the returned seek and the data which
> is returned in the buffer is impossible to interpret (consider a file
> where every other 1k block is sparse, and you try to read into a 4k
> buffer).
> 
> So what I would suggest is the following as a better, more efficient
> interface.
> 
> 1) Add a new flag which can be passed into ext2_file_open() which
> requests sparse-intelligent handling.
> 
> 2) If the sparse flag is set, then ext2_file_read() will stop the read
> when it runs into the first uninitialized or sparse block.  That is,
> consider the example file which has 8k of data, a 4k uninitialized
> block, and then 12k of data after that.  If the sparse flag is passed
> to ext2_file_open(), then ext2fs_file_read(fd, buf, 16384, &got) will
> read 8k of data into buf, and return with got set to 8192.
> 
> 3) To distinguish between EOF and a sparse block, if the current file
> offset is pointed at a sparse/uninitialized block, and the sparse flag
> was passed to ext2_file_open(), then in addition to *got getting set
> 0, ext2_file_read() will also return a new error code,
> EXT2_ET_READ_HOLE_FOUND.
> 
> 4) We also extend ext2_file_llseek() to also support EXT2_SEEK_HOLE
> and EXT2_SEEK_DATA, which works like SEEK_HOLE and SEEK_DATA flags to
> llseek().  This will allow the caller to efficiently find the next
> part of the file with valid data.
> 
> What I like about this interface is that we don't need to define a new
> ext2_file_read2(), and it is also more efficient for an application
> which is interested in reading multiple blocks at a time.
> 
> What do you think?

Thanks so much for your advices.  I will try to generate the latest
patches.

Regards,
                                                - Zheng

next prev parent reply	other threads:[~2013-01-09 15:20 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-01 12:30 [PATCH 0/2 v2] debugfs: dump a sparse file as a new sparse file Zheng Liu
2013-01-01 12:30 ` [PATCH 1/2 v2] debugfs: fixup the hard-coded buffer length in dump_file Zheng Liu
2013-01-01 20:14   ` Theodore Ts'o
2013-01-01 12:30 ` [PATCH 2/2 v2] debugfs: dump a sparse file as a new sparse file Zheng Liu
2013-01-01 20:38   ` Theodore Ts'o
2013-01-04  4:05     ` Zheng Liu
2013-01-04 19:37       ` Theodore Ts'o
2013-01-05  4:45         ` Zheng Liu
2013-01-09 14:58           ` Theodore Ts'o
2013-01-09 15:34             ` Zheng Liu [this message]
2013-01-10  0:42             ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130109153424.GA2945@gmail.com \
    --to=gnehzuil.liu@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux@horizon.com \
    --cc=tytso@mit.edu \
    --cc=wenqing.lz@taobao.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.