linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Sanidhya Solanki <jpage.lkml@gmail.com>,
	David Sterba <dsterba@suse.cz>,
	clm@fb.com, jbacik@fb.com
Cc: Christoph Anton Mitterer <calestyo@scientia.net>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] BTRFS: Adds an option to select RAID Stripe size
Date: Sun, 3 Jan 2016 09:37:04 +0800	[thread overview]
Message-ID: <56887B40.10105@gmx.com> (raw)
In-Reply-To: <20160102065207.4eec760a@gmail.com>



On 01/02/2016 07:52 PM, Sanidhya Solanki wrote:
> On Tue, 29 Dec 2015 19:06:44 +0100
> David Sterba <dsterba@suse.cz> wrote:
>
>> In theory this is possible with current on-disk data structures. The
>> stripe length is property of btrfs_chunk and changing it should be
>> possible the same way we do other raid transformations. The
>> implementation might be tricky at some places, but basically boils
>> down to the "read-" and "write-" stripe size. Reading chunks would
>> always respect the stored size, writing new data would use eg. the
>> superblock->stripesize or other value provided by the user.
>
> I was having misgivings about the conversion project, but after
> re-reading this part, I will try and get a patch in by Wednesday.
>
> I still have my reservations about the following two parts:
> - Checksumming: I have no experience with how the CRC implementation
>    would deal with the changed blocksizes. Would the checksum be
>    different just because the superblock size has been changed? This
>    would make confirming if the transformation was successful much more
>    difficult. Another way to deal with this would be ti read the data
>    instead and compare it directly, instead of using checksums.

Btrfs checksum are calculated in 3 different method:
1) Metadata: Per nodesize, stored in tree blocker header. (struct 
btrfs_header->csum)
2) Data: Per sectorsize, stored in csum tree.
3) Superblock: Per 4K (fixed), stored in its header (struct 
btrfs_super->csum)

I didn't the need to change any of them, as you are not changing any of 
the csum behavior.

Stripe size only affect how btrfs does IO, not the csum size.

>
> - Performance: Should it have a higher throughput by using larger data
>    sizes (which may reduce performance in scenarios such as databases and
>    video editing) or by having multiple transformations in parallel on
>    smaller data blocks. I am not sure if you can implement things such
>    as OpenMP in kernel space. Or spawn multiple kworkers in parallel to
>    deal with multiple streams of data.

IIRC, btrfs only need to pass bio to devices, and the 
parallel/merge/schedule are all done by kernel bio level.
So you don't really need to bother that much.

And since you are making the stripe size configurable, then user is 
responsible for any too large or too small stripe size setting.

Your only concern would be the default value, but IMHO current 64K 
stripe size is good enough as a default value.

Thanks,
Qu

>
> I am not too worried about dealing with crashes, as we can just
> implement something like a table that contains the addresses currently
> undergoing changes (which may further reduce throughput, but make it
> more space) or do it by using a serial transformation, which ensures a
> block was committed to storage before proceeding to the next
> transformation.
>
> Essentially a time vs. CPU usage vs. Memory usage trade-off.
> Please chime in with your thoughts, developers and administrators.
>
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

  reply	other threads:[~2016-01-03  1:37 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-28 12:24 [PATCH] BTRFS: Adds an option to select RAID Stripe size Sanidhya Solanki
2015-12-28 22:19 ` Christoph Anton Mitterer
2015-12-28 20:38   ` Sanidhya Solanki
2015-12-29  1:21     ` Christoph Anton Mitterer
2015-12-28 21:43       ` Sanidhya Solanki
2015-12-29  3:42         ` Christoph Anton Mitterer
2015-12-29  0:03           ` Sanidhya Solanki
2015-12-29  4:26             ` Christoph Anton Mitterer
2015-12-29  1:31               ` Sanidhya Solanki
2015-12-29  6:03                 ` Christoph Anton Mitterer
2015-12-29  2:23                   ` Sanidhya Solanki
2015-12-29 15:32                     ` Christoph Anton Mitterer
2015-12-29 16:44                       ` Duncan
2015-12-30  2:56                         ` Christoph Anton Mitterer
2015-12-29 18:06               ` David Sterba
2015-12-30 20:00                 ` Christoph Anton Mitterer
2015-12-30 21:02                   ` Duncan
2015-12-30 21:13                     ` Christoph Anton Mitterer
2016-01-02 11:52                 ` Sanidhya Solanki
2016-01-03  1:37                   ` Qu Wenruo [this message]
2016-01-03  2:26                     ` Christoph Anton Mitterer
2016-01-05 10:44                       ` David Sterba
2016-01-05 18:48                         ` Christoph Anton Mitterer
2016-01-10  3:11                     ` Sanidhya Solanki
2016-01-11  1:29                       ` Qu Wenruo
2016-01-11 15:43                       ` Christoph Anton Mitterer
2016-01-11 11:49                         ` Sanidhya Solanki
2016-01-11 15:57                           ` Christoph Anton Mitterer
2016-01-11 16:01                             ` Hugo Mills
2016-01-12 12:23                             ` Austin S. Hemmelgarn
2016-01-12 12:07                               ` Sanidhya Solanki
2015-12-29 13:39 ` David Sterba
2015-12-29 11:15   ` Sanidhya Solanki
2015-12-29 17:06     ` David Sterba
2015-12-29 21:32       ` Sanidhya Solanki
2015-12-30  6:39       ` Sanidhya Solanki
2015-12-30 11:59         ` Qu Wenruo
2015-12-30  9:54           ` Sanidhya Solanki
2015-12-30 14:10             ` Qu Wenruo
2015-12-30 11:15               ` Sanidhya Solanki
2015-12-30 15:58                 ` David Sterba
2015-12-30 21:19                   ` Sanidhya Solanki
2015-12-30 16:17               ` David Sterba
2015-12-30 21:21                 ` Sanidhya Solanki
2016-01-05 10:33                   ` David Sterba
2015-12-31  0:46                 ` Qu Wenruo
2016-01-05 10:16                   ` David Sterba
2015-12-30 19:48               ` Christoph Anton Mitterer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56887B40.10105@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=calestyo@scientia.net \
    --cc=clm@fb.com \
    --cc=dsterba@suse.cz \
    --cc=jbacik@fb.com \
    --cc=jpage.lkml@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).