linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Behrens <sbehrens@giantdisaster.de>
To: linux-btrfs@vger.kernel.org
Cc: David Sterba <dave@jikos.cz>
Subject: Re: [PATCH v3 3/3] Btrfs: read device stats on mount, write modified ones during commit
Date: Mon, 21 May 2012 16:57:00 +0200	[thread overview]
Message-ID: <4FBA57BC.9080407@giantdisaster.de> (raw)
In-Reply-To: <20120518115719.GH24394@twin.jikos.cz>

On Fri, 18 May 2012 13:57:19 +0200, David Sterba wrote:
> On Wed, May 16, 2012 at 06:50:47PM +0200, Stefan Behrens wrote:
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -823,6 +823,26 @@ struct btrfs_csum_item {
>>  	u8 csum;
>>  } __attribute__ ((__packed__));
>>  
>> +struct btrfs_device_stats_item {
>> +	/*
>> +	 * grow this item struct at the end for future enhancements and keep
>> +	 * the existing values unchanged
>> +	 */
>> +	__le64 cnt_write_io_errs; /* EIO or EREMOTEIO from lower layers */
>> +	__le64 cnt_read_io_errs; /* EIO or EREMOTEIO from lower layers */
>> +	__le64 cnt_flush_io_errs; /* EIO or EREMOTEIO from lower layers */
>> +
>> +	/* stats for indirect indications for I/O failures */
>> +	__le64 cnt_corruption_errs; /* checksum error, bytenr error or
>> +				     * contents is illegal: this is an
>> +				     * indication that the block was damaged
>> +				     * during read or write, or written to
>> +				     * wrong location or read from wrong
>> +				     * location */
>> +	__le64 cnt_generation_errs; /* an indication that blocks have not
>> +				     * been written */
> 
> A few spare u64 would come handy in the future. Currently there are 5,
> so add like 7 or 11. We might be interested in collecting more types of
> stats, ore more fine-grained.
> 
> I see the comment above about enhancing the structure, but then you need
> to version this stucture. Let's say this kernel at version 3.5 add this
> structre as you propose it now, and kernel 3.6 adds another item
> 'cnt_exploded'.

The size of the item is part of the information that is stored on disk.
This is used to downgrade and upgrade the item, and to support
downgrading and upgrading the kernel.

In btrfs_init_device_stats(), where the items are read from disk while
mounting, if the read item is too small, the missing statistic members
at the end are filled with zero. If the item is larger than expected,
the additional values are ignored.
And in update_device_stat_item(), where the items are written to disk
with each commit (only if they have been modified), if the disk item is
too small, it is removed and replaced by a larger one. If the disk item
is larger than expected, the additional values are ignored and not modified.

The ioctl() works similar. In the user mode, the max number of statistic
members is set before the ioctl(). The kernel code uses this parameter
to determine, how much values to write and at the end replaces it with
the number of values written. Then the btrfs tool prints all the values
that it knows by name, but at most the number of values that the kernel
has set.

If I compare this solution to the one that just pads the item size to a
well known size, the disadvantage are the extra lines of code. The
advantage is that no limit at all is set.

Rereading the code, I recognized that I will need to send a v4. The
ioctl structure needs to be padded, or copy_to_user(..., sizeof(struct
...)) would overwrite memory when the kernel struct is grown and the
btrfs tool is old.


> Accessing the 3.6-created image with a 3.5 will be ok (older kernel will
> not touch the new items).
> 
> Accessing the 3.5-created image with a 3.6 will be problematic, as the
> kernel would try to access ->cnt_exploded .
> 
> So, either the 3.6 kernel needs to know not to touch the missing item
> (ie. via reading the struct version from somewhere, stored on disk).
> 
> Or, there are spare items, which are zeroed in versions that do not use
> them and naturally used otherwise, but when new kernel uses old image,
> it finds zeros (and will be safe).
> 
>> +} __attribute__ ((__packed__));
>> +

      reply	other threads:[~2012-05-21 15:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-16 16:50 [PATCH v3 0/3] Btrfs: add IO error device stats Stefan Behrens
2012-05-16 16:50 ` [PATCH v3 1/3] Btrfs: add device counters for detected IO and checksum errors Stefan Behrens
2012-05-16 16:50 ` [PATCH v3 2/3] Btrfs: add ioctl to get and reset the device stats Stefan Behrens
2012-05-16 16:50 ` [PATCH v3 3/3] Btrfs: read device stats on mount, write modified ones during commit Stefan Behrens
2012-05-17  1:52   ` Liu Bo
2012-05-17  8:49     ` Stefan Behrens
2012-05-18 11:57   ` David Sterba
2012-05-21 14:57     ` Stefan Behrens [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBA57BC.9080407@giantdisaster.de \
    --to=sbehrens@giantdisaster.de \
    --cc=dave@jikos.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).