linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: CF Card wear optimalisation for ext4
       [not found] <5435661D.2040905@powercraft.nl>
@ 2014-10-10 19:02 ` Andreas Dilger
  2014-10-11 23:19   ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Dilger @ 2014-10-10 19:02 UTC (permalink / raw)
  To: Jelle de Jong; +Cc: EXT3 Users, Ext4 Developers List

[-- Attachment #1: Type: text/plain, Size: 2756 bytes --]

On Oct 8, 2014, at 10:28 AM, Jelle de Jong <jelledejong@powercraft.nl> wrote:
> Hello everyone,
> 
> I been using CF cards for almost more then 7 years now with ext
> file-system without any major problems on ALIX boards.
> 
> Last year I took 30 other systems in production with ext4 and the CF
> cards been dropping out pretty fast, it may have been a bad batch but
> I do want to look at it. I don't think the devices writes a lot of IO
> (is there a tool that can give me some useful numbers for say 24H or a
> week? iotop, atop, sysstat doesn?t seem suited for long term IO write
> monitoring, but maybe I am misusing them and can use some help here)

You can see in the ext4 superblock the amount of data that has been
written to a filesystem over its lifetime:

dumpe2fs -h /dev/vg_mookie/lv_home
dumpe2fs 1.42.7.wc2 (07-Nov-2013)
Filesystem volume name:   home
Last mounted on:          /home
:
:
Lifetime writes:          27 GB
:
:

Note that this number isn't wholly accurate, but rather a guideline.
IIRC it is not updated on disk all the time, so may lose writes.

You can also get this information from /sys/fs/ext4 including data
just for the current mount:

# grep . /sys/fs/ext4/*/*_write_kbytes 
/sys/fs/ext4/dm-0/lifetime_write_kbytes:77632360
/sys/fs/ext4/dm-0/session_write_kbytes:7124948
/sys/fs/ext4/dm-19/lifetime_write_kbytes:28081448
/sys/fs/ext4/dm-19/session_write_kbytes:16520
/sys/fs/ext4/dm-2/lifetime_write_kbytes:60847858
/sys/fs/ext4/dm-2/session_write_kbytes:7739388
/sys/fs/ext4/dm-7/lifetime_write_kbytes:22385952
/sys/fs/ext4/dm-7/session_write_kbytes:6379728
/sys/fs/ext4/sda1/lifetime_write_kbytes:835020
/sys/fs/ext4/sda1/session_write_kbytes:60848

> I mount root with the following options:
> 
> /dev/disk/by-uuid/09a04c01-64c6-4600-9e22-525667bda3e3 on / type ext4
> (rw,noatime,user_xattr,barrier=1,data=ordered)
> 
> # dumpe2fs /dev/sda1
> http://paste.debian.net/hidden/e3f81f11/
> 
> Are there kernel options to avoid synchronous disk writes? As
> suggested here: http://www.pcengines.ch/cfwear.htm

If you increase the journal commit interval (e.g. 30s) you can reduce
the number of times a block needs to be written to the journal.  The
drawback is that you also increase the amount of un-sync'd metadata
that would be lost in case of a crash.  This usually means the data
would also be lost, unless you are using a database-like workload that
overwrites the same files continuously.

> Is there a list of other kernel options I can optimise to limit any cf
> wear? The devices don't use
> 
> Kind regards
> 
> Jelle de Jong
> 
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users@redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: CF Card wear optimalisation for ext4
  2014-10-10 19:02 ` CF Card wear optimalisation for ext4 Andreas Dilger
@ 2014-10-11 23:19   ` Theodore Ts'o
  2014-10-12 14:07     ` power loss protection Ivan Baldo
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2014-10-11 23:19 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Jelle de Jong, EXT3 Users, Ext4 Developers List

Something else that you might want to do is count the number of
journal commits that are taking place, via a command like this:

perf stat -e jbd2:jbd2_start_commit -a sleep 3600

This will count the number of jbd2 commits are executed in 3600
seconds --- i.e., an hour.

If you are running some workload which is constantly calling fsync(2),
that will be forcing journal commits, and those turn into cache flush
commands that force all state to stable storage.  Now, if you are
using CF cards that aren't guaranteed to have power-loss protection
(hint: even most consumer grade SSD's do not have power loss
protection --- you have to pay $$$ for enterprise-grade SLC SSD's to
have power loss protection --- and I'm guessing most CF cards are so
cheap that they won't make guarantees that all of their flash metadata
are saved to stable store on a power loss event) the fact that you are
constantly using fsync(2) may not be providing you with the protection
you want after a power loss event.

Which might not be a problem if you have a handset with a
non-removable eMMC device and a non-removable battery that can't fly
out when you drop the phone, but for devices which which can easily
have unplanned power failure, it may every well be the case that
you're going to be badly burned across a power fail event anyway.

So the next question I would ask you is whether you care about
unplanned power failures.  If so, you probably want to test your CF
cards to make sure they actually will do the right thing across a
power failure --- and if they don't, you may need to replace your CF card provider.

If you don't care (because you don't have a removable battery, and the
CF card is permanently sealed inside your device, for example), then
you might want to consider disabling barriers so you're no longer
forcing synchronous cache flush commands to be sent to your CF card.
This trades off power failure safety versus increased performance and
decreased card wear --- but if you don't need power failure safety,
then it might be a good tradeoff.

And if you *do* need power fail protection, then it's a good thing to
test whether your hardware will actually provide it, so you don't find
out the hard way that you're paying the cost of decreased performance
and increased card wear, but you didn't get power fail protection
*anyway* because of hardware limitations.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

* power loss protection
  2014-10-11 23:19   ` Theodore Ts'o
@ 2014-10-12 14:07     ` Ivan Baldo
  2014-10-12 17:53       ` squadra
  0 siblings, 1 reply; 4+ messages in thread
From: Ivan Baldo @ 2014-10-12 14:07 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, EXT3 Users

     Hello.

El 11/10/14 21:19, Theodore Ts'o escribió:
> If you are running some workload which is constantly calling fsync(2),
> that will be forcing journal commits, and those turn into cache flush
> commands that force all state to stable storage.  Now, if you are
> using CF cards that aren't guaranteed to have power-loss protection
> (hint: even most consumer grade SSD's do not have power loss
> protection --- you have to pay $$$ for enterprise-grade SLC SSD's to
> have power loss protection --- and I'm guessing most CF cards are so
> cheap that they won't make guarantees that all of their flash metadata
> are saved to stable store on a power loss event) the fact that you are
> constantly using fsync(2) may not be providing you with the protection
> you want after a power loss event.
>
>
     This got me worried!
     How can we test if a device really stores all the data safely after 
a barrier and sudden power loss?
     Is there a tool for that?
     I am thinking something along the lines of a tool that does writes 
with some barriers in between and then I unplug the device and run the 
same tool but in a "check mode" that tells me if the requested data 
before the barrier is really there.
     Something sysadmin friendly or maybe even user friendly, but not 
too hard to use.
     Thanks for your insight!

-- 
Ivan Baldo - ibaldo@adinet.com.uy - http://ibaldo.codigolibre.net/
 From Montevideo, Uruguay, at the south of South America.
Freelance programmer and GNU/Linux system administrator, hire me!
Alternatives: ibaldo@codigolibre.net - http://go.to/ibaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: power loss protection
  2014-10-12 14:07     ` power loss protection Ivan Baldo
@ 2014-10-12 17:53       ` squadra
  0 siblings, 0 replies; 4+ messages in thread
From: squadra @ 2014-10-12 17:53 UTC (permalink / raw)
  To: Ivan Baldo; +Cc: Theodore Ts'o, Ext4 Developers List, EXT3 Users

dunno about any special tools, but misusing a mysql database could be
a good check for this. unplug/reset your device while inserts into the
db are ongoing (dont forget to use innodb for the tables). unplug /
reset your device, boot it up again and take a look into the mysql
log. theres a good chance that innodb gets wrecked... sure, this is
not perfect. but could be a impressive test if it ends like i think.

make sure your mysql instance is configured to be "safe":

http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_method
http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

and enable binlogs + sync binlogs

or in other words: make it as slow as possible :p

On Sun, Oct 12, 2014 at 4:07 PM, Ivan Baldo <ibaldo@adinet.com.uy> wrote:
>     Hello.
>
> El 11/10/14 21:19, Theodore Ts'o escribió:
>>
>> If you are running some workload which is constantly calling fsync(2),
>> that will be forcing journal commits, and those turn into cache flush
>> commands that force all state to stable storage.  Now, if you are
>> using CF cards that aren't guaranteed to have power-loss protection
>> (hint: even most consumer grade SSD's do not have power loss
>> protection --- you have to pay $$$ for enterprise-grade SLC SSD's to
>> have power loss protection --- and I'm guessing most CF cards are so
>> cheap that they won't make guarantees that all of their flash metadata
>> are saved to stable store on a power loss event) the fact that you are
>> constantly using fsync(2) may not be providing you with the protection
>> you want after a power loss event.
>>
>>
>     This got me worried!
>     How can we test if a device really stores all the data safely after a
> barrier and sudden power loss?
>     Is there a tool for that?
>     I am thinking something along the lines of a tool that does writes with
> some barriers in between and then I unplug the device and run the same tool
> but in a "check mode" that tells me if the requested data before the barrier
> is really there.
>     Something sysadmin friendly or maybe even user friendly, but not too
> hard to use.
>     Thanks for your insight!
>
> --
> Ivan Baldo - ibaldo@adinet.com.uy - http://ibaldo.codigolibre.net/
> From Montevideo, Uruguay, at the south of South America.
> Freelance programmer and GNU/Linux system administrator, hire me!
> Alternatives: ibaldo@codigolibre.net - http://go.to/ibaldo
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users@redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users



-- 
Sent from the Delta quadrant using Borg technology!
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-10-12 17:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <5435661D.2040905@powercraft.nl>
2014-10-10 19:02 ` CF Card wear optimalisation for ext4 Andreas Dilger
2014-10-11 23:19   ` Theodore Ts'o
2014-10-12 14:07     ` power loss protection Ivan Baldo
2014-10-12 17:53       ` squadra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).