public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* Good stress test for UBIFS?
@ 2009-12-18  8:00 David Jander
  2009-12-28 10:40 ` Adrian Hunter
  0 siblings, 1 reply; 5+ messages in thread
From: David Jander @ 2009-12-18  8:00 UTC (permalink / raw)
  To: linux-mtd


Hi all,

What would be a good stress-test for the whole ubifs stack: ubifs, ubi, mtd, 
nand-flash-driver?

I have gone through this:
- Forced to use 2.6.24 which is provided by the chip manufacturer with some 
unknown version of mtd/ubi/ubifs, which is not the latest and not the one that 
originally shipped with 2.6.24. The git history is gone.
- Used this version for a time when suddenly some boards had a corrupt 
filesystem (unreadable /etc).
- Used mtd tests to check hardware and driver, and everything seems fine.
- Updated to latest ubi/ubifs for 2.6.24 from ubifs-v2.6.24.git

And now I need to accomplish two things:

1. Be able come up with a fairly reliable method to reproduce the corruption 
on the original version of ubi/ubifs.
2. Check that this problem indeed does not occur on the latest version, and if 
it does post a bug report here.

For 1. I am looking for some kind of tool or method to stress-test ubi/ubifs 
preferably including also the nand-flash driver and hardware (you never know).

I thought of running something like bonnie++ while power-cycling the board at 
random times for a few days, or something similar, but since IMHO this must be 
a recurring issue for many here, I'd like to know about what others have come 
up with, or if there is some comprehensive stress-test-tool.

Best regards,

-- 
David Jander
Protonic Holland.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Good stress test for UBIFS?
  2009-12-18  8:00 Good stress test for UBIFS? David Jander
@ 2009-12-28 10:40 ` Adrian Hunter
  2010-01-06 10:59   ` David Jander
  0 siblings, 1 reply; 5+ messages in thread
From: Adrian Hunter @ 2009-12-28 10:40 UTC (permalink / raw)
  To: David Jander; +Cc: linux-mtd

David Jander wrote:
> Hi all,
> 
> What would be a good stress-test for the whole ubifs stack: ubifs, ubi, mtd, 
> nand-flash-driver?
> 
> I have gone through this:
> - Forced to use 2.6.24 which is provided by the chip manufacturer with some 
> unknown version of mtd/ubi/ubifs, which is not the latest and not the one that 
> originally shipped with 2.6.24. The git history is gone.
> - Used this version for a time when suddenly some boards had a corrupt 
> filesystem (unreadable /etc).
> - Used mtd tests to check hardware and driver, and everything seems fine.
> - Updated to latest ubi/ubifs for 2.6.24 from ubifs-v2.6.24.git
> 
> And now I need to accomplish two things:
> 
> 1. Be able come up with a fairly reliable method to reproduce the corruption 
> on the original version of ubi/ubifs.
> 2. Check that this problem indeed does not occur on the latest version, and if 
> it does post a bug report here.
> 
> For 1. I am looking for some kind of tool or method to stress-test ubi/ubifs 
> preferably including also the nand-flash driver and hardware (you never know).
> 
> I thought of running something like bonnie++ while power-cycling the board at 
> random times for a few days, or something similar, but since IMHO this must be 
> a recurring issue for many here, I'd like to know about what others have come 
> up with, or if there is some comprehensive stress-test-tool.

Generally we test with debugging checks turned on because they will spot an
error the instant it happens.  On the other hand, you must also test the actual
configuration you will deploy.

There are two approaches.  We use both of them.  They are:

1. Set up a desktop machine with your kernel and test on nandsim.  This has the
advantage that it can do very many more operations than a small device.

You can simulate power-off-recovery by using UBIFS "failure mode".  Set UBIFS
debugging module parameter debug_tsts to 4.  There is a script I have used
for that below.

2. Run tests on the device.  There are tests in mtd-utils/tests/fs-tests but
LTP's fsstress is good for stressing the file system during power-off-recovery
testing.

Good luck!




#!/bin/bash

start()
{
        umount /mnt/test_file_system
        rmmod ubifs
        rmmod ubi
        rmmod nandsim
        modprobe nandsim second_id_byte=0x78 overridesize=9
        modprobe ubi mtd=0
        udevsettle
        ubimkvol /dev/ubi0 -Nvol -m
        udevsettle
        modprobe ubifs
        mount -t ubifs ubi0 /mnt/test_file_system
}

sleep_until_fail()
{
        while true; do
                sleep 1
                touch /mnt/test_file_system/ok > /dev/null 2> /dev/null || retun
                rm /mnt/test_file_system/ok > /dev/null 2> /dev/null || return
        done
}

unmount()
{
        echo un-mounting
        umount /mnt/test_file_system && return
        sleep 1
        umount /mnt/test_file_system && return
        sleep 10
        umount /mnt/test_file_system && return
        sleep 30
        umount /mnt/test_file_system && return
        sleep 30
        umount /mnt/test_file_system && return
        pkill fsstress > /dev/null 2> /dev/null
        sleep 300
        umount /mnt/test_file_system && return
        echo Did not unmount
        exit 1
}

start

dmesg -n3

echo 4096 > /sys/module/ubifs/parameters/debug_msgs

mkdir -p /mnt/test_file_system/fsstress

rm -rf dmesg-*.txt

dmesg_count=0

while true; do
        #dmesg > dmesg-${dmesg_count}.txt
        dmesg_count=`expr $dmesg_count + 1`
        echo fsstress $dmesg_count
        /home/root/fsstress/fsstress -d /mnt/test_file_system/fsstress -p 5 -l &
        sleep 28
        echo failure mode on
        echo 4 > /sys/module/ubifs/parameters/debug_tsts
        sleep_until_fail
        echo failed
        echo 0 > /sys/module/ubifs/parameters/debug_tsts
        pkill fsstress > /dev/null 2> /dev/null
        sleep 1
        unmount
        echo mounting
        mount -t ubifs ubi0 /mnt/test_file_system || exit 1
        echo removing files
        rm -rf /mnt/test_file_system/* 2> /dev/null || exit 1
done

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Good stress test for UBIFS?
  2009-12-28 10:40 ` Adrian Hunter
@ 2010-01-06 10:59   ` David Jander
  2010-01-07  7:17     ` Artem Bityutskiy
  0 siblings, 1 reply; 5+ messages in thread
From: David Jander @ 2010-01-06 10:59 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: linux-mtd

On Monday 28 December 2009 11:40:43 am Adrian Hunter wrote:
>[...]
> Generally we test with debugging checks turned on because they will spot an
> error the instant it happens.  On the other hand, you must also test the
>  actual configuration you will deploy.
> 
> There are two approaches.  We use both of them.  They are:
> 
> 1. Set up a desktop machine with your kernel and test on nandsim.  This has
>  the advantage that it can do very many more operations than a small
>  device.
> 
> You can simulate power-off-recovery by using UBIFS "failure mode".  Set
>  UBIFS debugging module parameter debug_tsts to 4.  There is a script I
>  have used for that below.

Yes, but I did not consider this option, because it is a completely different 
processor architecture (little-endian vs. big-endian), also it won't test the 
hardware-driver, nor the nand-chip and interface which can potentially also be 
(part of) the problem. Here I am trying to reproduce a situation that has 
already occurred a few times in "real life", and I need to be sure it won't 
happen ever again with the latest ubi/ubifs.

> 2. Run tests on the device.  There are tests in mtd-utils/tests/fs-tests
>  but LTP's fsstress is good for stressing the file system during
>  power-off-recovery testing.

Thanks a lot. I will try fsstress.
I had already written my own test script, mimicking some suspicious scenarios, 
in the hope it would reproduce what had happend on three of our boards 
(corrupt fs), and eventually succeded, but it took several days running.
Now I am re-running the test with latest UBI/UBIFS to see if it's gone.
Hopefully fsstress will yield results more quickly.

Best regards,

-- 
David Jander
Protonic Holland.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Good stress test for UBIFS?
  2010-01-06 10:59   ` David Jander
@ 2010-01-07  7:17     ` Artem Bityutskiy
  2010-01-07 10:38       ` david
  0 siblings, 1 reply; 5+ messages in thread
From: Artem Bityutskiy @ 2010-01-07  7:17 UTC (permalink / raw)
  To: David Jander; +Cc: linux-mtd, Adrian Hunter

On Wed, 2010-01-06 at 11:59 +0100, David Jander wrote:
> On Monday 28 December 2009 11:40:43 am Adrian Hunter wrote:
> >[...]
> > Generally we test with debugging checks turned on because they will spot an
> > error the instant it happens.  On the other hand, you must also test the
> >  actual configuration you will deploy.
> > 
> > There are two approaches.  We use both of them.  They are:
> > 
> > 1. Set up a desktop machine with your kernel and test on nandsim.  This has
> >  the advantage that it can do very many more operations than a small
> >  device.
> > 
> > You can simulate power-off-recovery by using UBIFS "failure mode".  Set
> >  UBIFS debugging module parameter debug_tsts to 4.  There is a script I
> >  have used for that below.
> 
> Yes, but I did not consider this option, because it is a completely different 
> processor architecture (little-endian vs. big-endian), also it won't test the 
> hardware-driver, nor the nand-chip and interface which can potentially also be 
> (part of) the problem. Here I am trying to reproduce a situation that has 
> already occurred a few times in "real life", and I need to be sure it won't 
> happen ever again with the latest ubi/ubifs.
> 
> > 2. Run tests on the device.  There are tests in mtd-utils/tests/fs-tests
> >  but LTP's fsstress is good for stressing the file system during
> >  power-off-recovery testing.
> 
> Thanks a lot. I will try fsstress.
> I had already written my own test script, mimicking some suspicious scenarios, 
> in the hope it would reproduce what had happend on three of our boards 
> (corrupt fs), and eventually succeded, but it took several days running.
> Now I am re-running the test with latest UBI/UBIFS to see if it's gone.
> Hopefully fsstress will yield results more quickly.

Also, take a look at the MTD tests:

http://www.linux-mtd.infradead.org/doc/general.html#L_mtd_tests

E.g., by running the torture test for a week we once found a very rare
and subtle problem in our OneNAND driver related to the DMA transfers.
Also, it is nice to run this test for few months and see how your NAND
HW behaves when you try to wear few of their blocks out. Also, if you
manage to make real faulty blocks, you can test how UBI handles them.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Good stress test for UBIFS?
  2010-01-07  7:17     ` Artem Bityutskiy
@ 2010-01-07 10:38       ` david
  0 siblings, 0 replies; 5+ messages in thread
From: david @ 2010-01-07 10:38 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd, David Jander, Adrian Hunter

> On Wed, 2010-01-06 at 11:59 +0100, David Jander wrote:
>> On Monday 28 December 2009 11:40:43 am Adrian Hunter wrote:
>> >[...]
>> > Generally we test with debugging checks turned on because they will
>> spot an
>> > error the instant it happens.  On the other hand, you must also test
>> the
>> >  actual configuration you will deploy.
>> >
>> > There are two approaches.  We use both of them.  They are:
>> >
>> > 1. Set up a desktop machine with your kernel and test on nandsim.
>> This has
>> >  the advantage that it can do very many more operations than a small
>> >  device.
>> >
>> > You can simulate power-off-recovery by using UBIFS "failure mode".
>> Set
>> >  UBIFS debugging module parameter debug_tsts to 4.  There is a script
>> I
>> >  have used for that below.
>>
>> Yes, but I did not consider this option, because it is a completely
>> different
>> processor architecture (little-endian vs. big-endian), also it won't
>> test the
>> hardware-driver, nor the nand-chip and interface which can potentially
>> also be
>> (part of) the problem. Here I am trying to reproduce a situation that
>> has
>> already occurred a few times in "real life", and I need to be sure it
>> won't
>> happen ever again with the latest ubi/ubifs.
>>
>> > 2. Run tests on the device.  There are tests in
>> mtd-utils/tests/fs-tests
>> >  but LTP's fsstress is good for stressing the file system during
>> >  power-off-recovery testing.
>>
>> Thanks a lot. I will try fsstress.
>> I had already written my own test script, mimicking some suspicious
>> scenarios,
>> in the hope it would reproduce what had happend on three of our boards
>> (corrupt fs), and eventually succeded, but it took several days running.
>> Now I am re-running the test with latest UBI/UBIFS to see if it's gone.
>> Hopefully fsstress will yield results more quickly.
>
> Also, take a look at the MTD tests:
>
> http://www.linux-mtd.infradead.org/doc/general.html#L_mtd_tests
>
> E.g., by running the torture test for a week we once found a very rare
> and subtle problem in our OneNAND driver related to the DMA transfers.
> Also, it is nice to run this test for few months and see how your NAND
> HW behaves when you try to wear few of their blocks out. Also, if you
> manage to make real faulty blocks, you can test how UBI handles them.

Yes, I have done that. I saw blocks go bad, and when ubinizing, they were
marked by ubi immediately. Very cool indeed!

Best regards,

-- 
David Jander

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-01-07 10:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-18  8:00 Good stress test for UBIFS? David Jander
2009-12-28 10:40 ` Adrian Hunter
2010-01-06 10:59   ` David Jander
2010-01-07  7:17     ` Artem Bityutskiy
2010-01-07 10:38       ` david

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox