* How does newbie find bugs in ext4? @ 2022-09-12 13:09 JunChao Sun 2022-09-12 16:33 ` Theodore Ts'o 0 siblings, 1 reply; 3+ messages in thread From: JunChao Sun @ 2022-09-12 13:09 UTC (permalink / raw) To: tytso; +Cc: linux-ext4 Hi Ted. I am a new guy in ext4, may I ask several questions about ext4? I am very interested in ext4 file system and have been reading and debugging ext4 source code for 2-3 months(just about basic open(create)/close read/write, do not involve advanced features). I want to contribute to ext4 but I find it seems hard only by reading and debugging by myself, can't even find bugs. I only sent two patches up to now. The reason that I could not find bugs in ext4 may be that I could not understand code deeply only by reading and debugging them... And I find many contributors fix bugs which are found at work, but my company will not give me opportunity to trace the bug in the kernel, they just tell users "this is a bug in linux", and I could not repro that by myself... Could you please provide some suggestions for people who want to contribute to ext4 like me? Any suggestions about how to start contributing to ext4 step by step? I mean, really bugs fix other than document correction(This is also very important and one of the patches I have sent is about document correction, but I want to learn ext4 more deeply). I know that there is xfs-tests project which is used for testing ext4/xfs, but I think ext4 developers will pass all test cases before releasing a new version, so is it necessary to retest ext4 using xfs-tests? Best regards. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: How does newbie find bugs in ext4? 2022-09-12 13:09 How does newbie find bugs in ext4? JunChao Sun @ 2022-09-12 16:33 ` Theodore Ts'o 2022-09-13 12:57 ` JunChao Sun 0 siblings, 1 reply; 3+ messages in thread From: Theodore Ts'o @ 2022-09-12 16:33 UTC (permalink / raw) To: JunChao Sun; +Cc: linux-ext4 Hi, So first of all, I would recommend that you learn how to use kvm-xfstests. The reason for this is that kvm-xfstests is very useful for testing any changes that you make. The same test appliance can be used for testing file systems for Android and using Google Compute Engine VM's (which is one of the best ways to use it). Please take a look at these references: https://thunk.org/gce-xfstests https://github.com/tytso/xfstests-bld/blob/master/Documentation/what-is-xfstests.md https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md In addition to using this as a way of a quick "playground" where you can test patches, this can also be a good way to (for example) test syzbot reports. Another thing which you could potentially do is to manual backporting of ext4 patches which didn't automatically get applied because the patch required some adjustments (or required backporting some additional commits, etc.) to fix a particular problem. So for example, you could try running xfstests using the latest 5.10.y or 5.15.y stable kernels, since as we fix bugs, we often add tests to check for regressions. For example, if you look at the header of the test ext4/058, you'll find: # Set 256 blocks in a block group, then inject I/O pressure, # it will trigger off kernel BUG in ext4_mb_mark_diskspace_used # # Regression test for commit # a08f789d2ab5 ext4: fix bug_on ext4_mb_use_inode_pa So if you find out that a particular test fails on an LTS kernel (e.g., 5.15.y or 5.10.y), but it passes on upstream, it could be that a missing commit needs to be backported. We don't currently have anyone doing this on a regular basis for the LTS kernels (I maybe will do this once every few months, when I have time), so this could be a good way for you to contribute and also learn more about ext4 as you go. Finally, I'll note that although I do run xfstests regularly, and will reject patches that cause regressions, but there are still some tests that fail. For example, here is my latest test report: TESTRUNID: ltm-20220912073217 KERNEL: kernel 6.0.0-rc4-xfstests #760 SMP PREEMPT_DYNAMIC Mon Sep 12 07:23:13 EDT 2022 x86_64 CMDLINE: full --kernel gs://gce-xfstests/kernel.deb CPUS: 4 MEM: 7680 ext4/4k: 515 tests, 27 skipped, 4093 seconds ext4/1k: 511 tests, 2 failures, 40 skipped, 5095 seconds Flaky: generic/475: 40% (2/5) generic/476: 40% (2/5) ext4/ext3: 507 tests, 115 skipped, 3514 seconds ext4/encrypt: 493 tests, 3 failures, 129 skipped, 2583 seconds Failures: generic/681 generic/682 generic/691 ext4/nojournal: 510 tests, 4 failures, 94 skipped, 3610 seconds Failures: ext4/301 ext4/304 generic/455 Flaky: generic/077: 40% (2/5) ext4/ext3conv: 512 tests, 27 skipped, 3650 seconds ext4/adv: 512 tests, 3 failures, 34 skipped, 3860 seconds Failures: generic/475 generic/477 Flaky: generic/455: 80% (4/5) ext4/dioread_nolock: 513 tests, 27 skipped, 4235 seconds ext4/data_journal: 511 tests, 2 failures, 87 skipped, 3647 seconds Failures: generic/231 generic/455 ext4/bigalloc: 489 tests, 2 failures, 34 skipped, 3904 seconds Failures: generic/455 shared/298 ext4/bigalloc_1k: 488 tests, 2 failures, 51 skipped, 3826 seconds Failures: generic/455 shared/298 ext4/dax: 502 tests, 127 skipped, 2520 seconds Totals: 6135 tests, 792 skipped, 80 failures, 0 errors, 44288s (This was done by using gce-xfstests, which is a cloud VM variant of kvm-xfstests. The equivalant would take roughly 12 to 24 hours using kvm-xfstests, whichj gets run on multiple VM times, so the wall clock time needed is perhaps two to two and a half hours.) In general, I try very hard to make sure that ext4/4k (ext4 with the default 4k block size) to be free of failures hen running the xfstests "auto" group. However, you'll see that there are other configs where there are failures, some of which have been around for a while. However, the challenge is that these are bugs that often, more senior ext4 developers have tried looking at for, say, an hour or two, and then said, "I have higher priority fires to fight". But these might not be the best tests failures to ask a ext4 newbie to debug. That being said, if you don't mind a bit (or a lot) of frustration, it could be that you might be able root cause soe of these failed tests. (But starting with testing the LTS kernels might be a better place to start.) Cheers, - Ted ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: How does newbie find bugs in ext4? 2022-09-12 16:33 ` Theodore Ts'o @ 2022-09-13 12:57 ` JunChao Sun 0 siblings, 0 replies; 3+ messages in thread From: JunChao Sun @ 2022-09-13 12:57 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-ext4 Thanks a lot for your suggestions and patience . It is a great guidance for a newbie of ext4! On Tue, Sep 13, 2022 at 12:33 AM Theodore Ts'o <tytso@mit.edu> wrote: > > Hi, > > So first of all, I would recommend that you learn how to use > kvm-xfstests. The reason for this is that kvm-xfstests is very useful > for testing any changes that you make. The same test appliance can be > used for testing file systems for Android and using Google Compute > Engine VM's (which is one of the best ways to use it). Please take a > look at these references: > > https://thunk.org/gce-xfstests > https://github.com/tytso/xfstests-bld/blob/master/Documentation/what-is-xfstests.md > https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md > https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-xfstests.md > > In addition to using this as a way of a quick "playground" where you > can test patches, this can also be a good way to (for example) test > syzbot reports. > > Another thing which you could potentially do is to manual backporting > of ext4 patches which didn't automatically get applied because the > patch required some adjustments (or required backporting some > additional commits, etc.) to fix a particular problem. So for > example, you could try running xfstests using the latest 5.10.y or > 5.15.y stable kernels, since as we fix bugs, we often add tests to > check for regressions. For example, if you look at the header of the > test ext4/058, you'll find: > > # Set 256 blocks in a block group, then inject I/O pressure, > # it will trigger off kernel BUG in ext4_mb_mark_diskspace_used > # > # Regression test for commit > # a08f789d2ab5 ext4: fix bug_on ext4_mb_use_inode_pa > > So if you find out that a particular test fails on an LTS kernel > (e.g., 5.15.y or 5.10.y), but it passes on upstream, it could be that > a missing commit needs to be backported. We don't currently have > anyone doing this on a regular basis for the LTS kernels (I maybe will > do this once every few months, when I have time), so this could be a > good way for you to contribute and also learn more about ext4 as you > go. > > Finally, I'll note that although I do run xfstests regularly, and will > reject patches that cause regressions, but there are still some tests > that fail. For example, here is my latest test report: > > TESTRUNID: ltm-20220912073217 > KERNEL: kernel 6.0.0-rc4-xfstests #760 SMP PREEMPT_DYNAMIC Mon Sep 12 07:23:13 EDT 2022 x86_64 > CMDLINE: full --kernel gs://gce-xfstests/kernel.deb > CPUS: 4 > MEM: 7680 > > ext4/4k: 515 tests, 27 skipped, 4093 seconds > ext4/1k: 511 tests, 2 failures, 40 skipped, 5095 seconds > Flaky: generic/475: 40% (2/5) generic/476: 40% (2/5) > ext4/ext3: 507 tests, 115 skipped, 3514 seconds > ext4/encrypt: 493 tests, 3 failures, 129 skipped, 2583 seconds > Failures: generic/681 generic/682 generic/691 > ext4/nojournal: 510 tests, 4 failures, 94 skipped, 3610 seconds > Failures: ext4/301 ext4/304 generic/455 > Flaky: generic/077: 40% (2/5) > ext4/ext3conv: 512 tests, 27 skipped, 3650 seconds > ext4/adv: 512 tests, 3 failures, 34 skipped, 3860 seconds > Failures: generic/475 generic/477 > Flaky: generic/455: 80% (4/5) > ext4/dioread_nolock: 513 tests, 27 skipped, 4235 seconds > ext4/data_journal: 511 tests, 2 failures, 87 skipped, 3647 seconds > Failures: generic/231 generic/455 > ext4/bigalloc: 489 tests, 2 failures, 34 skipped, 3904 seconds > Failures: generic/455 shared/298 > ext4/bigalloc_1k: 488 tests, 2 failures, 51 skipped, 3826 seconds > Failures: generic/455 shared/298 > ext4/dax: 502 tests, 127 skipped, 2520 seconds > Totals: 6135 tests, 792 skipped, 80 failures, 0 errors, 44288s > > (This was done by using gce-xfstests, which is a cloud VM variant of > kvm-xfstests. The equivalant would take roughly 12 to 24 hours using > kvm-xfstests, whichj gets run on multiple VM times, so the wall clock > time needed is perhaps two to two and a half hours.) > > In general, I try very hard to make sure that ext4/4k (ext4 with the > default 4k block size) to be free of failures hen running the xfstests > "auto" group. However, you'll see that there are other configs where > there are failures, some of which have been around for a while. > However, the challenge is that these are bugs that often, more senior > ext4 developers have tried looking at for, say, an hour or two, and > then said, "I have higher priority fires to fight". But these might > not be the best tests failures to ask a ext4 newbie to debug. That > being said, if you don't mind a bit (or a lot) of frustration, it > could be that you might be able root cause soe of these failed tests. > > (But starting with testing the LTS kernels might be a better place to > start.) > > Cheers, > > - Ted ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-09-13 12:57 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-09-12 13:09 How does newbie find bugs in ext4? JunChao Sun 2022-09-12 16:33 ` Theodore Ts'o 2022-09-13 12:57 ` JunChao Sun
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).