From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Sender: Dmitry Monakhov From: Dmitry Monakhov Subject: Re: [PATCH] core: Actually EIO is a fatal error In-Reply-To: <505C4EB1.4090800@kernel.dk> References: <1348225456-21811-1-git-send-email-dmonakhov@openvz.org> <505C4EB1.4090800@kernel.dk> Date: Fri, 21 Sep 2012 15:42:51 +0400 Message-ID: <87haqry538.fsf@openvz.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: Jens Axboe Cc: fio@vger.kernel.org List-ID: On Fri, 21 Sep 2012 13:25:37 +0200, Jens Axboe wrote: > On 09/21/2012 01:04 PM, Dmitry Monakhov wrote: > > As soon as i understand this is just a mistype. > > It's not a typo. By that logic, EILSEQ is fatal too, since it is a > verification failure of read data (so might as well have been an EIO). > Fatal, in this context, means errors that fio can recover from and > continue doing work. Ohh i ment to say that both errors are fatal, but function called td_NON_fatal_error, and it result true in case of EIO or EILSEQ this result continue_on_error logic broken because io_u.c 1440: if (icd->error && td_non_fatal_error(icd->error) && (td->o.continue_on_error & td_error_type(io_u->ddir, icd->error))) { /* * If there is a non_fatal error, then add to the error count * and clear all the errors. */ update_error_count(td, icd->error); td_clear_error(td); icd->error = 0; io_u->error = 0; } that's why i've inverted result. FYI right after i've changed this my test which continuously hit ENOSPC goes forward and provoke panic :) WARNING: at lib/list_debug.c:62 __list_del_entry+0x1ee/0x250() Hardware name: list_del corruption. next->prev should be ffff88022d5c1a30, but was ffff880231f3e558 Modules linked in: ext4 jbd2 cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd ext3 jbd mbcache sd_mod crc_t10dif aesni_intel ablk_helper cryptd aes_x86_64 aes_generic ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 241, comm: kworker/u:3 Not tainted 3.6.0-rc1+ #62 Call Trace: [] warn_slowpath_common+0xc3/0xf0 [] warn_slowpath_fmt+0x46/0x50 [] __list_del_entry+0x1ee/0x250 [] move_linked_works+0x4e/0xd0 [] cwq_activate_first_delayed+0xf0/0x120 [] ? process_one_work+0x619/0x770 [] cwq_dec_nr_in_flight+0xa7/0x160 [] ? process_one_work+0x619/0x770 [] process_one_work+0x6c9/0x770 [] ? process_one_work+0x341/0x770 [] ? put_io_page+0x60/0x60 [ext4] [] worker_thread+0x1cc/0x330 [] ? manage_workers+0x140/0x140 [] kthread+0xc9/0xe0 [] kernel_thread_helper+0x4/0x10 [] ? retint_restore_args+0x13/0x13 [] ? __init_kthread_worker+0x70/0x70 [] ? gs_change+0x13/0x13 ---[ end trace abc6d2e3c8581c4a ]--- ------------[ cut here ]------------ WARNING: at lib/list_debug.c:33 __list_add+0xdc/0x180() Hardware name: list_add corruption. prev->next should be next (ffff880229a1e260), but was ffff880231f3e558. (prev=ffff880231f3e558). Modules linked in: ext4 jbd2 cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd ext3 jbd mbcache sd_mod crc_t10dif aesni_intel ablk_helper cryptd aes_x86_64 aes_generic ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 0, comm: swapper/3 Tainted: G W 3.6.0-rc1+ #62 Call Trace: [] warn_slowpath_common+0xc3/0xf0 [] warn_slowpath_fmt+0x46/0x50 [] ? __spin_lock_debug+0xae/0x110 [] __list_add+0xdc/0x180 [] insert_work+0x80/0xd0 [] __queue_work+0x4d6/0x5a0 [] ? ext4_add_complete_io+0x54/0xc0 [ext4] [] queue_work_on+0x32/0x40 [] queue_work+0x38/0x50 [] ext4_add_complete_io+0x84/0xc0 [ext4] [] ? _raw_spin_unlock_irqrestore+0x65/0x90 [] ext4_end_io_dio+0xdd/0xf0 [ext4] [] dio_complete+0x125/0x1a0 [] dio_bio_end_aio+0xaa/0x100 [] ? mempool_free_slab+0x17/0x20 [] bio_endio+0x76/0x80 [] dec_pending+0x279/0x340 [dm_mod] [] clone_endio+0x12f/0x150 [dm_mod] [] bio_endio+0x76/0x80 [] req_bio_endio+0x15c/0x180 [] blk_update_request+0x216/0x630 [] blk_update_bidi_request+0x35/0xf0 [] blk_end_bidi_request+0x2c/0x90 [] blk_end_request+0x10/0x20 [] scsi_end_request+0x40/0xf0 [] scsi_io_completion+0x32c/0x850 [] scsi_finish_command+0x1bb/0x1e0 [] scsi_softirq_done+0x158/0x1d0 [] blk_done_softirq+0x8c/0xa0 [] __do_softirq+0x1ba/0x3e0 [] ? _raw_spin_unlock+0x2b/0x50 [] call_softirq+0x1c/0x30 [] do_softirq+0x94/0x1d0 [] irq_exit+0x7a/0x140 [] do_IRQ+0xd5/0x100 [] common_interrupt+0x6f/0x6f [] ? intel_idle+0x19c/0x1f0 [] ? intel_idle+0x198/0x1f0 [] cpuidle_enter+0x19/0x20 [] cpuidle_enter_state+0x17/0x60 [] cpuidle_idle_call+0x2af/0x4e0 [] ? rcu_idle_enter+0x19a/0x1d0 [] cpu_idle+0xff/0x190 [] ? cpu_idle+0xd/0x190 [] start_secondary+0xcd/0xcf ---[ end trace abc6d2e3c8581c4b ]--- > > > -- > Jens Axboe >