From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Graham Cobb <g.btrfs@cobb.uk.net>, linux-btrfs@vger.kernel.org
Subject: Re: Interrupted and resumed scrubs seem to have caused filesystem to go readonly (EFBIG error)
Date: Thu, 2 Jan 2020 09:26:50 +0800 [thread overview]
Message-ID: <7798d1f5-d54d-e756-973c-f2ebfa456315@gmx.com> (raw)
In-Reply-To: <f15c0d2f-df61-17fc-667c-2b0eb5674be2@cobb.uk.net>
[-- Attachment #1.1: Type: text/plain, Size: 9497 bytes --]
On 2020/1/2 上午7:35, Graham Cobb wrote:
> I have a problem on one BTRFS filesystem. It is not a critical
> filesystem (it is used for backups) and I have not yet tried even
> unmounting and remounting, let alone a "btrfs check".
>
> The problem seems to be that after several iterations of running 'btrfs
> scrub' for 30 minutes, then pausing for a while, then resuming the
> scrub, I got a transaction aborted with an EFBIG error and a warning in
> the kernel log. The fs went readonly, and transid verify errors are now
> reported. The original log extract is available at
> http://www.cobb.uk.net/kern.log.bug-010120 but I have pasted the key
> part below.
EFBIG in btrfs is very rare, and can only be caused by too many system
chunks.
The most common reason is the chunk pre-alllocation for scrub, which
also matches your situation.
There is already a fix for it, and will land in v5.5 kernel.
It looks like we should backport it.
Thanks,
Qu
>
> The kernel is a Debian Testing kernel:
> Linux black 5.3.0-2-amd64 #1 SMP Debian 5.3.9-3 (2019-11-19) x86_64
> GNU/Linux
>
> I run this same script monthly, and I have not seen this problem before,
> so I cannot be certain it is caused by the scrub. I have not yet tried
> to reproduce it, or to investigate the filesystem (check, etc).
>
> Does anyone recognise this as a known/fixed problem? If not, is there
> any particular further information I could gather before or during my
> attempt to either recover the filesystem or just rebuild it?
>
> Here is the log (starting with the 7th resumed scrub):
>
>
> Jan 1 06:41:45 black kernel: [1930660.938782] BTRFS info (device sdc3):
> scrub: started on devid 1
> Jan 1 06:41:45 black kernel: [1930660.939195] BTRFS info (device sdc3):
> scrub: started on devid 4
> Jan 1 06:41:45 black kernel: [1930661.475557] ------------[ cut here
> ]------------
> Jan 1 06:41:45 black kernel: [1930661.475562] BTRFS: Transaction
> aborted (error -27)
> Jan 1 06:41:45 black kernel: [1930661.475667] WARNING: CPU: 0 PID:
> 771075 at fs/btrfs/extent-tree.c:8247 btrfs_create_pending_block_
> groups+0x1db/0x230 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475669] Modules linked in: fuse
> nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache bnep nf_t
> ables snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq
> snd_seq_device cpufreq_userspace cpufreq_powersave cpufreq_cons
> ervative nfnetlink_queue nfnetlink_log nfnetlink bluetooth drbg
> ansi_cprng ecdh_generic ecc binfmt_misc hid_generic usbhid hid it87 h
> wmon_vid radeon edac_mce_amd kvm_amd eeepc_wmi ccp asus_wmi rng_core
> evdev sparse_keymap kvm snd_hda_codec_realtek rfkill irqbypass s
> nd_hda_codec_generic ttm video wmi_bmof ledtrig_audio pcspkr
> snd_hda_codec_hdmi drm_kms_helper fam15h_power k10temp snd_hda_intel snd
> _hda_codec snd_hda_core snd_hwdep sp5100_tco drm snd_pcm_oss
> snd_mixer_oss watchdog snd_pcm snd_timer sg snd soundcore i2c_algo_bit b
> utton acpi_cpufreq eeprom i2c_nforce2 firewire_sbp2 firewire_core
> crc_itu_t psmouse nfsd parport_pc ppdev auth_rpcgss lp nfs_acl parp
> ort lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor
> zstd_decompress zstd_compress raid6_pq libcrc32c
> Jan 1 06:41:45 black kernel: [1930661.475710] ext4 crc16 mbcache jbd2
> crc32c_generic sr_mod cdrom uas usb_storage sd_mod dm_crypt d
> m_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
> ohci_pci aesni_intel ahci libahci xhci_pci aes_x86_64 xhci_hcd c
> rypto_simd libata ehci_pci ohci_hcd ehci_hcd cryptd glue_helper scsi_mod
> usbcore r8169 i2c_piix4 realtek libphy usb_common wmi
> Jan 1 06:41:45 black kernel: [1930661.475737] CPU: 0 PID: 771075 Comm:
> btrfs Not tainted 5.3.0-2-amd64 #1 Debian 5.3.9-3
> Jan 1 06:41:45 black kernel: [1930661.475739] Hardware name: To be
> filled by O.E.M. To be filled by O.E.M./M5A97, BIOS 0705 08/22/20
> 11
> Jan 1 06:41:45 black kernel: [1930661.475767] RIP:
> 0010:btrfs_create_pending_block_groups+0x1db/0x230 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475770] Code: e9 26 ff ff ff 48
> 8b 45 50 f0 48 0f ba a8 38 17 00 00 02 72 17 41 83 fc fb 74 2d
> 44 89 e6 48 c7 c7 50 2e 7a c0 e8 23 9d 19 e3 <0f> 0b 44 89 e1 ba 37 20
> 00 00 48 c7 c6 20 80 79 c0 48 89 ef e8 73
> Jan 1 06:41:45 black kernel: [1930661.475772] RSP:
> 0018:ffff9c69804cfb00 EFLAGS: 00010286
> Jan 1 06:41:45 black kernel: [1930661.475775] RAX: 0000000000000000
> RBX: ffff909444e7a520 RCX: 0000000000000006
> Jan 1 06:41:45 black kernel: [1930661.475777] RDX: 0000000000000007
> RSI: 0000000000000096 RDI: ffff90957aa17680
> Jan 1 06:41:45 black kernel: [1930661.475779] RBP: ffff90946c745d68
> R08: 0000000000010ec1 R09: 0000000000000007
> Jan 1 06:41:45 black kernel: [1930661.475781] R10: 0000000000000000
> R11: 0000000000000001 R12: 00000000ffffffe5
> Jan 1 06:41:45 black kernel: [1930661.475783] R13: ffff90946c745dc0
> R14: ffff909575d2e000 R15: ffff909574444000
> Jan 1 06:41:45 black kernel: [1930661.475786] FS:
> 00007ff2eb4c7700(0000) GS:ffff90957aa00000(0000) knlGS:0000000000000000
> Jan 1 06:41:45 black kernel: [1930661.475788] CS: 0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jan 1 06:41:45 black kernel: [1930661.475790] CR2: 00005634edab7008
> CR3: 00000000bd0f2000 CR4: 00000000000406f0
> Jan 1 06:41:45 black kernel: [1930661.475792] Call Trace:
> Jan 1 06:41:45 black kernel: [1930661.475826]
> __btrfs_end_transaction+0x3f/0x1b0 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475855]
> btrfs_inc_block_group_ro+0x10e/0x150 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475891]
> scrub_enumerate_chunks+0x162/0x560 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475900] ?
> remove_wait_queue+0x20/0x60
> Jan 1 06:41:45 black kernel: [1930661.475936]
> btrfs_scrub_dev+0x26b/0x590 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475942] ? _cond_resched+0x15/0x30
> Jan 1 06:41:45 black kernel: [1930661.475946] ?
> __kmalloc_track_caller+0x16e/0x260
> Jan 1 06:41:45 black kernel: [1930661.475980] ?
> btrfs_ioctl+0x82f/0x2e10 [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.475984] ?
> __check_object_size+0x136/0x147
> Jan 1 06:41:45 black kernel: [1930661.476019] btrfs_ioctl+0x87a/0x2e10
> [btrfs]
> Jan 1 06:41:45 black kernel: [1930661.476024] ?
> tomoyo_path_number_perm+0x66/0x1d0
> Jan 1 06:41:45 black kernel: [1930661.476030] ? do_vfs_ioctl+0x40e/0x670
> Jan 1 06:41:45 black kernel: [1930661.476033] do_vfs_ioctl+0x40e/0x670
> Jan 1 06:41:45 black kernel: [1930661.476036] ?
> create_task_io_context+0x95/0x100
> Jan 1 06:41:45 black kernel: [1930661.476040] ksys_ioctl+0x5e/0x90
> Jan 1 06:41:45 black kernel: [1930661.476044] __x64_sys_ioctl+0x16/0x20
> Jan 1 06:41:45 black kernel: [1930661.476048] do_syscall_64+0x53/0x140
> Jan 1 06:41:45 black kernel: [1930661.476052]
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> Jan 1 06:41:45 black kernel: [1930661.476055] RIP: 0033:0x7ff2eb5b95b7
> Jan 1 06:41:45 black kernel: [1930661.476058] Code: 00 00 90 48 8b 05
> d9 78 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84
> 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b
> 0d a9 78 0c 00 f7 d8 64 89 01 48
> Jan 1 06:41:45 black kernel: [1930661.476061] RSP:
> 002b:00007ff2eb4c6d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Jan 1 06:41:45 black kernel: [1930661.476064] RAX: ffffffffffffffda
> RBX: 000055eeaf2e94b0 RCX: 00007ff2eb5b95b7
> Jan 1 06:41:45 black kernel: [1930661.476066] RDX: 000055eeaf2e94b0
> RSI: 00000000c400941b RDI: 0000000000000003
> Jan 1 06:41:45 black kernel: [1930661.476067] RBP: 0000000000000000
> R08: 00007ff2eb4c7700 R09: 0000000000000000
> Jan 1 06:41:45 black kernel: [1930661.476069] R10: 00007ff2eb4c7700
> R11: 0000000000000246 R12: 00007ffc4cfa511e
> Jan 1 06:41:45 black kernel: [1930661.476071] R13: 00007ffc4cfa511f
> R14: 00007ff2eb4c7700 R15: 00007ff2eb4c6e40
> Jan 1 06:41:45 black kernel: [1930661.476075] ---[ end trace
> 6429c1bf293fecb8 ]---
> Jan 1 06:41:45 black kernel: [1930661.476079] BTRFS: error (device
> sdc3) in btrfs_create_pending_block_groups:8247: errno=-27 unknown
> Jan 1 06:41:45 black kernel: [1930661.476082] BTRFS info (device sdc3):
> forced readonly
> Jan 1 06:41:45 black kernel: [1930661.489816] BTRFS warning (device
> sdc3): failed setting block group ro: -30
> Jan 1 06:41:45 black kernel: [1930661.489821] BTRFS info (device sdc3):
> scrub: not finished on devid 1 with status: -30
> Jan 1 06:41:52 black kernel: [1930668.052295] BTRFS warning (device
> sdc3): failed setting block group ro: -30
> Jan 1 06:41:52 black kernel: [1930668.052301] BTRFS info (device sdc3):
> scrub: not finished on devid 4 with status: -30
> Jan 1 06:51:56 black kernel: [1931271.801468] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> Jan 1 06:51:56 black kernel: [1931271.822215] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> Jan 1 06:51:57 black kernel: [1931273.492798] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> Jan 1 06:51:57 black kernel: [1931273.493041] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]
next prev parent reply other threads:[~2020-01-02 1:27 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-01 23:35 Interrupted and resumed scrubs seem to have caused filesystem to go readonly (EFBIG error) Graham Cobb
2020-01-02 1:26 ` Qu Wenruo [this message]
2020-01-02 12:07 ` Graham Cobb
2020-01-02 12:34 ` Qu Wenruo
2020-01-04 10:46 ` Graham Cobb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7798d1f5-d54d-e756-973c-f2ebfa456315@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=g.btrfs@cobb.uk.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox