linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* repair: realloc(): invalid next size
@ 2018-10-08 10:52 Arkadiusz Miśkiewicz
  0 siblings, 0 replies; 7+ messages in thread
From: Arkadiusz Miśkiewicz @ 2018-10-08 10:52 UTC (permalink / raw)
  To: linux-xfs


Big fs, ton of small files, repair takes 36h until this happens:

rebuilding directory inode 30363993060
rebuilding directory inode 30398868604
rebuilding directory inode 30414474627
rebuilding directory inode 30425006954
rebuilding directory inode 30447937553
rebuilding directory inode 30529556616
rebuilding directory inode 30537494728
rebuilding directory inode 30569826838
rebuilding directory inode 31060721895
Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
Warning: recursive buffer locking at block 31060721776 detected
Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
Warning: recursive buffer locking at block 31060721776 detected
Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
realloc(): invalid next size
Aborted


Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>

Complete log at
https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt

Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.

kernel 4.18.5

Running under gdb now.

Any ideas?

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )

^ permalink raw reply	[flat|nested] 7+ messages in thread

* repair: realloc(): invalid next size
@ 2018-10-08 14:03 Arkadiusz Miśkiewicz
  2018-10-08 14:26 ` Eric Sandeen
  2018-10-09 13:49 ` Arkadiusz Miśkiewicz
  0 siblings, 2 replies; 7+ messages in thread
From: Arkadiusz Miśkiewicz @ 2018-10-08 14:03 UTC (permalink / raw)
  To: linux-xfs


Big fs, ton of small files, repair takes 36h until this happens:

rebuilding directory inode 30363993060
rebuilding directory inode 30398868604
rebuilding directory inode 30414474627
rebuilding directory inode 30425006954
rebuilding directory inode 30447937553
rebuilding directory inode 30529556616
rebuilding directory inode 30537494728
rebuilding directory inode 30569826838
rebuilding directory inode 31060721895
Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
Warning: recursive buffer locking at block 31060721776 detected
Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
Warning: recursive buffer locking at block 31060721776 detected
Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
realloc(): invalid next size
Aborted


Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>

Complete log at
https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt

Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.

kernel 4.18.5

Running under gdb now.

Any ideas?

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: repair: realloc(): invalid next size
  2018-10-08 14:03 repair: realloc(): invalid next size Arkadiusz Miśkiewicz
@ 2018-10-08 14:26 ` Eric Sandeen
  2018-10-08 14:42   ` Eric Sandeen
  2018-10-09 13:49 ` Arkadiusz Miśkiewicz
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2018-10-08 14:26 UTC (permalink / raw)
  To: Arkadiusz Miśkiewicz, linux-xfs



On 10/8/18 9:03 AM, Arkadiusz Miśkiewicz wrote:
> 
> Big fs, ton of small files, repair takes 36h until this happens:
> 
> rebuilding directory inode 30363993060
> rebuilding directory inode 30398868604
> rebuilding directory inode 30414474627
> rebuilding directory inode 30425006954
> rebuilding directory inode 30447937553
> rebuilding directory inode 30529556616
> rebuilding directory inode 30537494728
> rebuilding directory inode 30569826838
> rebuilding directory inode 31060721895
> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> Warning: recursive buffer locking at block 31060721776 detected
> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> Warning: recursive buffer locking at block 31060721776 detected
> Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> realloc(): invalid next size
> Aborted
> 
> 
> Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>
> 
> Complete log at
> https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt
> 
> Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.
> 
> kernel 4.18.5
> 
> Running under gdb now.
> 
> Any ideas?

With such a big fs it's tough to share a metadump for a reproducer, I assume.

The earlier write verifiers failing for xfs_repair writes are troubling...

I'm not certain why it's rebuilding so many dir inodes; there are several cases
where that happens, but unfortunately repair doesn't always say which one or why.

Anyway, you eventually get to this inode (it's the same in decimal & hex
below):

rebuilding directory inode 360732305
Metadata corruption detected at 0x41f9db, inode 0x15805691 data fork
xfs_repair: warning - iflush_int failed (-117)

with lots of corruption during the writes, and this happens for a couple
other inodes, until finally:

rebuilding directory inode 31060721895
Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork

and this one ends up aborting in glibc's realloc():

realloc(): invalid next size

I /think/ that this indicates that memory has been corrupted during the repair
run.  :/  Running under valgrind would probably lead to a 72hr runtime or more :)

I wonder if it would save time in the long run to make a metadump and remove all
directory trees other than this inode (360732305) from it, and see if the same
failure occurs when running on the reduced fs image?

Out of curiosity, what happened to this filesystem to leave it in bad shape?

-Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: repair: realloc(): invalid next size
  2018-10-08 14:26 ` Eric Sandeen
@ 2018-10-08 14:42   ` Eric Sandeen
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2018-10-08 14:42 UTC (permalink / raw)
  To: Arkadiusz Miśkiewicz, linux-xfs



On 10/8/18 9:26 AM, Eric Sandeen wrote:
> 
> 
> On 10/8/18 9:03 AM, Arkadiusz Miśkiewicz wrote:
>>
>> Big fs, ton of small files, repair takes 36h until this happens:
>>
>> rebuilding directory inode 30363993060
>> rebuilding directory inode 30398868604
>> rebuilding directory inode 30414474627
>> rebuilding directory inode 30425006954
>> rebuilding directory inode 30447937553
>> rebuilding directory inode 30529556616
>> rebuilding directory inode 30537494728
>> rebuilding directory inode 30569826838
>> rebuilding directory inode 31060721895
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> realloc(): invalid next size
>> Aborted
>>
>>
>> Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>
>>
>> Complete log at
>> https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt
>>
>> Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.
>>
>> kernel 4.18.5
>>
>> Running under gdb now.
>>
>> Any ideas?
> 
> With such a big fs it's tough to share a metadump for a reproducer, I assume.
> 
> The earlier write verifiers failing for xfs_repair writes are troubling...
> 
> I'm not certain why it's rebuilding so many dir inodes; there are several cases
> where that happens, but unfortunately repair doesn't always say which one or why.
> 
> Anyway, you eventually get to this inode (it's the same in decimal & hex
> below):
> 
> rebuilding directory inode 360732305
> Metadata corruption detected at 0x41f9db, inode 0x15805691 data fork
> xfs_repair: warning - iflush_int failed (-117)
> 
> with lots of corruption during the writes, and this happens for a couple
> other inodes, until finally:
> 
> rebuilding directory inode 31060721895
> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
> 
> and this one ends up aborting in glibc's realloc():
> 
> realloc(): invalid next size
> 
> I /think/ that this indicates that memory has been corrupted during the repair
> run.  :/  Running under valgrind would probably lead to a 72hr runtime or more :)
> 
> I wonder if it would save time in the long run to make a metadump and remove all
> directory trees other than this inode (360732305) from it, and see if the same
> failure occurs when running on the reduced fs image?

Actually if you try that, also leaving the directory trees in place for the other
inodes that reported issues would make sense.

-Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: repair: realloc(): invalid next size
  2018-10-08 14:03 repair: realloc(): invalid next size Arkadiusz Miśkiewicz
  2018-10-08 14:26 ` Eric Sandeen
@ 2018-10-09 13:49 ` Arkadiusz Miśkiewicz
  2018-10-09 18:48   ` Arkadiusz Miśkiewicz
  2018-10-12  7:52   ` Arkadiusz Miśkiewicz
  1 sibling, 2 replies; 7+ messages in thread
From: Arkadiusz Miśkiewicz @ 2018-10-09 13:49 UTC (permalink / raw)
  To: linux-xfs

On 08/10/2018 16:03, Arkadiusz Miśkiewicz wrote:
> 
> Big fs, ton of small files, repair takes 36h until this happens:
> 
> rebuilding directory inode 30363993060
> rebuilding directory inode 30398868604
> rebuilding directory inode 30414474627
> rebuilding directory inode 30425006954
> rebuilding directory inode 30447937553
> rebuilding directory inode 30529556616
> rebuilding directory inode 30537494728
> rebuilding directory inode 30569826838
> rebuilding directory inode 31060721895
> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> Warning: recursive buffer locking at block 31060721776 detected
> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> Warning: recursive buffer locking at block 31060721776 detected
> Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> realloc(): invalid next size
> Aborted
> 
> 
> Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>

Not much progress, traceback but without line numbers:
[New Thread 0x7ffff4588700 (LWP 16783)]
rebuilding directory inode 30299650439
rebuilding directory inode 30300818030
rebuilding directory inode 30317087573
rebuilding directory inode 30363993060
rebuilding directory inode 30398868604
rebuilding directory inode 30414474627
rebuilding directory inode 30425006954
rebuilding directory inode 30447937553
rebuilding directory inode 30529556616
rebuilding directory inode 30537494728
rebuilding directory inode 30569826838
rebuilding directory inode 31060721895
Metadata corruption detected at 0x486261, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
Warning: recursive buffer locking at block 31060721776 detected
Metadata corruption detected at 0x486261, inode 0x73b5d00e7 data fork
xfs_repair: warning - iflush_int failed (-117)
and segfault

warning: Loadable section ".note.gnu.property" outside of ELF segments
Core was generated by `/sbin/xfs_repair -vvvv /dev/sdc1'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memmove_avx_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:525
525		VMOVU	-VEC_SIZE(%rcx), %VEC(1)
[Current thread is 1 (Thread 0x7ffff797d300 (LWP 31979))]
(gdb) bt
#0  __memmove_avx_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:525
#1  0x0000000000485e1e in xfs_dir2_sf_addname_hard ()
#2  0x000000000048599b in xfs_dir2_sf_addname ()
#3  0x00000000004773c2 in libxfs_dir_createname ()
#4  0x00000000004279f3 in longform_dir2_rebuild ()
#5  0x000000000042a61a in longform_dir2_entry_check ()
#6  0x000000000042b697 in process_dir_inode ()
#7  0x000000000042c3ca in traverse_function ()
#8  0x00000000004304a1 in prefetch_ag_range ()
#9  0x000000000043061f in do_inode_prefetch ()
#10 0x000000000042c49c in traverse_ags ()
#11 0x000000000042c752 in phase6 ()
#12 0x000000000043ea38 in main ()

gdb doesn't like my binary, not sure why yet

/sbin/xfs_repair: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for 
GNU/Linux 3.2.0, BuildID[sha1]=710d79304cb58f8e415302572cc718e38f0f1aa4, 
with debug_info, not stripped

> 
> Complete log at
> https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt
> 
> Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.
> 
> kernel 4.18.5
> 
> Running under gdb now.
> 
> Any ideas?
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: repair: realloc(): invalid next size
  2018-10-09 13:49 ` Arkadiusz Miśkiewicz
@ 2018-10-09 18:48   ` Arkadiusz Miśkiewicz
  2018-10-12  7:52   ` Arkadiusz Miśkiewicz
  1 sibling, 0 replies; 7+ messages in thread
From: Arkadiusz Miśkiewicz @ 2018-10-09 18:48 UTC (permalink / raw)
  To: linux-xfs

On 09/10/2018 15:49, Arkadiusz Miśkiewicz wrote:
> On 08/10/2018 16:03, Arkadiusz Miśkiewicz wrote:
>>
>> Big fs, ton of small files, repair takes 36h until this happens:
>>
>> rebuilding directory inode 30363993060
>> rebuilding directory inode 30398868604
>> rebuilding directory inode 30414474627
>> rebuilding directory inode 30425006954
>> rebuilding directory inode 30447937553
>> rebuilding directory inode 30529556616
>> rebuilding directory inode 30537494728
>> rebuilding directory inode 30569826838
>> rebuilding directory inode 31060721895
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> realloc(): invalid next size
>> Aborted
>>
>>
>> Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>
> 
> Not much progress, traceback but without line numbers:
> [New Thread 0x7ffff4588700 (LWP 16783)]
> rebuilding directory inode 30299650439
> rebuilding directory inode 30300818030
> rebuilding directory inode 30317087573
> rebuilding directory inode 30363993060
> rebuilding directory inode 30398868604
> rebuilding directory inode 30414474627
> rebuilding directory inode 30425006954
> rebuilding directory inode 30447937553
> rebuilding directory inode 30529556616
> rebuilding directory inode 30537494728
> rebuilding directory inode 30569826838
> rebuilding directory inode 31060721895
> Metadata corruption detected at 0x486261, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> Warning: recursive buffer locking at block 31060721776 detected
> Metadata corruption detected at 0x486261, inode 0x73b5d00e7 data fork
> xfs_repair: warning - iflush_int failed (-117)
> and segfault
> 
> warning: Loadable section ".note.gnu.property" outside of ELF segments
> Core was generated by `/sbin/xfs_repair -vvvv /dev/sdc1'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __memmove_avx_unaligned_erms () at 
> ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:525
> 525        VMOVU    -VEC_SIZE(%rcx), %VEC(1)
> [Current thread is 1 (Thread 0x7ffff797d300 (LWP 31979))]
> (gdb) bt
> #0  __memmove_avx_unaligned_erms () at 
> ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:525
> #1  0x0000000000485e1e in xfs_dir2_sf_addname_hard ()
> #2  0x000000000048599b in xfs_dir2_sf_addname ()
> #3  0x00000000004773c2 in libxfs_dir_createname ()
> #4  0x00000000004279f3 in longform_dir2_rebuild ()
> #5  0x000000000042a61a in longform_dir2_entry_check ()
> #6  0x000000000042b697 in process_dir_inode ()
> #7  0x000000000042c3ca in traverse_function ()
> #8  0x00000000004304a1 in prefetch_ag_range ()
> #9  0x000000000043061f in do_inode_prefetch ()
> #10 0x000000000042c49c in traverse_ags ()
> #11 0x000000000042c752 in phase6 ()
> #12 0x000000000043ea38 in main ()
> 
> gdb doesn't like my binary, not sure why yet

but it looks to be memcpy near the end of function

(gdb) where
#0  __memmove_avx_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:525
#1  0x0000000000485e1e in xfs_dir2_sf_addname_hard ()
#2  0x000000000048599b in xfs_dir2_sf_addname ()
#3  0x00000000004773c2 in libxfs_dir_createname ()
#4  0x00000000004279f3 in longform_dir2_rebuild ()
#5  0x000000000042a61a in longform_dir2_entry_check ()
#6  0x000000000042b697 in process_dir_inode ()
#7  0x000000000042c3ca in traverse_function ()
#8  0x00000000004304a1 in prefetch_ag_range ()
#9  0x000000000043061f in do_inode_prefetch ()
#10 0x000000000042c49c in traverse_ags ()
#11 0x000000000042c752 in phase6 ()
#12 0x000000000043ea38 in main ()
(gdb) frame 1
#1  0x0000000000485e1e in xfs_dir2_sf_addname_hard ()
(gdb) disassemble 0x0000000000485e1e
Dump of assembler code for function xfs_dir2_sf_addname_hard:
    0x0000000000485b32 <+0>:	push   %rbp
    0x0000000000485b33 <+1>:	mov    %rsp,%rbp
    0x0000000000485b36 <+4>:	sub    $0x60,%rsp
    0x0000000000485b3a <+8>:	mov    %rdi,-0x58(%rbp)
    0x0000000000485b3e <+12>:	mov    %esi,-0x5c(%rbp)
    0x0000000000485b41 <+15>:	mov    %edx,-0x60(%rbp)
    0x0000000000485b44 <+18>:	mov    -0x58(%rbp),%rax
    0x0000000000485b48 <+22>:	mov    0x38(%rax),%rax
    0x0000000000485b4c <+26>:	mov    %rax,-0x18(%rbp)
    0x0000000000485b50 <+30>:	mov    -0x18(%rbp),%rax
    0x0000000000485b54 <+34>:	mov    0xb0(%rax),%rax
    0x0000000000485b5b <+41>:	mov    %rax,-0x20(%rbp)
    0x0000000000485b5f <+45>:	mov    -0x18(%rbp),%rax
    0x0000000000485b63 <+49>:	mov    0xe0(%rax),%rax
    0x0000000000485b6a <+56>:	mov    %eax,-0x24(%rbp)
    0x0000000000485b6d <+59>:	mov    -0x24(%rbp),%eax
    0x0000000000485b70 <+62>:	cltq
    0x0000000000485b72 <+64>:	mov    $0x1,%esi
    0x0000000000485b77 <+69>:	mov    %rax,%rdi
    0x0000000000485b7a <+72>:	callq  0x443186 <kmem_alloc>
    0x0000000000485b7f <+77>:	mov    %rax,-0x30(%rbp)
    0x0000000000485b83 <+81>:	mov    -0x30(%rbp),%rax
    0x0000000000485b87 <+85>:	mov    %rax,-0x38(%rbp)
    0x0000000000485b8b <+89>:	mov    -0x24(%rbp),%eax
    0x0000000000485b8e <+92>:	movslq %eax,%rdx
    0x0000000000485b91 <+95>:	mov    -0x20(%rbp),%rcx
    0x0000000000485b95 <+99>:	mov    -0x38(%rbp),%rax
    0x0000000000485b99 <+103>:	mov    %rcx,%rsi
    0x0000000000485b9c <+106>:	mov    %rax,%rdi
    0x0000000000485b9f <+109>:	callq  0x4033c0 <memcpy@plt>
    0x0000000000485ba4 <+114>:	mov    -0x18(%rbp),%rax
    0x0000000000485ba8 <+118>:	mov    0x130(%rax),%rax
    0x0000000000485baf <+125>:	mov    0x70(%rax),%eax
    0x0000000000485bb2 <+128>:	mov    %eax,-0x8(%rbp)
    0x0000000000485bb5 <+131>:	mov    -0x38(%rbp),%rax
    0x0000000000485bb9 <+135>:	mov    %rax,%rdi
    0x0000000000485bbc <+138>:	callq  0x48510e 
<xfs_dir2_sf_firstentry.lto_priv.614>
    0x0000000000485bc1 <+143>:	mov    %rax,-0x10(%rbp)
    0x0000000000485bc5 <+147>:	mov    -0x18(%rbp),%rax
    0x0000000000485bc9 <+151>:	mov    0x130(%rax),%rax
    0x0000000000485bd0 <+158>:	mov    0x40(%rax),%rax
    0x0000000000485bd4 <+162>:	mov    -0x58(%rbp),%rdx
    0x0000000000485bd8 <+166>:	mov    0x10(%rdx),%edx
    0x0000000000485bdb <+169>:	mov    %edx,%edi
    0x0000000000485bdd <+171>:	callq  0x4ac8cb <__x86_indirect_thunk_rax>
    0x0000000000485be2 <+176>:	mov    %eax,-0x3c(%rbp)
    0x0000000000485be5 <+179>:	mov    -0x24(%rbp),%eax
    0x0000000000485be8 <+182>:	movslq %eax,%rdx
    0x0000000000485beb <+185>:	mov    -0x30(%rbp),%rax
    0x0000000000485bef <+189>:	add    %rdx,%rax
    0x0000000000485bf2 <+192>:	cmp    %rax,-0x10(%rbp)
    0x0000000000485bf6 <+196>:	sete   %al
--Type <RET> for more, q to quit, c to continue without paging--
    0x0000000000485bf9 <+199>:	movzbl %al,%eax
    0x0000000000485bfc <+202>:	mov    %eax,-0x4(%rbp)
    0x0000000000485bff <+205>:	jmpq   0x485c8a 
<xfs_dir2_sf_addname_hard+344>
    0x0000000000485c04 <+210>:	mov    -0x10(%rbp),%rax
    0x0000000000485c08 <+214>:	mov    %rax,%rdi
    0x0000000000485c0b <+217>:	callq  0x4850c4 
<xfs_dir2_sf_get_offset.lto_priv.619>
    0x0000000000485c10 <+222>:	mov    %eax,-0x40(%rbp)
    0x0000000000485c13 <+225>:	mov    -0x3c(%rbp),%edx
    0x0000000000485c16 <+228>:	mov    -0x8(%rbp),%eax
    0x0000000000485c19 <+231>:	add    %edx,%eax
    0x0000000000485c1b <+233>:	cmp    %eax,-0x40(%rbp)
    0x0000000000485c1e <+236>:	jae    0x485c94 
<xfs_dir2_sf_addname_hard+354>
    0x0000000000485c20 <+238>:	mov    -0x18(%rbp),%rax
    0x0000000000485c24 <+242>:	mov    0x130(%rax),%rax
    0x0000000000485c2b <+249>:	mov    0x40(%rax),%rax
    0x0000000000485c2f <+253>:	mov    -0x10(%rbp),%rdx
    0x0000000000485c33 <+257>:	movzbl (%rdx),%edx
    0x0000000000485c36 <+260>:	movzbl %dl,%edx
    0x0000000000485c39 <+263>:	mov    %edx,%edi
    0x0000000000485c3b <+265>:	callq  0x4ac8cb <__x86_indirect_thunk_rax>
    0x0000000000485c40 <+270>:	mov    %eax,%edx
    0x0000000000485c42 <+272>:	mov    -0x40(%rbp),%eax
    0x0000000000485c45 <+275>:	add    %edx,%eax
    0x0000000000485c47 <+277>:	mov    %eax,-0x8(%rbp)
    0x0000000000485c4a <+280>:	mov    -0x18(%rbp),%rax
    0x0000000000485c4e <+284>:	mov    0x130(%rax),%rax
    0x0000000000485c55 <+291>:	mov    0x8(%rax),%rax
    0x0000000000485c59 <+295>:	mov    -0x10(%rbp),%rcx
    0x0000000000485c5d <+299>:	mov    -0x38(%rbp),%rdx
    0x0000000000485c61 <+303>:	mov    %rcx,%rsi
    0x0000000000485c64 <+306>:	mov    %rdx,%rdi
    0x0000000000485c67 <+309>:	callq  0x4ac8cb <__x86_indirect_thunk_rax>
    0x0000000000485c6c <+314>:	mov    %rax,-0x10(%rbp)
    0x0000000000485c70 <+318>:	mov    -0x24(%rbp),%eax
    0x0000000000485c73 <+321>:	movslq %eax,%rdx
    0x0000000000485c76 <+324>:	mov    -0x30(%rbp),%rax
    0x0000000000485c7a <+328>:	add    %rdx,%rax
    0x0000000000485c7d <+331>:	cmp    %rax,-0x10(%rbp)
    0x0000000000485c81 <+335>:	sete   %al
    0x0000000000485c84 <+338>:	movzbl %al,%eax
    0x0000000000485c87 <+341>:	mov    %eax,-0x4(%rbp)
    0x0000000000485c8a <+344>:	cmpl   $0x0,-0x4(%rbp)
    0x0000000000485c8e <+348>:	je     0x485c04 
<xfs_dir2_sf_addname_hard+210>
    0x0000000000485c94 <+354>:	mov    -0x24(%rbp),%eax
    0x0000000000485c97 <+357>:	neg    %eax
    0x0000000000485c99 <+359>:	mov    %eax,%ecx
    0x0000000000485c9b <+361>:	mov    -0x18(%rbp),%rax
    0x0000000000485c9f <+365>:	mov    $0x0,%edx
    0x0000000000485ca4 <+370>:	mov    %ecx,%esi
    0x0000000000485ca6 <+372>:	mov    %rax,%rdi
    0x0000000000485ca9 <+375>:	callq  0x4918ee <libxfs_idata_realloc>
    0x0000000000485cae <+380>:	mov    -0x60(%rbp),%ecx
    0x0000000000485cb1 <+383>:	mov    -0x18(%rbp),%rax
--Type <RET> for more, q to quit, c to continue without paging--
    0x0000000000485cb5 <+387>:	mov    $0x0,%edx
    0x0000000000485cba <+392>:	mov    %ecx,%esi
    0x0000000000485cbc <+394>:	mov    %rax,%rdi
    0x0000000000485cbf <+397>:	callq  0x4918ee <libxfs_idata_realloc>
    0x0000000000485cc4 <+402>:	mov    -0x18(%rbp),%rax
    0x0000000000485cc8 <+406>:	mov    0xb0(%rax),%rax
    0x0000000000485ccf <+413>:	mov    %rax,-0x20(%rbp)
    0x0000000000485cd3 <+417>:	mov    -0x10(%rbp),%rax
    0x0000000000485cd7 <+421>:	sub    -0x38(%rbp),%rax
    0x0000000000485cdb <+425>:	mov    %eax,-0x44(%rbp)
    0x0000000000485cde <+428>:	mov    -0x44(%rbp),%eax
    0x0000000000485ce1 <+431>:	movslq %eax,%rdx
    0x0000000000485ce4 <+434>:	mov    -0x38(%rbp),%rcx
    0x0000000000485ce8 <+438>:	mov    -0x20(%rbp),%rax
    0x0000000000485cec <+442>:	mov    %rcx,%rsi
    0x0000000000485cef <+445>:	mov    %rax,%rdi
    0x0000000000485cf2 <+448>:	callq  0x4033c0 <memcpy@plt>
    0x0000000000485cf7 <+453>:	mov    -0x44(%rbp),%eax
    0x0000000000485cfa <+456>:	movslq %eax,%rdx
    0x0000000000485cfd <+459>:	mov    -0x20(%rbp),%rax
    0x0000000000485d01 <+463>:	add    %rdx,%rax
    0x0000000000485d04 <+466>:	mov    %rax,-0x50(%rbp)
    0x0000000000485d08 <+470>:	mov    -0x58(%rbp),%rax
    0x0000000000485d0c <+474>:	mov    0x10(%rax),%eax
    0x0000000000485d0f <+477>:	mov    %eax,%edx
    0x0000000000485d11 <+479>:	mov    -0x50(%rbp),%rax
    0x0000000000485d15 <+483>:	mov    %dl,(%rax)
    0x0000000000485d17 <+485>:	mov    -0x8(%rbp),%edx
    0x0000000000485d1a <+488>:	mov    -0x50(%rbp),%rax
    0x0000000000485d1e <+492>:	mov    %edx,%esi
    0x0000000000485d20 <+494>:	mov    %rax,%rdi
    0x0000000000485d23 <+497>:	callq  0x4850e5 
<xfs_dir2_sf_put_offset.lto_priv.616>
    0x0000000000485d28 <+502>:	mov    -0x50(%rbp),%rax
    0x0000000000485d2c <+506>:	movzbl (%rax),%eax
    0x0000000000485d2f <+509>:	movzbl %al,%edx
    0x0000000000485d32 <+512>:	mov    -0x58(%rbp),%rax
    0x0000000000485d36 <+516>:	mov    0x8(%rax),%rax
    0x0000000000485d3a <+520>:	mov    -0x50(%rbp),%rcx
    0x0000000000485d3e <+524>:	add    $0x3,%rcx
    0x0000000000485d42 <+528>:	mov    %rax,%rsi
    0x0000000000485d45 <+531>:	mov    %rcx,%rdi
    0x0000000000485d48 <+534>:	callq  0x4033c0 <memcpy@plt>
    0x0000000000485d4d <+539>:	mov    -0x18(%rbp),%rax
    0x0000000000485d51 <+543>:	mov    0x130(%rax),%rax
    0x0000000000485d58 <+550>:	mov    0x28(%rax),%rax
    0x0000000000485d5c <+554>:	mov    -0x58(%rbp),%rdx
    0x0000000000485d60 <+558>:	mov    0x30(%rdx),%rdx
    0x0000000000485d64 <+562>:	mov    -0x50(%rbp),%rsi
    0x0000000000485d68 <+566>:	mov    -0x20(%rbp),%rcx
    0x0000000000485d6c <+570>:	mov    %rcx,%rdi
    0x0000000000485d6f <+573>:	callq  0x4ac8cb <__x86_indirect_thunk_rax>
    0x0000000000485d74 <+578>:	mov    -0x18(%rbp),%rax
    0x0000000000485d78 <+582>:	mov    0x130(%rax),%rax
--Type <RET> for more, q to quit, c to continue without paging--
    0x0000000000485d7f <+589>:	mov    0x18(%rax),%rax
    0x0000000000485d83 <+593>:	mov    -0x58(%rbp),%rdx
    0x0000000000485d87 <+597>:	movzbl 0x14(%rdx),%edx
    0x0000000000485d8b <+601>:	movzbl %dl,%ecx
    0x0000000000485d8e <+604>:	mov    -0x50(%rbp),%rdx
    0x0000000000485d92 <+608>:	mov    %ecx,%esi
    0x0000000000485d94 <+610>:	mov    %rdx,%rdi
    0x0000000000485d97 <+613>:	callq  0x4ac8cb <__x86_indirect_thunk_rax>
    0x0000000000485d9c <+618>:	mov    -0x20(%rbp),%rax
    0x0000000000485da0 <+622>:	movzbl (%rax),%eax
    0x0000000000485da3 <+625>:	lea    0x1(%rax),%edx
    0x0000000000485da6 <+628>:	mov    -0x20(%rbp),%rax
    0x0000000000485daa <+632>:	mov    %dl,(%rax)
    0x0000000000485dac <+634>:	mov    -0x58(%rbp),%rax
    0x0000000000485db0 <+638>:	mov    0x30(%rax),%rax
    0x0000000000485db4 <+642>:	mov    $0xffffffff,%edx
    0x0000000000485db9 <+647>:	cmp    %rdx,%rax
    0x0000000000485dbc <+650>:	jbe    0x485dd6 
<xfs_dir2_sf_addname_hard+676>
    0x0000000000485dbe <+652>:	cmpl   $0x0,-0x5c(%rbp)
    0x0000000000485dc2 <+656>:	jne    0x485dd6 
<xfs_dir2_sf_addname_hard+676>
    0x0000000000485dc4 <+658>:	mov    -0x20(%rbp),%rax
    0x0000000000485dc8 <+662>:	movzbl 0x1(%rax),%eax
    0x0000000000485dcc <+666>:	lea    0x1(%rax),%edx
    0x0000000000485dcf <+669>:	mov    -0x20(%rbp),%rax
    0x0000000000485dd3 <+673>:	mov    %dl,0x1(%rax)
    0x0000000000485dd6 <+676>:	cmpl   $0x0,-0x4(%rbp)
    0x0000000000485dda <+680>:	jne    0x485e1e 
<xfs_dir2_sf_addname_hard+748>
    0x0000000000485ddc <+682>:	mov    -0x18(%rbp),%rax
    0x0000000000485de0 <+686>:	mov    0x130(%rax),%rax
    0x0000000000485de7 <+693>:	mov    0x8(%rax),%rax
    0x0000000000485deb <+697>:	mov    -0x50(%rbp),%rcx
    0x0000000000485def <+701>:	mov    -0x20(%rbp),%rdx
    0x0000000000485df3 <+705>:	mov    %rcx,%rsi
    0x0000000000485df6 <+708>:	mov    %rdx,%rdi
    0x0000000000485df9 <+711>:	callq  0x4ac8cb <__x86_indirect_thunk_rax>
    0x0000000000485dfe <+716>:	mov    %rax,-0x50(%rbp)
    0x0000000000485e02 <+720>:	mov    -0x24(%rbp),%eax
    0x0000000000485e05 <+723>:	sub    -0x44(%rbp),%eax
    0x0000000000485e08 <+726>:	movslq %eax,%rdx
    0x0000000000485e0b <+729>:	mov    -0x10(%rbp),%rcx
    0x0000000000485e0f <+733>:	mov    -0x50(%rbp),%rax
    0x0000000000485e13 <+737>:	mov    %rcx,%rsi
    0x0000000000485e16 <+740>:	mov    %rax,%rdi
    0x0000000000485e19 <+743>:	callq  0x4033c0 <memcpy@plt>
=> 0x0000000000485e1e <+748>:	mov    -0x30(%rbp),%rax
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    0x0000000000485e22 <+752>:	mov    %rax,%rdi
    0x0000000000485e25 <+755>:	callq  0x484e23 <kmem_free.lto_priv.174>
    0x0000000000485e2a <+760>:	mov    -0x60(%rbp),%eax
    0x0000000000485e2d <+763>:	movslq %eax,%rdx
    0x0000000000485e30 <+766>:	mov    -0x18(%rbp),%rax
    0x0000000000485e34 <+770>:	mov    %rdx,0xe0(%rax)
    0x0000000000485e3b <+777>:	leaveq
    0x0000000000485e3c <+778>:	retq
--Type <RET> for more, q to quit, c to continue without paging--
End of assembler dump.

> 
> /sbin/xfs_repair: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for 
> GNU/Linux 3.2.0, BuildID[sha1]=710d79304cb58f8e415302572cc718e38f0f1aa4, 
> with debug_info, not stripped
> 
>>
>> Complete log at
>> https://ixion.pld-linux.org/~arekm/xfs-1/repair.txt
>>
>> Test was done with xfs_repair 4.17.0 and 4.18.0 with the same result.
>>
>> kernel 4.18.5
>>
>> Running under gdb now.
>>
>> Any ideas?
>>
> 
> 

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: repair: realloc(): invalid next size
  2018-10-09 13:49 ` Arkadiusz Miśkiewicz
  2018-10-09 18:48   ` Arkadiusz Miśkiewicz
@ 2018-10-12  7:52   ` Arkadiusz Miśkiewicz
  1 sibling, 0 replies; 7+ messages in thread
From: Arkadiusz Miśkiewicz @ 2018-10-12  7:52 UTC (permalink / raw)
  To: linux-xfs

On 09/10/2018 15:49, Arkadiusz Miśkiewicz wrote:
> On 08/10/2018 16:03, Arkadiusz Miśkiewicz wrote:
>>
>> Big fs, ton of small files, repair takes 36h until this happens:
>>
>> rebuilding directory inode 30363993060
>> rebuilding directory inode 30398868604
>> rebuilding directory inode 30414474627
>> rebuilding directory inode 30425006954
>> rebuilding directory inode 30447937553
>> rebuilding directory inode 30529556616
>> rebuilding directory inode 30537494728
>> rebuilding directory inode 30569826838
>> rebuilding directory inode 31060721895
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f9db, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> Warning: recursive buffer locking at block 31060721776 detected
>> Metadata corruption detected at 0x41f980, inode 0x73b5d00e7 data fork
>> xfs_repair: warning - iflush_int failed (-117)
>> realloc(): invalid next size
>> Aborted
>>
>>
>> Fails somewhere in 0x41f9db <xfs_dir2_sf_verify+603>
[...]

> 
> gdb doesn't like my binary, not sure why yet

LTO patch that was merged into xfsprogs makes binary undebuggable with 
gdb (if it gets enabled and it is automatically in my setup).

Now back to main problem:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7ba7541 in __GI_abort () at abort.c:79
#2  0x00007ffff7c00458 in __libc_message (action=action@entry=do_abort, 
fmt=fmt@entry=0x7ffff7d0f400 "%s\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff7c06dea in malloc_printerr (str=str@entry=0x7ffff7d0d6ce 
"realloc(): invalid next size") at malloc.c:5336
#4  0x00007ffff7c0b00c in _int_realloc (av=av@entry=0x7ffff7d46c40 
<main_arena>, oldp=oldp@entry=0x61930520, oldsize=oldsize@entry=96, 
nb=nb@entry=160) at malloc.c:4520
#5  0x00007ffff7c0c28b in __GI___libc_realloc (oldmem=0x61930530, 
bytes=140) at malloc.c:3214
#6  0x000000000044343a in kmem_realloc (ptr=0x61930530, new_size=140, 
flags=5) at kmem.c:90
#7  0x0000000000491dd7 in libxfs_idata_realloc (ip=0x50a95cd0, 
byte_diff=63, whichfork=0) at xfs_inode_fork.c:511
#8  0x0000000000485dc3 in xfs_dir2_sf_addname_easy (args=0x69ac0100, 
sfep=0x61930546, offset=44, new_isize=137) at xfs_dir2_sf.c:379
#9  0x0000000000485cff in xfs_dir2_sf_addname (args=0x69ac0100) at 
xfs_dir2_sf.c:339
#10 0x0000000000477705 in libxfs_dir_createname (tp=0x5b60a9d0, 
dp=0x50a95cd0, name=0x50ad5450, inum=31060721942, first=0x7fffffffdae8, 
dfops=0x7fffffffdaa0, total=45) at xfs_dir2.c:281
#11 0x0000000000427b1a in longform_dir2_rebuild (mp=0x7fffffffe180, 
ino=31060721895, ip=0x50a95cd0, irec=0x7fff2016f780, ino_offset=7, 
hashtab=0x8d9811e0) at phase6.c:1444
#12 0x000000000042a743 in longform_dir2_entry_check (mp=0x7fffffffe180, 
ino=31060721895, ip=0x50a95cd0, num_illegal=0x7fffffffdccc, 
need_dot=0x7fffffffdcd4, irec=0x7fff2016f780, ino_offset=7, 
hashtab=0x8d9811e0) at phase6.c:2481
#13 0x000000000042b7c7 in process_dir_inode (mp=0x7fffffffe180, agno=14, 
irec=0x7fff2016f780, ino_offset=7) at phase6.c:2983
#14 0x000000000042c4ff in traverse_function (wq=0x7fffffffde50, agno=14, 
arg=0x62c5ec20) at phase6.c:3254
#15 0x00000000004305fe in prefetch_ag_range (work=0x7fffffffde50, 
start_ag=0, end_ag=39, dirs_only=true, func=0x42c442 
<traverse_function>) at prefetch.c:964
#16 0x000000000043077e in do_inode_prefetch (mp=0x7fffffffe180, 
stride=0, func=0x42c442 <traverse_function>, check_cache=false, 
dirs_only=true) at prefetch.c:1027
#17 0x000000000042c5d3 in traverse_ags (mp=0x7fffffffe180) at phase6.c:3284
#18 0x000000000042c88a in phase6 (mp=0x7fffffffe180) at phase6.c:3372
#19 0x000000000043ebd8 in main (argc=3, argv=0x7fffffffe688) at 
xfs_repair.c:949


more at

https://ixion.pld-linux.org/~arekm/xfs-1/gdb1.txt

Have coredump, so can look for more details.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-12 15:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-08 14:03 repair: realloc(): invalid next size Arkadiusz Miśkiewicz
2018-10-08 14:26 ` Eric Sandeen
2018-10-08 14:42   ` Eric Sandeen
2018-10-09 13:49 ` Arkadiusz Miśkiewicz
2018-10-09 18:48   ` Arkadiusz Miśkiewicz
2018-10-12  7:52   ` Arkadiusz Miśkiewicz
  -- strict thread matches above, loose matches on Subject: below --
2018-10-08 10:52 Arkadiusz Miśkiewicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).