* nilfs bad btree node and broken bmap
@ 2017-06-28 19:19 Rodrigo Severo
[not found] ` <CAOdi6ianj9cgMHwjSzAkd=f2o3-i5KX4mznnR5hDzsN7ggKLdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Rodrigo Severo @ 2017-06-28 19:19 UTC (permalink / raw)
To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA
Hi,
I have a 8TG nilfs2 disk with started changing to read-only by itself.
My log files are full of the following error messages:
Jun 28 15:57:08 c1-df kernel: [ 1637.138168] NILFS (sdd1): bad btree
node (ino=6069196, blocknr=446552325): level = 69, flags = 0x24,
nchildren = 8203
Jun 28 15:57:08 c1-df kernel: [ 1637.139363] NILFS error (device
sdd1): nilfs_bmap_lookup_contig: broken bmap (inode number=6069196)
I have already checked and there are not errors on the disk.
How can I fix this?
Regards,
Rodrigo Severo
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nilfs bad btree node and broken bmap
[not found] ` <CAOdi6ianj9cgMHwjSzAkd=f2o3-i5KX4mznnR5hDzsN7ggKLdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-06-30 12:34 ` Rodrigo Severo
2017-06-30 18:41 ` Peter Grandi
1 sibling, 0 replies; 5+ messages in thread
From: Rodrigo Severo @ 2017-06-30 12:34 UTC (permalink / raw)
To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA
2017-06-28 16:19 GMT-03:00 Rodrigo Severo <rodrigo-sUXncmUOihszlUgHzu1iJkEOCMrvLtNR@public.gmane.org>:
> Hi,
>
>
> I have a 8TG nilfs2 disk with started changing to read-only by itself.
>
> My log files are full of the following error messages:
>
> Jun 28 15:57:08 c1-df kernel: [ 1637.138168] NILFS (sdd1): bad btree
> node (ino=6069196, blocknr=446552325): level = 69, flags = 0x24,
> nchildren = 8203
> Jun 28 15:57:08 c1-df kernel: [ 1637.139363] NILFS error (device
> sdd1): nilfs_bmap_lookup_contig: broken bmap (inode number=6069196)
>
> I have already checked and there are not errors on the disk.
>
> How can I fix this?
No ideas? Nobody?
Rodrigo
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nilfs bad btree node and broken bmap
[not found] ` <CAOdi6ianj9cgMHwjSzAkd=f2o3-i5KX4mznnR5hDzsN7ggKLdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-30 12:34 ` Rodrigo Severo
@ 2017-06-30 18:41 ` Peter Grandi
[not found] ` <22870.39764.148116.59757-HsBtKtDIwYdYO2OccljXW7VCufUGDwFn@public.gmane.org>
1 sibling, 1 reply; 5+ messages in thread
From: Peter Grandi @ 2017-06-30 18:41 UTC (permalink / raw)
To: Linux fs NILFS
> I have a 8TG nilfs2 disk with started changing to read-only by itself.
Highly unlikely.
> My log files are full of the following error messages:
> Jun 28 15:57:08 c1-df kernel: [ 1637.138168] NILFS (sdd1): bad btree node (ino=6069196, blocknr=446552325): level = 69, flags = 0x24, nchildren = 8203
> Jun 28 15:57:08 c1-df kernel: [ 1637.139363] NILFS error (device sdd1): nilfs_bmap_lookup_contig: broken bmap (inode number=6069196)
These are typical of some IO error, even a transient one, or
more simply the outcome of a crash with not well implemented
barriers. I notice that the errors happen with a time offset of
"1637" seconds, that is less than 30 minutes after a reboot.
> I have already checked and there are not errors on the disk.
How did you check?
> How can I fix this?
Mount earlier checkpoints until you find a "clean" one, then
delete later "unclean" checkpoints. If there is no earlier
"clean" checkpoint, some IO error has damaged existing data.
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nilfs bad btree node and broken bmap
[not found] ` <22870.39764.148116.59757-HsBtKtDIwYdYO2OccljXW7VCufUGDwFn@public.gmane.org>
@ 2017-07-01 23:33 ` Peter Grandi
2017-07-03 19:12 ` Rodrigo Severo
1 sibling, 0 replies; 5+ messages in thread
From: Peter Grandi @ 2017-07-01 23:33 UTC (permalink / raw)
To: Linux fs NILFS
[ ... ]
>> Jun 28 15:57:08 c1-df kernel: [ 1637.138168] NILFS (sdd1): bad btree node (ino=6069196, blocknr=446552325): level = 69, flags = 0x24, nchildren = 8203
>> Jun 28 15:57:08 c1-df kernel: [ 1637.139363] NILFS error (device sdd1): nilfs_bmap_lookup_contig: broken bmap (inode number=6069196)
[ ... ]
>> How can I fix this?
> Mount earlier checkpoints until you find a "clean" one, then
> delete later "unclean" checkpoints. If there is no earlier
> "clean" checkpoint, some IO error has damaged existing data.
Further words of caution: the errors above can be merely
inconsistencies in very recent metadata due to a crash, or they
can be the result of some IO error damaging existing data.
In the latter case we know for sure that some NILFS2 metadata
has been corrupted, but we don't know whether some _data_ inside
existing files has been corrupted too, as NILFS2 does not do
consistency checks for data/payload.
As the documentation says, it actually checksums each "block",
but currently this is used only to detect incomplete metadata
after a crash, and no verification of the checksum is made for
data.
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: nilfs bad btree node and broken bmap
[not found] ` <22870.39764.148116.59757-HsBtKtDIwYdYO2OccljXW7VCufUGDwFn@public.gmane.org>
2017-07-01 23:33 ` Peter Grandi
@ 2017-07-03 19:12 ` Rodrigo Severo
1 sibling, 0 replies; 5+ messages in thread
From: Rodrigo Severo @ 2017-07-03 19:12 UTC (permalink / raw)
To: Peter Grandi; +Cc: Linux fs NILFS
2017-06-30 15:41 GMT-03:00 Peter Grandi <pg-9Cpm1x5jwYpury8xqvH7Iip2UmYkHbXO@public.gmane.org>:
>> I have a 8TG nilfs2 disk with started changing to read-only by itself.
>
> Highly unlikely.
Why you say so? Do you think I`m calling some read-only remount? I can
assure you I`m not. It`s some error detecting routine that triggers
this read-only remount.
>> My log files are full of the following error messages:
>
>> Jun 28 15:57:08 c1-df kernel: [ 1637.138168] NILFS (sdd1): bad btree node (ino=6069196, blocknr=446552325): level = 69, flags = 0x24, nchildren = 8203
>> Jun 28 15:57:08 c1-df kernel: [ 1637.139363] NILFS error (device sdd1): nilfs_bmap_lookup_contig: broken bmap (inode number=6069196)
>
> These are typical of some IO error, even a transient one, or
> more simply the outcome of a crash with not well implemented
> barriers. I notice that the errors happen with a time offset of
> "1637" seconds, that is less than 30 minutes after a reboot.
Yes, these messages appeared just about 30 minutes after I rebooted
the machine trying to see if a reboot (and consequent new mounting)
would fix the issue.
This device is auto remounted read-only very quickly after being mounted.
>> I have already checked and there are not errors on the disk.
>
> How did you check?
smartctl -t long /dev/sdX
resulted in a "No error found" report log.
>> How can I fix this?
>
> Mount earlier checkpoints until you find a "clean" one, then
> delete later "unclean" checkpoints. If there is no earlier
> "clean" checkpoint, some IO error has damaged existing data.
While trying it, the checkpoint list suddenly got reduced from the
thousands of entries it had to less than 20 and started also to
present a few corrupted checkpoints at the beginning of the list:
# lscp /dev/sdd1
CNO DATE TIME MODE FLG BLKCNT ICNT
9023380516061122833 172852864--1725596255-48 06:50:13 ss i
4950690465199330146 2045974694022012474
5296068189917196282 1778420707--1725596255-48 13:21:03 cp -
14177017742416589113 7491717754959394247
1157707752023265537 -1859323591--1725596255-48 16:23:13 ss -
12280754577374290737 15446864286208281490
1609204614858096249 1999416923--1725596255-48 21:08:38 ss i
3600571124657065622 12156172216455180106
16492959802278112302 -1005506824--1725596255-48 05:37:43 ss -
11759415802487770311 15182905133685067495
3737236050226985035 -1235112073--1725596255-48 02:12:42 ss -
7876034599887446652 4574778715896039703
1897681074103846001 -993498507--1725596255-48 11:15:02 ss -
4713367178533865852 12495665640457683873
5629031927472857828 -1020115369--1725596255-48 22:25:13 ss -
4617052897507868927 12971493170585062941
2526176666379534699 -1001874240--1725596255-48 00:39:51 ss -
3112984405336493942 6943670751485548929
531055 2017-05-16 09:38:15 cp - 1524462858 38902090
531056 2017-05-16 09:38:20 cp - 1524461428 38902076
531057 2017-05-16 09:38:26 cp - 1524461427 38902075
531218 2017-05-16 09:52:12 cp - 1524461427 38902075
531219 2017-05-16 09:52:17 cp - 1524461427 38902075
531223 2017-05-16 09:52:39 cp - 1524461427 38902075
531228 2017-05-16 09:53:10 cp - 1524461427 38902075
531229 2017-05-16 09:53:17 cp - 1524461427 38902075
531230 2017-05-16 09:53:23 cp - 1524461427 38902075
531242 2017-05-16 09:54:29 cp - 1524461427 38902075
531323 2017-05-16 10:01:25 cp - 1524461427 38902075
531443 2017-05-16 10:11:40 cp - 1524461427 38902075
531455 2017-05-16 10:12:41 cp - 1524461427 38902076
531456 2017-05-16 10:12:45 cp - 1524461484 38902076
I wonder maybe nilfs2 isn't for me at all.
Thanks for your help and attention,
Rodrigo
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-07-03 19:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-28 19:19 nilfs bad btree node and broken bmap Rodrigo Severo
[not found] ` <CAOdi6ianj9cgMHwjSzAkd=f2o3-i5KX4mznnR5hDzsN7ggKLdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-30 12:34 ` Rodrigo Severo
2017-06-30 18:41 ` Peter Grandi
[not found] ` <22870.39764.148116.59757-HsBtKtDIwYdYO2OccljXW7VCufUGDwFn@public.gmane.org>
2017-07-01 23:33 ` Peter Grandi
2017-07-03 19:12 ` Rodrigo Severo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).