* dm-cache refusing to come up again after a crash
@ 2013-12-06 15:49 Steinar H. Gunderson
2013-12-06 17:57 ` Joe Thornber
0 siblings, 1 reply; 9+ messages in thread
From: Steinar H. Gunderson @ 2013-12-06 15:49 UTC (permalink / raw)
To: dm-devel
Linux (3.12.0-rc5) hung, and on boot, I can't get the dm-cache up again:
(initramfs) echo 0 23440891904 cache /dev/cache/metadata /dev/cache/blocks /dev/md1 1024 1 writeback
default 4 random_threshold 8 sequential_threshold 512 | dmsetup create cache -u CACHE-0a8bb56fc873c195bf7117af925c7f08
device-mapper: reload ioctl on cache failed: Input/output error
Command failed
The kernel complains with
[ 639.189756] attempt to access beyond end of device
[ 639.189761] dm-0: rw=0, want=18445688752888627208, limit=1048576
[ 639.189764] device-mapper: transaction manager: couldn't open metadata space map
[ 639.189767] device-mapper: cache metadata: tm_open_with_sm failed
[ 639.283130] device-mapper: table: 254:2: cache: Error creating metadata object
[ 639.283134] device-mapper: ioctl: error adding target to table
Is there anything I can do short of nuking the metadata partition
and taking the loss of whatever wasn't written back?
/* Steinar */
--
Homepage: http://www.sesse.net/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 15:49 dm-cache refusing to come up again after a crash Steinar H. Gunderson
@ 2013-12-06 17:57 ` Joe Thornber
2013-12-06 19:16 ` Steinar H. Gunderson
0 siblings, 1 reply; 9+ messages in thread
From: Joe Thornber @ 2013-12-06 17:57 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 04:49:14PM +0100, Steinar H. Gunderson wrote:
> Linux (3.12.0-rc5) hung, and on boot, I can't get the dm-cache up again:
>
> (initramfs) echo 0 23440891904 cache /dev/cache/metadata /dev/cache/blocks /dev/md1 1024 1 writeback
> default 4 random_threshold 8 sequential_threshold 512 | dmsetup create cache -u CACHE-0a8bb56fc873c195bf7117af925c7f08
> device-mapper: reload ioctl on cache failed: Input/output error
> Command failed
>
> The kernel complains with
>
> [ 639.189756] attempt to access beyond end of device
> [ 639.189761] dm-0: rw=0, want=18445688752888627208, limit=1048576
> [ 639.189764] device-mapper: transaction manager: couldn't open metadata space map
> [ 639.189767] device-mapper: cache metadata: tm_open_with_sm failed
> [ 639.283130] device-mapper: table: 254:2: cache: Error creating metadata object
> [ 639.283134] device-mapper: ioctl: error adding target to table
>
> Is there anything I can do short of nuking the metadata partition
> and taking the loss of whatever wasn't written back?
Yep, grab:
https://github.com/jthornber/thin-provisioning-tools
build, and then try cache_check on it (which should tell you what's
wrong). Other programs to play with are cache_dump, cache_restore and
cache_repair.
Let me know how it goes,
- Joe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 17:57 ` Joe Thornber
@ 2013-12-06 19:16 ` Steinar H. Gunderson
2013-12-06 19:35 ` Steinar H. Gunderson
2013-12-09 10:28 ` Joe Thornber
0 siblings, 2 replies; 9+ messages in thread
From: Steinar H. Gunderson @ 2013-12-06 19:16 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 05:57:13PM +0000, Joe Thornber wrote:
> Yep, grab:
>
> https://github.com/jthornber/thin-provisioning-tools
>
> build, and then try cache_check on it (which should tell you what's
> wrong). Other programs to play with are cache_dump, cache_restore and
> cache_repair.
Well, first of all, it doesn't compile, since you use typename outside of
templates :-) Fixing that is easy, though. But afterwards:
root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1
examining superblock
superblock is corrupt
bad checksum in superblock
So where do I want to go from there? cache_dump doesn't want to play with the
superblock because the checksum is bad... do I want cache_repair, then? Do I
want to take a backup of anything first?
/* Steinar */
--
Homepage: http://www.sesse.net/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 19:16 ` Steinar H. Gunderson
@ 2013-12-06 19:35 ` Steinar H. Gunderson
2013-12-06 19:53 ` Steinar H. Gunderson
2013-12-09 10:31 ` Joe Thornber
2013-12-09 10:28 ` Joe Thornber
1 sibling, 2 replies; 9+ messages in thread
From: Steinar H. Gunderson @ 2013-12-06 19:35 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 08:16:05PM +0100, Steinar H. Gunderson wrote:
> Well, first of all, it doesn't compile, since you use typename outside of
> templates :-) Fixing that is easy, though. But afterwards:
>
> root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1
> examining superblock
> superblock is corrupt
> bad checksum in superblock
Sorry, wrong device:
root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/cache/metadata
examining superblock
examining mapping array
no hint array present
examining discard bitset
root@ubuntu:~/thin-provisioning-tools# echo $?
0
Does that mean it ought to have worked better? :-)
/* Steinar */
--
Homepage: http://www.sesse.net/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 19:35 ` Steinar H. Gunderson
@ 2013-12-06 19:53 ` Steinar H. Gunderson
2013-12-07 0:16 ` Steinar H. Gunderson
2013-12-09 10:31 ` Joe Thornber
1 sibling, 1 reply; 9+ messages in thread
From: Steinar H. Gunderson @ 2013-12-06 19:53 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 08:35:41PM +0100, Steinar H. Gunderson wrote:
> root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/cache/metadata
And I forgot:
root@ubuntu:~/thin-provisioning-tools# ./cache_dump /dev/cache/metadata
<superblock uuid="" block_size="512" nr_cache_blocks="865560" policy="cleaner" hint_width="4">
<mappings>
<mapping cache_block="0" origin_block="6118373" dirty="false"/>
<mapping cache_block="1" origin_block="6118275" dirty="false"/>
<mapping cache_block="2" origin_block="5934780" dirty="false"/>
[... lots of blocks, none of them dirty ...]
<mapping cache_block="505877" origin_block="889613" dirty="false"/>
<mapping cache_block="505878" origin_block="690575" dirty="false"/>
<mapping cache_block="505879" origin_block="875752" dirty="false"/>
</mappings>
<hints>
cache_dump: /usr/include/boost/smart_ptr/shared_ptr.hpp:418: T*
boost::shared_ptr< <template-parameter-1-1> >::operator->() const [with T =
caching::hint_array]: Assertion `px != 0' failed.
The cleaner policy is not the one I actually use, but I have used it in the
past, so I guess it's stuck somehow.
/* Steinar */
--
Homepage: http://www.sesse.net/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 19:53 ` Steinar H. Gunderson
@ 2013-12-07 0:16 ` Steinar H. Gunderson
0 siblings, 0 replies; 9+ messages in thread
From: Steinar H. Gunderson @ 2013-12-07 0:16 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 08:53:10PM +0100, Steinar H. Gunderson wrote:
> <superblock uuid="" block_size="512" nr_cache_blocks="865560" policy="cleaner" hint_width="4">
> <mappings>
> <mapping cache_block="0" origin_block="6118373" dirty="false"/>
> <mapping cache_block="1" origin_block="6118275" dirty="false"/>
> <mapping cache_block="2" origin_block="5934780" dirty="false"/>
> [... lots of blocks, none of them dirty ...]
> <mapping cache_block="505877" origin_block="889613" dirty="false"/>
> <mapping cache_block="505878" origin_block="690575" dirty="false"/>
> <mapping cache_block="505879" origin_block="875752" dirty="false"/>
> </mappings>
OK, so since all blocks were marked as non-dirty, I wiped the metadata
volume, which made the system boot just fine, but was seemingly a big
mistake; a lot of filesystems had more or less fatal errors.
I'm restoring from backup right now. (Yes, I have them.)
/* Steinar */
--
Homepage: http://www.sesse.net/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 19:16 ` Steinar H. Gunderson
2013-12-06 19:35 ` Steinar H. Gunderson
@ 2013-12-09 10:28 ` Joe Thornber
2013-12-09 10:34 ` Steinar H. Gunderson
1 sibling, 1 reply; 9+ messages in thread
From: Joe Thornber @ 2013-12-09 10:28 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 08:16:05PM +0100, Steinar H. Gunderson wrote:
> On Fri, Dec 06, 2013 at 05:57:13PM +0000, Joe Thornber wrote:
> > Yep, grab:
> >
> > https://github.com/jthornber/thin-provisioning-tools
> >
> > build, and then try cache_check on it (which should tell you what's
> > wrong). Other programs to play with are cache_dump, cache_restore and
> > cache_repair.
>
> Well, first of all, it doesn't compile, since you use typename outside of
> templates :-) Fixing that is easy, though. But afterwards:
Grr, I thought that was fixed, what version of g++ are you using?
>
> root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1
> examining superblock
> superblock is corrupt
> bad checksum in superblock
>
> So where do I want to go from there? cache_dump doesn't want to play with the
> superblock because the checksum is bad... do I want cache_repair, then? Do I
> want to take a backup of anything first?
Ouch. Could you go through what happened please? Did dm-cache crash,
or did the machine die for some other reason?
- Joe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-06 19:35 ` Steinar H. Gunderson
2013-12-06 19:53 ` Steinar H. Gunderson
@ 2013-12-09 10:31 ` Joe Thornber
1 sibling, 0 replies; 9+ messages in thread
From: Joe Thornber @ 2013-12-09 10:31 UTC (permalink / raw)
To: device-mapper development
On Fri, Dec 06, 2013 at 08:35:41PM +0100, Steinar H. Gunderson wrote:
> On Fri, Dec 06, 2013 at 08:16:05PM +0100, Steinar H. Gunderson wrote:
> > Well, first of all, it doesn't compile, since you use typename outside of
> > templates :-) Fixing that is easy, though. But afterwards:
> >
> > root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1
> > examining superblock
> > superblock is corrupt
> > bad checksum in superblock
>
> Sorry, wrong device:
>
> root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/cache/metadata
> examining superblock
> examining mapping array
> no hint array present
> examining discard bitset
> root@ubuntu:~/thin-provisioning-tools# echo $?
> 0
>
> Does that mean it ought to have worked better? :-)
Yes, this is good news. The damage is probably in the space maps
which get completely regenerated during a restore/repair.
I see from a later mail that you're having another issue with the
tools. I wonder if you could email me off list, and we'll work out
how I can get a copy of your metadata.
- Joe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash
2013-12-09 10:28 ` Joe Thornber
@ 2013-12-09 10:34 ` Steinar H. Gunderson
0 siblings, 0 replies; 9+ messages in thread
From: Steinar H. Gunderson @ 2013-12-09 10:34 UTC (permalink / raw)
To: device-mapper development
On Mon, Dec 09, 2013 at 10:28:11AM +0000, Joe Thornber wrote:
>> Well, first of all, it doesn't compile, since you use typename outside of
>> templates :-) Fixing that is easy, though. But afterwards:
> Grr, I thought that was fixed, what version of g++ are you using?
This is an Ubuntu 10.04 live CD, which was what I was having handy.
It works fine in a Debian wheezy live CD (which I switched to later).
>> So where do I want to go from there? cache_dump doesn't want to play with the
>> superblock because the checksum is bad... do I want cache_repair, then? Do I
>> want to take a backup of anything first?
> Ouch. Could you go through what happened please? Did dm-cache crash,
> or did the machine die for some other reason?
The machine hung. I don't know entirely why (I don't have the logs).
I rebooted, and it refused to take up the volume (this is what the original
post in this message is about). After booting to a live CD and running
cache_check and cache_dump, I was convinced there were no dirty blocks,
so I nuked the entire metadata volume (using dd from /dev/zero).
This made the machine boot again, but with tons of filesystem errors on
anything I'd written to in the last few months, so I restored from backup
(thankfully I do have working backups!). I also upgraded to 3.13-rc3 in the
hopes of fixing whatever issue in 3.12 originally caused this; however, as
reported in the other thread, this was notoriously unstable, and after the
third crash, I was back into the “won't boot, but cache_check says everything
is fine” mode.
That's the current status; it's now standing in a live CD and not doing much
useful. I miss my machine :-) (And I hope I haven't lost data again.) Will it
help if I upload a dump of the 512MB metadata volume somewhere?
/* Steinar */
--
Homepage: http://www.sesse.net/
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-12-09 10:34 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-06 15:49 dm-cache refusing to come up again after a crash Steinar H. Gunderson
2013-12-06 17:57 ` Joe Thornber
2013-12-06 19:16 ` Steinar H. Gunderson
2013-12-06 19:35 ` Steinar H. Gunderson
2013-12-06 19:53 ` Steinar H. Gunderson
2013-12-07 0:16 ` Steinar H. Gunderson
2013-12-09 10:31 ` Joe Thornber
2013-12-09 10:28 ` Joe Thornber
2013-12-09 10:34 ` Steinar H. Gunderson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.