* dm-cache refusing to come up again after a crash @ 2013-12-06 15:49 Steinar H. Gunderson 2013-12-06 17:57 ` Joe Thornber 0 siblings, 1 reply; 9+ messages in thread From: Steinar H. Gunderson @ 2013-12-06 15:49 UTC (permalink / raw) To: dm-devel Linux (3.12.0-rc5) hung, and on boot, I can't get the dm-cache up again: (initramfs) echo 0 23440891904 cache /dev/cache/metadata /dev/cache/blocks /dev/md1 1024 1 writeback default 4 random_threshold 8 sequential_threshold 512 | dmsetup create cache -u CACHE-0a8bb56fc873c195bf7117af925c7f08 device-mapper: reload ioctl on cache failed: Input/output error Command failed The kernel complains with [ 639.189756] attempt to access beyond end of device [ 639.189761] dm-0: rw=0, want=18445688752888627208, limit=1048576 [ 639.189764] device-mapper: transaction manager: couldn't open metadata space map [ 639.189767] device-mapper: cache metadata: tm_open_with_sm failed [ 639.283130] device-mapper: table: 254:2: cache: Error creating metadata object [ 639.283134] device-mapper: ioctl: error adding target to table Is there anything I can do short of nuking the metadata partition and taking the loss of whatever wasn't written back? /* Steinar */ -- Homepage: http://www.sesse.net/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 15:49 dm-cache refusing to come up again after a crash Steinar H. Gunderson @ 2013-12-06 17:57 ` Joe Thornber 2013-12-06 19:16 ` Steinar H. Gunderson 0 siblings, 1 reply; 9+ messages in thread From: Joe Thornber @ 2013-12-06 17:57 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 04:49:14PM +0100, Steinar H. Gunderson wrote: > Linux (3.12.0-rc5) hung, and on boot, I can't get the dm-cache up again: > > (initramfs) echo 0 23440891904 cache /dev/cache/metadata /dev/cache/blocks /dev/md1 1024 1 writeback > default 4 random_threshold 8 sequential_threshold 512 | dmsetup create cache -u CACHE-0a8bb56fc873c195bf7117af925c7f08 > device-mapper: reload ioctl on cache failed: Input/output error > Command failed > > The kernel complains with > > [ 639.189756] attempt to access beyond end of device > [ 639.189761] dm-0: rw=0, want=18445688752888627208, limit=1048576 > [ 639.189764] device-mapper: transaction manager: couldn't open metadata space map > [ 639.189767] device-mapper: cache metadata: tm_open_with_sm failed > [ 639.283130] device-mapper: table: 254:2: cache: Error creating metadata object > [ 639.283134] device-mapper: ioctl: error adding target to table > > Is there anything I can do short of nuking the metadata partition > and taking the loss of whatever wasn't written back? Yep, grab: https://github.com/jthornber/thin-provisioning-tools build, and then try cache_check on it (which should tell you what's wrong). Other programs to play with are cache_dump, cache_restore and cache_repair. Let me know how it goes, - Joe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 17:57 ` Joe Thornber @ 2013-12-06 19:16 ` Steinar H. Gunderson 2013-12-06 19:35 ` Steinar H. Gunderson 2013-12-09 10:28 ` Joe Thornber 0 siblings, 2 replies; 9+ messages in thread From: Steinar H. Gunderson @ 2013-12-06 19:16 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 05:57:13PM +0000, Joe Thornber wrote: > Yep, grab: > > https://github.com/jthornber/thin-provisioning-tools > > build, and then try cache_check on it (which should tell you what's > wrong). Other programs to play with are cache_dump, cache_restore and > cache_repair. Well, first of all, it doesn't compile, since you use typename outside of templates :-) Fixing that is easy, though. But afterwards: root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1 examining superblock superblock is corrupt bad checksum in superblock So where do I want to go from there? cache_dump doesn't want to play with the superblock because the checksum is bad... do I want cache_repair, then? Do I want to take a backup of anything first? /* Steinar */ -- Homepage: http://www.sesse.net/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 19:16 ` Steinar H. Gunderson @ 2013-12-06 19:35 ` Steinar H. Gunderson 2013-12-06 19:53 ` Steinar H. Gunderson 2013-12-09 10:31 ` Joe Thornber 2013-12-09 10:28 ` Joe Thornber 1 sibling, 2 replies; 9+ messages in thread From: Steinar H. Gunderson @ 2013-12-06 19:35 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 08:16:05PM +0100, Steinar H. Gunderson wrote: > Well, first of all, it doesn't compile, since you use typename outside of > templates :-) Fixing that is easy, though. But afterwards: > > root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1 > examining superblock > superblock is corrupt > bad checksum in superblock Sorry, wrong device: root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/cache/metadata examining superblock examining mapping array no hint array present examining discard bitset root@ubuntu:~/thin-provisioning-tools# echo $? 0 Does that mean it ought to have worked better? :-) /* Steinar */ -- Homepage: http://www.sesse.net/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 19:35 ` Steinar H. Gunderson @ 2013-12-06 19:53 ` Steinar H. Gunderson 2013-12-07 0:16 ` Steinar H. Gunderson 2013-12-09 10:31 ` Joe Thornber 1 sibling, 1 reply; 9+ messages in thread From: Steinar H. Gunderson @ 2013-12-06 19:53 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 08:35:41PM +0100, Steinar H. Gunderson wrote: > root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/cache/metadata And I forgot: root@ubuntu:~/thin-provisioning-tools# ./cache_dump /dev/cache/metadata <superblock uuid="" block_size="512" nr_cache_blocks="865560" policy="cleaner" hint_width="4"> <mappings> <mapping cache_block="0" origin_block="6118373" dirty="false"/> <mapping cache_block="1" origin_block="6118275" dirty="false"/> <mapping cache_block="2" origin_block="5934780" dirty="false"/> [... lots of blocks, none of them dirty ...] <mapping cache_block="505877" origin_block="889613" dirty="false"/> <mapping cache_block="505878" origin_block="690575" dirty="false"/> <mapping cache_block="505879" origin_block="875752" dirty="false"/> </mappings> <hints> cache_dump: /usr/include/boost/smart_ptr/shared_ptr.hpp:418: T* boost::shared_ptr< <template-parameter-1-1> >::operator->() const [with T = caching::hint_array]: Assertion `px != 0' failed. The cleaner policy is not the one I actually use, but I have used it in the past, so I guess it's stuck somehow. /* Steinar */ -- Homepage: http://www.sesse.net/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 19:53 ` Steinar H. Gunderson @ 2013-12-07 0:16 ` Steinar H. Gunderson 0 siblings, 0 replies; 9+ messages in thread From: Steinar H. Gunderson @ 2013-12-07 0:16 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 08:53:10PM +0100, Steinar H. Gunderson wrote: > <superblock uuid="" block_size="512" nr_cache_blocks="865560" policy="cleaner" hint_width="4"> > <mappings> > <mapping cache_block="0" origin_block="6118373" dirty="false"/> > <mapping cache_block="1" origin_block="6118275" dirty="false"/> > <mapping cache_block="2" origin_block="5934780" dirty="false"/> > [... lots of blocks, none of them dirty ...] > <mapping cache_block="505877" origin_block="889613" dirty="false"/> > <mapping cache_block="505878" origin_block="690575" dirty="false"/> > <mapping cache_block="505879" origin_block="875752" dirty="false"/> > </mappings> OK, so since all blocks were marked as non-dirty, I wiped the metadata volume, which made the system boot just fine, but was seemingly a big mistake; a lot of filesystems had more or less fatal errors. I'm restoring from backup right now. (Yes, I have them.) /* Steinar */ -- Homepage: http://www.sesse.net/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 19:35 ` Steinar H. Gunderson 2013-12-06 19:53 ` Steinar H. Gunderson @ 2013-12-09 10:31 ` Joe Thornber 1 sibling, 0 replies; 9+ messages in thread From: Joe Thornber @ 2013-12-09 10:31 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 08:35:41PM +0100, Steinar H. Gunderson wrote: > On Fri, Dec 06, 2013 at 08:16:05PM +0100, Steinar H. Gunderson wrote: > > Well, first of all, it doesn't compile, since you use typename outside of > > templates :-) Fixing that is easy, though. But afterwards: > > > > root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1 > > examining superblock > > superblock is corrupt > > bad checksum in superblock > > Sorry, wrong device: > > root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/cache/metadata > examining superblock > examining mapping array > no hint array present > examining discard bitset > root@ubuntu:~/thin-provisioning-tools# echo $? > 0 > > Does that mean it ought to have worked better? :-) Yes, this is good news. The damage is probably in the space maps which get completely regenerated during a restore/repair. I see from a later mail that you're having another issue with the tools. I wonder if you could email me off list, and we'll work out how I can get a copy of your metadata. - Joe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-06 19:16 ` Steinar H. Gunderson 2013-12-06 19:35 ` Steinar H. Gunderson @ 2013-12-09 10:28 ` Joe Thornber 2013-12-09 10:34 ` Steinar H. Gunderson 1 sibling, 1 reply; 9+ messages in thread From: Joe Thornber @ 2013-12-09 10:28 UTC (permalink / raw) To: device-mapper development On Fri, Dec 06, 2013 at 08:16:05PM +0100, Steinar H. Gunderson wrote: > On Fri, Dec 06, 2013 at 05:57:13PM +0000, Joe Thornber wrote: > > Yep, grab: > > > > https://github.com/jthornber/thin-provisioning-tools > > > > build, and then try cache_check on it (which should tell you what's > > wrong). Other programs to play with are cache_dump, cache_restore and > > cache_repair. > > Well, first of all, it doesn't compile, since you use typename outside of > templates :-) Fixing that is easy, though. But afterwards: Grr, I thought that was fixed, what version of g++ are you using? > > root@ubuntu:~/thin-provisioning-tools# ./cache_check /dev/md1 > examining superblock > superblock is corrupt > bad checksum in superblock > > So where do I want to go from there? cache_dump doesn't want to play with the > superblock because the checksum is bad... do I want cache_repair, then? Do I > want to take a backup of anything first? Ouch. Could you go through what happened please? Did dm-cache crash, or did the machine die for some other reason? - Joe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: dm-cache refusing to come up again after a crash 2013-12-09 10:28 ` Joe Thornber @ 2013-12-09 10:34 ` Steinar H. Gunderson 0 siblings, 0 replies; 9+ messages in thread From: Steinar H. Gunderson @ 2013-12-09 10:34 UTC (permalink / raw) To: device-mapper development On Mon, Dec 09, 2013 at 10:28:11AM +0000, Joe Thornber wrote: >> Well, first of all, it doesn't compile, since you use typename outside of >> templates :-) Fixing that is easy, though. But afterwards: > Grr, I thought that was fixed, what version of g++ are you using? This is an Ubuntu 10.04 live CD, which was what I was having handy. It works fine in a Debian wheezy live CD (which I switched to later). >> So where do I want to go from there? cache_dump doesn't want to play with the >> superblock because the checksum is bad... do I want cache_repair, then? Do I >> want to take a backup of anything first? > Ouch. Could you go through what happened please? Did dm-cache crash, > or did the machine die for some other reason? The machine hung. I don't know entirely why (I don't have the logs). I rebooted, and it refused to take up the volume (this is what the original post in this message is about). After booting to a live CD and running cache_check and cache_dump, I was convinced there were no dirty blocks, so I nuked the entire metadata volume (using dd from /dev/zero). This made the machine boot again, but with tons of filesystem errors on anything I'd written to in the last few months, so I restored from backup (thankfully I do have working backups!). I also upgraded to 3.13-rc3 in the hopes of fixing whatever issue in 3.12 originally caused this; however, as reported in the other thread, this was notoriously unstable, and after the third crash, I was back into the “won't boot, but cache_check says everything is fine” mode. That's the current status; it's now standing in a live CD and not doing much useful. I miss my machine :-) (And I hope I haven't lost data again.) Will it help if I upload a dump of the 512MB metadata volume somewhere? /* Steinar */ -- Homepage: http://www.sesse.net/ -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-12-09 10:34 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-06 15:49 dm-cache refusing to come up again after a crash Steinar H. Gunderson 2013-12-06 17:57 ` Joe Thornber 2013-12-06 19:16 ` Steinar H. Gunderson 2013-12-06 19:35 ` Steinar H. Gunderson 2013-12-06 19:53 ` Steinar H. Gunderson 2013-12-07 0:16 ` Steinar H. Gunderson 2013-12-09 10:31 ` Joe Thornber 2013-12-09 10:28 ` Joe Thornber 2013-12-09 10:34 ` Steinar H. Gunderson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.