linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] corruption on reattaching cache
@ 2016-06-09 20:24 Markus Mikkolainen
  2016-06-10  9:09 ` Zdenek Kabelac
  0 siblings, 1 reply; 5+ messages in thread
From: Markus Mikkolainen @ 2016-06-09 20:24 UTC (permalink / raw)
  To: linux-lvm; +Cc: mark

I seem to have hit the same snag as Mark describes in his post.

https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html

with kernel 4.4.6 I detached (--splitcache) a writeback cache from a 
mounted lv which was then synchronized and detached. Then I reattached it 
and shortly detached it again. What was interesting is that after the 
second detach it synchronized AGAIN starting from 100% , and then I 
started getting filesystem errors. I immediately shutdown, and forced an 
fsck , and didnt lose that much data, but still had some stuff to correct.

It looked to me like a detached cache, being reattached will retain all 
cached data on it, even though it was supposed to be written to the 
backing disk, and then instead of marking it clean on attaching, it will 
continue serving old data from the cache.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] corruption on reattaching cache
  2016-06-09 20:24 [linux-lvm] corruption on reattaching cache Markus Mikkolainen
@ 2016-06-10  9:09 ` Zdenek Kabelac
  2016-06-10 10:51   ` Markus Mikkolainen
  2016-06-11  9:24   ` Mark Hills
  0 siblings, 2 replies; 5+ messages in thread
From: Zdenek Kabelac @ 2016-06-10  9:09 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: mark

Dne 9.6.2016 v 22:24 Markus Mikkolainen napsal(a):
> I seem to have hit the same snag as Mark describes in his post.
>
> https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
>
> with kernel 4.4.6 I detached (--splitcache) a writeback cache from a mounted
> lv which was then synchronized and detached. Then I reattached it and shortly
> detached it again. What was interesting is that after the second detach it
> synchronized AGAIN starting from 100% , and then I started getting filesystem
> errors. I immediately shutdown, and forced an fsck , and didnt lose that much
> data, but still had some stuff to correct.
>
> It looked to me like a detached cache, being reattached will retain all cached
> data on it, even though it was supposed to be written to the backing disk, and
> then instead of marking it clean on attaching, it will continue serving old
> data from the cache.
>


Yes - known issue,  --splitcache is rather for 'debugging' purposes.
Use --uncache  and create new cache when needed.

Splitted cache needs to be cleared on reattachment - but that needs further 
code rework.

The idea behind is - we want to support 'offline' writeback of data as ATM
cache target doesn't work well if there is any disk error - i.e. cache is in 
writeback mode and has 'error' sector - you can't clean such cache...

Regards

Zdenek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] corruption on reattaching cache
  2016-06-10  9:09 ` Zdenek Kabelac
@ 2016-06-10 10:51   ` Markus Mikkolainen
  2016-06-13  8:48     ` Zdenek Kabelac
  2016-06-11  9:24   ` Mark Hills
  1 sibling, 1 reply; 5+ messages in thread
From: Markus Mikkolainen @ 2016-06-10 10:51 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: mark


could there atleast be some kind of a warning if the reattaching will 
result in you destroying your filesystem? or possibly make the 
"--splitcache" warn that this is a debug feature and the result might be 
bad?

basically NOTHING suggested to me that i might be doing something that 
could destroy my filesystem.

> Dne 9.6.2016 v 22:24 Markus Mikkolainen napsal(a):
>>  I seem to have hit the same snag as Mark describes in his post.
>>
>>  https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
>>
>>  with kernel 4.4.6 I detached (--splitcache) a writeback cache from a
>>  mounted
>>  lv which was then synchronized and detached. Then I reattached it and
>>  shortly
>>  detached it again. What was interesting is that after the second detach it
>>  synchronized AGAIN starting from 100% , and then I started getting
>>  filesystem
>>  errors. I immediately shutdown, and forced an fsck , and didnt lose that
>>  much
>>  data, but still had some stuff to correct.
>>
>>  It looked to me like a detached cache, being reattached will retain all
>>  cached
>>  data on it, even though it was supposed to be written to the backing disk,
>>  and
>>  then instead of marking it clean on attaching, it will continue serving
>>  old
>>  data from the cache.
>> 
>
>
> Yes - known issue,  --splitcache is rather for 'debugging' purposes.
> Use --uncache  and create new cache when needed.
>
> Splitted cache needs to be cleared on reattachment - but that needs further 
> code rework.
>
> The idea behind is - we want to support 'offline' writeback of data as ATM
> cache target doesn't work well if there is any disk error - i.e. cache is in 
> writeback mode and has 'error' sector - you can't clean such cache...
>
> Regards
>
> Zdenek
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] corruption on reattaching cache
  2016-06-10  9:09 ` Zdenek Kabelac
  2016-06-10 10:51   ` Markus Mikkolainen
@ 2016-06-11  9:24   ` Mark Hills
  1 sibling, 0 replies; 5+ messages in thread
From: Mark Hills @ 2016-06-11  9:24 UTC (permalink / raw)
  To: Zdenek Kabelac; +Cc: LVM general discussion and development

On Fri, 10 Jun 2016, Zdenek Kabelac wrote:

> Dne 9.6.2016 v 22:24 Markus Mikkolainen napsal(a):
> > I seem to have hit the same snag as Mark describes in his post.
> >
> > https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
> >
> > with kernel 4.4.6 I detached (--splitcache) a writeback cache from a mounted
> > lv which was then synchronized and detached. Then I reattached it and
> > shortly
> > detached it again. What was interesting is that after the second detach it
> > synchronized AGAIN starting from 100% , and then I started getting
> > filesystem
> > errors. I immediately shutdown, and forced an fsck , and didnt lose that
> > much
> > data, but still had some stuff to correct.
> >
> > It looked to me like a detached cache, being reattached will retain all
> > cached
> > data on it, even though it was supposed to be written to the backing disk,
> > and
> > then instead of marking it clean on attaching, it will continue serving old
> > data from the cache.
> >
> 
> 
> Yes - known issue,  --splitcache is rather for 'debugging' purposes.
> Use --uncache  and create new cache when needed.
> 
> Splitted cache needs to be cleared on reattachment - but that needs further
> code rework.

It's ok that this is part of a wider picture.

If it is imncomplete, it might be wise to block the user from doing the 
operation, or force them to confirm at the time of reattaching (along with 
a summary of the risk)

Or, if '--splitcache' is completely for debugging purposes, then it 
probably should be removed from the section "Cache removal" of the 
lvmcache(7) man page.

In my case I was following the instructions on the page, which state that 
the result is an "unused cache pool LV". I wrongly understood that to mean 
one which is the same as a newly-created one with the same parameters.

As with Markus, I also experienced data corruption which I was lucky to 
spot, and lucky to have a backup to restore from.
 
> The idea behind is - we want to support 'offline' writeback of data as ATM
> cache target doesn't work well if there is any disk error - i.e. cache is in
> writeback mode and has 'error' sector - you can't clean such cache...

Interesting... is there scope for long-term writeback caching in this 
design? My own personal use case is I would like to spin down the hard 
drive in the machine for the majority of the time.

Many thanks

-- 
Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [linux-lvm] corruption on reattaching cache
  2016-06-10 10:51   ` Markus Mikkolainen
@ 2016-06-13  8:48     ` Zdenek Kabelac
  0 siblings, 0 replies; 5+ messages in thread
From: Zdenek Kabelac @ 2016-06-13  8:48 UTC (permalink / raw)
  To: linux-lvm

Dne 10.6.2016 v 12:51 Markus Mikkolainen napsal(a):
>
> could there atleast be some kind of a warning if the reattaching will result
> in you destroying your filesystem? or possibly make the "--splitcache" warn
> that this is a debug feature and the result might be bad?
>

I'll try to provide some fixing commit to let reattach without
zeroing require  -Zn with lvconvert.


> basically NOTHING suggested to me that i might be doing something that could
> destroy my filesystem.

Yep

>
>> Dne 9.6.2016 v 22:24 Markus Mikkolainen napsal(a):
>>>  I seem to have hit the same snag as Mark describes in his post.
>>>
>>>  https://www.redhat.com/archives/linux-lvm/2015-April/msg00025.html
>>>
>>>  with kernel 4.4.6 I detached (--splitcache) a writeback cache from a
>>>  mounted
>>>  lv which was then synchronized and detached. Then I reattached it and
>>>  shortly
>>>  detached it again. What was interesting is that after the second detach it
>>>  synchronized AGAIN starting from 100% , and then I started getting
>>>  filesystem
>>>  errors. I immediately shutdown, and forced an fsck , and didnt lose that
>>>  much
>>>  data, but still had some stuff to correct.
>>>
>>>  It looked to me like a detached cache, being reattached will retain all
>>>  cached
>>>  data on it, even though it was supposed to be written to the backing disk,
>>>  and
>>>  then instead of marking it clean on attaching, it will continue serving
>>>  old
>>>  data from the cache.
>>>
>>
>>
>> Yes - known issue,  --splitcache is rather for 'debugging' purposes.
>> Use --uncache  and create new cache when needed.
>>
>> Splitted cache needs to be cleared on reattachment - but that needs further
>> code rework.
>>
>> The idea behind is - we want to support 'offline' writeback of data as ATM
>> cache target doesn't work well if there is any disk error - i.e. cache is in
>> writeback mode and has 'error' sector - you can't clean such cache...
>>
>> Regards
>>
>> Zdenek
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-13  8:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-09 20:24 [linux-lvm] corruption on reattaching cache Markus Mikkolainen
2016-06-10  9:09 ` Zdenek Kabelac
2016-06-10 10:51   ` Markus Mikkolainen
2016-06-13  8:48     ` Zdenek Kabelac
2016-06-11  9:24   ` Mark Hills

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).