public inbox for linux-bcache@vger.kernel.org
 help / color / mirror / Atom feed
* Linux 3.11-rc4 Writeback Cache Corruption
@ 2013-09-03 21:26 Zachary Palmer
  2013-09-03 21:38 ` Gabriel de Perthuis
       [not found] ` <522656EC.8040002@gmail.com>
  0 siblings, 2 replies; 4+ messages in thread
From: Zachary Palmer @ 2013-09-03 21:26 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA

Hello,

I wrote in just a couple days ago mentioning problems involving 
hibernating/suspending my laptop.  I'm using Debian Wheezy at the 
moment, so please accept my apologies for not having commit hashes 
offhand.  I'm curious about how to test a very recent patch (committed 
less than an hour ago); I think I might have the same bug as someone who 
posted a message three weeks ago.  My story so far:

     * I started by running my root filesystem over LVM over LUKS over 
bcache over an HDD partition; the bcache device is cached by a partition 
on an SSD.  I have /dev/bcache0 in writethrough mode for now to be safe; 
I'll switch to writeback once things seem stable.  I was using the 
Debian-packaged kernel in Wheezy backports: 
linux-image-3.10-0.bpo.2-686-pae=3.10.5-1~bpo70+1.  While bcache seemed 
to operate correctly, a bug prevents the bcache device from shutting 
down when I attempt to suspend, hibernate, or shut down. (I hope it's 
all the same bug.)
     * After reading about someone else who had a similar issue, I tried 
upgrading to a kernel in Debian's experimental repository: 
linux-image-3.11-rc4-686-pae=3.11~rc4-1~exp1.  This kernel appears to 
suspend/hibernate correctly.  Unfortunately, it is also plagued by some 
kind of caching bug.  While I agree with Kent's comment that it seems 
obscure in principle, I'm sure I encountered this problem numerous times 
in the space of less than a day.  Shortly after booting into this 
kernel, (a) my copy of the bitcoin blockchain seemed to have a bad 
index; (2) LibreOffice, which had not been updated and previously 
worked, complained that it could not load one of its .so files, and (3) 
the checksum failed on the APT packages list stored on my drive.  After 
reading 
http://article.gmane.org/gmane.linux.kernel.bcache.devel/1898/match=silent+data+corruption+3.11+rc4+writethrough+mode 
, I rebooted into the 3.10 kernel, detached the cache device, and 
reattached it.  Everything seems to be fine again (except I'm back to 
not being able to suspend or hibernate).
     * I've just discovered that Kent may have committed a patch for 
exactly this bug less than an hour ago.  I'm interested and willing to 
try this thing out.

So here's the question: how would I best go about testing this patch?  
In looking through the git history, it doesn't seem as if the 
bcache-for-3.11 branch has been synced against the Linux git since 
3.10-rc7 (on June 22nd).  I was thinking I could

     * Pull the Linux kernel source
     * Add the bcache git as an origin
     * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline 
branch myself and
     * Assuming that this works, compile and boot the resulting kernel 
using my Debian kernel .config

Does this sound reasonable?  Or is there a better way to do this? I'm 
pretty happy with whatever gives me at least the behavior of my mainline 
3.10 kernel and I'm looking forward to getting bcache and laptop power 
modes on the same machine.  :)

Thanks,

Zach

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Linux 3.11-rc4 Writeback Cache Corruption
  2013-09-03 21:26 Linux 3.11-rc4 Writeback Cache Corruption Zachary Palmer
@ 2013-09-03 21:38 ` Gabriel de Perthuis
       [not found] ` <522656EC.8040002@gmail.com>
  1 sibling, 0 replies; 4+ messages in thread
From: Gabriel de Perthuis @ 2013-09-03 21:38 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA

On Tue, 03 Sep 2013 17:26:40 -0400, Zachary Palmer wrote:
> So here's the question: how would I best go about testing this patch?  
> In looking through the git history, it doesn't seem as if the 
> bcache-for-3.11 branch has been synced against the Linux git since 
> 3.10-rc7 (on June 22nd).  I was thinking I could
> 
>      * Pull the Linux kernel source
>      * Add the bcache git as an origin
>      * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline 
> branch myself and
>      * Assuming that this works, compile and boot the resulting kernel 
> using my Debian kernel .config
> 
> Does this sound reasonable?  Or is there a better way to do this? I'm 
> pretty happy with whatever gives me at least the behavior of my mainline 
> 3.10 kernel and I'm looking forward to getting bcache and laptop power 
> modes on the same machine.  :)

Yeah, it'll merge cleanly.  You can reuse the .config and build with
`make deb-pkg -j -l6`, which is slowly replacing make-kpkg functionality.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Linux 3.11-rc4 Writeback Cache Corruption
       [not found]     ` <5227F74F.5000305-J5qI5MFTcs8@public.gmane.org>
@ 2013-09-05  3:16       ` Zachary Palmer
       [not found]         ` <5227F774.7010002-J5qI5MFTcs8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Zachary Palmer @ 2013-09-05  3:16 UTC (permalink / raw)
  To: linux-bcache-u79uwXL29TY76Z2rM5mHXA

Whoops; forgot to send this e-mail to the whole list.  :)
> So I have some unfortunate test results regarding this corruption 
> issue.  I tested my laptop on two kernels I built today.  The 
> procedure was as follows:
>
>     1. Check out bcache-3.10-stable.  Kernal was build using the 
> .config I have for Debian package 
> linux-image-3.10-0.bpo.2-686-pae=3.10.5-1~bpo70+1.  All new options 
> were left at defaults.
>     2. Begin by checking out bcache git repository at 
> bcache-for-3.11.  Next, add Linux stable git as an origin.  Then, "git 
> br temp; git co temp; git merge linux-3.11.y".  The merge applies 
> automatically.  Build the resulting kernel, again using the above 
> .config with all new options left at defaults.
>
> My hope was that one of these two kernels would resolve both (a) the 
> cache corruption issue and (b) my hibernate/suspend problem. This did 
> not appear to be the case.  Using kernel #2 above (Linux 3.11.0), 
> cache corruption was immediately evident; both apt-cacher and MySQL 
> failed to start due to segfaults.  Cache corruption was resolved by 
> detaching and reattaching the cache device under a clean kernel.
>
> Using kernel #1 above, I get the same results as the Debian stock 
> kernel for Wheezy backports (the one from the package named above): 
> bcache seems to work just fine until the kernel attempts to stop 
> devices for suspend, hibernate, or shutdown; at this point, bcache 
> times out waiting for the device to stop and the laptop never changes 
> power states.
>
> For the time being, it is easier for me to live without 
> suspend/hibernate than it is for me to migrate back to a cacheless 
> layout; moving all of that data around is time-consuming and I really 
> want to use bcache.  :)  If there is any information I could collect 
> with my machine that would help in the debugging process, please let 
> me know!
>
> Thanks,
>
> Zach
>> [This mail was also posted to gmane.linux.kernel.bcache.devel.]
>>
>> On Tue, 03 Sep 2013 17:26:40 -0400, Zachary Palmer wrote:
>>> So here's the question: how would I best go about testing this patch?
>>> In looking through the git history, it doesn't seem as if the
>>> bcache-for-3.11 branch has been synced against the Linux git since
>>> 3.10-rc7 (on June 22nd).  I was thinking I could
>>>
>>>       * Pull the Linux kernel source
>>>       * Add the bcache git as an origin
>>>       * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline
>>> branch myself and
>>>       * Assuming that this works, compile and boot the resulting kernel
>>> using my Debian kernel .config
>>>
>>> Does this sound reasonable?  Or is there a better way to do this? I'm
>>> pretty happy with whatever gives me at least the behavior of my 
>>> mainline
>>> 3.10 kernel and I'm looking forward to getting bcache and laptop power
>>> modes on the same machine.  :)
>> Yeah, it'll merge cleanly.  You can reuse the .config and build with
>> `make deb-pkg -j -l6`, which is slowly replacing make-kpkg 
>> functionality.
>>
>>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Linux 3.11-rc4 Writeback Cache Corruption
       [not found]         ` <5227F774.7010002-J5qI5MFTcs8@public.gmane.org>
@ 2013-09-05  4:08           ` Josep Lladonosa
  0 siblings, 0 replies; 4+ messages in thread
From: Josep Lladonosa @ 2013-09-05  4:08 UTC (permalink / raw)
  To: Zachary Palmer; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On 5 September 2013 05:16, Zachary Palmer <zachary.palmer-J5qI5MFTcs8@public.gmane.org> wrote:
> Whoops; forgot to send this e-mail to the whole list.  :)
>>
>> So I have some unfortunate test results regarding this corruption issue.
>> I tested my laptop on two kernels I built today.  The procedure was as
>> follows:
>>
>>     1. Check out bcache-3.10-stable.  Kernal was build using the .config I
>> have for Debian package linux-image-3.10-0.bpo.2-686-pae=3.10.5-1~bpo70+1.
>> All new options were left at defaults.
>>     2. Begin by checking out bcache git repository at bcache-for-3.11.
>> Next, add Linux stable git as an origin.  Then, "git br temp; git co temp;
>> git merge linux-3.11.y".  The merge applies automatically.  Build the
>> resulting kernel, again using the above .config with all new options left at
>> defaults.
>>
>> My hope was that one of these two kernels would resolve both (a) the cache
>> corruption issue and (b) my hibernate/suspend problem. This did not appear
>> to be the case.  Using kernel #2 above (Linux 3.11.0), cache corruption was
>> immediately evident; both apt-cacher and MySQL failed to start due to
>> segfaults.  Cache corruption was resolved by detaching and reattaching the
>> cache device under a clean kernel.
>>
>> Using kernel #1 above, I get the same results as the Debian stock kernel
>> for Wheezy backports (the one from the package named above): bcache seems to
>> work just fine until the kernel attempts to stop devices for suspend,
>> hibernate, or shutdown; at this point, bcache times out waiting for the
>> device to stop and the laptop never changes power states.
>>
>> For the time being, it is easier for me to live without suspend/hibernate
>> than it is for me to migrate back to a cacheless layout; moving all of that
>> data around is time-consuming and I really want to use bcache.  :)  If there
>> is any information I could collect with my machine that would help in the
>> debugging process, please let me know!
>>

Hello Zach,

I am working now with kernel 3.11.0-rc7 (x86_64) and this issue is solved.

In some previous kernel releases with pending patches to be applied,
instead merging, I tried to replace contents in
(kernel)/drivers/md/bcache with latest git for 3.11 and it compiled
and worked.

Josep



>> Thanks,
>>
>> Zach
>>>
>>> [This mail was also posted to gmane.linux.kernel.bcache.devel.]
>>>
>>>
>>> On Tue, 03 Sep 2013 17:26:40 -0400, Zachary Palmer wrote:
>>>>
>>>> So here's the question: how would I best go about testing this patch?
>>>> In looking through the git history, it doesn't seem as if the
>>>> bcache-for-3.11 branch has been synced against the Linux git since
>>>> 3.10-rc7 (on June 22nd).  I was thinking I could
>>>>
>>>>       * Pull the Linux kernel source
>>>>       * Add the bcache git as an origin
>>>>       * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline
>>>> branch myself and
>>>>       * Assuming that this works, compile and boot the resulting kernel
>>>> using my Debian kernel .config
>>>>
>>>> Does this sound reasonable?  Or is there a better way to do this? I'm
>>>> pretty happy with whatever gives me at least the behavior of my mainline
>>>> 3.10 kernel and I'm looking forward to getting bcache and laptop power
>>>> modes on the same machine.  :)
>>>
>>> Yeah, it'll merge cleanly.  You can reuse the .config and build with
>>>
>>> `make deb-pkg -j -l6`, which is slowly replacing make-kpkg functionality.
>>>
>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
--
Salutacions...Josep
--

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-09-05  4:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-03 21:26 Linux 3.11-rc4 Writeback Cache Corruption Zachary Palmer
2013-09-03 21:38 ` Gabriel de Perthuis
     [not found] ` <522656EC.8040002@gmail.com>
     [not found]   ` <5227F74F.5000305@bahj.com>
     [not found]     ` <5227F74F.5000305-J5qI5MFTcs8@public.gmane.org>
2013-09-05  3:16       ` Zachary Palmer
     [not found]         ` <5227F774.7010002-J5qI5MFTcs8@public.gmane.org>
2013-09-05  4:08           ` Josep Lladonosa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox