* Linux 3.11-rc4 Writeback Cache Corruption
@ 2013-09-03 21:26 Zachary Palmer
2013-09-03 21:38 ` Gabriel de Perthuis
[not found] ` <522656EC.8040002@gmail.com>
0 siblings, 2 replies; 4+ messages in thread
From: Zachary Palmer @ 2013-09-03 21:26 UTC (permalink / raw)
To: linux-bcache-u79uwXL29TY76Z2rM5mHXA
Hello,
I wrote in just a couple days ago mentioning problems involving
hibernating/suspending my laptop. I'm using Debian Wheezy at the
moment, so please accept my apologies for not having commit hashes
offhand. I'm curious about how to test a very recent patch (committed
less than an hour ago); I think I might have the same bug as someone who
posted a message three weeks ago. My story so far:
* I started by running my root filesystem over LVM over LUKS over
bcache over an HDD partition; the bcache device is cached by a partition
on an SSD. I have /dev/bcache0 in writethrough mode for now to be safe;
I'll switch to writeback once things seem stable. I was using the
Debian-packaged kernel in Wheezy backports:
linux-image-3.10-0.bpo.2-686-pae=3.10.5-1~bpo70+1. While bcache seemed
to operate correctly, a bug prevents the bcache device from shutting
down when I attempt to suspend, hibernate, or shut down. (I hope it's
all the same bug.)
* After reading about someone else who had a similar issue, I tried
upgrading to a kernel in Debian's experimental repository:
linux-image-3.11-rc4-686-pae=3.11~rc4-1~exp1. This kernel appears to
suspend/hibernate correctly. Unfortunately, it is also plagued by some
kind of caching bug. While I agree with Kent's comment that it seems
obscure in principle, I'm sure I encountered this problem numerous times
in the space of less than a day. Shortly after booting into this
kernel, (a) my copy of the bitcoin blockchain seemed to have a bad
index; (2) LibreOffice, which had not been updated and previously
worked, complained that it could not load one of its .so files, and (3)
the checksum failed on the APT packages list stored on my drive. After
reading
http://article.gmane.org/gmane.linux.kernel.bcache.devel/1898/match=silent+data+corruption+3.11+rc4+writethrough+mode
, I rebooted into the 3.10 kernel, detached the cache device, and
reattached it. Everything seems to be fine again (except I'm back to
not being able to suspend or hibernate).
* I've just discovered that Kent may have committed a patch for
exactly this bug less than an hour ago. I'm interested and willing to
try this thing out.
So here's the question: how would I best go about testing this patch?
In looking through the git history, it doesn't seem as if the
bcache-for-3.11 branch has been synced against the Linux git since
3.10-rc7 (on June 22nd). I was thinking I could
* Pull the Linux kernel source
* Add the bcache git as an origin
* Merge the bcache-for-3.11 branch into the Linux 3.11 mainline
branch myself and
* Assuming that this works, compile and boot the resulting kernel
using my Debian kernel .config
Does this sound reasonable? Or is there a better way to do this? I'm
pretty happy with whatever gives me at least the behavior of my mainline
3.10 kernel and I'm looking forward to getting bcache and laptop power
modes on the same machine. :)
Thanks,
Zach
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Linux 3.11-rc4 Writeback Cache Corruption
2013-09-03 21:26 Linux 3.11-rc4 Writeback Cache Corruption Zachary Palmer
@ 2013-09-03 21:38 ` Gabriel de Perthuis
[not found] ` <522656EC.8040002@gmail.com>
1 sibling, 0 replies; 4+ messages in thread
From: Gabriel de Perthuis @ 2013-09-03 21:38 UTC (permalink / raw)
To: linux-bcache-u79uwXL29TY76Z2rM5mHXA
On Tue, 03 Sep 2013 17:26:40 -0400, Zachary Palmer wrote:
> So here's the question: how would I best go about testing this patch?
> In looking through the git history, it doesn't seem as if the
> bcache-for-3.11 branch has been synced against the Linux git since
> 3.10-rc7 (on June 22nd). I was thinking I could
>
> * Pull the Linux kernel source
> * Add the bcache git as an origin
> * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline
> branch myself and
> * Assuming that this works, compile and boot the resulting kernel
> using my Debian kernel .config
>
> Does this sound reasonable? Or is there a better way to do this? I'm
> pretty happy with whatever gives me at least the behavior of my mainline
> 3.10 kernel and I'm looking forward to getting bcache and laptop power
> modes on the same machine. :)
Yeah, it'll merge cleanly. You can reuse the .config and build with
`make deb-pkg -j -l6`, which is slowly replacing make-kpkg functionality.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Linux 3.11-rc4 Writeback Cache Corruption
[not found] ` <5227F74F.5000305-J5qI5MFTcs8@public.gmane.org>
@ 2013-09-05 3:16 ` Zachary Palmer
[not found] ` <5227F774.7010002-J5qI5MFTcs8@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Zachary Palmer @ 2013-09-05 3:16 UTC (permalink / raw)
To: linux-bcache-u79uwXL29TY76Z2rM5mHXA
Whoops; forgot to send this e-mail to the whole list. :)
> So I have some unfortunate test results regarding this corruption
> issue. I tested my laptop on two kernels I built today. The
> procedure was as follows:
>
> 1. Check out bcache-3.10-stable. Kernal was build using the
> .config I have for Debian package
> linux-image-3.10-0.bpo.2-686-pae=3.10.5-1~bpo70+1. All new options
> were left at defaults.
> 2. Begin by checking out bcache git repository at
> bcache-for-3.11. Next, add Linux stable git as an origin. Then, "git
> br temp; git co temp; git merge linux-3.11.y". The merge applies
> automatically. Build the resulting kernel, again using the above
> .config with all new options left at defaults.
>
> My hope was that one of these two kernels would resolve both (a) the
> cache corruption issue and (b) my hibernate/suspend problem. This did
> not appear to be the case. Using kernel #2 above (Linux 3.11.0),
> cache corruption was immediately evident; both apt-cacher and MySQL
> failed to start due to segfaults. Cache corruption was resolved by
> detaching and reattaching the cache device under a clean kernel.
>
> Using kernel #1 above, I get the same results as the Debian stock
> kernel for Wheezy backports (the one from the package named above):
> bcache seems to work just fine until the kernel attempts to stop
> devices for suspend, hibernate, or shutdown; at this point, bcache
> times out waiting for the device to stop and the laptop never changes
> power states.
>
> For the time being, it is easier for me to live without
> suspend/hibernate than it is for me to migrate back to a cacheless
> layout; moving all of that data around is time-consuming and I really
> want to use bcache. :) If there is any information I could collect
> with my machine that would help in the debugging process, please let
> me know!
>
> Thanks,
>
> Zach
>> [This mail was also posted to gmane.linux.kernel.bcache.devel.]
>>
>> On Tue, 03 Sep 2013 17:26:40 -0400, Zachary Palmer wrote:
>>> So here's the question: how would I best go about testing this patch?
>>> In looking through the git history, it doesn't seem as if the
>>> bcache-for-3.11 branch has been synced against the Linux git since
>>> 3.10-rc7 (on June 22nd). I was thinking I could
>>>
>>> * Pull the Linux kernel source
>>> * Add the bcache git as an origin
>>> * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline
>>> branch myself and
>>> * Assuming that this works, compile and boot the resulting kernel
>>> using my Debian kernel .config
>>>
>>> Does this sound reasonable? Or is there a better way to do this? I'm
>>> pretty happy with whatever gives me at least the behavior of my
>>> mainline
>>> 3.10 kernel and I'm looking forward to getting bcache and laptop power
>>> modes on the same machine. :)
>> Yeah, it'll merge cleanly. You can reuse the .config and build with
>> `make deb-pkg -j -l6`, which is slowly replacing make-kpkg
>> functionality.
>>
>>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Linux 3.11-rc4 Writeback Cache Corruption
[not found] ` <5227F774.7010002-J5qI5MFTcs8@public.gmane.org>
@ 2013-09-05 4:08 ` Josep Lladonosa
0 siblings, 0 replies; 4+ messages in thread
From: Josep Lladonosa @ 2013-09-05 4:08 UTC (permalink / raw)
To: Zachary Palmer; +Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On 5 September 2013 05:16, Zachary Palmer <zachary.palmer-J5qI5MFTcs8@public.gmane.org> wrote:
> Whoops; forgot to send this e-mail to the whole list. :)
>>
>> So I have some unfortunate test results regarding this corruption issue.
>> I tested my laptop on two kernels I built today. The procedure was as
>> follows:
>>
>> 1. Check out bcache-3.10-stable. Kernal was build using the .config I
>> have for Debian package linux-image-3.10-0.bpo.2-686-pae=3.10.5-1~bpo70+1.
>> All new options were left at defaults.
>> 2. Begin by checking out bcache git repository at bcache-for-3.11.
>> Next, add Linux stable git as an origin. Then, "git br temp; git co temp;
>> git merge linux-3.11.y". The merge applies automatically. Build the
>> resulting kernel, again using the above .config with all new options left at
>> defaults.
>>
>> My hope was that one of these two kernels would resolve both (a) the cache
>> corruption issue and (b) my hibernate/suspend problem. This did not appear
>> to be the case. Using kernel #2 above (Linux 3.11.0), cache corruption was
>> immediately evident; both apt-cacher and MySQL failed to start due to
>> segfaults. Cache corruption was resolved by detaching and reattaching the
>> cache device under a clean kernel.
>>
>> Using kernel #1 above, I get the same results as the Debian stock kernel
>> for Wheezy backports (the one from the package named above): bcache seems to
>> work just fine until the kernel attempts to stop devices for suspend,
>> hibernate, or shutdown; at this point, bcache times out waiting for the
>> device to stop and the laptop never changes power states.
>>
>> For the time being, it is easier for me to live without suspend/hibernate
>> than it is for me to migrate back to a cacheless layout; moving all of that
>> data around is time-consuming and I really want to use bcache. :) If there
>> is any information I could collect with my machine that would help in the
>> debugging process, please let me know!
>>
Hello Zach,
I am working now with kernel 3.11.0-rc7 (x86_64) and this issue is solved.
In some previous kernel releases with pending patches to be applied,
instead merging, I tried to replace contents in
(kernel)/drivers/md/bcache with latest git for 3.11 and it compiled
and worked.
Josep
>> Thanks,
>>
>> Zach
>>>
>>> [This mail was also posted to gmane.linux.kernel.bcache.devel.]
>>>
>>>
>>> On Tue, 03 Sep 2013 17:26:40 -0400, Zachary Palmer wrote:
>>>>
>>>> So here's the question: how would I best go about testing this patch?
>>>> In looking through the git history, it doesn't seem as if the
>>>> bcache-for-3.11 branch has been synced against the Linux git since
>>>> 3.10-rc7 (on June 22nd). I was thinking I could
>>>>
>>>> * Pull the Linux kernel source
>>>> * Add the bcache git as an origin
>>>> * Merge the bcache-for-3.11 branch into the Linux 3.11 mainline
>>>> branch myself and
>>>> * Assuming that this works, compile and boot the resulting kernel
>>>> using my Debian kernel .config
>>>>
>>>> Does this sound reasonable? Or is there a better way to do this? I'm
>>>> pretty happy with whatever gives me at least the behavior of my mainline
>>>> 3.10 kernel and I'm looking forward to getting bcache and laptop power
>>>> modes on the same machine. :)
>>>
>>> Yeah, it'll merge cleanly. You can reuse the .config and build with
>>>
>>> `make deb-pkg -j -l6`, which is slowly replacing make-kpkg functionality.
>>>
>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
--
Salutacions...Josep
--
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-09-05 4:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-03 21:26 Linux 3.11-rc4 Writeback Cache Corruption Zachary Palmer
2013-09-03 21:38 ` Gabriel de Perthuis
[not found] ` <522656EC.8040002@gmail.com>
[not found] ` <5227F74F.5000305@bahj.com>
[not found] ` <5227F74F.5000305-J5qI5MFTcs8@public.gmane.org>
2013-09-05 3:16 ` Zachary Palmer
[not found] ` <5227F774.7010002-J5qI5MFTcs8@public.gmane.org>
2013-09-05 4:08 ` Josep Lladonosa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox