reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Edward Shishkin <edward.shishkin@gmail.com>
To: "Dušan Čolić" <dusanc@gmail.com>
Cc: reiserfs-devel <reiserfs-devel@vger.kernel.org>,
	Ivan Shapovalov <intelfx100@gmail.com>
Subject: Re: Reiser4 for 3.16.2 problem: mount: mount /dev/md125 on /mnt/backup failed: Cannot allocate memory
Date: Tue, 04 Nov 2014 13:49:09 +0100	[thread overview]
Message-ID: <5458CB45.2050303@gmail.com> (raw)
In-Reply-To: <CADW=+3nSsTDABn4bWy0g8LVGmx4iJJ8YvdYuS2eokpi4RwdepQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 13060 bytes --]

The attached patch prevents panic.

Thanks,
Edward.


On 11/03/2014 07:27 PM, Dušan Čolić wrote:
> Ok now with same config kernels 3.16,  3.14 and 3.10 fail in same spot.
> It failes around rsync --delete or something command, dunno how to
> find what it is I looked at iotop.
> So let me reiterate:
> One automated operation that worked after fsck every day for months
> with kernel 3.10 started making problems since I started using 3.16
> BUT it makes kernel Oops even if I go back to 3.10. The only funky
> thing I did was having txmod=wa for few days in /etc/fstab but that
> was before these problems. Could it be that some corrupted file makes
> this, unseen by fsck?
> I can reformat this partition but I suspect that problematic file is
> on my /. And then we wouldn't have a test case to reproduce this oops
> :)
> Is there any way to record all IO requests to find what exactly does this?
>
> This is Oops from 3.14:
>
> Nov  3 18:57:40 krshina3 kernel: [  179.693947] ------------[ cut here
> ]------------
> Nov  3 18:57:40 krshina3 kernel: [  179.694212] kernel BUG at
> fs/reiser4/plugin/item/ctail.c:669!
> Nov  3 18:57:40 krshina3 kernel: [  179.694537] invalid opcode: 0000 [#1] SMP
> Nov  3 18:57:40 krshina3 kernel: [  179.694779] CPU: 1 PID: 3203 Comm:
> rsync Not tainted 3.14.14-gentoo #1
> Nov  3 18:57:40 krshina3 kernel: [  179.695148] Hardware name:
> Gigabyte Technology Co., Ltd. To be filled by O.E.M./B75-D3V, BIOS F5
> 07/04/2012
> Nov  3 18:57:40 krshina3 kernel: [  179.695702] task: ffff8800b6cf47b0
> ti: ffff8801db7f6000 task.ti: ffff8801db7f6000
> Nov  3 18:57:40 krshina3 kernel: [  179.696128] RIP:
> 0010:[<ffffffff811bc700>]  [<ffffffff811bc700>]
> do_readpage_ctail+0x2a0/0x400
> Nov  3 18:57:40 krshina3 kernel: [  179.696625] RSP:
> 0018:ffff8801db7f7a68  EFLAGS: 00010246
> Nov  3 18:57:40 krshina3 kernel: [  179.696925] RAX: 8000000000000021
> RBX: ffffea000664a8c8 RCX: ffff8800b9964800
> Nov  3 18:57:40 krshina3 kernel: [  179.697329] RDX: 0000000000000035
> RSI: 0000000000000000 RDI: ffff8800b9967800
> Nov  3 18:57:40 krshina3 kernel: [  179.697739] RBP: 0000000000000002
> R08: 0000000000000000 R09: 0000000000000001
> Nov  3 18:57:40 krshina3 kernel: [  179.698145] R10: ffffffff811ad0a0
> R11: 0000000000000000 R12: ffff8800841dfd58
> Nov  3 18:57:40 krshina3 kernel: [  179.698548] R13: ffff8801db7f7b58
> R14: 0000000000000001 R15: 0000000000001000
> Nov  3 18:57:40 krshina3 kernel: [  179.698952] FS:
> 00007f40176a0700(0000) GS:ffff88022e280000(0000)
> knlGS:0000000000000000
> Nov  3 18:57:40 krshina3 kernel: [  179.699410] CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Nov  3 18:57:40 krshina3 kernel: [  179.699735] CR2: 0000000002b30248
> CR3: 00000001de9d8000 CR4: 00000000001427e0
> Nov  3 18:57:40 krshina3 kernel: [  179.700145] DR0: 0000000000000045
> DR1: 0000000000000000 DR2: 0000000000000000
> Nov  3 18:57:40 krshina3 kernel: [  179.700555] DR3: 0000000000000005
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Nov  3 18:57:40 krshina3 kernel: [  179.700966] Stack:
> Nov  3 18:57:40 krshina3 kernel: [  179.701080]  ffff8800841dfe98
> ffff8801db7f7b58 ffffea000664a8c8 0000000000000000
> Nov  3 18:57:40 krshina3 kernel: [  179.701523]  ffff8800b6cf47b0
> ffff8800841dfd58 0000000000000000 ffffffff811bc969
> Nov  3 18:57:40 krshina3 kernel: [  179.701963]  ffffea000664a8c8
> ffffea000664a8c8 ffff8801db7f7c90 ffff8800841dfe98
> Nov  3 18:57:40 krshina3 kernel: [  179.702402] Call Trace:
> Nov  3 18:57:40 krshina3 kernel: [  179.702540]  [<ffffffff811bc969>]
> ? ctail_readpages_filler+0x109/0x210
> Nov  3 18:57:40 krshina3 kernel: [  179.702911]  [<ffffffff811bc860>]
> ? do_readpage_ctail+0x400/0x400
> Nov  3 18:57:40 krshina3 kernel: [  179.703257]  [<ffffffff810e9b91>]
> ? read_cache_pages+0xb1/0x120
> Nov  3 18:57:40 krshina3 kernel: [  179.703592]  [<ffffffff811bccf5>]
> ? readpages_ctail+0x135/0x340
> Nov  3 18:57:40 krshina3 kernel: [  179.703928]  [<ffffffff811b0da6>]
> ? readpages_cryptcompress+0x46/0x90
> Nov  3 18:57:40 krshina3 kernel: [  179.704292]  [<ffffffff810e99c1>]
> ? __do_page_cache_readahead+0x1b1/0x260
> Nov  3 18:57:40 krshina3 kernel: [  179.704677]  [<ffffffff811ab0a0>]
> ? reiser4_write_dispatch+0x4d0/0x4d0
> Nov  3 18:57:40 krshina3 kernel: [  179.705047]  [<ffffffff810e9d1c>]
> ? ra_submit+0x1c/0x30
> Nov  3 18:57:40 krshina3 kernel: [  179.705343]  [<ffffffff810e09b6>]
> ? generic_file_aio_read+0x4d6/0x6f0
> Nov  3 18:57:40 krshina3 kernel: [  179.705708]  [<ffffffff8111bb4a>]
> ? do_sync_read+0x5a/0x90
> Nov  3 18:57:40 krshina3 kernel: [  179.706023]  [<ffffffff8109de30>]
> ? __dequeue_entity+0x40/0x50
> Nov  3 18:57:40 krshina3 kernel: [  179.706354]  [<ffffffff811b0e68>]
> ? read_cryptcompress+0x78/0xc0
> Nov  3 18:57:40 krshina3 kernel: [  179.706694]  [<ffffffff811ab244>]
> ? reiser4_read_dispatch+0x74/0x170
> Nov  3 18:57:40 krshina3 kernel: [  179.707056]  [<ffffffff8111c8a1>]
> ? vfs_read+0xa1/0x180
> Nov  3 18:57:40 krshina3 kernel: [  179.707351]  [<ffffffff8111cd9f>]
> ? SyS_read+0x4f/0xc0
> Nov  3 18:57:40 krshina3 kernel: [  179.707642]  [<ffffffff81670d62>]
> ? system_call_fastpath+0x16/0x1b
> Nov  3 18:57:40 krshina3 kernel: [  179.707992] Code: 80 0b 08 31 ed
> e9 b0 fe ff ff 90 48 89 df e8 18 2d f2 ff e9 62 fe ff ff 0f 1f 00 48
> 89 df e8 08 2d f2 ff e9 92 fe ff ff 0f 1f 00 <0f> 0b 48 8b 03 a8 08 0f
> 84 a2 00 00 00 49 8b bd 80 00 00 00 e8
> Nov  3 18:57:40 krshina3 kernel: [  179.709414] RIP
> [<ffffffff811bc700>] do_readpage_ctail+0x2a0/0x400
> Nov  3 18:57:40 krshina3 kernel: [  179.709774]  RSP <ffff8801db7f7a68>
> Nov  3 18:57:40 krshina3 kernel: [  179.821174] ---[ end trace
> d60466a8b91493b8 ]---
>
> This is from 3.10:
>
> Nov  3 19:08:36 krshina3 kernel: [  449.276796] ------------[ cut here
> ]------------
> Nov  3 19:08:36 krshina3 kernel: [  449.276825] kernel BUG at
> fs/reiser4/plugin/item/ctail.c:669!
> Nov  3 19:08:36 krshina3 kernel: [  449.276841] invalid opcode: 0000 [#1] SMP
> Nov  3 19:08:36 krshina3 kernel: [  449.276857] CPU: 1 PID: 3167 Comm:
> rsync Not tainted 3.10.6-gentoo #2
> Nov  3 19:08:36 krshina3 kernel: [  449.276875] Hardware name:
> Gigabyte Technology Co., Ltd. To be filled by O.E.M./B75-D3V, BIOS F5
> 07/04/2012
> Nov  3 19:08:36 krshina3 kernel: [  449.276900] task: ffff88022d2ac690
> ti: ffff880074f80000 task.ti: ffff880074f80000
> Nov  3 19:08:36 krshina3 kernel: [  449.276920] RIP:
> 0010:[<ffffffff811555e9>]  [<ffffffff811555e9>]
> do_readpage_ctail+0x2f3/0x3ed
> Nov  3 19:08:36 krshina3 kernel: [  449.276948] RSP:
> 0018:ffff880074f81a28  EFLAGS: 00010246
> Nov  3 19:08:36 krshina3 kernel: [  449.276962] RAX: 0000000000000000
> RBX: ffffea0005a31988 RCX: 00000000000354d4
> Nov  3 19:08:36 krshina3 kernel: [  449.276981] RDX: 0000000000000035
> RSI: 0000000000000000 RDI: ffff8801a139fcd8
> Nov  3 19:08:36 krshina3 kernel: [  449.276999] RBP: ffff880074f81b18
> R08: 0000000000000000 R09: 0000000000000000
> Nov  3 19:08:36 krshina3 kernel: [  449.277018] R10: 0000000000000866
> R11: 0000000000000866 R12: ffff8801a139fb98
> Nov  3 19:08:36 krshina3 kernel: [  449.277037] R13: 0000000000000002
> R14: 0000000000000000 R15: 0000000000001000
> Nov  3 19:08:36 krshina3 kernel: [  449.277056] FS:
> 00007fc4e9c01700(0000) GS:ffff88022e280000(0000)
> knlGS:0000000000000000
> Nov  3 19:08:36 krshina3 kernel: [  449.277077] CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Nov  3 19:08:36 krshina3 kernel: [  449.277092] CR2: 000000000331a248
> CR3: 0000000075069000 CR4: 00000000001427e0
> Nov  3 19:08:36 krshina3 kernel: [  449.277111] DR0: 0000000000000045
> DR1: 0000000000000000 DR2: 0000000000000000
> Nov  3 19:08:36 krshina3 kernel: [  449.277129] DR3: 0000000000000005
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Nov  3 19:08:36 krshina3 kernel: [  449.277147] Stack:
> Nov  3 19:08:36 krshina3 kernel: [  449.277154]  ffff8801a139fcd8
> ffff880074f81b18 ffff8801a139fb98 ffffea0005a31988
> Nov  3 19:08:36 krshina3 kernel: [  449.277177]  0000000000000000
> 0000000000000000 ffff880074f81b18 ffffffff81155861
> Nov  3 19:08:36 krshina3 kernel: [  449.277200]  ffffea0005a31988
> ffffea0005a31988 ffff880074f81c78 ffff8801a139fcd8
> Nov  3 19:08:36 krshina3 kernel: [  449.277223] Call Trace:
> Nov  3 19:08:36 krshina3 kernel: [  449.277232]  [<ffffffff81155861>]
> ? ctail_readpages_filler+0x17e/0x1c2
> Nov  3 19:08:36 krshina3 kernel: [  449.277250]  [<ffffffff811556e3>]
> ? do_readpage_ctail+0x3ed/0x3ed
> Nov  3 19:08:36 krshina3 kernel: [  449.277268]  [<ffffffff810ad77c>]
> ? read_cache_pages+0x91/0x108
> Nov  3 19:08:36 krshina3 kernel: [  449.277286]  [<ffffffff8113a7d1>]
> ? reiser4_get_file_fsdata+0x33/0x8a
> Nov  3 19:08:36 krshina3 kernel: [  449.277304]  [<ffffffff81155c95>]
> ? readpages_ctail+0x2f2/0x2f9
> Nov  3 19:08:36 krshina3 kernel: [  449.277321]  [<ffffffff8114bef8>]
> ? readpages_cryptcompress+0x3f/0x6b
> Nov  3 19:08:36 krshina3 kernel: [  449.277339]  [<ffffffff810ad5b3>]
> ? __do_page_cache_readahead+0x11f/0x1c3
> Nov  3 19:08:36 krshina3 kernel: [  449.277357]  [<ffffffff810ad8c3>]
> ? ra_submit+0x1c/0x23
> Nov  3 19:08:36 krshina3 kernel: [  449.277372]  [<ffffffff810a6348>]
> ? generic_file_aio_read+0x269/0x5b6
> Nov  3 19:08:36 krshina3 kernel: [  449.277390]  [<ffffffff810d4399>]
> ? do_sync_read+0x6e/0x90
> Nov  3 19:08:36 krshina3 kernel: [  449.277406]  [<ffffffff8114bf8d>]
> ? read_cryptcompress+0x69/0x95
> Nov  3 19:08:36 krshina3 kernel: [  449.277423]  [<ffffffff8114723b>]
> ? reiser4_read_dispatch+0xc9/0x124
> Nov  3 19:08:36 krshina3 kernel: [  449.277441]  [<ffffffff810d4dc2>]
> ? vfs_read+0xac/0x146
> Nov  3 19:08:36 krshina3 kernel: [  449.277456]  [<ffffffff810d51da>]
> ? SyS_read+0x4e/0x78
> Nov  3 19:08:36 krshina3 kernel: [  449.277471]  [<ffffffff814c9312>]
> ? system_call_fastpath+0x16/0x1b
> Nov  3 19:08:36 krshina3 kernel: [  449.277487] Code: f9 00 00 00 44
> 8b ad 88 00 00 00 41 83 fd 02 74 1a 77 0c 41 83 fd 01 0f 85 e0 00 00
> 00 eb 61 41 83 fd 04 0f 87 d4 00 00 00 eb 02 <0f> 0b 65 48 8b 14 25 48
> b7 00 00 48 81 ea d8 1f 00 00 ff 42 1c
> Nov  3 19:08:36 krshina3 kernel: [  449.277612] RIP
> [<ffffffff811555e9>] do_readpage_ctail+0x2f3/0x3ed
> Nov  3 19:08:36 krshina3 kernel: [  449.277630]  RSP <ffff880074f81a28>
> Nov  3 19:08:36 krshina3 kernel: [  449.282221] ---[ end trace
> f14ff60ec68f3286 ]---
>
> On Mon, Nov 3, 2014 at 5:43 PM, Edward Shishkin
> <edward.shishkin@gmail.com> wrote:
>> On 11/03/2014 04:48 PM, Dušan Čolić wrote:
>>
>>
>> On Nov 3, 2014 4:43 PM, "Edward Shishkin" <edward.shishkin@gmail.com> wrote:
>>> On 11/03/2014 02:33 PM, Dušan Čolić wrote:
>>>> I forgot:
>>>> It was working for some time (deleting old directories etc.) and then
>>>> crashed.
>>>> Btw. why didn't that partition (/dev/md125) remounted read-only on
>>>> error when I have that in /etc/fstab?
>>>
>>>
>>> What do you suggest to do when encountering IO error?
>>> There is another option: to oops. You like this better?
>>>
>> There's a misunderstanding maybe.... Long time ago you told me to add
>> onerror=remount-ro to fstab but now when this error happened system
>> continued to work but the partition stayed rw, maybe I misunderstood the
>> option?
>>
>>
>>
>> Ah, "onerror" means "on IO error".
>> That is, file system submitted a RW-request, and the disk driver returns
>> IO error for this request. If so, then you file system will become
>> read-only.
>>
>> As to your case: there is no IO errors. There is a kernel oops, this is
>> another
>> situation..
>>
>> Edward.
>>
>>
>>
>>> Edward.
>>>
>>>
>>>> On Mon, Nov 3, 2014 at 2:28 PM, Dušan Čolić <dusanc@gmail.com> wrote:
>>>>> After an hour or more still nothing, one rsync went to zombie other
>>>>> still in D state
>>>>> I killed the main process and rebooted.
>>>>>
>>>>> krshina3 goran # ps -aux | grep rsync
>>>>> root      6655  0.0  0.1 101568  9328 pts/4    D    13:08   0:01
>>>>> /usr/bin/rsync -ax --delete --numeric-ids --relative --delete-excluded
>>>>> --exclude=/home/windows.qcow2
>>>>> --link-dest=/mnt/backup/daily.1/localhost/ /home
>>>>> /mnt/backup/daily.0/localhost/
>>>>> root      6656  0.0  0.0      0     0 pts/4    Z    13:08   0:00
>>>>> [rsync] <defunct>
>>>>>
>>>>>
>>>>> Now I tried same command (rsnapshot -c /etc/rsnapshot.d/daily.conf
>>>>> daily) and kernel BUGed with:
>>>>>
>>>>> krshina3 goran # rsnapshot -c /etc/rsnapshot.d/daily.conf daily
>>>>> rsync: writefd_unbuffered failed to write 5 bytes to socket
>>>>> [generator]: Broken pipe (32)
>>>>>
>>>>> ----------------------------------------------------------------------------
>>>>> rsnapshot encountered an error! The program was invoked with these
>>>>> options:
>>>>> /usr/bin/rsnapshot -c /etc/rsnapshot.d/daily.conf daily
>>>>>
>>>>> ----------------------------------------------------------------------------
>>>>> ERROR: /usr/bin/rsync returned 0.04296875 while processing /home/
>>>>> WARNING: Rolling back "localhost/"
>>>>> rsync error: error in rsync protocol data stream (code 12) at
>>>>> io.c(1532) [generator=3.0.9]
>>>>>
>>


[-- Attachment #2: reiser4-crc-fixups.patch --]
[-- Type: text/x-patch, Size: 678 bytes --]

Don't panic when unprepped ctail cluster is found.
Instead, return error and suggest to fsck.

Signed-off-by: Edward Shishkin <edward.shishkin@gmail.com>
---
 fs/reiser4/plugin/item/ctail.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/reiser4/plugin/item/ctail.c
+++ b/fs/reiser4/plugin/item/ctail.c
@@ -666,7 +666,11 @@ int do_readpage_ctail(struct inode * ino
 
 	switch (clust->dstat) {
 	case UNPR_DISK_CLUSTER:
-		BUG_ON(1);
+		warning("edward-1632",
+			"Bad item cluster %lu (Inode %llu). Fsck?",
+			clust->index,
+			(unsigned long long)get_inode_oid(inode));
+		return RETERR(-EIO);
 	case TRNC_DISK_CLUSTER:
 		/*
 		 * Race with truncate!

  reply	other threads:[~2014-11-04 12:49 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-02  8:38 Reiser4 for 3.16.2 problem: mount: mount /dev/md125 on /mnt/backup failed: Cannot allocate memory Dušan Čolić
2014-11-02  9:55 ` Dušan Čolić
2014-11-02 12:08   ` Dušan Čolić
     [not found] ` <CADW=+3ms6zaqzpkRBEgApi=gkMNPLCzzex8ShK4CkCFD-WNk8Q@mail.gmail.com>
2014-11-03 11:11   ` Dušan Čolić
     [not found]   ` <5457629C.8040906@gmail.com>
2014-11-03 11:12     ` Dušan Čolić
2014-11-03 11:42       ` Edward Shishkin
2014-11-03 12:25         ` Dušan Čolić
2014-11-03 13:28           ` Dušan Čolić
2014-11-03 13:33             ` Dušan Čolić
2014-11-03 15:42               ` Edward Shishkin
     [not found]                 ` <CADW=+3mzX7evaqZmvOSJ6F9XNmzmauegL4Vy0nqz6yqG8EyLyw@mail.gmail.com>
     [not found]                   ` <5457B0AB.6030203@gmail.com>
2014-11-03 18:27                     ` Dušan Čolić
2014-11-04 12:49                       ` Edward Shishkin [this message]
2014-11-04 13:16                         ` Dušan Čolić
2014-11-04 13:21                           ` Edward Shishkin
2014-11-04 13:23                             ` Dušan Čolić
     [not found]                           ` <545965C7.5010304@gmail.com>
     [not found]                             ` <CADW=+3k40KVKf5hLD1BgJ8a6uj=ZJ=T_gOmijDZVxWS3pK_6AQ@mail.gmail.com>
     [not found]                               ` <54665EFD.7030304@gmail.com>
     [not found]                                 ` <CADW=+3mrqv=BjX8TCpRszFbr3OYEvgOk0OpcYe56f6OJmYDZbQ@mail.gmail.com>
2014-11-30 23:04                                   ` Dušan Čolić
2014-11-30 23:48                                     ` Edward Shishkin
     [not found]                                       ` <CADW=+3=Tqw-2yfe6q4ERKSdyu9BmbApoREi-p4A7Auu_k27A6Q@mail.gmail.com>
     [not found]                                         ` <547BAF3D.1000605@gmail.com>
2014-12-03 20:16                                           ` Dušan Čolić
2014-12-07  8:07                                             ` Dušan Čolić
2014-12-07 10:52                                               ` Edward Shishkin
2014-12-07 23:45                                                 ` Dušan Čolić
2014-12-08 11:22                                                   ` Edward Shishkin
2014-12-08 11:54                                                     ` Dušan Čolić
2014-12-08 19:27                                                       ` Edward Shishkin
2014-12-16 13:03                                                         ` Dušan Čolić
2014-12-17 13:44                                                           ` Edward Shishkin
     [not found]                                                     ` <CADW=+3kGn9OBoHiVWT3EkmSxGn==0ZsEa1nS_dP7-oQeq5h48g@mail.gmail.com>
2014-12-10 11:38                                                       ` Fwd: " Dušan Čolić
2014-12-10  9:46                                               ` doiggl
2014-11-03 13:36             ` Edward Shishkin
     [not found] ` <CADW=+3nzK0rLpQR-+rhn_siKCYr5mSSbfkyyw5NXgagZWEPLvA@mail.gmail.com>
     [not found]   ` <54578AC2.3040803@gmail.com>
2014-11-03 15:12     ` Ivan Shapovalov
2014-11-03 15:25       ` Edward Shishkin
2014-11-03 15:50         ` Ivan Shapovalov
2014-11-03 16:14           ` Dušan Čolić

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5458CB45.2050303@gmail.com \
    --to=edward.shishkin@gmail.com \
    --cc=dusanc@gmail.com \
    --cc=intelfx100@gmail.com \
    --cc=reiserfs-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).