From: Nikolay Borisov <kernel@kyup.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>,
Theodore Ts'o <tytso@mit.edu>, Jan Kara <jack@suse.com>,
SiteGround Operations <operations@siteground.com>
Subject: Re: ext4 crash in 4.4.10
Date: Mon, 4 Jul 2016 11:49:27 +0300 [thread overview]
Message-ID: <577A2317.2070609@kyup.com> (raw)
In-Reply-To: <20160603091936.GA2470@quack2.suse.cz>
Hello again Jan,
On 06/03/2016 12:19 PM, Jan Kara wrote:
> Hi,
>
> On Fri 03-06-16 11:28:31, Nikolay Borisov wrote:
>> Recently the following crash was brought to my attention:
>>
[SNIP]
>
> Hum, this looks most likely like a memory corruption. The value
> ffffffffd9c01f11 doesn't look like a valid pointer to any dynamically
> allocated data (it is not aligned to multiple of 4, it does not point to
> data segment ffff88..........). It is close to a pointer to kernel code
> (modules start at ffffffffa.......) so if it really points to some kernel
> code it may be interesting to find out where. I have no clue how such
> number could get to ei->i_dquot[0]. Usually what I do in such cases is
> search kernel memory whether something unusual points to that place,
> whether previous struct members didn't get corrupted as well or whether
> that value is not also somewhere else in memory. But it's a search for a
> needle in a haystack.
>
> Honza
So I got this exact same crash on a different machine,
with the exact same value. This rules out it being a random corruption:
[2455521.848677] BUG: unable to handle kernel paging request at ffffffffd9c01fb1
[2455521.849025] IP: [<ffffffff81204b62>] dquot_free_inode+0xa2/0x230
[2455521.849315] PGD 1c0b067 PUD 1c0d067 PMD 0
[2455521.849720] Oops: 0000 [#1] SMP
[2455521.850062] Modules linked in: <OMITTED >
[2455521.856549] ipv6 [last unloaded: nf_conntrack_ftp]
[2455521.856904] CPU: 8 PID: 2955 Comm: rm Tainted: G O 4.4.10-clouder1 #73
[2455521.857286] Hardware name: Supermicro X10DRi/X10DRi, BIOS 2.0 12/28/2015
[2455521.857517] task: ffff883506658000 ti: ffff881d50198000 task.ti: ffff881d50198000
[2455521.857898] RIP: 0010:[<ffffffff81204b62>] [<ffffffff81204b62>] dquot_free_inode+0xa2/0x230
[2455521.858353] RSP: 0018:ffff881d5019bc48 EFLAGS: 00010286
[2455521.858581] RAX: ffffffffd9c01f11 RBX: ffff881d5019bc48 RCX: 000000000000fb20
[2455521.858962] RDX: ffff881d5019bc58 RSI: ffff880996894680 RDI: ffffffff81c09540
[2455521.859343] RBP: ffff881d5019bcc8 R08: 0000000000000001 R09: ffff881d5019bc58
[2455521.859724] R10: ffff881d5019bca0 R11: 0000000100000000 R12: ffff880996894680
[2455521.860105] R13: 0000000000000000 R14: 0000000000000008 R15: ffff881d5019be68
[2455521.860486] FS: 00007f6ad2fe9700(0000) GS:ffff881fffb00000(0000) knlGS:0000000000000000
[2455521.860868] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2455521.861096] CR2: ffffffffd9c01fb1 CR3: 0000000151007000 CR4: 00000000001406e0
[2455521.861476] Stack:
[2455521.861696] ffff881fa0388c00 ffff880996894368 0000000000000000 0000000000000000
[2455521.862335] 0000000000000000 ffffffff8123949c ffff881d5019bd28 ffffffff812351c8
[2455521.862972] ffff881d5019bcb8 ffff883fb9a4d800 ffff881ff093a810 ffff883fb9a4d800
[2455521.863611] Call Trace:
[2455521.863838] [<ffffffff8123949c>] ? ext4_evict_inode+0x26c/0x4c0
[2455521.864069] [<ffffffff812351c8>] ? ext4_mark_iloc_dirty+0x518/0x770
[2455521.864304] [<ffffffff812312e3>] ext4_free_inode+0x83/0x5a0
[2455521.864534] [<ffffffff8123949c>] ? ext4_evict_inode+0x26c/0x4c0
[2455521.864765] [<ffffffff8123673b>] ? ext4_mark_inode_dirty+0x7b/0x260
[2455521.864999] [<ffffffff812396e5>] ext4_evict_inode+0x4b5/0x4c0
[2455521.865233] [<ffffffff811ba616>] evict+0xc6/0x1c0
[2455521.865466] [<ffffffff811ba9dc>] iput+0x1ec/0x260
[2455521.865696] [<ffffffff811ab128>] ? vfs_unlink+0x128/0x130
[2455521.865928] [<ffffffff811ae766>] do_unlinkat+0x186/0x2c0
[2455521.866158] [<ffffffff811ae8e2>] SyS_unlinkat+0x22/0x40
[2455521.866390] [<ffffffff81635c57>] entry_SYSCALL_64_fastpath+0x12/0x6a
[2455521.866620] Code: 80 41 be 08 00 00 00 65 ff 0d cf 60 e0 7e e8 f6 0d 43 00 48 8d 53 10 4c 89 e6 4c 8d 55 d8 66 c7 02 00 00 48 8b 06 48 85 c0 74 61 <48> 8b 88 a0 00 00 00 4c 8d 80 a0 00 00 00 83 e1 08 0f 84 a5 00
[2455521.871376] RIP [<ffffffff81204b62>] dquot_free_inode+0xa2/0x230
[2455521.871674] RSP <ffff881d5019bc48>
[2455521.871897] CR2: ffffffffd9c01fb1
The crash again points to test_bit in info_idq_free. I followed
your advise to search for the address and here is what I got:
crash> search -m ffffffff00000000 d9c01f11
ffff88000181e030: d9c01927d9c01f11
ffff880996894680: ffffffffd9c01f11
ffff881d5019b858: ffffffffd9c01f11
ffff881d5019b998: ffffffffd9c01f11 - <stack frame of crash_kexec>
ffff881d5019bbe8: ffffffffd9c01f11 - <stack frame of page_fault)
ffffffff8181e030: d9c01927d9c01f11
So two of the values are in the stack frames of function involved,
in the crash so I'd say they are of no interest. What's interesting
is that ffffffff8181e030 seems to be quota_magics:
readelf -s vmlinux-4.4.10-clouder1 | grep ffffffff8181e030
15605: ffffffff8181e030 12 OBJECT LOCAL DEFAULT 4 quota_magics.24849
#define V2_INITQMAGICS {\
0xd9c01f11, /* USRQUOTA */\
0xd9c01927, /* GRPQUOTA */\
0xd9c03f14, /* PRJQUOTA */\
}
So it seems that somehow the USRQUOTA magic values overwrites
the dquot pointer. Looking at the code I'm not entirely
sure how this can happen though.
next prev parent reply other threads:[~2016-07-04 8:49 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-03 8:28 ext4 crash in 4.4.10 Nikolay Borisov
2016-06-03 9:19 ` Jan Kara
2016-07-04 8:49 ` Nikolay Borisov [this message]
2016-07-06 10:22 ` Jan Kara
2016-07-06 11:08 ` Nikolay Borisov
2016-07-06 14:13 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=577A2317.2070609@kyup.com \
--to=kernel@kyup.com \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=operations@siteground.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).