From: "sunny.s.zhang" <sunny.s.zhang@oracle.com>
To: Nikolay Borisov <nborisov@suse.com>,
Duncan <1i5t5.duncan@cox.net>,
linux-btrfs@vger.kernel.org
Subject: Re: btrfs panic problem
Date: Tue, 25 Sep 2018 16:29:35 +0800 [thread overview]
Message-ID: <4eafd6dd-814e-49fc-07d8-45a3bf8e7680@oracle.com> (raw)
In-Reply-To: <d98035c5-77fa-6a1c-0c4c-a8df138c4aaf@suse.com>
在 2018年09月20日 00:12, Nikolay Borisov 写道:
> On 19.09.2018 02:53, sunny.s.zhang wrote:
>> Hi Duncan,
>>
>> Thank you for your advice. I understand what you mean. But i have
>> reviewed the latest btrfs code, and i think the issue is exist still.
>>
>> At 71 line, if the function of btrfs_get_delayed_node run over this
>> line, then switch to other process, which run over the 1282 and release
>> the delayed node at the end.
>>
>> And then, switch back to the btrfs_get_delayed_node. find that the node
>> is not null, and use it as normal. that mean we used a freed memory.
>>
>> at some time, this memory will be freed again.
>>
>> latest code as below.
>>
>> 1278 void btrfs_remove_delayed_node(struct btrfs_inode *inode)
>> 1279 {
>> 1280 struct btrfs_delayed_node *delayed_node;
>> 1281
>> 1282 delayed_node = READ_ONCE(inode->delayed_node);
>> 1283 if (!delayed_node)
>> 1284 return;
>> 1285
>> 1286 inode->delayed_node = NULL;
>> 1287 btrfs_release_delayed_node(delayed_node);
>> 1288 }
>>
>>
>> 64 static struct btrfs_delayed_node *btrfs_get_delayed_node(
>> 65 struct btrfs_inode *btrfs_inode)
>> 66 {
>> 67 struct btrfs_root *root = btrfs_inode->root;
>> 68 u64 ino = btrfs_ino(btrfs_inode);
>> 69 struct btrfs_delayed_node *node;
>> 70
>> 71 node = READ_ONCE(btrfs_inode->delayed_node);
>> 72 if (node) {
>> 73 refcount_inc(&node->refs);
>> 74 return node;
>> 75 }
>> 76
>> 77 spin_lock(&root->inode_lock);
>> 78 node = radix_tree_lookup(&root->delayed_nodes_tree, ino);
>>
>>
> You are analysis is correct, however it's missing one crucial point -
> btrfs_remove_delayed_node is called only from btrfs_evict_inode. And
> inodes are evicted when all other references have been dropped. Check
> the code in evict_inodes() - inodes are added to the dispose list when
> their i_count is 0 at which point there should be no references in this
> inode. This invalidates your analysis...
Thanks.
Yes, I know this. and I know that other process can not use this inode
if the inode is in the I_FREEING status.
But, Chris has fixed a bug, which is similar with this and is found in
production. it mean that this will occur in some condition.
btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3Dec35e48b286959991cdbb886f1bdeda4575c80b4&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ&m=O7fQASCATWfOIp82M24gmi314geaUJDU-9erYxJ2ZEs&s=QtIafUNfkdy5BqfRQLhoHLY6o-Vk8-ZB0sD28mM-o_s&e=
>> 在 2018年09月18日 13:05, Duncan 写道:
>>> sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted:
>>>
>>>> My OS(4.1.12) panic in kmem_cache_alloc, which is called by
>>>> btrfs_get_or_create_delayed_node.
>>>>
>>>> I found that the freelist of the slub is wrong.
>>> [Not a dev, just a btrfs list regular and user, myself. But here's a
>>> general btrfs list recommendations reply...]
>>>
>>> You appear to mean kernel 4.1.12 -- confirmed by the version reported in
>>> the posted dump: 4.1.12-112.14.13.el6uek.x86_64
>>>
>>> OK, so from the perspective of this forward-development-focused list,
>>> kernel 4.1 is pretty ancient history, but you do have a number of
>>> options.
>>>
>>> First let's consider the general situation. Most people choose an
>>> enterprise distro for supported stability, and that's certainly a valid
>>> thing to want. However, btrfs, while now reaching early maturity for the
>>> basics (single device in single or dup mode, and multi-device in single/
>>> raid0/1/10 modes, note that raid56 mode is newer and less mature),
>>> remains under quite heavy development, and keeping reasonably current is
>>> recommended for that reason.
>>>
>>> So you you chose an enterprise distro presumably to lock in supported
>>> stability for several years, but you chose a filesystem, btrfs, that's
>>> still under heavy development, with reasonably current kernels and
>>> userspace recommended as tending to have the known bugs fixed. There's a
>>> bit of a conflict there, and the /general/ recommendation would thus be
>>> to consider whether one or the other of those choices are inappropriate
>>> for your use-case, because it's really quite likely that if you really
>>> want the stability of an enterprise distro and kernel, that btrfs isn't
>>> as stable a filesystem as you're likely to want to match with it.
>>> Alternatively, if you want something newer to match the still under heavy
>>> development btrfs, you very likely want a distro that's not focused on
>>> years-old stability just for the sake of it. One or the other is likely
>>> to be a poor match for your needs, and choosing something else that's a
>>> better match is likely to be a much better experience for you.
>>>
>>> But perhaps you do have reason to want to run the newer and not quite to
>>> traditional enterprise-distro level stability btrfs, on an otherwise
>>> older and very stable enterprise distro. That's fine, provided you know
>>> what you're getting yourself into, and are prepared to deal with it.
>>>
>>> In that case, for best support from the list, we'd recommend running one
>>> of the latest two kernels in either the current or mainline LTS tracks.
>>>
>>> For current track, With 4.18 being the latest kernel, that'd be 4.18 or
>>> 4.17, as available on kernel.org (tho 4.17 is already EOL, no further
>>> releases, at 4.17.19).
>>>
>>> For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series
>>> kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18
>>> and it's just not out of normal stable range yet so not yet marked LTS?),
>>> so it'll be coming up soon and 4.9 will then be dropping to third LTS
>>> series and thus out of our best recommended range. 4.4 was the previous
>>> LTS and while still in LTS support, is outside the two newest LTS series
>>> that this list recommends.
>>>
>>> And of course 4.1 is older than 4.4, so as I said, in btrfs development
>>> terms, it's quite ancient indeed... quite out of practical support range
>>> here, tho of course we'll still try, but in many cases the first question
>>> when any problem's reported is going to be whether it's reproducible on
>>> something closer to current.
>>>
>>> But... you ARE on an enterprise kernel, likely on an enterprise distro,
>>> and very possibly actually paying /them/ for support. So you're not
>>> without options if you prefer to stay with your supported enterprise
>>> kernel. If you're paying them for support, you might as well use it, and
>>> of course of the very many fixes since 4.1, they know what they've
>>> backported and what they haven't, so they're far better placed to provide
>>> that support in any case.
>>>
>>> Or, given what you posted, you appear to be reasonably able to do at
>>> least limited kernel-dev-level analysis yourself. Given that, you're
>>> already reasonably well placed to simply decide to stick with what you
>>> have and take the support you can get, diving into things yourself if
>>> necessary.
>>>
>>>
>>> So those are your kernel options. What about userspace btrfs-progs?
>>>
>>> Generally speaking, while the filesystem's running, it's the kernel code
>>> doing most of the work. If you have old userspace, it simply means you
>>> can't take advantage of some of the newer features as the old userspace
>>> doesn't know how to call for them.
>>>
>>> But the situation changes as soon as you have problems and can't mount,
>>> because it's userspace code that runs to try to fix that sort of problem,
>>> or failing that, it's userspace code that btrfs restore runs to try to
>>> grab what files can be grabbed off of the unmountable filesystem.
>>>
>>> So for routine operation, it's no big deal if userspace is a bit old, at
>>> least as long as it's new enough to have all the newer command formats,
>>> etc, that you need, and for comparing against others when posted. But
>>> once things go bad on you, you really want the newest btrfs-progs in
>>> ordered to give you the best chance at either fixing things, or worst-
>>> case, at least retrieving the files off the dead filesystem. So using
>>> the older distro btrfs-progs for routine running should be fine, but
>>> unless your backups are complete and frequent enough that if something
>>> goes wrong it's easiest to simply blow the bad version away with a fresh
>>> mkfs and start over, you'll probably want at least a reasonably current
>>> btrfs-progs on your rescue media at least. Since the userspace version
>>> numbers are synced to the kernel cycle, a good rule of thumb is keep your
>>> btrfs-progs version to at least that of the oldest recommended LTS kernel
>>> version, as well, so you'd want at least btrfs-progs 4.9 on your rescue
>>> media, for now, and 4.14, coming up, since when the new kernel goes LTS
>>> that'll displace 4.9 and 4.14 will then be the second-back LTS.
>>>
next prev parent reply other threads:[~2018-09-25 14:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-18 0:28 btrfs panic problem sunny.s.zhang
2018-09-18 0:36 ` sunny.s.zhang
2018-09-18 5:05 ` Duncan
2018-09-18 9:13 ` sunny.s.zhang
2018-09-18 23:53 ` sunny.s.zhang
2018-09-19 16:12 ` Nikolay Borisov
2018-09-25 8:29 ` sunny.s.zhang [this message]
2018-09-19 0:05 ` Qu Wenruo
2018-09-19 0:35 ` sunny.s.zhang
2018-09-19 2:27 ` Qu Wenruo
2018-09-19 18:36 ` Liu Bo
2018-09-25 8:20 ` sunny.s.zhang
2018-09-25 8:31 ` Nikolay Borisov
2018-09-25 8:42 ` sunny.s.zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4eafd6dd-814e-49fc-07d8-45a3bf8e7680@oracle.com \
--to=sunny.s.zhang@oracle.com \
--cc=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).