From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from aserp2120.oracle.com ([141.146.126.78]:40336 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728463AbeIYOgZ (ORCPT ); Tue, 25 Sep 2018 10:36:25 -0400 From: "sunny.s.zhang" Subject: Re: btrfs panic problem To: Nikolay Borisov , Duncan <1i5t5.duncan@cox.net>, linux-btrfs@vger.kernel.org References: <2cce0d8b-0958-9fb9-bb88-09fbfbf94c9e@oracle.com> <8f6641aa-fc2e-a7b2-4dee-d69706ed8801@oracle.com> Message-ID: <4eafd6dd-814e-49fc-07d8-45a3bf8e7680@oracle.com> Date: Tue, 25 Sep 2018 16:29:35 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: 在 2018年09月20日 00:12, Nikolay Borisov 写道: > On 19.09.2018 02:53, sunny.s.zhang wrote: >> Hi Duncan, >> >> Thank you for your advice. I understand what you mean.  But i have >> reviewed the latest btrfs code, and i think the issue is exist still. >> >> At 71 line, if the function of btrfs_get_delayed_node run over this >> line, then switch to other process, which run over the 1282 and release >> the delayed node at the end. >> >> And then, switch back to the  btrfs_get_delayed_node. find that the node >> is not null, and use it as normal. that mean we used a freed memory. >> >> at some time, this memory will be freed again. >> >> latest code as below. >> >> 1278 void btrfs_remove_delayed_node(struct btrfs_inode *inode) >> 1279 { >> 1280         struct btrfs_delayed_node *delayed_node; >> 1281 >> 1282         delayed_node = READ_ONCE(inode->delayed_node); >> 1283         if (!delayed_node) >> 1284                 return; >> 1285 >> 1286         inode->delayed_node = NULL; >> 1287         btrfs_release_delayed_node(delayed_node); >> 1288 } >> >> >>   64 static struct btrfs_delayed_node *btrfs_get_delayed_node( >>   65                 struct btrfs_inode *btrfs_inode) >>   66 { >>   67         struct btrfs_root *root = btrfs_inode->root; >>   68         u64 ino = btrfs_ino(btrfs_inode); >>   69         struct btrfs_delayed_node *node; >>   70 >>   71         node = READ_ONCE(btrfs_inode->delayed_node); >>   72         if (node) { >>   73                 refcount_inc(&node->refs); >>   74                 return node; >>   75         } >>   76 >>   77         spin_lock(&root->inode_lock); >>   78         node = radix_tree_lookup(&root->delayed_nodes_tree, ino); >> >> > You are analysis is correct, however it's missing one crucial point - > btrfs_remove_delayed_node is called only from btrfs_evict_inode. And > inodes are evicted when all other references have been dropped. Check > the code in evict_inodes() - inodes are added to the dispose list when > their i_count is 0 at which point there should be no references in this > inode. This invalidates your analysis... Thanks. Yes, I know this.  and I know that other process can not use this inode if the inode is in the I_FREEING status. But,  Chris has fixed a bug, which is similar with this and is found in production.  it mean that this will occur in some condition. btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_commit_-3Fid-3Dec35e48b286959991cdbb886f1bdeda4575c80b4&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=mcYQsljqnoxPHJVaWVFtwsEEDhXdP3ULRlrPW_9etWQ&m=O7fQASCATWfOIp82M24gmi314geaUJDU-9erYxJ2ZEs&s=QtIafUNfkdy5BqfRQLhoHLY6o-Vk8-ZB0sD28mM-o_s&e= >> 在 2018年09月18日 13:05, Duncan 写道: >>> sunny.s.zhang posted on Tue, 18 Sep 2018 08:28:14 +0800 as excerpted: >>> >>>> My OS(4.1.12) panic in kmem_cache_alloc, which is called by >>>> btrfs_get_or_create_delayed_node. >>>> >>>> I found that the freelist of the slub is wrong. >>> [Not a dev, just a btrfs list regular and user, myself.  But here's a >>> general btrfs list recommendations reply...] >>> >>> You appear to mean kernel 4.1.12 -- confirmed by the version reported in >>> the posted dump:  4.1.12-112.14.13.el6uek.x86_64 >>> >>> OK, so from the perspective of this forward-development-focused list, >>> kernel 4.1 is pretty ancient history, but you do have a number of >>> options. >>> >>> First let's consider the general situation.  Most people choose an >>> enterprise distro for supported stability, and that's certainly a valid >>> thing to want.  However, btrfs, while now reaching early maturity for the >>> basics (single device in single or dup mode, and multi-device in single/ >>> raid0/1/10 modes, note that raid56 mode is newer and less mature), >>> remains under quite heavy development, and keeping reasonably current is >>> recommended for that reason. >>> >>> So you you chose an enterprise distro presumably to lock in supported >>> stability for several years, but you chose a filesystem, btrfs, that's >>> still under heavy development, with reasonably current kernels and >>> userspace recommended as tending to have the known bugs fixed.  There's a >>> bit of a conflict there, and the /general/ recommendation would thus be >>> to consider whether one or the other of those choices are inappropriate >>> for your use-case, because it's really quite likely that if you really >>> want the stability of an enterprise distro and kernel, that btrfs isn't >>> as stable a filesystem as you're likely to want to match with it. >>> Alternatively, if you want something newer to match the still under heavy >>> development btrfs, you very likely want a distro that's not focused on >>> years-old stability just for the sake of it.  One or the other is likely >>> to be a poor match for your needs, and choosing something else that's a >>> better match is likely to be a much better experience for you. >>> >>> But perhaps you do have reason to want to run the newer and not quite to >>> traditional enterprise-distro level stability btrfs, on an otherwise >>> older and very stable enterprise distro.  That's fine, provided you know >>> what you're getting yourself into, and are prepared to deal with it. >>> >>> In that case, for best support from the list, we'd recommend running one >>> of the latest two kernels in either the current or mainline LTS tracks. >>> >>> For current track, With 4.18 being the latest kernel, that'd be 4.18 or >>> 4.17, as available on kernel.org (tho 4.17 is already EOL, no further >>> releases, at 4.17.19). >>> >>> For mainline-LTS track, 4.14 and 4.9 are the latest two LTS series >>> kernels, tho IIRC 4.19 is scheduled to be this year's LTS (or was it 4.18 >>> and it's just not out of normal stable range yet so not yet marked LTS?), >>> so it'll be coming up soon and 4.9 will then be dropping to third LTS >>> series and thus out of our best recommended range.  4.4 was the previous >>> LTS and while still in LTS support, is outside the two newest LTS series >>> that this list recommends. >>> >>> And of course 4.1 is older than 4.4, so as I said, in btrfs development >>> terms, it's quite ancient indeed... quite out of practical support range >>> here, tho of course we'll still try, but in many cases the first question >>> when any problem's reported is going to be whether it's reproducible on >>> something closer to current. >>> >>> But... you ARE on an enterprise kernel, likely on an enterprise distro, >>> and very possibly actually paying /them/ for support.  So you're not >>> without options if you prefer to stay with your supported enterprise >>> kernel.  If you're paying them for support, you might as well use it, and >>> of course of the very many fixes since 4.1, they know what they've >>> backported and what they haven't, so they're far better placed to provide >>> that support in any case. >>> >>> Or, given what you posted, you appear to be reasonably able to do at >>> least limited kernel-dev-level analysis yourself.  Given that, you're >>> already reasonably well placed to simply decide to stick with what you >>> have and take the support you can get, diving into things yourself if >>> necessary. >>> >>> >>> So those are your kernel options.  What about userspace btrfs-progs? >>> >>> Generally speaking, while the filesystem's running, it's the kernel code >>> doing most of the work.  If you have old userspace, it simply means you >>> can't take advantage of some of the newer features as the old userspace >>> doesn't know how to call for them. >>> >>> But the situation changes as soon as you have problems and can't mount, >>> because it's userspace code that runs to try to fix that sort of problem, >>> or failing that, it's userspace code that btrfs restore runs to try to >>> grab what files can be grabbed off of the unmountable filesystem. >>> >>> So for routine operation, it's no big deal if userspace is a bit old, at >>> least as long as it's new enough to have all the newer command formats, >>> etc, that you need, and for comparing against others when posted.  But >>> once things go bad on you, you really want the newest btrfs-progs in >>> ordered to give you the best chance at either fixing things, or worst- >>> case, at least retrieving the files off the dead filesystem.  So using >>> the older distro btrfs-progs for routine running should be fine, but >>> unless your backups are complete and frequent enough that if something >>> goes wrong it's easiest to simply blow the bad version away with a fresh >>> mkfs and start over, you'll probably want at least a reasonably current >>> btrfs-progs on your rescue media at least.  Since the userspace version >>> numbers are synced to the kernel cycle, a good rule of thumb is keep your >>> btrfs-progs version to at least that of the oldest recommended LTS kernel >>> version, as well, so you'd want at least btrfs-progs 4.9 on your rescue >>> media, for now, and 4.14, coming up, since when the new kernel goes LTS >>> that'll displace 4.9 and 4.14 will then be the second-back LTS. >>>