From: Ethan Zhao <haifeng.zhao@linux.intel.com>
To: Robin Murphy <robin.murphy@arm.com>, Ido Schimmel <idosch@idosch.org>
Cc: joro@8bytes.org, will@kernel.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, zhangzekun11@huawei.com,
john.g.garry@oracle.com, dheerajkumar.srivastava@amd.com,
jsnitsel@redhat.com
Subject: Re: [PATCH v3 0/2] iommu/iova: Make the rcache depot properly flexible
Date: Tue, 9 Jan 2024 14:23:32 +0800 [thread overview]
Message-ID: <d5ad4801-6061-42ee-aafb-129a78e5a2b8@linux.intel.com> (raw)
In-Reply-To: <15a058ba-3b51-46f3-bb1c-23792d100b55@linux.intel.com>
On 1/9/2024 1:54 PM, Ethan Zhao wrote:
>
> On 1/9/2024 1:35 AM, Robin Murphy wrote:
>> On 2023-12-28 12:23 pm, Ido Schimmel wrote:
>>> On Tue, Sep 12, 2023 at 05:28:04PM +0100, Robin Murphy wrote:
>>>> v2:
>>>> https://lore.kernel.org/linux-iommu/cover.1692641204.git.robin.murphy@arm.com/
>>>>
>>>> Hi all,
>>>>
>>>> I hope this is good to go now, just fixed the locking (and threw
>>>> lockdep at it to confirm, which of course I should have done to begin
>>>> with...) and picked up tags.
>>>
>>> Hi,
>>>
>>> After pulling the v6.7 changes we started seeing the following memory
>>> leaks [1] of 'struct iova_magazine'. I'm not sure how to reproduce it,
>>> which is why I didn't perform bisection. However, looking at the
>>> mentioned code paths, they seem to have been changed in v6.7 as part of
>>> this patchset. I reverted both patches and didn't see any memory leaks
>>> when running a full regression (~10 hours), but I will repeat it to be
>>> sure.
>>>
>>> Any idea what could be the problem?
>>
>> Hmm, we've got what looks to be a set of magazines forming a
>> plausible depot list (or at least the tail end of one):
>>
>> ffff8881411f9000 -> ffff8881261c1000
>>
>> ffff8881261c1000 -> ffff88812be26400
>>
>> ffff88812be26400 -> ffff8188392ec000
>>
>> ffff8188392ec000 -> ffff8881a5301000
>>
>> ffff8881a5301000 -> NULL
>>
>> which I guess has somehow become detached from its rcache->depot
>> without being freed properly? However I'm struggling to see any
>> conceivable way that could happen which wouldn't already be more
>> severely broken in other ways as well (i.e. either general memory
>> corruption or someone somehow still trying to use the IOVA domain
>> while it's being torn down).
>>
>> Out of curiosity, does reverting just patch #2 alone make a
>> difference? And is your workload doing anything "interesting" in
>> relation to IOVA domain lifetimes, like creating and destroying
>> SR-IOV virtual functions, changing IOMMU domain types via sysfs, or
>> using that horrible vdpa thing, or are you seeing this purely from
>> regular driver DMA API usage?
>
> There no lock held when free_iova_rcaches(), is it possible
> free_iova_rcaches() race with the delayed cancel_delayed_work_sync() ?
>
> I don't know why not call cancel_delayed_work_sync(&rcache->work);
> first in free_iova_rcaches() to avoid possible race.
>
between following functions pair, race possible ? if called cocurrently.
1. free_iova_rcaches() with iova_depot_work_func()
free_iova_rcaches() holds no lock, iova_depot_work_func() holds
rcache->lock.
2. iova_cpuhp_dead() with iova_depot_work_func()
iova_cpuhp_dead() holds per cpu lock cpu_rcache->lock,
iova_depot_work_func() holds rcache->lock.
3. iova_cpuhp_dead() with free_iova_rcaches()
iova_cpuhp_dead() holds per cpu lock cpu_rcache->lock,
free_iova_rcaches() holds no lock.
4. iova_cpuhp_dead() with free_global_cached_iovas()
iova_cpuhp_dead() holds per cpu lock cpu_rcache->lock and
free_global_cached_iovas() holds rcache->lock.
......
Thanks,
Ethan
>
> Thanks,
>
> Ethan
>
>>
>> Thanks,
>> Robin.
>>
>>>
>>> Thanks
>>>
>>> [1]
>>> unreferenced object 0xffff8881a5301000 (size 1024):
>>> comm "softirq", pid 0, jiffies 4306297099 (age 462.991s)
>>> hex dump (first 32 bytes):
>>> 00 00 00 00 00 00 00 00 e7 7d 05 00 00 00 00 00 .........}......
>>> 0f b4 05 00 00 00 00 00 b4 96 05 00 00 00 00 00 ................
>>> backtrace:
>>> [<ffffffff819f5f08>] __kmem_cache_alloc_node+0x1e8/0x320
>>> [<ffffffff818a239a>] kmalloc_trace+0x2a/0x60
>>> [<ffffffff8231d31e>] free_iova_fast+0x28e/0x4e0
>>> [<ffffffff82310860>] fq_ring_free_locked+0x1b0/0x310
>>> [<ffffffff8231225d>] fq_flush_timeout+0x19d/0x2e0
>>> [<ffffffff813e95ba>] call_timer_fn+0x19a/0x5c0
>>> [<ffffffff813ea16b>] __run_timers+0x78b/0xb80
>>> [<ffffffff813ea5bd>] run_timer_softirq+0x5d/0xd0
>>> [<ffffffff82f1d915>] __do_softirq+0x205/0x8b5
>>>
>>> unreferenced object 0xffff8881392ec000 (size 1024):
>>> comm "softirq", pid 0, jiffies 4306326731 (age 433.359s)
>>> hex dump (first 32 bytes):
>>> 00 10 30 a5 81 88 ff ff 50 ff 0f 00 00 00 00 00 ..0.....P.......
>>> f3 99 05 00 00 00 00 00 87 b7 05 00 00 00 00 00 ................
>>> backtrace:
>>> [<ffffffff819f5f08>] __kmem_cache_alloc_node+0x1e8/0x320
>>> [<ffffffff818a239a>] kmalloc_trace+0x2a/0x60
>>> [<ffffffff8231d31e>] free_iova_fast+0x28e/0x4e0
>>> [<ffffffff82310860>] fq_ring_free_locked+0x1b0/0x310
>>> [<ffffffff8231225d>] fq_flush_timeout+0x19d/0x2e0
>>> [<ffffffff813e95ba>] call_timer_fn+0x19a/0x5c0
>>> [<ffffffff813ea16b>] __run_timers+0x78b/0xb80
>>> [<ffffffff813ea5bd>] run_timer_softirq+0x5d/0xd0
>>> [<ffffffff82f1d915>] __do_softirq+0x205/0x8b5
>>>
>>> unreferenced object 0xffff8881411f9000 (size 1024):
>>> comm "softirq", pid 0, jiffies 4306708887 (age 51.459s)
>>> hex dump (first 32 bytes):
>>> 00 10 1c 26 81 88 ff ff 2c 96 05 00 00 00 00 00 ...&....,.......
>>> ac fe 0f 00 00 00 00 00 a6 fe 0f 00 00 00 00 00 ................
>>> backtrace:
>>> [<ffffffff819f5f08>] __kmem_cache_alloc_node+0x1e8/0x320
>>> [<ffffffff818a239a>] kmalloc_trace+0x2a/0x60
>>> [<ffffffff8231d31e>] free_iova_fast+0x28e/0x4e0
>>> [<ffffffff82310860>] fq_ring_free_locked+0x1b0/0x310
>>> [<ffffffff8231225d>] fq_flush_timeout+0x19d/0x2e0
>>> [<ffffffff813e95ba>] call_timer_fn+0x19a/0x5c0
>>> [<ffffffff813ea16b>] __run_timers+0x78b/0xb80
>>> [<ffffffff813ea5bd>] run_timer_softirq+0x5d/0xd0
>>> [<ffffffff82f1d915>] __do_softirq+0x205/0x8b5
>>>
>>> unreferenced object 0xffff88812be26400 (size 1024):
>>> comm "softirq", pid 0, jiffies 4306710027 (age 50.319s)
>>> hex dump (first 32 bytes):
>>> 00 c0 2e 39 81 88 ff ff 32 ab 05 00 00 00 00 00 ...9....2.......
>>> e3 ac 05 00 00 00 00 00 1f b6 05 00 00 00 00 00 ................
>>> backtrace:
>>> [<ffffffff819f5f08>] __kmem_cache_alloc_node+0x1e8/0x320
>>> [<ffffffff818a239a>] kmalloc_trace+0x2a/0x60
>>> [<ffffffff8231d31e>] free_iova_fast+0x28e/0x4e0
>>> [<ffffffff82310860>] fq_ring_free_locked+0x1b0/0x310
>>> [<ffffffff8231225d>] fq_flush_timeout+0x19d/0x2e0
>>> [<ffffffff813e95ba>] call_timer_fn+0x19a/0x5c0
>>> [<ffffffff813ea16b>] __run_timers+0x78b/0xb80
>>> [<ffffffff813ea5bd>] run_timer_softirq+0x5d/0xd0
>>> [<ffffffff82f1d915>] __do_softirq+0x205/0x8b5
>>>
>>> unreferenced object 0xffff8881261c1000 (size 1024):
>>> comm "softirq", pid 0, jiffies 4306711547 (age 48.799s)
>>> hex dump (first 32 bytes):
>>> 00 64 e2 2b 81 88 ff ff c0 7c 05 00 00 00 00 00 .d.+.....|......
>>> 87 a5 05 00 00 00 00 00 0e 9a 05 00 00 00 00 00 ................
>>> backtrace:
>>> [<ffffffff819f5f08>] __kmem_cache_alloc_node+0x1e8/0x320
>>> [<ffffffff818a239a>] kmalloc_trace+0x2a/0x60
>>> [<ffffffff8231d31e>] free_iova_fast+0x28e/0x4e0
>>> [<ffffffff82310860>] fq_ring_free_locked+0x1b0/0x310
>>> [<ffffffff8231225d>] fq_flush_timeout+0x19d/0x2e0
>>> [<ffffffff813e95ba>] call_timer_fn+0x19a/0x5c0
>>> [<ffffffff813ea16b>] __run_timers+0x78b/0xb80
>>> [<ffffffff813ea5bd>] run_timer_softirq+0x5d/0xd0
>>> [<ffffffff82f1d915>] __do_softirq+0x205/0x8b5
>>
>
next prev parent reply other threads:[~2024-01-09 6:23 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-12 16:28 [PATCH v3 0/2] iommu/iova: Make the rcache depot properly flexible Robin Murphy
2023-09-12 16:28 ` [PATCH v3 1/2] iommu/iova: Make the rcache depot scale better Robin Murphy
2023-09-12 16:28 ` [PATCH v3 2/2] iommu/iova: Manage the depot list size Robin Murphy
2023-09-25 10:08 ` [PATCH v3 0/2] iommu/iova: Make the rcache depot properly flexible Joerg Roedel
2023-12-28 12:23 ` Ido Schimmel
2024-01-02 7:24 ` Ido Schimmel
2024-01-03 8:38 ` Joerg Roedel
2024-01-06 4:21 ` Ethan Zhao
2024-01-06 7:07 ` zhangzekun (A)
2024-01-06 7:33 ` Ethan Zhao
2024-01-06 4:03 ` Ethan Zhao
2024-01-08 3:13 ` Ethan Zhao
2024-01-08 17:35 ` Robin Murphy
2024-01-09 5:54 ` Ethan Zhao
2024-01-09 6:23 ` Ethan Zhao [this message]
2024-01-09 11:26 ` Robin Murphy
2024-01-10 0:52 ` Ethan Zhao
2024-01-09 17:21 ` Ido Schimmel
2024-01-10 12:48 ` Robin Murphy
2024-01-10 14:00 ` Ido Schimmel
2024-01-10 17:58 ` Catalin Marinas
2024-01-11 8:20 ` Ido Schimmel
2024-01-11 10:13 ` Catalin Marinas
2024-01-12 15:31 ` Ido Schimmel
2024-01-15 7:17 ` Ido Schimmel
2024-10-28 8:04 ` Ido Schimmel
2024-10-28 17:45 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5ad4801-6061-42ee-aafb-129a78e5a2b8@linux.intel.com \
--to=haifeng.zhao@linux.intel.com \
--cc=dheerajkumar.srivastava@amd.com \
--cc=idosch@idosch.org \
--cc=iommu@lists.linux.dev \
--cc=john.g.garry@oracle.com \
--cc=joro@8bytes.org \
--cc=jsnitsel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
--cc=zhangzekun11@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox