* WARNING in shmem_evict_inode
@ 2015-11-09 8:55 Dmitry Vyukov
2015-11-23 8:30 ` Dmitry Vyukov
0 siblings, 1 reply; 4+ messages in thread
From: Dmitry Vyukov @ 2015-11-09 8:55 UTC (permalink / raw)
To: Hugh Dickins, Andrew Morton, linux-mm@kvack.org, LKML,
Sasha Levin
Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet
Hello,
The following program:
// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <syscall.h>
#include <string.h>
#include <stdint.h>
#include <pthread.h>
#define SYS_memfd_create 319
long fd;
void *thr(void *p)
{
syscall(SYS_ftruncate, fd, 0x8ul, 0, 0, 0, 0);
return 0;
}
int main()
{
pthread_t th;
syscall(SYS_mmap, 0x20000000ul, 0x10000ul, 0x3ul, 0x32ul,
0xfffffffffffffffful, 0x0ul);
memcpy((void*)0x20000f96, "\x23\x65\x6d\x31\x07\x2b\x27\x29\x00", 9);
fd = syscall(SYS_memfd_create, 0x20000f96ul, 0x2ul, 0, 0, 0, 0);
syscall(SYS_fallocate, fd, 0x0ul, 0x31d89288ul, 0x4ul, 0, 0);
syscall(SYS_mmap, 0x20061000ul, 0xc00000ul,
0x1a9d91e04768640bul, 0x11ul, fd, 0x0ul);
pthread_create(&th, 0, thr, 0);
syscall(SYS_fstat, fd, 0x20550fcful, 0, 0, 0, 0);
pthread_join(th, 0);
return 0;
}
triggers WARNING in shmem_evict_inode:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 10442 at mm/shmem.c:625 shmem_evict_inode+0x335/0x480()
Modules linked in:
CPU: 1 PID: 8944 Comm: executor Not tainted 4.3.0+ #39
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
00000000ffffffff ffff88006c6afab8 ffffffff81aad406 0000000000000000
ffff88006e39ac80 ffffffff83091660 ffff88006c6afaf8 ffffffff81100829
ffffffff814192e5 ffffffff83091660 0000000000000271 ffff88003d075aa8
Call Trace:
[<ffffffff81100a59>] warn_slowpath_null+0x29/0x30 kernel/panic.c:480
[<ffffffff814192e5>] shmem_evict_inode+0x335/0x480 mm/shmem.c:625
[<ffffffff8151560e>] evict+0x26e/0x580 fs/inode.c:542
[< inline >] iput_final fs/inode.c:1477
[<ffffffff81515f30>] iput+0x4a0/0x790 fs/inode.c:1504
[< inline >] dentry_iput fs/dcache.c:358
[<ffffffff8150667e>] __dentry_kill+0x4fe/0x700 fs/dcache.c:543
[< inline >] dentry_kill fs/dcache.c:587
[<ffffffff8150be7b>] dput+0x6ab/0x7a0 fs/dcache.c:796
[<ffffffff814c499b>] __fput+0x3fb/0x6e0 fs/file_table.c:226
[<ffffffff814c4d05>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff8115ab23>] task_work_run+0x163/0x1f0 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff81105049>] do_exit+0x7f9/0x2b80 kernel/exit.c:748
[<ffffffff8110b268>] do_group_exit+0x108/0x320 kernel/exit.c:878
[< inline >] SYSC_exit_group kernel/exit.c:889
[<ffffffff8110b49d>] SyS_exit_group+0x1d/0x20 kernel/exit.c:887
---[ end trace 43da88a03e29c2a5 ]---
Run the program in a loop, as the WARNING seems to be triggered by a race.
On commit d1e41ff11941784f469f17795a4d9425c2eb4b7a (Nov 5).
But I was also able to reproduce it on a 3.11-based kernel.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: WARNING in shmem_evict_inode 2015-11-09 8:55 WARNING in shmem_evict_inode Dmitry Vyukov @ 2015-11-23 8:30 ` Dmitry Vyukov 2015-12-02 9:29 ` Hugh Dickins 0 siblings, 1 reply; 4+ messages in thread From: Dmitry Vyukov @ 2015-11-23 8:30 UTC (permalink / raw) To: Hugh Dickins, Andrew Morton, linux-mm@kvack.org, LKML, Sasha Levin Cc: syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet On Mon, Nov 9, 2015 at 9:55 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > Hello, > > The following program: > > // autogenerated by syzkaller (http://github.com/google/syzkaller) > #include <syscall.h> > #include <string.h> > #include <stdint.h> > #include <pthread.h> > > #define SYS_memfd_create 319 > > long fd; > > void *thr(void *p) > { > syscall(SYS_ftruncate, fd, 0x8ul, 0, 0, 0, 0); > return 0; > } > > int main() > { > pthread_t th; > > syscall(SYS_mmap, 0x20000000ul, 0x10000ul, 0x3ul, 0x32ul, > 0xfffffffffffffffful, 0x0ul); > memcpy((void*)0x20000f96, "\x23\x65\x6d\x31\x07\x2b\x27\x29\x00", 9); > fd = syscall(SYS_memfd_create, 0x20000f96ul, 0x2ul, 0, 0, 0, 0); > syscall(SYS_fallocate, fd, 0x0ul, 0x31d89288ul, 0x4ul, 0, 0); > syscall(SYS_mmap, 0x20061000ul, 0xc00000ul, > 0x1a9d91e04768640bul, 0x11ul, fd, 0x0ul); > pthread_create(&th, 0, thr, 0); > syscall(SYS_fstat, fd, 0x20550fcful, 0, 0, 0, 0); > pthread_join(th, 0); > return 0; > } > > > triggers WARNING in shmem_evict_inode: > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 10442 at mm/shmem.c:625 shmem_evict_inode+0x335/0x480() > Modules linked in: > CPU: 1 PID: 8944 Comm: executor Not tainted 4.3.0+ #39 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > 00000000ffffffff ffff88006c6afab8 ffffffff81aad406 0000000000000000 > ffff88006e39ac80 ffffffff83091660 ffff88006c6afaf8 ffffffff81100829 > ffffffff814192e5 ffffffff83091660 0000000000000271 ffff88003d075aa8 > Call Trace: > [<ffffffff81100a59>] warn_slowpath_null+0x29/0x30 kernel/panic.c:480 > [<ffffffff814192e5>] shmem_evict_inode+0x335/0x480 mm/shmem.c:625 > [<ffffffff8151560e>] evict+0x26e/0x580 fs/inode.c:542 > [< inline >] iput_final fs/inode.c:1477 > [<ffffffff81515f30>] iput+0x4a0/0x790 fs/inode.c:1504 > [< inline >] dentry_iput fs/dcache.c:358 > [<ffffffff8150667e>] __dentry_kill+0x4fe/0x700 fs/dcache.c:543 > [< inline >] dentry_kill fs/dcache.c:587 > [<ffffffff8150be7b>] dput+0x6ab/0x7a0 fs/dcache.c:796 > [<ffffffff814c499b>] __fput+0x3fb/0x6e0 fs/file_table.c:226 > [<ffffffff814c4d05>] ____fput+0x15/0x20 fs/file_table.c:244 > [<ffffffff8115ab23>] task_work_run+0x163/0x1f0 kernel/task_work.c:115 > [< inline >] exit_task_work include/linux/task_work.h:21 > [<ffffffff81105049>] do_exit+0x7f9/0x2b80 kernel/exit.c:748 > [<ffffffff8110b268>] do_group_exit+0x108/0x320 kernel/exit.c:878 > [< inline >] SYSC_exit_group kernel/exit.c:889 > [<ffffffff8110b49d>] SyS_exit_group+0x1d/0x20 kernel/exit.c:887 > ---[ end trace 43da88a03e29c2a5 ]--- > > > Run the program in a loop, as the WARNING seems to be triggered by a race. > > On commit d1e41ff11941784f469f17795a4d9425c2eb4b7a (Nov 5). > But I was also able to reproduce it on a 3.11-based kernel. Hello, This is still happening periodically for me. Is anybody looking at this? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: WARNING in shmem_evict_inode 2015-11-23 8:30 ` Dmitry Vyukov @ 2015-12-02 9:29 ` Hugh Dickins 2015-12-16 19:23 ` Holger Hoffstätte 0 siblings, 1 reply; 4+ messages in thread From: Hugh Dickins @ 2015-12-02 9:29 UTC (permalink / raw) To: Dmitry Vyukov Cc: Hugh Dickins, Andrew Morton, linux-mm@kvack.org, LKML, Sasha Levin, syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Greg Thelen On Mon, 23 Nov 2015, Dmitry Vyukov wrote: > On Mon, Nov 9, 2015 at 9:55 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > > Hello, > > > > The following program: > > > > // autogenerated by syzkaller (http://github.com/google/syzkaller) > > #include <syscall.h> > > #include <string.h> > > #include <stdint.h> > > #include <pthread.h> > > > > #define SYS_memfd_create 319 > > > > long fd; > > > > void *thr(void *p) > > { > > syscall(SYS_ftruncate, fd, 0x8ul, 0, 0, 0, 0); > > return 0; > > } > > > > int main() > > { > > pthread_t th; > > > > syscall(SYS_mmap, 0x20000000ul, 0x10000ul, 0x3ul, 0x32ul, > > 0xfffffffffffffffful, 0x0ul); > > memcpy((void*)0x20000f96, "\x23\x65\x6d\x31\x07\x2b\x27\x29\x00", 9); > > fd = syscall(SYS_memfd_create, 0x20000f96ul, 0x2ul, 0, 0, 0, 0); > > syscall(SYS_fallocate, fd, 0x0ul, 0x31d89288ul, 0x4ul, 0, 0); > > syscall(SYS_mmap, 0x20061000ul, 0xc00000ul, > > 0x1a9d91e04768640bul, 0x11ul, fd, 0x0ul); > > pthread_create(&th, 0, thr, 0); > > syscall(SYS_fstat, fd, 0x20550fcful, 0, 0, 0, 0); > > pthread_join(th, 0); > > return 0; > > } > > > > > > triggers WARNING in shmem_evict_inode: > > > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 10442 at mm/shmem.c:625 shmem_evict_inode+0x335/0x480() > > Modules linked in: > > CPU: 1 PID: 8944 Comm: executor Not tainted 4.3.0+ #39 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > > 00000000ffffffff ffff88006c6afab8 ffffffff81aad406 0000000000000000 > > ffff88006e39ac80 ffffffff83091660 ffff88006c6afaf8 ffffffff81100829 > > ffffffff814192e5 ffffffff83091660 0000000000000271 ffff88003d075aa8 > > Call Trace: > > [<ffffffff81100a59>] warn_slowpath_null+0x29/0x30 kernel/panic.c:480 > > [<ffffffff814192e5>] shmem_evict_inode+0x335/0x480 mm/shmem.c:625 > > [<ffffffff8151560e>] evict+0x26e/0x580 fs/inode.c:542 > > [< inline >] iput_final fs/inode.c:1477 > > [<ffffffff81515f30>] iput+0x4a0/0x790 fs/inode.c:1504 > > [< inline >] dentry_iput fs/dcache.c:358 > > [<ffffffff8150667e>] __dentry_kill+0x4fe/0x700 fs/dcache.c:543 > > [< inline >] dentry_kill fs/dcache.c:587 > > [<ffffffff8150be7b>] dput+0x6ab/0x7a0 fs/dcache.c:796 > > [<ffffffff814c499b>] __fput+0x3fb/0x6e0 fs/file_table.c:226 > > [<ffffffff814c4d05>] ____fput+0x15/0x20 fs/file_table.c:244 > > [<ffffffff8115ab23>] task_work_run+0x163/0x1f0 kernel/task_work.c:115 > > [< inline >] exit_task_work include/linux/task_work.h:21 > > [<ffffffff81105049>] do_exit+0x7f9/0x2b80 kernel/exit.c:748 > > [<ffffffff8110b268>] do_group_exit+0x108/0x320 kernel/exit.c:878 > > [< inline >] SYSC_exit_group kernel/exit.c:889 > > [<ffffffff8110b49d>] SyS_exit_group+0x1d/0x20 kernel/exit.c:887 > > ---[ end trace 43da88a03e29c2a5 ]--- > > > > > > Run the program in a loop, as the WARNING seems to be triggered by a race. > > > > On commit d1e41ff11941784f469f17795a4d9425c2eb4b7a (Nov 5). > > But I was also able to reproduce it on a 3.11-based kernel. > > > Hello, > > This is still happening periodically for me. Is anybody looking at this? It was more interesting than I expected, thanks. I believe you will find that this fixes it. [PATCH] tmpfs: fix shmem_evict_inode warnings on i_blocks Dmitry Vyukov provides a little program, autogenerated by syzkaller, which races a fault on a mapping of a sparse memfd object, against truncation of that object below the fault address: run repeatedly for a few minutes, it reliably generates shmem_evict_inode()'s WARN_ON(inode->i_blocks). (But there's nothing specific to memfd here, nor to the fstat which it happened to use to generate the fault: though that looked suspicious, since a shmem_recalc_inode() had been added there recently. The same problem can be reproduced with open+unlink in place of memfd_create, and with fstatfs in place of fstat.) v3.7 commit 0f3c42f522dc ("tmpfs: change final i_blocks BUG to WARNING") explains one cause of such a warning (a race with shmem_writepage to swap), and possible solutions; but we never took it further, and this syzkaller incident turns out to have a different cause. shmem_getpage_gfp()'s error recovery, when a freshly allocated page is then found to be beyond eof, looks plausible - decrementing the alloced count that was just before incremented - but in fact can go wrong, if a racing thread (the truncator, for example) gets its shmem_recalc_inode() in just after our delete_from_page_cache(). delete_from_page_cache() decrements nrpages, that shmem_recalc_inode() will balance the books by decrementing alloced itself, then our decrement of alloced take it one too low: leading to the WARNING when the object is finally evicted. Once the new page has been exposed in the page cache, shmem_getpage_gfp() must leave it to shmem_recalc_inode() itself to get the accounting right in all cases (and not fall through from "trunc:" to "decused:"). Adjust that error recovery block; and the reinitialization of info and sbinfo can be removed too. While we're here, fix shmem_writepage() to avoid the original issue: it will be safe against a racing shmem_recalc_inode(), if it merely increments swapped before the shmem_delete_from_page_cache() which decrements nrpages (but it must then do its own shmem_recalc_inode() before that, while still in balance, instead of after). (Aside: why do we shmem_recalc_inode() here in the swap path? Because its raison d'etre is to cope with clean sparse shmem pages being reclaimed behind our back: so here when swapping is a good place to look for that case.) But I've not now managed to reproduce this bug, even without the patch. I don't see why I didn't do that earlier: perhaps inhibited by the preference to eliminate shmem_recalc_inode() altogether. Driven by this incident, I do now have a patch to do so at last; but still want to sit on it for a bit, there's a couple of questions yet to be resolved. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> --- Cc stable? I don't think that's necessary, but might be proved wrong: along with the warning, the bug does allow one page beyond the limit to be allocated from a size-limited tmpfs mount. mm/shmem.c | 34 ++++++++++++++-------------------- 1 file changed, 14 insertions(+), 20 deletions(-) --- 4.4-rc3/mm/shmem.c 2015-11-15 21:06:56.513752469 -0800 +++ linux/mm/shmem.c 2015-11-30 17:38:42.337790242 -0800 @@ -843,14 +843,14 @@ static int shmem_writepage(struct page * list_add_tail(&info->swaplist, &shmem_swaplist); if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) { - swap_shmem_alloc(swap); - shmem_delete_from_page_cache(page, swp_to_radix_entry(swap)); - spin_lock(&info->lock); - info->swapped++; shmem_recalc_inode(inode); + info->swapped++; spin_unlock(&info->lock); + swap_shmem_alloc(swap); + shmem_delete_from_page_cache(page, swp_to_radix_entry(swap)); + mutex_unlock(&shmem_swaplist_mutex); BUG_ON(page_mapped(page)); swap_writepage(page, wbc); @@ -1078,7 +1078,7 @@ repeat: if (sgp != SGP_WRITE && sgp != SGP_FALLOC && ((loff_t)index << PAGE_CACHE_SHIFT) >= i_size_read(inode)) { error = -EINVAL; - goto failed; + goto unlock; } if (page && sgp == SGP_WRITE) @@ -1246,11 +1246,15 @@ clear: /* Perhaps the file has been truncated since we checked */ if (sgp != SGP_WRITE && sgp != SGP_FALLOC && ((loff_t)index << PAGE_CACHE_SHIFT) >= i_size_read(inode)) { + if (alloced) { + ClearPageDirty(page); + delete_from_page_cache(page); + spin_lock(&info->lock); + shmem_recalc_inode(inode); + spin_unlock(&info->lock); + } error = -EINVAL; - if (alloced) - goto trunc; - else - goto failed; + goto unlock; } *pagep = page; return 0; @@ -1258,23 +1262,13 @@ clear: /* * Error recovery. */ -trunc: - info = SHMEM_I(inode); - ClearPageDirty(page); - delete_from_page_cache(page); - spin_lock(&info->lock); - info->alloced--; - inode->i_blocks -= BLOCKS_PER_PAGE; - spin_unlock(&info->lock); decused: - sbinfo = SHMEM_SB(inode->i_sb); if (sbinfo->max_blocks) percpu_counter_add(&sbinfo->used_blocks, -1); unacct: shmem_unacct_blocks(info->flags, 1); failed: - if (swap.val && error != -EINVAL && - !shmem_confirm_swap(mapping, index, swap)) + if (swap.val && !shmem_confirm_swap(mapping, index, swap)) error = -EEXIST; unlock: if (page) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: WARNING in shmem_evict_inode 2015-12-02 9:29 ` Hugh Dickins @ 2015-12-16 19:23 ` Holger Hoffstätte 0 siblings, 0 replies; 4+ messages in thread From: Holger Hoffstätte @ 2015-12-16 19:23 UTC (permalink / raw) To: Hugh Dickins, Dmitry Vyukov Cc: Andrew Morton, linux-mm@kvack.org, LKML, Sasha Levin, syzkaller, Kostya Serebryany, Alexander Potapenko, Eric Dumazet, Greg Thelen On 12/02/15 10:29, Hugh Dickins wrote: > On Mon, 23 Nov 2015, Dmitry Vyukov wrote: >> On Mon, Nov 9, 2015 at 9:55 AM, Dmitry Vyukov <dvyukov@google.com> wrote: [snip] >>> triggers WARNING in shmem_evict_inode: >>> >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 10442 at mm/shmem.c:625 shmem_evict_inode+0x335/0x480() >>> Modules linked in: >>> CPU: 1 PID: 8944 Comm: executor Not tainted 4.3.0+ #39 >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >>> 00000000ffffffff ffff88006c6afab8 ffffffff81aad406 0000000000000000 >>> ffff88006e39ac80 ffffffff83091660 ffff88006c6afaf8 ffffffff81100829 >>> ffffffff814192e5 ffffffff83091660 0000000000000271 ffff88003d075aa8 >>> Call Trace: >>> [<ffffffff81100a59>] warn_slowpath_null+0x29/0x30 kernel/panic.c:480 >>> [<ffffffff814192e5>] shmem_evict_inode+0x335/0x480 mm/shmem.c:625 >>> [<ffffffff8151560e>] evict+0x26e/0x580 fs/inode.c:542 >>> [< inline >] iput_final fs/inode.c:1477 [snip] > It was more interesting than I expected, thanks. > I believe you will find that this fixes it. > > [PATCH] tmpfs: fix shmem_evict_inode warnings on i_blocks Since I just saw this in Linus' tree, here's another retrospective bug report and Thank You for fixing it. :-) The problem is quite real, even though I'm probably the only other person to ever report it, see: http://www.spinics.net/lists/linux-fsdevel/msg83567.html > Cc stable? I don't think that's necessary, but might be proved wrong: > along with the warning, the bug does allow one page beyond the limit > to be allocated from a size-limited tmpfs mount. It applies and works fine, so it probably wouldn't hurt. I'm using it in my 4.1++ tree as we speak, no problems. -h -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-12-16 19:23 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-11-09 8:55 WARNING in shmem_evict_inode Dmitry Vyukov 2015-11-23 8:30 ` Dmitry Vyukov 2015-12-02 9:29 ` Hugh Dickins 2015-12-16 19:23 ` Holger Hoffstätte
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).