* vmsplice triggering bug in kfree.
@ 2012-06-07 2:51 Dave Jones
2012-06-07 4:27 ` Eric Dumazet
0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-06-07 2:51 UTC (permalink / raw)
To: Linux Kernel; +Cc: axboe
kernel BUG at mm/slub.c:3474!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU 7
Modules linked in: ipt_ULOG tun fuse binfmt_misc nfnetlink caif_socket caif phonet bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr i2c_i801 e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
Pid: 21252, comm: trinity-child7 Not tainted 3.5.0-rc1+ #74
RIP: 0010:[<ffffffff811945ce>] [<ffffffff811945ce>] kfree+0x26e/0x270
RSP: 0018:ffff880104065c48 EFLAGS: 00010246
RAX: 0020000000000000 RBX: ffff880104065d18 RCX: 0000000000000000
RDX: ffffffff7fffffff RSI: ffff880104065cf0 RDI: ffff880104065d18
RBP: ffff880104065c78 R08: 00000000fffffff2 R09: 0000000000000000
R10: ffffffff821e2d00 R11: 0000000000000001 R12: 0000000000000ffc
R13: ffffea0004101940 R14: 0000000000000000 R15: ffff880104065d98
FS: 00007f5baafd3740(0000) GS:ffff880148a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000ffc CR3: 0000000107181000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process trinity-child7 (pid: 21252, threadinfo ffff880104064000, task ffff8801080acd60)
Stack:
0000000000000010 ffff880104065cf0 0000000000000ffc fffffffffffffff2
0000000000000000 ffff880104065d98 ffff880104065c98 ffffffff811dc9ef
0000000000000018 0000000000000161 ffff880104065ec8 ffffffff811dcc4c
Call Trace:
[<ffffffff811dc9ef>] splice_shrink_spd+0x1f/0x30
[<ffffffff811dcc4c>] vmsplice_to_pipe+0x24c/0x290
[<ffffffff811db920>] ? page_cache_pipe_buf_release+0x30/0x30
[<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
[<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
[<ffffffff8108cd97>] ? local_clock+0x47/0x60
[<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
[<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
[<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
[<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
[<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
[<ffffffff8108cd97>] ? local_clock+0x47/0x60
[<ffffffff81050e0c>] ? do_setitimer+0x1cc/0x310
[<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
[<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
[<ffffffff81651919>] ? sub_preempt_count+0x79/0xd0
[<ffffffff811ad4da>] ? fget_light+0x3ca/0x500
[<ffffffff811dd90d>] sys_vmsplice+0x9d/0x210
[<ffffffff81655937>] ? sysret_check+0x1b/0x56
[<ffffffff81326f3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81655912>] system_call_fastpath+0x16/0x1b
Code: e8 58 ac fb ff e9 a8 fe ff ff 0f 0b 4d 8b 6d 30 e9 fe fd ff ff 4c 89 f1 48 89 da 4c 89 ee 4c 89 e7 e8 91 fd 4a 00 e9 87 fe ff ff <0f> 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb 48 8b
RIP [<ffffffff811945ce>] kfree+0x26e/0x270
RSP <ffff880104065c48>
---[ end trace 77573bf4cc1dedea ]---
That's...
3473 if (unlikely(!PageSlab(page))) {
3474 BUG_ON(!PageCompound(page));
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: vmsplice triggering bug in kfree. 2012-06-07 2:51 vmsplice triggering bug in kfree Dave Jones @ 2012-06-07 4:27 ` Eric Dumazet 2012-06-07 4:40 ` Eric Dumazet 2012-06-07 10:36 ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet 0 siblings, 2 replies; 7+ messages in thread From: Eric Dumazet @ 2012-06-07 4:27 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, axboe On Wed, 2012-06-06 at 22:51 -0400, Dave Jones wrote: > kernel BUG at mm/slub.c:3474! > invalid opcode: 0000 [#1] PREEMPT SMP > CPU 7 > Modules linked in: ipt_ULOG tun fuse binfmt_misc nfnetlink caif_socket caif phonet bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr i2c_i801 e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] > Pid: 21252, comm: trinity-child7 Not tainted 3.5.0-rc1+ #74 > RIP: 0010:[<ffffffff811945ce>] [<ffffffff811945ce>] kfree+0x26e/0x270 > RSP: 0018:ffff880104065c48 EFLAGS: 00010246 > RAX: 0020000000000000 RBX: ffff880104065d18 RCX: 0000000000000000 > RDX: ffffffff7fffffff RSI: ffff880104065cf0 RDI: ffff880104065d18 > RBP: ffff880104065c78 R08: 00000000fffffff2 R09: 0000000000000000 > R10: ffffffff821e2d00 R11: 0000000000000001 R12: 0000000000000ffc > R13: ffffea0004101940 R14: 0000000000000000 R15: ffff880104065d98 > FS: 00007f5baafd3740(0000) GS:ffff880148a00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000ffc CR3: 0000000107181000 CR4: 00000000001407e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process trinity-child7 (pid: 21252, threadinfo ffff880104064000, task ffff8801080acd60) > Stack: > 0000000000000010 ffff880104065cf0 0000000000000ffc fffffffffffffff2 > 0000000000000000 ffff880104065d98 ffff880104065c98 ffffffff811dc9ef > 0000000000000018 0000000000000161 ffff880104065ec8 ffffffff811dcc4c > Call Trace: > [<ffffffff811dc9ef>] splice_shrink_spd+0x1f/0x30 > [<ffffffff811dcc4c>] vmsplice_to_pipe+0x24c/0x290 > [<ffffffff811db920>] ? page_cache_pipe_buf_release+0x30/0x30 > [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40 > [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80 > [<ffffffff8108cd97>] ? local_clock+0x47/0x60 > [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530 > [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0 > [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530 > [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40 > [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80 > [<ffffffff8108cd97>] ? local_clock+0x47/0x60 > [<ffffffff81050e0c>] ? do_setitimer+0x1cc/0x310 > [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0 > [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50 > [<ffffffff81651919>] ? sub_preempt_count+0x79/0xd0 > [<ffffffff811ad4da>] ? fget_light+0x3ca/0x500 > [<ffffffff811dd90d>] sys_vmsplice+0x9d/0x210 > [<ffffffff81655937>] ? sysret_check+0x1b/0x56 > [<ffffffff81326f3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff81655912>] system_call_fastpath+0x16/0x1b > Code: e8 58 ac fb ff e9 a8 fe ff ff 0f 0b 4d 8b 6d 30 e9 fe fd ff ff 4c 89 f1 48 89 da 4c 89 ee 4c 89 e7 e8 91 fd 4a 00 e9 87 fe ff ff <0f> 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb 48 8b > RIP [<ffffffff811945ce>] kfree+0x26e/0x270 > RSP <ffff880104065c48> > ---[ end trace 77573bf4cc1dedea ]--- > > > That's... > > > 3473 if (unlikely(!PageSlab(page))) { > 3474 BUG_ON(!PageCompound(page)); > Thanks Dave, I'll take a look today on this report. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree. 2012-06-07 4:27 ` Eric Dumazet @ 2012-06-07 4:40 ` Eric Dumazet 2012-06-07 5:52 ` Eric Dumazet 2012-06-07 10:36 ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet 1 sibling, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2012-06-07 4:40 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, axboe On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote: > Thanks Dave, I'll take a look today on this report. > OK, trivial bug, I am testing a fix, thanks again. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree. 2012-06-07 4:40 ` Eric Dumazet @ 2012-06-07 5:52 ` Eric Dumazet 2012-06-07 7:47 ` Eric Dumazet 2012-06-08 18:04 ` Dave Jones 0 siblings, 2 replies; 7+ messages in thread From: Eric Dumazet @ 2012-06-07 5:52 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote: > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote: > > > Thanks Dave, I'll take a look today on this report. > > > > OK, trivial bug, I am testing a fix, thanks again. > Not sure if you can reproduce this bug easily, if so could you test following patch ? Problem is that bipe->buffers can change anytime if we dont lock the pipe mutex. So we must store in spd a new nr_pages_max field to shadow pipe->buffers for the duration of the syscall. Bug introduced in commit 35f3d14dbbc ( pipe: add support for shrinking and growing pipes ) Thanks fs/splice.c | 35 ++++++++++++++++++++--------------- include/linux/splice.h | 8 ++++---- kernel/relay.c | 5 +++-- kernel/trace/trace.c | 6 ++++-- mm/shmem.c | 3 ++- net/core/skbuff.c | 1 + 6 files changed, 34 insertions(+), 24 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index c9f1318..7bf08fa 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -273,13 +273,16 @@ void spd_release_page(struct splice_pipe_desc *spd, unsigned int i) * Check if we need to grow the arrays holding pages and partial page * descriptions. */ -int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) +int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) { - if (pipe->buffers <= PIPE_DEF_BUFFERS) + unsigned int buffers = ACCESS_ONCE(pipe->buffers); + + spd->nr_pages_max = buffers; + if (buffers <= PIPE_DEF_BUFFERS) return 0; - spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL); - spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL); + spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL); + spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL); if (spd->pages && spd->partial) return 0; @@ -289,10 +292,9 @@ int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) return -ENOMEM; } -void splice_shrink_spd(struct pipe_inode_info *pipe, - struct splice_pipe_desc *spd) +void splice_shrink_spd(struct splice_pipe_desc *spd) { - if (pipe->buffers <= PIPE_DEF_BUFFERS) + if (spd->nr_pages_max <= PIPE_DEF_BUFFERS) return; kfree(spd->pages); @@ -315,6 +317,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &page_cache_pipe_buf_ops, .spd_release = spd_release_page, @@ -326,7 +329,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos, index = *ppos >> PAGE_CACHE_SHIFT; loff = *ppos & ~PAGE_CACHE_MASK; req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - nr_pages = min(req_pages, pipe->buffers); + nr_pages = min(req_pages, spd.nr_pages_max); /* * Lookup the (hopefully) full range of pages we need. @@ -497,7 +500,7 @@ fill_it: if (spd.nr_pages) error = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return error; } @@ -598,6 +601,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &default_pipe_buf_ops, .spd_release = spd_release_page, @@ -608,8 +612,8 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, res = -ENOMEM; vec = __vec; - if (pipe->buffers > PIPE_DEF_BUFFERS) { - vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL); + if (spd.nr_pages_max > PIPE_DEF_BUFFERS) { + vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL); if (!vec) goto shrink_ret; } @@ -617,7 +621,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, offset = *ppos & ~PAGE_CACHE_MASK; nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) { + for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) { struct page *page; page = alloc_page(GFP_USER); @@ -665,7 +669,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, shrink_ret: if (vec != __vec) kfree(vec); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return res; err: @@ -1614,6 +1618,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &user_page_pipe_buf_ops, .spd_release = spd_release_page, @@ -1629,13 +1634,13 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov, spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages, spd.partial, false, - pipe->buffers); + spd.nr_pages_max); if (spd.nr_pages <= 0) ret = spd.nr_pages; else ret = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return ret; } diff --git a/include/linux/splice.h b/include/linux/splice.h index 26e5b61..09a545a 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -51,7 +51,8 @@ struct partial_page { struct splice_pipe_desc { struct page **pages; /* page map */ struct partial_page *partial; /* pages[] may not be contig */ - int nr_pages; /* number of pages in map */ + int nr_pages; /* number of populated pages in map */ + unsigned int nr_pages_max; /* pages[] & partial[] arrays size */ unsigned int flags; /* splice flags */ const struct pipe_buf_operations *ops;/* ops associated with output pipe */ void (*spd_release)(struct splice_pipe_desc *, unsigned int); @@ -85,9 +86,8 @@ extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *, /* * for dynamic pipe sizing */ -extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *); -extern void splice_shrink_spd(struct pipe_inode_info *, - struct splice_pipe_desc *); +extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *); +extern void splice_shrink_spd(struct splice_pipe_desc *); extern void spd_release_page(struct splice_pipe_desc *, unsigned int); extern const struct pipe_buf_operations page_cache_pipe_buf_ops; diff --git a/kernel/relay.c b/kernel/relay.c index ab56a17..e8cd202 100644 --- a/kernel/relay.c +++ b/kernel/relay.c @@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struct file *in, struct splice_pipe_desc spd = { .pages = pages, .nr_pages = 0, + .nr_pages_max = PIPE_DEF_BUFFERS, .partial = partial, .flags = flags, .ops = &relay_pipe_buf_ops, @@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struct file *in, ret += padding; out: - splice_shrink_spd(pipe, &spd); - return ret; + splice_shrink_spd(&spd); + return ret; } static ssize_t relay_file_splice_read(struct file *in, diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 68032c6..2884880 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3609,6 +3609,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, .pages = pages_def, .partial = partial_def, .nr_pages = 0, /* This gets updated below. */ + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &tracing_pipe_buf_ops, .spd_release = tracing_spd_release_pipe, @@ -3680,7 +3681,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, ret = splice_to_pipe(pipe, &spd); out: - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return ret; out_err: @@ -4231,6 +4232,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages_def, .partial = partial_def, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &buffer_pipe_buf_ops, .spd_release = buffer_spd_release, @@ -4318,7 +4320,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, } ret = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); out: return ret; } diff --git a/mm/shmem.c b/mm/shmem.c index 585bd220..c244e93 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1577,6 +1577,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &page_cache_pipe_buf_ops, .spd_release = spd_release_page, @@ -1665,7 +1666,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, if (spd.nr_pages) error = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); if (error > 0) { *ppos += error; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 016694d..bac3c57 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1755,6 +1755,7 @@ int skb_splice_bits(struct sk_buff *skb, unsigned int offset, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = MAX_SKB_FRAGS, .flags = flags, .ops = &sock_pipe_buf_ops, .spd_release = sock_spd_release, ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree. 2012-06-07 5:52 ` Eric Dumazet @ 2012-06-07 7:47 ` Eric Dumazet 2012-06-08 18:04 ` Dave Jones 1 sibling, 0 replies; 7+ messages in thread From: Eric Dumazet @ 2012-06-07 7:47 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert On Thu, 2012-06-07 at 07:52 +0200, Eric Dumazet wrote: > On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote: > > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote: > > > > > Thanks Dave, I'll take a look today on this report. > > > > > > > OK, trivial bug, I am testing a fix, thanks again. > > > > Not sure if you can reproduce this bug easily, if so could you test > following patch ? By the way, following program triggers the bug instantly : #define __USE_GNU 1 #define _GNU_SOURCE #include <fcntl.h> #include <pthread.h> #include <sys/types.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <errno.h> int pfd[2]; void *worker(void *arg) { unsigned int page_size = 4096; while (1) { fcntl(pfd[1], F_SETPIPE_SZ, 16 * page_size); fcntl(pfd[1], F_SETPIPE_SZ, 64 * page_size); } } char buffer[1024*1024]; int main(int argc, char *argv[]) { pthread_t tid; int res, fdnull = open("/dev/null", O_WRONLY); if (pipe(pfd) == -1) { perror("pipe"); return 1; } res = pthread_create(&tid, NULL, worker, NULL); if (res) { errno = res; perror("pthread_create"); return 1; } while (1) { struct iovec iov[1]; int wr; iov[0].iov_base = buffer; iov[0].iov_len = 1024*1024; wr = vmsplice(pfd[1], iov, 1, SPLICE_F_GIFT); if (wr > 0) { wr = splice(pfd[0], NULL, fdnull, NULL, wr, SPLICE_F_MOVE | SPLICE_F_MORE); } } } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree. 2012-06-07 5:52 ` Eric Dumazet 2012-06-07 7:47 ` Eric Dumazet @ 2012-06-08 18:04 ` Dave Jones 1 sibling, 0 replies; 7+ messages in thread From: Dave Jones @ 2012-06-08 18:04 UTC (permalink / raw) To: Eric Dumazet; +Cc: Linux Kernel, axboe, Tom Herbert On Thu, Jun 07, 2012 at 07:52:45AM +0200, Eric Dumazet wrote: > On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote: > > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote: > > > > > Thanks Dave, I'll take a look today on this report. > > > > > > > OK, trivial bug, I am testing a fix, thanks again. > > > > Not sure if you can reproduce this bug easily, if so could you test > following patch ? haven't seen this reoccur since applying this patch, so it looks good to me. Tested-by: Dave Jones <davej@redhat.com> ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] splice: fix racy pipe->buffers uses 2012-06-07 4:27 ` Eric Dumazet 2012-06-07 4:40 ` Eric Dumazet @ 2012-06-07 10:36 ` Eric Dumazet 1 sibling, 0 replies; 7+ messages in thread From: Eric Dumazet @ 2012-06-07 10:36 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert, Alexander Viro From: Eric Dumazet <edumazet@google.com> Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered by splice_shrink_spd() called from vmsplice_to_pipe() commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes) added capability to adjust pipe->buffers. Problem is some paths don't hold pipe mutex and assume pipe->buffers doesn't change for their duration. Fix this by adding nr_pages_max field in struct splice_pipe_desc, and use it in place of pipe->buffers where appropriate. splice_shrink_spd() loses its struct pipe_inode_info argument. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tom Herbert <therbert@google.com> Cc: stable <stable@vger.kernel.org> # 2.6.35 --- fs/splice.c | 35 ++++++++++++++++++++--------------- include/linux/splice.h | 8 ++++---- kernel/relay.c | 5 +++-- kernel/trace/trace.c | 6 ++++-- mm/shmem.c | 3 ++- net/core/skbuff.c | 1 + 6 files changed, 34 insertions(+), 24 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index c9f1318..7bf08fa 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -273,13 +273,16 @@ void spd_release_page(struct splice_pipe_desc *spd, unsigned int i) * Check if we need to grow the arrays holding pages and partial page * descriptions. */ -int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) +int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) { - if (pipe->buffers <= PIPE_DEF_BUFFERS) + unsigned int buffers = ACCESS_ONCE(pipe->buffers); + + spd->nr_pages_max = buffers; + if (buffers <= PIPE_DEF_BUFFERS) return 0; - spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL); - spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL); + spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL); + spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL); if (spd->pages && spd->partial) return 0; @@ -289,10 +292,9 @@ int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd) return -ENOMEM; } -void splice_shrink_spd(struct pipe_inode_info *pipe, - struct splice_pipe_desc *spd) +void splice_shrink_spd(struct splice_pipe_desc *spd) { - if (pipe->buffers <= PIPE_DEF_BUFFERS) + if (spd->nr_pages_max <= PIPE_DEF_BUFFERS) return; kfree(spd->pages); @@ -315,6 +317,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &page_cache_pipe_buf_ops, .spd_release = spd_release_page, @@ -326,7 +329,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos, index = *ppos >> PAGE_CACHE_SHIFT; loff = *ppos & ~PAGE_CACHE_MASK; req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - nr_pages = min(req_pages, pipe->buffers); + nr_pages = min(req_pages, spd.nr_pages_max); /* * Lookup the (hopefully) full range of pages we need. @@ -497,7 +500,7 @@ fill_it: if (spd.nr_pages) error = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return error; } @@ -598,6 +601,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &default_pipe_buf_ops, .spd_release = spd_release_page, @@ -608,8 +612,8 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, res = -ENOMEM; vec = __vec; - if (pipe->buffers > PIPE_DEF_BUFFERS) { - vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL); + if (spd.nr_pages_max > PIPE_DEF_BUFFERS) { + vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL); if (!vec) goto shrink_ret; } @@ -617,7 +621,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, offset = *ppos & ~PAGE_CACHE_MASK; nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) { + for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) { struct page *page; page = alloc_page(GFP_USER); @@ -665,7 +669,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos, shrink_ret: if (vec != __vec) kfree(vec); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return res; err: @@ -1614,6 +1618,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &user_page_pipe_buf_ops, .spd_release = spd_release_page, @@ -1629,13 +1634,13 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov, spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages, spd.partial, false, - pipe->buffers); + spd.nr_pages_max); if (spd.nr_pages <= 0) ret = spd.nr_pages; else ret = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return ret; } diff --git a/include/linux/splice.h b/include/linux/splice.h index 26e5b61..09a545a 100644 --- a/include/linux/splice.h +++ b/include/linux/splice.h @@ -51,7 +51,8 @@ struct partial_page { struct splice_pipe_desc { struct page **pages; /* page map */ struct partial_page *partial; /* pages[] may not be contig */ - int nr_pages; /* number of pages in map */ + int nr_pages; /* number of populated pages in map */ + unsigned int nr_pages_max; /* pages[] & partial[] arrays size */ unsigned int flags; /* splice flags */ const struct pipe_buf_operations *ops;/* ops associated with output pipe */ void (*spd_release)(struct splice_pipe_desc *, unsigned int); @@ -85,9 +86,8 @@ extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *, /* * for dynamic pipe sizing */ -extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *); -extern void splice_shrink_spd(struct pipe_inode_info *, - struct splice_pipe_desc *); +extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *); +extern void splice_shrink_spd(struct splice_pipe_desc *); extern void spd_release_page(struct splice_pipe_desc *, unsigned int); extern const struct pipe_buf_operations page_cache_pipe_buf_ops; diff --git a/kernel/relay.c b/kernel/relay.c index ab56a17..e8cd202 100644 --- a/kernel/relay.c +++ b/kernel/relay.c @@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struct file *in, struct splice_pipe_desc spd = { .pages = pages, .nr_pages = 0, + .nr_pages_max = PIPE_DEF_BUFFERS, .partial = partial, .flags = flags, .ops = &relay_pipe_buf_ops, @@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struct file *in, ret += padding; out: - splice_shrink_spd(pipe, &spd); - return ret; + splice_shrink_spd(&spd); + return ret; } static ssize_t relay_file_splice_read(struct file *in, diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 68032c6..2884880 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3609,6 +3609,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, .pages = pages_def, .partial = partial_def, .nr_pages = 0, /* This gets updated below. */ + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &tracing_pipe_buf_ops, .spd_release = tracing_spd_release_pipe, @@ -3680,7 +3681,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp, ret = splice_to_pipe(pipe, &spd); out: - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); return ret; out_err: @@ -4231,6 +4232,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages_def, .partial = partial_def, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &buffer_pipe_buf_ops, .spd_release = buffer_spd_release, @@ -4318,7 +4320,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos, } ret = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); out: return ret; } diff --git a/mm/shmem.c b/mm/shmem.c index 585bd220..c244e93 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1577,6 +1577,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = PIPE_DEF_BUFFERS, .flags = flags, .ops = &page_cache_pipe_buf_ops, .spd_release = spd_release_page, @@ -1665,7 +1666,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, if (spd.nr_pages) error = splice_to_pipe(pipe, &spd); - splice_shrink_spd(pipe, &spd); + splice_shrink_spd(&spd); if (error > 0) { *ppos += error; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 016694d..bac3c57 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1755,6 +1755,7 @@ int skb_splice_bits(struct sk_buff *skb, unsigned int offset, struct splice_pipe_desc spd = { .pages = pages, .partial = partial, + .nr_pages_max = MAX_SKB_FRAGS, .flags = flags, .ops = &sock_pipe_buf_ops, .spd_release = sock_spd_release, ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-06-08 18:31 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-07 2:51 vmsplice triggering bug in kfree Dave Jones 2012-06-07 4:27 ` Eric Dumazet 2012-06-07 4:40 ` Eric Dumazet 2012-06-07 5:52 ` Eric Dumazet 2012-06-07 7:47 ` Eric Dumazet 2012-06-08 18:04 ` Dave Jones 2012-06-07 10:36 ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox