* vmsplice triggering bug in kfree.
@ 2012-06-07 2:51 Dave Jones
2012-06-07 4:27 ` Eric Dumazet
0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2012-06-07 2:51 UTC (permalink / raw)
To: Linux Kernel; +Cc: axboe
kernel BUG at mm/slub.c:3474!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU 7
Modules linked in: ipt_ULOG tun fuse binfmt_misc nfnetlink caif_socket caif phonet bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr i2c_i801 e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
Pid: 21252, comm: trinity-child7 Not tainted 3.5.0-rc1+ #74
RIP: 0010:[<ffffffff811945ce>] [<ffffffff811945ce>] kfree+0x26e/0x270
RSP: 0018:ffff880104065c48 EFLAGS: 00010246
RAX: 0020000000000000 RBX: ffff880104065d18 RCX: 0000000000000000
RDX: ffffffff7fffffff RSI: ffff880104065cf0 RDI: ffff880104065d18
RBP: ffff880104065c78 R08: 00000000fffffff2 R09: 0000000000000000
R10: ffffffff821e2d00 R11: 0000000000000001 R12: 0000000000000ffc
R13: ffffea0004101940 R14: 0000000000000000 R15: ffff880104065d98
FS: 00007f5baafd3740(0000) GS:ffff880148a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000ffc CR3: 0000000107181000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process trinity-child7 (pid: 21252, threadinfo ffff880104064000, task ffff8801080acd60)
Stack:
0000000000000010 ffff880104065cf0 0000000000000ffc fffffffffffffff2
0000000000000000 ffff880104065d98 ffff880104065c98 ffffffff811dc9ef
0000000000000018 0000000000000161 ffff880104065ec8 ffffffff811dcc4c
Call Trace:
[<ffffffff811dc9ef>] splice_shrink_spd+0x1f/0x30
[<ffffffff811dcc4c>] vmsplice_to_pipe+0x24c/0x290
[<ffffffff811db920>] ? page_cache_pipe_buf_release+0x30/0x30
[<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
[<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
[<ffffffff8108cd97>] ? local_clock+0x47/0x60
[<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
[<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
[<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
[<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
[<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
[<ffffffff8108cd97>] ? local_clock+0x47/0x60
[<ffffffff81050e0c>] ? do_setitimer+0x1cc/0x310
[<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
[<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
[<ffffffff81651919>] ? sub_preempt_count+0x79/0xd0
[<ffffffff811ad4da>] ? fget_light+0x3ca/0x500
[<ffffffff811dd90d>] sys_vmsplice+0x9d/0x210
[<ffffffff81655937>] ? sysret_check+0x1b/0x56
[<ffffffff81326f3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81655912>] system_call_fastpath+0x16/0x1b
Code: e8 58 ac fb ff e9 a8 fe ff ff 0f 0b 4d 8b 6d 30 e9 fe fd ff ff 4c 89 f1 48 89 da 4c 89 ee 4c 89 e7 e8 91 fd 4a 00 e9 87 fe ff ff <0f> 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb 48 8b
RIP [<ffffffff811945ce>] kfree+0x26e/0x270
RSP <ffff880104065c48>
---[ end trace 77573bf4cc1dedea ]---
That's...
3473 if (unlikely(!PageSlab(page))) {
3474 BUG_ON(!PageCompound(page));
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree.
2012-06-07 2:51 vmsplice triggering bug in kfree Dave Jones
@ 2012-06-07 4:27 ` Eric Dumazet
2012-06-07 4:40 ` Eric Dumazet
2012-06-07 10:36 ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet
0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07 4:27 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel, axboe
On Wed, 2012-06-06 at 22:51 -0400, Dave Jones wrote:
> kernel BUG at mm/slub.c:3474!
> invalid opcode: 0000 [#1] PREEMPT SMP
> CPU 7
> Modules linked in: ipt_ULOG tun fuse binfmt_misc nfnetlink caif_socket caif phonet bluetooth rfkill can llc2 pppoe pppox ppp_generic slhc irda crc_ccitt rds af_key decnet rose x25 atm netrom appletalk ipx p8023 psnap p8022 llc ax25 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr i2c_i801 e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan]
> Pid: 21252, comm: trinity-child7 Not tainted 3.5.0-rc1+ #74
> RIP: 0010:[<ffffffff811945ce>] [<ffffffff811945ce>] kfree+0x26e/0x270
> RSP: 0018:ffff880104065c48 EFLAGS: 00010246
> RAX: 0020000000000000 RBX: ffff880104065d18 RCX: 0000000000000000
> RDX: ffffffff7fffffff RSI: ffff880104065cf0 RDI: ffff880104065d18
> RBP: ffff880104065c78 R08: 00000000fffffff2 R09: 0000000000000000
> R10: ffffffff821e2d00 R11: 0000000000000001 R12: 0000000000000ffc
> R13: ffffea0004101940 R14: 0000000000000000 R15: ffff880104065d98
> FS: 00007f5baafd3740(0000) GS:ffff880148a00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000ffc CR3: 0000000107181000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process trinity-child7 (pid: 21252, threadinfo ffff880104064000, task ffff8801080acd60)
> Stack:
> 0000000000000010 ffff880104065cf0 0000000000000ffc fffffffffffffff2
> 0000000000000000 ffff880104065d98 ffff880104065c98 ffffffff811dc9ef
> 0000000000000018 0000000000000161 ffff880104065ec8 ffffffff811dcc4c
> Call Trace:
> [<ffffffff811dc9ef>] splice_shrink_spd+0x1f/0x30
> [<ffffffff811dcc4c>] vmsplice_to_pipe+0x24c/0x290
> [<ffffffff811db920>] ? page_cache_pipe_buf_release+0x30/0x30
> [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
> [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
> [<ffffffff8108cd97>] ? local_clock+0x47/0x60
> [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
> [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
> [<ffffffff81078daa>] ? __hrtimer_start_range_ns+0x14a/0x530
> [<ffffffff810b1e7e>] ? put_lock_stats.isra.23+0xe/0x40
> [<ffffffff8164dee8>] ? _raw_spin_unlock_irqrestore+0x38/0x80
> [<ffffffff8108cd97>] ? local_clock+0x47/0x60
> [<ffffffff81050e0c>] ? do_setitimer+0x1cc/0x310
> [<ffffffff810b1ac8>] ? trace_hardirqs_off_caller+0x28/0xc0
> [<ffffffff81086f91>] ? get_parent_ip+0x11/0x50
> [<ffffffff81651919>] ? sub_preempt_count+0x79/0xd0
> [<ffffffff811ad4da>] ? fget_light+0x3ca/0x500
> [<ffffffff811dd90d>] sys_vmsplice+0x9d/0x210
> [<ffffffff81655937>] ? sysret_check+0x1b/0x56
> [<ffffffff81326f3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [<ffffffff81655912>] system_call_fastpath+0x16/0x1b
> Code: e8 58 ac fb ff e9 a8 fe ff ff 0f 0b 4d 8b 6d 30 e9 fe fd ff ff 4c 89 f1 48 89 da 4c 89 ee 4c 89 e7 e8 91 fd 4a 00 e9 87 fe ff ff <0f> 0b 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 89 fb 48 8b
> RIP [<ffffffff811945ce>] kfree+0x26e/0x270
> RSP <ffff880104065c48>
> ---[ end trace 77573bf4cc1dedea ]---
>
>
> That's...
>
>
> 3473 if (unlikely(!PageSlab(page))) {
> 3474 BUG_ON(!PageCompound(page));
>
Thanks Dave, I'll take a look today on this report.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree.
2012-06-07 4:27 ` Eric Dumazet
@ 2012-06-07 4:40 ` Eric Dumazet
2012-06-07 5:52 ` Eric Dumazet
2012-06-07 10:36 ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet
1 sibling, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07 4:40 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel, axboe
On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
> Thanks Dave, I'll take a look today on this report.
>
OK, trivial bug, I am testing a fix, thanks again.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree.
2012-06-07 4:40 ` Eric Dumazet
@ 2012-06-07 5:52 ` Eric Dumazet
2012-06-07 7:47 ` Eric Dumazet
2012-06-08 18:04 ` Dave Jones
0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07 5:52 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert
On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
>
> > Thanks Dave, I'll take a look today on this report.
> >
>
> OK, trivial bug, I am testing a fix, thanks again.
>
Not sure if you can reproduce this bug easily, if so could you test
following patch ?
Problem is that bipe->buffers can change anytime if we dont lock the
pipe mutex. So we must store in spd a new nr_pages_max field to shadow
pipe->buffers for the duration of the syscall.
Bug introduced in commit 35f3d14dbbc ( pipe: add support for shrinking
and growing pipes )
Thanks
fs/splice.c | 35 ++++++++++++++++++++---------------
include/linux/splice.h | 8 ++++----
kernel/relay.c | 5 +++--
kernel/trace/trace.c | 6 ++++--
mm/shmem.c | 3 ++-
net/core/skbuff.c | 1 +
6 files changed, 34 insertions(+), 24 deletions(-)
diff --git a/fs/splice.c b/fs/splice.c
index c9f1318..7bf08fa 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -273,13 +273,16 @@ void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
* Check if we need to grow the arrays holding pages and partial page
* descriptions.
*/
-int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
{
- if (pipe->buffers <= PIPE_DEF_BUFFERS)
+ unsigned int buffers = ACCESS_ONCE(pipe->buffers);
+
+ spd->nr_pages_max = buffers;
+ if (buffers <= PIPE_DEF_BUFFERS)
return 0;
- spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL);
- spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL);
+ spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL);
+ spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL);
if (spd->pages && spd->partial)
return 0;
@@ -289,10 +292,9 @@ int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
return -ENOMEM;
}
-void splice_shrink_spd(struct pipe_inode_info *pipe,
- struct splice_pipe_desc *spd)
+void splice_shrink_spd(struct splice_pipe_desc *spd)
{
- if (pipe->buffers <= PIPE_DEF_BUFFERS)
+ if (spd->nr_pages_max <= PIPE_DEF_BUFFERS)
return;
kfree(spd->pages);
@@ -315,6 +317,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &page_cache_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -326,7 +329,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
index = *ppos >> PAGE_CACHE_SHIFT;
loff = *ppos & ~PAGE_CACHE_MASK;
req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
- nr_pages = min(req_pages, pipe->buffers);
+ nr_pages = min(req_pages, spd.nr_pages_max);
/*
* Lookup the (hopefully) full range of pages we need.
@@ -497,7 +500,7 @@ fill_it:
if (spd.nr_pages)
error = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return error;
}
@@ -598,6 +601,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &default_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -608,8 +612,8 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
res = -ENOMEM;
vec = __vec;
- if (pipe->buffers > PIPE_DEF_BUFFERS) {
- vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL);
+ if (spd.nr_pages_max > PIPE_DEF_BUFFERS) {
+ vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL);
if (!vec)
goto shrink_ret;
}
@@ -617,7 +621,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
offset = *ppos & ~PAGE_CACHE_MASK;
nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
- for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) {
+ for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) {
struct page *page;
page = alloc_page(GFP_USER);
@@ -665,7 +669,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
shrink_ret:
if (vec != __vec)
kfree(vec);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return res;
err:
@@ -1614,6 +1618,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &user_page_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -1629,13 +1634,13 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages,
spd.partial, false,
- pipe->buffers);
+ spd.nr_pages_max);
if (spd.nr_pages <= 0)
ret = spd.nr_pages;
else
ret = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;
}
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 26e5b61..09a545a 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -51,7 +51,8 @@ struct partial_page {
struct splice_pipe_desc {
struct page **pages; /* page map */
struct partial_page *partial; /* pages[] may not be contig */
- int nr_pages; /* number of pages in map */
+ int nr_pages; /* number of populated pages in map */
+ unsigned int nr_pages_max; /* pages[] & partial[] arrays size */
unsigned int flags; /* splice flags */
const struct pipe_buf_operations *ops;/* ops associated with output pipe */
void (*spd_release)(struct splice_pipe_desc *, unsigned int);
@@ -85,9 +86,8 @@ extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
/*
* for dynamic pipe sizing
*/
-extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *);
-extern void splice_shrink_spd(struct pipe_inode_info *,
- struct splice_pipe_desc *);
+extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
+extern void splice_shrink_spd(struct splice_pipe_desc *);
extern void spd_release_page(struct splice_pipe_desc *, unsigned int);
extern const struct pipe_buf_operations page_cache_pipe_buf_ops;
diff --git a/kernel/relay.c b/kernel/relay.c
index ab56a17..e8cd202 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struct file *in,
struct splice_pipe_desc spd = {
.pages = pages,
.nr_pages = 0,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.partial = partial,
.flags = flags,
.ops = &relay_pipe_buf_ops,
@@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struct file *in,
ret += padding;
out:
- splice_shrink_spd(pipe, &spd);
- return ret;
+ splice_shrink_spd(&spd);
+ return ret;
}
static ssize_t relay_file_splice_read(struct file *in,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 68032c6..2884880 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3609,6 +3609,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
.pages = pages_def,
.partial = partial_def,
.nr_pages = 0, /* This gets updated below. */
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &tracing_pipe_buf_ops,
.spd_release = tracing_spd_release_pipe,
@@ -3680,7 +3681,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
ret = splice_to_pipe(pipe, &spd);
out:
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;
out_err:
@@ -4231,6 +4232,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages_def,
.partial = partial_def,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &buffer_pipe_buf_ops,
.spd_release = buffer_spd_release,
@@ -4318,7 +4320,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
}
ret = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
out:
return ret;
}
diff --git a/mm/shmem.c b/mm/shmem.c
index 585bd220..c244e93 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1577,6 +1577,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &page_cache_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -1665,7 +1666,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
if (spd.nr_pages)
error = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
if (error > 0) {
*ppos += error;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 016694d..bac3c57 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1755,6 +1755,7 @@ int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = MAX_SKB_FRAGS,
.flags = flags,
.ops = &sock_pipe_buf_ops,
.spd_release = sock_spd_release,
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree.
2012-06-07 5:52 ` Eric Dumazet
@ 2012-06-07 7:47 ` Eric Dumazet
2012-06-08 18:04 ` Dave Jones
1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07 7:47 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert
On Thu, 2012-06-07 at 07:52 +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote:
> > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
> >
> > > Thanks Dave, I'll take a look today on this report.
> > >
> >
> > OK, trivial bug, I am testing a fix, thanks again.
> >
>
> Not sure if you can reproduce this bug easily, if so could you test
> following patch ?
By the way, following program triggers the bug instantly :
#define __USE_GNU 1
#define _GNU_SOURCE
#include <fcntl.h>
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
int pfd[2];
void *worker(void *arg)
{
unsigned int page_size = 4096;
while (1) {
fcntl(pfd[1], F_SETPIPE_SZ, 16 * page_size);
fcntl(pfd[1], F_SETPIPE_SZ, 64 * page_size);
}
}
char buffer[1024*1024];
int main(int argc, char *argv[])
{
pthread_t tid;
int res, fdnull = open("/dev/null", O_WRONLY);
if (pipe(pfd) == -1) {
perror("pipe");
return 1;
}
res = pthread_create(&tid, NULL, worker, NULL);
if (res) {
errno = res;
perror("pthread_create");
return 1;
}
while (1) {
struct iovec iov[1];
int wr;
iov[0].iov_base = buffer;
iov[0].iov_len = 1024*1024;
wr = vmsplice(pfd[1], iov, 1, SPLICE_F_GIFT);
if (wr > 0) {
wr = splice(pfd[0], NULL,
fdnull, NULL,
wr, SPLICE_F_MOVE | SPLICE_F_MORE);
}
}
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] splice: fix racy pipe->buffers uses
2012-06-07 4:27 ` Eric Dumazet
2012-06-07 4:40 ` Eric Dumazet
@ 2012-06-07 10:36 ` Eric Dumazet
1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2012-06-07 10:36 UTC (permalink / raw)
To: Dave Jones; +Cc: Linux Kernel, axboe, Tom Herbert, Alexander Viro
From: Eric Dumazet <edumazet@google.com>
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()
commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.
Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.
Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.
splice_shrink_spd() loses its struct pipe_inode_info argument.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tom Herbert <therbert@google.com>
Cc: stable <stable@vger.kernel.org> # 2.6.35
---
fs/splice.c | 35 ++++++++++++++++++++---------------
include/linux/splice.h | 8 ++++----
kernel/relay.c | 5 +++--
kernel/trace/trace.c | 6 ++++--
mm/shmem.c | 3 ++-
net/core/skbuff.c | 1 +
6 files changed, 34 insertions(+), 24 deletions(-)
diff --git a/fs/splice.c b/fs/splice.c
index c9f1318..7bf08fa 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -273,13 +273,16 @@ void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
* Check if we need to grow the arrays holding pages and partial page
* descriptions.
*/
-int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
{
- if (pipe->buffers <= PIPE_DEF_BUFFERS)
+ unsigned int buffers = ACCESS_ONCE(pipe->buffers);
+
+ spd->nr_pages_max = buffers;
+ if (buffers <= PIPE_DEF_BUFFERS)
return 0;
- spd->pages = kmalloc(pipe->buffers * sizeof(struct page *), GFP_KERNEL);
- spd->partial = kmalloc(pipe->buffers * sizeof(struct partial_page), GFP_KERNEL);
+ spd->pages = kmalloc(buffers * sizeof(struct page *), GFP_KERNEL);
+ spd->partial = kmalloc(buffers * sizeof(struct partial_page), GFP_KERNEL);
if (spd->pages && spd->partial)
return 0;
@@ -289,10 +292,9 @@ int splice_grow_spd(struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
return -ENOMEM;
}
-void splice_shrink_spd(struct pipe_inode_info *pipe,
- struct splice_pipe_desc *spd)
+void splice_shrink_spd(struct splice_pipe_desc *spd)
{
- if (pipe->buffers <= PIPE_DEF_BUFFERS)
+ if (spd->nr_pages_max <= PIPE_DEF_BUFFERS)
return;
kfree(spd->pages);
@@ -315,6 +317,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &page_cache_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -326,7 +329,7 @@ __generic_file_splice_read(struct file *in, loff_t *ppos,
index = *ppos >> PAGE_CACHE_SHIFT;
loff = *ppos & ~PAGE_CACHE_MASK;
req_pages = (len + loff + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
- nr_pages = min(req_pages, pipe->buffers);
+ nr_pages = min(req_pages, spd.nr_pages_max);
/*
* Lookup the (hopefully) full range of pages we need.
@@ -497,7 +500,7 @@ fill_it:
if (spd.nr_pages)
error = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return error;
}
@@ -598,6 +601,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &default_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -608,8 +612,8 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
res = -ENOMEM;
vec = __vec;
- if (pipe->buffers > PIPE_DEF_BUFFERS) {
- vec = kmalloc(pipe->buffers * sizeof(struct iovec), GFP_KERNEL);
+ if (spd.nr_pages_max > PIPE_DEF_BUFFERS) {
+ vec = kmalloc(spd.nr_pages_max * sizeof(struct iovec), GFP_KERNEL);
if (!vec)
goto shrink_ret;
}
@@ -617,7 +621,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
offset = *ppos & ~PAGE_CACHE_MASK;
nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
- for (i = 0; i < nr_pages && i < pipe->buffers && len; i++) {
+ for (i = 0; i < nr_pages && i < spd.nr_pages_max && len; i++) {
struct page *page;
page = alloc_page(GFP_USER);
@@ -665,7 +669,7 @@ ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
shrink_ret:
if (vec != __vec)
kfree(vec);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return res;
err:
@@ -1614,6 +1618,7 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &user_page_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -1629,13 +1634,13 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,
spd.nr_pages = get_iovec_page_array(iov, nr_segs, spd.pages,
spd.partial, false,
- pipe->buffers);
+ spd.nr_pages_max);
if (spd.nr_pages <= 0)
ret = spd.nr_pages;
else
ret = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;
}
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 26e5b61..09a545a 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -51,7 +51,8 @@ struct partial_page {
struct splice_pipe_desc {
struct page **pages; /* page map */
struct partial_page *partial; /* pages[] may not be contig */
- int nr_pages; /* number of pages in map */
+ int nr_pages; /* number of populated pages in map */
+ unsigned int nr_pages_max; /* pages[] & partial[] arrays size */
unsigned int flags; /* splice flags */
const struct pipe_buf_operations *ops;/* ops associated with output pipe */
void (*spd_release)(struct splice_pipe_desc *, unsigned int);
@@ -85,9 +86,8 @@ extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
/*
* for dynamic pipe sizing
*/
-extern int splice_grow_spd(struct pipe_inode_info *, struct splice_pipe_desc *);
-extern void splice_shrink_spd(struct pipe_inode_info *,
- struct splice_pipe_desc *);
+extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
+extern void splice_shrink_spd(struct splice_pipe_desc *);
extern void spd_release_page(struct splice_pipe_desc *, unsigned int);
extern const struct pipe_buf_operations page_cache_pipe_buf_ops;
diff --git a/kernel/relay.c b/kernel/relay.c
index ab56a17..e8cd202 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1235,6 +1235,7 @@ static ssize_t subbuf_splice_actor(struct file *in,
struct splice_pipe_desc spd = {
.pages = pages,
.nr_pages = 0,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.partial = partial,
.flags = flags,
.ops = &relay_pipe_buf_ops,
@@ -1302,8 +1303,8 @@ static ssize_t subbuf_splice_actor(struct file *in,
ret += padding;
out:
- splice_shrink_spd(pipe, &spd);
- return ret;
+ splice_shrink_spd(&spd);
+ return ret;
}
static ssize_t relay_file_splice_read(struct file *in,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 68032c6..2884880 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -3609,6 +3609,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
.pages = pages_def,
.partial = partial_def,
.nr_pages = 0, /* This gets updated below. */
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &tracing_pipe_buf_ops,
.spd_release = tracing_spd_release_pipe,
@@ -3680,7 +3681,7 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
ret = splice_to_pipe(pipe, &spd);
out:
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
return ret;
out_err:
@@ -4231,6 +4232,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages_def,
.partial = partial_def,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &buffer_pipe_buf_ops,
.spd_release = buffer_spd_release,
@@ -4318,7 +4320,7 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
}
ret = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
out:
return ret;
}
diff --git a/mm/shmem.c b/mm/shmem.c
index 585bd220..c244e93 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1577,6 +1577,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = PIPE_DEF_BUFFERS,
.flags = flags,
.ops = &page_cache_pipe_buf_ops,
.spd_release = spd_release_page,
@@ -1665,7 +1666,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
if (spd.nr_pages)
error = splice_to_pipe(pipe, &spd);
- splice_shrink_spd(pipe, &spd);
+ splice_shrink_spd(&spd);
if (error > 0) {
*ppos += error;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 016694d..bac3c57 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1755,6 +1755,7 @@ int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
struct splice_pipe_desc spd = {
.pages = pages,
.partial = partial,
+ .nr_pages_max = MAX_SKB_FRAGS,
.flags = flags,
.ops = &sock_pipe_buf_ops,
.spd_release = sock_spd_release,
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: vmsplice triggering bug in kfree.
2012-06-07 5:52 ` Eric Dumazet
2012-06-07 7:47 ` Eric Dumazet
@ 2012-06-08 18:04 ` Dave Jones
1 sibling, 0 replies; 7+ messages in thread
From: Dave Jones @ 2012-06-08 18:04 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Linux Kernel, axboe, Tom Herbert
On Thu, Jun 07, 2012 at 07:52:45AM +0200, Eric Dumazet wrote:
> On Thu, 2012-06-07 at 06:40 +0200, Eric Dumazet wrote:
> > On Thu, 2012-06-07 at 06:27 +0200, Eric Dumazet wrote:
> >
> > > Thanks Dave, I'll take a look today on this report.
> > >
> >
> > OK, trivial bug, I am testing a fix, thanks again.
> >
>
> Not sure if you can reproduce this bug easily, if so could you test
> following patch ?
haven't seen this reoccur since applying this patch, so it looks good to me.
Tested-by: Dave Jones <davej@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-06-08 18:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-07 2:51 vmsplice triggering bug in kfree Dave Jones
2012-06-07 4:27 ` Eric Dumazet
2012-06-07 4:40 ` Eric Dumazet
2012-06-07 5:52 ` Eric Dumazet
2012-06-07 7:47 ` Eric Dumazet
2012-06-08 18:04 ` Dave Jones
2012-06-07 10:36 ` [PATCH] splice: fix racy pipe->buffers uses Eric Dumazet
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.