From: Willy Tarreau <w@1wt.eu>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Christophe Leroy <christophe.leroy@c-s.fr>,
Jens Axboe <axboe@fb.com>, Ben Hutchings <ben@decadent.org.uk>,
Willy Tarreau <w@1wt.eu>
Subject: [PATCH 2.6.32 38/38] [PATCH 38/38] splice: sendfile() at once fails for big files
Date: Sun, 29 Nov 2015 22:47:40 +0100 [thread overview]
Message-ID: <20151129214704.500487474@1wt.eu> (raw)
In-Reply-To: <8acf8256ccc72771a80b7851061027bc@local>
2.6.32-longterm review patch. If anyone has any objections, please let me know.
------------------
commit 0ff28d9f4674d781e492bcff6f32f0fe48cf0fed upstream.
Using sendfile with below small program to get MD5 sums of some files,
it appear that big files (over 64kbytes with 4k pages system) get a
wrong MD5 sum while small files get the correct sum.
This program uses sendfile() to send a file to an AF_ALG socket
for hashing.
/* md5sum2.c */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <linux/if_alg.h>
int main(int argc, char **argv)
{
int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
struct stat st;
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = "hash",
.salg_name = "md5",
};
int n;
bind(sk, (struct sockaddr*)&sa, sizeof(sa));
for (n = 1; n < argc; n++) {
int size;
int offset = 0;
char buf[4096];
int fd;
int sko;
int i;
fd = open(argv[n], O_RDONLY);
sko = accept(sk, NULL, 0);
fstat(fd, &st);
size = st.st_size;
sendfile(sko, fd, &offset, size);
size = read(sko, buf, sizeof(buf));
for (i = 0; i < size; i++)
printf("%2.2x", buf[i]);
printf(" %s\n", argv[n]);
close(fd);
close(sko);
}
exit(0);
}
Test below is done using official linux patch files. First result is
with a software based md5sum. Second result is with the program above.
root@vgoip:~# ls -l patch-3.6.*
-rw-r--r-- 1 root root 64011 Aug 24 12:01 patch-3.6.2.gz
-rw-r--r-- 1 root root 94131 Aug 24 12:01 patch-3.6.3.gz
root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz
root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
5fd77b24e68bb24dcc72d6e57c64790e patch-3.6.3.gz
After investivation, it appears that sendfile() sends the files by blocks
of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
is reset as if it was the end of the file.
This patch adds SPLICE_F_MORE to the flags when more data is pending.
With the patch applied, we get the correct sums:
root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz
root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443 patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43 patch-3.6.3.gz
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
(cherry picked from commit fcb2781782b61631db4ed31e98757795eacd31cb)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
fs/splice.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/splice.c b/fs/splice.c
index 1ef1c00..5c006c8b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1123,7 +1123,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
long ret, bytes;
umode_t i_mode;
size_t len;
- int i, flags;
+ int i, flags, more;
/*
* We require the input being a regular file, as we don't want to
@@ -1166,6 +1166,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
* Don't block on output, we have to drain the direct pipe.
*/
sd->flags &= ~SPLICE_F_NONBLOCK;
+ more = sd->flags & SPLICE_F_MORE;
while (len) {
size_t read_len;
@@ -1179,6 +1180,15 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
sd->total_len = read_len;
/*
+ * If more data is pending, set SPLICE_F_MORE
+ * If this is the last data and SPLICE_F_MORE was not set
+ * initially, clears it.
+ */
+ if (read_len < len)
+ sd->flags |= SPLICE_F_MORE;
+ else if (!more)
+ sd->flags &= ~SPLICE_F_MORE;
+ /*
* NOTE: nonblocking mode only applies to the input. We
* must not do the output in nonblocking mode as then we
* could get stuck data in the internal pipe:
--
1.7.12.2.21.g234cd45.dirty
next prev parent reply other threads:[~2015-11-29 22:01 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <8acf8256ccc72771a80b7851061027bc@local>
2015-11-29 21:47 ` [PATCH 2.6.32 00/38] 2.6.32.69-longterm review Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 01/38] [PATCH 01/38] dcache: Handle escaped paths in prepend_path Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 03/38] [PATCH 03/38] md: use kzalloc() when bitmap is disabled Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 04/38] [PATCH 04/38] ipv6: addrconf: validate new MTU before applying it Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 05/38] [PATCH 05/38] virtio-net: drop NETIF_F_FRAGLIST Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 06/38] [PATCH 06/38] USB: whiteheat: fix potential null-deref at probe Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 07/38] [PATCH 07/38] ipc/sem.c: fully initialize sem_array before making it visible Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 08/38] [PATCH 08/38] Initialize msg/shm IPC objects before doing ipc_addid() Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 10/38] [PATCH 10/38] rds: fix an integer overflow test in rds_info_getsockopt() Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 11/38] [PATCH 11/38] net: Clone skb before setting peeked flag Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 12/38] [PATCH 12/38] net: Fix skb_set_peeked use-after-free bug Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 13/38] [PATCH 13/38] ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 14/38] [PATCH 14/38] devres: fix devres_get() Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 15/38] [PATCH 15/38] windfarm: decrement client count when unregistering Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 16/38] [PATCH 16/38] xfs: Fix xfs_attr_leafblock definition Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 17/38] [PATCH 17/38] SUNRPC: xs_reset_transport must mark the connection as disconnected Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 18/38] [PATCH 18/38] Input: evdev - do not report errors form flush() Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 19/38] [PATCH 19/38] pagemap: hide physical addresses from non-privileged users Willy Tarreau
2015-11-30 1:54 ` Ben Hutchings
2015-11-30 7:01 ` Willy Tarreau
2015-11-30 11:30 ` Willy Tarreau
2015-11-30 11:49 ` Konstantin Khlebnikov
2015-11-30 12:13 ` Willy Tarreau
2015-11-30 14:55 ` Ben Hutchings
2015-11-30 15:14 ` Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 20/38] [PATCH 20/38] hfs,hfsplus: cache pages correctly between bnode_create and bnode_free Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 21/38] [PATCH 21/38] hfs: fix B-tree corruption after insertion at position 0 Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 22/38] [PATCH 22/38] x86/paravirt: Replace the paravirt nop with a bona fide empty function Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 23/38] [PATCH 23/38] RDS: verify the underlying transport exists before creating a connection Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 24/38] [PATCH 24/38] net: Fix skb csum races when peeking Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 25/38] [PATCH 25/38] net: add length argument to skb_copy_and_csum_datagram_iovec Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 26/38] [PATCH 26/38] module: Fix locking in symbol_put_addr() Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 27/38] [PATCH 27/38] x86/process: Add proper bound checks in 64bit get_wchan() Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 28/38] [PATCH 28/38] mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 29/38] [PATCH 29/38] tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 31/38] [PATCH 31/38] ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 32/38] [PATCH 32/38] HID: core: Avoid uninitialized buffer access Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 33/38] [PATCH 33/38] devres: fix a for loop bounds check Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 34/38] [PATCH 34/38] binfmt_elf: Dont clobber passed executables file header Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 35/38] [PATCH 35/38] RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 36/38] [PATCH 36/38] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context Willy Tarreau
2015-11-29 21:47 ` [PATCH 2.6.32 37/38] [PATCH 37/38] net: avoid NULL deref in inet_ctl_sock_destroy() Willy Tarreau
2015-11-29 21:47 ` Willy Tarreau [this message]
2015-11-30 1:25 ` [PATCH 2.6.32 09/38] [PATCH 09/38] xhci: fix off by one error in TRB DMA address boundary check Willy Tarreau
2015-11-30 2:04 ` [PATCH 2.6.32 30/38] [PATCH 30/38] mvsas: Fix NULL pointer dereference in mvs_slot_task_free Willy Tarreau
2015-11-30 2:42 ` [PATCH 2.6.32 00/38] 2.6.32.69-longterm review Ben Hutchings
2015-11-30 6:51 ` Willy Tarreau
2015-11-30 11:23 ` Willy Tarreau
2015-11-30 14:43 ` Ben Hutchings
2015-11-30 15:10 ` Willy Tarreau
[not found] ` <20151129214702.957590241@1wt.eu>
2015-11-30 6:44 ` [PATCH 2.6.32 02/38] [PATCH 02/38] Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount Willy Tarreau
[not found] ` <abcd326bbd6833b04b56738cd1734419@local>
2015-11-30 16:04 ` [PATCH 2.6.32 00/38] 2.6.32.69-longterm review Willy Tarreau
2015-11-30 16:04 ` [PATCH 2.6.32 39/38] vfs: Test for and handle paths that are unreachable from their mnt_root Willy Tarreau
2015-11-30 16:05 ` [PATCH 2.6.32 40/38] security: add cred argument to security_capable() Willy Tarreau
2015-11-30 16:05 ` [PATCH 2.6.32 19/38] pagemap: hide physical addresses from non-privileged users Willy Tarreau
2015-12-01 0:43 ` [PATCH 2.6.32 00/38] 2.6.32.69-longterm review Ben Hutchings
2015-12-01 6:57 ` Willy Tarreau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151129214704.500487474@1wt.eu \
--to=w@1wt.eu \
--cc=axboe@fb.com \
--cc=ben@decadent.org.uk \
--cc=christophe.leroy@c-s.fr \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).