public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Domenico Andreoli <cavokz@gmail.com>, Willy Tarreau <w@1wt.eu>,
	Rodrigo Rubira Branco <rbranco@la.checkpoint.com>,
	Jake Edge <jake@lwn.net>, Eugene Teo <eteo@redhat.com>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Eric Dumazet <dada1@cosmosbay.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [patch 36/48] tcp: splice as many packets as possible at once
Date: Fri, 13 Feb 2009 17:13:30 -0800	[thread overview]
Message-ID: <20090214011330.GJ17706@kroah.com> (raw)
In-Reply-To: <20090214011208.GA17706@kroah.com>

[-- Attachment #1: tcp-splice-as-many-packets-as-possible-at-once.patch --]
[-- Type: text/plain, Size: 2303 bytes --]

2.6.28-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Willy Tarreau <w@1wt.eu>

[ Upstream commit 33966dd0e2f68f26943cd9ee93ec6abbc6547a8e ]

As spotted by Willy Tarreau, current splice() from tcp socket to pipe is not
optimal. It processes at most one segment per call.
This results in low performance and very high overhead due to syscall rate
when splicing from interfaces which do not support LRO.

Willy provided a patch inside tcp_splice_read(), but a better fix
is to let tcp_read_sock() process as many segments as possible, so
that tcp_rcv_space_adjust() and tcp_cleanup_rbuf() are called less
often.

With this change, splice() behaves like tcp_recvmsg(), being able
to consume many skbs in one system call. With typical 1460 bytes
of payload per frame, that means splice(SPLICE_F_NONBLOCK) can return
16*1460 = 23360 bytes.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 net/ipv4/tcp.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -520,8 +520,12 @@ static int tcp_splice_data_recv(read_des
 				unsigned int offset, size_t len)
 {
 	struct tcp_splice_state *tss = rd_desc->arg.data;
+	int ret;
 
-	return skb_splice_bits(skb, offset, tss->pipe, tss->len, tss->flags);
+	ret = skb_splice_bits(skb, offset, tss->pipe, rd_desc->count, tss->flags);
+	if (ret > 0)
+		rd_desc->count -= ret;
+	return ret;
 }
 
 static int __tcp_splice_read(struct sock *sk, struct tcp_splice_state *tss)
@@ -529,6 +533,7 @@ static int __tcp_splice_read(struct sock
 	/* Store TCP splice context information in read_descriptor_t. */
 	read_descriptor_t rd_desc = {
 		.arg.data = tss,
+		.count	  = tss->len,
 	};
 
 	return tcp_read_sock(sk, &rd_desc, tcp_splice_data_recv);
@@ -613,11 +618,13 @@ ssize_t tcp_splice_read(struct socket *s
 		tss.len -= ret;
 		spliced += ret;
 
+		if (!timeo)
+			break;
 		release_sock(sk);
 		lock_sock(sk);
 
 		if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
-		    (sk->sk_shutdown & RCV_SHUTDOWN) || !timeo ||
+		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
 		    signal_pending(current))
 			break;
 	}


  parent reply	other threads:[~2009-02-14  1:42 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20090214010805.419403436@mini.kroah.org>
2009-02-14  1:12 ` [patch 00/48] 2.6.28.6-stable review Greg KH
2009-02-14  1:12   ` [patch 01/48] x86, vmi: put a missing paravirt_release_pmd in pgd_dtor Greg KH
2009-02-14  1:12   ` [patch 02/48] nbd: fix I/O hang on disconnected nbds Greg KH
2009-02-14  1:12   ` [patch 03/48] mac80211: restrict to AP in outgoing interface heuristic Greg KH
2009-02-14  1:12   ` [patch 04/48] w1: w1 temp calculation overflow fix Greg KH
2009-02-14  1:12   ` [patch 05/48] zd1211rw: adding 0ace:0xa211 as a ZD1211 device Greg KH
2009-02-14  1:12   ` [patch 06/48] zd1211rw: treat MAXIM_NEW_RF(0x08) as UW2453_RF(0x09) for TP-Link WN322/422G Greg KH
2009-02-14  1:12   ` [patch 07/48] parport: parport_serial, dont bind netmos ibm 0299 Greg KH
2009-02-14  1:12   ` [patch 08/48] syscall define: fix uml compile bug Greg KH
2009-02-14  1:12   ` [patch 09/48] kernel-doc: fix syscall wrapper processing Greg KH
2009-02-14  1:12   ` [patch 10/48] Fix page writeback thinko, causing Berkeley DB slowdown Greg KH
2009-02-14  1:12   ` [patch 11/48] write-back: fix nr_to_write counter Greg KH
2009-02-14  1:12   ` [patch 12/48] writeback: fix break condition Greg KH
2009-02-14  1:12   ` [patch 13/48] mm: rearrange exit_mmap() to unlock before arch_exit_mmap Greg KH
2009-02-14  1:12   ` [patch 14/48] powerpc/fsl-booke: Fix mapping functions to use phys_addr_t Greg KH
2009-02-14  1:12   ` [patch 15/48] lockd: fix regression in lockds handling of blocked locks Greg KH
2009-02-14  1:12   ` [patch 16/48] sctp: Fix crc32c calculations on big-endian arhes Greg KH
2009-02-14  1:12   ` [patch 17/48] sctp: Correctly start rtx timer on new packet transmissions Greg KH
2009-02-14  1:12   ` [patch 18/48] sctp: Properly timestamp outgoing data chunks for rtx purposes Greg KH
2009-02-14  1:12   ` [patch 19/48] net: Fix frag_list handling in skb_seq_read Greg KH
2009-02-14  1:13   ` [patch 20/48] net: Fix OOPS in skb_seq_read() Greg KH
2009-02-14  1:13   ` [patch 21/48] drivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic Greg KH
2009-02-14  1:13   ` [patch 22/48] ipv4: fix infinite retry loop in IP-Config Greg KH
2009-02-14  1:13   ` [patch 23/48] net: Fix userland breakage wrt. linux/if_tunnel.h Greg KH
2009-02-14  1:13   ` [patch 24/48] net: packet socket packet_lookup_frame fix Greg KH
2009-02-14  1:13   ` [patch 25/48] packet: Avoid lock_sock in mmap handler Greg KH
2009-02-14  1:13   ` [patch 26/48] sungem: Soft lockup in sungem on Netra AC200 when switching interface up Greg KH
2009-02-14  1:13   ` [patch 27/48] udp: Fix UDP short packet false positive Greg KH
2009-02-14  1:13   ` [patch 28/48] udp: increments sk_drops in __udp_queue_rcv_skb() Greg KH
2009-02-14  1:13   ` [patch 29/48] ipv6: Disallow rediculious flowlabel option sizes Greg KH
2009-02-14  1:13   ` [patch 30/48] ipv6: Copy cork options in ip6_append_data Greg KH
2009-02-14  1:13   ` [patch 32/48] sky2: fix hard hang with netconsoling and iface going up Greg KH
2009-02-14  1:13   ` [patch 33/48] tun: Add some missing TUN compat ioctl translations Greg KH
2009-02-14  1:13   ` [patch 34/48] tun: Fix unicast filter overflow Greg KH
2009-02-14  1:13   ` [patch 35/48] virtio_net: Fix MAX_PACKET_LEN to support 802.1Q VLANs Greg KH
2009-02-14  1:13   ` Greg KH [this message]
2009-02-14  1:13   ` [patch 37/48] tcp: Fix length tcp_splice_data_recv passes to skb_splice_bits Greg KH
2009-02-14  1:13   ` [patch 31/48] net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2 Greg KH
2009-02-14  1:13   ` [patch 38/48] sparc: Enable syscall wrappers for 64-bit (CVE-2009-0029) Greg KH
2009-02-14  1:13   ` [patch 39/48] sparc64: Annotate sparc64 specific syscalls with SYSCALL_DEFINEx() Greg KH
2009-02-14  1:13   ` [patch 40/48] ALSA: hda - Add missing terminator in slave dig-out array Greg KH
2009-02-14  1:13   ` [patch 41/48] ALSA: mtpav - Fix initial value for input hwport Greg KH
2009-02-14  1:13   ` [patch 42/48] HID: adjust report descriptor fixup for MS 1028 receiver Greg KH
2009-02-14  1:13   ` [patch 43/48] ide/libata: fix ata_id_is_cfa() (take 4) Greg KH
2009-02-14  1:13   ` [patch 44/48] libata: fix EH device failure handling Greg KH
2009-02-14  1:13   ` [patch 45/48] netfilter: fix tuple inversion for Node information request Greg KH
2009-02-14  1:13   ` [patch 46/48] netfilter: xt_sctp: sctp chunk mapping doesnt work Greg KH
2009-02-14  1:13   ` [patch 47/48] x86: microcode_amd: fix wrong handling of equivalent CPU id Greg KH
2009-02-14  1:13   ` [patch 48/48] ide-cd: fix DMA for non bio-backed requests Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090214011330.GJ17706@kroah.com \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chuckw@quantumlinux.com \
    --cc=dada1@cosmosbay.com \
    --cc=davej@redhat.com \
    --cc=davem@davemloft.net \
    --cc=eteo@redhat.com \
    --cc=jake@lwn.net \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkrufky@linuxtv.org \
    --cc=rbranco@la.checkpoint.com \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=w@1wt.eu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox