From: Willy Tarreau <w@1wt.eu>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: Nikolai ZHUBR <zhubr@mail.ru>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: epoll'ing tcp sockets for reading
Date: Sun, 20 Dec 2009 17:14:22 +0100 [thread overview]
Message-ID: <20091220161422.GH32739@1wt.eu> (raw)
In-Reply-To: <alpine.DEB.2.00.0912200750280.9679@makko.or.mcafeemobile.com>
Hi Davide,
On Sun, Dec 20, 2009 at 07:54:09AM -0800, Davide Libenzi wrote:
> On Sun, 20 Dec 2009, Nikolai ZHUBR wrote:
>
> > Sunday, December 20, 2009, 1:56:22 AM, Davide Libenzi wrote:
> > [trim]
> > > The kernel cannot make decisions based on something whose knowledge is
> > > userspace bound.
> > I didn't mean that. I just meant it would be usefull to let the caller
> > of epoll know also the size of data related to specific EPOLLIN event in
> > some "atomic" manner immediately, because the kernel probably knows this
> > size already.
> > The same thing can approximately be "emulated" by requesting FIOREAD for
> > all EPOLLIN-ready sockets just after epoll returns, before any other work.
> > It just would look not very elegant IMHO.
>
> No such a thing of "atomic matter", since by the time you read the event,
> more data might have come. It's just flawed, you see that?
I think that what Nikolai meant was the ability to wake up as soon as
there are *at least* XXX bytes ready. But while I can understand why
it would in theory save some code, in practice he would still have to
properly handle corner cases, which would defeat the original purpose
of his modification :
- if he waits for larger data than the socket buffer can handle, he
will never wake up ;
- if my memory serves me right, the copy_and_cksum() code only knows
whether a segment is correct during its transfer to userland, which
means that epoll() could very well wake up with XXX apparent bytes
ready, but the read would fail before XXX due to an invalid checksum
on an intermediate segment. So the code would still have to take
care of that situation anyway.
The last point implies the complete implementation of the code he wants
to avoid anyway, and the first one implies it will be hard to know when
this would work and when this would not. This means that while at first
glance this behaviour could be useful, it would in practice be useless.
Regards,
Willy
next prev parent reply other threads:[~2009-12-20 16:14 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-19 12:02 epoll'ing tcp sockets for reading Nikolai ZHUBR
2009-12-19 18:07 ` Davide Libenzi
2009-12-19 22:38 ` Re[2]: " Nikolai ZHUBR
2009-12-19 22:56 ` Davide Libenzi
2009-12-20 0:26 ` Re[3]: " Nikolai ZHUBR
2009-12-20 15:54 ` Davide Libenzi
2009-12-20 16:14 ` Willy Tarreau [this message]
2009-12-20 23:26 ` Re[2]: " Nikolai ZHUBR
2009-12-21 5:46 ` Willy Tarreau
2009-12-21 9:34 ` Re[2]: " Nikolai ZHUBR
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091220161422.GH32739@1wt.eu \
--to=w@1wt.eu \
--cc=davidel@xmailserver.org \
--cc=linux-kernel@vger.kernel.org \
--cc=zhubr@mail.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox