From: "David S. Miller" <davem@davemloft.net>
To: Olivier Galibert <galibert@pobox.com>
Cc: avi@argo.co.il, linux-kernel@vger.kernel.org
Subject: Re: tcp_sendpage and page allocation lifetime vs. iscsi
Date: Mon, 25 Apr 2005 15:08:40 -0700 [thread overview]
Message-ID: <20050425150840.5f27f77a.davem@davemloft.net> (raw)
In-Reply-To: <20050425220603.GA64842@dspnet.fr.eu.org>
On Tue, 26 Apr 2005 00:06:03 +0200
Olivier Galibert <galibert@pobox.com> wrote:
> Do you think possible to extent the sendpage api to add some kind of
> "don't get the pages, copy them if you need them" flag?
No, not really.
Do you happen to run the scsi->done() function from iscsi
as soon as the write over the TCP socket completes returns
success? That is likely what is causing the problem.
When you call scsi->done(), the buffer is effectively released
and the scsi/st.c driver can legally reuse it once you've done
that.
tcp_sendpages() is really meant to be invoked for page cache
pages, or temporary pages cons'd up specifically for that
send call. Just look at what TCP sendmsg does, for example.
It carves up a per-socket cached PAGE to put the user's data
into.
You could do something similar in iSCSI and for now I highly
suggest that is what you do.
You could also:
1) set TCP_CORK to 1
2) tcp_sendmsg() the scsi tape data
3) tcp_sendpage() to remaining pages
4) set TCP_CORK to 0
so that tcp_sendmsg() does all the data copying for you.
Finally, you could also use "SIOCOUTQ" ioctl to watch the
write buffer get released. Call it once before you do the send,
save that value, then after your send wait for it to hit
or pass the old value you saved.
In short, you're using an API in a way it was never designed
to be used. We don't lock pages, and that is a deliberate
design decision. When we send pages over the wire using
TCP sendpages out of the page cache, the file contents _CAN_
change mid-send, but that's OK because the card calculates
the packet checksums so no data corruption nor quality of
implementation issues arise as a result.
Again, this behavior and these mechanics were deliberately
made to function this way.
next prev parent reply other threads:[~2005-04-25 22:17 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-25 17:02 tcp_sendpage and page allocation lifetime vs. iscsi Olivier Galibert
2005-04-25 19:11 ` Avi Kivity
2005-04-25 19:19 ` David S. Miller
2005-04-25 19:43 ` Avi Kivity
2005-04-25 19:37 ` David S. Miller
2005-04-25 22:06 ` Olivier Galibert
2005-04-25 22:08 ` David S. Miller [this message]
2005-04-25 22:31 ` Olivier Galibert
2005-04-29 17:09 ` Dmitry Yusupov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050425150840.5f27f77a.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=avi@argo.co.il \
--cc=galibert@pobox.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox