From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org
To: linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [Bug 15568] New: O_NONBLOCK is NOOP on block devices
Date: Wed, 17 Mar 2010 23:16:08 GMT [thread overview]
Message-ID: <bug-15568-11311@http.bugzilla.kernel.org/> (raw)
http://bugzilla.kernel.org/show_bug.cgi?id=15568
Summary: O_NONBLOCK is NOOP on block devices
Product: Documentation
Version: unspecified
Kernel Version: 2.6.18-2.6.32
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: man-pages
AssignedTo: documentation_man-pages-ztI5WcYan/vQLgFONoPN62D2FQJk+8+b@public.gmane.org
ReportedBy: mh-linux-kernel-AB0c8HnZplU@public.gmane.org
Regression: No
Created an attachment (id=25573)
--> (http://bugzilla.kernel.org/attachment.cgi?id=25573)
Patch to man page v. 3.25
Timing results indicate that the O_NONBLOCK flag produces no noticable
effect on read or writev to a Linux block device.
I always perform aligned ios which are a multiple of the sector size
which also allows the use of O_DIRECT if desired. For testing, I've
been using 2.6.22, 2.6.24 kernels, 2.6.32 kernels (fedora core and
ubuntu distros) on both x86_64 and 32 bit arm architectures and get
similar results on every variation of hardware and kernel tested.
To extract the following data, I used the following set of system
calls in a loop driven by poll, surrounding read and write calls
immediately with time checks.
fd = open( filename, O_RDWR | O_NONBLOCK | O_NOATIME );
gettimeofday( &time, 0 );
read( fd, pos, len );
writev( fd, iov, count );
poll( pfd, npfd, timeoutms );
Byte counts are displayed in hex. On my core 2 duo laptop, for
example, io to or from the buffer cache typically takes 100 to 125
micro seconds to transfer 64k.
----------------------------------------------------------------------
BUFFER CACHE NOT FULL, NONBLOCKING 64K WRITES AS EXPECTED
write fd:3 0.000117s bytes:10000 remain:0
write fd:3 0.000115s bytes:10000 remain:0
write fd:3 0.000116s bytes:10000 remain:0
write fd:3 0.000118s bytes:10000 remain:0
write fd:3 0.000125s bytes:10000 remain:0
write fd:3 0.000126s bytes:10000 remain:0
write fd:3 0.000101s bytes:10000 remain:0
----------------------------------------------------------------------
READING AND WRITING, BUFFER CACHE FULL
read fd:3 0.006351s bytes:10000 remain:0
write fd:3 0.001235s bytes:200 remain:0
write fd:3 0.002477s bytes:200 remain:0
read fd:3 0.005010s bytes:10000 remain:0
write fd:3 0.001243s bytes:200 remain:0
read fd:3 0.005028s bytes:10000 remain:0
write fd:3 0.000506s bytes:200 remain:0
write fd:3 0.000106s bytes:10000 remain:0
write fd:3 0.000812s bytes:200 remain:0
write fd:3 0.000108s bytes:10000 remain:0
write fd:3 0.000807s bytes:200 remain:0
write fd:3 0.002652s bytes:200 remain:0
write fd:3 0.000107s bytes:10000 remain:0
write fd:3 0.000141s bytes:10000 remain:0
write fd:3 0.002232s bytes:200 remain:0
These are not worst-case, but rather best case results! For an
example of more worse case results, using a usb flash device,
frequently (about once a second or so) under heavier load I see reads
or writes blocked for 500ms or more when vmstat and top report more
than 90% idle / wait. 500ms to perform a 512 byte "non blocking" io
with a nearly idle cpu is an eternity in computer time; more than
10,000 times longer than it should take to memcpy all or even a
portion of the data or return EAGAIN.
I discovered this because, even though they succeed, all of these
"non" blocking system calls are blocking so much so that they easily
choke process non blocking socket io.
I think this O_NONBLOCK behavior has aspects that could probably be
classified as both a documentation and a kernel defect depending upon
whether the existing open(2) man page documents the intended behavior
of read and write or not. Alan Cox suggested a man page patch. The
attached one correctly describes the existing behavior while reserving
future nonblocking semantics.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
reply other threads:[~2010-03-17 23:16 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-15568-11311@http.bugzilla.kernel.org/ \
--to=bugzilla-daemon-590eeb7gvniway/ihj7yzeb+6bgklq7r@public.gmane.org \
--cc=linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.