* open(2) vs fopen(3)
@ 2006-09-14 9:15 moreau francis
2006-09-14 10:52 ` Andy Whitcroft
2006-09-14 15:46 ` Linus Torvalds
0 siblings, 2 replies; 5+ messages in thread
From: moreau francis @ 2006-09-14 9:15 UTC (permalink / raw)
To: git
Hi GIT folks,
I'm reading git source code and falling on this stupid question:
Why sometime open(2) is used and other time fopen(3) is
prefered. I'm sorry for this dump question but I have no clue.
thanks
Francis
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3)
2006-09-14 9:15 open(2) vs fopen(3) moreau francis
@ 2006-09-14 10:52 ` Andy Whitcroft
2006-09-14 15:46 ` Linus Torvalds
1 sibling, 0 replies; 5+ messages in thread
From: Andy Whitcroft @ 2006-09-14 10:52 UTC (permalink / raw)
To: moreau francis; +Cc: git
moreau francis wrote:
> Hi GIT folks,
>
> I'm reading git source code and falling on this stupid question:
> Why sometime open(2) is used and other time fopen(3) is
> prefered. I'm sorry for this dump question but I have no clue.
It looks very much from a quick random sampling, that open is used where
we are going to mmap the file for quick access. fopen otherwise.
-apw
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3)
2006-09-14 9:15 open(2) vs fopen(3) moreau francis
2006-09-14 10:52 ` Andy Whitcroft
@ 2006-09-14 15:46 ` Linus Torvalds
2006-09-14 16:37 ` Junio C Hamano
1 sibling, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2006-09-14 15:46 UTC (permalink / raw)
To: moreau francis; +Cc: git
On Thu, 14 Sep 2006, moreau francis wrote:
>
> I'm reading git source code and falling on this stupid question:
> Why sometime open(2) is used and other time fopen(3) is
> prefered. I'm sorry for this dump question but I have no clue.
fopen() tends to result in easier usage, especially if the file in
question is a line-based ASCII file, and you can just use "fgets()" to
read it. So fopen is the simple alternative for simple problems.
Using a direct open() means that you have to use the low-level IO
functions (I'm ignoring the use of "fdopen()"), but if done right, it has
a number of advantages:
- with the proper use, it's potentially more efficient (but stdio is a
lot more efficient if you do lots of small writes without buffering)
- you can control the creation flags better (ie if you want to do an
exclusive open, you _have_ to use "open()" - there's no portable way to
say O_EXCL with "fopen()")
- error conditions are a lot more obvious and repeatable with the
low-level things, at least so I find personally. Error handling with
stdio routines is _possible_, but probably because almost nobody ever
does it, it's not something that people are conditioned to do, so it
ends up beign "strange".
(So this third one is more a psychological issue than a really
technical issue - at least for me. I'd not use stdio for things I
might expect to do fsync() on, for example. It's _possible_, but very
non-intuitive, because that's now how people generally use stdio).
So it boils down to the fact that people tend to do higher-level things
with stdio interfaces (fopen and friends), and lower-level things with the
raw system call ("unistd.h") interfaces.
In git, you'd expect to see code that actually works on the object
database or the refs using "open()" (both because it's low-level, and it
generally wants to use O_EXCL and friends), and then things that open the
".gitignore" file to use fopen() (because it's a line-based ASCII
interface, and it's not an "important" file in the sense that we don't
really care about some strange situation where it could give us an IO
error).
There might also be a difference in personality. I probably tend to use
the core unistd interfaces more than some other people would, and some
other people might end up using stdio for pretty much everything.
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3)
2006-09-14 15:46 ` Linus Torvalds
@ 2006-09-14 16:37 ` Junio C Hamano
2006-09-14 17:31 ` Linus Torvalds
0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2006-09-14 16:37 UTC (permalink / raw)
To: moreau francis; +Cc: git, Linus Torvalds
Linus Torvalds <torvalds@osdl.org> writes:
> On Thu, 14 Sep 2006, moreau francis wrote:
>>
>> I'm reading git source code and falling on this stupid question:
>> Why sometime open(2) is used and other time fopen(3) is
>> prefered. I'm sorry for this dump question but I have no clue.
>
> fopen() tends to result in easier usage, especially if the file in
> question is a line-based ASCII file, and you can just use "fgets()" to
> read it. So fopen is the simple alternative for simple problems.
>
> Using a direct open() means that you have to use the low-level IO
> functions (I'm ignoring the use of "fdopen()"), but if done right, it has
> a number of advantages:
>...
> - error conditions are a lot more obvious and repeatable with the
> low-level things, at least so I find personally. Error handling with
> stdio routines is _possible_, but probably because almost nobody ever
> does it, it's not something that people are conditioned to do, so it
> ends up beign "strange".
Another issue related with this is that stdio implementations
tend to have unintuitive interaction with signals, one fine
example of it being the problem we fixed with commit fb7a653,
where on Solaris fgets(3) did not restart the underlying read(2)
upon SIGALRM.
Technically it was a bug on our part not Solaris, but that was
something unexpected to see.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3)
2006-09-14 16:37 ` Junio C Hamano
@ 2006-09-14 17:31 ` Linus Torvalds
0 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2006-09-14 17:31 UTC (permalink / raw)
To: Junio C Hamano; +Cc: moreau francis, git
On Thu, 14 Sep 2006, Junio C Hamano wrote:
>
> Another issue related with this is that stdio implementations
> tend to have unintuitive interaction with signals, one fine
> example of it being the problem we fixed with commit fb7a653,
> where on Solaris fgets(3) did not restart the underlying read(2)
> upon SIGALRM.
Yeah. However, I think it's worth just posting the code in question to
explain _why_ error handling with stdio sucks so badly, and why nobody
does it..
Here's the snippet:
if (!fgets(line, sizeof(line), stdin)) {
if (feof(stdin))
break;
if (!ferror(stdin))
die("fgets returned NULL, not EOF, not error!");
if (errno != EINTR)
die("fgets: %s", strerror(errno));
clearerr(stdin);
so with the <stdio.h> functions, you have to check FOUR DIFFERENT THINGS
(1: return value, 2: feof() value, 3: ferror() value, and 4: errno) to get
things right, and to add insult to injury, you then have to do an explicit
clear.
In other words, the fundamental reason nobody bothers checking errors with
stdio is that stdio just makes it a damn pain in the ass to do so - by
having a million different thing you have to do (and ordering actually
matters).
In contrast, the <unistd.h> interfaces are a paragon of clarity: you check
just two things - the return value, and possibly "errno".
Now, <unistd.h> isn't perfect either, and in the kernel we have simplified
things further, by getting rid of "errno", and just having the return
value contain errno too. Making things not only trivially thread-safe, but
also actually easier to code and understand, because you don't have
anything to be confused about: the return value is always the only thing
you need to look at in order to know what went wrong.
But unistd.h sure is a lot better than stdio in this area. Of course,
stdio.h is just a lot easier to use when you don't actually care about the
errors, which is also partly the _reason_ why caring about errors is so
hard (the whole separate clearerr() and ferror() interfaces exist exactly
_because_ people don't care about errors in many cases, and you're
supposed to maybe have some way to test at the end whether an error
happened or not).
So stdio.h is pretty much geared towards delayed error handling, which in
practice ends up often meaning "no error handling at all".
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-09-14 17:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-14 9:15 open(2) vs fopen(3) moreau francis
2006-09-14 10:52 ` Andy Whitcroft
2006-09-14 15:46 ` Linus Torvalds
2006-09-14 16:37 ` Junio C Hamano
2006-09-14 17:31 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).