* open(2) vs fopen(3) @ 2006-09-14 9:15 moreau francis 2006-09-14 10:52 ` Andy Whitcroft 2006-09-14 15:46 ` Linus Torvalds 0 siblings, 2 replies; 5+ messages in thread From: moreau francis @ 2006-09-14 9:15 UTC (permalink / raw) To: git Hi GIT folks, I'm reading git source code and falling on this stupid question: Why sometime open(2) is used and other time fopen(3) is prefered. I'm sorry for this dump question but I have no clue. thanks Francis ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3) 2006-09-14 9:15 open(2) vs fopen(3) moreau francis @ 2006-09-14 10:52 ` Andy Whitcroft 2006-09-14 15:46 ` Linus Torvalds 1 sibling, 0 replies; 5+ messages in thread From: Andy Whitcroft @ 2006-09-14 10:52 UTC (permalink / raw) To: moreau francis; +Cc: git moreau francis wrote: > Hi GIT folks, > > I'm reading git source code and falling on this stupid question: > Why sometime open(2) is used and other time fopen(3) is > prefered. I'm sorry for this dump question but I have no clue. It looks very much from a quick random sampling, that open is used where we are going to mmap the file for quick access. fopen otherwise. -apw ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3) 2006-09-14 9:15 open(2) vs fopen(3) moreau francis 2006-09-14 10:52 ` Andy Whitcroft @ 2006-09-14 15:46 ` Linus Torvalds 2006-09-14 16:37 ` Junio C Hamano 1 sibling, 1 reply; 5+ messages in thread From: Linus Torvalds @ 2006-09-14 15:46 UTC (permalink / raw) To: moreau francis; +Cc: git On Thu, 14 Sep 2006, moreau francis wrote: > > I'm reading git source code and falling on this stupid question: > Why sometime open(2) is used and other time fopen(3) is > prefered. I'm sorry for this dump question but I have no clue. fopen() tends to result in easier usage, especially if the file in question is a line-based ASCII file, and you can just use "fgets()" to read it. So fopen is the simple alternative for simple problems. Using a direct open() means that you have to use the low-level IO functions (I'm ignoring the use of "fdopen()"), but if done right, it has a number of advantages: - with the proper use, it's potentially more efficient (but stdio is a lot more efficient if you do lots of small writes without buffering) - you can control the creation flags better (ie if you want to do an exclusive open, you _have_ to use "open()" - there's no portable way to say O_EXCL with "fopen()") - error conditions are a lot more obvious and repeatable with the low-level things, at least so I find personally. Error handling with stdio routines is _possible_, but probably because almost nobody ever does it, it's not something that people are conditioned to do, so it ends up beign "strange". (So this third one is more a psychological issue than a really technical issue - at least for me. I'd not use stdio for things I might expect to do fsync() on, for example. It's _possible_, but very non-intuitive, because that's now how people generally use stdio). So it boils down to the fact that people tend to do higher-level things with stdio interfaces (fopen and friends), and lower-level things with the raw system call ("unistd.h") interfaces. In git, you'd expect to see code that actually works on the object database or the refs using "open()" (both because it's low-level, and it generally wants to use O_EXCL and friends), and then things that open the ".gitignore" file to use fopen() (because it's a line-based ASCII interface, and it's not an "important" file in the sense that we don't really care about some strange situation where it could give us an IO error). There might also be a difference in personality. I probably tend to use the core unistd interfaces more than some other people would, and some other people might end up using stdio for pretty much everything. Linus ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3) 2006-09-14 15:46 ` Linus Torvalds @ 2006-09-14 16:37 ` Junio C Hamano 2006-09-14 17:31 ` Linus Torvalds 0 siblings, 1 reply; 5+ messages in thread From: Junio C Hamano @ 2006-09-14 16:37 UTC (permalink / raw) To: moreau francis; +Cc: git, Linus Torvalds Linus Torvalds <torvalds@osdl.org> writes: > On Thu, 14 Sep 2006, moreau francis wrote: >> >> I'm reading git source code and falling on this stupid question: >> Why sometime open(2) is used and other time fopen(3) is >> prefered. I'm sorry for this dump question but I have no clue. > > fopen() tends to result in easier usage, especially if the file in > question is a line-based ASCII file, and you can just use "fgets()" to > read it. So fopen is the simple alternative for simple problems. > > Using a direct open() means that you have to use the low-level IO > functions (I'm ignoring the use of "fdopen()"), but if done right, it has > a number of advantages: >... > - error conditions are a lot more obvious and repeatable with the > low-level things, at least so I find personally. Error handling with > stdio routines is _possible_, but probably because almost nobody ever > does it, it's not something that people are conditioned to do, so it > ends up beign "strange". Another issue related with this is that stdio implementations tend to have unintuitive interaction with signals, one fine example of it being the problem we fixed with commit fb7a653, where on Solaris fgets(3) did not restart the underlying read(2) upon SIGALRM. Technically it was a bug on our part not Solaris, but that was something unexpected to see. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open(2) vs fopen(3) 2006-09-14 16:37 ` Junio C Hamano @ 2006-09-14 17:31 ` Linus Torvalds 0 siblings, 0 replies; 5+ messages in thread From: Linus Torvalds @ 2006-09-14 17:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: moreau francis, git On Thu, 14 Sep 2006, Junio C Hamano wrote: > > Another issue related with this is that stdio implementations > tend to have unintuitive interaction with signals, one fine > example of it being the problem we fixed with commit fb7a653, > where on Solaris fgets(3) did not restart the underlying read(2) > upon SIGALRM. Yeah. However, I think it's worth just posting the code in question to explain _why_ error handling with stdio sucks so badly, and why nobody does it.. Here's the snippet: if (!fgets(line, sizeof(line), stdin)) { if (feof(stdin)) break; if (!ferror(stdin)) die("fgets returned NULL, not EOF, not error!"); if (errno != EINTR) die("fgets: %s", strerror(errno)); clearerr(stdin); so with the <stdio.h> functions, you have to check FOUR DIFFERENT THINGS (1: return value, 2: feof() value, 3: ferror() value, and 4: errno) to get things right, and to add insult to injury, you then have to do an explicit clear. In other words, the fundamental reason nobody bothers checking errors with stdio is that stdio just makes it a damn pain in the ass to do so - by having a million different thing you have to do (and ordering actually matters). In contrast, the <unistd.h> interfaces are a paragon of clarity: you check just two things - the return value, and possibly "errno". Now, <unistd.h> isn't perfect either, and in the kernel we have simplified things further, by getting rid of "errno", and just having the return value contain errno too. Making things not only trivially thread-safe, but also actually easier to code and understand, because you don't have anything to be confused about: the return value is always the only thing you need to look at in order to know what went wrong. But unistd.h sure is a lot better than stdio in this area. Of course, stdio.h is just a lot easier to use when you don't actually care about the errors, which is also partly the _reason_ why caring about errors is so hard (the whole separate clearerr() and ferror() interfaces exist exactly _because_ people don't care about errors in many cases, and you're supposed to maybe have some way to test at the end whether an error happened or not). So stdio.h is pretty much geared towards delayed error handling, which in practice ends up often meaning "no error handling at all". Linus ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-09-14 17:32 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-14 9:15 open(2) vs fopen(3) moreau francis 2006-09-14 10:52 ` Andy Whitcroft 2006-09-14 15:46 ` Linus Torvalds 2006-09-14 16:37 ` Junio C Hamano 2006-09-14 17:31 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).