From: Eric Bambach <eric@cisu.net>
To: Richard Sammet <richard.sammet@sit.fraunhofer.de>
Cc: linux-c-programming@vger.kernel.org
Subject: Re: how to find the end of piped data?
Date: Tue, 14 Sep 2004 01:06:08 -0500 [thread overview]
Message-ID: <200409140106.08674.eric@cisu.net> (raw)
In-Reply-To: <41458643.7060907@sit.fraunhofer.de>
On Monday 13 September 2004 06:36 am, you wrote:
> hey list,
>
> i wrote a small tool which gets data over a pipe from other tools (like:
> cat stuff | mytool).
>
> how can i find the end of this data stream?
>
> at the moment im looking for a newline to see if the input is finished,
> but thats not practicable.
>
> this is the rutine for getting the data:
>
> 75 void scanin()
> 76 {
> 77 int tmpcnt=0;
> 78
> 79 while(sizeof(tmpkey) && tmpkey[tmpcnt-1] != 10)
> 80 {
> 81 tmpkey[tmpcnt]=getchar();
> 82 tmpcnt++;
> 83 }
> 84 }
>
> im looking for a flag like EOF but EndOfStream or something like this? ;)
>
> anybody any idea?
Yea, heres a small program I wrote that works exactly the same way. WIth piped
data. It just scans STDIN to match to a pattern in the input stream. If it
finds it, it pipes to /dev/null, if not, it pipes to stdout. Notice the
read()/write() combo with a buffer. This is MUCH faster than getchar()
method. Hope it helps.
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define EXIT_NOMATCH 0
#define EXIT_MATCH 1
//Should be large enough to scan most mail messages in one or two passes.
//Performance with 4K buffer is .8s for a 73M message.
//Increasing to 32K only trims it to .6s with an unjustified increase in
memory use.
//DO NOT SET THIS TOO SMALL. It probably won't catch the spam flag then since
it only scans the
//first BUFFERSIZE characters read in then fast copies the rest to stdout.
#define BUFFERSIZE 6144
//What we are checking for. must be EXACT. Leave the newline in because it
protects the offchance that it is in the message body somewhere.
//This way it will only match if its at the beginning of the line.
#define CHECKSTRING "\nX-Spam-Flag: YES"
int main(void);
int dump_full_message();
int write_message(char *buffer,int len,int fd);
int read_message(char *buffer);
int scan_message(char *buffer,int len);
int main(void){
//Our faithful buffer
char buffer[BUFFERSIZE];
//What file should we write to if its spam.
const char *spampipe = "/dev/null";
int len,
fd = STDOUT_FILENO,
exit_status = EXIT_NOMATCH;
//We only want to scan our message once. Its unlikely there are more than
4K(BUFFERSIZE)
//of headers. By scanning once, this lets us trash the rest of the output if
its spam. Also
//prevents scanning a huge non-matching mail-message-attachment.
len = read_message(buffer);
if (len){
if( scan_message(buffer,len) == EXIT_MATCH){
if( (fd = open(spampipe, O_WRONLY)) == -1){
perror("Cannot open spam pipe...will write to stdout");
fd = STDOUT_FILENO;
}
exit_status= EXIT_MATCH;
}
}
write_message(buffer,len,fd);
//Tight read/write for just piping the data. After the first BUFFERSIZE
characters
//we should already have what we need and just pass it on in the queue.
do{
len = read_message(buffer);
if (len){
write_message(buffer,len,fd);
}
}while(len > 0);
close(fd);
return exit_status;
}
int scan_message(char * buffer,int len){
char *spamptr,*bufptr;
int count = 0;
const char *spamflag = CHECKSTRING;
spamptr = spamflag;
bufptr = buffer;
for(count =0 ; count<len ; count++,bufptr++){
//Test it
if (*bufptr != *spamptr)
spamptr = spamflag;
if (*bufptr == *spamptr)
spamptr++;
//We've hit a match
if (*spamptr == '\0'){
return EXIT_MATCH;
}
}
return EXIT_NOMATCH;
}
int read_message(char *buffer){
int len;
len = read(STDIN_FILENO,buffer,BUFFERSIZE-1);
if (len < 0){
perror("Read Error");
exit(EXIT_NOMATCH);
}
return len;
}
//Works almost like write() except checks for errors.
int write_message(char *buffer,int len,int fd){
int ret;
ret = write(fd,buffer,len);
if (ret < 0){
perror("Write Error");
exit(EXIT_NOMATCH);
}
return ret;
}
--
-EB
next prev parent reply other threads:[~2004-09-14 6:06 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-13 11:36 how to find the end of piped data? Richard Sammet
2004-09-14 6:06 ` Eric Bambach [this message]
2004-09-14 8:10 ` Charlie Gordon
2004-09-14 8:45 ` Richard Sammet
2004-09-14 9:03 ` Charlie Gordon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200409140106.08674.eric@cisu.net \
--to=eric@cisu.net \
--cc=linux-c-programming@vger.kernel.org \
--cc=richard.sammet@sit.fraunhofer.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).