linux-c-programming.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* parsing with fscanf().
@ 2002-09-11 18:24 Elias Athanasopoulos
  2002-09-11 19:36 ` Glynn Clements
  2002-09-11 22:14 ` Richard Webb
  0 siblings, 2 replies; 6+ messages in thread
From: Elias Athanasopoulos @ 2002-09-11 18:24 UTC (permalink / raw)
  To: linux-c-programming

Hi,

A really newbie question, but I have no time for research (my deadline
is counted in hours).

I have to parse a text file, which has been exported from MS Excel (I
have no access to this thingie) using TABs as delimeters. The text
is like:

4	3	2.0
5		1.2
4	3	2.4

I use fscanf() to insert the data to my structures, but the "gap" (between
5 and 1.2) breaks the sequence. fscanf() places the 3rd column to the 2nd
field in my structure.

Is there a trivial way to bypass the problem using fscanf() or I have to do 
the parsing using read()?

Elias

-- 
http://gnewtellium.sourceforge.net			MP3 is not a crime.	

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: parsing with fscanf().
  2002-09-11 18:24 parsing with fscanf() Elias Athanasopoulos
@ 2002-09-11 19:36 ` Glynn Clements
  2002-09-11 20:08   ` Elias Athanasopoulos
  2002-09-11 22:14 ` Richard Webb
  1 sibling, 1 reply; 6+ messages in thread
From: Glynn Clements @ 2002-09-11 19:36 UTC (permalink / raw)
  To: Elias Athanasopoulos; +Cc: linux-c-programming


Elias Athanasopoulos wrote:

> A really newbie question, but I have no time for research (my deadline
> is counted in hours).
> 
> I have to parse a text file, which has been exported from MS Excel (I
> have no access to this thingie) using TABs as delimeters. The text
> is like:
> 
> 4	3	2.0
> 5		1.2
> 4	3	2.4
> 
> I use fscanf() to insert the data to my structures, but the "gap" (between
> 5 and 1.2) breaks the sequence. fscanf() places the 3rd column to the 2nd
> field in my structure.
> 
> Is there a trivial way to bypass the problem using fscanf() or I have to do 
> the parsing using read()?

fscanf() is worthless when the data doesn't adhere to a rigid format. 
One of the main problems is that, even when it isn't entirely
successful, it consumes some of the data from the stream.

A better solution is to read whole lines (e.g. with fgets), then parse
it with sscanf(). That way, if sscanf() fails, you can try again with
the same data. E.g.

	for (;;)
	{
		char buff[81];
		fgets(buff, sizeof(buff), fp);
		if (sscanf(buff, "%d %d %f", &i1, &i2, &f) == 3)
			continue;
		if (sscanf(buff, "%d %f", &i1, &f) == 2)
			continue;
		error();
	}

If you ever need to write a real parser, learn lex/yacc.

-- 
Glynn Clements <glynn.clements@virgin.net>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: parsing with fscanf().
  2002-09-11 19:36 ` Glynn Clements
@ 2002-09-11 20:08   ` Elias Athanasopoulos
  2002-09-12  7:25     ` Mohammed Khalid Ansari
  0 siblings, 1 reply; 6+ messages in thread
From: Elias Athanasopoulos @ 2002-09-11 20:08 UTC (permalink / raw)
  To: Glynn Clements; +Cc: Elias Athanasopoulos, linux-c-programming

On Wed, Sep 11, 2002 at 08:36:04PM +0100, Glynn Clements wrote:
> fscanf() is worthless when the data doesn't adhere to a rigid format. 
> One of the main problems is that, even when it isn't entirely
> successful, it consumes some of the data from the stream.
> 
> A better solution is to read whole lines (e.g. with fgets), then parse
> it with sscanf(). That way, if sscanf() fails, you can try again with
> the same data. E.g.
> 
> 	for (;;)
> 	{
> 		char buff[81];
> 		fgets(buff, sizeof(buff), fp);
> 		if (sscanf(buff, "%d %d %f", &i1, &i2, &f) == 3)
> 			continue;
> 		if (sscanf(buff, "%d %f", &i1, &f) == 2)
> 			continue;
> 		error();
> 	}

Thank you. I was up to write something like that, but I wasn't sure and
was ready to give up and go traditionaly with read(). ANW, thanks for
the above code, it helps a lot. :-)

> If you ever need to write a real parser, learn lex/yacc.

Nah... it is for a short report, a project that I want to spend as less
time as I can. 

Elias

-- 
http://gnewtellium.sourceforge.net			MP3 is not a crime.	

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: parsing with fscanf().
  2002-09-11 18:24 parsing with fscanf() Elias Athanasopoulos
  2002-09-11 19:36 ` Glynn Clements
@ 2002-09-11 22:14 ` Richard Webb
  1 sibling, 0 replies; 6+ messages in thread
From: Richard Webb @ 2002-09-11 22:14 UTC (permalink / raw)
  To: Elias Athanasopoulos; +Cc: linux-c-programming

[-- Attachment #1: Type: text/plain, Size: 1014 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Using flex :

%%
[,\t]+          putchar(',');
[,\t]+$		

will replace multiple commas and tabs with a single comma.

R.

On Wednesday 11 September 2002 7:24 pm, Elias Athanasopoulos wrote:
> Hi,
>
> A really newbie question, but I have no time for research (my deadline
> is counted in hours).
>
> I have to parse a text file, which has been exported from MS Excel (I
> have no access to this thingie) using TABs as delimeters. The text
> is like:
>
> 4	3	2.0
> 5		1.2
> 4	3	2.4
>
> I use fscanf() to insert the data to my structures, but the "gap" (between
> 5 and 1.2) breaks the sequence. fscanf() places the 3rd column to the 2nd
> field in my structure.
>
> Is there a trivial way to bypass the problem using fscanf() or I have to do
> the parsing using read()?
>
> Elias
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9f8BJbwZ8a5Bkl7QRArNfAJ0RwoHwJvl8DPlXYQld7rZdDOnTYQCfeJL8
fQ+CUQycjmtZ0iLxOKlZHXc=
=Jolb
-----END PGP SIGNATURE-----

[-- Attachment #2: public_key.asc --]
[-- Type: application/pgp-keys, Size: 1665 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: parsing with fscanf().
  2002-09-12  7:25     ` Mohammed Khalid Ansari
@ 2002-09-12  7:21       ` Carlos Fernández
  0 siblings, 0 replies; 6+ messages in thread
From: Carlos Fernández @ 2002-09-12  7:21 UTC (permalink / raw)
  To: Mohammed Khalid Ansari, Elias Athanasopoulos
  Cc: Glynn Clements, linux-c-programming

http://epaperpress.com/lexandyacc/

(happens to be the first result on google, btw)

----- Original Message -----
From: "Mohammed Khalid Ansari" <khalid@ncst.ernet.in>
To: "Elias Athanasopoulos" <eathan@otenet.gr>
Cc: "Glynn Clements" <glynn.clements@virgin.net>;
<linux-c-programming@vger.kernel.org>
Sent: Thursday, September 12, 2002 2:25 PM
Subject: Re: parsing with fscanf().


>
> Hello,
>
> Is there any site which gives an extensive tutorial on lex/yacc?
>
> with regards...
>
> --
>
> **************************************************************************
>
> Mohammed Khalid Ansari                    Tel (res) : 0091-022-3051360
> Assistant Manager II                          (off) : 0091-022-2024641
> National Centre for Software Technology   Fax       : 0091-022-2049573
> 8th flr,Air India Build. Nariman Point,   E-Mail    : khalid@ncst.ernet.in
> Mumbai 400021.
>
> Homepage : http://soochak.ncst.ernet.in/~khalid
>
> **************************************************************************
>
> On Wed, 11 Sep 2002, Elias Athanasopoulos wrote:
>
> > On Wed, Sep 11, 2002 at 08:36:04PM +0100, Glynn Clements wrote:
> > > fscanf() is worthless when the data doesn't adhere to a rigid format.
> > > One of the main problems is that, even when it isn't entirely
> > > successful, it consumes some of the data from the stream.
> > >
> > > A better solution is to read whole lines (e.g. with fgets), then parse
> > > it with sscanf(). That way, if sscanf() fails, you can try again with
> > > the same data. E.g.
> > >
> > > for (;;)
> > > {
> > > char buff[81];
> > > fgets(buff, sizeof(buff), fp);
> > > if (sscanf(buff, "%d %d %f", &i1, &i2, &f) == 3)
> > > continue;
> > > if (sscanf(buff, "%d %f", &i1, &f) == 2)
> > > continue;
> > > error();
> > > }
> >
> > Thank you. I was up to write something like that, but I wasn't sure and
> > was ready to give up and go traditionaly with read(). ANW, thanks for
> > the above code, it helps a lot. :-)
> >
> > > If you ever need to write a real parser, learn lex/yacc.
> >
> > Nah... it is for a short report, a project that I want to spend as less
> > time as I can.
> >
> > Elias
> >
> >
>
> -
> To unsubscribe from this list: send the line "unsubscribe
linux-c-programming" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: parsing with fscanf().
  2002-09-11 20:08   ` Elias Athanasopoulos
@ 2002-09-12  7:25     ` Mohammed Khalid Ansari
  2002-09-12  7:21       ` Carlos Fernández
  0 siblings, 1 reply; 6+ messages in thread
From: Mohammed Khalid Ansari @ 2002-09-12  7:25 UTC (permalink / raw)
  To: Elias Athanasopoulos; +Cc: Glynn Clements, linux-c-programming


Hello,

Is there any site which gives an extensive tutorial on lex/yacc?

with regards...

-- 

**************************************************************************

Mohammed Khalid Ansari                    Tel (res) : 0091-022-3051360
Assistant Manager II                          (off) : 0091-022-2024641
National Centre for Software Technology   Fax       : 0091-022-2049573 
8th flr,Air India Build. Nariman Point,   E-Mail    : khalid@ncst.ernet.in 	
Mumbai 400021.

Homepage : http://soochak.ncst.ernet.in/~khalid			  	  

**************************************************************************

On Wed, 11 Sep 2002, Elias Athanasopoulos wrote:

> On Wed, Sep 11, 2002 at 08:36:04PM +0100, Glynn Clements wrote:
> > fscanf() is worthless when the data doesn't adhere to a rigid format. 
> > One of the main problems is that, even when it isn't entirely
> > successful, it consumes some of the data from the stream.
> > 
> > A better solution is to read whole lines (e.g. with fgets), then parse
> > it with sscanf(). That way, if sscanf() fails, you can try again with
> > the same data. E.g.
> > 
> > 	for (;;)
> > 	{
> > 		char buff[81];
> > 		fgets(buff, sizeof(buff), fp);
> > 		if (sscanf(buff, "%d %d %f", &i1, &i2, &f) == 3)
> > 			continue;
> > 		if (sscanf(buff, "%d %f", &i1, &f) == 2)
> > 			continue;
> > 		error();
> > 	}
> 
> Thank you. I was up to write something like that, but I wasn't sure and
> was ready to give up and go traditionaly with read(). ANW, thanks for
> the above code, it helps a lot. :-)
> 
> > If you ever need to write a real parser, learn lex/yacc.
> 
> Nah... it is for a short report, a project that I want to spend as less
> time as I can. 
> 
> Elias
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-09-12  7:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-11 18:24 parsing with fscanf() Elias Athanasopoulos
2002-09-11 19:36 ` Glynn Clements
2002-09-11 20:08   ` Elias Athanasopoulos
2002-09-12  7:25     ` Mohammed Khalid Ansari
2002-09-12  7:21       ` Carlos Fernández
2002-09-11 22:14 ` Richard Webb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).