[Cialug] multiline regular expression
Dave Weis
djweis at internetsolver.com
Thu Jun 28 08:10:36 CDT 2007
Morris Dovey wrote:
> Dave Weis wrote:
> | I need to parse a text file that contains two lines per record in
> | this format:
> | 324 NOR 05/17/07 10:21A 0000000 999-999-9999
> | COLUMBUS OH 1 .9 .0700 .0000 .0000
> |
> | Y CURRENT 00:00:54 0000000 999-999-9999 DES
> | MOINES IA
> |
> | There are other lines in the file that are similar like
> | LATA USGE-GP DATE TIME DEST-CITY
> | --------DESTINATION-------- #RECS MINUTES AMOUNT-1 AMOUNT-2
> | VOL-AMT
> | ANI STATUS ACT-DUR ORIG-CITY
> | --------ORIGINATION-------- MISC-1 MISC-2
> | VOL-COD
> | that is the header.
> |
> | I'll be using the java regexp but if anyone can direct me on any
> | regexp setup I'll convert it myself.
>
> Hmm. It'd help to know more about the problem context - I'd be
> inclined to do the parsing in C using something like
> http://www.iedu.com/mrd/c/tokenize.c and, depending on the size of the
> file, something like http://www.iedu.com/mrd/c/tokfile.c to tokenize
> all the lines in the file in one shot...
It's a couple hundred thousand lines and will get longer every month -
long distance bill.
The final destination is into a postgresql database for rating and billing.
dave
More information about the Cialug
mailing list