[Cialug] Text Processing Choices
John Lengeling
John.Lengeling at radisys.com
Tue Jun 30 11:42:06 CDT 2009
> Except for perl, I'm with Daniel. If the command line tools don't work
> out I often move to a spreadsheet. For example I commonly need to get
> a bunch of data ready to put into a database so I will bring the raw
> ...
> ...
You usually have to end up using a mix of tools.
I don't like spreadsheets IMHO since they are pretty restrictive on the
size of data. There are limits on the # of rows and also the max size
of cells.
I work a lot with multiline text data and spreadsheets just don't handle
multiline text well.
UNIX tools like sed/awk/cut/etc and perl don't have as many limits. I
use the UNIX tools for simpler stuff (stuff written in 10-20 lines) and
perl for more complicated stuff like data validation, data translation,
code page conversion, screen scraping or XML formatting.
johnl
More information about the Cialug
mailing list