[Cialug] Columns of Data
jim kraai
jimgkraai at gmail.com
Fri Jul 31 16:15:33 UTC 2020
fixed widths can help if they can be specified in the cli tool
if the order of columns can be specified and only one column is
variable-length, then put the variable column last for parsing
if the tool options allow for a custom field separator, use something
weird, like '~~~' that won't normally appear in the data
if the tool options allow for standard or field delimiters, use those
dive into the tool source code and add whatever separators or delimiters
that would be useful to the source
On Fri, Jul 31, 2020 at 10:18 AM Todd Walton <tdwalton at gmail.com> wrote:
> Happy SysAdmin Day, everyone!
>
> Here is an example bit of text coughed up by a Kubernetes command-line
> tool:
>
> 9m49s Normal Updated
> machine/oo-r6sr3-worker-us-east-1d-nvfzh Updated machine
> oo-r6sr3-worker-us-east-1d-nvfzh
> 9m47s Normal Updated machine/oo-r6sr3-master-1
> Updated machine oo-r6sr3-master-1
> 9m46s Normal Updated
> machine/oo-r6sr3-worker-us-east-1b-hsmsx Updated machine
> oo-r6sr3-worker-us-east-1b-hsmsx
> 9m46s Normal Updated
> machine/oo-r6sr3-worker-us-east-1a-wk5cs Updated machine
> oo-r6sr3-worker-us-east-1a-wk5cs
> 9m46s Normal Updated
> machine/oo-r6sr3-worker-us-east-1e-z9xlb Updated machine
> oo-r6sr3-worker-us-east-1e-z9xlb
> 9m44s Normal Updated machine/oo-r6sr3-master-0
> Updated machine oo-r6sr3-master-0
> 9m43s Normal Updated machine/oo-r6sr3-master-2
> Updated machine oo-r6sr3-master-2
> 9m43s Normal Updated
> machine/oo-r6sr3-worker-us-east-1d-tfg6x Updated machine
> oo-r6sr3-worker-us-east-1d-tfg6x
> 9m43s Normal Updated
> machine/oo-r6sr3-worker-us-east-1c-6l42j Updated machine
> oo-r6sr3-worker-us-east-1c-6l42j
> 59s Normal SuccessfulUpdate clusterautoscaler/default
> Updated ClusterAutoscaler deployment:
> machine-api/cluster-autoscaler-default
> 4m7s Normal Pulled
> pod/gateway-laravel-schedule-1296080-h43n6 Container image
> "dockerregistry:4567/group/gateway/master:alpine-nodejs-fpm" already
> present on machine
>
> For the purpose of this email, don't mind about the semantics. This could
> be anything. But do notice that the output is arranged into neat columns.
> The first four columns are strings of non-space characters. The fifth
> column, however, gives us trouble. It seeks to undermine the movement from
> within, throwing a wrench into the works. Fifth columns, amiright?
>
> Here's another example, this one taken from my /var/log/messages:
>
> Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
> (wlp1s0): set-hw-addr: set MAC address to 9:7:8:9:2:F (scanning)
> Jun 28 02:50:42 ilm01-ll-ttwalto kernel: IPv6: ADDRCONF(NETDEV_UP): wlp1s0:
> link is not ready
> Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
> (wlp1s0): supplicant interface state: inactive -> disabled
> Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
> (wlp1s0): supplicant interface state: disabled -> inactive
> Jun 28 02:55:57 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
> (wlp1s0): set-hw-addr: set MAC address to 62:0F:6E:7A:B3:2C (scanning)
> Jun 28 02:55:57 ilm01-ll-ttwalto kernel: IPv6: ADDRCONF(NETDEV_UP): wlp1s0:
> link is not ready
> Jun 28 02:55:57 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
> (wlp1s0): supplicant interface state: inactive -> disabled
>
> Here again the fifth column is making things difficult. Also the first
> three could certainly stand to be one column, but at least they're
> standard, predictable, and manipulable. Manipulable being what I'm looking
> for.
>
> This happens frequently, where a command or log outputs text in columns
> that are not quite usefully arranged. How does one deal with columnar data
> like this? I can't use 'cut'. What would I cut on that would capture the
> first columns *and* keep the last one intact? I'm not sure how one would
> easily use awk for this. Is there something like '{ print $5- }'? Meaning,
> from column 5 onwards? I can't use "column -t" because that screws
> everything up royally.
>
> Another thing that trips me up. Sometimes I'll have a nice set of
> comma-separated values but there'll be a comma in one of the fields. The
> typical way of dealing with this in CSV files is to quote the entire field.
> But that doesn't help me, the bash scripter.
>
> Any suggestions for how to deal with stuff like this?
>
> --
> Todd
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> https://www.cialug.org/cgi-bin/mailman/listinfo/cialug
>
More information about the Cialug
mailing list