[Cialug] Columns of Data
Todd Walton
tdwalton at gmail.com
Fri Jul 31 15:17:25 UTC 2020
Happy SysAdmin Day, everyone!
Here is an example bit of text coughed up by a Kubernetes command-line tool:
9m49s Normal Updated
machine/oo-r6sr3-worker-us-east-1d-nvfzh Updated machine
oo-r6sr3-worker-us-east-1d-nvfzh
9m47s Normal Updated machine/oo-r6sr3-master-1
Updated machine oo-r6sr3-master-1
9m46s Normal Updated
machine/oo-r6sr3-worker-us-east-1b-hsmsx Updated machine
oo-r6sr3-worker-us-east-1b-hsmsx
9m46s Normal Updated
machine/oo-r6sr3-worker-us-east-1a-wk5cs Updated machine
oo-r6sr3-worker-us-east-1a-wk5cs
9m46s Normal Updated
machine/oo-r6sr3-worker-us-east-1e-z9xlb Updated machine
oo-r6sr3-worker-us-east-1e-z9xlb
9m44s Normal Updated machine/oo-r6sr3-master-0
Updated machine oo-r6sr3-master-0
9m43s Normal Updated machine/oo-r6sr3-master-2
Updated machine oo-r6sr3-master-2
9m43s Normal Updated
machine/oo-r6sr3-worker-us-east-1d-tfg6x Updated machine
oo-r6sr3-worker-us-east-1d-tfg6x
9m43s Normal Updated
machine/oo-r6sr3-worker-us-east-1c-6l42j Updated machine
oo-r6sr3-worker-us-east-1c-6l42j
59s Normal SuccessfulUpdate clusterautoscaler/default
Updated ClusterAutoscaler deployment:
machine-api/cluster-autoscaler-default
4m7s Normal Pulled
pod/gateway-laravel-schedule-1296080-h43n6 Container image
"dockerregistry:4567/group/gateway/master:alpine-nodejs-fpm" already
present on machine
For the purpose of this email, don't mind about the semantics. This could
be anything. But do notice that the output is arranged into neat columns.
The first four columns are strings of non-space characters. The fifth
column, however, gives us trouble. It seeks to undermine the movement from
within, throwing a wrench into the works. Fifth columns, amiright?
Here's another example, this one taken from my /var/log/messages:
Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
(wlp1s0): set-hw-addr: set MAC address to 9:7:8:9:2:F (scanning)
Jun 28 02:50:42 ilm01-ll-ttwalto kernel: IPv6: ADDRCONF(NETDEV_UP): wlp1s0:
link is not ready
Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
(wlp1s0): supplicant interface state: inactive -> disabled
Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
(wlp1s0): supplicant interface state: disabled -> inactive
Jun 28 02:55:57 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
(wlp1s0): set-hw-addr: set MAC address to 62:0F:6E:7A:B3:2C (scanning)
Jun 28 02:55:57 ilm01-ll-ttwalto kernel: IPv6: ADDRCONF(NETDEV_UP): wlp1s0:
link is not ready
Jun 28 02:55:57 ilm01-ll-ttwalto NetworkManager[2238]: <info> device
(wlp1s0): supplicant interface state: inactive -> disabled
Here again the fifth column is making things difficult. Also the first
three could certainly stand to be one column, but at least they're
standard, predictable, and manipulable. Manipulable being what I'm looking
for.
This happens frequently, where a command or log outputs text in columns
that are not quite usefully arranged. How does one deal with columnar data
like this? I can't use 'cut'. What would I cut on that would capture the
first columns *and* keep the last one intact? I'm not sure how one would
easily use awk for this. Is there something like '{ print $5- }'? Meaning,
from column 5 onwards? I can't use "column -t" because that screws
everything up royally.
Another thing that trips me up. Sometimes I'll have a nice set of
comma-separated values but there'll be a comma in one of the fields. The
typical way of dealing with this in CSV files is to quote the entire field.
But that doesn't help me, the bash scripter.
Any suggestions for how to deal with stuff like this?
--
Todd
More information about the Cialug
mailing list