[Cialug] Regular Expression for Pathnames
Daniel A. Ramaley
daniel.ramaley at drake.edu
Mon Nov 24 13:43:58 CST 2014
I realized right after sending it that the way your problem is stated,
you might want to match different numbers of slashes. To match strings
that have between 1 and 3 (inclusive) slashes, change the part between
curly braces to be "1,3" instead of just "3":
egrep '^(/[^/]+){1,3}$'
Explaining what this bit of line noise means:
^ Match beginning of line.
() Groups stuff and treats it as one object.
/ Match a literal "/".
[] Matches a single character provided inside the brackets.
For example, "[abcde]" will match 5 lowercase letters.
^ If used as the first character in square brackets, this
inverts the characters matched.
/ Match "/". Note that it is inverted due to the preceding
"^" and so therefore means match anything *except* "/".
+ Match the [] expression 1 or more times.
{} Counter. How many times we should match.
1,3 Match between 1 and 3 times, inclusive.
$ Match end of line.
You can find more precise explanation in regex documentation; this is
just off the top of my head. The most important thing with reading
regexes is figuring out how they are grouped. The basic pattern here is
the (foo){bar} which just means "match foo, bar number of times".
On 2014-11-24 at 10:58:24 Ron Houk wrote:
> Wow. Your solution is a lot more elegant. I'm still trying to learn
> this stuff. :)
>
> On Nov 24, 2014 10:35 AM, "Daniel A. Ramaley"
> <daniel.ramaley at drake.edu>
> wrote:
> > This should work for your purposes if all the data looks like the
> > samples. Set the number between curly braces to whatever you need.
> > (Note that your sample data didn't have any matches for just 2
> > slashes, but does for 3 slashes.)
> >
> > egrep '^(/[^/]+){3}$'
> >
> > On 2014-11-24 at 10:23:07 Todd Walton wrote:
> > > If I have a text file full of pathnames, like:
> > >
> > > /var/log/folder1
> > > /var/log/folder2
> > > /home/todd/mydir
> > > /var/log/folder1/fileh
> > > /var/log/folder1/foldersub/fileh
> > >
> > > ...etc, what's the regular expression to find where a string has
> > > exactly two (or however many) forward slashes to the left of it?
> > > I
> > > have a 360,000 line list of path names, and I'd like to find where
> > > a
> > > certain string falls early in the path. I'm really only
> > > interested
> > > in paths where it's in the top three or four directories.
> > >
> > > --
> > > Todd
> > > _______________________________________________
> > > Cialug mailing list
> > > Cialug at cialug.org
> > > http://cialug.org/mailman/listinfo/cialug
> >
> > __
> > Daniel A. Ramaley | Network Engineer 2
> > Drake Technology Services (DTS) | Drake University
> >
> > T: +1 515 271-4540
> > F: +1 515 271-1938
> > E: daniel.ramaley at drake.edu
> >
> > _______________________________________________
> > Cialug mailing list
> > Cialug at cialug.org
> > http://cialug.org/mailman/listinfo/cialug
__
Daniel A. Ramaley | Network Engineer 2
Drake Technology Services (DTS) | Drake University
T: +1 515 271-4540
F: +1 515 271-1938
E: daniel.ramaley at drake.edu
More information about the Cialug
mailing list