<div dir="ltr">On Fri, Jul 25, 2008 at 4:39 PM, Todd Walton <span dir="ltr"><<a href="mailto:tdwalton@gmail.com">tdwalton@gmail.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">On Fri, Jul 25, 2008 at 3:07 PM, Chris Freeman <<a href="mailto:cwfreeman@gmail.com">cwfreeman@gmail.com</a>> wrote:<br>
> \b(|D|PP)\d{1,3}(,{0,1}\d{3})*<br>
<br>
</div>You'll miss the first 999 patents. I'm guessing that's not gonna<br>
matter, but couldn't you make character 13 above be a zero instead of<br>
a one?<br>
</blockquote><div><br>I'm not sure how it's going to miss the first 999 patents. It matches 88, and 999, etc, just fine in my tests.<br><br>But, \b also matches a comma, so you'd need something like:<br>(\s|^)(|D|PP)\d{1,3}(,{0,1}\d{3})*(\s|$)<br>
<br>I'm using "\s" to match spaces, which may not be a valid assumption.<br>However, it does correctly match all of the examples set forth (including 1, 12, and 123).<br><br>Chris<br><br>
$ cat tmp.pl<br>
#!/usr/bin/perl<br>
while(<STDIN>) {<br>
if( $_ =~ /(\s|^)(|D|PP)\d{1,3}(,{0,1}\d{3})*(\s|$)/ ) {<br>
print "Match\n";<br>
} else {<br>
print "No match\n";<br>
}<br>
}<br><br>
$ perl tmp.pl<br>
1<br>
Match<br>
A1<br>
No match<br>
D1<br>
Match<br>
PP1<br>
Match<br>
12<br>
Match<br>
A12<br>
No match<br>
D12<br>
Match<br>
PP12<br>
Match<br>
123<br>
Match<br>
A123<br>
No match<br>
D123<br>
Match<br>
PP123<br>
Match<br>
1234<br>
Match<br>
A1234<br>
No match<br>
D1234<br>
Match<br>
PP1234<br>
Match<br>
1,234<br>
Match<br>
A1,234<br>
No match<br>
D1,234<br>
Match<br>
PP1,234<br>
Match<br>
12,345,678<br>
Match<br>
A12,345,678<br>
No match<br>
D12,345,678<br>
Match<br>
PP12,345,678<br>
Match<br>
12,34,56<br>
No match<br>
12,345,678,<br>
No match<br>
<br>
<br></div></div><br></div>