D&C Lug - Home Page
Devon & Cornwall Linux Users' Group

[ Date Index ][ Thread Index ]
[ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] regexp?



Adrian Midgley wrote:

NHS Numbers are 10 digits.
The canonical formatting of them is
999 999 9999

However often they are present as
9999999999

I am always amazed by the cleverness of people who can produce regexps such
as one that would enable me to grep all lines out of a big text (csv, values
in "") file which contain a specific NHS number regardless of which format it
is presented in...

Sometimes it is easier to process the number format before you
write the regular expression. What did our Perl guru say about
write once code ;)

#!/bin/bash
#
# Program expectsd 10 digit number and greps for all occurances
of
# the same 10 digit number in double quotes.
# 
# Both input and found format may be 123 456 7890 or 1234567890
#
#  Simon Waters 2002
# 
if [ $# -ne 2 -a $# -ne 4 ]
then
echo "Usage $0 123 456 7890 myfile"
echo "   or $0 1234567890 myfile"
if [ $# -eq 2 ]
then
FIRST=`expr $1 : "\(...\)"` 
SECOND=`expr $1 : "...\(...\)"`
THIRD=`expr $1 : "......\(....\)"`
FILE=$2
fi
if [ $# -eq 4 ]
then
FIRST=$1
SECOND=$2  
THIRD=$3
FILE=$4
fi
grep -E "\"$FIRST $SECOND $THIRD\"|\"$FIRST$SECOND$THIRD\""
$FILE

--
The Mailing List for the Devon & Cornwall LUG
Mail majordomo@xxxxxxxxxxxx with "unsubscribe list" in the
message body to unsubscribe.


Lynx friendly