D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Mass editing text files?

 

On Fri, Jul 27, 2007 at 09:35:11AM +0100, Jonathan Roberts wrote:
> > The above can be made much simpler by use of the -pi flags which
> > allow for in-place editing of files with a backup copy made.  e.g.:
> >
> > perl -npi.bak -e 's#<head>#<head>\nYour extra stuff here#i;' *.html
> >
> This looks like the simplest approach - especially for a person like
> me who doesn't know much programming!
> 
> I have a question though: if I wanted the text to be inside the <head>
> section would it read:
> 
> perl -npi.bak -e 's#<head>\nExtra stuff here#i;\n' *.html

the s###i part is a substitution operator.  I used '#' as the
separator because when dealing with HTML (which includes '/') it
seems a bit clearer than using '/'.  You'd need to specify which
string you're searching for and what to replace it with.

> The result I'm looking for is:
> 
> <head>
> 
> Extra stuff
> 
> stuff that was there before
> 
> </head>

So that would be more like:

perl -npi.bak -e 's#<head>#<head>\n\nExtra stuff here\n\n#i;' *.html

Assuming that you started with:

<head>
stuff that wa shere before
.
.
.

and wanted to end up with:

<head>

Extra stuff here

stuff that was here before
.
.
.

If doing more complex changes then you will probably want to do
something more like Tom suggested where you use a tool that
understand the DOM and can parse the tree that is an HTML document.
The simple text substitution approach can be fragile as the source
text may not always be in a uniform format.

Cheers,
Andy

-- 
http://bitfolk.com/ -- No-nonsense VPS hosting
Encrypted mail welcome - keyid 0x604DE5DB

Attachment: signature.asc
Description: Digital signature

-- 
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html