D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Web pages manipulations server side was Re: Mass editing text files?

 

On Friday 27 July 2007 13:46, Simon Waters wrote:
> Tom Potts wrote:
> > I got a bit of a shock finding that there is so little DOM processing
> > stuff server side!
>
> CPAN has shed loads. As they say in Perl - there is more than one way to
> do it, and more than one DOM-like module, and more than one HTML parsing
> module.
they dont come up in searches for "html dom parser perl" which doesnt help!
>
> One of my current projects is using HTML::TokeParser, the clever stuff
> here (read not done by Simon) is using HTML::TreeBuilder which uses
> TokeParser underneath.
I did play with some things - like libxml2 but when your input html is 
seriously wobbly and error ridden it proved too strict. The thing was 
initially written as to convert the original web site and that was so screwed 
up ... The other bits just fell into/out of  it and there was nothing easily 
automatable on the web that was free - no budget!
>
> Nick, out on the moor, offers folks this service, and similar, but he
> wrote his own software to do it in Apache years ago using libxml2 and
> friends. Checkout Web Valet, and Accessibility proxy. He also did a lot
> of work with the W3 in this area, and reports he is doing a lot with
> Apache (not just writing books on the topic).
cant find them.....
>
> The Apache Modules Book: Application Development with Apache (Prentice
> Hall Open Source Software Development Series)  by Nick Kew (Paperback -
> Jan 26, 2007)
you should have mentioned that when I asked about books a couple of weeks ago!
>
> He didn't mention if writing books is a good way of getting business or
> not.
In government you have to read up on what they think they need and then make 
the most ludicrous proposal possible. If you can do a job in two weeks and 
get it right they'll still give it to someone else who will take 4 years to 
mess it up. If theres a working product at the end that performs to spec then 
the person who wrote the spec is in trouble!
I still remember looking at the online tax form - couldn't work out how to 
take longer that 3 months to code and £1M in backend and then later found out 
it was £600M over budget!
Tom te tom te tom


-- 
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html