Re: [LUG] XML - was: sxi perl notes

To: list@xxxxxxxxxxxx
Subject: Re: [LUG] XML - was: sxi perl notes
From: Neil Williams <linux@xxxxxxxxxxxxxx>
Date: Tue, 18 Jun 2002 18:29:00 +0100
Reply-to: list@xxxxxxxxxxxx

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 16 June 2002 1:08 pm, Nick Kew wrote:

On Sat, 15 Jun 2002, Neil Williams wrote:

http://www.codehelp.co.uk/


Hmm - I'll try & find time to look at that when I'm next online.


Just don't expect the codehelp site to be an 'authority' on XML like W3C - it 
doesn't go into every aspect, it doesn't explain every section of code. It's 
there as an appetizer, a quick demo of what XSLT can do and a little pointer 
to creating a similar effect on other websites. (It's NOT a HOWTO.)

CodeHelp covers lots of areas, from XML to WML and PHP, and from Java and C++ 
to Pascal. Coverage in some areas is still a little thin (C) but it is only a 
hobby. When the XML program reaches 0.5 I expect to put some C++ code on the 
site and increase that section 2000%. (Not hard when you see how much is 
currently there.)

(BTW. Please see the 'About CodeHelp' page too.)

IE4 or later will load XML (actually it will load XSLT and render XHTML
but that is another issue),


mmmf .. yes, but with some serious bugs.


Come on, this is M$. What can you expect???

     other browsers will be directed to plain HTML pages
and some XHTML pages that can be viewed in all the browsers I have tested
(including Mozilla, Lynx, Konqueror, Galeon and Opera).


The Moz family do XML+XSLT; Konq and Opera[1] do XML but not XSLT.


I can't get Mozilla to do XSLT. Shown an XML file, Mozilla 0.9.4 just strips 
out the tags and displays the raw content without any formatting at all, one 
long wrapped line. It ignores the CSS link too. Can you give me the URL for 
some XML / XSLT pages that Mozilla can read? Probably one of the namespace 
URL's is out of date on my pages. (Written before Netscape6 was even in beta.)

Konqueror (KDE2) does put in the formatting as dictated by the CSS stylesheet 
and breaks the lines into readable chunks.

     (The HTML/XHTML was
simply created by writing XML and using XSLT to transform it into HTML.)
The site aims to explain the basics of XML, XSLT and DTD's. (However,
DTD's are old-hat and the new in-thing in XML is Schemas,


You've been reading the propaganda.  DTDs are still the best choice for
mainly-text documents; Schema are better suited to more highly structured
data whose primary content is not text.


Agreed. The irony didn't come across well in email. (Mental note: must use 
more smilies.)

:-(

     XPath and XLink.


These are different again.  XPath is the foundation for useful things like
XSLT; XPointer and XLink are more ivory-tower things with severe problems
in the real world (see for example the threads on the annotations and
wai-er fora at lists.w3.org).

     These are
not covered because (quite simply) there isn't a browser that has
implemented them yet!!)


Ahem .. any browser that supports XSLT must support XPath.


Partially. The full XPath functionality wasn't supported last time I looked, 
but I have not been concentrating on XML for a while, it may have changed, 
again. XLink (especially the many-to-many capability) still seems beyond the 
power of a browser. It's certainly beyond the scope of the codehelp site so I 
can't see myself getting into that for some time.

I have also written a C++ program that can parse one specific DTD
controlled set of XML files and produce output in Windows, xterm and KDE.


Have you written a new parser, or used an existing one (such as
libxml, expat or xerces)?


That's the plan. Are any of those cross-platform? I haven't tried them 
because I've got precious little documentation and at the time it seemed a 
better idea to create a simple text parser that copes only with the one DTD 
structure to make sure that I could get the rest of the program operating in 
a way that technophobes could input the data. That was the development path 
from v0.0.1 to the current 0.2 and the plan extends to incorporating a fully 
capable XML parser capable of operating on Linux and Windows by version 0.5. 
What I've written so far is from the 'just-good-enough' school of programming 
and desperately needs to be expanded. Right now, if anyone edits the XML by 
hand, the parser can fail even though the XML is valid because of a reliance 
on line-endings in the data read loop.

I use a PHP XML parser to speed-up data export from the previous spreadsheets 
so I know the benefits, it's just that I had a v.good text on PHP-XML and I 
haven't on C++-XML. (I do tend to rely on texts, (contrary to another thread 
here, I think a real library is far better than a virtual one - as long as 
the entire library is readily accessible, i.e. on my bookshelf!!!!) as it 
means I can plan ahead more easily. It's easier for me to program when the 
ideas are clear from being thought out on paper first. I make decision 
outlines (with the boxes, triangles and circles etc.) and plan the classes in 
my head, usually whilst doing something else entirely, like driving.
:-)))

(Can you tell I'm self-taught? My degree had NOTHING to do with computers, I 
was expected to type a 70,000 word dissertation on an Amstrad PCW 8256! Damn 
thing didn't even have a hard drive, I had to hawk multiple discs just to 
make sure my data was even partially protected!!! (The PCW had a habit of 
corrupting the datafile on the disc when writing large documents!))

     One engine,
three outputs. The XML itself is browseable using IE. So as well as XML
being cross-platform (ANY OS capable of parsing a text file can support
it - after all, if a mobile phone can do it (with a little help from the
gateway) so can the fridge!)


Naturally.


I take it from that that at least one of the parsers you mention can run on 
Linux and Windows. Correct? (Can't see a reason why not to be honest.)

    I'm sticking to C++, I don't need to port this to a non


Likewise.  mod_xml uses C and (optionally[2]) C++, to offer a lower
overhead than traditional Java-based platforms for Web-based XML
applications and webservices.

An XSL stylesheet is for the style - fonts, positioning, colour, tables,


Aaargh!  No it isn't!

Well, it can be, but only by coincidence.  If you put that sort of
thing in XSLT when generating HTML, you are abusing HTML rather badly.


I know, XSL is more about organising data from the XML file into a usable 
output tree. However, in terms of using XML to reduce the total workload of 
web design (by eliminating duplicated HTML tags) it served as a means to an 
end. Have a read of the codehelp XML site, including the example form and 
help files that go with it, and see if you agree. XML is moving fast and some 
of the pages will probably feel out-of-date, but that's life. (CodeHelp has 
never pretended to be cutting-edge!)

Note also that CSS stylesheets can also be used for browsers like Opera
that can understand XML but not XSLT.


Yes, but not only them; you should use CSS for any HTML you want to
format, layout or prettify.


I do. The CodeHelp site goes into CSS extensively. Check out the glossary 
that covers CSS terms and design and the CSS pages. XML, DHTML, Javascript, 
CSS, WML and PHP make up the bulk of the pages on the site (roughly in that 
order).

     The main limitation here is not being able
to create HTML style links between browsed XML documents like you can
with XSLT.


Huh?  Of course you can include HTML links!  All you need to do is use
<a href...> in the XML.


Only if you use a different namespace. Besides, having to include the entire 
<a> tag every time defeated the purpose of the XML site - to reduce the 
amount of code that needed to be typed. I need the tags to be created from 
XSLT and until more browsers cope with that I'll continue to re-direct to 
XHTML, exported using XSLT by a little Windows utility. Having said that, I'm 
now using PHP exclusively when adding new content as it can export XML or 
XHTML dynamically depending on the browser - saves yet another step in the 
workload.

EVERY XML file will contain a reference to the stylesheet AND the DTD.
Just check the first four lines of each file.


No!  An XML file may or may not include either or both of those.
But if they are included, they will be at the beginning.


OK, I exaggerated.

[1] in the latest versions I've used - which are not up to date.
[2] depending on what XML library you build it with

so will it compile in Windows? Remember, the people I rely on for data entry 
do not have Linux available. 

Plus, some of the PC's in use are still running Win95, have 32Mb of RAM and 
run at 90MHz or so. The emphasis is on low memory footprint and fast code. 
(Java is completely out-of-the-question for this project!) I'm looking for a 
library that adds no more than 60kb to the executable and less than 1Mb to 
the memory footprint without reducing the execution speed any further. 
(Memory is already being swapped to disc occasionally even with the current 
72kb executable and v.limited parser, after all, Win95 takes most of that 
32Mb for itself!). At least 6 separate XML files must be parsed without 
causing noticeable delay in data display (both on application start and upon 
user key instructions). The Linux version will have a far more capable 
machine, but will also have more files to deal with, up to 36 at a time, as 
well as more calculations to make in the data-mining role.

- -- 

Neil Williams
=============
http://www.codehelp.co.uk
neil@xxxxxxxxxxxxxx
linux@xxxxxxxxxxxxxx
neil@xxxxxxxxxxxx

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9D23fiAEJSii8s+MRArW8AJsH2Dtf042EJRNcccViXKsL7KKQ7QCg46c7
JxsP0FDOeRorEbriOZs+Lbg=
=EbtR
-----END PGP SIGNATURE-----


--
The Mailing List for the Devon & Cornwall LUG
Mail majordomo@xxxxxxxxxxxx with "unsubscribe list" in the
message body to unsubscribe.

Prev by Date: Re: [LUG] Any idea what this means - fixed
Next by Date: Re: [LUG] Computers for Schools (not Tesco's)
Previous by thread: [LUG] DOOM REVIEW
Next by thread: [LUG] PGP Encryprtyion
Index(es):
- Date
- Thread

Lynx friendly