D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Locales and ££££'s and Perl

 

On Wed, 30 May 2007 14:03:58 +0100
Simon Waters <simon@xxxxxxxxxxxxxx> wrote:

> After much pondering I set the locale manually for the current user
> (export LANG=en_GB.UTF-8"), and relevant things have changed.

Same as my default locale.
 
> vi test.pl
> #!/usr/bin/perl
> use strict;
> use warnings;
> use utf8;
> 
> binmode(STDOUT,":utf8");

Why set that?

> 
> print "£\n";

$ cat /tmp/simon.pl
#!/usr/bin/perl -w
binmode(STDOUT,":utf8");
my $sign = "£";

print "$sign\n";

$ perl simon.pl 
£

Yuk - so remove the binmode:

$ cat /tmp/simon.pl
#!/usr/bin/perl -w

my $sign = "£";

print "$sign\n";

$ perl simon.pl 
£

Lovely.

> In "vim" that looks like a £ sign ('cat' and 'less' want to use the
> hexagonal ? symbol).

cat and less both indicate '£' on this system.

Makes me wonder if you haven't actually got a pure UTF8 environment.

> My first question is what is going wrong here? Is this the wrong way to
> do Unicode string literal (pressing shift + "3"). I'm not so concerned
> with the "right way" or a "working way", but I wanted to understand what
> is going wrong (perl/vim/my brain(likely)/Debian).

Maybe perl isn't picking up the exported values - if you default locale
is not UTF8, perl will not start in UTF8.

Try:
$ LANG=en_GB.UTF-8 perl ./test.pl

(Although I cannot reproduce your errors by specifying non-UTF8 values
for LANG.)

> With Perl 5.8 (on Debian Sarge) I understood that "use utf8" should
> still be used to allow Unicode literals to be used.

I don't need to use it in a true UTF8 environment. I changed some time
ago due to problems with gnucash. 

> We don't actually have Unicode literals in the  Perl code, but it is
> going to have to handle them, and I was writing a new test case for one
> of the Perl template toolkit filters, and I couldn't get it to run at
> all because it needed some UTF-8 character (so it could be taught not
> mangle them as badly as it does currently -- I suspect the real mangling
> is done in Javascript).

If you have a snippet I can test on this system.

I don't think it is the fault of perl at this stage if it affects cat
and less (which should display £ perfectly).

€ also works.

$ cat simon.pl 
#!/usr/bin/perl -w

my $sign = "€";

print "$sign\n";

$ perl simon.pl 
€

(on holiday but still browsing from time to time.)

Bizarrely, my email client complains about € symbols so I'm not too
sure how this will show up in your email client or in the list archive.
Here goes....

-- 


Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

Attachment: pgpqT5PcHzv0C.pgp
Description: PGP signature

-- 
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html