I'm setting up the CPAN module for perl on CentOs 5, and one of the questions is 'Does your terminal support UTF-8?' (paraphrased). How do I find out?
8 Answers
Type this in your terminal:
echo -e '\xe2\x82\xac'
If your terminal supports UTF-8 it will output the euro sign:
€
- 2,478
Really, the surefire way to test is to download a text file and cat it in the terminal and see if everything looks ok.
or, if you can, recompile the terminal enabling the unicode option (assuming it has one).
what does $TERM and $LANG look like?
- 17,092
- 329
The lamest way: run following and check the output. It will be a capital O with circumflex if the terminal displays UTF-8.
perl -le 'print "\x{c3}\x{94}"'
- 682
The most sure fire way is to use the ‘locale’ command. It will print out all the various and sundry variables that dictate what character set to use. For instance, this is my output on RHEL5.3, set to only use UTF-8 by default.
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
- 15,097
curl http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
or
wget -O - http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt
This obviously requires wget or curl.
- 1,174
You can test the terminal by setting the cursor position to column 1 and outputing a multibyte unicode character. If the cursor moves by more than 1 position then the terminal does not support unicode.
On in this case we emit a 3 byte sequence which is a zero width space, so if the cursor moves at all, the terminal cannot process unicode
IFS=$';\x1B[' read -p $'\r\xE2\x80\x8B\x1B[6n\r \r' -d R -rst 1 _ _ _ X _ </dev/tty 2>/dev/tty && test "$X" = 1
Here we output \r to get to position 1 and then emit a 3 byte sequence which is a zero width space, and then emit ESC [ 6n which asks the cursor position, followed by \r \r to overwrite any junk that will have appeared if the terminal handled each byte as a separate character.
Then we read the cusor position with a 1 second timeout and check whether the X position is position 1 which it will be if the terminal can process unicode.
A better function is:
is-tty-unicode() {
local X
test -c /dev/tty &&
if test -t 0
then IFS=$';\x1B[' read -p $'\r\xE2\x80\x8B\x1B[6n\r \r' -d R -rst 1 _ _ _ X _ 2>&1
fi <>/dev/tty && test "$X" = 1
}
- 111
UTF=$(echo -e "\u263A")
if [[ ! "$UTF" =~ "A" ]] ; then
echo -n "UNICODE here!"
fi
- 17