原文 http://www.perlmonks.org/?node_id=256728
Windows的ActivePerl好像沒有File::BOM, File::MMagic。
最後一個方法是讀檔案前面兩個bytes來判斷。
Example:
open $FH_i, "<", "unicode.txt";
read $FH_i, $buf, 2, 0;
close $FH_i;
@File_head = split(//, $buf);
if (($File_head[0] eq "\xFF") && ($File_head[1] eq "\xFE")) {
print "This is unicode Little Endian file.\n";
} elsif (($File_head[0] eq "\xFE") && ($File_head[1] eq "\xFF")) {
print "This is unicode Big Endian file.\n";
} else {
print "This is ASCII file.\n";
}
Example:
open $FH_i, "<", "unicode.txt";
read $FH_i, $buf, 2, 0;
close $FH_i;
@File_head = split(//, $buf);
if (($File_head[0] eq "\xFF") && ($File_head[1] eq "\xFE")) {
print "This is unicode Little Endian file.\n";
} elsif (($File_head[0] eq "\xFE") && ($File_head[1] eq "\xFF")) {
print "This is unicode Big Endian file.\n";
} else {
print "This is ASCII file.\n";
}
How do I determine encoding format of a file ?
Perl 5.8 has a module called "Encode::Guess", which might work well if you know the language involved and/or can provide some hints as to the likely candidates. (I haven't tried it yet, but it is admittedly limited and speculative at present.) |
Answer: How do I determine encoding format of a file ? contributed by idsfa File::BOM provides get_encoding_from_filehandle and get_encoding from_stream to identify the encoding of Unicode files. Example:
|
Answer: How do I determine encoding format of a file ? contributed by particlehave a look at File::MMagic, it guesses the filetype given the filename or a filehandle, and is quite configurable (you can add more file type descriptions based on regular expressions.) it's a handy little module. |
Answer: How do I determine encoding format of a file ? contributed by donno20Read the first two bytes of the file. Corresponding encoding and hex codes are as follow: unicode Little Endian = "\xFF\xFE" unicode Big Endian = "\xFE\xFF" utf8 = "\xEF\xBB" ASCII = straight to content |
沒有留言:
張貼留言