Character Encoding Support

From wxHexEditor
Jump to: navigation, search

wxHexEditor supports too many character encoding sets.

Note that, encodings that marked with star (*) will appear under Experimental Character Encoding family at wxHexEditor.

Because some part of the characters broke proper text show due has different character sizes or right to left texts.

wxHexEditor's supported Character Encoding Sets

Code for Information Exchange

    • ASCII - American
    • *ISCII - Indian Script Code for Information Interchange
    • VSCII - Vietnamese Standard Code for Information Interchange
    • *TSCII - Tamil Script Code for Information Interchange

DOS

    • OEM - IBM PC/DOS CP437 - MS-DOS Latin US
    • *PC/DOS CP720 - MS-DOS Arabic
    • PC/DOS CP737 - MS-DOS Greek
    • PC/DOS CP775 - MS-DOS Baltic Rim
    • PC/DOS CP850 - MS-DOS Latin 1
    • PC/DOS CP852 - MS-DOS Latin 2
    • PC/DOS CP855 - MS-DOS Cyrillic
    • *PC/DOS CP856 - Hebrew
    • PC/DOS CP857 - MS-DOS Turkish
    • PC/DOS CP858 - MS-DOS Latin 1 Update
    • PC/DOS CP860 - MS-DOS Portuguese
    • PC/DOS CP861 - MS-DOS Icelandic
    • *PC/DOS CP862 - MS-DOS Hebrew
    • PC/DOS CP863 - MS-DOS French Canada
    • *PC/DOS CP864 - MS-DOS Arabic 2
    • PC/DOS CP866 - MS-DOS Cyrillic Russian
    • PC/DOS CP869 - MS-DOS Greek 2
    • PC/DOS CP1006 - Arabic
    • PC/DOS KZ-1048 - Kazakhstan
    • PC/DOS MIK Code page
    • PC/DOS Kamenický Encoding
    • PC/DOS Mazovia Encoding
    • *PC/DOS Iran System Encoding Standard

EBCDIC - Extended Binary Coded Decimal Interchange Code

    • EBCDIC 037 - IBM U.S. Canada
    • EBCDIC 285 - IBM Ireland U.K.
    • EBCDIC 424 - IBM Hebrew
    • EBCDIC 500 - IBM International
    • EBCDIC 875 - IBM Greek
    • EBCDIC 1026 - IBM Latin 5 Turkish
    • EBCDIC 1047 - IBM Latin 1
    • EBCDIC 1140 - IBM U.S. Canada with €
    • EBCDIC 1146 - IBM Ireland U.K. with €
    • EBCDIC 1148 - IBM International with €

ISO / IEC

    • ISO/IEC 6937
    • ISO/IEC 8859-1 Latin-1 Western European
    • ISO/IEC 8859-2 Latin-2 Central European
    • ISO/IEC 8859-3 Latin-3 South European
    • ISO/IEC 8859-4 Latin-4 North European
    • ISO/IEC 8859-5 Latin/Cyrillic
    • *ISO/IEC 8859-6 Latin/Arabic
    • ISO/IEC 8859-7 Latin/Greek
    • *ISO/IEC 8859-8 Latin/Hebrew
    • ISO/IEC 8859-9 Latin/Turkish
    • ISO/IEC 8859-10 Latin/Nordic
    • *ISO/IEC 8859-11 Latin/Thai
    • ISO/IEC 8859-13 Latin-7 Baltic Rim
    • ISO/IEC 8859-14 Latin-8 Celtic
    • ISO/IEC 8859-15 Latin-9
    • ISO/IEC 8859-16 Latin-10 South-Eastern European

Industrial Standard

    • *JIS X 0201 - Japanese Industrial Standard
    • *Shift JIS
    • *TIS-620 - Thai Industrial Standard 620-2533
    • *ANSEL - American National Standard for Extended Latin

KOI

    • KOI7 Код Обмена Информацией, 7 бит
    • KOI8-R Код Обмена Информацией, 8 бит
    • KOI8-U Код Обмена Информацией, 8 бит

Macintosh

    • Macintosh CP10000 - MacRoman
    • Macintosh CP10007 - MacCyrillic
    • Macintosh CP10006 - MacGreek
    • Macintosh CP10079 - MacIcelandic
    • Macintosh CP10029 - MacLatin2
    • Macintosh CP10081 - MacTurkish

Remaining Macintosh encodings are only supported under OSX version of wxHexEditor

    • Macintosh Arabic
    • Macintosh Celtic
    • Macintosh Central European
    • Macintosh Croatian
    • Macintosh Cyrillic
    • Macintosh Devanagari
    • Macintosh Dingbats
    • Macintosh Gaelic
    • Macintosh Greek
    • Macintosh Gujarati
    • Macintosh Gurmukhi
    • Macintosh Hebrew
    • Macintosh Icelandic
    • Macintosh Inuit
    • Macintosh Keyboard
    • Macintosh Roman
    • Macintosh Romanian
    • Macintosh Symbol
    • Macintosh Thai
    • Macintosh Tibetan
    • Macintosh Turkish
    • Macintosh Ukraine

UTF

    • UTF8 - Universal Character Set
    • UTF16 - Universal Character Set
    • UTF16LE - Universal Character Set
    • UTF16BE - Universal Character Set
    • UTF32 - Universal Character Set
    • UTF32LE - Universal Character Set
    • UTF32BE - Universal Character Set

Windows

    • *Windows CP874 - Thai
    • *Windows CP932 - Japanese (Shift JIS)
    • *Windows CP936 - Chinese Simplified (GBK)
    • *Windows CP949 - Korean (EUC-KR)
    • *Windows CP950 - Chinese Traditional (Big5)
    • Windows CP1250 - Central and Eastern European
    • Windows CP1251 - Cyrillic Script
    • Windows CP1252 - ANSI
    • Windows CP1253 - Greek Modern
    • Windows CP1254 - Turkish
    • *Windows CP1255 - Hebrew
    • *Windows CP1256 - Arabic
    • Windows CP1257 - Baltic
    • Windows CP1258 - Vietnamese

Other

    • DEC Multinational Character Set - VT220
    • Unicode

Experimental

    • *AtariST
    • *Big5
    • *GBK - GB2312 - Guojia Biaozhun (国家标准)
    • *EUC-JP Extended Unix Code for Japanese
    • *EUC-KR Extended Unix Code for Korean