- Doesn't need native code compilation. Works on Windows and in sandboxed environments like Cloud9.
- Used in popular projects like Express.js (body_parser), Grunt, Nodemailer, Yeoman and others.
- Faster than node-iconv (see below for performance comparison).
- Intuitive encode/decode API
- Streaming support for Node v0.10+
- [Deprecated] Can extend Node.js primitives (buffers, streams) to support all iconv-lite encodings.
- In-browser usage via Browserify (~180k gzip compressed with Buffer shim included).
- Typescript type definition file included.
- React Native is supported (need to explicitly
npm installtwo more modules:bufferandstream). - Transliteration option is available when either unidecode-plus or unidecode are added to your project
- License: MIT.
var iconv = require('iconv-lite');
// Convert from an encoded buffer to js string.
str = iconv.decode(Buffer.from([0x68, 0x65, 0x6c, 0x6c, 0x6f]), 'win1251');
// Convert from js string to an encoded buffer.
buf = iconv.encode("Sample input string", 'win1251');
// Check if encoding is supported
iconv.encodingExists("us-ascii");
// Convert from js string to an encoded buffer, keeping accented characters like "é", but transliterating Chinese.
buf2 = iconv.encode("Café 北京", 'iso-8859-1', { transliterate: true });// Decode stream (from binary stream to js strings)
http.createServer(function(req, res) {
var converterStream = iconv.decodeStream('win1251');
req.pipe(converterStream);
converterStream.on('data', function(str) {
console.log(str); // Do something with decoded strings, chunk-by-chunk.
});
});
// Convert encoding streaming example
fs.createReadStream('file-in-win1251.txt')
.pipe(iconv.decodeStream('win1251'))
.pipe(iconv.encodeStream('ucs2'))
.pipe(fs.createWriteStream('file-in-ucs2.txt'));
// Sugar: all encode/decode streams have .collect(cb) method to accumulate data.
http.createServer(function(req, res) {
req.pipe(iconv.decodeStream('win1251')).collect(function(err, body) {
assert(typeof body == 'string');
console.log(body); // full request body string
});
});NOTE: This doesn't work on latest Node versions. See details.
// After this call all Node basic primitives will understand iconv-lite encodings.
iconv.extendNodeEncodings();
// Examples:
buf = new Buffer(str, 'win1251');
buf.write(str, 'gbk');
str = buf.toString('latin1');
assert(Buffer.isEncoding('iso-8859-15'));
Buffer.byteLength(str, 'us-ascii');
http.createServer(function(req, res) {
req.setEncoding('big5');
req.collect(function(err, body) {
console.log(body);
});
});
fs.createReadStream("file.txt", "shift_jis");
// External modules are also supported (if they use Node primitives, which they probably do).
request = require('request');
request({
url: "http://github.com/",
encoding: "cp932"
});
// To remove extensions
iconv.undoExtendNodeEncodings();- All node.js native encodings: utf8, ucs2 / utf16-le, ascii, binary, base64, hex.
- Additional unicode encodings: utf16, utf16-be, utf-7, utf-7-imap, utf32, utf32-le, and utf32-be.
- All widespread singlebyte encodings: Windows 125x family, ISO-8859 family, IBM/DOS codepages, Macintosh family, KOI8 family, all others supported by iconv library. Aliases like 'latin1', 'us-ascii' also supported.
- All widespread multibyte encodings: CP932, CP936, CP949, CP950, GB2312, GBK, GB18030, Big5, Shift_JIS, EUC-JP.
See all supported encodings on wiki.
Most singlebyte encodings are generated automatically from node-iconv. Thank you Ben Noordhuis and libiconv authors!
Multibyte encodings are generated from Unicode.org mappings and WHATWG Encoding Standard mappings. Thank you, respective authors!
Comparison with node-iconv module (1000x256kb, on MacBook Pro, Core i5/2.6 GHz, Node v0.12.0). Note: your results may vary, so please always check on your hardware.
operation iconv@2.1.4 iconv-lite@0.4.7
----------------------------------------------------------
encode('win1251') ~96 Mb/s ~320 Mb/s
decode('win1251') ~95 Mb/s ~246 Mb/s
- Decoding: BOM is stripped by default, unless overridden by passing
stripBOM: falsein options (f.ex.iconv.decode(buf, enc, {stripBOM: false})). A callback might also be given as astripBOMparameter - it'll be called if BOM character was actually found. - If you want to detect UTF-8 BOM when decoding other encodings, use node-autodetect-decoder-stream module.
- Encoding: No BOM added, unless overridden by
addBOM: trueoption.
This library supports UTF-16LE, UTF-16BE and UTF-16 encodings. First two are straightforward, but UTF-16 is trying to be smart about endianness in the following ways:
- Decoding: uses BOM and 'spaces heuristic' to determine input endianness. Default is UTF-16LE, but can be
overridden with
defaultEncoding: 'utf-16be'option. Strips BOM unlessstripBOM: false. - Encoding: uses UTF-16LE and writes BOM by default. Use
addBOM: falseto override.
This library supports UTF-32LE, UTF-32BE and UTF-32 encodings. Like the UTF-16 encoding above, UTF-32 defaults to UTF-32LE, but uses BOM and 'spaces heuristics' to determine input endianness.
- The default of UTF-32LE can be overridden with the
defaultEncoding: 'utf-32be'option. Strips BOM unlessstripBOM: false. - Encoding: uses UTF-32LE and writes BOM by default. Use
addBOM: falseto override. (defaultEncoding: 'utf-32be'can also be used here to change encoding.)
If either unidecode-plus or unidecode are added to your project ("npm install unidecode-plus" or "npm install unidecode"), the option will be available to transliterate characters which are not available in a particular encoding. The transliterations are always plain ASCII characters, but unlike using unidecode directly (which will convert all non-ASCII characters into transliterations), transliterations done using iconv will only transliterate characters which are not available in the target character encoding.
In this example:
buf = iconv.encode("Café 北京", 'iso-8859-1', { transliterate: true });
The output is <Buffer 43 61 66 e9 20 42 65 69 20 4a 69 6e 67 20>. Converted back into ISO-8859-1 text, this is "Café Bei Jing ", preserving the accented "é", and only transliterating the Chinese characters.
Transliteration to a string instead of a buffer can also be done directly, like this:
str = iconv.transliterate("Café 北京", 'iso-8859-1');
When encoding to create a buffer, the node-iconv style of requesting transliteration can also be used:
buf = iconv.encode("Café 北京", 'iso-8859-1//translit');
If you use unidecode-plus instead of unidecode, you get two additional transliteration options: german, and smartSpacing.
The german option transliterates Ä, ä, Ö, ö, Ü, and ü to AE, ae, OE, oe, UE, and ue, respectively, instead of just removing the umlauts.
The smartSpacing options improves the formatting of transliterated text, removing some unnecessary spaces, and adding others for clarity. For example, "Café 北京, 鞋 size 10½" becomes "Cafe Bei Jing, Xie size 10 1/2" using smartSpacing. Without it, you get "Cafe Bei Jing , Xie size 101/2". (See the unidecode-plus site for more detail.)
Please take note that transliteration only affects encoding, not decoding.
- When decoding, be sure to supply a Buffer to decode() method, otherwise bad things usually happen.
- Untranslatable characters are set to � or ? unless using transliteration.
- Node versions 0.10.31 and 0.11.13 are buggy, don't use them (see #65, #77).
$ git clone git@github.com:ashtuchkin/iconv-lite.git
$ cd iconv-lite
$ npm install
$ npm test
$ # To view performance:
$ node test/performance.js
$ # To view test coverage:
$ npm run coverage
$ open coverage/lcov-report/index.html