mirror of
https://github.com/nodejs/node.git
synced 2025-05-02 20:46:16 +00:00

Each page of the API documentation should have links to other versions of the same page. This will make it easier to switch between the current "live" release at nodejs.org and LTS versions. PR-URL: https://github.com/nodejs/node/pull/10958 Fixes: https://github.com/nodejs/node/issues/10726 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com> Reviewed-By: Sakthipriyan Vairamani <thechargingvolcano@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
85 lines
2.6 KiB
Markdown
85 lines
2.6 KiB
Markdown
# String Decoder
|
|
|
|
<!--introduced_in=v0.10.0-->
|
|
|
|
> Stability: 2 - Stable
|
|
|
|
The `string_decoder` module provides an API for decoding `Buffer` objects into
|
|
strings in a manner that preserves encoded multi-byte UTF-8 and UTF-16
|
|
characters. It can be accessed using:
|
|
|
|
```js
|
|
const { StringDecoder } = require('string_decoder');
|
|
```
|
|
|
|
The following example shows the basic use of the `StringDecoder` class.
|
|
|
|
```js
|
|
const { StringDecoder } = require('string_decoder');
|
|
const decoder = new StringDecoder('utf8');
|
|
|
|
const cent = Buffer.from([0xC2, 0xA2]);
|
|
console.log(decoder.write(cent));
|
|
|
|
const euro = Buffer.from([0xE2, 0x82, 0xAC]);
|
|
console.log(decoder.write(euro));
|
|
```
|
|
|
|
When a `Buffer` instance is written to the `StringDecoder` instance, an
|
|
internal buffer is used to ensure that the decoded string does not contain
|
|
any incomplete multibyte characters. These are held in the buffer until the
|
|
next call to `stringDecoder.write()` or until `stringDecoder.end()` is called.
|
|
|
|
In the following example, the three UTF-8 encoded bytes of the European Euro
|
|
symbol (`€`) are written over three separate operations:
|
|
|
|
```js
|
|
const { StringDecoder } = require('string_decoder');
|
|
const decoder = new StringDecoder('utf8');
|
|
|
|
decoder.write(Buffer.from([0xE2]));
|
|
decoder.write(Buffer.from([0x82]));
|
|
console.log(decoder.end(Buffer.from([0xAC])));
|
|
```
|
|
|
|
## Class: new StringDecoder([encoding])
|
|
<!-- YAML
|
|
added: v0.1.99
|
|
-->
|
|
|
|
* `encoding` {string} The character encoding the `StringDecoder` will use.
|
|
Defaults to `'utf8'`.
|
|
|
|
Creates a new `StringDecoder` instance.
|
|
|
|
### stringDecoder.end([buffer])
|
|
<!-- YAML
|
|
added: v0.9.3
|
|
-->
|
|
|
|
* `buffer` {Buffer} A `Buffer` containing the bytes to decode.
|
|
|
|
Returns any remaining input stored in the internal buffer as a string. Bytes
|
|
representing incomplete UTF-8 and UTF-16 characters will be replaced with
|
|
substitution characters appropriate for the character encoding.
|
|
|
|
If the `buffer` argument is provided, one final call to `stringDecoder.write()`
|
|
is performed before returning the remaining input.
|
|
|
|
### stringDecoder.write(buffer)
|
|
<!-- YAML
|
|
added: v0.1.99
|
|
changes:
|
|
- version: v8.0.0
|
|
pr-url: https://github.com/nodejs/node/pull/9618
|
|
description: Each invalid character is now replaced by a single replacement
|
|
character instead of one for each individual byte.
|
|
-->
|
|
|
|
* `buffer` {Buffer} A `Buffer` containing the bytes to decode.
|
|
|
|
Returns a decoded string, ensuring that any incomplete multibyte characters at
|
|
the end of the `Buffer` are omitted from the returned string and stored in an
|
|
internal buffer for the next call to `stringDecoder.write()` or
|
|
`stringDecoder.end()`.
|