# bcp-47 [![Build][build-badge]][build] [![Coverage][coverage-badge]][coverage] [![Downloads][downloads-badge]][downloads] [![Size][size-badge]][size] Parse and stringify [BCP 47][spec] language tags. ## Install [npm][]: ```sh npm install bcp-47 ``` ## Contents * [Use](#use) * [API](#api) * [`bcp47.parse(tag[, options])`](#bcp47parsetag-options) * [`bcp47.stringify(schema)`](#bcp47stringifyschema) * [`schema`](#schema) * [`function warning(reason, code, offset)`](#function-warningreason-code-offset) * [Related](#related) * [License](#license) ## Use ```js var bcp47 = require('bcp-47') var schema = bcp47.parse('hy-Latn-IT-arevela') console.log(schema) console.log(bcp47.stringify(schema)) ``` Yields: ```js { language: 'hy', extendedLanguageSubtags: [], script: 'Latn', region: 'IT', variants: ['arevela'], extensions: [], privateuse: [], irregular: null, regular: null } 'hy-Latn-IT-arevela' ``` ## API ### `bcp47.parse(tag[, options])` Parse a BCP 47 tag into a language schema. Note that the algorithm is case-insensitive. ###### `options.normalize` Whether to normalize legacy tags when possible (`boolean`, default: `true`). For example, `i-klingon` does not match the BCP 47 language algorithm but is valid nonetheless. It is suggested to use `tlh` instead (the ISO 639-3 code for Klingon). When `normalize` is true, passing `i-klingon`, or other deprecated tags, is handled as if their suggested valid tag was given. ###### `options.forgiving` By default, when an error is encountered, an empty object is returned. When in forgiving mode, all found values up to the point of the error are included (`boolean`, default: `false`). So, for example, where by default `en-GB-abcdefghi` an empty object is returned (as the language variant is too long), in `forgiving` mode the `language` of `schema` is populated with `en` and the `region` is populated with `GB`. ###### `options.warning` When given, `warning` is invoked when an error is encountered ([`Function`][warning]). ###### Returns [`Schema`][schema] — Parsed BCP 47 language tag. ###### Throws When `tag` is `null` or `undefined`. ### `bcp47.stringify(schema)` Compile a [`schema`][schema] into a BCP 47 language tag. ###### Returns `string` — BCP 47 language tag. ### `schema` A schema may have the following properties. A schema is deemed empty when it has neither `language`, `irregular`, `regular`, nor `privateuse` (where an empty `privateuse` array is handled as no `privateuse`, too). ###### `schema.language` Two or three character [ISO 639][iso-639] language code, four character reserved language code, or 5 to 8 (inclusive) characters registered language subtag (`string`). For example, `en` (English) or `cmn` (Mandarin Chinese). ###### `schema.extendedLanguageSubtags` Selected three-character [ISO 639][iso-639] codes(`Array.`), such as `yue` in `zh-yue-HK` (Chinese, Cantonese, as used in Hong Kong SAR). ###### `schema.script` Four character [ISO 15924][iso-15924] script code (`string`), such as `Latn` in `hy-Latn-IT-arevela` (Eastern Armenian written in Latin script, as used in Italy). ###### `schema.region` Two alphabetical character [ISO 3166-1][iso-3166-1] code, or three digit [UN M.49][un-m49] code (`string`). For example, `CN` in `cmn-Hans-CN` (Mandarin Chinese, Simplified script, as used in China). ###### `schema.variants` 5 to 8 (inclusive) character language variants (`Array.`), such as both `rozaj` and `biske` in `sl-rozaj-biske` (San Giorgio dialect of Resian dialect of Slovenian). ###### `schema.extensions` List of extensions (`Array.`), each an object containing a one character `singleton`, and a list of `extensions` (`string`). `singleton` cannot be `x` (case-insensitive), and `extensions` must be between two and eight (inclusive) characters. For example, an extension would be `u-co-phonebk` in `de-DE-u-co-phonebk` (German, as used in Germany, using German phonebook sort order), where `u` is the `singleton` and `co` and `phonebk` are its extensions. ###### `schema.privateuse` List of private-use subtags (`Array.`), where each subtag must be between one and eight (inclusive) characters. ###### `schema.regular` One of the `regular` tags (`string`): tags which are seen as something different by the algorithm. Valid values are: * `art-lojban` * `cel-gaulish` * `no-bok` * `no-nyn` * `zh-guoyu` * `zh-hakka` * `zh-min` * `zh-min-nan` * `zh-xiang` ###### `schema.irregular` One of the `irregular` tags (`string`): tags which are seen as invalid by the algorithm). Valid values are: * `en-GB-oed` * `i-ami` * `i-bnn` * `i-default` * `i-enochian` * `i-hak` * `i-klingon` * `i-lux` * `i-mingo` * `i-navajo` * `i-pwn` * `i-tao` * `i-tay` * `i-tsu` * `sgn-BE-FR` * `sgn-BE-NL` * `sgn-CH-DE` ### `function warning(reason, code, offset)` Invoked when an error occurs. ###### Parameters * `reason` (`string`) — English reason for failure * `code` (`number`) — Code for failure * `offset` (`number`) — Index-based position of error ###### Warnings | code | reason | | :--- | :--------------------------------------------------------------------- | | 1 | Too long variant, expected at most 8 characters | | 2 | Too long extension, expected at most 8 characters | | 3 | Too many extended language subtags, expected at most 3 subtags | | 4 | Empty extension, extensions must have at least 2 characters of content | | 5 | Too long private-use area, expected at most 8 characters | | 6 | Found superfluous content after tag | ## Related * [`bcp-47-match`](https://github.com/wooorm/bcp-47-match) — Match BCP 47 language tags with language ranges per RFC 4647 * [`bcp-47-normalize`](https://github.com/wooorm/bcp-47-normalize) — Normalize, canonicalize, and format BCP 47 tags * [`iso-3166`](https://github.com/wooorm/iso-3166) — ISO 3166 codes * [`iso-639-2`](https://github.com/wooorm/iso-639-2) — ISO 639-2 codes * [`iso-639-3`](https://github.com/wooorm/iso-639-3) — ISO 639-3 codes * [`iso-15924`](https://github.com/wooorm/iso-15924) — ISO 15924 codes * [`un-m49`](https://github.com/wooorm/un-m49) — UN M49 codes ## License [MIT][license] © [Titus Wormer][author] [build-badge]: https://img.shields.io/travis/wooorm/bcp-47.svg [build]: https://travis-ci.org/wooorm/bcp-47 [coverage-badge]: https://img.shields.io/codecov/c/github/wooorm/bcp-47.svg [coverage]: https://codecov.io/github/wooorm/bcp-47 [downloads-badge]: https://img.shields.io/npm/dm/bcp-47.svg [downloads]: https://www.npmjs.com/package/bcp-47 [size-badge]: https://img.shields.io/bundlephobia/minzip/bcp-47.svg [size]: https://bundlephobia.com/result?p=bcp-47 [npm]: https://docs.npmjs.com/cli/install [license]: license [author]: https://wooorm.com [spec]: https://tools.ietf.org/html/bcp47 [warning]: #function-warningreason-code-offset [schema]: #schema [iso-639]: https://en.wikipedia.org/wiki/ISO_639 [iso-15924]: https://en.wikipedia.org/wiki/ISO_15924 [iso-3166-1]: https://en.wikipedia.org/wiki/ISO_3166-1 [un-m49]: https://en.wikipedia.org/wiki/UN_M.49