Skip to content

Commit e430964

Browse files
committed
Write documentation
1 parent ee1aa03 commit e430964

File tree

3 files changed

+106
-99
lines changed

3 files changed

+106
-99
lines changed

README.md

Lines changed: 95 additions & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# level-transcoder
22

3-
**Encode data with built-in or custom encodings.** The (not yet official) successor to [`level-codec`][level-codec] that introduces "transcoders" to translate between encodings and internal data formats supported by a db. This allows a db to store keys and values in a format of its choice (Buffer, Uint8Array or String) with zero-effort support of all known encodings.
3+
**Encode data with built-in or custom encodings.** The (not yet official) successor to [`level-codec`][level-codec] that transcodes encodings from and to internal data formats supported by a database. This allows a database to store keys and values in a format of its choice (Buffer, Uint8Array or String) with zero-effort support of all known encodings.
44

55
[![level badge][level-badge]](https://github.com/Level/awesome)
66
[![Test](https://img.shields.io/github/workflow/status/Level/transcoder/Test?label=test)](https://github.com/Level/transcoder/actions/workflows/test.yml)
@@ -11,14 +11,19 @@
1111

1212
## Usage
1313

14+
Create a transcoder, passing a desired format:
15+
1416
```js
1517
const Transcoder = require('level-transcoder')
1618

17-
// Create a transcoder, passing a desired format
1819
const transcoder1 = new Transcoder(['view'])
1920
const transcoder2 = new Transcoder(['buffer'])
2021
const transcoder3 = new Transcoder(['utf8'])
22+
```
23+
24+
Then select an encoding and encode some data:
2125

26+
```js
2227
// Uint8Array(3) [ 49, 50, 51 ]
2328
console.log(transcoder1.encoding('json').encode(123))
2429

@@ -29,145 +34,148 @@ console.log(transcoder2.encoding('json').encode(123))
2934
console.log(transcoder3.encoding('json').encode(123))
3035
```
3136

32-
If given multiple formats (like how [`leveldown`][leveldown] can work with both Buffer and strings), the best fitting format is chosen. Not by magic, just hardcoded logic because we don't have that many formats to deal with.
33-
34-
For example, knowing that JSON is a UTF-8 string which matches the desired `utf8` format, the `json` encoding will return a string here:
37+
If the `Transcoder` constructor is given multiple formats then `Transcoder#encoding()` selects an encoding with the best fitting format. Consider a database like [`leveldown`][leveldown] which has the ability to return data as a Buffer or string. If an `encoding.decode(data)` function needs a string, we'll want to fetch that `data` from the database as a string. This avoids the cost of having to convert a Buffer to a string. So we'd use the following transcoder:
3538

3639
```js
37-
const transcoder4 = new Transcoder(['buffer', 'utf8'])
38-
39-
// '123'
40-
console.log(transcoder4.encoding('json').encode(123))
40+
const transcoder = new Transcoder(['buffer', 'utf8'])
4141
```
4242

43-
In contrast, the `view` encoding doesn't match either `buffer` or `utf8` so data encoded by the `view` encoding gets transcoded into Buffers:
43+
Then, knowing for example that the return value of `JSON.stringify(data)` is a UTF-8 string which matches one of the given formats, the `json` encoding will return a string here:
4444

4545
```js
46-
// <Buffer 31 32 33>
47-
console.log(transcoder4.encoding('view').encode(Uint8Array.from([49, 50, 51])))
46+
// '123'
47+
console.log(transcoder.encoding('json').encode(123))
4848
```
4949

50-
Copying of data is avoided where possible. That's true in the last example, because the underlying ArrayBuffer of the view can be passed to a Buffer constructor without a copy.
50+
In contrast, data encoded as a `view` (for now that just means Uint8Array) would get transcoded into `buffer`. Copying of data is avoided where possible, like how the underlying ArrayBuffer of a view can be passed to `Buffer.from(..)` without a copy.
5151

52-
Lastly, the encoding returned by `Transcoder#encoding()` has a `format` property to be used to forward key- and valueEncoding options to an underlying store. This way, both the public and private API's of a db will be encoding-aware (somewhere in the future).
52+
Lastly, encodings returned by `Transcoder#encoding()` have a `format` property to be used to forward information to an underlying store. For example: an input value of `{ x: 3 }` using the `json` encoding which has a `format` of `utf8`, can be forwarded as value `'{"x":3}'` with encoding `utf8`. Vice versa for output.
5353

54-
For example, on `leveldown` a call like `db.put(key, { x: 3 }, { valueEncoding: 'json' })` will pass that value `{ x: 3 }` through a `json` encoding that has a `format` of `utf8`, which is then forwarded as `db._put(key, '{"x":3}', { valueEncoding: 'utf8' })`.
54+
## Encodings
5555

56-
## Compatible with
56+
### Built-in Encodings
5757

58-
Various modules in the ecosystem, in and outside of level, can be used with `level-transcoder`.
58+
These encodings can be used out of the box and are to be selected by name.
5959

60-
| Module | Format | Interface | Named |
61-
|:-------------------------------------------|:-------------|:------------------------------------|:------|
62-
| [`protocol-buffers`][protocol-buffers] | buffer | [`level-codec`][level-codec] ||
63-
| [`charwise`][charwise] | utf8 | [`level-codec`][level-codec] ||
64-
| [`bytewise`][bytewise] | buffer | [`level-codec`][level-codec] ||
65-
| [`lexicographic-integer-encoding`][lexint] | buffer, utf8 | [`level-codec`][level-codec] ||
66-
| [`codecs`][mafintosh-codecs] | buffer | [`codecs`][mafintosh-codecs] ||
67-
| [`abstract-encoding`][abstract-enc] | buffer | [`abstract-encoding`][abstract-enc] ||
68-
| [`multiformats`][js-multiformats] | view | [`multiformats`][blockcodec] ||
69-
| [`base32-codecs`][base32-codecs] | buffer | [`codecs`][mafintosh-codecs] ||
60+
In this table, the _input_ is what `encode()` accepts. The _format_ is what `encode()` returns as well as what `decode()` accepts. The _output_ is what `decode()` returns. The TypeScript typings of `level-transcoder` have generic type parameters with matching names: `TIn`, `TFormat` and `TOut`.
7061

71-
Common between the interfaces is that they have `encode()` and `decode()` methods. The terms "codec" and "encoding" are used interchangeably. Passing these encodings through `Transcoder#encoding()` (which is done implicitly when used in an `abstract-level` database) results in normalized encoding objects that follow [the interface](./lib/encoding.d.ts) of `level-transcoder`.
62+
| Name | Input | Format | Output |
63+
| :-------------------- | :------------------------- | :------------------ | :-------------- |
64+
| `buffer` <sup>1</sup> | Buffer, Uint8Array, String | `buffer` (Buffer) | Buffer |
65+
| `view` | Uint8Array, Buffer, String | `view` (Uint8Array) | Uint8Array |
66+
| `utf8` | String, Buffer, Uint8Array | `utf8` (String) | String |
67+
| `json` | Any JSON type | `utf8` (String) | Input |
68+
| `hex` | String (hex), Buffer | `buffer` (Buffer) | String (hex) |
69+
| `base64` | String (base64), Buffer | `buffer` (Buffer) | String (base64) |
7270

73-
If the format in the table above is buffer, then `encode()` is expected to return a Buffer. If utf8, then a string. If view, then a Uint8Array.
71+
<sup>1</sup> Aliased as `binary`. Use of this alias does not affect the ability to transcode.
7472

75-
Those marked as not named are modules that export or generate anonymous encodings that don't have a `name` property (or `type` as an alias) which means they can only be used as objects and not by name. Passing an anonymous encoding through `Transcoder#encoding()` does give it a `name` property for compatibility, but the value of `name` is not deterministic.
73+
### Transcoder Encodings
7674

77-
---
75+
It's not necessary to use or reference the below encodings directly. They're listed here for implementation notes and to show how input and output is the same; it's the format that differs.
7876

79-
**_Rest of README is not yet updated._**
77+
Custom encodings are transcoded in the same way and require no additional setup. For example: if a custom encoding has `{ name: 'example', format: 'utf8' }` then `level-transcoder` will create transcoder encodings on demand with names `example+buffer` and `example+view`.
8078

81-
## API
79+
| Name | Input | Format | Output |
80+
| :--------------------------| :------------------------- | :------- | :-------------- |
81+
| `buffer+view` | Buffer, Uint8Array, String | `view` | Buffer |
82+
| `view+buffer` | Uint8Array, Buffer, String | `buffer` | Uint8Array |
83+
| `utf8+view` | String, Buffer, Uint8Array | `view` | String |
84+
| `utf8+buffer` | String, Buffer, Uint8Array | `buffer` | String |
85+
| `json+view` | Any JSON type | `view` | Input |
86+
| `json+buffer` | Any JSON type | `buffer` | Input |
87+
| `hex+view` <sup>1</sup> | String (hex), Buffer | `view` | String (hex) |
88+
| `base64+view` <sup>1</sup> | String (base64), Buffer | `view` | String (base64) |
8289

83-
### `codec = Codec([opts])`
90+
<sup>1</sup> Unlike other encodings that transcode to `view`, these depend on Buffer at the moment and thus don't work in browsers if a [shim](https://github.com/feross/buffer) is not included by JavaScript bundlers like Webpack and Browserify.
8491

85-
Create a new codec, with a global options object.
92+
### Ecosystem Encodings
8693

87-
### `codec.encodeKey(key[, opts])`
94+
Various modules in the ecosystem, in and outside of Level, can be used with `level-transcoder` although they follow different interfaces. Common between the interfaces is that they have `encode()` and `decode()` methods. The terms "codec" and "encoding" are used interchangeably in the ecosystem. Passing these encodings through `Transcoder#encoding()` (which is done implicitly when used in an `abstract-level` database) results in normalized encoding objects as described further below.
8895

89-
Encode `key` with given `opts`.
96+
| Module | Format | Interface | Named |
97+
| :----------------------------------------- | :--------------- | :---------------------------------- | :---- |
98+
| [`protocol-buffers`][protocol-buffers] | `buffer` | [`level-codec`][level-codec] ||
99+
| [`charwise`][charwise] | `utf8` | [`level-codec`][level-codec] ||
100+
| [`bytewise`][bytewise] | `buffer` | [`level-codec`][level-codec] ||
101+
| [`lexicographic-integer-encoding`][lexint] | `buffer`, `utf8` | [`level-codec`][level-codec] ||
102+
| [`codecs`][mafintosh-codecs] | `buffer` | [`codecs`][mafintosh-codecs] ||
103+
| [`abstract-encoding`][abstract-enc] | `buffer` | [`abstract-encoding`][abstract-enc] ||
104+
| [`multiformats`][js-multiformats] | `view` | [`multiformats`][blockcodec] ||
105+
| [`base32-codecs`][base32-codecs] | `buffer` | [`codecs`][mafintosh-codecs] ||
90106

91-
### `codec.encodeValue(value[, opts])`
107+
Those marked as not named are modules that export or generate encodings that don't have a `name` property (or `type` as an alias). We call these _anonymous encodings_. They can only be used as objects and not by name. Passing an anonymous encoding through `Transcoder#encoding()` does give it a `name` property for compatibility, but the value of `name` is not deterministic.
92108

93-
Encode `value` with given `opts`.
109+
## API
94110

95-
### `codec.encodeBatch(batch[, opts])`
111+
### `Transcoder`
96112

97-
Encode `batch` ops with given `opts`.
113+
#### `transcoder = new Transcoder(formats)`
98114

99-
### `codec.encodeLtgt(ltgt)`
115+
Create a new transcoder, providing the formats that are supported by a database (or other). The `formats` argument must be an array containing one or more of `'buffer'`, `'view'`, `'utf8'`. The returned `transcoder` instance is stateful, in that it contains a set of cached encoding objects.
100116

101-
Encode the ltgt values of option object `ltgt`.
117+
#### `encoding = transcoder.encoding(encoding)`
102118

103-
### `codec.decodeKey(key[, opts])`
119+
Returns the given `encoding` argument as a normalized encoding object that follows the `level-transcoder` encoding interface. The `encoding` argument may be:
104120

105-
Decode `key` with given `opts`.
121+
- A string to select a known encoding by its name
122+
- An object that follows one of the following interfaces: [`level-transcoder`](#encoding-interface), [`level-codec`](https://github.com/Level/codec#encoding-format), [`codecs`][mafintosh-codecs], [`abstract-encoding`][abstract-enc], [`multiformats`][blockcodec]
123+
- A previously normalized encoding, such that `encoding(x)` equals `encoding(encoding(x))`.
106124

107-
### `codec.decodeValue(value[, opts])`
125+
Results are cached. If the `encoding` argument is an object and it has a name then subsequent calls can refer to that encoding by name.
108126

109-
Decode `value` with given `opts`.
127+
Depending on the `formats` provided to the `Transcoder` constructor, this method may return a _transcoder encoding_ that translates the desired encoding from / to a supported format. Its `encode()` and `decode()` methods will have respectively the same input and output types as a non-transcoded encoding, but its `name` property will differ.
110128

111-
### `codec.createStreamDecoder([opts])`
129+
#### `encodings = transcoder.encodings()`
112130

113-
Create a function with signature `(key, value)`, that for each key-value pair returned from a levelup read stream returns the decoded value to be emitted.
131+
Get an array of encoding objects. This includes:
114132

115-
### `codec.keyAsBuffer([opts])`
133+
- Encodings for the `formats` that were passed to the `Transcoder` constructor
134+
- Custom encodings that were passed to `transcoder.encoding()`
135+
- Transcoder encodings for either.
116136

117-
Check whether `opts` and the global `opts` call for a binary key encoding.
137+
### `Encoding`
118138

119-
### `codec.valueAsBuffer([opts])`
139+
#### `data = encoding.encode(data)`
120140

121-
Check whether `opts` and the global `opts` call for a binary value encoding.
141+
Encode data.
122142

123-
### `codec.encodings`
143+
#### `data = encoding.decode(data)`
124144

125-
The builtin encodings as object of form
145+
Decode data.
126146

127-
```js
128-
{
129-
[type]: encoding
130-
}
131-
```
147+
#### `encoding.name`
132148

133-
See below for a list and the format of `encoding`.
149+
Unique name. A string.
134150

135-
## Builtin Encodings
151+
#### `encoding.commonName`
136152

137-
| Type | Input | Stored as | Output |
138-
| :---------------------------------------------------------------- | :--------------------------- | :--------------- | :-------- |
139-
| `utf8` | String or Buffer | String or Buffer | String |
140-
| `json` | Any JSON type | JSON string | Input |
141-
| `binary` | Buffer, string or byte array | Buffer | As stored |
142-
| `hex`<br>`ascii`<br>`base64`<br>`ucs2`<br>`utf16le`<br>`utf-16le` | String or Buffer | Buffer | String |
143-
| `none` a.k.a. `id` | Any type (bypass encoding) | Input\* | As stored |
153+
Common name, computed from `name`. If this encoding is a transcoder encoding, `name` will be for example `'json+view'` and `commonName` will be just `'json'`. Else `name` will equal `commonName`.
144154

145-
<sup>\*</sup> Stores may have their own type coercion. Whether type information is preserved depends on the [`abstract-leveldown`][abstract-leveldown] implementation as well as the underlying storage (`LevelDB`, `IndexedDB`, etc).
155+
#### `encoding.format`
146156

147-
## Encoding Format
157+
Name of the (lower-level) encoding used by the return value of `encode()`. One of `'buffer'`, `'view'`, `'utf8'`. If `name` equals `format` then the encoding can be assumed to be idempotent, such that `encode(x)` equals `encode(encode(x))`.
148158

149-
An encoding is an object of the form:
159+
## Encoding Interface
150160

151-
```js
152-
{
153-
encode: function (data) {
154-
return data
155-
},
156-
decode: function (data) {
157-
return data
158-
},
159-
buffer: Boolean,
160-
type: 'example'
161+
Custom encodings must follow the following interface:
162+
163+
```ts
164+
interface EncodingOptions<TIn, TFormat, TOut> {
165+
name: string
166+
format: 'buffer' | 'view' | 'utf8'
167+
encode: (data: TIn) => TFormat
168+
decode: (data: TFormat) => TOut
161169
}
162170
```
163171

164-
All of these properties are required.
172+
## Install
165173

166-
The `buffer` boolean tells consumers whether to fetch data as a Buffer, before calling your `decode()` function on that data. If `buffer` is true, it is assumed that `decode()` takes a Buffer. If false, it is assumed that `decode` takes any other type (usually a string).
174+
With [npm](https://npmjs.org) do:
167175

168-
To explain this in the grand scheme of things, consider a store like [`leveldown`][leveldown] which has the ability to return either a Buffer or string, both sourced from the same byte array. Wrap this store with [`encoding-down`][encoding-down] and it'll select the most optimal data type based on the `buffer` property of the active encoding. If your `decode()` function needs a string (and the data can legitimately become a UTF8 string), you should set `buffer` to `false`. This avoids the cost of having to convert a Buffer to a string.
169-
170-
The `type` string should be a unique name.
176+
```
177+
npm install level-transcoder
178+
```
171179

172180
## Contributing
173181

@@ -189,10 +197,6 @@ Support us with a monthly donation on [Open Collective](https://opencollective.c
189197

190198
[level-codec]: https://github.com/Level/codec
191199

192-
[encoding-down]: https://github.com/Level/encoding-down
193-
194-
[abstract-leveldown]: https://github.com/Level/abstract-leveldown
195-
196200
[leveldown]: https://github.com/Level/leveldown
197201

198202
[protocol-buffers]: https://github.com/mafintosh/protocol-buffers

index.d.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@ declare class Transcoder<T = any> {
88
constructor (formats: Array<'buffer'|'view'|'utf8'>)
99

1010
/**
11-
* Get supported encoding objects.
11+
* Get an array of supported encoding objects.
1212
*/
1313
encodings (): Array<Encoding<any, T, any>>
1414

1515
/**
16-
* Get the given encoding, creating a transcoder if necessary.
16+
* Get the given encoding, creating a transcoder encoding if necessary.
1717
* @param encoding Named encoding or encoding object.
1818
*/
1919
encoding<TIn, TFormat, TOut> (

lib/encoding.d.ts

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,18 +13,19 @@ export abstract class Encoding<TIn, TFormat, TOut> {
1313
/** Decode data. */
1414
decode: (data: TFormat) => TOut
1515

16-
/** Unique name for this encoding. */
16+
/** Unique name. */
1717
name: string
1818

1919
/**
20-
* Common name. If this encoding is a transcoder, {@link name} will be for
21-
* example 'json+view' and {@link commonName} will be just 'json'. Else
22-
* {@link name} will equal {@link commonName}.
20+
* Common name, computed from {@link name}. If this encoding is a
21+
* transcoder encoding, {@link name} will be for example 'json+view'
22+
* and {@link commonName} will be just 'json'. Else {@link name}
23+
* will equal {@link commonName}.
2324
*/
2425
get commonName (): string
2526

2627
/**
27-
* The name of the (lower-level) encoding used by the return value of
28+
* Name of the (lower-level) encoding used by the return value of
2829
* {@link encode}. One of 'buffer', 'view', 'utf8'.
2930
*/
3031
format: 'buffer' | 'view' | 'utf8'
@@ -40,6 +41,8 @@ export abstract class Encoding<TIn, TFormat, TOut> {
4041
createBufferTranscoder (): BufferFormat<TIn, TOut>
4142
}
4243

44+
// TODO: split into multiple interfaces (and then union those) in order to
45+
// separate the main level-transcoder interface from compatibility options.
4346
export interface EncodingOptions<TIn, TFormat, TOut> {
4447
/**
4548
* Encode data.
@@ -52,7 +55,7 @@ export interface EncodingOptions<TIn, TFormat, TOut> {
5255
decode?: ((data: TFormat) => TOut) | undefined
5356

5457
/**
55-
* Unique name for this encoding.
58+
* Unique name.
5659
*/
5760
name?: string | undefined
5861

0 commit comments

Comments
 (0)