You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+95-91Lines changed: 95 additions & 91 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# level-transcoder
2
2
3
-
**Encode data with built-in or custom encodings.** The (not yet official) successor to [`level-codec`][level-codec] that introduces "transcoders" to translate between encodings and internal data formats supported by a db. This allows a db to store keys and values in a format of its choice (Buffer, Uint8Array or String) with zero-effort support of all known encodings.
3
+
**Encode data with built-in or custom encodings.** The (not yet official) successor to [`level-codec`][level-codec] that transcodes encodings from and to internal data formats supported by a database. This allows a database to store keys and values in a format of its choice (Buffer, Uint8Array or String) with zero-effort support of all known encodings.
If given multiple formats (like how [`leveldown`][leveldown] can work with both Buffer and strings), the best fitting format is chosen. Not by magic, just hardcoded logic because we don't have that many formats to deal with.
33
-
34
-
For example, knowing that JSON is a UTF-8 string which matches the desired `utf8` format, the `json` encoding will return a string here:
37
+
If the `Transcoder` constructor is given multiple formats then `Transcoder#encoding()` selects an encoding with the best fitting format. Consider a database like [`leveldown`][leveldown] which has the ability to return data as a Buffer or string. If an `encoding.decode(data)` function needs a string, we'll want to fetch that `data` from the database as a string. This avoids the cost of having to convert a Buffer to a string. So we'd use the following transcoder:
In contrast, the `view` encoding doesn't match either `buffer` or `utf8` so data encoded by the `view` encoding gets transcoded into Buffers:
43
+
Then, knowing for example that the return value of `JSON.stringify(data)` is a UTF-8 string which matches one of the given formats, the `json` encoding will return a string here:
Copying of data is avoided where possible. That's true in the last example, because the underlying ArrayBuffer of the view can be passed to a Buffer constructor without a copy.
50
+
In contrast, data encoded as a `view` (for now that just means Uint8Array) would get transcoded into `buffer`. Copying of data is avoided where possible, like how the underlying ArrayBuffer of a view can be passed to `Buffer.from(..)` without a copy.
51
51
52
-
Lastly, the encoding returned by `Transcoder#encoding()`has a `format` property to be used to forward key- and valueEncoding options to an underlying store. This way, both the public and private API's of a db will be encoding-aware (somewhere in the future).
52
+
Lastly, encodings returned by `Transcoder#encoding()`have a `format` property to be used to forward information to an underlying store. For example: an input value of `{ x: 3 }` using the `json` encoding which has a `format` of `utf8`, can be forwarded as value `'{"x":3}'` with encoding`utf8`. Vice versa for output.
53
53
54
-
For example, on `leveldown` a call like `db.put(key, { x: 3 }, { valueEncoding: 'json' })` will pass that value `{ x: 3 }` through a `json` encoding that has a `format` of `utf8`, which is then forwarded as `db._put(key, '{"x":3}', { valueEncoding: 'utf8' })`.
54
+
## Encodings
55
55
56
-
##Compatible with
56
+
### Built-in Encodings
57
57
58
-
Various modules in the ecosystem, in and outside of level, can be used with `level-transcoder`.
58
+
These encodings can be used out of the box and are to be selected by name.
In this table, the _input_ is what `encode()` accepts. The _format_ is what `encode()` returns as well as what `decode()` accepts. The _output_ is what `decode()` returns. The TypeScript typings of `level-transcoder` have generic type parameters with matching names: `TIn`, `TFormat` and `TOut`.
70
61
71
-
Common between the interfaces is that they have `encode()` and `decode()` methods. The terms "codec" and "encoding" are used interchangeably. Passing these encodings through `Transcoder#encoding()` (which is done implicitly when used in an `abstract-level` database) results in normalized encoding objects that follow [the interface](./lib/encoding.d.ts) of `level-transcoder`.
If the format in the table above is buffer, then `encode()` is expected to return a Buffer. If utf8, then a string. If view, then a Uint8Array.
71
+
<sup>1</sup> Aliased as `binary`. Use of this alias does not affect the ability to transcode.
74
72
75
-
Those marked as not named are modules that export or generate anonymous encodings that don't have a `name` property (or `type` as an alias) which means they can only be used as objects and not by name. Passing an anonymous encoding through `Transcoder#encoding()` does give it a `name` property for compatibility, but the value of `name` is not deterministic.
73
+
### Transcoder Encodings
76
74
77
-
---
75
+
It's not necessary to use or reference the below encodings directly. They're listed here for implementation notes and to show how input and output is the same; it's the format that differs.
78
76
79
-
**_Rest of README is not yet updated._**
77
+
Custom encodings are transcoded in the same way and require no additional setup. For example: if a custom encoding has `{ name: 'example', format: 'utf8' }` then `level-transcoder` will create transcoder encodings on demand with names `example+buffer` and `example+view`.
<sup>1</sup> Unlike other encodings that transcode to `view`, these depend on Buffer at the moment and thus don't work in browsers if a [shim](https://github.com/feross/buffer) is not included by JavaScript bundlers like Webpack and Browserify.
84
91
85
-
Create a new codec, with a global options object.
92
+
### Ecosystem Encodings
86
93
87
-
### `codec.encodeKey(key[, opts])`
94
+
Various modules in the ecosystem, in and outside of Level, can be used with `level-transcoder` although they follow different interfaces. Common between the interfaces is that they have `encode()` and `decode()` methods. The terms "codec" and "encoding" are used interchangeably in the ecosystem. Passing these encodings through `Transcoder#encoding()` (which is done implicitly when used in an `abstract-level` database) results in normalized encoding objects as described further below.
Those marked as not named are modules that export or generate encodings that don't have a `name` property (or `type` as an alias). We call these _anonymous encodings_. They can only be used as objects and not by name. Passing an anonymous encoding through `Transcoder#encoding()` does give it a `name` property for compatibility, but the value of `name` is not deterministic.
92
108
93
-
Encode `value` with given `opts`.
109
+
## API
94
110
95
-
### `codec.encodeBatch(batch[, opts])`
111
+
### `Transcoder`
96
112
97
-
Encode `batch` ops with given `opts`.
113
+
#### `transcoder = new Transcoder(formats)`
98
114
99
-
### `codec.encodeLtgt(ltgt)`
115
+
Create a new transcoder, providing the formats that are supported by a database (or other). The `formats` argument must be an array containing one or more of `'buffer'`, `'view'`, `'utf8'`. The returned `transcoder` instance is stateful, in that it contains a set of cached encoding objects.
100
116
101
-
Encode the ltgt values of option object `ltgt`.
117
+
#### `encoding = transcoder.encoding(encoding)`
102
118
103
-
### `codec.decodeKey(key[, opts])`
119
+
Returns the given `encoding` argument as a normalized encoding object that follows the `level-transcoder` encoding interface. The `encoding` argument may be:
104
120
105
-
Decode `key` with given `opts`.
121
+
- A string to select a known encoding by its name
122
+
- An object that follows one of the following interfaces: [`level-transcoder`](#encoding-interface), [`level-codec`](https://github.com/Level/codec#encoding-format), [`codecs`][mafintosh-codecs], [`abstract-encoding`][abstract-enc], [`multiformats`][blockcodec]
123
+
- A previously normalized encoding, such that `encoding(x)` equals `encoding(encoding(x))`.
106
124
107
-
### `codec.decodeValue(value[, opts])`
125
+
Results are cached. If the `encoding` argument is an object and it has a name then subsequent calls can refer to that encoding by name.
108
126
109
-
Decode `value` with given `opts`.
127
+
Depending on the `formats` provided to the `Transcoder` constructor, this method may return a _transcoder encoding_ that translates the desired encoding from / to a supported format. Its `encode()` and `decode()` methods will have respectively the same input and output types as a non-transcoded encoding, but its `name` property will differ.
110
128
111
-
###`codec.createStreamDecoder([opts])`
129
+
#### `encodings = transcoder.encodings()`
112
130
113
-
Create a function with signature `(key, value)`, that for each key-value pair returned from a levelup read stream returns the decoded value to be emitted.
131
+
Get an array of encoding objects. This includes:
114
132
115
-
### `codec.keyAsBuffer([opts])`
133
+
- Encodings for the `formats` that were passed to the `Transcoder` constructor
134
+
- Custom encodings that were passed to `transcoder.encoding()`
135
+
- Transcoder encodings for either.
116
136
117
-
Check whether `opts` and the global `opts` call for a binary key encoding.
137
+
### `Encoding`
118
138
119
-
###`codec.valueAsBuffer([opts])`
139
+
#### `data = encoding.encode(data)`
120
140
121
-
Check whether `opts` and the global `opts` call for a binary value encoding.
141
+
Encode data.
122
142
123
-
###`codec.encodings`
143
+
#### `data = encoding.decode(data)`
124
144
125
-
The builtin encodings as object of form
145
+
Decode data.
126
146
127
-
```js
128
-
{
129
-
[type]: encoding
130
-
}
131
-
```
147
+
#### `encoding.name`
132
148
133
-
See below for a list and the format of `encoding`.
|`utf8`| String or Buffer | String or Buffer | String |
140
-
|`json`| Any JSON type | JSON string | Input |
141
-
|`binary`| Buffer, string or byte array | Buffer | As stored |
142
-
|`hex`<br>`ascii`<br>`base64`<br>`ucs2`<br>`utf16le`<br>`utf-16le`| String or Buffer | Buffer | String |
143
-
|`none` a.k.a. `id`| Any type (bypass encoding) | Input\*| As stored |
153
+
Common name, computed from `name`. If this encoding is a transcoder encoding, `name` will be for example `'json+view'` and `commonName` will be just `'json'`. Else `name` will equal `commonName`.
144
154
145
-
<sup>\*</sup> Stores may have their own type coercion. Whether type information is preserved depends on the [`abstract-leveldown`][abstract-leveldown] implementation as well as the underlying storage (`LevelDB`, `IndexedDB`, etc).
155
+
#### `encoding.format`
146
156
147
-
## Encoding Format
157
+
Name of the (lower-level) encoding used by the return value of `encode()`. One of `'buffer'`, `'view'`, `'utf8'`. If `name` equals `format` then the encoding can be assumed to be idempotent, such that `encode(x)` equals `encode(encode(x))`.
148
158
149
-
An encoding is an object of the form:
159
+
## Encoding Interface
150
160
151
-
```js
152
-
{
153
-
encode:function (data) {
154
-
return data
155
-
},
156
-
decode:function (data) {
157
-
return data
158
-
},
159
-
buffer:Boolean,
160
-
type:'example'
161
+
Custom encodings must follow the following interface:
162
+
163
+
```ts
164
+
interfaceEncodingOptions<TIn, TFormat, TOut> {
165
+
name:string
166
+
format:'buffer'|'view'|'utf8'
167
+
encode: (data:TIn) =>TFormat
168
+
decode: (data:TFormat) =>TOut
161
169
}
162
170
```
163
171
164
-
All of these properties are required.
172
+
## Install
165
173
166
-
The `buffer` boolean tells consumers whether to fetch data as a Buffer, before calling your `decode()` function on that data. If `buffer` is true, it is assumed that `decode()` takes a Buffer. If false, it is assumed that `decode` takes any other type (usually a string).
174
+
With [npm](https://npmjs.org) do:
167
175
168
-
To explain this in the grand scheme of things, consider a store like [`leveldown`][leveldown] which has the ability to return either a Buffer or string, both sourced from the same byte array. Wrap this store with [`encoding-down`][encoding-down] and it'll select the most optimal data type based on the `buffer` property of the active encoding. If your `decode()` function needs a string (and the data can legitimately become a UTF8 string), you should set `buffer` to `false`. This avoids the cost of having to convert a Buffer to a string.
169
-
170
-
The `type` string should be a unique name.
176
+
```
177
+
npm install level-transcoder
178
+
```
171
179
172
180
## Contributing
173
181
@@ -189,10 +197,6 @@ Support us with a monthly donation on [Open Collective](https://opencollective.c
0 commit comments