Corrupted file when a field has non-ascii characters

When trying to create an mmdb with non-ascii characters, the file produced cannot be read. It's like the offsets are wrong..

I think it's because the offset written to file assume that the python string length is the same as the output bytes when a string is encoded to utf-8.

Setting the length from the encoded string seems to produce the correct result at https://github.com/cloudflare/py-mmdb-encoder/blob/master/mmdbencoder/__init__.py#L346

length = len(value.encode('utf-8'))


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Corrupted file when a field has non-ascii characters #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Corrupted file when a field has non-ascii characters #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions