Skip to content

Specific prefixes can cause invalid codes to return a valid result #152

@Snowysauce

Description

@Snowysauce

This issue is a bit esoteric, so it might take a bit of explaining.

I first noticed that something was amiss when location sets at the NSI with codes like "de-by" (without the .geojson suffix) were being accepted as valid. However, "de-by" is interpreted by country-coder as Belarus ("by"), not Bavaria, "de-bw" as Botswana ("bw"), not Baden-Wurttemburg, and so on.

Further testing and research showed the same behavior for all of the items inside this regex:

/(?=(?!^(and|the|of|el|la|de)$))(\b(and|the|of|el|la|de)\b)|[-_ .,'()&[\]/]/gi;

In other words, "el-by", "la-by", "of-by", etc., are all recognized as valid codes for Belarus when they should be considered invalid.

This behavior is present for all codes, regardless of their source index (e.g., "el-001", "el-us-ak", etc. all map to the code that follows "el-").

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions