Skip to content

Customise cell parsing for declarative decoder #55

Open
@HowlingEverett

Description

@HowlingEverett

Is your feature request related to a problem?

I am a fan of the concept of Codable declarative CSV parsing, but am running into the edges of it a little with my current use case. I'm parsing a nutrient database (a UK public health source), and in their dataset they either offer a floating point value for a quantity of a nutrient, or special codes representing trace amounts: e.g. they use "N" to represent "significant but unmeasured quantity" or "Tr" to represent "trace amounts".

Here's an example subset of an input:

Water (g),Protein (g),Fat (g),Carbohydrate (g),Energy (kJ) (kJ),Starch (g),Total sugars (g),Glucose (g)
76.7,2.9,15.2,0.8,625,Tr,0.8,0.1
9.7,1.3,1.2,Tr,67,0.0,Tr,0.0
84.2,0.2,0.1,Tr,7,0.0,Tr,0.0
93.4,4.0,0.7,0.4,100,Tr,0.3,0.1
8.5,6.1,8.7,N,N,N,N,N

In my use case, I'd basically like to ignore N or Tr values (defaulting them to 0 in the parsed type, maybe), but the parser throws an exception and exits when it encounters a non-parseable Double value.

Describe the solution you'd like

Similar to the customisation point for a Decimal parser, It'd be great if we could customise the parsing for types such as Double to be able to handle for edge cases in our input data. In my case I'd be able to Double cast values that aren't "N" or "Tr", and return 0.0 for those edge cases.

Describe alternatives you've considered

I've been able to resolve my issues using the imperative parser, or by pre-processing the CSV whenever I parse it, but it ceases to be a nice declarative interface at that point (and requires loading the whole thing into memory, as my old SwiftCSV implementation did).

The Decimal parser option works, but results in Decimal values - in my case I want simple Doubles.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions