Skip to content

Conversation

evoludolab
Copy link

GWT is a great tool to bring interactivity in ebooks to a whole new level. Unfortunately, the epub v3 specs are still based on XHTML... even though the newer bells and whistles of HTML5 are acceptable too.

With respect to GWT, issue arise solely due to the   character entity, which is undefined in XHTML. Once replaced by   GWT works like a charm inside any epub (and javascript capable reader).

This PR does just that by replacing those characters but only where it matters, e.g. no replacements under the test or samples directories.

Copy link
Member

@niloc132 niloc132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't increase compiled size at all, and improves compatibility... I think I'm in favor of it?

If we do identify a downside, we could extract this as a constant so that projects could use configuration properties to control it rather than just the single constant.

But as far as I can tell, using the # syntax should make it only more compatible rather than less compatible - every browser should support the non-special cased xml escapes?

@jnehlmeier
Copy link
Member

If you have control over the doctype you can also import the missing named entities. XHTML is modular and thus nearly all named entities have been moved to modules and you have to import them if required.

See: https://www.w3.org/TR/xhtml-modularization/dtd_module_defs.html#a_module_XHTML_Latin_1_Character_Entities

Every browser supports the hex notation so it should not be an issue for GWT. It is less readable though if you don't know what &#160 or &#xA0 represent.

I would favor nbsp and import the XHTML module but I assume epub readers don't like imports that much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants