This project is archived. Its data is read-only.

What should be the default behaviour for decoding invalid entities

Created by: jakubpawlowicz

We recently stumbled upon an issue with parsing the following entities:

Oga.parse_xml("<a>&#TAB;</a>")

which results in

ArgumentError: invalid value for Integer(): "TAB"
    from /app/lib/oga/xml/entities.rb:93:in `Integer'
    from /app/lib/oga/xml/entities.rb:93:in `block in decode'
    from /app/lib/oga/xml/entities.rb:92:in `gsub'
    from /app/lib/oga/xml/entities.rb:92:in `decode'

The Oga.parse_xml("<a>&#x;</a>") results in the same exception being raised.

I wonder if Oga could rather leave these entities as they are if they cannot be decoded instead of raising an unrecoverable error?

Assignee Loading
Time tracking Loading