I should have added that quite some of the options below were
adapted from http://perldoc.perl.org/Encode.html#Handling-Malformed-Data.
If you know other transcoding implementations I should look at,
please tell me.

Regards,   Martin.

At 18:16 08/02/21, Martin Duerst wrote:
>I just commited a very first implementation of using a hash for
>additional options to String#encode (r15565). Matz told me where
>to copy code from, so the implementation was pretty straightforward.
>
>The functionality is currently extremely limited: It is possible to
>indicate that instead of producing an error, invalid input bytes
>should just be ignored (i.e. dropped). This is done as follows:
>String#encode(to_encoding, invalid: :ignore)
>
>
>I'm now looking for comments on how to name these and further options.
>
>invalid: What to do for an invalid byte (sequence) in the input
>
>unknown: What to do if the target encoding doesn't include the character
>
>???: We may need a third option, to indicate a combination of invalid
>     and unknown.
>
>
>Values for each of the above options could include:
>
>:ignore - Ignore/drop the problem data.
>
>:substitute (or :subst or so to be shorter) - Use an
>          (encoding-dependent) substitution character.
>
>:warn   - Produce a warning, helpful for debugging.
>
>:error  - The current behavior, available just for completeness.
>
>:stop   - Stop transcoding, for encode! this will mean
>          loosing the rest of the string.
>
>:x_escape - add problem data to the output using \x escapes
>
>:u_escape - add problem characters to the output using \u escapes
>            (unknown: only)
>
>:hex_ncr - add problem characters to the output using XML/HTML
>           hex escapes (&#xhhhh;, unknown: only)
>
>:dec_ncr - add problem characters to the output using XML/HTML
>           dec escapes (&#ddddd;, unknown: only)
>
>:uri_escape - add problem characters to the output using
>           UTF-8->URI %-encoding conversion (for IRI->URI
>           conversion and similar things, unknown: only)
>
>:block - Use result of block, with interface to be worked out
>         (only needed to indicate that a block is used for
>          one case but not for the other)
>
>'string' - Replace by string (have to work out details about
>           encoding,...)
>
>
>Regards,    Martin.
>
>
>#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst / it.aoyama.ac.jp     


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst / it.aoyama.ac.jp