[pyicu-dev] Subclassing Transliterators
Andi Vajda
vajda at osafoundation.org
Thu Mar 18 10:16:58 PDT 2010
On Thu, 18 Mar 2010, Christoph Burgmer wrote:
> Am Samstag, 13. März 2010 schrieb Christoph Burgmer:
> [...]
>> There is an issue though with PyICU.Transliterator.transliterate() for
>>
>> UnicodeString objects:
>>>>> import PyICU
>>>>> t = PyICU.Transliterator.createInstance('NumericPinyin-Latin',
>>
>> PyICU.UTransDirection.UTRANS_FORWARD)
>>
>>>>> print t.transliterate('ni3hao3')
>>
>> n?h?o
>>
>>>>> m = PyICU.UnicodeString("ni3hao3")
>>>>> print m
>>
>> ni3hao3
>>
>>>>> t.transliterate(m)
>>
>> u''
>>
>>>>> print m
>>
>> ni3hao3
>
> See the attached patch for a test case.
Indeed, there was a bug. Sorry for forgetting about your earlier report.
The bug had to do with using the wrong object (from the *u, _u alternatives)
when parsing 'S' (allowing for python str, unicode or UnicodeString arg) and
receiving a UnicodeString argument.
The intent with supporting both python unicode/str (immutable) and
UnicodeString (mutable) is that the latter may be faster and more efficient
when used wth PyICU since it doesn't incur the data conversion or copying
costs.
When an ICU API returns UnicodeString& or void (and takes the corresponding
UnicodeString& parameter for receiving results), the intent of the wrappers
is to allow both variants:
1. passing no UnicodeString for results in and receiving a newly allocated
python unicode object in return
2. passing a UnicodeString for results in and receiving the same object
in return with the results in it
I fixed transliterate() to support both variants as well.
Thanks for the bug report and test case (which I integrated).
Andi..
More information about the pyicu-dev
mailing list