SOAPam Server and Client convert IPMs to SOAP messages, and vice-versa, using processes called serialization and deserialization. When converting string fields, these serialization/deserialization processes convert character data from one character encoding to another. By default, these encodings are UTF-8 in the SOAP message and ISO-8859-1 in the IPM. When the default character encodings are used, these conversions usually go unnoticed. However, if your application uses characters that are not in ISO-8859-1, care must be taken to set the correct character encoding for IPM string fields. If the correct encoding is not set on string fields, the conversion may result in the wrong character, or a single '?' character, indicating that the conversion failed.
The character encoding for string fields can be set using the "encoding" attribute as in the following example:
<element name="aCyrillicString" type="string" size="80" encoding="ISO-8859-5" />
In this example, the encoding is set to ISO-8859-5, also known as the Cyrillic character set, which contains the characters used in the Russian language. The "encoding" attribute can also be set on a <type/> element, which will cause all child string elements to inherit the encoding value. You can also set the default encoding for a SOAPam Server instance using the -servicedefaultencoding startup option.
SOAPam uses the opensource library Internation Components for Unicode (ICU) for character conversion and supports all of the conversion provided by ICU. A list of the available conversions can be found at the ICU Converter Explorer.