CONVERTUTF


9.5B

 

The CONVERTUTF instruction allows UTF conversions UTF-8, UTF-16, and UTF-32 to be performed. The basic operations of this instruction allow a PLB character data stream to be converted to/from UTF-8, UTF-16, and UTF-32. In addition, this instruction can allow an application to convert UTF formats directly to other UTF formats. The instruction using the following format:

 

 

[label]

CONVERTUTF

{source}{sep}{dest}[,{type}]

 

Where:

label

Optional. A Program Execution Label.

source

Required. A previously defined Character String Variable or Literal variable whose logical string contains the data to be converted.

sep

Required. A comma or one of the following prepositions: BY, TO, OF, FROM, USING, WITH, IN, or INTO.

dest

Required. A Character String Variable that receives the converted data.

type

Optional. A Numeric Variable, numeric literal, or decimal number that identifies the type of UTF conversion to be performed.

Flags Affected: EOS, OVER, ZERO

Note the following:

  1. The ZERO flag is set if the conversion was successful.

  2. Likewise, The OVER flag is set when a conversion error has occurred.

  3. The EOS flag is set to TRUE when the {dest} variable is too small to receive all of the converted data.

  4. If the logical string of the <source> operand is NULL, the flags are cleared and the instruction terminates immediately.

  5. If a conversion operation is executed, the S$ERROR$ variable contains a character string identified in single quotes as follows:

  6.  

    String

    Meaning

    T00

    Conversion successful

    T01

    Partial character in source

    T02

    Insufficient room in target for conversion

    T03

    Source data sequence is illegal/malformed

    T04

    Invalid conversion mode specified

    T09

    Insufficient memory to execute conversion

    T10

    Invalid codepage value specified. The codepage value must be a value larger than 400 and less than 65536. (9.6)

    T11

    Unable to execute the multi-byte to wide character conversion because an OS error occurred. Verify the source parameter string format. (9.6)

    T12

    Unable to execute the wide character to multi-byte conversion because an OS error occurred. Verify the source parameter string format. (9.6)

    T13

    Codepage values are not supported by the runtime being executed. (9.6)

     

  7. The <type> conversion values can be specified as follows:

  8.  

    Type Value

    Conversion Action

    0

    PLB data to UTF-8 (default)

    1

    PLB data to UTF-16

    2

    PLB data to UTF-32

    3

    UTF-8 to PLB data

    4

    UTF-16 to PLB data

    5

    UTF-32 to PLB data

    6

    UTF-8 to UTF-16

    7

    UTF-8 to UTF-32

    8

    UTF-16 to UTF-8

    9

    UTF-16 to UTF-32

    10

    UTF-32 to UTF-8

    11

    UTF-32 to UTF-16

    12

    PLB multi-byte string to wide character string - Windows only. The {type} can optionally include a Windows Codepage value greater than 400. (9.6)

    13

    Wide character string to PLB multi-byte string - Windows only. The {type} can optionally include a Windows Codepage value greater than 400. (9.6)

     

  9. A PLB multi-byte string is composed of 8-bit bytes that are found in a DIM variable type. The interpretation and/or presentation of the multi-byte bytes depends on the Windows environment where the PLB runtime is being executed. The Windows APIs named 'MultiByteToWideChar' and 'WideCharToMultiByte' can be reviewed for more details.

  10. A wide character is composed of 2 bytes in a DIM variable that is formatted as a UTF-16 character. The Windows APIs named 'MultiByteToWideChar' and 'WideCharToMultiByte' can be reviewed for more details.

  11. When using the conversion types of 12 or 13, the {type} parameter can be formatted to include a Windows Codepage identifier. When a Windows Codepage identified is not specified, the conversion defaults to the Windows ANSI code page. Otherwise, the Windows Codepage value must be greater than 400 and less then 65536. When a Codepage value is included in the {type} parameter value, it must be offset by a decimal factor of 100 with the following format expected:

  12. Format:

     

    {type} = ( {codepage} * 100 ) + {typevalue}

     

    Where:

     

    {codepage}
    Windows codepage identifier. Value must be greater than 400 and less than 65536. See Windows documentation for details on valid codepage values. For example:

     

    1252 - Windows-1252 is described as "ANSI Latin 1; Western European (Windows)"

     

    1255 - Windows-1255 is described as: "ANSI Hebrew; Hebrew (Windows)"

     

    {typevalue}
    The normal conversion values defined for the CONVERTUTF instruction. The values are limited to 00 to 99.

    Examples:

     

    TYPE12

    FORM

    "125512" // Use codepage value 1255 with the conversion type 12.

    TYPE13

    FORM

    "125513" //Use codepage value 1255 with the conversion type 13.

    TYPEXX

    FORM

    10 //Calculate codepage + type

     

    MOVE

    ( ( 1255 * 100 ) + 12 ),TYPEXX

 

 

See Also: Character String Instructions

 



PL/B Language Reference COMPRESS COUNT