    How to hadle multilingual characters in QVExpressor



      Am exploring QVExpressor...

      am trying to read data which has Chinese ,Japanese characters from a file  using UTF-8 encoding in schema to read the file.

      and using UTF-8 scehma for output file as well but Junk data  in the output file for the column which has special characters.

      Please can any one suggest which encoding and data type should be used.


      Thanks in advance.

          You have two issues to resolve.


          The first is to confirm that UTF-8 is the encoding being used in the file you are trying to read.  Hopefully, whoever gave you the file can tell you what encoding was used.  Can you read the file in a text editor?  If so, the editor's settings should help you determine the encoding used.


          Once data is in Expressor, UTF-8 encoding is employed, so the schema used by output operator could use this encoding for the output file.


          Within Expressor, you will want to use the ustring functions rather than the string functions to perform any sort of transformations on the data.  These functions have been developed to handle text that includes Unicode characters.


          Note that while Expressor can process special characters as data, however special characters in the names of fields in a file (or the names of columns in a database table) will present problems.  And special charactes cannot be used in the names of database tables.

              Hi John

              Thanks for your response


              The inputfile  is a file extracted from Database with character set UTF-8 format and am able to read the file in notepad.

              I have not used any  string  functions on the data, it is direct mapping from source to stage and am using

              same Schema for input file and output files and ecoding  is  set to UTF-8.

              I have another Question about the data tuype to be used for  the column which contain special characters.

              Usually we use Nvarchar in database so is there any specific datatype that has to be used in expressor.

                  If you can read the contents of the file in notepad and are using the same schema for input and output, then I can't think of a reason you should be having an issue. 


                  You will need to send me an extract from the file and your current project so I can see for myself.


                  When you use Expressor to read from a database table, the conversion from table types (NVARCHAR) to an appropriate Expressor type is handled by Expressor.  Expressor uses UTF-8 encoding internally, so there should be no problems once the data is in Expressor.