Tuesday, June 8, 2010

DB2, move to UTF8 from IBM-1252

The only way to move to UTF8 from IBM-1252 is to create a new database with UTF8 codeset. Export data from IBM-1252 database and load to the new UTF8 database.
There are a few things to consider during the ETL exercise:
  • Length of CHAR columns – You might have to increase the width of CHARACTER columns in your UTF8 database. Depending on the character in your database, the column width can grow between 1-4x.
  • Use of String manipulation functions – Because of the difference in character byte length, you will need to take care to make sure that your application will work properly if you are using function such as SUBSTR.
  • Collation – On UTF8 databases, DB2 supports binary collation as well as 3 different Unicode Collation Algorithm (UCA) based collations. You may need to investigate a bit to port your applications properly.

No comments:

Post a Comment