最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

db2 400 - How to recover original characters from data with broken characters? - Stack Overflow

programmeradmin2浏览0评论

We have a bunch of translations loaded in our Database which are showing broken characters where there should be accented latin characters like 'É'. I'm not sure where the encoding went wrong but we are now seeing data like 'CASQUE INTÃGRAL 100-SERIES 3.0' which obviously is not what we want.

I'm fairly certain the field is UTF-8 and I'm sure to keep it that way when reading the data. I've tried a few other combinations of encoding but all show the broken characters.

My question is whether it is possible to get this string back to a state before the broken characters occurred. Is the data still there, just in the incorrect encoding or did saving it in the incorrect encoding destroy the original character data?

To be clear, is it possible to convert 'CASQUE INTÃGRAL 100-SERIES 3.0' back to 'CASQUE INTÉGRAL 100-SERIES 3.0', or will this data need to be reloaded with the correct encoding?

The database in question is Db2 on AS400.

We have a bunch of translations loaded in our Database which are showing broken characters where there should be accented latin characters like 'É'. I'm not sure where the encoding went wrong but we are now seeing data like 'CASQUE INTÃGRAL 100-SERIES 3.0' which obviously is not what we want.

I'm fairly certain the field is UTF-8 and I'm sure to keep it that way when reading the data. I've tried a few other combinations of encoding but all show the broken characters.

My question is whether it is possible to get this string back to a state before the broken characters occurred. Is the data still there, just in the incorrect encoding or did saving it in the incorrect encoding destroy the original character data?

To be clear, is it possible to convert 'CASQUE INTÃGRAL 100-SERIES 3.0' back to 'CASQUE INTÉGRAL 100-SERIES 3.0', or will this data need to be reloaded with the correct encoding?

The database in question is Db2 on AS400.

Share Improve this question edited Mar 14 at 19:12 Daniel Black asked Mar 14 at 17:30 Daniel BlackDaniel Black 6298 silver badges16 bronze badges 7
  • 1 What is the CCSID on the table? DSPFD (file) and then search for CCSID. What is the System CCSID? DSPSYSVAL QCCSID. Do you have more than one CCSID loaded on your machine? – jmarkmurphy Commented Mar 14 at 18:58
  • 1 How are you viewing the data? Are you using a Green Screen app? or a SQL Client? – jmarkmurphy Commented Mar 14 at 19:00
  • 1 How was the data loaded? You say the table is UTF-8, which really isn't possible since encoding is at the column level. But you could have the column defined as UTF-8 aka CCSID(1208). – Charles Commented Mar 14 at 19:07
  • @jmarkmurphy not sure about your first comment, I'm admittedly not very familiar with the DB2 or the AS400, sorry. I've been trying to determine how to get the encoding of the columns but was coming up empty, saw some references to DB2 varchars being UTF-8 so that's what I was assuming. I only have query access to the DB2 and cannot run actual OS commands, is it possible for me to determine the CCSID via a query? – Daniel Black Commented Mar 14 at 19:10
  • 1 You can query the QSYS2.SYSCOLUMNS catalog view for the column definitions. You'll need to know the library (aka schema) that the table is in. – Charles Commented Mar 14 at 19:13
 |  Show 2 more comments

1 Answer 1

Reset to default 1

Short answer...yes you'll going to have to reload the data.

Long answer, it'd be unusual in my experience to have a Db2 for IBM i table with CCSID(1208) (aka UTF-8) columns. Unless it's a relatively recently added table. Even now, my experience (granted in the US) has been than many continue to use EBCDIC CCSIDs by default.

So it's likely however the table was loaded, conversion from UTF-8 to the assigned EBCDIC CCSID wasn't handled properly. Judging by your comments, you're going to need some assistance from the system admins and/or RPG/COBOL developers.

UPDATE: CCSID(37) is US English
I believe that should be able to handle INTÉGRAL so you'll need to look at how the data was loaded.

This may be a useful presentation for you...
What's With These ASCII, EBCDIC, Unicode CCSIDs?

发布评论

评论列表(0)

  1. 暂无评论