CSV Character encoding for PS

Discussion of Photoshop Scripting, Photoshop Actions and Photoshop Automation in General

Moderators: Patrick, Mike Hale, xbytor, Larry Ligon, Andrew, PS-Moderators

CSV Character encoding for PS

Postby cwilkinson » Fri May 13, 2011 11:36 am

I'm having a problem where Photoshop isn't properly reading french accents such as é. When I apply datasets, the é is sometimes displayed as garbage, É, or sometimes even È. I'm definitely not bilingual, but I had a user tell me that no accent is better than the wrong accent over the e.

Can anyone shed some light on my character encoding issues?

Thanks!
cwilkinson
 
Posts: 17
Joined: Wed Dec 03, 2008 11:19 pm

Re: CSV Character encoding for PS

Postby cwilkinson » Tue May 17, 2011 2:12 pm

Maybe I'm the only one having the problem.

I guess it could be the method in which I am building the CSV from a database, too.
cwilkinson
 
Posts: 17
Joined: Wed Dec 03, 2008 11:19 pm

Re: CSV Character encoding for PS

Postby Mike Hale » Tue May 17, 2011 2:21 pm

This is just a wild guess, but when you say the accented chars are sometimes incorrectly displayed are you using the same font?
Mike Hale
Site Admin
 
Posts: 4337
Joined: Fri Sep 30, 2005 10:52 pm
Location: USA

Re: CSV Character encoding for PS

Postby cwilkinson » Tue May 17, 2011 3:39 pm

Hmm. I've never tried a different font on the text replacement field because it supports the chars that aren't working. I'm pretty sure the data is changing encoding along the steps somewhere. If I know which encoding photoshop expects to see, I can start changing my code to use it.
Currently I'm using UTF-8.
cwilkinson
 
Posts: 17
Joined: Wed Dec 03, 2008 11:19 pm

Re: CSV Character encoding for PS

Postby cwilkinson » Wed Oct 26, 2011 1:20 pm

It's been a really long time since I last visited this problem, however, it still plagues me.

Photoshop still won't play nice with the french character é. It gets changed to È. Interestingly enough, unless I adjust the character encoding in Open Office Cacl to something other than UTF-8, the é appears as a black diamond with a question mark (�).

There must be something I'm doing wrong somewhere along the line. It can't be that Photoshop doesn't support the French language haha!
cwilkinson
 
Posts: 17
Joined: Wed Dec 03, 2008 11:19 pm

Re: CSV Character encoding for PS

Postby xbytor » Wed Oct 26, 2011 2:11 pm

Adobe products typically support a magic encoding marker (signature) at the beginning of a text file.
For UTF8 files, that signature is the three bytes: 0xEF 0xBB 0xBF. For instance, the hex dump of
a UTF8 encoded text file with this signature with the content "This is a test." would be:

Code: Select all
$ od -t x1 Source1.jsx
0000000    ef  bb  bf  54  68  69  73  20  69  73  20  61  20  74  65  73
0000020    74  2e                                                       
0000022


I believe this signature format is spec'd out in the XMP standard (at least). In theory, the Datasets features
should recognize this signature. If not, I'm not quite sure if there is another approach to take.
xbytor
Site Admin
 
Posts: 2294
Joined: Thu May 19, 2005 12:11 pm
Location: In Limbo

Re: CSV Character encoding for PS

Postby cwilkinson » Wed Oct 26, 2011 4:34 pm

I've made some progress.

The file to writes the UTF-8 BOM now, however, instead of é becoming È, it's now √©. I'm not sure how this is progress, but I'm working on it.
cwilkinson
 
Posts: 17
Joined: Wed Dec 03, 2008 11:19 pm

Re: CSV Character encoding for PS

Postby xbytor » Wed Oct 26, 2011 6:07 pm

Verify that a file containing
Code: Select all
éÈ√©


is encoded like this

Code: Select all
$ od -t x1 Source1.jsx
0000000    ef  bb  bf  c3  a9  c3  88  e2  88  9a  c2  a9               
0000014


If your file encoding looks good, I can't think of what else to try except check with the general Adobe PS forums at adobe.com.
xbytor
Site Admin
 
Posts: 2294
Joined: Thu May 19, 2005 12:11 pm
Location: In Limbo

Re: CSV Character encoding for PS

Postby Mike Hale » Wed Oct 26, 2011 10:06 pm

For what it is worth< I don't think the BOM is the problem. As I understand it UFT-8 does not need a BOM. The byte order is always the same. The number of byte needed depends on the char being encoded. Some programs do add a BOM to UFT-8 files but it is not needed and can cause problems.

More to the point. I just did a test using an English version of Photoshop CS5. The text file for the variable data is UTF-8 but does not have the BOM. When I import the dataset using Automatic encoding the text replacement correctly imports accented characters. So the BOM is not need by Photoshop when using datasets and variable data.
Mike Hale
Site Admin
 
Posts: 4337
Joined: Fri Sep 30, 2005 10:52 pm
Location: USA

Re: CSV Character encoding for PS

Postby cwilkinson » Thu Oct 27, 2011 12:35 am

I've returned to add my solution, slightly embarrassed.

In my work flow, I fixed the problem by converting the incoming CSV from ISO-8859-1 to macintosh (macroman). I assume this is the default encoding, based on the operating system charset, that photoshop uses when it can't automatically detect the encoding.

That said, I'm embarrassed to admit that the problem, for the entirety of this endeavor, has been placed squarely on relying leaving the photoshop text encoding method set to automatic.

It wasn't until I returned here to post my success story, and read Mike's reply, that I remembered I have the option to change the encoding during the CSV import.

Sorry to have taking up your time folks, hopefully google will index this, and save someone else like me the time.

Cheers
cwilkinson
 
Posts: 17
Joined: Wed Dec 03, 2008 11:19 pm

Re: CSV Character encoding for PS

Postby xbytor » Thu Oct 27, 2011 9:21 am

cwilkinson wrote:In my work flow, I fixed the problem by converting the incoming CSV from ISO-8859-1 to macintosh (macroman). I assume this is the default encoding, based on the operating system charset, that photoshop uses when it can't automatically detect the encoding.


It's also the (default) encoding for Script Listener log files on OSX. Win has a different encoding: CP-1252 IIRC. It took me about a dozen hours to figure out that this was a PS "feature" and not a problem in my code. Try parsing OSX SL Log files with lots of German and French on Win7 box and you'll get interesting failures. I had to tweak Stdlib.readFromFile() because inconsistencies in character conversion error handling.
xbytor
Site Admin
 
Posts: 2294
Joined: Thu May 19, 2005 12:11 pm
Location: In Limbo


Return to Photoshop Scripting - General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest