3/4/13

unicode mess

I think I may have messed up some unicode. Let's check the unicode pipeline.

First, I'll need some sample utf-8 unicode. We need a unicode character that is definitely not ascii. How about something Chinese: "你好 world". I think that means "hello world".

Ok, let's put that in the Sublime Text editor, and try to save it as utf-8.. done.

Now let's try loading it and printing it to the console — we'll see if the mac's console supports unicode.. hm.. it does. great.

Now let's try printing it out as json, just to see what it does — I think I'd prefer if it escaped it with \u, but I don't think it will.. it does not escape it.

Let's try saving it back in a file as json, and loading it again as utf-8.. success

Great, now let's try saving it in mongodb, and see what it looks like in the mongodb console.. that works too.. so far, so good..

Now let's try getting it back from mongodb in javascript.. works..

Hm.. let's see if node.js assumes that javascript files are utf-8.. we'll put "你好 world" as a literal inside a javascript file and save it as utf-8, and try running it.. works.. I guess the default is utf-8.

Now let's try sending the json to a webpage using ajax.. FAILURE: when I point the browser at the json directly, it shows "ä½ å¥½ world". I assume that if it executed the page as javascript, it would also be wrong.

..how to fix this.. somehow the browser needs to be informed that this json is utf-8 encoded. I'm guessing this will be header information.. though I'm surprised since I'm using.. oh wait, I was going to say I was using an external library to form the json http response, but I'm not, so that makes it even more likely that I'm just doing it wrong.. let's look it up..

I found my answer in this stackoverflow question: What does “Content-type: application/json; charset=utf-8” really mean?

Success!!

..and, sadness that I failed before. But mostly Success!!

Let's make sure that I can load that with ajax and put it into a div.. yup.

No comments:

Post a Comment