Jump to content

Ajax With Utf


justsomeguy
 Share

Recommended Posts

I'm having some problems submitting UTF-8 text over AJAX. Here's my sweet Chinese text:欢迎来到六西格玛基础课程。我是萨My first solution was to escape the data:query += "&comments=" + escape(comments);When I looked at the request in Firebug, it didn't even include a "comments" field. The processing page picked it up, but Firebug wouldn't show it. The processing page picked it up and replaced each character as a question mark. I'm having a hard time figuring out exactly when the encoding goes wrong, but if I write the raw value of the submitted data to a text file, I get the question marks. Next solution was to use encodeURIComponent instead of escape:query += "&comments2=" + encodeURIComponent(comments);Firebug appreciated that much more:comments2 欢迎来到六西格玛基础课程。我是萨It even printed the exact characters, not the encoded version. If I send the encoded string to the console I see this:%E6%AC%A2%E8%BF%8E%E6%9D%A5%E5%88%B0%E5%85%AD%E8%A5%BF%E6%A0%BC%E7%8E%9B%E5%9F%BA%E7%A1%80%E8%AF%BE%E7%A8%8B%E3%80%82%E6%88%91%E6%98%AF%E8%90%A8%0ASo that's the text that gets sent. I'm having a problem on the server side decoding that back to UTF though, it seems to be counting each character as a single byte instead of 3 bytes and gives me this as a result:欢迎æ�¥åˆ°å…­è¥¿æ ¼çŽ›åŸºç¡€è¯¾ç¨‹ã€‚我是è�¨I'm having a hard time decoding that. I'm using ASP/Javascript, so here's my UTF8 object:

<%//+ Jonas Raoni Soares Silva//@ http://jsfromhell.com/geral/utf-8 [rev. #1]UTF8 = {	encode: function(s){		for(var c, i = -1, l = (s = s.split("")).length, o = String.fromCharCode; ++i < l;			s[i] = (c = s[i].charCodeAt(0)) >= 127 ? o(0xc0 | (c >>> 6)) + o(0x80 | (c & 0x3f)) : s[i]		);		return s.join("");	},	decode: function(s){		for(var a, b, i = -1, l = (s = s.split("")).length, o = String.fromCharCode, c = "charCodeAt"; ++i < l;			((a = s[i][c](0)) & 0x80) &&			(s[i] = (a & 0xfc) == 0xc0 && ((b = s[i + 1][c](0)) & 0xc0) == 0x80 ?			o(((a & 0x03) << 6) + (b & 0x3f)) : o(128), s[++i] = "")		);		return s.join("");	}};%>

When I use UTF8.decode on the submitted text, and then try to write the result to a file, the WriteLine method produces an error that says "Invalid procedure call or argument". So that's pretty cool, I can't even write it to a file to help debug. If I insert the text into SQL Server it inserts several question marks, for both the raw data and the UTF8 decoded data.God, I don't even know what my question is at this point. I guess I'll continue to research and try things, if anyone has any suggestions please share.

Link to comment
Share on other sites

I mentioned in the SQL forum that in our application, the page that responds to the AJAX request explicitly sets the Content-Type header to:

application/json; charset=utf-8
Link to comment
Share on other sites

That didn't seem to help.This thing's a mess, this is one of the first times I used AJAX a couple years ago. It will be a lot of work, but I think I've got to rewrite all the AJAX parts, it could use a different architecture anyway. I probably should have tested international characters in the first place, I just didn't expect that someone would try to submit Chinese text. Thanks for the help though.

Link to comment
Share on other sites

Yeah, good luck. The only thing that I really see that is different is that you mentioned that the JSON.stringify script unicode-encodes the data before sending back the response. Maybe there's some weird double encoding happening. In our application, we just pass the strings around as-is with no extra encoding going on anywhere.

Link to comment
Share on other sites

Not really, I haven't broken apart the code to build a test page to just print it to the page though. I've got it going into a database, and that's where I see the question marks. I don't really trust SQL Server to show me what's actually there though, in fact I don't really trust SQL Server to do much of anything. I tried the exact same Chinese text on my PHP/ExtJS/MySQL application and it worked just fine though (and I never even specifically programmed it to handle UTF). The ASP system is so old at this point though that there is a ton of data there about the company and its projects going back several years, so I don't really want to rewrite the entire timesheet/financial/project system to use PHP and MySQL, I'm looking around for an AJAX replacement I can try to drop in to the project system. It would be nice if SQL Server 2000 had native support for UTF instead of using USC-2, but hopefully I can find a way to get this to work without changing DBs.

Link to comment
Share on other sites

Hmm, the problem may be in the UTF8.decode() method. Possibly the result isn't a string type and that's why WriteLine won't accept it as an argument.

Edited by Ingolme
Link to comment
Share on other sites

I did add a line there to cast it as a string for the split method, I'm not sure what type it started out as (I always thought all request data was a string) but it was having an error until I converted it. The return from that function is definitely a string though.

Link to comment
Share on other sites

I've switched out the AJAX stuff to use the new Ext Core 3 library, and I'm now seeing a disparity between the text getting printed directly to the page versus encoding to JSON. This test page has a couple things on it:The text box on the top is to add a new record. The Add button will submit using a regular post request and the AJAX button will submit using AJAX and then refresh the page. The table underneath is ASP printing the records directly to the page. The textarea under the table shows the JSON representation, which isn't used for anything else on that page other than to print into that textarea. The last few records in the table are the only ones that matter at this point, in the table you can see that all of the characters in those records print correctly, but in the JSON array the third and final characters are not encoded correctly. You can see the records loaded using AJAX on this page, where the third and final characters of the last few records are not displayed correctly. The JSON test button on that page just encodes the JSON client-side. To do the encoding server-side, I saved this file as an ASP script, and then use JSON.stringify to encode:http://www.json.org/json2.jsSince I'm seeing different results in printing the table directly to the page, versus encoding server-side and sending it out, I'm thinking that the Jscript implementation that ASP uses is not able to encode correctly. Encoding client-side seems to work correctly, it doesn't mess up the characters. From everything I can tell, the character sets are all set correctly.I'm not sure what I'm asking, other than to see if anyone has done server-side JSON encoding using ASP/Jscript and might have a solution for this. I don't think that the submission is the issue because the records always print correctly on the first page when they get printed directly in the HTML, it seems to be the server-side JSON encoding that's causing problems.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...