Jump to content

encodeURIComponent URL to UTF-8


Greywacke

Recommended Posts

hi there,i believe i might have discovered what should be a simple solution to the jumbled characters i get as result of the encodeURIComponent javascript function, i need to decode this in php5.2.9.the only problem is - this does not convert to utf8 if ë was entered in the string...

function UTF8URLDecode($value){    if (is_array($value)) {        foreach ($value as $key => $val) {            $value[$key] = UTF8URLDecode($val);        }    } else {        $value = preg_replace('/%([0-9a-f]{2})/ie', 'chr(hexdec($1))', (string) $value);    }    return $value;}

this code was adapted from http://weierophinney.net/matthew/archives/...ent-values.html.if i am getting problems with the first umlaud character i am testing this with, i assume the other UTF-8 unique characters will also be ignored.i've tried understanding regular expressions after 10 years, but i can't seem to find a logical explanation of regular expressions, either for javascript or php.could someone please help here!

Link to comment
Share on other sites

ps: the character set used in mysql is utf8_unicode_ci, and the form these strings are parsed from are charset=utf-8. what is wrong that this php function which reportedly converts encodeURIComponent() tainted strings to UTF-8, does not do so? (skips over replacing the ë tainted as ë for instance, with ë again.)

Link to comment
Share on other sites

oh i might also mention that the php functions meant to decode the URIencoding, such as urldecode, do not work here. i keep getting the ë replaced with ë through the encodeURIComponent function...

Link to comment
Share on other sites

When you connect to the database, you must also set the charset to UTF-8, you need to output the page as UTF-8, and you should set the form's accept-charset attribute to UTF-8, like so:

<form accept-charset="UTF-8">

For the first, use mysql_set_charset(), and for the second:

header('Content-Type: text/html;charset=UTF-8');

Link to comment
Share on other sites

its being submitted via ajax post request, hence the encodeURIComponent to validate the request. my problem is how its received at the php ajax handler...

Link to comment
Share on other sites

Why encode the data BTW? If it's in the query string (a GET variable), PHP should automatically decode it. If it's in the POST body, you shouldn't need to do anything. The browser should automatically encode it, and PHP should automatically decode it.

Link to comment
Share on other sites

okay, lets say it is the "form" that needs encoding. where would i do this here? the returned xmlresponse is already utf-8.

function makeRequest(method, url, parameters) {	var http_request = false;	if (window.XMLHttpRequest) { // Mozilla, Safari,...		http_request = new XMLHttpRequest();		if (http_request.overrideMimeType) {			// set type accordingly to anticipated content type			http_request.overrideMimeType('text/xml');			//http_request.overrideMimeType('text/html');		}	} else if (window.ActiveXObject) { // IE		try {			http_request = new ActiveXObject("Msxml2.XMLHTTP");		} catch (e) {		try {			http_request = new ActiveXObject("Microsoft.XMLHTTP");		} catch (e) {}		}	}	if (!http_request) {		alert('Cannot create XMLHTTP instance');		return false;	}	http_request.onreadystatechange = function() {		alertContents(http_request);	}	url += (method=="GET")?parameters:"";	http_request.open(method, url, true);	if (method == "POST") {		http_request.setCharacterEncoding("UTF-8");		http_request.setRequestHeader("Content-type", "application/x-www-form-urlencoded");		http_request.setRequestHeader("Content-length", parameters.length);		http_request.setRequestHeader("Connection", "close");	}	http_request.send((method=="GET")?null:parameters);	ajaxloading++;}

Link to comment
Share on other sites

ps: if i don't encode it and a string value contains an &, it breaks the post data. this is ajax, not a regular form i am afraid...pps: i gather it could be set in the post headers, but what would this be? *googles on this solong*...the solution i came up with should actually be on the javascript forum, but i have it onto this related topic. all the sites said adapt the backend, and i found one which had the option of adjusting the http_request object's character set, as well as the actual form's accept-charset="UTF-8". these are now both utf-8, but i still get Riëtte instead of Riëtte when sending encodeURIComponents over to the PHP handlers, to enter into the database. the function i have is to attempt to attempt performing a decodeURIComponent, in PHP.the returned xml document's encoding is also UTF-8. i need to know how the ###### can i "untaint" the strings as they are received server side. the javascript encodeURIComponent function messes some of the characters up.no more errors on encountering utf-8 characters anymore, but how to translate encodeURIComponent to the actual characters in PHP?

Link to comment
Share on other sites

Basically, what I need to know is, how would I use (and possibly recreate), effectively - a PHP 5 version of the JavaScript decodeURIComponent() function?All the results that I get from a web search on "PHP decodeURIComponent", both on Yahoo and Google, return with unhelpful information.

Link to comment
Share on other sites

okay. a php function already exists: utf8_decode, and this was the solution to my problem.this issue has now been RESOLVED. :)

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...