Jump to content

Character sets


denny911

Recommended Posts

Hello,i uploaded my site to test it. i noticed that certain characters from my language don't appear correctly. those characters are: č, ć and đ.when i retrieve records from my database, which contain text with those characters, such words that contain them appear with the "?" sign instead of them. For example, word "skočiti" is displayed as "sko?iti"..i HAVE TO be able to use those characters, since there'll be some records that start with one of them, and i want my users to be able to search by them.now, i asked my hosting provider what might be the cause, and i was told they host many sites that display those characters OK, and that the error must be somewhere in my code or in using wrong character set. Previously, i told them that i used "windows-1250" charset and they told me to try "utf-8" (didn't work).by the way, this "windows-1250" charset is the only one that can display those characters as they are (i.e. without typing, let's say "č ;" in the source code for "č") but it happens only if that doesn't come from the database.also, when i try to update a record from my site (using a form, of course), replacing a simple "c" for "č" or "ć" for example, the UPDATE query doesn't work!! nor the INSERT works if there's any of "č", "ć" or "đ" characters. But, "ž" and "š" work correctly.so, if anyone know or guess what might be the problem, feel free to post it. It will be very appreciated.Thanks in advance!

Link to comment
Share on other sites

I just hate it when it comes to encodings.There can be many reasons because of which an encoding problem could occur.Let's review what areas you need to ensure get consistent encoding:

  • The saved PHP file (settable when saving the PHP file)
  • Internal PHP encoding (only settable by some objects for use within them)
  • The database (settable within the DB engine itself)
  • The PHP file output (setable with server configuration, the HTTP PHP extension and I think the iconv extension)
  • The rendering of the output in the browser (settable with a meta element or in the case of XML - with the XML prolog)
  • If input is sent - it's encoding (settable within the form element I think)

A general rule of thumb is to use a single encoding everywhere and use the most supported encoding available (I use UTF-8 everywhere and since I only need latin and cyrilic characters, I haven't gotten the need for something else). If all else fails, recode everything to and from entities.When they meant to try UTF-8, are you sure you got them right? What you need to do is save the file as UTF-8, not just specify UTF-8 as an encoding in a meta element or something like that.PHP internally uses only UTF-8 as far as I'm aware... unless explicitly overriden somewhere in the code. If the PHP file is not saved as UTF-8, it is converted to that before it's parsed.If you don't mind the performance penalty, you could use iconv() to covert from one charset to another or htmlentities() as a last resort of converting all unknown characters to HTML entities. Note that if you do the last, you must do it right before the symbols are turned into "?" which as shown above could be one of many places.

Link to comment
Share on other sites

http://dev.mysql.com/tech-resources/articl....1/unicode.htmlTake a look at that, and if that doesn't help any, search "encoding(s)"(you add or subtract the "s" as necesary to your purposes, and look through things. I'm sure if you set the encoding type to the right type in the database, there shouldn't be any problem. Otherwise I'm at a loss for why this would be happening
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...