Jump to content

Character Encoding


cyber0ne

Recommended Posts

I need to build a simple little page that scrapes some Japanese text off of another page and feeds the text through BabelFish and scrapes the response. Sounds simple enough, but I seem to be having trouble with the character encoding (a subject I'm just now beginning to learn).The page I'm grabbing is mostly English, it just has some Japanese buried in there. Can anyone point me to some sample code that shows how to grab a properly encoded response from such a page, so that I can feed it through a regexp to parse out the non-English part?

Link to comment
Share on other sites

Well, I have same problem last time but in chinese, my forum version is in english and my member having some problem in chinese character in title. What i do is i go to the chinese forum and view their source. then copy their character encoding to my forum. it work well, i think you can try this up to find some japanese forum or other to copy their caracter encoding. since the character encoding did not influent the english character. hope this can help you :)

Link to comment
Share on other sites

I was thinking of that this morning, ya. The output from the Japanese/English page is definitely using UTF-8, but I can't seem to find a way to scrape it properly in PHP, that's my problem.I have the entire output of the page coming to me as a string. First I try grepping out the Japanese part and then run that part through utf8_encode(), but it comes out as nonsense (not properly encoded). Then I tried running the entire page through utf8_encode() and then grepping out the Japanese part, but it still wasn't properly encoded. Tried the same stuff with utf8_decode() (that might have been a silly change, or maybe the first one was silly, but like I said I'm just now learning about character encoding), same problem.I guess what I need help with is not the character encoding itself, but just how to handle it properly in my PHP code.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...