strange output from my form

WayneCa · May 23, 2020

I have a form that I use to send me data that includes the ½ value. Instead of %00BD, I am getting %C2%BD. Can anyone tell me how to correct this, or if it's even correctable?

Wayne

Ingolme · May 23, 2020

I think it looks correct. It is passing a two-byte UTF-8 encoded character. The %## encoding in a URL can only have two characters following the %. If you have a multi-byte character, each byte will be encoded on its own. The server should be able to decode it without a problem as long as it is treating strings as UTF-8.

WayneCa · May 23, 2020

2 hours ago, Ingolme said:

I think it looks correct. It is passing a two-byte UTF-8 encoded character. The %## encoding in a URL can only have two characters following the %. If you have a multi-byte character, each byte will be encoded on its own. The server should be able to decode it without a problem as long as it is treating strings as UTF-8.

OK. I can accept that answer. However, I am having the data sent to me via email, and when I look at it in my text editor the %C2 character is different from any character I am using. Why does the %C2 portion of the field get encoded to %C2 and not %00?

Ingolme · May 23, 2020

In UTF-8 encoding, any Unicode character with a code higher than 127 is split into multiple bytes. Each byte starts off with a number of 1s indicating how many of the following bytes belong to the same character. UTF-8 encoded bytes have structures like the following:

0XXXXXXX
110XXXXX 10XXXXXX
1110XXXX 110XXXXX 10XXXXXX

Your character has to be split into two bytes because its Unicode value is highter than 127. Its binary representation is: 10111101. These bits are split into the following two UTF-8 bytes:

11000010 10111101.

The hexadecimal representation of the above bytes is C2 BD

If you're actually seeing the C2 character, it means that the the software reading the bytes is not aware that it is UTF-8 encoded. The email has to contain a header indicating that it is using UTF-8 as the character encoding.

Sign In

strange output from my form

Recommended Posts

WayneCa

Link to comment

Share on other sites

Ingolme

Link to comment

Share on other sites

WayneCa

Link to comment

Share on other sites

Ingolme

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity