Unicode attacks and test cases - Visual Spoofing, IDN homograph attacks, and the Confusables
29 Nov 2008
Let's face it, playing tricks that mess with people's perception can be fun. With Unicode, there's lots of fun tricks to be had. What's to stop someone from believing the following is what it appears to be:
www.аmazon.com
Looks like amazon.com of course, but it's not. The first 'a' is the Cyrillic small letter a, not the English, or Latin rather, small letter 'a', although they look identical - they're from two different languages. Confused? Good. Now hover your mouse over the link above, don't click it because I don't know where it goes but it probably isn't nice. In your browser's status bar you should see the Punycode encoded version of the domain name:
http://www.xn--mazon-3ve.com/
Because DNS does not support Unicode (only a subset of ASCII characters are allowed), we have IDN (Internationalized Domain Name) standards which define how domain names with Unicode characters should be encoded. Punycode is the name of the encoding mechanism.
The above is often referred to as an IDN homograph attack. Aside from spoofing with lookalike characters from completely different alphabets, we can do a bunch of spoofing just within our own alphabets. For example, certain fonts make combinations of characters hard to determine. Just like the letter's 'r' and 'n' together can look like the letter 'm': rn == m Zeroe's can look like 'O' and the number 1 can look like a lower case 'l'. So you wind up with lots of clever visual attacks:
I've listed the same text here in several different fonts, because in some fonts, you wouldn't be able to tell the visual difference between the two words. The visual appearance of characters has a lot to do with the fonts used to display the glyph, not just the alphabet.
www.аmazon.com
Looks like amazon.com of course, but it's not. The first 'a' is the Cyrillic small letter a, not the English, or Latin rather, small letter 'a', although they look identical - they're from two different languages. Confused? Good. Now hover your mouse over the link above, don't click it because I don't know where it goes but it probably isn't nice. In your browser's status bar you should see the Punycode encoded version of the domain name:
http://www.xn--mazon-3ve.com/
Because DNS does not support Unicode (only a subset of ASCII characters are allowed), we have IDN (Internationalized Domain Name) standards which define how domain names with Unicode characters should be encoded. Punycode is the name of the encoding mechanism.
The above is often referred to as an IDN homograph attack. Aside from spoofing with lookalike characters from completely different alphabets, we can do a bunch of spoofing just within our own alphabets. For example, certain fonts make combinations of characters hard to determine. Just like the letter's 'r' and 'n' together can look like the letter 'm': rn == m Zeroe's can look like 'O' and the number 1 can look like a lower case 'l'. So you wind up with lots of clever visual attacks:
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
- www.rnu11ets.com looks a lot like www.mullets.com
I've listed the same text here in several different fonts, because in some fonts, you wouldn't be able to tell the visual difference between the two words. The visual appearance of characters has a lot to do with the fonts used to display the glyph, not just the alphabet.
https://www.dw-formmailer.de/index.php?action=convert
Interestingly, Microsoft Word 2007 will also do the conversions if you enter a letter and then press alt+x or enter the corresponding number value and hit alt+x.
i.e. entering 0430 and then hitting alt+x provides the cyrillic a (а)
-Michael
-Michael
Too bad, so it seems to me any sort of IDN will have little hopes of doing it?