IDN spoofing tests
08 Dec 2008Whole-script spoofing
www.аЬс.com using Cyrillic script for domain label
www.ігѕ.com using Greek script for domain label
ᎳᎳᎳ.lookout.net using Cherokee script for subdomain label
ᗯᗯᗯ.lookout.net using Canadian script for subdomain label
www.lookout.ᎷᎬ using Cherokee script for TLD
www.lookout.сом using Cyrillic script for TLD
Mixed-script spoofing
www.oracᇉ.com // Mixed ASCII and Hangul script U+11CN looks like 'LE'
www.I♥NY.com using ♥ for domain which is designated Common script
www.Αᑭᑭle.com using Canadian script for letter 'p'
www.Α⍴⍴le.com using common script APL functional symbol for letter 'p'
www.faϲebook.com using Greek script for letter 'c'
www.faϲebook.com using Greek script for letter 'c'
Ꮃww.lookout.net using Cherokee script for subdomain label
ᗯww.lookout.net using Canadian script for subdomain label
www.lookout.сom using Cyrillic script for TLD
Single-script spoofing
sweet⒗com // using ⒗ Common script for domain and full stop
www․lookout.nͤͭ // using Inherited combining diacritical marks for TLD
www․lOOkout.net // using Latin capital O for o :)
www․looĸout.net // using Latin letter kra 'ĸ' for 'k'
www․looĸout.net // using Latin letter turned 'm' for 'w' in subdomain and kra 'ĸ' for 'k' in domain
Normalization tests
www․lookout.net // using \u2024 one dot leader
www‥lookout.net // using \u2025 two dot leader
www.lookout‧net // using \u2027 hyphenation point
www…lookout.net // using \u2026 horizontal ellipsis
http://www.lookout.net⁄.test.com // using \u2044 fraction slash
www.lⓄⓄkout.сom using Latin Common script for 'oo' in domain label
www.lookout.nⓔt using Latin Common script for 'e' in TLD
www.lookout.net⩴777 // using \u2A74 which decomposes to ::=
㏂lookout.net using Latin Common script '㏂' which decomoses to a.m.
http://test.﹤.com // using \ufe64 small less than sign in domain label
Prohibited code points tests
Test the prohibited characters from IETF RFC 3454 stringprep.
www .lookout.net // using non-ASCII space chars 00A0
www. lookout.net // using non-ASCII space chars 1680 (Ogham space mark)
www. lookout.net // using ASCII control chars 001F
www․lookout.net // using non-ASCII control chars 06DD; ARABIC END OF AYAH
www․lookout.net // using non-ASCII control chars 180E; MONGOLIAN VOWEL SEPARATOR
www․lookout.net // using non-ASCII control chars 2060; WORD JOINER
www․lookout.net // using non-ASCII control chars FEFF; ZERO WIDTH NO-BREAK SPACE
www․lookout.net // using Non-character code points 1FFFE [NONCHARACTER CODE POINTS]
www․lookout.net // using Surrogate codes D800-DFFF; [SURROGATE CODES]
www․lookout.net // using Surrogate codes D800-DFFF; [SURROGATE CODES]
www․lookout.net // using Inappropriate for plain text FFFA; INTERLINEAR ANNOTATION SEPARATOR
www․look�out.net // using Inappropriate for plain text FFFD; INTERLINEAR ANNOTATION SEPARATOR
www․look⿰out.net // using Inappropriate for canonical representation 2FF0-2FFB; [IDEOGRAPHIC DESCRIPTION CHARACTERS]
www.looḱout.net // using Change display properties or are deprecated 0341; COMBINING ACUTE TONE MARK
www.lookout.net // using Change display properties or are deprecated 202E; RIGHT-TO-LEFT OVERRIDE
www․lookout.net // using Change display properties or are deprecated 206B; ACTIVATE SYMMETRIC SWAPPING
www.look out.net // using Tagging characters E0001; LANGUAGE TAG
www.look out.net // using Tagging characters E0020-E007F; [TAGGING CHARACTERS]
www.look־out.net // using Characters with bidirectional property "R" or "AL" 05BE
www.lookˮout.net // using Characters with bidirectional property "L" 02EE
UPDATE: June 2013
Including the same tests using the .ws TLD
Whole-script spoofing
ᎳᎳᎳ.lookout.ws using Cherokee script for subdomain labelᗯᗯᗯ.lookout.ws using Canadian script for subdomain label
Mixed-script spoofing
Ꮃww.lookout.ws using Cherokee script for subdomain label
ᗯww.lookout.ws using Canadian script for subdomain label
Single-script spoofing
www․lOOkout.ws // using Latin capital O for o :)
www․looĸout.ws // using Latin letter kra 'ĸ' for 'k'
www․looĸout.ws // using Latin letter turned 'm' for 'w' in subdomain and kra 'ĸ' for 'k' in domain
Normalization tests
www․lookout.ws // using \u2024 one dot leader
www‥lookout.ws // using \u2025 two dot leader
www.lookout‧net // using \u2027 hyphenation point
www…lookout.ws // using \u2026 horizontal ellipsis
http://www.lookout.ws⁄.test.com // using \u2044 fraction slash
www.lⓄⓄkout.ws.ws using Latin Common script for 'oo' in domain label
.ws www.lookout.ws⩴777 // using \u2A74 which decomposes to ::=
㏂lookout.ws using Latin Common script '㏂' which decomoses to a.m.
Prohibited code points tests
Test the prohibited characters from IETF RFC 3454 stringprep.
www .lookout.ws // using non-ASCII space chars 00A0
www. lookout.ws // using non-ASCII space chars 1680 (Ogham space mark)
www. lookout.ws // using ASCII control chars 001F
www․lookout.ws // using non-ASCII control chars 06DD; ARABIC END OF AYAH
www․lookout.ws // using non-ASCII control chars 180E; MONGOLIAN VOWEL SEPARATOR
www․lookout.ws // using non-ASCII control chars 2060; WORD JOINER
www․lookout.ws // using non-ASCII control chars FEFF; ZERO WIDTH NO-BREAK SPACE
www․lookout.ws // using Non-character code points 1FFFE [NONCHARACTER CODE POINTS]
www․look�out.ws // using Surrogate codes D800-DFFF; [SURROGATE CODES]
www․look�out.ws // using Surrogate codes D800-DFFF; [SURROGATE CODES]
www․lookout.ws // using Inappropriate for plain text FFFA; INTERLINEAR ANNOTATION SEPARATOR
www․look�out.ws // using Inappropriate for plain text FFFD; INTERLINEAR ANNOTATION SEPARATOR
www․look⿰out.ws // using Inappropriate for canonical representation 2FF0-2FFB; [IDEOGRAPHIC DESCRIPTION CHARACTERS]
www.looḱout.ws // using Change display properties or are deprecated 0341; COMBINING ACUTE TONE MARK
www.lookout.ws // using Change display properties or are deprecated 202E; RIGHT-TO-LEFT OVERRIDE
www․lookout.ws // using Change display properties or are deprecated 206B; ACTIVATE SYMMETRIC SWAPPING
www.look out.ws // using Tagging characters E0001; LANGUAGE TAG
www.look out.ws // using Tagging characters E0020-E007F; [TAGGING CHARACTERS]
www.look־out.ws // using Characters with bidirectional property "R" or "AL" 05BE
www.lookˮout.ws // using Characters with bidirectional property "L" 02EE