IDNA2003, IDNA2008, domain and sub-domain registrations during the transitional period
08 Jul 2011
To continue on with the discussion about THE RISKS OF USING “ESZETT” OR “SHARP S” (“SS”) IN DOMAIN NAME - this character is just one of four deviation characters that will certainly cause mischief and mayhem in the coming years. Here's the deal, the registries and registrars are moving from the initial specification that allowed Internationalized Domain Names (IDN) to be registered. It was called IDNA and is now referred to as IDNA2003. They're moving to the new specification which is called IDNA2008 although it didn't officially become a standard until the year 2010. Hang with me this may actually affect you, whether your a registry like DENIC, a browser like Internet Explorer, a second-level registry like tumblr.com, or simply a customer who wants a domain name.
There's a problem with these two specs - they're incompatible, and most of the risk here will be found during the "transitional period" when registries are upgrading. That's on purpose mind you, and was deemed to be the best choice for the decades ahead of us. Eventually we want to completely get rid of IDNA2003 and get the whole world on IDNA2008 - registries, Web browsers, anything that processes IDN's.
A lot of characters will be handled differently under the new rules of IDNA2008. Four characters in particular, called the deviation characters, are poised to cause mayhem during the transitional period as domain name registries shift to IDNA2008. Why? Because for a period of time domain names using these characters may actually resolve to two different IP addresses - depending on which IDNA rules the client/Web browser has implemented.
But just what is a domain name registry anyway? To keep it simple, we can think of a registry as the overarching authority for a top-level-domain (TLD). Most TLDs are well-known like .com, .net, and .org, and each has a single registry that enforces some rules and manages all domain name records. This also ensures the same domain name can't be registered by more than one party. Registrars on the other hand are sort of resellers, like Godaddy, Enom, and Dyndns who will sell the domain names to customers. They still need to comply with the rules of the registry.
But we can also think of some domains as their own second-level registries. For example, blogspot.com, tumblr.com, smugmug.com each have millions of customers with their own domain name like http://google.blogspot.com/. In this way they're acting as registries providing subdomains, one per customer. So all of these second-level registries will also be affected by IDNA's transitional period, if they decide to even offer IDNs in their subdomains - most don't currently.
The four deviation characters are of particular concern, named so because how they're handled is different under IDNA2003 rules than they are under IDNA2008 rules:
The two JOINERs get dropped under IDNA2003 but are valid in IDNA2008 under certain language contexts. The "ß" maps to "ss" under IDNA2003 but does not map under IDNA2008, and the "ς" maps to "σ" under IDNA2003 but does not map under IDNA2008. For a good visual of this see Table 1 of UTS46.
What does it all mean?
In the end, if you currently have a domain containing "ss" or "σ" then you may want to register the domain using the new character supported under IDNA2008 if it suits your market. That's not to say you should by any means, for example "ssa.gov" probably does not care to register "ßa.gov" since it's market is the United States. But a German bank named "Gießen Savings and Loan" who currently owns http://www.sparkasse-giessen.de will certainly want to register http://www.sparkasse-gießen.de.
As far as the JOINERs, that's a whole other story, but legitimate registrations should only be allowed for certain sequences of Arabic or Indic characters. The use cases are limited to those, and registries will be required to implement those restrictions. However, clients who perform IDNA2008 lookups are not required to implement those restrictions.
Is my registry IDNA2008 enabled?
I don't know but I'd suggest checking with them. DENIC decided not to implement the bundling or blocking recommendations and instead gave their customers about 3 weeks to register the alternate domain that would resolve to them normally under IDNA2003 but not under IDNA2008. Seems like a short period of time to me, if you were on vacation you might have missed the chance. But the choice is up to the registry. So check with your registrar or the registry to find out where they're at with their upgrade plans.
There's a problem with these two specs - they're incompatible, and most of the risk here will be found during the "transitional period" when registries are upgrading. That's on purpose mind you, and was deemed to be the best choice for the decades ahead of us. Eventually we want to completely get rid of IDNA2003 and get the whole world on IDNA2008 - registries, Web browsers, anything that processes IDN's.
A lot of characters will be handled differently under the new rules of IDNA2008. Four characters in particular, called the deviation characters, are poised to cause mayhem during the transitional period as domain name registries shift to IDNA2008. Why? Because for a period of time domain names using these characters may actually resolve to two different IP addresses - depending on which IDNA rules the client/Web browser has implemented.
But just what is a domain name registry anyway? To keep it simple, we can think of a registry as the overarching authority for a top-level-domain (TLD). Most TLDs are well-known like .com, .net, and .org, and each has a single registry that enforces some rules and manages all domain name records. This also ensures the same domain name can't be registered by more than one party. Registrars on the other hand are sort of resellers, like Godaddy, Enom, and Dyndns who will sell the domain names to customers. They still need to comply with the rules of the registry.
But we can also think of some domains as their own second-level registries. For example, blogspot.com, tumblr.com, smugmug.com each have millions of customers with their own domain name like http://google.blogspot.com/. In this way they're acting as registries providing subdomains, one per customer. So all of these second-level registries will also be affected by IDNA's transitional period, if they decide to even offer IDNs in their subdomains - most don't currently.
The four deviation characters are of particular concern, named so because how they're handled is different under IDNA2003 rules than they are under IDNA2008 rules:
- U+200C ZERO WIDTH NON-JOINER
- U+200D ZERO WIDTH JOINER
- U+00DF ( ß ) LATIN SMALL LETTER SHARP S
- U+03C2 ( ς ) GREEK SMALL LETTER FINAL SIGMA
The two JOINERs get dropped under IDNA2003 but are valid in IDNA2008 under certain language contexts. The "ß" maps to "ss" under IDNA2003 but does not map under IDNA2008, and the "ς" maps to "σ" under IDNA2003 but does not map under IDNA2008. For a good visual of this see Table 1 of UTS46.
What does it all mean?
In the end, if you currently have a domain containing "ss" or "σ" then you may want to register the domain using the new character supported under IDNA2008 if it suits your market. That's not to say you should by any means, for example "ssa.gov" probably does not care to register "ßa.gov" since it's market is the United States. But a German bank named "Gießen Savings and Loan" who currently owns http://www.sparkasse-giessen.de will certainly want to register http://www.sparkasse-gießen.de.
As far as the JOINERs, that's a whole other story, but legitimate registrations should only be allowed for certain sequences of Arabic or Indic characters. The use cases are limited to those, and registries will be required to implement those restrictions. However, clients who perform IDNA2008 lookups are not required to implement those restrictions.
Is my registry IDNA2008 enabled?
I don't know but I'd suggest checking with them. DENIC decided not to implement the bundling or blocking recommendations and instead gave their customers about 3 weeks to register the alternate domain that would resolve to them normally under IDNA2003 but not under IDNA2008. Seems like a short period of time to me, if you were on vacation you might have missed the chance. But the choice is up to the registry. So check with your registrar or the registry to find out where they're at with their upgrade plans.