Unicode & Punycode Converter
Translate internationalized domain names (IDNs) between Unicode and Punycode.
Unicode to Punycode
Punycode to Unicode
What are Unicode and Punycode?
Unicode is a universal character encoding standard that assigns a unique number to every character in every language, allowing computers to consistently represent and manipulate text from all writing systems. This includes characters from Latin, Cyrillic, Arabic, Chinese, Japanese, Korean, and many more.
While Unicode allows for a vast array of characters, the traditional Domain Name System (DNS) was originally designed to only handle a limited set of ASCII characters (A-Z, 0-9, and hyphen). This limitation meant that domain names could not directly contain characters from non-Latin alphabets or special symbols.
This is where Punycode comes in. Punycode is a special encoding syntax that converts Unicode characters into a limited ASCII character set that is compatible with the DNS. It allows internationalized domain names (IDNs) to be registered and resolved using the existing DNS infrastructure.
How Punycode Works:
- Punycode strings always begin with the prefix
xn--
. This prefix signals to the DNS resolver that the following characters are Punycode-encoded. - The part after
xn--
is the ASCII representation of the Unicode characters. - For example, the Unicode domain
bücher.example
becomesxn--bcher-kva.example
in Punycode.
Use Cases:
- Internationalized Domain Names (IDNs): Allows people to register and use domain names in their native languages and scripts.
- Email Addresses: Enables email addresses to contain non-ASCII characters.
- URLs: Facilitates the use of non-ASCII characters in web addresses.
This converter is a handy tool for developers, system administrators, and anyone working with internationalized web content to quickly translate between the human-readable Unicode form and the DNS-compatible Punycode form.