Skip to content

Character Encoding Standard

Last Updated 2025-08-07 UTC+8.

Unicode

Unicode is a universal character set used worldwide to assign a unique value to characters in various writing systems. It encompasses a wide range of languages and even includes emojis, making it a versatile encoding system.

ASCII

ASCII, which stands for American Standard Code for Information Interchange, is a character encoding standard primarily used in digital communications. It consists of 128 unique characters that are mapped to decimal numbers ranging from 0 to 127.

UTF-8

UTF-8 is a variable-length character encoding standard that ensures compatibility with ASCII while also offering support for emojis and a wide range of characters from various languages. In UTF-8, characters are encoded using 1 to 4 bytes, making it a widely adopted option for encoding text across websites, emails, and various digital platforms.

References

  1. Daniel Duan. (2020, August 28). Unicode vs UTF-8. https://www.youtube.com/watch?v=Vy2r21kli0Q
  2. W3Schools. (n.d.). HTML Unicode (UTF-8) Reference. https://www.w3schools.com/charsets/ref_html_utf8.asp