UTF-8 Encoder/Decoder

Convert text to UTF-8 bytes and decode byte sequences

πŸ’‘ Encode Mode: Convert regular text (like "Hello 🌍" or "CafΓ©") into UTF-8 byte sequences
πŸ’‘ Decode Mode: Convert UTF-8 bytes (like "F0 9F 8C 8D" or "43 61 66 C3 A9") back into readable text

How to Use This Tool

πŸ”„ Understanding Encode vs Decode

πŸ” ENCODE Mode

What it does: Converts regular text β†’ byte sequences

Input example: Hello 🌍

Output example: 48 65 6C 6C 6F 20 F0 9F 8C 8D

Use when: You have text and want to see the bytes

πŸ”“ DECODE Mode

What it does: Converts byte sequences β†’ regular text

Input example: 48 65 6C 6C 6F 20 F0 9F 8C 8D

Output example: Hello 🌍

Use when: You have bytes and want to see the text

Encoding Text to UTF-8 Bytes

  1. Select "Encode to UTF-8" mode (default)
  2. Enter your text in the input box (e.g., "Hello δΈ–η•Œ" or "CafΓ© β˜•")
  3. Choose your output format:
    • Hexadecimal - Most common for debugging (e.g., 48 65 6C 6C 6F)
    • Decimal - Numeric byte values (e.g., 72 101 108 108 111)
    • Binary - Binary representation (e.g., 01001000 01100101)
    • URL Encoded - Web-safe format (e.g., %48%65%6C%6C%6F)
  4. The tool automatically encodes as you type
  5. View the character breakdown to see how many bytes each character uses
  6. Click "Copy to Clipboard" to use the encoded bytes elsewhere

Decoding UTF-8 Bytes to Text

  1. Click "Decode from UTF-8" mode
  2. Paste your byte sequence in the input box
  3. Select format or use Auto-detect:
    • Auto-detect - Tool will figure out the format (recommended)
    • Or manually select if you know the format
  4. The tool automatically decodes as you type
  5. See the decoded text appear in the output area

Example Use Cases

Example 1: Encoding an emoji

Input: πŸ”

Output (Hex): F0 9F 94 90

Why 4 bytes? Emojis are complex Unicode characters requiring 4 bytes in UTF-8

Example 2: Debugging weird characters

Input: CafΓ©

Output (Hex): 43 61 66 C3 A9

Notice: The "Γ©" character takes 2 bytes (C3 A9) while regular ASCII takes 1 byte each

Example 3: Decoding URL parameters

Input: %48%65%6C%6C%6F%20%57%6F%72%6C%64

Format: URL Encoded (or Auto-detect)

Output: Hello World

About UTF-8 Encoding

What is UTF-8?

UTF-8 (Unicode Transformation Format - 8-bit) is a variable-width character encoding capable of encoding all possible Unicode characters. It uses 1 to 4 bytes per character and is backward compatible with ASCII.

How It Works

  • 1-byte characters (ASCII): 0xxxxxxx (0-127) - English letters, numbers, basic symbols
  • 2-byte characters: 110xxxxx 10xxxxxx (128-2047) - European letters with accents, Greek, etc.
  • 3-byte characters: 1110xxxx 10xxxxxx 10xxxxxx (2048-65535) - Most Asian languages, symbols
  • 4-byte characters: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (65536+) - Emojis, rare characters

Common Problems This Tool Solves

  • Weird characters in web forms - See exactly how your text is encoded
  • Database encoding issues - Debug why special characters look broken
  • URL encoding problems - Convert between URL-safe and readable text
  • Understanding character storage - See why some text takes more space
  • Educational purposes - Learn how computers represent international text

Character Byte Examples

  • "A" (ASCII): 1 byte β†’ 41 (hex) β†’ 01000001 (binary)
  • "Γ©" (Latin): 2 bytes β†’ C3 A9 (hex) β†’ 11000011 10101001 (binary)
  • "€" (Euro): 3 bytes β†’ E2 82 AC (hex)
  • "πŸ”’" (Lock emoji): 4 bytes β†’ F0 9F 94 92 (hex)
  • "δ½ " (Chinese): 3 bytes β†’ E4 BD A0 (hex)