🔣 Unicode Character Lookup
By ToolNimba Editorial Team · Updated 2026-06-19
Accepts U+XXXX, hex (0x4F60), or plain decimal. Separate several with spaces or commas.
Type some text above to see each character broken down.
This Unicode character lookup breaks any text down into the individual characters that make it up, showing each one as a Unicode code point (U+XXXX), its decimal code, and the numeric HTML entity you can paste into a web page. Paste a string to inspect it character by character, or switch to reverse mode and type a code point to build the character it represents. Everything runs in your browser, so the text you paste never leaves your device.
What is the Unicode Character Lookup?
Every character you can type, from a plain letter to an emoji, has a number assigned to it by the Unicode standard. That number is called a code point, and it is usually written in the form U+ followed by four or more hexadecimal digits, so the capital letter A is U+0041 and the star ★ is U+2605. The same number can also be written in plain decimal (A is 65) or as a numeric HTML entity (A), and this tool shows all three side by side for each character.
A subtle but important point is the difference between a character and a code unit. JavaScript strings store text as UTF-16, where most common characters take one 16-bit code unit but characters above U+FFFF (such as most emoji) take two, called a surrogate pair. A naive loop over a string counts those code units and would split an emoji in half. This tool uses codePointAt and iterates by code point (via Array.from), so an emoji like 😀 (U+1F600) is reported as a single character, not two broken pieces.
The reverse direction uses String.fromCodePoint, which turns a number back into the matching character. That is handy when you have a code point from a font chart, a spec, or another tool and want to see (and copy) the actual glyph. You can enter the code in whatever form you have it: the U+ notation, raw hexadecimal with a 0x prefix, or a plain decimal number, and you can list several at once to build a short string.
When to use it
- Finding the exact code point of a symbol so you can reference it in code, CSS content, or documentation.
- Spotting invisible or look-alike characters (a non-breaking space, a smart quote, a zero-width joiner) that are breaking your text or a regex.
- Converting a code point from a font chart or specification back into the visible character it represents.
- Generating the numeric HTML entity (&#NNNN;) for a character so it renders reliably regardless of page encoding.
- Checking whether a string contains astral characters like emoji that take two UTF-16 code units.
How to use the Unicode Character Lookup
- Keep the default Text to code points mode and paste or type the text you want to inspect.
- Read the table: each row shows the character, its U+XXXX code point, the decimal code, and the HTML entity.
- To go the other way, switch to Code point to character mode.
- Enter a code point as U+2605, 0x4F60, or a decimal like 9733 (list several separated by spaces or commas).
- Copy the resulting character or any of the codes straight from the result.
Formula & method
Worked examples
Look up the star symbol ★.
- The character ★ has codePointAt(0) = 9733 (decimal)
- 9733 in hexadecimal is 2605
- Pad to four digits and prefix U+, giving U+2605
- The numeric HTML entity uses the decimal value: ★
Result: ★ = U+2605, decimal 9733, HTML entity ★
Reverse a code point: build the character for U+1F600.
- Read the code point 1F600 (hexadecimal)
- 1F600 in decimal is 128512
- String.fromCodePoint(128512) returns the character
- This value is above U+FFFF, so in UTF-16 it is stored as two code units (a surrogate pair) but is still one character
Result: U+1F600 = 😀 (grinning face), decimal 128512, HTML entity 😀
Common characters and their Unicode code points
| Character | Code point | Decimal | HTML entity |
|---|---|---|---|
| A | U+0041 | 65 | A |
| a | U+0061 | 97 | a |
| (space) | U+0020 | 32 |   |
| © copyright | U+00A9 | 169 | © |
| € euro sign | U+20AC | 8364 | € |
| ★ black star | U+2605 | 9733 | ★ |
| 😀 grinning face | U+1F600 | 128512 | 😀 |
How a code point can be written
| Notation | Example for the euro sign | Where you see it |
|---|---|---|
| U+ hexadecimal | U+20AC | Unicode charts, specifications |
| 0x hexadecimal | 0x20AC | Source code, escapes |
| Decimal | 8364 | Plain numeric references |
| Numeric HTML entity | € | HTML and XML markup |
Common mistakes to avoid
- Counting code units instead of characters. A string length in JavaScript counts UTF-16 code units, not characters. An emoji such as 😀 has length 2 because it is a surrogate pair, even though it is one character. This tool iterates by code point so each emoji is reported once, and it also shows the code unit count for clarity.
- Mixing up hexadecimal and decimal. The U+XXXX form is hexadecimal, while the HTML entity &#NNNN; is decimal. U+0041 and A are the same character (A), not different ones. Reading 0041 as a decimal number would point at the wrong character entirely.
- Confusing &#NNNN; with &#xNNNN;. A numeric HTML entity can be decimal (★) or hexadecimal with an x (★). Both produce ★. Dropping the x but keeping the hex digits, as in ਭ, points at a completely different character.
- Assuming look-alike characters are identical. A straight quote, a curly quote, and a prime all look similar but have different code points. Pasting text into this lookup reveals which one you actually have, which often explains a failing search or comparison.
Glossary
- Unicode
- The standard that assigns a unique number, called a code point, to every character across the world's writing systems, symbols, and emoji.
- Code point
- The number Unicode assigns to a character, written as U+ followed by hexadecimal digits, for example U+0041 for the letter A.
- Code unit
- The fixed-size piece a string is stored in. In UTF-16 (used by JavaScript) each unit is 16 bits, and characters above U+FFFF use two units.
- Surrogate pair
- Two UTF-16 code units that together encode a single code point above U+FFFF, such as most emoji.
- HTML entity
- A way to write a character in HTML by its number, decimal as A or hexadecimal as A, so it renders regardless of encoding.
- Hexadecimal
- A base-16 number system using digits 0 to 9 and letters A to F. Code points are conventionally written in hexadecimal.
Frequently asked questions
What is a Unicode code point?
A code point is the unique number Unicode assigns to a character. It is written as U+ followed by hexadecimal digits, for example U+0041 for the capital letter A or U+1F600 for the grinning face emoji. The same value can also be shown in decimal or as an HTML entity.
How do I find the code point of a character?
Paste the character into the Text to code points box. The tool reads it with codePointAt and shows the U+XXXX code point, the decimal code, and the numeric HTML entity for every character in your input.
How do I turn a code point back into a character?
Switch to Code point to character mode and enter the value as U+2605, 0x2605, or the decimal 9733. The tool uses String.fromCodePoint to build the matching character, and you can list several codes at once to make a short string.
Why does an emoji count as more than one in some tools?
Many tools count UTF-16 code units rather than characters. Emoji and other characters above U+FFFF are stored as two code units, called a surrogate pair, so a naive length counts them as two. This lookup counts by code point, so each emoji is treated as one character.
What is the difference between U+0041 and A?
They represent the same character, the letter A. U+0041 is the hexadecimal code point notation, while A is the decimal numeric HTML entity. 41 in hexadecimal equals 65 in decimal, so both point at the same character.
Is my text sent anywhere?
No. The lookup runs entirely in your browser using built-in JavaScript (codePointAt and String.fromCodePoint). Nothing you type or paste is uploaded or stored, so it is safe to inspect private or sensitive text.