Highly Compressed Emoji Shortcodes

This demo contains webassembly and javascript content, and may not work correctly on older browsers.

For a more in-depth explanation of what's going on here, check out the writeup on my blog, and the project's (very messy) source code on Github.

Pop open the sources tab, subtract the ~15kb of wasm-pack overhead from the (uncompressed) .wasm file size, and you'll find that this shortcode-to-emoji lookup function only requires ~20kb of static storage (code + data), with zero dynamic allocation.

For reference, a file of raw (shortcode,emoji) pairs would occupy ~32kb, and would need to be parsed into a dynamically-allocated map to be queried efficiently. Additionally, the hash-map generated by the unmodified rust-phf takes up a whopping ~100kb of .rodata and .data.rel.ro, though admittedly, much of that space is wasted on struct-padding, which could be mitigated by refactoring the library to use two arrays-of-structs instead of a single array-of-tuples.

Go on, type in some shortcodes! Colons are optional, and can be omitted.
Some crowd favorites include: :+1:, :smile:, :poop:, :tada:

This lookup function is generated from the same dataset used by Github.

If you're feeling lucky, you can try spamming some input into the text box and trying to find a false positive! This demo uses 16-bit hashes for the keys, so it aught to be pretty difficult to find any "reasonable" false positives (i.e: one which is is only a mistake or two away from being a valid input).

That said, it's pretty easy to programmatically find false positives. In fact, every time you refresh this page, I'll go ahead and find another false positive by randomly generating a bunch of strings and seeing if one of them returns an output! Oh hey, looks like (still calculating...). Neat!

Oh, and if you're interesting in stress-testing the function directly, you can access it by calling window.shortcode_lookup.