Base-62 conversion in Emacs-Lisp

#emacs

So while developing my website (which uses Emacs' org-publish), I decided I want base62 permalinks translated from preexisting org IDs.

I didn't find any Elisp nor shell utility for base62 encoding, just Javascript and Go libraries. By luck, I found prior art in the Elisp function org-id-int-to-b36. Its inverse, org-id-b36-to-int, is already base62-capable, in fact.

A quick edit gives us this all-purpose base62 generator:

(defun my-int-to-base62 (integer &optional length)
  "Convert an INTEGER to a base-62 number represented as a string.
If LENGTH is given, pad the string with leading zeroes as needed
so the result is always that long or longer."
  (let ((s "")
        (i integer))
    (while (> i 0)
      (setq s (concat (char-to-string
                       (my-int-to-base62-one-char (mod i 62))) s)
            i (/ i 62)))
    (setq length (max 1 (or length 1)))
    (if (< (length s) length)
        (setq s (concat (make-string (- length (length s)) ?0) s)))
    s))

;; Workhorse for `my-int-to-base62'
(defun my-int-to-base62-one-char (integer)
  "Convert INTEGER between 0 and 61 into one character 0..9, a..z, A..Z."
  ;; Uses chars ?0, ?A, ?a off the ASCII table.  Evaluate those symbols and you
  ;; see important gaps between the character sets:
  ;; 0-9 has codes 48 thru 57
  ;; A-Z has codes 65 thru 90
  ;; a-z has codes 97 thru 122
  ;; Why compose chars to construct the final base62 string?  It's either
  ;; that, or you make a lookup string "0123456789abcdefg...", so you're
  ;; looking something up anyway.  The ASCII table is faster.
  (cond
   ((< integer 10) (+ ?0 integer))
   ((< integer 36) (+ ?a integer -10))
   ((< integer 62) (+ ?A integer -36))
   (t (error "Input was larger than 61"))))

And voila! Evaluating

(my-int-to-base62 16777215)

gives "18owf".

Bonus: Going from base16 to base62

Did you know string-to-number can parse hexadecimal numbers? It's so undiscoverable, we should ask upstream to add an alias like "hexa-to-decimal". Emacs is full of these hidden tools you find under the sofa cushions.

Say you have the hex string "FFFFFF" (which is the decimal number 16777215). By first turning it into decimal, you can then turn it into base62. Evaluating

(my-int-to-base62 (string-to-number "FFFFFF" 16))

gives "18owf"! Now we're sure it works correctly.

Bonus: org-id directly to base62

Finally, the purpose in all this: my website is made from Org files and I want to let the org-id determine the page ID. My org-ids are the standard unwieldy UUID strings like "5de8af34-af54-40b8-bb23-aa2261722837", better convert to base62.

(defun my-uuid-to-base62 (uuid)         ;
  (let ((decimal (string-to-number (string-replace "-" "" uuid) 16)))
    (if (or (= 0 decimal) (/= 36 (length uuid)))
        (error "String should only contain a valid UUID 36 chars long: %s" uuid)
      ;; The highest UUID (ffffffff-ffff-ffff-ffff-ffffffffffff) makes
      ;; a base62 string 22 chars long.  Let's always return 22 chars.
      (my-int-to-base62 decimal 22)))

Now, evaluating

(my-uuid-to-base62 "5de8af34-af54-40b8-bb23-aa2261722837")

gives "2RcCEDX8RzFJ0RpojVlXhB".

If you're reading this on my website as of <2024-Feb-09>, the page ID is nothing like this because I decided base-21 is more practical.

What links here

  • 2023-09-23
Created (8 months ago)
Updated (8 months ago)