Showing 417 to 420 Newer

True destructive pop?

In #elisp, the pop macro is not really destructive in the sense that you can use it to manipulate a memory location regardless of scope. It's only as destructive as setf, i.e. it operates on "generalized places".

By contrast, destructive functions like delete make use of the lower-level setcar and setcdr internally.

That tells us basically how to make a "true" pop:

(defun true-pop (list)
  (setcar list (cadr list))
  (setcdr list (cddr list)))

But something happens when the list has only one element left…

(setq foo '(a))
(true-pop foo)
foo
=> (nil)

Compare with pop:

(setq foo '(a))
(pop foo)
foo
=> nil

There seems to be a fundamental limitation in Emacs Lisp: you can't take a memory address that points to a cons cell and make that cons cell be not a cons cell. Even delq ceases to have a side-effect when the list has one element left: (delq 'a foo) does not change foo.

Why it works with pop? Because you use it on a symbol that's in-scope, and it reassigns the symbol to a different memory address. Or that's how I understood it as of [2024-09-12 Thu].

Workaround

There is a trick, if you still need "true pop" behavior. Let's say there's a list that you need to consume destructively, but the list is not stored in a symbol in the global obarray, but in a hash table value. To look up the hash table location every time would cost compute. So here's what we do: access the hash table once, let-bind the value, and manipulate only the cdr cell of the value.

How the value might originally be stored:

(puthash KEY (cons t (make-list 100 "item")) TABLE)

Note that the car is just t, and we'll do nothing with it.

Now, consuming the list:

(let ((items (gethash KEY TABLE)))
  (while (cdr items)
    ... (DO-SOMETHING-WITH (cadr items)) ...
    (setcdr items (cddr items))))
Created (2 months ago)

Proposal: Make Org-mode fast for real

#emacs

First: in what way is org-mode slow? It's easiest to illustrate in terms of some things that have been developed as a reaction to org-mode's slowness:

I have barely used org-ql, but as I understand the main distinction between org-ql and the other two:

  • org-ql searches the current file in realtime. It waits for you to input a search query, then it uses that to search for a specific kind of thing. It's supposed to be very fast, and I believe it can also operate on all files mentioned in an Agenda buffer, so it even works as a limited multi-file search.
    • I say limited because you're not going to have that many files in org-agenda. Both org-agenda and org-ql (correct me here) would run into a performance tarpit if you add 100 or 1000 files to the list `org-agenda-files`.
      • That's the root of the issue. We'll get back to it later.
    • Even if org-ql would perform well with 1000 files, you have to know what to search for, beforehand. It is not an exploration tool.
      • Admittedly, perhaps a command could be written that uses org-ql to search for "any heading" in "any file" and make a minibuffer prompt out of that – the result would be a similar idea to org-roam's/org-node's minibuffer prompts. But there is probably no need to use a search engine like org-ql if you're going to write such a simplistic query.
  • org-node (and org-roam) visits all files at some point in time, to cache as much info as possible about those files.
    • It's like you wrote a few simplistic org-ql searches and cached the results. But not really.
      • I briefly wondered if I could design org-node to just run on top of org-ql, but they don't have the same tasks. Org-node has to correlate all those results so that it can do things like take some Org entry title and return what entries have that title and what's in their PROPERTIES drawers, all while operating purely off its cache.

        To run on org-ql would be a lot like running on ripgrep, an experiment I already tried. It's a mess of having to do many search passes and correlating different sets of results, and necessarily slower than just giving org-node its own parser.

To my proposal, what if Org itself did such caching?

That's actually the idea with the org-element-cache, but it is not ambitious enough (yet). It's still the case that most functions that work with Org have to open the relevant file, turn on org-mode, and then use the org-element functions to grab the info they need. But almost all the CPU cycles are burned at the "turn on org-mode" step.

That's why having 1000 org-agenda-files causes it to take several minutes to build the agenda. It has to turn on org-mode 1000 times.

I envision that a function should be able to just ask Org "hey, in that file, get me that piece of information" and Org will return the information without visiting that file at all.

Concretely, say the first time Org loads, it spins up an async process that visits every file in `org-agenda-files`, `org-id-locations`, `recentf-list` and other variables, and returns the org-element tree for each. Then Org has a nice set of hash tables it can just look up.

(Of course, remember each file's last file-modification time to know if it needs re-scanning.)

The end result might be a lot of commands are suddenly instant, and things like agenda and org-ql can cope with an unlimited number of files the same as if they were concatenated into one file.

Even the fulltext under each entry could be added to the org-element-cache, so that we have a fulltext search that competes with ripgrep and can be filtered by tag or any other metadata.

Created (3 months ago)

Don't expand-file-name, just error

An insight from learning to write fast #emacs-lisp: "just-in-case" code can make things slower.

Example situation: you want to ensure that a provided string is an absolute filename, so you wrap it in expand-file-name or file-truename. But these are potentially expensive. Instead, if you know it's usually going to be absolute, just assert that it is:

(unless (file-name-absolute-p PATH)
  (error "Expected absolute filename but got: %s" PATH))

… and then proceed without ever calling expand-file-name.

Bonus tip: when you need to concatenate a file and directory, use file-name-concat, it is much faster than expand-file-name.

Created (3 months ago)

Taking ownership of org-id

#emacs

Let's say most of your Org files sit in a folder /home/kept/notes/ but some others are outside, scattered here and there, plus you'd like to try not depending on Org-roam's M-x org-roam-update-org-id-locations all the time.

The challenges with org-id:

  1. The classic way to tell it where to look for IDs is adding the directories to org-agenda-files.
    • Unfortunately with thousands of files, this slows down the agenda something extreme. Not an option.
  2. An alternative way is to populate org-id-extra-files or org-agenda-text-search-extra-files.
    • See snippet A below.
      • Unfortunately with thousands of files, this slows down M-x customize-group for either org-id or org-agenda, to a grinding halt.
  3. To sidestep the small problem with #2, you could trust in org-id to keep itself updated, because it does that every time your Emacs creates or searches for an ID. You regenerate org-id-locations once (or well, once every time you wipe .emacs.d). See snippets B or C.
  4. org-id complains about duplicate IDs because it's also looking in e.g. the versioned backups generated by Logseq
    • So, you need some sort of exclusion ruleset.
      • For an elisp-only way, see snippets A or B.
      • A natural way is to obey .ignore or .gitignore, if you already keep such files. I've found no elisp gitignore parser, but see snippet C for a way to use ripgrep's builtin parser.
    • Why org-roam didn't give you this problem? If I'm reading the source code right, it has actually been suppressing org-id errors!

Snippet A

;; Populate `org-id-extra-files'
(dolist (file (mapcan (lambda (dir)
                        (directory-files-recursively dir "\\.org$"))
                      '(;; Example values
                        "/home/kept/notes/"
                        "/home/kept/project1/"
                        "/home/kept/project2/")))
  (or (string-search "/logseq/bak/" file)
      (string-search "/logseq/version-files/" file)
      (string-search ".sync-conflict-" file)
      (push file org-id-extra-files)))

;; Then either run M-x org-id-update-id-locations or restart Emacs.

Snippet B

;; Populate org-id without setting `org-id-extra-files'.  Only do it if
;; `org-id-locations' is gone.
(when (or (and (not (file-exists-p org-id-locations-file))
               (null org-id-locations))
          (if (null org-id-locations)
              (org-id-locations-load)
            (if (listp org-id-locations)
                (null org-id-locations)
              (hash-table-empty-p org-id-locations))))
  (org-id-update-id-locations
   (seq-remove (lambda (file)
                 (or (string-search "/logseq/bak/" file)
                     (string-search "/logseq/version-files/" file)
                     (string-search ".sync-conflict-" file)))
               (mapcan (lambda (dir)
                         (directory-files-recursively dir "\\.org$"))
                       '(;; Example values
                         "/home/kept/notes/"
                         "/home/kept/project1/"
                         "/home/kept/project2/"))))
  (org-id-locations-save))

Snippet C

;; Populate org-id without setting `org-id-extra-files'. Only do it if
;; `org-id-locations' is gone.
(when (or (and (not (file-exists-p org-id-locations-file))
               (null org-id-locations))
          (if (null org-id-locations)
                     (org-id-locations-load)
                   (if (listp org-id-locations)
                       (null org-id-locations)
                     (hash-table-empty-p org-id-locations))))
  (dolist (default-directory '(;; Example values
                               "/home/kept/notes/"
                               "/home/kept/project1/"
                               "/home/kept/project2/"))
    ;; Borrow ripgrep's ability to obey .ignore/.gitignore
    (org-id-update-id-locations
     (split-string (shell-command-to-string "rg -ilt org :ID:") "\n" t))
    (org-id-locations-save)))

Bonus snippet: full reset

;; FOR TESTING: wipe all records
;; You ONLY need to wipe if it won't shut up about duplicates!
(progn
  (delete-file org-id-locations-file)
  (setq org-id-locations nil)
  (setq org-id--locations-checksum nil)
  (setq org-agenda-text-search-extra-files nil)
  (setq org-id-files nil)
  (setq org-id-extra-files nil))

Sudden amnesia

Every time your Emacs quits unexpectedly, it can forget many ID locations! To ensure it remembers, either use a hook like

(add-hook 'after-save-hook
 (defun my-save-id-soon ()
   (run-with-idle-timer 10 nil #'org-id-locations-save)))

or enable persist-state and add the hook

(add-hook 'persist-state-saving-functions
          (lambda ()
            (when (fboundp 'org-id-locations-save)
              (org-id-locations-save))))

Final fix

All these issues go away now that I let org-node populate ID locations for me, via its user option org-node-extra-id-dirs.

Created (7 months ago)
Showing 417 to 420 Newer