Older Showing 596 to 599 Newer

#elisp

What links here

(Sorted by recent first)

Created 2024-Sep-11 (14 months ago)

True destructive pop?

In #elisp, the pop macro is not really destructive in the sense that you can use it to manipulate a memory location regardless of scope. It's only as destructive as setf, i.e. it operates on "generalized places".

By contrast, destructive functions like delete make use of the lower-level setcar and setcdr internally.

That tells us basically how to make a "true" pop:

(defun true-pop (list)
  (setcar list (cadr list))
  (setcdr list (cddr list)))

But something happens when the list has only one element left…

(setq foo '(a))
(true-pop foo)
foo
=> (nil)

Compare with pop:

(setq foo '(a))
(pop foo)
foo
=> nil

There seems to be a fundamental limitation in Emacs Lisp: you can't take a memory address that points to a cons cell and make that cons cell be not a cons cell. Even delq ceases to have a side-effect when the list has one element left: (delq 'a foo) does not change foo.

Why it works with pop? Because you use it on a symbol that's in-scope, and it reassigns the symbol to a different memory address. Or that's how I understood it as of [2024-09-12 Thu].

Workaround

There is a trick, if you still need "true pop" behavior. Let's say there's a list that you need to consume destructively, but the list is not stored in a symbol in the global obarray, but in a hash table value. To look up the hash table location every time would cost compute. So here's what we do: access the hash table once, let-bind the value, and manipulate only the cdr cell of the value.

How the value might originally be stored:

(puthash KEY (cons t (make-list 100 "item")) TABLE)

Note that the car is just t, and we'll do nothing with it.

Now, consuming the list:

(let ((items (gethash KEY TABLE)))
  (while (cdr items)
    ... (DO-SOMETHING-WITH (cadr items)) ...
    (setcdr items (cddr items))))

Created 2024-Sep-11 (14 months ago)

Proposal: Make Org-mode fast for real

#emacs

First: in what way is org-mode slow? It's easiest to illustrate in terms of some things that have been developed as a reaction to org-mode's slowness:

org-ql
org-roam
org-node (I am the developer)

I have barely used org-ql, but as I understand the main distinction between org-ql and the other two:

org-ql searches the current file in realtime. It waits for you to input a search query, then it uses that to search for a specific kind of thing. It's supposed to be very fast, and I believe it can also operate on all files mentioned in an Agenda buffer, so it even works as a limited multi-file search.
- I say limited because you're not going to have that many files in org-agenda. Both org-agenda and org-ql (correct me here) would run into a performance tarpit if you add 100 or 1000 files to the list `org-agenda-files`.
  - That's the root of the issue. We'll get back to it later.
- Even if org-ql would perform well with 1000 files, you have to know what to search for, beforehand. It is not an exploration tool.
  - Admittedly, perhaps a command could be written that uses org-ql to search for "any heading" in "any file" and make a minibuffer prompt out of that – the result would be a similar idea to org-roam's/org-node's minibuffer prompts. But there is probably no need to use a search engine like org-ql if you're going to write such a simplistic query.
org-node (and org-roam) visits all files at some point in time, to cache as much info as possible about those files.
- It's like you wrote a few simplistic org-ql searches and cached the results. But not really.
  - I briefly wondered if I could design org-node to just run on top of org-ql, but they don't have the same tasks. Org-node has to correlate all those results so that it can do things like take some Org entry title and return what entries have that title and what's in their PROPERTIES drawers, all while operating purely off cache so you can write functions that loop over every entry in existence in less than 20ms.
    
    To run on org-ql would be a lot like running on ripgrep, an experiment I already tried. It's a mess of having to do many search passes and correlating different sets of results, and necessarily slower than just giving org-node its own parser.

To my proposal, what if upstream Org did such caching?

That's actually the idea with the org-element-cache, but it is not ambitious enough (yet). It's still the case that most functions that work with Org have to visit the relevant file, turn on org-mode, and then use the org-element functions to grab the info they need. But almost all the CPU cycles are burned at the "visit and turn on org-mode" step.

That's why having 1000 org-agenda-files causes the agenda to take several minutes to build. It has to turn on org-mode 1000 times.

I envision that a function should be able to just ask Org "hey, in that file, get me that piece of information" and Org will return the information without visiting that file at all.

Concretely: say the first time Org loads, it spins up an async process that visits every file in `org-agenda-files`, `org-id-locations`, `recentf-list` and other variables, and returns the org-element tree for each. Then Org has a nice set of hash tables it can just look up.

(Of course, store each file's last-modification time to know if it needs re-scanning.)

The end result might be a lot of commands are suddenly instant, and things like agenda and org-ql can cope with an unlimited number of files the same as if they were concatenated into one file.

If we further extend org-element-cache so that it even contains a copy of the fulltext of all entries, that would enable a fulltext search that competes with ripgrep, and can be filtered by additional metadata in a way you can't do with ripgrep.

The code-bases of org-node and org-roam could then shrink to 1/10 of the original LoC.

Created 2024-Aug-17 (15 months ago)

Don't expand-file-name, just error

An insight from learning to write fast #elisp: "just-in-case" code can make things slower.

Example situation: you want to ensure that a provided string is an absolute filename, so you wrap it in expand-file-name or file-truename. But these are expensive. Instead, if you know it's usually going to be absolute, just assert that it is:

(unless (file-name-absolute-p PATH)
  (error "Expected absolute filename but got: %s" PATH))

… and then proceed without ever calling expand-file-name.

Bonus tip: the other use of expand-file-name is faster with file-name-concat instead.

Alternatively, this is also good:

(let (file-name-handler-alist)
  (expand-file-name PATH))

Created 2024-Aug-12 (15 months ago)

Older Showing 596 to 599 Newer