Compacting an Assocation List in Scheme

I must admit, I went way too long not knowing about association lists in Scheme. There's really nothing particularly special about them, they're just a list of pairs. The assq, assv, and assoc functions are what makes them useful. The aforementioned functions work by finding the first pair in the list whose first value matches what is being searched for (using different functions to test for equality). This means updating an association list is as simple as using cons to prepend a new pair. This works even if there is already a pair that already has the same "key" (the first element in the pair list).

While prepend and read options are trivial with association lists, things get slightly trickier when you need to come back and traverse the list and use the stored value part of each pair. Take the following association list: (define my-list '((100 3) (100 2) (100 1) (200 2) (200 1))). If you're building a frequency count of elements in another list you might end up with a structure not unlike this one. Each time you count a new occurrence of a number you just cons a new (value count) pair onto the association list. This means that operations like (assv '100 my-list) will return (100 3) and (assv '200 my-list) will return (200 2) as expected. But what about when we want to reduce the association list but only work with the "final" value for a given key? Naive attempts to use reduce on the association list will give you potentially very incorrect results depending on how many times you have "overwritten" a pair.

An easy way to overcome this problem without resorting to proper hash tables is compacting the association list:

(define (compact-alist alist)
  (fold-left (lambda (result pair)
    (if (assv (car pair) result)
      (cons (assv (car pair) alist) result)))
    '() alist))

This probably isn't the prettiest way of going about this but at least it's pretty clear what's going on. The association list is reduced with fold-left using a lambda that returns the accumulator result as is if the key has already been added to the result ((car pair) is necessary because assv just looks at the first element in each pair). If the key wasn't found then the first matching pair from the original association list is consed onto the result.

The result of (compact-alist my-list) would be ((100 3) (200 2)) which in some situations is much more useful and as far as assq, assv, and assoc are concerned is the effectively the same as the non-compacted association list.