Jan 02, 2017

Looking at Gabriel's 1985 book on the performance of Lisp systems (lispm posted a link in another thread: http://rpgpoet.com/Files/Timrep.pdf) we see many benchmarks using 0 GC time. His "Boyer" theorem-proving benchmark (p.133, 142/294) very commonly uses as much time in the "GC" column as in the "CPU" column (½ of the total), twice as much on SAIL (⅔ of the total), and almost never less than about half as much (⅓ of the total).

I'd like to claim that one of the four configurations of VAX Franz Lisp shone here, since it spent only about ⅐ of its total execution time in the GC. However, it spent roughly as much absolute time in the GC as the other three Franz Lisp configurations; it's just that it took 6× as long as normal to do the non-GC part of the benchmark; "normal", for context, was several times slower than SAIL. It's not clear that this counts as a performance improvement in any sense!

But, since many of his benchmarks use 0 GC time, I don't think I can justify the statement that GC just before the introduction of generational GC normally used 50% of a program's total run time; most programs did not consist primarily of evaluating tree-rewriting rules, which obviously are allocation-intensive. (Boyer isn't the worst, though. On Vaughan Pratt's Deriv symbolic differentiation benchmark, SAIL spends almost 90% of its total CPU time in the GC, though some other collectors handle it better.)

Reading through some of the other benchmarks, Franz Lisp commonly comes out rather GC-heavy, though less so than SAIL; on div2, for example, Franz spends 80% of its time in the GC. It may be significant that Wilensky was using Franz to write the book I got my original impression from.

I'm somewhat surprised that the FFT benchmark is written to use GC time, since normally I think of FFTs as not requiring any allocation at all, much less significant numbers of allocations of separate objects...

So, in conclusion, I do think it's justifiable to say that there were significant classes of programs for which all Lisp systems continued to spend ⅓ to ½ of their time in the GC, or 80% to 90% in bad cases, right up until the introduction of generational GC shortly after Gabriel's book. But it's probably not justifiable to apply that statement generally to all Lisp programs. Your 5%–10% number sounds quite reasonable as an overall average.

Jan 02, 2017

above the 90s were mentioned. SPARC systems shipped in the late 80s, 87something. At that time (90s) Lisp systems on stock hardware had already quite useable GCs, especially the big three commercial CL implementations on Unix (Lucid, Allegro and Lispworks). Even MCL on Mac OS had a useful ephemeral GC.

It's hard to say how many time was spent on GC in a Lisp program in the 70s. The installations were different, programs were different. How much time did a Multics Emacs written in Maclisp spend in GC? I have no idea. A macsyma running for some hours? AN early rule based system? Probably there were taken some benchmark numbers, but one would need check the literature to find them.

To see how different benchmark numbers were:

http://rpgpoet.com/Files/Timrep.pdf

There are also pathological cases where Memory is mostly full and time would be spent mostly in GC, or where a lot of virtual memory is used, and much time is spent in IO paging in/out memory.

Note also that the invention of generational GCs was triggered by the availability of larger virtual memory based systems and the programs using it - available to the market outside of the original research. Before that, most of them were quite a bit smaller or very rare. Systems like the Symbolics 3600 were extremely rare and expensive. When they first shipped, GCs were copying and compacting, but not generational. Systems like that did not exist on the market before.