how slow can you go?

Over the last couple of days I’ve finally made some decent progress with the C64 version of the tunnel:

My reflection code isn’t quite right yet, so its disabled here. The fact that this is a static image isn’t really hurting you much - if you were running it for real, you’d being watching it in all of its glorious three frames per second. Pretty woeful, really.

I haven’t really optimised the code much. I did spend some time on the plot routine this morning, which can now plot a single [x,y] point in 54-55 cycles. I feel like I should be able to shave perhaps ten cycles off, though I’m not seeing where yet. My clear routine, as usual, is the killer - its still a naive implementation that clears the entire bitmap. I attempted to make it a little smarter by having the plot routine record which page its writing to, and only clearing the dirty pages (via some hefty unrolled loops) which gained a me a single FPS, but its not really enough. I want to only clear the points that were plotted, but at a potential 4096 plots (64 rings x 64 plots per ring, since every eighth point is plotted), the cost would be prohibitive - it would take 8K to store all the memory pointers for those plots, which would at least triple the time taken to clear as the naive approach! I need to think on this more.

The real killer though, seems to be the fact that every point requires a bunch of table lookups, at least three: the ring offset indexes, then the ring offset itself, and the ring point position as well. I don’t see how I’m going to able to optimise this much. It is in a tight loop so even small gains should make a difference, but I’m hoping to get this thing at least to 25 FPS (50 seems a far off impossibility at this stage).

I do have another idea though. Since the whole thing is just drawing circles, it occurs to me that rather than using a [x,y] plotter, I could be better served by a [r,θ] plotter (ie one that uses polar coordinates rather than cartesian coordinates). This way I could do away with all the circle table, because the rings would just drawn by looping θ from 0 to 255. Each ring is achieved by looping the radius.

Of course, the plotter then needs to know how to convert r and θ into memory locations, but that should be easy enough with appropriately-constructed tables. The equations for converting polar coordinates to cartesian coordinates are:

x = r * cos(θ)
y = r * sin(θ)

That multiply is a problem, but I’ve recently found out about a super-fast multiplication technique based on logarithms. In essence:

f(x) = x^2/4
a*b = f(a+b) - f(a-b)

So we keep a table of squares which lets the multiplication happen very quickly. After that, the x and y offset are added and the point is converted to a memory location and plotted. I’m hoping that by spending some time working on the tables I can embed a good amount of the memory location stuff in the tables themselves, reducing the amount of work that needs to be done.

I’ve still got half an hour of bus trip left. To work!