writeback

I’ve been back working on fat.handler this weekend. I had to look at the code for something and actually found it kind of interesting, which I thought was long past.

First thing was to add 64-bit support, so that it can handle partitions larger than 4GB. This was pretty easy, just new code in the cache to probe the underlying device to see if it supports 64-bit extensions, and then later if a request comes in for data that is over the 4GB bound ary, use a 64-bit read or write operation rather than the standard one (or error, if the probe didn’t find any 64-bit extensions). There’s three commonly-used 64-bit extensions in the Amiga world - TD64, New-style TD64, and DirectSCSI. The first two are supported, but DirectSCSI shouldn’t be hard to add.

I haven’t done any testing yet. Its basically impossible to test in hosted, as fdsk.device doesn’t have 64-bit support, but adding would mean that DOS would need 64-bit support too (since its a loopback-type device). ata.device for native has support, but that means needing a large FAT partition installed on a real box, or in VMWare, and to do that I pretty much need to install an OS that uses it. So far I’ve tried FreeDOS which crashed, and DR-DOS which created the partition but couldn’t write the partition table for some reason. The next thing to try is Windows 98SE/ME/2000, all of which could use large FAT partitions. The code should be available in tonight’s nightly build, so if you want to test before I get chance let me know how it goes.

This morning I started implementing write-back caching. The concept here is pretty simple - when the handler asks the cache to write some data, the cache reports success immediately but just marks the data as “to be written”. Then at regular intervals (eg five seconds) it writes all of these “dirty” blocks out to disk in one go. This makes things feel faster for the user, and has the potential to reduce disk activity (== less wear and lower power consumption), at the risk of losing data in the event of a power failure or loss of the device (like pulling the disk out). Typically removable media uses write-through caching (ie write immediately), while fixed disks use write-back.

Since this requires a seperate task that sits and waits and flushes the dirty blocks when called, it means the cache needs locking. Locking will also be needed in the future if a filesystem wanted to be multi-threaded (and the cache is actually in a cache.library, available to all). I’ve partially implemented the locking - so far there is locking around cache operations, but not block operations.

I hate that there’s no way (in most locking schemes, not just AROS) to promote a read lock to a write lock. Usually you have to drop the original lock before taking the write lock, which means there’s a moment where you’re not holding any lock and someone can come and steal it out from under you. I have a workaround for POSIX threads that I’m using in production code, but it requires condition variables which we don’t currently have for AROS semaphores. I think for the cache it won’t be a problem, but I’m thinking carefully about it because deadlocks are just too easy.