For a while, I’ve wanted to write a Mega Drive emulator, the original goal of this project. Pretty quickly it became apparent that it would be not significantly more work to build a generic system-agnostic emulator instead – either way I started from scratch – called Emulashione: the emulation framework with the nonsensical name.
In essence, Emulashione itself serves as the glue – providing services such as timing, synchronization, debugging support, logging, standardized interfaces, and more – that binds various devices (implemented as loadable dynamic modules) together to form a complete system. The devices and interconnections between them in a system are loaded from human-readable TOML system definition files.
Since Emulashione should be system agnostic, it’s effectively split into two components. The first is the core library, which is shared among all systems. It provides features common to all systems: system definition parsing, bus and clock sources, synchronization, and so forth. The second component consists of the plugins, which actually implement devices that systems can instantiate.
In addition to serving as the glue to join devices together into a complete system, the core library also provides a collection of C functions that can be called by clients to create systems and perform emulation.
As described earlier, entire systems are described in the form of TOML system definitions. In these definitions, metadata about the system, all clock sources, devices, and busses connecting them, as well as schedulers that control the emulation are defined. Here’s what a simple 68000 based system might look like:
This probably is a little overwhelming at first glance: but this file really just defines clock sources, busses, devices; and their properties. The file format is documented here.
A clock source is used as the basis of the timing of an emulated system; they establish a relationship between emulated time and real time, for synchronization purposes. In Emulashione, there are two kinds of clock sources: primary and derived.
Primary clock sources define one or more frequency variants directly. These frequency variants are just pairs of name and decimal frequency values (in Hertz) – this allows for a single device definition to allow for the selection of different named clock frequencies, for example, for different regions.
Derived clocks, as the name implies, take their frequencies from another clock. A clock divisor can be specified, and its output frequency is automatically updated when its source clock changes frequency.
Each bus maintains internally a mapping from address ranges to devices, which is used to process incoming bus transactions and route them to the appropriate device. Busses have fixed bit widths for both the address and data components.
In addition, each bit in the data bus can be specified as either pulled up or pulled down: this is used to simulate the result of reading “open bus,” or a situation where no device responds on access to a particular address.
Last are devices, identified by a short name, which is registered during device class registration when a plugin is loaded. When the system definition is loaded, the comma-separated type string is used to attempt instantiating a device of the given type, falling back to subsequent types if this fails.
Additional keys on device instances are passed to the constructors of the device itself, so they act as configuration for the device itself.
Emulating a system requires keeping track of time, and ensuring that all devices in the system are executing at roughly the same rate. In reality, this is a much more complex problem – featuring situations such as allowing devices to run ahead of others, requiring “catch-ups” at particular moments or memory accesses – than I can do justice here, and probably worthy of a post in its own right.
Scheduling is based around the idea of cooperative multithreading, a concept I introduced previously and wrote a little library to enable. Devices run in a main loop, where they periodically notify the scheduler that a certain number of clock ticks have elapsed.
Whenever devices call into the scheduler to update their time count, the scheduler figures out if any other devices are waiting to run, in which case a context switch to that device’s cothread is performed. Otherwise, the function returns without performing a context switch and the device can continue to execute.
For some good background information on what schedulers in emulators need to accomplish, check out this really solid article by Near, author of the higan emulator.
To make debugging the emulator itself, as well as programs running inside it simpler, the core provides a standardized low overhead tracing system. Devices can create tracers, which output their data to memory or files on disk, and stream whatever events it deems important to those tracers.
Memory tracing is used extensively by unit tests to verify the correct behaviour of devices such as processors by comparing the actual execution trace against an expected baseline.
Since there’s not really a stable compiler agnostic C++ ABI1, all interfaces provided by the core library are exported via plain C functions, which are defined as function pointers in interface structs. These structs can be looked up by predefined UUIDs, from clients of the library as well as plugins.
Most of the built-in features and device classes are exported through an interface. Their UUIDs are fixed and defined in the source of the core library.
Rather logically, the core library is responsible for loading plugins and keeping a directory of what devices, interfaces, and other resources are exported by a particular plugin. Plugins take the form of dynamic libraries, which are loaded at runtime from disk.
Each plugin exports some information about itself through a simple struct, which is exported as a symbol with a fixed name, looked up by the plugin loader during the load process. If the struct validates, the plugin is considered loaded, and the load callback provided in this struct is invoked. From these callbacks, plugins can register their implemented device types.
When a plugin is unloaded, all devices exported by it are automatically removed from the core, so they can’t be instantiated anymore. (This does assume that you never try to unload a plugin while executing any systems, though…)
In addition to the services that are directly involved in emulation, the core library provides a few bonus services for use by plugins. Probably most important is logging – each plugin receives its own logger instance, allowing messages from both the core library and all plugins to be processed in one continuous stream.
However, the core library also exports some generic key/value configuration stores (primarily used to allow devices to query their parameters in a system definition) and helper methods to retrieve file system paths.
Plugins are dynamic modules loaded at runtime, which export devices or other resources for use in an emulated system. The core library will load these during initialization (if explicitly specified in the configuration or if the plugin is located in the standard plugin directory) or manually later if requested to do so by a client application.
All plugins are required to export an information structure, which provides the plugin’s name, description, author information, UUID, and a few functions to control its load/unload callbacks:
This is parsed by the plugin manager in the core library to determine if the plugin is compatible with the version of the core library. The strings are collected to later identify the plugin to the user. The UUID is also stored so that the same plugin can’t be loaded multiple times, even if the path is different for each instance.
As the name implies, the init callback is executed once the plugin is validated as supported after being loaded; and the shutdown callback is executed immediately before the module is unloaded.
Since all of the interfaces exported by the core library are C structs and function pointers, they’re not too fun to work with. That’s where the plugin support library comes in: it’s a static library, linked into each plugin, that provides C++ wrappers for exported interfaces and core library functionality to make it easier to work with.
Additionally, the support library provides base classes for devices, which implement much of the common boilerplate code required, as well as bridging the C++ methods to the C-style function pointers exported in a device descriptor during device class registration.
I figured that a pretty cool example of how this all fits together would be to try and build a super simple example that’s capable of emulating enough of the 68komputer hardware to get the boot ROM running: some ROM, RAM, and a 68681 (or compatible) DUART. Thankfully, that ended up not being too particularly difficult, since only enough of the DUART required for serial input/output needs to be implemented.
Emulashione ships with generic RAM and ROM devices, which take care of those components. The DUART is not too difficult, but the 68000 core was a source of quite a few challenges, having been written entirely from scratch. (But more about that another time…)
This example omits error checking to make the code easier to follow; a real application would check the return values of all calls to ensure success.
In the hypothetical client program, a series of calls like the following would set up Emulashione, and create a system with the appropriate system definition, which is omitted for brevity here. This definition would specify the path to the boot ROM, but is otherwise very similar to the one presented above:
And… that’s it! This of course assumes the required plugins can be found by the core library; otherwise, a call to
libemulashione_plugin_load() with the plugins’ path may be necessary.
To actually execute the system’s emulation loop, only a single call is necessary:
The call will block the calling thread until the system exits its emulation loop. A nonzero return value from this method indicates that the system terminated execution for some exceptional reason, usually indicating an error. For our simple example case, we’re okay with blocking the main thread.
When we’re done with the emulation context, it needs to be freed so that all devices and associated resources are released. Likewise, when we’re done with Emulashione, it needs to be deinitialized:
Putting it together
Adding a tiny bit more code around those snippets above, and with the cooperation of all of the required devices, we can execute the binary, and connect to the socket representing UART A, and enjoy some BASIC:
Of course, the 68komputer is probably one of the simplest systems you could possibly emulate. There are no real timing requirements and no complex hardware interactions. But it nevertheless serves as a great test case for the emulator, since there are a metric shitload of moving pieces. And maybe you’ve noticed some jankiness in the output; the DUART/terminal emulation is truly minimal.
Plus, there’s just something Cool and Good(tm) about playing with the boot ROM for a computer you designed from scratch, in an emulator you wrote from scratch to emulate that hardware. The only way this could get any more cursed is if I managed to port it to kush-os2…
The source for Emulashione is available on GitLab along with all of the plugins I’ve developed so far. There’s no real frontend to drive the emulation, beyond a test command line program, however: this would be provided by a different project linking against the core library.
Documentation for both the core library, and built-in plugins is available here, and is updated automatically with the latest development snapshot.
Emulashione is nowhere near complete. It really just contains the bare minimum to get going, and run a few example programs but is not yet capable of supporting more complicated systems with inter-device timing constraints, though all the pieces are in place to enable this down the road.
Currently, I’m working on finishing up the from-scratch 68000 core, and putting together some of the missing pieces in the core library to handle synchronization with real time. I’ll likely continue to expand support for the 68komputer since it’s relatively simple and I understand the hardware it’s emulating very well – I plan to use this emulator to work on the software for a later hardware revision before that hardware actually exists.
This is mostly a Windows land problem, but it doesn’t make any sense to maintain two separate interfaces depending on the platform; most of this is abstracted away by helper classes. ↩
It’d probably make it a few emulation cycles before crashing… looking back, some of the design decisions in kush-os kernel were not the greatest and there’s many unresolved bugs and race conditions. It’d be nice to work on it again, but it desperately needs a full rewrite of the kernel. ↩