Several features of GameSynth are powered by state-of-the-art AI algorithms. This is especially true in the Repository and in the Modular model. In this post, we are examining some of these features, as well as giving a hint of things to come thanks to generative AI!
The Repository Map
The GameSynth repository offers the largest collection of procedural audio models to date, containing 1000+ patches carefully organized in categories and tagged.
In addition to the classic queries available to search this database by text, date, model etc.., the repository provides a map that displays the patches grouped by perceptual similarities. This means that patches that will generate similar sounds will appear close to each other.
This is especially useful to find layering candidates (i.e., similar sounds that are mixed to create a more interesting result) or to find new ideas to create patches. For example, browsing explosion patches on the map, you may notice that some are built around Noise generators as one would expect, while other use the Thunder module as a sound source, and you may decide to try this technique next time.
This feature is built around a self-organizing map, which is a supervised machine learning technique. Self-organizing maps produce low dimensional representations of higher dimensional data sets, while preserving the topological structure of the data. In the case of the GameSynth repository, it creates a two-dimensional map of the more complex data associated with the procedural models, while keeping patches that generate similar sounds close to each-other.
Finding procedural models from samples
Another useful feature of the GameSynth repository is the ability to drop a wave file and immediately get a list of patches that can generate similar sounds, basically getting your initial patching work done for you!
Here again, a neural network was built to return the closest patches. It has been trained with the features of the audio signals generated by the patches in the repository. Such features include the Mel-Frequency Cepstrum Coefficients (MFCCs), spectral density, centroid, flux, and many more. When a wave file is dropped, its features are also calculated, and fetch to the neural network which then returns the closest patches.
Generating new patches
The Modular model is GameSynth’s patching environment, in which you can combine 130+ synthesis, processing, logic and control modules to create advanced procedural audio models. The sheer number of possibilities can make it a bit intimidating! (Check our 100+ posts on the blog for some help.)
Thankfully, the random patch generator – accessible via the context menu or the Ctrl + P shortcut – uses AI to create unique random patches. The AI system is put to contribution to ensure that the generated patches will have a valid structure, and that the modules’ parameters will combine to produce an audible signal.
In all these examples, AI streamlines your workflow by providing the blueprints to create new content.
Welcome to the future
So, what comes next? Recently, generative AI has been all the talk with popular tools such as ChatGPT, Midjourney, Stable Diffusion, and many others.
Although audio content generation may be a bit harder to crack, no doubt similar tools will eventually appear. However, the content generated will suffer the same problems than the audio recordings it is based on: a lack of real-time control possibilities for interactive applications, large memory and storage footprints, no way to generate variations on the fly etc…
Procedural audio already solves all these issues. Since Tsugi has the largest database of procedural audio models available today, and because they are text-based and well structured (XML files in the case of GameSynth), and are already tagged and categorized, a small language model or a GAN (Generative Adversarial Network) can be written to produce new models at will.
Stay tuned for exciting new AI features for audio content creation for games, animations, and movies in Tsugi products!