The idea: The idea was born when I asked myself: what would happen if I connect a live audio source to a neural network? Well, this repository is the answer. I've lately found out that it resembles for many people the Futurama holophonor instrument, hence the name of this repository.
Discord community: https://discord.gg/YpamA34C3
Holophonor is a software that generates images in real time. It leverages neural networks (a subset of the artificial intelligence domain) to do so. By default, it uses Google Deepmind BigGan which comprehends 1000 subjects. Since I wanted to use it with audio as input source, I've alsoforked and edited friture, a software able to analyze audio in real time.
The software listens for data from an UDP connection. When received, it parses the data and feed it to the AI model which generates an image, which is then displayed on the screen. The process is repeated in real-time, creating a dynamic and interactive visualization that responds to external input.
Specifically, the framework can be used to create visualizations that respond to audio input, such as music or environmental sounds. The visualizations can be customized by changing the LiuNet model, the input data source, and other parameters
4 videos of the software in action using an audio source as input are available here:
I've built an executable version of Holophonor and Friture to make it easier to use for non-developers. It's really easy to use, just download the executable and run it. You won't be able to train your model, but you can use GoogleDeepMind BigGan model.
- Windows (possibly with a NVIDIA Card)
- A microphone (if you want to connect an input audio source)
disclaimer: this software doesn't access any local file other than the Deepmind GAN, which is automatically downloaded when the software is started. Making a .exe out of a python project isn't very efficient, if anyone knows how to make it more lightweight please let me know.
-
IF you have a NVIDIA video card: download and install cuda drivers to improve Holophonor performance
-
download the latest release of Holophonor from here and the ad-hoc version of Friture from here
-
Unzip the downloaded file
-
Open the folder and double click on Holophonor.exe
-
If you want to use a microphone, open Friture and select the microphone you want to use.
-
Enjoy!
The software has been built as lightweight as possible, with less libraries as possible. I've tested it with Python 3.12, but it should work with other versions of Python as well.
To install the necessary dependencies, run:
pip install -r requirements.txt
You will need problably need also to download Microsoft C++ BuildTools, the most clear guide I found to so is here
If you are using a N Windows version, you could need to install Windows Media Feature Pack, you can find how to install it here
If you have a nvidia graphic card, install cuda drivers to (drastically) improve Holophonor performance
The Holophonor class is the main class of the framework. LiuNet instead is a wrapper for neural nets, it allows you to easily load a model and generate images based on input data. I made it to be able to incorporate different models in the framework without having to change the code too much.
The performances of the framework are highly dependent on the hardware you are using.
The main question is: which specific hardware parameters affect the performance?
Device | Type | Fps | Latency | LiuNet type |
---|---|---|---|---|
RTX4070S | GPU | 30 | 47ms | BigGan256 |
RTX4060 | GPU | 25.6 | 56ms | BigGan256 |
RTXA1000 (mobile) | GPU | 24 | 42ms | BigGan256 |
i7-8700K | CPU | 2 | 536ms | BigGan256 |
I think that Holophonor framework could be used in various applications, including:
- Live Performances: Enhance live music performances with real-time visualizations that respond to audio
- Medical Research: I wonder if it's possible to use this visualization to help people with hearing problems.
- Music Videos: Generate dynamic and unique visuals for music videos that sync with the audio track.
- Game Development: Create interactive visualizations for games that respond to player input.
- Musician Practice: Use the framework to create visualizations that help musicians practice their pieces through a visual representation of the sound.
- Valtellina Petra for the valuable insights from a professional musician's perspective.
- Dad, for the endless support and patience (and for the best omelette ever)
A special thanks to this projects:
- Friture
- Google Deepmind BigGAN
- pytorch-pretrained-BigGAN (pytorch implementation of the tensor-flow based BigGAN)
- vsCode pets for the companionship during the development of this project.
Thanks to you for reading this README.md, I hope you enjoyed it :)
If you encounter any issues while using the framework, please open an issue on the Issues page. I will do my best to address them.
This repository is licensed under a dual license:
- GPL v3
- For commercial use, please contact me at [email protected].
While I would be happy to see this project used in various applications, I would appreciate it if you could credit me and provide a link to this repository if you use it in your projects.
Please consider giving a star ⭐ to this repository!
This project was developed by L. Gabriele, inspired by the love for math and sound. For any inquiries or contributions, please contact [email protected]. 04/02/2025