29 releases
new 0.3.22 | Jan 9, 2025 |
---|---|
0.3.21 | Jan 9, 2025 |
0.3.3 | May 21, 2024 |
0.2.1 | May 18, 2024 |
0.1.34 | May 15, 2024 |
#44 in Machine learning
647 downloads per month
1.5MB
4.5K
SLoC
MusicGPT
Generate music based on natural language prompts using LLMs running locally.
https://github.com/gabotechs/MusicGPT/assets/45515538/f0276e7c-70e5-42fc-817a-4d9ee9095b4c
☝️ Turn up the volume!
Overview
MusicGPT is an application that allows running the latest music generation AI models locally in a performant way, in any platform and without installing heavy dependencies like Python or machine learning frameworks.
Right now it only supports MusicGen by Meta, but the plan is to support different music generation models transparently to the user.
The main milestones for the project are:
- Text conditioned music generation
- Melody conditioned music generation
- Indeterminately long / infinite music streams
Install
Mac and Linux
MusicGPT can be installed on Mac and Linux using brew
:
brew install gabotechs/taps/musicgpt
Or by directly downloading the precompiled binaries from this link
Windows
On Windows, the executable file can be downloaded from this link.
Docker (Recommend for running with CUDA)
If you want to run MusicGPT with a CUDA enabled GPU, this is the best way, as you only need to have the basic NVIDIA drivers installed in your system.
docker pull gabotechs/musicgpt
Once the image is downloaded, you can run it with:
docker run -it --gpus all -p 8642:8642 -v ~/.musicgpt:/root/.local/share/musicgpt gabotechs/musicgpt --gpu --ui-expose
With cargo
If you have the Rust toolchain installed in your system, you can install it
with cargo
.
cargo install musicgpt
Usage
There are two ways of interacting with MusicGPT: the UI mode and the CLI mode.
UI mode
This mode will display a chat-like web application for exchanging prompts with the LLM. It will:
- store your chat history
- allow you to play the generated music samples whenever you want
- generate music samples in the background
- allow you to use the UI in a device different from the one executing the LLMs
You can run the UI by just executing the following command:
musicgpt
You can also choose different models for running inference, and whether to use a GPU or not, for example:
musicgpt --gpu --model medium
[!WARNING]
Most models require really powerful hardware for running inference
If you want to use a CUDA enabled GPU, it's recommended that you run MusicGPT with Docker:
docker run -it --gpus all -p 8642:8642 -v ~/.musicgpt:/root/.local/share/musicgpt gabotechs/musicgpt --ui-expose --gpu
CLI mode
This mode will generate and play music directly in the terminal, allowing you to provide multiple prompts and playing audio as soon as it's generated. You can generate audio based on a prompt with the following command:
musicgpt "Create a relaxing LoFi song"
By default, it produces a sample of 10s, which can be configured up to 30s:
musicgpt "Create a relaxing LoFi song" --secs 30
There's multiple models available, it will use the smallest one by default, but you can opt into a bigger model:
musicgpt "Create a relaxing LoFi song" --model medium
[!WARNING]
Most models require really powerful hardware for running inference
If you want to use a CUDA enabled GPU, it's recommended that you run MusicGPT with Docker:
docker run -it --gpus all -v ~/.musicgpt:/root/.local/share/musicgpt gabotechs/musicgpt --gpu "Create a relaxing LoFi song"
You can review all the options available running:
musicgpt --help
Benchmarks
The following graph shows the inference time taken for generating 10 seconds of audio using different models on a Mac M1 Pro. For comparison, it's Python equivalent using https://github.com/huggingface/transformers is shown.
The command used for generating the 10 seconds of audio was:
musicgpt '80s pop track with bassy drums and synth'
This is the Python script used for generating the 10 seconds of audio
import scipy
import time
from transformers import AutoProcessor, MusicgenForConditionalGeneration
processor = AutoProcessor.from_pretrained("facebook/musicgen-small")
model = MusicgenForConditionalGeneration.from_pretrained("facebook/musicgen-small")
inputs = processor(
text=["80s pop track with bassy drums and synth"],
padding=True,
return_tensors="pt",
)
start = time.time()
audio_values = model.generate(**inputs, max_new_tokens=500)
print(time.time() - start) # Log time taken in generation
sampling_rate = model.config.audio_encoder.sampling_rate
scipy.io.wavfile.write("musicgen_out.wav", rate=sampling_rate, data=audio_values[0, 0].numpy())
Storage
MusicGPT needs access to your storage in order to save downloaded models and generated audios along with some
metadata needed for the application to work properly. Assuming your username is foo
, it will store the data
in the following locations:
- Windows:
C:\Users\foo\AppData\Roaming\gabotechs\musicgpt
- MacOS:
/Users/foo/Library/Application\ Support/com.gabotechs.musicgpt
- Linux:
/home/foo/.config/musicgpt
License
The code is licensed under a MIT License, but the AI model weights that get downloaded at application startup are licensed under the CC-BY-NC-4.0 License as they are generated based on the following repositories:
Dependencies
~37–72MB
~1.5M SLoC