Beginner’s Guide to Neural Speaker Diarization with pyannote
An open-source toolkit written in Python for speaker diarization
By reading this article, you will learn to split an audio input into different segments or chunks according to the identity of each speaker. This process is also known as speaker diarization.
This tutorial is based on the pyannote-audio
Python package for speaker diarization. It comes with the following capabilities
- speech activity detection
- speaker change detection
- overlapped speech detection
- speaker embedding
Let’s proceed to the next section for the setup and installation process.
Setup
It is highly recommended to create a new virtual environment before you continue with the installation.
Pytorch
Run the following command to install Pytorch:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118
Other Python packages
Run the following command to install all the required dependencies:
pip install git+https://github.com/pyannote/pyannote-audio