This paper introduces ESPnet-SPK, a toolkit designed for training and utilizing speaker embedding extractors. It offers an open-source platform, facilitating effortless construction of models ranging from the x-vector to the SKA-TDNN, thanks to its modular architecture that simplifies the development of variants. The toolkit advances the use of speaker embeddings across various tasks where outdated embeddings are often employed, enabling the broader research community to use advanced speaker embeddings effortlessly. Pre-trained extractors are readily available for off-the-shelf use. The toolkit also supports integration with various self-supervised learning features. ESPnet-SPK features over 30 recipes: seven speaker verification recipes, including reproducible WavLM-ECAPA with an EER of 0.39% on the Vox1-O benchmark and diverse downstream tasks, including text-to-speech and target speaker extraction. It even supports speaker similarity evaluation for singing voice synthesis and more.