Resources Article Creating Speaker-Labeled Transcripts With Google Colab

Creating Speaker-Labeled Transcripts With Google Colab

Jose Nicholas Francisco

Published on 05/25/23Updated on 10/11/23

Table of Contents

Share this guide

(Note: If you’re one of those tinker first, read later sorts of folks, dive right into The Python Notebook covered in this article.)

Are you creating a podcast? Do you have multi-person Zoom calls? Or perhaps even earnings calls to get to?

Well that’s a lot of information to keep up with. And unless you have a very good notetaker on all of those calls, it becomes extremely important that you keep a record of your discussions somewhere.

That’s where speech-to-text (STT) technology comes in. But be careful. Many STT resources out there are extremely limited. And most don’t even offer speaker-labeling as a feature.

That is, a shoddy STT application will only produce a transcript that looks like this:

Hey, did you know that elk meat tastes really good? Really? Oh! I heard about elk meat too. I think I heard that on a podcast once. Which podcast? We probably heard the same podcast. I think so. What podcast are you guys talking about? Lemme look it up

Instead of something that looks like this:

Speaker 1: Hey, did you know that elk meat tastes really good?

Speaker 2: Really?

Speaker 3: Oh! I heard about elk meat too. I think I heard that on a podcast once.

Speaker 2: Which podcast?

Speaker 1: We probably heard the same podcast.

Speaker 3: I think so.

Speaker 2: What podcast are you guys talking about?

Speaker 1: Lemme look it up

Well luckily, Deepgram is here to help! Not only do we offer top-notch speaker-labeling (aka “diarization”) services, but we also have a handy-dandy notebook to help you out! That way, you don’t have to worry about writing any code. You can just upload your audios into the notebook, and run the code that was already written for you.

Ready? Let’s go!

The Python notebook

All the instructions you need are inside the notebook itself: here.

However, it can be helpful to break things down piece-by-piece. So let’s do that here. The first cell you’ll run into is the “Dependencies” cell (image below). By clicking the cell’s play-button, you’ll install all the fancy-schmancy coding packages you need for the rest of the cells to run.

After all, you can’t transcribe audios with Deepgram’s AI models without first installing Deepgram itself.

! pip install requests ffmpeg-python
! pip install deepgram-sdk --upgrade

Up next, we have a cell to remind you to upload the audio of your choice into the notebook. There is a menu on the left-hand side of the screen where you can upload any audio files you wish. To upload, simply click the icon of the paper with the upwards-facing arrow on it. It will take a few moments for the audio to appear, but once it does, move onto the next cell.

And now, here’s the fun part: