Chapters from Audacity

Disclaimer: I am not a programmer/Dev of any kind. This is probably a very easy thing for someone to accomplish but this is just me trying to reason through a problem publicly. Jump to the bottom for the latest updates!

I am trying to find a way to automate chapters (labels) from audacity into the proper json format for the 2.0 namespace.

The export from audacity creates a three column text file separated by tabs.

Column 1 and 2 are repeats of each other and contain numerical values of the timestamp in seconds.
- The values are separated by a decimal and character count to the right of the decimal is always 6
Column 3 is the label text

The cut command in linux can be used to easily remove one column of time:

cut -f 2,3 audacityLabels.txt > editedLabels.txt

Goes from

22.296961	22.296961	Introduction
59.036735	59.036735	Podcast 2.0 Talk
79.325170	79.325170	Value Tag
152.084898	152.084898	Chapters Tag
164.815238	164.815238	LIT tag
192.580499	192.580499	Transcripts 
232.010884	232.010884	Explain Break
257.860499	257.860499	Mini-Series
304.256871	304.256871	My Audio Sucks
357.250612	357.250612	Projects I've been working on
470.900680	470.900680	Homelab 2.0
646.536417	646.536417	Zorin OS
695.159002	695.159002	Closing
720.491973	720.491973	Get involved!!

22.296961	Introduction
59.036735	Podcast 2.0 Talk
79.325170	Value Tag
152.084898	Chapters Tag
164.815238	LIT tag
192.580499	Transcripts 
232.010884	Explain Break
257.860499	Mini-Series
304.256871	My Audio Sucks
357.250612	Projects I've been working on
470.900680	Homelab 2.0
646.536417	Zorin OS
695.159002	Closing
720.491973	Get involved!!

What is needed now is to remove decimal values from the time stamps and then loop through the chapter creation to make the json file using the proper formatting. It may be possible to use sed or awk to trim the time (either through removal or rounding) but I haven't found a conclusive method yet.

This will effectively round out the time stamps which is fine but I don't know how to combine that with cut as of now.

awk '{ print int($1 + 0.5) } ' numbers.txt

A step up that does rounded numbers and characters after that is as follows:

awk '{ print int($1 + 0.5),$3,$4,$5,$6,$7 } ' test.txt

Which gives the expected result but is not idea because you need to know the characters after the value. That is why it goes to 7 in my example. Looking for a way to just use the rest of the line.

22 Introduction    
59 Podcast 2.0 Talk  
79 Value Tag   
152 Chapters Tag   
165 LIT tag   
193 Transcripts    
232 Explain Break   
258 Mini-Series    
304 My Audio Sucks  
357 Projects I've been working on
471 Homelab 2.0   
647 Zorin OS   
695 Closing    
720 Get involved!!

Maybe something can be done with Js or Python to turn the text data into an array and then put that into the pattern of the Chapter json file. If the time stamp is X and the text string is Y, you would need a loop that would put the information into the json possibly using an array for the data like this?

for (let i = 0; i < ArrayA[i].startTime ArrayB[i].title; i+++) {
{
            "startTime": ArrayA,
            "title": "ArrayB"
        },
}

Some shotty html to get the idea across

<!DOCTYPE html>
<html>
<p id="demo"></p>
 <script>
const titles = ["intro","chapt 1","chap2","CHAPTER 3"];
const time = ["5","10","15","20"];
let i, len, text;
for (i = 0, len = titles.length, text = "\"\startTime\"\:"; i < len; i++) { text += time[i] + ",<br>" + "\"\title\"\: " + titles[i]  + ", <br>";
}

document.getElementById("demo").innerHTML = text;
</script>

</html>

Edited Nov 28, 2022 by rastacalavera