Chapters from Audacity
Disclaimer: I am not a programmer/Dev of any kind. This is probably a very easy thing for someone to accomplish but this is just me trying to reason through a problem publicly. Jump to the bottom for the latest updates!
I am trying to find a way to automate chapters (labels) from audacity into the proper json format for the 2.0 namespace.
The export from audacity creates a three column text file separated by tabs.
- Column 1 and 2 are repeats of each other and contain numerical values of the timestamp in seconds.
- The values are separated by a decimal and character count to the right of the decimal is always 6
- Column 3 is the label text
The cut
command in linux can be used to easily remove one column of time:
cut -f 2,3 audacityLabels.txt > editedLabels.txt
Goes from
22.296961 22.296961 Introduction
59.036735 59.036735 Podcast 2.0 Talk
79.325170 79.325170 Value Tag
152.084898 152.084898 Chapters Tag
164.815238 164.815238 LIT tag
192.580499 192.580499 Transcripts
232.010884 232.010884 Explain Break
257.860499 257.860499 Mini-Series
304.256871 304.256871 My Audio Sucks
357.250612 357.250612 Projects I've been working on
470.900680 470.900680 Homelab 2.0
646.536417 646.536417 Zorin OS
695.159002 695.159002 Closing
720.491973 720.491973 Get involved!!
to
22.296961 Introduction
59.036735 Podcast 2.0 Talk
79.325170 Value Tag
152.084898 Chapters Tag
164.815238 LIT tag
192.580499 Transcripts
232.010884 Explain Break
257.860499 Mini-Series
304.256871 My Audio Sucks
357.250612 Projects I've been working on
470.900680 Homelab 2.0
646.536417 Zorin OS
695.159002 Closing
720.491973 Get involved!!
What is needed now is to remove decimal values from the time stamps and then loop through the chapter creation to make the json
file using the proper formatting.
It may be possible to use sed
or awk
to trim the time (either through removal or rounding) but I haven't found a conclusive method yet.
This will effectively round out the time stamps which is fine but I don't know how to combine that with cut as of now.
awk '{ print int($1 + 0.5) } ' numbers.txt
A step up that does rounded numbers and characters after that is as follows:
awk '{ print int($1 + 0.5),$3,$4,$5,$6,$7 } ' test.txt
Which gives the expected result but is not idea because you need to know the characters after the value. That is why it goes to 7 in my example. Looking for a way to just use the rest of the line.
22 Introduction
59 Podcast 2.0 Talk
79 Value Tag
152 Chapters Tag
165 LIT tag
193 Transcripts
232 Explain Break
258 Mini-Series
304 My Audio Sucks
357 Projects I've been working on
471 Homelab 2.0
647 Zorin OS
695 Closing
720 Get involved!!
Maybe something can be done with Js or Python to turn the text data into an array and then put that into the pattern of the Chapter json file. If the time stamp is X
and the text string is Y
, you would need a loop that would put the information into the json possibly using an array for the data like this?
for (let i = 0; i < ArrayA[i].startTime ArrayB[i].title; i+++) {
{
"startTime": ArrayA,
"title": "ArrayB"
},
}
Some shotty html to get the idea across
<!DOCTYPE html>
<html>
<p id="demo"></p>
<script>
const titles = ["intro","chapt 1","chap2","CHAPTER 3"];
const time = ["5","10","15","20"];
let i, len, text;
for (i = 0, len = titles.length, text = "\"\startTime\"\:"; i < len; i++) { text += time[i] + ",<br>" + "\"\title\"\: " + titles[i] + ", <br>";
}
document.getElementById("demo").innerHTML = text;
</script>
</html>