Code Tip: Determining Ogg Audio Duration

Transliterating Go to Tcl with Improvements

While developing a couple of Tcl/Tk audio players I wanted to be able to determine the duration (in seconds) of .ogg audio files.

An internet search turned up two articles containing source code. The first used Go and was based on another article’s source that used Java.

Since I’m more familiar with Go than Java, I first ported the Go version to Tcl. I noted three inefficiencies in the Go code. First, after finding the length, instead of breaking out of the loop that searched for the length, it just kept looping until it ran out of data. Second, it did the same thing when searching for the (sample) rate. Third, it read in the entire .ogg file (typically many megabytes), just to find and read two strings (of 4 and 6 ASCII bytes), and a 32-bit unsigned integer, and a 64-bit unsigned integer.

Go Source

func getOggDurationMs(reader io.Reader) (int64, error) {
    // For simplicity, we read the entire Ogg file into a byte slice
    data, err := io.ReadAll(reader)
    if err != nil {
        return 0, fmt.Errorf("error reading Ogg file: %w", err)
    }
    // Search for the "OggS" signature and calculate the length
    var length int64
    for i := len(data) - 14; i >= 0 && length == 0; i-- {
        if data[i] == 'O' && data[i+1] == 'g' && data[i+2] == 'g' && data[i+3] == 'S' {
            length = int64(readLittleEndianInt(data[i+6 : i+14]))
        }
    }
    // Search for the "vorbis" signature and calculate the rate
    var rate int64
    for i := 0; i < len(data)-14 && rate == 0; i++ {
        if data[i] == 'v' && data[i+1] == 'o' && data[i+2] == 'r' && data[i+3] == 'b' && data[i+4] == 'i' && data[i+5] == 's' {
            rate = int64(readLittleEndianInt(data[i+11 : i+15]))
        }
    }
    if length == 0 || rate == 0 {
        return 0, fmt.Errorf("could not find necessary information in Ogg file")
    }
    durationMs := length * 1000 / rate
    return durationMs, nil
}

func readLittleEndianInt(data []byte) int64 {
    return int64(uint32(data[0]) | uint32(data[1])<<8 | uint32(data[2])<<16 | uint32(data[3])<<24)
}

Go’s standard library has support for searching raw bytes, so neither of the loops shown here are necessary. (My guess is that this code is itself a straight transliteration from some even more naïve Java.) Nor is there any need for the readLittleEndianInt function, since Go’s library already provides a suitable function.

The moral might be to be very wary of the poor quality code that turns up on the internet. However, in this case, I found the Ogg/Vorbis documentation so difficult to navigate, that although the Go and Java code are poor, they did provide sufficient information to know what data I needed and how to locate it in an .ogg file.

Here’s a naïve Tcl transliteration of the Go algorithm. Since I have all the data in memory I simply search for the markers (vorbis and OggS) and use Tcl’s binary scan subcommand to interpret the numbers. I return seconds rather than milliseconds, and return 0 on failure.

Tcl Transliteration Version (slow)

proc ogglen::secs filename {
    if {![regexp -nocase {^.*.(?:ogg|oga)$} $filename]} { return 0 }
    set data [readFile $filename binary]
    set i [string first "vorbis" $data]
    if {$i == -1} { return 0 }
    binary scan [string range $data $i+11 $i+14] iu rate
    set i [string last "OggS" $data]
    if {$i == -1} { return 0 }
    binary scan [string range $data $i+6 $i+13] wu length
    expr {int(round($length / double($rate)))}
}

Unfortunately, when reading thousands of files, this approach turns out to be very time consuming. Also, it seems wasteful to read in the whole file when we need only concern ourselves with a couple of very short byte sequences and a couple of integers.

So, here is the version I actually use.

Tcl Working Version (fast)

proc ogg::duration_in_secs filename {
    if {![regexp -nocase {^.*.(?:ogg|oga)$} $filename]} { return 0 }
    set rate 0
    set length 0
    set fh [open $filename rb]
    try {
        while {1} {
            set data [chan read $fh 4080]
            set size [string length $data]
            set i [string first "vorbis" $data]
            if {$i > -1 && $i+14 < $size} {
                binary scan [string range $data $i+11 $i+14] iu rate
                break
            }
            if {$size < 4080} { break }
            seek $fh -20 current
        }
        seek $fh -4020 end
        while {1} {
            set data [chan read $fh 4020]
            set size [string length $data]
            set i [string last "OggS" $data]
            if {$i > -1 && $i+13 < $size} {
                binary scan [string range $data $i+6 $i+13] wu length
                break
            }
            if {$size < 4020} { break }
            seek $fh -8000 current
        }
    } finally {
        close $fh
    }
    if {!$rate || !$length} { return 0 }
    expr {int(round($length / double($rate)))}
}

In timings the fast version has always been at least 200 times faster than the naïve approach shown earlier. For example, on a relatively old laptop, the Tcl Transliteration Version took about 23 minutes to read the durations of 3,265 .ogg files; whereas the fast Tcl Working Version took about 7 seconds.

Top