Add functions for secure extraction of archives
We have a few projects at GitLab that deal with extraction of archive files and some archive extraction implementations have been vulnerable to Zip Slip attacks in the past.
This MR introduces functions for safe extraction of zip, tar, tar.gz, and tar.bz2 archives:
-
Zip Slip protection: If a filename within an archive contains directory traversals (
../), the extraction is stopped and an error is returned. - Symlink attack protection: Only extraction of directories and regular files are supported. All other file types, like symbolic links, are ignored.
- Resource exhaustion protection: Extract functions are context-aware which makes it possible to cancel or time out the extraction, which should provide some protection against Zip Bomb-like attacks (however, not bullet proof).
Zip example:
package main
import (
"context"
"os"
"gitlab.com/gitlab-org/labkit/archive/zip"
)
func main() {
f, err := os.Open("archive.zip")
if err != nil {
panic(err)
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
panic(err)
}
if err := zip.Extract(context.Background(), f, fi.Size(), "./out_dir"); err != nil {
panic(err)
}
}
Tarball example:
package main
import (
"context"
"os"
"gitlab.com/gitlab-org/labkit/archive/tar"
)
func main() {
f, err := os.Open("archive.tar") // or archive.tar.gz / archive.tar.bz2
if err != nil {
panic(err)
}
defer f.Close()
if err := tar.Extract(context.Background(), f, "./out_dir"); err != nil {
panic(err)
}
}