Reviewing your artifact on NVIDIA GTX 1080
Hi Michel,
I managed to reproduce your artifact on my Ubuntu 16.04 server with NVidia GTX 1080 card. I attached all logs, produced results and pdf.
Docker version worked fine (though I did't use it for performance evaluation and I don't have Nvidia-docker installed). However, I have to admit that building and running your artifact natively was a challenge ;) ! Since it's a very impressive project with lots of software development, I hope that my feedback will be useful to improve documentation/installation.
=============================
First of all, I never used git-lfs, so I would suggest to provide info about how to install it directly in artifact description rather than just giving a link to this tool. Otherwise, I had to search how to install it on my Ubuntu (apt-get install didn't work so I needed to download and install a Debian package). Debian package installation worked fine:
$ git version
git version 2.7.4
$ git lfs version
git-lfs/1.5.3 (GitHub; linux amd64; go 1.7.4)
=============================
Git cloning worked fine too:
$ git clone https://gitlab.com/michel-steuwer/cgo_2017_artifact.git
Cloning into 'cgo_2017_artifact'...
remote: Counting objects: 2578, done.
remote: Compressing objects: 100% (128/128), done.
remote: Total 2578 (delta 56), reused 0 (delta 0)
Receiving objects: 100% (2578/2578), 36.57 MiB | 9.44 MiB/s, done.
Resolving deltas: 100% (893/893), done.
Checking connectivity... done.
Downloading amd.tar.bz2 (52.49 MB)
Username for 'https://gitlab.com':
Password for 'https://gitlab.com':
Downloading nvidia.tar.bz2 (51.65 MB)
Username for 'https://gitlab.com':
Password for 'https://gitlab.com':
Downloading rodinia_3.1.tar.bz2 (434.76 MB)
Username for 'https://gitlab.com':
Password for 'https://gitlab.com':
Checking out files: 100% (2109/2109), done.
=============================
Docker build and run worked fine (I needed sudo on my machine):
$ sudo docker build -t lift docker
$ sudo docker run -it --rm -v `pwd`:/lift lift
However, I was a bit confused what was ROOT environment. Apparently later I found that I should set it to /lift - I suggest to mention it explicitly in Docker description.
I also tried to run $ROOT/lift/scripts/Listing1
but it was not found. Again later I found that I need
to run buildRunScripts.py before using them. Maybe you should move Docker notes to the end of the documentation?
After that Listing1 worked fine and generated graph (see attached files).
=============================
Non-Docker installation:
I installed all dependencies without problems:
$ sudo apt-get update
$ sudo apt-get install -y default-jdk cmake g++ wget unzip libclang-dev llvm libgl1-mesa-dev freeglut3-dev libglew-dev libz-dev libssl-dev opencl-headers zsh python finger r-base r-cran-ggplot2 apt-transport-https graphviz
However, I could not install SBT via your command. Later, I found out that I need sudo to be able to install it and the correct instructions on my machine are (I need sudo again):
$ echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
$ sudo apt-get update
$ sudo apt-get install -y sbt
When I tried to compile sbt, it again would not compile without sudo. In fact, I realized that I should add sudo to all commands in order to run it on my machine (it's important, otherwise I had some very weird behavior):
$ sudo sbt compile
=============================
I then tried to run $ROOT/lift/scripts/buildRunScripts.py
and then waited for 20 minutes, but nothing has been happing. I tried different things but nothing worked, until I somehow run it again with sudo, and it worked in 1 minute:
$ sudo $ROOT/lift/scripts/buildRunScripts.py
Maybe you should mention that in Troubleshoting section, otherwise it's very frustrating to wait for 20 minutes wihtout understanding what is happening (and surprisingly there were no errors reported, script just stuck) ...
=============================
Then I ran
$ sudo ./build.zsh
All builds were fine apart from "Stencil2D" - it failed - see build.zsh.log
Also, I suggest to clean 'build' directories before using cmake (rm -rf build) otherwise if I have to stop build, fix something and then try again and then I get lots of complaints from cmake ;) ...
=============================
When running baseline, I though that N-Body stuck - it was very long. Maybe it's better to write some expected time ... But then it worked fine:
Running N-Body, Nvidia...
Running N-Body, AMD...
Running K-Means...
Running Nearest Neighbour...
Running Molecular Dynamics...
Running MRI-Q...
Running Convolution...
Running Matrix Multiplication...
Running GEMV N...
Running GEMV T...
Parsing results...
I added all logs and csv from NVidia GTX 1080
By the way, I guess that AMD results are from using OpenCL in CPU mode?
=============================
OpenCL kernels were also generated fine:
Generating code for Convolution...
Generating code for Gemv...
Generating code for KMeans...
Generating code for MD...
Generating code for MMAMD...
Generating code for MRIQ...
Generating code for NBody...
Generating code for NN...
Generating code for MMNvidia...
=============================
Running Generated programs seemed to be fine (no errors - see logs)
However when parsing results, I had the following few issues:
$ sudo ./parse_lift.zsh
Parsing results...
awk: fatal: cannot open file `mm_nvidia_no_simpl_lift_small.log' for reading (No such file or directory)
awk: fatal: cannot open file `mm_nvidia_no_opt_lift_small.log' for reading (No such file or directory)
awk: fatal: cannot open file `mm_nvidia_no_simpl_lift_large.log' for reading (No such file or directory)
awk: fatal: cannot open file `mm_nvidia_no_opt_lift_large.log' for reading (No such file or directory)
awk: fatal: cannot open file `convolution_row_no_opt_lift_small.log' for reading (No such file or directory)
awk: fatal: cannot open file `convolution_row_no_opt_lift_large.log' for reading (No such file or directory)
Maybe it's related to failed build of Stencil2D?
Finally, the speedup graph has similar tendencies as your Figure 8, but also has some very high speedups: plot.pdf It seems that the speedup of the best optimized version is always close to 1 or sometimes much better (K-Means - I hope it's not a bug). However, may I ask you to check these logs and results and confirm that they are expected, please? The difference may be due to different GPU used (1080 vs 980) ...
Thanks a lot for participating in AE and making your very interesting work public!