Skip to content

Fix packaging and some code.

Joe Wass requested to merge github/fork/pwnall/package into master

Created by: pwnall

I read about this project today, and thought it's really cool. I'd like to play with academic papers for a toy project, and I'm sure many other PhD students would find this tool really useful.

I was very happy to see this is Ruby, and I was hoping I'd be able to use it inside a Ruby application. The code doesn't seem in shape for that quite yet. I made a few changes to get it closer.

  • I moved the files in lib/ to lib/pdf/extract. Gems should generally not pollute the global require namespace.
  • I updated the paths inside all the files to point to the new location.
  • I added the gems needed to run test/catalog.rb as development dependencies in the gemspec.
  • I created a simple Gemfile that references the gemspec, to make it easy to develop this gem and run its tests.
  • I moved assign.rb and train.rb out of bin/ and turned them into Rake tasks. It seems like they're used for development, so they shouldn't end up in the user's path when the gem is installed.
  • I updated the code to work with the newer pdf-reader API and updated the version number. This should fix #4 (closed).
  • I replaced libsvm-ruby-swig with rb-libsvm. The former crashed on my setup (ruby 2.0.0 on OSX 10.9) and hasn't been updated in 2 years. The latter has been updated this year, has a newer libsvm and, most importantly, doesn't crash Ruby.

I was able to run bin/pdf-extract to extract titles and references from a PDF, and I was able to run rake assign PDF=..... to build training data from a PDF file. I think this means that my libsvm code changes are correct.

Merge request reports