1. 26 Jul, 2019 2 commits
    • hjhornbeck's avatar
      - Did a touch more tidying. · ecd1854f
      hjhornbeck authored
      - ArbKnotQuadRejSampler just barely passes the qualitative checks.
      Numeric accuracy may be the problem, which suggests cubic
      interpolation would be worse. It could also be because quadratic
      b-splines are non-interpolating.
      ecd1854f
    • hjhornbeck's avatar
      - Caught a potential bug in ArbKnotQuadRejSampler where the maximum at · c4fa2d1e
      hjhornbeck authored
      the endpoint of one bin didn't propagate to the startpoint of the next
      bin. There's no evidence of the bug in action, however that could be due
      to the choice of test VSFs.
      - Switched ArbKnotQuadRejSampler::generateEnvelope() to use doubles,
      instead of a pair of fp's, for an accumulator.
      - Did a lot of code and comment tidying.
      c4fa2d1e
  2. 25 Jul, 2019 4 commits
  3. 24 Jul, 2019 3 commits
    • hjhornbeck's avatar
      - Forgot to adjust the rejection rate for each bin within · 6a80a79b
      hjhornbeck authored
      ArbKnotQuadRejSampler, as unless they're all equal the sample will be
      biased.
      - Caught a bug where some copy-pasted code referenced an invalid
      variable in CUDA-specific code.
      - Caught a bug where the CUDA functor was being launched with the
      incorrect parameters.
      - Removed an unused variable in ArbKnotQuadJumpBiSampler and
      ArbKnotQuadRejSampler.
      6a80a79b
    • hjhornbeck's avatar
      - Caught a bug where the hints table in the arbQuadRej functor was · e8cf485d
      hjhornbeck authored
      incorrectly used.
      - ArbKnotQuadRejSampler compiles fine, but is in desperate need of testing.
      e8cf485d
    • hjhornbeck's avatar
      - Coded up ArbKnotQuadRejSampler, which attempts to lessen the · 605b45bf
      hjhornbeck authored
      computational load of quadratic b-splines via binned rejection sampling.
      - Tweaked ArbKnotLinBiSampler::allocGPU() to be more friendly towards
      ArbKnotQuadRejSampler.
      - Added more helper functions to ArbKnotQuadRejSampler, as the setup
      function isn't time-critical.
      - Caught a bug where the destructor for ArbKnotQuadRejSampler wasn't
      correctly defined.
      - Thought "CDF" was better described as "cdf", as it's not given to the
      algorithm, so ran a regex on arbknot.cpp's functors.
      - Ditto "JUMP".
      - Caught a typo in rejection.cpp, where I declared variables I didn't
      need to.
      605b45bf
  4. 01 Jul, 2019 1 commit
    • hjhornbeck's avatar
      - Caught a logic bug in the quadratic bisection algorithm where the integral... · 5b69cad4
      hjhornbeck authored
      - Caught a logic bug in the quadratic bisection algorithm where the integral was being calculated over the wrong interval.
      - "Caught" a loss of precision when evaluating the above. Didn't isolate
      it to a specific operation, but by whittling down the known-good math I
      was able to consolidate terms without losing much precision.
      - Caught a loss of precision in the above, as the full integral
      blows up if fed knots approximately equal to 1 or -1. The offset
      integral (a=0) doesn't have the same issue.
      - Sketched out the class for the version of quadratic which uses
      rejection binning.
      5b69cad4
  5. 30 Jun, 2019 1 commit
  6. 26 Jun, 2019 1 commit
  7. 25 Jun, 2019 2 commits
  8. 19 Jun, 2019 1 commit
  9. 18 Jun, 2019 4 commits
  10. 17 Jun, 2019 7 commits
  11. 16 Jun, 2019 7 commits
    • hjhornbeck's avatar
      - Verified the arbitrary knot variant with a jump table on both CPU and · 0ea2ba23
      hjhornbeck authored
      GPU! The compressability observation from last commit also applies here,
      too. Still don't have an answer for that.
      0ea2ba23
    • hjhornbeck's avatar
      - Added the arbitrary knot sampler, and verified it works on both CPU · 03cd0d49
      hjhornbeck authored
      and GPU!
      - .... buuut it's easier to compress than expected. Maybe because I ran
      it with an 8192-entry PDF, and slightly starved the algorithm of precision?
      03cd0d49
    • hjhornbeck's avatar
      - Forgot to mention: something seems off about GPU Linear LUT. It · 7a7aa5bd
      hjhornbeck authored
      compresses MUCH better than the CPU variant. Worse, the output appears
      perfect.
      - Caught a bug where GPU data wasn't being properly initialized for partitioned
      equal-bin LUT.
      - Verified partitioned equal-bin on both CPU and GPU!
      - Verified rejection binning on both CPU and GPU! Performance is
      terrible, though, with the default settings.
      - Verified positive rejection sampling without scaling, both on CPU and
      GPU.
      - 1HG rejection sampling is failing, as I feared.
      7a7aa5bd
    • hjhornbeck's avatar
      - Verified both plain equal-bin and linearized LUT on both CPU and GPU! · e5bce2aa
      hjhornbeck authored
      - Caught a bug where ApproxLUTsampler::createPartition() would run into
      precision issues and terminate early. On-the-fly updating LUTend worked
      with minimal modification.
      - Ran into uint/int conflicts within ApproxLUTsampler. Decided to switch
      to uints, as negative numbers don't seem to bring any advantage.
      e5bce2aa
    • hjhornbeck's avatar
      - Verified MCMC and TOMS748 on both the CPU and GPU! · 7e8a9f7c
      hjhornbeck authored
      - Caught a bug where the miss count pointer was getting mangled. Solved
      by refactoring the retrieval code.
      7e8a9f7c
    • hjhornbeck's avatar
      - Caught a bug where the contructor was being run on the CPU, yet the · eaacc9ca
      hjhornbeck authored
      pointer it needed was only valid on the GPU. Solved it by shuffling that
      code into the main functor and flipping that function to be
      GPU-callable.
      - Made tweaks to TOMS748, related to the above bug.
      - Cleaned up shared.h a bit.
      - Caught a bug where unified_functors weren't being called.
      - Caught a bug where the root-finding algorithms weren't transfering all
      the LHG data to the GPU.
      - Tested bisection. It works, both CPU and GPU!
      eaacc9ca
    • hjhornbeck's avatar
      - Caught a bug where CPU threads never started. · f5672b87
      hjhornbeck authored
      - Renamed some variables in FineTiming, to reflect that they also apply
      to CPU calculations. Minor renaming of output text to match, too.
      - Caught an embarassing spelling mistake in the help text.
      f5672b87
  12. 15 Jun, 2019 2 commits
  13. 13 Jun, 2019 1 commit
  14. 12 Jun, 2019 2 commits
  15. 10 Jun, 2019 1 commit
    • hjhornbeck's avatar
      - BROKEN. Another source file down, three more to go. · aa792944
      hjhornbeck authored
      - Outlined three more sampler classes.
      - Changed the output allocation routine to seed all samplers with a random
      value. Allows the LUT/search functions to merge their CPU and GPU
      samplers.
      - Caught a potential bug where one of the CPU threads uses the same RNG
      seed as the main thread.
      - Wrote the new code for generic Metropolis MCMC. Ditched the other
      variations, as they were no better and usually worse.
      - Caught a bug where the misses memory block wasn't being deleted.
      aa792944
  16. 09 Jun, 2019 1 commit
    • hjhornbeck's avatar
      - BROKEN. rejection.cpp now compiles, but that leaves several other · fed27066
      hjhornbeck authored
      algorithms in a broken state. BADLY needs debugging, too.
      - Added a new typedef to satisfy CUDA.
      - Caught a bug where calculateGandScale() were associated with a sister
      class instead of the parent.
      - Eliminated RejectionCompositeAdj::sample(), the parent class's routine
      works perfectly well.
      - Got rid of the bin count, we already know it via Parameters.binCount.
      - Forgot to add a destructor for RejectionLinearBinned.
      - Caught a LOT of compilation bugs in rejection.cpp, created due to the
      refactoring.
      - Added a new routine which applies calculateGandScale to an entire
      LinearHG. Should allow me to cut out some redundant code.
      - Minor algorithm tidying, too. What was startScale/endScale for, for
      instance? Got rid of scalars for LinearHG envelopes, that's handled by
      the weights.
      - Altered RejectionLinearBinned::setup() to equalize all rejection rates
      between bins. This *should* fix the bias in that algorithm.
      - TODO: more code consolidation. Left some comments hinting at how to do
      this.
      fed27066