1. 28 Apr, 2012 2 commits
  2. 26 Apr, 2012 2 commits
    • athena's avatar
      change revision to 3.3.2 · cb553a83
      athena authored
      cb553a83
    • athena's avatar
      Remove old aligned_main() hack. · 98229b0d
      athena authored
      On i386, in our benchmark program we used to manually aligned the
      stack to 16-byte boundary via asm trickery.  This was a good idea in
      1999 (and it was actually necessary to make things work) but the hack
      is now obsolete and it seems to break gcc-4.7.  So the hack is now
      gone.
      98229b0d
  3. 29 Mar, 2012 1 commit
  4. 20 Mar, 2012 1 commit
  5. 06 Mar, 2012 2 commits
  6. 02 Mar, 2012 3 commits
  7. 25 Feb, 2012 1 commit
  8. 21 Feb, 2012 2 commits
  9. 20 Feb, 2012 4 commits
  10. 09 Nov, 2011 2 commits
  11. 25 Sep, 2011 1 commit
  12. 18 Sep, 2011 1 commit
  13. 13 Sep, 2011 1 commit
  14. 03 Sep, 2011 2 commits
    • athena's avatar
    • athena's avatar
      shorten ESTIMATE planning time for certain weird sizes · f004d764
      athena authored
      FFTW includes a collection of "solvers" that apply to a subset of
      "problems".  Assume for simplicity that a "problem" is a single 1D
      complex transform of size N, even though real "problems" are much more
      general than that.  FFTW includes three "prime" solvers called
      "generic", "bluestein", and "rader", which implement different
      algorithms for prime sizes.
      
      Now, for a "problem" of size 13 (say) FFTW also includes special code
      that handles that size at high speed.  It would be a waste of time to
      measure the execution time of the prime solvers, since we know that
      the special code is way faster.  However, FFTW is modular and one may
      or may not include the special code for size 13, in which case we must
      resort to one of the "prime" solvers.  To address this issue, the
      "prime" solvers (and others) are proclaimed to be SLOW".  When
      planning, FFTW first tries to produce a plan ignoring all the SLOW
      solvers, and if this fails FFTW tries again allowing SLOW solvers.
      
      This heuristic works ok unless the sizes are too large.  For example
      for 1044000=2*2*2*2*2*3*3*5*5*5*29 FFTW explores a huge search tree of
      all zillion factorizations of 1044000/29, failing every time because
      29 is SLOW; then it finally allows SLOW solvers and finds a solution
      immediately.
      
      This patch proclaims solvers to be SLOW only for small values of N.
      For example, the "generic" solver implements an O(n^2) DFT algorithm;
      we say that it is SLOW only for N<=16.
      
      The side effects of this choice are as follows.  If one modifies FFTW to
      include a fast solver of size 17, then planning for N=17*K will be
      slower than today, because FFTW till try both the fast solver and the
      generic solver (which is SLOW today and therefore not tried, but is no
      longer SLOW after the patch).  If one removes a fast solver, of size say
      13, then he may still fall into the current exponential-search behavior
      for "problems" of size 13*HIGHLY_FACTORIZABLE_N.
      
      If somebody had compleined about transforms of size 1044000 ten years
      ago, "don't do that" would have been an acceptable answer.  I guess the
      bar is higher today, so I am going to include this patch in our 3.3.1
      release despite their side-effects for people who want to modify FFTW.
      f004d764
  15. 27 Aug, 2011 1 commit
  16. 26 Aug, 2011 1 commit
  17. 21 Aug, 2011 1 commit
  18. 19 Aug, 2011 1 commit
  19. 18 Aug, 2011 1 commit
  20. 14 Aug, 2011 1 commit
  21. 11 Aug, 2011 3 commits
  22. 08 Aug, 2011 1 commit
  23. 05 Aug, 2011 5 commits