1. 30 Jul, 2016 4 commits
    • Matteo Frigo's avatar
      update AUTHORS · b4059944
      Matteo Frigo authored
      b4059944
    • Matteo Frigo's avatar
      Fixes for Windows cross-compilation · 4d0c1894
      Matteo Frigo authored
      These days mingw by default produces binaries that depend on
      libgcc-sjlj-1.dll, which defeats the whole historical point of mingw
      (produce vanilla win32 binaries with no GNU stuff).
      
      Add a hack to link with -static-libgcc, which avoids the problem.
      4d0c1894
    • Matteo Frigo's avatar
      Misc fixes. · a17d44ee
      Matteo Frigo authored
      * sed s/avx[_- ]128[-_ ]fma/avx-128-fma
      * avoid some signed/unsigned casts
      a17d44ee
    • Matteo Frigo's avatar
      Fix SIMD autodetection · f3688be1
      Matteo Frigo authored
      * AVX was not testing for OSXSAVE support
      
      * AVX2 was broken (issuing XGETBV without checking for its presence---failing
        on atom)
      
      * AVX512 was broken in the same way as AVX2, I have guessed a fix but
        I have no way to test it.
      f3688be1
  2. 29 Jul, 2016 5 commits
  3. 05 Jun, 2016 3 commits
  4. 04 Jun, 2016 1 commit
    • Matteo Frigo's avatar
      Cast Police · 29cee6cc
      Matteo Frigo authored
      Eliminate some useless (but harmless) int<->size_t conversions.
      29cee6cc
  5. 13 Mar, 2016 1 commit
  6. 20 Jan, 2016 3 commits
  7. 30 Sep, 2015 2 commits
  8. 08 Sep, 2015 2 commits
  9. 05 Aug, 2015 1 commit
  10. 26 May, 2015 3 commits
    • Erik Lindahl's avatar
      Update VSX SIMD to avoid inline assembly · 8cd9bfa3
      Erik Lindahl authored
      Thanks to some help from Michael Gschwind of
      IBM, this removes the remaining inline assembly
      calls and replace the with vector functions. This
      avoid interfering with the optimizer both on GCC
      and XLC, and gets us another 3-10% of performance
      when using VSX SIMD. Tested with GCC-4.9, XLC-13.1
      in single and double on little-endian power 8.
      8cd9bfa3
    • Erik Lindahl's avatar
      Enable SSE2 automatically with AVX,AVX2, or AVX512. · 579cec9a
      Erik Lindahl authored
      256-bit AVX can be significantly slower than
      128-bit SIMD. Despite recommendations many
      distributions appear to only enable AVX, but not
      SSE. This fixes the problem by also enabling
      SSE when we use the wider SIMD instructions.
      579cec9a
    • Erik Lindahl's avatar
      Turn AVX-128 into AMD-specific AVX-128-FMA · dd80210e
      Erik Lindahl authored
      The only platform where AVX-128 really matters
      is AMD (since the compute units can execute a
      single 256-bit or two 128-bit SIMD instructions),
      so now we only use it there which means we can
      also enable FMA instructions.
      dd80210e
  11. 25 May, 2015 7 commits
  12. 13 May, 2015 1 commit
  13. 07 May, 2015 1 commit
    • Erik Lindahl's avatar
      Separate routines to query 128-bit AVX support · cd2b27d1
      Erik Lindahl authored
      This also disables 256-bit AVX for current AMD processors
      that work better with 128-bit AVX. Note that this is not
      detected by the timing routines since the effect is only
      apparent when using multiple cores.
      cd2b27d1
  14. 20 Apr, 2015 2 commits
  15. 16 Apr, 2015 1 commit
  16. 14 Apr, 2015 1 commit
  17. 13 Apr, 2015 1 commit
  18. 12 Apr, 2015 1 commit