1. 02 Mar, 2020 11 commits
  2. 27 Feb, 2020 1 commit
  3. 26 Feb, 2020 1 commit
  4. 22 Feb, 2020 1 commit
  5. 21 Feb, 2020 6 commits
  6. 20 Feb, 2020 1 commit
  7. 17 Jan, 2020 1 commit
  8. 08 Jan, 2020 1 commit
  9. 31 Dec, 2019 2 commits
  10. 19 Dec, 2019 1 commit
  11. 13 Dec, 2019 2 commits
  12. 06 Dec, 2019 1 commit
    • Zhiyuan Zhu's avatar
      igzip: cleanup perf test related code · f430953f
      Zhiyuan Zhu authored
      
      
      This patch addresses some cppcheck issues.
      And some minor changes to maintain code consistency.
      
      - Cleanup cppcheck issues.
        [log][igzip/igzip_perf.c] (error) Shifting signed 32-bit value by 31 bits is undefined behaviour
        [log][igzip/igzip_hist_perf.c:132]: (error) Memory leak: outbuf
      
      - Some minor changes to maintain code consistency.
        igzip/igzip_build_hash_table_perf.c
        igzip/igzip_hist_perf.c
        igzip/igzip_semi_dyn_file_perf.c
      
      - delete unused variable
        outbuf and outbuf_size from igzip/igzip_hist_perf.c
      
      Change-Id: Icbbd8f70de689931c8a844d89e457af8d97c6793
      Signed-off-by: default avatarZhiyuan Zhu <zhiyuan.zhu@arm.com>
      f430953f
  13. 29 Nov, 2019 1 commit
  14. 19 Nov, 2019 1 commit
  15. 13 Nov, 2019 1 commit
    • Samuel Lee's avatar
      crc: arm64 implementation tweaks · 4785428d
      Samuel Lee authored
      
      
      + Utilise `pmull2` instruction in main loops of arm64 crc functions and
      avoid the need for `dup` to align multiplicands.
        + Use just 1 ASIMD register to hold both 64b p4 constants,
      appropriately aligned.
      + Interleave quadword `ldr` with `pmull{2}` to avoid unnecessary stalls
      on existing LITTLE uarch (which can only issue these instructions every
      other cycle).
      + Similarly interleave scalar instructions with ASIMD instructions to
      increase likelihood of instruction level parallelism on a variety of
      uarch.
      + Cut down on needless instructions in non-critical sections to help
      performance for small buffers.
      + Extract common instruction sequences into inner macros and moved
      them into shared header - crc_common_pmull.h
      + Use the same human readable register aliases and register allocation
      in all 4 implementations, never refer to registers without using human
      readable alias.
        + Use #defines rather than .req to allow use of same names across
      several implementations
      + Reduce tail case size from 1024B to 64B
      
      + Phrased the `eor` instructions in the main loop to more clearly show
      that we can rewrite pairs of `eor` instructions with a single `eor3`
      instruction in the presence of Armv8.2-SHA (should probably be an option
      in multibinary in future).
      
      Change-Id: I3688193ea4ad88b53cf47e5bd9a7fd5c2b4401e1
      Signed-off-by: default avatarSamuel Lee <samuel.lee@microsoft.com>
      4785428d
  16. 01 Nov, 2019 7 commits
  17. 31 Oct, 2019 1 commit