1. 12 Mar, 2019 1 commit
  2. 19 Feb, 2019 1 commit
  3. 06 Feb, 2019 1 commit
    • harshad untwale's avatar
      Bug Fix · b1fbc24b
      harshad untwale authored
      Change-Id: Iffd9c6463686c5f8e10edc3ca7e00a9581910aa4
      b1fbc24b
  4. 25 Jan, 2019 1 commit
  5. 11 Dec, 2018 1 commit
  6. 26 Nov, 2018 1 commit
  7. 21 Nov, 2018 1 commit
    • David Woo's avatar
      Misc. sampler-related changes. · f623e3d0
      David Woo authored
      Expanded gather4 workaround to cover gather4_po as well..
      
      Changed ld_mcs definition to allow for different-sized
      coordinate types.
      
      Change-Id: I8bb5da3b131ca4e27d060514e84f7fc24846c5d6
      f623e3d0
  8. 14 Nov, 2018 1 commit
  9. 30 Oct, 2018 1 commit
    • xlei3's avatar
      Program scope global values are being tracked in metadata and · fcd8ea37
      xlei3 authored
      are being lost on LLVM optimizations, which causes illegal
      memory access when dereferencing the metadata. Add these global
      values into the "llvm.used" list so that they will not be
      removed.. Duplicated the "appendToUsed" LLVM function for IGC
      usage, since the original only can do non-addrspace pointer
      casts. The new function allows appending any pointer type.
      
      Change-Id: I6f5cb3e1ca8bd18f2d05b3c04533209667889969
      fcd8ea37
  10. 25 Oct, 2018 1 commit
  11. 24 Oct, 2018 1 commit
  12. 27 Sep, 2018 1 commit
  13. 26 Sep, 2018 1 commit
  14. 19 Sep, 2018 1 commit
    • pmistry's avatar
      This is the recheckin, with additional code in Emu64OpsPass to handle i64 read_register intrinsic. · eab147c5
      pmistry authored
      This is the first change in a series that will move the gen intrinsics to read_register based framework.
      Idea is to use metadata based operand mechanism to avoid LLVM passes to optimize intrinsics that we expect to have constant operands. 
      Current change introduces a read_register based framework and replaces the intrinsic GenISA_RunTimeValue(i32 const) with llvm:read_register(metadata i32 const). 
      
      
      Change-Id: I557968ca9db6868ae4bad73771da46952e94320a
      eab147c5
  15. 13 Jul, 2018 2 commits
  16. 12 Jul, 2018 2 commits
    • xlei3's avatar
      Fixed a bug where ldraw and storeraw changes the types of the value being... · 4c0b74c6
      xlei3 authored
      Fixed a bug where ldraw and storeraw changes the types of the value being stored when converting from load/store instructions.
      
      Since ldraw/storeraw now supports values of any type, we don't need to cast them to float types.
      
      Change-Id: I342924a190e104dcb37a137b3f75578c93f0258a
      4c0b74c6
    • xlei3's avatar
      While tracing bindless pointers, instead of bailing out when we encounter a... · a34dcf7d
      xlei3 authored
      While tracing bindless pointers, instead of bailing out when we encounter a loop in one of the PHI node paths, we can continue to trace the other paths.
      
      The idea is that looping paths can be safely ignored, as long as non-looping paths all converge on the same source pointer.
      
      Change-Id: I75cba187a4aefb4cab500bb336d1f9daa34b9e6a
      a34dcf7d
  17. 11 Jul, 2018 1 commit
  18. 03 Jul, 2018 1 commit
  19. 28 Jun, 2018 1 commit
  20. 25 Jun, 2018 1 commit
  21. 01 Jun, 2018 1 commit
  22. 28 May, 2018 1 commit
  23. 25 May, 2018 1 commit
  24. 24 May, 2018 1 commit
  25. 23 May, 2018 1 commit
    • jgu's avatar
      Changes in code. · 547d6437
      jgu authored
      Change-Id: Ica0acf9480067b9847c4c15daf8ce06dbf79ccba
      547d6437
  26. 18 May, 2018 1 commit
  27. 15 May, 2018 1 commit
    • xlei3's avatar
      1. Added a new "any" type for GenISA intrinsics. This type can be overloaded... · ac21e314
      xlei3 authored
      1. Added a new "any" type for GenISA intrinsics. This type can be overloaded with a default value, ex "any:float". At function creation the default value will be used if no overloaded type is provided.
      2. Runtime Values can now return "any" type. This is useful for bindless pointers, as we can get a pointer directly from the payload.
      3. Redesigned and simplified the bindless pointer tracing code due to above changes.
      4. Since RuntimeValue can now be any size, alignment needs to be ensured (ex. 8-byte pointers are still treated as taking 2 RTV offsets)
      
      Change-Id: Ia9f80e6c006e904826996148fd80ebfe04ff8b39
      ac21e314
  28. 08 May, 2018 1 commit
  29. 27 Apr, 2018 2 commits
  30. 03 Apr, 2018 1 commit
    • xlei3's avatar
      Commit contains 108 changes. · dbb9f9f1
      xlei3 authored
      Change 1:
          Will check in another fix after opensource builds are back up
        made by: Xiao Lei
      
      Change 2:
          To make subroutine call for emulation functions be on by default.
        made by: Junjie Gu
      
      Change 3:
          [IGC Refactor][Builtins, IGC]: Update changes for Built-in Indexing
          Description for Open Source:
          these allow us to load only the builtins that we need into the user kernel
        made by:  hudson_server
      
      Change 4made by: Michael Liao
      
      Change 5:
          As summary.
        made by: Wei Pan
      
      Change 6made by: Alexander Paige
      
      Change 7:
          Automated integration from mainline to DEV_IGC
        made by:  IGC
      
      Change 8:
          Need to use zext on PHI when doing integer promotion. For example,
           phi i2 [1, %b0] [-1, %b1]
        made by: Junjie Gu
      
      Change 9:
          When icmp is signed, need to do sext (using shl + ashr)
        made by: Junjie Gu
      
      Change 10:
          Missing description
        made by: Junjie Gu
      
      Change 11:
          For CS, if simd8 is not least allowed simd size, don't enable subroutine as subroutine is simd8 only for now.
        made by: Junjie Gu
      
      Change 12made by: Thomas F Raoux
      
      Change 13:
          As summary.
        made by: Wei Pan
      
      Change 14:
          To make subroutine call for emulation functions be on by default.
        made by:  hudson_server
      
      Change 15made by: Po-yu Chen
      
      Change 16:
          add more acc restrictions
        made by: Weiyu Chen
      
      Change 17:
          Make sure the uneeded code is indeed removed.
        made by: Junjie Gu
      
      Change 18:
          Handle insertelementinst and waveshuffle specially
        made by: Junjie Gu
      
      Change 19:
          In FF, un constrained variables can be assigned freely, RR helps scheduling effort.
          loose the initial condition. only node in first time un-constrainted list is applied RR.
        made by:  hudson_server
      
      Change 20:
          Fix alignment in CreateBufferLoad() for types smalled than 4 bytes. Set correct address alignement in ldraw instrinsics.
        made by: Lukasz Gotszald
      
      Change 21made by: Thomas F Raoux
      
      Change 22made by: Thomas F Raoux
      
      Change 23:
      
        made by:  hudson_server
      
      Change 24made by: Thomas F Raoux
      
      Change 25:
          Phi instructions always need to be at the beginning of a given basic block.
        made by:  hudson_server
      
      Change 26:
          [IGC Backout][IGC]: Revert CL#751879
          Description for Open Source:
          - due to performance regression
        made by:  hudson_server
      
      Change 27:
          Provide global and local variable splitting in single pass, and handled in the same way.
        made by:  hudson_server
      
      Change 28:
          [IGC Refactor][IGC]: Step2 to remove deprecated sampler intrinsics. Missing file in previous CL
          Description for Open Source:
        made by:  hudson_server
      
      Change 29:
          In FF, un constrained variables can be assigned freely, RR helps scheduling effort.
          loose the initial condition. only node in first time un-constrainted list is applied RR.
        made by: Bu Qi Cheng
      
      Change 30made by: Thomas F Raoux
      
      Change 31:
          [IGC BugFix][DX9_FE]: Fix more regression due to refactoring
          Description for Open Source:
           Fix more regression due to refactoring
        made by:  hudson_server
      
      Change 32:
          Provide global and local variable splitting in single pass, and handled in the same way.
        made by: Bu Qi Cheng
      
      Change 33:
          [IGC Refactor][IGC]: Last step of legacy sample removal
          Description for Open Source:
          Remove legacy intrinsics
        made by:  hudson_server
      
      Change 34:
          Fix wrong size of string passing.
          This caused the last char of "options" string to be cut off, therefore module metadata was incorrect.
        made by: Andrzej Ratajewski
      
      Change 35:
          Back-out of one of previous change.
        made by: Michael Liao
      
      Change 36made by: Tomasz Bujewski
      
      Change 37made by: Thomas F Raoux
      
      Change 38:
          Internal feature
        made by:  hudson_server
      
      Change 39:
          Unexpected problem has been detected in VulkanULT. It was not reproduced at Sanity, preETM nor locally.
        made by: Lukasz Gotszald
      
      Change 40:
          Phi instructions always need to be at the beginning of a given basic block.
        made by: Jacek Jankowski
      
      Change 41:
          This way IGC-NEO interface will be available in IGC when building IGC with Apple support (for non-apple OS).
        made by: Jaroslaw Chodor
      
      Change 42:
          Fix alignment in CreateBufferLoad() for types smalled than 4 bytes. Set correct address alignement in ldraw instrinsics.
        made by: Lukasz Gotszald
      
      Change 43made by: Thomas F Raoux
      
      Change 44made by: Thomas F Raoux
      
      Change 45:
      
        made by: Mariusz Merecki
      
      Change 46made by: Junjie Gu
      
      Change 47made by: Bu Qi Cheng
      
      Change 48:
          Emulation pass also emulates i64 div/mod. So, make sure it is for double emulation before enabling subroutine support.
          Code Review:
          trivial
        made by: Junjie Gu
      
      Change 49:
          add a missing check on whether mad src0 can be acc
        made by: Weiyu Chen
      
      Change 50:
          As summary.
        made by: Wei Pan
      
      Change 51:
          Enables stateless support on BDW by default
        made by: Xiao Lei
      
      Change 52:
          Added compiler support for GTPin
        made by: Xiao Lei
      
      Change 53:
          Refactored GTPin return data to contain the correct version and request status.
        made by: Xiao Lei
      
      Change 54made by: Thomas F Raoux
      
      Change 55:
      
        made by: Thomas F Raoux
      
      Change 56:
          Back-out of one of previous change.
        made by: Mariusz Merecki
      
      Change 57:
      
        made by:  IGC
      
      Change 58:
          Provide global and local variable splitting in single pass, and handled in the same way.
        made by:  IGC
      
      Change 59:
      
        made by: Mariusz Merecki
      
      Change 60:
          Provide global and local variable splitting in single pass, and handled in the same way.
        made by: Bu Qi Cheng
      
      Change 61:
          internal feature
        made by: Weiyu Chen
      
      Change 62:
          See if we can remove this seemly useless check
        made by: Weiyu Chen
      
      Change 63made by: Thomas F Raoux
      
      Change 64made by: Thomas F Raoux
      
      Change 65made by: Piotr Mochocki
      
      Change 66:
          Assert is unnecessary as it will be handled in other parts of the pass.
          Fixes a regression from CL751781
        made by: Xiao Lei
      
      Change 67made by: Jose Santillan
      
      Change 68made by: Michael Liao
      
      Change 69made by: Pawel Jurek
      
      Change 70:
          In FF, un constrained variables can be assigned freely, RR helps scheduling effort.
        made by:  IGC
      
      Change 71:
          CPack: switch to component based packaging
          Group IGC artifacts into igc component. This change enables creation of separable
          IGC installation packages (deb,rpm,tgz) by top level build system.
        made by: Lukasz Filipkowski
      
      Change 72made by: Michael Liao
      
      Change 73:
          Internal feature
        made by: Weiyu Chen
      
      Change 74:
          As summary.
        made by: Wei Pan
      
      Change 75:
          enable multi-acc replacement for certain platforms
        made by: Weiyu Chen
      
      Change 76made by: Thomas F Raoux
      
      Change 77:
          Accidental Checkin - Backing out of previous CL
        made by: Xiao Lei
      
      Change 78:
          Enables GTPin input read and output writes
        made by: Xiao Lei
      
      Change 79:
          don't do acc replacement for dst without any use
        made by: Weiyu Chen
      
      Change 80:
          Added support in Legalization pass to handle store of illegal int types.
          Added support in PeepholeTypeLegalizer pass to handle PHI and Trunc instructions with illegal int types.
        made by: Xiao Lei
      
      Change 81:
          acc substitution relies on def-use on virtual declares, and removeReudundMov pass can break it
        made by: Weiyu Chen
      
      Change 82:
          - ensure all required values are present
        made by: Michael Liao
      
      Change 83:
          - limit to OpenCL shader only
        made by: Michael Liao
      
      Change 84made by: Thomas F Raoux
      
      Change 85:
          Back-out of one of previous change.
        made by: Michael Liao
      
      Change 86:
          To make subroutine call for emulation functions be on by default.
        made by: Junjie Gu
      
      Change 87:
          Turining full Half Promotion pass back on for OCL due to functional regressions.
        made by: Jacek Jankowski
      
      Change 88:
          Gen9 doesn't support bindless sampler heap. These changes follow DirectX approach to support a large number of samplers. The idea is to use sampler header to pass pointer to the sampler state.
          These changes are not complete as we also need corresponding UMD changes but we can safely check them in and enable later when UMD is ready.
          Proposed solution is:
          - UMD creates separate descriptor set heap just for samplers in the indirect state heap. So UMD has to manage two bindless heaps one for surfaces and one for samplers.
          - The assumption is that a descriptor set offset "n" has the same offset in both heaps. This will limit the number of offsets that need to go through constant registers
          - Additionally UMD will pass us the base offset of the emulated bindless heap (offset relative to the dynamic state base address)
          - Compiler will add this base offset to every sampler binding and IGC will program the sampler state offset in the header.
        made by: Mariusz Merecki
      
      Change 89made by: Thomas F Raoux
      
      Change 90:
          Update SpirV OpAtomicIDecrement instruction to use the DEC not PREDEC HW operation.
        made by: Lukasz Gotszald
      
      Change 91:
          Backing out previous commit, as CI is down and checkins are halted, so no testing for this will occur.
        made by: Xiao Lei
      
      Change 92:
          Added support in Legalization pass to handle store of illegal int types.
          Added support in PeepholeTypeLegalizer pass to handle PHI and Trunc instructions with illegal int types.
        made by: Xiao Lei
      
      Change 93:
          Automated integration from mainline to DEV_IGC
        made by:  IGC
      
      Change 94:
          In FF, un constrained variables can be assigned freely, RR helps scheduling effort.
        made by: Bu Qi Cheng
      
      Change 95:
          revert CL#750441
        made by: Michael Liao
      
      Change 96:
          Change CS SIMD32 heuristics to consider loop stall cost, to allow more shader to be compiled as SIMD32.
        made by:  hudson_server
      
      Change 97:
          --add check for add dst in VxH mode
          --add check for src modifier in mul inst
        made by: Weiyu Chen
      
      Change 98:
          GTPin data is sent to driver through patch token
        made by: Xiao Lei
      
      Change 99:
          add code to support acc on mad src0. currently disabled by default.
        made by: Weiyu Chen
      
      Change 100:
          Change CS SIMD32 heuristics to consider loop stall cost, to allow more shader to be compiled as SIMD32.
        made by: Peng Guo
      
      Change 101:
          Added support to enabling subroutine for emulation functions. The default is off for now.
          Don't expect any functional changes.
        made by: Junjie Gu
      
      Change 102:
          When lexical scope information is present for -gline-tables-only, it leads to unexpected behavior when building lexical scope DIE because of missing MCSymbol labels for instructions. This patch skip building lexical scope DIE for such cases.
        made by: Pratik J Ashar
      
      Change 103:
          add platform checks on when to use multiAccSubstitution
        made by:  hudson_server
      
      Change 104:
          - 2nd attempt
        made by:  hudson_server
      
      Change 105:
          fix a bug in previous checkin where src2 was incorrectly applied acc
        made by:  hudson_server
      
      Change 106:
          Added support to 3DBuilder to build igc for debugging Metal Shaders and also to build IGC for open source
        made by: Juan1 Rodriguez
      
      Change 107:
          If the address fill move has a different exec size than kernel's simd size, NoMask must be used (e.g., (W) mov (1) a0.0 r1.0:w
        made by: Weiyu Chen
      
      Change 108:
          As Vulkan no longer needs FP64 emulation, do not set NeedFP64() for vulkan.
          This should have no functional change.
        made by: Junjie Gu
      
      Change-Id: I2fc60e3bdbf85c279446c0a3a9a9c8dc60e09cdd
      dbb9f9f1
  31. 16 Mar, 2018 1 commit
  32. 09 Mar, 2018 1 commit
    • rishipal's avatar
      Commit contains 30 changes. · 479b29e5
      rishipal authored
      Change 1:
          Adding a regkey LimitConstantBuffersPushed to enable/disable limiting the number of constant buffers
        made by: Rishipal S Bhatia
      
      Change 2:
          Added a new pattern to the MatchRegisterRegion. In this case we are matching Shuffle(shr(laneid, const)).
        made by: Juan1 Rodriguez
      
      Change 3:
          Updated the cmake file
        made by: Pankaj Mistry
      
      Change 4:
          This is the IGC part of fix for issue described in: https://github.com/intel/compute-runtime/issues/21
        made by: Pawel Jurek
      
      Change 5:
          Added a new pattern to the MatchRegisterRegion. In this case we are matching Shuffle(shr(laneid, const)).
        made by:  hudson_server
      
      Change 6:
          N/A
        made by: Manohara Kariganur
      
      Change 7:
      
        made by: Thomas F Raoux
      
      Change 8:
          N/A
        made by: Manohara Kariganur
      
      Change 9:
          Fixing a couple of leaks reported by:
        made by: Juan1 Rodriguez
      
      Change 10:
          N/A
        made by: Manohara Kariganur
      
      Change 11:
          This CL refactors IGC code to read maxWorkGroupSize inforamtion from the metadata for compute shaders and choose the appropriate simd mode.
        made by: Rishipal S Bhatia
      
      Change 12:
          Added a new pattern to the MatchRegisterRegion. In this case we are matching Shuffle(shr(laneid, const)).
        made by: Juan1 Rodriguez
      
      Change 13:
          The previous perf regression was due to BB layout
        made by: Junjie Gu
      
      Change 14:
          Hoist loop invariant multiplies outside of loop, fp unsafe optimization.
        made by: Peng Guo
      
      Change 15:
          Michael Liao investigated performance regression in basemark_julia and found that the issue is related to Clang upgrade. The issue was related to vec3 handling and was fixed in Clang by adding optional -fpreserve-vec3-type option. It was added to our runtime, but wasn't added to CmakeLists responsible for built-in generation.
        made by: Pawel Jurek
      
      Change 16:
          Changed the way of moving from temporary directory of libraries to the destination. Instead of copying cmake will now make symlinks.
        made by: Lukasz Wesierski
      
      Change 17:
          BDW platform does not support write to cube texture through HDC.
          This workaround consists of the following parts:
          1. Adds new field called cubeTo2DArrayWATable to compute compiler output - indices correspond with location indices of textures.
          2. Analyses the shader inputs and if it finds cube texture with write or read/write access qualifier, it changes appropriate element in the array cubeTo2DArrayWATable in compute compiler output.
          3. Adds flag responsible for switching between old and new workaround, It gives the testing possibility to someone who will be responsible for driver implementation.  If driver changes will be done, I'm planning to remove this flag and make code clearer - at this moment old WA is enabled, the new one will be enabled if driver changes will be done.
        made by: Andrzej Ratajewski
      
      Change 18:
          Reduce the time of split checking for interference graph building
        made by: Bu Qi Cheng
      
      Change 19:
          Compile time logging
        made by: Peng Guo
      
      Change 20:
          Back-out of one of previous change.
        made by:  IGC
      
      Change 21:
          Back-out of one of previous change.
        made by:  IGC
      
      Change 22:
          Back-out of one of previous change.
        made by:  IGC
      
      Change 23:
          Spir-V instruction OpCompositeConstruct crashes driver when vector operands are used. Fix is about to extract vector elements before insert them to new composite object.
        made by: Lukasz Gotszald
      
      Change 24:
          HW swapping is only triggerred by the first simd8 in a simd16, but both sources of two simd8 will swapped.
          Inter read suppression is not supported for simd16 instruction
        made by: Bu Qi Cheng
      
      Change 25:
          Apply rule: "elements within a `Width' cannot cross GRF boundaries"
        made by: Bu Qi Cheng
      
      Change 26:
          Fix input payload layouts
        made by: Jose Santillan
      
      Change 27:
          Init tables are already declared in wa_def.h. They are not needed here.
        made by: Anupama Chandrasekhar
      
      Change 28:
          Automated integration from mainline to DEV_IGC
        made by:  IGC
      
      Change 29:
          If two succs are empty BBs, select one based on some rules, rather than return the first succ all the time.
        made by: Junjie Gu
      
      Change 30:
        made by: Xiao Lei
      
      Change-Id: I13ae7da8467fcd9214ef24f07893bd979d06b10b
      479b29e5
  33. 22 Feb, 2018 1 commit
    • Junjie Gu's avatar
      Fixes to CISA.y · 6faf83d1
      Junjie Gu authored
      Change-Id: Ic5ba97b2d4ad21dcdd04b8c6bfd1862221d4eaf1
      6faf83d1
  34. 30 Jan, 2018 1 commit