Commits on Source (82)
-
GregF authored
The combinator initialization was only looking at the capabilities in the shader and not the inferred capabilities. Geometry and tessellation shaders were not setting the Shader capability which is inferred. So the combinator set was not initialized correctly causing problems for ADCE.
-
Alan Baker authored
* Added TypeManager::RebuildType * rebuilds the type and its constituent types in terms of memory owned by the manager. * Used by TypeManager::RegisterType to properly allocate memory * Adding an unit test to expose the issue * Added some tests to provide coverage of RebuildType * Added an accessor to the target pointer for a forward pointer
-
David Neto authored
-
David Neto authored
-
David Neto authored
Disable use of Effcee and RE2 with MSVC compilers older than Visual Studio 2015 since RE2 doesn't support them.
-
Józef Kucia authored
Add pkg-config file for shared libraries Properly build SPIRV-Tools DLL Test C interface with shared library Set PATH to shared library file for c_interface_shared test Otherwise, the test won't find SPIRV-Tools-shared.dll. Do not use private functions when testing with shared library Make all symbols hidden by default for shared library target
-
Andrey Tuganov authored
Added atomic instructions validation rules from https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspec.html#spirvenv-module-validation
-
Andrey Tuganov authored
-
Steven Perron authored
Implementation of the simplification pass. - Create pass that calls the instruction folder on each instruction and propagate instructions that fold to a copy. This will do copy propagation as well. - Did not use the propagator engine because I want to modify the instruction as we go along. - Change folding to not allocate new instructions, but make changes in place. This change had a big impact on compile time. - Add simplification pass to the legalization passes in place of insert-extract elimination. - Added test cases for new folding rules. - Added tests for the simplification pass - Added a method to the CFG to apply a function to the basic blocks in reverse post order. Contributes to #1164.
-
GregF authored
-
Alexander Johnston authored
-
Steven Perron authored
-
Józef Kucia authored
-
David Neto authored
-
Steven Perron authored
Create files for constant folding rules. Add the rules for OpConstantComposite and OpCompositeExtract.
-
Arseny Kapoulkine authored
This is important when SPIRV-Headers are not checked out to external/ folder and mirrors other places in the code where spirv.h is included.
-
Steven Perron authored
There seems to only be a single location where the def-use manager is used. It is to get information about a type. We can do that with the type manager instead. Fixes #1285
-
Diego Novillo authored
-
Alan Baker authored
* Undef now marked as varying in ccp * this prevents incorrect meet operations since phis were always not interesting * added a test to catch the bug
-
Stephen McGroarty authored
This patch adds initial support for loop unrolling in the form of a series of utility classes which perform the unrolling. The pass can be run with the command spirv-opt --loop-unroll. This will unroll loops within the module which have the unroll hint set. The unroller imposes a number of requirements on the loops it can unroll. These are documented in the comments for the LoopUtils::CanPerformUnroll method in loop_utils.h. Some of the restrictions will be lifted in future patches.
-
Steven Perron authored
Adds the floating rules for FAdd, FDiv, FMul, and FSub. Contributes to #1164.
-
Steven Perron authored
Adding a map from an id to it set of OpName and OpMemberName instructions. This will be used in KillNameAndDecorates to kill the names for the ids that are being removed. In my test, the compile time for 50 shaders went from 1m57s to 55s. This was on linux using the release build. Fixes #1290.
-
Arseny Kapoulkine authored
We can fold OpSelect into one of the operands in two cases: - condition is constant - both results are the same Even if the original shader doesn't have either of these, if-conversion pass sometimes ends up generating instructions like %7127 = OpSelect %int %3220 %7058 %7058 And this optimization cleans them up.
-
Lei Zhang authored
unordered_map is not POD. Using it as static may cause problems when operator new() and operator delete() is customized. Also changed some function signatures to use const char* instead of std::string, which will give caller the flexibility to avoid creating a std::string.
-
David Neto authored
Need to do this in case cmake is not on the path. This should fix the Android NDK build, as in when building the NDK itself.
-
Nerijus Baliūnas authored
-
Lei Zhang authored
-
Steven Perron authored
Fixes #1311
-
Steven Perron authored
In dead branch elimination, we already recognize unreachable continue blocks, and update OpPhi instruction accordingly. This change adds an extra check: if the head block has exactly 1 other incoming edge, then replace the OpPhi with the value from that edge. Fixes #1314.
-
Arseny Kapoulkine authored
Registering a constant in constant manager establishes a relation between instruction that defined it and constant object. On complex shaders this could result in the constant definition getting removed as part of one of the DCE pass, and a subsequent simplification pass trying to use the defining instruction for the constant. To fix this, we now remove associated constant entries from constant manager when killing constant instructions; the constant object is still registered and can be remapped to a new instruction later. GetDefiningInstruction shouldn't ever return nullptr after this change so add an assertion to check for that.
-
Arseny Kapoulkine authored
This change handles all 6 regular comparison types in two variations, ordered (true if values are ordered *and* comparison is true) and unordered (true if values are unordered *or* comparison is true). Ordered comparison matches the default floating-point behavior on host but we use std::isnan to check ordering explicitly anyway. This change also slightly reworks the floating-point folding support code to make it possible to define a folding operation that returns boolean instead of floating point. These tests exhaustively test ordered/unordered comparisons for float/double. Since for NaN inputs the comparison result doesn't depend on the comparison function, we just test == and !=; NaN inputs result in true unordered comparisons and false ordered comparisons.
-
Steven Perron authored
The simplification pass works better after all of the dead branches are removed. So swapping them around in the legalization passes. Also adding the simplification pass to performance passes right after dead branch elimination. Added CCP to the legalization passes so we can propagate the constants into the branchs, and remove as many branches a possible. CCP is designed to still get opportunities even if the branches are dead, so it is a good place for it. Fixes #1118
-
Andrew Woloszyn authored
Bitcasting FloatProxy<->uint_type was hitting a warning with g++8.0.1. Replace bitcasts with new casting traits for FloatProxy.
-
Alan Baker authored
* Now track propagation status and assert on bad statuses * Added helper methods to access instruction propagation status * Modified the phi meet operator to properly reflect the paper it is based on * Modified SSA edge addition so that all edge are added, but only on state changes * Fixed a bug in instruction simulation where interesting conditional branches would not mark the interesting edge as executed * Added a test to catch this bug * Added an ostream operator for SSAPropagator::PropStatus
-
Steven Perron authored
I mixed up two cases when folding an OpCompositeExtract that is feed by and OpCompositeInsert. The specific cases are demonstracted in the new test. I mixed up the conditions for the cases, and treated one like the other. Fixes #1323.
-
Lei Zhang authored
-
Diego Novillo authored
On some shader code we have in our testsuite, Phi insertion is showing massive compile time slowdowns, particularly during destruction. The specific shader I was looking at has about 600 variables to keep track of and around 3200 basic blocks. The algorithm is currently O(var x blocks), which means maps with around 2M entries. This was taking about 8 minutes of compile time. This patch changes the tracking of stored variables to be more sparse. Instead of having every basic block contain all the tracked variables in the map, they now have only the variables actually stored in that block. This speeds up deallocation, which brings down compile time to about 1m20s. Note that this is not the definite fix for this. I will re-write Phi insertion to use a standard SSA rewriting algorithm (https://github.com/KhronosGroup/SPIRV-Tools/issues/893). This contributes to https://github.com/KhronosGroup/SPIRV-Tools/issues/1328.
-
Steven Perron authored
Building the def-use chains is very expensive, so we do not want to invalidate them it if is not necessary. At the moment, it seems like most optimizatoins are good at not invalidating the def-use chains, but simplification does. This PR get the simlification pass to keep the analysies valid. Contributes to #1328.
-
Steven Perron authored
This reverts commit ec3bbf09.
-
Arseny Kapoulkine authored
This change implements instruction folding for arithmetic operations that are redundant, specifically: x + 0 = 0 + x = x x - 0 = x 0 - x = -x x * 0 = 0 * x = 0 x * 1 = 1 * x = x 0 / x = 0 x / 1 = x mix(a, b, 0) = a mix(a, b, 1) = b Cache ExtInst import id in feature manager This allows us to avoid string lookups during optimization; for now we just cache GLSL std450 import id but I can imagine caching more sets as they become utilized by the optimizer. Add tests for add/sub/mul/div/mix folding The tests cover scalar float/double cases, and some vector cases. Since most of the code for floating point folding is shared, the tests for vector folding are not as exhaustive as scalar. To test sub->negate folding I had to implement a custom fixture.
-
Steven Perron authored
Fixes #1326.
-
Steven Perron authored
When inlining a function call the instructions in the same basic block as the call get cloned. The clone is added to the set of new blocks containing the inlined code, and the original instructions are deleted. This PR will change this so that we simply move the instructions to the new blocks. This saves on the creation and deletion of the instructions. Contributes to #1328.
-
Alan Baker authored
* No longer assume the branch/switch condition must be bool or int constants (respectively) * Added a couple unit tests for each case
-
GregF authored
This function now checks for side-effects before adding operand instructions to the dead instruction work list. Because this fix puts more pressure on IsCombinatorInstruction() to be correct, this commit adds all OpConstant* and OpType* instructions to combinator_ops_ set. Fixes #1341.
-
Lei Zhang authored
We already have VS2013 and VS2017, which should be good guards.
-
Steven Perron authored
Fixes a bug at the same time. In `UpdateDefUse`, if the definition already exists, we are not suppose to analyse it again. When you do the entries for the definition are deleted, and we don't want that. The check for this was wrong.
-
Steven Perron authored
In some shaders there are a lot of very large and deeply nested structures. This creates a lot of work for scalar replacement. Also, since commit ca4457b4 we have been very aggressive as rewriting variables. This has causes a large increase in compile time in creating and then deleting the instructions. To help low the costs, I want to run a cleanup of some of the easy loads and stores to remove. This reduces the number of symbols sroa has to work on. It also reduces the amount of code the simplifier has to simplify because it was not generated by sroa. To confirm the improvement, I ran numbers on three different sets of shaders: Time to run --legalize-hlsl: Set #1: 55.89s -> 12.0s Set #2: 1m44s -> 1m40.5s Set #3: 6.8s -> 5.7s Time to run -O Set #1: 18.8s -> 10.9s Set #2: 5m44s -> 4m17s Set #3: 7.8s -> 7.8s Contributes to #1328.
-
Stephen McGroarty authored
Support for multiple induction variables within a loop and support for loop condition operands <= and >=.
-
Victor Lomuller authored
It moves all conditional branching and switch whose conditions are loop invariant and uniform. Before performing the loop unswitch we check that the loop does not contain any instruction that would prevent it (barriers, group instructions etc.).
-
David Neto authored
-
Pierre Moreau authored
-
Pierre Moreau authored
Fixes: https://github.com/KhronosGroup/SPIRV-Tools/issues/1144
-
Pierre Moreau authored
Fixes: https://github.com/KhronosGroup/SPIRV-Tools/issues/1218
-
Stephen McGroarty authored
This change makes the IR builder use the type manager to generate OpTypeInts when creating OpConstants. This avoids dangling references being stored by the created OpConstants.
-
Alan Baker authored
Adding basis of arithmetic merging * Refactored constant collection in ConstantManager * New rules: * consecutive negates * negate of arithmetic op with a constant * consecutive muls * reciprocal of div * Removed IRContext::CanFoldFloatingPoint * replaced by Instruction::IsFloatingPointFoldingAllowed * Fixed some bad tests * added some header comments Added PerformIntegerOperation * minor fixes to constants and tests * fixed IntMultiplyBy1 to work with 64 bit ints * added tests for integer mul merging Adding test for vector integer multiply merging Adding support for merging integer add and sub through negate * Added tests Adding rules to merge mult with preceding divide * Has a couple tests, but needs more * Added more comments Fixed bug in integer division folding * Will no longer merge through integer division if there would be a remainder in the division * Added a bunch more tests Adding rules to merge divide and multiply through divide * Improved comments * Added tests Adding rules to handle mul or div of a negation * Added tests Changes for review * Early exit if no constants are involved in more functions * fixed some comments * removed unused declaration * clarified some logic Adding new rules for add and subtract * Fold adds of adds, subtracts or negates * Fold subtracts of adds, subtracts or negates * Added tests
-
David Neto authored
Use indirection through latest_version_spirv.h Also, when generating enum tables, use the unified1 JSON grammar since it now has FragmentFullyCoveredEXT but the other JSON grammars don't. They are starting to fall behind.
-
Steven Perron authored
The algorithm used in DCEInst to remove dead code is very slow. It is fine if you only want to remove a small number of instructions, but, if you need to remove a large number of instructions, then the algorithm in ADCE is much faster. This PR removes the calls to DCEInst in the load-store removal passes and adds a pass of ADCE afterwards. A number of different iterations of the order of optimization, and I believe this is the best I could find. The results I have on 3 sets of shaders are: Legalization: Set 1: 5.39 -> 5.01 Set 2: 13.98 -> 8.38 Set 3: 98.00 -> 96.26 Performance passes: Set 1: 6.90 -> 5.23 Set 2: 10.11 -> 6.62 Set 3: 253.69 -> 253.74 Size reduction passes: Set 1: 7.16 -> 7.25 Set 2: 17.17 -> 16.81 Set 3: 112.06 -> 107.71 Note that the third set's compile time is large because of the large number of basic blocks, not so much because of the number of instructions. That is why we don't see much gain there.
-
Victor Lomuller authored
-
Steven Perron authored
Adds rule to fold OpVectorShuffle with constant inputs. Adds rules to fold OpCompositeExtrac being fed by an OpVectorShuffle.
-
Alan Baker authored
* Removes merging of div with a div or mul for integers * Updated tests
-
GregF authored
-
Alan Baker authored
* getFloatConstantKind() now handles OpConstantNull * PerformOperation() now handles OpConstantNull for vectors * Fixed some instances where we would attempt to merge a division by 0 * added tests
-
Arseny Kapoulkine authored
As per Vulkan spec, BuiltIn variables can't have Location or Component decorations. On some drivers, these can lead to driver crashing when compiling the shader pipeline; for example, NVidia/AMD desktop drivers: https://github.com/KhronosGroup/glslang/issues/1182. This change adds validation and tests to catch this.
-
Alan Baker authored
* Also mark function parameters as varying * Conservatively mark assignment instructions as varying if any input is varying after attempting to fold * Added a test to catch this case
-
David Neto authored
-
Alan Baker authored
* Handles OpConstantNull and vector types * vector selects (except against a null) are converted to vector shuffles * Added tests
-
David Neto authored
-
David Neto authored
-
David Neto authored
Some tokens are only showing up in the unified1 grammar. So enum string mappings have to be generated from that grammar, not the grammar from the (deprecated) include/spirv/1.2 grammar. Example: capabilities FragmentFullyCovered, Float16ImageAMD
-
Pierre Moreau authored
-
Steven Perron authored
The merging types we do not remove other information related to the types. We simply leave it duplicated, and hope it is removed later. This is what happens with decorations. They are removed in the next phase of remove duplicates. However, for OpNames that is not the case. We end up with two different names for the same id, which does not make sense. The solution is to remove the names and decorations for the type being removed instead of rewriting them to refer to the other type. Note that it is possible that if the first type does not have a name, then the types will end up with no name. That is fine because the names should not have any semantic significance anyway. The was identified in issue #1372, but this does not fix that issue.
-
David Neto authored
This commit add assembling, disassembling, and basic validation for two Google extensions to better support HLSL translation.
-
Alan Baker authored
* Added early returns to folding rules to prevent half attempts * Added some tests
-
David Neto authored
The default target is SPIR-V 1.3. For example, spirv-as will generate a SPIR-V 1.3 binary by default. Use command line option "--target-env spv1.0" if you want to make a SPIR-V 1.0 binary or validate against SPIR-V 1.0 rules. Example: # Generate a SPIR-V 1.0 binary instead of SPIR-V 1.3 spirv-as --target-env spv1.0 a.spvasm -o a.spv spirv-as --target-env vulkan1.0 a.spvasm -o a.spv # Validate as SPIR-V 1.0. spirv-val --target-env spv1.0 a.spv # Validate as Vulkan 1.0 spirv-val --target-env vulkan1.0 a.spv
-
David Neto authored
-
Andrey Tuganov authored
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1375 Hardcoded float16 feature enabling if extension SPV_AMD_gpu_shader_half_float is present.
-
Alan Baker authored
* Added tests to catch the bug
-
David Neto authored
-
David Neto authored
-
Timo Aaltonen authored
-
Timo Aaltonen authored
-
Timo Aaltonen authored
cmake/SPIRV-Tools-shared.pc.in
0 → 100644