i965/vec4/dce: Don't narrow the write mask if the flags are used
In an instruction sequence like cmp(8).ge.f0.0 vgrf17:D, vgrf2.xxxx:D, vgrf9.xxxx:D (+f0.0) sel(8) vgrf1:UD, vgrf8.xyzw:UD, vgrf1.xyzw:UD The other fields of vgrf17 may be unused, but the CMP still needs to generate the other flag bits. To my surprise, nothing in shader-db or any test suite appears to hit this. However, I have a change to brw_vec4_cmod_propagation that creates cases where this can happen. This fix prevents a couple dozen regressions in that patch. Signed-off-by:Ian Romanick <ian.d.romanick@intel.com> Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 5df88c20 ("i965/vec4: Rewrite dead code elimination to use live in/out.") (cherry picked from commit 440c0513)
Loading
Please register or sign in to comment