Skip to content
Commits on Source (37)
......@@ -9,3 +9,6 @@ b6905438514ae4de0b7f85c861e3d811ddaadda9
# This isn't worth the effort to backport, as it only affects build with
# asserts enable, which hopefully wont happen in a stable branch.
937b9055698be0dfdb7d2e0673a989e2ecc05912
# this is reverted, so just don't apply
973181c06cca3fe232c3a435abde31f2fc1b81ef
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.3.1 Release Notes / 2019-12-18</h1>
<p>
Mesa 19.3.1 is a bug fix release which fixes bugs found since the 19.3.0 release.
</p>
<p>
Mesa 19.3.1 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.3.1 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
cd951db69c56a97ff0570a7ab2c0e39e6c5323f4cd8f4eb8274723e033beae59 mesa-19.3.1.tar.xz
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>i965/iris: assert when destroy GL context with active query</li>
<li>Visuals without alpha bits are not sRGB-capable</li>
<li>radv secure compile feature breaks compilation of RADV on armhf EABI (19.3-rc1)</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Bas Nieuwenhuizen (2):</p>
<li> amd/common: Fix tcCompatible degradation on Stoney.</li>
<li> amd/common: Always use addrlib for HTILE tc-compat.</li>
<p></p>
<p>Dylan Baker (3):</p>
<li> docs/19.3.0: Add SHA256 sums</li>
<li> cherry-ignore: update for the 19.3.1 cycle</li>
<li> docs: remove new_features.txt from stable branch</li>
<p></p>
<p>Gert Wollny (1):</p>
<li> virgl: Increase the shader transfer buffer by doubling the size</li>
<p></p>
<p>Iván Briano (1):</p>
<li> anv: Export filter_minmax support only when it&#x27;s really supported</li>
<p></p>
<p>Kenneth Graunke (1):</p>
<li> iris: Default to X-tiling for scanout buffers without modifiers</li>
<p></p>
<p>Lionel Landwerlin (2):</p>
<li> anv: fix fence underlying primitive checks</li>
<li> mesa: avoid triggering assert in implementation</li>
<p></p>
<p>Luis Mendes (1):</p>
<li> radv: fix radv secure compile feature breaks compilation on armhf EABI and aarch64</li>
<p></p>
<p>Tapani Pälli (2):</p>
<li> dri: add __DRI_IMAGE_FORMAT_SXRGB8</li>
<li> i965: expose MESA_FORMAT_B8G8R8X8_SRGB visual</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 19.3.2 Release Notes / 2020-01-09</h1>
<p>
Mesa 19.3.2 is a bug fix release which fixes bugs found since the 19.3.1 release.
</p>
<p>
Mesa 19.3.2 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is <strong>only</strong> available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
</p>
<p>
Mesa 19.3.2 implements the Vulkan 1.1 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
</p>
<h2>SHA256 checksum</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<ul>
<li>None</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li>Rise of the Tomb Raider benchmark crash on Dell XPS 7390 2-in-1 w/ Iris Plus Graphics (Ice Lake 8x8 GT2)</li>
<li>Raven Ridge (2400G): Resident Evil 2 crashes my machine</li>
<li>Rocket League ingame artifacts</li>
<li>[radv] SteamVR direct mode no longer works</li>
<li>[RADV] [Navi] LOD artifacting in Halo - The Master Chief Collection (Halo Reach)</li>
<li>[ANV] unused create parameters not properly ignored</li>
<li>Blocky corruption in The Surge 2</li>
<li>radeonsi: Floating point exception on R9 270 gpu for a set of traces</li>
<li>[CTS] dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r32g32b32_* fail on GFX6-GFX8</li>
<li>Vulkan: Please consider adding another sample count to sampledImageIntegerSampleCounts</li>
<li>Navi10: Bitrate based encoding with VAAPI/RadeonSI unusable</li>
<li>[GFX10] Glitch rendering Custom Avatars in Beat Saber</li>
<li>intel/fs: Check for 16-bit immediates in fs_visitor::lower_mul_dword_inst is too strict</li>
</ul>
<h2>Changes</h2>
<ul>
<p>Andrii Simiklit (3):</p>
<li> glsl: fix an incorrect max_array_access after optimization of ssbo/ubo</li>
<li> glsl: fix a binding points assignment for ssbo/ubo arrays</li>
<li> glsl/nir: do not change an element index to have correct block name</li>
<p></p>
<p>Bas Nieuwenhuizen (7):</p>
<li> radv: Limit workgroup size to 1024.</li>
<li> radv: Expose all sample counts for integer formats as well.</li>
<li> amd/common: Handle alignment of 96-bit formats.</li>
<li> nir: Add clone/hash/serialize support for non-uniform tex instructions.</li>
<li> spirv: Fix glsl type assert in spir2nir.</li>
<li> radv: Only use the gfx mipmap level offset/pitch for linear textures.</li>
<li> radv: Emit a BATCH_BREAK when changing pixel shaders or CB_TARGET_MASK.</li>
<p></p>
<p>Caio Marcelo de Oliveira Filho (4):</p>
<li> intel/fs: Lower 64-bit MOVs after lower_load_payload()</li>
<li> intel/fs: Fix lowering of dword multiplication by 16-bit constant</li>
<li> intel/vec4: Fix lowering of multiplication by 16-bit constant</li>
<li> anv: Ignore some CreateInfo structs when rasterization is disabled</li>
<p></p>
<p>Christian Gmeiner (1):</p>
<li> etnaviv: update resource status after flushing</li>
<p></p>
<p>Dylan Baker (2):</p>
<li> dcos: add releanse notes for 19.3.1</li>
<li> cherry-ignore: update for 19.3.2</li>
<p></p>
<p>Eric Engestrom (4):</p>
<li> util/format: remove left-over util_format_description_table declaration</li>
<li> amd: fix empty-body issues</li>
<li> nine: fix empty-body-issues</li>
<li> mesa: avoid returning a value in a void function</li>
<p></p>
<p>Gert Wollny (1):</p>
<li> r600: Fix maximum line width</li>
<p></p>
<p>Jason Ekstrand (2):</p>
<li> anv: Properly advertise sampledImageIntegerSampleCounts</li>
<li> intel/nir: Add a memory barrier before barrier()</li>
<p></p>
<p>Lionel Landwerlin (2):</p>
<li> loader: fix close on uninitialized file descriptor value</li>
<li> anv: don&#x27;t close invalid syncfd semaphore</li>
<p></p>
<p>Marek Olšák (2):</p>
<li> winsys/radeon: initialize pte_fragment_size</li>
<li> radeonsi: disable SDMA on gfx8 to fix corruption on RX 580</li>
<p></p>
<p>Pierre-Eric Pelloux-Prayer (2):</p>
<li> radeon/vcn2: enable rate control for hevc encoding</li>
<li> radeonsi: check ctx-&gt;sdma_cs before using it</li>
<p></p>
<p>Samuel Pitoiset (2):</p>
<li> radv/gfx10: fix the out-of-bounds check for vertex descriptors</li>
<li> radv: return the correct pitch for linear mipmaps on GFX10</li>
<p></p>
<p>Timur Kristóf (1):</p>
<li> aco: Fix uniform i2i64.</li>
<p></p>
<p>Yevhenii Kolesnikov (2):</p>
<li> meta: Cleanup function for DrawTex</li>
<li> main: allow external textures for BindImageTexture</li>
<p></p>
<p></p>
</ul>
</div>
</body>
</html>
......@@ -64,7 +64,7 @@
#define ADDR_DBG_BREAK() { __debugbreak(); }
#endif
#else
#define ADDR_DBG_BREAK()
#define ADDR_DBG_BREAK() do {} while(0)
#endif
////////////////////////////////////////////////////////////////////////////////////////////////////
......@@ -143,15 +143,15 @@
#define ADDRDPF 1 ? (void)0 : (void)
#define ADDR_PRNT(a)
#define ADDR_PRNT(a) do {} while(0)
#define ADDR_DBG_BREAK()
#define ADDR_DBG_BREAK() do {} while(0)
#define ADDR_INFO(cond, a)
#define ADDR_INFO(cond, a) do {} while(0)
#define ADDR_WARN(cond, a)
#define ADDR_WARN(cond, a) do {} while(0)
#define ADDR_EXIT(cond, a)
#define ADDR_EXIT(cond, a) do {} while(0)
#endif // DEBUG
////////////////////////////////////////////////////////////////////////////////////////////////////
......
......@@ -28,7 +28,7 @@
#include <memcheck.h>
#define VG(x) x
#else
#define VG(x)
#define VG(x) ((void)0)
#endif
#include <inttypes.h>
......
......@@ -207,6 +207,17 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
AddrSurfInfoIn->width = align(AddrSurfInfoIn->width, alignment);
}
/* addrlib assumes the bytes/pixel is a divisor of 64, which is not
* true for r32g32b32 formats. */
if (AddrSurfInfoIn->bpp == 96) {
assert(config->info.levels == 1);
assert(AddrSurfInfoIn->tileMode == ADDR_TM_LINEAR_ALIGNED);
/* The least common multiple of 64 bytes and 12 bytes/pixel is
* 192 bytes, or 16 pixels. */
AddrSurfInfoIn->width = align(AddrSurfInfoIn->width, 16);
}
if (config->is_3d)
AddrSurfInfoIn->numSlices = u_minify(config->info.depth, level);
else if (config->is_cube)
......@@ -1052,8 +1063,10 @@ static int gfx9_compute_miptree(ADDR_HANDLE addrlib,
surf->surf_alignment = out.baseAlign;
if (in->swizzleMode == ADDR_SW_LINEAR) {
for (unsigned i = 0; i < in->numMipLevels; i++)
for (unsigned i = 0; i < in->numMipLevels; i++) {
surf->u.gfx9.offset[i] = mip_info[i].offset;
surf->u.gfx9.pitch[i] = mip_info[i].pitch;
}
}
if (in->flags.depth) {
......
......@@ -154,6 +154,8 @@ struct gfx9_surf_layout {
uint64_t surf_slice_size;
/* Mipmap level offset within the slice in bytes. Only valid for LINEAR. */
uint32_t offset[RADEON_SURF_MAX_LEVELS];
/* Mipmap level pitch in elements. Only valid for LINEAR. */
uint32_t pitch[RADEON_SURF_MAX_LEVELS];
uint64_t stencil_offset; /* separate stencil */
......
......@@ -1977,7 +1977,7 @@ void visit_alu_instr(isel_context *ctx, nir_alu_instr *instr)
case nir_op_i2i64: {
Temp src = get_alu_src(ctx, instr->src[0]);
if (src.regClass() == s1) {
Temp high = bld.sopc(aco_opcode::s_ashr_i32, bld.def(s1, scc), src, Operand(31u));
Temp high = bld.sop2(aco_opcode::s_ashr_i32, bld.def(s1), bld.def(s1, scc), src, Operand(31u));
bld.pseudo(aco_opcode::p_create_vector, Definition(dst), src, high);
} else if (src.regClass() == v1) {
Temp high = bld.vop2(aco_opcode::v_ashrrev_i32, bld.def(v1), Operand(31u), src);
......
......@@ -1162,7 +1162,7 @@ bool validate_ra(Program* program, const struct radv_nir_compiler_options *optio
#ifndef NDEBUG
void perfwarn(bool cond, const char *msg, Instruction *instr=NULL);
#else
#define perfwarn(program, cond, msg, ...)
#define perfwarn(program, cond, msg, ...) do {} while(0)
#endif
void aco_print_instr(Instruction *instr, FILE *output);
......
......@@ -1138,6 +1138,33 @@ radv_emit_rbplus_state(struct radv_cmd_buffer *cmd_buffer)
cmd_buffer->state.context_roll_without_scissor_emitted = true;
}
static void
radv_emit_batch_break_on_new_ps(struct radv_cmd_buffer *cmd_buffer)
{
if (!cmd_buffer->device->pbb_allowed)
return;
struct radv_binning_settings settings =
radv_get_binning_settings(cmd_buffer->device->physical_device);
bool break_for_new_ps =
(!cmd_buffer->state.emitted_pipeline ||
cmd_buffer->state.emitted_pipeline->shaders[MESA_SHADER_FRAGMENT] !=
cmd_buffer->state.pipeline->shaders[MESA_SHADER_FRAGMENT]) &&
(settings.context_states_per_bin > 1 ||
settings.persistent_states_per_bin > 1);
bool break_for_new_cb_target_mask =
(!cmd_buffer->state.emitted_pipeline ||
cmd_buffer->state.emitted_pipeline->graphics.cb_target_mask !=
cmd_buffer->state.pipeline->graphics.cb_target_mask) &&
settings.context_states_per_bin > 1;
if (!break_for_new_ps && !break_for_new_cb_target_mask)
return;
radeon_emit(cmd_buffer->cs, PKT3(PKT3_EVENT_WRITE, 0, 0));
radeon_emit(cmd_buffer->cs, EVENT_TYPE(V_028A90_BREAK_BATCH) | EVENT_INDEX(0));
}
static void
radv_emit_graphics_pipeline(struct radv_cmd_buffer *cmd_buffer)
{
......@@ -1170,6 +1197,8 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer *cmd_buffer)
cmd_buffer->state.context_roll_without_scissor_emitted = true;
}
radv_emit_batch_break_on_new_ps(cmd_buffer);
for (unsigned i = 0; i < MESA_SHADER_COMPUTE; i++) {
if (!pipeline->shaders[i])
continue;
......@@ -2428,8 +2457,12 @@ radv_flush_vertex_descriptors(struct radv_cmd_buffer *cmd_buffer,
S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W);
if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX10) {
/* OOB_SELECT chooses the out-of-bounds check:
* - 1: index >= NUM_RECORDS (Structured)
* - 3: offset >= NUM_RECORDS (Raw)
*/
desc[3] |= S_008F0C_FORMAT(V_008F0C_IMG_FORMAT_32_UINT) |
S_008F0C_OOB_SELECT(1) |
S_008F0C_OOB_SELECT(stride ? 1 : 3) |
S_008F0C_RESOURCE_LEVEL(1);
} else {
desc[3] |= S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_UINT) |
......
......@@ -1178,11 +1178,11 @@ void radv_GetPhysicalDeviceProperties(
.maxFragmentCombinedOutputResources = 8,
.maxComputeSharedMemorySize = 32768,
.maxComputeWorkGroupCount = { 65535, 65535, 65535 },
.maxComputeWorkGroupInvocations = 2048,
.maxComputeWorkGroupInvocations = 1024,
.maxComputeWorkGroupSize = {
2048,
2048,
2048
1024,
1024,
1024
},
.subPixelPrecisionBits = 8,
.subTexelPrecisionBits = 8,
......@@ -1215,7 +1215,7 @@ void radv_GetPhysicalDeviceProperties(
.framebufferNoAttachmentsSampleCounts = sample_counts,
.maxColorAttachments = MAX_RTS,
.sampledImageColorSampleCounts = sample_counts,
.sampledImageIntegerSampleCounts = VK_SAMPLE_COUNT_1_BIT,
.sampledImageIntegerSampleCounts = sample_counts,
.sampledImageDepthSampleCounts = sample_counts,
.sampledImageStencilSampleCounts = sample_counts,
.storageImageSampleCounts = pdevice->rad_info.chip_class >= GFX8 ? sample_counts : VK_SAMPLE_COUNT_1_BIT,
......
......@@ -1874,7 +1874,9 @@ void radv_GetImageSubresourceLayout(
struct radeon_surf *surface = &plane->surface;
if (device->physical_device->rad_info.chip_class >= GFX9) {
pLayout->offset = plane->offset + surface->u.gfx9.offset[level] + surface->u.gfx9.surf_slice_size * layer;
uint64_t level_offset = surface->is_linear ? surface->u.gfx9.offset[level] : 0;
pLayout->offset = plane->offset + level_offset + surface->u.gfx9.surf_slice_size * layer;
if (image->vk_format == VK_FORMAT_R32G32B32_UINT ||
image->vk_format == VK_FORMAT_R32G32B32_SINT ||
image->vk_format == VK_FORMAT_R32G32B32_SFLOAT) {
......@@ -1884,8 +1886,10 @@ void radv_GetImageSubresourceLayout(
*/
pLayout->rowPitch = surface->u.gfx9.surf_pitch * surface->bpe / 3;
} else {
uint32_t pitch = surface->is_linear ? surface->u.gfx9.pitch[level] : surface->u.gfx9.surf_pitch;
assert(util_is_power_of_two_nonzero(surface->bpe));
pLayout->rowPitch = surface->u.gfx9.surf_pitch * surface->bpe;
pLayout->rowPitch = pitch * surface->bpe;
}
pLayout->arrayPitch = surface->u.gfx9.surf_slice_size;
......
......@@ -3330,6 +3330,28 @@ radv_pipeline_generate_disabled_binning_state(struct radeon_cmdbuf *ctx_cs,
pipeline->graphics.binning.db_dfsm_control = db_dfsm_control;
}
struct radv_binning_settings
radv_get_binning_settings(const struct radv_physical_device *pdev)
{
struct radv_binning_settings settings;
if (pdev->rad_info.has_dedicated_vram) {
settings.context_states_per_bin = 1;
settings.persistent_states_per_bin = 1;
settings.fpovs_per_batch = 63;
} else {
/* The context states are affected by the scissor bug. */
settings.context_states_per_bin = 6;
/* 32 causes hangs for RAVEN. */
settings.persistent_states_per_bin = 16;
settings.fpovs_per_batch = 63;
}
if (pdev->rad_info.has_gfx9_scissor_bug)
settings.context_states_per_bin = 1;
return settings;
}
static void
radv_pipeline_generate_binning_state(struct radeon_cmdbuf *ctx_cs,
struct radv_pipeline *pipeline,
......@@ -3348,21 +3370,8 @@ radv_pipeline_generate_binning_state(struct radeon_cmdbuf *ctx_cs,
unreachable("Unhandled generation for binning bin size calculation");
if (pipeline->device->pbb_allowed && bin_size.width && bin_size.height) {
unsigned context_states_per_bin; /* allowed range: [1, 6] */
unsigned persistent_states_per_bin; /* allowed range: [1, 32] */
unsigned fpovs_per_batch; /* allowed range: [0, 255], 0 = unlimited */
if (pipeline->device->physical_device->rad_info.has_dedicated_vram) {
context_states_per_bin = 1;
persistent_states_per_bin = 1;
fpovs_per_batch = 63;
} else {
/* The context states are affected by the scissor bug. */
context_states_per_bin = pipeline->device->physical_device->rad_info.has_gfx9_scissor_bug ? 1 : 6;
/* 32 causes hangs for RAVEN. */
persistent_states_per_bin = 16;
fpovs_per_batch = 63;
}
struct radv_binning_settings settings =
radv_get_binning_settings(pipeline->device->physical_device);
bool disable_start_of_prim = true;
uint32_t db_dfsm_control = S_028060_PUNCHOUT_MODE(V_028060_FORCE_OFF);
......@@ -3383,10 +3392,10 @@ radv_pipeline_generate_binning_state(struct radeon_cmdbuf *ctx_cs,
S_028C44_BIN_SIZE_Y(bin_size.height == 16) |
S_028C44_BIN_SIZE_X_EXTEND(util_logbase2(MAX2(bin_size.width, 32)) - 5) |
S_028C44_BIN_SIZE_Y_EXTEND(util_logbase2(MAX2(bin_size.height, 32)) - 5) |
S_028C44_CONTEXT_STATES_PER_BIN(context_states_per_bin - 1) |
S_028C44_PERSISTENT_STATES_PER_BIN(persistent_states_per_bin - 1) |
S_028C44_CONTEXT_STATES_PER_BIN(settings.context_states_per_bin - 1) |
S_028C44_PERSISTENT_STATES_PER_BIN(settings.persistent_states_per_bin - 1) |
S_028C44_DISABLE_START_OF_PRIM(disable_start_of_prim) |
S_028C44_FPOVS_PER_BATCH(fpovs_per_batch) |
S_028C44_FPOVS_PER_BATCH(settings.fpovs_per_batch) |
S_028C44_OPTIMAL_BIN_SELECTION(1);
pipeline->graphics.binning.pa_sc_binner_cntl_0 = pa_sc_binner_cntl_0;
......
......@@ -266,7 +266,7 @@ void radv_logi_v(const char *format, va_list va);
fprintf(stderr, "%s:%d ASSERT: %s\n", __FILE__, __LINE__, #x); \
})
#else
#define radv_assert(x)
#define radv_assert(x) do {} while(0)
#endif
#define stub_return(v) \
......@@ -1676,6 +1676,15 @@ radv_graphics_pipeline_create(VkDevice device,
const VkAllocationCallbacks *alloc,
VkPipeline *pPipeline);
struct radv_binning_settings {
unsigned context_states_per_bin; /* allowed range: [1, 6] */
unsigned persistent_states_per_bin; /* allowed range: [1, 32] */
unsigned fpovs_per_batch; /* allowed range: [0, 255], 0 = unlimited */
};
struct radv_binning_settings
radv_get_binning_settings(const struct radv_physical_device *pdev);
struct vk_format_description;
uint32_t radv_translate_buffer_dataformat(const struct vk_format_description *desc,
int first_non_void);
......
......@@ -49,7 +49,6 @@ get_block_array_index(nir_builder *b, nir_deref_instr *deref,
if (nir_src_is_const(deref->arr.index)) {
unsigned arr_index = nir_src_as_uint(deref->arr.index);
arr_index = MIN2(arr_index, arr_size - 1);
/* We're walking the deref from the tail so prepend the array index */
block_name = ralloc_asprintf(b->shader, "[%u]%s", arr_index,
......
......@@ -103,6 +103,8 @@ process_arrays(void *mem_ctx, ir_dereference_array *ir,
if (*ub_array_ptr == NULL) {
*ub_array_ptr = rzalloc(mem_ctx, struct uniform_block_array_elements);
(*ub_array_ptr)->ir = ir;
(*ub_array_ptr)->aoa_size =
ir->array->type->arrays_of_arrays_size();
}
struct uniform_block_array_elements *ub_array = *ub_array_ptr;
......@@ -199,6 +201,7 @@ link_uniform_block_active_visitor::visit(ir_variable *var)
(*ub_array)->array_elements,
unsigned,
(*ub_array)->num_array_elements);
(*ub_array)->aoa_size = type->arrays_of_arrays_size();
for (unsigned i = 0; i < (*ub_array)->num_array_elements; i++) {
(*ub_array)->array_elements[i] = i;
......
......@@ -30,6 +30,15 @@
struct uniform_block_array_elements {
unsigned *array_elements;
unsigned num_array_elements;
/**
* Size of the array before array-trimming optimizations.
*
* Locations are only assigned to active array elements, but the location
* values are calculated as if all elements are active. The total number
* of elements in an array including the elements in arrays of arrays before
* inactive elements are removed is needed to be perform that calculation.
*/
unsigned aoa_size;
ir_dereference_array *ir;
......
......@@ -222,7 +222,7 @@ static void process_block_array_leaf(const char *name, gl_uniform_block *blocks,
gl_uniform_buffer_variable *variables,
const struct link_uniform_block_active *const b,
unsigned *block_index,
unsigned *binding_offset,
unsigned binding_offset,
unsigned linearized_index,
struct gl_context *ctx,
struct gl_shader_program *prog);
......@@ -237,26 +237,28 @@ process_block_array(struct uniform_block_array_elements *ub_array, char **name,
size_t name_length, gl_uniform_block *blocks,
ubo_visitor *parcel, gl_uniform_buffer_variable *variables,
const struct link_uniform_block_active *const b,
unsigned *block_index, unsigned *binding_offset,
unsigned *block_index, unsigned binding_offset,
struct gl_context *ctx, struct gl_shader_program *prog,
unsigned first_index)
{
for (unsigned j = 0; j < ub_array->num_array_elements; j++) {
size_t new_length = name_length;
unsigned int element_idx = ub_array->array_elements[j];
/* Append the subscript to the current variable name */
ralloc_asprintf_rewrite_tail(name, &new_length, "[%u]",
ub_array->array_elements[j]);
ralloc_asprintf_rewrite_tail(name, &new_length, "[%u]", element_idx);
if (ub_array->array) {
unsigned binding_stride = binding_offset + (element_idx *
ub_array->array->aoa_size);
process_block_array(ub_array->array, name, new_length, blocks,
parcel, variables, b, block_index,
binding_offset, ctx, prog, first_index);
binding_stride, ctx, prog, first_index);
} else {
process_block_array_leaf(*name, blocks,
parcel, variables, b, block_index,
binding_offset, *block_index - first_index,
ctx, prog);
binding_offset + element_idx,
*block_index - first_index, ctx, prog);
}
}
}
......@@ -266,7 +268,7 @@ process_block_array_leaf(const char *name,
gl_uniform_block *blocks,
ubo_visitor *parcel, gl_uniform_buffer_variable *variables,
const struct link_uniform_block_active *const b,
unsigned *block_index, unsigned *binding_offset,
unsigned *block_index, unsigned binding_offset,
unsigned linearized_index,
struct gl_context *ctx, struct gl_shader_program *prog)
{
......@@ -283,7 +285,7 @@ process_block_array_leaf(const char *name,
* block binding and each subsequent element takes the next consecutive
* uniform block binding point.
*/
blocks[i].Binding = (b->has_binding) ? b->binding + *binding_offset : 0;
blocks[i].Binding = (b->has_binding) ? b->binding + binding_offset : 0;
blocks[i].UniformBufferSize = 0;
blocks[i]._Packing = glsl_interface_packing(type->interface_packing);
......@@ -307,7 +309,6 @@ process_block_array_leaf(const char *name,
(unsigned)(ptrdiff_t)(&variables[parcel->index] - blocks[i].Uniforms);
*block_index = *block_index + 1;
*binding_offset = *binding_offset + 1;
}
/* This function resizes the array types of the block so that later we can use
......@@ -370,7 +371,6 @@ create_buffer_blocks(void *mem_ctx, struct gl_context *ctx,
if ((create_ubo_blocks && !b->is_shader_storage) ||
(!create_ubo_blocks && b->is_shader_storage)) {
unsigned binding_offset = 0;
if (b->array != NULL) {
char *name = ralloc_strdup(NULL,
block_type->without_array()->name);
......@@ -378,12 +378,12 @@ create_buffer_blocks(void *mem_ctx, struct gl_context *ctx,
assert(b->has_instance_name);
process_block_array(b->array, &name, name_length, blocks, &parcel,
variables, b, &i, &binding_offset, ctx, prog,
variables, b, &i, 0, ctx, prog,
i);
ralloc_free(name);
} else {
process_block_array_leaf(block_type->name, blocks, &parcel,
variables, b, &i, &binding_offset,
variables, b, &i, 0,
0, ctx, prog);
}
}
......@@ -440,6 +440,7 @@ link_uniform_blocks(void *mem_ctx,
GLSL_INTERFACE_PACKING_PACKED)) {
b->type = resize_block_array(b->type, b->array);
b->var->type = b->type;
b->var->data.max_array_access = b->type->length - 1;
}
block_size.num_active_uniforms = 0;
......
......@@ -422,6 +422,9 @@ clone_tex(clone_state *state, const nir_tex_instr *tex)
ntex->texture_array_size = tex->texture_array_size;
ntex->sampler_index = tex->sampler_index;
ntex->texture_non_uniform = tex->texture_non_uniform;
ntex->sampler_non_uniform = tex->sampler_non_uniform;
return ntex;
}
......