Skip to content

Comments

Normalize shader codegen to produce identical source across materials#2776

Open
ashwinbhat wants to merge 4 commits intoAcademySoftwareFoundation:mainfrom
autodesk-forks:bhata/equivalent_shader
Open

Normalize shader codegen to produce identical source across materials#2776
ashwinbhat wants to merge 4 commits intoAcademySoftwareFoundation:mainfrom
autodesk-forks:bhata/equivalent_shader

Conversation

@ashwinbhat
Copy link
Contributor

Ensure materials using the same shading model generate structurally identical shader code, differing only in uniform values.

  • Use short input names (e.g. base_color) instead of node-prefixed names (e.g. SR_brass_base_color) for shader/closure node uniforms in finalize().
  • Replace std::deque with std::set using a name-based comparator in topologicalSort() to guarantee deterministic node ordering at each topological level.
  • Assign generic output variable names (material_out, surfaceshader_out) in setVariableNames() instead of deriving them from per-material node names.

Attaching shader diffs of standard_surface_wood_tiled_ps.glsl vs standard_surface_brass_tiled_ps.glsl (before and after)
shaderdiff.txt
shaderdiff_after.txt

Attaching shader diffs of standard_surface_defauilt_ps.glsl vs standard_surface_plastic_ps.glsl (before and after)
shaderdiff_b4.txt
shaderdiff_after.txt

…instances

Ensure materials using the same shading model generate structurally identical shader code, differing only in uniform values.
- Use short input names (e.g. base_color) instead of node-prefixed names (e.g. SR_brass_base_color) for shader/closure node uniforms in finalize().
- Replace std::deque with std::set using a name-based comparator in topologicalSort() to guarantee deterministic node ordering at each topological level.
- Assign generic output variable names (material_out, surfaceshader_out) in setVariableNames() instead of deriving them from per-material node names.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request aims to normalize shader code generation so that materials using the same shading model produce structurally identical shader code, differing only in uniform values. This improves shader caching and reduces compilation overhead when switching between materials with the same underlying shader structure.

Changes:

  • Modified input naming in finalize() to use short names (e.g., "base_color") for shader/closure node uniforms instead of node-prefixed names (e.g., "SR_brass_base_color")
  • Replaced std::deque with std::set using a name-based comparator in topologicalSort() to ensure deterministic node ordering at each topological level
  • Changed output variable naming in setVariableNames() to use generic names ("material_out", "surfaceshader_out") for material and surfaceshader outputs
Comments suppressed due to low confidence (1)

source/MaterialXGenShader/ShaderGraph.cpp:954

  • When multiple shader or closure nodes have the same input name, only the first node's input properties are used to initialize the shared inputSocket. The subsequent nodes with the same input name will reuse the existing socket but their value, path, unit, and colorSpace properties will not be applied to the socket (these are only set in the if block on lines 943-951). This could lead to incorrect behavior if different shader/closure nodes have different values, paths, units, or color spaces for inputs with the same name.

Either these properties should be validated to be identical across all shader/closure nodes with the same input name (and an error thrown if they differ), or the naming strategy should be reconsidered to avoid collisions in cases where the properties actually differ.

                        const bool isShaderNode = node->hasClassification(ShaderNode::Classification::SHADER) ||
                                                  node->hasClassification(ShaderNode::Classification::CLOSURE);
                        const string interfaceName = isShaderNode
                            ? input->getName()
                            : input->getFullName();

                        ShaderGraphInputSocket* inputSocket = getInputSocket(interfaceName);
                        if (!inputSocket)
                        {
                            inputSocket = addInputSocket(interfaceName, input->getType());
                            inputSocket->setPath(input->getPath());
                            inputSocket->setValue(input->getValue());
                            inputSocket->setUnit(input->getUnit());
                            inputSocket->setColorSpace(input->getColorSpace());
                            if (input->isUniform())
                            {
                                inputSocket->setUniform();
                            }
                        }
                        inputSocket->makeConnection(input);
                        inputSocket->setMetadata(input->getMetadata());

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ashwinbhat ashwinbhat marked this pull request as draft February 14, 2026 03:11
ashwinbhat and others added 2 commits February 19, 2026 14:50
@ashwinbhat ashwinbhat marked this pull request as ready for review February 20, 2026 00:29
Make replaceTokens virtual to allow the WGSL override.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1123,9 +1136,19 @@ void ShaderGraph::topologicalSort()
//
// Running time: O(numNodes + numEdges).
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "Running time: O(numNodes + numEdges)" is no longer accurate after changing from std::deque to std::set. The actual running time is now O((numNodes + numEdges) * log(numNodes)) due to the logarithmic insert/erase operations on the std::set. Consider updating this comment to reflect the actual complexity.

Suggested change
// Running time: O(numNodes + numEdges).
// Running time: O((numNodes + numEdges) * log(numNodes)) due to std::set usage.

Copilot uses AI. Check for mistakes.
else if (input->getType() == Type::BOOLEAN)
{
const string result = getUpstreamResult(input, context);
emitString("bool(" + result + ")", stage);
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boolean handling in emitInput wraps all boolean inputs with bool() cast, including those that might already be boolean expressions. If getUpstreamResult returns something like "bool(someValue)", this will result in "bool(bool(someValue))", which is redundant but should still be valid WGSL. Consider checking if the result is already a boolean cast to avoid redundancy, though this is a minor stylistic issue.

Suggested change
emitString("bool(" + result + ")", stage);
// Avoid redundant casts such as bool(bool(expr)) if the upstream
// result is already a bool(...) expression.
if (result.size() >= 6 && result.compare(0, 5, "bool(") == 0 && result.back() == ')')
{
emitString(result, stage);
}
else
{
emitString("bool(" + result + ")", stage);
}

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +97
// but skip declaration lines (pattern: " name;").
// Add more entries here if additional bool uniforms are introduced.
const vector<std::pair<string, string>> boolUniforms = {
{ HW::REFRACTION_TWO_SIDED, "bool(" + HW::REFRACTION_TWO_SIDED + ")" },
};
string code = stage.getSourceCode();
for (const auto& entry : boolUniforms)
{
const string& plainName = entry.first;
const string& castName = entry.second;
for (size_t pos = 0; (pos = code.find(plainName, pos)) != string::npos; )
{
bool isDecl = (pos > 0 && code[pos - 1] == ' ' &&
pos + plainName.size() < code.size() &&
code[pos + plainName.size()] == ';');
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The declaration detection logic may have false positives. The pattern checks if a uniform name is preceded by a space and followed by a semicolon, but this doesn't account for other valid declaration patterns. For example, a declaration might have multiple spaces, tabs, or newlines before the semicolon. Consider checking against the actual uniform declaration syntax or using a more robust pattern that accounts for the full declaration line structure (e.g., checking for "u32" or "i32" type keyword before the variable name).

Suggested change
// but skip declaration lines (pattern: " name;").
// Add more entries here if additional bool uniforms are introduced.
const vector<std::pair<string, string>> boolUniforms = {
{ HW::REFRACTION_TWO_SIDED, "bool(" + HW::REFRACTION_TWO_SIDED + ")" },
};
string code = stage.getSourceCode();
for (const auto& entry : boolUniforms)
{
const string& plainName = entry.first;
const string& castName = entry.second;
for (size_t pos = 0; (pos = code.find(plainName, pos)) != string::npos; )
{
bool isDecl = (pos > 0 && code[pos - 1] == ' ' &&
pos + plainName.size() < code.size() &&
code[pos + plainName.size()] == ';');
// but skip declaration lines.
// Add more entries here if additional bool uniforms are introduced.
const vector<std::pair<string, string>> boolUniforms = {
{ HW::REFRACTION_TWO_SIDED, "bool(" + HW::REFRACTION_TWO_SIDED + ")" },
};
string code = stage.getSourceCode();
// Helper to determine whether the occurrence of plainName at namePos is part of
// a declaration like: "<type> <plainName> ;" (allowing arbitrary whitespace).
auto isDeclarationOccurrence = [&code](const string& plainName, size_t namePos) -> bool
{
if (namePos == string::npos || namePos >= code.size())
{
return false;
}
const size_t nameEnd = namePos + plainName.size();
if (nameEnd > code.size())
{
return false;
}
// Look forward: skip whitespace, then require a semicolon.
size_t forward = nameEnd;
while (forward < code.size() &&
(code[forward] == ' ' || code[forward] == '\t' ||
code[forward] == '\n' || code[forward] == '\r'))
{
++forward;
}
if (forward >= code.size() || code[forward] != ';')
{
return false;
}
// Look backward: skip whitespace, then read the previous identifier token.
if (namePos == 0)
{
return false;
}
size_t backward = namePos;
while (backward > 0 &&
(code[backward - 1] == ' ' || code[backward - 1] == '\t' ||
code[backward - 1] == '\n' || code[backward - 1] == '\r'))
{
--backward;
}
if (backward == 0)
{
return false;
}
auto isIdentChar = [](char c) -> bool
{
return (c >= 'a' && c <= 'z') ||
(c >= 'A' && c <= 'Z') ||
(c >= '0' && c <= '9') ||
c == '_';
};
size_t identEnd = backward;
size_t identStart = identEnd;
while (identStart > 0 && isIdentChar(code[identStart - 1]))
{
--identStart;
}
if (identStart == identEnd)
{
return false;
}
const string typeToken = code.substr(identStart, identEnd - identStart);
// In WGSL, these bool-as-int uniforms are typically declared as u32 or i32.
return (typeToken == "u32" || typeToken == "i32");
};
for (const auto& entry : boolUniforms)
{
const string& plainName = entry.first;
const string& castName = entry.second;
for (size_t pos = 0; (pos = code.find(plainName, pos)) != string::npos; )
{
bool isDecl = isDeclarationOccurrence(plainName, pos);

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +107
for (size_t pos = 0; (pos = code.find(plainName, pos)) != string::npos; )
{
bool isDecl = (pos > 0 && code[pos - 1] == ' ' &&
pos + plainName.size() < code.size() &&
code[pos + plainName.size()] == ';');
if (isDecl)
{
pos += plainName.size();
}
else
{
code.replace(pos, plainName.size(), castName);
pos += castName.size();
}
}
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string replacement in replaceTokens could result in incorrect behavior if the uniform name appears as a substring of another identifier. For example, if HW::REFRACTION_TWO_SIDED is "u_refractionTwoSided" and the code contains "my_u_refractionTwoSided", the replacement would incorrectly wrap part of "my_u_refractionTwoSided" with bool(). Consider using word boundary checks or ensuring the match is a complete token rather than a substring.

Copilot uses AI. Check for mistakes.
Comment on lines +1228 to +1242
string variable;
// Use generic names for material and surfaceshader outputs to ensure consistency
// The identifier map will handle uniqueness if there are multiple nodes of the same type
if (output->getType() == Type::MATERIAL)
{
variable = "material_out";
}
else if (output->getType() == Type::SURFACESHADER)
{
variable = "surfaceshader_out";
}
else
{
variable = output->getFullName();
}
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using generic names like "material_out" and "surfaceshader_out" for all material and surfaceshader outputs could lead to confusing variable names if there are multiple such nodes in a graph. While the comment states "The identifier map will handle uniqueness," this means the actual variable names will become "material_out1", "material_out2", etc., which may not be intuitive for debugging. Consider whether this tradeoff between normalization and debuggability is acceptable for the use case.

Suggested change
string variable;
// Use generic names for material and surfaceshader outputs to ensure consistency
// The identifier map will handle uniqueness if there are multiple nodes of the same type
if (output->getType() == Type::MATERIAL)
{
variable = "material_out";
}
else if (output->getType() == Type::SURFACESHADER)
{
variable = "surfaceshader_out";
}
else
{
variable = output->getFullName();
}
string variable = output->getFullName();

Copilot uses AI. Check for mistakes.
Comment on lines +76 to +84
// Let the base handle all interface and code substitution with a safe map
// (T_REFRACTION_TWO_SIDED -> u_refractionTwoSided, no bool wrap).
StringMap safeSubstitutions = substitutions;
safeSubstitutions[HW::T_REFRACTION_TWO_SIDED] = HW::REFRACTION_TWO_SIDED;
ShaderGenerator::replaceTokens(safeSubstitutions, stage);

// Now fix up code only: wrap bool-as-int uniform names in bool() at use sites,
// but skip declaration lines (pattern: " name;").
// Add more entries here if additional bool uniforms are introduced.
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "wrap bool-as-int uniform names in bool() at use sites" but the code also attempts to skip declaration lines. However, the pattern detection for declarations is fragile. Consider adding a more detailed comment explaining the full algorithm: (1) exclude the token from base replaceTokens substitution, (2) manually replace the token in interfaces, (3) scan code for remaining occurrences and wrap with bool() cast except at declaration sites. This would help future maintainers understand the complete approach.

Suggested change
// Let the base handle all interface and code substitution with a safe map
// (T_REFRACTION_TWO_SIDED -> u_refractionTwoSided, no bool wrap).
StringMap safeSubstitutions = substitutions;
safeSubstitutions[HW::T_REFRACTION_TWO_SIDED] = HW::REFRACTION_TWO_SIDED;
ShaderGenerator::replaceTokens(safeSubstitutions, stage);
// Now fix up code only: wrap bool-as-int uniform names in bool() at use sites,
// but skip declaration lines (pattern: " name;").
// Add more entries here if additional bool uniforms are introduced.
// Three-step algorithm for "bool-as-int" uniforms:
// (1) Start from the incoming substitution map and build a "safe" map that
// excludes bool-wrap behavior for specific tokens. In particular, map
// T_REFRACTION_TWO_SIDED directly to the uniform name
// (T_REFRACTION_TWO_SIDED -> u_refractionTwoSided) so the base
// implementation will substitute it as-is, with no bool() wrapping.
// (2) Call the base ShaderGenerator::replaceTokens with this safe map so
// that all interface blocks and general code see the plain uniform
// name (e.g. u_refractionTwoSided) substituted in the right places.
StringMap safeSubstitutions = substitutions;
safeSubstitutions[HW::T_REFRACTION_TWO_SIDED] = HW::REFRACTION_TWO_SIDED;
ShaderGenerator::replaceTokens(safeSubstitutions, stage);
// (3) Post-process the generated source code only: scan for plain
// bool-uniform names and wrap them in bool() at use sites, while
// attempting to skip their declaration lines. Declarations are
// heuristically detected using the pattern " name;" (single space
// before the name and a semicolon immediately after). This pattern is
// intentionally simple and should be kept in mind if the surrounding
// code format changes.
// Add more entries here if additional bool uniforms are introduced.

Copilot uses AI. Check for mistakes.

// Set uniform type to integer
// WGSL does not support boolean uniforms; emit as integer.
// Cast to bool is emitted at use site (emitInput / replaceTokens).
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Cast to bool is emitted at use site (emitInput / replaceTokens)", but this only covers BOOLEAN type inputs passed through emitInput and uniforms handled by replaceTokens. If there are other code paths that use boolean uniforms directly (e.g., in inline code or string literals), they may not get the proper bool() cast. Consider adding a comment noting this limitation or ensuring comprehensive coverage of all use cases.

Suggested change
// Cast to bool is emitted at use site (emitInput / replaceTokens).
// Cast to bool is emitted at use site (emitInput / replaceTokens) for
// code paths that use these APIs. Other usages (e.g., inline code or
// string-literal snippets that reference this uniform directly) must
// ensure an explicit bool() cast if needed.

Copilot uses AI. Check for mistakes.
// Calculate in-degrees for all nodes, and enqueue those with degree 0.
std::unordered_map<ShaderNode*, int> inDegree(_nodeMap.size());
std::deque<ShaderNode*> nodeQueue;
std::set<ShaderNode*, decltype(compareNodes)> nodeQueue(compareNodes);
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from std::deque to std::set changes the time complexity of the topological sort from O(n + e) to O((n + e) * log n) where n is the number of nodes and e is the number of edges. For typical shader graphs with fewer than 100 nodes, this is acceptable. However, if there are use cases with very large shader graphs (hundreds or thousands of nodes), this could become a performance bottleneck. Consider documenting this tradeoff or monitoring performance on large graphs.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant