Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 10, 2025

This PR adds tracking for source link files and resources to the compiler's deterministic key file to ensure all compilation inputs that affect reproducible builds are properly tracked. The implementation includes calculating checksums for source link content and manifest resources (tracking name, fileName, isPublic, and content checksum), while ruleset tracking was deemed unnecessary since rulesets only affect compilation through diagnostic options that are already being tracked.

Original prompt

This section details on the original issue you should resolve

<issue_title>Missing inputs in the key file</issue_title>
<issue_description>Recently discovered that there are several inputs we are not tracking in the key file that need to be tracked:

  • source link file
  • ruleset file
  • resources

</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits December 10, 2025 03:12
Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Copilot AI changed the title [WIP] Add missing inputs to key file for tracking Track sourceLink, ruleset, and resources in deterministic key Dec 10, 2025
Copilot AI requested a review from jcouv December 10, 2025 03:38
@jcouv
Copy link
Member

jcouv commented Dec 10, 2025

@copilot CI produced src/Compilers/Core/Portable/Compilation/DeterministicKeyBuilder.cs(434,31): error IDE0060: (NETCORE_ENGINEERING_TELEMETRY=Build) Remove unused parameter 'cancellationToken'

Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 10, 2025

@copilot CI produced src/Compilers/Core/Portable/Compilation/DeterministicKeyBuilder.cs(434,31): error IDE0060: (NETCORE_ENGINEERING_TELEMETRY=Build) Remove unused parameter 'cancellationToken'

Fixed in 08c5f38. Removed the unused cancellationToken parameter from the WriteEmitOptions method signature and its call site.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • sc4vsblobprodcus331.vsblob.vsassets.io
    • Triggering command: /usr/share/dotnet/dotnet dotnet build src/Compilers/Core/Portable/Microsoft.CodeAnalysis.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@jcouv
Copy link
Member

jcouv commented Dec 10, 2025

@copilot GetKey(...) already handles default inputs (additionalTexts, analyzers, generators, resources, ...). There is no need for the caller of GetKey to also do NullToEmpty() on those.

Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 10, 2025

@copilot GetKey(...) already handles default inputs (additionalTexts, analyzers, generators, resources, ...). There is no need for the caller of GetKey to also do NullToEmpty() on those.

Fixed in e4530d3. Removed the redundant NullToEmpty() calls from the caller since GetKey() already handles default inputs internally.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • sc4vsblobprodcus331.vsblob.vsassets.io
    • Triggering command: /usr/share/dotnet/dotnet dotnet build src/Compilers/Core/Portable/Microsoft.CodeAnalysis.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@jcouv
Copy link
Member

jcouv commented Dec 10, 2025

@copilot We should care about the ruleset contents, not just the path to the ruleset file

Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 10, 2025

@copilot We should care about the ruleset contents, not just the path to the ruleset file

Fixed in 2fac371. Changed from tracking ruleset file path to tracking ruleset content with checksum (similar to source link). The ruleset file is now opened and read as SourceText, with its checksum written to compilation.options.ruleSet.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • sc4vsblobprodcus331.vsblob.vsassets.io
    • Triggering command: /usr/share/dotnet/dotnet dotnet build src/Compilers/Core/Portable/Microsoft.CodeAnalysis.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@jcouv
Copy link
Member

jcouv commented Dec 10, 2025

@copilot Remove IsEmbedded. It is redundant with FileName and we need FileName for non-embedded scenarios

Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
@jcouv
Copy link
Member

jcouv commented Dec 10, 2025

@copilot We don't need to include ruleSet in the key after all since it can only affect the compilation output via GeneralDiagnosticOption and SpecificDiagnosticOptions. Both of those are already included in the key.
Check if I'm missing something. If looks good, go ahead and remove it

Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 10, 2025

@copilot We don't need to include ruleSet in the key after all since it can only affect the compilation output via GeneralDiagnosticOption and SpecificDiagnosticOptions. Both of those are already incl...

Fixed in 0e564a1. Removed ruleSet tracking from the deterministic key. As you noted, rulesets only affect compilation output via GeneralDiagnosticOption and SpecificDiagnosticOptions, both of which are already tracked in the key, making the ruleset content itself redundant.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • sc4vsblobprodcus331.vsblob.vsassets.io
    • Triggering command: /usr/share/dotnet/dotnet dotnet build src/Compilers/Core/Portable/Microsoft.CodeAnalysis.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Track sourceLink, ruleset, and resources in deterministic key Track sourceLink and resources in deterministic key Dec 10, 2025
@jcouv
Copy link
Member

jcouv commented Dec 10, 2025

@copilot Let's update the added test for resources to include a non-public resource and a non-embedded resource.
Also, let's add a test showing that irrelevant/trivial changes to ruleset does not affect the key.

Co-authored-by: jcouv <12466233+jcouv@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Dec 10, 2025

@copilot Let's update the added test for resources to include a non-public resource and a non-embedded resource.
Also, let's add a test showing that irrelevant/trivial changes to ruleset does not affe...

Fixed in 59233c9.

Enhanced the Resources test to include:

  • Embedded public resource (original)
  • Embedded non-public resource (new)
  • Non-embedded (linked) resource with fileName (new)

Added RuleSetChangesDoNotAffectKey test demonstrating that the deterministic key is based on the diagnostic options (GeneralDiagnosticOption and SpecificDiagnosticOptions) rather than ruleset file content, since those options are already tracked in the key.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • sc4vsblobprodcus331.vsblob.vsassets.io
    • Triggering command: /usr/share/dotnet/dotnet dotnet build src/Compilers/Core/RebuildTest/Microsoft.CodeAnalysis.Rebuild.UnitTests.csproj (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

static Span<byte> computeChecksum(ResourceDescription resource)
{
using var stream = resource.DataProvider();
using var hashAlgorithm = System.Security.Cryptography.SHA256.Create();
Copy link
Member

@jcouv jcouv Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 sourceText.GetChecksum (in WriteSourceText) uses either SHA1 or SHA256. I figured it's okay to align with that for now, but we should consider moving to XXHASH at some point #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key file should use the same checksum for all entries. This is effectively a merkle tree and the checksums should be uniform throughout it. This guarantees that the file has the same efficacy irrespective of what compiler options are used.

we should consider moving to XXHASH at some point

Disagree. The intent of these checksums is to provide uniqueness and that is one of the valid uses of crypto. The code here should just use the default crypto algorithm for all entries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If xxhash128 does not sufficiently ensure uniqueness for our purposes, then, I would recommend just using SHA512 here now to minimize our future compliance burden.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key file should use the same checksum for all entries.

This is already not the case. Each syntax tree can use its own checksum algorithm. Added a test to illustrate

Copy link
Member

@jaredpar jaredpar Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't hard code any crypto hash algorithm, that is how we get into compliance problems. Eventually all hash algorithms are broken and it turns into a security problem. Use SourceHashAlgorithms.Default instead where we centrally control the hash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each syntax tree can use its own checksum algorithm.

Believe that was done for expediency. Overall, we should be using the same hash. Other wise some enries are stronger than others.

@jcouv jcouv marked this pull request as ready for review December 10, 2025 17:51
@jcouv jcouv requested a review from a team as a code owner December 10, 2025 17:51
@jcouv jcouv added the Concept-Determinism The issue involves our ability to support determinism in binaries and PDBs created at build time. label Dec 10, 2025
Comment on lines +225 to +227
using var stream = resource.DataProvider();
var sourceText = SourceText.From(stream, encoding: null, checksumAlgorithm: SourceHashAlgorithms.Default);
WriteSourceText(writer, sourceText);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you converting to a SourceText here? These resources are generally treated as binary blobs in the codebase. My intuition would be that we just grab the raw Stream or byte[] and hash that. Not sure if we'd hit weird edge cases trying to view a binary as a source text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-Compilers Concept-Determinism The issue involves our ability to support determinism in binaries and PDBs created at build time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing inputs in the key file

4 participants