fix: ensure integer bomref discrimination #926

qkaiser · 2025-12-15T11:58:02Z

The discriminator previously used str(random()) pieces, which could emit scientific notation (e.g. .756153012152602e-05) and generate invalid BOM ref fragments.

Switch to uuid4().int for each segment so we always produce integer-only values while maintaining uniqueness.
Add a regression test to lock the BomRef.. format.

uuid4() gives ~122 bits of entropy (non-crypto), compared to ~53 bits from random.random(), and the prior decimal-string slicing wasted some of those bits while allowing scientific notation.

Went with uuid4 after exploring in-memory addresses like the JavaScript CycloneDX library and the PHP CycloneDX library which uses a time based approach.

A generalized approach for BomRef emission would be nice I think. Both from a resulting format and generation standpoint. Specifically:

PHP library uses uniqid which is time based and results in a timestamp in microseconds expressed in hex suffixed by a random digit
JavaScript uses in-memory addresses but also Math.random() which emits two dot separated base32 encoded value
Python is currently using math.random() which emits two dot separated 16 digits integer, sometimes in scientific notation

Instead of having custom implementations for each library offering support for CycloneDX, they could all rely on unique identifiers such as uuid4 which is natively supported by each language. This would offer strong guarantees of uniqueness along with a common format.

The discriminator previously used str(random()) pieces, which could emit scientific notation (e.g. 4.1e-05) and generate invalid BOM ref fragments. Switch to uuid4().int for each segment so we always produce integer-only values while maintaining uniqueness. Add a regression test to lock the BomRef.<int>.<int> format. uuid4() gives ~122 bits of entropy (non-crypto), compared to ~53 bits from random.random(), and the prior decimal-string slicing wasted some of those bits while allowing scientific notation. Signed-off-by: Quentin Kaiser <quentin.kaiser@onekey.com>

jkowalleck · 2025-12-16T11:28:30Z

Thanks for the report. I do have some questiosn

[...] generate invalid BOM ref fragments

invalid to what?

Instead of having custom implementations for each library offering support for CycloneDX, they could all rely on unique identifiers such as uuid4 which is natively supported by each language. This would offer strong guarantees of uniqueness along with a common format.

why exactly?

I dont see any reason to streamline a thing that has basically no rule nor format.
I mean, BomRef have only one constraint: being unique in the CycloneDX context(bom) they are used in.

Please explain why you feel this needs changing.

qkaiser · 2025-12-17T09:02:22Z

Please explain why you feel this needs changing.

Some of our validators check BomRef fields to make sure they follow the format from this library which we naively assumed was dot separated digits. When casting from float to string leads to scientific notation, the presence of - and e breaks this validation.

I fully agree with you there's no standardized format for BomRef. It just feels logic to fix the value generation rather than relax a validation to accept scientific notation values in an identifier.

Feel free to close this MR if you want to keep things as they are.

jkowalleck · 2025-12-17T10:10:52Z

I fully agree with you there's no standardized format for BomRef.
[...]
Feel free to close this MR if you want to keep things as they are.

I’ll close this ticket as “Not Planned”, since there’s no existing standard to base further work on.

It just feels logic to fix the value generation rather than relax a validation to accept scientific notation values in an identifier.

You really should "relax"(fix!) this validator, since it was build on wrong assumptions, right?

qkaiser requested a review from a team as a code owner December 15, 2025 11:58

qkaiser force-pushed the fix/bom-ref-discriminator branch from fb05b8c to b0a9f50 Compare December 15, 2025 11:59

jkowalleck added the question Further information is requested label Dec 16, 2025

jkowalleck closed this Dec 17, 2025

jkowalleck added invalid This doesn't seem right and removed question Further information is requested labels Dec 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: ensure integer bomref discrimination #926

fix: ensure integer bomref discrimination #926

Uh oh!

qkaiser commented Dec 15, 2025

Uh oh!

jkowalleck commented Dec 16, 2025

Uh oh!

qkaiser commented Dec 17, 2025

Uh oh!

jkowalleck commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: ensure integer bomref discrimination #926

fix: ensure integer bomref discrimination #926

Uh oh!

Conversation

qkaiser commented Dec 15, 2025

Uh oh!

jkowalleck commented Dec 16, 2025

Uh oh!

qkaiser commented Dec 17, 2025

Uh oh!

jkowalleck commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants