test: monitor CVE5 conversion regressions through snapshot testing #4470

jess-lowe · 2025-12-08T05:15:35Z

This PR is intended for regression monitoring, and should probably be set up to only run under a specific flag if the conversion code is changed, as it will be doing a lot of converting.

When the data is converted, a snapshot is saved per CNA so that if it changes we can see what exactly has changed.

We also keep a snapshot of the conversion stats, to monitor that on a higher level.

A follow up PR will include all of the sampled data we want to be testing.

another-rex · 2025-12-12T04:13:07Z

vulnfeeds/cvelist2osv/snapshot_test.go

+// TestSnapshotConversion runs the conversion process on a sample of CVEs and
+// creates snapshots of the output for comparison. This is used for monitoring
+// progressions and regressions when making changes to the converter.
+func TestSnapshotConversion(t *testing.T) {


t.Parallel()

Can't parallelize this because we are grouping snapshots by CNA, so there would be race conditions causing snapshots to be non-deterministic.

Should be fine to t.Parallel here though? This just makes it run at the same time as snapshot_test.go, which should be independent to this I believe.

if you mean the stats output test, then no it can't be run in parallel as the the rest of the files first need to be processed in order to have a stats outcome.

The files can't be parallelized easily either, because of the non-determinism thing grouping them by CNA and that they're all writing to the same stats output.

another-rex · 2025-12-12T04:14:02Z

vulnfeeds/cvelist2osv/snapshot_test.go

+	keys := make([]string, 0, len(stats))
+	for k := range stats {
+		keys = append(keys, k)
+	}
+	sort.Strings(keys)


Use the shorter form with maps.Key

maps.Keys isn't deterministic :/

I meant use slices.Sorted(maps.Key(stats)), same as in the previous PR.

another-rex · 2025-12-12T04:26:26Z

vulnfeeds/cvelist2osv/snapshot_test.go

+	// Sort keys for deterministic output
+	keys := make([]string, 0, len(stats))
+	for k := range stats {
+		keys = append(keys, k)
+	}
+	sort.Strings(keys)
+
+	var statsOutput strings.Builder
+	statsOutput.WriteString("Conversion Outcomes:\n")
+	for _, k := range keys {
+		statsOutput.WriteString(fmt.Sprintf("%s: %d\n", k, stats[k]))
+	}
+	snaps.WithConfig(snaps.Filename("conversion_outcomes")).MatchSnapshot(t, statsOutput.String())


Actually is this useful? The snapshot of each individual record would change (including the metrics) without this aggregate right? Since we are no longer putting every single record in here, it's probably not needed here.

Do we have conversion_outcomes aggregation being written normally and uploaded? If not we should do that.

Personally I'd find it useful because I have the high level overview of numbers going in which direction. If it were numbers going up to Successful, I wouldn't need to look into it as much as NoRanges suddenly spiking. It's just slightly more useful for speedy triage.

We don't currently have conversion_outcomes being written and uploaded for the CVE5 logic, but there is an equivalent for NVD. The NVD one uses a CSV file that I'm not convinced isn't being truncated at 5000 entries.

It'd be good to use this data though in the future for our metrics dashboard.

Problem is that we're running conversion in parallel and would need to do some analysis using the .metrics.json files, or do icky stuff with mutexes.

Have got the earlier solution in another WIP PR that I can put up after this one.

hmm main concern is I don't want the direction to be part of tests. That'll work in the short term but I feel might get really annoying to update in the long term.

jess-lowe added 2 commits December 8, 2025 00:35

return outcome of conversions

0b3d579

Create snapshot testing for CVE Conversion

50e0443

jess-lowe changed the title ~~tests: monitor CVE5 conversion regressions through snapshot testing~~ test: monitor CVE5 conversion regressions through snapshot testing Dec 8, 2025

jess-lowe requested a review from another-rex December 8, 2025 05:16

jess-lowe added 2 commits December 8, 2025 05:23

fix lint

1fd6900

Update snaps

3dc474b

jess-lowe marked this pull request as ready for review December 9, 2025 05:43

another-rex reviewed Dec 12, 2025

View reviewed changes

jess-lowe added 2 commits December 22, 2025 21:05

use a function for outputting conversion output.

796ca6c

shorten keys sorting logic

b33f0ec

test: monitor CVE5 conversion regressions through snapshot testing #4470

Are you sure you want to change the base?

test: monitor CVE5 conversion regressions through snapshot testing #4470

Uh oh!

Conversation

jess-lowe commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jess-lowe commented Dec 8, 2025 •

edited

Loading