Prepare inputs needed for downstream evaluation by taha-yassine · Pull Request #74 · OpenHands/codescout

taha-yassine · 2026-02-10T10:43:34Z

This PR adds a script to build the dataset required to run the benchmark on downstream tasks using OpenHands/benchmarks. The script takes as input the search results produced by CodeScout as a .jsonl.
There's also a custom prompt template to be used with the dataset to insert the search results into the user message.

Example command to run the benchmark

uv run swebench-infer .llm_config/modal.json \
        --dataset path/to/dataset_with_search_results.jsonl \
        --split test \
        --max-iterations 100 \
        --workspace docker \
        --prompt-path path/to/default_with_search.j2

taha-yassine added 2 commits February 10, 2026 11:25

Add downstream dataset build script

27adcc9

Add custom prompt template for benchmark

34af62b

taha-yassine requested a review from adityasoni9998 February 10, 2026 10:43

taha-yassine changed the base branch from main to major-update February 10, 2026 10:44

adityasoni9998 mentioned this pull request Feb 16, 2026

Demonstrate the advantages of localization before fixing. #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Prepare inputs needed for downstream evaluation#74

Prepare inputs needed for downstream evaluation#74
taha-yassine wants to merge 2 commits intomajor-updatefrom
dataset-with-search

taha-yassine commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

taha-yassine commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant