Cleanup codebase to make it readable and usable by adityasoni9998 · Pull Request #73 · OpenHands/codescout

adityasoni9998 · 2026-02-10T06:12:16Z

Fixes #71 (mostly, but not fully)

Summary of changes made:

edit code search generator to fix bugs in masking, removes length-based loss masking in trainer, and fixes masking logic across all LLMs
cleanup dead code files, dead comments, unused datasets, submodules, unused packages like prime-rl
agent-sdk is now installed via pyproject.toml
fixes bugs that logs metrics to Wandb regarding total tokens in each rollout

This code is tested by training the 1.7B RL'ed LLM ckpt for a few more steps -- it runs out-of-the-box, and the reward is similar/close to where it plateaued. But empirical end-to-end validation by training some base LLM and checking rewards going up or not is a TODO task.

TODOs/Questions:

configs and src/prompts/templates contain a lot of unused reward configs and prompt templates. Do we want to remove them?
Do we want to retain all the reward function implementations? We end up only using the simplest rewards in our work
Train 4B Instruct with this code (with exact same config as 14B model?) and verify that it works (<12 hours of work on 8xH100 machine)

yucc-leon · 2026-02-10T11:36:13Z

Hi Aditya, thanks for the cleanup PR — this seems super helpful.

Some of the issues you fixed (especially the masking logic and the length-based loss masking) might explain why my earlier ablations on the major-update branch didn’t behave well (runs here: https://wandb.ai/leon_at_work/ablation_v2_4b). I likely ran into those bugs before they were cleaned up or used different configs or settings in training scripts.

I’m happy to help with the TODO you mentioned — e.g. doing an end-to-end validation by training a 4B Instruct model using this branch and checking whether rewards improve as expected. Let me know if the cleanup branch reflects the intended setup for that.

cleanup major-update and update code

0491b33

adityasoni9998 requested review from ApGa and lintangsutawika February 10, 2026 06:12

adityasoni9998 mentioned this pull request Feb 10, 2026

Merge all branches into the main codebase #71

Open

cleanup

613882e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Cleanup codebase to make it readable and usable#73

Cleanup codebase to make it readable and usable#73
adityasoni9998 wants to merge 2 commits intomajor-updatefrom
soni/main

adityasoni9998 commented Feb 10, 2026 •

edited

Loading

Uh oh!

yucc-leon commented Feb 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

adityasoni9998 commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yucc-leon commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adityasoni9998 commented Feb 10, 2026 •

edited

Loading

yucc-leon commented Feb 10, 2026 •

edited

Loading