Independent Contractor (IC) - Research Software Engineer (Part-Time)

Princeton University – SAgE Research Group

About SAgE

The Science of Agent Evaluation (SAgE) group at Princeton studies the systematic evaluation of AI agents. Our work includes benchmark development, building open-source infrastructure for agent evaluations, and research on the impact of AI on science. Our recent projects include the Holistic Agent Leaderboard (HAL), CORE-bench, and research on the limits of inference scaling.

Your Role

As an IC Research Software Engineer, you will take a leading role in maintaining the Holistic Agent Leaderboard (HAL), including its backend infrastructure, evaluation harness, and public leaderboards. This position also offers the opportunity to work closely with our group on other ongoing projects aiming to shape emerging evaluations for AI systems.

Core responsibilities:

Integrate benchmarks and agents to HAL
Integrate open source pull requests
Lead development of HAL harness
Fix issues and maintain the HAL leaderboard and harness
Run evaluations on new agents
Attend SAgE research group meetings
Support ongoing research projects on AI evaluation

Time Commitment – We estimate the work will take around 20 hours a week, but you’re free to manage your own schedule and workload.

Remuneration

The contractor will be paid $100 per hour, which will be approximately $8,000 per month based on 20 hours per week or 80 hours per month. If fewer than 20 hours are worked in any given week, payment for that month will be prorated accordingly.

Qualifications

Required: Strong programming skills (Python and web development)

Desired:

Familiarity with ML tooling (agent frameworks such as smolagents)
Interest in AI evaluation and research infrastructure

How to Apply

Email your resume, GitHub, and a brief statement of interest to sayashk AT princeton DOT edu. Please include [HAL application] in the subject line. Applications will be reviewed on a rolling basis, and we will close the search as soon as we find a suitable candidate.