91˿Ƶ

ARIA Spotlight: Rosa Chen – Department of Computer Science

Rosa Chen's Research Poster

My Arts Undergraduate Research Internship (ARIA) project is called “Improving Successor Feature Learning with Efficient Optimization Techniques,” supervised by Dr. Isabeau Prémont‑Schwarz. In simple terms, I worked on helping a learning program (an “agent”) learn useful patterns faster and reuse them across tasks, instead of starting from zero each time. This is practical because goals can change, and we want systems that adapt quickly without wasting computers.

I chose ARIA because I wanted hands‑on research: write code that runs, design clean experiments, and explain results in plain language. I focused on a method called “successor features” (SF). In plain terms, SF separates two things: general knowledge about the world and the current goal. With that separation, an agent can switch goals while keeping what it already knows, which can save training time.

My learning goals were simple. First, try small, practical changes that make training steadier and faster. Second, run fair tests: keep settings fixed, try multiple random seeds, and track not only scores but also the helper losses we add during training. Third, practice clear communication—short notes, readable plots, and simple takeaways.

I added two ideas to the training loop. The first was to mix a slow, steady signal from a target copy of the model with the live model’s features when computing helper losses. This keeps the model from overreacting and makes learning smoother. The second was to ask the model to predict the immediate reward from its own internal features. This nudges the model to learn features that are truly useful for decisions. Both ideas are light‑weight and easy to tune.

The main value was the process. I built a reliable training pipeline, learned to keep runs reproducible, and saw steadier learning when the slow signal was tuned well. The curves were not dramatically better in every case, but the system became easier to debug and understand. Plotting both the main score and the helper losses made it clearer why certain settings behaved better or worse.

At the beginning I was new to this codebase and to RL in practice. I made mistakes and spent time understanding how the parts fit together. When experiments stalled, it was discouraging, but I learned this is normal in research. I made progress by testing in smaller steps, changing one thing at a time, and keeping careful notes. I also tuned the strength of the helper loss and the slow‑signal mix to keep training stable.

ARIA gave me the full cycle—idea, build, test, explain. I learned that small, well‑motivated changes can improve stability without adding heavy complexity. This experience grew my interest in practical, reliable training methods, which I plan to carry into future research or an applied ML role. Next, I hope to turn this work into a short workshop paper or extended report with clearer ablations. I also plan to share the code and simple run scripts so that other students can easily reproduce the results.

Lastly, I would like to thank the ARIA award, precisely, the Undergraduate Experiential Learning Opportunities Support Fund because it covered my basic costs (especially rent), which allowed me to focus on research instead of extra jobs. That time and focus turned into better experiments and clearer results. I am grateful for the support.

Back to top