Internship | Curriculum learning for tactical reinforcement learning agents

Reinforcement learning holds many promises, but as always the devil is in the details. And in when and where rewards are presented to an agent.


Den Haag

Education type

university (wo)


Internship and graduation project

Hours a week

Fulltime – 40


Apply now


What will you be doing?

At the Modelling, Simulation & Gaming department, we perform applied research for (mostly) the safety domain, including behavioural simulations, terrain modelling and artificial intelligence. This research is mostly applied for training purposes such as simulators, and for operational planners to support their awareness and situational understanding.

For planners and operators in the safety domain, the stakes are high. The lives of their own colleagues and even civilians are often at risk. Choosing what to do or even simply where to be present can have large effects cascading through an operation. That is why before and during every operation, the mission is planned thoroughly by a team who formulate a plan (Course of Action) by considering as many factors as possible, including the terrain, weather, local populace and media, and more. This is a process that can take hours for simple tasks, or up to days and weeks for larger missions, and even then, only a very limited number of courses of action can be developed and evaluated.

Using AI techniques to help planners develop and analyse plans can be an enormous boon for a rapid and safe response to threats. That is why we are developing an AI framework that can support these mission planners and operators, by enabling them to simulate a scenario, learn the best strategy given certain goals and restrictions (for example using deep reinforcement learning (DRL)), and use those results to help build their awareness and understanding of the situation. The output might be prototypically formulated plans, heatmaps of strategic locations and statistical data on what types of units work best against which threats.

In this project, you will investigate and prototype methods to achieve curriculum learning in our tactical AI framework, such that better and more stable tactical behaviours can be learned. Depending on time and progress, the assignment can be scaled from fully observable single-player settings, all the way to challenging probabilistic asymmetric co-learning settings. Research questions can include a comparison of different approaches, how problems (scenario’s) can efficiently be generated and their suitability measured, what methods can be used to express what has been learnt at what stage, or how well a curriculum can be transferred to train another behaviour in another environment (e.g., city vs. rural area).

What do we require of you?

You want to be contribute to innovation in the military and security domain. You feel right at home working (and learning) in a multidisciplinary environment, partly due to your proactive attitude and communication skills. You are alert to opportunities and actively contribute to the team of which you are a part. You take responsibility for your own work and planning, of course under our supervision.

What else do you bring:
  • An academic background in artificial intelligence, computer science, or another relevant technical field;
  • A structured approach to software development;
  • Experience with modern programming languages (eg Java, Python, C ++, C #, Javascript) and software development environments.

What can you expect of your work situation?

At TNO we innovate for a healthier, safer and more sustainable life. And for a strong economy. Since 1932 we have been developing knowledge and technology for the common good. We find each other in wonder and ingenuity, and we are driven to push boundaries. There is plenty of room and support for your talent and ambition. You work with people who are daring, who inspire you and who want to learn from you. Our state-of-the-art facilities are there to make your vision a reality. What you do at TNO matters: impact makes the difference. Because with every innovation you contribute to the life of tomorrow.
Within TNO you will work in the Modeling, Simulation & Gaming research group: a varied team of approximately thirty highly trained experts. Together with other TNO research groups, government, industry and business, we create innovations that contribute to a safe world, both physically and digitally. Think of simulations to predict the route of a fugitive or a game in which soldiers can practice on the basis of real data for their next mission. And that's just the tip of the iceberg.

What can TNO offer you?

You want to work on the precursor of your career; a work placement gives you an opportunity to take a good look at your prospective future employer. TNO goes a step further. It’s not just looking that interests us; you and your knowledge are essential to our innovation. That’s why we attach a great deal of value to your personal and professional development. You will, of course, be properly supervised during your work placement and be given the scope for you to get the best out of yourself. Naturally, we provide suitable work placement compensation.

Application process

For this vacancy it is required that the AIVD issues a security clearance after conducting a security screening. Please visit for more information the AIVD website.

Has this vacancy sparked your interest?

Then please feel free to apply on this vacancy! For further questions don’t hesitate to contact us.

Due to Covid-19 and the consequent uncertainties and restrictions, students who are not residing in the Netherlands may currently not be able to start an internship or graduation project at TNO.

Contact: Philip Kerbusch
Phone number: 08886 61020

Note that applications via email and third party applications are not taken into consideration.


Apply now



Stay up to date with our latest news, activities and vacancies collects and processes data in accordance with the applicable privacy regulations for an optimal user experience and marketing practices.
This data can easily be removed from your temporary profile page at any time.
You can also view our privacy statement or cookie statement.