Data science, Trustworthy AI
Artificial Intelligence (AI) is rapidly transforming our world, influencing everything from public services to industry. The goal of our Data Science expertise group is to create and support the trustworthy adoption of AI solutions. We do this by adding human knowledge and values to the AI systems.
The commitment to Trustworthy AI aligns with core values of the European Union, such as respect for human dignity, freedom, democracy, equality, the rule of law, and respect for human rights, and support us in building solutions that comply with standards for ethic and social responsibility.
Our mission
AI systems sometimes hallucinate, giving answers that are uncomplete, misleading or even wrong. Also, AI systems are often biased against certain groups of people, because the data the systems use are unevenly distributed. At Data Science we make AI more responsible, by improving user interaction, and by developing techniques to reduce the risk of hallucinations and biases. AI can support humans, and may contain human bias and mistakes. At Data Science we aim mitigate this as much as possible.
Hence our mission: Trustworthy AI, as human as it gets, and sometimes even better than that.
We therefore contribute to the development of European Generative AI and Large Language Models, providing a sovereign and ethical alternative to models from big tech companies. We collaborate with both public and private organizations to help them adopt innovative AI solutions that are not only effective but also aligned with societal values, strengthening in this way their success and competitiveness. Moreover, we design solutions that enable the secure and efficient sharing of data, making it easier for different stakeholders—across sectors and domains—to collaborate. This is essential for tackling complex challenges, such as optimizing the use of energy grids for a smarter and more sustainable future.
World overlaid with a thin layer of data
You can’t see it, but the world is covered by a wafer-thin layer of data. The temperature of the air, the age of your neighbour, the balance in your bank account. Our smartphones, laptops, smart watches, and other devices provide us with all these data whenever we want them. These devices can do a lot with data, but there’s still a lot they can’t do – or can’t yet do in a good and proper way.
For example, it isn’t always clear on the basis of what data a decision (possibly an automated decision) has been made. And this decision can have a great impact on people’s lives. For example, it may determine whether or not you’ll be invited for a job interview. When questionable use of data leads to an invasion of privacy, we realise how dependent and vulnerable we are. Especially when personal data are suddenly out in the open. With growing technological possibilities, the use of unexplainable systems is also increasing. Systems that seem to think and decide autonomously. The huge amounts of data produced are helpful to us – so much so that we can hardly live without all the information. But it’s so extensive that we sometimes risk losing control of it. The importance of handling data properly and giving it the right meaning will therefore only increase.
What does the Data Science expertise group do?
We work on solutions that enrich information systems and artificial intelligence (AI) with human knowledge and experience. In this way, we help devise and build meaningful, reliable, and explainable solutions for data sharing and data analysis. One method is to use natural language to enable data exchange. Or we can enable AI systems to learn from human knowledge and to use context awareness in reasoning. We call the combination of data-driven and knowledge-driven AI ‘Hybrid AI’. More data also makes privacy and data security extremely important. The impact on policy and legislation is also growing. We therefore work closely with colleagues who develop information security and privacy-enhancing technologies, as well as experts on policy and strategy making.
The Data Science expertise group aims to provide insightful and meaningful data in a responsible and comprehensible manner. Our primary focus is on Trustworthy AI, ensuring that data and AI models are shared and managed securely, and used to make better decisions that benefit both society and businesses. We strive to have a significant impact by supporting companies and public authorities in their data management, decision-making processes, and adoption of AI solutions. Discover what we do.
Technical know-how
Our expertise in Knowledge-Driven AI, Data-Driven AI, and Hybrid AI, in combination with our role as an independent research group, puts us in a unique position at national and European level to guide this process and keep adapting to the rapidly evolving world of AI.
Our expertise in Data-Driven AI includes traditional machine learning and advanced deep learning technologies. We specialize in natural language processing (NLP) and in developing reliable and compliant Large Language Models (LLMs), We apply these techniques across various domains -such as agriculture and health- keeping the focus on the trustworthy AI aspects of linguistic technology development.
Examples include developing chatbots tailored to specific sectors and using techniques like Generative AI (GenAI) and Large Language Models (LLMs) to summarize text and answer queries. We also generate synthetic datasets to protect privacy and address data scarcity.
We excel in building semantic models like ontologies and knowledge graphs for specific domains, such as the labour market and energy distribution. These models enable us to label raw data with metadata and reason with it in an open and context-aware manner. Notable applications include creating the Semantic Interoperability Framework, adopted by multiple companies, and ensuring interoperability between IoT devices and databases.
Our Hybrid AI approach integrates data-driven and knowledge and reasoning-based methodologies, focusing on data quality and trustworthiness. By combining these approaches, we create comprehensive AI systems that provide reliable and transparent outcomes. An example is tuning LLMs for specific applications, such as a speech-based interface for in health-related questions, ensuring that AI-generated answers are valid and trustworthy.
Our commitment to Trustworthy AI ensures that our technologies conform to ethical standards and societal expectations, promoting fairness, transparency, and accountability.
Innovations and applications
Our research group has a long history in natural language processing (NLP), and we were quick to expand our work to include Large Language Models (LLMs) and Generative AI (GenAI). With the increase and popularisation of data-driven AI in the market, we observe a growing need for hybrid AI. For example, one limitation to a broader adoption of LLMs is the lack of factualness and robustness of the generated content; combining knowledge graphs with LLMs might mitigate this problem, resulting in more Trustworthy AI solutions.
We work in close collaboration with stakeholders to create relevant AI applications across various domains. Examples include:
- Energy: Optimise energy usage and timing via Semantic Interoperability Framework using the SAREF ontology for data sharing.
- Health: Personalized recommender systems for diabetes type-2 patients.
- Agrifood: Chatbots for easy access to specialized agricultural knowledge.
- Government: AI tools for tax legislation accessibility and responsible oversight.
- National Security: Automated intelligence and decision support systems via reasoning for defence field operations.
Our technical expertise spans across numerous domains, ensuring solutions are both effective and trustworthy.
Projects and impact
Here you can read more about a selection of our projects.
This example shows how Data Science expertise in knowledge modelling and interoperability is supporting the transition towards skills-based labour market, resulting in more job opportunities for the citizens, better career mobility, reduced unemployment and more personal fulfillment.
The labor market is tight and rapidly changing, and diplomas and CVs no longer help identify the best candidate for a job. Employers and job seekers focus too much on education and experience, and not enough on skills. Better coordination between the labor market and vocational education is needed to ensure everyone finds the right job.
Understanding skills provides the best opportunities for jobseekers to adapt to the changing market, and for employers to find the perfect match for their positions.
In collaboration with UWV, SBB, CBS, and the Ministry of Social Affairs and Employment, we have applied our expertise on knowledge modelling, and our experience in connecting different stakeholders to develop CompetentNL, a common language, used to describe skills across professional and study programs. CompetentNL is developed to be a dynamic language, able to adapt to the changes of the labour market, and it is kept up-to-date to the existing European standards. Last, but definitely not least, specific attention has been dedicated to mitigating (gender) biases and provide more fair decision support.
CompetentNL aims to support the Netherlands towards a skills-based labour market, resulting in more job opportunities for the citizens, better career mobility, and reduced unemployment. The broad interest to the project, both in the public and private sector, and a number of successful pilots demonstrate the positive impact that this solution can have.
This example shows how semantic interoperability can improve citizens’ daily life and make smart energy management accessible in homes and buildings, resulting in higher energy efficiency, lower energy bills, increased awareness of energy usage, sustainable behavior, reduced power grid load, and easier connectivity of devices from different vendors.
Optimizing energy use is a complex problem, whose solution effective communication and coordination between different systems, devices, and applications.
Semantic interoperability is a technology that enables this exchange of information and interoperability. Data Science has developed key AI driven solutions: Smart Applications REFerence ontology (SAREF) – the first EU standardized IoT ontology framework - and Knowledge Engine (KE) - an ontology based open-source middleware for smart information integration across systems. Since 2014, SAREF and KE have been used in multiple B2B projects in different domains (e.g. horticulture, agrifood, smart homes, energy and defense), as well as in big pilots within the HorizonEU project InterConnect (2019-2024).
This example shows how data science supports public organisation to adopt AI solutions in a trustworthy and responsible way, resulting in better and more fair services for the citizens, and improving the trust of society in governmental bodies.
While AI is increasingly important in our society, its development and integration in existing processes poses risks and raises doubts, especially within the public sectors. The AI oversight Lab aims to support organisations and public entities to guarantee that the adoption of AI solutions happens in a responsible way. This is made possible via an interdisciplinary approach that includes both technical and organisational aspects.
The AI oversight Lab moved its first steps in 2021, when the Data Science department collaborated with the municipality of Nissewaard to evaluate biases issues and robustness of a fraud assess AI model for social benefits; several issues were identified, which led to the withdrawal of the algorithm. This first experience led to many other collaborations, for example with the Immigration and Naturalisation Service (IND) or with the Public Prosecution Service (OM). The Lab keeps growing and broadening its impact, publishing academic papers and facilitating the sharing of experiences and best practices across its community of committed stakeholders.
This example demonstrates how Data Science expertise in language technologies can lead to the creation of a sovereign and compliant Dutch Large Language Model.
The GPT-NL project aims to develop a large language model (LLM) tailored specifically for the Dutch language and culture, supportive of transparency, inclusivity, and in alignment with Dutch and European values. The project addresses challenges related to biases, ethical standards, and user data privacy. By working towards compliance to the General Data Protection Regulation (GDPR) and the EU AI Act, GPT-NL seeks to create a Dutch-language model that does not infringe on privacy or intellectual property rights. This approach aims to inspire other initiatives to innovate with generative AI while respecting legal boundaries and promoting sustainability.
The project is well-known beyond TNO, SURF and NFI. Team members have been invited as keynote speakers at Dutch conferences, and as invited speakers and panellists at both Dutch and international (European) events, and have been interviewed by organizations such as NPO, iBestuur or Computer Idee. The team is moreover continuously working on building an ecosystem, reaching out to and understanding the concerns, needs and hopes of potential data contributors, potential end users of the model, interested academics, supervisory authorities such as the Dutch privacy protection agencies, and policy makers.
Want to know more?
Are you keen to gain more insight into your data? Make AI systems more trustworthy or comprehensible? Or would you like to discuss with us how to organise your information sharing in a meaningful way? Then please feel free to contact us.
Get inspired
Data sharing to achieve smart production
Data sharing

