We’re constantly collecting more data, for example from camera images and text documents. This can provide us with relevant information. However, data is not always stored in a structured manner. This makes it difficult to retrieve the relevant information. Natural Language Processing (NLP) is an AI technique that tackles this problem.
NLP combines the techniques of statistics with machine learning. This makes it possible to extract keywords from a text. We can then use this to make important classifications. TNO uses NLP to extract information from extensive, unstructured textual data in a more automated way.
You can use jargon to better streamline and standardise processes, for example in the form of a taxonomy or ontology. However, matching jargon within a field is a time-consuming exercise. TNO uses NLP to identify important terms from a set of documents and determine their mutual relationships. We do this by combining syntactic information (sentence construction), keyword extraction, web sources and semantic embedding methods. The taxonomy can then be used as input for an expert session.
At TNO, we use our tools to automatically extract information from documents. We can also make predictions, such as in the foresight domain. Using the Horizon Scanner, we explore and extract from relevant websites, blogs and documents. This allows us to retrieve relevant information and to show trends. Trend analysis shows us that the term deep learning is now being mentioned much more frequently within the computer vision domain than it was ten years ago. In addition, we can classify the documents automatically. For example, by a particular topic or field. We can also use blogs to conduct sentiment analysis and find out whether terms are being described more positively or negatively.