Semantic Analyser - Development of a software environment for algorithmic processing of Slovenian texts
DOI:
https://doi.org/10.31449/upinf.156Keywords:
semantic analysis, data spaces, text mining, visual analytics, workflowsAbstract
Every day, civil servants and officials are confronted with a large number of voluminous documents that need to be reviewed and applied according to the information requirements of a specific task. This is the case when making decisions, drafting legislation and policies, reviewing legislation and policies, assessing the impact of legislation and policies, carrying out various analyses, describing data sources and services and many other tasks. Since reviewing many documents and selecting the most relevant ones for our needs is a time-consuming task, we have developed an AI-based approach for the content-based review of large collections of texts. The approach of semantic analysis of texts and the comparison of content relatedness between individual texts in a collection allows for time-saving and the comprehensive analysis of collections. In the paper, we present the results of the project to develop a general-purpose tool for analysing sets of textual documents. The project aims to select and implement semantic analysis building blocks that can be used to perform arbitrary types of document analyses and prototype analytical workflows that could support the tasks and decision-making in public administration. The building blocks we have developed include components to access data repositories, embed documents in vector spaces, search for similar documents, visualize document maps, search for characteristic terms, rank documents according to their semantic similarity to selected terms and arrange concepts into ontologies. In the paper, we present a use case to semantically link the proposals to the government with a collection of laws.