Semantic Analyser - Development of a software environment for algorithmic processing of Slovenian texts

Authors

  • Miha Jesenko Ministrstvo za javno upravo
  • Miro Lozej
  • Karmen Kern Pipan Ministry of Public Administration
  • Primož Godec University of Ljubljana, Faculty for Computing and Informatics
  • Vesna Tanko Univerza v Ljubljani, Fakulteta za računalništvo in informatiko
  • Lan Žagar Univerza v Ljubljani, Fakulteta za računalništvo in informatiko
  • Ajda Pretnar Žagar Univerza v Ljubljani, Fakulteta za računalništvo in informatiko
  • NIkola Đukić Univerza v Ljubljani, Fakulteta za računalništvo in informatiko
  • Blaž Zupan Univerza v Ljubljani, Fakulteta za računalništvo in informatiko

DOI:

https://doi.org/10.31449/upinf.156

Keywords:

semantic analysis, data spaces, text mining, visual analytics, workflows

Abstract

Every day, civil servants and officials are confronted with a large number of voluminous documents that need to be reviewed and applied according to the information requirements of a specific task. This is the case when making decisions, drafting legislation and policies, reviewing legislation and policies, assessing the impact of legislation and policies, carrying out various analyses, describing data sources and services and many other tasks. Since reviewing many documents and selecting the most relevant ones for our needs is a time-consuming task, we have developed an AI-based approach for the content-based review of large collections of texts. The approach of semantic analysis of texts and the comparison of content relatedness between individual texts in a collection allows for time-saving and the comprehensive analysis of collections. In the paper, we present the results of the project to develop a general-purpose tool for analysing sets of textual documents. The project aims to select and implement semantic analysis building blocks that can be used to perform arbitrary types of document analyses and prototype analytical workflows that could support the tasks and decision-making in public administration. The building blocks we have developed include components to access data repositories, embed documents in vector spaces, search for similar documents, visualize document maps, search for characteristic terms, rank documents according to their semantic similarity to selected terms and arrange concepts into ontologies. In the paper, we present a use case to semantically link the proposals to the government with a collection of laws.

Published

2022-06-21

How to Cite

[1]
Jesenko, M., Lozej, M., Kern Pipan, K., Godec, P., Tanko, V., Žagar, L., Pretnar Žagar, A., Đukić, N. and Zupan, B. 2022. Semantic Analyser - Development of a software environment for algorithmic processing of Slovenian texts. Applied Informatics. 30, 2 (Jun. 2022). DOI:https://doi.org/10.31449/upinf.156.

Issue

Section

Professional papers