State-of-the-art analysis of technology stacks for the implementation of modern big data architectures
DOI:
https://doi.org/10.31449/upinf.194Keywords:
IT architecture, Big Data, data warehouse, databases, wide-column databases, technology stackAbstract
Nowadays, in the implementation modern big data architectures, companies opt for different technology stacks, which are either open-source or offered by certain vendors on the market, such as Google, Microsoft, Amazon and others. The selection of any given stack is influenced by several factors, while the cost of stack usage as well as human resources often appear to be the most important ones. As a cost-efficient alternative, it is possible to use open-source technologies, such as the Apache stack, but this also comes with certain implications and compromises. On the other hand, modern big data architectures sometimes include separated storage and compute layers, with a different technology solution used on each layer (or even within the same layer) and the ultimate goal of storing variously structured data and their efficient analysis. In this paper, we have presented an example of a two-layered IT architecture optimised for big data storage and analysis, and we have shown the tools and solutions used in the selected three technology stacks (Google, Amazon, Apache) that can be used to implement such an architecture. We have analysed the properties of individual technology stacks and reviewed the benefits and challenges of using a specific stack.