Please use this identifier to cite or link to this item: https://scholarhub.balamand.edu.lb/handle/uob/6530
Title: Automatic SQL to HQL-NoSQL Querying using PostgreSQL and Integrated Hive-HBase
Authors: Saada, Ouday
Daba, Jihad S. 
Affiliations: Department of Electrical Engineering 
Keywords: Automatic Query Language
Big data
HBase
HDFS
Hive
PostgreSQL
Relational Database
Sqoop
Issue Date: 2023-01-26
Part of: WSEAS Transactions on Information Science and Applications
Volume: 20
Start page: 16
End page: 27
Abstract: 
The amount of digital data is constantly growing in almost all fields. This data is divided into two categories, structured and unstructured data. Non-structural databases known as NoSQL became one of the main fields of big data. Many companies are still using relational databases like PostgreSQL and MySQL. But with the rapid evolution and diversity of stored data, companies find themselves obliged to use big data tools like HBase or Hive. Big data is characterized by its capacity, speed, and ability to store diverse types of data. Data analysis and high storage capacity are the main reasons for companies to search for new database systems. Data migration to new systems is associated with the modification of the existing data and applications. This process costs a lot to adopt new specialists to handle this transition. Furthermore, due to different sources of data in old systems, e.g., real-time applications that are continuously collecting new data, companies will not be able to leave relational databases. For this reason, we present a system, termed Automatic Query Language, or AQL in short form, for migrating data from PostgreSQL to integrated HBase/Hive databases. In addition, we provide a platform that allows any user to query automatically PostgreSQL, Hive, and HBase databases using SQL query only. Querying the system is related to where each big data tool’s performance is better. After the platform was completed, we were able to insert and select data from both relational databases and big data components. Join operation was not a problem because complex queries for analysis were executed using Hive which was integrated with HBase. The tested AQL system proved that HBase can insert data with more efficiency than PostgreSQL and Hive, and that select query in Hive has a better performance than PostgreSQL for big data size, whereas, for small data size, the performance of PostgreSQL is better.
URI: https://scholarhub.balamand.edu.lb/handle/uob/6530
DOI: 10.37394/23209.2023.20.3
Open URL: Link to full text
Type: Journal Article
Appears in Collections:Department of Electrical Engineering

Show full item record

Record view(s)

70
checked on May 25, 2024

Google ScholarTM

Check

Dimensions Altmetric

Dimensions Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.