Browsing by Author "Saddad, Emad"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Lake Data Warehouse Architecture for Big Data Solutions(SAI, 2020) Saddad, Emad; Mokhtar, Hoda M. O.; El-Bastawissy, Ali; Hazman, MaryamTraditional Data Warehouse is a multidimensional repository. It is nonvolatile, subject-oriented, integrated, time- variant, and non-operational data. It is gathered from multiple heterogeneous data sources. We need to adapt traditional Data Warehouse architecture to deal with the new challenges imposed by the abundance of data and the current big data characteristics, containing volume, value, variety, validity, volatility, visualization, variability, and venue. The new architecture also needs to handle existing drawbacks, including availability, scalability, and consequently query performance. This paper introduces a novel Data Warehouse architecture, named Lake Data Warehouse Architecture, to provide the traditional Data Warehouse with the capabilities to overcome the challenges. Lake Data Warehouse Architecture depends on merging the traditional Data Warehouse architecture with big data technologies, like the Hadoop framework and Apache Spark. It provides a hybrid solution in a complementary way. The main advantage of the proposed architecture is that it integrates the current features in traditional Data Warehouses and big data features acquired through integrating the traditional Data Warehouse with Hadoop and Spark ecosystems. Furthermore, it is tailored to handle a tremendous volume of data while maintaining availability, reliability, and scalability.Item Towards analternativeData Warehouses Architecture(Trends in Innovative Computing, 2014) Saddad, Emad; El-Bastawissy, Ali; Hegazy, Osman; Hazman, MaryamData warehouses (DWs)are centralized data repositories that integrate data from various transactional, legacy, or external systems, applications, and sources. DW provides an environment separate from the operational systems and is completely designed for decision-support, analytical-reporting, ad-hoc queries, and data mining. Recently, the structure and the volume of data stored on computer systems are growing at an accelerated rate.In current DWs architectures based on n-ary-Relational DBMSs, DWs are increasing their data volume; high disk space consumption; slow query response time, and complex database administration are common problems in these environments. Furthermore, there are a number of factors making developing and maintaininga data warehouse system a painful process such as: setting up a data warehouse can takea long time, over-provisioningcan lead to high costs, organizations may lack the expertise needed to set up and maintain a data warehouse, and system crashes and downtime or system overload can have numerous consequences for an organization. Also, DWs depend on static number of external data sources that may be incomplete, do not use the same definitions, and not always available.The lack of a proper data model and an adequate architecture specifically targeted towards these environments are the root causes of these all problems. So, this paper try to explain why we need an alternative DWs architecture that takes all benefits of existing traditional DWs architecture and solving its problems, dealing with modern environment such as cloud computing, handling in efficient manner the next generation databases (NoSQL) mostly addressing some of the points, and dealing with web applications scalability (such as: big data)