Securing Distributed Data Storage In Cloud Computing Using Intelligent Algorithm

Problem Statement

One of the major challenges in big data in the cloud is security for the data because data is transmitted and stored in remote servers over the Internet channel. Distributed storage is among the significant technologies used for storage in cloud computing as it makes mass storage of data remotely possible using technologies such as StaaS (storage as a service) (Esposito, Ficco, Palmieri & Castiglione, 2016). Distributed storage entails storing data in many nodes, usually in replicated manner for redundancy  (Stephens et al., 2015). Cloud service vendors such as Amazon, Google, and Microsoft offer massive scalable storage space for their clients. However, concerns over data security in the cloud side is a major obstacle to the adoption of StaaS (Rao & Selvamani, 2015), (Kumar, 2016). many users of the cloud have concerns about the security of their data in the cloud since they lose physical access and control over the data (Ali, 2015), moreover, the cloud service providers can access this data (Singh & Pasupuleti, 2016). As cloud computing use continues to grow unabated, the concept of MDS (mass distributed storage) is increasingly being explored as a means of scaling data storage (Gai, Qiu, & Zhao, 2016). This proposed research seeks to propose and experiment on how distributed data storage in the cloud can be secured using an intelligent algorithm. The paper starts by stating the problem to be addressed before defining the purpose of the research. This is followed by an explanation of the significance of the study, after which the research questions and hypothesis are discussed. This is followed by a definition f the research variables and any assumptions and limitations accruing in the project. Finally, the proposed methodology for undertaking the research is discussed and how the information and data will be analyzed.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Various approaches have been put forth as a way of ensuring security of big data in the cloud. Some of the methods include identity and authentication management, which is a basic way for managing cloud security such as using passwords before accessing data in the cloud, data encryption, information privacy and integrity, information availability, and secure management of information (Kalra & Sood, 2016). An area where security needs to be improved with respect to cloud computing is in distributed data storage where there are threats from a variety of sources due to a larger surface that can be attacked. Distributed storage can lead to attacks and malicious activity , including attacks when data is being transmitted. Attacks and unexpected operations can be experienced in cloud server side that face constraints due to regulations and laws, especially for geographic distributed storage servers. Abuse by cloud operators is a major problem; service providers can also present challenges when clients want to transfer data from their cloud servers to other providers (Gai, Qiu, Thuraisingham & Tao, 2015). as such, it is important to enhance the security of big data in the clouds by providing an intelligent mechanism for security distributed cloud data.

Purpose Statement

The purpose of this research is to add to te existing security protocols and methods currently in use to incorporate distributed data storage security. The proposed research seeks to improve security from the cloud server end to solve the challenge of cloud service providers abusing clin data, especially given that distributed storage entails having data copies in different geographic and political jurisdictions where laws can be different. This proposed research will put forth a novel apporach for data security in distributed storage using intelligent encryption that secures data when at rest and during transmission.

As cloud computing is increasingly used and organizations collect ever increasing data volumes, the need for increased storage capacities that are secure and redundant also increase. While organizations do their best to ensure their data is secure in the cloud, the realities of shared storage infrastructure, and distributed storage as well as loss of physical access and control over data raises further concerns. It will provide another option for consideration to enhance the security of data in distributed storage cloud platforms

  • How can intelligent encryption algorithms enhance security of data in distributed cloud storage platforms ?
  • Will named data packets that are search-able approach to accessing user data help improve big data security in the cloud ?

Implementing an intelligent encryption algorithm based on named search-able data packets is a novel method that will improve cloud data security for distributed storage architectures and will ensure cloud service providers cannot abuse the client data they store.  

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

The dependent variables in this proposed research include;

  • Increased security for cloud data stored in a distributed storage architecture

The independent variable is the implementation of a novel intelligent encryption with  named search-able data packets

The proposed research is based on the assumption that distributed cloud data storage is unique and requires additional security protocols. It also assumes that clients will have little or no control over the data they pass on to cloud service providers. The proposed research also assumes that cloud service providers that employ distributed storage do so in geographically and legally dispersed jurisdictions, and that data stored in each distributed location id a replica of that in another storage node. The experimentation and proposed algorithms to enhance distributed cloud data security is assumed to be independent of the platform being used for the storage and will work in any distributed cloud architecture. Further, it is assumed that the proposed intelligent encryption will work independent of the type of data in storage and will work with incremental data storage ‘on the fly’. The findings of the research will be limited to the inherent limitations and disadvantages of the research tools and experimental research methodology adopted for this research. Assumptions and research limitations are important as they impact the inferences that can be drawn from research (Sekaran & Bougie, 2016) 

Significance of the Study

This research seeks to develop, using an experimental approach, a novel approach to securing cloud data in a distributed cloud storage architecture. The proposed research seeks to shift from theoretical approaches to data security and instead employ a hybrid approach where an algorithm is proposed and then tested experimentally  as a ‘proof of concept’ to demonstrate that it can actually enhance security for data stored in a distributed cloud architecture. The section begins by defining the setting for the research  in which the research and experimental testing will be undertaken. The instruments to be used for the proposed method is also discussed along with the procedures for conducting the research and evaluating performance data. The section also outlines the overall research design and how the data will be analyzed, before discussing the expected outcomes for the proposed data security approach.

The research will be undertaken by first developing a suitable algorithm for data encryption to secure all forms of data stored in a distributed cloud storage architecture. Two cloud server will be used and normal data assigned to them, with sensitive data assigned to each of the cloud servers. An algorithm will then be developed and applied to the servers

The research will be undertaken by using an alternative data distribution algorithm, which is an extension and adaptation of work done by Palomar and Chiang; alternative distributed algorithms for maximizing network utility. After the algorithm is applied to the data, the data will be split using the secure efficient data distribution. The splitting of the data using the secure efficient data distribution algorithm is meant to ensure that the sensitive information does not leak on the cloud server side; the aim is also to achieve this at minimal costs. After encryption and splitting, the data needs to be retrieved safely and efficiently using an efficient data conflation algorithm. This approach is highly adaptable for organizations that seek to employ the STaaS model to store their data in the cloud and ensure high level security for the stored data. This method will enhance data security in the cloud by preventing cloud service providers from being able to access the data while it is in their storage servers and prevent abuse. The proposed cryptography approach is diagrammatically shown below;

The research questions will be carefully developed, considering the research questions and the research objectives. The developed questions will then have suitable responses developed and then pretested using a small portion of the population samples,and then refined, before being administered. All possible participants will be explained to the purpose of the research and their obligations, as well as the confidentiality nature of the research, before being formally requested to

Research Questions

Input data is partitioned into units that are functional before storage such that the data is split into two parts that are encrypted for distributed storage in the cloud so that the service provider or their employees cannot access this data while maintaining the requisite performance standards.  

participate in the research. The security of the cloud data is to be achieved using a security aware and efficient distributed data storage. The image below shows how the encryption and decryption process will be undertaken;

This proposed research will employ an experimental research design; this is a type of study where one variable is manipulated and other variables are randomized (Nassaji, 2015). the research will commence by developing an alternative data distribution algorithm that will be compatible with various big data frameworks such as MapReduce (Hadoop). The data is split and grouped using name labels (named data packets and pre-stored name list) and can be searched based on fuzzy logic approaches.  A secure efficient data distribution algorithm will also be developed and applied to the distributed data storage. An efficient data conflation algorithm will also be developed and used for securing the distributed stored data. Experimentation will then be done using the developed algorithms

The experiment will be set up to first evaluate the differences in execution times between the proposed encryption method and the broadly used and accepted AES (advanced encryption standard). Further, evaluation will be done on whether the encryption is impacted by the data size. A cloud setup will be simulated on a hardware consisting of an 8 core server with 8 GB RAM and Mango Database with VMWare used for virtualization of a server side workstation.

Different experimental setups will be used and performance measured and these will then be analyzed graphically and calculations made for the execution times.

The proposed model is expected to enhance the security of sensitive data in distributed cloud storage model in a cost effective and efficient manner where performance in execution for encryption/ decryption is not impacted adversely, for all data sizes. A comparison will be made for execution times with the popular AES encryption standard.

References

Ali, H. (2015). Cloud Computing Security: An Investigation into the Security Issues and Challenges Associated with Cloud Computing, for both Data Storage and Virtual Applications. International Research Journal Of Electronics And Computer Engineering, 1(2), 15. https://dx.doi.org/10.24178/irjece.2015.1.2.15

Esposito, C., Ficco, M., Palmieri, F., & Castiglione, A. (2016). Smart Cloud Storage Service Selection Based on Fuzzy Logic, Theory of Evidence and Game Theory. IEEE Transactions On Computers, 65(8), 2348-2362. https://dx.doi.org/10.1109/tc.2015.2389952

Gai, K., Qiu, M., Zhao, H., (April 01, 2016). Security-Aware Efficient Mass Distributed Storage Approach for Cloud Systems in Big Data. 140-145. 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS).

Gai, K., Qiu, M., Thuraisingham, B., & Tao, L. (2015). Proactive attribute-based secure data schema for mobile cloud in financial industry. IEEE, 6(4). Retrieved from https://www.researchgate.net/profile/Keke_Gai/publication/283121673_Proactive_Attribute-based_Secure_Data_Schema_for_Mobile_Cloud_in_Financial_Industry/links/566c75ce08aea0892c 4fcb63.pdf    

Kalra, S., & Sood, S. (2015). Secure authentication scheme for IoT and cloud servers. Pervasive And Mobile Computing, 24, 210-223. https://dx.doi.org/10.1016/j.pmcj.2015.08.001

Kumar, R. (2016). Cloud computing and security issue. International Journal Of Engineering And Computer Science, 2(1). https://dx.doi.org/10.18535/ijecs/v5i11.18

Nassaji, H. (2015). Qualitative and descriptive research: Data type versus data analysis. Language Teaching Research, 19(2), 129-132. https://dx.doi.org/10.1177/1362168815572747

Sekaran, U., & Bougie, R. J. (2016). Research Methods For Business: A Skill Building Approach  Seventh Edition. John Wiley & Sons.

Singh, A., & Pasupuleti, S. (2016). Optimized Public Auditing and Data Dynamics for Data Storage Security in Cloud Computing. Procedia Computer Science, 93, 751-759. https://dx.doi.org/10.1016/j.procs.2016.07.286

Stephens, Z., Lee, S., Faghri, F., Campbell, R., Zhai, C., & Efron, M. et al. (2015). Big Data: Astronomical or Genomical?. PLOS Biology, 13(7), e1002195. https://dx.doi.org/10.1371/journal.pbio.1002195.