Parallel Data Sorting and Deduplication in Distributed File Systems

PARALLEL DATA SORTING AND DEDUPLICATION

IN DISTRIBUTED FILE SYSTEMS

TABLE OF CONTENTS

SL. NO.

TITLE

PAGE NO.

 

i.

Title Page

1

ii. 

Declaration

2

iii.

Certificate

3

iv.

Acknowledgement

4

v.

Table of Contents

5

vi.

Abstract

6

1.

Introduction

7

2.

Literature Survey

8

3.

Project Design

10

3.1.  Description of Modules

10

3.2.  Software and Hardware Specifications

11

3.3.  Model Diagrams and Design Specifications

12

3.4.  Individual Contributions

13

3.5.  Project Plans and Milestones

14

4.

Achievements

15

4.1.  Analysis Completed

15

4.1.  Results

15

4.1.  Future Work

19

4.2.  Scope for Publishing

19

5.

References

19

6.

Appendix A

20

6.1.  Data Pre-Processing

20

6.2.  Parallel Deduplication, Sorting, Client File

21

ABSTRACT

This project titled “Parallel Data Sorting and Data Duplication” aims to build a distributed file system in Python. This DFS will be distributed in nature, much like the Hadoop Distributed File System (HDFS). In today’s age of cloud computing and big data, where speed of computing is as critical as the reliability of storage, and data is huge in size, DFSs are the storage solution that will provide a long-lasting impact into the future. As mentioned, there are two critical factors, namely speed of computing and reliability of storage. These are the two cornerstones on top of which our distributed file system is built on. For the first factor, i.e. speed, we have designed a parallel deduplication algorithm and a parallel sort algorithm. The latter is based on the sequential merge sort algorithm. This is a novel algorithm, that makes use of the nature of the DFS designed, i.e. the master-slave architecture of HDFS. These operations have been used because this project serves as a basic demonstration of parallel architecture in DFSs, and sorting and deduplication are basic operations that are used in almost every advanced data storage and manipulation operation. The Python language has been used for creating the DFS and the Map-Reduce paradigm. RPC system calls have been used for the DFS, and the RPyC library in Python has been used to achieve this result. Finally, the DFS has been shown to demonstrate speed, reliability of storage and parallel data computations and processing.

1.  INTRODUCTION

In recent times, there has been a surge in high performance computing, especially distributed computing and parallel operations. Apart from these, there has also been a major rise in network-based computing. All of these types of computing have several demands from their architectures, namely speed (of computing and data storage/manipulation), reliability (of storage) and scaling capabilities (of the architecture used). We have attempted to create a distributed file system that fulfils all these criteria.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

As an outcome of the progress of web and data innovation, gigantic measures of information are delivered in our everyday life. Expansive volumes of data and petabytes of information are recorded each day. Notwithstanding the information estimate, the huge information has different qualities, for example, assortment and speed. Accordingly, huge information examination by machine learning and information mining methods has turned into an imperative research issue.

Mining enormous information is difficult to oversee, especially when utilizing the present procedures and information mining programming instruments, because of their expansive size and unpredictability. At the end of the day, utilizing a PC to execute the information mining errand over vast scale datasets requires high computational expenses. It is important to utilize all the more ground-breaking registering situations to effectively process and investigate enormous information. The answers for the issue of mining extensive scale datasets can be founded on the parallel and distributed computing stages. On a fundamental level, parallel processing centers around partitioning the picked (huge) issue into smaller ones, every one of which is completed by one single processor exclusively, with the goal that a calculation made out of various computations is performed simultaneously in a distributed and parallel way.

In our distributed file system, we have used the master-slave architecture as seen in Hadoop Distributed File System and the Map-Reduce paradigm for the file system. According to the standard nomenclature, the master node is called the NameNode, slave/minion nodes are called DataNodes and the data is replicated by a factor replication_factor, specified in the config file. The data is first deduplicated parallelly, then sorted using the parallel mergesort algorithm. Then, this data is replicated as mentioned above, and is stored in the DataNodes. This communication between NameNode (master), DataNodes (minions) and the client node takes place using RPCs (Remote Procedure Calls). To do this, we have used the RPyC library in Python.

While reading data, minion nodes are contacted in serial order. If data for corresponding file is contained in the minion, then data is streamed over RPC from minion to client and is displayed on the screen. The data is replicated for fail-safes against data corruption and power failures.

The project has immense scope for future work. Distributed file systems are the future of data storage and manipulation. In today’s age data is stored in data centers all over the globe, and this data must be constantly synced across all the centers. To make this possible. DFSs provide the ultimate solution, both in terms of speed and reliability. Additional parallel computations (deduplication and sorting) only increase the speed of operations. Thereby, this work can be carried forward, and modified for larger data, for additional features in the file system such as deleting DFS files, and endless more possibilities.

2.  LITERATURE SURVEY

Large-scale Distributed L-BFGS

M. M. Najafabadi et. al. (2017) suggest a solution to the problem of examining large amounts of data or extracting patterns from large amounts of data. The data is used for training machine learning algorithms [1]. Limited Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) is an optimization method proposed used for estimating parameters increased. Since, resources from a single computer could be insufficient in running this algorithm, this paper presents a parallel implementation of L-BFGS algorithm on a distributed file system. It uses HPCC as the distributed system.

Scalable Distributed Implementation of a Biologically Inspired Parallel Model

G. Ciobanu (2015) introduces distributed computing middleware [2]. Distributed computing middleware is inspired by biological models and it helps in solving various synchronization issues. The MapReduce algorithm is used to develop a parallel and scalable implementation which permits the division of a single task into several tasks or subtasks. These subtasks are executed parallelly and the results of these are aggregated into a final result. This model can provide solutions to NP-complete problems.

Efficient Parallel Spectral Clustering Algorithm Design for Large Data Sets under Cloud Computing Environment

R. Gin et. al. (2013) improve the clustering speed of the MapReduce algorithm [3]. It uses spectral clustering and MapReduce by evaluating sparse matrix eigenvalues and by finding distributed clusters. The paper concludes that when the processing data increases, the rate of clustering increasing linearly. Thus, the parallel spectral clustering algorithm

iiHadoop: An Asynchronous Distributed Framework for Incremental Iterative Computations

A. G. B. Saadon et. al. (2017) introduce iiHadoop as an extension to the existing Hadoop framework because, even though MapReduce was recently introduced to solve the problem of handling computations with massive amounts of data [4]. However, it does not provide a solution for handling small amounts of incremental data. In the proposed iiHadoop tech, it speeds up the program by performing the incremental computations on the small fraction of affected data. It also improves the performance by executing iterations asynchronously.

Trustworthy Group Making Algorithm in Distributed Systems

A. Aikebaier et. al. (2011) introduce the distributed and scalable system as a peer to peer system [5]. The security between each group member or peer is the primary concern. In the system, each peer has to be trust worthy as behaviour of one peer can affect the whole system. The trustworthiness of each peer is a ground variable for the distributed environment. The paper introduces a new approach of incrementing a safe group in the distributed system protocols.

Meta-MapReduce for Scalable Data Mining

X. Liu et. al. (2015) tackle the problem of the time taken by MapReduce algorithm to sole machine learning problems involving iterations [6]. The existing framework of MapReduce suffers a very significant weakness in that is cannot support iterations. A new algorithm, Meta MapReduce is introduced which reduces the computational complexity of the training data while the number of nodes increases. It also obtains smaller error rates.

Lessons Learned from CPES Co-Simulation with Distributed, Heterogeneous Systems

C. Steinbrink et. al. (2018) present a case study based on co-simulation with distributed, heterogenous simulation [7]. In today’s world, there is increased integration of renewable energy sources into the conventional power grid, and very often, this results into the grid being transformed into a cyber-physical energy system.  Although this provides options for stable, optimized control, it also poses vulnerabilities through ignorance of certain setup characteristics, and this through this paper, the authors present a system MOSAIK that aims to bridge the gap between requirements for special interfacing and high usability of the systems.

A Feasible MapReduce Peer-to-Peer Framework for Distributed Computing Applications

H. M. Tran et. al. (2015) introduce a MapReduce peer to peer framework which helps MR implementations to P2P networks [8]. This is useful for people who cannot afford dedicated clusters for rare demands. Another advantage of the framework is that. it allows internet users to make use of large data on distributed systems. There also are features to improve fault tolerance and to manage peer failures.

Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

B. Cordes et. al. (2009) present an implementation of backprojection for use in synthetic aperture radar (SAR) performed on a high-performance reconfigurable computing (HPRC) system [9]. Backprojection is an image synthesis algorithm that can be used as a part of SAR. This is especially done on HPRCs, where a novel approach using general-purpose processors and FPGAs permits designers to exploit both fine-grained and coarse-grained parallelism, thereby reaching very degrees of computation speedup.

3.  PROJECT DESIGN

3.1 Description of Modules

There are 5 major modules in this project:

Data Pre-processing: We have taken data from 2 sources: the Brown corpus present in NLTK, and the Wikipedia 100k Dataset which contains the 100,000 most used words. The 2nd dataset had repetitions of words as well as non-English words. While repetition was not a major concern, non-English words were, which is why they had to be removed. Finally, we resulted with a large dataset containing 306,694 repeating words. All these operations were performed in a Jupyter notebook.

Master Node: This is the module pertaining to the master node, or more appropriately, the NameNode. This module handles everything pertaining to communication among all the other nodes over Remote Procedure Calls through the RPyC module. The master node is responsible for keeping communication channels open, for links between the client and the minion, and for maintaining the integrity of the data nodes.

Minion Node:This is the module that creates the minion nodes to store data according to the Map-Reduce paradigm. This module is responsible to ensure the authenticity and integrity of data, and to make sure that data is appropriately duplicated and split between the nodes.

Client: The client module is responsible to get the parallel tasks done, i.e. to deduplicate the data parallelly and then to sort it using parallel mergesort. Functions in the client file call objects from the master and minion nodes to get information about the distributed file system, and get these tasks done.

Configuration File: The config file contains basic information pertaining to the DFS, i.e. the block size and the replication factor. This information is used by the master and minon files to create the DFS and form the basic building blocks of the file system.

3.2 Software and Hardware Specifications

The major software libraries used in this project are:

RPyC: This library in Python offers convenient ways to utilize Remote Procedure Calls and to create network-based communication. RPyC is the means through which master, minion and client communicate with each other and transfer data over the localhost (in this case) network. Since we are using RPCs, this project can also be extended to be built over distributed systems so that even if specifications of different systems are not known, communication can still happen over remote procedure calls.

Multiprocessing: The multiprocessing library in Python is analogous to the OpenMP library in C. It facilitates using more cores than actually allotted, creating threads for parallel computing and much more. The number of cores, threads etc. can be set, mutually exclusive and critical areas of code and be specified etc. This library has been used for parallel data deduplication and parallel data sorting.

NLTK: The Natural Language Toolkit library (NLTK) has been used for the words from the Brown corpus to generate the dataset for English words.

The project has been done on a Dell Inspiron 7560 laptop notebook having 8GB RAM and an Intel Core i5-7200U quad-core processor with a base clock of 2.5GHz that can be overclocked to 2.75GHz. Apart from a standard Intel HD Graphics 620 card, there us also am NVIDIA GeForce 940MX GPU with GDDR5 RAM and 4GB of dedicated graphics memory. The primary memory is a 1TB SATA3 NTFS hard drive, with the operating system located on a Samsung 850EVO M.2 solid state drive for higher speed when accessing applications and programs.

3.3 Model Diagrams and Design Specifications

Fig. 1: Model of the distributed file system

Fig. 2: Flowchart of tasks in the DFS

3.4 Individual Contributions

The individual contributions of all the group members is listed below:

Yash Gupta (15BCE2073):

Literature Survey

Implement parallel sorting

Implement master node

Implement minion nodes

Perform integration testing

Sukriti Jain (15BCB0065):

Requirements analysis

Implement MapReduce

Perform unit tests

Implement client interaction

Gahana Agarwal (15BCE0552):

Find relevant research

Implement communication network

Parallel data deduplication

Final documentation

3.5 Project Plan and Milestones

Fig. 3: Gantt chart for project milestones

4.  ACHIEVEMENTS

4.1 Analysis Completed

We have successfully finished the implementation of the project ahead of schedule. Having done this, we have also finished unit and integration test on the project and have analysed a comparison of the parallel operations with corresponding sequential operations. Although the parallel operations take slightly more time than the sequential operations, we have observed that this time gap reduces with an increase in the size of the dataset. Thus, we have concluded that parallel operations are more suited when the size of the dataset huge, in the order of hundreds of millions to billions of data values. For smaller datasets, the overhead of creating extra threads is too much to be sorting or deduplicating the data parallelly.

4.2 Results

Fig. 4: Jupyter notebook for data pre-processing

Fig. 5: Reading data from source and storing in nodes

Fig. 6: Client reading data from the source

Fig. 7: Accessing the data stored in nodes

Fig. 8: Data after being sorted parallelly

Fig. 9: Storing the sorted data back into the nodes

Fig. 10: Reading already sorted data

4.3 Future Work

As mentioned earlier, distributed file systems represent the future of data storage and manipulation. Data centres across the world must now access data and be in synchronization at all times. That is where DFSs come into the picture, and with parallel operations, this can be made even faster. Thus, this project has a lot of future scope and applications. Moreover, more functions can be implemented such as deleting nodes, deletion of DFS variables that store data etc.

4.4 Scope for Publishing

This project has a strong scope for being published because the work done here is genuine, original and cutting-edge. DFSs are only emerging, and not a lot of research and development has gone into the area of distributed systems with parallel operations. Finally, after more extensive testing on larger datasets on more efficient systems, this work could be sent to journals to be published.

5.  REFERENCES

Najafabadi, M. M., Khoshgoftaar, T. M., Villanustre, F., & Holt, J. (2017). Large-scale distributed L-BFGS. Journal of Big Data, 4(1), 22.

Ciobanu, G. (2015). Scalable distributed implementation of a biologically inspired parallel model. Complex & Intelligent Systems, 1(1-4), 69-80.

Jin, R., Kou, C., Liu, R., & Li, Y. (2013). Efficient parallel spectral clustering algorithm design for large data sets under cloud computing environment. Journal of Cloud Computing: Advances, Systems and Applications, 2(1), 18.

Saadon, A. G. B., & Mokhtar, H. M. (2017). iiHadoop: an asynchronous distributed framework for incremental iterative computations. Journal of Big Data, 4(1), 24.

Aikebaier, A., Enokido, T., & Takizawa, M. (2011). Trustworthy group making algorithm in distributed systems. Human-centric computing and information sciences, 1(1), 6.

Liu, X., Wang, X., Matwin, S., & Japkowicz, N. (2015). Meta-MapReduce for scalable data mining. Journal of Big Data, 2(1), 14.

Steinbrink, C., Köhler, C., Siemonsmeier, M., & van Ellen, T. (2018). Lessons learned from CPES co-simulation with distributed, heterogeneous systems. Energy Informatics, 1(1), 38.

Tran, H. M., Ha, S. V. U., Huynh, T. K., & Le, S. T. (2015). A feasible MapReduce peer-to-peer framework for distributed computing applications. Vietnam Journal of Computer Science, 2(1), 57-66.

Cordes, B., & Leeser, M. (2009). Parallel backprojection: a case study in high-performance reconfigurable computing. EURASIP Journal on Embedded Systems, 2009, 1.

 

6.  APPENDIX A

6.1 Data Pre-Processing

from nltk.stem import LancasterStemmer

from nltk.stem import PorterStemmer

from nltk.corpus import words

eng_words = set(words.words())

data_dirty = [line.rstrip(‘\n’) for line in open(“data_dirty.txt”, encoding = “utf-8”)]

data_clean = []

porter_stemmer = PorterStemmer()

lancaster_stemmer = LancasterStemmer()

for word in data_dirty:

    if word.lower() in eng_words:

        data_clean.append(word)

    elif word in eng_words:

        data_clean.append(word.lower())

    elif porter_stemmer.stem(word.lower()) in eng_words:

        data_clean.append(porter_stemmer.stem(word))

    elif porter_stemmer.stem(word) in eng_words:

        data_clean.append(porter_stemmer.stem(word.lower()))

    elif lancaster_stemmer.stem(word.lower()) in eng_words:

        data_clean.append(lancaster_stemmer.stem(word))

    elif lancaster_stemmer.stem(word) in eng_words:

        data_clean.append(lancaster_stemmer.stem(word.lower()))

len(data_dirty), len(data_clean)

data_clean.extend(list(eng_words))

len(data_clean)

with open(“data_clean.txt”, “w”) as f:

    for word in data_clean:

        f.write(“%s\n” % word)

6.2 Parallel Deduplication, Sorting, Client File

warnings.filterwarnings(“ignore”)

def merge(*args):

 left, right = args[0] if len(args) == 1 else args

 left_length, right_length = len(left), len(right)

 left_index, right_index = 0, 0

 merged = []

 while left_index
  if left[left_index]
   merged.append(left[left_index])

   left_index += 1

  else:

   merged.append(right[right_index])

   right_index += 1

 if left_index == left_length:

  merged.extend(right[right_index:])

 else:

  merged.extend(left[left_index:])

 return merged

def merge_sort(data):

 length = len(data)

 if length
  return data

 middle = length // 2

 left = merge_sort(data[:middle])

 right = merge_sort(data[middle:])

 return merge(left, right)

def parallel_sort(data):

 processes = multiprocessing.cpu_count()

 pool = multiprocessing.Pool(processes = processes, maxtasksperchild = 1)

 size = int(math.ceil(float(len(data)) / processes))

 data = [data[i * size: (i + 1) * size] for i in range(processes)]

 data = pool.map(merge_sort, data, chunksize = 1)

 while len(data) > 1:

  extra = data.pop() if len(data) % 2 == 1 else None

  data = [(data[i], data[i + 1]) for i in range(0, len(data), 2)]

  data = pool.map(merge, data) + ([extra] if extra else [])

 return data[0]

def send_to_minion(block_uuid, data, minions):

 print(“sending to: ” + str(block_uuid) + str(minions))

 minion = minions[0]

 minions = minions[1:]

 host, port = minion

 con = rpyc.connect(host, port = port)

 minion = con.root.Minion()

 minion.put(block_uuid, data, minions)

def read_from_minion(block_uuid, minion):

 host, port = minion

 con = rpyc.connect(host, port=port)

 minion = con.root.Minion()

 return minion.get(block_uuid)

def get(master, fname):

 file_table = master.get_file_table_entry(fname)

 if not file_table:

  print(“404: File Not Found”)

  return

 data_unsorted = “”

 print(“\nData stored in nodes is:\n”)

 for block in file_table:

  for m in [master.get_minions()[_] for _ in block[1]]:

   data = read_from_minion(block[0], m)

   if data:

    sys.stdout.write(data)

    data_unsorted += data

    break

  else:

   try:

    if os.path.getsize(os.getcwd() + “\\minion_nodes\\” + str(block[0])) != 0:

     print(“No blocks found. Possibly a corrupt file.”)

   except:

     print(“No blocks found. Possibly a corrupt file.”)

 data_unsorted = data_unsorted.split(“\n”)

 if data_unsorted == [“”]:

  return

 if data_unsorted[-1] == “”:

  data_unsorted = data_unsorted[:-1]

 data_sorted = parallel_sort(data_unsorted)

 if data_unsorted != data_sorted:

  print(“\n\n\nData stored in nodes is now sorted:\n”)

  with open(“data_{}_sorted.txt”.format(fname), “w”) as f:

   for word in data_sorted:

    print(word)

    f.write(“%s\n” % word)

  print()

  put(master, “data_{}_sorted.txt”.format(fname), fname, “get”)

 else:

  print(“\nData is already sorted.”)

def put(master, source, dest, src):

 if src == “main”:

  data_dup = [line.rstrip(“\n”) for line in open(source)]

  data_dup = list(set(data_dup))

  source = “data_{}_deduplicated.txt”.format(dest)

  with open(source, “w”) as f:

   for word in data_dup:

    f.write(“%s\n” % word)

 size = os.path.getsize(source)

 b_size = int(math.ceil(float(size) / master.get_block_size()))

 blocks = master.write(dest, size, src)

 if src == “main”:

  rep = master.get_replication_factor()

 else:

  rep = 1

 for r in range(rep):

  with open(source) as f:

   for i in range(b_size):

    b = blocks[b_size * r + i]

    data = f.read(master.get_block_size())

    block_uuid = b[0]

    minions = [master.get_minions()[_] for _ in b[1]]

    send_to_minion(block_uuid, data, minions)

def main(args):

 con = rpyc.connect(“localhost”, port = 2131)

 master = con.root.Master()

 if len(args) != 0 and args[0] == “get”:

  get(master, args[1])

 elif len(args) != 0 and args[0] == “put”:

  put(master, args[1], args[2], “main”)

 else:

  print(“TRY ‘put srcFile.txt destFile’ OR ‘get destFile'”)

if __name__ == “__main__”:

 main(sys.argv[1:])

Developing Algorithm to Schedule Workflow onto Distributed Resources

Introduction

Distributed computing technologies such as cluster, grid, and now, cloud computing, have all aimed at allowing access to large amounts of computing power in a fully virtualized manner, by aggregating resources and offering a single system view. In addition, an important aim of these technologies has been delivering computing as a utility. Utility computing is a business model for providing services and computing power on-demand; consumers pay providers based on usage (“pay-as-you-go”), similar to the way in which we currently obtain services from traditional public utility services such as water, electricity, gas, and telephony. Clouds are a large pool of easily usable and accessible virtualized resources (such as hardware, development platforms and/or services) [3]. Today’s applications consist of immense number of interactive tasks. These applications include scientific workflow and big data that are often in the form of extremely large datasets in a broad range of domains, such as astronomy, bioinformatics, climate science, and others [1, 2, 14]. These tasks generally require enormous processing power that is beyond the ability of a single machine. With the emergence of distributed computing and cloud computing, the computing power required for processing these large datasets is provided. A popular representation of a workflow application is the directed acyclic graph (DAG). The workflow scheduling is a general form of task scheduling in which tasks are mapped into the distributed resources for execution by workflow management system [10]. However, the main challenge is how to schedule the dynamic workflow with heavy fluctuations onto distributed resources (e.g., Cloud) efficiently as the underlying distributed resources are highly dynamic, failure-prone and heterogeneous [8].

Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling and resource provisioning algorithms to support efficient and reliable workflow executions. Most of the proposed duplication heuristics algorithms assume the existence of accurate estimates for tasks resource requirements such as execution and communication times, disk space, or memory usage. In particular, it is not ordinary possible to know a priori the task execution requirements and one would at best have only probability distributions for the estimation. In this work, we introduce a different approach to construct a prediction model which is based on observed correlations between input data size and task requirements [16, 17, 18, 19] for the task duplication scheduling.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

The aim of the task duplication scheduling is to achieve shorter makespan. Though, it may ravel the scheduling problem and make it more complicated. The scheduling algorithm requires to monitor the priority of tasks and also needs to identify those tasks that need to be duplicated and the algorithm should also detect idle time slots for allocating duplicated tasks. For that, we need to capture profiling data of workflow such as runtime, I/O, CPU utilization, memory, data dependencies among tasks. These workflows information can be extracted using profiling and monitoring tools such as Pegasus workflow management system [10, 11] (Pegasus is open source and available under the Apache V2 license) and Kickstart [17]. Pegasus uses Kickstart for monitoring the workflow, or the information can be captured through instrumenting the tasks by using Linux system calls. These tools capture profiling data such as runtime, CPU and memory usage as well as I/O operations. We can evaluate the significance of each of these parameters using a feature selection process [16].

1.1.  Problem definition: Duplication-based workflow scheduling in distributed systems

In order to make a plan for the execution of a workflow in a cloud environment, we need to consider two sub problems; 1) resource provisioning that includes the selection and provisioning the compute resources that will be used to run and execute the tasks. This means having heuristics in place that are capable of determining how many VMs to lease, their type, and when to start them and shut them down,  2) scheduling or task mapping onto the computational resources, in which each task is assigned onto the best-suited resource. The term scheduling is often used to refer to the combination of these two sub problems. The main issue is to schedule parallel applications, represented as a Directed Acyclic Graph (DAG), onto processing elements of parallel and distributed systems efficiently and cost effective. The goals of the scheduling process are to efficiently utilize resources and to achieve performance objectives of the application (e.g., to minimize program parallel execution time). The communication overhead in distributed systems remains an inevitable penalty. Due to this problem the parallel program speedup may be limited or may not scale very well with size of the system. This can be expressed in two primary optimization problems as: minimizing cost under deadline constraint or minimizing scheduling length (makespan) under budget constraint.

1.2.  Motivation

Many scientific areas such as astronomy, bioinformatics, climate science, and etc. have adopted workflows as a way of expressing multi-step computational problems that can be efficiently processed and analyzed on distributed infrastructures. The ever-growing availability of data generated by powerful scientific instruments often require enormous processing power to analyze large quantities of data. This needs a distributed platform in order for relevant results to be achieved in a reasonable amount of time. The latest distributed computing paradigm, cloud computing, offers several key benefits for the deployment of large-scale scientific workflows. However, present scheduling algorithms suffer from the lack of precise information about the workflows and resources.

1.3.  Challenges

Dynamic and fluctuating workflows – the major challenge to elasticity in cloud computing is that the workloads are unpredictable.

Heterogeneous physical nodes in cloud datacentres – the tasks are allocated (or scheduled) across available nodes which are widely distributed at different location and vary in computational power, architecture, memory and even the network performance. Different tasks perform differently at different nodes.

Cloud systems are highly dynamic, failure-prone – While the cloud service offerings present a simplistic view of IT in case of IaaS, underlying systems level support challenges are huge and highly complex.

Aims

We aim to develop an efficient algorithm to schedule workflow onto distributed resources (e.g. Cloud).

We aim to present an analytical model to achieve auto-scaling in resource provisioning scheme for scientific application on a cloud platform. This can be expressed in two primary optimization problems as: minimizing cost under deadline constraint or minimizing scheduling length (makespan) under budget constraint.

We aim to develop an algorithm to predict the task execution runtime and also to determine which resource a task should be allocated by using the previous execution history of workflow process and resources as in practice, it is difficult to estimate these requirements accurately due to the highly dynamic nature of underlying cloud systems and workflows.

Survey

Workflow Scheduling Approaches

We can classify workflow scheduling algorithms into three categories to distinguish a variety of them based on available information of workflow and resource and task resource mapping: 1) Static scheduling, 2) Dynamic scheduling, 3) Hybrid as shown in Fig. 1.

 

Fig. 1 workflow scheduling classification

Workflow applications require a scheduling strategy that should take into account the precedence constraints among their component tasks. The workflow scheduling heuristics are classified into the following general categories:

List scheduling algorithms

Clustering algorithms

Task duplication algorithms

Guided random search algorithms

2.1.1.     List scheduling

The basic idea of list scheduling is to make a scheduling list (a sequence of tasks for scheduling) by assigning them some priorities and sorting them according to their priorities, and then repeatedly execute the following two steps until all the tasks in the DAG are scheduled:

1. Task selection Select the first task from the scheduling list.

2. Resource selection Allocate the task to selected resource.

Some of the most important list scheduling heuristics are modified critical path (MCP) [168], mapping heuristic (MH) [57], insertion scheduling heuristic [88] and earliest time first (ETF) [74], heterogeneous earliest-finish-time (HEFT).

2.1.2.     Clustering scheduling

The main idea of clustering algorithms is the minimization of the communication cost between the tasks of a DAG, by grouping heavily communicating tasks into the same cluster and assigning all of the tasks in the cluster to the same processor.

2.1.3.     Duplication scheduling

Duplication-based scheduling is modelled for communication time optimization between data dependent tasks. The duplication method is often used in together with clustering scheduling or list scheduling. The main idea behind this type scheduling is to exactly duplicate task onto the same resource in order to avoid transmission time between task which is duplicated and other tasks. Thus, the earliest start time of tasks for execution on that resource is minimized that can result in a better makespan.

Duplication heuristic relies on the fact that some resources remain idle during different time intervals since they are allocated to tasks that are waiting for output of tasks that are assigned to some other resources, despite using an optimized task scheduling algorithm. Therefore the main motivation of duplication-based scheduling is to effectively employ these idle resources by discovering the critical tasks and redundantly assigning them in these time slots. Thus, the overall parallel execution time of tasks can be further decreased.

The following two concerns are essential to be considered when modelling an optimize task duplication-based algorithm:

Identifying tasks to be duplicated: this involves minimizing the start execution time of child task by selecting the parent tasks for duplication.

Discovering idle time slots: this involves how to locate an idle time slot for duplicating parent task on a resource.

In the literatures [4, 28], DBS algorithms are classified into two categories according to the task duplication approach used: Scheduling with Partial Duplication (SPD) and Scheduling with Full Duplication (SFD). Full duplication algorithms attempt to duplicate all the parents of a join node and apply the task duplication algorithm to all the processors that have any of the parents of the join node. A join node is defined as a node with an in-degree greater than one (i.e., a node with more than one incoming edge). DSH [20], BTDH [21], LCTD [22], CPFD [23], TCSD [24] and PY [25] belong to this category. Partial duplication algorithms do not duplicate the parent of a join node unless the parent is critical. Instead, they try to find the critical parent which is defined later in this paper as an immediate parent which gives the largest start time to the join node. The join node is scheduled on the processor where the critical parent has been scheduled. Because of the limited task duplication, algorithms in this category have a low complexity ‘but may not be appropriate for systems with high communication overhead. They typically provide good schedules for an input DAG where computation cost is strictly larger than communication cost. CPM [26], SDBS [27], DFRN [28], TDS [29], LWB [30], PLW [31] and FSS [32] belong to this category. SFD algorithms have a higher complexity but typically show better performance than SPD algorithms. A trade-off exists between algorithms in these two categories: performance (better application parallel execution time) versus time complexity (longer time to carry out the scheduling algorithm itself).

Most of the proposed duplication heuristics algorithms assume the existence of accurate estimates for tasks resource requirements such as execution and communication times, disk space, or memory usage. In particular, it is not ordinary possible to know a priori the task execution requirements and one would at best have only probability distributions for the estimation. Some of the task duplication scheduling algorithms have been extensively explored in the context of Grid systems [20, 21, 22, 23, 24] without addressing the issue of cost for resource utilization.

Table 1 summarizes the time complexity of duplication based algorithms and indicates the class of algorithms they belong to (i.e., whether they are SPD or SFD algorithms). Note that, for a DAG with V nodes, all the SFD algorithms have a complexity of O(
v4
) while the SPD algorithms have a complexity of O(
v2
).

Scheduling Algorithms

Classification

Time Complexity

Description

TDS

SPD

O(
v2
)

Only the critical parent is duplicated

PLW

SPD

O(v(e + v log v))

Lower bound of start-time is approximated

LWB

SPD

O(
v2
)

Lower bound of start-time is approximated

node weights are strictly larger than any edge weight

DFRN

SPD

O(
v2
)

Duplication first and reduction next

CPM

SPD

O(
v2
)

SDBS

SPD

O(
v2
)

FSS

SPD

O(
v2
)

LCDT

SFD

O(
v4
)

High time complexity

Optimization of linear clustering

CPFD

SFD

O(
e x v2
)

High time complexity

Task on critical path is considered firstly

DSH

SFD

O(
v4
)

High time complexity

It considers only the idle time slot between the finish time of the last node scheduled to a processor and the earliest start time of the candidate node (the one being considered for scheduling), the degree of duplication is likely to be small

duplication may not always be effective

BTDH

SFD

O(
v4
)

High time complexity

does not indicate any preference as to which parent node to be considered for duplication

The duplication process does not even if the start time of the candidate node is increased

PY

SFD

O(
v2
(e +v log v))

Lower bound of start-time is approximated

TCSD

SFD

O(
v3
log v)

Lower bound of start-time is approximated

Table 1. Comparison of scheduling algorithms

In this research, we aim to introduce a new DBS algorithm that duplicates the parents of any join node as done in SFD algorithms but with reduced time complexity. We select the critical node for duplication based on the predicted output size of the node to achieve the performance of SFD algorithms with a computational complexity close to SPD algorithms.

2.1.4.     Dynamic scheduling

Dynamic scheduling is designed to tackle the unavailability of scheduling information and it aims to achieve load balancing between available resource queues. However, it is difficult to properly determine the load of each queue. We can achieve better load balancing among resources using duplication based scheduling through identifying idle time slots of resources and estimating dependencies among tasks by examining historical data from an earlier execution of workflow. Sonmez et al. [15] introduced a taxonomy of dynamic scheduling policies based on two task information (task length and communication data size) and three resource information (status, processing speed and link speed). Dynamic scheduling is able to handle uncertainties but loses global optimization advantage of static scheduling. Hybrid scheduling takes advantages of both static scheduling and dynamic scheduling. It makes static plan for all tasks based on approximate estimation of task execution and communication time.

System Architecture and Modelling

Workflow Management System Architecture

Several scientific workflow management systems are developed to help scientists, analysts, and developers in different scientific domains to create and execute scientific workflows and analyses across broad areas of scientific communities. Kepler [39] is a free and open-source workflow management system which operates in a variety of formats on data locally and globally. By using Kepler’s GUI, users are able to create, execute scientific workflows. Pegasus [40] is a workflow management system which runs over varieties of hardware including a laptop, a campus cluster, a grid, or a commercial or academic cloud environment such as Amazon EC2 [43] and Nimbus. Triana [42] is an environment for workflow and data analysis, which provides a graphical user interface that helps users to develop and run their own programs. Triana has been developed at Cardiff University, initiating as a part of GEO600 gravitational wave detector software and more recently in a wider range of users. Taverna [41] is a powerful, open-source, and domain-independent tool for designing and executing workflows. It uses textual language SCUFL which is a mechanism for specifying Taverna workflows. A workflow management system architecture is shown in Fig. 2 which consists of Workflow and Clustering Engines, Workflow and Resource Monitoring Tools, and User Interface.

 

 

 

 

 

 

 

 

 

 

Fig. 2 Workflow management system architecture

Workflow and resource engine is the core of the workflow management system. Its main responsibilities consist of scheduling, data management, task execution management and resource provisioning.

3.2.   Workflow modelling and definition

Task-system scheduling

The notation and terminology used here is identical with [38]. A task isa unit of computational activity in the sequencing problem. It might for example, be a job, a program, or an instruction. A task will be specified in terms of its external behavior, e.g., the inputs it requires, the output it generates, its action of function, and its execution time. If T is a task; two events are associated with T: initiation
 T̅
, and termination Ṯ. lett denote the time of occurrence of an event. We assume that t(Ṯ) – t(
 T̅
) is nonzero and finite as long as all resources required by T are available.

In particular, a resource is any (not necessarily physical) device which is used by tasks in performing certain functions. Resources include control and input/output devices (tape, disk drives, card readers…), processors, storage media (drum, disks…), program, procedures, and data files. For the initiation and termination events of tasks, sets of states transitions are defined for the function of tasks as well as system capabilities: Initiation event corresponds to a state transition that reflect: 1) the acquisition and assignment of resources, 2) the initialization of the resource state, 3) the reading input values. A termination event corresponds to a state transition that reflect: 1) the release or discontinued use of resources, 2) the writing of output values, 3) if values or internal states are associated with resources, a termination event also correspond to “Saving” resource state.

Parallelism in task execution computational model

Let v = {
T1, … Tn
} be a set of tasks, and let → be a partial order (precedence graph) on v. The pair C = (v, →) is called a task system. The partial order represents operational precedence; i.e. T→T’ means that task T is to be completed before task T’ is begun.

Graphical representation of task systems

The precedence corresponding to the task system C has v as its vertices and the following set of directed edges. The edge (T, T’) from T to T’ is in the graph if and only if T→T’ and there exists no T” such that T→T”→T’. This definition ensures that there is no redundant specification of precedence in the graph.

 

 

 

 

 

 

 

 

 

Fig. 3 A precedence graph

A (directed) path (
x1x2
) (
x2x3
)… (
xk–1xk
) passes through vertices (tasks)
x1…xk
in the graph C. The length of this path is k, i.e., the number of vertices in the path. For i and j such that 1 ≤ i ≤ j ≤ k,
xj
is a successor of
xi
and
xi
is a predecessor of
 xj
. If j = i + 1, we shall use the terms immediate successor and immediate predecessor, respectively. A task with no successor is an exit task, and a task with no predecessor is an entry task. If task T is neither a successor nor predecessor of task T’, the T and T’ are independent. Critical Path (CP) of a task graph, is a set of nodes and edges, forming a path from an entry node to an exit node, of which the sum of computation cost and communication cost is the maximum [23].

Since we are interesting → as a (temporal) precedence ordering on tasks, T and T’ may be concurrent if and only if they are independent (otherwise, T and T’ must occur temporally in some fix order).

An execution sequence of an n-task system C = (v, →) is any string α =
a1a2…a2n
of task initiation and termination events satisfying the precedence constraints of C. Stated accurately,

For each T in v the symbols
 T̅
and Ṯ appears exactly once in α.

If
ai
=
 T̅
and
aj
= Ṯ, then i j.

If
ai
= Ṯ and
aj
=
 T̅
’, where T → T’, then i j.

Two valid execution sequences for Fig. 3 are
 T̅1Ṯ1 T̅2Ṯ2 T̅3Ṯ3 T̅4Ṯ4 T̅5Ṯ5 T̅6Ṯ6

and
 T̅1Ṯ1 T̅2 T̅3Ṯ3T̅4 T̅5Ṯ5Ṯ4Ṯ2T̅6Ṯ6

3.3.  Proposed Framework

In this project a novel framework is proposed for mapping workflow to the underlying resources in the distributed system that adopts a layered treatment method. The main aim of mapping is to accomplish auto-scaling in resource provisioning. The proposed framework in Fig. 5, consists of two layers: A logical layer and a physical layer. In the first stage of the logical layer workflow process and workflow resources are modelled and then a logical connection relations is established between workflow process activities and distributed resources. The modelling is based on dynamic information such as set of available tasks, location of data, and the state of the resources and static information such as task priorities computed from the whole task graph. The algorithm also exploits the previous execution history of workflow process and resources in order to predict the task execution runtime and also to determine which idle time slot of resources a work item should be duplicated. In the second stage, of the logical layer (running stage) the logical connection is searched to find out best scheduling solution. The algorithm learns from historical data in order to establish the connection faster and more accurate.
Logical layer Modeling workflow         Modeling workflowprocess                                resources––––––––––––––––––––––––Establishing Connections           Reasoning workflow activities      Allocation resources for workflowActivities –––––––––––––––––––––––Searching Connections
Physical layer                          Optimization and resource allocation

Fig. 5 Layered framework for workflow process and resource modelling

In the physical layer, on the availability of the resources the workflow management system allocates the resources to the corresponding workflow process. If there are more number of available resources (budget available), for duplicate execution, which allows provisioning of more resources than the minimal necessary for meeting the deadline. In practice, it is difficult to estimate these requirements accurately and also they don’t take into account the memory usage [6]. We can optimize resource provisioning by estimating characteristics of the resources required for workflow execution that can have a significant impact on costs and resource utilization for example when we use a cloud infrastructure.

Current Status of Research and Future Plan

At this stage, we have designed a framework for mapping workflow to the underlying resources for a heterogeneous and dynamic distributed resources (section 3.3). For this project, we consider Montage workflow which is created by the NASA Infrared Processing and Analysis Centre (IPAC) as an open source toolkit that can be used to generate custom mosaics of astronomical images in the Flexible Image Transport System (FITS) format workflow. Montage workflow is I/O- and data-intensive workflow. Therefore, it is a suitable case study for our project.

The next stage of this work will be to carry out experimental studies of the proposed framework and workflow and resource requirements such as task runtime, disk space usage, and memory consumption. Also, we shall incorporate additional consideration such as multitenancy workflow into the model.

Table 2 outlines the schedule for writing the thesis proposal.

References

R. Ferreira da Silva, R. Filgueira, I. Pietri, M. Jiang, R. Sakellariou, and E. Deelman, “A Characterization of Workflow Management Systems for Extreme-Scale Applications,” Future Generation Computer Systems, vol. 75, p. 228–238, 2017.

E. Deelman, T. Peterka, I. Altintas, C. D. Carothers, K. K. van Dam, K. Moreland, M. Parashar, L. Ramakrishnan, M. Taufer, and J. Vetter, “The future of scientific workflows,” The International Journal of High Performance Computing Applications, vol. 32, iss. 1, p. 159–175, 2018. 

R. Buyya, J. Broberg, and A. Goscinski, Cloud computing: Principles and paradigms. Wiley, 2010.

F. Wu, Q. Wu, Y. Tan, Workflow scheduling in cloud: a survey, J. Supercomput. 71, 2015, pp. 3373–3418.

H. Topcuouglu, S. Hariri, M.-y. Wu, Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Trans. Parallel Distrib. Syst. 13 (3) (2002) 260–274.

R. F. da Silva, G. Juve, M. Rynge, E. Deelman, and M. Livny, “Online Task Resource Consumption Prediction for Scientific Workflows,” Parallel Process. Lett., vol. 25, no. 3, p. 1541003, Sep. 2015.

F. Zhang, J. Cao, W. Tan, S.U. Khan, K. Li, A.Y. Zomaya, Evolutionary scheduling of dynamic multitasking workloads for big-data analytics in elastic cloud, IEEE Trans. Emerg. Top. Comput. 2, 2014, pp. 338–351.

F. Zhang, J. Cao, K. Hwang, K. Li, S.U. Khan, Adaptive workflow scheduling on cloud computing platforms with iterative ordinal optimization, IEEE Trans. Cloud Comput. 3, 2015, pp. 156–168.

M. Maheswaran, S. Ali, H. J. Siegel, D. Hensgen, R. F. Freund, Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems, in: 8th Heterogeneous Computing Workshop, HCW ’99, 1999.

E. Deelman et al., “Pegasus, a workflow management system for science automation,” Futur. Gener. Comput. Syst., vol. 46, pp. 17–35, 2015.

A. Khan, X. Yan, S. Tao, and N. Anerousis, “Workload characterization and prediction in the cloud: A multiple time series approach,” in 2012 IEEE Network Operations and Management Symposium, 2012, pp. 1287–1294.

T. Shibata, S. Choi, and K. Taura, “File-access Patterns of Data-intensive Workflow Applications and Their Implications to Distributed Filesystems,” in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010, pp. 746–755.

I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, Mar. 2003.

G. Juve, A. Chervenak, E. Deelman, S. Bharathi, G. Mehta, and K. Vahi, “Characterizing and profiling scientific workflows,” Futur. Gener. Comput. Syst., vol. 29, no. 3, pp. 682–692, 2013.

Sonmez O, Yigitbasi N, Abrishami S, Iosup A, Epema D (2010) Performance analysis of dynamic workflow scheduling inmulticluster grids. In: Proceedings of the 19th ACM international symposium on high performance distributed computing, ACM, pp 49–60.

G. Juve, A. Chervenak, E. Deelman, S. Bharathi, G. Mehta, K. Vahi, Characterizing and profiling scientific workflows, Future Generation Computer Systems 29 (3) (2014) 682–692.

F. Nadeem, M. Yousaf, R. Prodan, T. Fahringer, Soft benchmarks-based application performance prediction using a minimum training set, in: 2nd IEEE International Conference on e-Science and Grid Computing, 2006.

W. Tang, J. Bischof, N. Desai, K. Mahadik, W. Gerlach, T. Harrison, A. Wilke, F. Meyer, Workload characterization for mg-rast metagenomic data analytics service in the cloud, in: IEEE International Conference on Big Data, 2014.

T. Shibata, S. Choi, K. Taura, File-access patterns of data-intensive workflow applications and their implications to distributed filesystems, in: 19th ACM International Symposium on High Performance Distributed Computing (HPDC), 2010.

Kruatrachue B, Lewis T (1988) Grain size determination for parallel processing. IEEE Softw 5(1):23–32

Chung YC, Ranka S (1992) Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors. In: Proceedings of supercomputing ’92. IEEE, pp 512–521.

Chen H, Shirazi B, Marquis J (1993) Performance evaluation of a novel scheduling method: linear clustering with task duplication. In: Proceedings of the 2nd international conference on parallel and distributed systems.

Ahmad I, Kwok YK (1998) On exploiting task duplication in parallel program scheduling. IEEE Trans Parallel Distrib Syst 9(9):872–892.

Li G, Chen D, Wang D, Zhang D (2003) Task clustering and scheduling to multiprocessors with duplication. In: Proceedings of the parallel and distributed processing symposium, IEEE.

Papadimitriou CH, Yannakakis, M (1988) Towards an architecture-independent analysis of parallel algorithms. In: Proceedings of the twentieth annual ACM symposium on theory of computing, STOC ’88, ACM, New York.

J. Y. Colin and P. Chretienne, “C.P.M. Scheduling with Small Communication Delays and Task Duplication,” Operations Research, 1991, pp. 680-684.

S. Darbha and D. P. Agrawal, “SDBS: A task duplication based optimal scheduling algorithm,” Proc. of Scalable High Performance Computing Conf., May 1994, pp. 756-763.

Park GL, Shirazi B, Marquis J (1997) Dfrn: a new approach for duplication based scheduling for distributed memory multiprocessor systems. In: Proceedings of 11th international parallel processing symposium, pp 157–166.

Darbha S, Agrawal DP (1998) Optimal scheduling algorithm for distributed-memorymachines. IEEE Trans Parallel Distrib Syst 9(1):87–95.

Colin J, Chretienne P (1991) C.p.m. scheduling with small computation delays and task duplication. In: Proceedings of operations research, pp 680–684.

Palis MA, Liou JC, Wei DS (1996) Task clustering and scheduling for distributed memory parallel architectures. IEEE Trans Parallel Distrib Syst 7(1):46–55.

S. Darbha and D. P. Agrawal, “A Fast and Scalable Scheduling Algorithm for Distributed Memory Systems,” Proc. of Symp. On Parallel and Distributed Processing, Oct. 1995, pp. 60-63.

W. Cirne, F. Brasileiro,D. Paranhos, L.F.W. Go´es, and W. Voorsluys, ‘‘On the Efficacy, Efficiency and Emergent Behavior of Task Replication in Large Distributed Systems,’’ Parallel Comput., vol. 33, no. 3, pp. 213-234, Apr. 2007.

G. Kandaswamy, A. Mandal, and D.A. Reed, ‘‘Fault Tolerance and Recovery of Scientific Workflows on Computational Grids,’’ in Proc. 8th Int’l Symp. CCGrid, 2008, pp. 777-782.

R. Sirvent, R.M. Badia, and J. Labarta, ‘‘Graph-Based Task Replication for Workflow Applications,’’ in Proc. 11th Int’l Conf. HPCC, 2009, pp. 20-28.

M. Dobber, R. van der Mei, and G. Koole, ‘‘Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison,’’ IEEE Trans. Parallel Distrib. Syst., vol. 20, no. 2, pp. 207-218, Feb. 2009.

X. Tang, K. Li, G. Liao, and R. Li, ‘‘List Scheduling with Duplication for Heterogeneous Computing Systems,’’ J. Parallel Distrib. Comput., vol. 70, no. 4, pp. 323-329, Apr. 2010.

Coffman, E. G., and Denning, P. J. Operating Systems Theory. Prentice-Hall, Englewood Cliffs, N.J.

Kepler Scientific Workflow System. https://kepler-project.org/

Pegasus Workflow Management System. https://pegasus.isi.edu/

Apache Taverna. http://www.taverna.org.uk/

Triana Scientific Workflow. http://www.trianacode.org/

Amazon Web Services. https://aws.amazon.com/

Our proposed project aims to engineer intelligence and sustainability in the next generation of scientific workflow scheduling and management of Distributed Systems (i.e. Cloud computing). These systems are characterized by scale, dynamism, uncertainty and hyper connectivity in operations.  Workflow is very important in the scheduling paradigm. For example, there are unprecedented amounts of requests from the customers of Amazon and eBay during the holiday seasons. Thereafter, the website traffic drops down dramatically. Mapping performance requirements to the underlying resources in the cloud is the key problem [4]. Over-provisioning of resources to assure performance requirements may results in unproductive instances leading to unnecessary costs, while under-provisioning of resources will undoubtedly hurt performance [7]. Clouds typically have highly dynamic demands for resources with highly heterogeneous and dynamic workflows. For example, the workflows associated with the application can be quite dynamic, in terms of both the number of tasks processed and the computation requirements of each task. Furthermore, different applications may have very different and dynamic quality of service (QoS) requirements; for example, one application may require high throughput while another may be constrained by a budget, and a third may have to balance both throughput and budget. The performance of a cloud service can also vary based on these varying loads as well as failures, network conditions, and so on, resulting in different “QoS” to the application. Workflow management and scheduling of these systems require high level of autonomy that enables dynamic application scale-out to address dynamic workflows, spikes in demands, and other extreme requirements, where scheduling should consider (i) the software world including functional requirements, logical flow, priorities and constraints of the workflow and (ii) the physical world, including computational resources, their dependability of the distributed resources and the end user Quality of Experience (QE) among the others. Our project builds on successful combination of research areas such as AI, machine learning, dynamic optimization and distributed computing to investigate the requirements of scientific workflow scheduling and management of distributed systems.  It will develop novel intelligent algorithms which leverage learning from history [6, 11, 12, 13] situation, context and/or various interactions between the physical and software worlds to provide seamless, dependable and more effective scheduling. The comparative evaluation of our algorithm and the related work are based on some metrics such as Schedule Length Ratio, Speedup, and Running Time of Algorithm [5].
 

Does Distributed of Massed Practice Produce Better Learning?

For a long time, analysts have concentrated on training circulation because of the essential role it plays in engine learning. The researchers have primarily focused on the amount of rest that should be set aside between practices to create an environment that enhances optimal learning. As indicated by certain scientists, distributed practice is more successful than massed practice. Massed practice not at all like distributive practice is described by longer work time or dynamic practice with shorter rest. A distributive practice, then again, displays a comparable measure of work time or dynamic practice crosswise over different sessions. Be that as it may, the length of every session is shorter contrasted with massed plan. Notwithstanding, the session is reached out over a critical period to understand the training destinations. The fundamental worry among teachers, practice pioneers, or physical specialists utilizing practice circulation is how to use the allotted time between and inside training sessions. In a perfect world, the rule of massed practice versus distributed practice contends that an equivalent measure of study will result in improved adapting just if it is performed in unmistakable shorter time lengths than a solitary broadened period. Subsequently, learning for all individuals will result under circulated rehearses in any case their racial foundation, age, where they dwell or whether they or not went to school.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

At the point when people need to comprehend something better, rehearsing the movement or auditing data once won’t yield critical results. In this way, rehearsing or investigating the data at the ideal time is fundamental in improving learning results. Truth be told, checking on data after the first learning procedure as opposed to investigating following the first learning procedure has happened results in better learning. The significant advantage of dispersed practice over its partner is the dividing impact. Thus, it is better to use distributed practice to better learn and comprehend things.
In the article by Zechmeister and Shaughnessy, they made some interesting points describing the pros and cons to both sides. Undergrads appraised the probability of review of individual words exhibited with the expectation of complimentary scholarly. Expectations were made utilizing a seven-point scale quickly following a thing’s introduction in the rundown. To-be-appraised things incorporated those introduced 1 time just as things displayed twice in either a massed or appropriated way. Twice-introduced things were appraised as bound to be reviewed than things displayed once, and they were reviewed in that capacity. In any case, despite the fact that Massed practice things were made a decision about bound to be reviewed than Distributed practice things, they were most certainly not. The finding that Ss misconceived when they knew MP things proposes why handling might be less for massed than for circulated introductions. Results bolster the constriction of consideration speculation with respect to the dispersing impact in free review.
To best remediate scholarly inadequacies, teachers need to recognize experimentally approved mediations as well as have the option to apply instructional changes that outcome in increasingly proficient understudy learning. The present investigation analyzed the impact of massed and conveyed practice with an unequivocal planning intercession to assess the degree to which these changes lead to expanded math actuality familiarity on essential expansion issues. Forty-eight third-grade understudies were set into one of three gatherings with every one of the gatherings finishing four 1-min math unequivocal planning systems every day crosswise over 19days. Gathering one finished every one of the four 1-min timings continuously; bunch two finished two consecutive 1-min timings toward the beginning of the day and two consecutive 1-min timings toward the evening, and gathering three finished one, 1-min autonomous planning multiple times conveyed over the day. Development bend demonstrating was utilized to inspect the advancement over the span of the investigation. Results proposed that understudies in the circulated practice conditions, both four times each day and two times each day, indicated altogether higher familiarity development rates than those rehearsing just once every day in a massed configuration. These outcomes demonstrate that joining circulated practice with unequivocal planning methods is a helpful adjustment that upgrades understudy learning without the expansion of additional instructional time when focusing on math reality familiarity. Schutte, Duhon, Solomon, Poncy, and Moore discuss the comparison between Massed and Distributed practice within a basic math fluency rate.
Lastly to back up my argument, the last research article I read by Moss; he conducted an investigation that analyzed the degree to which sort of-practice methodologies affected learning a verbal data or scholarly ability task for second-and fourth-grade understudies. One hundred and ninety understudies from eight second-and fourth-grade study halls took an interest in the examination. Homerooms were haphazardly allotted to the two practice conditions and all understudies took an interest in a 9-week coordinated learning framework intercession. The present examination found that scholarly aptitude assignments are found out marginally more successfully in a massed than conveyed practice mode, however the thing that matters was not factually critical. Understudies likewise learned verbal data errands more viably in the massed practice mode, however the thing that matters was not measurably noteworthy. The contrasts between the two practice conditions were not as incredible on verbal data errands, be that as it may, and no factually huge contrasts were found. Extra examinations, utilizing the quantity of exercise units finished, demonstrated that having finished a more noteworthy number of math exercises positively affected the math test scores. These investigations recommend that a more grounded treatment or better adherence to the treatment could have caused a factually critical impact for massed practice in scholarly aptitude areas. Replication is expected to give an increasingly strong establishment to this statement. It was closed from this examination, because of the moderate impact estimate contrasts and the indistinguishable cost factor for joining the two sorts of training, that the utilization of massed practice would be progressively reasonable for scholarly aptitude undertakings. Massed practice is likewise progressively viable in the higher request verbal data region. Solid research surmising proposes the duration of dispersed practice for lower level undertakings, especially in the verbal data regions. Further research is expected to find factors that breaking point or refute the dividing impact.
In conclusion, it is a better idea to use the distributed practice because the lessons are broken down into smaller lessons. Rehearsing or investigating the data at the ideal time is fundamental in improving learning results. Within these lessons you can gradually learn and comprehend what you are trying to gain an understanding of. A distributive practice, then again, displays a comparable measure of work time or dynamic practice crosswise over different sessions. Just like how we were taught in our own school system growing up and now in university our professors break the lessons down into sections. This is how we gain the knowledge we have today.
References

Schutte, G. M., Duhon, G. J., Solomon, B. G., Poncy, B. C., Moore, K., & Story, B. (2015). A comparative analysis of massed vs. distributed practice on basic math fact fluency growth rates. Journal of School Psychology, 53(2), 149-159. doi:http://dx.doi.org.ezproxy.fiu.edu/10.1016/j.jsp.2014.12.003
Zechmeister, E. B., & Shaughnessy, J. J. (1980). When you know that you know and when you think that you know but you don’t. Bulletin of the Psychonomic Society, 15(1), 41-44. doi:http://dx.doi.org.ezproxy.fiu.edu/10.3758/BF03329756
Moss, V. D. (1996). The efficacy of massed versus distributed practice as a function of desired learning outcomes and grade level of the student (Order No. AAM9603493). Available from PsycINFO. (618988241; 1996-95005-375). Retrieved from http://ezproxy.fiu.edu/login?url=https://search-proquest-com.ezproxy.fiu.edu/docview/618988241?accountid=10901

 

Real World Distributed Applications

Topic 1: Give two examples of real world distributed applications that were not discussed in the class slides (Hospital Management system, Airline reservation system, Banking system). You should not only specify what the application does, but also provide at least 3 features of the system (hardware technology, software technology, integration features, number of nodes, network characteristics, etc.)
Answer:
Example 1: Immigrant VISA information System (IVIS)   
This is a computerized Management information system. It is used by the National VISA Center (NVC) to manage the processing of immigrant visa petitions received from the Department of Homeland Security (DHS), United States Citizenship and Immigration Services (USCIS) regional service centers and district offices. The information shared by IVIS is used for processing; auditing and tracking of individual immigration visa applications as well as tracking the number of immigrant visas assigned that are subject to numerical limitations based upon the visa classification and country of chargeability.

Only internal organization that has access to IVIS data is the Bureau of Consular Affairs (CA).
IVIS System is used by CA for issuing visas to foreign nationals and passports to U.S. citizens. IVIS results are used as a data source for this assessment at Posts abroad and domestic passport agencies.
Specifically, data is shared among the following CA applications:

DataShare/Interagency Data Exchange Application (IDEA) – This provides application case data from the petition. This data arrives daily and is manually loaded into IVIS. This data is automatically populated in IVIS when creating a new case.
Consular Consolidated Database (CCD) – Conduit for data exchange between IVIS and DataShare / IDEA.
Immigrant Visa Allocation Management System (IVAMS) – The Case Number, FSC, Post Code, and Visa Class were loaded into IVAMS for the purpose of immigrant visa tracking and reporting.
Diversity Visa Information System (DVIS) – Alien Numbers generated in IVIS are transferred to DVIS and the DV post systems.
Immigrant Visa Overseas (IVO) – data on immigrant visas, petitions, and allocations is sent to a post location and loaded into their IVO systems.
SharePoint – data and images on immigrant visas, petitions, and appointment information is shared with a post through a secure site.
Worldwide Refugee Admission Program System (WRAPS) – data on immigrant visa petitions is sent to the Refugee Processing Center’s WRAPS system.

Features of the VISA Information System (VIS):
Hardware:

Mainframe systems. Government-operated computing platforms not shared by other business applications or technologies.
Finger print recognition, biometrics technology and ,
intrusion detection systems.

Software:

DataShare is used to move the data from the Consular Consolidated Database (CCD). That allows text files to be converted into Interagency Data Exchange Application (IDEA) format and transferred to USCIS. Encryption technology is used during all communications shared with external agencies.
Finger print reader / recognition. Firewalls.
eDP (Electronic Data Processing) Web
Data Replication technology

Networking :
This mainframe system has Networking z/OS network capability which includes a fully -featured communications server with integration of SNA (System Network Architecture) and TCP/IP protocols, making it a large server capable of serving a large number of worldwide clients simultaneously
Example 2: Retail Management Information System at GS-Retail, South Korea.
GS-Retail is a largest retailer in South Korea. They are using Retail management information system (RMIS) to support their distributed stores by linking them together using distributed applications. Below are the features of this GS-Retail’s RMIS:

Information is exchanged instantly; store managers stays in contact to more effectively control profits for the whole company.
This system supports product management and also enabled ability to do CRM (Customer Relationship Management) analysis.
Allowed managers to set prices for variable time periods based on the store location and to meet the needs of sales and inventory managers. ,
Provided flexibility to make use of a mobile user interface.

It’s an integrated platform end-to-end solution (Appliance), which has below components – Application Module, IBM Smart Analytic Solution (Admin nodes, and Data Nodes with Standby nodes).
Hardware Stack: with IBM System x3650 M3 servers, Storage servers (DS3400) with SSDs (Solid State Drives), SAN Switches. This integrated platform (hardware, software with functional procedures) which provides an ability to replace superannuated servers and have a single Implementation of the integrated Enterprise Data Warehouse Environment
Software Stack: DB2 Enterprise server edition, IBM Tivoli System Automation for multi-platforms (TSA) with RSCT (Reliable Scalable Cluster Technology), IBM Cognos 8 Business Intelligence, IBM Cognos 8 Business Intelligence, IBM Systems Director, DS Storage Manager, IBM Remote Support Manager (RSM) for Linux
Integration features: Easily scalable and expandable solution where data nodes can be added to the existing cluster solution to expand the capacity of the system.
Number of nodes: 2 Application Nodes, 1 Management node, 1 Administration node, 4 Data Nodes and 1 Standby node
Network characteristics: Network is fault- tolerant and resilient. This system has two networks – Public (for external client communication) and Private FCM Network which is used by the system for internal communication between the data nodes. For public network, two HBA adapters were provided which were bonded together.
Network and switch failures are protected by H/W redundancy. For example: Single Network port failures – using Bonded networks. Dual HBAs adapters to take care of HBA failures and Stacked switch configuration for FCM (Fiber Optic Communication Management network) Network to take care of FCM network switch failure.
 
Topic 2: Describe two similarities between road/highway networks and packet switching networks
Packet switching network is a network which routes digital data in small pieces called “packets”, each of which proceeds through the network independently. This digital data is nothing but a bit stream with encoded information. Packet is not really a physical thing. Thus, packets switched networks transport packets. This network is in many ways similar to the transportation network of roads, highways and intersections which transports vehicles that carries people and goods.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

For Example – when a factory needs to move a large amount of cargo to some destination warehouse located thousands of miles away. At factory, first the cargo is segmented and loaded into a fleet of trucks. Each of trucks then independently travels through the network of intersections, roads and highways and to the destination warehouse. At destination warehouse, the cargo is unloaded and grouped with the rest of cargo arriving from same shipment. Below are some similarities between packet switching network and road/highway transportation network:

Packets are similar to trucks
Communication links are similar to highways and roads
Packet switches are similar to intersections
End systems are similar to buildings
Trucks take path through transportation network, packets takes path through computer network

 
Retail management means running a store where merchandise is sold and Retail Management Information Systems include using hardware, software and procedures to manage activities like planning, inventory control, financial management, logistics and point of sale transactions.
Distributed application Name: CLAIMS 3 i.e Computer Linked Application Information Management System and Associated Systems.
CLAIMS 3 is the case management system used by USCIS that supports and maintains officer casework documentation and tracking for most benefit requests. USCIS oversees lawful immigration to the United States. It receives and adjudicates petitions, applications, and other requests for immigration benefits.
ICMS is a web – based front-end to CLAIMS 3. ICMS can be used to review, modify, and track the
adjudication performed by USCIS personnel of benefit request forms.
CLAIMS 3 functionalities include tracking
the adjudication performed by USCIS personnel, archiving, card production, case history, case
transfer, on-demand reports, electronic file tracking, image capture, production statistics,
and status update and electronic ingestion of benefit request form data captured through
the Lockbox.
USCIS uses the Computer Linked Application Information Management System (CLAIMS 3) and associated systems to manage the adjudication process for most domestically – filed, paper-based, immigration benefit filings with the exception of naturalization, intercountry adoption, and certain requests for asylum and refugee status.
USCIS uses different data systems to capture and store information provided by benefit requestors, including the Computer Linked Application Information Management System (CLAIMS 3), the Interim Case Management System (ICMS), and Marriagee Fraud Amendment System (MFAS), collectively referred to as “CLAIMS 3 and associated systems.”
3 features of the system (hardware technology, software technology, integration features, number of nodes, network characteristics, etc.) :CLAIMS 3 and associated systems are old, legacy, mainframe systems that do not have the capability to interface in real-time with other systems or to generate reports, metrics, or aggregated statistics. CLAIMS 3, includes the Mainframe, Local Application Network (LAN), ICMS, and MFAS.
But CLAIMS 3 still serves as the authoritative source case management system for certain benefit requests because so many other tools and systems point to it.
Software technology : Data Replication technology is used to replicate data from CLAIMS 3 across many systems and tools within USCIS due to the technical limitations of CLAIMS 3 itself.
Integration features :
This system stores the information related to:

Petitioner and Beneficiary data
Processing of cases based on priority and the cut-off dates,
Creation and recording of correspondence with the beneficiary,
petitioner and/or agent and the transmittal of data to the Immigrant Visa Overseas (IVO) system at post for final processing.

IVIS applications assists NVC in tracking and processing immigration visa petitions based on local necessities and requirements established by the State Department. The immigrant visa issuance process begins with the submission of a petition for immigration to the USCIS. USCIS reviews and adjudicates the petition and forwards approved petitions to the State Department for visa processing.
The NVC performs several visa – processing activities that track petitions requesting immigration services from initial NVC receipt from USCIS through transfer to the posts. NVC processing includes:
Telecom Industry – fraud management
Reference :
http://searchitoperations.techtarget.com/definition/distributed-applications-distributed-apps
Distributed apps can communicate with multiple servers or devices on the same network from any geographical location. The distributed nature of the applications refers to data being spread out over more than one computer in a network.
Distributed applications are broken up into two separate programs: the client software and the server software. The client software or computer accesses the data from the server or cloud environment, while the server or cloud processes the data. Cloud computing can be used instead of servers or hardware to process a distributed application’s data or programs. If a distributed application component goes down, it can failover to another component to continue running.
Distributed applications allow multiple users to access the apps at once. Many developers, IT professionals or enterprises choose to store distributed apps in the cloud because ofcloud’s elasticity and scalability, as well as its ability to handle large applications or workloads.
Enterprises can choose to use container technology, such as Docker, to package and deploy distributed applications. The containers can build and run distributed applications, as well as separate distributed apps from other applications in a cloud or shared infrastructure.

Distributed Control System for Microgrid

Abstract
In this study an example of a microgrid composed of diesel generator and two uninterruptable power supply systems is considered. This microgrid installed in the three buildings of the Tallinn University of Technology. This paper deals with how to implement a distributed control and monitoring system based on the Ethernet network in the microgrid. The paper describes a control strategy to implement both grid connected and islanded operation modes of the microgrid.
Keywords – Control system, diesel generator, microgrid
Introduction
Distributed generation (DG) is becoming an increasingly attractive approach to reduce greenhouse gas emissions, to improve power system efficiency and reliability, and to relieve today’s stress on power transmission and distribution infrastructure [1]. Distributed generation encompasses a wide range of prime mover technologies, such as internal combustion engines, gas turbines, microturbines, photovoltaic, fuel cells and windpower [32]. A better way to realize the emerging potential of DG is to take a system approach which views generation and associated loads as a microgrid [21].
Microgrid is a concept of defining the operation of distributed generation, in which different microsources operate as s single controllable system that provides power and heat to a cluster of loads in the local area [3], [8] – [9].
A well designed microgrid should appear as an independent power system meeting the power quality and reliability requirements [3]. The primary goal of microgrid architectures is to significantly improve energy production and delivery to load customers, while facilitating a more stable electrical infrastructure with a measurable reduction in environmental emissions [10]. The most positive features of microgrids are the relatively short distances between generation and loads and low generation and distribution voltage level. The main function of a microgrid is to ensure stable operation during faults and various network disturbances.
The microgrid is a promising concept in several fronts because it [18]:

provides means to modernize today’s power grids by making it more reliable, secure, efficient, and de-centralized;
provides systematic approaches to utilize diverse and distributed energy sources for distributed generation;
provides uninterruptible power supply functions;
minimizes emissions and system losses.

Despite many advantages of microgrid there remain many technical challenges and difficulties in this new power industry area. One of them is the design, acceptance, and availability of low-cost technologies for installing and using microgrids [4]. The increased deployment of power electronic devices in alternative energy sources within microgrids requires effective monitoring and control systems for safe and stable operation while achieving optimal utilization of different energy sources [35]. Microgeneration suffers from lack of experience, regulations and norms. Because of specific characteristics of microgrids, such as high implication of control components, large number of microsources with power electronic interfaces remains many difficulties in controlling of microgrids. Realization of complicated controlling processes in microgrids requires specific communication infrastructure and protocols. During the process of microgrid organization many questions concerning the protection and safety aspects emerge. Also, it is required to organize free access to the network and efficient allocation of network costs.
The predominant existing distributed generation is based on an internal combustion engine driving an electric generator [36]. To investigate various aspects of integration of alternative energy sources such as conventional engine generators, this paper proposes a prototype of the microgrid for three academic buildings at the Tallinn University of Technology which consists of a diesel generator, and batteries storage with power electronic interface. The main goal of this work is to design an intelligent control system of the microgrid that is efficient enough to manage itself for power balance by making use of state of the art communication technology. Moreover, the aim of this paper is to describe the control strategy of the microgrid operation in both stagy state modes. This control system enables the microgrid system to balance the electric power demand and supply and to simultaneously control the state of power network.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

Microgrid Theoretical Background
A microgrid is described as a small (several MW or less in scale) power system with three primary components: distributed generators with optional storage capacity, autonomous load centers, and system capability to operate interconnected with or islanded from the larger utility electrical grid [10], [11]-[13]. According to [39], [22], multiple facility microgrids span multiple buildings or structures, with loads typically ranging between 2MW and 5MW. Examples include campuses (medical, academic, municipal, etc), military bases, industrial and commercial complexes, and building residential developments.
Microgrids include several basic components for operation [3], [4]. An example of a microgrid with is illustrated in Fig.1.

Distributed GenerationDistributed generation units [1] are small sources of energy located at or near the point of use. There are two basic classes of microsources; one is a DC source (fuel cells, photovoltaic cells, etc.), the other is a high frequency AC source (microturbines, reciprocating engine generators, wind generators), which needs to be rectified. An AC microgrid can be a single-phase or a three-phase system. It can be connected to low voltage or medium voltage power distribution networks.
Storage DevicesDistributed storage technologies are used in microgrid applications where the generation and loads of the microgrid cannot be exactly matched. Distributed storage provides a bridge in meeting the power and energy requirements of the microgrid. Distributed storage enhances microgrid systems overall performance in three ways. First, it stabilizes and permits DG units to run at a constant and stable output, despite load fluctuations. Second, it provides the ride through capability when there are dynamic variations of primary energy (such as those of sun, wind, and hydropower sources). Third, it permits DG to seamlessly operate as a dispatchable unit. Moreover, energy storage can benefit power systems by damping peak surges in electricity demand, countering momentary power disturbances, providing outage ridethrough while backup generators respond, and reserving energy for future demand. There are several forms of energy storage, such as the batteries, supercapacitors, and flywheels.
Interconnection SwitchThe interconnection switch is the point of connection between the microgrid and the rest of the distribution system. New technologies in this area consolidate the various power and switching functions (power switching, protective relaying, metering, and communications) traditionally provided by relays, hardware, and other components at the utility interface into a single system with a digital signal processor. The interconnection switches are designed to meet grid interconnection standards.
Control SystemThe control system of a microgrid is designed to safely operate the system in grid-parallel and stand-alone modes. This system may be based on a central controller or imbedded as autonomous parts of each distributed generator. When the utility is disconnected, the control system must control the local voltage and frequency, provide (or absorb) the instantaneous real power difference between generation and loads, provide the difference between generated reactive power and the actual reactive power consumed by the load, and protect the internal microgrid.

Structure of the Proposed Microgrid
The microgrid installed in three buildings of the Tallinn University of Technology (TUT): Faculty of Power Engineering, TUT Library, School of Economics and Business Administration. Consequently, according to the classification given in [22], this power system can be defined as a multiple facility microgrid. Fig.2 illustrates the various components of the power system of the microgrid at TUT.
The structure of the microgtid for the campuses of the TUT is proposed. Fig.3 shows a schematic of the power system. Microgrid systems targeted in this study are autonomous areas having the power demand of several kilowatts including a diesel generator, two uninterruptable power supply (UPS) systems with batteries storage, and loads. They are connected to the power electronic interface forming local AC network with 230V, 50Hz.
The diesel generator is used as the main distributed energy resource in this microgrid. It has a nominal power of 176kW/220kVA, voltage of 240V/400V and maximum current of 318A. This generator is connected to the AC bus via the automatic relay logic (ARL2). The ARL2 is continuously observing it both sides: the main grid and the microgrid. If there is a fault in the general grid, the ARL2 will disconnect the microgrid, creating an energetic island.
The battery banks (E1 and E2) are used as the distributed energy storage devices in the microgrid to insure continuous supply of the local load. They are interfaced to the electrical network through the two UPS systems: UPS1 (160kVA), and UPS2 (240kVA). Hence, we can conclude that the microgrid has two main possible operation modes: grid-connected and islanded mode.
Main customers of the microgrid are the computers and servers located in the laboratories and office rooms in the three buildings of TUT. The clients in the Library Building (computers) are interfaced to the electrical network using ARL1. In addition, four experimental loads (Experimental loads 1..4) are used that can be connected to the distributed shield located in the Laboratory of Electrical Drives. The nine intelligent sensors (P1..P9) – assign these loads. Their task is to measure electrical power and energy parameters of the network, such as voltage, current, power, energy, power factor and transmit this information to the controller.
The microgrid is connected to the general city electricity grid using two two-section transformer substations (6000kV/400kV) located in the Faculty of Power Engineering and the School of Economics and Business Administration Buildings.
Description of the Control System
Taking into account the configuration and features of the power network of the Tallinn University of Technology, the control system structure for the microgrid is designed with the following specifications:

the balance of electric power demand and supply of power network are provided;
both the steady state modes and the transient performance of the microgrid are achieved.

A block diagram of the hierarchical control system which is based on the multiagent technology [40], [41], is demonstrated in Fig.4. The design of the control system can be divided into hardware and software.
The control structure of the microgrid has three levels:

Operator console and application server;
Central controller (CC);
Local controllers (LC) and measuring devices.

Operator console is a computerized workstation with special software which comprises of supply and demand calculation units, monitoring units, control schemes and dispatching units. The function block diagram of the software is shown in Fig.5. The operator console heads the hierarchical control system. Its main goals of are: to keep track of the whole system by monitoring the status of the communication nodes and generating units; to collect data from the measuring devices; to calculate supply and demand of power; to visualize information received; to display the basic modes of the microgrid; and to transfer control commands to the central controller. Application server is designed for archiving data received from the measuring devices.
The main interface between the operator console and others communication nodes of the microgrid control system is the central controller. It is the main responsible for the management of the microgrid. for the optimization of the microgrid operation. The central controller operates in real time. Its main functions are: connection and disconnection of the microgrid, the synchronization process, the detachment of loads. In addition, the aims of the central controller are: to collect information from the measuring devices; to transfer data from the operator console and the application server; to manage the power supply switches; and to transmit the control commands to the local controllers.
Group of the local controllers are related to the third hierarchical control level. They include microsource controller that located in the distributed resources of the microgrid. It manages active and reactive power production levels at the diesel generator. Moreover, the microsource controller is responsible for the maintaining desired steady-state and dynamic performance of the power network. The other local controllers are located in the two UPS systems. Their main goals are to provide management of charge of the batteries storage.

Measuring process

Information required by the proposed monitoring and control system is voltage, current, power, energy, and power factor measurements. Real-time information is acquired through the intelligent measuring devices located at the output of the energy source, at the input of each loads, and at the both UPS systems. In this system, Allen-Bradley Powermonitor 3000 [25] is used to measure these instantaneous values. It implements real-time power monitoring with 50 ms selectable update rate. Such operating information is displayed in real-time for monitoring and energy management purposes.

Communication network

A communication infrastructure is needed between the central controller and the local controllers [23]. The short geographical span of the microgrid may aid establishing a communication infrastructure using low-cost communications. The adoption of standard protocols and open technologies allows designing and developing modular solutions using off-the-shelf, low-cost, widely available, and fully supported hardware and software components.

At the present time, many low cost microcontrollers include at least an Ethernet controller, standalone cheap controllers are also available. The main advantages of using Ethernet are: the transition from a centralized control to a distributed control; wiring reduction – no need for point to point connections. This solution provides flexibility and scalability for low-cost implementations.
Taking these into account, the Ethernet industrial protocol has been chosen in this microgrid as communication network for data transfer for all those control units. The amount of data to be exchanged between network controllers includes mainly messages containing set-points to LC, information requests sent by the MGCC to LC about active and reactive powers, and voltage levels and messages to control microgrid switches. The LC is responsible of collecting local information from the attached energy resource and takes some real-time decisions based on the control algorithm. The communication network of the control system is illustrated in Fig.6. Every communication node has to get registered to the master server. The node sends its information to the master server through diverse communication channel. Furthermore, this topology provides an opportunity for immediate control center access via remote consoles and web based laptops for necessary actions to be taken.
To include new generation resources or storage devices in a flexible manner into the microgrid, multi-agent technologies [40] might be applied. The proposed hierarchical control scheme provides a flexible platform to make high level decisions.
Control Strategy of Operation of the Microgrid
A microgrid may operate either connected to the main grid or disconnected from it. There are two steady states of operation, grid-connected (Mode-G) and islanded (Mode-I). Furthermore, there are two transient modes of operation, transfer from Mode-G to Mode-I and transfer from Mode-I to Mode-G. The key issue of the control is how to maintain the voltage and frequency stability of the microgrid [20].
Grid-connected mode
In the grid-connected operation mode, the main function of a DG unit is to control the output real and reactive power. The real and reactive power generated by a DG can be controlled through current or voltage regulation, thus the DG output power control schemes can be generally categorized as current-based and voltage-based power flow control [43].
During Mode-G operation, the voltage and frequency of the microgrid is set by the main grid. The aim of the uninterruptible power supply systems is to obtain energy backup as much as possible, so during Mode-G operation, the main grid, the microgrid or both of them, will charge the batteries [20].
In grid-connected mode the balance between the generation and the consumption as well as the control of the parameters of the system is guaranteed by the utility grid. Thus, generators are regulated with the criterion of optimized economic exploitation of the installation [23]. Concerning the programmable generator, the objective of the control is to optimize the microgrid performance.

Islanded mode

The MG operates autonomously, in a similar way to physical islands, when the disconnection from the main grid occurs [37].
When the grid is not present, the ARL2 disconnects the microgrid from the grid, starting the autonomous operation.
The instant at which the intentional islanding occurs must be detected in order to the inverter changes between grid-connected to intentional island modes. The detection is achieved using an algorithm described in [23].
When the main distribution network is faulted, the fault current will flow into the main grid from the microgrid continuously. At the same time, the circuit breaker of microgrid should detect the frequency and voltage-drop, and open in time, which makes the microgrid disconnect automatically from the main grid and change to islanded operation mode. Diesel generator should adopt the reasonable control strategies to ensure the stability of frequency and voltage in microgrid [42].
While switched from Mode-G to Mode-I, the UPS system operates in voltage control mode, is setting the voltage and frequency of the microgrid through absorbing or releasing energy.
In islanded mode, due to the unavailability of the utility grid, two requirements must be fulfilled: the power balance between the generation and the consumption and the control of the main parameters of the installation (voltage amplitude and frequency). In synchronous islanded mode this reference is the same as the grid voltage. This mode is also called synchronization mode and it is the mode that necessarily precedes a reconnection with the grid. The control system is responsible for assuring the power balance. In case of energy excess the management system can limit the output power of the diesel generator’s power in order to avoid the operation in extremely inefficient low power generation modes. On the contrary, if all the available power is not enough to feed the local loads, the management system will detach non-critical loads. The control system is voltage controlled and it regulates the main parameters of the system.

Find Out How UKEssays.com Can Help You!
Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.
View our services

The UPS systems sets the voltage and frequency of the islanded microgrid and maintains them within acceptable limits by injecting or absorbing active power and reactive power as required. As soon as the presence of mains is detected, the microgrid control system uses feedback information from the mains voltage to adjust the energy storage unit voltage and frequency control loops to synchronize the microgrid voltage with the main voltage of the main grid.

Transition from Grid-Connected to Islanded Mode

There are various islanding detection methods proposed for DG systems [44].
As mentioned above, there is a different control strategy when the laboratory-scale microgrid system operates in Mode-G or Mode-I. If there is a transition between these two modes, the control mode of the battery inverter will change. A switching circuit, as shown in Fig.7, is designed to realize this transition [20].
A load-voltage control strategy proposed by [23] is employed to provide the operation of the microgrid.
Disconnection of the microgrid from the grid can be provoked by many causes, like unsatisfactory grid voltage (in terms of amplitude or waveform) or even economic aspects related to power price. In order to monitor grid voltage characteristics a Voltage monitoring module is required. This module measures continuously the rms grid voltage comparing it with a preestablished threshold value. When any of the phase voltages goes down the threshold value (0.9 pu in this case) the detection signal is activated. If 20 ms after the first detection this signal is still activated the microgrid must be disconnected from the utility grid and it must pass to islanded operation mode, otherwise the microgrid will remain connected to the utility grid. This way unnecessary islandings are avoided and selectivity is respected. A 20 ms time window has been chosen after verifying through experimental tests and standards [47] that a personal computer (which is considered as the most critical residential load in this microgrid) is not affected by a 20 ms voltage interruption. As soon as the microgrid is disconnected from the grid, the programmable generator controller passes from a power control mode to a voltage control mode. Microgrid power consumption is also continuously measured in order to detach non-critical loads if there is no enough local available power. In addition if consumption or generation conditions are modified and it becomes possible to feed all the local loads, non-critical loads will be reconnected.

Transition from Islanded to Grid-Connected Mode

When the grid-disconnection cause disappears, the transition from islanded to grid-connected mode can be started. To avoid hard transients in the reconnection, the diesel generator has to be synchronized with the grid voltage [23]. The DG is operated in synchronous island mode until both systems are synchronized. Once the voltage in the DG is synchronized with the utility voltage, the DG is reconnected to the grid and the controller will pass from voltage control mode to current control mode.
When the microgrid is working in islanded mode, and the ARL2 detects that the voltage outside the microgrid (in the grid) is stable and fault-free, we have to resynchronize the microgrid to the frequency, amplitude and phase of the grid, in order to reconnect seamlessly the microgrid.
If the grid-disconnection cause disappears and the gridvoltage fulfills the desired requirements, the transition from islanded to grid-connected mode can be started. The grid voltage conditions will be again monitored by the Voltage monitoring module. This way if the grid voltage exceeds the threshold value the detection signal is deactivated. If 20 ms after the first detection the detection signal is still deactivated it means that utility grid has returned back to normal operating conditions and the microgrid can reconnect to the grid. However, before the reconnection, the microgrid has to be synchronized with the grid voltage in order to avoid hard transients in the reconnection. To do so, the microgrid operates in synchronous islanded mode during 100 ms with the aim of decoupling the reference variation and the physical grid reconnection transients. In this operating mode the voltage in the microgrid is set to the characteristics of the grid voltage, frequency and phase. Once the voltage in the microgrid is synchronized with the utility voltage the microgrid can be reconnected to the grid and the programmable generator controller will pass from a voltage control mode to a power control mode. In the same way if non-critical loads are detached they are also reconnected.

In the presence of unplanned events like faults, microgrid separation from the MV network must occur as fast as possible. However, the switching transient will have great impact on microgrid dynamics.
The microgrid functionalities as well as its control methods depend on the mode of operation [23]:
Islanding of the MG can take place by unplanned events like faults in the MVnetwork or by planned actions like maintenance requirements. In this case, the local generation profile of theMG can be modified in order to reduce the imbalance between local load and generation and reduce the disconnection transient [48].
Conclusions
In this paper the microgrid system installed at the Tallinn University of Technology, has been presented. The microgrid includes a diesel generator, batteries storage with power electronic interface.
The architecture of the microgrid for the Tallinn University of Technology and a control system structure for the microgrid were proposed. Design of a control and monitoring system for a microgrid is presented in this paper. A hierarchical control scheme is proposed.
This will enhance the reliability and stability of the microgrid on one end and will make microgrid an easy to use product on the other.
Acknowledgement
This paper was supported by the Project DAR8130 “Doctoral School of Energy and Geotechnology II”.
References

A.M.Borbely,J.F.Krieder, “Distributed generation: the power paradigm for the new millennium,” CRC Press, Boca Raton, Florida, 2001, 388p.
P.Nabuurs, “SmartGrids, European Technology platform,” Strategic Deployment Document for Europe’s Electricity Networks of the Future, September 2008, 68p.
R.Lasseter, “Microgrids,” Proceedings of 2002 IEEE Power Engineering Society Winter Meeting, vol.1, NewYork, NY, 2002, pp.305-308.
B.Kroposki,T.Basso,R.DeBlasio, “Microgrid Standards and Technologies,” Power and Energy Society General Meeting – Conversion and Delivery of Electrical Energy in the 21st Century, 2008, pp.1-4.
P.Mazza, “The Smart Energy Network: Electricity’s Third Great Revolution,” Jun. 2003. [online]. Available: http://www.microplanet.com/upload/pdf/SmartEnergy.pdf, 22p.
J.A.Momoh, “Smart Grid Design for Efficient and Flexible Power Networks Operation and Control,” IEEE Power & Energy Society Power Systems Conference and Exposition, Seattle, Washington, 2009, pp.1-8.
A.Mehrizini-Sani,R.Iravani, “Secondary Control for Microgrids Using Potential Functions: Modeling Issues,” Conference on Power Systems (CIGRECanada2009), Toronto, Canada, 2009, pp.1-9.
A.Mohamed, “Microgrid modeling and online management,” PhD thesis, Helsinki University of Technology, Helsinki, Finland, 2008, 169p.
D.Yubing,G.Yulei,L.Qingmin,W.Hui, “Modelling and Simulation of the Microsources Within a Microgrid,” Electrical Machines and Systems (ICEMS 2008), Jinan, China, 2008, pp.2667-2671.
C.M.Colson,M.H.Nehrir, “A Review of Challenges to Real-Time Power Management of Microgrids,” IEEE Power & Energy Society General Meeting, Calgary, Canada, 2009, pp.1-8.
C.M.Colson,M.H.Nehrir,C.Wang, “Ant Colony Optimization for Microgrid Multi-Objective Power Management,” IEEE Power & Energy Society Power Systems Conference and Exposition, Seattle, Washington, 2009, pp.1-7.
S.Ahn,S.Moon, “Economic Scheduling of Distributed Generators in a Microgrid Considering Various Constraints,” IEEE Power & Energy Society General Meeting, Calgary, Canada, 2009, pp.1-6.
C.A.Hernandez-Aramburo,T.C.Green,N.Mugniot, “Fuel Consumption Minimization of a Microgrid,” Industry Applications, IEEE Transactions, 2005, vol.41, no.3, pp.673-681.
A.Arulampalam,M.Barnes,A.Engler,A.Goodwin,N.Jenkins, “Control of power electronic interfaces in distributed generation Microgrids,” International Journal of Electronics, vol.91, no.9, London, GB, 2004, pp.503-524.
F.Pilo,G.Pisano,G.G.Soma, “Neural Implementation of MicroGrid Central Controllers,” IEEE International Conference on Industrial Informatics, New York, 2007, pp.1177-1182.
R.H.Lasseter,P.Piagi, “Control and Design of Microgrid Components,” Final Project Report – Power Systems Engineering Research Center (PSERC-06-03), 2006, p. 257.
P.Piagi,R.H.Lasseter, “Autonomous Control of Microgrids,” IEEE Power Engineering Society General Meeting, Montreal, Canada, 2006, pp.1-8.
F.Z.Peng,Y.W.Li,L.M.Tolbert, “Control and Protection of Power Electronics Interfaced Distributed Generation Systems in a Customer-Driven Microgrid,” IEEE Power & Energy Society General Meeting (PESGM 2009), Calgary, Canada, 2009, pp.1-8.
R.H.Lasseter,P.Piagi, “Microgrid: A Conceptual Solution,” IEEE 35th Power Electronics Specialists Conference (PESC2004), vol.6, Aachen, Germany, 2004, pp.4285-4290.
Y.Che,Z.Yang,K.W.EricCheng, “Construction, Operation and Control of a Laboratory-Scale Microgrid,” 3rd International Conference Power Electronics Systems and Applications, (PESA2009), 2009, pp.1-5.
R.Lasseter,A.Akhil,C.Marnay,J.Stephens,J.Dagle,R.Guttromson,A.S.Meliopoulous,R.Yinger,J.Eto, “The CERTS MicroGrid Concept,” CEC Consultant Report P500-03-089F. Sacramento, CA: California Energy Commission, 2003, 32p.
M.Adamiak,S.Bose,Y.Liu,J.Bahei-Eldin,J.DeBedout, “Tieline Controls in Microgrid Applications,” Bulk Power System Dynamics and Control – VII. Revitalizing Operational Reliability, 2007 REP Symposium, 2007, pp.1-9.
H.Gaztanaga,I.Etxeberria-Otadui,S.Bacha,D.Roye, “Real-Time Analysis of the Control Structure and Management Functions of a Hybrid Microgrid System,” IEEE 32nd Annual Conference Industrial Electronics, (IECON2006), 2006, pp.5137-5142.
A.Rööp(editor,reviser), “Annual Report 2008 Department of Electrical Drives and Power Electronics,” Tallinn: TUT Publishing, Estonia, 2009, 74p.
http://www.ab.com/PEMS/pm3000.html
http://www.rockwellautomation.com/rockwellsoftware/assetmgmt/energymetrix/sysreq.html
http://www.ab.com/programmablecontrol/pac/controllogix/
Design and Implementation of a Control System for a Microgrid involving a Fuel Cell Power Module
A. P. Agalgaonkar, S. V. Kulkarni, S. A. Khaparde, and S. A. Soman, “Placement and Penetration of Distributed Generation under Standard Market Design,” International Journal of Emerging Electric Power Systems, Volume 1, Issue 1 2004; Article 1004
TOWARDS A SMART NETWORK IN A BUSINESS DISTRICT. COMBINING DISPERSED UPS WITH DISTRIBUTED GENERATION
Designing the Optimal Stand alone Power System which uses Wind Power and Solar Radiation for Remote Area Object
Placement and Penetration of Distributed Generation under Standard Market Design
Off-Grid Diesel Power Plant Efficiency Optimization and Integration of Renewable Energy Sources
Model. Validation and Coordinated Operation of a Photovoltaic Array and a Diesel Power Plant for Distributed Generation
Distributed monitoring and control of future power systems via g

 

Replica Synchronization in Distributed File System

ABSTRACT – The Map Reduce framework provides a scalable model for large scale data intensive computing and fault tolerance. In this paper, we propose an algorithm to improve the I/O performance of the distributed file systems. The technique is used to reduce the communication bandwidth and increase the performance in the distributed file system. These challenges are addressed in the proposed algorithm by using adaptive replica synchronization. The adaptive replica synchronization among storage server consists of chunk list which holds the information about the relevant chunk. The proposed algorithm contributing to I/O data rate to write intensive workload. This experiments show the results to prove that the proposed algorithm show the good I/O performance with less synchronization applications.
Index terms – Big data, distributed file system, Map Reduce, Adaptive replica synchronization

INTRODUCTION

The distributed environment which is used to improve the performance and system scalability in the file system known as distributed file system [1]. It consists of many I/O devices chunks of data file across the nodes. The client sends the request to the metadata server who manages all the whole system which gets the permission to access the file. The client will access the storage server which is corresponding to it, which handles the data management, to perform the real operation from the MDS
The distributed file system of MDS which manages all the information about the chunk replicas and replica synchronization is triggered when any one of the replica has been updated [2]. When the data are updated in the file system the newly written data are stored in the disk which becomes the bottleneck. To solve this problem we are using the adaptive replica synchronization in the MDS

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

MapReduce is which is the programming primitive , programmer can map the input set and obtaining the output and those output set send to the reducer to get the map output. In the MapReduce function it is written as the single node and it is synchronized by MapReduce framework [3]. In distributing programming models which perform the work of data splitting, synchronization and fault tolerance. MapReduce framework is the programming model which is associated with implementation for processing large data sets with distributed and parallel algorithm on a cluster of nodes.
Hadoop MapReduce is a framework for developing applications which can process large amounts of data up to even multiple terabytes of data-sets in parallel on large clusters which includes thousands of commodity nodes in a highly fault tolerant and reliable manner. The input and the output of the MapReduce job are stored in Hadoop Distributed File System (HDFS).

RELATED WORKS

GPFS [4] which allocates the space for the multiple copies of data on the different storage server which supports the chunk replication and it writes the updates to all the location. GPFS keeps track of the file which been updated to the chunk replica to the primary storage server. Ceph[5] has replica synchronization similar ,the newly written data should be send to all the replicas which are stored in different storage server which is before responding to the client. Hadoop File System [6] the large data are spitted into different chunk and it is replicated and stored on storage servers, the copes of the any stripe are stored in the storage server and maintained by the MDS, so the replica synchronization are handled by the MDS, the process will be done when new data written on the replicas. In GFS [7], there are various chunk servers were the MDS manages the location and data layout. For the purpose of the reliability in the file system the chunk are replicated on multiple chunk servers; replica synchronization can be done in MDS. The Lustre file system [8], which is known for parallel file system, which has replication mechanism
For better performance Mosa Store [9] which is a dynamic replication for the data reliability. By the application when one new data block is created, the block at one of the SSs is stored in the MosaStore client, and the MDS replicate the new block to the other SSs to avoid the bottleneck when the new data block is created. Replica synchronization is done in the MDS of MosaStore.
The Gfarm file system [10] the replication mechanism is used for data replication for the reliability and availability. In the distributed and parallel file system, the MDS controls the data replication and send the data to the storage servers; this makes pressure to the MDS. Data replication which has the benefits to support for better data access was the data is required and provide data consistency. In the parallel file system [11], this improves the I/O throughput, data duration and availability by data replication. The proposed mechanism, according to the cost of analysis the data pattern are analysed a data replication is done, but replication synchronization is done in the MDS.
In the PARTE file system, the metadata file parts can be replicated to the storage servers to improve the availability of metadata for high service [12]. In detail we can say that in the PARTE file system, the metadata file parts can be distributed and replicated to the corresponding metadata into chunks on the storage servers, the file system in the client which keeps the some request of the metadata which have been sent to the server. If the active MDS crashed for any reason, then these client backup request are used to do the work bu the standby MDS to restore the metadata which are lost during the crash.
iii.PROPOSED SYSTEM OVERVIEW
The adaptive replica synchronization mechanism is used to improve the I/O throughput, communication bandwidth and performance in the distributed file system. The MDS manages the information in the distributed file system which is split the large data into chunks replicas.
The main aim of using the mechanism adaptive replica synchronization because the storage server cannot withstand the large amount of the concurrent read request to the specific replica, adaptive replica is triggered to the up to chunk data to the other related SSs in the hadoop distributed file system [13][5].The adaptive replica synchronization will be preformed to satisfy heavy concurrent reads when the access frequency to the target replica is greater than the predefined threshold. The adaptive replica synchronization mechanism among SSs intends to enhance the I/O subsystems performance.

Fig 1: Architecture of replica synchronization mechanism
A. Big data Preparation and Distributed data Storage
Configure the storage server in distributed storage environment. Hadoop distributed file system consists of big data, Meta Data Servers (MDS), number of replica, Storage Server (SS). Configure the file system based on the above mentioned things with proper communication. Prepare the social network big data. It consists of respected user id, name, status, updates of the user. After the data set preparation, it should be stored in a distributed storage server.
 
B. Data update in distributed storage
The user communicates with distributed storage server to access the big data. After that, user accesses the big data using storage server (SS). Based on user query, update the big data in distributed storage database. By updating the data we can store that in the storage server.
C. Chunk list replication to storage servers
The chunk list consists of all the information about the replicas which belongs to the same chunk file and stored in the SSs. The primary storage server which has the chunk replica that is newly updated to conduct the adaptive replica synchronization , when there is a large amount of the read request which concurrently passes in a short while with minimum overhead to satisfy this that mechanism is used.
D. Adaptive replica synchronization
The replica synchronization will not perform synchronization when one of the replicas is modified at the same time. The proposed mechanism Adaptive replica synchronization which improve the I/O subsystem performance by reducing the write latency and the effectiveness of replica synchronization is improved because in the near future the target chunk might be written again, we
can say that the other replicas are necessary to update until the adaptive replica synchronization has been triggered by primary storage server.
In the distributed file system the adaptive replica synchronization is used to increase the performance and reduce the communication bandwidth during the large amount of concurrent read request. The main work of the adaptive synchronization is as follows: The first step is chunk is saved in the storage servers is initiated .In second step the write request is send one of the replicas after that the version and count are updated. Those SS update corresponding flag in the chunk list and reply an ACK to the SS. On the next step read/write request send to other overdue replicas .On other hand it should handle all the requests to the target chunk and the every count is incremented according to the read operation and frequency is computed. In addition, the remaining replica synchronization for updated chunks, which are not the hot spot objects after data modification, will be conducted while the SSs are not as busy as in working hours. As a result, a better I/O bandwidth can be obtained with minimum synchronization overhead. The proposed algorithm is shown in algorithm.
ALGORITHM: Adaptive replica synchronization
Precondition and Initialization:
1) MDS handles replica management without synchronization, such as creating a new replica;
2) Initialize [Replica Location] [Dirty], [cnt], and [ver] in Chunk List when the relevant chunk replicas have been created.
Iteration:
1: while Storage server is active do
2: if An access request to the chunk then
3: / Other Replica has been updated /
4: if [Dirty] == 1 then
5: Return the latest Replica Status;
6: break;
7: end if
8: if Write request received then
9: [ver] ← I/O request ID;
10: Broadcast Update Chunk List Request;
11: Conduct write operation;
12: if Receiving ACK to Update Request then
13: Initialize read count
14: [cnt] ← 1;
15: else
16: /Revoke content updates /
17: Undo the write operation;
18: Recover its own Chunk List;
19: end if
20: break;
21: end if
22: if Read request received then
23: Conduct read operation;
24: if [cnt] > 0 then
25: [cnt] ← [cnt] + 1;
26: Compute [Freq]
27: if [Freq] >= Configured Threshold then
28: Issue adaptive replica synchronization;
29: end if
30: end if
31: end if
32: else
33: if Update Chunk List Request received then
34: Update chunk List and ACK
35: [Dirty] ← 1; break;
36: end if
37: if Synchronization Request received then
38: Conduct replica synchronization;
39: end if
40: end if
iv.PERFORMANCE RESULTS
The replica in the target chunk has been modified by the primary SSs will retransmits the updated to the other relevant replicas, and the write latency is which is required time for the each write ,by proposing new mechanism adaptive replica synchronization the write latency is measured by writing the data size.

Fig:2 Write latency
By the adaptive replica synchronization we can get the throughput of the read and write bandwidth in the file system. We will perform both I/O data rate and the time processing operation of the metadata.

Fig.3.I/ O data throughput
V. CONCLUSION
In this paper we have presented an efficient algorithm to process the large amount of the concurrent request in the distributed file system to increase the performance and reduce the I/O communication bandwidth. Our approach that is adaptive replica synchronization is applicable in distributed file system that achieves the performance enhancement and improves the I/O data bandwidth with less synchronization overhead. Furthermore the main contribution is to improve the feasibility, efficiency and applicability compared to other synchronization algorithm. In future, we can extend the analysis by enhancing the robustness of the chunk list
REFRENCES
[1] Benchmarking Mapreduce implementations under different application scenarios Elif Dede Zacharia Fadika Madhusudhan,Lavanya ramakrishnan Grid and Cloud Computing Research Laboratory,Department of Computer Science, State University of New York (SUNY) at Binghamton and Lawrence Berkeley National Laboratory
[2] N. Nieuwejaar and D. Kotz, “The galley parallel file system,” Parallel Comput., vol. 23, no. 4/5, pp. 447–476, Jun. 1997.
[3] K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop distributed file system,” in Proc. 26th IEEE Symp. MSST, 2010, pp. 1–10,
[4] M. P. I. Forum, “Mpi: A message-passing interface standard,” 1994.
[5] F. Schmuck and R. Haskin, “GPFS: A shared-disk file system for large computing clusters,” in Proc. Conf. FAST, 2002, pp. 231–244, USENIX Association.
[6] S. Weil, S. Brandt, E. Miller, D. Long, and C. Maltzahn, “Ceph: A scalable,high-performance distributed file system,” in Proc. 7th Symp. OSDI, 2006, pp. 307–320, USENIX Association.
[7] W. Tantisiriroj, S. Patil, G. Gibson, S. Son, and S. J. Lang, “On the duality of data-intensive file system design: Reconciling HDFS and PVFS,” in Proc. SC, 2011, p. 67.
[8] S. Ghemawat, H. Gobioff, and S. Leung, “The Google file system,” in Proc. 19th ACM SOSP, 2003, pp. 29–43.
[9] The Lustre file system. [Online]. Available: http://www.lustre.org
[10] E. Vairavanathan, S. AlKiswany, L. Costa, Z. Zhang, D. S. Katz, M. Wilde, and M. Ripeanu, “A workflow-aware storage system: An opportunity study,” in Proc. Int. Symp. CCGrid, Ottawa, ON, Canada, 2012, pp. 326–334.
[11]GfarmFileSystem.[Online].Available:http://datafarm.apgrid.org/
[12] A. Gharaibeh and M. Ripeanu, “Exploring data reliability tradeoffs in replicated storage systems,” in Proc. HPDC, 2009, pp. 217–226.
[13] J. Liao and Y. Ishikawa, “Partial replication of metadata to achieve high metadata availability in parallel file systems,” in Proc. 41st ICPP, 2012, pp. 168–1.
 

A Survey on Parallel and Distributed Deep Learning

Abstract
Deep learning is considered as one of the most remarkable machine learning techniques in recent years. It has achieved great success in many applications, such as image analysis, speech recognition, and text understanding. It uses supervised and unsupervised methods for the tasks of classification and pattern recognition to learn multi-level representations and features in hierarchical architectures. The recent development in parallel and distributed technologies has enabled the processing of big data using deep learning. Although big data offers great opportunities for a wide range of areas including e-commerce, industrial control, and smart medicine, it poses many challenging issues in data mining and information processing due to its high size, wide variety and high speed. Deep learning has played a major role in big data applications over the past few years.
In this paper, we review the emerging researches of deep learning models using parallel and distributed tools, resources, algorithms and techniques. Furthermore, we point out the remaining challenges of using parallel and distributed deep learning and discuss future topics.
Index Terms – Deep learning, Parallel, Distributed, Highperformance computing, GPU
I. INTRODUCTION
Machine learning and, in particular, deep learning are rapidly taking on a range of dimensions of our daily lives. Inspired by the integrated nature of the human brain, the Deep Neural Network (DNN) is at the core of deep learning. Properly trained, the expressiveness of DNNs offers precise solutions only by analyzing large amounts of data to previously thought unsolvable problems. Deep learning has been successfully implemented for a multitude of fields, ranging from image classification, speech recognition, and medical diagnosis to autonomous driving and defeating in complex games human players. As the size and complexity of DNNs increase in data sets, the computational frequency and storage requirements of deep learning increase proportionately. A high-performance computing cluster is required to train a DNN for today’s competitive accuracy. Different aspects of DNN learning and inference (evaluation) were modified to increase competitiveness to exploit these systems.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

Parallel processors such as GPUs have played an important role in the practical implementation of deep neural networks. The computations that arise naturally lend themselves to effective parallel implementation when training and using deep neural networks. The performance of these deployment products enables researchers to test networks that are significantly higher in size and train them on larger datasets. This has helped to greatly enhance the accuracy of tasks such as speech recognition and object identification, among other things. A recognized drawback of the GPU approach is that the speedup of learning is limited when the template does not fit in GPU memory. Researchers also reduce the size of the data or parameters to efficiently use a GPU so that transfers from CPU to GPU are not a major bottleneck. Although information and parameter reduction function well for small problems (e.g. acoustic modeling for speech recognition), problems with a large number of examples and measurements (e.g. highresolution images) are less desirable.
Nonetheless, there are two major challenges in designing a distributed deep learning model. First, deep learning models have a ton of parameters to synchronize nodes when modified, which would cause a lot of overhead interactions. Consequently, the scalability in terms of training time to achieve such accuracy is a concern. Second, it is non-trivial for programmers to build and train models with deep and complex system structures. In addition, distributed training increases the burden on programmers, e.g. partitioning data and model, and communication with the network. This question is compounded by the fact that it will work with deep learning models for data scientists with a little deep learning experience.
In this study, we discuss the variety of topics, ranging from vectorization to supercomputer-efficient use, in the context of parallelism and in-depth learning distribution. Parallel strategies are presented to evaluate and implement multiple research and existing tools, as well as extensions to training algorithms and systems to support distributed environments.
II. RELATED WORKS
Some field studies focus on deep learning applications [1], neural networks and their history [2], [3], [4], [5], deep learning scaling [6], and DNN hardware architecture [7], [8], [9]. Specifically, three surveys [2], [4], [5] describe DNNs and the origins of deep learning methodologies from a historical perspective, as well as discuss the possible capabilities of DNNs w.r.t. training and representational energy. [4], [5] also describe in detail the methods of optimization and the methods of regularization. Bengio [6] discusses the scaling of deep learning from different perspectives, focusing on models, algorithms for optimization, and datasets. The paper also looks at some aspects of distributed computing, including sparse and asynchronous communication. Hardware design surveys focus primarily on the computational learning side and not on optimization. It includes a recent survey [9] which analyses DNN operator computing techniques (layer types) and mapping hardware computations, leveraging inherent parallelism. The survey also provides a discussion on increasing data representation (e.g. through quantization) to reduce the overall bandwidth of hardware memory. Other surveys focus on accelerators for traditional neural networks [7] and the use of FPGAs in deep learning [8].
III. TERMINOLOGY
This section sets out the conventions of theory and naming for the material presented in the survey. We first discuss the class of subjects of supervised learning, followed by relevant parallel programming foundations.
A. Deep Neural Network
An artificial neural network (ANN) with multiple layers between the layers of input and output is a deep neural network (DNN). The DNN finds the correct mathematical manipulation, whether it is a linear relationship or a non-linear relationship, to turn the input into an output. The network passes through the layers to measure each output’s probability. For example, a DNN trained to recognize dog breeds will go over the given image and determine the probability of a certain breed being the dog in the picture. The user can review the results and select the probabilities to be displayed by the network (above a certain threshold, etc.) and return the proposed label. Every mathematical manipulation as such is considered a layer, and there are many layers of complex DNN, hence the name “deep” networks.
B. Parallel Computer Architecture
1) Multi-core Computing: A multi-core processor is a processor on the same chip that includes multiple processing units (known as “core”). From multiple instruction streams, a multi-core processor can issue multiple instructions per clock cycle. Each core in a multi-core processor can theoretically be superscalar, i.e. each core can issue multiple instructions from a single thread on every clock cycle.
2) Symmetric Multiprocessing: Symmetric multiprocessor (SMP) is a multi-processor computer system that shares memory and links through a bus. Bus controversy prevents the scaling of bus architectures. As a result, there are generally no more than 32 processors in SMPs. Due to the small size of the processors and the significant reduction in bus bandwidth requirements achieved by large caches, such symmetric multiprocessors are extremely cost-effective, as long as there is enough memory bandwidth.
3) Distributed Computing: A distributed computer (also known as a multiprocessor distributed memory) is a distributed computer system where a network connects the processing elements. Distributed computers are highly scalable. There is a lot of overlap between the terms “competitive computing,” “parallel computing,” and “distributed computing,” and there is no clear distinction between them. The same system can be characterized as “parallel” as well as “distributed;” processors run simultaneously in a typical distributed system.

Fig. 1: Single machine and distributed system structure
C. Parallel Programming
The programming techniques used on parallel computers to implement parallel learning algorithms depend on the target architecture. They range from simple, single-machine threaded implementations to OpenMP. Accelerators are usually programmed with special languages such as the CUDA of NVIDIA, OpenCL, or with hardware design languages in the case of FPGAs. However, the details are often hidden behind calls from libraries (e.g. cuDNN or MKL-DNN) implementing the time-consuming primitive. It is possible to use simple communication mechanisms such as TCP / IP or Remote Direct Memory Access (RDMA) on multiple machines with distributed memory. It is also possible to use more convenient libraries such as the Message Passing Interface (MPI) or Apache Spark on distributed memory machines. MPI is a lowlevel library that aims to deliver portable performance while Spark is a higher-level framework that focuses more on the productivity of programmers.
IV. OPERATOR CONCURRENCY
There are several ways to parallel execution of neural network layer. Computations (for example, in the case of pooling operators) can be directly parallelized in most cases. However, computations need to be reshaped in order to expose parallelism in other types of operators. Below, we describe concurrency analysis of three popular operators.
A. Performance Modeling
It is difficult to estimate the runtime of a single DNN operator, let alone a whole network, even with work and depth models. But with performance modeling, other works still manage to approximate the runtime of a given DNN. Using the values in the figure as a lookup table, it was possible to predict the time to calculate and backpropagate with a 5–19 percent error through lots of different sizes, even on asynchronous communication [10] clusters of GPUs. In a distributed system [11], the same was done for CPUs, using a similar approach. Paleo [12] derives a performance model from service counts alone (with a prediction error of 10–30 percent) and Pervasive CNN’s [13] uses performance modeling to pick networks with reduced precision to meet users real-time needs.
B. Fully Connected Layers
It is possible to express and model a fully connected layer as a matrix multiplication of weights and neuron values (column per batch sample). To that end, it is possible to use efficient linear algebra libraries like CUBLAS [14] and MKL [15]. [16] presents a variety of methods for further optimizing fully connected layer CPU implementation. The paper shows efficient loop building, vectorizing, blocking, unrolling, and batching in particular. The paper also demonstrates how weights can be quantized to use fixed-point math instead of floating-point.
C. Convolution
The bulk of computations involved in DNN learning and inference are convolutions. As such, significant efforts have been made by the research community and industry to optimize their computation across all platforms. While a convolution operator can be explicitly determined, it will not make full use of the capabilities of vector processors and multi-core architectures, which are oriented to many parallel multiplication-accumulation operations. Furthermore, through ordering operations to optimize information reuse [17], adding data redundancy, or through base transformation, it is possible to increase utilization. In CNN’s [18], the first occurrence of unrolling convolutions used both CPUs and GPUs for training. The approach was subsequently popularized by [19] to reshape images from 3D tensors to 2D matrices in the array. Every 1D row in the matrix contains an unrolled 2D patch that would produce redundant data, normally converted (possibly with overlap). Then the kernels of convolution are stored as a 2D matrix, where each column represents an unrolled kernel (one filter of convolution). Multiplying these two matrices results in a matrix containing the converted tensor in 2D format, which for subsequent operations can be reshaped to 3D. Note that this operation can be generalized to 4D tensors (a whole batch), converting it into a multiplication of a single matrix. DNN basic libraries, such as CUDNN [20], provide a variety of methods and software formats for convolution. These libraries include functions that select the best performing algorithm given tensor sizes and memory constraints to assist users in selecting an algorithm. The libraries will run all methods internally and pick the fastest one.
D. Recurrent Units
The complex gate systems within RNN units contain multiple operations, each involving a small multiplication of matrixes or an element-wise operation. For this reason, as a series of high-level operations, such as GEMMs, these layers have traditionally been implemented. Nevertheless, these layers can be further accelerated. Moreover, since RNN units are usually chained together (forming consecutive recurrent layers), it is possible to consider two types of competition: within the same layer and between consecutive layers. [21] describes several GPU optimizations. The first optimization fuses all computations (GEMMs and otherwise) into one function (kernel), saving the memory of scratch-pad intermediate results. This both decreases the overhead of the kernel scheduling and maintains round trips to the global memory using the massively parallel GPU’s multi-level memory hierarchy. Certain enhancements involve the pre-transposition of matrices and allowing the GPU to concurrently perform separate recurrent units on various multi-processors. Competitiveness between layers is achieved through a parallel pipeline with which [21] implements stacked RNN unit computations, starting to propagate immediately through the next layer once its data dependencies are met. Overall, these optimizations result in an 11x increase in performance over the implementation at a high level. Dynamic programming [22] for RNNs was proposed from the memory consumption perspective to balance between caching intermediate results and recomputing forward inferences for backpropagation.
V. NETWORK CONCURRENCY
The high average parallelism of neural networks can be used not only to accurately quantify individual operators but also to simultaneously test the entire network for various dimensions. Below, together with a variant, we analyze three influential partitioning techniques.
A. Data Parallelism
A simple parallel approach is to split the work of batch samples between multiple computational resources (core or devices). This approach (initially called pattern parallelism, because input samples are called patterns), dates back to the first practical implementation of [23] artificial neural networks. Multiple vector accelerator microprocessors (Spert-II) were used by [24] to parallel neural network training error backpropagation. The paper presents a version of delayed gradient updates called “bunch mode” to support data parallelism, where the gradient is updated several times before the weights are updated. [25] performed one of the earliest occurrences of mapping DNN computations to software parallel architectures (e.g., GPUs). When teaching Restricted Boltzmann Machines, the paper shows a speedup of up to 72.6x over CPU. Today, the vast majority of deep learning frameworks support data parallelism, using a single GPU, multiple GPUs, or a multiGPU node cluster. Additional methods have been suggested in the literature for software parallelism. SGD runs (possibly with batches) k times in parallel in ParallelSGD [26], splitting the data set between the processors. The resulting weights are aggregated and averaged after the convergence of all SGD instances. ParallelSGD was developed using the programming model for MapReduce [27]. It is easy to plan parallel tasks on multiple processors and distributed environments using MapReduce. While the MapReduce model initially succeeded in deep learning, its generality hindered specific DNN optimizations. Current implementations, therefore, use high-performance communication interfaces (e.g., MPI) to implement features of fine-grained parallelism.
B. Model Parallelism
The second DNN learning partitioning technique is template parallelism also known as network parallelism. This technique separates the job into each layer according to the neurons. In this case, the sample batch is copied to all processors and different parts of the DNN are measured on different processors, which can save memory (because the entire network is not stored in one place) but contributes to further interaction after each layer. [28] has been proposed to incorporate redundant computations into neural networks in order to reduce communication costs in fully connected layers. In particular, the proposed method partitions an NN in such a way that each processor is responsible for twice the neurons (with overlap), thus requiring more calculation but less communication. One approach proposed to reduce interaction in fully connected layers is to use Cannon’s algorithm for matrix multiplication, updated for DNNs[29]. The paper reports that Cannon’s algorithm produces better efficiency and speed-ups by simple partitioning on fully connected networks on smallscale multilayer.

Fig. 2: Neural Network Parallelism Schemes
C. Pipelining
In deep learning, the pipeline can either refer to overlapping computations, i.e. between one layer and the next (as data becomes ready); or to dividing the DNN by depth, assigning layers to specific processors. Pipelining can be viewed as a form of data parallelism, as elements (samples) are processed in parallel through the network, but also as model parallelism, since the DNN structure determines the length of the pipeline. For combine forward analysis, backpropagation, and weight updates, the first type of pipeline can be used. This scheme is commonly used in [30], [31], [32] training, and improves use by reducing idle time for the processor.
D. Hybrid Parallelism
Most parallelism schemes combined would overcome each scheme’s drawbacks. Below we take a look at successful examples of such hybrids. applies parallel data to the convolution layer and parallel model to the fully connected portion. Using this hybrid approach, it is possible to achieve a speed-up of up to 6.25x over one for 8 GPUs, with less than 1 percent loss of accuracy (due to an increase in lot size). AMPNet [33] is an asynchronous implementation of CPU DNN learning using an intermediate representation to implement parallel fine-grained modeling. In particular, internal parallel tasks are defined and designed asynchronously within and between layers. In addition, asynchronous dynamic control flow execution enables the tasks of forwarding analysis, backpropagation, and weight updating to be pipelined. Finally, a deep learning system was distributed by the DistBelief [34] which combines all three parallel strategies. Training is conducted simultaneously in the implementation of multiple model replicas, where each replica is trained on different samples (data parallelism). Within each replica, the DNN is distributed in the same layer (model parallelism) by neurons as well as by different layers (pipeline).
VI. TRAINING CONCURRENCY
So far we have addressed training algorithms where there is only one copy and all processors can see its up-to-date quality directly. There may be multiple instances of training agents operating independently in distributed environments, and therefore the overall algorithm has to be adapted.
A. Model Consistency
Directly splitting computations between nodes produces a distributed type of data parallelism where all nodes have to communicate their updates to others before a new batch is retrieved. It involves a large overhead on the overall system, hampering the scaling of learning. The HOGWILD sharedmemory algorithm [35] is a well-known instance of inconsistency, which enables training agents to read parameters and modify gradients at will, overwriting existing development. Stale-Synchronous Parallelism (SSP) [36] proposes a balance between consistent and inconsistent models to provide accuracy guarantees given asynchrony.
B. Parameter Distribution
For distributed deep learning, there are two general ways to maintain communication bandwidth: compressing parameters with efficient data representations and avoiding sending unnecessary information entirely, resulting in sparse data structures being communicated. Although the methods used in the former category are orthogonal to the network infrastructure, when applied using hierarchical (PS) and distributed topologies, the methods used in the latter classification vary. A common gradient (or parameter) compression data representation is quantization, i.e. the mapping of continuous information into buckets representing value sets (usually ranges). [37] has been shown to be closely distributed in the distributions of parameter and gradient values, so these approaches are successful in representing the working range to reduce the number of bits per parameter. Sparsification is another popular method used for the distribution of parameters. DNNs (especially CNNs) display sparse gradients during updates of parameters. Using a fixed threshold, the first application of gradient sparsification [38] prunes gradient values should not be sent below.
VII. TOOLS AND SOFTWARES
A. cuDNN
CuDNN [20] is an effective library of basic deep learning implementation. Deep workloads of learning are computationally intensive and it is difficult and time-consuming to optimize their kernels. Kernels need to be reoptimized as parallel architectures evolve, making it difficult to maintain codebases over time. Libraries such as the Basic Linear Algebra Subroutines (BLAS) [39] have long addressed similar issues in the HPC community. For deep learning, however, there is no comparable library. CuDNN is a BLAS-like library of structured routines for deep workload learning. This implementation includes GPU routines, although these routines could be implemented for other platforms similar to the BLAS library. The library is easy to integrate into existing frameworks and provides efficient use of memory and performance.

Fig. 3: Comparative Performance between Caffe and cuDNN
For example, incorporating cuDNN into Caffe [32], a common framework for convolutional networks improves performance on a standard model by 36 percent while increasing memory consumption as well. Our goal is to offer matrix multiplication output as close as possible while using no auxiliary memory. GPU memory is high, but low, bandwidth and therefore a scarce resource. Ideally, the GPU memory should be filled with information, parameters, and neuron responses when training deep networks, not auxiliary data structures needed by the algorithm of convolution. Figure 3 displays three convolution implementations output on an NVIDIA Tesla K40: cuDNN, Caffe, and Cuda-convnet2. Importantly, even with a small mini-batch size of just 16, cuDNN performance is still 86 percent of maximum performance, which indicates that our implementation is performing well across the parameter space of convolution.
B. Torch
Torch7 [31] has been designed with efficiency in mind, leveraging SSE wherever possible and supporting two parallel approaches: OpenMP and CUDA. These techniques are heavily used by the Tensor library (interfaced with the “torch” package in Lua). From the user’s point of view, enabling CUDA and OpenMP can result in great speed-ups in any “Lua” script at zero cost of implementation (because most packages rely on the Tensor library). For more specific uses not covered by the Tensor library, other packages (such as the “nn” package) also leverage OpenMP and CUDA. On most benchmarks, Torch shows it’s faster than Theano [40]. Ironically, for small architectures, Theano lags, which could be explained by a high overhead of Python. The great performance of OpenMP compared to the GPU implementation is another interesting comment: only the largest architectures will benefit from the GPU.
C. TensorFlow
TensorFlow [30] is an application for expressing algorithms for machine learning and implementing these algorithms. A computation expressed using TensorFlow can be performed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets to hundreds of machines’ large-scale distributed systems and thousands of computational devices such as GPU cards. The program is versatile and can be used to describe a wide range of algorithms, including deep neural network models learning and inference algorithms. Client programs communicate with the TensorFlow model by generating a Session and a tensor is a typed, multidimensional set that is the minimum building block in their implementation. We have built TensorBoard, a companion visualization tool for TensorFlow that is included in the open-source release, to help users understand the structure of their computer graphs and also understand the overall behavior of machine learning models.

Fig. 4: Baseline throughput for synchronous replication with a null model. Sparse accesses enable TensorFlow to handle larger models, such as embedding matrices
VIII. CONCLUSION
The world of deep learning is full of concurrency. Nearly every aspect of learning is essentially parallel, from convolution computing to meta-optimizing DNN architectures. Even if an aspect is sequential, due to the robustness of nonlinear optimization, its consistency requirements can be reduced to increase competition while at the same time achieving reasonable accuracy, if not better. In this paper, we provide an overview of many of these aspects, the approaches documented in the literature, and the analysis of competition. It’s hard to predict what the future holds for this highly active research field (many have tried over the years) but we can assume there’s a lot to do with parallel and distributed deep learning to progress. As research progresses, DNN architectures between consecutive and non-consecutive layers are becoming deeper and more interconnected. Apart from accuracy, considerable effort is made to reduce memory footprint and the number of operations so that inferences on mobile devices can be successfully executed. This also means that compression of DNN post-training is likely to be further studied, and compressible networks learning is desirable. Because mobile hardware is limited in memory capacity and must be energy-efficient, it requires specialized DNN computational hardware.
REFERENCES
1 Najafabadi, M. M., Villanustre, F., Khoshgoftaar, T. M., Seliya, N., Wald, R., and Muharemagic, E., “Deep learning applications and challenges in big data analytics,” Journal of Big Data, vol. 2, no. 1, p. 1, 2015.
2 LeCun, Y., Bengio, Y., and Hinton, G., “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
3 Li, Y., “Deep reinforcement learning: An overview,” arXiv preprint arXiv:1701.07274, 2017.
4 Schmidhuber, J., “Deep learning in neural networks: An overview,” Neural networks, vol. 61, pp. 85–117, 2015.
5 Wang, H., Raj, B., and Xing, E., “On the origin of deep learning. arxiv preprint arxiv: 170207800,” Neural networks, 2017.
6 Bengio, Y., “Deep learning of representations: Looking forward,” in International Conference on Statistical Language and Speech Processing. Springer, 2013, pp. 1–37.
7 Ienne, P. et al., “Architectures for neuro-computers: review and performance evaluation,” Computer Science Department Technical Report, no. 93/21, 1993.
8 Lacey, G., Taylor, G. W., and Areibi, S., “Deep learning on fpgas: Past, present, and future,” arXiv preprint arXiv:1602.04283, 2016.
9 Sze, V., Chen, Y.-H., Yang, T.-J., and Emer, J. S., “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.
10 Oyama, Y., Nomura, A., Sato, I., Nishimura, H., Tamatsu, Y., and Matsuoka, S., “Predicting statistics of asynchronous sgd parameters for a large-scale distributed deep learning system on gpu supercomputers,” in 2016 IEEE International Conference on Big Data (Big Data). IEEE, 2016, pp. 66–75.
11 Yan, F., Ruwase, O., He, Y., and Chilimbi, T., “Performance modeling and scalability optimization of distributed deep learning systems,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2015, pp. 1355–1364.
12 Qi, H., Sparks, E. R., and Talwalkar, A., “Paleo: A performance model for deep neural networks,” 2016.
13 Song, M., Hu, Y., Chen, H., and Li, T., “Towards pervasive and user satisfactory cnn across gpu microarchitectures,” in 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2017, pp. 1–12.
14 Nvidia, C., “Cublas library,” NVIDIA Corporation, Santa Clara, California, vol. 15, no. 27, p. 31, 2008.
15 Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., and Wang, Y., “Intel math kernel library,” in High-Performance Computing on the Intel R Xeon PhiTM. Springer, 2014, pp. 167–188.
16 Vanhoucke, V., Senior, A., and Mao, M. Z., “Improving the speed of neural networks on cpus,” 2011.
17 Demmel, J. and Dinh, G., “Communication-optimal convolutional neural nets,” arXiv preprint arXiv:1802.06905, 2018.
18 Chellapilla, K., Puri, S., and Simard, P., “High performance convolutional neural networks for document processing,” 2006.
19 Coates, A., Huval, B., Wang, T., Wu, D., Catanzaro, B., and Andrew, N., “Deep learning with cots hpc systems,” in International conference on machine learning, 2013, pp. 1337–1345.
20 Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., and Shelhamer, E., “cudnn: Efficient primitives for deep learning,” arXiv preprint arXiv:1410.0759, 2014.
21 Appleyard, J., Kocisky, T., and Blunsom, P., “Optimizing performance of recurrent neural networks on gpus,” arXiv preprint arXiv:1604.01946, 2016.
22 Gruslys, A., Munos, R., Danihelka, I., Lanctot, M., and Graves, A., “Memory-efficient backpropagation through time,” in Advances in Neural Information Processing Systems, 2016, pp. 4125–4133.
23 Zhang, X., Mckenna, M., Mesirov, J. P., and Waltz, D. L., “An efficient implementation of the back-propagation algorithm on the connection machine cm-2,” in Advances in neural information processing systems, 1990, pp. 801–809.
24 Farber, P. and Asanovic, K., “Parallel neural network training on multispert,” in Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing. IEEE, 1997, pp. 659–666.
25 Raina, R., Madhavan, A., and Ng, A. Y., “Large-scale deep unsupervised learning using graphics processors,” in Proceedings of the 26th annual international conference on machine learning. ACM, 2009, pp. 873– 880.
26 Zinkevich, M., Weimer, M., Li, L., and Smola, A. J., “Parallelized stochastic gradient descent,” in Advances in neural information processing systems, 2010, pp. 2595–2603.
27 Dean, J. and Ghemawat, S., “Mapreduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.
28 Muller, U. and Gunzinger, A., “Neural net simulation on parallel computers,” in Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), vol. 6. IEEE, 1994, pp. 3961–3966.
29 Ericson, L. and Mbuvha, R., “On the performance of network parallel training in artificial neural networks,” arXiv preprint arXiv:1701.05130, 2017.
30 Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M. et al., “Tensorflow: Largescale machine learning on heterogeneous distributed systems,” arXiv preprint arXiv:1603.04467, 2016.
31 Collobert, R., Kavukcuoglu, K., and Farabet, C., “Torch7: A matlab-like environment for machine learning,” in BigLearn, NIPS workshop, no. CONF, 2011.
32 Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T., “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014, pp. 675–678.
33 Gaunt, A. L., Johnson, M. A., Riechert, M., Tarlow, D., Tomioka, R., Vytiniotis, D., and Webster, S., “Ampnet: asynchronous model-parallel training for dynamic neural networks,” arXiv preprint arXiv:1705.09786, 2017.
34 Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M., Senior, A., Tucker, P., Yang, K. et al., “Large scale distributed deep networks,” in Advances in neural information processing systems, 2012, pp. 1223–1231.
35 Recht, B., Re, C., Wright, S., and Niu, F., “Hogwild: A lock-free approach to parallelizing stochastic gradient descent,” in Advances in neural information processing systems, 2011, pp. 693–701.
36 Ho, Q., Cipar, J., Cui, H., Lee, S., Kim, J. K., Gibbons, P. B., Gibson, G. A., Ganger, G., and Xing, E. P., “More effective distributed ml via a stale synchronous parallel parameter server,” in Advances in neural information processing systems, 2013, pp. 1223–1231.
37 Koster, U., Webb, T., Wang, X., Nassar, M., Bansal, A. K., Constable, W.,¨ Elibol, O., Gray, S., Hall, S., Hornof, L. et al., “Flexpoint: An adaptive numerical format for efficient training of deep neural networks,” in Advances in neural information processing systems, 2017, pp. 1742–1752.
38 Strom, N., “Scalable distributed dnn training using commodity gpu cloud computing,” in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
39 Lawson, C. L., Hanson, R. J., Kincaid, D. R., and Krogh, F. T., “Basic linear algebra subprograms for fortran usage,” 1977.
40 Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., and Bengio, Y., “Theano: a cpu and gpu math expression compiler,” in Proceedings of the Python for scientific computing conference (SciPy), vol. 4, no. 3. Austin, TX, 2010.
 

Distributed Data Processing Technology

Distributed database system technology is the union of what appear to be two diametrically opposed approaches to data processing: database system and computer network technologies. Database system have taken us from a paradigm of data processing in which each application defined and maintained its own data to one in which the data is defined and administered centrally. This new orientation results in data independence , whereby the application programs are immune to changes in the logical or physical organization of the data

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

One of the major motivations behind the use of database systems is the desire to integration the operation data of an enterprise and to provide centralized ,thus controlled access to that data. The technology of computer networks , on the other hand , promotes a mode of that work that goes against all centralization efforts . At first glance it might be difficult to understand how these two contrasting approaches can possibly be synthesized to produce a technology that is more powerful and more promising than either one alone . The key to this understanding is the realization that the most important objective of the database technology is integration, not centralization. It is important to realize that either one of these terms does not necessarily imply the other . It is possible to achieve integration with centralization and that is exactly what at distriduted database technology attempts to achieve.
In this chapter we define the fundamental concepts and set the framework for discussing distributed databases .We start by examining distributed system in general in order to clarify the role of database technology with distributed data processing , and then move to topics that are more directly related to DDBS.
The term distributed processing is probably the most abused term in computer science of the last couple of the year . It has been used to refer to such diverse system as multiprocessing systems, distributed data processing , and computer networks . This abuse has gone on such an extent that the term distributed processing has sometime been called a concept in search of a definition and a name .Here are some of the other term that have been synonymously with distributed processing distributed / multicomputers , satellite processing /satellite computers , backend processing , dedicated/special-purpose computers, time-shared systems and functionally modular system.
Obviously, some degree of distributed processing goes on in any computer system, ever on single-processor computers .Starting with the second-generation computers, the central processing. However ,it should be quite clear that what we would like to refer to as distributed processing , or distributed computing has northing to do with this form of distribution of function of function in a single-processor computer system.
A term that has caused so much confused is obviously quite difficult to define precisely. They have been numerous attempts to define what distributed process is , and most ever researcher has come up with a definition. The working definition we use for a distributed computing systems states that it is a number of autonomous processing elements that are interconnected by a computer network and that cooperate in performing their assigned tasks. The processing elements referred to in this definition is a computing device that can execute a program on its own .
One fundamental question that needs to be asked is : Distributed is one thing that might be distributed is that processing logic .In fact , the definition of a distributed computing computer system give above implicitly assumes that the processing logic or processing elements are distributed . Another possible distribution is according to function . Various functions of a computer system could be delegated to various pieces of hardware sites . Finally, control can be distributed . The control of execution of various task might be distributed instead of being performed by one computer systems . from the view of distributed instead of being system , these modes of distribution are all necessary and important .
Distributed computing system can be classified with respect to a number of criteria . Some of these criteria are as follows : degree of coupling , interconnection structure , interdependence of components ,and synchronization between components . Degree of coupling refer to a measure that determines closely the processing elements are connected together . This can be measured as the ratio of the amount of data exchanged to the amount of local processing performed in executing a task . If the communication is done a computer network ,there exits weak coupling among the processing elements. However if components are shared we talk about strong coupling . shared components can be both primary memory or secondary storage devices. As for the interconnection structure , one can talk about those case that have a point to point interconnection channel .The processing elements might depend on each other quite strongly in the execution of a task , or this interdependence might be as minimal as passing message at the beginning of execution and reporting results at the end . Synchronization between processing elements might be maintained by synchronous or by asynchronous means . Note that some of these criteria are not entirely independent like the processing elements to be strongly interdependent and possibly to work in a strongly coupled fashion.
The fundamental reason behind distributed processing is to be better able to solve the big and complicated problems by using a variation of the well-known divide-and -conquer .This approach has two fundamental advantages from an economics standpoint. First ,distributed computing provides an economical method of harnessing more computer power by employing multiple processing elements optimally .This require research in distributed processing as defined earlier as well as in parallel processing .The second economic reason is that by attacking these problem in discipline the cost of software development .Indeed it is well known that the cost of software has increasing in opposition to the cost trends of hardware.
Distributed database system should also be viewed with this frame work and treated as tools that could make distributed processing easier and more efficient .It is reasonable to draw an analogy between what distributed database might offer to the data processing world and what the data technology has already provided .There is no doubt that the development in the task of developing distributed software .
Distributed Database System
Distributed Database system is a collection of multiple ,logical interrelated database distributed over a computer networks. A distributed database management system is known as the software that permits the management of the DDMS and make the distributed transparent to thr user . The two important terms in these in these definition
are Logically interrelated and distributed over a computer networks .They help eliminate cases that eliminate certain that have sometimes been accepted to report a DDBS.
A DDBS is not a collection of files that can be individually stored at each node of a computer networks .To form a DDBS , files should not only be logically related but there should be structure among the files and access should be via a common interface. We should note that there has been much recent activity in providing DBMS functionality over semi-structured data that are stored in file on the Internet .
The physical distribution of data is not the most significant issue . The proponent of this view would therefore feel comfortable in labeling as a distributed data base two database that reside in the same computer system . However the physical distribution of data is very important .It creates problem that are not encountered when the database in the same computer . This brings us to another point is multiprocessor system as DDBSs . A multiprocessing system is generally considered to be a system where two or more processors share some from of memory either primary memory in which case the multiprocessor is called shared memory Or Tightly couple Or shared disk.
The shared-nothing architecture is one where each processor has its own primary and secondary memories as well as peripherals and communicates with other processors other processors over a very high speed interconnect. However there are differences between the interactions in multiprocessors architectures and the rather loose interaction that is common in distributed computing environments . The fundamental difference is the mode of operation . A multiprocessor system design is rather symmetrically , consisting of a number of identical processor and memory components and controlled by one or more copies of the same operating system, which is responsible for a strict control of the task assignment to each processor.
PROMISES OF DDBMSs
Advantages of DDBS have been cited in literature , ranging from sociological reasons for decentralization .All of these can be distilled to four fundamental which may also be viewed as promises of DDBS technology.
Fundamental relational DDBMS
A relational DBMS is a software component supporting the relational model and a relational language .A DBMS is a reentrant program shared by multiple alive entities, called truncations that run database program . when running on a general purpose computer ,a DBMS is interfaced with two other components: the communication subsystem and the operation system . The communication with applications such as the terminal monitor needs to communicate with the DBMS
DISTRIBUTED DBMS ARCHITECTURE
The architecture is defined as the structure , this means that the components of the system are identified as the function of components is specified and the interrelationships and interactions among these components are defined .This general framework also hold true for computer system in general and software systems in particular. The specified of the architecture of the software system requires identification of the various modules with their interface and interrelationship , in term of the data and control flow through the system From a software engineering perspective the task of developed individual modules is called programming in the small where the task of integrating them into a complete system is referred to as programming -in-the-large.
There are three type of distributed architecture in DDBMS client/server system , peer-to-peer distributed DDMS and multidatabase system .There are idealized views of a DBMS in that many of the commercially available systems may deviates. A reference architecture is commonly created by standards developed since it clearly defines the interfaces that need to be standardized .
DBMS STANDARDIZATION
The standardization efforts to DBMSs because of the close relational ship between the architecture of the system and the reference model of that system which is developed as a precursor to any standardization activity. For all practical purpose , the reference model can be through of as an idealized architectural model of the system .It is defined model as a conceptual framework whose purpose is to divide standardization work into manageable pieces and to show at a general level how these piece are related with each other. A reference model can be described as three difference approaches :

Based on components . The components of the system are defined together with the interrelationships between components . Thus a DBMS consists of a number of components , each of which provides some functionality . Their orderly and well-defined interaction provides objectives is to design and implements the system under consideration . On the other hand to determine the functionality of a system by examining its components . The DBMS standard proposals prepared by the computer corporation of America for national bureau of standards

Based on functions . The difference classes of user are identified and the functions that the systems will perform for each class are defined The system specification within this category typically specify a hierarchical structure for user classes This results in the hierarchical system architecture with well-defined interface between the functionalities of different layers . The advantage of the functional approach is the clarity with which the objectives of the system are specified . However , it gives very little insight into how these objectives will be attended or the level of complexity of the system.

The different types of data are identified and an architecture frameworks is specified which defined the functional units that will realize or use data according to these different views . Since data is the central resource that a DBMS manages this approach is claimed to be the preferable of the data approach is that central important it associates with the data resource .This is significant from the DBMS viewpoint since the fundamental resource that a DBMS manages is data . On the other hand it is impossible to specify an architectural model fully unless the fundamental modules are also described . The ANSI/SPARC architecture discussed in the next section belongs in this category.

Even through three distinct approaches are identified , one should never lose sight of the interplay among them . As indicated in a report of the Database Architecture framework Task group of ANSI .All the three approaches need to be used together to defined an architectural model, with each point of view serving to focus our attention on different aspects of an architectural model.
A more important issue is the orthogonality of the foregoing classification schemes and the DBMS objective . Regardless of how we choose to view a DBMS these objective have to be addressed within each functional unit. In the remainder of this section we concentrate on a reference architecture that has generated considerable interest and is the basis of our reference model.