Literature review about data warehouse

Chapter 2 provides literature review about data warehouse, OLAP MDDB and data mining concept. We reviewed concept, characteristics, design and implementation approach of each above mentioned technology to identify a suitable data warehouse framework. This framework will support integration of OLAP MDDB and data mining model.
Section 2.2 discussed about the fundamental of data warehouse which includes data warehouse models and data processing techniques such as extract, transform and loading (ETL) processes. A comparative study was done on data warehouse models introduced by William Inmons (Inmon, 1999), Ralph Kimball (Kimball, 1996) and Matthias Nicola (Nicola, 2000) to identify suitable model, design and characteristics. Section 2.3 introduces about OLAP model and architecture. We also discussed concept of processing in OLAP based MDDB, MDDB schema design and implementation. Section 2.4 introduces data mining techniques, methods and processes for OLAP mining (OLAM) which is used to mine MDDB. Section 2.5 provides conclusion on literature review especially pointers on our decision to propose a new data warehouse model. Since we propose to use Microsoft ® product to implement the propose model, we also discussed a product comparison to justify why Microsoft ® product is selected.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

According to William Inmon, data warehouse is a “subject-oriented, integrated, time-variant, and non-volatile collection of data in support of the management’s decision-making process” (Inmon, 1999). Data warehouse is a database containing data that usually represents the business history of an organization. This historical data is used for analysis that supports business decisions at many levels, from strategic planning to performance evaluation of a discrete organizational unit.
It provides an effective integration of operational databases into an environment that enables strategic use of data (Zhou, Hull, King and Franchitti, 1995). These technologies include relational and MDDB management systems, client/server architecture, meta-data modelling and repositories, graphical user interface and much more (Hammer, Garcia-Molina, Labio, Widom, and Zhuge, 1995; Harinarayan, Rajaraman, and Ullman, 1996).
The emergence of cross discipline domain such as knowledge management in finance, health and e-commerce have proved that vast amount of data need to be analysed. The evolution of data in data warehouse can provide multiple dataset dimensions to solve various problems. Thus, critical decision making process of this dataset needs suitable data warehouse model (Barquin and Edelstein, 1996).
The main proponents of data warehouse are William Inmon (Inmon, 1999) and Ralph Kimball (Kimball, 1996). But they have different perspectives on data warehouse in term of design and architecture. Inmon (Inmon, 1999) defined data warehouse as a dependent data mart structure while Kimball (Kimball, 1996) defined data warehouse as a bus based data mart structure. Table 2.1 discussed the differences in data warehouse structure between William Inmon and Ralph Kimball.
A data warehouse is a read-only data source where end-users are not allowed to change the values or data elements. Inmon’s (Inmon, 1999) data warehouse architecture strategy is different from Kimball’s (Kimball, 1996). Inmon’s data warehouse model splits data marts as a copy and distributed as an interface between data warehouse and end users. Kimball’s views data warehouse as a unions of data marts. The data warehouse is the collections of data marts combine into one central repository. Figure 2.1 illustrates the differences between Inmon’s and Kimball’s data warehouse architecture adopted from (Mailvaganam, 2007).
Although Inmon and Kimball have a different design view of data warehouse, they do agree on successful implementation of data warehouse that depends on an effective collection of operational data and validation of data mart. The role of database staging and ETL processes on data are inevitable components in both researchers data warehouse design. Both believed that dependant data warehouse architecture is necessary to fulfil the requirement of enterprise end users in term of preciseness, timing and data relevancy
Although data warehouse architecture have wide research scope, and it can be viewed in many perspectives. (Thilini and Hugh, 2005) and (Eckerson, 2003) provide some meaningful way to view and analyse data warehouse architecture. Eckerson states that a successful data warehouse system depends on database staging process which derives data from different integrated Online Transactional Processing (OLTP) system. In this case, ETL process plays a crucial role to make database staging process workable. Survey on factors that influenced selection on data warehouse architecture by (Thilini, 2005) indentifies five data warehouse architecture that are common in use as shown in Table 2.2

Independent Data Marts

Independent data marts also known as localized or small scale data warehouse. It is mainly used by departments, divisions of company to provide individual operational databases. This type of data mart is simple yet consists of different form that was derived from multiple design structures from various inconsistent database designs. Thus, it complicates cross data mart analysis. Since every organizational units tend to build their own database which operates as independent data mart (Thilini and Hugh, 2005) cited the work of (Winsberg, 1996) and (Hoss, 2002), it is best used as an ad-hoc data warehouse and also to be use as a prototype before building a real data warehouse.

Data Mart Bus Architecture

(Kimball, 1996) pioneered the design and architecture of data warehouse with unions of data marts which are known as the bus architecture or virtual data warehouse. Bus architecture allows data marts not only located in one server but it can be also being located on different server. This allows the data warehouse to functions more in virtual mode and combined all data marts and process as one data warehouse.

Hub-and-spoke architecture

(Inmon, 1999) developed hub and spoke architecture. The hub is the central server taking care of information exchange and the spoke handle data transformation for all regional operation data stores. Hub and spoke mainly focused on building a scalable and maintainable infrastructure for data warehouse.

Centralized Data Warehouse Architecture

Central data warehouse architecture build based on hub-and-spoke architecture but without the dependent data mart component. This architecture copies and stores heterogeneous operational and external data to a single and consistent data warehouse. This architecture has only one data model which are consistent and complete from all data sources. According to (Inmon, 1999) and (Kimball, 1996), central data warehouse should consist of database staging or known as operational data store as an intermediate stage for operational processing of data integration before transform into the data warehouse.

Federated Architecture

According to (Hackney, 2000), federated data warehouse is an integration of multiple heterogeneous data marts, database staging or operational data store, combination of analytical application and reporting systems. The concept of federated focus on integrated framework to make data warehouse more reliable. (Jindal, 2004) conclude that federated data warehouse are a practical approach as it focus on higher reliability and provide excellent value.
(Thilini and Hugh, 2005) conclude that hub and spoke and centralized data warehouse architectures are similar. Hub and spoke is faster and easier to implement because no data mart are required. For centralized data warehouse architecture scored higher than hub and spoke as for urgency needs for relatively fast implementation approach.
In this work, it is very important to identify which data warehouse architecture that is robust and scalable in terms of building and deploying enterprise wide systems. (Laney, 2000), states that selection of appropriate data warehouse architecture must incorporate successful characteristic of various data warehouse model. It is evident that two data warehouse architecture prove to be popular as shown by (Thilini and Hugh, 2005), (Eckerson, 2003) and (Mailvaganam, 2007). First hub-and-spoke proposed by (Inmon, 1999) as it is a data warehouse with dependant data marts and secondly is the data mart bus architecture with dimensional data marts proposed by (Kimball, 1996). The selection of the new proposed model will use hub-and-spoke data warehouse architecture which can be used for MDDB modelling.
Data warehouse architecture process begins with ETL process to ensure the data passes the quality threshold. According to Evin (2001), it is essential to have right dataset. ETL are an important component in data warehouse environment to ensure dataset in the data warehouse are cleansed from various OLTP systems. ETLs are also responsible for running scheduled tasks that extract data from OLTP systems. Typically, a data warehouse is populated with historical information from within a particular organization (Bunger, Colby, Cole, McKenna, Mulagund, and Wilhite, 2001). The complete process descriptions of ETL are discussed in table 2.3.
Data warehouse database can be populated with a wide variety of data sources from different locations, thus collecting all the different dataset and storing it in one central location is an extremely challenging task (Calvanese, Giacomo, Lenzerini, Nardi, and Rosati, , 2001). However, ETL processes eliminate the complexity of data population via simplified process as depicts in figure 2.2. The ETL process begins with data extract from operational databases where data cleansing and scrubbing are done, to ensure all data’s are validated. Then it is transformed to meet the data warehouse standards before it is loaded into data warehouse.
(Zhou et al, 1995) states that during data integration process in data warehouse, ETL can assist in import and export of operational data between heterogeneous data sources using Object linking and embedding database (OLE-DB) based architecture where the data are transform to populate all validated data into data warehouse.
In (Kimball, 1996) data warehouse architecture as depicted in figure 2.3 focuses on three important modules, which is “the back room” “presentation server” and “the front room”. ETL processes is implemented in the back room process, where the data staging services in charge of gathering all source systems operational databases to perform extraction of data from source systems from different file format from different systems and platforms. The second step is to run the transformation process to ensure all inconsistency is removed to ensure data integrity. Finally, it is loaded into data marts. The ETL processes are commonly executed from a job control via scheduling task. The presentation server is the data warehouse where data marts are stored and process here. Data stored in star schema consist of dimension and fact tables. This is where data are then process of in the front room where it is access by query services such as reporting tools, desktop tools, OLAP and data mining tools.
Although ETL processes prove to be an essential component to ensure data integrity in data warehouse, the issue of complexity and scalability plays important role in deciding types of data warehouse architecture. One way to achieve a scalable, non-complex solution is to adopt a “hub-and-spoke” architecture for the ETL process. According to Evin (2001), ETL best operates in hub-and-spoke architecture because of its flexibility and efficiency. Centralized data warehouse design can influence the maintenance of full access control of ETL processes.
ETL processes in hub and spoke data warehouse architecture is recommended in (Inmon, 1999) and (Kimball, 1996). The hub is the data warehouse after processing data from operational database to staging database and the spoke(s) are the data marts for distributing data. Sherman, R (2005) state that hub-and-spoke approach uses one-to-many interfaces from data warehouse to many data marts. One-to-many are simpler to implement, cost effective in a long run and ensure consistent dimensions. Compared to many-to-many approach it is more complicated and costly.
Building a data warehouse is indeed a challenging task as data warehouse project inheriting a unique characteristics that may influence the overall reliability and robustness of data warehouse. These factors can be applied during the analysis, design and implementation phases which will ensure a successful data warehouse system. Section focus on factors that influence data warehouse project failure. Section discusses on the success factors which implementing the correct model to support a successful data warehouse project. DATA WAREHOUSE FAILURE FACTORS
(Hayen, Rutashobya, and Vetter, 2007) studies shows that implementing a data warehouse project is costly and risky as a data warehouse project can cost over $1 million in the first year. It is estimated that two-thirds of the effort of setting up the data warehouse projects attempt will fail eventually. (Hayen et al, 2007) cited on the work of (Briggs, 2002) and (Vassiliadis, 2004) noticed three factors for the failure of data warehouse project which is environment, project and technical factors as shown in table 2.4.
Environment leads to organization changes in term of business, politics, mergers, takeovers and lack of top management support. These include human error, corporate culture, decision making process and poor change management (Watson, 2004) (Hayen et al, 2007).
Poor technical knowledge on the requirements of data definitions and data quality from different organization units may cause data warehouse failure. Incompetent and insufficient knowledge on data integration, poor selection on data warehouse model and data warehouse analysis applications may cause huge failure.
In spite of heavy investment on hardware, software and people, poor project management factors may lead data warehouse project failure. For example, assigning a project manager that lacks of knowledge and project experience in data warehouse, may cause impediment of quantifying the return on investment (ROI) and achievement of project triple constraint (cost, scope, time).
Data ownership and accessibility is a potential factor that may cause data warehouse project failure. This is considered vulnerable issue within the organization that one must not share or acquire someone else data as this considered losing authority on the data (Vassiliadis, 2004). Thus, it emphasis restriction on any departments to declare total ownership of pure clean and error free data that might cause potential problem on ownership of data rights. DATA WAREHOUSE SUCCESS FACTORS
(Hwang M.I., 2007) stress that data warehouse implementations are an important area of research and industrial practices but only few researches made an assessment in the critical success factors for data warehouse implementations. He conducted a survey on six data warehouse researchers (Watson & Haley, 1997; Chen et al., 2000; Wixom & Watson, 2001; Watson et al., 2001; Hwang & Cappel, 2002; Shin, 2003) on the success factors in a data warehouse project. He concluded his survey with a list of successful factors which influenced data warehouse implementation as depicted in figure 2.8. He shows eight implementation factors which will directly affect the six selected success variables
The above mentioned data warehouse success factors provide an important guideline for implementing a successful data warehouse projects. (Hwang M.I., 2007) studies shows an integrated selection of various factors such as end user participation, top management support, acquisition of quality source data with profound and well-defined business needs plays crucial role in data warehouse implementation. Beside that, other factors that was highlighted by Hayen R.L. (2007) cited on the work of Briggs (2002) and Vassiliadis (2004), Watson (2004) such as project, environment and technical knowledge also influenced data warehouse implementation.
In this work on the new proposed model, hub-and-spoke architecture is use as “Central repository service”, as many scholars including Inmon, Kimball, Evin, Sherman and Nicola adopt to this data warehouse architecture. This approach allows locating the hub (data warehouse) and spokes (data marts) centrally and can be distributed across local or wide area network depending on business requirement. In designing the new proposed model, the hub-and-spoke architecture clearly identifies six important data warehouse components that a data warehouse should have, which includes ETL, Staging Database or operational database store, Data marts, MDDB, OLAP and data mining end users applications such as Data query, reporting, analysis, statistical tools. However, this process may differ from organization to organization. Depending on the ETL setup, some data warehouse may overwrite old data with new data and in some data warehouse may only maintain history and audit trial of all changes of the data.
OLAP Council (1997) define OLAP as a group of decision support system that facilitate fast, consistent and interactive access of information that has been reformulate, transformed and summarized from relational dataset mainly from data warehouse into MDDB which allow optimal data retrieval and for performing trend analysis.
According to Chaudhuri (1997), Burdick, D. et al. (2006) and Vassiladis, P. (1999), OLAP is important concept for strategic database analysis. OLAP have the ability to analyze large amount of data for the extraction of valuable information. Analytical development can be of business, education or medical sectors. The technologies of data warehouse, OLAP, and analyzing tools support that ability.
OLAP enable discovering pattern and relationship contain in business activity by query tons of data from multiple database source systems at one time (Nigel. P., 2008). Processing database information using OLAP required an OLAP server to organize and transformed and builds MDDB. MDDB are then separated by cubes for client OLAP tools to perform data analysis which aim to discover new pattern relationship between the cubes. Some popular OLAP server software programs include Oracle (C), IBM (C) and Microsoft (C).
Madeira (2003) supports the fact that OLAP and data warehouse are complementary technology which blends together. Data warehouse stores and manages data while OLAP transforms data warehouse datasets into strategic information. OLAP function ranges from basic navigation and browsing (often known as “slice and dice”), to calculations and also serious analysis such as time series and complex modelling. As decision-makers implement more advanced OLAP capabilities, they move from basic data access to creation of information and to discovering of new knowledge.
In comparison to data warehouse which usually based on relational technology, OLAP uses a multidimensional view to aggregate data to provide rapid access to strategic information for analysis. There are three type of OLAP architecture based on the method in which they store multi-dimensional data and perform analysis operations on that dataset (Nigel, P., 2008). The categories are multidimensional OLAP (MOLAP), relational OLAP (ROLAP) and hybrid OLAP (HOLAP).
In MOLAP as depicted in Diagram 2.11, datasets are stored and summarized in a multidimensional cube. The MOLAP architecture can perform faster than ROLAP and HOLAP (C). MOLAP cubes designed and build for rapid data retrieval to enhance efficient slicing and dicing operations. MOLAP can perform complex calculations which have been pre-generated after cube creation. MOLAP processing is restricted to initial cube that was created and are not bound to any additional replication of cube.
In ROLAP as depict in Diagram 2.12, data and aggregations are stored in relational database tables to provide the OLAP slicing and dicing functionalities. ROLAP are the slowest among the OLAP flavours. ROLAP relies on data manipulating directly in the relational database to give the manifestation of conventional OLAP’s slicing and dicing functionality. Basically, each slicing and dicing action is equivalent to adding a “WHERE” clause in the SQL statement. (C)
ROLAP can manage large amounts of data and ROLAP do not have any limitations for data size. ROLAP can influence the intrinsic functionality in a relational database. ROLAP are slow in performance because each ROLAP activity are essentially a SQL query or multiple SQL queries in the relational database. The query time and number of SQL statements executed measures by its complexity of the SQL statements and can be a bottleneck if the underlying dataset size is large. ROLAP essentially depends on SQL statements generation to query the relational database and do not cater all needs which make ROLAP technology conventionally limited by what SQL functionality can offer. (C)
HOLAP as depict in Diagram 2.13, combine the technologies of MOLAP and ROLAP. Data are stored in ROLAP relational database tables and the aggregations are stored in MOLAP cube. HOLAP can drill down from multidimensional cube into the underlying relational database data. To acquire summary type of information, HOLAP leverages cube technology for faster performance. Whereas to retrieve detail type of information, HOLAP can drill down from the cube into the underlying relational data. (C)
In OLAP architectures (MOLAP, ROLAP and HOLAP), the datasets are stored in a multidimensional format as it involves the creation of multidimensional blocks called data cubes (Harinarayan, 1996). The cube in OLAP architecture may have three axes (dimensions), or more. Each axis (dimension) represents a logical category of data. One axis may for example represent the geographic location of the data, while others may indicate a state of time or a specific school. Each of the categories, which will be described in the following section, can be broken down into successive levels and it is possible to drill up or down between the levels.
Cabibo (1997) states that OLAP partitions are normally stored in an OLAP server, with the relational database frequently stored on a separate server from OLAP server. OLAP server must query across the network whenever it needs to access the relational tables to resolve a query. The impact of querying across the network depends on the performance characteristics of the network itself. Even when the relational database is placed on the same server as OLAP server, inter-process calls and the associated context switching are required to retrieve relational data. With a OLAP partition, calls to the relational database, whether local or over the network, do not occur during querying.
OLAP functionality offers dynamic multidimensional analysis supporting end users with analytical activities includes calculations and modelling applied across dimensions, trend analysis over time periods, slicing subsets for on-screen viewing, drilling to deeper levels of records (OLAP Council, 1997) OLAP is implemented in a multi-user client/server environment and provide reliably fast response to queries, in spite of database size and complexity. OLAP facilitate the end user integrate enterprise information through relative, customized viewing, analysis of historical and present data in various “what-if” data model scenario. This is achieved through use of an OLAP Server as depicted in diagram 2.9.
OLAP functionality is provided by an OLAP server. OLAP server design and data structure are optimized for fast information retrieval in any course and flexible calculation and transformation of unprocessed data. The OLAP server may either actually carry out the processed multidimensional information to distribute consistent and fast response times to end users, or it may fill its data structures in real time from relational databases, or offer a choice of both.
Essentially, OLAP create information in cube form which allows more composite analysis compares to relational database. OLAP analysis techniques employ ‘slice and dice’ and ‘drilling’ methods to segregate data into loads of information depending on given parameters. Slice is identifying a single value for one or more variable which is non-subset of multidimensional array. Whereas dice function is application of slice function on more than two dimensions of multidimensional cubes. Drilling function allows end user to traverse between condensed data to most precise data unit as depict in Diagram 2.10.
The base of every data warehouse system is a relational database build using a dimensional model. Dimensional model consists of fact and dimension tables which are described as star schema or snowflake schema (Kimball, 1999). A schema is a collection of database objects, tables, views and indexes (Inmon, 1996). To understand dimensional data modelling, Table 2.10 defines some of the terms commonly used in this type of modelling:
In designing data models for data warehouse, the most commonly used schema types are star schema and snowflake schema. In the star schema design, fact table sits in the middle and is connected to other surrounding dimension tables like a star. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table.
Most data warehouses use a star schema to represent the multidimensional data model. The database consists of a single fact table and a single table for each dimension. Each tuple in the fact table consists of a pointer or foreign key to each of the dimensions that provide its multidimensional coordinates, and stores the numeric measures for those coordinates. A tuple consist of a unit of data extracted from cube in a range of member from one or more dimension tables. (C, Each dimension table consists of columns that correspond to attributes of the dimension. Diagram 2.14 shows an example of a star schema For Medical Informatics System.
Star schemas do not explicitly provide support for attribute hierarchies which are not suitable for architecture such as MOLAP which require lots of hierarchies of dimension tables for efficient drilling of datasets.
Snowflake schemas provide a refinement of star schemas where the dimensional hierarchy is explicitly represented by normalizing the dimension tables, as shown in Diagram 2.15. The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increase number of lookup tables. (C)
Levene. M (2003) stresses that in addition to the fact and dimension tables, data warehouses store selected summary tables containing pre-aggregated data. In the simplest cases, the pre-aggregated data corresponds to aggregating the fact table on one or more selected dimensions. Such pre-aggregated summary data can be represented in the database in at least two ways. Whether to use star or a snowflake mainly depends on business needs.
2.3.2 OLAP Evaluation
As OLAP technology taking prominent place in data warehouse industry, there should be a suitable assessment tool to evaluate it. E.F. Codd not only invented OLAP but also provided a set of procedures which are known as the ‘Twelve Rules’ for OLAP product ability assessment which include data manipulation, unlimited dimensions and aggregation levels and flexible reporting as shown in Table 2.8 (Codd, 1993):
Codd twelve rules of OLAP provide us an essential tool to verify the OLAP functions and OLAP models used are able to produce desired result. Berson, A. (2001) stressed that a good OLAP system should also support a complete database management tools as a utility for integrated centralized tool to permit database management to perform distribution of databases within the enterprise. OLAP ability to perform drilling mechanism within the MDDB allows the functionality of drill down right to the source or root of the detail record level. This implies that OLAP tool permit a smooth changeover from the MDDB to the detail record level of the source relational database. OLAP systems also must support incremental database refreshes. This is an important feature as to prevent stability issues on operations and usability problems when the size of the database increases.
2.3.1 OLTP and OLAP
The design of OLAP for multidimensional cube is entirely different compare to OLTP for database. OLTP is implemented into relational database to support daily processing in an organization. OLTP system main function is to capture data into computers. OLTP allow effective data manipulation and storage of data for daily operational resulting in huge quantity of transactional data. Organisations build multiple OLTP systems to handle huge quantities of daily operations transactional data can in short period of time.
OLAP is designed for data access and analysis to support managerial user strategic decision making process. OLAP technology focuses on aggregating datasets into multidimensional view without hindering the system performance. According to Han, J. (2001), states OLTP systems as “Customer oriented” and OLAP is a “market oriented”. He summarized major differences between OLTP and OLAP system based on 17 key criteria as shown in table 2.7.
It is complicated to merge OLAP and OLTP into one centralized database system. The dimensional data design model used in OLAP is much more effective for querying than the relational database query used in OLTP system. OLAP may use one central database as data source and OLTP used different data source from different database sites. The dimensional design of OLAP is not suitable for OLTP system, mainly due to redundancy and the loss of referential integrity of the data. Organization chooses to have two separate information systems, one OLTP and one OLAP system (Poe, V., 1997).
We can conclude that the purpose of OLTP systems is to get data into computers, whereas the purpose of OLAP is to get data or information out of computers.
Many data mining scholars (Fayyad, 1998; Freitas, 2002; Han, J. et. al., 1996; Frawley, 1992) have defined data mining as discovering hidden patterns from historical datasets by using pattern recognition as it involves searching for specific, unknown information in a database. Chung, H. (1999) and Fayyad et al (1996) referred data mining as a step of knowledge discovery in database and it is the process of analyzing data and extracts knowledge from a large database also known as data warehouse (Han, J., 2000) and making it into useful information.
Freitas (2002) and Fayyad (1996) have recognized the advantageous tool of data mining for extracting knowledge from a da

This student written literature review is published as an example. See How to Write a Literature Review on our sister site for a writing guide.

Data warehouse and data mining

Data mining and data warehouse is one of an important issue in a corporate world today. The biggest challenge in a world that is full of information is searching through it to find connections and data that were not previously known. Dramatic advance in data development make the role of data mining and data warehouse become important in order to improve business operation in organization. The scenario’s of important data mining and data warehouse in organization are seen in the process of accumulating and integrating of vast and growing amounts of data in various format and various databases. This paper is discuss about data warehouse and data mining, the concept of data mining and data warehouse, the tools and techniques of data mining and also the benefits of data mining and data warehouse to the organizations.
Data, Data Warehouse, Data Mining, Data Mart
Organizations tend to grow and prosper as they gain a better understanding of their environment. Typically, business managers must be able to track daily transactions to evaluate how the business is performing. By tapping into the operational database, management can develop strategies to meet organizational goals. The process that identified the trends and patterns in data are the factors to accomplish that. By the way, the way to handle the operational data in organization is important because the reason for generating, storing and managing data is to create information that becomes the basis for rational decision making. To facilitate the decision-making process, decision support systems (DSSs) were developed whereas it is an arrangement of computerized tools used to assist managerial decision making within a business. Decision support is a methodology that designed to extract information from data and to use such information as a basis for decision making. However, information requirements have become so complex that is difficult for a DSS to extract all necessary information from the data structures typically found in an operational database. Therefore, a data mining and data warehouse was developed and become a proactive methodology in order to support managerial decision making in organization.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

Concept of Data Warehouse
A data warehouse is a firm’s repositories that running the process of updating and storing historical business data of organization whereas the process then transform the data into multidimensional data model for efficient querying and analysis. All the data stored are extracts or obtains its data from multiple operational systems in organization with containing the information of relevant activity that occurred in the past in order to support organizational decision making. A data mart, on the other hand, is a subset of a data warehouse. It holds some special information that has been grouped to help business in making better decisions. Data used here are usually derived from data warehouse. The first organized used of such large database started with OLAP (Online Analytical Processing) whereas the focused is analytical processing of organization. The diffrences between a data mart and a data warehouse is only the size and scope of the problem being solved.
According to William H.Inmon (2005), a data warehouse is a “subject-oriented, integrated, time-varying, and non-volatile collection of data in support of the management’s decision-making process”. To understand that definition, the components will be explained more detailed;
Provide a unified view of all data elements with a common definition and representation for all business units.
Data are stored with a subject orientation that facilitates multiple views of the data and facilitates decision making. For example, sales may be recorded by product, by division, by manager, or by region.
Dates are recorded with a historical perspective in mind. Therefore, a time dimension is added to facilitate data analysis and various time comparisons.
Data cannot be changed. Data are added only periodically from historical systems. Once the data are properly stored, no changes are allowed. Therefore, the data environment is relatively static.
In summary, the data warehouse is usually a read-only database optimized for data analysis and query processing. Typically, data are extracted from various sources and are then transformed and integrated, in other words, passed through a data filter, before being loaded into the data warehouse. Users access the data warehouse via front-end tools and end-user application software to extract the data in usable form.
The Issues That Arise in Data Warehouse
Although the centralized and integrated data warehouse can be a very attractive proposition that yields many benefits, managers may be reluctant to embrace this strategy. Creating a data warehouse requires time, money, and considerable managerial effort. Therefore, it is not surprising that many companies begin their foray into warehousing by focusing on more manageable data sets that are targeted to meet the special needs of small groups within the organization. These smaller data warehouse are called data marts. A data mart is a small, single-subject data warehouse subset that provides decision support to a small group of people. Some organizations choose to implement data marts not only because of the lower cost and shorter implementation time, but also because of the current technological advances and inevitable “people issues” that make data marts attractive. Powerful computers can provide a customized DSS to small groups in ways that might not be possible with a centralized system. Also, a company’s culture may predispose its employees to resist major changes, but they might quickly embrace relatively minor changes that lead to demonstrably improved decision support. In addition, people at different organizational levels are likely to require data with different summarization, aggregation, and presentation formats. Data marts can serve as a test vehicle for companies exploring the potential benefits of data warehouses. By migrating gradually from data marts to data warehouses, a specific departments decision support needs can be addressed within a reasonable time frame (six month to one year), as compared to the longer time frame usually required to implement a data warehouse (one to three years). Information Technology (IT) departments also benefit from this approach because their personnel have the opportunity to learn the issues and develop the skills required to create a data warehouse.
Concept of Data Mining
Data mining is the forecasting techniques and analytical tools that extensively used in industries and corporates to ensure the effectiveness in decision making. Data mining is a tools to analyze the data, uncover problems or opportunities hidden in the data relationships, form computer models based on their findings, and then use the models to predict business behavior by requiring minimal end-user intervention. The way it works is through search of valuable information from a huge amount of data that is collected over time and defined the patterns or relationships of information that present by data. In business field, the organization use data mining to predict the customer behaviour in the business environment. The process of data mining started from analyzed the data from different perspectives and summarized it into useful information, which from the information then created knowledge to address any number of business problems. For the example, banks and credit card companies use knowledge-based analysis to detect fraud, thereby decreasing fraudulent transactions. In fact, data mining has proved to be very helpful in finding practical relationships among data that help define customer buying patterns, improve product development and acceptance, reduce healthcare fraud, analyze stock markets and so on.
Data Mining in Historical Perspective
Over the last 25 years or so, there has been a gradual evolution from data processing to data mining. In the 1960s business routinely collected data and processed it using database management techniques that allowed an orderly listing and tabulation of the data as well as some query activity. The OLTP (Online Transaction Processing) became routine, data retrieval from stored data bacame faster and more efficient because of the availability of new and better storage devices, and data processing became quicker and more efficient because of advancement in computer technology. Database management advanced rapidly to include highly sophisticated query systems, and became popular not only in business applications but also in scientific inquiries.
Approaches of Data Mining in Various Industries
With data mining, a retail store may find that certain products are sold more in one channel of distribution than in the others, certain products are sold more in one geographical location than in others, and certain products are sold when a certain event occurs. With data mining, a financial analyst would like to know the characteristics of a successful prospective employee; credit card departments would like to know which potential customers are more likely to pay back the debt and when a credit card is swiped, which transaction is fraudulent and which one is legitimate; direct marketers would like to know which customers purchase which types of products; booksellers like Amazon would like to know which customers purchase which types of books (fiction, detective stories or any other kind) and so on. With this type of information available, decision makers will make better choices. Human resource people will hire the right individuals. Credit departments will target those prospective customers that are less prone to become delinquent or less likely to involve in fraudulent activities. Direct marketers will target those customers that are likely to purchase their products. With the insight gained from data mining, businesses may wish to re-configure their product offering and emphasize specific features of a product. These are not the only uses of data mining. Police use this tool to determine when and where a crime is likely to occur, and what would be the nature of that crime. Organized stock changes detect fraudulent activities with data mining. Pharmaceutical companies mine data to predict the efficacy of compounds as well as to uncover new chemical entities that may be useful for a particular disease. The airline industry uses it to predict which flights are likely to be delayed (well before the flight is scheduled to depart). Weather analyst determine weather patterns with data mining to predict when there will be rain, sunshine, a hurricane, or snow. Beside that, nonprofit companies use data mining to predict the likelihood of individuals making a donation for a certain cause. The uses of data mining are far reaching and its benefits may be quite significant.
Data Mining Tools and Techniques
Data mining is the set of tools that learn the data obtained and then using the useful information for business forecasting. Data mining tools use and analyze the data that exist in databases, data marts, and data warehouse. A data mining tools can be categorized into four categories of tools which are prediction tools, classification tools, clustering analysis tools and association rules discovery. Below are the elobaration of data mining tools:
Prediction Tools
A prediction tool is a method that derived from traditional statistical forecasting for predicting a value of the variable.
Classification Tools
The classification tools are attempt to distinguish the differences between classes of objects or actions. Given the example is an advertiser may want to know which aspect of its promotion is most appealing to consumers. Is it a price, quality or reliability of a product? Or maybe it is a special feature that is missing on competitive products. This tools help give such information on all the products, making possible to use the advertising budget in a most effective manner.
Clustering Analysis Tools
This is very powerful tools for clustering products into groups that naturally fall together which are the groups are identified by the program. Most of the clusters discovered may not be useful in business decision. However, they may find one or two that are extremely important which the ones the company can take advantage of. The most common use is market segmentation which in this process, a company divides the customer base into segments dependent upon characteristics like income, wealth and so on. Each segment is then treated with different marketing approach.
Association Rules Discovery
This tool discover associations which are like what kinds of books certain groups of people read, what products certain groups of people purchase and so on. Businesses use such information in targeting their markets. For instance, recommends movies based on movies people have watched and rated in the past.
There are four general phases in data mining which are data preparation, data analysis and classification, knowledge acquisition and prognosis.
Data Preparation
In the data preparation phase, the main data sets to be used by the data mining operation are identified and cleaned of any data impurities. Because the data in the data warehouse are already integrated and filtered, the data warehouse usually is the target set for data mining operations.
Data Analysis
The data anlysis and classification phase studies the data to identify common data characteristics or patterns. During this phase, the data mining tool applies specific algorithm to find:

Data groupings, classifications, clusters, or sequences.
Data dependencies, links, or relationships.
Data patterns, trends, and deviations.

Knowledge Acquisition
The knowledge-acquisition phase uses the results of the data analysis and classification phase. During the knowledge-acquisition phase, the data mining tool (with possible intervention by the end user) selects the appropriate modeling or knowledge-acquisition algorithms. The most common algorithms used in data mining are based on neural networks, decision trees, rules induction, genetic algorithms, classification and regression trees, memory-based reasoning, and nearest neighbor and data visualization. A data mining tool may use many of these algorithms in any combination to generate a computer model that reflects the behavior of the target data set.
Although many data mining tools stop at the knowledge-acquisition phase, others continue to the prognosis phase. In that phase, the data mining findings are used to predict future behavior and forecast business outcomes. Examples of data mining findings can be:

65% of customers who did not use a particular credit card in the last six months are 88% likely to cancel that account.
82% of customers who bought a 27-inch or larger TV are 90% likely to buy an entertainment center within the next four weeks.
If age

The complete set of findings can be represented in a decision tree, a neural net, a forecasting model, or a visual presentation interface that is used to project future events or results. For example, the prognosis phase might project the likely outcome of a new product rollout or a new marketing promotion.
The Benefit and Weaknesess of Data Warehouse to Organization
Data warehouse is the one of powerful techniques that applies in organization in order to assist managerial decision making within a business. This methodology becomes a crucial asset in modern business enterprise. It is designed to extract information from data and to use such information as a basis for decision making. The organization will get more benefit with application of data warehouse because the features of data warehouse itself is it’s a central repositories that stores historical information, meaning say that eventhough the data come from differ location and various points in time but all the relevant data are assembled in one location and was organized in efficient manner. Indirectly, it makes a profit to company because it greatly reduces the computing cost. One of the advantage of using data warehouse is it allows the accessible of large volume information whereas the information will be used in problem solving that arise in business organization. All the data that are from multiple sources that located in central repository will be analyze in order to allow them come out with a choice of solutions.
However there are also having weaknesses that need to concern as well. The processes of data warehouse actually take a long period of time bacause before all the data can be stored into warehouse, they need to cleaned, extracted and loaded. The process of maintaining the data is one of the problems in data warehouse because it is not easy to handle. The compatibility may be the isssued in order to implement the data warehouse in organization because the new transaction system that tried to implement may not work with the system that already used. Beside that, the user that works with the system must be trained to use the system because without having a proper training may cause a problem. Furthermore, if the data warehouse can be accessed via the internet, the security problem might be the issue. The biggest problem that related with the data warehouse is the costs that must taken into consideration especially for their maintenance. Any organization that is considering using a data warehouse must decide if the benefits outweigh the costs.
Successfully supporting managerial decision-making is significantly dependent upon the availability of integrated, high quality information organized and presented in a timely and in simply way to understand. Data mining and data warehouse have emerged to meet this need. The application of data mining and data warehouse will be apart of crucial element in organization in order to assist the managerial running the operation smoothly and at the same time will help them to accomplish the business goal. It is because both of these techniques are the foundation of decision support system. Today data mining and data warehouse are an important tools and more companies will begin using them in the future.

Bonifati, A., Cattaneo, F., Ceri, F., Fuggetta, A., and Paraboschi, S., (2001). Designing data marts for data warehouse. ACM Transactions On Software Engineering And Methodology, 10, 452-483. Retrieved February 15, 2010 from:
Chaplot, P., (2007). An introduction to data warehousing. Retrieved February 14, 2010 from:
Roiger, R.,J., (2005). Teaching an introductory course in data mining. Retrieved February 13, 2010 from:
Santos, R., J., and Bernandino, J. Real-time data warehouse loading methodology. Retrieved February 13, 2010 from:
Chowdhury, S., Chan, J.,O., (2007). Data warehousing and data mining: a course in mba and msis program from uses perspective. Data Warehousing And Data Mining. 7. Retrieved February 15, 2010 from:
Ranjan, J., Malik, K., (2007). Effective educational process: a data mining approach. The Journal Of Information And Knowledge Management Systems. 37, 502-515. Retrieved February 16, 2010 from:
Mora, S., L., Trujillo, J., Song, I, Y., (2006). A uml profile for multidimensional modeling in data warehouses. Data & Knowledge Engineering. 59, 725-769. Retrieved February 20, 2010 from:
March, S., T., Hevner, A., R., (2005). Integrated decision support systems: a data warehousing perspective. Retrieved February 21, 2010 from:


Chemist Warehouse Human Resource Strategy


Chemist Warehouse is a pharmaceutical company that operates a chain of retail pharmacies. The company is based in Australia, and it is the biggest pharmacy retailer company in the country having over 8,000 staff members. This company deals with pharmaceutical products, and it is known for offering discounted prices for its products (“Chemist Warehouse workers strike over alleged sexual harassment, job casualization,” 2019). However, this company’s industrial relations have been a challenge because its employees from the warehouse in Preston and Storehouses in Somerton, Victoria went on strike on March 2019 over what they claimed is labor exploitation and sexual harassment (Valin et al., 2018). Chemist Warehouse is an example of the companies that need to implement strong human resources strategies that will eliminate bad industrial relations in the company.

Case Analysis

Industrial Relation is the key determinant of the well-being of employees in any organization. Poor industrial relations involve bad treatment of employees in the workplace, which leads to low productivity because employees become demotivated; hence, they become less productive. Chemist Warehouse has been experiencing this challenge as a result of sexually harassment and exploiting of workers (Bacon et al., n.d). Additionally, there has been a work casualization of about seventy percent (Chidi & Okpala, 2012). According to ABC News, Chemist Warehouse workers said that they would receive messages as late as 10.00pm indicating whether they would report to the job the next day or not. This situation is similar to the time Fox wrote his book “Beyond Contract” (1974) when work relations in the United Kingdom were impoverished. During the time of Fox, managers used their powers to exploit employees for their gains. As a result, Fox proposed that industrial relations can be viewed in three flames of reference, including Unitism, Pluralism, and Radicalism.


This was a descriptive and a normative theory that was being held by many managers. This theory is based on the perception of society and organizations that managers should be given the privilege of having control of every situation in an organization. This theory suggests that this idea was theorized as the harmony of interests and doctrine of common purpose. The belief that lack of trust and work relations conflict was pathology was supported by such ideas, which was as a result of sectional interest that was being aggravated by the political motivation of shop stewards and by leadership and poor managerial communications. This frame of reference sees an organization as a harmonious consisting of loyal workers and management teams (Berg & Farbenblum, 2017). This frame of reference assumes that there is a common interest between workers and managers. In the Chemist Warehouse context, the management and the workers are not in good terms; hence, the organization is not harmonious. The management in Chemist Warehouse organization has the interests of exploitation the workers which are against the will of most staff members and the employees. The management is interested in getting good salaries and benefits from the company. The staff and casual workers are also interested in getting better salaries and increased benefits (Chidi & Okpala, 2012).


This frame of reference views an organization consisting of many different groups, which have their own interests’.This frame of reference best fits in the post-world war two social democracy which compromises between different classes holding pro-capitalist positions. Pluralism frame of reference focused on overcoming the limitations of Unitism through setting in place the limits of coercion (Perelberg, 2017). According to Fox, Pluralism is a more realistic industrial relation analysis.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

Moreover, Fox suggested that Pluralism is an essential model of enhancing industrial relations, which is made up of diverse groups having different interests. In pluralist view of organizations focuses on different but related group objective and interests, which are the basis for the norms of an organization. According to fox, these interests must be respected because they are the sources of leadership which must be understood by every stakeholder of an organization (Kirk, 2017). This frame of reference best fits in Chemist Warehouse context because the company consists of different groups of people having different interests. Such groups include Casual workers, staff, and the organization management, team. Each group has its interests that are almost similar to those of the other groups.


This is the third frame of reference that Fox proposed after the pluralism perspective. This frame of reference takes into account the coercive duress, which binds workers to the employment contract. This perspective is based on the alienation of particular workgroup that results from the exploitation of groups in the organization. In this perspective, a balance of power in an organization is almost impossible. Under this perspective, people compete for scarce resources, and hence, one group or more must be exploited when competing for scarce resources (Loignon et al.,.2017).In most cases, employees are forced to accept the structural inequalities and are expected to submit to the employment contract despite the provisions in the contract. Such provisions are against the interests of the employees.

Human resource management Strategies to resolve issues

Employees are the largest capital of any organization, and hence, they must be treated well. Proper management of human resources results in increased profitability for every organization. Human Resource Management strategies are ways in which employees can be managed efficiently and effectively in order to achieve organizational goals and objectives. Coming up with an effective employee management program is the first best strategy that Chemist Warehouse needs to implement. This strategy will help in determining relevant dimensions of performance among all employees through providing all necessary incentives required for better performance.

Another strategy that needs to be implemented in Chemist Warehouse is coming up with a competitive compensation plan that will be based on the payment packages from other companies. This package will include better employee’s benefits such as medical allowance, house allowances, Insurance cover, and bonuses. A competitive compensation package will reduce the level of employee’s turnover and eliminate employee’s strikes. Moreover, such benefits enable employees to be much more motivated at the work place, and hence, the organization will achieve its goals and objectives.

These strategies are essential in ensuring that the organization is in a harmonious state that results to smooth flow of activities and improves group cohesion as well as an employer- employees’ relation. Harmonious environment is vital for increased production and productivity of an organization.



Bacon, N., Blyton, P., Fiorito, J., Heery, E., Budd, J. W., & Bhave, D. VALUES, IDEOLOGIES, AND FRAMES OF REFERENCE IN EMPLOYMENT RELATIONS.

Berg, L., & Farbenblum, B. (2017). Wage theft in Australia: Findings of the national temporary migrant work survey. Available at SSRN 3140071.

Chemist Warehouse workers strike over alleged sexual harassment, job casualisation. (2019). Retrieved 15 August 2019, from

Chidi, C. O., & Okpala, O. P. (2012). Theoretical approaches to employment and industrial relations: a comparison of subsisting orthodoxies. Theoretical and Methodological Approaches to Social Sciences and Knowledge Management, 263-278.

Contract, B. (1974). Work, Power and Trust Relations. London: Faberand Faber.

Kirk, N. (2017). Labour and the politics of empire: Britain and Australia 1900 to the present.

Loignon, A. C., Woehr, D. J., Thomas, J. S., Loughry, M. L., Ohland, M. W., & Ferguson, D. M. (2017). Facilitating peer evaluation in team contexts: The impact of frame-of-reference rater training. Academy of Management Learning & Education, 16(4), 562-578.

Perelberg, R. J. (2017). A Multi-Dimensional Frame of Reference: The Independent tradition. In British Psychoanalysis (pp. 14-18). Routledge.

Valin, M., Proulx, C., Amiot, L. P., Falardeau, B., & Deslongchamps, J. (2018). U.S. Patent No. 9,901,405. Washington, DC: U.S. Patent and Trademark Office.


Warehouse Automation Efficiency Case Study

This paper examines warehouse automation efficiency by using special equipment and software. The case study of CCHBS will be looked at. Business is interested in automation but uncertain about cost effectiveness. Implementation of these projects is expensive. However, the spending is overlapped by savings and provides warehouse activity increase. Most of the previous works illustrated only one type/group of equipment or described several in brief without comparing them. In this thesis, the comparative analysis will be performed, and it will be defined which equipment is more preferred by technical parameters and expected performance. The findings of this paper are expected to simplify the process of choosing ways of warehouse automation facilities that allow to get competitive advantage on the market. Moreover, the method attached in this paper can be used by other companies where the warehouse is needed to be developed.

Keywords: warehouse logistics, warehouse automation


Nowadays Industry 4.0 (the forth industrial revolution characterized by cyber-physical transformation of manufacturing) increases its impact for most parts of the business, including logistics. Modern technologies can improve processes of production, transportation, warehousing, etc. This paper focuses on methods of enterprise finished product storage automation.

Finished products are indispensable part of the manufacturing, but it is hard to keep products properly without the relevant warehouse. Storing is a very old process. We will research this process from the beginning if the last century, when the mass production appeared, and the logistics become a part of a business.

The evolution of modern warehouse logistics is characterized by 4 main periods (Lukinskiy, Lukinskiy & Pletneva, 2016). The first period was called “Fragmentation” and lasted from 1920 to 1950 year. The idea was to separate logistics activities for reducing costs, and the warehousing was separated as an individual business process. During the next period, “Emerging” (1950-1970), there was an intensive development of theories and practices in logistics. That has resulted in the logistics processes because storing and transportation was connected at the first time. Moreover, the concept was the basis for another period – Integration Logistics. But before this step of the warehouse logistics evolution the technologies made a leap forward which influenced on operational processes at a storage due to its automation.

Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
Essay Writing Service

From these times to nowadays automotive technologies play significant role in logistics, including storing. That is why most of the modern large warehouses are complex technical buildings that consist of many various subsystems, such as compound, set of processed goods, infrastructure of information technologies, etc., and elements of the structure combined to serve as functions of material flow transformation. Put simply, warehouse is the starting and the final point of all logistics functional areas of manufacturers and wholesalers. That is the reason the logistics manages not the warehouse, but the material flow coming through the warehouse system.

Storage automatic systems can be roughly divided into two groups – Software and Hardware. The latter term defines not the computer details, but the mechanisms that improve the flow of goods, for example warehouse indoors vehicles.

There is a number of academic works on the subject. Dadzie, Johnston and Sadchev (2014) analyze three groups of automation technologies and suggest how they can influence on abstract warehouse, but there is no information about the storing subjects and the precise characteristics of the automatic products. Hamberg and Verriet (2012) have listed certain technologies and their opportunity for improving operational processes, but there is no a comparison of them. Baker and Halim(2010) show the cost efficiency of voice and light picking systems integrated on a finished goods warehouse

All previous works have only focused on illustrating current technologies or efficiency of one of them. However, how do different methods of warehouse automation effect on business indicators? Despite the expensiveness of the implementation, automation provides opportunities for improving the company activities because it increases warehouse efficiency and reduces logistics costs.

In this paper we will research automation of warehouse complex of the Coca-Cola HBS Eurasia example. At the current challenging moment storage plays a significant role for reaching the previous leading position.

Warehouse automation is generating considerable interest in terms of reducing cost due to the volatility of economic situation in Russia which causes instability of demand for fast-moving consumers goods. Implementing technologies helps to control the amount of finished goods and manage material flows. Moreover, decreasing the human factor provides the decrease in number of mistakes, which can move closer to the idea of 0 PPM. This is an indicator which shows that the operational process has no more than one mistake per million operations. Automation also modify logistics cost, reducing up to 35 %. (Petchina, 2018)

In previous studies researches learned the ability of decreasing risks and mistakes, methods of improving warehouse and different automation technologies. However, all of them are scattered and there are no comparative assessment modern instruments of storage automation.

In order to investigate the issue, it is necessary to explore three tasks:

review current warehouse automation technologies;

explore technical efficiency of them;

analyze financial relevance of the implementation.

The first task will allow us to analyze and compare technologies that can be implemented in warehouse. The second one will help to understand which of them are truly useful. The solution of the last one will show appropriateness in a given situation.

Literature review

A considerable amount of valuable work on supply chain management in the field of economic and technical sciences has been done to accomplish the revision of the notion “logistics”. All these studies analyze material flow movements at a macro or micro level but some of them focus on different aspects of supply chain.

Warehousing and stocking definition improve due to the evolution in research papers and academic works. In the first half of the 20th century the warehouse was firstly described as not the place of storing goods, but also a part of company which can give a profit to it (Lukinskiy et al., 2016). Keeping raw materials and finished goods has a direct effect on the logistics efficiency, especially on economical part.

The warehouse is the most important element of the infrastructure of commodity markets and logistics systems actively developing in our country. Therefore, in addition to transportation costs, the costs of storage, inventory management and cargo handling account for the lion’s share of total logistics costs. With the introduction of various information innovation systems, it can be reduced to 30%.

Comparing to the manufacturing sphere, the automation process has implemented less rapid in warehouse logistics. According to the survey by Coimbra (2013), the amount of storage logistics companies using automotive technologies had fall behind for years and only in the beginning of 1990s reached that in manufacturing. She believes the two main problems were the lack of appropriate corporative IT tools and the insufficient attention to the warehousing as the business function

Due to technological development, these problems seemed to have been solved. Costs increase have made directors to pay attention to warehouse processes, so automation has become an important option. That is why there were expectations about growth in adoption decisions. Many companies have started to improve this part of logistic activities.

Hybrid lift trucks, horizontal transfer systems, and automated storage and retrieval systems were the most popular kind of staff in case of storage automation (Coimbra, 2013). The main purposes of their usage are horizontal movement, storage and orders item picking. That is the reason why main warehousing activities can be assisted by automation.

Researches of the ending of XX century concerned that warehouses could not be fully automated due to the inability of avoidance of manual phases. For example, Kerremans, Thunessie and Drury (1991) highlight the difficulty in making a difference between automatic and manual processes during the survey. Baker and Naish (2014) have contributed four levels of automation: manual, mechanically assisted, simple automation, and complex ones.

According to Coimbra (2013), there is a typical situation when the company partly automize storage processes. However, there is a tendency for improving this situation among multinational companies (Dotoli, Epicoco, Falagario, Costantino & Turchiano, 2015)

There is a variety of benefits which may be achieved by storage automation. These include labor costs reduce, accuracy and processes speed increase, the storage capacity growth. According to Kerremans et al. (2015), labor cost decrease is the most popular reason and key performance indicator in the automation – cited by 89 percent of their respondents. Other arguments which are less obvious and complex to measure include damaged units decrease, better quality of service and less mistakes of staff activity (Naish & Baker, 2014).

However, the improvement always come with costs. The main disadvantage is a long-time period of implementation and commission (Naish & Baker, 2014). While the new equipment would be installed the level of service may decrease significantly. Moreover, at the first time the flexibility of logistics processes may be lost. Both may significantly influence on supply chain of the company. That is the reason why automation must be held with a high-calibrated implementation plan.

Find Out How Can Help You!
Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.
View our services

It is known that the goal should be measurable to achieve it. Nevertheless, many automation effects are difficult to measure (Naish & Baker, 2014). That is the reason why directors often get difficulties with automation project decisions. Naish & Baker argue that the main reason of such project failure are undefined or unrealistic indicators. The claims that objectives ought to be set quantitatively, for example, the level of cost decrease. Unrealistic expectations in amount of spending are the main financial problem in this situation.

Furthermore, the number of warehouse equipment and software is so huge, that the choice of them tends to be challenging. According to Speranza and Stahly (2012), automation oversees the following tasks, which are needed to be covered and controlled:

formation of all required reports and documentation;

registration of goods receipts to the warehouse and the issuance of products from the warehouse;

registration of product shipments from the warehouse, as well as returns from the customer;

inventory, calculation and adjustment of balances;

accounting of product balances on stock and quality control;

movement between warehouse units and placement of goods for storage;

product return on production, if a marriage is found;

assembly, packaging, packing documents for goods and formation required licensed and certified sets.

Improving these warehouse processes is important to the whole business. Reducing time for each operation duration, the number of them and the amount of staff effects positively on risks and efficiency of the company.


The goal of this research will be achieved by choosing appropriate warehouse equipment. To reach it the recommendation model will be formed.

Before giving recommendation the current situation and suggestions from the supplier of equipment companies should be analyzed. Market analysis will provide information about warehouse products available now. To create the system, it is needed to understand, which products will help in each case. For this reason, information about storage characteristics will be collected, such as:

warehouse volume and capacity;

amount of SKU;

cargo turnover;

type of cargo;

current equipment and software.

For equipment analysis it will also include:



average and max speed (for vehicles and some kind of rack systems);

availability for integration to WMS system.

According to WMS systems, it will need:

current ERP system usage;

automated equipment existence;

server’s location.

Automation in general increases the efficiency of logistic operations. However, different companies have different purposes. The main part of collecting and analyzing data will be identification the reasons of warehouse automation. To get it, there will be analysis of the possible aspects to be improved, for example:

operational time reduce;

amount of staff decreases;

operational process optimization.

Parameters outlined above will help to illustrate the expected efficiency of the processes, but there is another part of the research – economic impact on the company after the warehouse automation. Because of that during the market analysis the preliminary prices of each tool should be requested.

After analysis there will be clustering of the information about equipment and warehouses parameters. It will be presented in a table or matrix with groups in the horizontal line and products in the vertical. At the cross there will be binary values: 1 if the current product is appropriate for the parameter and 0 if not.

Information about warehouse will be formed as a table with only one horizontal line. The groups should be the same as in the 1st table. Comparing the required parameters of the warehouse and the possibility of chosen equipment will help to form short list of equipment.

Next step will be the comparison of their implementation effects by using expert method. Each financial and logistics parameter will be defined due to analysis. After that each of them will get the weight for the next comparison with equal values. The sum of the tool value illustrates the amount of effect it may give to warehouse system: the more summary provides better effect.

The information about a warehouse will be formed as a table with only one horizontal line. The groups should be the same in the 1st table. Comparing the required parameters of the warehouse and the possibility of chosen equipment will help to form short list of equipment.

The next step will be the comparison of their implementation effects by using expert method. Each financial and logistics parameter will be defined due to analysis. After that each of them will get the weight for the next comparison with equal values. The sum of the tool value illustrates the amount of effect it may give to warehouse system: the more summary provides the better effect.


Speaking about the future result of this research it should be stated that at the stage of preparing the project proposal there is a number of uncertainty factors. However, the recommendation model will facilitate the process of choosing the rack and picking systems, indoors vehicles and WMS systems. It will illustrate the appropriate equipment in the current situation. Moreover, there will be captured the anticipated effect on the warehouse and the company.

The analysis described in the previous part of proposal will provide us the information about economic and technical efficiency. It is important both for logistics department and the director of the CCHBS Russia company because of the possibility to compare the amount of costs and the profit with the payback period that is important due to current economy situation.


Automation has already become a significant part of business for companies which want to save and improve the competitive advantage on the market. Sometimes the price for the equipment seems too expensive but the savings after implementation might be the best investments of the company. Moreover, it highlights ways for the improvements to follow by providing employees more time to solve not routine but complex tasks.

The system may be used for different duration periods and types of warehouse as long as the company change their needs or infrastructure. Furthermore, it might be implemented at all their plants providing the synergy of warehousing at the company.


Lykinskiy, V., Lykinskiy, V., Pletneva, N. (2016). Logistika I upravlenie tsepyami postavok [Logistics and Supply Chain Management]. Moscow: Urait

Hamberg, R., Verriet, J. (2012). Automation in Warehouse Development. London: Springer-Verlag

Dadzie, K., Johnston, W., & Sadchev, H. (2015). Organizational Characteristics and the Adoption of Innovative Warehouse Automation Technologies. In Proceedings of the 1993 Academy of Marketing Science (AMS) Annual Conference (pp. 581-583). Springer, Cham.

Baker, P., & Halim, Z. (2010). Method and system for optimized logistics planning. Supply Chain Management: An International Journal, 12(2), 129-138.

Petchina, D. (2018) Metody snizheniya logisticheskih zatrat na prozvodstvennom predpriyatii v sovremennyh usloviyuah [Methods of decrease logistics costs at manufacturing enterprise in the current context

Kerremans, M., Thunisse, H., & Drury, C. (1991). Impact of Automation on Logistics Cost. Accounting and Business Research, 21(82): 147-155.

Baker, P., & Naish, S., (2004). Evaluating the efficiency of 3PL logistics operations, Transport & Logistics Focus, Vol. 6: 18-23.

Coimbra, E. A. (2013). Kaizen in logistics and supply chains. New York, NY: McGraw-Hill Education.

Speranza, M. G., & Stahly, P. (Eds.). (2012). New trends in distribution logistics (Vol. 480). Springer Science & Business Media.

                    Dotoli, M., Epicoco, N., Falagario, M., Costantino, N., & Turchiano, B. (2015). An integrated approach for warehouse analysis and optimization: A case study. Computers in Industry, 70, 56-69.