Privacy Policy And Data Analysis For Health Record And Australian Weather Dataset

My Health Record System and its Privacy Provisions

Privacy concerns in the health research has now a day become one issue that attract considerable attention. In designing relevant health care service, research is required regarding health related issues. Data or information are considered as one vital asset of any organization. In the information set, special attention needs to be given on confidential information. The organization should provide proper focus on maintaining terms of confidentiality of the information set. Securing confidential information by using different software and hardware is known as information security. It implies a combined internal and external system of operation where collected information and data are kept protected. In functioning of an organization, recorded data and information plat several important roles. The information security consists of different functions. The first primary responsibility is to maintain the privacy of the collected data. A secure information system also protects capacity of the concerned organization in performing its assigned functions. Another important aspect the security to the accessed technology of the organization. Organization gives special attention to protection of the information as the unauthorized access to the confidential information has adverse effect on people directly or indirectly connected to the organization.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

All the health related data in Australia re documented under the supervision of digital health agency of Australia. With increasing prevalence of various health issues, load of gathered information is increasing at a rapid pace in a very short span of time. The concerned agency gathers and maintains all these health related information. Growing concerns for various diseases encourage more people to take health care services. The number of people willing to have different health care service far exceed the number of available service providers ( 2018). Therefore, maintenance of detailed information regarding about the individual recipient has become extremely important. Information are collected regarding personal details and status of health. The technological advancement in several equipment and machinery used by different health care service unit increases information availability related to individual recipient. Data related to prenatal testing is the most easily accessible. Analysis of the health related risk factors is an important aspect determining continuation of required services. Maintenance of proper data and information also provide protection against unjustified allegation or claims. In addition to direct health care, various other aspects are considered under primary health care service. It is the responsibility of the service providers to document observations and instructions. Intervention by the third party is often observed in the system where relevant information are used to pay for the used services.

Exploratory Data Analysis of Loan Delinquency Dataset

Health care organizations today pay great attention in securing information. Given large volume of health care data, security system is designed to maintain confidentiality of the personal information. Unless proper security, it would be very easy to access and misuse these information. Now, information are recorded and shared using electronic medium instead of earlier paper based method of documentation (Dinev et al. 2018). The paper-based method of photocopying important documents is a laborious process and require more time compared to storing data electronically.

Various sources have been used in gathering the relevant data, which is combined and connected to other profiles. Therefore, with the electronic process it is easier exploring the database within the build network for extracting the data from different remote locations. Nevertheless, the system relevantly increases the chance of third party accesses of the data, which is being stored. Hence, the overall system indicates the absence of adequate security, which is not protecting the overall data of the organization, while making the access process easy. The individual can access the data without leaving any kind of trace for such kind of incident. Moreover, the system has directly allowed the service providers to understand the trend of the data, which indicates the health conditions of the population. The service providers depict easy access and understanding of the information presented in data based (Van Cauteren et al. 2016). The information is relevantly used with the advancements in the technology for supporting the electronic health care record to the individuals. Mobile technology is also used in detecting the required level of data for the individuals. The major significance of HER is to support activities of the industry. Thus, with the continuous evolution of the system along with the improvements directly increase the quality of health care service that is being contributed to the overall system.

There are different measures, where the information related to the statues of store data is electronically kept and reduces the overall error processing. Therefore, with this measure risk of malpractices can be avoided for meeting the reimbursement claims. However, there are drawback of the current legal framework that is being used in the health care recording process. Moreover, the obligation of the health care is also based on electronic and paper-based methods. Thus, it could be detected that the confidentiality conditions may vary on the information holder.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

The rapid growth in the health record burden has relevantly flourished the use of electronically based recording system, which directly supports the service providers. Therefore, with the large variation in the nation forced the creation of huge support is electronic health record. The electronic health record has played a significant role in major hospitals, as its allows the authorities to understand the history of the patients. The national center has relevantly indicated that 75% of the service provers are able to enhance the quality of patient care with the use of electronic data. Consequently, with the electronic recording system the has relevantly allowed the individuals to access information regarding the patients and make adequate decision during critical hours (Kim et al. 2017). The system has relevantly provided alerts for the new medication and physicians that the patients are considering for their health issues. Hence, it could be understood that the digital health technology has undergone serious changes in recent years for supporting the hospitals with information regarding the patients. The structure of the health record system has also played an adequate role in distributing the information of different hospitals.

Decision Tree and Logistic Regression Model Building for Predicting Loan Delinquency

The personal information of the Australian citizen and other personal data of the citizen are stored in the Australian Data Agency. Since the data is composed of the Australian Citizen personal and crucial, important documents the same needs to be stored and governed by the regulatory act such as the Privacy Act of 1998. The health record system manages all the personal information of the organization in a more classified way. The personal data and information collected and stored are useful and is always viable for an consideration for an option when the same is identified for the use of the communication process and for the management purpose. The use of “My Health Record System has enabled and widely used by the company and the operators for reclassification and arraignment of the data. The crucial information, which are gathered about the health care products and services rendered are stored in a structured and the same is stored with privacy. The data protection and the personal data gathered is a privacy matter for the company and the same should be regulated with the regulatory bodies by imposing certain rules and guidelines for the same. The regulatory body can take several steps by including steps such as imposition of penalty and fines and imposing several regulatory and criminal proceedings against those involved in the breaching of secured and private data of the organization (Zingg et al.2015).

The organization has several ways through which the data inflow to the organization such as data recorded via telephonic conversations, emails and other general letters and all, which may contain certain other privy data.  The Organizations collect several and various kind of employment related data also, which should also be managed and stored effectively. The process of data collection and data processing is well managed by the company in terms of managing relation with the employees of the organizations (Watanabe 2015). It is crucial to note that the management of the company should assess crucial important situations and scenario where the data collected and gathered may be for the use in making effective decision process. Situation arises when the organization reviews data management and the same is used in the various process and steps of the company like in the contract, workforce management, and meeting the obligations and rules of the regulatory bodies and for association of goods with the market information available to the management of the company. There are several requirement by the Human service Department for the for providing health data records and information which is ensured by the regulatory body to ensure betted data management and data processing. The registration are also taken for those interested in registration of the digital health care system and the security for the same ios an important factor. Certain and several steps needs to be taken into consideration for enabling and protecting the data of the organizations (Booth et al. 2018).

Performance Evaluation of Final Decision Tree and Logistic Regression Models

There should be several steps and accessibility to the data should be given to those individuals after having careful analyzing of their identity. The parental responsibility should also be taken care where the responsibility of the same should be regarding authorized representative should be over the age of 18.

In order to explore the loan delinquency of ACME bank the primary function is to import the data into the rapidminer. Findings from the exploration of data suggest that the data file represents the primary variable was to recognize the variable. By using the “Select Attributes” the identity variable was detached. Consequently, the matrix correlation operator was implanted and the central process was finished. The procedure is illustrated in the figure Task 2.1. The execution procedure offered the analysis of exploratory data of the dataset together with the matrix of correlation.



As understood from the analysis of data it represents that to review the information relating to loan delinquency that has been collected. The information comprises the data relating to the 1.5 lakh customers of bank. Additionally, it is noticed that barring the monthly earnings and number of dependents information relating to all the attributes are present in it. Data relating to the monthly earnings of 29731 customers are missing. Similarly, data relating to the total number of dependents for 3924 customers are also omitted.


The measurement of loan delinquency is assessed with the help of dichotomous variable – SeriousDlqin2yrs. The variable is helpful in measuring the loan defaulting habit of the person that are past 90 days. It is later noticed that 93.3% of the customers have the habit of loan default during the last 90 days. Only the 6.7 per cent of the customers does not has the habit of loan delinquency.


The word “RevolvingUtilizationOfUnsecuredLines” is viewed as continuous variable. Whereas the minimum value relating to the variable is zero while, the maximum value stands 50708. Additionally, it is noticed that the average value stood 6.048, furthermore, it is noticed that the majority of the customers does make the utilization of unsecured lines.

Tableau View of Rainfall and Evaporation in Australian Weather Dataset


The variable “Age” is treated as continuous variable. The minimum and the maximum age of every customer are stated as 0 and 109 respectively. The average age of customer stood 52.9295. Additionally, representation of histogram represents that there is a normal distribution of customers age.

Variables such as “Number of Time 30-59 Days Past Due Not Worse”, “Number of Times 90 Days Late” and “Number Of Time 60-89 Days Past Due Not Worse” represented 0 and 98 as the minimum and the maximum value. The maximum as well as minimum value of Debt ratio stood 0 and 329664 while the average debt ratio stood 353.005.  


The minimum and maximum value stood 0 and 58 for “NumberOfOpenCreditLinesAndLoans” while the average value stood 8.453.


For the NumberOfDependents the minimum and maximum value of the customers stood 0 and 20 respectively. The average number of dependents stood 0.757 whereas the maximum number of customers have the number of dependents at 1 with customers that has the higher number.


To forecast the correlation of loan delinquency relating to the five variables a calculations of variables is performed. The evidences from the figures suggest the matrix of correlation for different variables. The analysis obtained from the correlation matrix represents “NumberOfTime30-59DaysPastDueNotWorse”,

“NumberOfTimes90DaysLate”, “NumberOfTime60-89DaysPastDueNotWorse”,

The “NumberOfDependents” and “age” represents greater degree of correlation with the loan delinquency. Therefore, to analyse the loan delinquency the above stated five factors are selected.


                                                                         Figure 2.2 (a) Decision Tree Procedure


                                                                             Figure 2.2(b): Decision Tree


                                                                        Figure 2.2(c): Decision Tree Process

In order to prepare the decision tree in the rapidminer the variables that have the greater analysis of higher correlation is selected. Therefore, 5 variables that have the greater correlation is selected for determining the loan delinquency. The set role reporter is employed to select the delinquency of loan as the targeted variable. The “decision tree” operator is employed to make the decision tree. The use of lease square criterion is used to prepare the decision tree. Additionally, maximum depth of 5 is used in preparing the decision tree. The trimming relating to the decision represents the minimum gain of 0.01.

The evidences from the decision tree represents that the “NumberofTimes90DaysLate” is initially selected and it is separated in less than or greater than 0.500. For the “NumberofTimes90DaysLate” above 0.500 is attributed again to segregate into less than and greater than 1.500. As understood from the decision tree it is noticed that all the “NumberofTimes90DaysLate” possess greater than 1500 variables. The “NumberofTime30-59dayspastduenotworse” is divided under two sections that has less than and greater than 0.500. Use of age attribute is employed to define all the “NumberofTime30-59dayspastduenotworse” to greater than 0.500.

For “NumberofTimes90DaysLate” less than 0.500 is attributed. While the attribute of “NumberofTime30-59dayspastduenotworse” is separated to less than and greater than 0.500. The variables of “NumberofTime60-89dayspastduenotworse” is used to forecast “NumberofTime30-59dayspastduenotworse.”

As understood from the decision tree it is noticed that the entire five variables are used in predicting the delinquency of loan. The separation has made used of the factor of 0.500 for the first order. While the variable of “NumberofTime60-89dayspastduenotworse” is used to forecast the “NumberofTime30-59dayspastduenotworse.”

The understanding from the decision tree suggest that the full five factors has been utilised to predict the delinquency of loan. The segregation has made use of 0.500 for the first order while in the second order the factor stood 1.500 has been used.


                                                                  Figure 2.3 (a): Logistic Regression Model


                                                              Figure 2.3(b): Logistic Regression Output

The image represents logistic regression procedure for determining loan delinquency. The process for obtaining the logistic regression comprises of importation of data into the process. The numerical variables are turned into binomial variable and the variables for selecting loan delinquency is chosen. The variables are chosen based on loan delinquency matrix. Later the use of set role operator is used to determine the association among the dependent loan delinquency variable and independent variables.

The association between the dependent and independent variable is stated below;

Loan delinquency = 5.961*Age + 1.230* NumberOfTime30-59DaysPastDueNotWorse + 1.938* NumberOfTimes90DaysLate + 1.256* NumberOfTime60-89DaysPastDueNotWorse + 0.224* NumberOfDependents – 9.462

The equation provides evidences that all the independent variables presents the positive effect on the loan delinquency. The equation provides that there is a greater rise in loan delinquency with rise in age. The changes for loan delinquency falls with the age of customers. Additionally, it is noticed that least effect on loan delinquency is obtained by the number of dependents.

Meanwhile it is noticed that age do not represents statistical significance on loan delinquency at 0.05 level of significance. Additionally, it is noticed that except for all the age other variables possess the noteworthy effect on the loan delinquency.

Model Accuracy



True Positive



False positive















F Measure



The evidences from table 2.2 provides a comparative analysis between the decision tree model and logistic. The cross validation techniques were employed to provide comparative view of models. As understood from the table both the model represents the equivalent accurate level at 93.51%. Additionally, it is noticed that the precision of logistic regression model is greater at 57.44% in comparison to the decision tree model of 55.29%. The recall level of logistic model stood 11.31% in comparison to the decision tree model of 15.04%. The sensitivity of decision tree model stood better at 15.40% while the logistic model stood 11.31%. As understood from cross validation the accuracy of both the models are identical in comparison to the decision tree model.


                                                                         Figure 3.1 : Daily rainfall at NorfolkIsland

The tableau view has been created for the NorthfolkIsland weather each day in the month of June during the year 2012. The bar chart of this specific weather stations has been generated by measuring days of June month in the horizontal axis and rainfall in the vertical axis. From the above image we find that the maximum rainfall occurred on 30th June. In addition, it is also found that the rainfall was very high from 12th to 15th subsequently there was approximately no rainfall till 29th June.  Moreover, it is also found that there was no rainfall in the starting of the month. Through change of location in tableau file we would get the rainfall on other locations.


                                                                   Figure 3.2 : Monthly rainfall at NorfolkIsland

For the yearly analysis again we study the rainfall in NorfolkIsland. The analysis of the rainfall shows that the rainfall follows a normal distribution from 2009 to 2018. The highest total rainfall occurred in 2011. The least amount of rainfall occurred in 2017. Further it is found from the chart that there was a decrease of rainfall from 2011 to 2013. The amount of rainfall from 2013 to 2015 was approximately equal. There was a rise in rainfall in 2016. However, the rainfall fell drastically in 2017.  


                                                                          Figure 3.3: Monthly rainfall at NorfolkIsland

For the monthly evaluation of rainfall we have again selected the location of NorfolkIsland. The year of analysis is 2012. The month variable (from date) is placed in columns and rainfall as rows. The rainfall measure is converted to sum to present the total rainfall. The colour of the bar is changed to yellow. The year variable (date) is placed in filter. This aids in selecting 2012 as the year. The total monthly rainfall is represented through a bar chart. The height of the bar represents the amount of rainfall. From the above chart it is found that the highest amount of rainfall occurred in the month of January. The rainfall in 2012 followed a cyclic occurred. There was a slump in rainfall from January till the month of June. From the month of June there was rise in the rainfall till the month of November. However, we find that the rainfall fell again in December.  


                                                                                   Figure 3.4: Geomap of Rainfall

The geomap in tableau is created by placing longitude in the columns and latitude in the rows. Tableau automatically creates a geomap based on the given latitude and longitudes. In order to find the locations, the variable is placed in the measure as colour. The locations are highlighted with gradient green colour. The year 2010 is selected for accessing the rainfall. The least total rainfall in 2010 was 206 mm while the highest total rainfall was 2660 mm.  In the tableau file when the year is changed then the rainfall for a different year would be provided.


                                                                                   Figure 3.5: Tableau Dashboard


Booth, A., Moylan, A., Hodgson, J., Wright, K., Langworthy, K., Shimizu, N. and Maconochie, I., 2018. Resuscitation registers: how many active registers are there and how many collect data on paediatric cardiac arrests?. Resuscitation., 2018. Privacy – Australian Digital Health Agency. [online] Available at: [Accessed 6 Oct. 2018].

Dinev, T., Albano, V., Xu, H., D’Atri, A. and Hart, P., 2016. Individuals’ attitudes towards electronic health records: A privacy calculus perspective. In Advances in healthcare informatics and analytics (pp. 19-50). Springer, Cham.

John, A., Dennis, M., Kosnes, L., Gunnell, D., Scourfield, J., Ford, D.V. and Lloyd, K., 2014. Suicide Information Database-Cymru: a protocol for a population-based, routinely collected data linkage study to explore risks and patterns of healthcare contact prior to suicide to identify opportunities for intervention. BMJ open, 4(11), p.e006780.

Kim, Y.H., Han, K., Son, J.W., Lee, S.S., Oh, S.W., Kwon, H.S., Shin, S.A., Kim, Y.Y., Lee, W.Y. and Yoo, S.J., 2017. Data analytic process of a nationwide population-based study on obesity using the national health information database presented by the national health insurance service 2006-2015. Journal of Obesity & Metabolic Syndrome, 26(1), pp.23-27., 2018. Healthcare Identifiers Act 2010. [online] Available at: [Accessed 6 Oct. 2018]., 2018. My Health Records Act 2012. [online] Available at: [Accessed 6 Oct. 2018]., 2018. My Health Record. [online] Available at: [Accessed 6 Oct. 2018]., 2018. My Health Record. Privacy Policy. [online] Available at: [Accessed 6 Oct. 2018].

Van Cauteren, D., Millon, L., De Valk, H. and Grenouillet, F., 2016. Retrospective study of human cystic echinococcosis over the past decade in France, using a nationwide hospital medical information database. Parasitology research, 115(11), pp.4261-4265.

Watanabe, K., Ricoh Co Ltd, 2015. Data management for hospital form auto filling system. U.S. Patent Application 14/194,365.

Zingg, W., Holmes, A., Dettenkofer, M., Goetting, T., Secci, F., Clack, L., Allegranzi, B., Magiorakos, A.P. and Pittet, D., 2015. Hospital organisation, management, and structure for prevention of health-care-associated infection: a systematic review and expert consensus. The Lancet Infectious Diseases, 15(2), pp.212-224.