Welcome to Premium Paper Help

premiumpaperhelp.com logo

Our Services

Get 15% Discount on your First Order

Background · For this assignment, you will be using the Cleansing_Week4.R script and the data.csv · The code is in R

Background

· For this assignment, you will be using the Cleansing_Week4.R script and the data.csv

· The code is in R programming language. You should open R studio and open the file Cleansing_Week4.R. Follow the steps in the code and answer each of the following questions below.

· Some manipulation and rework of the code is required. The steps are explained in detail in the Code.

· Steps 0 through Step 5, included, should be completed.

Instructions

You should complete all the steps provided in the code and answer the following questions in a report.

After you complete your readings, and listen to the provided videos (Required), you will proceed with this implementation and report.

1.  Introduction

· Provide information about the Language, GUI, and Data File you are using in this assignment. Use references to support the importance of the language you are using, the advantages, disadvantages, and how it relates to other languages that are used in Data Science.

· Provide the Value stored in the variable Randomizer in your code and your Student ID in this section. Take a printscreen of the output in your Console and paste it here.

2.  Data Presentation before Cleansing

Run Step 0 and answer the following questions.

A. Data file format and the corresponding command that you used to read the data. Does the file have headers?

B. How many observations are there?

C. How many variables are in the data?

D. What is the purpose of the command str(df). Take a printscreen of the output in your Console and paste it here.

E. summary(df) # find out what this means and answer the question in your paper.

F. Answer the following questions:

a. # What type of variables does your file include

b. # Specific data types?

c. # Are they read properly?

d. # Are there any issues?

e. # Does your file include both NAs and blanks? How did you identify those?

f. # How many NAs do you have and

g. # How many blanks?

3.  Data Preprocessing

A. Summarize the steps of preprocessing you expect to complete before you run the previous steps in your code. Recommend methods of inputting NAs in each of the variables when needed, and or observations. Review literature and suggest methods of imputation for Categorical and Numeric Variables.

B. Run the Step 1 in your code. How this step affected the NAs and the blanks in your variables (you can run summary(df)) to determine this. Take a printscreen of the output in your Console and paste it here.

C. For each of the Numeric Variables record the Mean and the Median, for the Categorical Variables record the counts. Present them on your paper on a table.

D. Run Steps 2-3 and 4. How many observations include NAs, how many variables include NAs, what is the percentage of rows and columns that have NAs, if we were to eliminate those, what is the approximate size of the remaining dataset? Is this the proper method of imputing?

E. Run Step 5 and answer the following questions:

1.

a. What is the method of imputation that is described? What does linear interpolation mean? Research and discuss if this is an appropriate method. The above method of imputation has now changed some of the statistics of your variables.

· Run summary(df) and compare with the previous statistics. Take a printscreen of the output in your Console and paste it here.

· Do you observe any undesired changes? Explain in detail, how could you have avoided this?

· Are there any more NA’s in your file?

Length: This assignment must be 4-5 pages (excluding the title and reference page)

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Assessment Tasks and Instructions Student Name Student Number Course and

Assessment Tasks and Instructions Student Name Student Number Course and Code Unit(s) of Competency and Code(s) SITXHRM008 Roster staff Stream/Cluster Trainer/Assessor Assessment for this Unit of Competency/Cluster Details Assessment 1 Short Answers Assessment 2 Assignment Assessment 3 Project Assessment conducted in this instance: Assessment 1 |_| 2 |X| 3 |_|

SYSTEMS ANALYSIS AND DESIGN [2242-INSY-3305]Spring 2024 | Instructor: Atieno A. AmadiAssignment-3 Logical Data ModelInstructions:Draw a logical data model

SYSTEMS ANALYSIS AND DESIGN [2242-INSY-3305]Spring 2024 | Instructor: Atieno A. AmadiAssignment-3 Logical Data ModelInstructions:Draw a logical data model (ERD) for the Picnics R US case [Refer to ‘Assignment#1’ and‘Assiignment#2’ on Canvas under ‘Assignments’]Guidelines:A data model describes the data that flow through the business processes in anorganization. It represents the logical

This project will incorporate acquired course knowledge. The BSN student will select a therapy or treatment reviewed within this course. The BSN student

This project will incorporate acquired course knowledge. The BSN student will select a therapy or treatment reviewed within this course. The BSN student will develop a PowerPoint presentation that will address the selected therapy. Identification of a complementary therapy and its origins – Which therapy will you evaluate. Description/Characteristics –

4/15/24, 11:59 PM GCU Library Resources – All Subjects 1/5 Title: Authors: Source: Document Type: Subject Terms: Abstract: Full Text Word

4/15/24, 11:59 PM GCU Library Resources – All Subjects 1/5 Title: Authors: Source: Document Type: Subject Terms: Abstract: Full Text Word Count: Accession Number: Database: Record: 1 Biopsychosocial model. Purdy, Elizabeth Rholetter, PhD Salem Press Encyclopedia, 2022. 3p. Article Biopsychosocial model The biopsychosocial model (BSP) is a method of looking

Overview Aswe have discussed, an appreciation of leadership theory and style is crucial to effective social work leadership. Reading current literature,

Overview Aswe have discussed, an appreciation of leadership theory and style is crucial to effective social work leadership. Reading current literature, discussing important issues with fellow practitioners engaged in leadership roles, and attending formal leadership training seminars or classes are someways to stay up-to-date and continue your professional development as

Module 7 International Business Operations Discussion: Cultural and Political Influence of Brazil, Russia, India, China, and South Africa (BRICS

Module 7 International Business Operations Discussion: Cultural and Political Influence of Brazil, Russia, India, China, and South Africa (BRICS Countries)  Your organization is looking to understand the potential influence that BRICS countries (Brazil, Russia, India, China, and South Africa) will have on the world’s global markets over the next three years. Utilizing