r programming project
R Programming Assignment
- 1) Load a data set into R. This can be any data set that has at least two quantitative variables and one qualitative variable and can be used to complete the rest of this assignment.
- a) If you have an interest (chess, avocados, gender wage gap), this is a great opportunity toexplore that topic. Iâ€™ve listed potential data sources at the end of this document. Look fordata that is â€œ.csvâ€ file type. You must get my approval of your data set before you start working on it.
- b) If you do not have an interest in mind, you can use the default data set from the 2016 American Community Survey from Colorado. It is located on Canvas asâ€œACS_2016_CO.csvâ€ with an accompanying codebook describing the variable values.NOTE: the default data set is nice to have, but a lot of the variables are top coded as 9999* because the information is missing. This isnâ€™t valuable data when calculatingstatistics or creating plots and should not be used! (The command subset() in R is helpful for dealing with this.)
- 2) Describe the data set with words. How many variables are there? How many observations? What is the unit of observation for the data set (e.g. state, month, person-year)? Is this a cross-section, or multiple observations overtime? Do we have repeated observations for the same subject? Describe a few key variables that you will use in your data set including their units (feet, miles, $).
- 3) Summarize one of the quantitative variables for the full sample using sample statistics. Then, summarize the same quantitative variable for a subset the observations that meet a specific condition. (E.g. report the average and standard deviation of the monthly price of avocados in 20 major cities in the US from 2010 to 2015, then repeat for all 20 cities only in the month of February). Try to choose the subsample in a way that is meaningful. How do the summary statistics between the full sample and subsample compare? What do you learn from this comparison? Include R code and output here with your interpretation.
- 4) Create a histogram of the variable that you summarize in part 3 with properly labeled axes and title. (Bonus points if you can create two histograms of the same variable, split by some other variable (e.g. gender), to show a striking difference.) Include R output here.
- 5) Calculate a confidence interval for difference of means. Choose one variable (e.g. income) and create two subsample of the data using another variable (e.g. gender). Calculate the means for the subsamples and create a confidence interval for the difference in means. Pick something interesting to you and interpret your findings. Include your R code here and output here. Interpret your results.
- 6) Formalize a hypothesis you wish to test with these data (e.g. is the average salary from men the same as the average salary for women?). You might not have all the knowledge to test the exact hypothesis you are interested in. Stick to doing a difference of means test or a test for if the mean of a variable is equal to a specific value.
- 7) Conduct the hypothesis at the ð›¼ = 0.05 level of significance and interpret your results in a meaningful way. Include your R code and output here.
- 8) Visualize at least two variables from the data set using a scatter plot with an appropriate title, axis labels, and legend. Quantitative variables work best for this type of visual. The goal is for this image to tell a story that is clear to the reader. Include R code and output here. Interpret the plot.
- 9) Finally, think and write about who would be a good â€œconsumerâ€ of this information. Whowould be interested in the facts you present here? How could you improve the analysis in the future by incorporating new data or using the existing data to answer a more interesting question?
Study Acers provides students with tutoring and help them save time, and excel in their courses. Students LOVE us!No matter what kind of essay paper you need, it is simple and secure to hire an essay writer for a price you can afford at StudyAcers. Save more time for yourself. Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.Read more
Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.Read more
Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.Read more
Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.Read more
By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.Read more