Paige Stapleton
AP Statistics Summer Assignment
Chapter 1: Exploring Data
1.1 Displaying Distributions with Graphs
1. In statistics, what is meant by individuals?
In statistics, individuals are the objects described by a set of data. Individuals may be people, animals, or things.
2. In statistics, what is meant by a variable?
In statistics, a variable is any characteristic of an individual.
3. What is meant by exploratory data analysis?
Exploratory data analysis is a way of examining data by using statistical tools and ideas in order to describe main features.
4. What is the difference between a categorical variable and a quantitative variable?
A categorical variable places an individual into one of several groups or categories. A quantitative
…show more content…
Which variable always appears on the horizontal axis of a scatterplot?
The explanatory variable always appears on the horizontal axis of a scatterplot.
5. Explain the difference between a positive association and a negative association.
The difference between a positive association and a negative association is that the pattern of a positive association moves from the lower left to the upper right, and the pattern of a negative association moves from upper left to lower right.
3.2 Correlation
1. What does correlation measure?
Correlation measures the strength of the linear relationship between two quantitative variables.
2. Explain why two variables must both be quantitative in order to find the correlation between them. Two variables must both be quantitative in order to find the correlation between them because if they aren’t, - 1 < r < 1 won’t work.
3. What is true about the relationship between two variables if the r-value is:
a. Near 0?
Very weak linear relationship
b. Near 1?
Strong, positive linear relationship
c. Near -1?
Strong, negative linear relationship
4. Is correlation resistant to extreme observations? Explain.
Correlation isn’t resistant to extreme observations; outliers can have an affect on the
Measurement issues. Data, even numerically coded variables, can be one of 4 levels - nominal, ordinal, interval, or ratio. It is important to identify which level a variable is, as this impact the kind of analysis we can do with the data. For example, descriptive statistics such as means can only be done on interval or ratio level data. Please list under each label, the variables in our data set that belong in each group.
3.5 Dealing with outliers The graphical representations of data made possible by visualization can communicate trends and outliers much faster than tables
What is meant by ‘statistical infrequency’ as a definition of abnormality? [2 marks] Gavin describes his daily life. ‘I sometimes get gripped with the thought that my family is in danger. In particular, I worry about them being trapped in a house fire. I now find that I can only calm myself if I check that every plug socket is switched off so an electrical fire couldn’t start.
Week Two Reflective Assignment Mary Carnahan QN 320: Essential Statistical Thinking July 20, 2016 Week Two What I learned this week in “Data Description, Probability, and Counting Rules” Chapter three “Descriptive Analysis and Presentation of Bivariate Data” Bivariate data are values with the two different variables that are from the same population element.
The people of Mesoamerican had many talents that for that time were very advanced. They used many tools to live but there most valuable tool was there brain. The people knew exactly how to survive due to trying to survive. In document 1,5, and 7 they showed us how they farmed, planted, got water to wear they need to get to.
Sierra Andreas Ms Scott 1 - 3 - 2023 AP US History Unit 5 Guiding Questions and Terms Economic development came after territorial expansion, and it made regional tensions between the North and South worse. In the North economic development pushed towards industrialization, the creation of a market system, and a transport revolution. These economic developments flourished due to the high amount of immigrants moving into the area. The South however, was very different as it continued its growth of slavery and the cotton economy. The white landowners of the South pushed for less laws surrounding slavery and the African American slaves were forced to deal with hardships pertaining to their family, religion and the suffering from slavery .
We learn from the individuals introduced so far in ‘Outliers’ that odd occurrences are not random. Whether it’s a Canadian Hockey Team’s high number of players born early in the year or a South Korean airline with a crash rate higher than its competitors, there’s a logical explanation to it. How about migrant Italians of Roseto, PA with above-average health? Who's diets faired no better than their European counterparts in neighborhoods nearby. Further, the successes of Bill Gates, Bill Joy, and other tech moguls, while not obvious, are also explainable.
In the experiment that tested whether leaf litter played a role in poison frog decline, there was ten plots that had no change in leaf litter. The data was fairly consistent until plot number seven appeared to have sixteen frogs in it. This was more than twice of the amount of frogs than any of the other plots with no change in leaf litter. Clearly, this is an outlier in the data and could have thrown off the average amount of strawberry poison dart frogs in leaf litters. If 1,000 frogs had been tested, this outlier would not have made such an impact on the mean.
R in this data set is 0.043, this means that the 4.3% of the variation in salary is accounted for by the linear model relating salary to points scored by an NBA player. The equation for the least squared linear regression line is ŷ=3,239,630+131.887·x, meaning that for every average point made, salary should increase by $3,239,630. The slope for this set of data is 275.175, meaning that an NBA player's salary is predicted to go up by $275.18 as points increase. The standard deviation/average distance between the residuals and the predictions is about $4,738,380 meaning this scatterplot is somewhat accurate. The y-intercept of the data is $3,362,050, meaning if a player scored 0 points for a season, he would still receive $3,362,050 for the season; this is not very accurate because although players are sometimes guaranteed a certain amount money when drafted, the value varies for each
We have learned ever since we were introduced to statistics that outliers don’t just fit in. In Outliers by Malcolm Gladwell, these people gain a new definition: they do fit in. So much, in fact, that people shape their own lives to become an outlier. We idolize them and crave to be as successful as them, while they are really just the same as each one of us. What makes them true outliers is a combination of fate, fortune, and fervor.
The diagnostic test was quite the surprise to me. I knew more than I thought I did, but I didn’t know enough to pass the AP Exam. Out of the 55 questions, I missed 17. Which is not too shabby for a guy that isn’t outstanding at complex multiple choice questions. I seemed to be the most distraught about the questions that could have multiple answers or required me to choose the best answer.
The reason why this is used is because the smaller the alpha level, the smaller area where you draw your hypothesis. The alpha level relies on how positive you desire to be present that probability is not accountable for the result of improbable procedures. I would use the same alpha level because if an experimental result has less than this possibility of occurrence at random, then it can be called statistically significant. If there were any changes to make the alpha smaller then the null hypothesis would be discarded. To draw a better and more accurate conclusion I would use a greater number of participants to perform the tasks and delope a larger sample size and observation.
8. Rate Outliers on a scale of 1 to 5. Why did you give it that rating? 9. In Chapter 2
An application example where global outliers, contextual outliers and also collective outliers are all interesting is credit card fraud detection. If a student’s credit card charges during the semester range from $100 to $150 but the student’s statement has one expenditure charge of $500, then the latter charge is a global outlier. If the student’s charges during the summer holidays range from $400 to $700, then the $500 spent during the semester is a contextual outlier because it is an interesting value when the period when it was spent is considered. Most of the student’s charges for the year fall between $100 to $150, hence the charges for the summer holidays, which range from $400 to $700 are collective outliers.
To illustrate this difference, use the example at the beginning of the chapter on linking health and income. There is a correlation when we observe two changes at the same time. The causal relationship can prove that a change in one element causes a change in another. The causal relationships are the idea that a change in one factor leads to a corresponding change in another. Individuals with higher salary have a tendency to appreciate better general wellbeing.