Successful students will be able to analyze, interpret, compare, and compute with summary statistics for sets of data. Analysis of bivariate data includes the determination and interpretation of regression lines and correlation coefficients. While some important components in the study of data and statistics, such as sampling techniques, question formulation, and experiment design are addressed when possible on this module of the Algebra II End-of-Course Exam, those topics will be expected to be assessed in more depth in the classroom. This module includes a variety of types of test items including some that cut across the objectives in this standard and require students to make connections and solve rich contextual problems.

- For univariate categorical data, use percentages and proportions (relative frequencies).
- For bivariate categorical data, use conditional (row or column) percentages or proportions.
- For univariate quantitative data, use measures of center (mean and median) and measures of spread (percentiles, quartiles interquartile range, and standard deviation).
- Graphically represent measures of center and spread (variability) for quantitative data.
- Use box plots to compare key features of data distributions.

- Identify and interpret common instances of weighted averages.
- Analyze variation in weighted averages and distinguish change due to weighting from changes in the quantities measured.
- Contexts may include composite grades, stock market indexes, consumer price index, or unemployment rate.
Example: Suppose a company employed 100 women with average annual salaries of $20,000 and 500 men with average salaries of $40,000. After a change in management, they employed 200 women and 400 men. To correct past inequities, the new management increased women's salaries by 25% and men's salaries by 5%. Despite these increases, the company's average salary declined by almost 1%.

- Regression can be used to summarize bivariate quantitative data.
- Know that the least squares line passes through the point (, ), where the mean of the
*x*-coordinates of the data points is and the mean of the*y*-coordinates of the data points is . - Recognize correlation as a number between –1 and +1 that measures the direction and strength of linear association between two variables.

- Common terms such as least squares line, regression line, and line of best or good fit will be used in the problems.
- Use the relationship among the standard deviation, correlation coefficient, and slope of the regression line to assess the strength of linear association suggested by an underlying scatter plot.
- Interpret the slope of a linear trend line in terms of the data being studied.
- Identify the effect of outliers on the position and slope of the regression line.
- Analyze a residual plot to informally assess whether a line provides a good model for a set of data.

- Interpret, interpolate, and judiciously extrapolate from graphs and tables.
- Use the basic properties of confidence intervals to make simple predictions and answer questions about statistical data.
- State conclusions in terms of the question(s) being investigated.
- Use appropriate statistical language when reporting on plausible answers that go beyond the data actually observed.
- Interpret oral, written, graphic, pictorial, or multi-media reports on data.

- Identify and explain misleading uses of data by considering the completeness and source of the data, the design of the study, and the way the data are analyzed and displayed.
- Recognize the difference between correlation or association and causation.

Example: Determine whether the height or area of a bar graph is being used to represent the data; evaluate whether the scales of a graph are consistent and appropriate or whether they are being adjusted to alter the perception conveyed by the information.

Task related to this benchmark: Click It

- Some possible real world applications may be clinical trials in medicine, an opinion poll, or a report on the effect of smoking on health.
- Distinguish between random sampling from a population in sample surveys and random assignment of treatments to experimental units in an experiment:
Random sampling is how items are selected from a population so that the sample data can be used to estimate characteristics of the population; random assignment is how treatments are assigned to experimental units so that comparisons among the treatment groups can allow cause and effect conclusions to be made.

Task related to this benchmark: Click It