Data Analytics

# Problem 1 (5%)

## Please indicate whether the following statements are true or false:

1. A sample size should not exceed 100 observations, otherwise it will be called a
1. True
2. False
2. The difference between the midpoints of two consecutive classes is equal to the number of classes.
1. True
2. False
3. The line segments in a cumulative frequency polygon can be either increasing or decreasing depending on the given data.
1. True
2. False
4. The variance is considered the most accurate measure of dispersion for distribution comparison because it is calculated using the squared values.
1. True
2. False
5. In a group of 70 scores, if the largest score is increased by 20 points the mean of the scores will increase by 3.5 points.
1. True
2. False

# Problem 2 (15%)

1. Which of the following represents a sample?
1. Number of cups of coffee served at Starbucks Marbella
2. Total registered voters in Spain
3. All the Colombians working abroad
4. None of the above

1. Fifty mouses were chosen from a shelter containing 500 animals to test a new What is the sample?
1. The 50 selected mouses
2. The 500 animals in the shelter
3. The 550 animals
4. All the mouses in the shelter

1. Which of the following is a discrete variable?
1. Depth of the pool measured in meters
2. Numbers of newborn kittens
3. Number of hours spent on social media
4. None of the above

1. The amount of “dollars” stuck in non-US banks is a:
1. Quantitative discrete variable
2. Qualitative discrete variable
3. Quantitative continuous variable
4. Qualitative continuous variable

1. Identify the scale of measurement for the following categorization of clothing: hat, shirt, shoes, pants.
1. Nominal level of data
2. Ordinal level of data
3. Ratio level of data
4. Interval level of data

1. As part of a test preparation course, students are asked to take a practice version of the Graduate Record Examination (GRE). This is a standardized test, and scores can range from 200 to 800. The appropriate scale of measurement is:
1. Nominal
2. Ordinal
3. Interval
4. Ratio

# Problem 3 (25%)

A sample of 20 women were asked about the symptoms they felt after taking the COVID19 vaccine. Below are their responses:

 Headaches Stroke Fever Nausea Tiredness Nausea Headaches Tiredness Cough Fever Tiredness Cough Skin Rash Tiredness Cough Fever Nausea Tiredness Cough Headaches

1. The “Symptoms” is a variable, thus it should be organized into a

.

1. Qualitative, frequency distribution
2. Qualitative, frequency table
3. Quantitative, frequency distribution
4. Quantitative, frequency table

1. Based on the above data, the relative frequency of “tiredness” is:
1. 4
2. 5
3. 2
4. 25

1. If two more women were added to the survey and if they both had a stroke after taking the vaccine, the relative frequency of this symptom would be:
1. 1
2. 15
3. 136
4. 09

1. Based on the above data, the angle that corresponds to the “Fever” category is:
1. 15
2. 54
3. 8
4. 58
2. The best graphical presentation for this data is:
1. Bar Graph
2. Histogram
3. Frequency polygon
4. Cumulative histogram or cumulative frequency polygon

# Problem 4 (25%)

The raw data below represents the rate per hour of a sample of doctors in Paris. This data needs to be represented in a frequency distribution.

113  189  186  174  103   125  41   81    47    156  37    89

90    141   126 28    58     172 75    61

1. What interval for each class do you suggest?
1. 5
2. 30
3. 33
4. 32
2. The relative frequency of doctors who earn between 160 USD and 193 USD per hour is:
1. 2
2. 20%
3. 1
4. 25

1. The percentage of doctors who earn less than 127 USD per hour is:
1. 10%
2. 20%
3. 70%
4. 80%

1. The percentage of workers who earn more than 160 USD per hour is:
1. 80%
2. 20%
3. 10%
4. 16

1. The first point of a cumulative frequency polygon that represents this data is:
1. X = 61 and Y = 5
2. X = 28 and Y = 5
3. X = 28 and Y = 0
4. X = 5 and Y = 0

# Problem 5 (30%)

The numbers that follow represent the number of paint gallons (in thousands) produced each month by a sample of 10 companies.

7     20    10     4     18    12    7      14    6      22

1. The mean number of paint gallons is:
1. 7
2. 12
3. 120
4. 33

1. The mode of this distribution is:
1. 15
2. 2
3. 7
4. There is no

1. The median of this distribution is:
1. 10
2. 11
3. 12
4. 15

1. The distribution of data for the number of paint gallons produced is:
1. Positively
2. Negatively
3. Symmetrical
4. Cannot be

1. The range is:
1. 26
2. 18
3. 15
4. 29

1. The variance of this distribution is:
1. 8
2. 98
3. 78
4. 31

1. The standard deviation of this distribution is:
1. 8
2. 98
3. 78
4. 31

1. Which of the dispersion measures is considered the most accurate for distribution comparison?
1. The range because it is the simplest
2. The standard deviation because it includes all
3. The variance because it is calculated using the squared
4. All measures are equally