Descriptive Statistics Data Analysis

Dwight Wallace

STAT 200 – Assignment 2

April 21, 2021

Descriptive Statistics Data Analysis

 

Introduction

I am 43 years old and married with three children.  My income is $98,000 annually. The variables included:  SE- Income, SE – Marital status, SE family size, and USD food.

Variable/Data Set Description Variable Type
Variable 1 “Income” Annual Income (USD) Quantitative
Variable 2

“Age”

Head of Household Age Qualitative
Variable 3

“Size of family”

No. of people in household Quantitative
     

 

Income is necessary for basic needs for the survival of the family.

Numerical Summary

For income, the median is the measure of central tendency, the variable is quantitative.  If any outliers or if the data is not distributed normally, the median is the best measure for central tendency.

For income, the measure of dispersion is the sample (SD), which was selected for different reasons.  For example, the data is from a larger data set.  Also, the sample (SD) is mainly use measure of dispersion, and income variable is quantitative.

For income, a histogram will be used for showing the normal distribution of data, because histogram considered a good method to use to plot for showing the normal distribution og quantitative data level.

 

 

 

Table 3

Variable   Measure of Central Tendency Measure of Dispersion
Variable “Age” 43

42

39

44

48

30

39

43

39

33

47

45

33

47

50

42

42

46

27

43

28

40

45

28

28

36

34

Median = 47 SD = 8.27512
       

 

Description of findings

Mean = 36.3

Median = 39

Mode = 33, 41, 42

The graph was skewed to the right, which was considered being skewed negatively.

Variable 3 – Family Size

Another critical variable is the family size because as the family size increases, additional expenses will be incurred.  For example, foods, clothing, and other expenses for two people is a lot less than the expenses for five.

Numerical Summary

For family size, the measure of central tendency is the median, with a quantitative variable.  If any outliers or data is not distributed normally, the median is considered the best measure of central tendency.

For family size, the measure of dispersion is the sample (SD), which was selected for different reasons.  For example, the data is a sample from a larger data set.  The sample (SD) mainly use measure of dispersion, and finally, the family size variable is quantitative.

For family size, the pie chart was changed to histogram to show the data was normally distributed.

For family size, I believe that histogram is considered the best graph because it compares the size of the family.

Table 4 – Descriptive Analysis for Variable 3

Variable N Measure of Central Tendency Measure of Dispersion
  5

1

2

2

1

3

2

5

1

3

5

5

5

2

1

5

3

2

3

2

4

4

2

3

3

4

4

1

5

Median = 3.1 SD = 1.24121

 

Histogram is considered the best graph for “Family Size” because it will allow comparison of the family size.

 

 

Table 4

Table 4 – Descriptive Analysis for Variable 3

Variable N Measure of Central Tendency Measure of Dispersion
  5

1

2

2

1

3

2

5

1

3

5

5

5

2

1

5

3

2

3

2

4

4

2

3

3

4

4

1

5

Median = 3.1 SD = 1.24121

 

Mean = 2.9

Median = 3.3

Mode = 3

Variable 4 – Food Expenses

The food expense is a critical variable the money spent of food will depend on the income and the size of the family.  Food is considered a necessity to live.

Numerical Summary

For food expenses, the measure of central tendency is the median and the variable is quantitative.  If any outliers or data is not distributed normally, the median is considered the best measure for central tendency.

For food expenses. The measure of dispersion is the sample (SD), which was selected for several reasons.  For example, the data is a sample from a larger data set, the sample (SD) which is mainly use measure of dispersion.  Also, this expense variable is quantitative.

For food expenditures, histogram was used because it shows the normal distribution of data.  Also, it showed food expense grouped by number of family members into different categories.

Description of findings

Variable 5 – Other Expenses

This expense is important because of the money spent for other expenses will depend on the income and family size.  Even though these expenses are not considered a necessary, but it assist individuals to become healthy.

Numerical Summary

For other expenses, the measure of central tendency is the median, the variable is quantitative.  If any outliers or data is not distributed normally, the median is the best measure of central tendency.

For other expense, the measure of dispersion is the sample (SD), which was selected for several reasons.  For example, the data is a sample from a larger data base, the sample (SD) is mainly used measure of dispersion, and that other expenses variable is quantitative.

For other expenses, a histogram was used for showing the normal distribution of data, because histogram is considered the best plots for showing the normal distribution of quantitative level data.

Based on the results, food was the highest category, and the lowest expense was other expenses.   A recommendation to save money is the food expense because the food expense annually was high, it appears that a lot of unnecessary money was spent of food.