# Celebrate The Big Data Problems – #3

The Big Data Problems – #3

How to have our basic statistics (Mean, Median, SD, Var, Cor, Cov) computed using the R language?

The Dataottam team has come up with a blog sharing initiative called “Celebrate the Big Data Problems”. In this series of blogs, we will share our big data problems using CPS (Context, Problem, Solutions) Framework.

## Context:

In statistics Mean, Median, Standard Deviations, Variance, Correlation, or Covariance are foundations steps. From Data Analysts to Data Scientists they will use basic statistics. It can be arrived at using many languages. But here we will use the language called R.

### Mean

The mean is the average of the numbers. And it’s easy to calculate; add up all the numbers, then divide by how many numbers there are. In other words, it is the sum divided by the count.

### Median

The median is the middle of a sorted list of numbers. To find the median, place the numbers in value order and find the middle.

### Standard Deviation

SD is a measure of how spreads out numbers are. And the symbol for SD is sigma, a greek letter.

### Variance

The Variance is a measure of how to spread out numbers are and it is the average of the squared differences from the mean.

### Correlation

Correlation is when two sets of data are strongly linked together we say they have a high correlation. Correlation is positive when the values increase together, and it is negative when one value decreases as the other increases.

### Covariance

Covariance is a measure of how much two random variables change together.

## Problem:

How to have our basic statistics Mean, Median, Standard Deviation, Variance, Correlation, and Covariance are computed using R language.

## Solutions:

Use the below functions as applies with assumptions of x and y are vectors.

• mean(x) median(x) sd(x) var(x) cor(x,y) cov(x,y)

#### You may also like... 