ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

How to calculate simple statistics

Updated on November 13, 2008

Introduction

In statistics, the most important calculations are the mean, mode, median, variance, and standard deviation (std dev) 

In this hub, I will cover the following:

  • sample size
  • population
  • mean
  • mode
  • median
  • variance
  • standard deviation

Sample Size and Population

Statistics begins with a set of numbers which are called the sample.

The set of all possible numbers is called the population.

Let's say that we ask 5 friends to rate a popular movie on the scale from 1 to 10.

Then, the sample size is 5 and the population is the set of all people who have seen or will see the movie.

Calculating the mean

So, we ask our 5 friends to rate and movie and here's what we get:

Fred: 6

Sally: 9

Michael: 8

Raul: 9

Elena: 2

To calculate the mean, you sum up all the numbers in the sample and then divide by the sample size. The sum is 5+10+8+9+2= 34. Since the sample size is 5, the mean is 34/5 = 6.8.

This then is the average of the sample.

Calculating the mode

The mode is the number that appears the most often in the sample.

To calculate the mode, we count the number of times each rating is made. So we have one 6, two 9's, one 8, and one 2. Since we have two 9's and one of everything else, 9 is the mode.

But what would happen if we have the following sequence: 2,2,8,9,9?

In this case, we would say that there is no unique mode. A mode is unique if and only if one number is more frequent than all others.

Calculating the median

The median is the value we get when we order all of our numbers and then find the one in the middle.

If we order the numbers from smallest to largest, we get: 2, 6, 8, 9, 9

Since we have a sample size of 5, the number in the middle is 8.

But what happens if the sample size is even. In this case, we can add the two middle numbers and divide by 2.

So, if our numbers are: 2,6,8,9, then the median is (6+8)/2 = 7.

Calculating Variance

The variance is a measure of the variation of the sample data. The larger the variance, the more random the answers appear. Many people find standard deviation to be a more useful measure of variability.

The method for calculating the variance is different depending on whether we are calculating the variance of a population (everyone) or the variance of a sample (some but not all).

Here are the steps:

(1) Figure out the mean. This is the sum of the numbers given divided by the sample size (i.e. the average).

(6+ 9+ 8 + 9 + 2)/5 = 34/5 = 6.8

(2) Figure out the difference between each number and its mean so that we have:

(6 - 6.8), (9 - 6.8), (8 - 6.8), (9 - 6.8), (2 - 6.8) = -0.8, 2.2, 1.2, 2.2, -4.8

(3) Get the square of each difference in step #2 so that we have:

(-0.8)*(-0.8), (2.2)*(2.2), (1.2)*(1.2), (2.2)*(2.2), (-4.8)*(-4.8) = 0.64, 4.84, 1.44, 4.84, 23.04

(4) Get the sum of all the squares in step #3 so that we have:

sum of squares = 0.64 + 4.84 + 1.44 + 4.84 + 23.04 = 34.8

(5) Now, for the sample variance, we divide the sum in step #4 by the sample size - 1

Variance = 34.8/(5-1) = 34.8/4 = 8.7

Calculating the Standard Deviation

The standard deviation, like variance, is a measure of the variation of the sample data. The larger the standard deviation, the more random the answers appear.  Standard deviation is more popular as a measure than variance.

The method for calculating the standard deviation is different depending on whether we are calculating the variance of a population (everyone) or the variance of a sample (some but not all).  The method is the same as variance with one additional step. 

Here are the steps:

(1) Figure out the mean. This is the sum of the numbers given divided by the sample size (i.e. the average).

(6+ 9+ 8 + 9 + 2)/5 = 34/5 = 6.8

(2) Figure out the difference between each number and its mean so that we have:

(6 - 6.8), (9 - 6.8), (8 - 6.8), (9 - 6.8), (2 - 6.8) = -0.8, 2.2, 1.2, 2.2, -4.8

(3) Get the square of each difference in step #2 so that we have:

(-0.8)*(-0.8), (2.2)*(2.2), (1.2)*(1.2), (2.2)*(2.2), (-4.8)*(-4.8) = 0.64, 4.84, 1.44, 4.84, 23.04

(4) Get the sum of all the squares in step #3 so that we have:

sum of squares = 0.64 + 4.84 + 1.44 + 4.84 + 23.04 = 34.8

(5)  We divide the sum in step #4 by the sample size - 1

34.8/(5-1) = 34.8/4 = 8.7

(6)  Last, we take the square root of the value in step #5.

Standard Deviation = sqrt(8.7) = roughly 2.95 

Interpreting Standard Deviation

A smaller standard deviation means that there is more agreement between the numbers (less variation)and a larger standard deviation means that there is less agreement (more variation).

If the observations are random and fall in a bell curve, then we can use the standard deviation to make the following observations:

  • 68% of the numbers lie within one standard deviations of the mean
  • 95% of the numbers lie within two standard deviations of the mean

Now, movie ratings are, in theory, not random since they are based on the quality of a movie. Additionally, we can know that 100% are between 1 and 10 and are most likely whole numbers.

But, what would it say for another movie if the mean were 5 and the standard deviation was 1 and we assume that ratings form a bell curve.

With this information, we can expect:

  • 68% of all people will rate the movie between 4 and 6 since 4= 5-1 and 6 = 5+1
  • 95% of all people will rate the movie between 3 and 7 since 3 = 5 - 2*1 and 7 = 5 + 2*1

 

Comments

    0 of 8192 characters used
    Post Comment

    • profile image

      Guntur 

      3 years ago

      Why are you using the sum fiuntcon to find the mean.. When there is just a fiuntcon specifically for finding the mean or the average? WAste of time.. I just want to know how to do standard..

    • profile image

      Cristobal 

      3 years ago

      Hi Arizona,First many thanks for chnkiceg out my video, I hope you found it useful.The Mean Deviation is not a commonly used statistic (at least I don't teach it in class). However, it is easy to do in Excel use the AVEDEV function (Foumulas -> More Functions -> Statistical). All you need to do is select the data range and Excel will calculate the Mean Dev for your.Try it out with these data: 92, 97, 95, 90, 98. the Mean Dev should work out as 6.Hope this helps,Dr E.

    • profile image

      Marylada 

      3 years ago

      Grade A stuff. I'm unqoastiunebly in your debt.

    • profile image

      ou 

      4 years ago

      good

    • profile image

      Thien 

      5 years ago

      Thanks for the last part.

    • profile image

      VIJAYAKUMAR 

      5 years ago

      Rest of "Interpreting Standard Deviation step, i have clear everything. Please let me have bit explanation of last point. How we get the 68% & 95%.

    • profile image

      ann 

      5 years ago

      my is not a question

    • profile image

      Debbie 

      6 years ago

      It helped me well, i nw undrstnd it well, thnk u a lot

    • profile image

      zohaib noor 

      6 years ago

      thank you very much in order to share that hub page

    • profile image

      Elijah ibrahim 

      6 years ago

      U people ar doing great pls kep it up

    • profile image

      girlly 

      6 years ago

      OMG, that was awesome. It saved me a lot of time. I couldn't understand the way my prof did it but I understand yours and i got the same result as her. Thank you sooooo very much

    • profile image

      your face 

      6 years ago

      how do you graph a statistic

    • Veronica Clark profile image

      Veronica Clark 

      7 years ago

      Hi there!

      Can you please explain Z-scores and provide an example on how to calculate them? Thank you!

    • Horlah profile image

      Horlah 

      7 years ago from Oyo, Oyo, Nigeria

      Good one there. How do you explain the aspect of correlation and regression? Please help.

    • profile image

      frogpaul77 

      7 years ago

      need help with variance and stanard diviation for the sample of numbers 12,4,16,14,10?

    • profile image

      Ruguru 

      7 years ago

      Biostatistics jug my head that was a better way i have understood it now my lecturer made my life a hell. I knew can count on you nice job

    • profile image

      stuck 

      7 years ago

      big up. Keep up the good work

    • profile image

      Stan 

      7 years ago

    • profile image

      Smurf 

      7 years ago

      Excellent way of explaining how to understand the "end results" of stdev!!! Awsome!!!

    • profile image

      radhhhhhhh  

      8 years ago

      really good to see. its very useful for people who are learning stats fundamentals. keep it up guys

    • Manna in the wild profile image

      Manna in the wild 

      8 years ago from Australia

      Math hubs will never go out of date !

    • pinkhawk profile image

      pinkhawk 

      8 years ago from Pearl of the Orient

      ...We are using softwares now but I still need to go back with the basics, it is a great help.... thank you for sharing this hub! ^.^

    • myClone profile image

      myClone 

      8 years ago from The Land of Confusion

      Nicely explained! I am wondering though--what is a simple way of calculating the Total Sum of Squares (SST)? Also, it would be awesome to have a detailed tutorial on how to analyze an ANOVA table!

    • profile image

      Gen R 

      8 years ago

      Thank you!!!

    • Iamsam profile image

      Helna 

      8 years ago

      nice work

    • profile image

      beta 

      8 years ago

      Great work, but I wish I get it, what is the shortest way?

    • Manna in the wild profile image

      Manna in the wild 

      8 years ago from Australia

      You explained this well.

    • anothermathgeek profile imageAUTHOR

      anothermathgeek 

      8 years ago from East Bay, California

      Hi D.S.,

      1) Figure out the mean

      2) Figure out the standard deviation

      3) Count the number of observations that lie between (mean - std dev) and (mean + std dev)

    • profile image

      D.S. 

      8 years ago

      How do you calculate the number of observations that lie within one standard deviation of the mean, given a list of 44 observations/numbers?

    • Moon Daisy profile image

      Moon Daisy 

      8 years ago from London

      Nice lesson, and very clearly explained! (Btw I'm a bit of a maths geek too!)

    • profile image

      Anonymous1 

      8 years ago

      Nice mate, thanks for this

    • Christenstock profile image

      Christenstock 

      9 years ago from Mililani, HI & Rye, NY

      Supoerb Hub. Thanks MathGeek!

    working

    This website uses cookies

    As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

    For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://hubpages.com/privacy-policy#gdpr

    Show Details
    Necessary
    HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
    LoginThis is necessary to sign in to the HubPages Service.
    Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
    AkismetThis is used to detect comment spam. (Privacy Policy)
    HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
    HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
    Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
    CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
    Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
    Features
    Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
    Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
    Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
    Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
    Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
    VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
    PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
    Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
    MavenThis supports the Maven widget and search functionality. (Privacy Policy)
    Marketing
    Google AdSenseThis is an ad network. (Privacy Policy)
    Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
    Index ExchangeThis is an ad network. (Privacy Policy)
    SovrnThis is an ad network. (Privacy Policy)
    Facebook AdsThis is an ad network. (Privacy Policy)
    Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
    AppNexusThis is an ad network. (Privacy Policy)
    OpenxThis is an ad network. (Privacy Policy)
    Rubicon ProjectThis is an ad network. (Privacy Policy)
    TripleLiftThis is an ad network. (Privacy Policy)
    Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
    Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
    Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
    Statistics
    Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
    ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
    Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)