Introduction to Statistics

Hasonló dokumentumok
Quantitative Statistical Methods

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet Nonparametric Tests

Correlation & Linear Regression in SPSS

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Nonparametric Tests. Petra Petrovics.

Statistical Inference

Statistical Dependence

Descriptive Statistics

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet Factor Analysis

Correlation & Linear Regression in SPSS

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Hypothesis Testing. Petra Petrovics.

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Correlation & Linear. Petra Petrovics.

Construction of a cube given with its centre and a sideline

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Correlation & Regression

Áprilisban 14%-kal nőtt a szálláshelyek vendégforgalma Kereskedelmi szálláshelyek forgalma, április

A rosszindulatú daganatos halálozás változása 1975 és 2001 között Magyarországon

FÖLDRAJZ ANGOL NYELVEN GEOGRAPHY

On The Number Of Slim Semimodular Lattices

Cluster Analysis. Potyó László

FAMILY STRUCTURES THROUGH THE LIFE CYCLE

FÖLDRAJZ ANGOL NYELVEN

Performance Modeling of Intelligent Car Parking Systems

PIACI HIRDETMÉNY / MARKET NOTICE

A riport fordulónapja / Date of report december 31. / 31 December, 2017

FÖLDRAJZ ANGOL NYELVEN GEOGRAPHY

Results of the project Sky-high schoolroom SH/4/10

Professional competence, autonomy and their effects

FÖLDRAJZ ANGOL NYELVEN

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet Descriptive Statistics

Harvested production of cereals increases about one third in 2013 (Preliminary production data of main crops, 2013)

István Micsinai Csaba Molnár: Analysing Parliamentary Data in Hungarian

EN United in diversity EN A8-0206/419. Amendment

Statisztikai hipotézisvizsgálatok. Paraméteres statisztikai próbák

Angol Középfokú Nyelvvizsgázók Bibliája: Nyelvtani összefoglalás, 30 kidolgozott szóbeli tétel, esszé és minta levelek + rendhagyó igék jelentéssel

36% more maize was produced (Preliminary production data of main crops, 2014)

Computer Architecture

FÖLDRAJZ ANGOL NYELVEN

Széchenyi István Egyetem

Klaszterezés, 2. rész

A KELET-BORSODI HELVÉTI BARNAKŐSZÉNTELEPEK TANI VIZSGÁLATA

Tudok köszönni tegezve és önözve, és el tudok búcsúzni. I can greet people in formal and informal ways. I can also say goodbye to them.

Sztochasztikus kapcsolatok

Geokémia gyakorlat. 1. Geokémiai adatok értelmezése: egyszerű statisztikai módszerek. Geológus szakirány (BSc) Dr. Lukács Réka

16F628A megszakítás kezelése

ANGOL NYELV KÖZÉPSZINT SZÓBELI VIZSGA I. VIZSGÁZTATÓI PÉLDÁNY

FÖLDRAJZ ANGOL NYELVEN

Módszertani eljárások az időtényező vezetési, szervezeti folyamatokban betöltött szerepének vizsgálatához

A modern e-learning lehetőségei a tűzoltók oktatásának fejlesztésében. Dicse Jenő üzletfejlesztési igazgató

SQL/PSM kurzorok rész

Supporting Information

ELEKTRONIKAI ALAPISMERETEK ANGOL NYELVEN

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet Descriptive Statistics

Utasítások. Üzembe helyezés

6. Szociális támogatások Social benefits

THS710A, THS720A, THS730A & THS720P TekScope Reference

First experiences with Gd fuel assemblies in. Tamás Parkó, Botond Beliczai AER Symposium

(NGB_TA024_1) MÉRÉSI JEGYZŐKÖNYV

FÖLDRAJZ ANGOL NYELVEN

REGIONAL COMPARISON OF FARMS ON THE BASIS OF THE FADN DATABASE. PESTI, CSABA - KESZTHELYI, KRISZTIÁN - Dr. TÓTH, TAMÁS SUMMARY

HAL SST CL P 30 W 230 V E14

ELEKTRONIKAI ALAPISMERETEK ANGOL NYELVEN

- Bevándoroltak részére kiadott személyazonosító igazolvány

9. Táppénz Sick-pay TÁPPÉNZ SICK-PAY 153

Központi Statisztikai Hivatal Hungarian Central Statistical Office

MATEMATIKA ANGOL NYELVEN MATHEMATICS

There is/are/were/was/will be

Sebastián Sáez Senior Trade Economist INTERNATIONAL TRADE DEPARTMENT WORLD BANK

Cloud computing. Cloud computing. Dr. Bakonyi Péter.

Bird species status and trends reporting format for the period (Annex 2)

Nemzetközi Kenguru Matematikatábor

6. Szociális támogatások Social benefits

Választási modellek 3

Mapping Sequencing Reads to a Reference Genome

Szeretettel hívjuk, várjuk sporttársainkat az Eger Ünnepe és az Egri Senior Úszó-Klub fennállásának 25-ikévében rendezett versenyünkre

Tudományos Ismeretterjesztő Társulat

Cashback 2015 Deposit Promotion teljes szabályzat

Using the CW-Net in a user defined IP network

Utolsó frissítés / Last update: február Szerkesztő / Editor: Csatlós Árpádné

9. Táppénz Sick-pay TÁPPÉNZ SICK-PAY 159

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Cluster analysis in SPSS

Pilot & start small, see next whether it can be mainstreamed. Demonstrate the link between transparency & quality of public services

STUDENT LOGBOOK. 1 week general practice course for the 6 th year medical students SEMMELWEIS EGYETEM. Name of the student:

ELEKTRONIKAI ALAPISMERETEK ANGOL NYELVEN

ELEKTRONIKAI ALAPISMERETEK ANGOL NYELVEN

Cloud computing Dr. Bakonyi Péter.

Supplementary Table 1. Cystometric parameters in sham-operated wild type and Trpv4 -/- rats during saline infusion and

Phenotype. Genotype. It is like any other experiment! What is a bioinformatics experiment? Remember the Goal. Infectious Disease Paradigm

Gottsegen National Institute of Cardiology. Prof. A. JÁNOSI

KÖZPONTI STATISZTIKAI HIVATAL HUNGARIAN CENTRAL STATISTICAL OFFICE. OKTATÁSI ADATOK (Előzetes adatok) DATA OF EDUCATION (Preliminary data) 2005/2006

Kezdőlap > Termékek > Szabályozó rendszerek > EASYLAB és TCU-LON-II szabályozó rendszer LABCONTROL > Érzékelő rendszerek > Típus DS-TRD-01

THE CHARACTERISTICS OF SOUNDS ANALYSIS AND SYNTHESIS OF SOUNDS

A kalászos gabonák betakarított területe, termésmennyisége és termésátlaga, 2008

Flowering time. Col C24 Cvi C24xCol C24xCvi ColxCvi

Az egészségügyi munkaerő toborzása és megtartása Európában

Decision where Process Based OpRisk Management. made the difference. Norbert Kozma Head of Operational Risk Control. Erste Bank Hungary

3. MINTAFELADATSOR KÖZÉPSZINT. Az írásbeli vizsga időtartama: 30 perc. III. Hallott szöveg értése

A BÜKKI KARSZTVÍZSZINT ÉSZLELŐ RENDSZER KERETÉBEN GYŰJTÖTT HIDROMETEOROLÓGIAI ADATOK ELEMZÉSE

INDEXSTRUKTÚRÁK III.

Kereskedelmi szálláshelyek forgalma, január február

IES TM Evaluating Light Source Color Rendition

ELEKTRONIKAI ALAPISMERETEK ANGOL NYELVEN

Átírás:

Introduction to Statistics Petra Petrovics

Statistics Statistics: is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. Practical activity to analyze data Set of data as a result of statistical activity Method Analyzing data Drawing conclusion

Central Statistical Office (HCSO) Independent administrative organization Operating under the direct supervision of the government Main tasks: Designing and conducting surveys, Recording Processing and storing data Data analyses- and dissemination, Protection of individual data.

Statistics Descriptive Statistics Study of how data can be summarized effectively to describe the important aspects of large data sets It turns data into information Data collection & analyzation Statistical Inference It is used when tentative conclusions about a population are drawn on the basis of a sample

Statistical Population All members of a specified group (N) It is a set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population. Discrete population Continuous population (interval)

Statistical Variables = Characteristic of a unit. (1) (2) Quantitative Qualitative Temporal Geographical Common Differential

Quantitative vs. Qualitative Quantitative data measures either how much or how many of something, i.e. a set of observations where any single observation is a number that represents an amount or a count. Qualitative data provide labels, or names, for categories of like items, i.e. a set of observations where any single observation is a word or code that represents a class or category. ~ categorical variable

Types of Quantitative Variables Continuous variables are those variables that have theoretically an infinite number of gradations between two measurements. For example, body weight of individuals, milk yield of cows or buffaloes etc. Most of the variables in biology are of continuous type. Discrete variables do not have continuous gradations but there is a definite gap between two measurements, i.e. they can not be measured in fractions. For example, number of eggs laid by hens, number of children in a family etc.

Scales of Measurement from weakest to strongest - nominal scale - ordinal scale - interval scale - ratio scale

1. Nominal Scale Numbers are labels of groups or classes Simple codes assigned to objects as labels For qualitative data, e.g. professional classification, geographic classification e.g. - blonde: 1, brown: 2, red: 3, black: 4 (a person with red hair does not possess more "hairness" than a person with blonde hair) - female: 1, male: 2

2. Ordinal Scale Data elements may be ordered according to their relative size or quality, the numbers assigned to objects or events represent the rank order (1 st, 2 nd, 3 rd etc.) e.g. top lists of companies

3. Interval Scale Meaning of distances between any two observations The "zero point" is arbitrary Negative values can be used Ratios between numbers on the scale are not meaningful, so operations such as multiplication and division cannot be carried out directly e.g. temperature with the Celsius scale

4. Ratio Scale Strongest scale of measurement Distances between observations and also the ratios of distances have a meaning Contains a meaningful zero e.g. mass, length, time a salary of $50,000 is twice as large as a salary of $25,000

Statistical Rows & Columns Classes Frequencies

Data Set 1. Mass of numerical data discrete values E.g: 11.8, 3.6, 16.6, 13.5, 3.6, 8.3, 8.9, 9.1, 7.7, 2.3, 12.1, 6.1, 10.2, 8.0, 11.4, 6.8, 9.6, 19.5, 15.3, 12.3, 8.5, 15.9, 18.7, 11.7, 6.2, 11.2, 10.4, 7.2, 5.5, 14.5 2. Frequency distribution: method of organising & presenting data Score value Interval of score values: classes Statistical table records the number of observations in each class

Frequency table with score values Class Intervals Approximate class width: largest value - smallest value number of classes 20 2 3 6 Number of class intervals: 2 k > N Gazdaságtudományi Kar Class Limits Frequency 2.3 1 3.6 2 Class Limits Class Frequency Width 2-5 3 3 19.5 1 Total 30 5-8 3 6 8-11 3 8 11-14 3 7 14-17 3 4 17-20 3 2 Total 30

Statistical rows Types of Statistical Rows The main and partial Classifying population Same measures Comparative Generally: cannot add data Same types of data Descriptive Different types and measures of data Qualitative, Quantitative, Temporal, Geographical

Descriptive Rows Name Data Territory (Thousand qkm) 93,0 Population (Million people) 10,04 GDP (Billion Euro) 105,8 CPI (%) 106,1

Comparative Rows Year Hungarian Population (Thousand person) 1960 9 961 1970 10 322 1980 10 709 1990 10 709 Temporal: Time series - Point of date - Discrete population summarize Temporal: Time series - Period - Continuous population We can summarize Year Number of marriage 2002 46 008 2003 46 398 2004 43 791 2005 44 234

Comparative Rows Geographical Country GDP (%) 2001-2005 Hungary 4.2 Romania 5.7 Slovakia 4.6 Slovenia 3.4 Hungary, 2005 Year Expected lifetime (year) Men 68,6 Women 76,9 Source: Statistical Yearbook 2005 Qualitative

Classifying Rows Qualitative Temporal Geographical Product Period Turnover (Th HUF) Country A 3 880 B 4 020 C 3 000 Total 10 900

Types of Quantitative Rows E.g: Water consumption in X village Water consumption (m 3 ) Number of houses f g (%) g (%) S (m 3 ) Z (%) 15 5 5 10 10 50 3 15 25 17 22 34 44 340 24 25 35 15 37 30 74 450 32 35 45 8 45 16 90 320 23 45 5 50 10 100 250 18 Total 50-100 1410 100 Frequency Relative Frequency Cumulative Frequency Cumulative Relative Frequency

Frequency (f): The number of times a value of the data occurs. Cumulative Frequency (f ): The sum of the frequencies for all values that are less than or equal to the given value. Water consumption (m 3 ) Number of houses f 15 5 5 15 25 17 22 25 35 15 37 35 45 8 45 45 5 50 Total 50 - f k i 1 f i N 5+17 5+17+15 5+17+15+8 5+17+15+8+5

Relative Frequency (g): The ratio of the number of times a value of the data occurs in the set of all outcomes to the number of all outcomes. Cumulative Relative Frequency (g ): The term applies to an ordered set of observations from smallest to largest. The Cumulative Relative Frequency is the sum of the relative frequencies for all values that are less than or equal to the given value. Water consumption (m 3 ) g (%) g (%) 15 10 10 15 25 34 44 25 35 30 74 35 45 16 90 45 10 100 Total 100 - g i fi (%) 100 (i 1,...,k) N

Sum of Values (S) x Water consumption Number of houses S (m 3 i i i ) (f) (m 3 ) 15 5 10*5 50 15 25 17 20*17 340 25 35 15 30*15 450 35 45 8 40*8 320 45 5 50*5 250 Total 50 1410 X i lower X i X i 2 x i : discrete value or middle of the class k i 1 S i f 1 X 1 S i 1... fk X k fi X i k f upper

Water consumption (m 3 ) Relative Sum of Values (Z) S (m 3 ) Z (%) 15 50 3 15 25 340 24 25 35 450 32 35 45 320 23 45 250 18 Total S=1410 100 Z i Si (%) 100 (i 1,...,k) S Z 450 1410 100 32 0 Z 1 i k i 1 Z i 1(100%)

Types of Statistical Tables Descriptive / Comparative Row Descriptive / Comparative Row Simple Table Classifying Row Classifying Row Descriptive / Comparative Row Classifying Table Classifying Row Combined Table

Statistical Table Statistical table: set of data arranged in rows and columns; It is important to have: title & source & measurements Signs: if we do not know the data: if there is not any data: 0

Name Data Territory (thousand qkm) 93,0 Population (million people) 10,04 GDP (billion euro) 105,8 CPI (%) 106,1 Source: HCSO (KSH) title measurements Data about Hungary (2008) source 1 dimension

Territory price index in Hungary (2008) Employed people Unemployed people Territory price index (Thousand people) Central Hungary 1245,5 80,5 Central Transdanubia 441,5 40,5 Western Transdanubia 408,2 36,5 Southern Transdanubia 337,4 42,3 Northern Hungary 397,6 69,9 Northern Great Plain 492,1 78,7 Southern Great Plain 474,8 53,3 Total 3797,1 401,7 Source: HCSO (KSH) 2 dimensions

Graphs

The Graphic Presentation of Data It allows to visualize important characteristics. Principals: Perspicuous Homogenous Aim oriented Simple Reconstructable Scaled

The Graphic Presentation of Time Series I Number of Accidents in Hungary Source: HCSO Line chart connects a series of data points together with a line

The Graphic Presentation of Time Series Natural Gas Consumption II Area chart to represent cumulated totals using numbers or percentages (stacked) over time; emphasizes a change in values

The Graphic Presentation of Time Series Change in the number of Employments III Source: Statistical Yearbook, 2005. Bar chart (Stacked) In case of time periods (x-axis: interval)

The Graphic Presentation of Quantitative Rows Source: Statistical Yearbook, 2005. Population Pyramide

The Graphic Presentation of Quantitative Rows Based on Word95.sav Scatterdot: Distribution of data points along one or two dimensions

The Graphic Presentation of Frequency Histogram bar chart grouped into a frequency distribution shows the quantity of points that fall within various numeric ranges Distribution

The Graphic Presentation of Frequency Frequency Polygon Distribution Connects data points through straight lines or higher order graphs x-axis: midpoint of each interval y-axis: absolute frequency Cumulative Distribution Frequency Tends to flatten out

The Graphic Presentation According Pie Chart: to Qualitative Variables Proportional relationships at a point in time Shows percentage values as a slice of a pie Compare part of a whole at a given point in time

The Graphic Presentation According to Territory Variables Cartogram: map, showing quantitative information Pictogram

Thanks for your attention!