Examination of Multicollinearity in Linear Regression Models Examination of PETRES Red



Hasonló dokumentumok
Correlation & Linear Regression in SPSS

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet Factor Analysis

A multikollinearitás vizsgálata lineáris regressziós modellekben A PETRES-féle Red-mutató vizsgálata

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Hypothesis Testing. Petra Petrovics.

Correlation & Linear Regression in SPSS

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Correlation & Linear. Petra Petrovics.

Statistical Dependence

UNIVERSITY OF PUBLIC SERVICE Doctoral School of Military Sciences. AUTHOR S SUMMARY (Thesis) Balázs Laufer

A modern e-learning lehetőségei a tűzoltók oktatásának fejlesztésében. Dicse Jenő üzletfejlesztési igazgató

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet Nonparametric Tests

Statistical Inference

Skills Development at the National University of Public Service

FAMILY STRUCTURES THROUGH THE LIFE CYCLE

STUDENT LOGBOOK. 1 week general practice course for the 6 th year medical students SEMMELWEIS EGYETEM. Name of the student:

ANGOL NYELV KÖZÉPSZINT SZÓBELI VIZSGA I. VIZSGÁZTATÓI PÉLDÁNY

Utolsó frissítés / Last update: február Szerkesztő / Editor: Csatlós Árpádné

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Nonparametric Tests. Petra Petrovics.

Angol Középfokú Nyelvvizsgázók Bibliája: Nyelvtani összefoglalás, 30 kidolgozott szóbeli tétel, esszé és minta levelek + rendhagyó igék jelentéssel

A rosszindulatú daganatos halálozás változása 1975 és 2001 között Magyarországon

On The Number Of Slim Semimodular Lattices

Választási modellek 3

Bevezetés a kvantum-informatikába és kommunikációba 2015/2016 tavasz

Utolsó frissítés / Last update: Szeptember / September Szerkesztő / Editor: Csatlós Árpádné

Using the CW-Net in a user defined IP network

Phenotype. Genotype. It is like any other experiment! What is a bioinformatics experiment? Remember the Goal. Infectious Disease Paradigm

USER MANUAL Guest user


Performance Modeling of Intelligent Car Parking Systems

Construction of a cube given with its centre and a sideline

GEOGRAPHICAL ECONOMICS B

Dr. Sasvári Péter Egyetemi docens

Geokémia gyakorlat. 1. Geokémiai adatok értelmezése: egyszerű statisztikai módszerek. Geológus szakirány (BSc) Dr. Lukács Réka

Miskolci Egyetem Gazdaságtudományi Kar Üzleti Információgazdálkodási és Módszertani Intézet. Correlation & Regression

ANGOL NYELV KÖZÉPSZINT SZÓBELI VIZSGA I. VIZSGÁZTATÓI PÉLDÁNY

Cluster Analysis. Potyó László

ACTA CAROLUS ROBERTUS. Károly Róbert Főiskola Gazdaság és Társadalomtudományi Kar tudományos közleményei Alapítva: (1)

István Micsinai Csaba Molnár: Analysing Parliamentary Data in Hungarian

OROSZ MÁRTA DR., GÁLFFY GABRIELLA DR., KOVÁCS DOROTTYA ÁGH TAMÁS DR., MÉSZÁROS ÁGNES DR.

THE EFFECTIVENESS OF THE E-LEARNING APPLICATION: IMPACT ASSESSMENT OF THE QUALITY

KELET-ÁZSIAI DUPLANÁDAS HANGSZEREK ÉS A HICHIRIKI HASZNÁLATA A 20. SZÁZADI ÉS A KORTÁRS ZENÉBEN

Étkezési búzák mikotoxin tartalmának meghatározása prevenciós lehetıségek

Lopocsi Istvánné MINTA DOLGOZATOK FELTÉTELES MONDATOK. (1 st, 2 nd, 3 rd CONDITIONAL) + ANSWER KEY PRESENT PERFECT + ANSWER KEY

Sebastián Sáez Senior Trade Economist INTERNATIONAL TRADE DEPARTMENT WORLD BANK

Professional competence, autonomy and their effects

Descriptive Statistics

EN United in diversity EN A8-0206/419. Amendment

Computer Architecture

Módszertani eljárások az időtényező vezetési, szervezeti folyamatokban betöltött szerepének vizsgálatához

Supporting Information

N É H Á N Y A D A T A BUDAPESTI ÜGYVÉDEKRŐ L

IES TM Evaluating Light Source Color Rendition

Néhány folyóiratkereső rendszer felsorolása és példa segítségével vázlatos bemutatása Sasvári Péter

SAJTÓKÖZLEMÉNY Budapest július 13.

HALLGATÓI KÉRDŐÍV ÉS TESZT ÉRTÉKELÉSE

NYOMÁSOS ÖNTÉS KÖZBEN ÉBREDŐ NYOMÁSVISZONYOK MÉRÉTECHNOLÓGIAI TERVEZÉSE DEVELOPMENT OF CAVITY PRESSURE MEASUREMENT FOR HIGH PRESURE DIE CASTING

Effect of sowing technology on the yield and harvest grain moisture content of maize (Zea mays L.) hybrids with different genotypes

Budapesti Műszaki és Gazdaságtudományi Egyetem Építőmérnöki Kar

A jövedelem alakulásának vizsgálata az észak-alföldi régióban az évi adatok alapján

Expansion of Red Deer and afforestation in Hungary

ELEKTRONIKAI ALAPISMERETEK ANGOL NYELVEN

KÖZGAZDASÁGI ALAPISMERETEK (ELMÉLETI GAZDASÁGTAN) ANGOL NYELVEN BASIC PRINCIPLES OF ECONOMY (THEORETICAL ECONOMICS)

INDEXSTRUKTÚRÁK III.

Emelt szint SZÓBELI VIZSGA VIZSGÁZTATÓI PÉLDÁNY VIZSGÁZTATÓI. (A részfeladat tanulmányozására a vizsgázónak fél perc áll a rendelkezésére.

Magyar ügyek az Európai Unió Bírósága előtt Hungarian cases before the European Court of Justice

ANGOL NYELVI SZINTFELMÉRŐ 2014 A CSOPORT

OKTATÓI MUNKA HALLGATÓI VÉLEMÉNYEZÉSÉNEK KÉRDÉSEI MAGYAR ÉS ANGOL NYELVEN

A TÓGAZDASÁGI HALTERMELÉS SZERKEZETÉNEK ELEMZÉSE. SZATHMÁRI LÁSZLÓ d r.- TENK ANTAL dr. ÖSSZEFOGLALÁS

FORGÁCS ANNA 1 LISÁNYI ENDRÉNÉ BEKE JUDIT 2

EN United in diversity EN A8-0206/445. Amendment

ANGOL NYELVI SZINTFELMÉRŐ 2013 A CSOPORT. on of for from in by with up to at

FÖLDRAJZ ANGOL NYELVEN GEOGRAPHY

Csima Judit április 9.

Tudományos Ismeretterjesztő Társulat

Rezgésdiagnosztika. Diagnosztika

Klaszterezés, 2. rész

Can/be able to. Using Can in Present, Past, and Future. A Can jelen, múlt és jövő idejű használata

International Round Table on the Practical Operation of the European Succession Certificate in certain Member States

Felnőttképzés Európában

A controlling és az értékelemzés összekapcsolása, különös tekintettel a felsőoktatási és a gyakorlati alkalmazhatóságra

Longman Exams Dictionary egynyelvű angol szótár nyelvvizsgára készülőknek

MAGYAR AFRIKA TÁRSASÁG AFRICAN-HUNGARIAN UNION

Kvantum-informatika és kommunikáció 2015/2016 ősz. A kvantuminformatika jelölésrendszere szeptember 11.

BKI13ATEX0030/1 EK-Típus Vizsgálati Tanúsítvány/ EC-Type Examination Certificate 1. kiegészítés / Amendment 1 MSZ EN :2014

Minta ANGOL NYELV KÖZÉPSZINT SZÓBELI VIZSGA II. Minta VIZSGÁZTATÓI PÉLDÁNY

3. MINTAFELADATSOR EMELT SZINT. Az írásbeli vizsga időtartama: 30 perc. III. Hallott szöveg értése

Tudományos Ismeretterjesztő Társulat

2. Local communities involved in landscape architecture in Óbuda

3. MINTAFELADATSOR KÖZÉPSZINT. Az írásbeli vizsga időtartama: 30 perc. III. Hallott szöveg értése

ACTA CLIMATOLOGICA ET CHOROLOGICA Universitatis Szegediensis, Tom , 2005,

Összefoglalás. Summary

REGIONAL COMPARISON OF FARMS ON THE BASIS OF THE FADN DATABASE. PESTI, CSABA - KESZTHELYI, KRISZTIÁN - Dr. TÓTH, TAMÁS SUMMARY

1918 December 1 út, 15/H/4, Sepsiszentgyörgy (Románia) Mobil biro_biborka@yahoo.com

Pletykaalapú gépi tanulás teljesen elosztott környezetben

1. Katona János publikációs jegyzéke

First experiences with Gd fuel assemblies in. Tamás Parkó, Botond Beliczai AER Symposium

DR. BOROMISZA ZSOMBOR. A Velencei-tóhoz kapcsolódó tájvédelmi szakértői tevékenység

Word and Polygon List for Obtuse Triangular Billiards II

KÉPI INFORMÁCIÓK KEZELHETŐSÉGE. Forczek Erzsébet SZTE ÁOK Orvosi Informatikai Intézet. Összefoglaló

ACTA AGRONOMICA ÓVÁRIENSIS

EN United in diversity EN A8-0206/473. Amendment

Átírás:

University of Szeged Faculty of Econoics and Business Adinistration Doctoral School in Econoics Exaination of Multicollinearity in Linear Regression Models Exaination of PETRES Red Theses of PhD Dissertation Written by: Péter Kovács Consultant: Dr. Tibor Petres associate professor Szeged 2008

I. Proble definition, ais and hypotheses of the research I. 1. Definition of the proble In today s globalizing world decision akers have an increased need for inforation. The great increase in the quantity of data is not autoatically accopanied by an appropriate increase in inforation. Actually, the proble that decision akers have to face today is not the lack but the abundance of inforation, but this huge aount of data frequently has only a little inforation content, which eans that redundancy is high. Redundancy eans superfluous data which do not convey new or noteworthy inforation in ters of the exaination. For this reason the inforation content of etric data is an essential issue in epirical analyses. This is particularly true for the application of linear regression odels. In the case of linear regression odels, ulticollinearity can be interpreted as a type of redundancy. With atrix algebraic notation this can be written in the for of ~ y = Xβ ~ ~ + ~ ε, where ~ y is the n coponent colun vector of the dependent variable; X ~ is the atrix of explanatory variables consisting of row n and colun (+1), where the first colun is always an ~ x 0 su vector; ~ β is the (+1) coponent colun vector of the odel paraeters unknown to us; is the nuber of explanatory variables (explanatory variables); ε ~ is the n coponent colun vector of the error ter. The concept of ulticollinearity is apparently unifor in literature. Definitions usually differ fro each other in one word, but this entails significant changes in content. Multicollinearity as an expression was first used by RAGNAR FRISCH. He used it for the description of cases in which one variable was present in several relations. In his exainations he did not distinguish dependent variables fro explanatory variables. He supposed that the easureent of all variables was erroneous, the correlation between the actual values of the variables had to be estiated on this basis. It is very superficial when ulticollinearity is defined as the absence of the independence of explanatory variables. This definition is probleatic because it is not 1

defined unabiguously what the independence of the explanatory variables eans. Does it ean their linear independence or possibly their independence in the statistical sense? One of the priary conditions of the standard linear regression odel is the linear independence of the explanatory variables (KENNEDY). Therefore in certain sources ulticollinearity is interpreted as the absence of the linear independence of explanatory variables. This approach can be regarded as a special case of ulticollinearity, which is called extree ulticollinearity. This case does not pose special probles in practice as it is easily anageable. In the course of epirical analyses cases close to extree ulticollinearity are frequently encountered, when the variances of individual estiated paraeters are considerably increased as copared to the variance of the error ter. The great ajority of literature on ulticollinearity deals with this case. However, let e reark that ulticollinearity could ean a uch ore general phenoenon, naely the covariance of explanatory variables. Naturally, the special cases of this definition would convey the content eant by ulticollinearity to everybody. The recognition of ulticollinearity and the identification of its cause often present a serious proble in epirical exainations, as on the one hand the negative consequences of ulticollinearity do not always occur, and on the other hand ulticollinearity can be caused not only by one variable but also by a group of variables. Thus it can be suspected that the indicators of ulticollinearity do not always describe this phenoenon properly. The interpretation of the indicators of ulticollinearity is frequently quite subjective. Firstly, ost of the indicators give how uch the data exained are not ideal, that is to what extent they deviate fro the ideal case when each explanatory variable is linearly independent of each other. For soe indicators there is no definite boundary for indicating the harful extent of deviation. Secondly, if the specification of the applied odel is appropriate, ulticollinearity is only the consequence of the lack of proper inforation. The success of the ethods used for reducing or eliinating the negative effects of ulticollinearity can largely depend on the exact recognition of ulticollinearity. Although the use of the ajority of these ethods decreases or ay decrease the extent of the negative consequences of ulticollinearity, this ay be accopanied by other negative 2

consequences for exaple, by significant inforation loss or by the iproper interpretability of the results. The topicality of the subject is given by the fact that these probles are alost always encountered in the course of econoic analyses. This is particularly true if there is a strong trend in the explanatory variables, or if the inforation available is too little for the exaination of the effect of the explanatory variables on the dependent variable. In su, in epirical analyses it frequently happens that not all the data have a useful content in respect of the exaination, in other words the database is redundant. In a ultivariate linear regression calculation ulticollinearity can be interpreted as a type of redundancy. Therefore during regression analysis it is essential to know the proportion of the data with a useful content in respect of the estiator β ˆ~ ~ ~ ~ = ( X X) 1 X ~ y, but its proper easureent poses a proble. It is questionable what the indicators of ulticollinearity indicate, or how the negative consequences of the presence of ulticollinearity can be decreased. I. 2. Ai of the dissertation PETRES Red is one possibility for easuring the proportion of data with a useful content in respect of the estiator β ˆ~ ~ ~ ~ = ( X X) 1 X ~ y. PETRES Red is a new possible indicator of redundancy and thus of ulticollinearity. The Red indicator is defined by using the eigenvalues λ j (j=1,2,,) of the correlation atrix R of the explanatory variables. The Red indicator is based on the following train of thought. If the database serving as the source of the explanatory variables is redundant in respect of estiator β ~, that is if the covariance of the data is considerable, not all the data will have a useful content. The saller the proportion of the data with a useful content is, the greater the extent of redundancy will be. The greater the dispersion of the eigenvalues is, the greater the covariance of the explanatory variables in the database will be. There are two extree cases: all the eigenvalues are equal to each other (that is their value is one), or all the eigenvalues with the exception of one equal zero. The extent of dispersion can be quantified with the relative dispersion of the eigenvalues or with their dispersion (being equal in this case). 3

σ v λ λ = = λ ( λ j λ ) 2 j= 1 λ j j= 1 = ( λ j λ ) 2 j= 1 = ( λ j 1) 2 j= 1 = σ λ In order to ake the redundancy of various databases coparable, the above indicator has to be noralized. As the eigenvalues are nonnegative, noralization is carried out with value 1 because of the relationship 0 vλ 1 concerning relative dispersion. The indicator obtained in this way can be used to quantify the extent of redundancy, and the Red indicator can be defined with its help as follows. v Red = λ 1 In the case of the absence of redundancy the value of the above indicator is zero or zero percent, while in the case of axiu redundancy it is one or one hundred percent. The Red indicator easures the redundancy of the exained database of the given size. When the redundancies of two or ore databases of different sizes are copared, the Red indicators can only be used to deterine how redundant individual databases are, but one cannot ake a direct stateent as to which of these has ore useful data. I. 3. Structure of the dissertation The ai of y dissertation is to exaine the properties of the Red indicator and to copare it with other indicators in a ultivariate linear regression odel. In accordance with its ai, y dissertation is organized as follows. In Chapter I the proble, the tasks and the ais of the dissertation are laid out. In order to do so the fundaentals of regression analysis essential to understanding the dissertation are suarized in short. In Chapter II the literature on ulticollinearity is surveyed. Several known and less-known indicators of ulticollinearity, its ethod of detection, its potential 4

consequences and also the possibilities to decrease the negative effects thereof are discussed in this chapter. Procedures of detection and indicators discussed Exaination of the correlation atrix of explanatory variables KLEIN S rule of thub MASON S and PERREAULT S proposal M 1 indicator M indicator FARRAR GLAUBER test WILKS test Exaination of the differences of the correlation coefficients and partial correlation coefficients FRISCH S bunch aps ethod VIF indicator BELSLEY S gaa indicator FELLMAN S L indicator MAHAYAN S and LAWLES S M 1 indicator THISTED S ci (ulticollinearity index) és pci (predicted ulticollinearity index) ISRM (Index of Stability of Relative Magnitudes) DEF indicator (Direct Effect Factor) Procedures discussed for decreasing the adverse effects of ulticollinearity Oission of explanatory variables fro the odel Increasing the nuber of eleents in the saple Use of external inforation Use of the MOORE PENROSE inverse Principal coponent analysis 5

Ridge regression Nested estiate procedure Exaination of the orthogonality of explanatory variables As the conclusion of the chapter, I illustrated the procedures and indicators entioned with an exaple. After surveying the literature and on the basis of epirical exaples, I ade the following findings. 1. The increase in the variance of the estiated paraeters is the ost frequently entioned negative consequence of ulticollinearity, but what should be considered is not their absolute value but the extent of their inflation copared to the variance of the error ter. 2. Several ethods are known for detecting and easuring ulticollinearity but only few of the are widely accepted, partly because it is often very difficult to detect ulticollinearity, and partly because the interpretation of the ajority of the indicators is quite subjective. Usually, soe of the indicators and procedures only detect ulticollinearity without generally due to their synthetic nature localizing the proble. In contrast with this, a group of the indicators and procedures tries to localize ulticollinearity with ore or less success. 3. The great disadvantage of indicators using the reciprocals of eigenvalues is that their interpretation is subjective, which eans that there is no definitive threshold value indicating strong ulticollinearity. The values of the indicators are not coparable with each other. Moreover, the values of these indicators ainly depend only on the sallest eigenvalue. 4. The presented indicators describe ulticollinearity fro different aspects. 5. There is no generally valid procedure for decreasing the negative consequences of ulticollinearity, that is each procedure ay have adverse effects fro different aspects. 6. As a suary of the described and applied indicators, ideas and algoriths it can be stated that the indicators and procedures entioned are not generally 6

valid in the sense that only in special cases do they describe or handle the phenoenon of ulticollinearity properly. In Chapter III the ethods applied during y research and their results are presented. The ajor characteristics of the Red indicator are exained. The results of other, siilar exaination ethods are also presented here and copared with y results. The other chapters of y dissertation contain the evaluation of y research activity and results, the list of the literature, figures and tables used, the outcoe of longer coputerized analyses and the list of y publications. I. 4. Research hypotheses The probles and hypotheses detailed in the following are investigated in order to achieve the ai of y dissertation. 1. Calculation of the Red indicator in a different way. In accordance with its definition, the Red indicator can be calculated on the basis of the eigenvalues of the correlation atrix. The question ay arise whether the value of the indicator can be calculated without knowing the eigenvalues, erely fro the eleents of the correlation atrix of the explanatory variables. I exained the following hypothesis in Chapter III.1. Hypothesis 1: The Red indicator can be expressed without knowing the eigenvalues of the correlation atrix of the explanatory variables, erely fro the correlation coefficients of the pairs. 2. Generalization of the exaination ethod of ulticollinearity. In y opinion not only the covariance of the variable pairs but also the covariance of the variable groups ay pose a proble during the exaination of ulticollinearity. However, no detailed ethodology has been worked out for this yet. I think a possible solution to the proble could be the use of canonical correlation analysis, a special case of which can be exained with the help of the Red indicator. I exained the following hypothesis in Chapter III.1. 7

Hypothesis 2: The covariance of two groups of explanatory variables can be exained in special cases with the help of the Red indicator. 3. Exaination of new odelling possibilities of ulticollinearity. A possible ethod for odelling ulticollinearity is to exaine the orthogonality of explanatory variables, that is the stretching of the space of explanatory variables. The question rightly arises whether ulticollinearity can be odelled in a different way. I exained the following hypothesis in Chapter III.2. Hypothesis 3: The elliptical odel of ulticollinearity can be forulated on the basis of the Red indicator as a new approach. 4. Searching for soe relationship between the variances of the estiated regression paraeters and the Red indicator. As one of the ost frequently entioned negative consequences of ulticollinearity is the increase in the variance of the estiated regression paraeters and in their inflation, it is advisable to exaine the relationship between the Red indicator and the variances of the estiated regression paraeters. I exained the following hypothesis in Chapter III.3. Hypothesis 4: A critical value of the Red indicator can be given as the precondition for the variances of the estiated paraeters not to be infinite. 5. Exaination of the distribution of the Red indicator. In Chapter III.4 I tried to prepare the epirical distribution function of the Red indicator and to deterine its theoretical distribution. 6. Exaination of the application possibilities of the Red indicator. An interesting question is in what fields the Red indicator can be used. I exained the following hypothesis in Chapter III.5. Hypothesis 5: The KMO index used in factor analysis can be expressed on the basis of the Red indicator. 8

7. Identification of an indicator siilar to the Red indicator. As the Red indicator is a noral relative dispersion calculated on the basis of the eigenvalues of the correlation atrix of the explanatory variables, I think that ulticollinearity can be easured with soe other dispersion indicator of the eigenvalues, which is a conception siilar to that of the Red indicator. I exained the following hypothesis in Chapter III.6. Hypothesis 6: The GINI coefficient of the eigenvalues of the correlation atrix of the explanatory variablesis an indicator of ulticollinearity siilar in conception to the definition of the Red indicator. II. Research results and findings Chapter III of the dissertation presents the new results of the paper. Part of the exainations is based on theoretical considerations, while for the other part various saples had to be created and their results had to be analysed. SPSS 13.0 and Microsoft Excel progras were used for the analyses. The geoetric depiction was ade with Derive 6.0. In su, y dissertation contains the following theses. Thesis 1: The Red indicator can be expressed without knowing the eigenvalues of the correlation atrix of the explanatory variable s, erely as the quadratic ean of the correlation coefficients of the pairs. I could express the Red indicator without knowing the eigenvalues as the quadratic ean of the eleents outside the ain diagonal of the correlation atrix of the explanatory variable s. This eans that this indicator shows not only the proportion of the data with a useful content in respect of the estiator β ~ but also the ean covariance of the explanatory variable s. This result has been acknowledged in several international conferences and has also been cited in illustrious international journals. Thesis 2: The exaination of the covariance of two groups of explanatory variablesis possible with the Red indicator in the case of groups with one eleent 9

each, and with the haronic ean of the VIF j values in the case of groups with one ( 1) eleents. I found that ulticollinearity can be caused not only by variables but also by groups of variables. As this does not have abundant literature, later on I a going to exaine the effect of the covariance of the variable groups. I established that one special case of this can be easured with the Red indicator, while another special case with the help of the haronic ean of the VIF j values. Thesis 3: The elliptical odel of ulticollinearity can be forulated on the basis of the Red indicator as a new approach. As a new approach, I forulated the elliptical odel of ulticollinearity. Parallel with the increase in the extent of the ean covariance of the variables, the possible eigenvalues are situated on an -diensional sphere with a greater radius. The possible eigenvalues are situated on a segent of the -diensional sphere in such a way that with a fixed Red value they are located on an ( 1)-diensional ellipsoid. Unfortunately, the higher the diension nuber of the odel is, the ore conditions have to be given for deterining and studying the range of possible eigenvalues. Therefore the detailed exaination of this range and of the elliptical curves was carried out only for three explanatory variables. I deterined the possible values of the Red indicator as the function of one eigenvalue, and I could give the possible values of each eigenvalue depending on the value of the Red indicator. I copared how the ellipses and the lines containing the identical-value quotients of the highest and lowest values of the eigenvalues ove along the range of the possible eigenvalues. Later I will try to iprove the odel and to extend the exaination to higher diensions. Thesis 4: A critical value of the Red indicator can be given as the necessary precondition for the variances of the estiated paraeters not to be infinite. As the Red indicator is a synthetic indicator, it cannot be connected separately to the variances of the estiated regression paraeters. I found that not the absolute value of 10

the variances of the estiated regression paraeters is to be exained but their inflation copared to the variance of the error ter. The su and ean of these depend in turn on the reciprocal su of the eigenvalues. I proved that the product of the haronic ean of the eigenvalues and of the arithetical ean of the estiated regression paraeters equals the variance of the error ter, and also that the product of the haronic ean of the eigenvalues and of the arithetical ean of the VIF j values equals one. After the refutation of a previous stateent of ine I gave a critical value of the Red indicator as the necessary precondition for the variances of the estiated paraeters not to be infinite, and siilarly critical values which are the preconditions for the nuber of zero eigenvalues to be lower than k. As this is of slight practical iportance as such, further detailed exainations are necessary. However, I perfored these exainations for three explanatory variablesby using the elliptical odel. I found that the reciprocal su of the eigenvalues increases when oving farther fro the lower boundary of the range of possible eigenvalues. Based on this, I deterined the sallest and if possible the greatest extent of the inflation of the su of the variances of the estiated paraeters as copared to the variance of the error ter, in the function of the Red indicator. On the basis of this, in the function of the Red indicator, a critical value can be given which is the precondition for the su of the variances of the estiated paraeters not to be inflated to an extent greater than set beforehand, copared to the variance of the error ter. In the course of the exaination of the distribution of the Red indicator I prepared the epirical distribution function in a few diensions. During the analysis I exained only existing correlation structures. I used an algorith written by yself for generating the possible eigenvalues and for preparing the distribution of the Red indicator. The essence of this algorith is that all the possible eigenvalue cobinations are prepared with a given accuracy. The analysis was ade ore difficult by the nuber of generated eigenvalues, which ay be hundreds of thousands or even hundreds of illions even in a rough approxiation. The identification of the distribution of the Red indicator was unsuccessful. High-perforance coputers would be needed for further exainations. 11

Thesis 5: The KMO index used in factor analysis can be expressed on the basis of the Red indicator. I proposed an application possibility for the Red indicator. The KMO index used in factor analysis can be expressed on the basis of the Red indicator. Based on this I established that the ean covariance of the partial correlation coefficients cannot be saller than the ean covariance of the correlation coefficients. Thesis 6: The GINI coefficient of the eigenvalues of the correlation atrix of the explanatory variablesis an indicator of ulticollinearity siilar in conception to the definition of the Red indicator is. I deterined another possible indicator of ulticollinearity based on a train of thought siilar to that of the Red indicator. This indicator is the GINI coefficient of the eigenvalues. I identified an easily anageable anner of calculation for the indicator. I exained the behaviour of the indicator in the case of three explanatory variables in the range of the possible eigenvalues. The behaviour of the indicator necessitates further detailed studies. II. 1. Future research directions As a conclusion to y dissertation, I a going to suarize y planned future research directions in the order corresponding to the structure of the dissertation. 1. The possibility to decrease the negative consequences of ulticollinearity is a very iportant practical proble. Therefore a. on the one hand I would like to exaine whether soe optial estiation can be ade for the distort paraeter used in ridge regression on the basis of the value of the Red indicator. b. On the other hand, I would like to prepare a variable selection procedure based on the value of the Red indicator by defining the indicator also partially, as a explanatory variable, as the ean covariance of the given explanatory variable with all the other explanatory variables. 12

2. I would like to continue to exaine the extension of ulticollinearity, that is how the covariance of a group consisting of two or an optional nuber of explanatory variables can be easured, and what negative consequences the phenoenon has. 3. Later on I would like to reveal further properties of the elliptical odel both in the case of three explanatory variables and in higher diensions. 4. I a planning to exaine the relationship of the Red indicator and the Red indicator to be defined partially with the inflation of the estiated regression paraeters in ore detail. 5. The definition of the theoretical distribution and the epirical distribution of the Red indicator ay pose an iense task in future. 6. I would like to prepare soe statistical test concerning the hypothetical value of the Red indicator. 7. I would like to extend the range of application of the Red indicator both during theoretical ethods and econoic studies. III. Publications and conference lectures Reviewed scientific publications [1] SZONDI I. KOVÁCS P. IDOVIKA B. [2002]: A családok helyzete Szeged város lakótelepein (The Situation of Failies in the Housing Estates of the City of Szeged), ACTA JURIDICA ET POLITICA, Tous LXII. Fasc. 18., Szeged, 30 oldal. [2] FÜLÖP V. SZONDI I. KOVÁCS P. [2003]: Lakáscélú állai táogatások és egyéb, a lakáshoz jutást segítő ellátási forák (State Housing Supports and Other Fors of Housing Provisions), PUBLICATIONES DOKTORANDUM JURIDICORUM, Tous II. Fasc. 6. Szeged, 30 oldal. [3] KOVÁCS P. SZONDI I. [2003]: Úton az inforációs társadalo felé (On the Way to the Inforational Society), ACTA JURIDICA ET POLITICA, Tous LXIII. Fasc. 13., Szeged, 20 oldal. [4] KOVÁCS P. PETRES T. TÓTH L. [2004]: Adatálloányok redundanciájának érése (Measureent of the Redundancy of Databases), Statisztikai Szele, Budapest, 82. évfolya 6.-7. szá, 595-604. oldal. 13

[5] GYÉMÁNT R. PETRES T. KOVÁCS P. [2005]: A Szandzsák, az egyedülálló vallási régió (Sanjak, the Unique Region of Religion), Területi statisztika, 8. (45.) évfolya 3. szá 278-287. oldal. [6] KOVÁCS P. PETRES T. TÓTH L. [2005]: A new easure of ulticollinearity in linear regression odels, International Statistical Review (ISR), Volue 73 Nuber 3, Voorburg, The Netherlands, 405-412. oldal. [7] KOVÁCS P. PETRES T. TÓTH L. [2006]: Válogatott fejezetek Statisztikából, Többváltozós statisztikai ódszerek (Selected Chapters in Statistics, Multivariate Statistical Methods), JATEPress, Szeged, 167 oldal. [8] KOVÁCS P. [2008]: A ultikollinearitás vizsgálata lineáris regressziós odellekben (Exaination of Multicollinearity in Linear Regression Models), Statisztikai Szele, Budapest, 86. évfolya 1 szá, 38-67. oldal. [9] LUKOVICS M. KOVÁCS P. [2008]: Eljárás a területi versenyképesség érésére (Procedure for the Measureent of Regional Copetitiveness), Területi statisztika, KSH, 11. (48.) évfolya, 20 oldal, (egjelenés alatt). [10] VILMÁNYI M. KOVÁCS P. [2008]: Egyetei-ipar együttűködések teljesíténye és lehetséges vizsgálati ódszere (Perforance and Possible Exaination Method of University-Industrial Co-operations), Kérdőjelek a régiók gazdasági fejlődésében (szerk. LENGYEL I. LUKOVICS M.), JATEPress, Szeged, 25 oldal, (egjelenés alatt). [11] KOVÁCS P. [2008]: Az inforációs társadalo szerinti területi egyenlőtlenségek érése (Measureent of Regional Inequalities According to the Inforational Society), Kérdőjelek a régiók gazdasági fejlődésében (szerk. LENGYEL I. LUKOVICS M.), JATEPress, Szeged, 11 oldal, (egjelenés alatt). Educational aterials, lecture notes [1] KOVÁCS P. PETRES T. [2004]: Statisztika Feladatgyűjteény (közgazdász hallgatók száára) [Collection of Exercises in Statistics (for students of econoics)], SZTE GTK, 120 oldal. [2] KOVÁCS P. PETRES T. [2004]: Statisztika Feladatgyűjteény (Collection of Exercises in Statistics), Dunaújvárosi Főiskola, Dunaújváros, 284 oldal. 14

[3] KOVÁCS P. PETRES T. [2004]: Statisztika Képletgyűjteény (Collection of Forulae in Statistics), Dunaújvárosi Főiskola, Dunaújváros, 50 oldal. [4] KATONA T. KOVÁCS P. PETRES T. [2006]: Általános statisztika, tankönyv (General Statistics, Coursebook), JATEPress, Szeged, 225 oldal. [5] KOVÁCS P. PETRES T. [2006]: Általános Statisztika Feladatgyűjteény (joghallgatók részére) [Collection of Exercises in General Statistics (for students of law)], JATEPress, Szeged, 2005, 132 oldal. [6] KOVÁCS P. [2006]: Általános statisztikai alapiseretek, EU távoktatás elektronikus jegyzet (Fundaentals in General Statistics, electronic lecture notes for EU distant teaching), 80 oldal. [7] KOVÁCS P. PETRES T. [2007]: Tanulási útutató a főiskolák és egyeteek Általános statisztika cíű tantárgyához (Learning Guide for the Subject of General Statistics in Universities and Colleges), Dunaújvárosi Főiskola, Dunaújváros, 190 oldal. [8] KOVÁCS P. PETRES T. [2008]: Szoftverek alkalazása az üzleti életben: statisztikai prograok (Application of Softwares in Business Life: Statistical Progras), Dunaújvárosi Főiskola, Dunaújváros, 83 oldal. [9] KOVÁCS P. PETRES T. [2008]: Tanulási útutató a Szoftverek alkalazása az üzleti életben: statisztikai prograok tantárgyához (Learning Guide for the Subject of Application of Softwares in Business Life: Statistical Progras), Dunaújvárosi Főiskola, Dunaújváros, 78 oldal. Foreign language conference publications [1] KOVÁCS P. SZONDI I. [2006]: E-europe- E-Hungary, Ungarn auf der Schwelle in die EU, A Pólay Eleér Alapítvány Könyvtára, sorozatszerkesztő: Balogh Eleér, Szeged, 29.-48. oldal. [2] KOVÁCS P. PETRES T. [2006]: A New Measure of Multicollinearity in Linear Regression Models, International Conference Applied Statistics (2006, Ribno, Slovenia), Progra and Abstract, Statistical Society of Slovenia, Ljubljana. [3] KOVACS P. LUKOVICS M. [2006]: Classifying Hungarian sub-regions by their copetitiveness, Globalization Ipact on Regional and Urban Statistics, 25th 15

SCORUS Conference on Regional and Urban Statistics Research, Wroclaw, Poland, http://www.scorus2006.ae.wroc.pl, 12 oldal. [4] KOVÁCS P. PETRES T. [2007]: Measure of Multicollinearity with a New, Original Indicator (PETRES Red) in Linear Regression Models, International Conference on Matheatics & Statistics, Athens Institute for Education Research, Athens, (KIADÁS ALATT) Hungarian language conference publications [1] KOVÁCS P. LAMPERTNÉ A. I. PETRES T. [2005]: A ultikollinearitás érése lineáris regressziós odellekben (Measureent of Multicollinearity in Linear Regression Models), A Dunaújvárosi Főiskola Közleényei XXVI/II., Dunaújváros, 355-365. oldal. [2] KOVÁCS P. [2005]: Statisztikai intákat generáló algoritusok (Algoriths Generating Statistical Saples), A Dunaújvárosi Főiskola Közleényei XXVI/II., Dunaújváros, 347-354. oldal. [3] KOVÁCS P. [2005]: Az inforatika alkalazása a közgazdasági képzésben (Application of Inforatics in Econoist Education), Inforatika a felsőoktatásban konferencia 2005 CD-elléklete, Debreceni Egyete Inforatikai Kar, Debrecen, 6 oldal. [4] KOVÁCS P. [2005]: Az inforatika oktatása és lehetőségei a jogászképzésben (Teaching and Possibilities of Inforatics in Law Education), Inforatika a felsőoktatásban konferencia 2005 CD-elléklete, Debreceni Egyete Inforatikai Kar, Debrecen, 6 oldal. [5] KOVÁCS P. PETRES T. [2006]: A PETRES-féle Red-utató eloszlásának vizsgálata (Exaination of the Distribution of PETRES Red), A Dunaújvárosi Főiskola Közleényei XXVII/II., Dunaújváros, 2006, 521-530. oldal. [6] KOVÁCS P. PETRES T. LUKOVICS M. [2006]: A PETRES-féle Red-utató alkalazásának lehetőségei (Application Possibilities of PETRES Red), A Dunaújvárosi Főiskola Közleényei XXVIII., Dunaújváros, 304-316. oldal. Other studies 16

[1] KOVÁCS P. [2006]: A statisztika oktatásának és oktatásódszertanának reforálása a saját gyakorlatoban (Refor of the Teaching and Teaching Methodology of Statistics in My Own Practice), A felsőoktatás szerkezeti és tartali fejlesztése tárgyú Huánerőforrás-fejlesztési Operatív Progra (HEFOP 3.3.) Partnerközpontú önértékelési odell egalkotása és továbbképzések a felsőoktatási intézények huánerőforrásainak fejlesztéséért tanulányainak CD gyűjteénye, Dunaújváros, 52 oldal. Foreign language conference lectures [1] KOVÁCS P. SZONDI I.: eeurope, ehungary, Társadali és gazdasági kihívások az Eu-csatlakozás küszöbén (Social and Econoic Challenges upon the Iinent Accession to the European Union), angol nyelvű, nezetközi konferencia előadás, Szeged, 2004. június 12. [2] KOVÁCS P. LUKOVICS M.: Classifying Hungarian sub-regions by their copetitiveness, Globalization Ipact on Regional and Urban Statistics, 25th SCORUS Conference on Regional and Urban Statistics Research, Wroclaw, Poland, 2006. augusztus 30.-szepteber 1. [3] KOVÁCS P. PETRES T.: A new easure of ulticollinearity in linear regression odels, Applied Statistics 2006 International Conference, Ribno (Bled), Slovenia, 2006. szepteber 17.-20. [4] KOVÁCS P. PETRES T.: Measure of Multicollinearity with a New, Original Indicator (PETRES Red) in Linear Regression Models, International Conference on Matheatics & Statistics, ATINER, 2007. június 11. Hungarian language conference lectures [1] KOVÁCS P. LAMPERTNÉ A. I. PETRES T.: A ultikollinearitás érése lineáris regressziós odellekben (Measureent of Multicollinearity in Linear Regression Models), DUF Közgazdasági szipóziu, Dunaújváros, 2004. noveber. [2] KOVÁCS P.: Statisztikai intákat generáló algoritusok (Algoriths Generating Statistical Saples), DUF Inforatikai szipóziu, Dunaújváros, 2004. noveber. 17

[3] KOVÁCS P.: Az inforatika alkalazása a közgazdasági képzésben (Application of Inforatics in Econoist Education), Inforatika a felsőoktatásban konferencia 2005, B szekció, Debreceni Egyete Inforatikai Kar, Debrecen, 2005. augusztus 24.-26. [4] KOVÁCS P.: Az inforatika oktatása és lehetőségei a jogászképzésben (Teaching and Possibilities of Inforatics in Law Education), Inforatika a felsőoktatásban konferencia 2005, F szekció, Debreceni Egyete Inforatikai Kar, Debrecen, 2005. augusztus 24.-26. [5] KOVÁCS P. PETRES T.: A PETRES-féle Red-utató eloszlásának vizsgálata (Exaination of the Distribution of PETRES Red), Magyar Tudoány Hete a Dunaújvárosi Főiskolán Közgazdasági és enedzsent Konferencia, Dunaújváros, 2005. noveber 22. [6] KOVÁCS P. PETRES T. LUKOVICS M.: A PETRES-féle Red-utató alkalazásának lehetőségei (Application Possibilities of PETRES Red), Magyar Tudoány Hete a Dunaújvárosi Főiskolán Közgazdasági és enedzsent Konferencia, Dunaújváros, 2006. noveber 16. [7] KOVÁCS P. PETRES T.: A PETRES-féle Red-utató isertetése (Presentation of PETRES Red), VI. Terészet-, Műszaki- és Gazdaságtudoányok Alkalazása Nezetközi Konferencia, Szobathely, 2007. ájus 18. [8] VILMÁNYI M. KOVÁCS P.: Egyetei-ipar együttűködések teljesíténye és lehetséges vizsgálati ódszere (Perforance and Possible Exaination Method of University-Industrial Co-operations), "Kérdőjelek a régiók gazdasági fejlődésében" Konferencia, Szeged, 2007. noveber 12. [9] KOVÁCS P. : Az inforációs társadalo szerinti területi egyenlőtlenségek érése (Measureent of Regional Inequalities According to the Inforational Society), "Kérdőjelek a régiók gazdasági fejlődésében" Konferencia, Szeged, 2007. noveber 13. 18