Regression prediction is most useful when the covariates are. Refer to that chapter for in depth coverage of multiple regression analysis. However data depth has not been studied much in the con text of multivariate regression. Analyzing data with robust multivariate methods and. Robust estimates, on projection pursuit or fast mcd estimators, efficient tools residuals, and outlier detection with multiresponse data. This course will consider methods for making sense of data of this kind, with an emphasis on practical techniques.
Many of the most used estimators in statistics are semiparametric. It is an ideal resource for researchers, practitioners, and graduate students in statistics, engineering, computer science. Robust multivariate analysis, computational geometry and applications dimacs series in discrete mathematics and theoretical computer science on free shipping on qualified orders. This concept is very important because it leads to a natural centeroutward ordering of sample points in multivariate data sets.
Robust likelihoodbased analysis of multivariate data with missing values roderick little and hyonggin an university of michigan abstract. Wilcoxonmannwhitneytype test for infinitedimensional data. Robustness of the regression estimate depends critically on the. The workshop brought together researchers from two different communities. The concept of data depth has many applications in the multivariate analysis field. Download pdf open epub full article content list abstract. In this paper, we propose robust depthbased statistical tools for the analysis of microarray data. Robust scaling and modified scaling are also applied in mvt and cdcilmvt. If the data were all independent columns,then the data would have no multivariate structure and we could just do univariate statistics on each variable column in turn. Regression analysis seeks to find the relationship between one or more independent variables and a dependent variable. The goal of robust statistics is to develop methods that are robust against the possibility that one or several.
We can compute covariances to evaluate the dependencies. We introduce a robust method for multivariate regression based on robust estimation of the joint location. With stateoftheart robust methods, for example based 21. Influence functions for a general class of depthbased generalized quantile functions.
In order to provide a training opportunity that could compensate for this, we collaborated on an introductory, intensive workshop in multivariate analysis of ecological data, generously supported and hosted several times by the bbva foundation in madrid, spain. He 2003 on the limiting distributions of multivariate depthbased rank sum statistics and related tests. The concept of depth in statistics maria raquel neto abstract. The main idea unifying these two research areas turned out to be the notion of data depth, which is an important notion both in statistics and in the study of efficiency of algorithms used in computational geometry. An overview of the recently developed methods for multivariate data analysis, based on the minimum covariance determinant and least trimmed squares estimators for location, scatter and regression. According to the classical tukeyhuber contamination model thcm, a small fraction of rows can be contaminated and. Journal of statistical computation and simulation volume 87, 2017 issue 2. A quality index based on data depth and multivariate rank tests. Depth based classification for functional data 103 118. Multivariate analogues of the univariate median have been successfully introduced based on data depth. Robust multivariate mixture regression models with incomplete data. Robust multivariate analysis, computational geometry and applications cover image. The text is suitable for a first course in multivariate statistical analysis or a first course in robust statistics.
A family of kurtosis orderings for multivariate distributions. The book is a collection of some of the research presented at the workshop of the same name held in may 2003 at rutgers university. For multivariate location and dispersion mld, the classical estimator is the. We also propose a multivariate t mixture regression model using mmestimation with missing information that is robust to highleverage outliers. Most depth functions are robust and affine invariant making them. Postscript version pdf version a powerpoint presentation describing the code developed in the department for analysis based on the notion of data depth presented in a dimacs workshop on data depth, may 2003. For graduate and upperlevel undergraduate marketing research courses. Multivariate analysis of process data using robust. The workshop was held from may 14 to 16, 2003, at rutgers university in new jersey, and it was sponsored by dimacs with support from the national science foundation. Robust statistics aims to stimulate the use of robust methods as a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. In the 21st century, statisticians and data analysts typically work with data sets containing a large number of observations and many variables. It is an ideal resource for researchers, practitioners, and graduate students in statistics, engineering, computer science, and physical and social sciences. Pdf on jun 1, 2008, mia hubert and others published data depth.
Selected recent and current professional service activities. Robust methods for multivariate functional data analysis. Mva can be as simple as analysing two variables right up to millions. Spherical data depth and a multivariate median 87 102. For applying methods developed for the euclidean geometry, the data rst have to be transformed to the euclidean space ilr. Pca is a wellknown multivariate technique and detailed descriptions on the subject are available elsewhere chiang et al. Data analysis and classification with the zonoid depth 49 64. Home page of robert serfling university of texas at dallas. We propose a robust nonparametric classifier, which relies on the intuitively simple notion of epd. Multivariate analysis can be complicated by the desire to include physicsbased analysis to calculate the effects of variables for a hierarchical systemofsystems.
From the data description of iris data it is known that there were three types of flowers informations in the data. Computing robust measure of multivariate location data depth. Multivariate functional data, robust eigenfunction, robust functional regression. The robustness studies of the general data depth induced estimators. The sample space of compositional data is the simplex. Recently robust versions of these methods have been proposed by croux and haesbroeck 2000, croux and dehon 2001 and pison et al. On scale curves for nonparametric description of dispersion. Multivariate analysis adds a muchneeded toolkit when. The notion of data depth has long been in use to obtain robust location and scale estimates in a multivariate setting.
The depth of an observation is a measure of its centrality, with respect to a data set or a distribution. Robust methods for multivariate functional data analysis by pallavi sawant a dissertation submitted to the graduate faculty of auburn university in partial ful. A nonparametric multivariate multisample test based on data depth. An extended projection data depth and its applications to.
Statistical depth function, robust data analysis, multivariate methods, r. Application of multivariaterankbased techniques in. Chapter 308 robust regression introduction multiple regression analysis is documented in chapter 305 multiple regression, so that information will not be repeated here. Multivariate statistics old school mathematical and methodological introduction to multivariate statistical analytics, including linear models, principal components, covariance structures, classi. Multivariate trimmed means based on data depth springerlink. This work aims at exploring the concept of depth in statistics and at showing its usefulness in practice. This article investigates the possible use of our newly defined extended projection depth abbreviated to epd in nonparametric discriminant analysis. The proposed methodologies are illustrated through simulation studies and real data analysis.
Robust multivariate analysis, computational geometry and applications dimacs series in discrete mathematics and theoretical computer science on free shipping on. Multivariate nonparametric tests oja, hannu and randles, ronald h. Robust estimation of multivariate location and shape. Depthbased classification for functional data 103 118. Multivariate data consist of measurements made on each of several variables on each observational unit. On some parametric, nonparametric and semiparametric discrimination rules 61 76. Typically, mva is used to address the situations where multiple measurements are made on each experimental unit and the relations among these measurements and their structures are important. Rather, the compositional nature is an inherent data property. Robust multivariate analysis, computational geometry and applications. Use of this algorithm architecture can enable reliable, fast, robust estimation of heavily contaminated multivariate data in high 20 dimension even with large quantities of data. Robust multivariate estimation based on statistical data.
The main idea unifying these two research areas turned out to be the notion of data depth, which is an important notion both in. This book is a collection of some of the research work presented in the workshop on data depth. The proposed test is shown to be robust with respect to outliers and to have better power than some competitors for certain distributions with heavy. The idea of depth for multivariate data provides a way of measuring how representative or central an observation is within a sample. Jun 28, 2016 rsimpls robust partial least squares regression 30062003 cdq censored depth quantiles 26072007 predict regression results for new data 09062008 based on rpcr or rsimpls analysis classical multivariate analysis and regression. Robust multivariate analysis for problem images jeremy m. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with outliers. Robust depthbased tools for the analysis of gene expression data. Two groups are clearly identified by ca, svd, l1 depth and robust svd analysis but kernel based depth indicates three groups may present in the iris data.
Like their univariate counterpart, these multivariate medians in general are quite robust but. Robust multivariate analysis, computational geometry. Introduction modern economics crucially depend on advances in applications of recent developments in statistics. After developingsucha robustmethod,we furtheranalyzethese data in section 5. Often, studies that wish to use multivariate analysis are stalled by the dimensionality of the problem. Robust principal component analysis for functional data. Depth based estimators have been proposed and studied by donoho and gasko. Chapter 1 robust location and scatter estimators in multivariate. A computer program implementing the algorithm is available from the authors. Let us take, for instance, a theory and practice of a portfolio optimization, a practice of credit scoring. Robust methods reduce or remove the effect of outlying data. For over 30 years, multivariate data analysis has provided readers with the information they need to understand and apply multivariate data analysis. Robust statistical methods cannot\repairan incorrect geometrical representation of the data.
Asymptotics of generalized depthbased spread processes and applications. In robust statistics, robust regression is a form of regression analysis designed to overcome some limitations of traditional parametric and nonparametric methods. Principal component analysis, canonical correlation analysis and factor analysis johnson and wichern 1998 are three different methods for analyzing multivariate data. Title statistical depth functions for multivariate analysis. Miltivariate data analysis for dummies, camo software. The data depths of a set of multivariate observations translates to a centeroutward ordering of the data. Robust multivariate estimation based on statistical data depth filters. A new concept of quantiles for directional data and the angular mahalanobis depth ley. The offered techniques may be successfully used in cases of lack of our knowledge on.
Multivariate statistics means we are interested in how the columns covary. Outliers may hamper proper classical multivariate analysis, and lead to incorrect conclusions. The epdbased classifier assigns an observation to the population with respect to which it has the maximum epd. On the limiting distributions of multivariate depthbased rank sum statistics and related tests zuo, yijun and he, xuming, the annals of statistics, 2006. Miltivariate data analysis for dummies, camo software special. Multivariate analysis of ecological data 10 exposure to statistical modelling. Multivariate analysis mva is based on the principles of multivariate statistics, which involves observation and analysis of more than one statistical outcome variable at a time. Robust multivariate analysis, computational geometry and applications by liu, r. Linford department of chemistry and biochemistry, brigham young university, provo, ut 84602 robust methods multivariate curve resolution mcr. Pdf highbreakdown robust multivariate methods researchgate.
692 1128 525 47 371 1096 661 767 33 1420 67 1190 981 653 1386 115 684 115 1096 1486 1554 545 992 883 177 626 218 432 457 872 899 214 922 1397 1195 1482 898 189 10 5 1134 822 221