Analysis of the most recent modelling techniques for big data with. Whether it is stock data for individual companies or economic data used for macroeconomic modeling. This book presents the econometric foundations and applications of multidimensional panels, including modern methods of big data analysis. Big data in dynamic predictive econometric modeling penn arts. Highdimensional sparse framework the framework two examples 2. In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l1penalization and postl1penalization methods. Big data in dynamic predictive econometric modeling request pdf. High dimensionality brings challenge as well as new insight into the advancement of econometric theory. Standard timeseries dynamic econometric modeling var estimation, forecasting, understanding, but new tools are required for bigdata environments. Parameter estimates of these models without corrective measures may be inconsistent.
Estimating and understanding high dimensional dynamic stochastic econometric models for volatility, derivatives, and more. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s data. This first part of the course introduces students to contemporary methods of microeconometric. Dec 28, 20 statistical significance in big data december 28, 20 december 28, 20 matthew harding bayesian information criterion, big data, critical values, statistical significance an interesting problem when analyzing big data is whether one should report the statistical significance of the estimated coefficients at the 1% level, instead of the. Highdimensional sparse econometric models, an introduction,springer lecture notes 2009, with a. Together with the recent developments in information technology that permit the collection of high dimensional data, this special issue will focus on econometric model selection theories and applications concerning the econometric analysis of high dimensional data. Highdimensional sparse models we have a large number of parameters p, potentially larger than the sample size n the idea is that a low dimensional submodel captures very accurately all essential features of the full dimensional model. Oracle inequalities and inference in highdimensional var models. Conventional econometric models and techniques often work with many economic and financial data, but there are issues unique to big datasets that may require new tools. This article is about estimation and inference methods for high dimensional sparse hds regression models in econometrics. In this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l 1penalization and post l 1penalization methods.
Lopez 2019, \monitoring banking system connectedness with big data, journal of econometrics, vol, pages. Dealing with highdimensionality in large data sets quantuniversity. Estimation of regression functions via penalization and selection 3. Big data in dynamic predictive econometric modeling.
Estimation methods for linearnonparametric regression. Highdimensional sparse econometric models, 2010, advances in. The event, hosted by the wang yanan institute for studies in economics wise at xiamen university, focused on recent developments in econometric theory with applications. Highdimensional sparse econometric models, an introduction. Estimation and inference on te in a general modelconclusion econometrics of big data. One simple example concerns the estimation of an average treatment effect in a high dimensional regression model, where the econometrician has hundreds of. Estimating treatment effects with highdimensional data 1.
Econometric models based on observational data are often endogenous due to measurement error, autocorrelated errors, simultaneity and omitted variables, nonrandom sampling, selfselection, etc. The r package bigvar allows for the simultaneous estimation of highdimensional time series by applying structured penalties to the conventional vector autoregression var and vector autoregression with exogenous variables varx frameworks. Large k is e ectively high dimensional because endogenizing the regressors in a largek univariate. For instances, the availability of big datasets facilitates the applicability of nonlinear models and estimation methods, which normally require large sample sizes. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and. Pdf this work is devoted to statistical methods for the analysis of economic data with a large number of variables. Estimation of regression functions via penalization and selection methods. Big data are characterized by high dimensionality and large sample size. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts. The course introduces key concepts and tools demanded in the business environment. Students should be able to learning methods assessment methods. One simple example concerns the estimation of an average treatment effect in a highdimensional regression model, where the econometrician has hundreds of. Highdimensional sparse econometric models hdsm models motivating examples for linearnonparametric regression 2.
High dimensional problems in econometrics sciencedirect. Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with monte carlo simulations. Oracle inequalities for high dimensional vector autoregressions. You may be interested in what i had to say on this topic back in 2011. Our methods can be utilized in many forecasting applications that make use of timedependent data such as. Researchers and policymakers should thus pay close attention to recent developments in machine. It helps us to quantifying new trends and exploiting new dimensions having timely answers on the impact of different events. Click through to find the program, copies of papers and slides, a participant list, and a few more photos.
Together with the recent developments in information technology that permit the collection of highdimensional data, this special issue will focus on econometric model selection theories and applications concerning the econometric analysis of high dimensional data. Pdf highdimensional data in economics and their robust analysis. Next month, the cemmap center at university college london is organizing a very exciting workshop on high dimensional econometric models. It is expected that all students will have taken intermediate level courses covering. Journal of the american statistical association, forthcoming. Analyzing a large panel of economic and finan cial data is. Econ 590 big data and machine learning in econometrics spring. Ultrahigh dimensional modeling is a more common task than before due to the emergence of ultrahigh dimensional data sets in many fields such as economics, finance, genomics and health studies. Macroeconomic nowcasting and forecasting with big data. The following list of potential topics is provided to stimulate ideas. Econometric analysis of large factor models jushan bai and peng wangy august 2015 abstract large factor models use a few latent factors to characterize the comovement of economic variables in a high dimensional data set. Fan, jianqing, wenyan gong, and ziwei zhu 2019, \generalized high dimensional trace regression via nuclear norm regularization, journal of econometrics, vol, pages. Regularization to assist with variable selection in highdimensional trade.
Focusing on linear and nonparametric regression frameworks, we discuss various econometric examples, present basic theoretical results, and illustrate the concepts and methods with monte carlo simulations and an. I recently had the opportunity to attend a conference held in honor of the great econometrician dr. Appendix k gathers auxiliary results on algebra of covering entropies. Sparse high dimensional regression lasso estimation application motivation trouble with large dimension goals important balance.
Highdimensional sparse models hdsm models motivating examples 2. Estimating and understanding highdimensional dynamic stochastic econometric models for volatility, derivatives, and more. The econometrics of multidimensional panels theory and. Many financial data sets are characterized by large number of dimensions. Jun 26, 2011 in this chapter we discuss conceptually high dimensional sparse econometric models as well as estimation of these models using l1penalization and postl1penalization methods. Estimating treatment effects with high dimensional data 1. In large data sets, however, machine learning methods shine. This reference gives a helicopter tour of various methods. Big data in dynamic predictive econometric modeling university of pennsylvania. Endogenous econometric models and multistage estimation. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s may 22, 2017 big data in econometric modeling heres a speakers photo from last weeks penn conference, big data in dynamic predictive econometric modeling. Existing variable selection methods can be computationally intensive and may not perform well the conditions required for those methods are very. Estimation and inference on te in a general model conclusion econometrics of big data.
Without a clear prior understanding of the underlying data. To understand problems related with the analysis of big data and application of highdimensional models and to get acquainted with the relevant tools coming from statistical learning, machine learning and econometrics. As n gets very large we have \ high dimensional data. High dimensional sparse models arise in situations. Prakasa rao cr rao advanced institute of mathematics, statistics and computer science aimscs university of hyderabad campus gachibowli, hyderabad 500046 email. Econometric methods for cross section and panel data fall 2018 instructor. Estimating and understanding highdimensional dynamic. How to use the glmnet and the lars package in r to implement lassotype estimators. Prediction with a large number of covariates big p varian, hal r. This book presents the econometric foundations and applications of multi dimensional panels, including modern methods of big data analysis.
In book contains an introduction to and a summary of the actively developing field of statistical learning with sparse models. Big data lecture 2 high dimensional regression with the. The last two decades or so, the use of panel data has become a standard in many areas of economic analysis. Most of the large data statistical and econometric literature attempts to reduce the data dimension by penalising the model for complexity. New analytic approaches are needed to make the most of big data in economics. Big data in dynamic predictive econometric modeling of. Students will learn how to explore, visualize, and analyze high dimensional datasets, build predictive models, and estimate causal e ects.
Statistics for highdimensional data methods, theory and applications. A growing and successful new branch of econometric literature asks how unbiased estimates of key structural parameters such as average treatment effects can be obtained in big data problems. Estimation of regression functions via penalization and selection3. Estimation and inference on te in a general model conclusion vc econometrics of big. Supplement to program evaluation and causal inference with high dimensional data this supplement contains 11 appendices with additional results and some omitted proofs. Estimation and inference wi outline for econometric theory of big data part i. Appendices fj include additional results for sections 27, respectively. Statistical significance in big data big data econometrics. Fan, jianqing, wenyan gong, and ziwei zhu 2019, \generalized highdimensional trace regression via nuclear norm regularization, journal of econometrics, vol, pages. Endogenous econometric models and multistage estimation in. L1penalized quantile regression in highdimensional sparse models, arxiv 2009, annals of statistics 2011, with a. More generally we refer to situations involving large n or k, or both, as high dimensional. Some results from applications to large macroeconomic data sets. Examples of techniques include an advanced overview of linear and logistic regression.
The best new econometric research on big data will be presented. High dimensional sparse econometric models, an introduction,springer lecture notes 2009, with a. Highdimensional sparse econometric models, 2010, advances. What it actually is, however, appears to differ from field to field, and even from practitioners within fields. Program evaluation and causal inference with highdimensional. Econometrics, highdimensional data, dimensionality reduction, linear regression. Supplement to program evaluation and causal inference with highdimensional data this supplement contains 11 appendices with additional results and some omitted proofs. Editorial big data in dynamic predictive econometric modeling. L1penalized quantile regression in high dimensional sparse models, arxiv 2009, annals of statistics 2011, with a. Large k is effectively high dimensional because endog. Uniformly valid inference in highdimensional models when the number of variables is larger than the number of parameters. To understand problems related with the analysis of big data and application of high dimensional models and to get acquainted with the relevant tools coming from statistical learning, machine learning and econometrics.
1372 140 620 485 334 1395 1200 739 302 1125 473 1080 1406 495 519 182 1388 1119 1414 1274 211 1450 1378 942 923 422 1493 780 62 604