Loading...
Please wait, while we are loading the content...
Similar Documents
Partial Least Squares Modeling and Its Multi-collinearity Analysis
| Content Provider | Semantic Scholar |
|---|---|
| Author | Mao, Lixia |
| Copyright Year | 2018 |
| Abstract | In mathematics problems, multiple regression analysis often encounters multiple collinear problems, which makes the multiple correlation problems between variables become serious, but this problem is ubiquitous, and the phenomenon of multi-collinearity will affect the estimation of the parameter values, making the model's error larger, thus destroying the stability of the model. Therefore, eliminating multi-collinearity has become the most critical issue. Modeling by partial least squares regression, and verifying the theory of partial least squares, screening the original independent variables in the least squares regression model, and a model is established to solve the practical problems in real life. Partial least squares is a diversified statistical analysis method. It is also the latest data analysis method in recent years. Its research object is the linear regression of multi-dependent variables to multiple independent variables, and modeling the variables. Special attention should be paid to the fact that when the multicollinear correlation occurs within the variable, the partial least squares modeling method can solve this problem. The partial least squares regression can solve the problem that the number of samples is less than the number of variables, so It is called regression analysis. 1. Multi-collinear processing One way that is frequently used is to remove unimportant collinear variables, select variables by means of multiple regression analysis, or use stepwise regression to filter variables, but in theory, these screening methods are aimed at the data without collinearity. If the multicollinearity is very complicated and there is still no suitable method to deal with it. To delete part of the multicollinearity variable will increase the error of the model and damage the information of the system itself, the risk of damage will increase continuously; increasing the capacity of the sample can reduce the damage caused by the multi-collinear model to a certain extent, but it will not be feasible due to time or cost constraints. If multiple collinearity exists, you can build a multicollinear combination, construct a new variable instead of the old variable in the regression equation, or convert the function of the equation into a differential form, which can reduce the multi-collinearity to a large extent. There are also ridge regression, principal component regression and partial least squares regression. The best way to solve multiple collinearity is partial least squares regression [1]. In the case of severe multicollinearity, the use of ridge regression, principal component regression and partial least squares regression is more effective than ordinary regression models in modeling and extracting components. Different regression models have different test methods, the principal component will find the correlation between the component and the variable in the independent variable system; and the partial least squares will find the strongest molecule related to the variable in the system. Therefore, according to the magnitude of the square root error, it can be concluded that the partial minimum regression is better than the other two regressions. 2. Partial least squares modeling In order to effectively solve the multiple correlations problem, we used the principal component regression method to extract the required components from the independent variables. However, it can’t guarantee that the variables have the strongest ability, because the principal component 2018 International Conference on Computational Science and Engineering (ICCSE 2018) Copyright © (2018) Francis Academic Press, UK DOI: 10.25236/iccse.18.054 --280-regression method doesn’t consider the role of the variable extraction components. Therefore, the partial least squares regression can solve the problem of variable extraction components more than the principal component regression. We suppose that there are n independent variables {p1, p2, p3, pn} and m dependent variables {q1, q2, q3, qm} and study the relationship between independent variables and dependent variables, then set up R assembly points, the independent variable data table p and and the dependent variable data table q. The partial least squares regression method needs to extract the corresponding components a1 and b1 in the independent variable p and the dependent variable q. The linear combination of a1 is p1, p2, p3, pn, and the linear combination of b1 is q1, q2. Q3, qm. According to the relevant requirements of the regression analysis method, we need focus on two points when extracting components. The a1 and b1 components need to carry the information in their own data tables, and the degree of association between a1 and b1 should be maximized. As the first components a1 and b1 are extracted, the partial least squares regression analysis needs to perform the regression of p-a1 and q-b1, and then check the accuracy. If the regression equation can reach a reasonable precision value, stop the calculation; If it is not qualified, the residual information of p-a1 and q-b1 will be extracted twice, and will be recycled until the satisfactory accuracy is achieved. If at the end p extracts r components a1,a2,a3,···,ar, partial least squares regression will be performed through qk(k=1,2,···,x) to a1,a2,a3,···,ar and finally form the regression equation of qk with respect to variables x1,x2,x3,···,xp. 3. Partial least squares regression calculation First of all, the data will be normalized and the processed data form is E0=(E0,···Ep)mxp,a1 is a vector, extracted by E0, ║w1║=1,║a1║=1. In partial least squares regression, the problem needs to be optimized. So it is concluded that: |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://webofproceedings.org/proceedings_series/ECS/ICCSE%202018/ICCSE054.pdf |
| Journal | CSE 2018 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |