VARIMAX rotation of PLS loadings

This post was kindly contributed by SAS Programming for Data Mining Applications - go there to comment and to read the full post.


Partial Least Square is one of several supervised dimension reduction techniques and attracts attention in recent years. In the one hand, PLS is able to generate a series of scores that maximize linear correlation between dependent variables and independent variables, on the other hand, the loading of PLS can be regarded as similar counterpart from factor analysis, hence we can rotate the loadings from PLS therefore eliminate some of the non-significant variable in terms of prediction.


%macro PLSRotate(Loading, TransMat, PatternOut, PatternShort, 
                 method=VARIMAX, threshold=0.25);
/* VARIMAX rotation of PLS loadings. Only variables having 
   large loadings after rotation will enter the final model. 

   Loading dataset contains XLoadings output from PROC PLS 
   and should have variable called NumberOfFactors
   TransMat is the generated Transformation matrix;
   PatternOut is the output Pattern after rotation;
   PatternShort is the output Pattern with selected variables
*/

%local covars;
proc sql noprint;
     select name into :covars separated by ' '
  from   sashelp.vcolumn
  where  libname="WORK" & memname=upcase("&Loading") 
        &   upcase(name) NE "NUMBEROFFACTORS"
  &   type="num"
  ;
quit;
%put &covars;

data &Loading.(type=factor);
         set &Loading;
         _TYPE_='PATTERN';
         _NAME_=compress('factor'||_n_);
run;
ods select none;
ods output OrthRotFactPat=&PatternOut;
ods output OrthTrans=&TransMat; 
proc factor  data=&Loading   method=pattern  rotate=&method  simple; 
         var &covars;
run;
ods select all;

data &PatternShort;
     set &PatternOut;
  array _f{*} factor:;
  _cntfac=0;
  do _j=1 to dim(_f);  
        _f[_j]=_f[_j]*(abs(_f[_j])>&threshold); _cntfac+(_f[_j]>0); 
     end;
  if _cntfac>0 then output;
  drop _cntfac _j;
run;
%mend;

Here I try to replicate the case study in [1] which elaborated how to do and properties of VARIMAX rotation to PLS loadings. The PROC PLS output, after various tweaks on convergence criteria and singularity conditions, is still a little different from the result reported in [1] for factors other than the leading one, therefore, I will directly use the U=PS matrix in pp.215.



data loading;
input factor1-factor3;
cards;
-0.9280  -0.0481  0.2750
0.0563  -0.8833  0.5306
-0.9296  -0.0450  0.2720
-0.7534  0.1705  -0.5945
0.5917  -0.0251  -0.6450
0.9082  0.3345    0.1118
-0.8086  0.4551  -0.3800
;
run;


proc transpose data=loading  out=loading2;
run;

data loading2(type=factor);
     retain _TYPE_ "PATTERN";
  set loading2;
run;


ods select none;
ods output OrthRotFactPat=OrthRotationOut;
ods output OrthTrans=OrthTrans; 
proc factor  data=Loading2   method=pattern  rotate=varimax  simple; 
         var col1-col7;
run;
ods select all;

Reference:
[1] Huiwen Wang; Qiang Liu , Yongping Tu, “Interpretation of PLS Regression Models with VARIMAX Rotation”, Computational Statistics and Data Analysis, Vol.48 (2005) pp207 – 219

This post was kindly contributed by SAS Programming for Data Mining Applications - go there to comment and to read the full post.