Printer Friendly Version | Back

Optimal Variable Subset Selection Problem in Regression Analysis is NP-Complete

Year: 2019       Vol.: 68       No.: 1      

Authors: Paolo Victor T. Redondo


Combinatorial and optimization problems are classified into different complexity classes, e.g., whether an algorithm that efficiently solve the problem exists or a hypothesized solution to the problem can be quickly verified. The optimal selection of subset variables in regression analysis is shown to belong to a complexity class called NP-hard (Welch, 1982) in which solutions to the problems in the same class may not be easily (in terms of computing speed) proven optimal. Variable selection in regression analysis based on correlations is shown to be NP-hard, i.e., a complexity class of problems with easily verifiable solutions.

Keywords: optimal variable selection, regression analysis, np-completeness


Back to top