A semiparametric probit model for high dimensional clustered data and its estimation procedure are proposed. The model is characterized by flexibility in the model structure through a nonparametric formulation of the effect of the predictors on the dichotomous response and a parametric specification of the inherent heterogeneity due to clustering. The predictive ability of the model is further investigated by looking at possible factors such as dimensionality, presence of misspecification, clustering, and response distribution. Simulation studies illustrate the advantages of using the proposed model over the ordinary probit model even in low dimensional cases. High predictive ability is observed in high dimensional cases especially when the distribution of the response categories is balanced. Results show that cluster distribution and functional form of the response variable do not affect the performance of the model. Also, the predictive ability of the proposed estimation increases as the number of clusters increases. Under the presence of misspecification, the predictive ability of the model is slightly lower yet remains better than the ordinary probit model.