Expansive soils pose significant challenges due to their tendency to swell when wet and shrink when dry, causing ground instability. These volumetric changes can lead to structural damage, including foundation cracks, uneven floors, and compromised infrastructure. Addressing these issues requires proper soil evaluation and the implementation of stabilization techniques to ensure long-term safety and durability. The high degree of expansive, problematic soil is stabilized by cement, bitumen, lime, etc. This investigation predicts the unconfined compressive strength (UCS) of lime-treated soil using decision tree (DT), ensemble tree (ET), gaussian process regression (GPR), support vector machine (SVM), and multilinear regression (MLR). This research investigates the impact of dimensionality on the computational approaches. The variance accounted for (VAF), correlation coefficient (R), mean absolute error (MAE), root mean square error (RMSE), and performance index (PI) metrics have computed the model's performance. The comparison reveals that model ET5 has predicted UCS with an excellent performance in testing (RMSE = 368.06 kPa, R = 0.9640, VAF = 91.60, PI = 1.8077) and validation (RMSE = 508.41 kPa, R = 0.9165, VAF = 83.89, PI = 1.6337) phase. Also, model ET5 has achieved better score (total = 90), area over the curve (testing = 8.98E-04, validation = 1.56E-03), computational cost (testing = 0.1772s, validation = 0.1551 s), uncertainty rank (= 1), and overfitting (testing = 2.32, validation = 2.80), presenting model ET5 as an optimal performance model. The dimensionality analysis reveals that simple models like MLR, SVM, GPR, and DT struggle with high-dimensional data (case 5). Still, the ET5 model achieves high performance and reliable prediction with consistency, compaction and soil physical parameters. Conversely, the effect of multicollinearity has been observed on the performance of the MLR, SVM, and DT models.
In soil mechanics, liquefaction is the phenomenon that occurs when saturated, cohesionless soils temporarily lose their strength and stiffness under cyclic loading shaking or earthquake. The present work introduces an optimal performance model by comparing two baselines, thirty tree-based, thirty support vector classifier-based, and fifteen neural network-based models in assessing the liquefaction potential. One hundred and seventy cone penetration test results (liquefied and non-liquefied) have been compiled from the literature for this aim. Earthquake magnitude, vertical-effective stress, mean grain size, cone tip resistance, and peak ground acceleration parameters have been used as input parameters to predict the soil liquefaction potential for the first time. Performance metrics, accuracy, an area under the curve (AUC), precision, recall, and F1 score have measured the training and testing performances. The comparison of performance metrics reveals that the model Runge-Kutta optimized extreme gradient boosting (RUN_XGB) has assessed the liquefaction potential with an overall accuracy of 99%, AUC of 0.99, precision of 0.99, recall value of 1, and F1 score of 1. Moreover, model RUN_XGB has a true negative rate of 0.98, negative predictive value of 1, Matthews correlation coefficient of 0.98, and average classification accuracy of 0.99, close to the ideal values and presents the robustness of the RUN_XGB model. Finally, the RUN_XGB model has been recognized as an optimal performance model for predicting the liquefaction potential. It has been noted that a low multicollinearity level affects the prediction accuracy of models based on conventional soft computing techniques, i.e., logistic regression. This research will help researchers choose suitable hybrid algorithms and enhance the accuracy of seismic soil liquefaction potential models.