| Hydrocolloids such as pectin, carrageenan, and alginate are widely employed in food, pharmaceutical, cosmetic, and hygiene applications due to their gelling, stabilizing, thickening, and texturizing functionalities. The industrial performance of these materials depends on strict control of physicochemical properties including viscosity, gel strength, pH, solids content, and structural stability. Ensuring batch-to-batch consistency requires extensive laboratory testing during production and formulation development, resulting in significant time, cost, and material consumption. This study presents a data-driven framework for predicting the physicochemical properties of finished hydrocolloid products using ingredient-level analytical data. Historical production and laboratory records are transformed into structured datasets suitable for machine learning modeling. Tree-based ensemble methods are evaluated under two complementary strategies: (i) family-level general models predicting multiple analyses simultaneously, and (ii) analysis-specific models combined with structural dimensionality reduction to address extreme feature dimensionality and heterogeneity. The results demonstrate that predictive modeling can achieve industrially acceptable error levels in several product families, significantly reducing reliance on experimental validation. The proposed framework establishes the feasibility of deploying machine learning systems in high-dimensional industrial manufacturing environments. |
*** Title, author list and abstract as submitted during Camera-Ready version delivery. Small changes that may have occurred during processing by Springer may not appear in this window.