Inference is the process of using a trained machine learning model to make predictions on new, unseen data. In production, inference should be reliable, fast, and consistent, ensuring that model outputs are trustworthy and reproducible.