After decades of study and significant data collection of time-varying swash on sandy beaches, there is no single deterministic prediction scheme for wave runup that eliminates prediction error – even bespoke, locally tuned predictors present scatter when compared to observations. Scatter in runup prediction is meaningful and can be used to create probabilistic predictions of runup for a given wave climate and beach slope. This contribution demonstrates this using a data-driven Gaussian process predictor; a probabilistic machine-learning technique. The runup predictor is developed using 1 year of hourly wave runup data (8328 observations) collected by a fixed lidar at Narrabeen Beach, Sydney, Australia. The Gaussian process predictor accurately predicts hourly wave runup elevation when tested on unseen data with a root-mean-squared error of 0.18 m and bias of 0.02 m. The uncertainty estimates output from the probabilistic GP predictor are then used practically in a deterministic numerical model of coastal dune erosion, which relies on a parameterization of wave runup, to generate ensemble predictions. When applied to a dataset of dune erosion caused by a storm event that impacted Narrabeen Beach in 2011, the ensemble approach reproduced ∼85 % of the observed variability in dune erosion along the 3.5 km beach and provided clear uncertainty estimates around these predictions. This work demonstrates how data-driven methods can be used with traditional deterministic models to develop ensemble predictions that provide more information and greater forecasting skill when compared to a single model using a deterministic parameterization – an idea that could be applied more generally to other numerical models of geomorphic systems.