Documentation Index
Fetch the complete documentation index at: https://mintlify.com/maxiricalde/ProfeLedesma/llms.txt
Use this file to discover all available pages before exploring further.
What Is Systematic Bias?
Even after careful preprocessing, a GHI model may exhibit systematic bias — for example, consistently underestimating measured values (negative rMBE). This kind of offset is structural: the model is not wrong at random, it is wrong in the same direction every time. A simple linear regression bias correction of the formY = a·X + b can significantly improve predictions without rebuilding the model from scratch.
Detecting Bias with rmbe
Compute rmbe on the training set to detect and quantify systematic bias before applying any correction.
- Negative rMBE → the model systematically underestimates measured GHI.
- Positive rMBE → the model systematically overestimates measured GHI.
Fitting the Linear Correction
The correction modelY = a·X + b is fit directly with NumPy. X is the model output (GHImod) and Y is the measured GHI. The coefficients are estimated on the training set only — applying the correction to the same data used to fit it estimates in-sample performance; the real test comes on the held-out test set.
The workshop uses NumPy for the bias correction fit. The scikit-learn library is not imported in the workshop notebooks.
numpy.polyfit (or equivalently scipy.stats.linregress) is sufficient for this single-predictor linear correction.Evaluating the Improvement
Recompute metrics on both sets to measure the impact of the correction. The expected result is a bias close to zero and a reduction in RMSD on the test set.Linear bias correction is a post-processing step, not a substitute for a well-calibrated model. It corrects for a constant multiplicative/additive offset but cannot fix structural model errors such as systematic misrepresentation of cloud effects or incorrect aerosol assumptions. If those deeper problems exist, the corrected model will still underperform on data that differs significantly from the training period.