What FOI data reveals about how UK universities actually mark

UK universities like to give the impression that grading is something close to a science. Each module has learning outcomes. Each piece of coursework is marked against a rubric. The whole thing is moderated by a second marker and then again by an external. By the time a number lands on your transcript, it has supposedly survived a small bureaucratic inquisition.

What the FOI data shows is something rather more interesting.

We've been filing Freedom of Information requests at UK universities for module-level grade distributions for the last 18 months. Most universities comply — eventually — and disclose at least a partial dataset. Once you sit a hundred of these side by side and squint, four patterns become unavoidable.

1. Grade ceilings are real and consistent

The most striking thing in the data is that some modules simply don't produce many firsts. Year after year, cohort after cohort, a particular module will run a 2:1 average with maybe 10–15% in the first-class band — and another module on the same course, taught by the same kind of staff, will produce 35–45% firsts.

This is not random. It's structural. Some modules are designed (consciously or not) around assessment formats that compress marks into the middle of the distribution. Others have a high ceiling baked in from the start.

If you've been told "the marking is the same across the course", the data does not bear this out. Module choice is the dominant variable.

2. There's a "year-of-cohort" effect that nobody talks about

Mark distributions for the same module shift year on year, sometimes substantially. Sometimes you can trace it: a lecturer changes; a coursework brief is reworked; a global pandemic happens.

But often there's no obvious cause. A module averages 62 for four years, then 68, then back to 63. The cohort wasn't markedly different. The assessment was unchanged. Something has shifted — possibly second-marker calibration, possibly external-examiner mood, possibly nothing more than statistical noise around an unstable mean.

The lesson is to treat any single-year figure with appropriate suspicion. Look at year ranges. Three to five years of data tells you more than the most recent year — particularly for small cohorts.

3. Coursework drift is more visible than exam drift

Exam-heavy modules tend to have stable distributions year-on-year. Coursework-heavy modules drift — usually upward — over time, sometimes by 5–8 percentage points across half a decade, with no apparent change in the syllabus.

The reason, as far as we can tell, is that coursework rubrics get refined through use. Each year markers see what the top end looks like and quietly reset their expectations. Exams, anchored by a question paper that's much harder to drift, don't have the same dynamic.

This has practical implications: a module's historical coursework grade is a less reliable predictor of next year's coursework grade than a module's historical exam grade is of its next year's exam grade.

4. The FOI suppression rules are a feature

Universities are required, under FOIA s40(2), to suppress data where individuals could be re-identified. Most universities apply this at cohort sizes of 5–7 students. Some apply it more conservatively at 10. A few apply rounding to nearest 5 even above the threshold.

That's actually fine. We apply our own suppression at < 10 across the board, regardless of what the FOI reply discloses. The point of the dataset isn't to expose individuals; it's to surface structural patterns. Aggregated banded signals do that job perfectly well, and they don't undermine the privacy commitments universities make to their cohorts.

What the dataset is and isn't

It's a multi-year, multi-university record of module-level grade distributions. It is not (and shouldn't be) a leaderboard. It can't tell you whether a module is "good" — there's no entry in the data for "did the lecturer change my life?" or "did the coursework brief teach me how to think?". It can tell you what outcomes historical cohorts actually got, which is a different question, and a useful one when you're staring at a portal trying to make a decision.

We're publishing aggregated signals on public pages — bands, trend arrows, year ranges — and exact distributions to logged-in users only. The reasons are partly defensive (the moat is the dataset; we'd rather it didn't end up in someone else's training corpus) and partly principled (we've seen the cases where re-identification arguments could plausibly be made, and we'd rather not test them).

What's coming next

We're scaling up the FOI request flow and aiming to hit 40+ universities with multi-year coverage by the end of the academic year. The press piece — "What we learned filing 100 FOI requests to UK universities" — will land alongside that milestone, with the underlying methodology fully documented.

If you want to be first in line when each new university lands, the waitlist is here.