AI Performance Reviews: Risks, UK Legal Considerations and Practical Guardrails

AI evaluated my performance: what this Reddit post reveals about workplace AI

A Redditor describes a client running their interview transcripts and recordings through a leading large language model (LLM) to “evaluate” their performance and provide coaching. The AI’s feedback was detailed but missed interpersonal nuance — then got forwarded to the project sponsor. If that feels off to you, you’re not alone.

“The output… lacked the significant elements of human interactions and nuance.”

It’s a timely story for the UK. More organisations are experimenting with AI in HR, procurement and vendor management. Done badly, it risks unfair treatment, privacy violations and poor decisions. Done well, it can support better coaching and consistency — with strict safeguards.

Here’s what this case shows, the UK legal context, and practical guardrails you can adopt today.

What happened and why it matters

The consultant specialises in expert interview-based research. The client fed interviews into an LLM, asked it to evaluate performance, and circulated the AI’s verdict. Model, prompts and instructions: not disclosed.

The reaction is understandable. Human interviewing is high-context. It involves rapport, power dynamics, and strategic trade-offs that are difficult to judge from a transcript, and even harder from a summarised transcript. A generic LLM rubric can over-index on form (filler words, question length) while missing substance (relationship building, trust, and steering).

“Learn how to manage AI and don’t let AI manage you.”

LLMs are poor judges of conversations without careful design

LLMs are powerful pattern-matchers, not conversation psychologists. Several pitfalls show up when using them to evaluate people:

Instruction bias – they optimise for what you ask. Vague prompts produce confident, template-like judgements.
Missing context – transcript-only views miss tone, intent, and pre-existing relationships. Even recordings lose the “why” behind choices.
Hallucinations – models sometimes infer unfounded traits, especially when scoring soft skills.
Low reliability – small prompt changes can flip an assessment. Test-retest variance is common.
Mismatch with job outcomes – “neat questioning” is not the same as “right insights for a client problem”.

None of this means AI-generated feedback is useless. It just means it must be narrow, well-instrumented, and always reviewed by a human who understands the domain.

UK legal considerations: AI in performance reviews and vendor assessments

Whether you’re assessing employees or consultants, several UK laws and regulators are relevant:

Data protection and UK GDPR

Lawful basis and transparency – if you process personal data in AI tools, you need a lawful basis and clear privacy information. See the ICO’s AI guidance.
Automated decision-making – Article 22 of UK GDPR restricts solely automated decisions with legal or similarly significant effects. Pay, promotion, termination, and contract awards can fall in scope. Individuals have rights to get human review and to contest decisions. See the ICO’s overview of explaining AI decisions.
DPIA requirement – high-risk processing such as profiling workers typically needs a Data Protection Impact Assessment (DPIA).
Data minimisation – avoid feeding entire recordings where summaries will do; redact special category data (health, ethnicity, union membership) unless you have a valid condition.

Employment and equality

Fairness and non-discrimination – the Equality Act 2010 prohibits discriminatory outcomes. If AI disadvantages protected groups, you have a problem, even if unintentional.
Worker relations – ACAS encourages consultation and transparency on AI use in the workplace. See ACAS guidance on AI at work.

Confidentiality and contractual duties

Client and supplier obligations – contracts may restrict sharing recordings/transcripts with third parties (including AI vendors). Check NDAs and data processing clauses.
International transfers – many AI tools process data overseas. Ensure appropriate safeguards and vendor due diligence.

If this scenario involved a UK worker or supplier, a “surprise” AI assessment shared internally could trigger multiple compliance gaps: no transparency, uncertain lawful basis, questionable fairness testing, and a lack of human-in-the-loop review.

Practical guardrails: how to use AI for reviews without burning trust

For organisations in the UK

Adopt a simple principle – AI-assisted feedback, human-owned decisions. No solely automated outcomes.
Be transparent – publish a short notice describing what data is used, for what purpose, which models, retention, and human oversight.
Run a DPIA – document risks, mitigations, and why AI is proportionate for the task.
Minimise and redact – strip names, demographics, health references; process only what’s necessary for the specific evaluation.
Use reliable rubrics – define clear, observable criteria linked to real outcomes. Avoid “vibes-based” ratings.
Test reliability and bias – check whether AI scores align with expert human ratings; monitor for demographic skews.
Keep prompts and versions – log prompts, model versions, and outputs for audit and explainability.
Choose vendors carefully – prefer enterprise offerings with DPA terms, regional processing options, and security attestations.
Offer challenge routes – allow individuals to see the inputs, understand the method, and contest errors.

For individuals on the receiving end

Ask for the process – which model, prompts, data, and rubric were used? Was a human reviewer involved?
Request the inputs – your transcripts/recordings and the AI output that informed the assessment.
Query the legal basis – how was your personal data processed lawfully, and where is it stored?
Challenge specifics – refute incorrect claims with evidence (client outcomes, stakeholder feedback, meeting objectives).
Propose a fair method – suggest a joint review using a transparent rubric plus human observation of a live session.

Where AI coaching can help (with guardrails)

There are constructive, low-risk use cases:

Transcription and summarisation – speed up note-taking. Always verify summaries.
Self-coaching prompts – ask a model to identify missed follow-ups or to suggest neutral phrasing options.
Quantitative hygiene checks – track talk-time ratio, question length, and filler words as optional self-metrics, not performance scores.
Knowledge extraction – pull themes across many interviews to inform analysis, not to grade the interviewer.

Keep anything evaluative strictly advisory and human-reviewed. If you’re automating data flows, be mindful where recordings land. For example, if you connect LLMs into spreadsheets or dashboards, treat them as personal data stores and secure accordingly. If you’re curious about safe automation patterns, I’ve covered a practical workflow here: using ChatGPT with Google Sheets.

Takeaway for UK readers

It’s legitimate to experiment with AI to improve feedback quality and efficiency. It’s risky to outsource judgement on complex human interactions to a generic model. The UK regulatory bar is clear: be transparent, minimise data, avoid solely automated decisions with significant effects, and design for fairness and human oversight.

If you’re leading teams or managing suppliers, establish policy now rather than discovering the boundaries by accident. And if you’re on the receiving end, you’re entitled to ask sensible questions and to challenge thin, AI-only verdicts — especially when reputation, pay, or contracts are at stake.

Source post: “My work performance was just evaluated by AI”.

AI Performance Reviews: Risks, UK Legal Considerations and Practical Guardrails

Joshua

Unlock exclusive content ✨

Joshua

AI evaluated my performance: what this Reddit post reveals about workplace AI

What happened and why it matters

LLMs are poor judges of conversations without careful design

UK legal considerations: AI in performance reviews and vendor assessments

Data protection and UK GDPR

Employment and equality

Confidentiality and contractual duties

Practical guardrails: how to use AI for reviews without burning trust

For organisations in the UK

For individuals on the receiving end

Where AI coaching can help (with guardrails)

Takeaway for UK readers

You might also enjoy 🔍

Caledonian Holdings PLC Interim Results Highlight Strategic Shift and New Financial Services Investments

Galileo Resources PLC Reports Interim Loss and Key Project Updates Including Jubilee Collaboration

Comments 💭

Leave a Comment 💬

Got an article to share?