AI Performance Review Survey Questions: 2026 Template for Employees & Managers

By Jürgen Ulbrich

This AI performance review survey template systematically measures how employees and managers experience AI in performance reviews — from trust and fairness to data privacy and governance. It covers 84 Likert-scale statements, 6 overall ratings (0–10), and 12 open questions. Use it to spot transparency, quality, or compliance issues early, and to set clear thresholds for when to pause, retrain, or tighten guardrails.

Why this survey matters in 2026

AI-assisted performance reviews are spreading fast — with significant legal and cultural consequences. A 2025 Resume Builder survey of more than 1,300 managers found that 91% use AI to assess performance, but only 71% felt confident in its fairness for individual decisions. Without structured measurement, that confidence gap turns into a trust and legal liability problem.

Three regulatory layers create specific obligations for employers using AI in reviews:

  • GDPR Article 22 — prohibits decisions based solely on automated processing that produce legal or similarly significant effects. AI-generated rating suggestions accepted without genuine human review potentially trigger this protection. Employees have the right to human intervention, to express their view, and to contest the decision. A manager who rubber-stamps AI outputs without substantive review does not satisfy the requirement under EDPB guidance.
  • EU AI Act — classifies performance evaluation systems as high-risk AI (Annex III). From August 2026, deployers (the employer, not the software vendor) must implement risk management, technical documentation, automated logging, human oversight, and a fundamental rights impact assessment. Note: a proposed legislative package (Digital Omnibus) may shift the high-risk deadline to December 2027 or August 2028 — but the original August 2026 deadline applies until formally amended. Additionally, Art. 86 of the AI Act grants employees the right to request a meaningful explanation of any high-risk AI output that affects them.
  • Works councils and co-determination (EU/DACH) — AI systems capable of monitoring employee performance trigger co-determination rights under national labor law. In Germany, §87(1)(6) BetrVG gives the Betriebsrat (works council) mandatory co-determination over technical systems that can monitor employee behavior or output. Under Art. 26(7) of the AI Act, employers must also consult worker representatives before deployment.

This survey gives you a measurable, documentable baseline across all these dimensions — and concrete inputs for your works council and data protection officer.

How to use this template

Use a 5-point Likert scale for statements (1 = Strongly disagree, 5 = Strongly agree). Numbering is for analysis and follow-up: E = Employees, M = Managers, S = Shared. Select one of the four blueprints below and load only the relevant items. For a first run, 18–22 questions is plenty; keep the total under 8 minutes.

Survey questions: Employees

Employees (E1–E6) — Awareness & transparency

  • (E1) I understand when AI is used in my performance review process.
  • (E2) I know which parts of my review may include AI-generated text (drafts, summaries, suggested wording).
  • (E3) I was told what the AI can and cannot do in performance reviews.
  • (E4) I know who is accountable for the final review content — not the AI.
  • (E5) I was informed if AI influenced ratings, calibration input, or performance labels.
  • (E6) The company explained the reason for using AI in reviews in plain language.

Employees (E7–E12) — Quality & usefulness

  • (E7) AI-assisted feedback in my review was specific to my actual work.
  • (E8) The feedback included concrete examples or evidence, not just generic phrases.
  • (E9) The feedback was consistent with what I heard in 1:1s during the cycle.
  • (E10) The feedback clearly separated facts, interpretations, and expectations.
  • (E11) The feedback helped me understand priorities for the next 3–6 months.
  • (E12) The tone of the feedback felt respectful and professional.

Employees (E13–E18) — Fairness & bias perceptions

  • (E13) AI-assisted feedback made the review feel more fair than fully manual feedback.
  • (E14) I worry the AI may amplify bias (e.g., proximity bias, similarity bias, stereotypes).
  • (E15) The AI-assisted feedback accurately reflected my contributions — not just visible work.
  • (E16) I felt the same performance standard was applied to me as to comparable peers.
  • (E17) I worry AI could misread context (e.g., parental leave, part-time, project changes).
  • (E18) The review language avoided coded or ambiguous terms (e.g., "not assertive enough").

Employees (E19–E24) — Psychological safety & trust

  • (E19) I felt comfortable asking whether AI was used in my review.
  • (E20) I felt comfortable challenging AI-influenced wording during the review conversation.
  • (E21) My manager was open to correcting errors in the review content.
  • (E22) I trust that AI use did not reduce my chance to be heard as a person.
  • (E23) I believe my manager reviewed and owned the final feedback — not "copy-paste."
  • (E24) I know how to escalate concerns if AI-assisted feedback feels wrong or unfair.

Employees (E25–E30) — Data protection & consent

  • (E25) I understand what data sources may be used by AI in the review process.
  • (E26) I understand whether chat inputs, notes, or 360° comments can be used as AI inputs.
  • (E27) I trust that sensitive personal data is not used for AI prompts in reviews.
  • (E28) I understand at a high level where data is processed and stored (e.g., EU/EEA).
  • (E29) I believe AI use in reviews follows GDPR principles (data minimisation, purpose limitation).
  • (E30) I know the retention period for AI-related review artifacts (drafts, logs, summaries).

Employees (E31–E36) — Overall impact & preference

  • (E31) AI-assisted reviews improved the clarity of expectations for me.
  • (E32) AI-assisted feedback made the review feel more consistent across the company.
  • (E33) AI-assisted feedback made the review feel less personal for me.
  • (E34) I would prefer AI to be used only for drafting, not for rating suggestions.
  • (E35) I would prefer AI to be used only with clear human review checkpoints.
  • (E36) Overall, AI use improved my review experience this cycle.

Survey questions: Managers

Managers (M1–M6) — Onboarding & training

  • (M1) I received training on where AI can be used in reviews and where it cannot.
  • (M2) The training covered how to verify AI outputs with evidence (projects, outcomes, behaviors).
  • (M3) The training covered GDPR-safe prompting — what data must not be entered.
  • (M4) I understand how to explain AI use transparently to employees.
  • (M5) I know what to do when an employee challenges AI-influenced wording.
  • (M6) I feel prepared to use AI without weakening psychological safety in my team.

Managers (M7–M12) — Workflow & time impact

  • (M7) AI reduced the time I needed to prepare reviews.
  • (M8) AI helped me structure feedback faster (strengths, gaps, next steps).
  • (M9) AI improved my ability to summarise 360° feedback without missing key points.
  • (M10) AI increased my admin work due to extra checking and rewriting.
  • (M11) AI helped me keep feedback consistent across multiple direct reports.
  • (M12) AI support improved the quality of my review conversations.

Managers (M13–M18) — Quality of drafts & summaries

  • (M13) AI-generated drafts were accurate enough to be a good starting point.
  • (M14) The drafts included measurable outcomes or observable behaviors when prompted.
  • (M15) AI helped avoid vague feedback by pushing for specifics.
  • (M16) AI summaries captured context correctly — scope changes, constraints, dependencies.
  • (M17) AI outputs matched our internal rubric and competency language.
  • (M18) AI outputs avoided biased language without hiding performance issues.

Managers (M19–M24) — Judgment, oversight & accountability

  • (M19) I feel confident editing or rejecting AI suggestions.
  • (M20) I consistently verify AI outputs against evidence before sharing with employees.
  • (M21) I can explain the rationale for my final feedback without referring to AI.
  • (M22) AI never overrides my judgment on ratings or performance outcomes.
  • (M23) I understand the risk of over-relying on AI in sensitive people decisions.
  • (M24) I know how to document decisions in an audit-ready way.

Managers (M25–M30) — Governance & guardrails

  • (M25) The company has clear dos/don'ts for AI in performance reviews.
  • (M26) I know which data must never be used in prompts — health, union status, protected categories.
  • (M27) I know whether a works council agreement applies to this AI use case.
  • (M28) I know who to contact when tool output seems risky or wrong (HR/IT/Data Protection).
  • (M29) Our process includes clear human checkpoints before anything impacts an employee.
  • (M30) AI usage is logged in a way that supports transparency and accountability.

Managers (M31–M36) — Fairness, consistency & calibration support

  • (M31) AI helped me apply our performance standards more consistently.
  • (M32) AI increased the risk of "template feedback" that flattens differences between people.
  • (M33) AI made it easier to spot missing evidence before calibration discussions.
  • (M34) AI made it easier to avoid common review biases (recency, halo/horn, proximity).
  • (M35) I worry AI could introduce new bias through training data or wording patterns.
  • (M36) AI improved the quality of inputs I bring to calibration sessions.

Managers (M37–M42) — Overall confidence & willingness to continue

  • (M37) I trust the tool's outputs when used with careful human review.
  • (M38) I feel comfortable being transparent with employees about AI use.
  • (M39) I would use AI again for drafting feedback in the next cycle.
  • (M40) I would use AI again for summarising 360° feedback in the next cycle.
  • (M41) I would avoid using AI for rating suggestions unless governance improves.
  • (M42) Overall, AI made my review work more effective this cycle.

Survey questions: Shared (Employees + Managers)

  • (S1) AI use in reviews is communicated in a clear, consistent way across teams.
  • (S2) People affected by AI in reviews can give feedback without negative consequences.
  • (S3) The process makes it easy to correct mistakes in AI-assisted review content.
  • (S4) AI use in reviews aligns with our performance review rubric and expectations.
  • (S5) AI improves the quality of review conversations — not just the paperwork.
  • (S6) I trust the organisation's guardrails for AI in performance reviews.

Overall ratings (0–10) and open questions

Overall ratings (0–10)

  • (Employees) How much do you trust AI-assisted feedback in performance reviews? (0–10)
  • (Employees) How much did AI improve the quality of feedback you received this cycle? (0–10)
  • (Employees) How likely are you to recommend AI-assisted feedback to a colleague? (0–10)
  • (Managers) How confident are you using AI in reviews without harming fairness? (0–10)
  • (Managers) How much did AI improve your review preparation efficiency this cycle? (0–10)
  • (Managers) How likely are you to recommend the current AI review workflow to another manager? (0–10)

Open questions (12 total)

  • (Employees) Where did AI-assisted feedback feel most accurate and helpful?
  • (Employees) Where did AI-assisted feedback feel generic, wrong, or out of context?
  • (Employees) Which sentence or section would you rewrite to better reflect your work?
  • (Employees) What would make you feel safer challenging AI-influenced wording in a review conversation?
  • (Employees) What is your biggest concern about data use or privacy in AI-assisted reviews?
  • (Managers) In which part of the review workflow did AI save you the most time?
  • (Managers) Where did AI create extra work — rewrites, verification, back-and-forth?
  • (Managers) What guardrail would prevent your biggest AI risk in reviews?
  • (Managers) What training topic would most improve your AI use in performance feedback?
  • (Shared) What should we stop doing with AI in reviews, starting next cycle?
  • (Shared) What should we keep doing with AI in reviews because it clearly works?
  • (Shared) If you could change one rule about AI in reviews, what would it be?

Survey blueprints: which questions to use when

Pick one blueprint per cycle. Simple flow: select blueprint → load items → agree anonymity threshold → send 3–10 days after reviews close → publish results and actions within 21 days. If you already run a classic post-cycle survey, align wording to keep trends comparable even as AI changes the workflow. For tool selection guidance, see our enterprise performance management software guide.

BlueprintAudienceWhenItemsQuestion mixDecision output
A) Employee post-cycle (pilot)Employees in AI-assisted reviews3–10 days after reviews18–22E1–E6, E7–E12, E13–E18, E19–E24, E25–E30 + 2 ratings + 3 openKeep/adjust AI; fix transparency, safety, and privacy gaps
B) Manager post-cycle (pilot)Managers using AI tools3–10 days after calibration18–22M1–M6, M7–M12, M13–M18, M19–M24, M25–M30 + 2 ratings + 3 openTraining plan; governance updates; workflow changes
C) Combined pulse (pilot)Employees + ManagersMid-pilot or after first cycle12–15S1–S6 + E7/E13/E20/E29 + M19/M25 + 2 ratings + 2 openEarly warning: stop/continue before scaling
D) Follow-up trend surveySame populations as A/B6–12 months later12–18Core items: E1/E7/E13/E20/E29/E36, M1/M7/M19/M25/M31/M42, S6 + ratingsTrust/fairness trend; adoption readiness for broader rollout

Scoring & thresholds

Use the 5-point scale for statements and 0–10 for overall ratings. Tie decisions to thresholds — not individual anecdotes. Three bands: Low (Avg <3.0) — fix before scaling. Medium (3.0–3.9) — improve next cycle. High (≥4.0) — standardise and share examples. For 0–10 ratings, treat <6/10 as a stop-and-fix signal in pilots.

MetricHow to calculateThresholdDecision rule
Dimension averageMean of items per block (e.g., E13–E18)Avg <3.0Pause expansion; fix root cause before next cycle
Favorable rate% choosing 4–5 (Agree/Strongly agree)<60%Targeted improvement plan with owner + deadline
Disagree concentration% choosing 1–2≥20%Run focus groups; review comms and manager behaviour
Group gapDifference between groups (e.g., remote vs. office)≥0.4 pointsBias check + process audit; escalate to HR leadership

Four-step analysis routine: (1) compute averages per dimension, (2) check dispersion and "disagree" rates, (3) compare groups, (4) map to actions in the decision table.

  • HR computes dimension scores (E, M, S) and flags red thresholds — within 7 days.
  • People Analytics checks group gaps and outliers — within 10 days.
  • Leaders agree on 3 priority fixes — within 14 days.
  • HR publishes a short "what changes next cycle" note — within 21 days.

Action plan: what to do when scores are low

AI in reviews fails most often not because of the tool, but because nobody owns the messy parts: corrections, escalations, and governance. Set routing rules upfront. Default response times for pilots: ≤24 h for severe privacy concerns, ≤7 days to acknowledge low trust scores, ≤21 days to publish actions. Align calibration follow-ups with your internal calibration process to keep AI from becoming a backdoor to inconsistent standards.

SignalThresholdRecommended actionOwnerDue
Low transparencyE1–E6 Avg <3.2 or ≥20% "disagree"Publish a 1-page AI-in-reviews explainer; manager talking points; add "AI used: yes/no" label in formsHR + Comms14 days
Low feedback qualityE7–E12 Avg <3.0Collect 10 anonymised "bad vs good" examples; refine prompts; require 2 evidence bullets per sectionHR + Pilot managers21 days
Fairness/bias concernsE13–E18 Avg <3.0 or group gap ≥0.4Bias review of AI outputs; sharpen rubric anchors; run calibration refreshPeople Analytics + HR30 days
Low psychological safetyE19–E24 Avg <3.2Set a correction right; provide escalation path; train "challenge-safe" conversation scriptsHRBP + Managers14 days
Privacy concernsE25–E30 Avg <3.5 or any severe commentRe-brief GDPR rules; update prompt do-not-enter list; confirm retention period; align with DPO and works councilDPO + IT + HR7 days
Insufficient trainingM1–M6 Avg <3.0Mandatory 90-min training + checklist; gate next cycle access on completionL&D + HR30 days
Weak manager oversightM19–M24 Avg <3.2Add human sign-off step; require evidence references; peer-review 10% of AI-assisted reviewsHR + Function leadersNext cycle
Low 0–10 ratingsAvg <6/10 or downtrend ≥1.0Run 45-min focus groups (separate employee/manager); publish 3 concrete changesHR14 days

Fairness & bias checks

Fairness is not a single average score. Break results down by relevant groups and compare both perception and process signals. Research on AI-based performance prediction shows employees judge AI features as fair when they closely align with actual job performance — generic or opaque systems undermine that perception even when output quality is adequate. Practical red flags: group gap ≥0.4, or ≥15 percentage points difference in favorable rates, or repeated open-text mentions of "generic," "copy-paste," "unfair," or "privacy." Align language checks with your internal bias review process for consistent standards.

Pattern you seeTypical interpretationWhat to do nextOwner
Remote workers: lower E1–E6Transparency gaps in distributed communicationRemote-first briefing; add disclosure in review tool UIHR + Managers
Junior staff: higher E14/E17Higher uncertainty and power distanceAdd "how to challenge" steps; create a safe escalation pathHRBP
One team: lower E16 + M31Inconsistent standards and calibration driftRefresh rubric anchors; run targeted calibration workshopFunction leader + HR
High E33 + low E12Feedback feels impersonal or poorly editedSet editing minimums; ban copy-paste; require personalised examplesManagers

Practical scenarios

Scenario 1: Employees distrust AI because disclosure is unclear

You see E1–E6 Avg 2.8 and comments like "I only noticed AI wording after the meeting." HR adds disclosure at the point of use: the review form shows "AI-assisted draft: yes/no" plus a one-paragraph explanation. Managers get a 60-second script for the review conversation. You re-run the combined pulse (Blueprint C) in 60 days to confirm trust moved upward.

Scenario 2: Managers save time, but employees find feedback generic

M7 and M8 score high (≥4.0), but E7–E12 score low (Avg <3.0) and E33 rises. Decision: keep AI for structuring, but require evidence. Each review section must include 2 proof points — project, metric, observable behaviour. HR shares "good vs bad" examples, updates prompts, and spot-checks 10% of reviews for specificity before they reach employees in the next cycle.

Scenario 3: Fairness concerns cluster in one group

E16 is stable overall, but one location shows a gap of 0.5 points and higher E14. HR and the local works council check whether different prompts, rubrics, or data sources were used there. Next steps: calibration refresher, sharpen rubric anchors, re-brief managers on bias patterns. You compare the same group slice in the follow-up trend survey (Blueprint D) to confirm the gap narrowed.

Implementation guide

Start small, then scale. Involve your works council early whenever AI touches performance processes, monitoring questions, or decision support — the EU AI Act and national labor law both require prior consultation. Keep the survey clearly separate from individual outcomes: responses must not influence ratings, pay, or performance labels — or you will kill trust and participation rates.

Recommended rollout rhythm: pilot (6–10 weeks)first review cyclesurvey + fixes (≤30 days)scale to next areatrend check at 6–12 months. For broader talent and skills planning alongside your AI rollout, our talent management software guide for DACH covers the GDPR and works council checklist in detail.

  • Pilot in 1 function with 20–50 participants; define allowed AI uses — HR + IT, within 14 days.
  • Agree data minimisation, retention, and access rights; document in plain language — DPO + HR, within 30 days.
  • Train managers on verification and challenge-safe conversations — L&D, before the cycle starts.
  • Run Blueprint A and B after the cycle; publish actions within 21 days — HR lead.
  • Review and update the question bank annually or after major tool changes — HR + works council, 1× per year.
KPI to trackTargetWhy it mattersOwner
Participation rate≥70% post-cycleLow rates usually signal trust or survey fatigue issuesHR Ops
Time-to-action≤21 daysSpeed builds credibility; slow follow-through reduces honesty next roundHR Lead
Training completion (managers)≥90%Reduces risky prompts and over-reliance on unreviewed draftsL&D
Fairness group gaps<0.4 pointsEarly warning for bias patterns or inconsistent standardsPeople Analytics
Action completion rate≥80%Prevents "survey theatre" and demonstrates follow-throughHRBP + Leaders

For broader skills and competency tracking alongside your AI governance work, a structured skill management software comparison helps you separate "tool problems" from "skill gaps" — and plan targeted fixes faster.

FAQ

How often should we run these AI performance review survey questions?

During a pilot, run it after every AI-assisted review cycle for at least the first 2 cycles — while memories are fresh. After that, a deep-dive once a year plus a short pulse after major AI feature changes works well. Keep 6–8 core items stable across runs (trust, fairness, psychological safety, privacy) so you can track trends reliably over time.

What should we do if scores are very low (Avg <3.0) or comments are harsh?

Start with containment and clarity — not defensiveness. Acknowledge results within ≤7 days and say what will happen next. Use focus groups to identify root causes: disclosure gaps, generic wording, privacy fear, or manager behaviour? Then commit to 3 fixes maximum, with owners and deadlines. Close the loop publicly within ≤21 days. For severe privacy concerns, route to DPO/IT with a response time of ≤24 hours.

How do we stop the survey from feeling like monitoring or performance control?

Be explicit about purpose and separation: survey responses must not influence ratings, pay, or individual outcomes. Report aggregated results only, and apply anonymity thresholds (no reporting for groups smaller than 5–7). Involve your works council or employee representatives early and document guardrails clearly — data sources, retention periods, access rights. The EDPB guidelines on automated decision-making are a useful reference point for framing this internally.

What do GDPR Art. 22 and the EU AI Act require in practice?

GDPR Art. 22 prohibits decisions with significant effects that are based solely on automated processing. AI-generated rating suggestions or calibration inputs that managers accept without substantive review potentially trigger this protection. Employees have the right to request human intervention, express their view, and contest the outcome — and genuine review requires actual decision-making authority, not rubber-stamping. The EU AI Act adds a second layer: performance evaluation systems are high-risk AI (Annex III), requiring risk management documentation, logging, human oversight, and a fundamental rights impact assessment. From August 2026 (or later if the Digital Omnibus proposal passes), employees may also request a meaningful explanation of any high-risk AI output under Art. 86 of the AI Act.

Should we tell employees when AI was used in their review?

Yes — if you want trust. Lack of transparency tends to inflate fairness concerns even when output quality is fine. Keep it simple: where was AI used (drafting, summarising, calibration support), what data did it use and not use, and who owns the final content. Add a correction right: employees can challenge wording and request edits without needing to "prove" the AI was wrong.

How do we keep the question bank current as tools and policies change?

Run an annual review with HR, a few managers, People Analytics, and your works council where applicable. Keep your core trend items stable (trust, fairness, psychological safety, privacy) and rotate a small set of feature-specific questions based on what changed — new summarisation features, new rating suggestions, new data sources. Pilot new items with one team first, then scale once wording is unambiguous and action-ready.

Jürgen Ulbrich

CEO & Co-Founder of Sprad

Jürgen Ulbrich has more than a decade of experience in developing and leading high-performing teams and companies. As an expert in employee referral programs as well as feedback and performance processes, Jürgen has helped over 100 organizations optimize their talent acquisition and development strategies.

Free Templates &Downloads

Become part of the community in just 26 seconds and get free access to over 100 resources, templates, and guides.

Free Leadership Effectiveness Survey Template | Excel with Auto-Scoring
Video
Performance Management
Free Leadership Effectiveness Survey Template | Excel with Auto-Scoring
Free Advanced 360 Feedback Template | Ready-to-Use Excel Tool
Video
Performance Management
Free Advanced 360 Feedback Template | Ready-to-Use Excel Tool

The People Powered HR Community is for HR professionals who put people at the center of their HR and recruiting work. Together, let’s turn our shared conviction into a movement that transforms the world of HR.