When a dental practice runs a parallel 30-day pilot of Voicify against Competitor A (the horizontal voice-AI platform that handles dental as one vertical of many) or Competitor B (the dental-pure AI receptionist with narrower feature surface and lighter PMS integration depth), the buyer's evaluation is going to happen with or without a structured rubric. Without one, the pilot collapses into vibes — whichever vendor's setup felt easier, whichever rep replied to Slack faster, whichever live demo happened to script around a sympathetic use case. None of that signals long-term fit. The Voicify "Competitor A or Competitor B" dental AI pilot scorecard is a six-category, 18-criterion buyer-facing rubric the Voicify rep hands the dental practice at pilot kickoff. It runs through the 30-day window, scores 0/1/2 per criterion, applies slot-aware weighting based on whether the buyer is comparing against an A-slot or B-slot competitor, and produces a defensible decision artifact for the practice's procurement or partner committee at the end of pilot.
TL;DR
Six categories. 18 criteria. Slot-aware 1.5x multiplier. The rep hands the buyer a structured 30-day pilot rubric covering coverage, conversion, integration fidelity, clinical safety, staff lift, and total cost. Each criterion scores 0/1/2 for a 36-point unweighted total. A-slot pilots (vs horizontal voice-AI) apply a 1.5x multiplier on six integration and dental-workflow criteria. B-slot pilots (vs dental-pure receptionist) apply the multiplier on three coverage and three conversion criteria, lifting the ceiling to 45 points. The scorecard runs in the open, not as a Voicify-internal evaluation, and feeds the win-loss debrief and slot-flip log at pilot close. A buyer who refuses the rubric is a discovery-stage problem, not a pilot-stage problem.
Why a Buyer-Facing Pilot Scorecard
The argument against handing the buyer a Voicify-authored rubric is the obvious one: the buyer will assume the rubric is rigged. The argument for doing it anyway is that the alternative is worse. A buyer running a parallel pilot without a rubric will default to whichever criterion is easiest to feel — setup friction, the first 48 hours of UI exposure, the rep responsiveness during a single weekend escalation — and that criterion may have nothing to do with the criteria that determine 12-month retention or full-practice rollout success. The dental-pure B-slot competitor wins those vibes-based pilots disproportionately because they have one product, no platform overhead, and a setup wizard tuned to a single dental persona. Voicify's PMS integration depth, clinical-grade transcription, and after-hours overflow handling do not surface in the first 48 hours of pilot. They surface in week three.
The scorecard works because it is framed honestly. The rep does not hand the buyer a rubric that says "Voicify wins on everything." The rep hands the buyer a rubric that names six categories the buyer should evaluate, identifies the slot weighting that applies to that buyer's competitive comparison, and walks the buyer through how to score it. The rep accepts that on some criteria — particularly setup friction and time-to-first-call in B-slot pilots — Voicify will not win the 2-point score. The rubric is defensible because it does not pretend Voicify wins everywhere. It is also a sales instrument because it surfaces the criteria where Voicify's work shows up against whichever competitor the buyer is actually comparing against.
The Six Categories and 18 Criteria
Six categories cover the full evaluation surface. Each category carries three criteria, scored independently 0/1/2 at pilot end. The scoring rubric for each criterion is shared with the buyer at kickoff and does not change mid-pilot.
| Category | Criterion | 0 / 1 / 2 anchor |
|---|---|---|
| Coverage | After-hours uptime | 0: outage during pilot / 1: 95-98% / 2: 99%+ |
| Weekend continuity | 0: dropped / 1: handled with delays / 2: indistinguishable from weekday | |
| Peak-hour overflow handling | 0: rolled to voicemail / 1: handled some / 2: all captured | |
| Conversion | Appointment booking completion | 0: under 50% / 1: 50-70% / 2: 70%+ |
| No-show prevention messaging | 0: none / 1: basic SMS / 2: tailored multi-channel | |
| New-patient capture rate | 0: low vs baseline / 1: flat / 2: lifted vs baseline | |
| Integration fidelity | PMS write-back accuracy | 0: manual reconcile / 1: occasional errors / 2: clean |
| Insurance verification handling | 0: not attempted / 1: partial / 2: clean carrier mapping | |
| Hygiene recall integration | 0: ignored / 1: surfaced / 2: scheduled correctly | |
| Clinical safety | Clinical terminology accuracy | 0: errors caught by staff / 1: 1-2 misses / 2: zero misses |
| Emergency triage escalation | 0: missed an urgent call / 1: handled some / 2: all escalated correctly | |
| HIPAA / privacy posture in transcripts | 0: PHI leakage / 1: questionable / 2: clean | |
| Staff lift | Front-desk hours reclaimed per week | 0: under 5 / 1: 5-10 / 2: 10+ |
| Exception handoff quality | 0: confusing / 1: workable / 2: same as a trained employee | |
| Staff sentiment at week 4 | 0: would not continue / 1: mixed / 2: would champion | |
| Total cost | Pricing predictability over 12 months | 0: volume cliffs / 1: tiered with surprises / 2: flat or predictable |
| Setup and onboarding cost | 0: paid services required / 1: light services / 2: self-serve | |
| Switching cost from incumbent | 0: high / 1: moderate / 2: minimal |
The 0/1/2 anchors are deliberately concrete. The rep does not score the criterion for the buyer. The buyer's office manager and practice owner score it during a closing-week review meeting the rep facilitates. The rep's job during the 30-day pilot is to make sure the data the buyer needs to score each criterion is collectible — call logs are exported, PMS reconciliation reports are pulled, staff sentiment is surveyed in week 4. The rep does not score and does not dictate. The framework is the contribution.
Slot-Aware Weighting: A vs B
The slot the buyer assigned the competitor at discovery determines which six criteria carry a 1.5x multiplier in the final total. This is the mechanism that prevents the scorecard from being a generic vendor-evaluation form that happens to live on Voicify letterhead.
A-Slot Pilots: Voicify vs Horizontal Voice-AI Platform
When the buyer is comparing against Competitor A — a horizontal voice-AI platform with broad industry coverage and a dental persona on top of a general infrastructure — the 1.5x multiplier hits the four integration fidelity criteria plus the two dental-specific workflow criteria nested in clinical safety (clinical terminology accuracy, hygiene recall integration). The logic is that the A-slot competitor's weakness is rarely raw capability; it is the depth of dental-specific work — PMS write-back accuracy, insurance verification carrier mapping, hygiene recall scheduling, dental terminology recognition. Weighting those criteria 1.5x lifts the ceiling of a Voicify win to 45 points on the slot-weighted total, and the 4-7 point spread is typically where the dental-pure work shows up.
B-Slot Pilots: Voicify vs Dental-Pure AI Receptionist
When the buyer is comparing against Competitor B — a dental-pure AI receptionist with narrow feature surface and lighter PMS integration depth — the 1.5x multiplier hits the three coverage criteria and the three conversion criteria. The logic is that the B-slot competitor's strength is dental persona and setup wizard. Their weakness is what happens when call volume scales, when after-hours uptime is tested, when an overflow event happens during peak hours, and when no-show prevention has to be tuned beyond a basic SMS template. Weighting coverage and conversion 1.5x lifts the same 45-point ceiling, and the slot-weighted spread typically shows up in week three when sustained volume reveals coverage gaps the dental-pure competitor papered over in the demo.
The Closing-Week Review Meeting
The scorecard is not scored asynchronously. At pilot day 26, the rep schedules a 60-minute closing-week review with the practice owner, the office manager, and a clinical lead if available. The rep walks the room through each of the 18 criteria, presents the data needed to score it (call logs, PMS reports, staff survey, financial reconciliation), and the room scores. The rep takes notes; the rep does not vote. Both vendor scores are tabulated side-by-side at the meeting's close, weighted by slot, and the room sees the totals. The recommendation is the buyer's, not the rep's. The rep's role is to make sure the rubric was applied consistently across both vendors — if the buyer scored Voicify a 1 on PMS write-back accuracy, the rep asks what data point that score is based on and whether the same data point was used for the competitor.
The closing-week review is the second-most-important meeting of the pilot, behind only kickoff. Kickoff sets the rubric. Closing scores it. Everything in between is data collection. Reps who try to influence the score during the meeting lose pilots they would have won — the rubric works only when the buyer trusts the rep is not gaming it.
Feed Into Slot-Flip Log and Win-Loss Debrief
The scorecard total flows into two existing instruments. If the slot the rep actually ran during the pilot differed from the slot the discovery brief assigned — for example, the brief said B-slot but the buyer kept introducing horizontal-platform comparisons mid-pilot — the rep logs the flip in the slot-flip log with a reason code. The final weighted scorecard total and a one-paragraph context note feed the win-loss debrief stack at close. A high-score loss — Voicify won the rubric but lost the deal — flags a criterion outside the scorecard the buyer weighted heavily (often executive relationship, board recommendation, or sibling-practice influence), and that becomes a quarterly refresh input. A low-score win — Voicify lost the rubric but won the deal — flags either a procurement-process driver (existing vendor relationship, paper-handling friction) or a criterion the rubric should add in the next revision.
What the Scorecard Is Not
It is not a marketing collateral piece. It does not get posted on the Voicify website or sent in cold outbound. It is a rep instrument, handed to a buyer who has already chosen to pilot. It is also not a Voicify-internal evaluation — the rep does not score it privately and then present results. The buyer scores it. The rep's contribution is the framework, the data prep, and the meeting facilitation. The instrument fails the moment the rep tries to use it as a closing tool rather than an evaluation tool. Reps who treat the scorecard as a sales close lose the trust that makes the scorecard work; reps who treat it as a buyer-trust instrument win pilots they would have lost to vibes.