An AI Idea Validator Scored My Live App 66/100

Line-art illustration of a hand holding a scorecard document under a magnifying glass next to a smartphone showing a fitness app, with a gauge dial pointing to the middle of its range

GainFrame has been live on the App Store since March. This morning I took the original pitch — gym selfies in, body fat and physique insights out — and fed it into IdeaGrit, an AI idea-validation tool, as if the app didn't exist yet. I wanted to see what it would have told me on day zero.

It scored the idea 66 out of 100 and stamped the problem "Nice to have."

The report took about a minute to generate and ran on a free credit (you get three at signup, no card). The parts of it that are right took me four months and $5,674 to learn myself. That gap is what this post is about — three things the tool genuinely got right, and the one place it whiffed.

IdeaGrit idea score card showing 66 out of 100, with subscores of 16/25 for problem urgency, 15/25 for market opportunity, 17/25 for feasibility and differentiation, and 18/25 for execution realism — The verdict on the app I'd already built: 66/100, "moderate urgency, crowded market, and trust-sensitive differentiation cap the score."

It didn't blow smoke

This is the thing that made me keep reading instead of closing the tab.

I've pasted the GainFrame pitch into ChatGPT before and asked whether it's a good idea. The answer is always some version of "strong concept, growing market, great differentiation." It feels good and teaches nothing. IdeaGrit opened with 16/25 on problem urgency, called my app "more likely a retention-sensitive convenience product than a must-have," and put "No strong moat yet" in writing under the defensibility section. About my app. The one I've been building for over a year.

None of that was fun to read. All of it is defensible.

The detail I appreciated most: when a section rests on shaky ground, the tool flags it with a red "worth refining before you act on this" banner instead of presenting every paragraph with the same confidence. The urgency assessment and the competitor list both got flagged in my report. An AI product that marks its own weak spots is rare — most of them would rather be confidently wrong than admit the input was thin.

It found my most expensive lessons on its own

This is the part that actually stung. The report had no access to my analytics, my ad spend, or my App Store data. It still landed on three conclusions I paid real money for.

On paid acquisition: the report says "customer acquisition will likely be harder than the concept sounds because this is not an urgent mass-market problem," and later, "pure direct-to-consumer acquisition in fitness is usually noisy and competitive." I spent $5,674 on ads finding that out — $114 per paying customer, each worth about $18. The report gets there in one sentence.

On what the real constraint is: "The main constraint is not market awareness but retention: if users do not trust the outputs after 4-8 weeks, revenue will likely stall at a low level even if early signups are easy." That's uncomfortably close to my actual dashboard. Retention of paying users is fine — above the category benchmark. The number I've been fighting all summer is trial-to-paid conversion, which runs at roughly half the health-and-fitness median. Signups are easy. Trust-me-enough-to-pay is the hard part. The report called that ordering without seeing a single row of my data.

On positioning: the go-to-market section says, verbatim, "Do not lead with 'AI body fat from selfies' because that will trigger skepticism; lead with weekly physique check-ins with clearer progress explanations." I learned this on TikTok over months of posting. Content that leads with the body-fat estimate gets picked apart in the comments. Content about training myths and progress tracking pulls people in, and then they find the estimate feature on their own. I have the posting history to prove which one works, and the tool just... said it.

I want to be fair about the mechanism here — this isn't magic, it's a model that has read a lot about fitness apps, and "fitness is competitive" is not a shocking take. But there's a difference between generically warning "fitness is hard" and specifically predicting that skepticism about photo-based body fat is the acquisition landmine. It got the specific version right.

It gives you numbers you can be graded on

Most AI advice is unfalsifiable. This report ends with a single metric: "Get 15 conversion-ready beta users to complete 4 weekly check-ins within 8 weeks." Pass or fail. It also sets a 50% four-week check-in completion target, says if fewer than 15-20% of engaged users will pay the value is too weak, and calls monthly churn above 10-15% fatal for the subscription. Those are lines in the sand, not vibes.

The revenue forecast has the same property, which meant I could do something mildly masochistic: grade my actual business against it.

IdeaGrit growth and revenue forecast section showing a base case of 150-300 paying users and $2,000-4,500 MRR by end of year one at $9-19/month, and a pessimistic case of 30-80 paying users at $300-1,200 MRR — Base case: 150-300 paying users and $2,000-4,500 MRR by end of year one. Pessimistic case: 30-80 paying users, $300-1,200 MRR.

My reality, four months in: 185 active subscriptions and $859 MRR. The subscriber count is already inside the base-case band with eight months to spare. The MRR is sitting in the pessimistic band. Both at once.

The explanation is pricing, and it's the second-order lesson I didn't expect from this exercise. The forecast assumes $9-19 a month. GainFrame Pro Yearly is $39.99 — about $3.33 a month. I priced low at launch because I was scared nobody would pay, and a one-minute report just implied I'm leaving most of the base case on the table. I'm not raising prices this week over an AI's opinion. But it's the first tool that made the pricing question concrete instead of theoretical, and it's been rattling around my head since.

The miss: it pulled the wrong competitors

The one section that didn't survive contact with reality. The report names MacroFactor, MyFitnessPal, and Stronger by the Day as direct competitors.

IdeaGrit competitor analysis section listing MacroFactor, MyFitnessPal, and Stronger by the Day as direct competitors, with a key insight about the crowded fitness-tracking market — Adjacent giants, yes. Direct competitors, no. This section carried the tool's own "worth refining" flag.

Those are adjacent giants — macro trackers and lifting logs. Nobody cross-shops MyFitnessPal against GainFrame. The apps my users actually compare me to are the photo-analysis niche: ZozoFit, MeThreeSixty, Spren, Metamorph, Recomp AI. I know because I've written comparison posts against most of them, and those posts are some of my highest-converting search traffic — people searching "X vs Y" have their credit card half out already.

Two fair caveats. The tool flagged this exact section as "worth refining before you act on this," so it knows the list is shaky. And its description of the gap is right — "few products are known specifically for turning casual physique photos into explanations users trust" is a better articulation of my positioning than I usually manage. It understood the hole in the market; it just filled the competitor list with household names instead of the apps actually standing in the hole. A live App Store or web lookup for the niche would fix this, and it's the one upgrade I'd ask for. For a category-defining idea it matters less. For an idea like mine, entering an existing niche, the named competitors are half the value of the report.

Would it have changed anything?

Honest answer: it wouldn't have stopped me from building GainFrame. I built the app before validating anything, which is exactly backwards, and a 66 with "nice to have" stamped on it probably wouldn't have talked me out of it — I wanted this app to exist so I could use it.

But it would have changed the first four months. The report's launch plan says to validate with a small beta, lead with check-ins instead of body-fat claims, and test coach partnerships before spending on acquisition. I did roughly the opposite: shipped broadly, led with the scanner, and put $5,674 into ads. A one-minute report being directionally right about all three is a little embarrassing, but that's the point of writing these posts.

The report is a starting point, not an oracle. The competitor list needs your own homework, and no tool can tell you whether you specifically can pull an idea off. What it replaces is the messy "is this a good idea?" chat thread that flatters you into building — and it replaces it with subscores, thresholds, and a pessimistic case you can be graded against later. Four months in, I've been graded against mine. Mixed results.

If you've got an app idea sitting in a notes file, running it through something like IdeaGrit before you write code is a lot cheaper than finding out the way I did. And if you've used AI validation tools on an idea you later shipped, I'd genuinely like to hear how the prediction held up.