Methodology
Last updated: 13 April 2026
PredictJEE uses a 10-signal adaptive prediction framework (long/short-horizon frequency, trend, cycle, gap-boost, over-exposure, recency, short-trend, short-cycle, momentum) trained year-by-year on JEE Main 2015–2025 and validated on the held-out next year. April 2026 validation: 88% precision on top-25.
PredictJEE is built on a single premise: the JEE Main examination follows statistically identifiable patterns. By analysing historical data systematically, we can estimate which topics and question types are most likely to appear in upcoming sessions.
We currently cover Physics, Mathematics, and Chemistry, each with its own tailored prediction engine. This page explains our general approach, the data we use, how we validate our predictions, and where our limitations lie.
1. Data Foundation
Our models are trained on 15,000+ real JEE Main questions scraped from official sources, spanning every session from 2010 to 2026. Each question is classified into a specific question type (internally called a pattern), a category more granular than a chapter.
For example, the chapter "Waves" in Physics contains distinct question types such as Standing Waves, Doppler Effect, and Superposition of Waves. The chapter "Matrices" in Mathematics contains Determinants, Matrix Inverse, and System of Linear Equations. Predicting at the question-type level is more actionable than predicting at the chapter level because it tells you exactly what type of problem to prepare for.
15,000+
questions analysed
345
unique question types tracked
17 years
of exam history (2010-2026)
2. How the Prediction Engine Works
Each question type is scored by a proprietary multi-signal framework that analyses multiple dimensions of exam behaviour simultaneously. The signals are combined into a composite score that determines how likely a question type is to appear in the next session.
At a high level, the engine considers four categories of evidence:
Historical Strength
How consistently has this question type appeared across past exams? Question types that show up reliably year after year are statistically more likely to appear again. The engine measures both raw frequency and consistency over time.
Momentum and Trends
Is this question type becoming more or less common over recent years? The engine detects rising and falling trends to distinguish question types that are gaining prominence from those being phased out.
Gap and Cycle Analysis
When was this question type last tested, and does it follow a cyclical appearance schedule? NTA tends to rotate question types. Those absent for multiple years often make a comeback. The engine identifies these rotation cycles and predicts when a question type is due to return.
Saturation Detection
Has this question type been over-tested recently? NTA avoids repeating the same question types excessively. Those that have appeared many times in recent sessions are statistically less likely to appear again immediately.
Signal weights are not fixed. They are calibrated through multi-year backtesting, allowing each subject's engine to adapt to the unique statistical behaviour of that discipline. The exact signal composition, weighting methodology, and calibration process are proprietary.
3. Data Integrity
Prediction accuracy depends on answer accuracy. We have run 10 independent verification passes on our question bank, including:
- Blind re-solving by 5 independent AI models (cross-verified against each other)
- Web scraping verification against official answer keys and trusted educational sources
- Manual expert review of every disputed answer
- Full re-scrape of all question options to eliminate scraper-introduced errors
After 10 verification passes, our estimated answer accuracy across all subjects is ~99.5%.
4. Tier System
After scoring, all question types are ranked from highest to lowest composite score and assigned to one of three tiers.
Highest Confidence
The most statistically reliable predictions. Expect the vast majority of these question types to appear. These should form the core of your preparation.
Strong Candidates
Strong secondary targets with solid statistical support. These round out a thorough study plan.
Moderate Probability
Supplementary study material. These question types have lower statistical support but can still appear, particularly when NTA introduces rotation surprises.
5. Backtested Performance
Every prediction engine is validated through rigorous backtesting. We simulate predictions for past exam sessions using only data available before that session, then measure how many of our predictions actually appeared.
Physics
Top 25 Coverage
21/25
question types matched
Marks Coverage
84%
of 100 Physics marks
F1 Score
87.7%
mature year average
Hit Rate
94.6%
holdout accuracy
Mathematics
Top 25 Coverage
24/25
question types matched
Holdout Hit Rate
97.9%
46/47 question types
F1 Score
81-84%
mature years (2022-2025)
Answer Accuracy
99.5%
10 verification passes
Chemistry
Question Types
75+
across all chapters
Top 25 Precision
88%
backtested accuracy
Questions Verified
464
re-scraped and audited
F1 score is the harmonic mean of precision (how many predictions were correct) and recall (how many actual question types were predicted). Hit rate measures the proportion of our predicted question types that actually appeared in the exam.
6. Limitations and Disclaimer
Important: Read before relying on predictions
- Past performance does not guarantee future results. Backtested accuracy is measured on historical data. Future sessions may deviate significantly from historical patterns.
- NTA can introduce new question types at any time. Our model can only predict question types it has observed before. Entirely novel question types or topics added to the syllabus will not appear in our rankings.
- Predictions are statistical, not deterministic. A Tier 1 ranking means high probability, not certainty. Some Tier 1 question types will not appear in every session, and some Tier 3 question types will.
- Use predictions as a supplement, not a replacement. PredictJEE is designed to help you prioritise, not to narrow your syllabus. Comprehensive preparation remains the most reliable strategy.
- We are not affiliated with NTA. PredictJEE is an independent platform. We have no access to unreleased examination content or insider information of any kind.
7. Questions
If you have questions about our methodology or want to understand how a specific question type was scored, contact us at [email protected].