Age & Gender Predictor
Guess age and gender from any photo with AI. Detects multiple faces, runs offline in browser. Powered by face-api.js — no upload, no signup.
About the AI Age & Gender Predictor
This Age & Gender Predictor estimates the apparent age (in years) and predicted gender of every face in an uploaded photo. It runs entirely on your device using face-api.js, an open-source TensorFlow.js port of established deep-learning models for face detection and attribute estimation. No image data is uploaded to any server: detection, embedding, age regression and gender classification all execute as JavaScript in your browser. After the first visit (~5 MB of model weights are cached) the tool works offline.
Use it for casual exploration — guessing how old a face looks, demonstrating computer vision in classrooms, prototyping features for a personal project, or checking that a photo dataset has roughly balanced demographics. It is a fun, fast tool that gives you a number quickly. It is not a biometric authentication system, an identity verifier, an age-gate for adult content, or evidence for legal, medical, employment, or insurance decisions. Treat the output as a probabilistic estimate from a model that was trained on a finite dataset and that inherits the biases of that dataset.
The tool detects multiple faces in a single image and reports each with a bounding box, an estimated age in years, and a predicted gender label (male / female) accompanied by a confidence score. Best results require well-lit, front-facing portraits with the face occupying a meaningful portion of the frame. Heavy makeup, beards, sunglasses, masks, side angles, motion blur, very low resolution, or strong shadows all degrade accuracy. Babies and young children are systematically over-estimated by most public models because training corpora skew toward adults. Older adults (70+) are often under-estimated for the same reason.
On ethics and bias: face-api.js inherits the limitations of its training data — primarily IMDB-WIKI for age and gender — which over-represents lighter-skinned, North-American/European, professionally photographed adults. NIST FRVT, MIT Media Lab Gender Shades, and many academic studies have documented systematically higher error rates for darker skin tones and for non-binary gender expression. Two-class male/female classification is itself a coarse simplification of real human gender. We provide this tool to make face analysis less mysterious, not to authorize sensitive decisions about real people. Do not use the output to allow or deny anyone access to a service, premises, content, or right.
Privacy is by design rather than by promise: because all model code is shipped to the browser and all inference runs locally, your image bytes never travel over the network. The page itself is served over HTTPS; standard analytics record only the URL visited, not photo contents. We do not store, log, sell, or share the images you analyze. Closing the tab clears all in-memory data.
How the prediction works
Inference proceeds in three stages. First, face detection: face-api.js uses an SSD MobileNetV1 detector trained on the WIDER FACE dataset (and optionally a Tiny Face Detector for low-resource devices). The detector outputs a list of bounding boxes with confidence scores; a non-maximum-suppression step removes overlaps. The library also supports MTCNN — a three-stage cascade (P-Net, R-Net, O-Net) introduced by Zhang et al. (2016) — which is more accurate but slower; the default model balances accuracy and speed for in-browser execution.
Second, alignment: each detected face is cropped, optionally aligned by predicting 68 facial landmarks (eyes, nose tip, mouth corners, jawline) so that the eyes lie horizontally. Aligning the face improves attribute prediction because the regression network was trained on aligned crops. The landmark detector is a small ConvNet trained on the iBUG 300-W dataset.
Third, attribute estimation: the aligned crop is fed through a shared feature-extractor backbone (a ResNet-style architecture) followed by two heads. The age head is a regression — it directly outputs a single floating-point number in years, trained with mean-squared error against IMDB-WIKI labels (Rothe, Timofte & Van Gool, 2015–2018). The gender head is a binary classifier outputting a probability of being female; we report the more probable label and its softmax score as the confidence. Both heads share an SSR-Net-inspired backbone (Yang et al., 2018) which is small enough to run smoothly on phones.
All three networks are quantized to 32-bit float for the in-browser TensorFlow.js runtime. They run on WebGL when available (GPU-accelerated) or fall back to CPU via WebAssembly. Total weight size is roughly 5–10 MB; the browser caches the weights so repeat visits are instant. Per-face inference takes 50–300 ms on a modern laptop, longer on mobile. The whole pipeline — detection, landmarks, attributes — is sequential, but multiple faces in one image are processed in a tight loop, not in parallel.
The bounding box returned is in the original image coordinates, so we draw it directly on a canvas overlaid on the input. The age regression value is rounded to the nearest integer for display. Gender confidence is reported as a percentage; values close to 50% indicate the model has very low confidence and the label should be ignored or treated as 'unknown'.
Accuracy, limitations, and ethical use
On well-lit front-facing adult portraits at decent resolution, age estimates are typically within ±5 to ±8 years of the true age, and gender classification confidence above 90% is reliable in the male/female sense the model was trained on. These figures degrade markedly outside that operating envelope. The IMDB-WIKI evaluation paper reports a Mean Absolute Error around 3.5 years for the original DEX (Deep EXpectation) network on its in-distribution test set; in-the-wild performance is worse. Treat any single prediction as an estimate, not a measurement.
Critically, accuracy varies by demographic. Multiple peer-reviewed audits — Buolamwini & Gebru's Gender Shades (2018), NIST FRVT 1:1 (ongoing), Raji et al. (2020) — have shown that face-analysis models trained on Western, light-skinned datasets produce significantly higher error rates for women, darker-skinned subjects, and people whose gender presentation does not match a binary male/female norm. These aren't minor differences: error rates of 35% are common on under-represented groups versus 1% on well-represented ones in some commercial systems. face-api.js is not exempt from these issues.
Do not use this tool for any decision that affects a person's rights, opportunities, money, or safety. That includes — but is not limited to — verifying age for purchasing alcohol, tobacco, or adult content; gating access to age-restricted services; filtering job applicants; verifying identity for financial or legal transactions; medical diagnosis or triage; immigration or border control; surveillance, profiling, or law enforcement; targeted advertising based on inferred gender. For any such use case you need a calibrated, accountable, audited system, not a free demo. The authors of face-api.js, the original model papers, and WuTools all explicitly disclaim suitability for those uses.
- Age estimates are typically ±5 to ±10 years on adults; much wider on children and seniors who are under-represented in training data.
- The model produces a binary male/female label and cannot represent non-binary, intersex, transgender, or fluid gender identities.
- Accuracy degrades on darker skin tones, non-frontal angles, occluded faces (sunglasses, masks, hands), and low-resolution or poorly lit images.
- Heavy makeup, beards, hijabs, surgery, or aging treatments can dramatically shift age and gender predictions.
- Children under 5 are often estimated as 8–12 years old; adults over 70 are often under-estimated by 5–15 years.
- Photos of faces wearing VR headsets, partial occlusion, profile views, or extreme expressions may not be detected at all.
- The tool cannot match the same person across photos — for that, see our face similarity meter.
- Outputs are unsuitable for legal age verification, biometric identity, employment screening, medical diagnosis, or law-enforcement use.
Glossary
- Face detection
- Locating where in an image faces appear, typically reported as axis-aligned bounding boxes with confidence scores. Distinct from face recognition, which would identify whose face it is.
- Bounding box
- A rectangle, given as (x, y, width, height), that encloses a detected face in image-pixel coordinates.
- Facial landmark
- A specific anatomical point on the face — outer eye corner, nose tip, mouth corner, jawline point. This tool uses the 68-point iBUG scheme to align faces before attribute prediction.
- Regression model
- A neural network that outputs a continuous number (here, age in years) rather than a class label. Trained by minimising mean-squared error against ground-truth ages.
- Classification model
- A neural network that outputs a probability over a fixed set of categories (here, two: male and female). Confidence is the softmax score on the predicted class.
- Model inference
- Running a trained neural network on new input to produce predictions. Distinct from training, which is the offline learning phase. This tool only does inference; the model was trained elsewhere on IMDB-WIKI.
- ONNX / TensorFlow.js
- Runtimes for executing neural networks. ONNX is an open exchange format; TensorFlow.js runs models in JavaScript, optionally GPU-accelerated via WebGL or WebGPU. face-api.js uses TensorFlow.js.
- MTCNN
- Multi-task Cascaded Convolutional Network. A face-detection algorithm by Zhang et al. (2016) that runs three small networks in sequence (P-Net, R-Net, O-Net) and jointly predicts bounding boxes plus five facial landmarks.
Frequently Asked Questions
How does the AI estimate my age?
It runs face-api.js (TensorFlow.js port) in your browser. After locating your face with an SSD-MobileNet detector, it aligns the crop using 68 facial landmarks and feeds it through a regression network trained on IMDB-WIKI to output a single number — apparent age in years. The whole pipeline runs offline in JavaScript; nothing is uploaded.
How accurate is the age estimate?
On well-lit frontal adult portraits, the published DEX/IMDB-WIKI Mean Absolute Error is around 3.5 years on benchmark sets, and ±5 to ±10 years is realistic in-the-wild. Children, seniors, side angles, low-resolution images, heavy makeup, and people with darker skin tones tend to see larger errors because of training-data bias.
Can it detect multiple faces?
Yes. The detector returns every face above a configurable confidence threshold; each is processed independently and gets its own bounding box, age estimate, and gender label. There is no hard limit, but very small faces may be missed.
Are my photos private?
Yes. All inference happens in your browser via TensorFlow.js. The neural-network weights are downloaded once (~5 MB, cached) and the inference itself runs locally on the JPEG you select. Your image bytes never leave your device. We do not store, log, or share photos.
Why does the model only output 'male' or 'female'?
Because that's how it was trained — IMDB-WIKI labels gender as a binary attribute. We acknowledge this is a crude simplification of real human gender identity and we cannot accurately detect non-binary, transgender, or fluid gender expression. Treat the binary output as the model's guess based on training-set statistics, not a fact about the person.
Is this safe for age verification?
No. Do not use this tool to gate alcohol, tobacco, gambling, or adult content. Even at its best the model is ±5–10 years off, and statutory age verification typically requires a calibrated, audited, regulator-approved system. NIST FRVT, ICO/UK and EU AI Act guidance all warn against using off-the-shelf face analysis for compliance use.
Why is the model wrong on my photo?
Common causes: (1) darker skin tones are under-represented in IMDB-WIKI; (2) the photo is non-frontal, blurry, or low resolution; (3) face is partially occluded by glasses, masks, hands, or hair; (4) heavy makeup, beards, plastic surgery; (5) children and very old adults are systematically off. Try a different photo and check the bounding box is on the correct face.
Does it identify who the person is?
No. The model outputs only a numeric age estimate and a male/female label. It does not match the face to a database, look up identity, or recognise specific individuals. For face matching see our Face Similarity Meter, which compares two faces — also entirely offline.
What model architecture is used?
Face detection: SSD MobileNetV1 (or optional Tiny Face Detector / MTCNN). Landmark detection: 68-point ConvNet. Age regression and gender classification: a shared feature backbone in the SSR-Net family, trained on IMDB-WIKI plus UTKFace. All weights are quantized for in-browser TensorFlow.js.

Can I use this commercially?
The tool itself is free, but face-api.js is MIT-licensed and the underlying model papers have their own usage notes. More importantly, deploying any face-analysis system in a product almost always triggers GDPR (EU), CCPA (California), and Illinois BIPA — biometric privacy laws — even if everything is local. Get legal advice before shipping a product based on this.
References & academic sources
- Zhang, K., Zhang, Z., Li, Z., & Qiao, Y.. (2016). Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN) IEEE Signal Processing Letters.
- Rothe, R., Timofte, R., & Van Gool, L.. (2018). DEX: Deep EXpectation of Apparent Age from a Single Image (IMDB-WIKI dataset) International Journal of Computer Vision.
- Yang, T.-Y., Huang, Y.-H., Lin, Y.-Y., Hsiu, P.-C., & Chuang, Y.-Y.. (2018). SSR-Net: A Compact Soft Stagewise Regression Network for Age Estimation IJCAI.
- Buolamwini, J., & Gebru, T.. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification Proceedings of Machine Learning Research.
- Grother, P., Ngan, M., & Hanaoka, K.. (2024). NIST Face Recognition Vendor Test (FRVT) — ongoing benchmark of commercial face systems U.S. National Institute of Standards and Technology.
- Mühler, V.. (2020). face-api.js: JavaScript API for Face Detection and Recognition in the Browser Open-source library, MIT licence.
Last reviewed: · Reviewed by WuTools AI Ethics & Engineering Team
Frequently Asked Questions
Does the age and gender prediction run in my browser or are my photos sent to a server?
Everything runs locally in your browser. The face detection (SSD-MobileNet), 68-point landmark alignment, and age/gender regression heads (face-api.js on TensorFlow.js) are downloaded once and then every prediction is computed on-device using WebGL, WebGPU, or WebAssembly. Your photos and the predicted age/gender labels never leave your device. This is essential because predicted demographics combined with a photo can be considered sensitive personal data under GDPR. We do not log, store, or transmit any image or prediction — you can verify with DevTools that no POST request fires after loading the model files.
What image conditions give the most accurate age estimate?
For the best apparent-age estimate, use a frontal photo with even daytime lighting, the face filling at least a 200x200 pixel region, no sunglasses, no heavy makeup, no filters or beautification, neutral expression, and the head untilted. Side profiles, harsh shadows, masks, hats covering the forehead, smiling broadly, and Snapchat/Instagram beauty filters can shift the predicted age by 5-15 years. The model was trained on IMDB-WIKI, a celebrity photoset that skews towards adults aged 20-60 in posed lighting, so children, very elderly adults, and casual snapshots tend to have higher error.
How accurate is the predicted age compared to my real age?
On the published DEX/IMDB-WIKI benchmark, age regression CNNs achieve a Mean Absolute Error around 3.5-5 years on apparent age across adults aged 20-60 in well-lit frontal photos. Performance drops for children (training data is sparse below age 15) and the elderly (sparse above age 80), where errors of 8-15 years are common. The model predicts apparent age — how old you look — not biological age, so makeup, lighting, hairstyle, and image quality matter as much as your actual birthday. Two photos of the same person taken minutes apart in different conditions can easily differ by 5+ years in the prediction.
How does the gender prediction work and is it binary?
The gender head is a small two-output softmax classifier returning a probability for "male" and "female" based on the same aligned 64-dimensional face embedding used for age. The output is binary by training-data design (IMDB-WIKI labels) — there is no non-binary or "unknown" class. The classifier expresses uncertainty via the probability: a face the model is unsure about might return 0.52 male / 0.48 female. We recommend treating predictions below about 0.7 confidence as ambiguous and not surfacing them as labels. This model captures apparent gender presentation in the photo, not the subject's self-identified gender.
Is WebGPU faster than WebAssembly for age/gender prediction?
Yes. The detection + landmark + age + gender pipeline involves several convolutional networks. On WebGPU, the full pipeline completes in 50-200 ms per face on a typical laptop; on WebAssembly with SIMD, it takes 300-1500 ms; on plain WebAssembly (older browsers, no SIMD) it can take 2-5 seconds. For batch processing of many photos or live webcam mode, WebGPU is essential to keep the UI responsive. The tool autodetects backend support at startup and uses the fastest available; you can check the active backend in the browser console.
Can I use this in real-time on a webcam stream?
Yes, with caveats. On WebGPU with a small detector input (320x240) the tool sustains 15-30 FPS on a typical laptop, which feels smooth for live preview. On WebAssembly-CPU expect 2-10 FPS — usable as a slideshow but jerky for video. To improve frame rate: reduce the detector input resolution, throttle predictions to every Nth frame, run prediction only when the face moves (motion detection from frame differences), or use a lighter detector like BlazeFace from MediaPipe. Keep in mind that live demographic prediction raises stronger privacy questions than one-shot prediction — even though all of it runs locally.
Which architecture is used — face-api.js, MediaPipe, or DeepFace?
The default pipeline is face-api.js / @vladmandic/face-api (TensorFlow.js port), combining an SSD-MobileNet v1 face detector, a 68-point landmark regressor, and two small regression heads on top of a shared face-feature backbone for age and gender. The age head is a single-output regression network fine-tuned from a DEX classifier; the gender head is a softmax with two outputs. MediaPipe Face Mesh + a custom demographic classifier is an alternative path used by some apps; DeepFace (the Python library) wraps multiple architectures including VGG-Face, Facenet, and OpenFace — most are too large for browsers but exist as research baselines. The face-api.js stack is the de-facto browser standard because of its accuracy/size balance.
Why does the same photo give a different age when I re-run it, and is this a bug?
For age and gender the prediction is fully deterministic: same image, same alignment, same weights produce the same output every time, byte-for-byte. If you see a different number on re-run, the most likely cause is that the face detector picked a slightly different bounding box (it is a non-max-suppression process, and tiny floating-point differences can flip which detection wins). Reload-induced differences in tensor compilation, lossy image re-encoding (PNG vs JPEG), or pasting from clipboard at a different size can also shift the predicted age by 1-3 years. The model is doing the same math; the input is slightly different. For stable comparisons, save the cropped face once and reuse it.
