Clear Sky Science · en

A Topology Standardized 3D Facial Dataset with Emotion and Action Unit Diversity for East Asians

· Back to index

Why digital faces matter

From video calls to virtual reality, our lives are filled with digital faces. Yet many of the computer systems behind these faces are trained on limited data, often focused on Western populations and a narrow range of expressions. This paper introduces AST-Face, a new 3D facial dataset centered on East Asian young adults that aims to give researchers better building blocks for animation, emotion research, and human–computer interaction.

Figure 1. Many East Asian 3D faces are unified into a common structure so computers can compare expressions fairly.
Figure 1. Many East Asian 3D faces are unified into a common structure so computers can compare expressions fairly.

What the new face collection contains

The AST-Face dataset includes detailed 3D scans from 98 East Asian participants between 18 and 30 years old. For each person, the team captured a neutral face, six common emotions (happiness, anger, sadness, surprise, fear, and disgust), and nine specific muscle-based facial movements. These movements follow a well-known system that breaks expressions into small action units, such as raising the inner eyebrows or pulling the corners of the mouth. A subset of volunteers also allowed synchronized color photos from three camera angles, creating a richer resource for studies that combine 3D shape with regular images.

How the faces were captured and cleaned

To make the data reliable and comparable, the researchers built a carefully controlled capture setup. A high-precision 3D scanner recorded fine details of each face while three color cameras filmed from the left, center, and right. Adjustable lighting reduced shadows and glare, and a positioning device helped participants hold a steady pose. Everyone followed the same recording script: first a relaxed neutral face, then the six emotions, and finally the nine action units, each guided by trained staff. Afterward, the raw scans were cleaned by trimming away background and neck areas, aligning head pose, correcting surface properties, and extracting 84 standard landmark points on each face.

Figure 2. A rough 3D face is gradually refined into a smooth shared mesh that keeps expression details while matching structure.
Figure 2. A rough 3D face is gradually refined into a smooth shared mesh that keeps expression details while matching structure.

Making every face comparable

A central challenge in 3D face research is that raw scans do not share the same digital structure. They can differ in how many points they contain and how those points are connected, which makes it hard to compare one person’s smile to another’s. AST-Face tackles this by running every scan through a two step alignment process. First, a flexible face model is fitted to capture large movements such as open mouths and raised brows. Then an advanced matching algorithm gently warps a shared template mesh so that all final faces have identical point counts and connectivity. This unified structure lets researchers compare faces point by point across people and expressions without designing their own complex preprocessing pipeline.

What the data can be used for

The finished dataset offers several layers of information: standardized 3D meshes, landmark points, detailed maps of how each expression differs from the neutral face, and verified labels for every emotion and action unit. Publicly available files exclude any identifiable textures, while raw scans and color images sit behind a data usage agreement to protect participant privacy. With this structure, AST-Face can support a wide range of work, from more natural facial animation driven by muscle-like controls, to machine learning models that study how expressions vary between individuals, to cross modal systems that link 3D shape and 2D imagery.

What this means for future digital faces

In simple terms, AST-Face gives researchers a high quality, well organized set of East Asian 3D faces that all speak the same digital language. By combining diverse expressions, carefully checked muscle based labels, and a shared mesh structure, the dataset makes it easier to build and test algorithms that need consistent, realistic facial movements. While it focuses on a specific age group and posed expressions under controlled lighting, it helps close demographic gaps in existing resources and lays a clearer foundation for more inclusive and accurate digital faces in the future.

Citation: Zhao, Y., Gong, G., Li, Y. et al. A Topology Standardized 3D Facial Dataset with Emotion and Action Unit Diversity for East Asians. Sci Data 13, 735 (2026). https://doi.org/10.1038/s41597-026-07098-2

Keywords: 3D facial dataset, facial expression, East Asian faces, action units, topology standardization