GENMAT Project
apartmentThe Project

Holistic-M3FM Framework

A rigorous, multi-modal foundation model for advanced materials discovery, integrating domain-specific encoders and a physics-informed core.

Multi-Modal Encoders

The core architectural triad processing distinct material modalities into a unified latent space.

visibility
Vision

Domain-Specific Large Vision Models

Processes high-resolution microstructural imaging (SEM/TEM), XRD, and AFM topography data using ViT architectures with ~1B parameters.

data_object
Language

Domain-Specific Large Language Models

Parses unstructured scientific literature, patents, synthesis protocols, and property databases using BERT-large/GPT-2 medium architectures.

science
Audio

Domain-Specific Large Audio Models

Analyzes acoustic emission data and ultrasonic NDT signals using Audio Spectrogram Transformer (SSAST) architectures with ~800M parameters.

Four Pillars of GENMAT

hub

Unified Infrastructure & AI Ecosystem

Seamlessly connects atomic chemistry, advanced microstructural modeling, process optimization, and real-time performance monitoring in one coherent digital ecosystem.

recenter

Physics-Informed Predictions

Integrates strict physical constraints, kinetic relationships, thermodynamic principles, and standardized European EMMO ontologies for scientifically valid outputs.

account_tree

Overcoming Fragmentation

The Holistic-M3FM backbone learns across material classes and dimensional scales (from atoms to macroscopic components) with a Multi-gate Mixture-of-Experts architecture.

security

Trustworthy & Sustainable

Built-in explainable AI, systematic uncertainty quantification, and full EU AI Act compliance for reliability, transparency, and ethical standards.

Project Timeline & Milestones

Key milestones for the 36-month project duration.

M3

Kick-off & Risk Management

Official project kick-off meeting and activation of risk control procedures.

M6

Visual Identity & Requirements

Launch of consortium visual identity and public website; freeze of user requirements and data governance specifications.

M12

FAIR Data Fabric & Initial DB

FAIR Data Fabric infrastructure operational with ontologies, SHACL validators, and APIs; completion of initial pre-training databases.

M20

Mid-term Review

Technical mid-term review meeting and update of critical risk register.

M24

Workbench & SDK Release

Official release of GENMAT Workbench (GA version) with Python SDK.

M30

Data Generation & Trust Toolkit

Completion of data generation cycles (real and augmented); integration of Trust Toolkit, license release, and models opening.

M36

Final Validation & Audit

Final validation of all 5 use cases reaching TRL-4 and completion of compliance audits.