The "Research Report on AI Foundation Models and Their Applications in Automotive Field, 2024-2025" report has been added to ResearchAndMarkets.com's offering.
Research on AI foundation models and automotive applications: reasoning, cost reduction, and explainability
Reasoning capabilities drive up the performance of foundation models.
Since the second half of 2024, foundation model companies inside and outside China have launched their reasoning models, and enhanced the ability of foundation models to handle complex tasks and make decisions independently by using reasoning frameworks like Chain-of-Thought (CoT).
The intensive releases of reasoning models aim to enhance the ability of foundation models to handle complex scenarios and lay the foundation for Agent application. In the automotive industry, improved reasoning capabilities of foundation models can address sore points in AI applications, for example, enhancing the intent recognition of cockpit assistants in complex semantics and improving the accuracy of spatiotemporal prediction in autonomous driving planning and decision.
In 2024, reasoning technologies of mainstream foundation models introduced in vehicles primarily revolved around CoT and its variants (e.g., Tree-of-Thought (ToT), Graph-of-Thought (GoT), Forest-of-Thought (FoT)), and combined with generative models (e.g., diffusion models), knowledge graphs, causal reasoning models, cumulative reasoning, and multimodal reasoning chains in different scenarios.
In 2025, the focus of reasoning technology will shift to multimodal reasoning. Common training technologies include instruction fine-tuning, multimodal context learning, and multimodal CoT (M-CoT), and are often enabled by combining multimodal fusion alignment and LLM reasoning technologies.
Explainability bridges trust between AI and users
Before users experience the 'usefulness' of AI, they need to trust it. In 2025, the explainability of AI systems therefore becomes a key factor in increasing the user base of automotive AI. This challenge can be addressed by demonstrating long CoT. The explainability of AI systems can be achieved at three levels: data explainability, model explainability, and post-hoc explainability.
In Li Auto's case, its L3 autonomous driving uses 'AI reasoning visualization technology' to intuitively present the thinking process of end-to-end + VLM models, covering the entire process from physical world perception input to driving decision outputted by the foundation model, enhancing users' trust in intelligent driving systems.
In Li Auto's 'AI reasoning visualization technology'
Various reasoning models' dialogue interfaces also employ a long CoT to break down the reasoning process as well. Examples include DeepSeek R1 which during conversations with users, first presents the decision at each node through a CoT and then provides explanations in natural language.
Additionally, most reasoning models, including Zhipu's GLM-Zero-Preview, Alibaba's QwQ-32B-Preview, and Skywork 4.0 o1, support demonstration of the long CoT reasoning process.
DeepSeek lowers the barrier to introduction of foundation models in vehicles, enabling both performance improvement and cost reduction.
Does the improvement in reasoning capabilities and overall performance mean higher costs? Not necessarily, as seen with DeepSeek's popularity. In early 2025, OEMs have started connecting to DeepSeek, primarily to enhance the comprehensive capabilities of vehicle foundation models as seen in specific applications.
In fact, before DeepSeek models were launched, OEMs had already been developing and iterating their automotive AI foundation models. In the case of cockpit assistant, some of them had completed the initial construction of cockpit assistant solutions, and connected to cloud foundation model suppliers for trial operation or initially determined suppliers, including cloud service providers like Alibaba Cloud, Tencent Cloud, and Zhipu. They connected to DeepSeek in early 2025, valuing the following:
Strong reasoning performance: for example, the R1 reasoning model is comparable to OpenAI o1, and even excels in mathematical logic.
Lower costs: maintain performance while keeping training and reasoning costs at low levels in the industry
By connecting to DeepSeek, OEMs can really reduce the costs of hardware procurement, model training, and maintenance, and also maintain performance, when deploying intelligent driving and cockpit assistants:
Low computing overhead technologies facilitate high-level autonomous driving and technological equality, which means high performance models can be deployed on low-compute automotive chips (e.g., edge computing unit), reducing reliance on expensive GPUs. Combined with DualPipe algorithm and FP8 mixed precision training, these technologies optimize computing power utilization, allowing mid- and low-end vehicles to deploy high-level cockpit and autonomous driving features, accelerating the popularization of intelligent cockpits.
Enhance real-time performance. In driving environments, autonomous driving systems need to process large amounts of sensor data in real time, and cockpit assistants need to respond quickly to user commands, while vehicle computing resources are limited. With lower computing overhead, DeepSeek enables faster processing of sensor data, more efficient use of computing power of intelligent driving chips (DeepSeek realizes 90% utilization of NVIDIA A100 chips during server-side training), and lower latency (e.g., on the Qualcomm 8650 platform, with computing power of 100TOPS, DeepSeek reduces the inference response time from 20 milliseconds to 9-10 milliseconds).
Key Topics Covered:
Overview of AI Foundation Models
- Introduction to AI Models
- Definition and Features of AI Models
- Classification of AI Models
- Application Process of AI Models
- Introduction to Foundation Models
- Classification of Foundation Models
- Current Development of Foundation Models in Automotive Industry
- Application Scenarios of Foundation Models in Automotive Industry
- Application of LLM in Autonomous Driving
- Application of VFM in Autonomous Driving
- Application of MFM in Autonomous Driving
Analysis of AI Foundation Models of Differing Types
- Large Language Models (LLM)
- Development History of LLM
- Key Capabilities of LLM
- Cases of Integration with Other Models
- Multimodal Large Language Models (MLLM)
- Development and Overview of Large Multimodal Models
- Large Multimodal Models VS. Large Single-modal Models
- Technology Panorama of Large Multimodal Models
- Multimodal Information Representation
- Multimodal Large Language Models (MLLM)
- Architecture and Core Components of MLLM
- Status Quo of MLLM
- Dataset Evaluation by Different MLLM Representatives
- Reasoning Capabilities of MLLM
- Synergy between MLLM and Agent
- MLLM in VQA
- MLLM in Autonomous Driving
- Vision-Language Models (VLM) and Vision-Language-Action (VLA) Models
- Development History of VLM
- Application of VLM
- Architecture of VLM
- Evolution of VLM in Intelligent Driving
- End-to-end Autonomous Driving
- Combination with Gaussian Framework
- VLM2VLA
- VLA Models
- Principles of VLA
- Classification of VLA Models
- Application Cases of VLA
- Core Functions of End-to-End Multimodal Model for Autonomous Driving (EMMA)
- World Model Construction
- Improve Vision-Language Navigation Capabilities
- VLA Generalization Enhancement
- Computing Overhead of VLA
- World Models
- Key Definitions of World Models and Application Development
- Basic Architecture of World Models
- Framework Setup and Implementation Challenges of World Models
- Video Generation Methods Based on Transformer and Diffusion Models
- Technical Principle and Path of WorldDreamer
- World Models and End-to-end Intelligent Driving
- Tesla World Model
- NVIDIA
- InfinityDrive
- Worlds Labs Spatial Intelligence
- NIO
- 1X's 'World Model'
Common Technologies in AI Foundation Models
- Common Foundation Model Algorithms and Architectures
- Comparison of Features and Application Scenarios between Foundation Model Algorithms
- Foundation Model Architectures and Related Algorithms
- Transformer
- KAN
- MAMBA
- Applicability of CNN in the Era of Foundation Models
- Applicability of RNN Variants in the Era of Foundation Models
- Visual Processing Algorithms
- Common Vision Algorithms
- ViT
- CLIP Scenarios and Features
- CLIP Workflow
- LLaVA Model
- Training and Fine-Tuning Technologies
- Foundation Model Training Process
- Training Case: Geely's CPT Enhancement Solution
- Instruction Fine-tuning
- Training Case: Geely's Fine-tuning Framework for Multi-round Dialogues
- Reinforcement Learning
- Introduction to Reinforcement Learning
- Reinforcement Learning Process
- Comparison between Some Reinforcement Learning Technology Routes
- Knowledge Graphs
- Optimization Directions for Retrieval-Augmented Generation (RAG)
- Evolution Directions of RAG : KAG: CAG: GraphRAG
- RAG Application Case: Li Auto
- RAG Application Case: Geely
- Comparison between RAG Routes
- Function Call
- Reasoning Technologies
- Reasoning Process of Transformer Models
- Evaluation of Reasoning Capabilities
- Three Optimization Directions for Foundation Model Reasoning
- Reasoning Task Types
- Common Reasoning Algorithm
- Comparison between Common Reasoning Algorithms
- Reasoning Case 1: Geely
- Reasoning Case 2: NVIDIA
- Sparsification
- Characteristics of MoE Architecture
- Principles of MoE Architecture
- MoE Training Strategies
- Advantages and Challenges of MoE
- MoE Models from Different Foundation Model Companies
- Evolution Direction of MoE
- Generation Technologies
- Introduction to Generative Models
- Comparison between Generation Technologies
- Case 1: Li Auto
- Case 2: XPeng
- Case 3: SAIC
AI Foundation Model Companies
- OpenAI
- SORA
- Meta
- Anthropic
- Mistral AI
- Amazon
- Stability AI
- xAI
- SenseTime
- Alibaba Cloud
- Baidu AI Cloud
- Tencent Cloud
- Huawei
- Zhipu AI
- Flytek
- DeepSeek
Application Cases of AI Foundation Models in Automotive
- Cockpit Cases
- Lenovo's AI Vehicle Computing Framework Used in Cockpits
- In-cabin Functions of Thundersoft's Rubik Foundation Model
- LLM Empowers Smart Eye's DMS/OMS Assistance System
- Application of DIT in Voice Processing Scenarios
- Application of Unisound's Shanhai Model in Cockpits
- Phoenix Auto Intelligence's Cockpit Smart Brain
- Intelligent Driving Cases
- Li Auto
- Geely
- Waymo: Generative World Model GAIA-1
- Tesla
- Giga's World Model
Application Trends of AI Foundation Models
- Algorithm
- Computing Power
- Engineering
For more information about this report visit https://www.researchandmarkets.com/r/jutyc7
About ResearchAndMarkets.com
ResearchAndMarkets.com is the world's leading source for international market research reports and market data. We provide you with the latest data on international and regional markets, key industries, the top companies, new products and the latest trends.
View source version on businesswire.com: https://www.businesswire.com/news/home/20250421614414/en/
Contacts
ResearchAndMarkets.com
Laura Wood, Senior Press Manager
press@researchandmarkets.com
For E.S.T Office Hours Call 1-917-300-0470
For U.S./ CAN Toll Free Call 1-800-526-8630
For GMT Office Hours Call +353-1-416-8900