- The dataset passed the MIIT evaluation based on the national standard TC609-5-2025-04.
- It scored over 90 points in format standardization, content consistency, and cleanliness.
- The dataset supports AI model training in molecular understanding and generation.
- Nearly 900 users in 86 units utilize the dataset for data processing.
Evaluation and Certification
Sinopec's "General High-Quality Dataset for the Petrochemical Industry" has successfully passed the evaluation by the Electronics Industry Standardization Institute of the Ministry of Industry and Information Technology. It is the first industry dataset in the country to meet national standards.
Evaluation Criteria
The evaluation was based on the national standard "High-Quality Dataset Quality Evaluation Specifications" (TC609-5-2025-04). It used a "data + model" combined evaluation method, covering three dimensions and 17 indicators, including dataset description documents, data quality, and model application.
Performance and Application
The dataset scored over 90 points in format standardization, content consistency, and content cleanliness indicators. It supports Sinopec's Great Wall large model training and professional large model training in molecular understanding, molecular generation, and audit.
Industry Impact
The dataset's standardized construction method offers a reference model for the petrochemical industry. It provides support for data processing work for nearly 900 users in 86 units within the system.