中文AI能力评测发布,即中文通用大模型综合性基准。

5 min read

这是针对中文可用的通用大模型的一个测评基准。

它主要回答的问题是:在当前通用大模型大力发展的背景下,中文大模型的效果情况,包括但不限于"这些模型不同任务的效果情况"、"相较于国际上的代表性模型做到了什么程度"、 "这些模型与人类的效果对比如何"。

来源:https://github.com/CLUEbenchmark/SuperCLUE

The Chinese AI Capabilities Evaluation is a comprehensive benchmark for Chinese universal large-scale models.

This is a benchmark for evaluating the performance of universal large-scale models available in Chinese.

The main questions it answers include: given the current development of universal large-scale models, what is the performance of Chinese large-scale models, including but not limited to the effectiveness of these models in different tasks, how they compare to representative models internationally, and how their performance compares to that of humans.

Source: https://github.com/CLUEbenchmark/SuperCLUE