GitXplorerGitXplorer
s

zi-dataset

public
113 stars
17 forks
1 issues

Commits

List of commits on branch master.
Verified
733a402d7803fd0d4767120d8f31c1db6ccc629d

Update README.md

ssecsilm committed 5 years ago
Verified
2fba585f5d7a1352ec65cefd42c3b24261f52928

Add field leaf_component

ssecsilm committed 5 years ago
Verified
73d906c7004d340079c1331a1005aa64762b94a0

Update README.md

ssecsilm committed 5 years ago
Verified
83c83cacea759c961de220666d2758060e87212e

Add dataset info

ssecsilm committed 5 years ago
Verified
803537fd8bf1f9ae249af7c40a6c7d591327b4e0

First version of the dataset

ssecsilm committed 5 years ago
Verified
b7bbaa1f1a8db5de90b95f674a64064ec0b72cbd

使用CC-BY-SA-4.0

ssecsilm committed 5 years ago

README

The README file for this repository.

zi

汉字数据集,包括约 20000 个汉字的相关信息,具体字段包括:

字段 说明 举例
zi 汉字本身
stroke_count 笔画数 7画
stroke_count_decomposed 笔画数拆解 木 + 3
mandarin_pinyin 普通话拼音
cantonese_pinyin 粤语拼音 lei5
english 英文 plum; judge; surname
radical 部首
radical_stroke_count 部首笔画数 4
radical_pinyin 部首拼音
radical_english 部首英文 tree
variant 变体,通常为对应繁体 NaN
fc_code 四角码 4040.7
cj_code 仓颉码 DND
zis_with_this_component 包含该字的字,即以该字为部件的字,以英文逗号分隔,不包含本身 NaN
leaf_component 叶子部件,即将该字拆分构成一个树后,其叶子节点便是叶子部件,用 / 拼接 木/子

WIP

  • [ ] 添加笔画顺序