GitXplorerGitXplorer
h

howl-anderson

Graduate student in AI @Duke Univeristy; Developer Expert in Machine Learning @Google; Mentor of TensorFlow @Google Summer of Code; SuperHero @Rasa. Book Author

256 repositories
1218 followers
United States

Repositories

Select a repository to view its commits, contributors, and more.
public

unlocking-the-power-of-llms

使用 Prompts 和 Chains 让 ChatGPT 成为神奇的生产力工具!Unlocking the power of LLMs.

Shell
2501
162
2
Updated a day ago
public

Chinese_models_for_SpaCy

SpaCy 中文模型 | Models for SpaCy that support Chinese

Jupyter Notebook
655
110
10
Updated 11 days ago
public

hanzi_chaizi

汉字拆字库,可以将汉字拆解成偏旁部首,在机器学习中作为汉字的字形特征 | Hanzi Decomposition Library allows Chinese characters to be broken down into radicals and components, which can be used as character shape features in machine learning.

Python
354
59
12
Updated 6 hours ago
public

hanzi_char_featurizer

汉字字符特征提取器 (featurizer),提取汉字的特征(发音特征、字形特征)用做深度学习的特征 | A Chinese character feature extractor, which extracts the features of Chinese characters (pronunciation features, glyph features) as features for deep learning

Python
287
56
3
Updated 11 days ago
public

tools_for_corpus_of_people_daily

人民日报语料处理工具集 | Tools for Corpus of People's Daily

Python
272
57
5
Updated 3 days ago
public

WeatherBot

一个基于 Rasa 的中文天气情况问询机器人(chatbot), 带 Web UI 界面

236
68
5
Updated 6 days ago
public

ATIS_dataset

The ATIS (Airline Travel Information System) Dataset

Python
154
49
2
Updated a month ago
public

MicroTokenizer

一个轻量且功能全面的中文分词器,帮助学生了解分词器的工作原理。MicroTokenizer: A lightweight Chinese tokenizer designed for educational and research purposes. Provides a practical, hands-on approach to understanding NLP concepts, featuring multiple tokenization algorithms and customizable models. Ideal for students, researchers, and NLP enthusiasts..

Python
147
22
2
Updated a month ago
public

rasa_chinese

rasa_chinese 专门针对中文语言的 rasa 组件扩展包,提供了许多针对中文语言的组件

Python
144
36
9
Updated 12 days ago
public

seq2annotation

基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF 和 IDCNN+CRF,更多算法正在持续添加中)实现中文分词(Tokenizer / segmentation)、词性标注(Part Of Speech, POS)和命名实体识别(Named Entity Recognition, NER)等序列标注任务。

Python
84
21
33
Updated a day ago
public

MITIE_Chinese_Wikipedia_corpus

Pre-trained Wikipedia corpus by MITIE

52
9
1
Updated 18 days ago
public

chinese-wikipedia-corpus-creator

Corpus creator for Chinese Wikipedia

Python
42
9
0
Updated 8 months ago