NeMo Text Processing

This repository is under development, please refer to https://github.com/NVIDIA/NeMo/tree/main/nemo_text_processing for full functionality. See documentation for details.

Introduction

nemo-text-processing is a Python package for text normalization and inverse text normalization.

Documentation

NeMo-text-processing (text normalization and inverse text normalization).

Tutorials

Google Collab Notebook	Description
Text_(Inverse)_Normalization.ipynb	Quick-start guide
WFST_Tutorial	In-depth tutorial on grammar customization

Getting help

If you have a question which is not answered in the Github discussions, encounter a bug or have a feature request, please create a Github issue. We also welcome you to directly open a pull request to fix a bug or add a feature.

Installation

Conda virtual environment

We recommend setting up a fresh Conda environment to install NeMo-text-processing.

conda create --name nemo_tn python==3.8
conda activate nemo_tn

(Optional) To use hybrid text normalization install PyTorch using their configurator.

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

NOTE: The command used to install PyTorch may depend on your system.

Pip

Use this installation mode if you want the latest released version.

pip install nemo_text_processing

Pip from source

Use this installation mode if you want the a version from particular GitHub branch (e.g main).

pip install Cython
python -m pip install git+https://github.com/NVIDIA/NeMo-text-processing.git@{BRANCH}#egg=nemo_text_processing

From source

Use this installation mode if you are contributing to NeMo-text-processing.

git clone https://github.com/NVIDIA/NeMo-text-processing
cd NeMo-text-processing
./reinstall.sh

NOTE: If you only want the toolkit without additional conda-based dependencies, you may replace reinstall.sh with pip install -e . with the NeMo-text-processing root directory as your current working director.

Contributing

We welcome community contributions! Please refer to the CONTRIBUTING.md for guidelines.

Citation

@inproceedings{zhang21ja_interspeech,
  author={Yang Zhang and Evelina Bakhturina and Boris Ginsburg},
  title={{NeMo (Inverse) Text Normalization: From Development to Production}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={4857--4859}
}

@inproceedings{bakhturina22_interspeech,
  author={Evelina Bakhturina and Yang Zhang and Boris Ginsburg},
  title={{Shallow Fusion of Weighted Finite-State Transducer and Language Model for
Text Normalization}},
  year=2022,
  booktitle={Proc. Interspeech 2022}
}

License

NeMo-text-processing is under Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github		.github
data		data
nemo_text_processing		nemo_text_processing
requirements		requirements
tests		tests
tools/text_processing_deployment		tools/text_processing_deployment
tutorials		tutorials
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
__init__.py		__init__.py
install_pynini.sh		install_pynini.sh
reinstall.sh		reinstall.sh
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeMo Text Processing

Introduction

Documentation

Tutorials

Getting help

Installation

Conda virtual environment

Pip

Pip from source

From source

Contributing

Citation

License

About

Releases

Packages

Languages

License

sshankar619/NeMo-text-processing

Folders and files

Latest commit

History

Repository files navigation

NeMo Text Processing

Introduction

Documentation

Tutorials

Getting help

Installation

Conda virtual environment

Pip

Pip from source

From source

Contributing

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages