パラレルコーパスデータ集 : OPUS

LINK

プロジェクトサイト

機械翻訳の学習で欠かせないパラレルコーパス (ある言語のテキストとそれに対応する翻訳された別の言語のテキストを一対一対応させたコーパス)を集めたデータ集.

OPUS

OPUS is a growing collection of translated texts from the web. In the OPUS project we try to convert and align free online data, to add linguistic annotation, and to provide the community with a publicly available parallel corpus. OPUS is based on open source products and the corpus is also delivered as an open content package. We used several tools to compile the current collection. All pre-processing is done automatically. No manual corrections have been carried out.

人工知能と表現の今

パラレルコーパスデータ集 : OPUS – the open parallel corpus

LINK

関連

TAG

SHARE US

SEARCH

キーワード検索

タグ検索