#sticker #transformer #tensorflow #lstm #field #task #finalfusion

sticker-tf-proto

Tensorflow protocol buffer definitions used by sticker

5 releases (breaking)

0.11.0 Mar 28, 2020
0.10.0 Oct 18, 2019
0.7.0 Sep 23, 2019
0.6.0 Aug 17, 2019
0.1.0 Jul 26, 2019

#4 in #finalfusion


Used in 2 crates (via sticker)

Apache-2.0

1MB
18K SLoC

Warning: sticker is succeeded by SyntaxDot, which supports many new features:

  • Multi-task learning.
  • Pretrained transformer models, suchs as BERT and XLM-R.
  • Biaffine parsing in addition to parsing as sequence labeling.
  • Lemmatization.

sticker

sticker is a sequence labeler using neural networks.

Introduction

sticker is a sequence labeler that uses either recurrent neural networks, transformers, or dilated convolution networks. In principle, it can be used to perform any sequence labeling task, but so far the focus has been on:

  • Part-of-speech tagging
  • Topological field tagging
  • Dependency parsing
  • Named entity recognition

Features

  • Input representations:
    • finalfusion embeddings with subword units
    • Bidirectional byte LSTMs
  • Hidden representations:
    • Bidirectional recurrent neural networks (LSTM or GRU)
    • Transformers
    • Dillated convolutions
  • Classification layers:
    • Softmax (best-N)
    • CRF
  • Deployment:
    • Standalone binary that links against libtensorflow
    • Very liberal license
    • Docker containers with models

Status

sticker is almost production-ready and we are preparing for release 1.0.0. Graphs and models crated with the current version must work with sticker 1.x.y. There may still be breaking API or configuration file changes until 1.0.0 is released.

Where to go from here

References

sticker uses techniques from or was inspired by the following papers:

Issues

You can report bugs and feature requests in the sticker issue tracker.

License

sticker is licensed under the Blue Oak Model License version 1.0.0. The Tensorflow protocol buffer definitions in tf-proto are licensed under the Apache License version 2.0. The list of contributors is also available.

Credits

  • sticker is developed by Daniël de Kok & Tobias Pütz.
  • The Python precursor to sticker was developer by Erik Schill.
  • Sebastian Pütz and Patricia Fischer reviewed a lot of code across the sticker projects.

Dependencies

~1.3–2.1MB
~36K SLoC