Tonic Suite consists of 7 applications that span 3 domains:

  • Image Processing: Image Classification (IMC), Facial Recognition (FACE), and Digit Recognition (DIG).
  • Speech Recognition: Automatic Speech Recognition (ASR).
  • Natural Language Processing: Part-Of-Speech (POS) tagging, Name Entity Recognition (NER), and Word Chunking (CHK).

Download

Visit our Downloads page to get tagged versions of Tonic Suite or clone the latest from the repo: View On Github.

Set up Tonic Suite

Prerequisites

Tonic Suite uses Caffe* for the DNN forward pass computation. Additionally, Tonic Suite depends on the following packages:

*Building Caffe

If you have built DjiNN service, which means you should have Caffe installed, skip this step.

Caffe is under active development and since some of the latest changes may break downstream projects, we provide users with a snapshot version at our Downloads page that is verified to build with Tonic Suite. Caffe can be built using different libraries as detailed here and we recommend reading their installation process to get familiar with the process.

$ tar xzf caffe.tar.gz
$ cd caffe;
$ ./get-libs.sh
$ sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libboost-all-dev libhdf5-serial-dev libopenblas-dev

Copy Makefile.config.example to Makefile.config and set the flags following the intended build environment.

$ make -j 4
$ make distribute

Compile libtonic.a

libtonic.a covers some common components shared by all the applications. This should be built first before attempt building any tonic application.

Navigate to DjiNN’s directory and set the Caffe installation path in common/Makefile.config, for example CAFFE=/home/jahausw/tools/caffe/distribute

Build the tonic library (libtonic.a) and download the pretrained models in common/weights:

$ tar xzf djinn-1.0.tar.gz
$ cd common
$ make
$ cd weights
$ ./dl_djinn_weights.sh

Image Processing Applications

Tonic Suite contains three image based applications: Image Classification (IMC), Digit Recognition (DIG) and Facial Recognition (FACE).

  • IMC sends an image to the DjiNN service and a prediction of what the image contains is returned to the application by the DjiNN service. The neural network architecture used is AlexNet, and the network is trained on 1.4 million images from ImageNet.
  • DIG sends an image of a hand-written digit to the DjiNN service and a prediction of the most likely digit (between 0 and 9) is sent back to the application. The neural network architecture used for DIG is created based on MNIST.
  • FACE sends an image of face to the DjiNN service and a prediction of the identity of the face is sent back to the application. The neural network architecture used for FACE is replicated from DeepFace published by Facebook, and the network is trained on PubFig83+LFW Dataset.

Directory structure

The home directory of image applications is ./tonic-suite/img. In the directory:

./data/ contains pretrained data for face alignment, and the list of classes for IMC and FACE.
./input/ contains the input images for the image applications.
./src/ contains all the source files.

Building the applications

After building Flandmark, run make from the home directory of image processing applications:

$ cd tonic-suite/img
$ ./get-flandmark.sh
$ make

Running the applications

Execute the IMC application:

$ ./tonic-img --task imc --network imc.prototxt --weight imc.caffemodel --input imc-list.txt --djinn 0

Execute the DIG application:

$ ./tonic-img --task dig --network dig.prototxt --weight dig.caffemodel --input dig-list.txt --djinn 0

Execute the FACE application:

$ ./tonic-img --task face --network face.prototxt --weight face.caffemodel --input face-list.txt --djinn 0

Changing “–djinn 0″ to “–djinn 1″ will use the djinn service to execute the forward pass. The number of entries in the “–input” file define how many images are batched into 1 query for the forward pass (1 image per line).

Automatic Speech Recognition (ASR) Application

Tonic Suite contains Automatic Speech Recognition (ASR), which takes in a user’s audio file and generate the most possible transcript. This network architecture is adapted from Kaldi, a start-of-the-art speech recognition toolbox. The model is trained on voxforge,  an open-source large scale speech corpora.

Directory structure

The home directory of speech application is ./tonic-suite/asr. In the directory:

./egs/voxforge/s5 contains data required for ASR.
./input/ contains the input wav files.
./tools/ contains the installation scripts for dependencies required by Kaldi.
./src/ contains all the source files.

Building the applications

First, Kaldi depends on an installed ATLAS library as well as a specific set of ATLAS headers to operate. If your system does not have ATLAS installed, you can install from the package manager. On Ubuntu, execute

$ sudo apt-get install libatlas-dev

You can also install ATLAS from source (CPU throttling needs to be off for this). We provide a script  tonic-suite/asr/tools/install_atlas.sh to help you install ATLAS from source.

After installing ATLAS, download the set of ATLAS headers required by Kaldi.

$ cd tonic-suite/asr/tools
$ ./atlas_header.sh

Next, install openfst library and FLAC decompressor required by Kaldi.

$ ./install_openfst.sh
$ sudo apt-get install flac

Then, configure the build process

$ cd tonic-suite/asr/src
$ ./configure

configure  will generate kaldi.mk which including all the relevant path information including the openFST libraries and ATLAS headers which should resides under tools/ .kaldi.mk also specify the location of the ATLAS libraries on the system. Make sure they are set correctly before proceeding.

Finally, build the source code

$ make

For more information on the Kaldi’s build process, visit Kaldi’s installation tutorial page. For a set of common error regarding linking to the external matrix library (ATLAS), visit this Kaldi’s documentation page.

Final step before running the appication, untar the graph data

$ cd tonic-suite/asr/egs/voxforge/s5/exp/
$ tar -xvzf tri3b.tar.gz

Running the applications

Execute the ASR application:

$ ./tonic-asr --network asr.prototxt --weight asr.caffemodel --input asr-list.txt --djinn 0

Changing “–djinn 0″ to “–djinn 1″ will use the djinn service to execute the forward pass. The number of entries in the “–input” file define how many wav files are batched into 1 query for the forward pass (1 wav file per line).


Natural Language Processing (NLP) Applications

Tonic Suite contains three Natural Language Processing applications: Part-of-Speech (POS) Tagging, Word Chunking (CHK) and Name Entity Recognition (NER).

  • POS assigns each word with a part of speech, for example if it is a noun or a verb.
  • CHK tags each segment of a sentence as a noun or verb phrase where each word is labeled as a begin-chunk or an inside-chunk.
  • NER labels each word in the sentence with a category, for example whether it is a location or a person.

The neural networks used in NLP applications are based on SENNA. The pretrained models used are trained on Wikipedia for over 2 months and achieve over 89% accuracy for these applications.

Directory structure

The home directory of NLP applications is ./tonic-suite/nlp. In the directory:

./data/ ./hash contain data needed by SENNA.
./input/ contains the input text files for the NLP applications.
./src/ contains all the source files.

Building the applications

Run make from the home directory of NLP applications, ./tonic-suite/nlp, will build a client that can run all the three natural language processing applications.

$ cd tonic-suite/nlp
$ make

Running the applications

Execute the POS application:

$ ./tonic-nlp --task pos --network pos.prototxt --weight pos.caffemodel --input input/small-input.txt

Execute the CHK application:

$ ./tonic-nlp --task chk --network chk.prototxt --weight chk.caffemodel --input input/small-input.txt

Execute the NER application:

$ ./tonic-nlp --task ner --network ner.prototxt --weight ner.caffemodel --input input/small-input.txt

Changing “–djinn 0″ to “–djinn 1″ will use the djinn service to execute the forward pass. The number of sentences in the “–input” file define how many sentences are batched into 1 query for the forward pass (1 sentence per line).

Citing Tonic suite

If you use Tonic suite in your research, please cite the official publication [1].

[1] [pdf] Johann Hauswald, Yiping Kang, Michael A. Laurenzano, Quan Chen, Cheng Li, Ronald Dreslinski, Trevor Mudge, Jason Mars, and Lingjia Tang. Djinn and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA), ISCA ’15, New York, NY, USA, 2015. ACM. Acceptance Rate: 19%
[Bibtex]
@inproceedings{hauswald15isca,
author = {Hauswald, Johann and Kang, Yiping and Laurenzano, Michael A. and Chen, Quan and Li, Cheng and Dreslinski, Ronald and Mudge, Trevor and Mars, Jason and Tang, Lingjia},
title = {Djinn and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers},
booktitle = {Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA)},
series = {ISCA '15},
year = {2015},
location = {Istanbul, Turkey},
publisher = {ACM},
address = {New York, NY, USA},
note = {Acceptance Rate: 19%},
}