Converstational AI Frameworks
3 BERT
BERT stands for Bidirectional Encoder Representation form Transformers and was developed in 2018 by Google. This NLP technique utilized pre-trained transformer type neural networks. There are different sizes of the BERT model to meet various use-cases and available computing resources for different applications. The model size does influence the the accuracy of the predictions.
In general having a larger model size results in a more accurate result. However, larger models consume more processing power and take longer to process. In the case of conversation AI processing delays can greatly impact the user experience and studies have shown that in a typical natural language conversation with the average delay in between exchanges is 300ms. This is a very narrow window of time to evaluate the intent of the user, fetch any external data that is required and prepare a response. When you are running multiple models against a query you may only have 10ms to evaluate and decode the natural language. [1]
According to Mohd Sanad Zaki Rizvi[2]
TinyBERT model to achieve 96% of its BERT base teacher on the BLUE benchmark while being 7.5x smaller and 9.4x faster! Its performance numbers are impressive even when comparing with BERT small, a model of exactly the same size, which TinyBERT is 9% better than (76.5 vs 70.2 points average on GLUE).
Model | Parameters | Layers | Hidden |
BERT Tiny | 4M | 2 | 128 |
BERTMini | 11M | 4 | 256 |
BERT Small | 29M | 4 | 512 |
BERT Medium | 42M | 8 | 512 |
BERT Base | 108M | 12 | 768 |
BERT large | 334M | 24 | 1024 |
BERT. xlarge | 1270M | 24 | 2048 |
ALBERT base | 12M | 12 | 768 |
ALBERT large | 18M | 24 | 1024 |
ALBERT xlarge | 59M | 24 | 2048 |
ALBERT xxlarge | 233M | 12 | 4096 |