Voice Translation System

Overview

The Voice Translation System aims to bridge communication gaps by providing accurate translations of spoken language. In today’s interconnected world, voice translation systems have become essential tools for effective communication across diverse languages. Innovations from tech giants like Google with Google Translate and Meta's new Ray-Ban smart glasses highlight the growing importance of voice translation technology, making it more accessible and practical in everyday situations.

Our project aspires to design an effective and accurate translator that not only competes with these established solutions but also addresses their limitations. By leveraging advanced machine learning techniques, we will develop a robust translation system that can enhance the quality and speed of translations, making them more reliable for users in real-time communication.

Proposal

Introduction/Background:

At the core of many modern voice translation systems is the application of advanced machine learning techniques, notably Long Short-Term Memory (LSTM) networks. Originally, Google Translate relied heavily on LSTMs as part of its neural machine translation (NMT) framework, specifically through the Google Neural Machine Translation (GNMT) model introduced in 2016 [2].

Text-to-text translation was made possible by the development of the Transformer architecture. The Transformer model eliminated the need for RNNs and instead relied solely on self-attention mechanisms and positional encoding to capture relationships between words in a sequence.

The Tatoeba English-Spanish Dataset contains over 265,817 sentence pairs, supporting multilingual NLP tasks, including machine translation, and facilitating linguistic research and model training. The English-Spanish Dataset consists of pairs of sentences in English (source language) and their corresponding translations in Spanish (target language), providing a level of linguistic variety and flexibility.

Problem Definition:

The problem we’re aiming to improve is the need for more accurate and efficient voice translations for individuals traveling or engaging in communication with people who speak different languages.

Methods:

Data Preprocessing Methods Identified:

ML Algorithms/Models Identified:

(Potential) Results and Discussion:

Quantitative Metrics:

Project Goals:

Expected Results:

References:

  1. M. H. A. R. Al-Azzeh and H. A. A. Al-Ramahi, "Voice Translation System: A Review," International Journal of Advanced Computer Science and Applications, vol. 10, no. 1, pp. 265-272, 2019. DOI: 10.14569/IJACSA.2019.0100133. Link.
  2. Wu, Y., et al. "Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation." Google Research, 2016. Link.
  3. M. G. Zeyer, J. G. von Neumann, and A. J. Spang, "Evaluating the Effectiveness of Voice Translation Systems for Communication in International Business," Journal of Language and Business, vol. 9, no. 2, pp. 1-15, 2020. Link.
  4. Bahdanau, D., Cho, K., and Bengio, Y. "Neural Machine Translation by Jointly Learning to Align and Translate." ICLR, 2015. Link.
  5. "Model Behind Google Translate: Seq2seq in Machine Learning." Analytics Vidhya, Feb. 2023. Link.

Gantt Chart | Contribution Table

Video Presentation

Voice Translation System : GitHub Repository

Midterm Checkpoint

Here is our midterm checkpoint for our Voice-Based Language Translation System, focusing on data preprocessing, machine learning, and model training for English-Spanish translation using a Sequence-to-Sequence (Seq2Seq) model with a GRU-based encoder-decoder architecture.

Introduction/Background

The Voice Translation System aims to bridge communication gaps by providing accurate translations of spoken language. In today’s interconnected world, voice translation systems have become essential tools for effective communication across diverse languages. Innovations from tech giants like Google with Google Translate and Meta's new Ray-Ban smart glasses highlight the growing importance of voice translation technology, making it more accessible and practical in everyday situations.

Our project aspires to design an effective and accurate translator that not only competes with these established solutions but also addresses their limitations. By leveraging advanced machine learning techniques, we will develop a robust translation system that can enhance the quality and speed of translations, making them more reliable for users in real-time communication.

Problem Definition

The problem we’re aiming to improve is the need for more accurate and efficient voice translations for individuals traveling or engaging in communication with people who speak different languages.

Methods

The preprocessing of the dataset is performed using various techniques:

Final Report

This section will include your final report, summarizing the work completed, results obtained, and conclusions drawn from the project.

Introduction/Background

Problem Definition

Methods

Data PreProcessing Methods