Saikat's notes on AI
🏠🐦💼🧑‍💻
  • Hello world!
  • 🚀LLM
    • The Evolution of Language Models: From Word2Vec to GPT-4
      • [1] Word2Vec - Efficient Estimation of Word Representations in Vector Space
      • [2] Seq2Seq - Sequence to Sequence Learning with Neural Networks
      • [3] Attention Mechanism - Neural Machine Translation by Jointly Learning to Align and Translate
      • [4] Transformers - Attention Is All You Need
      • [5] GPT - Improving Language Understanding by Generative Pre-Training
      • [6] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
      • [7] T5 - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
      • [8] GPT2 - Language Models are Unsupervised Multitask Learners
  • Best LLM Resources on the internet
  • MPT-7B: A Revolutionary Leap in Language Models
  • From Rules to Vectors: How NLP Changed Over Time
Powered by GitBook
On this page

Was this helpful?

  1. LLM
  2. The Evolution of Language Models: From Word2Vec to GPT-4

[1] Word2Vec - Efficient Estimation of Word Representations in Vector Space

PreviousThe Evolution of Language Models: From Word2Vec to GPT-4Next[2] Seq2Seq - Sequence to Sequence Learning with Neural Networks

Last updated 2 years ago

Was this helpful?

Title: Efficient Estimation of Word Representations in Vector Space

Authors & Year: Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, 2013

Link:

Objective: Develop an efficient method to learn continuous word embeddings that capture semantic meaning in a vector space.

Context: Before Word2Vec, NLP techniques mainly relied on sparse, high-dimensional representations like bag-of-words, which did not efficiently capture semantic meaning. Word2Vec was a breakthrough in representing words as dense, low-dimensional vectors.

Key Contributions:

  • Introduced the Word2Vec framework with two learning architectures: Continuous Bag-of-Words (CBOW) and Skip-gram.

  • Demonstrated the ability to capture semantic and syntactic relationships in the vector space.

Methodology:

  • CBOW predicts a target word based on its context words.

  • Skip-gram predicts context words given a target word.

  • Both models are optimized for computational efficiency and scalability.

Results:

  • Word2Vec achieved state-of-the-art performance on word analogy and similarity tasks.

  • Showcased the ability to perform arithmetic with word vectors (e.g., "king" - "man" + "woman" ≈ "queen").

Impact:

  • Revolutionized the field of NLP by enabling the use of continuous word embeddings.

  • Laid the foundation for subsequent developments in word representation learning and NLP models.

Takeaways:

  • Word2Vec learns meaningful word embeddings by predicting surrounding words in a sentence, using either CBOW or Skip-gram architectures.

  • Continuous word embeddings enable arithmetic with word vectors, capturing semantic relationships, such as finding synonyms or solving analogies.

  • Word2Vec has significantly influenced the field of NLP and the development of more advanced models, paving the way for innovations like transformer-based architectures.

https://arxiv.org/abs/1301.3781
🚀
Page cover image