# \[1] Word2Vec - Efficient Estimation of Word Representations in Vector Space

**Title**: Efficient Estimation of Word Representations in Vector Space

**Authors & Year**: Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean, 2013

**Link**: <https://arxiv.org/abs/1301.3781>

**Objective**: Develop an efficient method to learn continuous word embeddings that capture semantic meaning in a vector space.

**Context**: Before Word2Vec, NLP techniques mainly relied on sparse, high-dimensional representations like bag-of-words, which did not efficiently capture semantic meaning. Word2Vec was a breakthrough in representing words as dense, low-dimensional vectors.

**Key Contributions:**

* Introduced the Word2Vec framework with two learning architectures: Continuous Bag-of-Words (CBOW) and Skip-gram.
* Demonstrated the ability to capture semantic and syntactic relationships in the vector space.

**Methodology:**

* CBOW predicts a target word based on its context words.
* Skip-gram predicts context words given a target word.
* Both models are optimized for computational efficiency and scalability.

**Results:**

* Word2Vec achieved state-of-the-art performance on word analogy and similarity tasks.
* Showcased the ability to perform arithmetic with word vectors (e.g., "king" - "man" + "woman" ≈ "queen").

**Impact:**

* Revolutionized the field of NLP by enabling the use of continuous word embeddings.
* Laid the foundation for subsequent developments in word representation learning and NLP models.

**Takeaways:**

* Word2Vec learns meaningful word embeddings by predicting surrounding words in a sentence, using either CBOW or Skip-gram architectures.
* Continuous word embeddings enable arithmetic with word vectors, capturing semantic relationships, such as finding synonyms or solving analogies.
* Word2Vec has significantly influenced the field of NLP and the development of more advanced models, paving the way for innovations like transformer-based architectures.
