Introduction to Tranformer models

Introduction The “Attention is All You Need” paper by Vaswani et al. introduced the Transformer architecture, which has become a foundational model for many natural language processing (NLP) and other sequence-to-sequence tasks. Transformers are neural networks that learn context and understanding through sequential data analysis. The Transformer models use a modern and evolving mathematical techniques set, generally known as attention or self-attention. This self-attention mechanism lets each word in the sequence consider the entire context of the sentence, rather than just the words that came before it....

May 16, 2024 · 5 min · Pravi Devineni, PhD

Efficient Fine-Tuning of LLMs with LoRA, QLoRA and DoRA

Fine-Tuning LLMs Fine-tuning is the process of adapting a pre-trained large language model (LLM) to a specific task or dataset. It involves further training the model on a smaller, specialized dataset to tailor its responses to be more relevant to particular domains or applications. The key benefits of fine-tuning LLMs include: Improving performance on specific tasks by making the model more domain-specific Reducing the amount of data required to train the model effectively Enhancing the model’s efficiency and making it more suitable for production use cases Standard Fine-tuning vs....

May 8, 2024 · 10 min · Pravi Devineni, PhD

RAG: Easy to use but hard to master

Introduction The Retrieval-augmented Generation (RAG) framework combines the benefits of information retrieval systems with the generative capability of large language models. RAG is particularly useful in tasks that require a deep understanding of the query to generate contextually relevant responses. RAG workflow RAG involves two main components: a document retriever and a large language model (LLM). The retriever is responsible for finding relevant documents based on the input query and the generator uses the retrieved documents and the original query to generate a response....

April 29, 2024 · 6 min · Pravi Devineni, PhD
Pandas Cheat Sheet

Data Exploration with Python using Pandas

Install and import Pandas pip install pandas import numpy as np import pandas as pd Pandas Data Structures The core value of Pandas comes through the data structure options it provides, primarily Series (labeled, homogenously-typed, one-dimensional arrays) DataFrames (labeled, potentially heterogenously-typed, two-dimensional arrays) Pandas Series Create Series Create empty Series s = pd.Series(dtype='float64') Create Series from dictionary d = {'a': 1, 'b': 2, 'c': 3} s = pd.Series(d) Create Series from Numpy array...

April 30, 2023 · 2 min · Pravi Devineni, PhD
GitHub Cheat Sheet

GitHub Cheatsheet for Data Scientists

Git is a tool used for code management. It is open source and is very helpful for code development and collaboration. Git uses version control of code, which means every change to the code is recorded by version control in form of a database. In case of a mistake, version control allows us to go back in time, compare it to prior versions and help fix the error while causing the least amount of interruption to people who are working on that code....

February 28, 2023 · 4 min · Pravi Devineni, PhD