LEARN COMPLETE PYTHON IN 24 HOURS

Pandas is the most powerful and widely used Python library for data wrangling, cleaning, exploration, and analysis. It is built on top of NumPy and provides high-level data structures (Series and DataFrame) that make working with tabular/structured data feel like using Excel or SQL — but much more powerful.

Install Pandas (if not using Anaconda)

Bash

pip install pandas

Standard import (always use this)

Python

import pandas as pd

3.1 Series and DataFrame – Core Data Structures

Series A one-dimensional labeled array (like a column in Excel or a vector with labels).

Python

# Create Series from list s1 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']) print(s1) # a 10 # b 20 # c 30 # d 40 # dtype: int64 # Access by label print(s1['b']) # 20 # From dictionary s2 = pd.Series({'Math': 85, 'Science': 92, 'English': 78}) print(s2['Science']) # 92

DataFrame A two-dimensional labeled data structure (like a spreadsheet or SQL table) — the heart of Pandas.

Python

# Create DataFrame from dictionary data = { 'Name': ['Anshuman', 'Priya', 'Rahul', 'Sneha'], 'Age': [25, 23, 24, 22], 'City': ['Ranchi', 'Delhi', 'Patna', 'Kolkata'], 'Marks': [92, 88, 85, 90] } df = pd.DataFrame(data) print(df) # Name Age City Marks # 0 Anshuman 25 Ranchi 92 # 1 Priya 23 Delhi 88 # 2 Rahul 24 Patna 85 # 3 Sneha 22 Kolkata 90 # Basic inspection print(df.head(2)) # first 2 rows print(df.info()) # data types, non-null count print(df.describe()) # summary statistics print(df.shape) # (rows, columns) → (4, 4)

Quick access

Python

df['Name'] # Series – one column df[['Name', 'Marks']] # DataFrame – multiple columns df.iloc[0] # first row (position-based) df.loc[0, 'Name'] # label-based access

3.2 Data Loading (CSV, Excel, JSON, SQL)

Pandas makes reading data from almost any source effortless.

CSV

Python

df = pd.read_csv("sales_data.csv") # Options: skiprows=2, usecols=['date','sales'], dtype={'sales':float}

Excel

Python

df = pd.read_excel("report.xlsx", sheet_name="Sales", skiprows=1) # Need: pip install openpyxl or xlrd

JSON

Python

df = pd.read_json("data.json") # or pd.json_normalize() for nested JSON

SQL (with database connection)

Python

import sqlalchemy as sa engine = sa.create_engine("sqlite:///mydb.db") df = pd.read_sql("SELECT * FROM customers", engine) # or pd.read_sql_query(query, engine)

Quick save

Python

df.to_csv("cleaned_data.csv", index=False) df.to_excel("report.xlsx", index=False) df.to_json("data.json", orient="records")

3.3 Data Cleaning, Filtering & Transformation

Real data is messy — Pandas excels at cleaning it.

Basic cleaning

Python

df = df.drop_duplicates() # remove duplicate rows df = df.dropna(subset=['age', 'city']) # drop rows with missing values df['age'] = df['age'].fillna(df['age'].median()) # fill missing with median df['salary'] = df['salary'].astype(float) # change data type df['date'] = pd.to_datetime(df['date']) # convert to datetime

Filtering

Python

high_salary = df[df['salary'] > 80000] young_delhi = df[(df['age'] < 30) & (df['city'] == 'Delhi')] top_10 = df.nlargest(10, 'marks')

Transformations

Python

df['tax'] = df['salary'] * 0.18 # new column df['full_name'] = df['first'] + " " + df['last'] df['salary_category'] = pd.cut(df['salary'], bins=[0, 50000, 100000, np.inf], labels=['Low', 'Medium', 'High'])

3.4 Grouping, Aggregation & Pivot Tables

Groupby – most powerful feature

Python

# Group by city and calculate mean salary df.groupby('city')['salary'].mean() # Multiple aggregations df.groupby('city').agg({ 'salary': ['mean', 'max', 'count'], 'age': 'median' })

Pivot Tables

Python

pd.pivot_table(df, values='salary', index='city', columns='department', aggfunc='mean', fill_value=0)

Crosstab

Python

pd.crosstab(df['city'], df['gender'], margins=True)

3.5 Handling Missing Values & Outliers

Missing values

Python

# Check missing df.isnull().sum() # Fill missing df['age'].fillna(df['age'].median(), inplace=True) df['city'].fillna('Unknown', inplace=True) # Drop missing df.dropna(subset=['salary'], inplace=True)

Detect & handle outliers

Python

# Using IQR method Q1 = df['salary'].quantile(0.25) Q3 = df['salary'].quantile(0.75) IQR = Q3 - Q1 lower = Q1 - 1.5 IQR upper = Q3 + 1.5 IQR # Remove outliers df_clean = df[(df['salary'] >= lower) & (df['salary'] <= upper)] # Or cap them df['salary'] = df['salary'].clip(lower=lower, upper=upper)

Mini Summary Project – Quick EDA on Sample Dataset

Python

import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Load sample (or your own CSV) df = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv") # Quick look print(df.head()) print(df.info()) print(df.describe()) # Missing values print(df.isnull().sum()) # Group analysis print(df.groupby('day')['total_bill'].mean()) # Visualization sns.boxplot(x='day', y='total_bill', data=df) plt.title("Total Bill by Day") plt.show()

This completes the full Pandas – Data Manipulation & Analysis section — the most important tool for real data science work in Python!

📚 Amazon Book Library

All my books are FREE on Amazon Kindle Unlimited🌍 Exclusive Country-Wise Amazon Book Library – Only Here!

On GlobalCodeMaster.com you’ll find complete, ready-to-use lists of my books with direct Amazon links for every country.

Belong to India, Australia, USA, UK, Canada or any other country? Just click your country’s link and enjoy:

✅ Any eBook FREE on Kindle Unlimited ✅ Or buy at incredibly low prices

400+ fresh books written in 2025-2026 with today’s latest AI, Python, Machine Learning & tech trends – nowhere else will you find this complete country-wise collection on one platform!

Choose your country below and start reading instantly 🚀

BOOK LIBRARY USA 2026 LINK

BOOK LIBRARY INDIA 2026 LINK

BOOK LIBRARY AUSTRALIA 2026 LINK

BOOK LIBRARY CANADA 2026 LINK

BOOK LIBRARY UNITED KINGDOM 2026 LINK

BOOK LIBRARY GERMANY 2026 LINK

BOOK LIBRARY FRANCE 2026 LINK

BOOK LIBRARY ITALY 2026 LINK

BOOK LIBRARY SPAIN 2026 LINK

BOOK LIBRARY NETHERLANDS 2026 LINK

BOOK LIBRARY BRAZIL 2026 LINK

BOOK LIBRARY MEXICO 2026 LINK

BOOK LIBRARY JAPAN 2026 LINK

BOOK LIBRARY POLAND 2026 LINK

BOOK LIBRARY IRELAND 2026 LINK

BOOK LIBRARY SWEDEN 2026 LINK

BOOK LIBRARY BELGIUM 2026 LINK

Email-ibm.anshuman@gmail.com

All my books are exclusively available on Amazon. The free notes/materials on globalcodemaster.com do NOT match even 1% with any of my PUBLISHED BOoks. Similar topics ≠ same content. Books have full details, exercises, chapters & structure — website notes do not.No book content is shared here. We fully comply with Amazon policies.

Free Reading Alert! All my books are FREE on Kindle Unlimited or eBooks just ₹145!

Check now: https://www.amazon.in/stores/Anshuman-Mishra/author/B0DQVNPL7P

Start reading! 🚀

🚀 Best content for SSC, CGL, LDC, TET, NET & SET preparation!
📚 Maths | Reasoning | GK | Previous Year Questions | Tips & Tricks

👉 Join our WhatsApp Channel now:
🔗 https://whatsapp.com/channel/0029Vb6kg2vFnSz4zknEOG1D...