LEARN COMPLETE PYTHON IN 24 HOURS

🟦 Table of Contents – Master Data Science with Python

🔹 1. Introduction to Data Science & Python Setup

  • 1.1 What is Data Science and Why Python

  • 1.2 Data Science Career Paths

  • 1.3 Python Environment Setup

  • 1.4 Essential Libraries Overview

🔹 2. NumPy – Foundation of Numerical Computing

  • 2.1 NumPy Arrays vs Python Lists

  • 2.2 Array Operations, Broadcasting & Vectorization

  • 2.3 Indexing, Slicing & Array Manipulation

  • 2.4 Mathematical & Statistical Functions

🔹 3. Pandas – Data Manipulation & Analysis

  • 3.1 Series and DataFrame

  • 3.2 Data Loading

  • 3.3 Data Cleaning & Transformation

  • 3.4 Grouping & Aggregation

  • 3.5 Handling Missing Values & Outliers

🔹 4. Data Visualization with Matplotlib & Seaborn

  • 4.1 Matplotlib Basics

  • 4.2 Seaborn Visualization

  • 4.3 Advanced Plots

  • 4.4 Publication-Ready Visualizations

🔹 5. Exploratory Data Analysis (EDA)

  • 5.1 Data Distribution & Summary Statistics

  • 5.2 Univariate, Bivariate & Multivariate Analysis

  • 5.3 Correlation Analysis

  • 5.4 EDA Case Study

🔹 6. Data Preprocessing & Feature Engineering

  • 6.1 Data Scaling & Normalization

  • 6.2 Encoding Categorical Variables

  • 6.3 Feature Selection

  • 6.4 Handling Imbalanced Data

🔹 7. Statistics & Probability for Data Science

  • 7.1 Descriptive vs Inferential Statistics

  • 7.2 Hypothesis Testing

  • 7.3 Probability Distributions

  • 7.4 Correlation & Regression

🔹 8. Machine Learning with Scikit-learn

  • 8.1 Supervised Learning

  • 8.2 Model Training & Evaluation

  • 8.3 Cross-Validation

  • 8.4 Unsupervised Learning

🔹 9. Advanced Data Science Topics

  • 9.1 Time Series Analysis

  • 9.2 NLP Basics

  • 9.3 Deep Learning Introduction

  • 9.4 Model Deployment

🔹 10. Real-World Projects & Case Studies

  • 10.1 House Price Prediction

  • 10.2 Customer Churn Prediction

  • 10.3 Sentiment Analysis

  • 10.4 Sales Dashboard

🔹 11. Best Practices, Portfolio & Career Guidance

  • 11.1 Clean Code Practices

  • 11.2 Portfolio Building

  • 11.3 Git & Resume Tips

  • 11.4 Interview Preparation

🔹 12. Next Steps & Learning Roadmap

  • 12.1 Advanced Topics

  • 12.2 Books & Resources

  • 12.3 Career Opportunities

3. Pandas – Data Manipulation & Analysis

Pandas is the most powerful and widely used Python library for data wrangling, cleaning, exploration, and analysis. It is built on top of NumPy and provides high-level data structures (Series and DataFrame) that make working with tabular/structured data feel like using Excel or SQL — but much more powerful.

Install Pandas (if not using Anaconda)

Bash

pip install pandas

Standard import (always use this)

Python

import pandas as pd

3.1 Series and DataFrame – Core Data Structures

Series A one-dimensional labeled array (like a column in Excel or a vector with labels).

Python

# Create Series from list s1 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd']) print(s1) # a 10 # b 20 # c 30 # d 40 # dtype: int64 # Access by label print(s1['b']) # 20 # From dictionary s2 = pd.Series({'Math': 85, 'Science': 92, 'English': 78}) print(s2['Science']) # 92

DataFrame A two-dimensional labeled data structure (like a spreadsheet or SQL table) — the heart of Pandas.

Python

# Create DataFrame from dictionary data = { 'Name': ['Anshuman', 'Priya', 'Rahul', 'Sneha'], 'Age': [25, 23, 24, 22], 'City': ['Ranchi', 'Delhi', 'Patna', 'Kolkata'], 'Marks': [92, 88, 85, 90] } df = pd.DataFrame(data) print(df) # Name Age City Marks # 0 Anshuman 25 Ranchi 92 # 1 Priya 23 Delhi 88 # 2 Rahul 24 Patna 85 # 3 Sneha 22 Kolkata 90 # Basic inspection print(df.head(2)) # first 2 rows print(df.info()) # data types, non-null count print(df.describe()) # summary statistics print(df.shape) # (rows, columns) → (4, 4)

Quick access

Python

df['Name'] # Series – one column df[['Name', 'Marks']] # DataFrame – multiple columns df.iloc[0] # first row (position-based) df.loc[0, 'Name'] # label-based access

3.2 Data Loading (CSV, Excel, JSON, SQL)

Pandas makes reading data from almost any source effortless.

CSV

Python

df = pd.read_csv("sales_data.csv") # Options: skiprows=2, usecols=['date','sales'], dtype={'sales':float}

Excel

Python

df = pd.read_excel("report.xlsx", sheet_name="Sales", skiprows=1) # Need: pip install openpyxl or xlrd

JSON

Python

df = pd.read_json("data.json") # or pd.json_normalize() for nested JSON

SQL (with database connection)

Python

import sqlalchemy as sa engine = sa.create_engine("sqlite:///mydb.db") df = pd.read_sql("SELECT * FROM customers", engine) # or pd.read_sql_query(query, engine)

Quick save

Python

df.to_csv("cleaned_data.csv", index=False) df.to_excel("report.xlsx", index=False) df.to_json("data.json", orient="records")

3.3 Data Cleaning, Filtering & Transformation

Real data is messy — Pandas excels at cleaning it.

Basic cleaning

Python

df = df.drop_duplicates() # remove duplicate rows df = df.dropna(subset=['age', 'city']) # drop rows with missing values df['age'] = df['age'].fillna(df['age'].median()) # fill missing with median df['salary'] = df['salary'].astype(float) # change data type df['date'] = pd.to_datetime(df['date']) # convert to datetime

Filtering

Python

high_salary = df[df['salary'] > 80000] young_delhi = df[(df['age'] < 30) & (df['city'] == 'Delhi')] top_10 = df.nlargest(10, 'marks')

Transformations

Python

df['tax'] = df['salary'] * 0.18 # new column df['full_name'] = df['first'] + " " + df['last'] df['salary_category'] = pd.cut(df['salary'], bins=[0, 50000, 100000, np.inf], labels=['Low', 'Medium', 'High'])

3.4 Grouping, Aggregation & Pivot Tables

Groupby – most powerful feature

Python

# Group by city and calculate mean salary df.groupby('city')['salary'].mean() # Multiple aggregations df.groupby('city').agg({ 'salary': ['mean', 'max', 'count'], 'age': 'median' })

Pivot Tables

Python

pd.pivot_table(df, values='salary', index='city', columns='department', aggfunc='mean', fill_value=0)

Crosstab

Python

pd.crosstab(df['city'], df['gender'], margins=True)

3.5 Handling Missing Values & Outliers

Missing values

Python

# Check missing df.isnull().sum() # Fill missing df['age'].fillna(df['age'].median(), inplace=True) df['city'].fillna('Unknown', inplace=True) # Drop missing df.dropna(subset=['salary'], inplace=True)

Detect & handle outliers

Python

# Using IQR method Q1 = df['salary'].quantile(0.25) Q3 = df['salary'].quantile(0.75) IQR = Q3 - Q1 lower = Q1 - 1.5 IQR upper = Q3 + 1.5 IQR # Remove outliers df_clean = df[(df['salary'] >= lower) & (df['salary'] <= upper)] # Or cap them df['salary'] = df['salary'].clip(lower=lower, upper=upper)

Mini Summary Project – Quick EDA on Sample Dataset

Python

import pandas as pd import seaborn as sns import matplotlib.pyplot as plt # Load sample (or your own CSV) df = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv") # Quick look print(df.head()) print(df.info()) print(df.describe()) # Missing values print(df.isnull().sum()) # Group analysis print(df.groupby('day')['total_bill'].mean()) # Visualization sns.boxplot(x='day', y='total_bill', data=df) plt.title("Total Bill by Day") plt.show()

This completes the full Pandas – Data Manipulation & Analysis section — the most important tool for real data science work in Python!

📚 Amazon Book Library

All my books are FREE on Amazon Kindle Unlimited🌍 Exclusive Country-Wise Amazon Book Library – Only Here!

On GlobalCodeMaster.com you’ll find complete, ready-to-use lists of my books with direct Amazon links for every country.
Belong to India, Australia, USA, UK, Canada or any other country? Just click your country’s link and enjoy:
Any eBook FREE on Kindle Unlimited ✅ Or buy at incredibly low prices
400+ fresh books written in 2025-2026 with today’s latest AI, Python, Machine Learning & tech trends – nowhere else will you find this complete country-wise collection on one platform!
Choose your country below and start reading instantly 🚀
BOOK LIBRARY USA 2026 LINK
BOOK LIBRARY INDIA 2026 LINK
BOOK LIBRARY AUSTRALIA 2026 LINK
BOOK LIBRARY CANADA 2026 LINK
BOOK LIBRARY UNITED KINGDOM 2026 LINK
BOOK LIBRARY GERMANY 2026 LINK
BOOK LIBRARY FRANCE 2026 LINK
BOOK LIBRARY ITALY 2026 LINK
BOOK LIBRARY SPAIN 2026 LINK
BOOK LIBRARY NETHERLANDS 2026 LINK
BOOK LIBRARY BRAZIL 2026 LINK
BOOK LIBRARY MEXICO 2026 LINK
BOOK LIBRARY JAPAN 2026 LINK
BOOK LIBRARY POLAND 2026 LINK
BOOK LIBRARY IRELAND 2026 LINK
BOOK LIBRARY SWEDEN 2026 LINK
BOOK LIBRARY BELGIUM 2026 LINK