Roll No: - ____________ Sonopant Dandekar Shikshan Mandali’s Sonopant Dandekar Arts,V.S.Apte Commerce, M.H.Mehta Science Colloge DEPARTMENT OF COMPUTER SCIENCE CERTIFICATE Certified ThatMr./Miss._________________________________________ of___________________________ has satisfactorily completed a course of necessary experiment in____________________________________under my supervision in the SY.BSC Computer Science in the Year 2025 – 2026 Head of Department Subject Teacher Date: / / 202 6 I NDEX MODULE - I SR NO Aim Of Practical Practical D ate Submission date Remarks 1. Using Pandas, load a dataset and perform column - wise operations such as adding a new calculated column, renaming columns, changing data types and droping unessesary columns. 2. Load a csv dataset into a pandas data frame perform basic data inspection such as displaying the first few rows,checking data types,handling missing values and removing duplicate rows. 3. Using NumPy generate a random dataset of 1000 values. Calculate basic statistical measures such as mean,median,variance,Standard Deviation, Min,Max values using NumPy functions. 4 Write Pandas program to group a dataset by one or more categorical columns and calculate summary statistics such as count, mean, standard deviation of each group. 5 Load a dataset into pandas and filter rows based on complex condition using .loc and .query().For example,filter rows,where sales are above a threshold and region equals “North”. 6 Using NumPy,create two matrices(3x3) filled with random integers.Perform matrix addition, subtraction, multiplication, element - wise division and calculate the determinant and inverse. 7 Write a pandas program to merge Dataframes using different types of joins(inner,outer,left,right).Use sample data representing customer details and order details. 8 Load a dataset and perform time series analysis using Pandas DataTime features.Extract year,month,dy and weekday from a date column, and group data by month to calculate monthly sales. 9 Using NumPy generate 1D array of 100 random integers between 1 and 1000. Use Boolean indexing to filter all values greater than 500 and less than 800 and calculate the mean of filtered values. 10. Write a pandas program to pivot a DataFrame to create a pivot table summarizing the data.Use sample sales data to show product - wise sales for each region. MODULE - I I SR NO Aim Of Practical Practical D ate Submission date Remarks 1 1 Load a dataset intp Pandas and use Seaborn to plot histogram showing the distribution of a numerical column. Customize the bin size,color and add a title. 1 2. Using Seabon,plot a boxplot for a numerical column group by a categorical column(e.g., salary distribution across different department) from a given dataset 1 3. Load a dataset into Pandas and create a pairplot using seabornto visualize pairwise relationship between all numerical columns Add line to difference categories 1 4 Create a Seaborn heatmap using a correlation matrix generated from a DataFrame. Customize the color palette,annovations, and title. 1 5 Using seaborn create a barplot comparing the average values of a numerical column for different categories of categorical column.Customize axes,labels and titles. 1 6 Load a dataset and create a Seaborn scatter plot between two numerical columns. Add hue to differentiate categories and customize markers and plot size. 1 7 Using seaborn create a libneplot to visualize trends in time series dataset. Customize the plot with a appropriate labels, grid lines and title. 1 8 Load a dataset with multiple numerical columns and create a Seaborn violin plot to show the distribution of values for each column grouped by a categorical column. Date: __________ Practical No.1 Aim:Using Pandas, load a dataset and perform column - wise operations such as adding a new calculated column, renaming columns, changing data types and droping unessesary columns. INPUT: import pandas as pd df=pd.read_csv("C:/Users/student/Desktop/65015/Python/Data1.csv") print(df) #Add a new calculated column df["Total_Salary"]=df["Salary"]+df["Bonus"] print(df) #Rename column df=df.rename(columns={ "ID":"Emp_ID", "Name":"Emp_Name" }) print(df) #Change types df["Emp_ID"]=df["Emp_ID"].astype(int) #drop column df=df.drop(columns=["Department"]) print(df.info()) print(df.head()) OUTPUT: Date: __________ Practical No.2 Aim: Load a csv dataset into a pandas data frame perform basic data inspection such as displaying the first few rows,checking data types,handling missing values and removing duplicate rows. INPUT: import pandas as pd df=pd.read_csv("C:/Users/student/Desktop/65015/Python/Data1.csv") print(df) #show first few rows print(" \ nFirst 5 rows:") print(df.head()) #Show dataframe shape print(" \ nShape:",df.shape) #check data types print(" \ nData Types:") print(df.dtypes) #summary statistics print(" \ nSummary Statistics:") print(df.describe()) #check missing values print(" \ nMissing Values:") print(df.isnull().sum()) #drop rows with missing values df_clean=df.dropna() #remove duplicates df_clean=df_clean.drop_duplicates() #show final shape print(" \ nShape after Cleaning:",df_clean.shape) #show cleaned data print(" \ nFirst 5 rows of cleaned data:") print(df_clean.head()) OUTPUT: Date: __________ Practical No.3 Aim:Using NumP y generate a random dataset of 1000 values. Calculate basic statistical measures such as mean,median,variance,Standard Deviation, Min,Max values using NumPy functions. INPUT: import numpy as np import statistics random_dataset=np.random.randn(1000) print(random_dataset) #calculate statistical measures mean_value=np.mean(random_dataset) median_value=np.median(random_dataset) var_value=np.var(random_dataset) std_dev_value=np.std(random_dataset) min_value=np.min(random_dataset) max_value=np.max(random_dataset) #print results print("Mean value:",mean_value) print("Median value:",median_value) print("Variance value:",var_value) print("Standard Deviation:",std_dev_value) print("Minimum value:",min_value) print("Maximum value:",max_value) OUTPUT: Date: __________ Practical No.4 Aim:Write Pandas program to group a dataset by one or more categorical columns and calculate summary statistics such as count, mean, standard deviation of each group. INPUT: import pandas as pd import numpy as np import statistics data=[ [23,67,34,67,75], [83,56,53,37,62], [98,41,79,87,65] ] df=pd.DataFrame(data,columns=['A', 'B', 'C', 'D', 'E']) print(df) print("Summary Statistics") print(df.describe()) OUTPUT: INPUT: import pandas as pd import numpy as np import statistics np.random.seed(0) df=pd.DataFrame({ "Category":np.random.choice(["A","B","C"],size=20), "Subgroup":np.random.choice(["X","Y"],size=20), "Value":np.random.randn(20)*10+50 }) print("Original DataFrame:") print(df) #Group by one or more categorical columns and calculate summary statistics grouped=df.groupby(["Category","Subgroup"]).agg( count=("Value","count"), mean=("Value","mean"), std=("Value","std") ) print(" \ nGrouped Summary Statistics:") print(grouped) OUTPUT: Date: __________ Practical No.5 Aim:Load a dataset into pandas and filter rows based on complex condition using .loc and .query().For example,filter rows,where sales are above a threshold and region equals “North”. INPUT: import pandas as pd #Create a simple dataset data={ "sales":[5000,15000,20000], "region":["North","South","North"], } df=pd.DataFrame(data) print("Original Data:") print(df) #Filter using .loc filtered_loc=df.loc[ (df["sales"]>10000) & (df["region"]=="North") ] print(" \ nFiltered using .loc:") print(filtered_loc) #Filter using .query filtered_query=df.query( "sales>10000 and region=='South'" ) print(" \ nFiltered using .query:") print(filtered_query) OUTPUT: Date: __________ Practical No.6 Aim: Using NumPy,create two matrices(3x3) filled with random integers Perform matrix addition, subtraction, multiplication, element - wise division and calculate the determinant and inverse. INPUT: import numpy as np A=np.random.randint(1,10,size=(3,3)) print("Matrix A:") print(A) B=np.random.randint(1,10,size=(3,3)) print("Matrix B:") print(B) #Addition print("Addition of Matrix A and B:") print(A+B) #Subtraction print("Subtraction of Matrix A and B:") print(A - B) #Multiplication print("Multiplication of Matrix A and B:") print(A*B) #Division print("Division of Matrix A and B:") print(A/B) det_A=np.linalg.det(A) print("Deteminant of matrix A:",det_A) det_B=np.linalg.det(B) print("Deteminant of matrix B:",det_B) inv_A=np.linalg.inv(A) print("Inverse of matrix A:",inv_A) inv_B=np.linalg.inv(B) print("Inverse of matrix A:",inv_B) OUTPUT: Date: __________ Practical No.7 Aim:Write a pandas program to merge Dataframes using different types of joins(inner,outer,left,right).Use sample data representing customer details and order details. INPUT: import pandas as pd customers=pd.DataFrame({ "customer_id":[1,2,3,4], "customer_name":["ABC","PQR","LMN","XYZ"], "city":["New York","London","Paris","Tokyo"] }) orders=pd.DataFrame({ "order_id":[101,102,103,104], "customer_id":[1,2,2,5], "order_amount":[250,150,300,400] }) print("Customers DataFrame:") print(customers) print("Orders DataFrame:") print(orders) #Inner Join Inner_Join=pd.merge(customers,orders,on="customer_id",how="inner") print("Inner Join:") print(Inner_Join) #Outer Join Outer_Join=pd.merge(customers,orders,on="customer_id",how="outer") print("Outer Join:") print(Outer_Join) #Left Join Left_Join=pd.merge(customers,orders,on="customer_id",how="left") print("Left Join:") print(Left_Join) #Right Join Right_Join=pd.merge(customers,orders,on="customer_id",how="right") print("Right Join:") print(Right_Join) OUTPUT: INPUT: import pandas as pd employee=pd.DataFrame({ "emp_id":[1,2,3,4], "emp_name":["Aarya","Khanak","Pratiksha","Mansi"], "salary":[20000,25000,30000,35000] }) customer=pd.DataFrame({ "customer_id":[101,102,103,104], "emp_id":[2,2,3,5], "city":["New York","London","Paris","Tokyo"] }) print("Employee DataFrame:") print(employee) print("Customer DataFrame:") print(customer) #Inner Join Inner_Join=pd.merge(employee,customer,on="emp_id",how="inner") print("Inner Join:") print(Inner_Join) #Outer Join Outer_Join=pd.merge(employee,customer,on="emp_id",how="outer") print("Outer Join:") print(Outer_Join) #Left Join Left_Join=pd.merge(employee,customer,on="emp_id",how="left") print("Left Join:") print(Left_Join) #Right Join Right_Join=pd.merge(employee,customer,on="emp_id",how="right") print("Right Join:") print(Right_Join) OUTPUT: