Business Benefits
Use Pandas to manipulate data and create data visualizations.
Load data from CSV files to the Pandas dataframe using the pandas.read_csv()
function.
For example, if you have a CSV file with headers of name
, physics
, chemistry
and algebra
, and names and exam scores filled in as rows, you can call the CSV file using the pandas library and output the sheet using pandas:
import pandas as pd
import plotly.offline as pyo
import plotly.graph_objs as go
df = pd.read_csv(r’ /
name physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
## Read and output data from a CSV file using the `read_csv()` function with a delimiter argument.
For example, if you have a CSV file with headers `name`, `physics`, `chemistry` and `algebra`, student names and scores as rows, and each piece of data separated by a space, you can access its data using:
import pandas as pd
#load dataframe from csv
df = pd.read_csv('data.csv', delimiter=' ')
#print dataframe
print(df)
name physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
## Use the `header=None` method to output data when the CSV file you are using does not have column header.
`header=None` instructs the `pandas.read_csv()` function to ignore the first row in the CSV file - where the header would usually be - and print all the rows as data. For example, this code allows the data in `data.csv` to be printed, despite the file not containing a header:
import pandas as pd
# using header argument
df = pd.read_csv('data.csv', header=None)
## Output only the data from certain columns in a CSV file using the `names=` method.
For example, if the `data.csv` file has columns `first_name`, `last_name`, and `email_address`, and you need to use the `email_address` column data for processing, use this code to output the `email_address` column data into the pandas dataframe:
## Use the DataFrame library command `pd.read_excel` to access the contents of an Excel file and display the data as an array.
For example, if you want to access the contents of the file `tmp.xlsx`, which contains columns ***Name*** and ***Value***, use the `pd.read_excel` function to display the data as an array:
import pandas as pd
pd.read_excel('tmp.xlsx', index_col=0)
Output:
Name Value
0 string1 1
1 string2 2
## Access specific sheets from multiple sheets within an Excel file using the syntax `pd.read_excel('[filename]', sheet="[sheet]")`.
For example, if you want to output only the data from a specific sheet `cxl` within an Excel file named `tmp.xlsx`, use this code to read it:
import pandas as pd
pd.read_excel(‘tmp.xlsx’, sheet=“cxl”)
## Write data into an Excel file using the `to_excel` function.
For example, to create an Excel file named `cxl-lesson.xlsx` with two sheets named `Sheet1` and `Sheet2`, then reopen the file to add another sheet named `Sheet3`:
>>> with ExcelWriter('path_to_file.xlsx') as writer:
... df.to_excel(writer)
To write to separate sheets in a single file:
>>> with ExcelWriter('cxl-lesson.xlsx') as writer:
... df1.to_excel(writer, sheet_name='Sheet1')
... df2.to_excel(writer, sheet_name='Sheet2')
You can set the date format or datetime format:
>>> with ExcelWriter('cxl-lesson.xlsx',
... date_format='YYYY-MM-DD',
... datetime_format='YYYY-MM-DD HH:MM:SS') as writer:
... df.to_excel(writer)
You can also append to an existing Excel file:
>>> with ExcelWriter('cxl-lesson.xlsx', mode='a') as writer:
... df.to_excel(writer, sheet_name='Sheet3')