Pandas is a popular data analysis tool that

# Install and import Pandas#

``````pip install pandas
``````
``````import numpy as np
import pandas as pd
``````

# Pandas Data Structures#

The core value of Pandas comes through the data structure options it provides, primarily

1. Series (labeled, homogenously-typed, one-dimensional arrays)
2. DataFrames (labeled, potentially heterogenously-typed, two-dimensional arrays)

## Pandas Series#

### Create Series#

Create empty Series

``````s = pd.Series(dtype='float64')
``````

Create Series from dictionary

``````d = {'a': 1, 'b': 2, 'c': 3}
s = pd.Series(d)
``````

Create Series from Numpy array

``````a = np.array([1,2,3,4])
s = pd.Series(a, copy=False, dtype=float)
``````

Create Series from Numpy array with a defined index

``````data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[10,11,12,13])
``````

Create a Series from Scalar

``````s = pd.Series(5, index=[0, 1, 2, 3])
``````

### Select Series data#

Retrieve first element

``````s
``````

Retrieve first 3 elements

``````s[:3]
``````

Retieve last 3 elements

``````s[:-3]
``````

Retieve data via index

``````s['a']
``````

Retrieve multiple elements via index

``````s[['a', 'c', 'd']]
``````

### Series Functions#

Return Series as an array

``````s.values
``````

Returns shape and size of the series

``````s.shape
s.size
``````

Cast Series as another data type

``````s.astype('int32')
``````

Count non-null values in Series

``````s.count()
``````

Cumulative Sum

``````s.cumsum()
``````

Drop missing values

``````s.dropna()
``````

## Pandas DataFrame#

A DataFrame is a two-dimensional structure, where data is aligned in a tabular fashion in rows and columns.

The columns of a dataframe are potentially heterogenously typed and the size is mutable. The axes are labeled, which allows for performing arithematic operations on the rows and columns.

### Create DataFrame#

Create empty dataframe

``````df = pd.DataFrame()
``````