Tag: data science

Python Cheat Sheet

This is my first time learning Python because I mostly use R in my projects and for all I know, whatever can be done in Python can also be done in R. But as an aspiring data scientist, I shouldn’t just stick with R all the time. I enrolled in this course by Microsoft because it is data-science-centered. Of course this is for everyone interested in Python and not just limited for data scientists. Join me as I’m gonna write down my takeaways here.

Version 3.x – https://www.python.org/downloads/

Python Script – Text Files .py

Basics
print(3 + 4) # add
print(4 – 3) # subtract
print(4 * 3) # multiply
print(4 / 2) # divide
print(4 ** 2) # exponent, 4²
print(4 % 2) # modulo

Variables
height = 1.79
weight = 68.7
bmi = weight / height ** 2

Types
type(bmi) # float
type(5) # int
type(“body mass index”) # str
type(‘this works too’) # str
type(True) # bool
print(2 + 3) # 5
print(‘ab’ + ‘cd’) # ‘abcd’
“I said ” + (“Hey ” * 2) + “Hey!” # ‘I said Hey Hey Hey!’
str(5) # convert 5 to a string “5”
int(True) # convert True to 1
bool(“True”) # convert “True” to True
float(1) # convert 1 to t1.0

Lists
fam = [“liz”, 1.73, “emma”, 1.68, “mom”, 1.71, “dad”, 1.89, [a,b], [c,d]] # can contain different types, even lists too
type(fam) # list
fam[3] # 1.68, zero-based indexing
fam[-1] # [c,d]
fam[-3] # 1.89
fam[3:5] # [1.68, “mom”] [start:end] [inclusive:exclusive]
fam[:4] # 0 to 3 [“liz”, 1.73, “emma”, 1.68]
fam[5:] # 5 to last [1.71, “dad”, 1.89, [a,b], [c,d]]
fam[0:2] = [“lisa”, 1.74] # fam = [“lisa”, 1.74, “emma”, 1.68, “mom”, 1.71, “dad”, 1.89, [a,b], [c,d]]
fam + [“me”, 1.79] # [“lisa”, 1.74, “emma”, 1.68, “mom”, 1.71, “dad”, 1.89, [a,b], [c,d], “me”, 1.79]
del(fam[2]) # [“lisa”, 1.74, 1.68, “mom”, 1.71, “dad”, 1.89, [a,b], [c,d], “me”, 1.79]
x = [“a”, “b”, “c”]
y = x
y[1] = “z” # x[1] is also z because you copied the reference to the list, not the actual values themselves
y = list(x) # or y = x[:] to select all elements
fam.index(“mom”) # finds “mom” and returns its index: 4
fam.count(1.74) # counts the number of times 1.74 occurs in the list; returns 1

Functions
max(fam) # maximum value in the list
round(1.68, 1) # round 1.68 to 1 decimal place, 1.7
round(1.68) # round to nearest whole number
help(round) # opens documentation of round function
len(fam) # length of list

Methods
Methods are functions but they differ from function because they call functions on objects.
sister = ‘liz’
sister.capitalize() # ‘Liz’
sister.replace(“z”, “sa”) # ‘lisa’
sister.index(“z”) # 2
fam = [“liz”, 1.73, “emma”, 1.68, “mom”, 1.71, “dad”, 1.89] 
fam.index(“mom”) # 4
fam.append(“me”) # fam = [“liz”, 1.73, “emma”, 1.68, “mom”, 1.71, “dad”, 1.89, “me”]  fam automatically updated even without re-assigning to fam
sister.upper() # ‘LIZ’
sister.count(“i”) # 1
fam.reverse() # fam = [“me”, 1.89, “dad”, 1.71, “mom”, 1.68, “emma”, 1.73, “liz”] fam automatically updated even without re-assigning to fam

Numpy
Numpy (Numeric Python) efficiently works with arrays. Once installed…
import numpy as np # personal preference for calling the numpy package; can be done without the as np but the whole numpy word should be used when calling a numpy function like array
np.array([1, 2, 3])
a = [1, 2, 3]
b = [4, 5, 6]
np_a = np.array(a)
np_b = np.array(b)
np_a / np_b ** 2 
# can perform element-wise operations
np.array([1.0, “is”, True]) # will all turn to string because because Numpy arrays contain only one type
python_list = [1, 2, 3]
python_list + python_list # [1, 2, 3, 1, 2, 3]
numpy_array = np.array([1, 2, 3])
numpy_array + numpy_array # array([2, 4, 6])
a[1] # 2
a > 1 # array([False, True, True], dtype=bool)
a[a > 1] # array([2, 3])
np_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8]]) # 2D array
np_2d.shape # returns the dimension of the array; (2, 3) since 2 rows and 3 columns
np_2d[0] # array([1, 2, 3])
np_2d[0][1] # 2
np_2d[0, 1] # 2
np_2d[:, 1:3] # array([[2, 3, 4], [6, 7, 8]])
np_2d[1, :] # array([5, 6, 7, 8])
np_2d_another = np.array([[1, 1, 1, 1], [1, 1, 1, 1]])
np_2d + np_2d_another
# array([[2, 3, 4, 5], [6, 7,  8, 9]])