Before introducing specialized Python constructs used heavily in data science, core Python concepts are presented. These concepts form the fundamental building blocks for all subsequent topics. A thorough understanding of these basics is essential for daily work with data in Python and provides a solid foundation for advanced applications.
At its core, programming involves manipulating data stored in variables. Python is dynamically typed, meaning you don't need to declare a variable's type explicitly. However, understanding the primary data types is important for effective data handling.
int): Whole numbers, like 10, -5, 0.float): Numbers with a decimal point, like 3.14, -0.001, 2.7e5 (scientific notation). Be mindful of potential precision issues inherent in floating-point arithmetic.str): Sequences of characters, enclosed in single ('...') or double ("...") quotes. Used for textual data. Operations include concatenation (+) and slicing.bool): Represent truth values, either True or False. Essential for conditional logic.# Variable assignment
count = 100
temperature = 23.5
city_name = "San Francisco"
is_valid = True
# Checking types (useful for debugging)
# print(type(count))
# print(type(temperature))
# print(type(city_name))
# print(type(is_valid))
While dynamic typing offers flexibility, explicitly indicating expected types using type hints (e.g., count: int = 100) is becoming increasingly common, especially in larger projects, as it improves code readability and allows for static analysis.
Python provides several built-in data structures for organizing collections of data. Choosing the right structure is often tied to performance and the specific task.
list): Ordered, mutable (changeable) sequences of items. Defined with square brackets []. Lists are versatile and commonly used for storing sequences of data points or observations.
features = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
measurements = [5.1, 3.5, 1.4, 0.2]
measurements.append(0.3) # Lists are mutable
# print(features[0]) # Access by index
# print(measurements)
tuple): Ordered, immutable (unchangeable) sequences of items. Defined with parentheses (). Because they are immutable, tuples are often used for data that shouldn't change, like coordinates or fixed configuration settings. They can also be used as keys in dictionaries.
point = (10, 20)
# point[0] = 15 # This would raise a TypeError
# print(point)
dict): Unordered (in Python versions before 3.7) collections of key-value pairs. Defined with curly braces {}. Keys must be unique and immutable (strings, numbers, or tuples are common keys). Dictionaries are extremely useful for mapping information, like feature names to values or storing configuration parameters.
sample = {'sepal_length': 5.1, 'sepal_width': 3.5, 'species': 'setosa'}
# print(sample['sepal_length']) # Access by key
sample['petal_length'] = 1.4 # Add new key-value pairs
# print(sample)
set): Unordered collections of unique, immutable items. Defined with curly braces {} or the set() function. Sets are highly optimized for membership testing (in operator) and removing duplicates from sequences.
unique_species = {'setosa', 'versicolor', 'virginica', 'setosa'}
# print(unique_species) # Output: {'setosa', 'versicolor', 'virginica'}
# print('setosa' in unique_species) # Fast membership testing
Control flow statements direct the order in which code is executed.
if, elif, else): Execute blocks of code based on whether conditions evaluate to True or False.
value = 75
if value > 90:
grade = 'A'
elif value > 70:
grade = 'B'
else:
grade = 'C'
# print(f"Grade: {grade}")
for, while): Repeat blocks of code.
for loops iterate over sequences (like lists, tuples, strings, dictionaries, or generator outputs).
# Iterate over list elements
total = 0
numbers = [1, 2, 3, 4, 5]
for num in numbers:
total += num
# print(f"Sum: {total}")
# Iterate over dictionary keys
# for key in sample:
# print(f"{key}: {sample[key]}")
while loops continue as long as a condition remains True. Be careful to ensure the condition eventually becomes False to avoid infinite loops.
count = 0
while count < 3:
# print(f"Count is {count}")
count += 1
break, continue): break exits the current loop entirely, while continue skips the rest of the current iteration and proceeds to the next one.Functions are reusable blocks of code that perform a specific task. They are fundamental to writing organized, modular, and maintainable programs.
def): Use the def keyword to define a function, followed by the function name, parentheses () for parameters, and a colon :. The indented block below constitutes the function body.return): Functions can optionally return a value using the return statement. If omitted, the function returns None.def calculate_mean(data_list):
"""Calculates the arithmetic mean of a list of numbers."""
if not data_list: # Handle empty list case
return 0.0
return sum(data_list) / len(data_list)
# Calling the function
scores = [88, 92, 75, 98, 85]
average_score = calculate_mean(scores)
# print(f"Average score: {average_score}")
Functions help break down complex problems into smaller, manageable pieces. They also promote code reuse, reducing redundancy and making updates easier.
This brief review covers the absolute essentials. As we move forward, we'll build upon these concepts, introducing more efficient ways to work with sequences (comprehensions, generators), techniques for writing more flexible functions (advanced arguments, decorators), and methods for structuring larger applications (OOP, context managers). A solid grasp of these fundamentals will make mastering the subsequent topics significantly smoother.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with