
Learn Python with Baseball (part 2)
This is part 2 of the Learn Python with baseball beginner series. In this post, we cover the namespace, built-in objects, what libraries are, and user-defined functions. We learn about all of this cool stuff and then use it make our code from last time slightly more Pythonic.
Recap
In this part of the beginner series, we will build upon the code from the last post, introducing more layers and moving parts. We will discuss the Python namespace and how Python keeps track of all the objects we have defined in our program. Additionally, we will learn a little bit about libraries and how we can use them to import code into our namespace (although we won't be importing anything just yet) and built-in objects (which we've actually already used). We will also cover user-defined functions and functions in general.
Built-in objects and the namespace
Built-in functions are functions (we touched briefly on this last time) that exist naturally in the Python namespace. The namespace sounds very arcane, but essentially it is a collection of already defined names for things that we can use in our code.
Namespaces
I asked chatGPT to explain the namespace, to which it replied:
Python namespace is a collection of objects (variable, function, class, etc.) that can be accessed by a unique name. It's basically a way for Python to keep track of all the different objects that you have defined in your program, and make sure that each object has a unique name. This helps to prevent confusion and name collisions between different objects in your program.
We can import things in to our namespace, as we'll see in later posts, to bring in more items (functions and classes) to write our code. Importing external code will be useful when we start using libraries like pandas, Python's data manipulation library. A library is a bunch of code already written for you so you don't have to reinvent the wheel. When you import those libraries into your code you are then able to use them once you get them into the namespace.
But there are certain objects that already exist without any necessary importing. These are called "built-in" objects.
One such built-in function here is the print function, which we used in the last post.
print('Hello World!')
Hello World!
We touched only briefly on functions last time, but essentially in Python, functions are blocks of code that are defined once and can be used repeatedly throughout a program. Functions allow for efficient and organized code by breaking down tasks into smaller, reusable chunks. They can accept arguments (inputs) and return values (outputs). Functions can also be used to create modules, which are collections of related functions that can be imported into other Python scripts (as discussed, we'll be exploring some of these modules/libraries in later posts like pandas). Functions are a key part of Python programming and are used to make code more modular, maintainable, and readable.
User-defined Functions
User defined functions in Python are functions that are created and defined by the user, rather than being pre-defined in the language itself. Using a baseball example, a user may define a function that calculates a players batting average. This function would take in parameters such as the player's name, their position, how many hits they had, and their number of at bats. The function would return the batting average for the player. This user defined function could then be called and used multiple times throughout the program to calculate points for different players.
To write a user-defined function in Python using an MLB example, you could follow these steps:
1. Define the function by using the keyword def followed by the function name and any parameters that will be passed to the function. For example, a function that calculates a player's average yards per carry might be defined like this:
def batting_average(hits, at_bats):
2. Inside the function, write the code that will perform the desired calculation. In the example above, this would be the code that divides the player's total hits by their total at bats.
avg = hits / at_bats
3. Return the result of the calculation by using the return keyword. In the example above, this would be the player's batting_average
return avg
4. Call the function in your code by using the function name followed by any necessary arguments. For example, to calculate the batting average for a player who has 33 hits in 100 at bats.
batting_average = batting_average(33, 100)
5. Print the result of the function call to see the result of the calculation. In the example above, this would be the player's average yards per carry:
print(batting_average) # Output: .33
Overall, a user-defined function in Python using an MLB example might look like this:
def batting_average(hits, at_bats):
avg = hits / at_bats
return avg
batting_average = batting_average(33, 100)
print(batting_average)
0.33
Now let's take all of this new information and use it to make our example from the last post even better. How can we modularize our code in a way that makes it better? The answer is that we can write a function that takes in a player's name, number of earned runs and innings pitched and outputs a human-readable string that tells us that player's earned run average.
def calculate_era(player_name, earned_runs, innings_pitched):
earned_run_average = earned_runs / innings_pitched * 9
earned_run_average_rounded = round(earned_run_average, 2)
print(player_name, 'had an earned run average of ', earned_run_average_rounded)
calculate_era('Justin Verlander', earned_runs = 34 , innings_pitched = 175)
Justin Verlander had an earned run average of 1.75
This is our simple Python function that calculates the catch rate of a player. The function takes in three arguments:
The function first calculates the earned run average by dividing the number of earned runs by the number of innings pitched. Next it multiplies the value by 9 so that the value represents the ERA for that player for a full nine-inning game. It then rounds the earned run average to two decimal places and prints out a string that includes the player's name and their earned run average.
In the example provided, the function is called with the player name "Justin Veralander", 34 earned runs, and 175 innings pitched. This means that Justin Verlander had an earned run average of 1.7485714285714287, which gets rounded to 1.75 nd gets printed out as part of the string.
Positional and Keyword Arguments
Here you can also see that we explicitly defined arguments when passing them in. "Justin Verlander" was a positional argument and earned_runs and innings_pitched were keyword arguments.
In programming, a keyword argument is a type of argument that is passed to a function or method in which the name of the argument is specified in the function call. This allows for greater clarity and readability of the code, as the name of the argument clearly indicates its purpose.
As an example, consider a function that calculates the batting average for a given MLB player. This function might have a keyword argument called "player_name" that specifies the name of the player for whom the batting average should be calculated. The function could then be called like this:
calculate_batting_average(player_name="Bryce Harper")
In this example, the "player_name" keyword argument is used to specify the player for whom batting average should be calculated. The use of a keyword argument makes the code more readable and easier to understand, as it is clear that the "player_name" argument is used to specify the player in question. Our example above is no different.
Let's circle back to introducing this new idea of user-defined functions in to our code.
# 2022 stats
players = [
{
"name": "Justin Verlander",
"earned_runs": 34,
"innings_pitched": 175
},
{
"name": "Julio Urias",
"earned_runs": 42,
"innings_pitched": 175
},
{
"name": "Dylan Cease",
"earned_runs": 45,
"innings_pitched": 184
}
]
for player in players:
calculate_era(player["name"], earned_runs= player["earned_runs"], innings_pitched=player["innings_pitched"])
Justin Verlander had an earned run average of 1.75
Julio Urias had an earned run average of 2.16
Dylan Cease had an earned run average of 2.2
As you can see here, we bring in our list of dictionary objects, each containing information about 3 MLB pitchers - Justin Verlander, Julio Urias, and Dylan Cease.
We then iterate through each of our player objects and pass in their data to the function, which provides the desired output! Compare this to our previous post and you could probably notice that the code is a lot cleaner now.
Concluding Thoughts
We covered a decent amount of theory in this post. To recap, we:
We also covered in-depth what functions are and some other details like positional and keyword arguments.
Using all of this, we took our code from last time and made it slightly more Pythonic (emphasis on slightly because at the moment, our code isn't all that more Pythonic yet, but we'll touch on this topic more in future posts. "Pythonic" code is essentially Python code that is written with good style/convention).
No groundbreaking analysis yet, but we're one step closer to do some real analysis in Python!
Thanks for reading - You guys are all awesome and happy coding!