Working With Lists

Working With Lists

I imagine that for most people, this topic might seem too simple to be worth reading. However, I've spent the past week trying to become more comfortable working with lists in Python. By working with them, I mean creating, modifying, and leveraging lists to accomplish various objectives.

I've done several exercises working with lists, but I learned the most from a practice project in Automate the Boring Stuff with Python by Al Sweigart. The exercise prompt was to create a program that simulates flipping a coin 100 times to see if there are any streaks of 6 "heads" or 6 "tails" in a row. Each set of 100 flips is one experiment, and the goal is to repeat the experiment 10,000 times to find out the percentage chance of a six-streak occurring in any experiment. This was interesting to me because, as an engineer, understanding probabilities is important for various reasons in my job.

The starting code given was:

import random
numberOfStreaks = 0

for experimentNumber in range(10000)

               # Code that creates a list of 100 ‘heads’ or ‘tails’ values.

               # Code that checks if there is a streak of 6 heads or tails in a row

print(‘Chance of streak: %s%% % (numberOfStreaks / 100))

The first part was easy to start with. There might be a better way to do this, but using the tools I knew, I decided to create an empty list. Then, I used a "while" loop with the random.randint() and .append() functions to fill the list with 0s and 1s.

flipsList = []
while len(flipsList) < 100:
    flipsList.append(random.randint(0, 1))

Now, according to the prompt, the list should have been created using the string values 'H' and 'T'. I wasn't sure how to make a function for random letters, so I used 0s and 1s as placeholders, knowing I would need to find a way to convert them to 'H' and 'T'. I thought this would be easy, but it turned out to be more challenging, and I eventually had to ask ChatGPT for advice.

Side Note: Is using AI assistance considered cheating in the real world of coding? I genuinely want to know because it seems like a great tool to quickly overcome obstacles. While I would love to solve everything on my own, it seems unwise not to use available resources to complete the task.

Based on the AI's suggestion, I ended up creating a second list using if/else statements, which made me feel a bit silly for not thinking of it myself (hopefully, that will come with time). What was interesting, though, was the syntax for how it was done. My initial thought was to set it up like this:

flipsHT = []         # above the ‘while’ loop that created the numerical flipsList[]
    for item in flipsList:
        if item == 0:
            flipsHT.append(‘H’)
        else:
            flipsHT.append(‘T’)

which I tested, and it did work. However, the syntax that was recommended was:

flipsHT = [‘H’ if item == 0 else ‘T’ for item in flipsList]

I always thought the word 'if' had to come before the Boolean, which had to come before the action. By putting 'if' and 'else' inside the list brackets, we completely avoid the need for the 'for' loop and the .append() function. It's definitely cleaner code, but it has me rethinking what I understood up to this point. I want to experiment with more use cases to better understand how it works and can be used.

Moving on to the next part, I needed to figure out how to go through my newly created flipsHT[] list and find sequences of 6 or more repeated items. I knew this would require a for loop to go through the list, checking if each item was the same as the one before it, and either add to a counter or reset the counter based on the result.

I initially tried writing the loop as:

numberOfStreaks = 0  # Global variable
currentStreak = 1    # Global variable

for item in flipsHT:
    if flipsHT[item] == flipsHT[item – 1]:
        currentStreak += 1
        if currentStreak == 6:
            numberOfStreaks += 1
            break
        else:
           currentStreak = 1

this gave me a syntax error though, and after fiddling around with it for a little bit, I got frustrated and went back to ChatGPT and learned that I had to incorporate the range function into the first line of the for loop:

for item in range(1, len(flipsHT)):
    if flipsHT[item] == flipsHT[item – 1]:
        currentStreak += 1
        if currentStreak == 6:
            numberOfStreaks += 1
            break
        else:
           currentStreak = 1

I guess I still don’t totally understand the reason why I had to do this to count streaks, but not for converting 0s and 1s to ‘H’ and ‘T’. But hey, the program worked now!

The full program looked like this:

import random
numberOfStreaks = 0 
currentStreak = 1

for experimentNumber in range(10000)
    # Code that creates a list of 100 ‘heads’ or ‘tails’ values.
    flipsList = []
while len(flipsList) < 100:
flipsList.append(random.randint(0, 1)) 
        flipsHT = [‘H’ if item == 0 else ‘T’ for item in flipsList]   

    # Code that checks if there is a streak of 6 heads or tails in a row
    for item in range(1, len(flipsHT)):
        if flipsHT[item] == flipsHT[item – 1]:
            currentStreak += 1
            if currentStreak == 6:
                numberOfStreaks += 1
                break
        else:
            currentStreak = 1 

print('Chance of streak: %s%%' % (numberOfStreaks / 100))

Every time you run the program, it creates 10,000 lists of 100 simulated coin flips and checks each list for a streak of six 'H's or six 'T's in a row. It consistently reports a 79-84% chance of this happening over 10,000 experiments.

I started getting curious, though. What if the number of experiments changes? I found that if I reduced the number of experiments from 10,000 to 1,000, the print function malfunctioned and incorrectly reported the percentage as ~8.2% instead of staying around ~82%.

I realized this happens because the code isn't designed to handle changes in the number of experiments. The calculation "numberOfStreaks/100" only gives the correct percentage if there are exactly 10,000 experiments. If the number of experiments changes, a more robust version of the code is needed:

print(‘Chance of streak: %s%% % (numberOfStreaks / 100))

A better approach would be to create a variable for numberOfExperiments and use it in both the main loop of the program and the print function at the end. This involves changing the top line of the main loop from:

for experimentNumber in range(10000)

to:

for experimentNumber in range(numberOfExperiments):

and the print function from:

print('Chance of streak: %s%%' % (numberOfStreaks / 100))

to:

print('Chance of streak: %s%%' % (numberOfStreaks / numberOfExperiments * 100))

By making these changes, I was able to adjust the number of experiments in just one place and still get accurate outputs for any number of experiments.

The next question I was curious about was: "What happens to the probability if the number of flips in an experiment changes?" A lower number of flips would likely reduce the chance of getting a 6-streak, while increasing the number of flips would have the opposite effect. But how significant would the impact be?

It was easy enough to simply adjust the value in the ‘while’ loop that creates the flips list:

while len(flipsList) < 100:

But I decided I don’t want to have to dig through the code to find that value and adjust it. I swapped the integer out for a global variable I named “flipsCount” and re-wrote the line as:

while len(flipsList) < flipsCount:

My final product looked like this:

For professional programmers, the number of comments in my code might seem excessive, but as someone learning, I find it helpful to take the time to explain what each line does, especially when I encounter something new.

By modifying the program, I learned that changing the number of experiments doesn't greatly affect the average output value when the program is run multiple times. However, it does significantly reduce the range of output values.

For example, if I ran the program with the flipsCount variable set to 100 and the numberOfExperiments variable set to 100, the output could range from 75% to 86%. The average result from multiple runs was about 80.5%. Changing the numberOfExperiments variable to 20,000 narrowed the range to between 79.56% and 80.75%. The average output value stayed the same (around 80.5%), but the standard deviation from the average was much smaller.

If I changed the flipsCount variable to only 50 flips per experiment, the probability of getting a six-streak dropped to an average of about 54.92%.

After experimenting with these numbers and being eager to learn about the relationships between the inputs and outputs, I began to consider other information I could extract from the lists. This led me to create modified versions of the program.

The first modified version was created to answer the question: What is the probability that a given experiment with x number of coin flips will have multiple six-streaks of heads or tails? I used tools I was already familiar with to make this adjustment confidently.

  1. Added global variables to count lists with 2 streaks of 6 in a row and 3 streaks of 6 in a row.

  2. Moved currentStreak variable inside the main loop and added a streaksCount variable that I then subbed into the place where numberOfStreaks was used in the inner for loop

  3. Added a line to reset the current streak after it reaches 6

    1. Without doing this, a streak of -say- 7 could be counted as 2 separate 6-streaks
  4. Added additional if statements creating the logic that updates the variables for numberOfStreaks (experiment has at least one 6-streak), numberDoubleStreaks (experiment has at least two 6-streaks) and numberTrippleStreaks (experiment has at least three 6-streaks)

  5. Additional print statements to report the probability of the additional counted number of 6-streaks.

The output looks something like this:

I had a lot of fun experimenting with the flipsCount and numberOfExperiments variables to see how they affected the results.

Then I had another idea: instead of just looking for 6-streaks of heads or tails, what if the program could also report the probability of longer or shorter streaks using the same experiment inputs? The first part of updating the code was straightforward. I just needed to add some extra variables to count streaks of different lengths.

import random
flipsCount = 30
numberOfExperiments = 5000
streakCount4 = 0 # Tracks the number of lists containing 4 streaks 
streakCount5 = 0 # Tracks the number of lists containing 5 streaks 
streakCount6 = 0 # Tracks the number of lists containing 6 streaks 
streakCount7 = 0 # Tracks the number of lists containing 7 streaks 
streakCount8 = 0 # Tracks the number of lists containing 8 streaks

I'll admit that the logic needed to count this way completely stumped me. ChatGPT came to my rescue and reminded me about using True and False as variable values. With the help of our soon-to-be overlord, I implemented this logic within the main for loop and after the list creation while loop to update the streak counter variables:


    # initial conditions for variables in the for loop below
    currentStreak = 1
    has4Streak = False
    has5Streak = False
    has6Streak = False
    has7Streak = False
    has8Streak = False

    for item in range(1, len(flipsHT)):         
        if flipsHT[item] == flipsHT[item - 1]:  
            currentStreak += 1                  
        else:
            currentStreak = 1                   
        # Booleans added     
        if currentStreak >= 4:
            has4Streak = True
        if currentStreak >= 5:
            has5Streak = True
        if currentStreak >= 6:
            has6Streak = True
        if currentStreak >= 7:
            has7Streak = True
        if currentStreak >= 8:
            has8Streak = True

    # Logic to update counts of different streaks 
    if has4Streak:
        streakCount4 += 1
    if has5Streak:
        streakCount5 += 1
    if has6Streak:
        streakCount6 += 1
    if has7Streak:
        streakCount7 += 1
    if has8Streak:
        streakCount8 += 1

For an output like this:

I know there's more I could do with this program to practice extracting different types of information from lists, but this is where I decided to stop for this week. I feel a bit more knowledgeable and comfortable working with lists than when I started.

Working with lists in Python might seem easy, but it involves challenges in creating, modifying, and using them effectively. Exercises like simulating coin flips to study streak probabilities help you understand list manipulation and statistical analysis. Learning means overcoming challenges, asking for help, and trying different methods. By improving your code and exploring various scenarios, you enhance your technical skills and appreciate the power of lists in programming. Continued practice boosts confidence and skill, preparing you for more advanced programming tasks.