Code coaching: session 2
(Last session)
My friend's "homework" from last time was to turn the code snippet we had (list filtering) into a function because modular code is easy to read and functions are reusable (which means less repetition).
She did this! And the function worked as expected, outputting a list of missing items that exist on one list (the first parameter) but not another (the second parameter). I was really happy about this success!
I had a couple of notes for her about how she accomplished this:
The first thing that caught my eye was the return statement. She wrote it to return a `print()` function, as in
return print(missing_items)
I was of two minds about this! It's not good practice to return a `print()` because the value returned is always `None`. So, if one were to assign a variable to the result of her function, it would look something like this:
missing_items = list_filtering_function(list_one, list_two)
and the value of `missing_items`, instead of being a list of items missing from the second list that are present in the first, would be `None`. This could lead to confusion and problems (bugs!) down the line, if we were to use the `missing_items` variable for some other purpose (like populating a new csv or something).
On the other hand, it was a good instinct to notice that nothing displayed for her if she simply returned the list of missing items! I suggested that she return the list, and print it outside of the function instead, to make the program display the result of the function.
def find_missing_items(outer_list, inner_list): """ compares two lists, and returns the items contained in outer_list that are not found in inner_list """ # initialize result missing_items = [] for item in outer_list: if item not in inner_list: missing_items.append(item) return missing_items # outside scope of function, create some simple lists to test it all_fruits = ["strawberry", "watermelon", "blueberry", "kiwi", "rambutan", "durian", "mango", "tomato", "cucumber"] available_fruits = ["strawberry", "rambutan", "mango"] # call the function and assign the result to a new variable unavailable_fruits = find_missing_items(all_fruits, available_fruits) # print the result! print(unavailable_fruits) # alternatively, print and call in the same line (more concise, less readable to a beginner, and the result can't be reused without calling the function again) print(find_missing_items(all_fruits, available_fruits))
2. The other notes were mostly minor, but the main other one was choosing descriptive names for functions and variables. I emphasized clarity and specificity over shorter names, because any worthwhile IDE (we're using VSCode) will have a shortcut for re-entering long function/variable names, and clarity/specificity in naming will help you remember what the function does and avoid confusing it with other, similar functions. We came up with `find_missing_items` for the function, and `big_list` and `small_list` for the params. I now think that `outer_list` and `inner_list` would be more appropriate, to reflect SQL outer/inner join language, or to evoke a Venn diagram of the two lists, but `big_list` and `small_list` get the job done: she will always know which param is which!
We committed these changes to her git branch (have I talked about git branches here?) and pushed the changes up to the remote repo. My next task is to clone her repo on my own machine and play around with SQLAlchemy and a simple database structure, since we also discussed her existing codebase at work, and what it's trying to accomplish, and I think that querying a persistent database would be wayyyyy more efficient than literally building a new database every time using CSVs, and then outputting new CSVs and saving them to a folder...the program took like 4 hours to finish running. Yikes!
[infomercial voice]: There's gotta be a better way!
So, next time we discuss classes, and object-oriented programming, because there are a lot of conceptual steps between what we have right now and ORMs/databases....













