Baby Steps with Python

Chief amongst the many skills you need for any kind of AI or machine learning work is the programming language Python.

I started studying it ten days ago with a series of excellent courses from the University of Michigan called Python for Everybody on Coursera. I binged them, doing on average five weeks' worth of course every day, and had completed them all a week later. This post is to showcase what I've learned so far.

Please note that this has nothing to do with AI or ML, and is very basic stuff if you know Python. But it was all new to me, and a great learning experience.

The Objective

The course has an assignment where you adapt some code that takes a list of places and converts it to pins on a map. I decided to do something similar, but instead of reusing the code, I would write it from scratch to prove that I really understood it.

 

This is the final output image. The pins represent places I visited on my year abroad in 2023. (I travelled for 15 months, starting in Asia and ending in South America.)

System Overview

Here is a Powerpoint diagram I made to visualise what I was going to do.

 

Based on this diagram, there were two Python scripts I needed to write: loaddata.py and extractdata.py.

OpenStreetMap is the open source version of Google Maps. There's a website called Geoapify.com which provides an API to interact with it. This means that you can write code that requests information from Geoapify about any given place in the world, and it sends back data about that place, including its latitude and longitude.

Because free access to it is limited and because quering a website a bunch of times is relatively slow, the plan was to split the process into two phases, hence the two Python files.

In stage 1 (the green parts of the diagram), the code takes a list of my travel destinations from a text file, things like "Siem Reap, Cambodia" and "Medellin, Colombia". For each one, it queries Geoapify and is given data about each place in JSON form. This is then placed in a SQLite database so that there's no need to query Geoapify any more than necessary. The data hasn't been processed or validated in any way at this stage.

So the end result of loaddata.py is records in a database table:


 Onto stage 2, the purple part of the earlier diagram.

extractdata.py reads each record from the database and processes the raw JSON data about each place, converting it into a Python nested dictionary object and drilling down to extract the latitude and longitude.

For each destination, the name, latitude and longitude were then written into a simple Javascript file. Finally, I used a HTML file provided by the course to display that data as pins on a map. I didn't attempt to rewrite the HTML from scratch because it would involve learning a lot of things that aren't Python and the point of this little project was to practice Python.

Challenges and Conclusions

As well as not knowing any Python ten days ago, I didn't know anything about JSON and hadn't done anything with SQL or databases for twenty years.

Writing the Python code from scratch without referring to the course materials was immensely useful, because I kept running into the limits of what I knew, figuring it out and then expanding those limits. In a professional programming environment you don't have to know everything there is to know about the language and its libraries; a key meta skill is being able to find out and fill in the gaps whenever you don't know how something is done.

Back when I was a software developer working mainly with C#, this involved trawling websites for clues, which was time-consuming and often frustrating. This time around I used AI (Claude), not to write code but to answer questions. I could ask things like:

"Using python, I am inserting a record into a database using a sql command. That works but I want to check first whether it already exists, in order to avoid duplication. How do I do it?"

Claude provided the information each time in a wonderful clear way. It was invaluable. Already I can't imagine any kind of programming workflow without the help of AI.

Clean Code

I've been reading Clean Code by Robert C. Martin, a hugely influential book about programming in a maintainable way. I applied what I was learning to the two Python scripts I wrote. Specifically the principles:

  • Functions should be as short as possible.
  • Functions should do one thing and one thing only.
  • Function and variable names should be self-explanatory and clear, eliminating the need for most comments.
  • Spacing should be liberally applied to enhance readability.

The Python examples in the course materials, incidentally, had none of the above! Each was a single dense block of code.

If you want to see my code, I've converted the files to text files so you can see them here in your browser:

loaddata.py.txt

extractdata.py.txt

Thanks for reading!


 



Comments