Session 20: Looking Forward

Session 20: Looking Forward#

Throughout this module, you’ve learned the fundamentals of Python programming with a focus on applying it to scientific problems. As you continue your programming journey beyond this course, here are some important concepts that may help you write more robust and maintainable code which can be easily collaborated on with others.

Version Control#

Version control systems (like Git) allow you to track changes in your code over time. This becomes invaluable when:

  • You need to recover previous versions of your code

  • You’re collaborating with others on the same codebase

  • You want to try experimental features without breaking working code

Real-world example: Imagine you’re developing a script that analyzes molecular structures. With Git, you can:

# 1. Save your working version
git commit -m "Working molecular weight calculator"

# 2. Create a branch to experiment with a new feature
git checkout -b new-feature

# 3. If the experiment fails, easily return to your working version
git checkout main

# 4. If it works, merge it into your main code
git merge new-feature

These terminal commands demonstrate a basic Git workflow for experimenting with new features without risking your working code. Here’s what’s happening in each numbered step:

  1. This saves your current code in Git with a descriptive message. Think of it like creating a checkpoint in a game that you can return to.

  2. This creates and switches to a new “branch” called new-feature - a separate copy of your code where you can make changes without affecting your original version.

  3. If your experiment isn’t working out, this command takes you back to your original working code (on the main branch).

  4. If your experiment worked well, this combines your experimental code with your original code. You are “merging” the new-feature branch into your current (main) branch.

Tests#

Writing tests ensures your code works as expected, even after making changes. As your program grows, you might make accidentally break some part of your existing functionality without noticing - this is where tests come in. Tests are special pieces of code that check if your other code is working correctly. They act like a safety net to catch mistakes.

Real-world example: Testing a function that calculates molecular weight:

def test_molecular_weight():
    # Test with water
    assert round(calculate_molecular_weight("H2O"), 2) == 18.02
    
    # Test with aspirin
    assert round(calculate_molecular_weight("C9H8O4"), 2) == 180.16

This code checks that our calculate_molecular_weight function gives the correct answers:

  1. It tries calculating the molecular weight of water (H2O) and makes sure the result is 18.02

  2. It tries calculating the molecular weight of aspirin and checks the result is 180.16 The keyword assert here means “make sure this is true” - if not, the test fails and warns you something is wrong.

Imagine you’re working on a more advanced version of your molecular weight calculator program and accidentally change something that breaks your molecular weight calculations. Without tests, you might not notice the error until it causes serious problems. With tests, you’ll know immediately when something has broken.

Type Hints#

Type hints make your code more readable and help catch errors before runtime by specifying what kind of data should be used. Type hints are labels that tell what kind of data should go into and come out of functions, or what type a variable should be.

Real-world example: A function to calculate drug dosage:

def calculate_ideal_gas_volume(pressure: float, mol: float, temp: float) -> float:
    """
    Calculate the volume of an ideal gas using the ideal gas equation: PV = nRT
    
    Args:
        pressure: Pressure in pascals
        mol: Amount of substance in moles
        temp: Temperature in kelvin
        
    Returns:
        Volume in cubic meters
    """
    R = 8.314
    return (mol * R * temp) / pressure

The key bit to notice here are the float type hints after the colon for each keyword paramter, as well as after the -> type hint for the function output.

Type hints clearly communicate what values the function expects and returns. They generally help prevent mistakes like trying to calculate something using text instead of numbers. They also help other people understand your code more easily. When you or someone else reads this code later, it’s immediately clear what types of data to use. IDEs can also check your code for errors by checking that you are using the correct argument type in a function call.

APIs#

Application Programming Interfaces (APIs) allow your code to interact with external services or databases. They are provided by an external resource, and describe a way for others to retrieve (or even store) information remotely. In scientific research, this may involve accessing chemical databases, molecular structures, or literature.

Real-world example: Using the PubChem API to retrieve compound information:

import requests

def get_compound_info(compound_name: str) -> dict:
    """Retrieve information about a chemical compound from PubChem."""
    url = f"https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/{compound_name}/JSON"
    response = requests.get(url)
    
    if response.status_code == 200:
        return response.json()
    else:
        return {"error": f"Could not find information for {compound_name}"}
        
# Example usage
aspirin_info = get_compound_info("aspirin")
print(f"Molecular formula: {aspirin_info['PC_Compounds'][0]['props'][16]['value']['sval']}")

So what is happening here?

  1. The function creates a special web address (URL) that includes the compound name.

  2. It sends a request to PubChem (a chemical database website).

  3. If successful (status code 200), it returns the chemical information (as a nested data set). You can see the full data it returns here.

  4. If unsuccessful, it returns an error message.

To achieve this, we also use the requests package to communicate with websites over the Internet.

Instead of manually looking up chemical information or maintaining your own database, you can access comprehensive information from established resources. This lets you focus on analyzing the data directly.

Where next?#

If you’re interested in pursuing these topics further:

  • For version control: Learn Git basics at Git-SCM

  • For testing: Explore the pytest framework

  • For type hints: Review the typing module in Python’s documentation

  • For APIs: Practice with scientific APIs like PubChem, ChEMBL, or the Protein Data Bank

Remember that programming is a skill that improves with practice. The foundations you’ve learned in this module will serve you well as you tackle increasingly complex problems in your scientific career.