Anthony's Blog Site

Student Club Explorer

May 29, 2025

U of T Student Club

Photo Credit: University of Toronto


🧭 Student Club Explorer

Student Club Explorer is a Python-based tool that explores and extracts information about student clubs from the University of Toronto’s club directory. It extracts club details, contact info, interests, and more β€” with support for keyword-based filtering.

πŸ’‘ Motivation

While working in an administrative role at the University of Toronto, my partner encountered a challenge: she needed to extract student club information β€” such as primary contact emails, origin campuses, and official website links β€” from the Student Organization Portal (SOP). She asked if there were any existing tools that could help with this task.

After understanding the requirements and evaluating the available options, it became clear that building a lightweight, customized script would be faster and more flexible than relying on a general-purpose tool. That's how Student Club Explorer came to life β€” a tool built to automate and streamline this data collection process.

πŸ” Intuition

The SOP provides a directory of student organizations, where each club is listed with a clickable name linking to its individual profile page. While the homepage displays basic information like the club name and campus, key details β€” such as the primary email address or external website β€” are only available on each club’s individual page.

Upon inspection, these profile pages follow a recognizable structure: for instance, the contact email consistently appears under a "Contact" heading and before the next section like "Address". These structural patterns make the site highly amenable to HTML parsing and automation, making a custom scraper both practical and efficient for collecting structured data at scale.

πŸ“Œ Features

  • πŸ” Scrapes club names, campuses, websites, emails, and interest tags
  • 🌐 Follows each club’s individual profile page for deeper details
  • 🧠 Filters clubs by interest keywords
  • πŸ“€ Exports clean, UTF-8 encoded CSV files
  • ⏳ Displays a real-time progress bar using tqdm

πŸ› οΈ Technologies

  • Python 3.11
  • requests for HTTP requests
  • beautifulsoup4 for HTML parsing
  • tqdm for progress bar
  • csv for output formatting

To see how specifically these libraries were used, feel free to visit the source code on GitHub.