go-event-alerts

GO Event Alerts - Backend

This directory contains the Python backend for the Pokémon GO Event Notifier project. The backend is responsible for scraping event information, processing it with a series of LLMs, and storing it in DynamoDB.

Table of Contents

Overview

The backend is a data processing pipeline that performs the following steps:

  1. Scrapes event data from pokemongolive.com and leekduck.com.
  2. Processes the raw HTML content using a multi-stage LLM pipeline to extract, refine, and deduplicate structured data.
  3. Stores the final, clean event data in an AWS DynamoDB table.

Features

LLM Processing Pipeline

A key feature of this backend is its multi-stage approach to data refinement using several targeted LLM calls. This strategy, favoring multiple, specialized calls to smaller models over a single call to a large one, ensures both high data quality and cost-effectiveness.

1. Initial Structuring

2. Quality Assurance & Classification

3. Semantic Deduplication

Since the same event is often announced on multiple sites with slightly different wording and timing, a sophisticated deduplication process is required.

Setup & Installation

  1. Clone the repository:
    git clone [https://github.com/sthoresen/go-event-alerts.git](https://github.com/sthoresen/go-event-alerts.git)
    cd go-event-alerts/backend
    
  2. Create a virtual environment:
    python -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
    
  3. Install the dependencies:
    pip install -r requirements.txt
    

Configuration

The backend requires the following environment variables to be set in a .env file:

Running the Pipeline

To run the data processing pipeline manually, execute the lambda_function.py script:

```bash python lambda_function.py