Wright-Way Rescue Tracker#
Background and Motivation#
Wright-Way Rescue is a pet adoption center that pull cats and dogs from rural areas, that are under-resourced and overpopulated with homeless animals. They do admirable work, and have an continual supply of some of the cutest puppies you will ever see. Their adoption process is relatively straightforward: you first apply and get approved to be an adopter, then you can reserve any of their available pets online, and finally you can visit them and take your new pet home.1
Poor Website Interface#
One of the main issues with the adoption process is the poor web interface:
- There is no search functionality, beyond your browser's search functionality.
- There is no filtering functionality for attributes that are important, like age, gender, breed, reserved status, etc.
- There is no ordering functionality, so you cannot sort by most recently added, or anything else.
There is a cursory amount of information on the main page, and you have to click on every puppy you are interested to see all relevant information, including reserved status.
High Demand#
The most desirable puppies are often reserved within a few hours of being posted, so if you are not contantly refreshing the page, you will most likely miss out on the puppy you want. As of writing, there were ~200 puppies available and ~50 puppies reserved. And every couple of days, a handful of new puppies are added and/or adopted. Even if you are chronically watching the page, it is extremely difficult to discern recently added puppies from the rest. It becomes a game of luck and that is not a fun game for adopting a pet.
Goals and Non-Goals#
Goals#
- Create a client that can pull all the puppies from the Wright-Way Rescue website.
- Stand up a database that can store all the puppies that have been extracted. This will help us keep track of which puppies are new to the system and which have already been seen by the system.
- Create an alerting system that can notify users when a new puppy has been added. Sending push-based notifications instead of pull-based will get the information to the user in the fastest way possible.
- Leverage an orchestrator that can run the client, update the database, and send alerts on a regular basis.
Non-Goals#
- A Web GUI: Creating a visual interface for users would be nice, but is not required. The most important aspect is speed of information to a user, visualization of all the puppies is secondary.
- User-Specified Filters for Alerts: Allowing users to filter the attributes of a puppy that they care about would be nice, but it is not required. This would require keeping track of preferences on a per-user basis, which would have a fair amount of complexity. User's can and should be able to filter the puppies themselves, once they have been alerted.
Design#
Client#
The Wright-Way website2 embeds a Petango widget3, which seems to be their platform for managing their pets. This is good news, because Petango has a semi-public API for accessing their data. We need to take the authkey
left in the code Wright-Way's website. With that, we can then access AdoptableSearch
to get all available puppies, and AdoptableDetails
to get the complete information on a puppy. We can then use the standard requests
library to pull the XML data from Petango. We can then extract the elements we need and validate them against a Pydantic
schema. The schema could look something like this:
from datetime import datetime
from typing import Literal
from pydantic import BaseModel
class Animal(BaseModel):
id: str
Name: str
Species: Literal["Cat", "Dog"]
Breed: str
Birthdate: datetime
Gender: Literal["Male", "Female"]
Size: str
Color: str
Declawed: bool
Housetrained: bool
Location: str
IntakeDate: datetime
Stage: Literal["Reserved", "Available"]
Profile: str
Database#
A Lightweight Framework#
The database should be something simple, since the amount of data we are storing is on the order of megabytes, if not just kilobytes. SQLite is a good choice, since it is lightweight and easy to use. SQLModel
is a good choice for interacting with SQLite, since it is extremely developer-friendly and maintainable.
Alternative DB Frameworks#
Depending on how we run the end-to-end workflow, we may need to have the database in a shared location, like a cloud database, as SQLite is primarily for single-machine, local use. This would be a good time to consider using a more robust database, like PostgreSQL. PostgreSQL might be overkill, but its is easy to connect to over the internet, and is very straightforward to manage.
Schema#
The schema should basically mirror the client's Pydantic schema. We may want to add a few more fields, like an id
field (separate from the Petango / Wright-Way id
), a created_at
field, and a updated_at
field. This is basic overhead for any database, and will help us keep track of the data we are storing.
Alerting System#
The alerting system should be able to send a notification to a user with the appropriate information about a puppy for the user to make a decision on whether to reserve that puppy. The user should not have to take the extra step of navigating to the Wright-Way website after seeing the alert, although that option should exist.
While any messaging service could be used, Slack is one one of the easiest to set up and I am most familiar with. The code should be set up in a way that another messaging service could be easily replace or supplement the Slack functionality. The message should be a slack block that something like this:
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "\n*<https://wright-wayrescue.org/adoptable-pets|Sadie>*\n*Species:* Dog\n*Breed:* Golden Retreiver\n*Gender:* Female\n*Birthdate:* 2024-12-1 (0 years, 1 months, 3 days)\n*Intake Date:* 2024-12-25\n*Stage:* Available\n"
},
"accessory": {
"type": "image",
"image_url": "https://www.stockvault.net/data/2016/05/13/197535/preview16.jpg",
"alt_text": "puppy"
}
}
Orchestrator#
All the above components need to be wrapped up in an end-to-end workflow that can be run on a regular basis. A simple cron schedule will be used to launch the workflow, as attempting anything more complex would be excessive. The cron workflow should run as often as possible, but without overloading the API or causing us to get banned. We will start with a 5 minute interval, and adjust as necessary.
Workflow Logic#
The workflow should do the following:
- Pull all the active puppies from the Wright-Way website using the client.
- Add any new puppies to the database.
- Delete any puppies that have been adopted from the database.
- Publish any new puppies to the Slack channel.
You can see the flow in graph form below:
flowchart LR
petango[(Petango API)]
AdoptableSearch("AdoptableSearch")
AdoptableDetails("AdoptableDetails")
client("Wright-Way client")
db[(SQLite)]
flow("Prefect Flow")
slack("Slack")
petango --> AdoptableSearch
petango --> AdoptableDetails
AdoptableSearch --> |"GET"| client
AdoptableDetails --> |"GET"| client
client -->|"find active puppies (with retry)"| flow
flow --> |"add new puppies"| db
flow --> |"delete newly adopted puppies"| db
flow --> |"publish new puppy alerts (with retry)"| slack
Proposed Orchestrator#
Prefect is a good choice for this, as it is lightweight, flexible, and easy to maintain. For our workflow, it can handle retries of failed Wright-Way / Petango requests, can easily handle our CRUD operations on the SQLite database, and has some Slack integration built-in. From an operational perspective, it is easy to deploy and monitor, and can send alerts if a workflow fails. This will allow us to be hands-off with the system, and only have to intervene if something goes wrong.
Alternative Orchestrators#
There are a lot of other options for orchestration systems that do what we require, Prefect is not the only choice.4 Suggestions are encouraged for alternatives that are either easier to operate, or are more robust, without being overly complex. Dagster looked the most promising, but seemed to focus more on the data asset side of things, which is not necessary for such a basic schema as we are using.