Understanding the life cycle of a Green Check Request¶
This page outlines the life cycle for the majority of the traffic the Green Web Platform serves - API requests on its greencheck endpoints, most commonly at the following url:
https://api.thegreenwebfoundation.org/greencheck/
It exists to inform system design discussions, and help developers new to the system trace a path through the code for common operations.
When a result has been cached for a domain¶
When a user carries out a greencheck for a specific site, or a third party uses our greencheck API, you can trace the flow of a request through the system like so:
sequenceDiagram
Browser client->>+Caddy: Look up domain
Caddy->>+Django Web: Forward to a django web process
Django Web->>+Database: Look up domain
Database->>-Django Web: Return lookup result
Django Web->>+Caddy: Return rendered result
Caddy->>+Browser client: Present domain result
Django Web->>+RabbitMQ: Queue domain for logging
In most cases we try to find a result we can return quickly, and check in a local cache table called greendomain, described by the GreenDomain model. In this case, we return the cached result, and add it to a Rabbit MQ queue, so that a separate worker process can write the check result to a logging table, currently named greencheck, and represented by the Greencheck model.
Note: See the model definitions Greencheck and GreenDomain for the definitive listing of the names the tables we write to - they have changed over time.
Once we have the domain queued, another worker takes the domain off the queue, and logs the checked domain, for later aggregate analysis.
sequenceDiagram
Django Dramatiq->>+RabbitMQ: Check for any domains to log
RabbitMQ->>+Django Dramatiq: Return domain to log
Django Dramatiq->>+Database: Log domain to greencheck table
The greendomains cache table has a TTL of six months - domains which have not been updated in six months or so are deleted, so that a full check is then carried out on the next lookup.
When a result has not been cached for a domain (or the cache is requested to be refreshed)¶
When a domain does not exist in the greendomains table, or we are explicitly refreshing the cache (see below), a full check is carried out, and the result cached to the greendomains table. As above, the check is also added to the rabbitMQ queue, and the check logged asynchronously in the Greencheck table.
The sequence flow diagram looks like so:
sequenceDiagram
Browser client->>+Caddy: Send a request to check a website domain
Caddy->>+Django Web: Forward to a django web process
Django Web->>+External Network: Look up domain
External Network->>+Django Web: Return domain lookup
Django Web->>+Database: Clear old cached domain lookup from database
Django Web->>+Caddy: Return rendered result
Caddy->>+Browser client: Present result for website check
Django Web->>+RabbitMQ: Queue domain for logging
There are two place a cache refresh is always performed:
When a manual cache refresh is manually requested - Either by clicking the “update result” link on the green web checker result page, or by passing the
nocache=truequery parameter on a greencheck API lookup.our own “detail” view for troubleshooting with support - this is visible at https://admin.thegreenwebfoundation.org/admin/extended-greencheck, and used to show more detail about how we arrive at a given result
Greencheck images¶
Greencheck badge images are generated on first request, and cached in object storage, subsequent requests redirect to this existing file rather than regenerating the image. The greencheckbadge endpoint does not respect the nocache flag, but a cache busting request to the main greencheck API endpoint will also delete the cached image, allowing it to be updated.