A week in Openverse: 2024-04-15 – 2024-04-22

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4077: Add (or update) robots.txt and ai.txt to blockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. AI crawlers
  • #4085: Show the unstable params and improve collection docs
  • #4120: Publish changelog for api-2024.04.15.17.18.29
  • #4123: Bump sqlparse from 0.4.4 to 0.5.0 in /api
  • #4128: Add implementation plan for new Python package manager PDM

Catalog

  • #4083: Add instructions how to run ingestion script from command line
  • #4093: Fix date range sent to SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. from reingestion workflows
  • #4102: Fill out query params for Stocksnap DAG, allow restarts
  • #4104: Wait for iNaturalist load completion before compiling statistics
  • #4105: Update Science Museum ingester with API changes
  • #4159: Update dependency apacheApache Apache is the most widely used web server software. Developed and maintained by Apache Software Foundation. Apache is an Open Source software available for free.-airflow to v2.9.0, Python to 3.12
  • #4161: Publish changelog for catalog-2024.04.18.23.30.21

Documentation

  • #4077: Add (or update) robots.txt and ai.txt to block AI crawlers
  • #4089: Add useful Log Insights query to Nuxt avg response time runbook
  • #4128: Add implementation plan for new Python package manager PDM
  • #4162: Add horizontal lines to DAG doc for better visual separation

Frontend

  • #4077: Add (or update) robots.txt and ai.txt to block AI crawlers
  • #4099: Add a redirect on 503 to frontend nginxNGINX NGINX is open source software for web serving, reverse proxying, caching, load balancing, media streaming, and more. It started out as a web server designed for maximum performance and stability. In addition to its HTTP server capabilities, NGINX can also function as a proxy server for email (IMAP, POP3, and SMTP) and a reverse proxy and load balancer for HTTP, TCP, and UDP servers. https://www.nginx.com/.
  • #4112: Publish changelog for frontend-2024.04.13.16.19.50
  • #4116: Fix the back to results button for pages opened from collection pages
  • #4117: Set additional search views to enabled
  • #4118: Publish changelog for frontend-2024.04.15.15.20.23
  • #4129: Update dependency prettier-pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party-tailwindcss to v0.5.14
  • #4130: Update dependency typescript to v5.4.5
  • #4131: Update pnpm to v8.15.7
  • #4132: Update dependency @nuxtjs/composition-api to ^0.34.0
  • #4133: Update dependency @playwright/test to v1.43.1
  • #4136: Update dependency rimraf to v5

Ingestion Server

  • #4140: Bump gunicorn from 21.2.0 to 22.0.0 in /ingestion_server
  • #4160: Publish changelog for ingestion_server-2024.04.18.16.49.08

Management

  • #4114: Break `docker-compose.yml` into smaller `compose.yml` files
  • #4170: Remove unnecessary params from healthcheck
  • #4171: Always remove containers when using `just run`

Closed issues

API

  • #286: Implementation Plan: Switch Python package management away from Pipenv
  • #687: Prevent divergent migrations from being merged by concurrent, not-rebased PRs
  • #3634: Add content moderator user preferences admin view
  • #4005: Set cache control headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes. on API responses
  • #4080: Show unstable parameters in the API documentation, adding a disclaimer about why they are unstable
  • #4081: Add `unstable__` prefix to the additional search views `collection` parameter

Catalog

  • #737: Unify environment names
  • #1379: Aggregate ingestion errors over reingestion
  • #1404: Report data refresh count change by provider
  • #2799: Add instructions for running ingestion via the CLICLI Command Line Interface. Terminal (Bash) in Mac, Command Prompt in Windows, or WP-CLI for WordPress.
  • #3964: Reingestion workflows report misleading date range to Slack
  • #3981: Implementation Plan for Ingestion Server removal
  • #4092: Update Science Museum DAG to use new API response format
  • #4103: Investigate recent iNaturalist failures

Documentation

  • #3207: Do not produce reingestion workflow documentation if it matches regular workflow documentation

Frontend

  • #2996: Handle Plausible outages
  • #3473: Proxy frontend API requests through Nuxt
  • #3900: robots.txt for AI-related crawlers and bots
  • #3959: Dark Mode Design Proposal
  • #4115: Single result pages opened from collection views do not have "Back to results" link

Ingestion Server

  • #286: Implementation Plan: Switch Python package management away from Pipenv
  • #3981: Implementation Plan for Ingestion Server removal

Management

  • #286: Implementation Plan: Switch Python package management away from Pipenv
  • #4111: Break down `docker-compose.yml` into smaller more-maintainable chunks

openverse-infrastructure

Merged PRs

Catalog

  • #854: Bump catalog to rel-2024.04.18.23.30.21

Infra

  • #833: Import existing openverse.org Cloudflare rulesets and bring over API and Airflow rules from .engineering zone
  • #848: Remove legacy deployment environment from Airflow compose

Ingestion Server

  • #852: Bump ing server to rel-2024.04.18

Closed issues

Infra

  • #777: Add API domains for `openverse.org` to Cloudflare access and port all API `openverse.engineering` rules to `openverse.org`

#openverse, #week-in-openverse