Introducing Collection views for Tags, Creators, and Sources

OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. now offers new ways to explore our collection of over 800 million images and audio files. The new Collection search views allow you to view works belonging to an individual tag, creator, or source.

On single search results, the creator and source names now link to the new views:

On this illustration result, clicking on “Alfred Chandler” or “Cleveland Museum of Art” now leads to a dedicated collection page. Here’s the page for the Cleveland Museum of Art, for example:

https://openverse.org/image/collection?source=clevelandmuseum

These new pages also exist for tags. All image or audio tags now link off to dedicated collection pages. In this example, clicking the “blubird” tag:

leads to a new collection of more bluebird images!

https://openverse.org/image/collection?tag=bluebird

If you notice and bugs or functionality you would like to see in these new views, please report an issue on the Openverse GitHubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/ repository or reach out to our maintainers in the #openverse channel of the Make WP Chat. Thank you! Special thanks to @olgabulat and @fcoveram for implementing and designing the new functionality.

Openverse Monthly Priorities Meeting 2024-04-03

OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. contributors will host a community meeting to discuss priorities for April at 1500 UTC on April 3rd, 2024.

A sync video chat link will be provided in the #openverse channel of the Making WordPress Chat. We hope to see you there!

You can read the ongoing notes document for these meetings here.

#openverse-priorities, #priorities

Community Meeting Recap (2024-04-15)

[Meeting start]

We had 1 item on our agenda:

Request to check the Additional Search Views

Meeting attendees were invited to take a look at https://openverse.org and see the recently launched Additional Search Views. Huge congratulations to those involved in launching this feature! There was also some feedback and discussion on the creator collections, with the creation of an issue to re-enable searching by creator.

[Meeting end]

#openverse-weekly-community-meeting

A week in Openverse: 2024-04-08 – 2024-04-15

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #4031: Add warning to search response when source parameter has mixed validity
  • #4056: Show the additional search views documentation on the API docs site
  • #4062: Bugfix to ensure image type is correctly extracted from content type
  • #4078: Publish changelog for api-2024.04.09.03.50.11
  • #4098: Bump idna from 3.6 to 3.7 in /api

Catalog

  • #4060: Fix: Escape space in `just catalog/test` directory injection
  • #4061: Refactor: Remove `get_media_type()` redundant override in providers with a single media type
  • #4071: Reduce visual footprint of DAG Alerts in SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/.

Documentation

  • #4084: Fix the feature flags in production on hydration
  • #4095: Bump idna from 3.4 to 3.7 in /documentation

Frontend

  • #4009: Related endpoint error is still sent to Sentry
  • #4011: Handle global audio play errors
  • #4043: Turn on additional search views frontend
  • #4044: Add `SEARCH_RESPONSE_TIME` event
  • #4048: Update jest packages
  • #4049: Update dependency typescript to v5.4.4
  • #4050: Update pnpm to v8.15.6
  • #4052: Update dependency postcss-focus-visible to v9
  • #4054: Replace `focus:` with `focus-visible:`
  • #4057: Update Node.js to v18.20.2
  • #4058: Update dependency node-htmlHTML HTML is an acronym for Hyper Text Markup Language. It is a markup language that is used in the development of web pages and websites.-parser to v6
  • #4066: Fix: Remove unwanted leading/trailing whitespaces in attributions
  • #4079: Publish changelog for frontend-2024.04.09.03.50.12
  • #4084: Fix the feature flags in production on hydration

Infra

  • #4051: Add logging to all dag-sync exits

Ingestion Server

  • #4097: Bump idna from 3.6 to 3.7 in /ingestion_server

Management

  • #4059: Update workflows to v6 (major)
  • #4069: Update print_ps() to use Docker Compose ps JSONJSON JSON, or JavaScript Object Notation, is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML. output
  • #4090: Add emojis to the project thread reminder bot acceptable statuses
  • #4094: Bump idna from 3.4 to 3.7 in /automations/python
  • #4096: Bump idna from 3.6 to 3.7 in /utilities/project_planning

Closed issues

API

  • #723: Drop `_package` suffix from module names in API
  • #3750: Image type used for thumbnails is sometimes incorrectly extracted from the content type
  • #4023: Create a script to automatically generate media properties in the API and frontend
  • #4030: Improve invalid source parameter handling
  • #4055: Switch on the display of the API additional search views documentation

Catalog

  • #1370: Upgrade to Python 3.11
  • #1385: Remove unnecessary boilerplate implementations of `get_media_type`
  • #1566: Remove duplicated tags
  • #2186: Create a script to automatically generate media properties
  • #3987: Reduce/Improve visual footprint of DAG Alerts in Slack
  • #4013: Science Museum queries may occasionally fail due to upstream bug
  • #4091: Use batched update to clean up empty JSON objects in tags fields

Documentation

  • #3877: Add docs for including machine-generated Arabic translations in e2e tests

Frontend

  • #553: Trim leading/trailing whitespace from attribution copy button result
  • #2471: Add `SEARCH_RESPONSE_TIME` analytics event
  • #3487: `NotSupportedError` when trying to play audio
  • #3775: `SEARCH_TIME_EVENT` is unusable due to plausible and CORS limitations
  • #4006: Related endpoint error is still sent to Sentry
  • #4023: Create a script to automatically generate media properties in the API and frontend
  • #4035: Switch the `additional_search_views` flag on in staging and prod
  • #4047: Update jest to the newest version
  • #4082: Switchable flags don't work in production
  • #4087: VCheckbox element logical state not updated to match the state stored in the application

Management

  • #2340: Add maintainer oriented documentation with guidelines and expectations for "good first issue" and "help wanted" issues
  • #3876: Switch to Docker Compose v2

openverse-infrastructure

Merged PRs

Infra

  • #837: Roll out Airflow in next/production with Ansible

Management

  • #845: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.

Closed issues

API

  • #468: Manage Sentry projects and settings in Terraform

Frontend

  • #468: Manage Sentry projects and settings in Terraform

Infra

  • #775: Port all Airflow Cloudflare rules from openverse.engineering to openverse.org
  • #776: Run a data refresh and provider dags on the new Airflow instance

#openverse, #week-in-openverse

Community Meeting Recap (2024-04-08)

[Meeting start]

We had 1 item on our agenda:

Prioritization of Implementation Plan: Switch Python package management away from Pipenv

The maintainers discussed what options exist for package managers in Python outside of Pipenv, and how best to proceed. Some community feedback was provided about the different tooling. The team decided to try out a few options prior to drafting the implementation plan, and have the IP issue an opinionated approach based off the advantages/disadvantages identified in the exploration.

[Meeting end]

#openverse-weekly-community-meeting

A week in Openverse: 2024-04-01 – 2024-04-08

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #3989: Create the moderation decision model
  • #3991: Update dependency fakeredis to v2.21.3
  • #3996: Update dependency elasticsearch to v8.13.0
  • #4000: Publish changelog for api-2024.04.01.17.07.10
  • #4002: Selectively update API deps and undo unrelated updates
  • #4008: Publish changelog for api-2024.04.02.05.06.52
  • #4027: Remove provision for missing fields on `Hit`
  • #4032: Remove potentially problematic `do_not_wait_for`

Catalog

  • #3997: Update dependency flaky to v3.8.1
  • #4004: Increase Wikimedia request timeout
  • #4010: Update dependency tldextract to v5.1.2
  • #4014: FilterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. out duplicates from `raw_tags` in the catalog v2
  • #4029: Improve testing import behavior for the catalog
  • #4041: Clarify Batched Update DAG docs with use cases for failure recovery

Documentation

  • #4012: Add log insights querying information for Nuxt 5XX errors
  • #4017: Replace docker-compose with docker compose in just scripts and docs

Frontend

  • #3975: VTag improvements
  • #3988: Add context comments to i18n key
  • #3990: Update dependency @playwright/test to v1.42.1
  • #3992: Update dependency prettier-pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party-tailwindcss to v0.5.13
  • #3994: Update Node.js to v18.20.0
  • #3995: Update dependency async-mutex to ^0.5.0
  • #3999: Add SEO properties to collection pages
  • #4001: Publish changelog for frontend-2024.04.01.17.07.11
  • #4018: Replace implicit getBy* assertion in `v-modal` test

Ingestion Server

  • #3996: Update dependency elasticsearch to v8.13.0
  • #4017: Replace docker-compose with docker compose in just scripts and docs
  • #4042: Publish changelog for ingestion_server-2024.04.04.14.33.24

Management

  • #3993: Update workflows
  • #4017: Replace docker-compose with docker compose in just scripts and docs
  • #4021: Bump pillow from 10.2.0 to 10.3.0 in /utilities/project_planning
  • #4022: Bump pillow from 10.2.0 to 10.3.0 in /utilities/provider_tallies
  • #4028: Handle PR automations when quick succession of PR approved and merged

Closed issues

API

  • #1996: Implementation Plan: Clearly document all media properties in catalog in API & Frontend
  • #3636: Create `ModerationDecision` table
  • #3945: Log when source query parameter contains invalid values

Catalog

  • #3926: Update `raw_tags` to avoid duplicates in the catalog
  • #4003: Increase Wikimedia request timeout

Documentation

  • #3896: Project Proposal: Incorporate Rekognition data into the catalog

Frontend

  • #617: Translation strings partials should be linked with the whole sentence.
  • #790: More descriptive screen reader text for search page headings
  • #1996: Implementation Plan: Clearly document all media properties in catalog in API & Frontend
  • #2321: Remove implicit `@testing-library` `get*` assertions: `v-modal.spec.js`
  • #3190: Refactor and improve `VTag` component
  • #3917: Add SEO properties to the collection pages

Management

  • #3973: Set expectation of Docker compose v2 and update references and compose file appropriately

openverse-infrastructure

Merged PRs

Infra

  • #829: DeployDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. Airflow with Ansible
  • #830: Add min/max values to CPU and Memory ECS graphs in Cloudwatch
  • #831: Explicitly declare HTTPSHTTPS HTTPS is an acronym for Hyper Text Transfer Protocol Secure. HTTPS is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The 'S' at the end of HTTPS stands for 'Secure'. It means all communications between your browser and the website are encrypted. This is especially helpful for protecting sensitive data like banking information. always for cloudflare

Ingestion Server

  • #836: Bump ingestion server

Management

  • #832: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.
  • #834: 🔄 synced file(s) with WordPress/openverse

Closed issues

Infra

  • #356: Manage "HTTPS Everywhere" filter for domains
  • #666: Configure monitoring index lifecycle policy
  • #774: Create new `concrete/airflow` module in `next` modules; create Ansible playbook for spinning up Airflow on the EC2 instance

#openverse, #week-in-openverse

A week in Openverse: 2024-03-25 – 2024-04-01

openverse

Merged PRs

APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.

  • #3887: Use search query parameters for additional search views in the API
  • #3962: Publish changelog for api-2024.03.25.15.22.26
  • #3971: Fix load sample data script provider insertion

Catalog

  • #3921: Add elasticsearch concurrency tags for Airflow
  • #3936: Add Project Proposal for ingestion server removal project
  • #3974: Update dependency apacheApache Apache is the most widely used web server software. Developed and maintained by Apache Software Foundation. Apache is an Open Source software available for free.-airflow to v2.8.4 [SECURITY]
  • #3983: Use a `make_insert_query` function in test_sql.py

Documentation

  • #3913: Project proposal for dark mode project
  • #3921: Add elasticsearch concurrency tags for Airflow
  • #3980: Add documentation guidelines, update API docs guidelines links

Frontend

  • #3725: Add ESLint rule to cap the length of translation strings
  • #3957: Update payload of collection search analytics events
  • #3960: Publish changelog for frontend-2024.03.25.15.22.24
  • #3961: Remove unused colors from the tailwind config
  • #3967: Fix logo height
  • #3978: Update additional search views API params in frontend
  • #3979: Reset the media store state if collection state changes
  • #3985: Check for emptiness of case.split function input

Ingestion Server

  • #3936: Add Project Proposal for ingestion server removal project

Management

  • #3955: Group "week in OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org." items by stack label
  • #3982: Automatically apply the project proposal and IP labels to PRs

Closed issues

API

  • #3869: Replace API collection paths with search parameters

Catalog

  • #1835: [Quality] Use a `make_insert_query` function in test_sql.py
  • #3891: Improve support for "concurrency pools" in Elasticsearch DAGs
  • #3935: Write Project Proposal for ingestion server removal

Documentation

  • #3504: Update contribution references in documentation quickstart
  • #3894: Project Proposal: Dark Mode

Frontend

  • #476: Updating logo nav component in headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes.
  • #3378: Don't use very long strings to be translated
  • #3619: Update `REACH_RESULT_END`, `LOAD_MORE`, `SELECT_SEARCH_RESULT` analytics events for additional search views
  • #3894: Project Proposal: Dark Mode
  • #3976: Staging: Provider filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. is preserved on collection tag views
  • #3984: TypeError: Cannot read properties of undefined (reading 'trim')

Ingestion Server

  • #3935: Write Project Proposal for ingestion server removal

Management

  • #3445: Categorise weekly updates by stack

#openverse, #week-in-openverse

Community Meeting Recap (2024-03-25)

[Meeting start]

We had 1 item on our agenda:

What’s the plan for launching additional search views? Can we this week? Should we draft a make post; do we need a rollback plan (in case the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. changes caused trouble in prod, for example)?

There is still work to be done before launching the project, specifically:

  1. Use search query parameters for additional search views in the API #3887
  2. Add SEO properties to the collection pages #3917 (needs discussion)

After completing the issues, we can discuss the details of the launch.

[Meeting end]

#openverse-weekly-community-meeting

A week in Openverse: 2024-03-18 – 2024-03-25

openverse

Merged PRs

  • #3956: Publish changelog for catalog-2024.03.22.17.45.11
  • #3954: Add debug logs to renovate
  • #3953: Bump jwcrypto from 1.5.4 to 1.5.6 in /api
  • #3952: Update pinia and vue-demi
  • #3951: Adds locale to the locale kebab-case warnings
  • #3950: Link to OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. in Documentation site root
  • #3942: Fix the skip-to-content link reloading the results
  • #3933: Publish changelog for frontend-2024.03.18.15.51.41
  • #3932: Publish changelog for api-2024.03.18.15.51.25
  • #3931: Fix audio alt files missing bit rate
  • #3928: Use DAG_DEFAULT_ARGS for all DAGs
  • #3850: Centralise frontend error reporting (and suppress unactionable Sentry errors)
  • #3836: Add accesstoken and ThrottledApplication to admin panel
  • #3835: Use the `VMediaCollection` for search and collection results
  • #3808: Cleanup tag display for long lists of tags
  • #3760: Implementation Plan: Content moderation metrics

Closed issues

  • #3940: Using "skip to content" button on search results page clears result counts (staging only)
  • #3939: Unmet `pinia` peer dependency (version issue)
  • #3938: Kebab-cased translation key warning when running `just p frontend i18n`
  • #3930: KeyError: "Got KeyError when attempting to get a value for field `bit_rate` on serializer `AudioAltFileSeri…
  • #3884: Search results are not updated when filters are unchecked but the search term is the same
  • #3830: Navigating between additional search views and single result pages and back does not update the single image
  • #3821: Clean up APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org environment variables
  • #3803: Not all DAGs have `DAG_DEFAULT_ARGS` applied
  • #3711: Add access token and throttled application models to Django admin
  • #3576: Returning to results from the new content views does not load previously-loaded pages
  • #3468: Avoid `AxiosError` when requesting bad image links
  • #2589: Cleanup tag display with long lists of tags
  • #1970: Implementation Plan: Moderation queue metrification
  • #1163: Update the Priority custom fieldCustom Field Custom Field, also referred to as post meta, is a feature in WordPress. It allows users to add additional information when writing a post, eg contributors’ names, auth. WordPress stores this information as metadata. Users can display this meta data by using template tags in their WordPress themes. when the issue priority label changes
  • #1126: Implementation Plan: Rekognition Data Evaluation
  • #1459: Surface materialized views in view names
  • #1473: Investigate Data Refreshes blocking during popularity steps
  • #1662: Catalog database/ingestion overhaul
  • #1667: No descriptions for audio files
  • #744: Add Rawpixel to authority data as `CURATED`
  • #1765: Come up with a solution for consuming crawler events (original #457)
  • #1791: Scrape CC REL data to identify CC-licensed images (original #182)
  • #1790: Feed new images to the crawler (original #456)

openverse-infrastructure

Merged PRs

  • #828: Bump catalog airflow version to rel-2024.03.22.17.45.11
  • #827: Update Nuxt HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. 5xx responses runbook link
  • #826: Restrict SSHSSH Secure SHell - a protocol for securely connecting to a remote system in addition to or in place of a password. ingress on all non-bastion services to within the VPC
  • #825: Remove sudo calls from ingestion server init
  • #824: Remove API environment variables that are no longer used
  • #819: Remove Airflow email settings

Closed issues

  • #789: Drop SSH ingress from outside of the VPC on all EC2 instances (except the SSH bastion)
  • #548: Remove unnecessary calls to `sudo` from ingestion server user-data script
  • #258: Remove Airflow SMTP settings

#openverse, #week-in-openverse

Community Meeting Recap (2024-03-18)

[Meeting start]

We had 1 item on our agenda:

When should we add ccMixter to the Django Admin content provider list? And should we make a public announcement about their addition?

We decided to keep this item on the Agenda for the next week, and clarify some problems with it in the mean time:

  1. The preview URLs don’t work on the frontend, so the users cannot play them. @aetherunbound offered to contact ccMixter to clarify if we are using their APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. correctly, and how we can fix the preview URLs.
  2. The API returns 500 due to the AudioAltFile serializer.

[Meeting end]

#openverse-weekly-community-meeting