Welcome to the official blog of the translator team for the WordPress open sourceOpen SourceOpen Source denotes software for which the original source code is made freely available and may be redistributed and modified. Open Source **must be** delivered via a licensing model, see GPL. project. This is where we discuss all things related to translating WordPress. Follow our progress for general updates, status reports, and debates.
We’d love for you to help out!
You can help translate WordPress to your language by logging in to the translation platform with your WordPress.orgWordPress.orgThe community site where WordPress code is created and shared by the users. This is where you can download the source code for WordPress core, plugins and themes as well as the central location for community conversations and organization. https://wordpress.org/ account and suggesting translations (more details).
We have meetings every week on SlackSlackSlack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. in #polyglots (the schedule is on the sidebarSidebarA sidebar in WordPress is referred to a widget-ready area used by WordPress themes to display information that is not a part of the main content. It is not always a vertical column on the side. It can be a horizontal rectangle below or above the content area, footer, header, or any where in the theme. of this page). You are also welcome to ask questions on the same channel at any time!
Captions are a text version of the speech and non-speech audio information needed to understand the content. They are synchronized with the audio and are usually shown in a media player when users turn them on.
Web accessibilityAccessibilityAccessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) is essential for people with disabilities and useful for all, so adding captions to the WordPress.tv videos would improve the usability of this platform.
At wordpress.tv, we currently use Otter.ai (English) & Sonix.ai (multilingual) to generate caption text files, then edit and upload them manually. These options have a cost, so we use them only in a few videos.
In this post, I am going to explain how to test Whisper, a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. Whisper is open-source.
You can get all the information about the installation here.
You need to have Python 3 and Pip installed. If you have a Mac M1, take a look at this link before installing this tool.
Be advised that the files you will get from wordpress.tv are the video without audio (1) and the audio files (2). You have to use the audio file (or a video with audio), because the video without audio breaks the extraction (and, of course, doesn’t work without audio). The next screenshot was taken from the Google Chrome inspector for A chat with Matt Mullenweg: WordCamp US 2022 Q&A.
This command only generates the translation, not the text files in the original language, so I had to run the command without the --task translate parameter. I have to research if it is possible to do both actions at the same time.
To have some reference values of the time it takes to run these processes, I used the time command. My laptop is a MacBook Pro M1 2020 with 16 GB RAM. Really, the time command is then actually performed by the ZSH shell.
time whisper matt.mp4 --language English
10409.95s user 3461.08s system 217% cpu 1:46:22.95 total
time whisper RocioIsotta.mp4 --language Spanish
4222.15s user 1253.41s system 164% cpu 55:31.56 total
time whisper NuriaMiriam.mp4 --language Galician
7320.41s user 2412.34s system 204% cpu 1:19:10.39 total
time whisper RocioIsotta.mp4 --language Spanish --task translate
3813.41s user 1197.22s system 221% cpu 37:42.37 total
You can see the process takes some time, so if we are going to use these files in a WordCampWordCampWordCamps are casual, locally-organized conferences covering everything related to WordPress. They're one of the places where the WordPress community comes together to teach one another what they’ve learned throughout the year and share the joy. Learn more., we need to process them before, maybe using a script running the night before, extracting the subtitles from all videos inside a folder.
These are the files if you want to review the result:
Be advised that the name of the files inside RocioIsotta-en.zip and RocioIsotta.zip are the same, but with different content: one with subtitles in English and the other with subtitles in Spanish.
Usages and conclusion
The captions are not perfect, but they have good quality, so they can be a good starting point to work in the WordCamps (TV table in the Translation Day) or by the community who uploads the videos to WordPress.tv. They can edit the caption files and get the subtitles in the original language and in English, so we can make videos more accessible to the community with this open-source tool.
The WordPress 4.7 release video finalized and presented tonight during the State of the WordState of the WordThis is the annual report given by Matt Mullenweg, founder of WordPress at WordCamp US. It looks at what we’ve done, what we’re doing, and the future of WordPress. https://wordpress.tv/tag/state-of-the-word/., so let’s get this going and translate it to as many languages as we can.
The WordPress 4.6 release video is being finalised as we speak but the voice over is already available for translation, so let’s get this going and translate it to as many languages as we can. The 4.5 release video was subtitled into 31 languages but I know we can do better than that 🙂
Time for subtitling 🙂 This time the video isn’t quite there – I have a few more things to do; but it’s possible to get it subtitled anyway as the timestamps won’t be changing on the voice over track. Here’s what needs to be done:
1. Jazzer to be added
2. Tagline to be added – I’ve added the tagline in the English subtitles so you can just translate that.
3. I also need to add the opening and closing tags.
You can find the video here. I’ve added the English subtitles so please just add the ones in your language. I need them done by noon GMT on Monday 7th December.
As promised, it’s time to subtitle the WordPress 4.3 release video. We’re going to do it a bit differently this time. When you’re done, please do provide feedback on the process as it’ll help us to refine it again next time.
I’ll hang out in the polyglots SlackSlackSlack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. channel for the next few days so feel free to pingPingThe act of sending a very small amount of data to an end point. Ping is used in computer science to illicit a response from a target server to test it’s connection. Ping is also a term used by Slack users to @ someone or send them a direct message (DM). Users might say something along the lines of “Ping me when the meeting starts.” me with any questions.
The WordPress 4.2 video was the first one that we created subtitles for. It was great to be able to provide subtitles, although I think we had varying degrees of success with the overall workflow. We’d like to provide subtitles again for WordPress 4.3.
In brief, what we did was either:
1. Translate through a web interface, or
2. Translate the XML file directly
The files were then emailed to me.
I would be okay with using the same approach again this time but not everyone found the experience to be smooth. So if you have feedback and a suggestion for a better workflow that we can implement, please let me know on this thread.
Some dates for your diary:
the final cut of the video will be made available by 12th August for translation