Here’s the WP-CLI command I used to regenerate thumbnails for featured images:-
wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' ' ' | xargs wp media regenerate
Explaining The WP CLI Command
There are three parts to this command, first we have
wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent
[Sample Output]
99
98
81
77
wp db query sends a SQL query to the database used by the WordPress instance.
SELECT meta_value FROM wp_postmeta WHERE meta_key=’_thumbnail_id is a SQL query that retrieves media attachment IDs for featured images from the postmeta table.
--skip-column-names
instructs the database to not return the column name, i.e. meta_value.
--silent
gets rid of the borders
In essence, --skip-column-names
and --silent
are used in combination such that only media IDs are return by the command (see sample output above).
Next, we pipe the media IDs to tr to join the IDs into a single line.
tr '\n' ' '
[Sample Output]
99 98 81 77
tr '\n' ' '
replaces ‘\n’ (newline character) with a single whitespace.
Next the single-lined IDs are piped to xargs.
xargs wp media regenerate
xargs will then call wp media regenerate on those IDs. To illustrate using the above sample output, xargs will execute wp media regenerate 99 98 81 77.
Post-Ops
If you’re running these commands as the direct owner of the files, you’re done here. However, if you’re like me who uses a user that is different from the user that owns the WordPress files, make sure you transfer the file ownership back to the righteous owner.
For my servers, user www-data owns the files and operates the web servers. But I’m using user manager to run the command, this means that the generated images will be owned by manager. To fix this just do a chown:-
/var/www/html/wp-content$ sudo chown www-data:www-data -R uploads
Performance
With WebP Express Enabled
On my DigitalOcean 2 CPU cores 2GB server, it takes about 16 mins to regenerate thumbnails for 727 images which is about 45 images per minute.
Success: Regenerated 727 of 727 images.
real 16m32.172s
user 14m42.408s
sys 1m30.963s
It’s slow mainly because of the WebP Express plugin I have installed. The plugin helps to convert regenerated thumbnails to WebP.
With WebP Express Disabled
With the plugin disabled, the processing time for me drops to about 2 minutes
Success: Regenerated 727 of 727 images.
real 1m48.026s
user 1m26.915s
sys 0m5.923s
But if you plan to keep the WebP support, you will have to convert to WebP in other ways after this.
Command line tools like cwebp can get this done really fast. Previously I was getting 16 conversions per second on a single-core 2 GB server. You just have to find a way to get the paths the newly generated images and feed them to cwebp. Or look at all image files one by one and only generate a new WebP if it doesn’t already exist. I didn’t do this because I decided that the productivity vs runtime trade-off isn’t really worth it for me.
Running in Parallel: Putting CPU Cores to Work
If you have free CPU cores to use on your server, you can speed up this up by spawning multiple processes. Note that while I have tested these commands below, I ended up not using them because most of the websites I’m hosting are hosted on single-core or dual-core DigitalOcean droplets. For single-core servers, spawning multiple CPU-intensive processes doesn’t really make much sense. For dual-core servers, I like to keep one free core so that the website doesn’t slow to a crawl when the images are being processed.
Test environment: # of CPU cores: 2 Ram: 2GB
Number of images: 727
WebP Plugin: Disabled
1. Baseline: No parallelization.
Run time (wall clock): 1m 48s or 108s.
time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' ' ' | xargs wp media regenerate
real 1m48.026s
user 1m26.915s
sys 0m5.923s
2. Greedy: Spawn as many processes as possible (-P 0). Each process handles one regeneration task (-n 1).
Unfortunately, this creates a situation where the processes are fighting for resources which results in lower overall performance.
It is about 178% slower than baseline.
time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' 'time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' '\0' | xargs -0 -I '{}' -n 1 -P 0 wp media regenerate {}
real 4m59.771s
user 8m3.722s
sys 1m19.691s
' | xargs -0 -I '{}' -n 1 -P 0 wp media regenerate {}
real 4m59.771s
user 8m3.722s
sys 1m19.691s
3. Less Greedy: Operate two processes at most (-P 2). Each process handles one regeneration task (-n 1)
Limiting the number of processes to the number of cores doesn’t help either. The reason is most likely due to the overhead of wp media regenerate.
This approach is about 177% slower than baseline.
time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' 'time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' '\0' | xargs -0 -I '{}' -n 1 -P 2 wp media regenerate {}
real 4m58.887s
user 7m54.432s
sys 1m19.672s
' | xargs -0 -I '{}' -n 1 -P 2 wp media regenerate {}
real 4m58.887s
user 7m54.432s
sys 1m19.672s
4. Optimal
Spawn processes according to number of CPU cores (-P 2 for two cores, -P 8 for 8 cores so on and so forth).
Number of images each process will process, n = Total Number of Images / Number of CPU Cores.
For my experiment, Total Number of Images = 727, Number of CPU Cores = 2
So n should be 727 / 2 =363, but I just round it up to 400.
And the result turns out to be 37% faster than baseline.
time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' 'time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent | tr '\n' '\0' | xargs -0 -I '{}' -n<strong>400</strong> -P 2 wp media regenerate {}
real 1m8.819s
user 1m28.113s
sys 0m6.100s
' | xargs -0 -I '{}' -n<strong>400</strong> -P 2 wp media regenerate {}
real 1m8.819s
user 1m28.113s
sys 0m6.100s
Summary
Approach | Real time | Vs. Baseline | n | P |
---|---|---|---|---|
Baseline | 108s | – | – | – |
Greedy | 300s | 178% | 1 | 0 |
Less Greedy | 299s | 177% | 1 | 2 |
Optimal | 68s | -37% | 400 | 2 |
As I mentioned earlier, I went with Baseline for my websites because of limited CPU resources and I don’t want the entire server to dedicate all its CPU resources to regenerating images. There are other tasks that the server has to perform.
I suspect the Optimal settings for xargs also work for other CPU intensive tasks. I suppose someone must have written about it.
Troubleshooting Errors
Here are some errors I ran into and how I fixed them. Hopefully this will be helpful to you.
1. Table ‘wordpress.wp_postmeta’ doesn’t exist
ERROR 1146 (42S02) at line 1: Table 'wordpress.wp_postmeta' doesn't exist
This error came from the SQL query in wp db query and it happens when the WordPress instance uses a different table prefix other than wp_.
To fix this error, find out what is the correct database prefix and update the query accordingly.
Here are two ways to do determine the DB prefix.
A. Look at wp-config.php
table prefix is defined by a variable name $table_prefix in wp-config.php
$table_prefix = 'jazz_';
In this case the prefix is jazz_ and all you have to do is change the table name from wp_postmeta to jazz_postmeta.
B. Look at the tables directly
This can be done through phpMyAdmin, login to MySQL interactive shell, or run this WP-CLI command
wp db query 'SHOW TABLES'
2. Permission denied / Operation not permitted
If your command can start just fine but you get these pesky warnings when the thumbnails are generating, it is likely a file ownership and permission issue.
PHP Warning: unlink(/var/www/html/wp-content/uploads/2021/07/lavarel-oauth2-300x214.jpg): Permission denied in phar:///usr/local/bin/wp/vendor/wp-cli/media-command/src/Media_Command.php on line 645
Warning: unlink(/var/www/html/wp-content/uploads/2021/07/lavarel-oauth2-300x214.jpg): Permission denied in phar:///usr/local/bin/wp/vendor/wp-cli/media-command/src/Media_Command.php on line 645
PHP Warning: chmod(): Operation not permitted in /var/www/html/wp-includes/class-wp-image-editor-imagick.php on line 723
Warning: chmod(): Operation not permitted in /var/www/html/wp-includes/class-wp-image-editor-imagick.php on line 723
Solution
First find out your user groups:-
manager@codedodle:/var/www/html$ groups
manager sudo www-data
Next, find out the file permission and owner of the images and the uploads directory.
manager@codedodle:/var/www/html$ ls -l /var/www/html/wp-content/uploads/2021/07/lavarel-oauth2-300x214.jpg
-rw-r--r-- 1 www-data www-data 10548 Aug 2 2021 /var/www/html/wp-content/uploads/2021/07/lavarel-oauth2-300x214.jpg
It appears that owner and group are both set correctly to www-data. The file permission is -rw-r–r– which means that only the direct owner (www-data) can read and write to the file (-rw), the group (www-data) and other users can only read (r–).
This means that my user, manager, who’s only a member of the www-data group, is only allowed to read the file. That explains the warning.
Anyway, to prevent other issues, I decided to just set the ownership and permission again.
manager@codedodle:/var/www/html/wp-content$ sudo find uploads \( -iname *.jpg -o -iname *.jpeg -o -iname *.png \) -exec chmod 664 {} +
manager@codedodle:/var/www/html/wp-content$ sudo find uploads -type d -exec chmod 775 {} +
manager@codedodle:/var/www/html/wp-content$ sudo chown www-data:www-data -R uploads
These settings for my particular setup, but if you’re interested to read more about file permission in WordPress you can check out this page.
3. Error: Strange wp-config.php
Similar to Error #2, this error is also related to file ownership and permission. I have written about the solution to the problem previously.
Discussions
Topics that don’t really fit in with the main flow of the article.
Find Featured Image IDs by Using wp post list
There’s another way to get IDs for featured images. And it is by combining wp post list AND wp post meta get.
wp post list --post_type=post --format=ids | tr ' ' 'wp post list --post_type=post --format=ids | tr ' ' '\0' | xargs -0 -I '{}' wp post meta get {} _thumbnail_id
' | xargs -0 -I '{}' wp post meta get {} _thumbnail_id
First we use wp post list to retrieve post IDs. Next the IDs get delimited with null characters using tr. The transformed IDs are fed to xargs which is expecting null-terminated input because of the -0 option. xargs then call wp post meta get on each of the post IDs to retrieve its corresponding _thumbnail_id.
I’m sharing this here for completeness. However, DON’T use this. it’s REALLY slow. Like 100x slower than the wp db query approach. See the comparison below:-
manager@codedodle:/var/www/html$ time wp post list --post_type=post --format=ids | tr ' ' 'manager@codedodle:/var/www/html$ time wp post list --post_type=post --format=ids | tr ' ' '\0' | xargs -0 -I '{}' wp post meta get {} _thumbnail_id
<snipped>
real 0m32.492s
user 0m26.817s
sys 0m4.557s
manager@codedodle:/var/www/html$ time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent
<snipped>
real 0m0.291s
user 0m0.205s
sys 0m0.081s
' | xargs -0 -I '{}' wp post meta get {} _thumbnail_id
<snipped>
real 0m32.492s
user 0m26.817s
sys 0m4.557s
manager@codedodle:/var/www/html$ time wp db query "SELECT meta_value FROM wp_postmeta WHERE meta_key='_thumbnail_id'" --skip-column-names --silent
<snipped>
real 0m0.291s
user 0m0.205s
sys 0m0.081s
The wp db query command took less than a second to complete. The wp post list approach took slightly more than 30 seconds.
HG