Today has been spent researching how to rescue my content from the spam attack that is poisoning a lot of the older articles with unwanted links.
It’s becoming clear that WordPress’ built in export and import are barely useful; and are NOT used by professionals who run websites based around it. What they do is use professional (and very expensive) tools to migrate the data from a “staging” copy of WordPress, where everything is developed and written, across to the “live” site. When they do work with the content, they work directly with the database behind WordPress. They export and import whole databases, when they’re not using migration tools.
For them, this means that an attack on the “live” site is meaningless, a minor interruption. They just erase it and push a new copy across from their local site to the server.
This is excellent practice, and commonplace in IT in general.
So the way forward is to do all my content creation on a WordPress Instance running on my PC, and then migrate it to live, if I can. Rather a come-down from entering my posts online directly, but it would work.
But that doesn’t help me with retrieving my data. So I have been burrowing into the database underneath. The very simple, very obvious database, if you are a retired professional database developer, as I am.
Today I have been setting up LocalWP on my PC. This has gone reasonably well except that I have to pause my antivirus when creating a new WordPress instance. And then remember to reenable it. This is because it locks the hosts file, which LocalWP edits really rather often. Daft design, really. I’ve also been working with the command-line interface, WP-CLI, locally. This also has a bug, where the DB_HOST variable in wp-config.php does not include the port number. Everything works, other than WP-CLI doing database stuff. A nuisance.
So I created an empty WordPress locally, and then tried to import the last valid backup. The import failed. It’s simply not designed to take a 70mb file. That’s really wretched. Come on WordPress, this is basic functionality! I then ran the import using WP-CLI where – to my astonishment – it took a couple of hours to load 7,400 blog posts. I learned from this that the professionals simply don’t use the WordPress Export/Import in any way.
But it did load. Which gives me a WordPress instance with a clean copy of all the corrupted posts.
In theory, one should be able to connect to the live system database, and run a cross database update to restore the correct content fields. I have some doubts that little old MySQL databases can handle that, unlike the Oracle monsters that I knew! I imagine it would all time out.
But possibly I could simply start my own MySQL database, independently of WordPress; then import into it the clean database file that I have created, import it as a whole; then rename all the tables imported from WP_xyz to something else – maybe VALID_xyz. Then I could import an export of the “live” system into the same database, somehow; and then do the update from one to the other locally? UPDATE WP-POSTS from VALID_POSTS or something – I don’t know what the syntax would be as yet. Then drop the VALID tables, export the whole database, now fixed; and create a new instance on the server using the new cleaned up database. Or something like that.
Um yes. I’m sure most of you swallowed. I’m an old database programmer. I think in these terms when obliged to.
It is pathetic and ridiculous that such maneouvres should be necessary. Not one blogger in ten thousand would think in such terms. Blogging is for convenience. How is all this “convenient”? Tools that do not work, software that is insecure, timeouts all over the place?
Still, it’s clearly possible. It will just take a ridiculous amount of time.