How to reduce the size of your Magento 1 database using nucleanup

Time to Read: 15 minutes Difficulty Level: Intermediate
Tools Needed: text editor, Plesk control panel access Last Updated: 13/03/2019

Why use Nucleanup?

Whilst Magento is a powerful and flexible eCommerce platform, it has the unhealthy disadvantage of not self-maintaining its own log records.

These log files are found within your Magento database and they record information about every customer who visits your store and every link that is visited.

Over time, and especially if your store receives a lot of traffic, these log files can become very large in size. Quite often we notice bulging log files taking up over 90% of the size of a database! As you can see in the screenshot below, here is an example of a Magento store holding around 6 million records in its log tables, equating to around 750MB of disk space!

Therefore, it is important to maintain your Magento database in order to keep it running efficiently. By cleaning out your log files, you not only speed up your Magento store, you also help dramatically reduce your overall disk space usage which can often negate the need to upgrade your existing hosting package.

To help you maintain your log files and reduce your overall disk space usage, we have modified a popular script to be compatible with our hosting that does the hard work for you! The script will also save you disk space elsewhere by clearing error reports generated by Magento and other cached files relating to the Magento Connect process.

This guide will walk you through the installation and configuration of that script.

As always, we recommend taking a full backup of your site before carrying out any work on your web hosting account. Nublue take no responsibility for any damage to your Magento store caused by the use of this script. We advise that all testing of this script should be undertaken in a development environment to ensure it works as expected before deployment in a production environment. We also cannot guarantee that the script won’t do odd things to any modules you have installed within Magento (the script empties a lot of the log database tables which a ‘recently viewed products’ type extension might rely on), although you can of course revert to your backed up solution if it does.

Method

Method

Step 1

Create a new file called nucleanup.php using your favourite text editor and paste the following script into the new nucleanup.php file:

<?php
/* Clean up script for Magento sites hosted in a Plesk environment.

  Adapted from : http://www.magentocommerce.com/wiki/groups/227/maintenance_script

  Changes :
  Added configuration values (domain, subdirectory, cronOnly)
  Tests for the existence of local.xml before running either command (tests path and tests existence of db details)
  Removed the exec command in favour of a recursive PHP function to delete directory contents (much better!)
  Added a security setting that stops the script being run unless by cron/shell_user of the site.
  Removed some of the directory entries (specifically var/cache and var/log (the former shouldn't be emptied frequently and the latter should be managed within System ---------------> Configuration -> Developer)
  */

//Edit the next three lines :
$domain = "your_domain_goes_here"; //Put your domain in here as it looks in Plesk (ie no www.)
$subdirectory = "your_subdirectory_goes_here"; //If your site is in a subdirectory put the subdirectory here otherwise leave empty
$cronOnly =false; //Set to false for testing, or if inexplicably you get the security notice when running as a cron.

//Do not edit below this line :
if ($cronOnly)
{
    $dnsArray = dns_get_record($domain,DNS_A);
    $inboundIP = $dnsArray[0]['ip'];
    $callingIP = $_SERVER['REMOTE_ADDR'];
    if ($inboundIP != $_SERVER['REMOTE_ADDR'])
    {
        die("For security this script can only be called via a cron from the same server as its hosted.  Your IP : $callingIP . Required IP : $inboundIP ");
    }
}

$path = "/var/www/vhosts/".$domain."/httpdocs/".$subdirectory;
$localXML = "/app/etc/local.xml";

//Check that the path is correct and accessible

if (!file_exists($path.$localXML))
{
    echo "I have path : ".$path." but I cannot find Magento here";
    echo "One of the following is incorrect :";
    echo "Domain : ".$domain."";
    echo "Subdirectory : ".$subdirectory."";
    die();
}

$xml = simplexml_load_file($path.$localXML, NULL, LIBXML_NOCDATA);

$db['host'] = $xml->global->resources->default_setup->connection->host;
$db['name'] = $xml->global->resources->default_setup->connection->dbname;
$db['user'] = $xml->global->resources->default_setup->connection->username;
$db['pass'] = $xml->global->resources->default_setup->connection->password;
$db['pref'] = $xml->global->resources->db->table_prefix;

if($_GET['clean'] == 'log') clean_log_tables();
if($_GET['clean'] == 'var') clean_var_directory();

function clean_log_tables() {
    global $db;

    $tables = array(
        'dataflow_batch_export',   //
        'dataflow_batch_import',
        'log_customer',
        'log_quote',
        'log_summary',
        'log_summary_type',
        'log_url',
        'log_url_info',
        'log_visitor',
        'log_visitor_info',
        'log_visitor_online',
        'report_event'
    );

  $conn = mysqli_connect($db['host'], $db['user'], $db['pass'], $db['name']);
    if($conn) {
    foreach($tables as $v => $k) {
        mysqli_query($conn, 'TRUNCATE `'.$db['pref'].$k.'`') or die(mysqli_error());
    } 
  }
}
function clean_var_directory() {

    global $path;

    $dirs = array(
        'downloader/pearlib/cache/',  //Cached files relating to the Magento Connect process
        'downloader/pearlib/download/', //Downloaded copies of installed Magento modules and extensions
        'var/report/', //Copies of Magento error reports
        'var/session/', //Cart sessions
        'var/tmp/' //Used in old version of Magento (I think)
    );

    foreach($dirs as $v => $k) {
       destroy_contents($path.$k);
    }
}


function destroy_contents($dir) {
   $mydir = opendir($dir);
   if ($mydir == false)
        return;
   while(false !== ($file = readdir($mydir))) {

    if($file != "." && $file != "..") {
          chmod($dir.$file, 0777);
          if(is_dir($dir.$file)) {
                chdir('.');
                destroy_contents($dir.$file.'/');
                 if(!rmdir($dir.$file))
                     echo "Could not delete $dir/$file";
            }
            else
                  if(!unlink($dir.$file))
                      echo "Could not delete $dir/$file";
            }
      }
      closedir($mydir);
}
?>

Step 2

Upload your nucleanup.php file to your Magento root folder either by logging into your Plesk Control Panel with your credentials, or though your FTP client of choice.

Step 3

Set the value of the $domain variable to your domain name. For example, if your website URL is mydomain.co.uk you would set the value of the $domain variable to mydomain.co.uk.

Step 4

The value of the $subdirectory variable should be set to the path of your Magento store. For example, if your Magento store is located at mydomain.co.uk/mystore the value of the $subdirectory variable would be set to mystore.

Step 5

The value of $cronOnly should be set to false if you are simply testing the script or if you inexplicably receive a security notice when running as a cron. By setting the value of $cronOnly to false you will be able to run and test the script directly from your browser.

Step 6

From your Plesk Control Panel, select your Magento website. Now click the Scheduled Tasks button. (A Scheduled Task is Plesk’s user-friendly term for cron-job!)

Step 7

Next, hit Add Task to create a new task. Enter the following into the Command field:

curl  --silent  --output  /dev/null  http://www.yourdomain.co.uk/nucleanup.php?clean=log

(You’ll need to replace “yourdomain.co.uk” with your actual domain name and update the /path/to/ part so that it reflects the path to nucleanup.php for your domain).

Step 8

Now we need to schedule when and how often the script is run. We recommend that the log files in your Magento database be cleaned daily, at the quietest time for your Magento store (often in the early hours of the morning). To schedule the script to run daily at 04:30 hours every morning, in the Run field, ensure the Daily option is selected from the drop-down menu and input 04:30 as the time. Click OK to create the cron.

Step 9

We can now schedule an additional task for purging the var files (such as error reports) present in your Magento store. To do this, we need to create a new task. Select Add New Task again and enter the following in the Command text field:

curl --silent --output /dev/null  http://www.yourdomain.co.uk/path/to/nucleanup.php?clean=var

Step 10

To configure the script to run and clean the var files within your Magento store every three days at 03:00, select Cron Style from the Run field and then enter the following into the box that will appear:

0 3 */3 * *

Step 11

Your Scheduled Tasks screen should look something like the screenshot below:

Step 12

You can now test the nucleanup.php script by running it directly in your web browser. Depending on which function of the script you wish to test, browse to:

http://www.yourdomain.co.uk/Path-To-Magento-Store/nucleanup.php?clean=log

or

http://www.yourdomain.co.uk/Path-To-Magento-Store/nucleanup.php?clean=var

Step 13

The log and var folders in your Magento store should now be purged! To check your log files are cleaned out you can log into phpMyAdmin from within your Plesk Control Panel. To do this select your domain from within Plesk and select your database from the dropdown menu next to Databases, then click Open. A new tab should appear displaying phpMyAdmin. To analyse the size of the tables in your database click the name of your database in the top left hand corner of this popup. Now browse down to log_url onwards and you should see that the number of records is reduced to 0 (or near 0 if someone has visited your Magento store since you ran the script!). You can see from the screenshot below, our initial example now has 0 log records:

Step 14

Similarly, you can check that the relevant files have also been cleared from your Magento folder by browsing to httpdocs and checking the file size of the downloader/pearlib/cache, downloader/pearlib/download/, var/report/, var/session/, var/tmp/ directories.

Step 15

If you are happy with the way the script is running, you can now update the value of $cronOnly within the script to true. This final step is crucial for the script to be called successfully via a cron.

And that’s it! You can now enjoy the benefits of a cleaner and faster Magento store without worrying too much about the size of your database spiralling out of control!