Custom Database Tables: Safety First

This is part two of a series about custom database tables in WordPress. In part one we covered the reasons for, and against, using custom tables. We looked at some of the details that would need to be considered – column naming, column types – as well as how to create the table. Before we go any further we need to cover how to interact with this new table safely. In a previous article I covered general sanitisation and validation – in this tutorial we’ll look at this in more detail in the context of databases.

Safety when interacting with a database table is paramount – which is why we’re covering it early on in the series. If not done correctly then you can leave your table open to manipulation via SQL injection. It could allow a hacker to extract information, replace content or even alter the way your site behaves – and the damage they could do is not restricted to your custom table.

Let’s suppose we want allow administrators to delete records from our activity log. A common mistake I’ve seen is the following:

     if ( !empty($_GET['action'])
          && 'delete-activity-log' == $_GET['action']
          && isset($_GET['log_id']) ) {
               global $wpdb;
               unsafe_delete_log($_GET['log_id']);
     }
     function unsafe_delete_log( $log_id ){
          global $wpdb;
          $sql = "DELETE FROM {$wpdb->wptuts_activity_log} WHERE log_id = $log_id";
          $deleted = $wpdb->query( $sql );
     }

So what’s wrong here? Plenty: They’ve not checked permissions, so anyone can delete an activity log. Nor have they checked nonces, so even with permission checks, an admin user could be tricked into deleting a log. This was all covered in this tutorial. But their third mistake compounds the first two: the unsafe_delete_log() function uses the passed value in an SQL command without escaping it first. This leaves it wide open to manipulation.

Let’s suppose its intended use is

  www.unsafe-site.com?action=delete-activity-log&log_id=7

What if an attacker visited (or tricked an admin into visiting): http://www.unsafe-site.com?action=delete-activity-log&log_id=1;%20DROP%20TABLE%20wp_posts. The log_id contains an SQL command, which is subsequently injected into $sql and would be executed as:

    DELETE from wp_wptuts_activity_log WHERE log_id=1; DROP TABLE wp_posts

The result: the entire wp_posts table is deleted. I’ve seen code like this on forums – and the result is that anyone visiting their site can update or delete any table in their database.

If the first two mistakes were corrected, then it does make it harder for this type of attack to work – but not impossible, and it wouldn’t protect against an ‘attacker’ who has permission to delete activity logs. It’s incredibly important to protect your site against SQL injections. It’s also incredibly simple: WordPress provides the prepare method. In this particular example:

     function safe_delete_log( $log_id ){
         global $wpdb;
         $sql = $wpdb->prepare("DELETE from {$wpdb->wptuts_activity_log} WHERE log_id = %d", $log_id);
         $deleted = $wpdb->query( $sql )
     }

The SQL command would now execute as

    DELETE from wp_wptuts_activity_log WHERE log_id=1;

Sanitising Database Queries

Most sanitisation can be performed solely using the $wpdb global – notably through its prepare method. It also provides methods for inserting and updating data into tables safely. These usually work by replacing an unknown input, or associating an input, with a format placeholder. This format tells WordPress what data it is to expect:

  • %s denotes a string
  • %d denotes an integer
  • %f denotes a float

We start by looking at three methods which not only sanitize queries – but build them for you too.

Inserting Data

WordPress provides the method $wpdb->insert(). It’s a wrapper for inserting data into the database and handles the sanitisation. It takes three parameters:

  • Table name – the name of the table
  • Data – array of data to insert as column->value pairs
  • Formats – array of formats for the corresponding value in the data array (e.g. %s, %d,%f)

Note that the keys of the data should be columns: if there is a key that does not match a column an error may be thrown.

In the examples that follow we have explicitly set the data – but of course, in general, this data would have come from user input – so it could be anything. As discussed in this article the data should have been validated first, so as to return any errors to the user – but we still need to sanitize the data before adding it to our table. We’ll be looking at validation in the next article of this series.

global $wpdb;
//
$user_id = 1;
$activity = 1;
$object_id = 1479;
$activity_date = date_i18n('Y-m-d H:i:s', false, true);
$inserted = $wpdb->insert(
     $wpdb->wptuts_activity_log,
     array(
        'user_id'=>$user_id,
        'activity'=>$activity,
        'object_id'=>$object_id,
        'activity_date'=> $activity_date,
      ),
     array (
        '%d',
        '%s',
        '%d',
        '%s',
     )
 );
 if( $inserted ){
    $insert_id = $wpdb->insert_id;
 }else{
    //Insert failed
 }

Updating Data

For updating data in the database we have $wpdb->update(). This method accepts five arguments:

  • Table name – the name of the table
  • Data – array of data to update as column->value pairs
  • Where – array of data to match as column->value pairs
  • Data Format – array of formats for the corresponding ‘data’ values
  • Where Format – array of formats for the corresponding ‘where’ values

This updates any rows that matching the where array with values from the data array. Again, as with $wpdb->insert() the keys of the data array must match a column. It returns false on error, or the number of rows updated.

In the following example we update any records with log ID ’14′ (which should be at most one record, as this is our primary key). It updates the user ID to 2 and activity to ‘edited’.

global $wpdb;
$user_id=2;
$activity='edited';
$log_id = 14;
$updated = $wpdb->update(
     $wpdb->wptuts_activity_log,
     array(
        'user_id'=>$user_id,
        'activity'=>$activity,
     ),
     array('log_id'=>$log_id,),
     array( '%d', '%s'),
     array( '%d'),
 );
 if( $updated ){
    //Number of rows updated = $updated
 }

Deleting

Since 3.4 WordPress has also provided the $wpdb->delete() method for easily (and safely) deleting row(s). This method takes three parameters:

  • Table name – the name of the table
  • Where – array of data to match as column->value pairs
  • Formats – array of formats for the corresponding value type (e.g. %s, %d,%f)

If you want your code to be compatible with WordPress pre-3.4, then you will need to use the $wpdb->prepare method to sanitise the appropriate SQL statement. An example of this was given above. The $wpdb->delete method returns the number of rows delete, or false otherwise – so you can determine if the deletion was successful.

global $wpdb;
$deleted = $wpdb->delete(
     $wpdb->wptuts_activity_log,
     array('log_id'=>14,),
     array( '%d'),
 );
 if( $deleted ){
    //Number of rows deleted = $deleted
 }

esc_sql

In light of the above methods, and the more general $wpdb->prepare() method discussed next, this function is a bit redundant. Its provided as a useful wrapper for the $wpdb->escape() method, itself a glorified addslashes. Since its usually more appropriate, and advisable, to use the above three methods, or $wpdb->prepare(), you will probably find that you rarely need to use esc_sql().

As a simple example:

$activity = 'commented';
$sql = "DELETE FROM {$wpdb->wptuts_activity_log} WHERE activity='".esc_sql($activity)."';";

General Queries

For general SQL commands where (i.e. those not inserting, removing or updating rows) we have to use method $wpdb->prepare(). It accepts a variable number of arguments. The first is the SQL query we wish to execute with all ‘unknown’ data replaced by their appropriate format placeholder. These values are passed as additional arguments, in the order that they appear.

For instance instead of:

$sql = "SELECT* FROM {$wpdb->wptuts_activity_log}
         WHERE user_id = $user_id
         AND object_id = $object_id
         AND activity = $activity
         ORDER BY activity_date $order";
$logs = $wpdb->get_results($sql);

we have

$sql = $wpdb->prepare("SELECT* FROM {$wpdb->wptuts_activity_log}
                WHERE user_id = %d
                AND object_id = %d
                AND activity = %s
                ORDER BY activity_date %s",
               $user_id,$object_id,$activity, $order  );
$logs = $wpdb->get_results($sql);

The prepare method does two things.

  1. It applies mysql_real_escape_string() (or addslashes()) to the values being inserted. In particular this will prevent values containing quotation marks from jumping out of the query.
  2. It applies vsprintf() when adding the values to the query to ensure they are formatted appropriately (so integers are integers, floats are floats etc). This is why our example at the very beginning of the article stripped out everything but the ’1′.

More Complicated Queries

You should find that $wpdb->prepare, along with the insert, update and delete methods are all you really need. Sometimes though there are circumstances where a more ‘manual’ approach is desired – sometimes just from readability point of view. For example, suppose we have an unknown array of activities for which we want all the logs. We *could* dynamically add the %s placeholders to the SQL query, but a more direct approach seems easier:

 //An unknown array that should contain strings being queried for
 $activities = array( ... );
 //Sanitize the contents of the array
 $activities = array_map('esc_sql',$activities);
 $activities = array_map('sanitize_title_for_query',$activities);
 //Create a string from the sanitised array forming the inner part of the IN( ... ) statement
 $in_sql = "'". implode( "','", $activities ) . "'";
 //Add this to the query
 $sql = "SELECT* FROM $wpdb->wptuts_activity_log WHERE activity IN({$in_sql});"
 //Perform the query
 $logs = $wpdb->get_results($sql);

The idea is to apply esc_sql and sanitize_title_for_query to each element in the array. The first adds slashes to escape the terms – similar to what $wpdb->prepare() does. The second simply applies sanitize_title_with_dashes() – though the behaviour can be completely modified through filters. The actual SQL statement is formed by imploding the now sanitised array into a comma separated string, which is added into the IN(...) part of the query.

If the array is expected to contain integers then it suffices to use intval() or absint() to sanitize each element in the array.

Whitelisting

In other cases whitelisting may be appropriate. For instance the unknown input may be an array of columns that are to be returned in the query. Since we know what the columns of the database are we can simply whitelist them – removing any fields which we don’t recognise. However, to make our code human friendly we should be case-insensitive. To do this we’ll convert anything we receive to lowercase – since in part one we specifically used lowercase column names.

 //An unknown array that should contain columns to be included in the query
 $fields = array( ... );
 //A whitelist of allowed fields
 $allowed_fields = array( ... );
 //Convert fields to lowercase (as our column names are all lower case - see part 1)
 $fields = array_map('strtolower',$fields);
 //Sanitize by white listing
 $fields = array_intersect($fields, $allowed_fields);
 //Return only selected fields. Empty $fields is interpreted as all
 if( empty($fields) ){
     $sql = "SELECT* FROM {$wpdb->wptuts_activity_log}";
 }else{
     $sql = "SELECT ".implode(',',$fields)." FROM {$wpdb->wptuts_activity_log}";
 }
 //Perform the query
 $logs = $wpdb->get_results($sql);

Whitelisting is also convenient when setting the ORDER BY part of the query (if this is set by user input): data can be ordered as DESC or ASC only.

     //Unknown user input (expected to be asc or desc)
     $order = $_GET['order'];
     //Allow input to be any, or mixed, case
     $order = strtoupper($order);
     //Sanitised order value
     $order = ( 'ASC' == $order ? 'ASC' : 'DEC' );

LIKE Queries

SQL LIKE statements support the use of wildcards such as % (zero or more characters) and _ (exactly one character) when matching values to the query. For instance the value foobar would match any of the queries:

 SELECT * FROM $wpdb->wptuts_activity_log WHERE activity LIKE 'foo%'
 SELECT * FROM $wpdb->wptuts_activity_log WHERE activity LIKE '%bar'
 SELECT * FROM $wpdb->wptuts_activity_log WHERE activity LIKE '%oba%'
 SELECT * FROM $wpdb->wptuts_activity_log WHERE activity LIKE 'fo_bar%'

However, these special characters may actually be present in the term being searched for – and so to prevent them from being interpreted as wildcards – we need to escape them. For this WordPress provides the like_escape() function. Note that this does not prevent SQL injection – but only escapes the % and _ characters: you still need to use esc_sql() or $wpdb->prepare().

 //Collect term
 $term = $_GET['activity'];
 //Escape any wildcards
 $term = like_escape($term);
 $sql = $wpdb->prepare("SELECT* FROM $wpdb->wptuts_activity_log WHERE activity LIKE %s", '%'.$term.'%');
 $logs = $wpdb->get_results($sql);

Query Wrapper Functions

In the examples we’ve looked we’ve used two other methods of $wpdb:

  • $wpdb->query( $sql ) – This performs any query given to it and returns the number of affected rows.
  • $wpdb->get_results( $sql, $ouput) – This performs the query given to it and returns the matching result set (i.e. the matching rows). $output sets the format of the returned results:
    • ARRAY_A – numerical array of rows, where each row is an associative array, keyed by the columns.
    • ARRAY_N – numerical array of rows, where each row is a numerical array.
    • OBJECT – numerical array of rows, where each row is a row object. Default.
    • OBJECT_K – associative array of rows (keyed by the value of the first column), where each row is an associative array.

There are others we haven’t mentioned too:

  • $wpdb->get_row( $sql, $ouput, $row) – This performs the query and returns one row. $row sets which row is to be returned, by default this is 0, the first matching row. $output sets the format of the row:
    • ARRAY_A – Row is a column=>value pair.
    • ARRAY_N – Row is numerical array of values.
    • OBJECT – Row is returned as an object. Default.
  • $wpdb->get_col( $sql, $column) – This performs the query and returns a numerical array of values from the specified column. $column specifies which column to return as integer. By default this is 0, the first column.
  • $wpdb->get_var( $sql, $column, $row) – This performs the query and returns a particular value. $row and $column are as above, and specify which value to return. For example,
    $activities_by_user_1 = $wpdb->get_var("SELECT COUNT(*) FROM {$wpdb->wptuts_activity_log} WHERE user_id = 1");
    

It’s important to note that these methods are just wrappers for performing an SQL query and formatting the result. They do not sanitize the query – so you should not be using them alone when the query contains some ‘unknown’ data.


Summary

We’ve covered quite a lot in this tutorial – and data sanitisation is an important topic to understand. In the next article we’ll be applying it to our plug-in. We’ll be looking at developing a set of wrapper functions (similar to functions like wp_insert_post(), wp_delete_post() etc.) that will add a layer of abstraction between our plug-in and the database.

from Wptuts+ http://wp.tutsplus.com/tutorials/plugins/custom-database-tables-safety-first/

Kommentar verfassen

Bitte logge dich mit einer dieser Methoden ein, um deinen Kommentar zu veröffentlichen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ photo

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s