Showing posts with label php. Show all posts
Showing posts with label php. Show all posts

Tuesday, February 26, 2019

MySQL extra inner join taking too long: optimizer_search_depth

MySQL has a query optimiser that takes in SQL queries and tries to find the best way to execute them. Then the query gets executed using the best execution plan found. When you use joins in your query, the time the optimiser takes to finish optimising is exponential to the number of joins. This means that for a few joins it only takes a negligible amount of time to finish but after a point, which is somewhere between 7 and 10 tables, the optimisation time will shoot up. In my case I went from 0.1 seconds to 17 seconds but just adding another table in my joins.

To check if this is happening to you, you can profile your query by executing the following:

SET profiling = 1;

*your query*

SHOW PROFILE FOR QUERY 1;

(If you using phpMyAdmin you might need to add "SHOW PROFILE;" as well at the end)

This will show you a table of execution times for each phase in the execution of the query. The time taken by the query optimiser is in the "statistics" row. If this is too high, then you need to stop your optimiser from spending so much time.

Controlling how much time the optimiser should spend doing its job can be accomplished using the optimizer_search_depth variable. This is the maximum depth to search (presumably in some tree search algorithm) when optimising the query and is unfortunately set to the highest value by default. Being set to the highest value makes it the most likely to find the best execution plan but may also be unnecessarily time consuming. Thankfully there is a better default value to use: 0. When this variable is set to zero, the optimiser automatically picks a value based on the query and that seems to work well for me. To do this just execute the following query:

SET optimizer_search_depth = 0;

Friday, April 20, 2012

Finding the median value using MySQL sql

The median of a finite list of numbers is defined as the number in the middle of the list when the list is sorted, or the average (arithmetic mean) of the two numbers in the middle if the size of the list is even.

Since, as I described in my previous post, averages are meaningless when it comes to something like marks, I've decided to start using medians instead since it would be more useful to know in which half of the class you stand with your mark, and medians are much more intuitive than means.

Here is the SQL to find the median in MySQL (adapted from http://stackoverflow.com/a/7263925)

SET @rownum := -1;

SELECT
   AVG(t.mark)
FROM
(
   SELECT
      @rownum := @rownum + 1 AS rownum,
      results.mark AS mark
   FROM
      results
   ORDER BY results.mark
) AS t
WHERE
   t.rownum IN (
      CEIL(@rownum/2),
      FLOOR(@rownum/2)
   )
;

Here's a brief explanation:

In MySQL we cannot use expressions in LIMIT, so instead we have to simulate a LIMIT by using a counter with user defined variables. Each row in the table will have a row number associated with it starting from 0 in the variable @rownum.

We then choose from these numbered rows the ceiling and floor of "(number of rows - 1)/2" which will be the same middle value if the number of rows is odd or the middle two values if the number of rows is even. The variable will still retain its value after all the rows are numbered so we can use it to know the number of rows.

Since the average of one number is the number itself, we can simply calculate the average of whatever is returned without checking. If the number of rows is odd then we shall find the average of the middle value which will remain the same and if the number of rows is even then we shall find the average of the middle two values will will be the correct median.

Don't forget that if you're using this in PHP you have to use two "mysql_query" calls, one for the first SET and another for the SELECT as you cannot use ";" in queries.

Monday, March 14, 2011

Reading Related Tables As Nested Rows

Problem
A problem I used to face with relational databases is how to handle 1-to-many or many-to-many rows in SQL select statements. Let's say you have 2 tables with a many-to-many relationship between them, such as a Movies table and an Actors table. How do I present a list of movies together with their associated actors?

The naive way to do this is by using nested loops with a query for movies in the top loop and a query for actors in the second loop. In PHP it would look something like...

$moviesResult = mysql_query("SELECT id, title FROM movies"); 
while($moviesRow = mysql_fetch_assoc($result)) 
{ 
    echo("<h1>" . $moviesRow['title'] . "</h1>"); 

    echo("<ul>"); 
    $actorsResult = mysql_query("SELECT actors.name FROM actors INNER JOIN movies_actors ON actors.id = movies_actors.actorid WHERE movies_actors.movieid = " . $moviesRow['id'] . ";"); 
    while($actorsRow = mysql_fetch_assoc($actorsResult)) 
    { 
        echo("<li>" . $actorsRow['name'] . "</li>"); 
    } 
    echo("</ul>"); 
}

This approach however will kill your database. Ideally you should minimize the number of queries sent.

Another approach would be to join the movies table with the actors table and then read it with nested loops.

$result = mysql_query("SELECT movies.id AS id movies.title AS title, actors.name AS actor FROM movies INNER JOIN movies_actors ON movies.id = movies_actors.movie INNER JOIN actors ON actors.id = movies_actors.actor"); 
$row = mysql_fetch_assoc($result); 
while($row) 
{ 
    $currMovieId = $row['id']; 
    echo("<h1>" . $row['title'] . "</h1>"); 
    echo("<ul>"); 
    do 
    { 
        echo("<li>" . $row['name'] . "</li>"); 
    } while ($row = mysql_fetch_assoc($result) && $row['id'] == $currMovieId);
    echo("</ul>"); 
}

This works but as soon as you add another table to the join, determining which rows contain new information can be a nightmare, not to mention all the rows which just contain repeated information due to the cartesian product. For example:

IdTitleActorGenre
1The MatrixKeanu ReevesAction
1The MatrixKeanu ReevesAdventure
1The MatrixKeanu ReevesSci-Fi
1The MatrixLaurence FishburneAction
1The MatrixLaurence FishburneAdventure
1The MatrixLaurence FishburneSci-Fi
1The MatrixCarrie-Anne MossAction
1The MatrixCarrie-Anne MossAdventure
1The MatrixCarrie-Anne MossSci-Fi
2The Matrix ReloadedKeanu ReevesAction
2The Matrix ReloadedKeanu ReevesAdventure
2The Matrix ReloadedKeanu ReevesSci-Fi
2The Matrix ReloadedLaurence FishburneAction
2The Matrix ReloadedLaurence FishburneAdventure
2The Matrix ReloadedLaurence FishburneSci-Fi
2The Matrix ReloadedCarrie-Anne MossAction
2The Matrix ReloadedCarrie-Anne MossAdventure
2The Matrix ReloadedCarrie-Anne MossSci-Fi
3The Matrix RevolutionsKeanu ReevesAction
3The Matrix RevolutionsKeanu ReevesAdventure
3The Matrix RevolutionsKeanu ReevesSci-Fi
3The Matrix RevolutionsLaurence FishburneAction
3The Matrix RevolutionsLaurence FishburneAdventure
3The Matrix RevolutionsLaurence FishburneSci-Fi
3The Matrix RevolutionsCarrie-Anne MossAction
3The Matrix RevolutionsCarrie-Anne MossAdventure
3The Matrix RevolutionsCarrie-Anne MossSci-Fi

Are we supposed to receive all those rows just to display the below?

1The MatrixKeanu Reeves, Laurence Fishburne, Carrie-Anne MossAction, Adventure, Sci-Fi
2The Matrix ReloadedKeanu Reeves, Laurence Fishburne, Carrie-Anne MossAction, Adventure, Sci-Fi
3The Matrix RevolutionsKeanu Reeves, Laurence Fishburne, Carrie-Anne MossAction, Adventure, Sci-Fi

Solution
The best solution I found is a hybrid of the above 2 solutions. You make a different query for each table and then read all the rows returned by each query, placing the information in the list it belongs to.

$moviesResult = mysql_query("SELECT id, title FROM movies;"); 
$actorsResult = mysql_query("SELECT movies_actors.movieid AS movieid, actors.name AS name FROM actors INNER JOIN movies_actors ON actors.id = movies_actors.actorid;");
$genresResult = mysql_query("SELECT movies_genres.movieid AS movieid, genres.name AS name FROM genres INNER JOIN movies_genres ON genres.id = movies_genres.genreid;"); 

$actorRow = mysql_fetch_assoc($actorsResult); 
$genreRow = mysql_fetch_assoc($genresResult); 
while($movieRow = mysql_fetch_assoc($moviesResult)) 
{ 
    $currMovieId = $movieRow['id']; 
    echo("<h1>" . $movieRow['title'] . "</h1>"); 

    echo("<ul>"); 
    while ($actorRow && $actorRow['movieid'] == $currMovieId) 
    { 
        echo("<li>" . $row['name'] . "</li>"); 
        $actorRow = mysql_fetch_assoc($actorsResult); 
    } 
    echo("</ul>"); 

    echo("<ul>"); 
    while ($genreRow && $genreRow['movieid'] == $currMovieId) 
    { 
        echo("<li>" . $row['name'] . "</li>"); 
        $genreRow = mysql_fetch_assoc($genresResult); 
    } 
    echo("</ul>"); 
}

In this way we only make 3 queries, equal to the number of tables, and we only read as many rows as needed.