Commit 41b58442 authored by David Négrier's avatar David Négrier

Improving article

parent e1bb9b4b
Pipeline #85717 passed with stages
in 6 minutes and 54 seconds
......@@ -7,7 +7,7 @@ theme: main
#menu: Foo/Bar
context:
post_date: 2019-09-06
introduction: In this tutorial, I would like to present a novel day to deal with the N+1 issue that we are facing when using most ORMs.
introduction: In this article, I present a novel way to deal with the N+1 issue that we are facing when using most ORMs.
related:
- /tdbm5.1-symfony-tutorial
- /tdbm5.1-rich-model
......@@ -39,6 +39,8 @@ foreach ($users as $user) {
}
```
<small>This is PHP but this could really be any other programming language.</small>
What will happen here?
On the first line, the ORM will probably fetch all the users. Something like:
......@@ -68,7 +70,7 @@ Of course, if we were to write pure SQL, we could solve this in exactly one quer
SELECT * FROM users JOIN countries ON users.country_id = countries.id
```
Needless to say the performance of raw SQL are way better than the performance of this hypothetical ORM.
Needless to say the performance of raw SQL is way better than the performance of this hypothetical ORM.
## Eager loading to the rescue
......@@ -77,8 +79,8 @@ Hopefully, ORM developers have known the issue for quite some time and they alre
The idea is always the same: the developer should tell in advance to the ORM that it will need additional data.
The ORM can then fetch that data in advance (we call that "eager loading").
With eager loading, the relationship is fetched along with the parent object. This is more efficient in loading data
but will load data irrespective of the data being used or not.
With eager loading, the related data is fetched along with the parent object. This is more efficient in loading data
but it will load data irrespective of the data being used or not.
### Eager loading in Eloquent
......@@ -132,9 +134,20 @@ There are actually more ways to work with JOINs in Doctrine and if you want to l
Current available solutions are sharing a common issue: **they rely on the developer** to tell the ORM when to perform
eager-loading or not.
It is tremendously easy to write code that is not optimal and it requires *knowledge* to perform eager loading only
But it is far too easy to forget setting up eager loading and it requires *knowledge* to perform eager loading only
when needed.
It becomes even harder with several developers! Look at this Twig template:
```twig
{% for user in users %}
<li>{{ user.name }} lives in {{ user.country.label }}</li>
{% endfor %}
```
What are the chances that the designer working on Twig understands the implication of the "for" loop in terms of performance?
Literally none.
Even worse, sub-optimal code can go undetected for quite some time. If N is small enough, you might not notice the issue,
but when your database starts growing, N will become large and you will face a problem after a few months/years
in production!
......@@ -160,41 +173,212 @@ The idea is simple. For each entity fetched, we try to remember where the entity
```php
$users = $userRepository->findAll();
// $users is an instance of what we call a "ResultIterator"
// $users is an instance of a "ResultIterator"
foreach ($users as $user) {
// At this point, if you give me a $user instance
// TDBM can go back to the ResultIterator
// At this point, if you give TDBM a $user instance
// TDBM can go back to the ResultIterator that generated it
}
```
![From any entity, we can go back to the original query](images/orm-n-plus-one/smart-eager-load.svg)
Now, here is what is going on when we call `$user->getCountry()`.
Now, here is sneak peek of what is going on in TDBM mind when we call `$user->getCountry()`.
On the **first iteration of the loop**, TDBM will ask:
\- "Where does `$user` comes from?"
\- "It comes from a ResultIterator"
\- "So it is very very likely that we are currently in a foreach loop and that the `getCountry` method will be called in a loop for every user of the ResultIterator, right?"
\- "Yup..."
\- "Ok, then rather than only fetch a single country, let me prefetch **all** the countries that I'll need"
\- "What was the query that generated the ResultIterator?"
\- "`SELECT * FROM users WHERE status="ON"`"
\- "And we want the list of countries attached to this list of users, right?"
\- "Yup..."
\- "So what about this? `SELECT DISTINCT * FROM countries WHERE country_id IN (SELECT country_id FROM users WHERE status="ON")`"
\- "Excellent!"
\- "So let's cache all this data for the next loop iteration, shall we?"
<style>
.chatwrapper {
width: 100%;
display: flex;
flex-direction: column;
align-items: center;
}
.chat {
width: 80%;
border: solid 1px #EEE;
display: flex;
flex-direction: column;
padding: 10px;
}
.messages {
margin-top: 30px;
display: flex;
flex-direction: column;
}
.message {
border-radius: 20px;
padding: 8px 15px;
margin-top: 5px;
margin-bottom: 5px;
display: inline-block;
}
.yours {
align-items: flex-start;
}
.yours .message {
margin-right: 25%;
background-color: #EEE;
position: relative;
}
.yours .message.last:before {
content: "";
position: absolute;
z-index: 0;
bottom: 0;
left: -7px;
height: 20px;
width: 20px;
background: #EEE;
border-bottom-right-radius: 15px;
}
.yours .message.last:after {
content: "";
position: absolute;
z-index: 1;
bottom: 0;
left: -10px;
width: 10px;
height: 20px;
background: white;
border-bottom-right-radius: 10px;
}
.mine {
align-items: flex-end;
}
.mine .message {
color: white;
margin-left: 25%;
background: rgb(0, 120, 254);
position: relative;
}
.mine .message.last:before {
content: "";
position: absolute;
z-index: 0;
bottom: 0;
right: -8px;
height: 20px;
width: 20px;
background: rgb(0, 120, 254);
border-bottom-left-radius: 15px;
}
.mine .message.last:after {
content: "";
position: absolute;
z-index: 1;
bottom: 0;
right: -10px;
width: 10px;
height: 20px;
background: white;
border-bottom-left-radius: 10px;
}
</style>
<div class="chatwrapper">
<div class="chat">
<div class="yours messages">
<div class="message last">
Where does <code>$user</code> comes from?
</div>
</div>
<div class="mine messages">
<div class="message last">
It comes from a ResultIterator
</div>
</div>
<div class="yours messages">
<div class="message last">
So it is very very likely that we are currently in a foreach loop and that the <code>getCountry</code> method will be called in a loop for every user of the ResultIterator, right?
</div>
</div>
<div class="mine messages">
<div class="message last">
Yup...
</div>
</div>
<div class="yours messages">
<div class="message last">
What was the query that generated the ResultIterator?
</div>
</div>
<div class="mine messages">
<div class="message last">
<code>SELECT * FROM users WHERE status="ON"</code>
</div>
</div>
<div class="yours messages">
<div class="message last">
And we want the list of countries attached to this list of users, right?
</div>
</div>
<div class="mine messages">
<div class="message last">
Yup...
</div>
</div>
<div class="yours messages">
<div class="message last">
So what about this?<br/>
<code>SELECT DISTINCT * FROM countries WHERE country_id IN (SELECT country_id FROM users WHERE status="ON")</code>
</div>
</div>
<div class="mine messages">
<div class="message last">
Excellent!
</div>
</div>
<div class="yours messages">
<div class="message last">
So let's cache all this data for the next loop iteration, shall we?
</div>
</div>
</div>
</div>
On the **subsequent iterations of the loop**, here is what will happen:
\- "Where does `$user` comes from?"
\- "It comes from a ResultIterator"
\- "Did we already fetched some data related to countries for this result iterator?"
\- "Yup..."
\- "Excellent! Give me the data! No queries required."
<div class="chatwrapper">
<div class="chat">
<div class="yours messages">
<div class="message last">
Where does <code>$user</code> comes from?
</div>
</div>
<div class="mine messages">
<div class="message last">
It comes from a ResultIterator
</div>
</div>
<div class="yours messages">
<div class="message last">
Did we already fetched some data related to countries for this result iterator?
</div>
</div>
<div class="mine messages">
<div class="message last">
Yup...
</div>
</div>
<div class="yours messages">
<div class="message last">
Excellent! Give me the data! No queries required.
</div>
</div>
</div>
</div>
In the end, we managed to fetch all the required data in only **2 queries** (instead of N+1 queries!).
......@@ -260,3 +444,6 @@ If I wrote this article, it is also in the spirit to share this idea with my fel
So if you are an ORM developer, maybe you will consider adding a similar feature to your library?
Want to stay tuned on the latest TDBM releases or PHP related news? [Follow me on Twitter!](https://twitter.com/david_negrier/)
Also, now is a good time to thank my wonderful teammates [Arthmael](https://github.com/Kharhamel) and [Guillaume](https://github.com/homersimpsons)
that made these innovations possible!
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment