How to avoid top 5 mistakes in SQL
Learn what are the top 5 mistakes in SQL and how to avoid them.
Let’s discuss some tips and tricks to avoid major mistakes in SQL
Never use Select *
select *, gets the data from all the columns, which increases the latency and is expensive for huge data.
Instead, get only the required fields, which limits the size of each record
# wrong way
select * from employees
# optimal way
select id, name from employees
Use EXISTS() Instead of COUNT()
If you want to know whether a record/result exists or not, it is better to use exists(), rather than count()
count(), will browse the entire table to get you the number of records.
Whereas, exists(), will get back to you, when it finds the first record for the query, saving you time and computing.
Functions on indexed columns is useless
Using functions on indexed columns, while querying, will make the indexes remain useless.
In order to use indexes, we need to avoid adding functions on the indexed columns.
# wrong way
select * from users where DATEPART(YEAR, birth_date) = 1996 and DATEPART(MONTH, birth_date) = 8
# optimal way
select * from users where birth_date >= '19960101' and birth_date < '19960201'
LIMIT Statements
Most common used pagination is LIMIT, OFFSET, which is not optimal.
It is fast for smaller and immediate sets like “LIMIT 0, 10”
But when OFFSET is changed to 1000000, it takes too long, since database doesn't know where 1000000th record exists, it starts from scratch till it finds 1000000th row.
Better approach would be to add another filter, mostly indexed column, like below
# wrong way
select * from messages where user_id = 'abc' limit 10 offset 100000;
# optimal way
select * from messages where user_id = 'abc' and id >= 100000
limit 10;
Use GROUP BY Instead of DISTINCT
The distinct is an expensive operation, and doesn’t use indexes if available.
Faster and easier way to do the same is to use group by.
# wrong way
EXPLAIN select count(DISTINCT user_id) from invoices;
# optimal way
select count(*) from (select user_id from invoices group by user_id) as a
In Conclusion…
I haven't covered many major SQL pitfalls. But these are some of the top mistakes, even made by experienced developers. Now you get to avoid them 😁
That’s it! Please let me know about your views and comment below for any clarifications.
If you found value in reading this, please consider sharing it with your friends and also on social media 🙏
Also, to be notified about my upcoming articles, subscribe to my newsletter (I’ll not spam you 😂)