Skip to main content

15 posts tagged with "SQL优化"

View All Tags

When OR Just Won't Do: Using UNION for Higher SQL Performance

· 4 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

Problem Definition

If two query conditions are connected by OR, the database optimizer may not be able to use indexes to execute the query. For example, for the following SQL statement:

select * from customer where c_phone like '139%' or c_name = 'Ray' 

Even if both fields have indexes, they can be rewritten as a UNION query to leverage the indexes and improve query performance:

Join Elimination

· 5 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

Definition

Join Elimination is a rewriting optimization in SQL that simplifies queries and improves query performance by removing joins from the query without affecting the final result. Typically, this optimization is used when a query contains a primary-foreign key join and only references the primary key columns of the main table.

Consider the following example:

select o.* from orders o inner join customer c on c.c_custkey=o.o_custkey

The orders table is joined with the customer table, and c_custkey is the primary key of the customer table. In this case, the customer table can be eliminated, and the rewritten SQL would be:

select * from orders where o_custkey

IN Subquery Optimization

· 6 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Problem Definition

An IN subquery is a type of subquery that takes the following form.

(expr1, expr2...) [NOT] IN (SELECT expr3, expr4, ...)

An IN subquery can be rewritten as an equivalent correlated EXISTS subquery or inner join, which can create a extra filtering condition. If the filtering condition has an appropriate index or is recommended by the PawSQL index recommendation engine, better performance can be achieved.

  • IN Subquery to EXISTS conversion

Implicit Data Type Conversion

· 6 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

Problem

Let's say the O_ORDERDATE column in TPCH.ORDERS table is defined as VARCHAR(16), and there is an index ORDDATE_IDX on ORDERS(O_ORDERDATE) to accelerate the queries against this table if there is a condition on O_ORDERDATE. If we want to count the number of orders of today using following SQL query.

select count(*) from ORDERS 
where O_ORDERDATE = current_date();

Use explain to get the query plan:

-> Filter: (orders.O_ORDERDATE = <cache>(curdate()))  (cost=2621.05 rows=19938)
-> Table scan on ORDERS (cost=2621.05 rows=199378)

From the query plan, we can see the database optimizer uses a table scan instead of the index lookup as we expected. The query cost is high.

Counting Optimization

· 2 min read
PawSQL Team
Optimize your SQL Queries by Clicks!

Copyright © 2024 PawSQL

DDL Definition

-- tpch.nation definition
CREATE TABLE `nation` (
`N_NATIONKEY` int NOT NULL,
`N_NAME` char(25) NOT NULL,
`N_REGIONKEY` int NOT NULL,
`N_COMMENT` varchar(152) DEFAULT NULL,
KEY `PAW_IDX1831397382` (`N_NATIONKEY`,`N_NAME`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;