【LeetCode】学习计划——SQL入门

Day1 选择595. 大的国家

World表：

+-------------+---------+| Column Name | Type    |+-------------+---------+| name        | varchar || continent   | varchar || area        | int     || population  | int     || gdp         | int     |+-------------+---------+name 是这张表的主键。这张表的每一行提供：国家名称、所属大陆、面积、人口和 GDP 值

选择出：

面积至少为 300 万平方公里（即，\(3000000\ km^2\)），或者
人口至少为 2500 万（即 \(25000000\)）

方法一

两个条件一起查询：

select name, population, area from World where area >= 3000000 or population >= 25000000;

方法二

使用union连接两个查询条件：

select name, population, areafrom worldwhere area >= 3000000unionselect name, population, areafrom worldwhere population >= 25000000;

Union：对两个结果集进行并集操作，不包括重复行，同时进行默认规则的排序；即：去重+排序

Union All：对两个结果集进行并集操作，包括重复行，不进行排序；即：不去重+不排序

1757. 可回收且低脂的产品

Products表：

+-------------+---------+| Column Name | Type    |+-------------+---------+| product_id  | int     || low_fats    | enum    || recyclable  | enum    |+-------------+---------+product_id 是这个表的主键。low_fats 是枚举类型，取值为以下两种 ('Y', 'N')，其中 'Y' 表示该产品是低脂产品，'N' 表示不是低脂产品。recyclable 是枚举类型，取值为以下两种 ('Y', 'N')，其中 'Y' 表示该产品可回收，而 'N' 表示不可回收。

select product_idfrom productswhere low_fats = 'Y' and recyclable = 'Y';

584. 寻找用户推荐人

+------+------+-----------+| id   | name | referee_id|+------+------+-----------+|    1 | Will |      NULL ||    2 | Jane |      NULL ||    3 | Alex |         2 ||    4 | Bill |      NULL ||    5 | Zack |         1 ||    6 | Mark |         2 |+------+------+-----------+

MySQL有三个逻辑值：TRUE, FALSE, NULL

所以这个题如果直接选择 referee_id != 2，则会导致 referee_id = NULL的数据没有被选择出来，所以要加上 referee_id is null

select namefrom customerwhere referee_id != 2 or referee_id is null;

还有一种方法是先选出来编号为2的元素然后取反：

运算符相当于封装了= 和 is ，既可以判断非NULL值，也可以用来判断NULL值。只能在MySQL中使用

select namefrom customerwhere not referee_id  2;

或者使用 not in：

select namefrom customerwhere id not in (select id from customer where referee_id = 2);# id是主键，所以选择referee_id等于2的id然后取反

183. 从不订购的客户

Customers 表：

+----+-------+| Id | Name  |+----+-------+| 1  | Joe   || 2  | Henry || 3  | Sam   || 4  | Max   |+----+-------+

Orders 表：

+----+------------+| Id | CustomerId |+----+------------+| 1  | 3          || 2  | 1          |+----+------------+

注意要对name重命名为Customers

select customers.name as 'Customers'from customerswhere customers.id not in(select CustomerId from orders);

使用左连接

select Customers.name as 'Customers'from Customersleft join orderson Customers.id = orders.CustomerIdwhere orders.CustomerId is null;

Day2 排序&修改1873. 计算特殊奖金

Employees表：

+-------------+---------+| 列名        | 类型     |+-------------+---------+| employee_id | int     || name        | varchar || salary      | int     |+-------------+---------+employee_id 是这个表的主键。此表的每一行给出了雇员id ，名字和薪水。

使用CASE

case配合when，then

when后面是条件，then后面是返回的结果

select employee_id,(    case         when mod(employee_id, 2) != 0 and left(name, 1) != 'M' then salary        else 0    end) as bonusfrom Employeesorder by employee_id;

使用IF

IF有三个参数，第一个是判断条件，第二个是条件成立的返回值，第三个是条件不成立的返回值

select employee_id,if(mod(employee_id, 2) != 0 and left(name, 1) != 'M', salary, 0) as bonusfrom Employeesorder by employee_id;

使用LIKE

使用LIKE进行匹配：

'%a'//以a结尾的数据'a%'//以a开头的数据'%a%'//含有a的数据'_a_'//三位且中间字母是a的'_a'//两位且结尾字母是a的'a_'//两位且开头字母是a的

select employee_id,if(mod(employee_id, 2) = 0 or name like 'M%', 0, salary) as bonusfrom Employeesorder by employee_id;

627. 变更性别

要求只使用单个 update 语句 ，且不产生中间临时表。

Salary 表：

+-------------+----------+| Column Name | Type     |+-------------+----------+| id          | int      || name        | varchar  || sex         | ENUM     || salary      | int      |+-------------+----------+id 是这个表的主键。sex 这一列的值是 ENUM 类型，只能从 ('m', 'f') 中取。本表包含公司雇员的信息。

使用IF

update Salary set sex = if(sex = 'f', 'm', 'f');

使用CASE

update Salary set sex = case    when sex = 'f' then 'm'    else 'f'end;

196. 删除重复的电子邮箱

题目要求不使用SELECT语句

Person表：

+-------------+---------+| Column Name | Type    |+-------------+---------+| id          | int     || email       | varchar |+-------------+---------+id是该表的主键列。该表的每一行包含一封电子邮件。电子邮件将不包含大写字母。

题解链接

delete p1表示从p1表中删除满足where条件的记录

# Please write a DELETE statement and DO NOT write a SELECT statement.# Write your MySQL query statement belowdelete p1from Person p1, Person p2where p1.email = p2.email and p1.id > p2.id;

使用SELECT和GROUP BY：

delete from Personwhere id not in(    select * from(select min(id) from Person group by email) t);

Day3 字符串处理函数/正则1667. 修复表中的名字

Users表：

+----------------+---------+| Column Name    | Type    |+----------------+---------+| user_id        | int     || name           | varchar |+----------------+---------+user_id 是该表的主键。该表包含用户的 ID 和名字。名字仅由小写和大写字符组成。

CONCAT函数用来拼接两个字符串，使用 UPPER和 LOWER来对name进行变换，然后拼接起来

select user_id, concat(upper(left(name, 1)), lower(substring(name, 2))) as namefrom Usersorder by user_id;

1484. 按日期分组销售产品

Activities表：

+-------------+---------+| 列名         | 类型    |+-------------+---------+| sell_date   | date    || product     | varchar |+-------------+---------+此表没有主键，它可能包含重复项。此表的每一行都包含产品名称和在市场上销售的日期。

用 group by sell_date将产品按日期统计起来，然后使用 count进行计数，使用 group_concat将产品名拼接起来

select sell_date,count(distinct(product)) as num_sold,group_concat(distinct product order by product asc separator ',') as productsfrom Activitiesgroup by sell_date;

1527. 患某种疾病的患者

患者信息表： Patients

+--------------+---------+| Column Name  | Type    |+--------------+---------+| patient_id   | int     || patient_name | varchar || conditions   | varchar |+--------------+---------+patient_id （患者 ID）是该表的主键。'conditions' （疾病）包含 0 个或以上的疾病代码，以空格分隔。这个表包含医院中患者的信息。

用like匹配，注意两种情况：

DIAB1在第一个，这时候用 DIAN1%匹配
DIAB1不在第一个，此时要在用 % DIAB1%匹配，注意前面有个空格

select patient_id, patient_name, conditionsfrom Patientswhere conditions like 'DIAB1%' or conditions like '% DIAB1%';

Day4 组合查询 & 指定选取1965. 丢失信息的雇员

表: Employees

+-------------+---------+| Column Name | Type    |+-------------+---------+| employee_id | int     || name        | varchar |+-------------+---------+employee_id 是这个表的主键。每一行表示雇员的id 和他的姓名。

表: Salaries

+-------------+---------+| Column Name | Type    |+-------------+---------+| employee_id | int     || salary      | int     |+-------------+---------+employee_id is 这个表的主键。每一行表示雇员的id 和他的薪水。

使用 union all来连接两个查询结果，通过 group by进行将employee_id进行聚合，使用 having count()选择仅出现一次的id

UNION和 UNION ALL的区别：前者会在连接后进行去重操作；后者不会去重，把查询出来的所有结果一起返回

select employee_idfrom(    select employee_id from Employees    union all    select employee_id from Salaries) as tgroup by employee_idhaving count(*) = 1order by employee_id asc;

1795. 每个产品在不同商店的价格

表：Products

+-------------+---------+| Column Name | Type    |+-------------+---------+| product_id  | int     || store1      | int     || store2      | int     || store3      | int     |+-------------+---------+这张表的主键是product_id（产品Id）。每行存储了这一产品在不同商店store1, store2, store3的价格。如果这一产品在商店里没有出售，则值将为null。

将查询出来的 store{1,2,3}都重命名为 store，然后使用 union将三个查询连接起来

select product_id, 'store1' as store, store1 price from Products where store1 is not nullunionselect product_id, 'store2' as store, store2 price from Products where store2 is not nullunionselect product_id, 'store3' as store, store3 price from Products where store3 is not null;

608. 树节点

给定一个表 tree，id 是树节点的编号， p_id 是它父节点的 id 。

+----+------+| id | p_id |+----+------+| 1  | null || 2  | 1    || 3  | 1    || 4  | 2    || 5  | 2    |+----+------+

使用CASE

如果 p_id为null，则节点为根节点

如果 id在 p_id里出现过，则为内部节点

没出现过的为叶子

select t.id, (    case    when t.p_id is null then 'Root'    when (        select count(*)        from tree t1 where t1.p_id = t.id    ) > 0 then 'Inner'    else 'Leaf'    end) as typefrom tree as t;

使用LEFT JOIN

将 id与 p_id进行左连接

如果 t1.p_id是空，则该节点是根节点

如果 t2.p_id是空，则说明 id没有在 p_id中出现过，即该节点是叶子

否则，是内部节点

select distinct t1.id, (    if(isnull(t1.p_id), 'Root', if(isnull(t2.p_id), 'Leaf', 'Inner'))) as typefrom tree t1left jointree t2 on t1.id = t2.p_id;

176. 第二高的薪水

Employee 表：

+-------------+------+| Column Name | Type |+-------------+------+| id          | int  || salary      | int  |+-------------+------+id 是这个表的主键。表的每一行包含员工的工资信息。

方法二和方法三注意使用 DISTINCT去重，因为最高的薪水可能不止一个

方法一

从去除掉最大薪水的剩余表中查询最大薪水

select max(salary) as SecondHighestSalary from Employeewhere salary not in (select max(salary) from Employee);

方法二

使用 limit 和 offset

offset表示要跳过的数据的数量

如果查询到的数据为空，用 ifnull将空数据变为null

select ifnull(    (        select distinct salary        from Employee        order by salary desc        limit 1 offset 1    ), null) as SecondHighestSalary;

方法三

使用临时表解决没有第二高工资的情况，对临时表进行选择，如果临时表是空表的话会返回null

select(    select distinct salary    from Employee    order by salary desc    limit 1 offset 1) as SecondHighestSalary;

Day5 合并

LEFT JOIN从左表（table1）返回所有的行，即使右表（table2）中没有匹配。如果右表中没有匹配，则结果为 NULL。

语法示例：

SELECT column_name(s)FROM table1LEFT JOIN table2ON table1.column_name=table2.column_name;

175. 组合两个表

表: Person

+-------------+---------+| 列名         | 类型     |+-------------+---------+| PersonId    | int     || FirstName   | varchar || LastName    | varchar |+-------------+---------+personId 是该表的主键列。该表包含一些人的 ID 和他们的姓和名的信息。

表: Address

+-------------+---------+| 列名         | 类型    |+-------------+---------+| AddressId   | int     || PersonId    | int     || City        | varchar || State       | varchar |+-------------+---------+addressId 是该表的主键列。该表的每一行都包含一个 ID = PersonId 的人的城市和州的信息。

直接使用左连接即可

select firstName, lastName, city, statefrom Personleft join Addresson Person.personId = Address.personId;

1581. 进店却未进行过交易的顾客

表：Visits

+-------------+---------+| Column Name | Type    |+-------------+---------+| visit_id    | int     || customer_id | int     |+-------------+---------+visit_id 是该表的主键。该表包含有关光临过购物中心的顾客的信息。

表：Transactions

+----------------+---------+| Column Name    | Type    |+----------------+---------+| transaction_id | int     || visit_id       | int     || amount         | int     |+----------------+---------+transaction_id 是此表的主键。此表包含 visit_id 期间进行的交易的信息。

使用左连接将 Visits表和 Transactions表连接，然后查询连接后的表里有多少个null

select customer_id, count(*) count_no_transfrom Visits vleft joinTransactions t on v.visit_id = t.visit_idwhere amount is nullgroup by customer_id;

1148. 文章浏览 I

Views 表：

+---------------+---------+| Column Name   | Type    |+---------------+---------+| article_id    | int     || author_id     | int     || viewer_id     | int     || view_date     | date    |+---------------+---------+此表无主键，因此可能会存在重复行。此表的每一行都表示某人在某天浏览了某位作者的某篇文章。请注意，同一人的 author_id 和 viewer_id 是相同的。

使用 DISTINCT或 GROUP BY均可

select distinct author_id as idfrom Viewswhere author_id = viewer_id# group by idorder by id asc;

Day6 合并197. 上升的温度

表： Weather

+---------------+---------+| Column Name   | Type    |+---------------+---------+| id            | int     || recordDate    | date    || temperature   | int     |+---------------+---------+id 是这个表的主键该表包含特定日期的温度信息

使用 DATEDIFF函数来判断两个日期的差值

可以使用 INNER JOIN连接，也可以直接select两个表：

select today.idfrom Weather today,Weather yesterday# 或者：# Weather today# inner join Weather yesterdaywhere datediff(today.recordDate, yesterday.recordDate) = 1 and today.Temperature > yesterday.Temperature;

607. 销售员

表: SalesPerson

+-----------------+---------+| Column Name     | Type    |+-----------------+---------+| sales_id        | int     || name            | varchar || salary          | int     || commission_rate | int     || hire_date       | date    |+-----------------+---------+sales_id 是该表的主键列。该表的每一行都显示了销售人员的姓名和 ID ，以及他们的工资、佣金率和雇佣日期。

使用 WHERE一直嵌套

select S.name as namefrom SalesPerson Swhere S.sales_id not in (    select O.sales_id    from Orders O    where O.com_id in    (        select C.com_id        from Company C        where C.name = 'RED'    ));

1141. 查询近30天活跃用户数

活动记录表：Activity

+---------------+---------+| Column Name   | Type    |+---------------+---------+| user_id       | int     || session_id    | int     || activity_date | date    || activity_type | enum    |+---------------+---------+该表是用户在社交网站的活动记录。该表没有主键，可能包含重复数据。activity_type 字段为以下四种值 ('open_session', 'end_session', 'scroll_down', 'send_message')。每个 session_id 只属于一个用户。

注意是 distinct user_id，因为一个用户可能会对应多个 session_id，datediff的时候要注意不小于0

select activity_date as day, count(distinct user_id) as active_usersfrom Activitywhere datediff('2019-07-27', activity_date) = 0group by activity_date;

Day7 统计去重1141. 查询近30天活跃用户数

活动记录表：Activity

+---------------+---------+| Column Name   | Type    |+---------------+---------+| user_id       | int     || session_id    | int     || activity_date | date    || activity_type | enum    |+---------------+---------+该表是用户在社交网站的活动记录。该表没有主键，可能包含重复数据。activity_type 字段为以下四种值 ('open_session', 'end_session', 'scroll_down', 'send_message')。每个 session_id 只属于一个用户。

注意是 distinct user_id，因为一个用户可能会对应多个 session_id，datediff的时候要注意不小于0

select activity_date as day, count(distinct user_id) as active_usersfrom Activitywhere datediff('2019-07-27', activity_date) = 0group by activity_date;

1693. 每天的领导和合伙人

表：DailySales

+-------------+---------+| Column Name | Type    |+-------------+---------+| date_id     | date    || make_name   | varchar || lead_id     | int     || partner_id  | int     |+-------------+---------+该表没有主键。该表包含日期、产品的名称，以及售给的领导和合伙人的编号。名称只包含小写英文字母。

select date_id, make_name, count(distinct lead_id) as unique_leads, count(distinct partner_id) as unique_partnersfrom DailySalesgroup by date_id, make_name;

1729. 求关注者的数量

表： Followers

+-------------+------+| Column Name | Type |+-------------+------+| user_id     | int  || follower_id | int  |+-------------+------+(user_id, follower_id) 是这个表的主键。该表包含一个关注关系中关注者和用户的编号，其中关注者关注用户。

select user_id, count(follower_id) as followers_countfrom Followersgroup by user_idorder by user_id asc;

Day8 计算函数586. 订单最多的客户

表: Orders

+-----------------+----------+| Column Name     | Type     |+-----------------+----------+| order_number    | int      || customer_number | int      |+-----------------+----------+Order_number是该表的主键。此表包含关于订单ID和客户ID的信息。

降序排序后用 limit 1选择出来第一个值，就是订单最多的用户

select customer_numberfrom Ordersgroup by customer_numberorder by count(*) desclimit 1;

511. 游戏玩法分析 I

活动表 Activity：

+--------------+---------+| Column Name  | Type    |+--------------+---------+| player_id    | int     || device_id    | int     || event_date   | date    || games_played | int     |+--------------+---------+表的主键是 (player_id, event_date)。这张表展示了一些游戏玩家在游戏平台上的行为活动。每行数据记录了一名玩家在退出平台之前，当天使用同一台设备登录平台后打开的游戏的数目（可能是 0 个）。

对 player_id进行分组，然后选择出每个id的最小的 event_date

使用排序后时间会比不用排序直接去最小值快将近100ms，chatgpt给出的解释是：

MySQL 中先排序再取最小值可能会变快的原因是，在数据表中有索引的情况下，如果在排序之前就取最小值，MySQL 的引擎会扫描整个表并在内存中对所有行进行排序，而如果先排序再取最小值，MySQL 的引擎只需要扫描索引并返回第一个索引值即可。
这个表现差异更明显的是在排序字段上有索引的情况下.
在这种情况下，MySQL 的引擎可以使用索引进行排序，而无需在内存中对所有行进行排序。因此查询速度会显著加快.

select player_id, min(event_date) as first_loginfrom Activitygroup by player_idorder by event_date asc;

1890. 2020年最后一次登录

表: Logins

+----------------+----------+| 列名           | 类型      |+----------------+----------+| user_id        | int      || time_stamp     | datetime |+----------------+----------+(user_id, time_stamp) 是这个表的主键。每一行包含的信息是user_id 这个用户的登录时间。

对 user_id进行分组，选出在2020年的最大登录时间

select user_id, max(time_stamp) as last_stampfrom Loginswhere time_stamp between '2020-01-01 0:0:0' and '2020-12-31 23:59:59'group by user_id;

1741. 查找每个员工花费的总时间

表: Employees

+-------------+------+| Column Name | Type |+-------------+------+| emp_id      | int  || event_day   | date || in_time     | int  || out_time    | int  |+-------------+------+(emp_id, event_day, in_time) 是这个表的主键。该表显示了员工在办公室的出入情况。event_day 是此事件发生的日期，in_time 是员工进入办公室的时间，而 out_time 是他们离开办公室的时间。in_time 和 out_time 的取值在1到1440之间。题目保证同一天没有两个事件在时间上是相交的，并且保证 in_time 小于 out_time。

select event_day as day, emp_id, sum(out_time - in_time) as total_timefrom Employeesgroup by emp_id, event_day;

Day9 控制流1393. 股票的资本损益

Stocks 表：

+---------------+---------+| Column Name   | Type    |+---------------+---------+| stock_name    | varchar || operation     | enum    || operation_day | int     || price         | int     |+---------------+---------+(stock_name, day) 是这张表的主键operation 列使用的是一种枚举类型，包括：('Sell','Buy')此表的每一行代表了名为 stock_name 的某支股票在 operation_day 这一天的操作价格。保证股票的每次'Sell'操作前，都有相应的'Buy'操作。

用 if判断一下，用 case也可以

select stock_name, sum(    if(operation = 'Buy', -1 * price, price)) as capital_gain_lossfrom Stocksgroup by stock_name;

1407. 排名靠前的旅行者

表：Users

+---------------+---------+| Column Name   | Type    |+---------------+---------+| id            | int     || name          | varchar |+---------------+---------+id 是该表单主键。name 是用户名字。

用 ifnull来将null变为0，order by可以排序多个字段

select name, ifnull(sum(distance), 0) as travelled_distancefrom Usersleft joinRides on Users.id = Rides.user_idgroup by user_idorder by travelled_distance desc, name asc;

1158. 市场分析 I

Table: Users

+----------------+---------+| Column Name    | Type    |+----------------+---------+| user_id        | int     || join_date      | date    || favorite_brand | varchar |+----------------+---------+此表主键是 user_id。表中描述了购物网站的用户信息，用户可以在此网站上进行商品买卖。

Table: Orders

+---------------+---------+| Column Name   | Type    |+---------------+---------+| order_id      | int     || order_date    | date    || item_id       | int     || buyer_id      | int     || seller_id     | int     |+---------------+---------+此表主键是 order_id。外键是 item_id 和（buyer_id，seller_id）。

Table: Items

+---------------+---------+| Column Name   | Type    |+---------------+---------+| item_id       | int     || item_brand    | varchar |+---------------+---------+此表主键是 item_id。

Items表是没有用的。

首先从 Orders表中选出在2019年买过商品的 buyer_id，然后用 group by分组，统计出来每个人买的次数，然后和 Users表进行连接

select user_id as buyer_id, join_date, ifnull(orders_in_2019, 0) as orders_in_2019from Users as Uleft join(    select buyer_id, count(*) as orders_in_2019    from Orders as O    where O.order_date between '2019-01-01' and '2019-12-31'    group by buyer_id) as ton t.buyer_id = U.user_id;

Day10 过滤182. 查找重复的电子邮箱

编写一个 SQL 查询，查找 Person 表中所有重复的电子邮箱。

示例：

+----+---------+| Id | Email   |+----+---------+| 1  | a@b.com || 2  | c@d.com || 3  | a@b.com |+----+---------+

使用GROUP BY

select Emailfrom(    select Email, count(*) as cnt    from Person    group by Email) as twhere t.cnt > 1;

使用GROUP BY和HAVING

select Emailfrom Persongroup by Emailhaving count(*) > 1;

1050. 合作过至少三次的演员和导演

ActorDirector 表：

+-------------+---------+| Column Name | Type    |+-------------+---------+| actor_id    | int     || director_id | int     || timestamp   | int     |+-------------+---------+timestamp 是这张表的主键.

使用GROUP BY和HAVING

select actor_id, director_idfrom ActorDirectorgroup by actor_id, director_idhaving count(*) >= 3;

1587. 银行账户概要 II

表: Users

+--------------+---------+| Column Name  | Type    |+--------------+---------+| account      | int     || name         | varchar |+--------------+---------+account 是该表的主键.表中的每一行包含银行里中每一个用户的账号.

表: Transactions

+---------------+---------+| Column Name   | Type    |+---------------+---------+| trans_id      | int     || account       | int     || amount        | int     || transacted_on | date    |+---------------+---------+trans_id 是该表主键.该表的每一行包含了所有账户的交易改变情况.如果用户收到了钱, 那么金额是正的; 如果用户转了钱, 那么金额是负的.所有账户的起始余额为 0.

使用左连接将两个表连接起来，然后对 account进行分组，计算账户余额，最后用 having选出余额大于一万的账户

select name,sum(amount) as balancefrom Users as Uleft joinTransactions as Ton U.account = T.accountgroup by T.accounthaving balance > 10000;

1084. 销售分析III

Table: Product

+--------------+---------+| Column Name  | Type    |+--------------+---------+| product_id   | int     || product_name | varchar || unit_price   | int     |+--------------+---------+Product_id是该表的主键。该表的每一行显示每个产品的名称和价格。

Table: Sales

+-------------+---------+| Column Name | Type    |+-------------+---------+| seller_id   | int     || product_id  | int     || buyer_id    | int     || sale_date   | date    || quantity    | int     || price       | int     |+------ ------+---------+这个表没有主键，它可以有重复的行。product_id 是 Product 表的外键。该表的每一行包含关于一个销售的一些信息。

注意是产品的所有销售时间都在第一个季度，所以要判断销售时间的最大值和最小值均在第一季度

select P.product_id, P.product_namefrom Product as Pleft joinSales as S on S.product_id = P.product_idgroup by S.product_idhaving (min(S.sale_date) between '2019-01-01' and '2019-03-31') and  (max(S.sale_date) between '2019-01-01' and '2019-03-31')

文章版权归作者所有，未经允许请勿转载。

THE END