标签
PostgreSQL , ddos , 拒绝服务 , 锁 , SLOT
背景
连接数据库的过程中, 需要数据库有足够的 SLOT(连接槽, 通过 max_connections 配置), 认证. 如果把连接槽位占用, 或者在认证过程加锁 (使得认证过程被锁), 则可以制造 DDOS 攻击.
占用连接槽的攻击与防范方法:
《PostgreSQL 连接攻击 (类似 DDoS)》
认证过程加锁的攻击方法, 本文会提到.
- BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack
- From: PG Bug reporting form <noreply(at)PostgreSQL(dot)org>
- To: pgsql-bugs(at)lists(dot)PostgreSQL(dot)org
- Cc: lalbin(at)scharp(dot)org
- Subject: BUG #15182: Canceling authentication due to timeout aka Denial of Service Attack
- Date: 2018-04-30 20:41:11
- Message-ID: 152512087100.19803.12733865831237526317@wrigleys.PostgreSQL.org
- Views: Raw Message | Whole Thread | Download mbox
- Thread:
- Lists: pgsql-bugs pgsql-hackers
- The following bug has been logged on the website:
- Bug reference: 15182
- Logged by: Lloyd Albin
- Email address: lalbin(at)scharp(dot)org
- PostgreSQL version: 10.3
- Operating system: OpenSUSE
- Description:
- Over the last several weeks our developers caused a Denial of Service Attack
- against ourselves by accident. When looking at the log files, I noticed that
- we had authentication timeouts during these time periods. In researching the
- problem I found this is due to locks being held on shared system catalog
- items, aka system catalog items that are shared between all databases on the
- same cluster/server. This can be caused by beginning a long running
- transaction that queries pg_stat_activity, pg_roles, pg_database, etc and
- then another connection that runs either a REINDEX DATABASE, REINDEX SYSTEM,
- or VACUUM FULL. This issue is of particular importance to database resellers
- who use the same cluster/server for multiple clients, as two clients can
- cause this issue to happen inadvertently or a single client can either cause
- it to happen maliciously or inadvertently. Note: The large cloud providers
- give each of their clients their own cluster/server so this will not affect
- across cloud clients but can affect an individual client. The problem is
- that traditional hosting companies will have all clients from one or more
- Web servers share the same PostgreSQL cluster/server. This means that one or
- two clients could inadvertently stop all the other clients from being able
- to connect to their databases until the first client does either a COMMIT or
- ROLLBACK of their transaction which they could hold open for hours, which is
- what happened to us internally.
- In Connection 1 we need to BEGIN a transaction and then query a shared
- system item; pg_authid, pg_database, etc; or a view that depends on a shared
- system item; pg_stat_activity, pg_roles, etc. Our developers were accessing
- pg_roles.
- Connection 1 (Any database, Any User)
- BEGIN;
- SELECT * FROM pg_stat_activity;
- Connection 2 (Any database will do as long as you are the database owner)
- REINDEX DATABASE postgres;
- Connection 3 (Any Database, Any User)
- psql -h sqltest-alt -d sandbox
- All future Connection 3's will hang for however long the transaction in
- Connection 1 runs. In our case this was hours and denied everybody else the
- ability to log into the server until Connection 1 was committed. psql will
- just hang for hours, even overnight in my testing, but our apps would get
- the "Canceling authentication due to timeout" after 1 minute.
- Connection 2 can also do any of these commands to also cause the same
- issue:
- REINDEX SYSTEM postgres;
- VACUUM FULL pg_authid;
- vacuumdb -f -h sqltest-alt -d lloyd -U lalbin
- Even worse is that the VACUUM FULL pg_authid; can be started by an
- unprivileged user and it will wait for the AccessShareLock by connection 1
- to be released before returning the error that you don't have permission to
- perform this action, so even an unprivileged user can cause this to happen.
- The privilege check needs to happen before the waiting for the
- AccessExclusiveLock happens.
- This bug report has been simplified and shorted drastically. To read the
- full information about this issue please see my blog post:
- http://lloyd.thealbins.com/Canceling authentication due to timeout
- Lloyd Albin
- Database Administrator
- Statistical Center for HIV/AIDS Research and Prevention (SCHARP)
- Fred Hutchinson Cancer Research Center
复现方法如上.
防范
1, 对于连接占用 DDOS 攻击的防范 (1, 设置认证超时参数. 2, 不要在公网监听. 3, 设置网络层防火墙.)
2, 对于锁攻击 (通常是无意识攻击), 建议在操作大锁的 SQL 前, 加锁超时, 或者语句超时 (尽量减少等待时长). (lock_timeout, statement_timeout 都可以)
参考
《PostgreSQL 锁等待监控 珍藏级 SQL - 谁堵塞了谁》
《PostgreSQL 设置单条 SQL 的执行超时 - 防雪崩》
《如何防止数据库雪崩 (泛洪 flood)》
《PostgreSQL 连接攻击 (类似 DDoS)》
来源: https://yq.aliyun.com/articles/700362