add tests for h3Line
Activity
Felixoid push ClickHouse/ClickHouse
commit sha: c6e2dd1c431c0d6dfc086e651d4deed9cd33c260
push time in 1 minute agogingerwizard push ClickHouse/clickhouse-go
commit sha: e14539c9615b764111ea6925e804aa1cec9ccd08
push time in 6 minutes agoleegean issue comment ClickHouse/ClickHouse
About clickHouse's memory management mechanism
Describe the issue My server has 32 GIGABytes of memory, and for experimental purposes I set max_server_memory_usage_to_RAM_ratio to 0.2. Everything else is default, but with occasional clickhouse queries,Memory keeps growing without cleaning up the old data in memory to free up some memory for more urgent query processing requests.
Question 1.what is clickHouse's memory reclamation mechanism? 2.when is it reclaimed? 3.How do I solve this problem?
Error message and/or stacktrace
2022.05.12 15:23:26.745074 [ 1106767 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 241. DB::Exception: Memory limit (total) exceeded: would use 6.25 GiB (attempt to allocate chunk of 4200879 bytes), maximum: 6.25 GiB. (MEMORY_LIMIT_EXCEEDED), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xaebed1a in /usr/bin/clickhouse
1. DB::Exception::Exception<char const*, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, long&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*&&, char const*&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, long&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&) @ 0xaed6d0c in /usr/bin/clickhouse
2. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0xaed6904 in /usr/bin/clickhouse
3. DB::MarksInCompressedFile::MarksInCompressedFile(unsigned long) @ 0x155828a5 in /usr/bin/clickhouse
4. DB::MergeTreeMarksLoader::loadMarksImpl() @ 0x15581865 in /usr/bin/clickhouse
5. DB::MergeTreeMarksLoader::loadMarks() @ 0x15580de8 in /usr/bin/clickhouse
6. DB::MergeTreeReaderCompact::getReadBufferSize(std::__1::shared_ptr<DB::IMergeTreeDataPart const> const&, DB::MergeTreeMarksLoader&, std::__1::vector<std::__1::optional<unsigned long>, std::__1::allocator<std::__1::optional<unsigned long> > > const&, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange> > const&) @ 0x1557970e in /usr/bin/clickhouse
7. DB::MergeTreeReaderCompact::MergeTreeReaderCompact(std::__1::shared_ptr<DB::MergeTreeDataPartCompact const>, DB::NamesAndTypesList, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::UncompressedCache*, DB::MarkCache*, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange> >, DB::MergeTreeReaderSettings, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, double, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, double> > >, std::__1::function<void (DB::ReadBufferFromFileBase::ProfileInfo)> const&, int) @ 0x1557877d in /usr/bin/clickhouse
8. DB::MergeTreeDataPartCompact::getReader(DB::NamesAndTypesList const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange> > const&, DB::UncompressedCache*, DB::MarkCache*, DB::MergeTreeReaderSettings const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, double, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, double> > > const&, std::__1::function<void (DB::ReadBufferFromFileBase::ProfileInfo)> const&) const @ 0x154e4e6c in /usr/bin/clickhouse
9. DB::MergeTreeSequentialSource::MergeTreeSequentialSource(DB::MergeTreeData const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::IMergeTreeDataPart const>, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, bool, bool, bool) @ 0x1559017c in /usr/bin/clickhouse
10. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::createMergedStream() @ 0x15436af6 in /usr/bin/clickhouse
11. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x15434996 in /usr/bin/clickhouse
12. bool std::__1::__function::__policy_invoker<bool ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::MergeTask::ExecuteAndFinalizeHorizontalPart::subtasks::'lambda'(), bool ()> >(std::__1::__function::__policy_storage const*) @ 0x15442bc9 in /usr/bin/clickhouse
13. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x1543940b in /usr/bin/clickhouse
14. DB::MergeTask::execute() @ 0x1543e3ba in /usr/bin/clickhouse
15. DB::MergePlainMergeTreeTask::executeStep() @ 0x1542ffac in /usr/bin/clickhouse
16. DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(std::__1::shared_ptr<DB::TaskRuntimeData>) @ 0xae95feb in /usr/bin/clickhouse
17. DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::threadFunction() @ 0xae95c39 in /usr/bin/clickhouse
18. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) @ 0xaf6546a in /usr/bin/clickhouse
19. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()&&...)::'lambda'()::operator()() @ 0xaf674a4 in /usr/bin/clickhouse
20. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0xaf62837 in /usr/bin/clickhouse
21. ? @ 0xaf662fd in /usr/bin/clickhouse
22. ? @ 0x7ff75cf08609 in ?
23. clone @ 0x7ff75ce2d163 in ?
(version 22.2.2.1)
In the figure above, the overall memory footprint shows a linear upward trend.I've rebooted where memory is zero.
den-crane issue ClickHouse/ClickHouse
New record was ignored with ReplicatedCollapsingMergeTree engine
ReplicatedCollapsingMergeTree engine, I insert one new record with the same key to update an old record in db, the new record was ignored, recorded in the log. It works fine with CollapsingMergeTree engine, without Replicated.
What's the suggestion to avoid it? Thanks!
We used ClickHouse-Keeper
ClickHouse version: 22.1.3.7
{54dd3654-493e-477a-8c35-7b7d07e3774e}
den-crane issue comment ClickHouse/ClickHouse
New record was ignored with ReplicatedCollapsingMergeTree engine
ReplicatedCollapsingMergeTree engine, I insert one new record with the same key to update an old record in db, the new record was ignored, recorded in the log. It works fine with CollapsingMergeTree engine, without Replicated.
What's the suggestion to avoid it? Thanks!
We used ClickHouse-Keeper
ClickHouse version: 22.1.3.7
{54dd3654-493e-477a-8c35-7b7d07e3774e}
rvasin wants to merge ClickHouse/ClickHouse
Add total_max_threads parameter
Changelog category:
- New Feature
Changelog entry
Add total_max_threads parameter to increase performance in case of high RPS by means of limiting total number of threads for all queries.
Closes #36551
See the attached article for details: article.pdf
I did not understand - should I rename parameter total_max_threads into max_threads_for_all_queries or not? (and in which places?). If we decide to rename then I would rename it everywhere to keep the same naming logic everywhere.
rvasin merge to ClickHouse/ClickHouse
Add total_max_threads parameter
Changelog category:
- New Feature
Changelog entry
Add total_max_threads parameter to increase performance in case of high RPS by means of limiting total number of threads for all queries.
Closes #36551
See the attached article for details: article.pdf
taotaizhu-pw issue ClickHouse/ClickHouse
New record was ignored with ReplicatedCollapsingMergeTree engine
ReplicatedCollapsingMergeTree engine, I insert one new record with the same key as an old record in db, the new record was ignored, recorded in the log. It works fine with CollapsingMergeTree engine, without Replicated.
What's the suggestion to avoid it? Thanks!
We used ClickHouse-Keeper
ClickHouse version: 22.1.3.7
{54dd3654-493e-477a-8c35-7b7d07e3774e}
dependabot[bot] in ClickHouse/clickhouse-go delete branch dependabot/go_modules/github.com/paulmach/orb-0.7.1
gingerwizard push ClickHouse/clickhouse-go
commit sha: 08fd610ece23b266a833a01cd5730e5b2c4d1a15
push time in 39 minutes agogingerwizard pull request ClickHouse/clickhouse-go
Bump github.com/paulmach/orb from 0.7.0 to 0.7.1
Bumps github.com/paulmach/orb from 0.7.0 to 0.7.1.
Release notes
Sourced from github.com/paulmach/orb's releases.
v0.7.1
v0.7.0 initially pointed to the wrong commit. After moving the tag there are some caching issues in GitHub actions. I hope this clears up the issue.
Changelog
Sourced from github.com/paulmach/orb's changelog.
v0.7.1 - 2022-05-16
No changes
The v0.7.0 tag was updated since it initially pointed to the wrong commit. This is causing caching issues.
Commits
6f098c1
Update change log for v0.7.1- See full diff in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase
.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebase
will rebase this PR -
@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it -
@dependabot merge
will merge this PR after your CI passes on it -
@dependabot squash and merge
will squash and merge this PR after your CI passes on it -
@dependabot cancel merge
will cancel a previously requested merge and block automerging -
@dependabot reopen
will reopen this PR if it is closed -
@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) -
@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
gingerwizard push ClickHouse/clickhouse-go
commit sha: 1e74d356c65da6e6991742480ed6d888d5988da9
push time in 39 minutes agoazat wants to merge ClickHouse/ClickHouse
Add total_max_threads parameter
Changelog category:
- New Feature
Changelog entry
Add total_max_threads parameter to increase performance in case of high RPS by means of limiting total number of threads for all queries.
Closes #36551
See the attached article for details: article.pdf
Not only disk bound, thread may just poll something once in a ms or so.
azat merge to ClickHouse/ClickHouse
Add total_max_threads parameter
Changelog category:
- New Feature
Changelog entry
Add total_max_threads parameter to increase performance in case of high RPS by means of limiting total number of threads for all queries.
Closes #36551
See the attached article for details: article.pdf
azat wants to merge ClickHouse/ClickHouse
Add total_max_threads parameter
Changelog category:
- New Feature
Changelog entry
Add total_max_threads parameter to increase performance in case of high RPS by means of limiting total number of threads for all queries.
Closes #36551
See the attached article for details: article.pdf
To me, functions can be leased as now, since it is already attached to processes, so simply total_max_threads is fine.
azat merge to ClickHouse/ClickHouse
Add total_max_threads parameter
Changelog category:
- New Feature
Changelog entry
Add total_max_threads parameter to increase performance in case of high RPS by means of limiting total number of threads for all queries.
Closes #36551
See the attached article for details: article.pdf
mergify[bot] push ClickHouse/ClickHouse
commit sha: d5f870eac8e459362332f5da56aa3e485d924c19
push time in 41 minutes agomergify[bot] issue comment ClickHouse/ClickHouse
Multiple client connection attempts if hostname resolves to multiple addresses
Changelog category (leave one):
- Bug Fix (user-visible misbehavior in official stable or prestable release)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Client will try every IP address returned by DNS resolution until successful connection.
closes #6698
update
✅ Branch has been successfully updated
yakov-olkhovskiy issue comment ClickHouse/ClickHouse
Multiple client connection attempts if hostname resolves to multiple addresses
Changelog category (leave one):
- Bug Fix (user-visible misbehavior in official stable or prestable release)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Client will try every IP address returned by DNS resolution until successful connection.
closes #6698
KochetovNicolai merge to ClickHouse/ClickHouse
Speed up test 00157_cache_dictionary
Changelog category (leave one):
- Not for changelog (changelog entry is not required)
KochetovNicolai push ClickHouse/ClickHouse
commit sha: a19d4c6f1fc49bfb4964c7288ba9373ff908306a
push time in 46 minutes agoKochetovNicolai pull request ClickHouse/ClickHouse
tests/integration: fix possible race for iptables user rules inside containers
Changelog category (leave one):
- Not for changelog (changelog entry is not required)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
tests/integration: fix possible race for iptables user rules inside containers
TL;DR;
It is possible for network PartitionManager to work incorrectly, because of how docker setting up forward to DOCKER-USER chain, it first removes forward and then adds it back (see 1 and 2), however this introduce race for a short period of time, and this is enough for TCP to retransmit packets, and breaks network PartitionManager.
Here are some details from logs for 3:
2022-04-27 03:01:00 [ 621 ] DEBUG : Executing query SELECT node FROM distributed_table ORDER BY node on node2 (cluster.py:2879, query_and_get_error)
This query fails, from the server logs:
2022.04.27 03:01:00.213101 [ 10 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> executeQuery: (from 172.16.5.1:59008) SELECT node FROM distributed_table ORDER BY node
...
2022.04.27 03:01:03.578439 [ 223 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> Connection (node1:9000): Sent data for 2 scalars, total 2 rows in 0.000284672 sec., 6993 rows/sec., 68.00 B (232.15 KiB/sec.), compressed 0.4594594594594595 times to 148.00 B (505.16 KiB/sec.)
2022.04.27 03:01:03.590637 [ 223 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> MergingSortedTransform: Merge sorted 3 blocks, 2 rows in 3.371592744 sec., 0.5931914533744174 rows/sec., 94.61 B/sec
2022.04.27 03:01:03.601256 [ 10 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Information> executeQuery: Read 2 rows, 28.00 B in 3.387950542 sec., 0 rows/sec., 8.26 B/sec.
2022.04.27 03:01:03.601894 [ 10 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> MemoryTracker: Peak memory usage (for query): 334.38 KiB.
And from docker daemon log:
time="2022-04-27T03:00:59.916693113Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-I\",\"DOCKER-USER\",\"1\",\"-p\",\"tcp\",\"-s\",\"172.16.5.2\",\"-d\",\"172.16.5.3\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
time="2022-04-27T03:01:00.030654116Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-I\",\"DOCKER-USER\",\"1\",\"-p\",\"tcp\",\"-s\",\"172.16.5.3\",\"-d\",\"172.16.5.2\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
...
time="2022-04-27T03:01:03.515813984Z" level=debug msg="/usr/sbin/iptables, [--wait -t filter -n -L DOCKER-USER]"
time="2022-04-27T03:01:03.531106486Z" level=debug msg="/usr/sbin/iptables, [--wait -t filter -C DOCKER-USER -j RETURN]"
time="2022-04-27T03:01:03.535442346Z" level=debug msg="/usr/sbin/iptables, [--wait -t filter -C FORWARD -j DOCKER-USER]"
time="2022-04-27T03:01:03.555856911Z" level=debug msg="/usr/sbin/iptables, [--wait -D FORWARD -j DOCKER-USER]"
time="2022-04-27T03:01:03.564905764Z" level=debug msg="/usr/sbin/iptables, [--wait -I FORWARD -j DOCKER-USER]"
...
time="2022-04-27T03:01:03.706374466Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-D\",\"DOCKER-USER\",\"-p\",\"tcp\",\"-s\",\"172.16.5.3\",\"-d\",\"172.16.5.2\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
time="2022-04-27T03:01:03.968077970Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-D\",\"DOCKER-USER\",\"-p\",\"tcp\",\"-s\",\"172.16.5.2\",\"-d\",\"172.16.5.3\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
I've tried multiple ways of fixing this:
-
Creating separate chain for rules from PartitionManager (DOCKER-USER-CLICKHOUSE) But it is created only once, and docker places new rules on top of the FORWARD chain, so it will not work, since it will not receive any packets
-
Use DOCKER-USER, but replace iptables with a wrapper ([script]), that will ignore recreating of a rule for forward to DOCKER-USER, but this will not work too, since new docker rules will be created on top of FORWARD chain, and so DOCKER-USER will packets.
[script]:
if [[ "$*" =~ "-D FORWARD -j DOCKER-USER" ]]; then
exit 0
fi
if [[ "$*" =~ "-I FORWARD -j DOCKER-USER" ]]; then
if iptables.real iptables -C FORWARD -j DOCKER-USER; then
exit 0
fi
fi
- And the only way to avoid flakiness for this case, is to forbid parallel execution for tests with PartitionManager.
Fixes: #36541 (fixes first problem, everything else, had been already fixed) Refs: https://github.com/moby/moby/pull/43585
More CI:
- https://s3.amazonaws.com/clickhouse-test-reports/36979/02bd5f6542e1f7a6dfec955322b290d681837bcf/integration_tests__asan__actions__[2/3].html
- https://s3.amazonaws.com/clickhouse-test-reports/36295/314d553ab14d30df7508814513506ec09c7c7061/integration_tests__asan__actions__[2/3]/integration_run_parallel1_0.log
KochetovNicolai issue ClickHouse/ClickHouse
Weirdness with timeouts in StorageDistributed
The test simple breaks connectivity between two nodes and checks, that connection timeout in StorageDistributed
works:
https://github.com/ClickHouse/ClickHouse/blob/3246261da8a3152bc0ecf0c8855120454d76877f/tests/integration/test_distributed_respect_user_timeouts/test.py#L174-L187
But seem slike something went wrong and PartitionManager
did not break connectivity completely, so node2
has successfully connected to node1
and sent some data (the following log is from node2
):
2022.04.20 18:56:39.566138 [ 224 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Trace> Connection (node1:9000): Connecting. Database: (not specified). User: default
2022.04.20 18:56:42.569127 [ 224 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Warning> HedgedConnectionsFactory: Connection failed at try №1, reason: Code: 209. DB::NetException: Timeout: connect timed out: 172.16.5.3:9000 (node1:9000, receive timeout 0 ms, send timeout 0 ms). (SOCKET_TIMEOUT) (version 22.4.1.2246)
2022.04.20 18:56:42.569275 [ 224 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Trace> Connection (node1:9000): Connecting. Database: (not specified). User: default
2022.04.20 18:56:42.571173 [ 224 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Trace> Connection (node1:9000): Connected to ClickHouse server version 22.4.1.
2022.04.20 18:56:42.572354 [ 224 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Debug> Connection (node1:9000): Sent data for 2 scalars, total 2 rows in 0.000386693 sec., 5157 rows/sec., 68.00 B (171.22 KiB/sec.), compressed 0.4594594594594595 times to 148.00 B (372.59 KiB/sec.)
Then it hung for ~17 minutes (with no log messages) and finally failed:
2022.04.20 19:13:09.901008 [ 225 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Trace> StorageDistributed (distributed_table): () Cancelling query
2022.04.20 19:13:09.919001 [ 10 ] {224b5ee7-ad99-4575-98bb-f7dab9bbfb87} <Error> executeQuery: Code: 209. DB::NetException: Timeout exceeded while reading from socket (172.16.5.3:9000, 300000 ms): while receiving packet from node1:9000: While executing Remote. (SOCKET_TIMEOUT) (version 22.4.1.2246) (from 172.16.5.1:35710) (in query: SELECT node FROM distributed_table ORDER BY node), Stack trace (when copying this message, always include the lines below):
0. ./build_docker/../contrib/libcxx/include/exception:133: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x391486e9 in /usr/bin/clickhouse
1. ./build_docker/../src/Common/Exception.cpp:58: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0xd5890e8 in /usr/bin/clickhouse
2. ./build_docker/../src/Common/NetException.h:12: DB::ReadBufferFromPocoSocket::nextImpl() @ 0x2b925dc9 in /usr/bin/clickhouse
3. ./build_docker/../src/IO/ReadBuffer.h:86: void DB::readVarUIntImpl<false>(unsigned long&, DB::ReadBuffer&) @ 0xd65a87e in /usr/bin/clickhouse
4. ./build_docker/../src/IO/VarInt.h:0: DB::Connection::receivePacket() @ 0x2e10b03b in /usr/bin/clickhouse
5. ./build_docker/../src/Client/PacketReceiver.h:0: DB::PacketReceiver::Routine::operator()(boost::context::fiber&&) @ 0x2e157165 in /usr/bin/clickhouse
6. ./build_docker/../contrib/libcxx/include/__utility/swap.h:36: boost::context::detail::fiber_capture_record<boost::context::fiber, FiberStack&, DB::PacketReceiver::Routine>::run() @ 0x2e156321 in /usr/bin/clickhouse
7. ./build_docker/../contrib/boost/boost/context/fiber_ucontext.hpp:74: void boost::context::detail::fiber_entry_func<boost::context::detail::fiber_capture_record<boost::context::fiber, FiberStack&, DB::PacketReceiver::Routine> >(void*) @ 0x2e150eab in /usr/bin/clickhouse
The weird things are:
- How did
node2
connected tonode1
whenPartitionManager
was active? Ok, it adds only one rule that drops only packets fromnode1
tonode2
, but I thought TCP requires some acknowledgment to establish connection. Maybe something wrong withPartitionManager
/iptables rules in our integration tests environment. - Why did it take ~17 minutes to get
Timeout exceeded while reading from socket
when timeout is 300000ms (5 minutes)? Also 17 is not divisible by 5. - Why did
IConnections::dumpAddresses
return empty string? (StorageDistributed (distributed_table): () Cancelling query
)
KochetovNicolai merge to ClickHouse/ClickHouse
tests/integration: fix possible race for iptables user rules inside containers
Changelog category (leave one):
- Not for changelog (changelog entry is not required)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
tests/integration: fix possible race for iptables user rules inside containers
TL;DR;
It is possible for network PartitionManager to work incorrectly, because of how docker setting up forward to DOCKER-USER chain, it first removes forward and then adds it back (see 1 and 2), however this introduce race for a short period of time, and this is enough for TCP to retransmit packets, and breaks network PartitionManager.
Here are some details from logs for 3:
2022-04-27 03:01:00 [ 621 ] DEBUG : Executing query SELECT node FROM distributed_table ORDER BY node on node2 (cluster.py:2879, query_and_get_error)
This query fails, from the server logs:
2022.04.27 03:01:00.213101 [ 10 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> executeQuery: (from 172.16.5.1:59008) SELECT node FROM distributed_table ORDER BY node
...
2022.04.27 03:01:03.578439 [ 223 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> Connection (node1:9000): Sent data for 2 scalars, total 2 rows in 0.000284672 sec., 6993 rows/sec., 68.00 B (232.15 KiB/sec.), compressed 0.4594594594594595 times to 148.00 B (505.16 KiB/sec.)
2022.04.27 03:01:03.590637 [ 223 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> MergingSortedTransform: Merge sorted 3 blocks, 2 rows in 3.371592744 sec., 0.5931914533744174 rows/sec., 94.61 B/sec
2022.04.27 03:01:03.601256 [ 10 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Information> executeQuery: Read 2 rows, 28.00 B in 3.387950542 sec., 0 rows/sec., 8.26 B/sec.
2022.04.27 03:01:03.601894 [ 10 ] {19b1719f-8c39-4e3e-b782-aa4c933650f2} <Debug> MemoryTracker: Peak memory usage (for query): 334.38 KiB.
And from docker daemon log:
time="2022-04-27T03:00:59.916693113Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-I\",\"DOCKER-USER\",\"1\",\"-p\",\"tcp\",\"-s\",\"172.16.5.2\",\"-d\",\"172.16.5.3\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
time="2022-04-27T03:01:00.030654116Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-I\",\"DOCKER-USER\",\"1\",\"-p\",\"tcp\",\"-s\",\"172.16.5.3\",\"-d\",\"172.16.5.2\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
...
time="2022-04-27T03:01:03.515813984Z" level=debug msg="/usr/sbin/iptables, [--wait -t filter -n -L DOCKER-USER]"
time="2022-04-27T03:01:03.531106486Z" level=debug msg="/usr/sbin/iptables, [--wait -t filter -C DOCKER-USER -j RETURN]"
time="2022-04-27T03:01:03.535442346Z" level=debug msg="/usr/sbin/iptables, [--wait -t filter -C FORWARD -j DOCKER-USER]"
time="2022-04-27T03:01:03.555856911Z" level=debug msg="/usr/sbin/iptables, [--wait -D FORWARD -j DOCKER-USER]"
time="2022-04-27T03:01:03.564905764Z" level=debug msg="/usr/sbin/iptables, [--wait -I FORWARD -j DOCKER-USER]"
...
time="2022-04-27T03:01:03.706374466Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-D\",\"DOCKER-USER\",\"-p\",\"tcp\",\"-s\",\"172.16.5.3\",\"-d\",\"172.16.5.2\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
time="2022-04-27T03:01:03.968077970Z" level=debug msg="form data: {\"AttachStderr\":true,\"AttachStdin\":false,\"AttachStdout\":true,\"Cmd\":[\"iptables\",\"--wait\",\"-D\",\"DOCKER-USER\",\"-p\",\"tcp\",\"-s\",\"172.16.5.2\",\"-d\",\"172.16.5.3\",\"-j\",\"DROP\"],\"Container\":\"b75f3b68cda51386bfbb9cceb67e92c4d217a5a1660bde2470b583cb1f4c7fc4\",\"Privileged\":true,\"Tty\":false,\"User\":\"\"}"
I've tried multiple ways of fixing this:
-
Creating separate chain for rules from PartitionManager (DOCKER-USER-CLICKHOUSE) But it is created only once, and docker places new rules on top of the FORWARD chain, so it will not work, since it will not receive any packets
-
Use DOCKER-USER, but replace iptables with a wrapper ([script]), that will ignore recreating of a rule for forward to DOCKER-USER, but this will not work too, since new docker rules will be created on top of FORWARD chain, and so DOCKER-USER will packets.
[script]:
if [[ "$*" =~ "-D FORWARD -j DOCKER-USER" ]]; then
exit 0
fi
if [[ "$*" =~ "-I FORWARD -j DOCKER-USER" ]]; then
if iptables.real iptables -C FORWARD -j DOCKER-USER; then
exit 0
fi
fi
- And the only way to avoid flakiness for this case, is to forbid parallel execution for tests with PartitionManager.
Fixes: #36541 (fixes first problem, everything else, had been already fixed) Refs: https://github.com/moby/moby/pull/43585
More CI:
- https://s3.amazonaws.com/clickhouse-test-reports/36979/02bd5f6542e1f7a6dfec955322b290d681837bcf/integration_tests__asan__actions__[2/3].html
- https://s3.amazonaws.com/clickhouse-test-reports/36295/314d553ab14d30df7508814513506ec09c7c7061/integration_tests__asan__actions__[2/3]/integration_run_parallel1_0.log
LGTM, let's execute those tests in-order
vdimir wants to merge ClickHouse/ClickHouse
Release without prestable
Changelog category (leave one):
- Not for changelog (changelog entry is not required)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
- Do not create prestable release
- to be continued
Relates to #34900
Seems self._create_gh_release(True)
is not called anymore, so can we remove argument?
vdimir merge to ClickHouse/ClickHouse
Release without prestable
Changelog category (leave one):
- Not for changelog (changelog entry is not required)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
- Do not create prestable release
- to be continued
Relates to #34900
So we want to get rid of prestable
and have testing
and stable
. Did I get it right?
vdimir merge to ClickHouse/ClickHouse
Release without prestable
Changelog category (leave one):
- Not for changelog (changelog entry is not required)
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
- Do not create prestable release
- to be continued
Relates to #34900
So we want to get rid of prestable
and have testing
and stable
. Did I get it right?
zhicwu pull request ClickHouse/clickhouse-jdbc
Add CLI client
clickhouse-cli-client
is a wrapper of ClickHouse native command-line client.
alesapin push ClickHouse/ClickHouse
commit sha: 19462bdf9e96fd1271a96e827f683c656d907a56
push time in 51 minutes agoarthurpassos pull request ClickHouse/clickhouse-cpp
Add empty arrays to LC(Array) existing unit test
In https://github.com/ClickHouse/clickhouse-cpp/issues/178 was reported that empty arrays in LC(Array) was crashing the client. This PR adds empty arrays to the existing unit test.
Not closing the issue because I am waiting for more OP information.
add h3Line func