-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Description
Search before asking
- I had searched in the issues and found no similar issues.
Version
doris-3.1.4-rc02-7f5ba43de6
What's Wrong?
If the number of partitions queried exceeds num_partitions_in_batch_mode, an error will be reported after waiting for 30 seconds.
ERROR 1105 (HY000): errCode = 2, detailMessage = Failed to get first split after waiting for 30 seconds.
The Env.getCurrentEnv().getExtMetaCacheMgr().getScheduleExecutor() contains a number of threads (greater than max_external_cache_loader_thread_pool_size from historical runs) that continuously call queue.offer in the org.apache.doris.datasource.SplitAssignment#appendBatch method in an infinite loop, the queue is full. It may be due to an abnormal termination of the query, but it is impossible to determine which query terminated or the reason for the termination.
private void appendBatch(Multimap<Backend, Split> batch) throws UserException {
for (Backend backend : batch.keySet()) {
// ...
while (needMoreSplit()) {
BlockingQueue<Collection<TScanRangeLocations>> queue =
assignment.computeIfAbsent(backend, be -> new LinkedBlockingQueue<>(10000));
try {
if (queue.offer(locations, 100, TimeUnit.MILLISECONDS)) {
break;
}
} catch (InterruptedException e) {
addUserException(new UserException("Failed to offer batch split by interrupted", e));
}
}
}
}
"NotCheckpointscheduleExecutor-0" Id=4862 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@164e81e
at java.base@17.0.15/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@164e81e
at java.base@17.0.15/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
at java.base@17.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1679)
at java.base@17.0.15/java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:378)
at app//org.apache.doris.datasource.SplitAssignment.appendBatch(Unknown Source)
at app//org.apache.doris.datasource.SplitAssignment.addToQueue(Unknown Source)
at app//org.apache.doris.datasource.hive.source.HiveScanNode.lambda$startSplit$0(Unknown Source)
at app//org.apache.doris.datasource.hive.source.HiveScanNode$$Lambda$4683/0x00007f1235a97240.run(Unknown Source)
at java.base@17.0.15/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
at java.base@17.0.15/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base@17.0.15/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base@17.0.15/java.lang.Thread.run(Thread.java:840)
What You Expected?
After the query is completed, needMoreSplit() returns false.
Or method appendBatch has a timeout period.
How to Reproduce?
No response
Anything Else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels