You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running a HdfsTest from Spark 3.3/3.4/3.5 on TDP1.1/HDFS, we are facing the following error : "No live nodes contain current block Block locations"
Here is a fraction of the log stack ==>
24/02/10 10:55:35 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (XXX.XX.X.XXX executor 1): org.apache.spark.SparkException: Encountered error while reading file hdfs://path/to/monfichier. Details:
24/02/10 10:55:35 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1) (XXX.XX.X.XXX, executor 1, partition 0, ANY, 7962 bytes)
... 22 more
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:297)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.hasNext(HadoopFileLinesReader.scala:67)
at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:198)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:158)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:185)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:227)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
at java.base/java.io.DataInputStream.read(DataInputStream.java:151)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:677)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:969)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:990)
at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1007)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1040252842-XX.XXX.XXX.XX-1687882122273:blk_1074022325_282875 file=/path/to/monfichier No live nodes contain current block Block locations: DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-67d022d0-c1db-4d0e-9604-0573a9e83e0f,DISK] DatanodeInfoWithStorage[XXX.XXX.XXX.XXX:9866,DS-2a18a93d-9ffe-454d-b106-f02680cbebdd,DISK] DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-4deeb2d6-1a39-45a9-a8a4-520026026552,DISK] Dead nodes: DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-4deeb2d6-1a39-45a9-a8a4-520026026552,DISK] DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-67d022d0-c1db-4d0e-9604-0573a9e83e0f,DISK] DatanodeInfoWithStorage[XXX.XXX.XXX.XXX:9866,DS-2a18a93d-9ffe-454d-b106-f02680cbebdd,DISK]
Can you investigate about this issue please ?
Regards,
BEFTEAM2022
PS : in attachement the manifest yaml file to help you if you want to reproduce the issue. Just change everything which is between <>. spark_3_3_k8s_on_tdp.yaml.txt
The text was updated successfully, but these errors were encountered:
befteam2022
changed the title
Issue when running HdfsTest with Spark3.3+ on K8S and data on TDP3.1/HDFS
Issue when running HdfsTest with Spark3.3+ on K8S and data on TDP1.1/HDFS
Feb 14, 2024
Hello,
When running a HdfsTest from Spark 3.3/3.4/3.5 on TDP1.1/HDFS, we are facing the following error : "No live nodes contain current block Block locations"
Here is a fraction of the log stack ==>
24/02/10 10:55:35 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (XXX.XX.X.XXX executor 1): org.apache.spark.SparkException: Encountered error while reading file hdfs://path/to/monfichier. Details:
24/02/10 10:55:35 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1) (XXX.XX.X.XXX, executor 1, partition 0, ANY, 7962 bytes)
... 22 more
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:297)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:125)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.hasNext(HadoopFileLinesReader.scala:67)
at org.apache.spark.sql.execution.datasources.RecordReaderIterator.hasNext(RecordReaderIterator.scala:39)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:198)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.skipUtfByteOrderMark(LineRecordReader.java:158)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.readLine(UncompressedSplitLineReader.java:94)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:185)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:227)
at org.apache.hadoop.mapreduce.lib.input.UncompressedSplitLineReader.fillBuffer(UncompressedSplitLineReader.java:62)
at java.base/java.io.DataInputStream.read(DataInputStream.java:151)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:957)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:884)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:677)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:969)
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:990)
at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1007)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1040252842-XX.XXX.XXX.XX-1687882122273:blk_1074022325_282875 file=/path/to/monfichier No live nodes contain current block Block locations: DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-67d022d0-c1db-4d0e-9604-0573a9e83e0f,DISK] DatanodeInfoWithStorage[XXX.XXX.XXX.XXX:9866,DS-2a18a93d-9ffe-454d-b106-f02680cbebdd,DISK] DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-4deeb2d6-1a39-45a9-a8a4-520026026552,DISK] Dead nodes: DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-4deeb2d6-1a39-45a9-a8a4-520026026552,DISK] DatanodeInfoWithStorage[XX.XXX.XXX.XX:9866,DS-67d022d0-c1db-4d0e-9604-0573a9e83e0f,DISK] DatanodeInfoWithStorage[XXX.XXX.XXX.XXX:9866,DS-2a18a93d-9ffe-454d-b106-f02680cbebdd,DISK]
Can you investigate about this issue please ?
Regards,
BEFTEAM2022
PS : in attachement the manifest yaml file to help you if you want to reproduce the issue. Just change everything which is between <>.
spark_3_3_k8s_on_tdp.yaml.txt
The text was updated successfully, but these errors were encountered: