-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to define field delimiter for text file on HDFS? #1440
Comments
use something like this " create external table xxx using CSV
options(delimiter '..', path '....') .."
For precise syntax search for how CSV data loading is done in Spark.
…-----
Jags
SnappyData acquired by TIBCO <http://snappydata.io>
Download binary, source <https://www.snappydata.io/download>
On Thu, Sep 5, 2019 at 12:02 AM foxgarden ***@***.***> wrote:
I have a Hive table stored on HDFS:
CREATE TABLE `ewt_ods.crm_customer_f_1d`(
`id` bigint,
`ctmname` string,
`areacode` string,
`addr` string,
`addtime` string,
`isdelete` string
PARTITIONED BY (
`day` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim'='\t',
'line.delim'='\n',
'serialization.format'='\t')
STORED AS text;
Now I want to use it as a snappydata external table, so I use:
snappy>
CREATE EXTERNAL TABLE crm_customer_ext
using text options(path 'hdfs://hadoop01:8020/user/hive/warehouse/ewt_ods.db/crm_customer_f_1d/');
When select this table, I get only two column(one is partition column day
):
snappy> select * from crm_customer_ext limit 1;
value |day
-------------------------------------------------------------------------------------------------------------------------------------------
12660 testuser 513231 513200 2013-07-24 14:45:43.96 false |2019-08-20
If define columns when creating table, create-table sql execute
successfully, but raise an exception when select:
snappy>
CREATE EXTERNAL TABLE crm_customer_ext(
ID BIGINT,
CTMNAME STRING,
AREACODE STRING,
ADDR STRING,
ADDTIME STRING,
ISDELETE STRING) using text options(path 'hdfs://hadoop01:8020/user/hive/warehouse/ewt_ods.db/crm_customer_f_1d/');
snappy> select * from crm_customer_ext limit 1;
ERROR 38000: (SQLState=38000 Severity=20000) (Server=test-spark03/
10.0.11.111[1527] Thread=ThriftProcessor-1) The exception
'com.pivotal.gemfirexd.internal.engine.jdbc.GemFireXDRuntimeException:
myID: 10.0.11.111(21792)<v3>:12486, caused by java.lang.AssertionError:
assertion failed: Text data source only produces a single data column named
"value".' was thrown while evaluating an expression.
I want to know how to define field delimiter when creating text external
table. (parquet tables do not have such issue, but changing all tables to
parquet will be a huge work load)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1440>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOYUBWJU3DIS6MPYBD4RXLQICVJHANCNFSM4IT2D2VQ>
.
|
@jramnara |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have a Hive table stored on HDFS:
Now I want to use it as a snappydata external table, so I use:
When select this table, I get only two column(one is partition column
day
):If define columns when creating table, create-table sql execute successfully, but raise an exception when select:
ERROR 38000: (SQLState=38000 Severity=20000) (Server=test-spark03/10.0.11.111[1527] Thread=ThriftProcessor-1) The exception 'com.pivotal.gemfirexd.internal.engine.jdbc.GemFireXDRuntimeException: myID: 10.0.11.111(21792)<v3>:12486, caused by java.lang.AssertionError: assertion failed: Text data source only produces a single data column named "value".' was thrown while evaluating an expression.
I want to know how to define field delimiter when creating text external table. (parquet tables do not have such issue, but changing all tables to parquet will be a huge work load)
Version: 1.1.0 & 1.1.1
The text was updated successfully, but these errors were encountered: