I am trying to create a table on top of an S3 bucket that has the following structure:
s3://mybucket/productA/${model}/${qa}/${year}/${month}/${day}/
I created the table using the following SQL query:
CREATE EXTERNAL TABLE `sampletable` (
`timestamp` date COMMENT 'from deserializer'
)
PARTITIONED BY (
`model` int,
`year` int,
`month` int,
`date` int
)
ROW FORMAT SERDE
'.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'case.insensitive'='TRUE',
'dots.in.keys'='FALSE',
'ignore.malformed.json'='FALSE',
'mapping'='TRUE'
)
STORED AS INPUTFORMAT
'.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://mybucket/productA'
TBLPROPERTIES (
'classification'='json',
'projection.Date.range'='1,31',
'projection.Date.type'='integer',
'projection.Model.range'='10000,99999',
'projection.Model.type'='integer',
'projection.Month.range'='1,12',
'projection.Month.type'='integer',
'projection.Year.range'='2022,2042',
'projection.Year.type'='integer',
'projection.enabled'='true',
'storage.location.template'='s3://mybucket/productA/Model=${Model}/Year=${Year}/Month=${Month}/Date=${Date}/'
);
Problem
- The table creation query runs successfully.
- When I run the query:
SELECT * FROM DATABASE.sampletable
WHERE model = 47317 AND year = 2025 AND month = 03 AND date = 11
LIMIT 10;
- It runs forever without returning results and there is no error message, but it never completes.
- Even SELECT * FROM DATABASE.sampletable; runs indefinitely.
- Running SHOW PARTITIONS DATABASE.sampletable; does not return any partitions.
Manually adding a partition works:
ALTER TABLE DATABASE.sampletable
ADD PARTITION (model=47317, year=2025, month=3, day=11)
LOCATION 's3://mybucket/productA/47317/2025/03/11/';
The partition is successfully added, but Partition Projection is not working.