@kyreddy let me reorganize and annotate a few things here so that its easier to follow.
- nightly trunc and load oracle source using sqoop into hdfs hive driven metadata
- invalidate hive tables
This drops all metadata for the table in the Catalog
- do stats gathering for source tables
This is optional. You would do this in Impala ideally, and Arcadia will get the stats once you execute a query from the table (see next step).
- select * from eachtable limi1
This forces a load of the table in your SELECT query. This helps reduce metadata load times when users start running queries, but this may still take a little bit of time depending on how large the table metadata is which is function of the number of partitions, columns ,files, and if stats were computed.
- drop logical and analytical views
- create logical and analytical views
- refresh analytical views
Analytical Views should always be refreshed after table data and metadata are finished loading. They will be in a STALE state until the Refresh is complete because the table data and metadata has changed.
- execute select * from analytical view limit 1 for all AVs
This is not necessary. Analytical View metadata will already be present in the Arcadia Catalog.
Your full Refreshes are failing because of a different reason, but that’s being followed up on with support I believe.