Possible to Query Data From two different YARN cluster?

Team,

Is it possible to query data from two different YARN cluster using one instance of Arcadia?

Use Case:

Hive Table1 present in Cluster1 and Table2 present in Cluster2. I want to join both the table and generate report out of it.

Thanks,
Dhaval

Dhavel,

There isn’t a way to create a join between 2 tables that are located in 2 different environments from within an Arcadia Dataset Data Model. However, you can make both tables interact with each other from within a Dashboard.

For example, below I have a Visual from a Table in Environment #1:

And I also have a Visual from a Table in Environment #2:

59%20AM

Within my Dashboard, I can make my first Visual perform a lookup into the Table in Environment #2 using Parameters within the filter shelf of the second Visual. In our example, we want to do a lookup in Table #2 from Table #1 using the “uid” column:

The highlighted part of the screenshot above is an example of Parameter syntax. This syntax allows you to dynamically receive values in a Visual (query) from a column that was clicked or filtered within the Dashboard.

Once you’ve set up the Parameter syntax, in “Edit” mode on the Dashboard you should be able to “Enable Filter on click” from the first Visual, and then “Test click behavior”:

you should see the clicked the “uid” value broadcasted within the Dashboard:

49%20AM

Thank you,
Tadd Wood

1 Like

Hi Tadd,

Thanks for the revert.

When you say Environment #1, is that a Arcadia instance running on Cluster1 and Environment #2 on Cluster2?

Dhavel,

Yes that’s correct, sorry for the confusion.

Thanks for the quick revert.

In the example you mentioned, we want to do a lookup in Table #2 from Table #1 using the “uid” column: To enable access from one instance to another, where/what do we need to configure to access tables across arcadia instances?

There might be the case where both the Arcadia instance are running in different cluster and different Data Center.

@Dhaval_Patel You don’t need Arcadia running on cluster/env #2. The example is more generic, for example you could be connecting to Hive / Impala / Oracle etc running on a different environment.

You just need to create a new connection within Arcadia UI Data page to that new data source with proper IP/Port connection details. Then build dataset pointing to tables.