Using email thresholds to send reports after your data is updated

Scenario

I have a COVID-19 tracking Dashboard that gets updated daily with the latest data from the CDC and other sources, but the updates don’t always arrive at the same time each day. Sometimes these updates happen at 4pm EST and sometimes they’re closer to 8pm EST. Ideally I would like to get an email alert along with an attachment of my Dashboard when the data is updated so I know when to explore the data further.

Walkthrough

Step 1: Create an Email based on threshold job from your Dashboard

The threshold for the job is used as a mechanism/trigger to determine whether email is sent or not. To
create the calculation for our threshold trigger we need to click the “Create Threshold” button at the top right of the form, which will then add the Visual ID for us once we’re navigated to the Threshold Visual to create the calculation.

Screen Shot 2020-10-05 at 4.38.09 PM

Step 2: Create threshold calculation

The way the Email threshold calculations work is that we evaluate only one Measure column, and look for each row return in the result set to determine whether the threshold was met. For example, if we were looking to send our report only when the Record Count of the table is10,000 total rows or greater, our calculation would have to show at least that value when the Trigger Visual calculation runs.

In this scenario we want to monitor the Record Count of the table to determine whether the data has been updated, which means we don’t necessarily care about the number of total rows in our data but that it increases each time that we add data so that we can detect the table has changed. Each day we’re appending data to our table so this should accurately reflect that an update was made to our data.

Step 3: Configure Email job to evaluate the threshold calculation

Back in our Email job form, we also need to specify that we’re not evaluating the calculated value itself, but the fact that the value has changed since the last assessment (option 2). We also need to indicate that we only want the trigger to occur if the difference of the total Record Count is greater than 0 between table updates ( Value > 0 ).

Step 4: Configure a schedule for the email to run and a trigger pattern for sending the emails

Most of the time the COVID-19 data should be available before 8pm EST, so I’ll aim for 7pm EST (4pm PST) to run this job once daily. Ideally we’re trying to evaluate our threshold after the data updates so we could also adjust this job to run on an hourly basis or every few hours depending on what makes most sense for the use-case. This can be very convenient when there’s little to no guarantee when your data will be updated on an hourly basis.

I also only want one email to be sent out when the threshold is triggered as well so I will choose “When trigger occurs” for the Email limit setting.

Step 5: Specify recipients, email message, and attachments

My email job will only be sent to me, and I don’t have any fancy requirements related to the email message itself or any template design for the email. However, I do want the email to send the Dashboard has a PDF attachment so it’s easy to share.

Step 6: Manually run email job to set the initial value threshold

After I’ve created the email job, it needs to run one time in order to set an initial threshold value that can be evaluated before the next run time. I can start my email job manually by going to the Job Status page from the Settings menu in the top right of Arcadia, and then going to the Scheduled Jobs tab.

Once the jobs run successfully, there should be a green indicator, and we can wait till the job schedule naturally runs to see what happens next.


NOTE: If you’re curious and want to examine what’s happening behind the scenes, you can see in the Webserver logs from the Activity Log page that the initial job ran, but didn’t send an email (Conditional Not Triggered (First run)) because there’s not a calculated value to evaluate agianst during the first run (This is why you see “Conditional Not Triggered (First run)” in the logs below):

2020-10-02 15:19:39,669 INFO P1 18007 Thread-29 {"run_parameters": {"skip_next_targets": "Conditional Not Triggered (First run)", "force_run": true, "creator_id": 1, "creator_username": "admin", "editor_id": 1, "editor_username": "admin", "trigger_id": 20351, "trigger_name": "Trigger: COVID-19 Cases by State", "trigger_data": {"data": {"colnames": ["Record Count"], "coltypes": ["BIGINT"], "info": ["Tables: default.us_state_latitude_and_longitude:0+1584424076,default.covidtracking_statesdaily:0xcedc68fd+1585693461,default.us_zip_code_latitude_and_longitude:0x9a9781d9+1572908828,default.covidtracking_statesdaily_max_date_lv:0+1585693423", "Sequence:99", "Query:SELECT sum(1) as `Record Count`\n FROM `default`.`covidtracking_statesdaily` TA_0 left outer join `default`.`us_state_latitude_and_longitude` TA_1 ON ( TA_0.`state` = TA_1.`state` ) left outer join `default`.`covidtracking_statesdaily_max_date_lv` TA_2 ON ( TA_0.`date` = TA_2.`date` ) \n LIMIT 5000"], "rows": [["10850"]]}, "profile": [["start", "2020-10-02 15:19:39.441"], ["build.start", "0.006621"], ["build.complete", "0.003561"], ["query", "SELECT sum(1) as `Record Count`\n FROM `default`.`covidtracking_statesdaily` TA_0 left outer join `default`.`us_state_latitude_and_longitude` TA_1 ON ( TA_0.`state` = TA_1.`state` ) left outer join `default`.`covidtracking_statesdaily_max_date_lv` TA_2 ON ( TA_0.`date` = TA_2.`date` ) \n LIMIT 5000"], ["execute.pre", "0.000760"], ["execute.post", "0.144642"], ["cache.post", "0.004306"], ["cache.tsupdate.3", "0.010736"], ["finish", "2020-10-02 15:19:39.611"]]}, "trigger_data_value": "10850", "trigger_fullconditional": "Change in Record Count > 0"}, "0": {"trigger_id": 20351, "trigger_type": "D", "trigger_conditional": ">", "trigger_threshold": 0}}

Step 7: Wait and watch for the Email to be naturally…

In a few hours my email was sent out successfully when the COVID-19 table was updated, and I received this PDF in my inbox.

And if I revisit the Job Status page I can confirm that my email job ran and was triggered at run-time (4pm PST / 7pm EST).


NOTE: If I wanted to further verify the data has changed I could also revisit my Threshold Visual to check the calculation. Below you can see that my value has indeed increased (from 10850 to 11914).

Screen Shot 2020-10-02 at 4.01.08 PM

If I also look at the Webserver logs again I can that the “trigger_data_value” key for this job has been updated to 11914 (Previously this was “trigger_data_value”: “10850” if you look at the previous log snippet above) so that the next times the job runs it will evaluate the previous trigger_data_value properly to determine whether the table has been updated.

2020-10-02 16:04:54,070 INFO P1 18007 Thread-28 {"run_parameters": {"skip_next_targets": false, "force_run": null, "creator_id": 1, "creator_username": "admin", "trigger_id": 20351, "trigger_name": "Trigger: COVID-19 Cases by State", "trigger_data": {"data": {"colnames": ["Record Count"], "coltypes": ["BIGINT"], "info": ["Tables: default.us_state_latitude_and_longitude:0+1584424076,default.covidtracking_statesdaily:0x75dacdf4+1585693461,default.us_zip_code_latitude_and_longitude:0x9a9781d9+1572908828,default.covidtracking_statesdaily_max_date_lv:0+1585693423", "Sequence:132", "Query:SELECT sum(1) as `Record Count`\n FROM `default`.`covidtracking_statesdaily` TA_0 left outer join `default`.`us_state_latitude_and_longitude` TA_1 ON ( TA_0.`state` = TA_1.`state` ) left outer join `default`.`covidtracking_statesdaily_max_date_lv` TA_2 ON ( TA_0.`date` = TA_2.`date` ) \n LIMIT 5000"], "rows": [["11914"]]}, "profile": [["start", "2020-10-02 16:04:37.851"], ["build.start", "0.021278"], ["build.complete", "0.006170"], ["query", "SELECT sum(1) as `Record Count`\n FROM `default`.`covidtracking_statesdaily` TA_0 left outer join `default`.`us_state_latitude_and_longitude` TA_1 ON ( TA_0.`state` = TA_1.`state` ) left outer join `default`.`covidtracking_statesdaily_max_date_lv` TA_2 ON ( TA_0.`date` = TA_2.`date` ) \n LIMIT 5000"], ["execute.pre", "0.001354"], ["execute.post", "0.162127"], ["cache.post", "0.007192"], ["cache.tsupdate.3", "0.005218"], ["finish", "2020-10-02 16:04:38.055"]]}, "trigger_data_value": "11914", "trigger_fullconditional": "Change in Record Count > 0", "trigger_difference": 1064.0, "trigger_value": 1064.0, "trigger_type": "D", "trigger_conditional": ">", "trigger_threshold": 0, "triggered_time": "10/02/20 23:04:38 UTC", "image": "/tmp/tmprn05c8o8", "csv_names_to_filepath": {}, "error": null, "email_subject": "[ArcadiaData] COVID-19 Cases by State", "message": "", "app_id": 18864, "url_params": "embed=true&sheet=1&param.measure=%5Bpositive%5D&param.measure.alias=Positive%20Cases&param.date_formatted.start=2020-01-22&param.date_formatted.end=2020-09-13&scope=%5B%7B%22filterId%22%3A18993%2C%22datasetId%22%3A1616%2C%22filterName%22%3A%22date_formatted%22%7D%5D", "to_addresses": ["tadd@arcadiadata.com"], "cc_addresses": [], "image_type": "D", "visual_url": "http://<hostname>:443/arc/apps/app/18864", "location_pathname": "/arc/apps/builder/18864", "email_template": "", "from_address": "Arcadia Data &ltno-reply@arcadiadata.com>", "name": "COVID-19_Cases_by_State", "start_time": "10/02/20 23:04:37 UTC", "run_time": "0:00:12.951359", "run_interval": "4pm (PST) Daily", "run_interval_cron": "0 23 * * *", "next_run_time": "10/03/20 23:00:00 UTC"}, "2": {"email_subject": "[ArcadiaData] COVID-19 Cases by State", "message": "", "app_id": 18864, "url_params": "embed=true&sheet=1&param.measure=%5Bpositive%5D&param.measure.alias=Positive%20Cases&param.date_formatted.start=2020-01-22&param.date_formatted.end=2020-09-13&scope=%5B%7B%22filterId%22%3A18993%2C%22datasetId%22%3A1616%2C%22filterName%22%3A%22date_formatted%22%7D%5D", "to_addresses": ["tadd@arcadiadata.com"], "cc_addresses": [], "image_type": "D", "visual_url": "<hostname>443/arc/apps/app/18864", "location_pathname": "/arc/apps/builder/18864", "email_template": ""}}

1 Like