Why is my Import/ Transform job stuck at Hotswapping step?

The problem: Your Import or Transform job, when executed in Full mode, seems to be running forever. Suspecting that it is stuck, you go to the Job Monitoring Dashboard and check its job logs. And indeed, the job appears to be running “Performing hotswap table” forever. What is going on?

:question: First, what is Hotswap anyway?

The Hotswap process involves 2 tables: main table and temporary table. Holistics jobs would write results on the temporary table. After the job has finished, the temporary table and main table would be swapped.

This process could be expressed using pseudocode as below:

[Pseudocode] Hotswap mechanism

- CREATE TABLE temp_data; 
- INSERT DATA INTO temp_table;
-  transaction {
     DROP TABLE main_table;
     RENAME temp_table TO main_table;
 } ensure {
     DROP TABLE IF EXISTS temp_table;
}

:clock1: Why Hotswap sometimes takes a long time to finish?
DROP TABLE command requires main table to be free. In other words, it must wait for other operations on this table to finish. If your Hotswap job is taking exceedingly slow to execute, it is most likely due to some long-running processes, such as:

  • Analytical queries from your reports being executed on this main table

  • Other Import/ Transform jobs running in Full mode in the same table.

  • An ETL job running in this table.

:nerd_face: Why Hotswap is necessary

Holistics has adopted Hotswap strategy because it minimizes downtime for the (main) destination table.

:sos: My job is stuck at Hotswap step. How to resolve this?

  • Check if there are other import/ transform jobs running in the same main table alongside with your current job.

    Cancel/ Reschedule them if need be.

:mag_right: How to determine if my main table is being locked?

There are plenty of ways that you can check. Here are some suggestions to get you started:

  • Verify if your job is really stuck or it is just slow: Head over to your Job Monitoring dashboard, and check the job log to see how long it has been running. Compare this runtime to that of your past job(s) to determine if there is a spike in processing time.

    If you are not sure how to find those jobs, it is worthwhile to check this document.

  • (Recommended) If you are familiar with your database console/ admin tool, you can use it to detect long-running queries/ database locks and cancel them.

    • If you are using postgreSQL, this article details the queries that you can run to detect and terminate long running queries.

:pencil2: How to prevent this issue from happening in the future?

  • Schedule your Import/ Transform jobs to avoid conflicts.

  • Run your Import/ Transform job in Append mode instead. To find out the differences
    between the writing modes, see this document.

  • Running in off-hours when there are fewer analytical queries.

2 Likes