Streamlining Data Migration: Why Sling Might Be Your Easiest Option

Albert Wong
4 min readFeb 13, 2024

--

Sling is a CLI-based Extract Load Transform (ELT) tool to help you move data from one system/database to another

Migrating data from one system or database to another can be a daunting task. Fear not, weary warrior of data transfer! There’s a solution designed to make your life easier: Sling.

By the way, I’m Albert, and I’m Head of Community and Developer Relations at CelerData. CelerData is a $60 million VC-funded startup that is building StarRocks, an open-source replacement for Snowflake, Big Query, and RedShift.

What is Sling?

Sling isn’t your average jack-of-all-trades tool. It’s a dedicated champion of the data migration arena, focusing solely on the art of Extract, Load, and Transform (ELT). This singular focus translates into several advantages for you:

1. Built for Speed and Efficiency:

It tries to use the database version specific implementations to get the best performance.

2. Command Line Commando:

Sling operates through the command line, making it highly versatile. You can use it as a standalone tool or seamlessly embed it within other workflows, offering the flexibility you crave. Currently, it’s the default embedded ELT tool in Dagster. I can also see how this could work in Apache Airflow, Temporal, Orkes (Netflix Conductor), N8N or any other workflow tool.

3. Connector King:

Sling boasts an extensive library of connectors for all the major databases and systems. No more struggling with compatibility issues; connect and conquer with ease. Basically it moves all the various data formats like CSV, AVRO, JSON and Parquet into StarRocks. Also helps with our popular migrations like MySQL and PostGresSQL to StarRocks.

Setup

atwong@Albert-CelerData Downloads % sling conns set STARROCKSLOCAL url=starrocks://root:@localhost:9030/albert
11:55AM INF connection `STARROCKSLOCAL` has been set in /Users/atwong/.sling/env.yaml. Please test with `sling conns test STARROCKSLOCAL`
atwong@Albert-CelerData Downloads % sling conns test STARROCKSLOCAL
11:55AM INF success!

Importing csv example1.csv

atwong@Albert-CelerData auxjars % sling run --src-stream 'file:///Users/atwong/Downloads/example1.csv' --tgt-conn STARROCKSLOCAL --tgt-object 'albert.call_center2' --mode full-refresh --src-options '{"header":false}' --primary-key col_001
11:24AM INF connecting to target database (starrocks)
11:24AM INF reading from source file system (file)
11:24AM INF writing to target database [mode: full-refresh]
11:24AM INF streaming data
11:24AM WRN Using INSERT mode which is meant for small datasets. Please set the `fe_url` for loading large datasets via Stream Load mode. See https://docs.slingdata.io/connections/database-connections/starrocks
11:24AM INF dropped table `albert`.`call_center2`
11:24AM INF created table `albert`.`call_center2`
11:24AM INF inserted 4 rows into `albert`.`call_center2` in 0 secs [7 r/s]
11:24AM INF execution succeeded

Importing json. example2.json.

atwong@Albert-CelerData auxjars % sling run --src-stream 'file:///Users/atwong/Downloads/example2.json' --tgt-conn STARROCKSLOCAL --tgt-object 'albert.call_center4' --mode full-refresh --src-options '{"flatten":true}' --primary-key code
11:30AM INF connecting to target database (starrocks)
11:30AM INF reading from source file system (file)
11:30AM INF writing to target database [mode: full-refresh]
11:30AM INF streaming data
11:30AM WRN Using INSERT mode which is meant for small datasets. Please set the `fe_url` for loading large datasets via Stream Load mode. See https://docs.slingdata.io/connections/database-connections/starrocks
11:30AM INF dropped table `albert`.`call_center4`
11:30AM INF created table `albert`.`call_center4`
11:30AM INF inserted 1 rows into `albert`.`call_center4` in 0 secs [2 r/s]
11:30AM INF execution succeeded

Importing parquet.

atwong@Albert-CelerData auxjars % sling run --src-stream 'file:///Users/atwong/sandbox/tpcds-parquet/call_center.parquet' --tgt-conn STARROCKSLOCAL --tgt-object 'albert.call_center' --mode full-refresh --primary-key cc_call_center_sk
1:28PM INF connecting to target database (starrocks)
1:28PM INF reading from source file system (file)
1:28PM INF writing to target database [mode: full-refresh]
1:28PM INF streaming data
1:28PM WRN Using INSERT mode which is meant for small datasets. Please set the `fe_url` for loading large datasets via Stream Load mode. See https://docs.slingdata.io/connections/database-connections/starrocks
1:28PM INF dropped table `albert`.`call_center`
1:28PM INF created table `albert`.`call_center`
1:28PM INF inserted 6 rows into `albert`.`call_center` in 1 secs [6 r/s]
1:28PM INF execution succeeded

There are also other examples of migrating from MySQL and PostGreSQL found here https://github.com/slingdata-io/sling-cli/issues/140.

Why Choose Sling Over the Rest?

Here’s how Sling differentiates itself from the data migration crowd:

  • Simplicity: Say goodbye to complex interfaces and overwhelming features like FiveTran and Airbyte. Sling focuses on making your job intuitive and straightforward.
  • Customization: Need some fine-tuning? Sling’s parameter options capabilities allow you to tailor the migration process to your specific needs.
  • Open Source Advantage: Enjoy the benefits of an open-source community, with access to ongoing development and support.

Is Sling Right for You?

If you’re looking for a tool that makes data migration:

  • Faster: Skip the bloatware and get straight to the data movement.
  • Easier: Navigate the process with a CLI.
  • More Flexible: Integrate seamlessly with your existing workflows and systems.

Then Sling might be your champion. Give it a try and experience the smooth flow of data migration!

--

--

Albert Wong

#eCommerce #Java #Database #k8s #Automation. Hobbies: #BoardGames #Comics #Skeet #VideoGames #Pinball #Magic #YelpElite #Travel #Candy