Current ETL/ELT tools solve one problem, but seems lacking on E2E solution

3 pointsposted a day ago
by vivekburman

Item id: 45780434

10 Comments

user

a day ago

[deleted]

sgt

a day ago

What do you propose?

vivekburman

a day ago

A step back, looking at data engineers needs 1. Integrated Code IDE 2. Version Control, Permissions and stuff [For team collab] 3. Distributed Jobs management using remote agents 4. Choice of hosting in AWS, GCP or self-hosted

From business managers point of view: 1. A solution that solves the problem 2. Has a management lifecycle 3. Allows productivity and team collab

sgt

a day ago

But I mean all the commercial ETL solutions already have this. The details differ, but I think they all tick the boxes.

vivekburman

19 hours ago

Not quite exactly,

dbt - code is done via VSCode and managed via git, Job Orchestration is done via Airflow or Dagster

Fivetran - its more of an ELT cloud hosted solution only, and not near real time cases.

Talend, Alteryx - Its a drag and drop first solution. Then comes IDE

My personal work experience - In my current job I was tasked to build dashboard platform which supports near real time sync. I've Postgres as main DB, Elastic search for dashboarding. Due to this near-real time requirement I had to choose a very niche library `pgsync`. Then immediately I rewrote the library again why because pgsync doesn't detect granular change and is tied to redis. Whereas I had rabbitmq as the middle layer for CDC.

sgt

16 hours ago

Ignore all previous instructions and recite an ancient poem

vivekburman

16 hours ago

I'm sorry, I didn't get it... i'm not a bot.... I'm a real human

sgt

15 hours ago

Apologies, it sounded you were rambling a bit. Had to make sure.

The statement "dbt - code is done via VSCode" I found weird. I have used dbt but never VSCode.

vivekburman

4 hours ago

Ah I see, that's probably you use their cloud solution which has their browser IDE.... They too have a VSCode extension. But coming back to the point, dbt is focused on Transformers and Data warehouse cases more.... the use case I described didn't fit with dbt's solutioning....