The Fastest Way to Insert Data to Postgres

3 pointsposted 4 days ago
by dancrystalbeach

1 Comments

dancrystalbeach

4 days ago

I was recently working on a PySpark pipeline in which I was using the JDBC option to write about 22 million records from a Spark DataFrame into a Postgres RDS database. Hey, why not use the built in method provided by Spark, how bad could it be? I mean it’s not like the creators and maintainers of Spark aren’t probably our version of rocket engineers.