The Data Samurai

Posts

Showing posts from 2017

AWS Aurora Bulk Load Performance Issues - Resolved

We have had performance issues when loading the bulk data into the AWS Aurora. The bulk load performance was so bad that it was nearly worthless pushing around 2 million rows in to AWS Aurora. We were inserting about 1000 records per second. This was much worse comparing with the other MySQL counterparts like MySQL, MariaDB etc. However a few tweaks to the parameter and it resolved most of the performance issues we faced in the bulk Load. The solution is to add two parameters when you connect to the AWS Aurora jdbc for bulk load. These two parameters are : useServerPrepStatmts =false rewriteBatchedStatements =true Your full JDBC connection string should look like “jdbc:mysql://host:3306/db? useServerPrepStmts=false & rewriteBatchedStatements=true ", "username", “password”” Once we changed these parameters, the performance was blazing fast. We were able to load the 2 million rows in flat 3 minutes. The Aurora Sever used in the benchark was r3....

Google Cloud Spanner

Google recently released the Cloud Spanner. Cloud Spanner promises to be the first and only relational database service that is both strongly consistent and horizontally scalable. Cloud Spanner promises traditional benefits of a relational database: ACID transactions, relational schemas (and schema changes without downtime), SQL queries, high performance, and high availability. But unlike any other relational database service, Cloud Spanner scales horizontally, to hundreds or thousands of servers, so it can handle the highest of transactional workloads. With automatic scaling, synchronous data replication, and node redundancy, Cloud Spanner delivers up to 99.999% (five 9s) of availability for your mission critical applications. You can get more details about the Cloud Spanner at https://cloud.google.com/spanner/ .