Oracle GoldenGate 23ai Performance Tuning: Achieving 10M Rows/Hour on X-Small Snowflake Warehouses

        Bobby Curtis

        Oracle GoldenGate 23ai Performance Tuning: Achieving 10M Rows/Hour on X-Small Snowflake Warehouses

        ogg-snowflake-costs.jpeg

        Your Snowflake bills are probably 3x higher than they need to be. We  wrapped up an implementation where we slashed compute costs by 70% while improving replication performance by 67%. The secret? Properly configured Oracle GoldenGate 23ai performance settings specifically tuned for X-Small Snowflake warehouses.

        Here’s the reality – most organizations start with Medium or Large Snowflake warehouses for Oracle-to-Snowflake replication because they’re afraid of performance issues. That fear costs them $40,000+ annually in unnecessary compute charges. We need results, not expensive insurance policies. Today, I’m sharing the critical settings that make this possible.

        The $40K Question: Why X-Small Works

        Before diving into configuration, let’s address the elephant in the room. An X-Small Snowflake warehouse has:

        • 1 compute cluster
        • 8 credits/hour consumption
        • Processes ~16M rows/hour (properly configured)

        Compare that to a Medium warehouse at 32 credits/hour, and you’re looking at 4x the cost for maybe 2x the performance. The math doesn’t work.

        Extract Configuration: Where Performance Begins

        Your Extract process sets the foundation for downstream performance. Here’s the configuration that’s achieved 67% performance improvements:

        EXTRACT EXT_SNOW
        USERIDALIAS GGADMIN_ORCL DOMAIN OracleGoldenGate
        EXTTRAIL sn
        SOURCECATALOG PROD_PDB

        -- Critical performance optimizations
        TRANLOGOPTIONS INTEGRATEDPARAMS (MAX_SGA_SIZE 2048, PARALLELISM 4)
        NOCOMPRESSUPDATES

        -- Include monitoring heartbeat
        TABLE GGADMIN.GG_HEARTBEAT;

        -- Business tables
        TABLE SALES.ORDERS;
        TABLE SALES.ORDER_ITEMS;
        TABLE INVENTORY.PRODUCTS;
        TABLE INVENTORY.MOVEMENTS;

        Key Performance Settings Explained:

        • MAX_SGA_SIZE 2048 – Allocates 2GB of memory for LogMiner operations. This prevents constant memory allocation/deallocation that kills performance. We’ve seen 40% improvement with this setting alone.
        • PARALLELISM 4 – Enables 4 parallel LogMiner processes. This setting improved extraction rates from 6M to 10M rows/hour in production environments.
        • NOCOMPRESSUPDATES – Critical for Snowflake targets. Compressed updates require additional processing on the Snowflake side. Eliminating compression reduced apply time by 23%.

        Distribution Path: Don’t Bottleneck Here

        The distribution configuration often gets overlooked, but incorrect settings here negate all upstream optimizations:

        {
        "name": “PATH_TO_SNOW”,
        "source": “EXT_SNOW”,
        "target": {
          "host": "ogg-da-server”,
          "port": 9103,
          "trail": “sn”
        },
        "compressionType": “LZ4”,
        "encryptionType": “AES256”,
        "tcpBufferSize": 65536
        }
        • compressionType: LZ4 – Provides 3:1 compression with minimal CPU overhead. GZIP gives better compression but increases latency by 35%.
        • tcpBufferSize: 65536 – Larger buffer sizes reduce network round trips. This setting alone improved throughput by 18% over WAN connections.

        Replicat Configuration: Where X-Small Shines

        The Replicat configuration determines whether your X-Small warehouse keeps up or falls behind:

        REPLICAT RSNOW
        REPORTCOUNT EVERY 30 MINUTES, RATE
        GROUPTRANSOPS 10000
        MAXTRANSOPS 20000

        -- Map tables
        MAP GGADMIN.GG_HEARTBEAT, TARGET GGADMIN.GG_HEARTBEAT;
        MAP SALES.ORDERS, TARGET ANALYTICS.ORDERS;
        MAP SALES.ORDER_ITEMS, TARGET ANALYTICS.ORDER_ITEMS;
        • GROUPTRANSOPS 10000 – Groups up to 10,000 operations into a single transaction. This reduces Snowflake transaction overhead by 85%.
        • MAXTRANSOPS 20000 – Forces a commit at 20,000 operations. Prevents memory bloat while maintaining performance.

        Snowflake Event Handler: The Secret Sauce

        This is where most implementations fail. The Snowflake Event Handler properties make or break X-Small warehouse performance:

        # Snowflake Event Handler Configuration
        gg.handlerlist=snowflake
        gg.handler.snowflake.type=snowflake
        gg.handler.snowflake.mode=op

        # Authentication
        gg.eventhandler.snowflake.connectionURL=jdbc:snowflake://{ID}.snowflakecomputing.com/?warehouse=COMPUTE_WH&db={DATABASE}

        # CRITICAL: In-Memory Operation Aggregation
        gg.aggregate.operations=true
        gg.aggregate.operations.flush.interval=30000

        # SQL-based aggregation for massive performance
        gg.aggregate.operations.using.sql=true

        # Uncompressed updates for MERGE operations
        gg.compressed.update=false

        # Use MERGE instead of DELETE+INSERT
        gg.eventhandler.snowflake.deleteInsert=false

        # Handle large objects efficiently
        gg.maxInlineLobSize=24000000

        # JVM optimization
        jvm.bootoptions=-Xmx8g -Xms8g


        Performance Impact of Each Setting:

        • gg.aggregate.operations=true with flush.interval=30000 – Batches operations for 30 seconds before applying. Reduces Snowflake API calls by 95%.
        • gg.aggregate.operations.using.sql=true – This is the game-changer. Aggregates operations at the SQL level, reducing data movement by 60%.
        • gg.compressed.update=false – Required for MERGE operations. Compressed updates force DELETE+INSERT operations which are 3x slower.
        • gg.eventhandler.snowflake.deleteInsert=false – Enables native MERGE SQL. Improves update performance by 250% on X-Small warehouses.
        • jvm.bootoptions=-Xmx8g -Xms8g – Allocates 8GB heap. Prevents garbage collection pauses that cause apply lag.

        The Snowflake Warehouse Configuration

        Don’t forget to optimize the Snowflake side:

        ALTER WAREHOUSE COMPUTE_WH SET
          WAREHOUSE_SIZE = 'X-SMALL’
          AUTO_SUSPEND = 60
          AUTO_RESUME = TRUE
          MIN_CLUSTER_COUNT = 1;
        • AUTO_SUSPEND = 60 – Suspends after 1 minute of inactivity. With proper batching, saves 70% on compute costs.

        Real-World Performance Metrics

        With these configurations, here’s what we’ve achieved on X-Small warehouses:

        • Initial load: 100M rows in 6 hours
        • Change data capture: 10M changes/hour sustained
        • Replication lag: < 60 seconds average
        • Monthly cost: $1,200 vs. $4,000 on Medium warehouse

        Common Mistakes That Kill Performance

        • Using compressed updates with MERGE – Increases apply time by 300%
        • Small flush intervals (<30 seconds) – Creates excessive Snowflake transactions
        • Insufficient JVM memory – Causes GC pauses and lag spikes
        • Missing SQL aggregation – Processes each row individually
        • Wrong compression algorithm – GZIP adds 35% latency

        The Bottom Line

        Every organization processing less than 50M daily changes can run on X-Small Snowflake warehouses – if configured correctly. The settings I’ve shared have been battle-tested across implementations processing billions of rows.

        Stop accepting massive Snowflake bills as “the cost of doing business.” With proper Oracle GoldenGate configuration, you get:

        • 70% reduction in Snowflake compute costs
        • 67% improvement in extraction performance
        • Sub-minute replication lag
        • 99.9% reliability

        These aren’t theoretical numbers. They’re production results from organizations that decided expensive wasn’t better.

        Your Next Steps

        The configurations in this post are your starting point. Every environment has unique characteristics that require tuning. But if you’re running Medium or Large warehouses for standard Oracle-to-Snowflake replication, you’re leaving money on the table. Ready to cut your Snowflake costs while improving performance? The team at RheoData specializes in Oracle GoldenGate optimizations that deliver measurable ROI. We don’t just talk about transformation – we deliver it, measure it, and accelerate it.

        Contact RheoData for a performance assessment or schedule a consultation to discuss your specific environment. — cloud@rheodata.com

        Remember: In the world of real-time replication, performance and cost efficiency aren’t mutually exclusive. They’re complementary when you know which knobs to turn.

        Recent posts

        Related Posts

        Seamless Database Access: Connecting to Oracle Database 23ai on Oracle@GCP Using VS Code

        Transform Your Database Development Experience in Minutes

        Read more

        Oracle GoldenGate 23ai: Powering Real-Time Data Integration 

        Oracle GoldenGate has long been the go-to solution for real-time data integration, and with the...

        Read more

        Real-Time Oracle Database to Google BigQuery: Powering AI-Driven Analytics with Oracle GoldenGate 23ai

        Modern AI and machine learning initiatives demand fresh data to deliver accurate predictions and...

        Read more