<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Last Dev]]></title><description><![CDATA[Writing about DevOps, Cloud Engineering, Data Engineering and more!]]></description><link>https://www.thelastdev.com</link><image><url>https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png</url><title>The Last Dev</title><link>https://www.thelastdev.com</link></image><generator>Substack</generator><lastBuildDate>Fri, 15 May 2026 22:14:26 GMT</lastBuildDate><atom:link href="https://www.thelastdev.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Konstantinos Siaterlis]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[thelastdev@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[thelastdev@substack.com]]></itunes:email><itunes:name><![CDATA[Konstantinos Siaterlis]]></itunes:name></itunes:owner><itunes:author><![CDATA[Konstantinos Siaterlis]]></itunes:author><googleplay:owner><![CDATA[thelastdev@substack.com]]></googleplay:owner><googleplay:email><![CDATA[thelastdev@substack.com]]></googleplay:email><googleplay:author><![CDATA[Konstantinos Siaterlis]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[AWS re:Invent 2025 - somewhat non-GenAI recap]]></title><description><![CDATA[Released during re:Invent 2025 that you might have missed]]></description><link>https://www.thelastdev.com/p/aws-reinvent-2025-somewhat-non-genai</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-reinvent-2025-somewhat-non-genai</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Thu, 11 Dec 2025 08:13:45 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b1d9772e-8ee4-4c24-b881-e881384d804a_2160x2160.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Every re:Invent comes with a wave of announcements, some big, some subtle, and some that quietly make day-to-day engineering a bit better.<br>This is my quick recap of the non-GenAI updates that stood out to me this year: improvements in observability, new cost-optimization levers, updates to Lambda&#8217;s capabilities, and a notable step forward in multicloud networking.</p><p>Feel free to check more updates from re:Invent and pre:Invent 2025 here:  https://aws-news.com/</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>CloudTrail events in CloudWatch: fewer moving parts, more visibility</h2><p>AWS added a more straightforward way to enable <strong><a href="https://aws.amazon.com/about-aws/whats-new/2025/12/key-enhancements-cloudtrail-events-cloudwatch">CloudTrail events in CloudWatch</a></strong> using <strong>service-linked channels (SLCs)</strong>.</p><h3>Why I care</h3><p>Because half the time you want CloudTrail events in CloudWatch, you&#8217;re not trying to build an archival strategy. You&#8217;re trying to answer:</p><ul><li><p>&#8220;Who changed this security group?&#8221;</p></li><li><p>&#8220;Why did this API call spike?&#8221;</p></li><li><p>&#8220;Can I alert on this before it becomes a Slack incident?&#8221;</p></li></ul><p>SLCs also include features such as <strong>safety checks</strong> and <strong>termination protection</strong>.</p><h3>The &#8220;AWS fine print&#8221;</h3><p>You still pay:</p><ul><li><p>CloudTrail event delivery charges <strong>and</strong></p></li><li><p>CloudWatch Logs ingestion (custom logs pricing).</p></li></ul><p>Yes, it&#8217;s simpler, but don&#8217;t turn everything on everywhere and then act surprised.</p><h2>Database Savings Plans</h2><p>AWS introduced <strong><a href="https://aws.amazon.com/blogs/aws/introducing-database-savings-plans-for-aws-databases/">Database Savings Plans</a></strong> (up to <strong>35%</strong> savings).<br>This is essentially AWS acknowledging that databases are&nbsp;<em>expensive</em>&nbsp;and&nbsp;that &#8220;please right-size&#8221; is not a strategy.</p><h3>What to do with it</h3><p>If you have steady-state usage (Aurora/RDS/others in the eligible set), you can treat it like:</p><ul><li><p>commit for the baseline,</p></li><li><p>keep spikes on-demand.</p></li></ul><p>AWS also integrated this into the billing console recommendations flow (Savings Plans recommendations + purchase analyzer).</p><h2>Lambda Managed Instances: serverless DX, EC2-shaped compute</h2><p>This one is spicy: <strong><a href="https://aws.amazon.com/about-aws/whats-new/2025/11/aws-lambda-managed-instances/">Lambda Managed Instances</a></strong> lets you run Lambda functions <strong>on your Amazon EC2 instances</strong> via a <strong>capacity provider</strong> model. This update is very controversial, since I believe in Lambda being serverless, but you can still use Lambda (as it was initially intended) and ignore this release &#128514;</p><p>You define:</p><ul><li><p>VPC config</p></li><li><p>optional instance requirements</p></li><li><p>scaling policies<br>&#8230;and then attach Lambda functions to that capacity provider via console, API, or IaC.</p></li></ul><h3>Why this matters</h3><p>This is a new middle layer for teams that:</p><ul><li><p>love Lambda event sources + tooling (CloudWatch, X-Ray, Config&#8230;)</p></li><li><p>but want more control over compute shape or cost for steady workloads.</p></li></ul><p>Also, supported runtimes include the latest Java, Node.js, Python, and .NET.</p><h3>The &#8220;this will come up in architecture review&#8221; part</h3><p>Third-party reporting states that pricing is&nbsp;<strong>standard EC2 + a compute management fee + request pricing</strong>, and that this eliminates the usual Lambda duration charge (since you&#8217;re paying for EC2).</p><h2>API Gateway adds MCP proxy support (AI-adjacent, but it&#8217;s really &#8220;API-as-a-tool&#8221;)</h2><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/12/api-gateway-mcp-proxy-support/">API Gateway now supports </a><strong><a href="https://aws.amazon.com/about-aws/whats-new/2025/12/api-gateway-mcp-proxy-support/">MCP proxy</a></strong> capability, which provides&nbsp;<strong>protocol translation</strong>&nbsp;so a REST API can communicate with MCP clients/agents without requiring app changes or additional infrastructure.</p><p>AWS frames it alongside Bedrock AgentCore Gateway services, but the &#8220;boring&#8221; value is:</p><ul><li><p>governance,</p></li><li><p>auth,</p></li><li><p>throttling,</p></li><li><p>making APIs discoverable/usable as tools</p></li></ul><p>If you&#8217;re not building &#8220;agents&#8221;, you can still read this as: API Gateway now supports another integration pattern that used to require custom glue (not the service).</p><h2>AWS Interconnect (multicloud) preview: private connectivity to other clouds</h2><p>AWS announced a preview of <strong><a href="https://aws.amazon.com/about-aws/whats-new/2025/11/preview-aws-interconnect-multicloud/">AWS Interconnect &#8211; multicloud</a></strong>: &#8220;simple, resilient, high-speed private connections&#8221; to other cloud providers.</p><p>It starts with <strong>Google Cloud</strong> as the first partner, and AWS says <strong>Azure comes later in 2026</strong>.</p><h3>Why this matters</h3><p>Multicloud connectivity is usually:</p><ul><li><p>slow to procure,</p></li><li><p>annoying to operate,</p></li><li><p>and &#8220;fun&#8221; during incident response.</p></li></ul><p>The promise here is: private links between clouds without weeks of paperwork and waiting.</p><p>We will see how this evolves over time.</p><h2>Amazon S3 Vectors is GA: S3 continues to absorb the universe</h2><p><strong><a href="https://aws.amazon.com/about-aws/whats-new/2025/12/amazon-s3-vectors-generally-available/">S3 Vectors</a></strong> is now <strong>generally available</strong>, and AWS says it&#8217;s available in <strong>14 Regions</strong> (up from 5 in preview).</p><p>Even if you don&#8217;t want to say &#8220;embeddings&#8221; out loud, vector storage shows up in:</p><ul><li><p>similarity search,</p></li><li><p>dedupe,</p></li><li><p>recommendations,</p></li><li><p>anomaly detection,</p></li><li><p>and hybrid search patterns.</p></li></ul><p>AWS also highlights integration patterns in which OpenSearch can manage vector storage in S3 to optimize hybrid search costs. </p><p>This is an update where I want to make a showcase blog post in the upcoming weeks.</p><h2>Lambda Durable Functions: long-running workflows without paying for &#8220;waiting&#8221;</h2><p>AWS added <strong><a href="https://aws.amazon.com/about-aws/whats-new/2025/12/lambda-durable-multi-step-applications-ai-workflows/">durable functions for Lambda</a></strong> to build multi-step workflows that can run for seconds to&nbsp;<strong>one year</strong>, without incurring idle compute costs while waiting for humans/external systems.</p><p>I am still a fan of Step Functions, and TBH, by using Step Functions&#8217; native integration, there is no need for Lambda functions for everything (i.e., put DynamoDB item). I see the value here, but for me, Step Functions still win in that case.</p><h3>Why this matters</h3><p>Because today, teams tend to choose between:</p><ul><li><p>Step Functions (great, but can become &#8220;JSON orchestration art&#8221;)</p></li><li><p>DIY state in DynamoDB + retries + idempotency + sadness</p></li></ul><p>Durable functions are AWS's way of saying: &#8220;What if the workflow <em>was code</em>, but also reliable?&#8221;</p><p>The AWS News Blog post explicitly positions it for long-running multi-step coordination and not paying while waiting.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>Conclusion</strong></h2><p>It&#8217;s easy to get swept up in the big re:Invent moments, but these quieter updates are the ones that tend to stick. They refine the everyday experience of building on AWS, and that&#8217;s often where the real gains show up.</p><p>Feel free to reach out if you have suggestions for my next blog post.</p><p>Till the next time, stay safe and have fun! &#10084;&#65039;</p>]]></content:encoded></item><item><title><![CDATA[Amazon S3 Tables using Glue Jobs and Terraform]]></title><description><![CDATA[Enable managed iceberg tables in AWS and ingest data with Glue Jobs]]></description><link>https://www.thelastdev.com/p/amazon-s3-tables-using-glue-jobs</link><guid isPermaLink="false">https://www.thelastdev.com/p/amazon-s3-tables-using-glue-jobs</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Fri, 06 Jun 2025 06:57:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e9a920e5-afb0-4d9a-8a29-9aa704eb16d8_2075x1051.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this &#8220;unofficial&#8221; series of Amazon&#8217;s Data Stack showcasing, we will introduce <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables.html">Amazon S3 Tables</a>. Amazon S3 Tables is a new AWS feature that makes Apache Iceberg-based tables natively available in Amazon S3&#8212;without needing to manage Iceberg yourself. So far, we have seen <a href="https://www.thelastdev.com/p/showcasing-aws-athena">Amazon Athena</a>, how to <a href="https://www.thelastdev.com/p/managing-cost-and-usage-reports-data">query data, such as CUR reports</a>, and <a href="https://www.thelastdev.com/p/using-aws-glue-jobs-with-terraform">how to use Glue Jobs</a>. We will use our knowledge so far to do a small dive into the S3 tables.</p><p>What we&#8217;ll see:</p><ul><li><p>How to provision S3 Tables with Terraform, including table buckets, namespaces, and Iceberg tables.</p></li><li><p>Set up a workflow with Step Functions and Glue Jobs to process CSV files and ingest the data into our Iceberg tables.</p></li><li><p>Managing the access with Lake Formation (basic) </p></li><li><p>Ability to query our Iceberg tables from Athena, which will make the provisioning of Lake Formation mandatory&#129394;</p></li></ul><p>We will not see the S3 Tables table maintenance in this post. You can read more about table maintenance <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-maintenance-overview.html">here</a>.</p><p>This is a <strong>Level 300</strong> post; following along with the post and deploying the infrastructure to your AWS account will <strong>cost approximately ~3.5$</strong>, given that you run the Glue Job around 10 times and have around 1GB of files in S3 buckets. You can follow along with the code in my repo.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>What are S3 Tables</h2><p><a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables.html">Amazon S3 Tables</a> is a purpose-built storage solution within Amazon S3, explicitly optimized for analytics workloads that use tabular data. Unlike general-purpose S3 buckets, S3 Tables introduces a new bucket type&#8212;table buckets&#8212;designed to efficiently store and manage data in a table-like structure (rows and columns), such as transaction logs, sensor data, or event streams.</p><p>S3 Tables natively supports the Apache Iceberg format, enabling advanced features like schema evolution, partition evolution, and time travel queries. This means you can query your data using standard SQL with analytics engines that support Iceberg, such as Amazon Athena, Amazon Redshift, and Apache Spark.</p><p>Key features of S3 Tables include:</p><ul><li><p><strong>Optimized performance</strong> for high-throughput analytics queries.</p></li></ul><ul><li><p><strong>Automated table maintenance</strong> (compaction, snapshot management, and cleanup of unused files).</p></li></ul><ul><li><p><strong>Fine-grained access control</strong> using IAM and Lake Formation </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!REhb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!REhb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 424w, https://substackcdn.com/image/fetch/$s_!REhb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 848w, https://substackcdn.com/image/fetch/$s_!REhb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!REhb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!REhb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg" width="324" height="216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:490,&quot;width&quot;:735,&quot;resizeWidth&quot;:324,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Pin de Jach en Memes ( &#865;&#176; &#860;&#662; &#865;&#176;) todo tipo | Memes, Viejitos, Mejores memes&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Pin de Jach en Memes ( &#865;&#176; &#860;&#662; &#865;&#176;) todo tipo | Memes, Viejitos, Mejores memes" title="Pin de Jach en Memes ( &#865;&#176; &#860;&#662; &#865;&#176;) todo tipo | Memes, Viejitos, Mejores memes" srcset="https://substackcdn.com/image/fetch/$s_!REhb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 424w, https://substackcdn.com/image/fetch/$s_!REhb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 848w, https://substackcdn.com/image/fetch/$s_!REhb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!REhb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b57bb86-9cc1-489c-9fbf-a4cdbf21f62e_735x490.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div></li></ul><ul><li><p><strong>Seamless integration</strong> with AWS analytics services, allowing easy discovery and querying of your tabular data.</p></li></ul><h2>What You&#8217;ll Learn</h2><p>In this post, you&#8217;ll build an automated way of processing CSV files into Iceberg using Step Functions, Lambdas, Glue Jobs, and most importantly, S3 Tables.</p><p><strong>Key takeaways:</strong></p><ul><li><p>Lake Formation is mandatory when you want to expose your data to Athena and other analytical services within AWS</p></li><li><p>Unfortunately, there will be some manual work from the Console regarding Lake Formation</p></li><li><p>No need to activate lake formation if the Iceberg tables are only for your Spark jobs</p></li></ul><h2><strong>Setting the Stage - Terraform</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!itLX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!itLX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 424w, https://substackcdn.com/image/fetch/$s_!itLX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 848w, https://substackcdn.com/image/fetch/$s_!itLX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 1272w, https://substackcdn.com/image/fetch/$s_!itLX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!itLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png" width="1456" height="737" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:737,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:201536,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/161945684?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!itLX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 424w, https://substackcdn.com/image/fetch/$s_!itLX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 848w, https://substackcdn.com/image/fetch/$s_!itLX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 1272w, https://substackcdn.com/image/fetch/$s_!itLX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ba47d47-30c9-45b4-83fe-73d9919e42b5_2075x1051.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>As you can see in the diagram above, we will set up a landing bucket for our data, and when we upload a CSV file, a lambda will be triggered. This lambda will trigger a step function (this is in case you want to expand the functionality) that will run the Glue Job to process the CSV file. The data will land in the S3 Table, and we will be able (with proper access) to query the data in Athena as well.</p><h3>Terraform</h3><p>In the Terraform code below, I will remove the IAM statements and things we have covered multiple times in our previous posts. I will focus on the essential things regarding our pipeline. You can find the full code <a href="https://github.com/siakon89/aws-s3-tables/tree/main">here</a>.</p><p>As always, we will start from the S3 buckets listed in <a href="https://github.com/siakon89/aws-s3-tables/blob/main/s3.tf">s3.tf</a>. We create two buckets, one for landing our CSV files named raw_data_bucket and one for the artifacts of the Glue Job.</p><pre><code><code>module "artifacts_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~&gt; 4.8"

  bucket = "${local.project_name}-glue-artifacts-${local.environment}"

  force_destroy = true
  acl           = "private"

  # Add ownership controls
  control_object_ownership = true
  object_ownership         = "ObjectWriter"


  tags = local.tags
}

module "raw_data_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~&gt; 4.10"

  bucket = local.raw_data_bucket_name

  force_destroy = true
  acl           = "private"

  # Add ownership controls
  control_object_ownership = true
  object_ownership         = "ObjectWriter"

  tags = local.tags
}</code></code></pre><p>Now let&#8217;s create the Glue Job, which will take the file from the raw bucket and convert it to Iceberg format. File: <a href="https://github.com/siakon89/aws-s3-tables/blob/main/glue.tf">glue.tf</a></p><pre><code># Upload the S3 Tables Iceberg connector JAR
resource "aws_s3_object" "s3_tables_connector" {
  bucket = module.artifacts_bucket.s3_bucket_id
  key    = "jars/s3-tables-catalog-for-iceberg-runtime-0.1.5.jar"
  source = "${path.module}/jars/s3-tables-catalog-for-iceberg-runtime-0.1.5.jar"
  etag   = filemd5("${path.module}/jars/s3-tables-catalog-for-iceberg-runtime-0.1.5.jar")
}

# Upload the Glue job script to S3
resource "aws_s3_object" "glue_job_script" {
  depends_on = [aws_s3_object.s3_tables_connector]
  bucket     = module.artifacts_bucket.s3_bucket_id
  key        = "scripts/csv_to_iceberg.py"
  source     = "${path.module}/scripts/csv_to_iceberg.py"
  etag       = filemd5("${path.module}/scripts/csv_to_iceberg.py")
}

# Glue job definition
resource "aws_glue_job" "csv_to_iceberg" {
  depends_on = [aws_s3_object.glue_job_script]
  name       = "${local.project_name}-csv-to-iceberg"
  role_arn   = aws_iam_role.glue_job_role.arn

  command {
    name            = "glueetl"
    script_location = "s3://${module.artifacts_bucket.s3_bucket_id}/${aws_s3_object.glue_job_script.key}"
    python_version  = "3"
  }

  default_arguments = {
    "--job-language"                     = "python"
    "--job-bookmark-option"              = "job-bookmark-enable"
    "--enable-metrics"                   = "true"
    "--enable-continuous-cloudwatch-log" = "true"
    "--TempDir"                          = "s3://${module.raw_data_bucket.s3_bucket_id}/temp/"
    "--extra-jars"                       = "s3://${module.artifacts_bucket.s3_bucket_id}/jars/s3-tables-catalog-for-iceberg-runtime-0.1.5.jar"
  }

  execution_property {
    max_concurrent_runs = 2
  }

  glue_version      = "5.0"
  worker_type       = "G.1X"
  number_of_workers = 2
  timeout           = 15
}

resource "aws_iam_role" "glue_job_role" {
  name = "${local.project_name}-glue-job-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "glue.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "glue_service" {
  role       = aws_iam_role.glue_job_role.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole"
}</code></pre><p>This is pretty much the same as what we used in our previous post (see below), the only difference is that we upload the JAR file for managing the Iceberg tables.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;9affb345-03cd-41ce-b351-5bdb8378adf4&quot;,&quot;caption&quot;:&quot;The road so far&#8230; *Carry on, my wayward son is playing in my mind*. In a previous post, we saw Amazon Athena and how to query data, such as CUR reports. Amazon Athena is a fantastic tool provided by AWS, a serverless managed Presto that allows you to query your unstructured and semi-structured data at a very low cost. In this post, we will continue our j&#8230;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Using AWS Glue Jobs with Terraform&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:1234969,&quot;name&quot;:&quot;Konstantinos Siaterlis&quot;,&quot;bio&quot;:&quot;Cloud Engineer, AWS Hero, AWS User Group Athens co-organizer, and blogger. Passionate about Data and DevOps, with extensive experience in architecting and implementing robust data platforms. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/373847a0-448b-4cc3-8220-698fb8c74a75_300x300.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-05-30T10:31:28.020Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thelastdev.com/p/using-aws-glue-jobs-with-terraform&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:163619650,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;The Last Dev&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Regarding the Step Function, it is as simple as it gets; it is composed of a single step where it runs the Glue Job. Feel free to extend this and add more capabilities and let me know in the comments below what you did. File: <a href="https://github.com/siakon89/aws-s3-tables/blob/main/state_machine.tf">state_machine.tf</a></p><pre><code># Step Functions state machine definition
module "etl_state_machine" {
  source  = "terraform-aws-modules/step-functions/aws"
  version = "~&gt; 4.2.1"

  name = "${local.project_name}-etl-workflow"

  attach_policy_json = true
  policy_json        = data.aws_iam_policy_document.step_functions_glue_policy.json

  definition = jsonencode({
    Comment = "ETL workflow to process CSV to Iceberg and crawl the data",
    StartAt = "StartGlueJob",
    States = {
      "StartGlueJob" = {
        Type     = "Task",
        Resource = "arn:aws:states:::glue:startJobRun.sync",
        Parameters = {
          JobName = aws_glue_job.csv_to_iceberg.name,
          Arguments = {
            "--source_s3_path.$"   = "$.source_s3_path",
            "--table_namespace.$"  = "$.table_namespace",
            "--table_name.$"       = "$.table_name",
            "--table_bucket_arn.$" = "$.table_bucket_arn"
          }
        },
        ResultPath = "$.glueJobResult",
        End        = true
      }
    }
  })
}</code></pre><p>Now let&#8217;s put everything together. Let&#8217;s create a trigger on the S3 bucket every time we upload a CSV and trigger the step function accordingly using a Lambda. File: <a href="https://github.com/siakon89/aws-s3-tables/blob/main/lambdas.tf">lambdas.tf</a></p><pre><code># ECR Docker image for Lambda
module "docker_image" {
  source = "terraform-aws-modules/lambda/aws//modules/docker-build"

  ecr_repo    = module.ecr.repository_name
  source_path = "${path.module}/lambdas"

  use_image_tag = true
}

module "ecr" {
  source = "terraform-aws-modules/ecr/aws"

  repository_name         = "${local.project_name}-ecr"
  repository_force_delete = true

  create_lifecycle_policy = false

  repository_lambda_read_access_arns = [module.trigger_step_function.lambda_function_arn]
}


module "trigger_step_function" {
  source  = "terraform-aws-modules/lambda/aws"
  version = "~&gt; 7.20"

  function_name = "${local.project_name}-trigger-step-function"
  description   = "Lambda function to trigger Step Function when a file is uploaded to S3"

  create_package = false
  image_uri      = module.docker_image.image_uri
  package_type   = "Image"

  timeout     = 300
  memory_size = 512

  environment_variables = {
    GLUE_JOB_NAME     = aws_glue_job.csv_to_iceberg.name
    STATE_MACHINE_ARN = module.etl_state_machine.state_machine_arn
    TABLE_NAMESPACE   = aws_s3tables_namespace.iceberg_namespace.namespace
    TABLE_NAME        = local.table_name
    TABLE_BUCKET_ARN  = module.s3_tables_bucket.s3_table_bucket_arn
  }

  image_config_command = ["trigger_step_function.handler"]

  attach_policies = true
  policies = [
    "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole",
    aws_iam_policy.lambda_glue_access.arn,
    aws_iam_policy.lambda_step_functions_policy.arn
  ]
  number_of_policies = 3

  tags = local.tags
}

resource "aws_s3_bucket_notification" "bucket_notification" {
  bucket = module.raw_data_bucket.s3_bucket_id

  lambda_function {
    lambda_function_arn = module.trigger_step_function.lambda_function_arn
    events              = ["s3:ObjectCreated:*"]
    filter_prefix       = "input/"
    filter_suffix       = ".csv"
  }

  depends_on = [aws_lambda_permission.allow_bucket]
}

resource "aws_lambda_permission" "allow_bucket" {
  statement_id  = "AllowExecutionFromS3Bucket"
  action        = "lambda:InvokeFunction"
  function_name = module.trigger_step_function.lambda_function_arn
  principal     = "s3.amazonaws.com"
  source_arn    = "arn:aws:s3:::${module.raw_data_bucket.s3_bucket_id}"
}</code></pre><p>Again, make sure you check the files, because I am vomiting the IAM resources.</p><p>Now let&#8217;s see the protagonist of our post, the S3 Table. File: <a href="https://github.com/siakon89/aws-s3-tables/blob/main/main.tf">main.tf</a></p><pre><code>module "s3_tables_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws//modules/table-bucket"
  version = "~&gt; 4.10"

  table_bucket_name = local.s3_tables_bucket_name
  encryption_configuration = {
    kms_key_arn   = module.kms.key_arn
    sse_algorithm = "aws:kms"
  }

  maintenance_configuration = {
    iceberg_unreferenced_file_removal = {
      status = "enabled"

      settings = {
        non_current_days  = 7
        unreferenced_days = 3
      }
    }
  }

  create_table_bucket_policy = true
  table_bucket_policy        = data.aws_iam_policy_document.s3_tables_bucket_policy.json
}

# S3 Tables Namespace - requires a table bucket
resource "aws_s3tables_namespace" "iceberg_namespace" {
  namespace        = local.namespace_name
  table_bucket_arn = module.s3_tables_bucket.s3_table_bucket_arn
}</code></pre><p>S3 Table is a different kind of S3 bucket under the Table Buckets section in the AWS console. Once you have created the table, you will find it there. You can see that I have added some simple maintenance (data housekeeping), but more functionality can be added if you want to.</p><p>Since we want to query our Iceberg table with Athena, we must enable the integration with the <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-aws.html">AWS analytics services</a>. This will use Lake Formation for access management and Cataloging. </p><p>To convert the CSV file from the S3 bucket to the Iceberg, we will use the following code (part of <a href="https://github.com/siakon89/aws-s3-tables/blob/main/scripts/csv_to_iceberg.py">csv_to_iceberg.py</a>)</p><pre><code><code># Initialize spark session with integration to analytics services
spark = SparkSession.builder.appName("SparkIcebergSQL") \
       .config("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") \
       .config("spark.sql.defaultCatalog", "s3tables") \
       .config("spark.sql.catalog.s3tables", "org.apache.iceberg.spark.SparkCatalog") \
       .config("spark.sql.catalog.s3tables.catalog-impl", "org.apache.iceberg.aws.glue.GlueCatalog") \
       .config("spark.sql.catalog.s3tables.glue.id", f"{ACCOUNT_ID}:s3tablescatalog/{TABLE_BUCKET_NAME}") \
       .config("spark.sql.catalog.s3tables.warehouse", f"s3://{TABLE_BUCKET_NAME}/warehouse/") \
       .getOrCreate()  </code></code></pre><p>First, we initiate the Spark Session with the <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-integrating-glue.html#glue-etl-script">integration to analytics services</a>.</p><pre><code>dynamic_frame = glueContext.create_dynamic_frame.from_options(
        format_options={
            "quoteChar": '"',
            "withHeader": True,
            "separator": ",",
            "optimizePerformance": False,
        },
        connection_type="s3",
        format="csv",
        connection_options={
            "paths": [SOURCE_S3_PATH],
            "recurse": True
        },
        transformation_ctx="read_csv"
    )</code></pre><p>Then we read the CSV file from the S3 bucket and place it into a dynamic frame.</p><pre><code>columns = df.dtypes
columns_sql = ", ".join([f"{slugify(name)} {dtype.upper()}" for name, dtype in columns])

table_identifier = f"{TABLE_NAMESPACE}.{TABLE_NAME}"
    
create_table_sql = f"""
    CREATE TABLE IF NOT EXISTS {table_identifier} (
        {columns_sql}
    )
"""

spark.sql(create_table_sql)</code></pre><p>Then we infer the schema of the CSV and create the table in the Table Bucket.</p><pre><code>df.createOrReplaceTempView("temp_data_to_insert")
insert_sql = f"""
    INSERT INTO {table_identifier}
    SELECT * FROM temp_data_to_insert
"""
spark.sql(insert_sql)</code></pre><p>Last but not least, we insert the data into the table.</p><h3>Set up the environment</h3><p>Now, to put everything in place, we will use Terraform. As I said in the beginning, we will have some manual Steps.</p><p>First of all, run:</p><pre><code>terraform apply</code></pre><p>This will create the resources, but you will not be able to run the script since we will need to enable Lake Formation and provide the necessary access to our Glue Role.</p><p>First of all, you will need to enable the Integration:</p><ol><li><p>Open the Amazon S3 console at <a href="https://console.aws.amazon.com/s3/">https://console.aws.amazon.com/s3/</a>.</p></li><li><p>In the left navigation pane, choose <strong>Table buckets</strong>.</p></li><li><p>Click on <strong>Enable integration</strong></p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ls_M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ls_M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 424w, https://substackcdn.com/image/fetch/$s_!Ls_M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 848w, https://substackcdn.com/image/fetch/$s_!Ls_M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 1272w, https://substackcdn.com/image/fetch/$s_!Ls_M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ls_M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png" width="1456" height="202" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:202,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40981,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/161945684?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ls_M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 424w, https://substackcdn.com/image/fetch/$s_!Ls_M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 848w, https://substackcdn.com/image/fetch/$s_!Ls_M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 1272w, https://substackcdn.com/image/fetch/$s_!Ls_M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c5cec0d-97dc-43dd-994f-9b526a45d8ea_1642x228.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Once you do that, your Bucket will be registered as a catalog in Lake formation with the name s3tablescatalog. Navigate to the Lake Formation page and do the following:</p><p>Go to <strong>Permissions</strong> and then click on <strong>Data permissions</strong> and then click <strong>Grant</strong></p><ul><li><p>Select Principals and then Iam user and Roles. Select your Glue Role</p></li><li><p>Then select Named Data Catalog resources and</p><ul><li><p>Select your catalog with the bucket name at the end</p></li><li><p>Select your database</p></li></ul></li><li><p>Click at the bottom <strong>Super</strong> access for Database permissions and Grantable permissions (You can narrow down the scope of the access if you want to by adding only Describe, Create table, and Alter)</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6PfH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6PfH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 424w, https://substackcdn.com/image/fetch/$s_!6PfH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 848w, https://substackcdn.com/image/fetch/$s_!6PfH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 1272w, https://substackcdn.com/image/fetch/$s_!6PfH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6PfH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png" width="819" height="1109" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1109,&quot;width&quot;:819,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:133857,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/161945684?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6PfH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 424w, https://substackcdn.com/image/fetch/$s_!6PfH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 848w, https://substackcdn.com/image/fetch/$s_!6PfH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 1272w, https://substackcdn.com/image/fetch/$s_!6PfH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5845828-7067-48bb-b0d3-47c1802ccdfb_819x1109.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We do the same thing, but this time we select all Tables as well</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wrNM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wrNM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 424w, https://substackcdn.com/image/fetch/$s_!wrNM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 848w, https://substackcdn.com/image/fetch/$s_!wrNM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 1272w, https://substackcdn.com/image/fetch/$s_!wrNM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wrNM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png" width="799" height="1118" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1118,&quot;width&quot;:799,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:134872,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/161945684?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wrNM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 424w, https://substackcdn.com/image/fetch/$s_!wrNM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 848w, https://substackcdn.com/image/fetch/$s_!wrNM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 1272w, https://substackcdn.com/image/fetch/$s_!wrNM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde500e43-468f-4ff1-8323-6258ed0e0fd3_799x1118.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve tried to do this with Terraform, but it seems there is a bug for referencing an S3 Tables catalog.</p><p>Last but not least, you will need to uncomment the resource <a href="https://github.com/siakon89/aws-s3-tables/blob/main/main.tf#L95-L102">here</a>.</p><pre><code>resource "aws_lakeformation_permissions" "data_location" {
  principal   = aws_iam_role.glue_job_role.arn
  permissions = ["DATA_LOCATION_ACCESS"]

  data_location {
    arn = module.s3_tables_bucket.s3_table_bucket_arn
  }
}</code></pre><p>And run again</p><pre><code>terraform apply</code></pre><p>And you are set. Upload a CSV file in your raw bucket under the prefix input and see it run &#128516;</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>Conclusion</strong></h2><p>Exploring S3 Tables has been a game-changer in my thinking about storing and analyzing tabular data in the cloud. The performance, built-in Iceberg support, and &#8220;seamless integration&#8221; (besides Lake Formation) with AWS analytics tools made setting up a modern, scalable data lake easy.</p><p>If you&#8217;re curious about the next generation of data lake storage, I definitely recommend giving S3 Tables a try. It&#8217;s been a fun and eye-opening experience, and I&#8217;m excited to see how this new approach will shape future analytics projects!</p><p>To destroy what we have created today, simply run</p><pre><code><code>terraform destroy  </code></code></pre><p>Feel free to reach out if you encounter any problems or have suggestions.</p><p>Till the next time, stay safe and have fun! &#10084;&#65039;</p>]]></content:encoded></item><item><title><![CDATA[Using AWS Glue Jobs with Terraform]]></title><description><![CDATA[Discover how AWS Glue Jobs quietly binds your data journey with the agility of serverless Spark]]></description><link>https://www.thelastdev.com/p/using-aws-glue-jobs-with-terraform</link><guid isPermaLink="false">https://www.thelastdev.com/p/using-aws-glue-jobs-with-terraform</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Fri, 30 May 2025 10:31:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!XB-f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The road so far&#8230; *Carry on, my wayward son is playing in my mind*. In a previous post, we <a href="https://www.thelastdev.com/p/showcasing-aws-athena">saw Amazon Athena</a> and how to <a href="https://www.thelastdev.com/p/managing-cost-and-usage-reports-data">query data, such as CUR reports</a>. Amazon Athena is a fantastic tool provided by AWS, a serverless managed Presto that allows you to query your unstructured and semi-structured data at a very low cost. In this post, we will continue our journey to the AWS Data Stack and see the famous Glue Jobs in practice by converting CSV files to Parquet.</p><p>We will use the following services with Terraform, where the majority of those are from <a href="https://registry.terraform.io/namespaces/terraform-aws-modules">Terraform AWS Modules</a>:</p><ul><li><p>Amazon S3: for storing our raw and processed data</p></li><li><p>AWS Glue:</p><ul><li><p>Glue Job: For processing files</p></li><li><p>Glue Crawler for cataloging the data</p></li></ul></li><li><p>Step-function: For executing the whole workflow</p></li><li><p>Lambda: For triggering the step function upon file upload in the S3 bucket</p></li></ul><p>This is a <strong>Level 200</strong> post; following along with the post and deploying the infrastructure to your AWS account <strong>will cost approximately ~$1.5 per month</strong>, given you run ~10 times a month, the Glue job and the Crawler. <a href="https://github.com/siakon89/AWS-examples/tree/master/data-services/AWS%20Glue/Glue%20Jobs">You can follow along with the code in my repo</a>.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>What is AWS Glue</strong></h2><p><strong><a href="https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html">AWS Glue</a></strong><a href="https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html"> is a fully managed, serverless data integration service</a> that eliminates the complexity of building and managing data infrastructure. It automatically provisions, scales, and manages the compute resources needed for your data workloads, allowing you to focus on extracting value from your data rather than managing servers.</p><p><strong>&#128375;&#65039; <a href="https://docs.aws.amazon.com/glue/latest/dg/add-crawler.html">Crawlers</a></strong><a href="https://docs.aws.amazon.com/glue/latest/dg/add-crawler.html"> </a>- Automatically discover and catalog your data across various sources (S3, databases, data lakes), inferring schemas and populating metadata without manual intervention.</p><p><strong>&#128218; <a href="https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html">Data Catalog</a></strong> - A centralized metadata repository that is a persistent store for table definitions, schema information, and data location details, making your data discoverable and queryable.</p><p><strong>&#9881;&#65039; <a href="https://docs.aws.amazon.com/glue/latest/dg/author-glue-job.html">Jobs</a></strong> - Serverless ETL (Extract, Transform, Load) and ELT workflows that process your data using familiar programming languages like Python and Scala, with built-in monitoring and error handling.</p><div class="poll-embed" data-attrs="{&quot;id&quot;:324835}" data-component-name="PollToDOM"></div><p></p><h2><strong>What You'll Learn</strong></h2><p>In this post, you'll build an automated AWS Glue ETL pipeline that transforms CSV files to Parquet format and catalogs them for analytics. </p><p><strong>Key takeaways:</strong></p><ul><li><p><strong>Glue Jobs</strong> - Create serverless ETL scripts and use Glue Jobs to process your data</p></li></ul><ul><li><p><strong>Glue Crawlers</strong> - Set up automatic schema discovery and data cataloging for your processed datasets</p></li></ul><ul><li><p><strong>Event-Driven Processing</strong> - Trigger Glue jobs automatically when new files arrive in S3, creating a seamless data ingestion pipeline</p></li></ul><ul><li><p><strong>Serverless Spark</strong> - Leverage AWS Glue's managed Spark environment for scalable data transformations without cluster management</p></li></ul><h2><strong>Setting the Stage - Terraform</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XB-f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XB-f!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 424w, https://substackcdn.com/image/fetch/$s_!XB-f!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 848w, https://substackcdn.com/image/fetch/$s_!XB-f!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 1272w, https://substackcdn.com/image/fetch/$s_!XB-f!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XB-f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png" width="724" height="373.9340659340659" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:752,&quot;width&quot;:1456,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:151977,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/163619650?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!XB-f!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 424w, https://substackcdn.com/image/fetch/$s_!XB-f!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 848w, https://substackcdn.com/image/fetch/$s_!XB-f!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 1272w, https://substackcdn.com/image/fetch/$s_!XB-f!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F676baf69-9a94-46e1-bffe-614d06190e59_2107x1088.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Architectural Diagram of the solution</figcaption></figure></div><p>As we can see in the Diagram above, we are going to use Glue Jobs to convert a file from CSV to Parquet. You can replace the Glue code with whatever you like; I am just using this example to showcase the Job. </p><p>We will start by creating the buckets for our use case. File: <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/s3.tf">s3.tf</a>. We have created multiple buckets using <a href="https://www.linkedin.com/in/antonbabenko/">Anton&#8217;s</a> module, so I will only show you the definition of the RAW bucket. Most importantly, I will later show you how to create the trigger.</p><pre><code>module "raw_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~&gt; 4.8"

  bucket = "${local.project_name}-raw-data-${local.environment}"

  force_destroy = true
  acl           = "private"

  # Add ownership controls
  control_object_ownership = true
  object_ownership         = "ObjectWriter"


  tags = local.tags
}</code></pre><p>Now, let&#8217;s create our Glue Job, which will take the files from the raw bucket and place them as Parquet in a processed bucket. File: <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/glue.tf">glue.tf</a></p><pre><code># Upload the Glue job script to S3
resource "aws_s3_object" "glue_job_script" {
  bucket = module.artifacts_bucket.s3_bucket_id
  key    = "scripts/csv_to_parquet.py"
  source = "${path.module}/scripts/csv_to_parquet.py"
  etag   = filemd5("${path.module}/scripts/csv_to_parquet.py")
}

# Glue job definition
resource "aws_glue_job" "csv_to_parquet" {
  depends_on = [aws_s3_object.glue_job_script]
  name       = "${local.project_name}-csv-to-parquet"
  role_arn   = aws_iam_role.glue_job_role.arn

  command {
    name            = "glueetl"
    script_location = "s3://${module.artifacts_bucket.s3_bucket_id}/${aws_s3_object.glue_job_script.key}"
    python_version  = "3"
  }

  default_arguments = {
    "--job-language"                     = "python"
    "--job-bookmark-option"              = "job-bookmark-enable"
    "--enable-metrics"                   = "true"
    "--enable-continuous-cloudwatch-log" = "true"
    "--TempDir"                          = "s3://${module.parquet_bucket.s3_bucket_id}/temp/"
    "--input_path"  = "s3://${module.raw_bucket.s3_bucket_id}/input/"
    "--output_path" = "s3://${module.parquet_bucket.s3_bucket_id}/data/"
  }

  execution_property {
    max_concurrent_runs = 1
  }

  glue_version      = "5.0"
  worker_type       = "G.1X"
  number_of_workers = 2
  timeout           = 10 # minutes
}</code></pre><p>A lot to digest here &#128517;, let&#8217;s go line by line and see the configuration of our crawler.</p><pre><code>resource "aws_s3_object" "glue_job_script" {
  bucket = module.artifacts_bucket.s3_bucket_id
  key    = "scripts/csv_to_parquet.py"
  source = "${path.module}/scripts/csv_to_parquet.py"
  etag   = filemd5("${path.module}/scripts/csv_to_parquet.py")
}</code></pre><p>This is to upload the script that Glue will run, in this <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/scripts/csv_to_parquet.py">script</a> I have used both Spark and GlueSpark context.</p><pre><code>command {
    name            = "glueetl"
    script_location = "s3://${module.artifacts_bucket.s3_bucket_id}/${aws_s3_object.glue_job_script.key}"
    python_version  = "3"
}</code></pre><p>Let&#8217;s define the execution environment and script location. We specify where Glue can find the script and what the Python version is</p><pre><code>default_arguments = {
    "--job-language"                     = "python"
    "--job-bookmark-option"              = "job-bookmark-enable"
    "--enable-metrics"                   = "true"
    "--enable-continuous-cloudwatch-log" = "true"
    "--TempDir"                          = "s3://${module.parquet_bucket.s3_bucket_id}/temp/"
    "--input_path"  = "s3://${module.raw_bucket.s3_bucket_id}/input/"
    "--output_path" = "s3://${module.parquet_bucket.s3_bucket_id}/data/"
}</code></pre><p>Now for the job behavior and data flow paths. We set up the observability features for logging and enable <a href="https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html">bookmarks</a>. Last but not least, we define the locations we need to interact with our files.</p><pre><code>execution_property {
  max_concurrent_runs = 1
}</code></pre><p>This prevents the same job from running multiple times. You can set this to whatever value you need; I&#8217;ve used 1 to ensure the cost estimate above is accurate. </p><pre><code>glue_version      = "5.0"
worker_type       = "G.1X"
number_of_workers = 2
timeout           = 10 # minutes</code></pre><p>We define the compute environment and resource allocation where we set the Glue runtime version (determines Spark version, features available), the allocated compute resources (worker type and count), and some safety limits (timeout) to prevent runaway costs.</p><p>After the Glue Job, we create a Glue Crawler to catalog our data from the destination bucket. You can find the definition <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/glue.tf#L41-L63">here</a>; no need to describe this again, since we have seen it multiple times in our previous posts.</p><p>Now we need to define our Step Function. It will operate as shown in the definition below. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BqSw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BqSw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BqSw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BqSw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BqSw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BqSw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png" width="1456" height="709" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:709,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:132873,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/163619650?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!BqSw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BqSw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BqSw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BqSw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29255af-a38f-46d8-918c-4f34a97b2233_2103x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Feel free to copy the definition from <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/state_machine.tf#L11-L87">here</a>. File: <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/state_machine.tf">state_machine.tf</a></p><pre><code># Step Functions state machine definition
module "etl_state_machine" {
  source  = "terraform-aws-modules/step-functions/aws"
  version = "~&gt; 4.2.1"

  name = "${local.project_name}-etl-workflow"

  attach_policy_json = true
  policy_json        = data.aws_iam_policy_document.step_functions_glue_policy.json

  definition = jsonencode({
    Comment = "ETL workflow to process CSV to Parquet and crawl the data",
    StartAt = "StartGlueJob",
    States = {
      "StartGlueJob" = {
        Type     = "Task",
        Resource = "arn:aws:states:::glue:startJobRun.sync",
        Parameters = {
          JobName = aws_glue_job.csv_to_parquet.name,
          Arguments = {
            "--input_path.$"  = "$.input_path",
            "--output_path.$" = "$.output_path"
          }
        },
        ResultPath = "$.glueJobResult",
        Next       = "StartGlueCrawler"
      },
      ...
      "Success" = {
        Type = "Pass",
        End  = true
      }
    }
  })
}</code></pre><p>Let&#8217;s see now how we create the trigger along with the Lambda. The Lambda will be Dockerized. File <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/lambda.tf">lambda.tf</a> This is the Lambda that will trigger the step function above.</p><p>We first define the Docker image and Registry.</p><pre><code># ECR Docker image for Lambda
module "docker_image" {
  source = "terraform-aws-modules/lambda/aws//modules/docker-build"

  ecr_repo = module.ecr.repository_name
  # image_tag       = "latest"
  source_path = "${path.module}/lambdas"

  # cache_from = ["${module.ecr.repository_url}:latest"]
  # Use the pre-built image from ECR
  use_image_tag = true
}

module "ecr" {
  source = "terraform-aws-modules/ecr/aws"

  repository_name         = "${local.project_name}-ecr"
  repository_force_delete = true

  create_lifecycle_policy = false

  repository_lambda_read_access_arns = [module.trigger_step_function.lambda_function_arn]
}</code></pre><p>Then we create the Lambda.</p><pre><code>module "trigger_step_function" {
  source  = "terraform-aws-modules/lambda/aws"
  version = "~&gt; 7.20"

  function_name = "${local.project_name}-trigger-step-function"
  description   = "Lambda function to trigger Step Function when a file is uploaded to S3"

  # Docker image config
  create_package = false
  image_uri      = module.docker_image.image_uri
  package_type   = "Image"

  # Lambda settings
  timeout     = 300
  memory_size = 512

  # Environment variables
  environment_variables = {
    GLUE_JOB_NAME     = aws_glue_job.csv_to_parquet.name
    OUTPUT_BUCKET     = module.parquet_bucket.s3_bucket_id
    STATE_MACHINE_ARN = module.etl_state_machine.state_machine_arn
  }

  image_config_command = ["trigger_step_function.handler"]

  # IAM policy statements
  attach_policies = true
  policies = [
    "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole",
    aws_iam_policy.lambda_glue_access.arn,
    aws_iam_policy.lambda_step_functions_policy.arn
  ]
  number_of_policies = 3

  tags = local.tags
}</code></pre><p>Last but not least, we create the notification from the bucket we have created above.</p><pre><code># S3 event notification to trigger Lambda
resource "aws_s3_bucket_notification" "bucket_notification" {
  bucket = module.raw_bucket.s3_bucket_id

  lambda_function {
    lambda_function_arn = module.trigger_step_function.lambda_function_arn
    events              = ["s3:ObjectCreated:*"]
    filter_prefix       = "input/"
    filter_suffix       = ".csv"
  }

  depends_on = [aws_lambda_permission.allow_bucket]
}

# Permission for S3 to invoke Lambda
resource "aws_lambda_permission" "allow_bucket" {
  statement_id  = "AllowExecutionFromS3Bucket"
  action        = "lambda:InvokeFunction"
  function_name = module.trigger_step_function.lambda_function_arn
  principal     = "s3.amazonaws.com"
  source_arn    = "arn:aws:s3:::${module.raw_bucket.s3_bucket_id}"
}</code></pre><p>You can edit your deployment by configuring the variables in <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/locals.tf">locals.tf</a>. Additional files, not included in the post, are:</p><ul><li><p><a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/iam.tf">iam.tf</a>: to define the IAM access of our resources</p></li><li><p><a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/lambdas/trigger_step_function.py">trigger_step_function.py</a>: the Lambda code</p></li><li><p><a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/AWS%20Glue/Glue%20Jobs/lambdas/Dockerfile">Dockerfile</a>: the Lambda Image</p></li></ul><p>And that&#8217;s it. You are now ready to deploy the infrastructure.</p><pre><code><code>terraform plan</code></code></pre><p>and once we validate the plan, we can run</p><pre><code><code>terraform apply</code></code></pre><h2><strong>Using Glue Jobs</strong></h2><h4>How Glue Jobs Execute Spark Code Serverlessly</h4><p>AWS Glue Jobs provide a <strong>serverless Apache Spark environment</strong> that automatically provisions the exact compute resources you need, executes your PySpark transformations across distributed workers, and cleans up when complete. You simply upload your script to S3 and define job parameters - AWS handles cluster management, scaling, and infrastructure entirely behind the scenes, charging you only for actual compute time used.</p><h4>Job Configuration Essentials</h4><p><strong>Data Processing Units (DPUs)</strong> are the core compute building blocks, where each DPU provides 4 vCPUs and 16 GB of RAM. You can choose from different worker types (Standard, G.1X, G.2X) and scale from 2 to 100 DPUs based on your data volume. <strong>Job bookmarks</strong> intelligently track what data has been processed to avoid reprocessing on subsequent runs - essential for incremental data pipelines. <strong>Retry mechanisms</strong> and <strong>timeout controls</strong> provide operational safety, while <strong>custom arguments</strong> let you parameterize your jobs for different environments and datasets.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>Conclusion</strong></h2><p>AWS Glue transforms what once required complex cluster management, infrastructure planning, and operational overhead into a simple, declarative experience. </p><p>It is a very powerful tool to know, but it still has its limits. You need to evaluate and validate your configuration carefully.</p><p>To destroy what we have created today, simply run</p><pre><code>terraform destroy  </code></pre><p>Feel free to reach out if you encounter any problems or have suggestions.</p><p>Till the next time, stay safe and have fun! &#10084;&#65039;</p>]]></content:encoded></item><item><title><![CDATA[Managing Cost and Usage Reports (Data Exports) in AWS]]></title><description><![CDATA[Learn how to set up and query your AWS Cost and Usage Reports (Data Exports)]]></description><link>https://www.thelastdev.com/p/managing-cost-and-usage-reports-data</link><guid isPermaLink="false">https://www.thelastdev.com/p/managing-cost-and-usage-reports-data</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Tue, 20 May 2025 12:43:45 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/809d38f7-8a4e-48ad-946e-b6215f00dff1_1188x1461.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today&#8217;s post, I am going to show you how to enable <a href="https://docs.aws.amazon.com/cur/latest/userguide/what-is-data-exports.html">Cost and Usage reports</a> for your AWS account/organization and how to query them to get valuable insights. </p><p>We are going to enable <a href="https://docs.aws.amazon.com/cur/latest/userguide/table-dictionary-cur2.html">CUR 2.0</a> and set up an Infrastructure via Terraform to do the following:</p><ul><li><p>Set up Amazon Athena</p></li><li><p>Create a Glue Crawler to catalog our data</p></li><li><p>Create a Lambda to send us monthly reports based on some SQL queries</p></li></ul><p><strong>This is a Level 200 post, following along with the post and deploying the infrastructure to your AWS account will cost approximately ~$1 per month (not including the CUR reports in S3)</strong></p><p>If you haven&#8217;t seen my previous post about Amazon Athena, I would recommend doing so, since I will not dive deep into that section.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;0a3e0cc6-bff6-4b8e-b7b6-0abe27f2c8d8&quot;,&quot;caption&quot;:&quot;In today&#8217;s post, I am going to showcase Amazon Athena. Amazon Athena is one of my favourite services within AWS. It has such nice capabilities at a very low cost!&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Showcasing AWS Athena&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:1234969,&quot;name&quot;:&quot;Konstantinos Siaterlis&quot;,&quot;bio&quot;:&quot;Cloud Engineer, AWS Hero, AWS User Group Athens co-organizer, and blogger. Passionate about Data and DevOps, with extensive experience in architecting and implementing robust data platforms. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/373847a0-448b-4cc3-8220-698fb8c74a75_300x300.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-05-13T15:19:40.425Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thelastdev.com/p/showcasing-aws-athena&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:162598264,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;The Last Dev&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>What is CUR, and what is CUR 2.0</h2><p>CUR is AWS's Cost and Usage Reports, and this name is now deprecated. The official name of the service is Data Exports. With Data Exports, you can extract AWS accounts&#8217;/orgs&#8217; data to S3 in a readable and queryable format. There are many flavors of exports as mentioned below:</p><ul><li><p><strong>Cost and Usage Report 2.0 (CUR 2.0)</strong><br>Enhanced, detailed cost and usage data. Recommended over legacy CUR.</p></li><li><p><strong>Cost Optimization Recommendations</strong><br>Data from the Cost Optimization Hub to identify savings opportunities.</p></li><li><p><strong>FOCUS 1.0 with AWS Columns</strong><br>Structured cost data export aligned with the FOCUS standard.</p></li><li><p><strong>Carbon Emissions</strong><br>Track AWS carbon footprint data for sustainability reporting.</p></li><li><p><strong>Cost and Usage Dashboard (QuickSight)</strong><br>Pre-built dashboard export for cost visualization in Amazon QuickSight.</p></li><li><p><strong>Legacy Cost and Usage Report (Legacy CUR)</strong><br>Older CUR format, supported with different API actions.</p></li></ul><p>We will focus today on the CUR 2.0 reports, but after this post, you can enable whichever data export, follow the same process, use the same infrastructure, and query your data as you did with CUR 2.0.</p><p><a href="https://docs.aws.amazon.com/cur/latest/userguide/table-dictionary-cur2.html">Cost and Usage Reports 2.0</a> provides the following improvements over Cost and Usage Reports (Legacy):</p><ul><li><p><strong>Consistent schema</strong>: CUR 2.0 contains a fixed set of columns, whereas the columns included for CUR can vary monthly depending on your usage of AWS services, cost categories, and resource tags.</p></li><li><p><strong>Nested data</strong>: CUR 2.0 reduces data sparsity by collapsing certain columns from CUR into individual columns with key-value pairs of the collapsed columns. The nested keys can optionally be queried in Data Exports as separate columns to match the original CUR schema and data.</p></li></ul><h2>Enabling CUR 2.0 ~ Data Exports</h2><p>Now let&#8217;s enable our CUR 2.0 in our AWS account. <a href="https://docs.aws.amazon.com/cur/latest/userguide/dataexports-create.html">You can follow this guide</a> from AWS, which shows how to do that via the console, but you can always use the CLI.</p><p>First, we need to create a bucket, make sure you put a proper bucket name here:</p><pre><code>aws s3 mb s3://my-cur-bucket-example --region eu-central-1</code></pre><p>Then we need to attach a bucket policy so that Data Exports can write to the bucket. You can save the following to a .json file, let&#8217;s call it bucket-policy.json</p><pre><code>{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EnableAWSDataExportsToWriteToS3AndCheckPolicy",
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "billingreports.amazonaws.com",
                    "bcm-data-exports.amazonaws.com"
                ]
            },
            "Action": [
                "s3:PutObject",
                "s3:GetBucketPolicy"
            ],
            "Resource": [
                "arn:aws:s3:::${bucket_name}/*",
                "arn:aws:s3:::${bucket_name}"
            ],
            "Condition": {
                "StringLike": {
                    "aws:SourceAccount": "${accountId}",
                    "aws:SourceArn": [
                        "arn:aws:cur:us-east-1:${accountId}:definition/*",
                        "arn:aws:bcm-data-exports:us-east-1:${accountId}:export/*"
                    ]
                }
            }
        }
    ]
}</code></pre><p>Replace <code>${bucket_name} with the name you used above and ${accountId} with your AWS account ID.</code></p><p>Then we want to attach this policy to our newly created bucket. Make sure you put the correct bucket name in the command.</p><pre><code>aws s3api put-bucket-policy --bucket my-cur-bucket-example --policy file://bucket-policy.json</code></pre><p>After we have our bucket, we can set up the data export. <a href="https://docs.aws.amazon.com/cur/latest/userguide/data-exports-migrate-two.html">Follow this guide</a>. At this point, you should have a CUR 2.0 data export, and within 24 hours, it will start generating data. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Oqa5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Oqa5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 424w, https://substackcdn.com/image/fetch/$s_!Oqa5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 848w, https://substackcdn.com/image/fetch/$s_!Oqa5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 1272w, https://substackcdn.com/image/fetch/$s_!Oqa5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Oqa5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png" width="1456" height="923" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:923,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:201029,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/163612253?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Oqa5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 424w, https://substackcdn.com/image/fetch/$s_!Oqa5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 848w, https://substackcdn.com/image/fetch/$s_!Oqa5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 1272w, https://substackcdn.com/image/fetch/$s_!Oqa5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa43f6b24-dae0-4459-bdfb-90e15a996036_1657x1050.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Disclaimer:</strong> There is no backfill option, meaning you will start seeing data from the day you enabled the data export.</p><h2>Setting the stage - Terraform</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y32V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y32V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 424w, https://substackcdn.com/image/fetch/$s_!Y32V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 848w, https://substackcdn.com/image/fetch/$s_!Y32V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!Y32V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y32V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png" width="889" height="1137" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1137,&quot;width&quot;:889,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:252975,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/163612253?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3d63e5f6-4d62-49ba-9e2e-4347b3331aa0_1188x1461.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y32V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 424w, https://substackcdn.com/image/fetch/$s_!Y32V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 848w, https://substackcdn.com/image/fetch/$s_!Y32V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 1272w, https://substackcdn.com/image/fetch/$s_!Y32V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29660c3-4f63-45f6-bbd8-2dae6ff19576_889x1137.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As we can see, we will create the following resources. The Terraform files and sample queries can be found in the <a href="https://github.com/siakon89/AWS-examples/tree/master/data-services/Amazon%20Athena/infrastructure_cur">GitHub repo</a>.</p><ul><li><p>2 S3 buckets: One for Athena to save the query results and one for Athena data in case we want to CTAS. The third bucket is the one we created in the previous section.</p></li><li><p>A Glue Crawler to catalog the CUR 2.0 data to Glue</p></li><li><p>Amazon Athena workgroup and database for the CUR table</p></li><li><p>And a Lambda that will be triggered monthly and send us an email report with the untagged resources. The Lambda is dockerized.</p></li></ul><p>You can configure the infrastructure by editing the <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure_cur/locals.tf">locals.tf</a> file.</p><ul><li><p>Update <code>aws_region</code> to your preferred region</p></li><li><p>Set <code>name</code> to your project's name</p></li><li><p>Set <code>cur_bucket_name</code> to your CUR bucket name</p></li><li><p>Set <code>cur_prefix</code> to your CUR data prefix</p></li><li><p>Make sure <code>table_name</code> <code>cur_{data}</code>, <code>data</code> is the folder Glue is looking</p></li><li><p>Set <code>sender_email</code> and <code>recipient_emails</code> for the Lambda to send the cur report</p></li><li><p>Set the <code>tag_key_to_analyze</code> to the tag you want to make sure you are searching for</p></li></ul><p>I will skip the creation of the S3 Buckets, AWS Glue Crawler, and Amazon Athena, since it is the same as my previous post. <a href="https://www.thelastdev.com/p/showcasing-aws-athena">Feel free to follow the guide there</a>.</p><p>Let&#8217;s first add an identity to SES to send our email reports.</p><pre><code>resource "aws_ses_email_identity" "sender" {
  email = local.sender_email
}</code></pre><p>We will need to verify our identity in order to send emails.</p><p>Then we will create the Lambda function and the Docker image. We use <a href="https://www.linkedin.com/in/antonbabenko/">Anton&#8217;s</a> <a href="https://registry.terraform.io/namespaces/terraform-aws-modules">Terraform modules</a> to make our lives easier.</p><pre><code># ECR Docker image for Lambda
module "docker_image" {
  source = "terraform-aws-modules/lambda/aws//modules/docker-build"

  ecr_repo        = module.ecr.repository_name
  source_path     = "${path.module}/lambdas"

  use_image_tag = true
}

module "ecr" {
  source = "terraform-aws-modules/ecr/aws"

  repository_name         = local.ecr_repository_name
  repository_force_delete = true

  create_lifecycle_policy = false

  repository_lambda_read_access_arns = [module.lambda_function.lambda_function_arn]
}


# Lambda function using terraform-aws-modules/lambda/aws module
module "lambda_function" {
  source  = "terraform-aws-modules/lambda/aws"
  version = "~&gt; 7.20"

  function_name = local.lambda_name
  description   = local.lambda_description

  # Docker image config
  create_package = false
  image_uri      = module.docker_image.image_uri
  package_type   = "Image"

  # Lambda settings
  timeout     = 300
  memory_size = 512

  # Environment variables
  environment_variables = {
    DATABASE_NAME      = aws_athena_database.athena_database.name
    TABLE_NAME         = local.table_name
    WORKGROUP          = aws_athena_workgroup.athena_workgroup.name
    OUTPUT_BUCKET      = module.athena_results_bucket.s3_bucket_id
    SENDER_EMAIL       = local.sender_email
    RECIPIENT_EMAILS   = local.recipient_emails
    TAG_KEY_TO_ANALYZE = local.tag_key_to_analyze
  }

  # IAM policy statements
  attach_policy_statements = true
  policy_statements = {
    cloudwatch_logs = {
      effect = "Allow",
      actions = [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      resources = ["arn:aws:logs:*:*:*"]
    },
    athena = {
      effect = "Allow",
      actions = [
        "athena:StartQueryExecution",
        "athena:GetQueryExecution",
        "athena:GetQueryResults"
      ],
      resources = ["*"]
    },
    glue = {
      effect = "Allow",
      actions = [
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetTables",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:BatchGetPartition"
      ],
      resources = ["*"]
    },
    s3 = {
      effect = "Allow",
      actions = [
        "s3:GetObject",
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:PutObject"
      ],
      resources = [
        "arn:aws:s3:::${module.athena_results_bucket.s3_bucket_id}",
        "arn:aws:s3:::${module.athena_results_bucket.s3_bucket_id}/*"
      ]
    },
     s3_cur = {
      effect = "Allow",
      actions = [
        "s3:GetObject",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      resources = [
        "arn:aws:s3:::${local.cur_bucket_name}",
        "arn:aws:s3:::${local.cur_bucket_name}/*"
      ]
    },
    ses = {
      effect = "Allow",
      actions = [
        "ses:SendEmail",
        "ses:SendRawEmail"
      ],
      resources = ["*"]
    }
  }

  tags = local.tags
}</code></pre><p>In the code above, we first set up the Docker build and push it to ECR. Then we create the Lambda function alongside all the permissions it needs to execute its job.</p><p>Last but not least, we introduce the scheduling part of our architecture by utilizing a CloudWatch event rule.</p><pre><code># CloudWatch Event Rule to trigger Lambda on a schedule (monthly)
resource "aws_cloudwatch_event_rule" "monthly_trigger" {
  name                = "lambda-monthly-trigger"
  description         = "Triggers the untagged resources reporter Lambda function on the 3rd day of each month"
  schedule_expression = "cron(0 8 3 * ? *)" # 8:00 AM UTC on the 3rd day of each month

  tags = local.tags
}

# CloudWatch Event Target
resource "aws_cloudwatch_event_target" "lambda_target" {
  rule      = aws_cloudwatch_event_rule.monthly_trigger.name
  target_id = "TriggerLambda"
  arn       = module.lambda_function.lambda_function_arn
}

# Lambda permission to allow CloudWatch Events to invoke the function
resource "aws_lambda_permission" "allow_cloudwatch" {
  statement_id  = "AllowExecutionFromCloudWatch"
  action        = "lambda:InvokeFunction"
  function_name = module.lambda_function.lambda_function_name
  principal     = "events.amazonaws.com"
  source_arn    = aws_cloudwatch_event_rule.monthly_trigger.arn
}</code></pre><p>In the code above, we create a CloudWatch event rule triggered at 8:00 AM UTC every 3rd day of the month. 3rd day because we want to make sure all the data points from CUR are available from the previous month, usually it takes 24 hours. Then, we create the target and the permissions to CloudWatch to trigger the Lambda function.</p><p>And that&#8217;s it. You are now ready to deploy the infrastructure.</p><pre><code><code>terraform plan</code></code></pre><p>and once we validate the plan, we can run</p><pre><code><code>terraform apply</code></code></pre><p>If we are impatient and want to see the work we&#8217;ve done, we can trigger the Lambda function with a test and see the results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!htFj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!htFj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 424w, https://substackcdn.com/image/fetch/$s_!htFj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 848w, https://substackcdn.com/image/fetch/$s_!htFj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 1272w, https://substackcdn.com/image/fetch/$s_!htFj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!htFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png" width="1391" height="1109" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1109,&quot;width&quot;:1391,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:104873,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/163612253?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!htFj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 424w, https://substackcdn.com/image/fetch/$s_!htFj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 848w, https://substackcdn.com/image/fetch/$s_!htFj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 1272w, https://substackcdn.com/image/fetch/$s_!htFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3dd45593-c620-4c87-9048-7e8e36c2dd05_1391x1109.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As you can see, I vibe-coded the sh*t out of it for the report visuals &#128517;&#128517;&#128517;Who likes to write HTML in python for emails? Let me know in the comments below!</p><p>You can find the Lambda code inside the <a href="https://github.com/siakon89/AWS-examples/tree/master/data-services/Amazon%20Athena/infrastructure_cur/lambdas">lambdas folder</a>.</p><p>You can modify the Lambda function to do whatever you like, you now have all of the CUR data at your fingertips &#129316;</p><h2>Useful queries</h2><p>In this section, I will showcase some Athena queries on the CUR 2.0 data that I found helpful. The complete collection of the CUR queries can be found <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure_cur/sample_queries.sql">here</a>.</p><p>You will notice these lines in some queries I show you. Pretty much this excludes the credits to show some data &#128517; If you want the credits to be applied in your queries, remove those lines.</p><pre><code>line_item_line_item_type != 'Credit' AND
line_item_line_item_type != 'Refund'</code></pre><h3>S3 Costs by bucket and usage type</h3><pre><code>SELECT
    line_item_resource_id AS bucket_name,
    line_item_usage_type AS usage_type,
    SUM(line_item_unblended_cost) AS cost
FROM cur_data
WHERE
    line_item_product_code = 'AmazonS3' AND
    line_item_resource_id &lt;&gt; '' AND
    bill_billing_period_start_date = DATE '2025-05-01' AND
    line_item_line_item_type != 'Credit' AND
    line_item_line_item_type != 'Refund'
GROUP BY 1, 2
ORDER BY 3 DESC
LIMIT 20;</code></pre><h3>Saving from Reserved Instances</h3><pre><code>SELECT
    bill_billing_period_start_date AS billing_period,
    reservation_reservation_a_r_n AS reservation_arn,
    SUM(reservation_effective_cost) AS effective_cost,
    SUM(reservation_unused_amortized_upfront_fee_for_billing_period) AS unused_upfront_fee,
    SUM(reservation_unused_recurring_fee) AS unused_recurring_fee
FROM cur_data
WHERE
    reservation_reservation_a_r_n &lt;&gt; '' AND
    bill_billing_period_start_date = DATE '2025-05-01'
GROUP BY 1, 2
ORDER BY 3 DESC;</code></pre><h3>Identify untagged resources</h3><pre><code>SELECT
    line_item_product_code AS service,
    line_item_resource_id AS resource_id,
    product_region_code AS region,
    SUM(line_item_unblended_cost) AS cost
FROM cur_data
WHERE
    CARDINALITY(MAP_KEYS(resource_tags)) = 0 AND
    line_item_resource_id &lt;&gt; '' AND
    line_item_line_item_type = 'Usage' AND
    bill_billing_period_start_date = DATE '2025-05-01'
GROUP BY 1, 2, 3
HAVING SUM(line_item_unblended_cost) &gt; 0
ORDER BY 4 DESC
LIMIT 50;</code></pre><h3>Cost trend current vs previous month</h3><pre><code>SELECT
    '2025-05' AS current_month,
    '2025-04' AS previous_month,
    SUM(CASE WHEN bill_billing_period_start_date = DATE '2025-05-01' AND line_item_line_item_type NOT IN ('Credit', 'Refund') THEN line_item_unblended_cost ELSE 0 END) AS may_cost,
    SUM(CASE WHEN bill_billing_period_start_date = DATE '2025-04-01' AND line_item_line_item_type NOT IN ('Credit', 'Refund') THEN line_item_unblended_cost ELSE 0 END) AS april_cost,
    SUM(CASE WHEN bill_billing_period_start_date = DATE '2025-05-01' AND line_item_line_item_type NOT IN ('Credit', 'Refund') THEN line_item_unblended_cost ELSE 0 END) - 
    SUM(CASE WHEN bill_billing_period_start_date = DATE '2025-04-01' AND line_item_line_item_type NOT IN ('Credit', 'Refund') THEN line_item_unblended_cost ELSE 0 END) AS cost_difference
FROM cur_data
WHERE
    bill_billing_period_start_date IN (DATE '2025-05-01', DATE '2025-04-01');</code></pre><h3>Identify resources with the highest cost increase </h3><p>Weekly analysis for the last 2 weeks</p><pre><code>WITH current_week_costs AS (
    SELECT
        line_item_resource_id AS resource_id,
        SUM(line_item_unblended_cost) AS cost
    FROM cur_data
    WHERE
        line_item_usage_start_date BETWEEN DATE('2025-05-12') AND DATE('2025-05-18') AND
        line_item_resource_id &lt;&gt; ''
    GROUP BY 1
),
prev_week_costs AS (
    SELECT
        line_item_resource_id AS resource_id,
        SUM(line_item_unblended_cost) AS cost
    FROM cur_data
    WHERE
        line_item_usage_start_date BETWEEN DATE('2025-05-05') AND DATE('2025-05-11') AND
        line_item_resource_id &lt;&gt; ''
    GROUP BY 1
),
resource_details AS (
    SELECT DISTINCT
        line_item_resource_id AS resource_id,
        line_item_product_code AS service,
        product_region_code AS region
    FROM cur_data
    WHERE
        line_item_resource_id &lt;&gt; '' AND
        line_item_usage_start_date BETWEEN DATE('2025-05-05') AND DATE('2025-05-18')
)
SELECT
    r.resource_id,
    r.service,
    r.region,
    -- Week costs
    COALESCE(p.cost, 0) AS prev_week_cost,
    COALESCE(c.cost, 0) AS current_week_cost,
    COALESCE(c.cost, 0) - COALESCE(p.cost, 0) AS week_over_week_change,
    
    -- Percentage change
    CASE 
        WHEN COALESCE(p.cost, 0) &gt; 0 
        THEN ROUND(((COALESCE(c.cost, 0) - COALESCE(p.cost, 0)) / COALESCE(p.cost, 0)) * 100, 2)
        WHEN COALESCE(p.cost, 0) = 0 AND COALESCE(c.cost, 0) &gt; 0
        THEN NULL
        ELSE 0
    END AS percentage_change,
    
    -- Growth classification
    CASE
        WHEN COALESCE(p.cost, 0) = 0 AND COALESCE(c.cost, 0) &gt; 0 
        THEN 'New Resource'
        WHEN COALESCE(c.cost, 0) - COALESCE(p.cost, 0) &gt; 0 
        THEN 'Increasing'
        WHEN COALESCE(c.cost, 0) - COALESCE(p.cost, 0) &lt; 0 
        THEN 'Decreasing'
        ELSE 'Stable'
    END AS cost_trend
FROM resource_details r
LEFT JOIN current_week_costs c ON r.resource_id = c.resource_id
LEFT JOIN prev_week_costs p ON r.resource_id = p.resource_id
WHERE 
    -- Show resources with costs in either week
    COALESCE(c.cost, 0) &gt; 0 OR COALESCE(p.cost, 0) &gt; 0
ORDER BY week_over_week_change DESC
LIMIT 20;</code></pre><h3>Athena usage and cost by workgroup and operation</h3><pre><code>SELECT
    CAST(line_item_usage_start_date AS DATE) AS usage_date,
    line_item_operation AS operation,
    line_item_resource_id AS workgroup,
    SUM(line_item_usage_amount) AS data_scanned_bytes,
    SUM(line_item_unblended_cost) AS cost
FROM cur_data
WHERE
    line_item_product_code = 'AmazonAthena' AND
    bill_billing_period_start_date = DATE '2025-05-01'
GROUP BY 1, 2, 3
ORDER BY 1 DESC, 5 DESC;</code></pre><h3>List all tags with their associated cost</h3><pre><code>WITH flattened_tags AS (
  SELECT
    line_item_resource_id,
    line_item_unblended_cost,
    k AS tag_key,
    resource_tags[k] AS tag_value
  FROM cur_data
  CROSS JOIN UNNEST(MAP_KEYS(resource_tags)) AS t(k)
  WHERE
    bill_billing_period_start_date = DATE '2025-05-01'
    line_item_resource_id &lt;&gt; '' AND
    resource_tags[k] &lt;&gt; ''
)
SELECT
  tag_key,
  tag_value,
  COUNT(DISTINCT line_item_resource_id) AS resource_count,
  SUM(line_item_unblended_cost) AS total_cost
FROM flattened_tags
GROUP BY 1, 2
ORDER BY 1, 4 DESC;</code></pre><h3>Tagged vs Untagged resources and their cost</h3><p>This is one of the queries we used in the Lambda, edit the tag_key_to_search and then you can see how many resources are not tagged with that particular tag and how many they are tagged properly.</p><pre><code>WITH tag_key_to_search AS (
    SELECT 'user_creator' AS key -- Change this to your desired tag key
),
resource_tagging AS (
    SELECT
        line_item_product_code AS service,
        line_item_resource_id,
        line_item_unblended_cost,
        CASE 
            WHEN CARDINALITY(MAP_KEYS(resource_tags)) &gt; 0 THEN 'Tagged'
            ELSE 'Untagged'
        END AS general_tagging_status,
        CASE
            WHEN resource_tags[(SELECT key FROM tag_key_to_search)] IS NOT NULL 
                 AND resource_tags[(SELECT key FROM tag_key_to_search)] &lt;&gt; '' THEN 'Has Tag Key'
            ELSE 'Missing Tag Key'
        END AS specific_tag_status
    FROM cur_data
    WHERE
        line_item_resource_id &lt;&gt; '' AND
        bill_billing_period_start_date = DATE '2025-05-01' AND
        line_item_line_item_type != 'Credit' AND
        line_item_line_item_type != 'Refund' AND
        line_item_line_item_type = 'Usage'
)
SELECT
    service,
    general_tagging_status,
    specific_tag_status,
    COUNT(DISTINCT line_item_resource_id) AS resource_count,
    SUM(line_item_unblended_cost) AS total_cost
FROM resource_tagging
GROUP BY 1, 2, 3
HAVING COUNT(DISTINCT line_item_resource_id) &gt; 0
ORDER BY 1, 2, 3;</code></pre><p>The other query used for the Lambda, listing the untagged services, is query 15 in the <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure_cur/sample_queries.sql">sample_queries.sql</a> file.</p><p>To generate some of the queries, I&#8217;ve used an LLM, and it seems that, providing the right context (and some editing after the proposed query), you can pretty much effortlessly create queries on top of the CUR 2.0. It is much nicer than searching 300+ (legacy cur) columns and their definition to start using the data.</p><p><strong>Disclaimer:</strong> While vibe-coding is trendy and can bootstrap your code, please do not forget to take a closer look at your query and understand that <a href="https://docs.aws.amazon.com/cur/latest/userguide/table-dictionary-cur2.html">the fields being queried</a>&nbsp;are correct.</p><p>In my queries, I&#8217;ve used the unblinded cost, which is the raw, undiscounted cost.</p><p>CUR 2.0 dictionary: https://docs.aws.amazon.com/cur/latest/userguide/table-dictionary-cur2.html</p><p>Here is a very <a href="https://aws.amazon.com/blogs/aws-cloud-financial-management/understanding-your-aws-cost-datasets-a-cheat-sheet/">nice cheat sheet</a> to help you better understand the AWS Cost Dataset.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Conclusion</h2><p>Well, that&#8217;s all, folks. I hope you found this post helpful and that it leaves you with a bit more knowledge and confidence when navigating your AWS cost reports.</p><p>It will be a shame not to mention some already existing automation/dashboard/etc, a considerable effort from the AWS team to streamline and trivialize FinOps (to an extent of course.)</p><p><a href="https://catalog.workshops.aws/awscid/en-US/dashboards">Here</a> you can find the Cloud Intelligence dashboards.</p><p><a href="https://catalog.workshops.aws/well-architected-cost-optimization/en-US">Here </a>you can find a cost optimization workshop.</p><p>Feel free to reach out if you encounter any problems or have suggestions.</p><p>Till the next time, stay safe and have fun! &#10084;&#65039;</p>]]></content:encoded></item><item><title><![CDATA[Showcasing AWS Athena]]></title><description><![CDATA[Process your analytical data at very low cost!]]></description><link>https://www.thelastdev.com/p/showcasing-aws-athena</link><guid isPermaLink="false">https://www.thelastdev.com/p/showcasing-aws-athena</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Tue, 13 May 2025 15:19:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!EtA7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today&#8217;s post, I am going to showcase <a href="https://aws.amazon.com/athena/">Amazon Athena</a>. Amazon Athena is one of my favourite services within AWS. It has such nice capabilities at a very low cost! </p><p>Whether you're a data engineer looking to implement a new analytics solution or a developer seeking to optimize existing Athena workloads, in this post, I will provide insights and examples to help you leverage Athena's full potential.</p><p>I hope that after you read this, you have a better understanding on how Athena works. So without further ado, let&#8217;s dive in!</p><p>If you enjoy this content, take a moment to subscribe; it means a lot to me.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.thelastdev.com/subscribe?"><span>Subscribe now</span></a></p><div class="poll-embed" data-attrs="{&quot;id&quot;:316906}" data-component-name="PollToDOM"></div><p><em>We will not discuss Iceberg in this blog post because I want to cover this topic in detail in a future post about S3 Tables.</em></p><h2>What is Amazon Athena</h2><blockquote><p>Amazon Athena is an interactive query service that simplifies data analysis in <a href="https://aws.amazon.com/s3/">Amazon S3</a> using standard SQL. Athena is serverless, so there is no infrastructure to set up or manage, and you only pay for the resources your query needs to run.</p></blockquote><p>The above definition is from AWS; believe me, no additional words describe Athena. You put data, catalog data, and query data, and that is it, it is completely serverless (There is a provisioned flavor, but we are not going to dive into this today) without the hassle of spawning up resources, maintaining the cluster, etc.</p><p>Athena shines in the following use cases:</p><ul><li><p>Log analysis and monitoring</p></li><li><p>Cost and Usage Analysis</p></li><li><p>Data Lake analytics</p><ul><li><p>Ad-hoc analysis on a massive amount of data</p></li><li><p>Data processing and cleaning</p></li></ul></li><li><p>Security and compliance</p><ul><li><p>Security logs</p></li><li><p>Audit trails</p></li></ul></li><li><p>ML Data Preparation</p></li></ul><p><a href="https://aws.amazon.com/athena/pricing/">Athena&#8217;s pricing</a> is very straightforward, 5$ per TB of scan, meaning you pay not for compute but based on the amount of data you scanned to generate the required result. You can always use provisioned capacity, but it usually makes sense if you have a large data lake that constantly scans TBs of data. Minimum commitment is 24 DPUs, and a 24/7 usage will cost around 5,000$ per month, so in that case, if you scan more than 1.2PBs per month, provisioned capacity makes sense.</p><h2>What you&#8217;ll learn</h2><p>This post will show a simple use case on loading, cataloging, and querying data.</p><p>We are going to take a look at the following:</p><ul><li><p>Basics of setting up Amazon Athena</p></li><li><p>Configure Athena properly</p></li><li><p>Create a Glue crawler for cataloging our Data</p></li><li><p>Deploying our IaC in Terraform</p></li></ul><p>This is a Level 200 difficulty, meaning you will need to know some basic concepts of AWS and Terraform</p><p>You will need the following:</p><ul><li><p>An AWS Account and access to credentials of your AWS (either a service account or configure SSO)</p></li><li><p><a href="https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli">Terraform installed</a></p></li></ul><p>The whole project will cost approximately: <strong>2-3$</strong></p><h2>Terraform into Play</h2><p>For the infrastructure setup, we will create the following resources:</p><ul><li><p>2 S3 buckets, one for Athena to save the results and one for sample data</p></li><li><p>An Athena workgroup and a database where we will run our queries</p></li><li><p>A Glue crawler to catalog our sample data</p></li><li><p>IAM policies to provide access</p></li></ul><p>You can find the whole code for this post <a href="https://github.com/siakon89/AWS-examples/tree/master/data-services/Amazon%20Athena/infrastructure">here</a>, where you can edit the <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/locals.tf">locals.tf</a> file to configure it your way and follow along. So let&#8217;s begin, I will showcase some Terraform and explain its usage to make it easier to follow.</p><p>First, we need to take a look at the <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/locals.tf">locals.tf</a> file and adjust <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/locals.tf#L6">line 6</a> and <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/locals.tf#L17">line 17</a>:</p><ul><li><p>name: add a name for your project, which will be propagated in the resource names</p></li><li><p>sample_data_prefix: is the &#8220;folder&#8221; where we will put the data in S3</p></li></ul><p>Setting up the buckets (<a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/main.tf">main.tf</a>).</p><pre><code>module "athena_results_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~&gt; 4.8"

  bucket = "${local.name}-query-results"
  force_destroy = true
  acl = "private"

  # Add ownership controls
  control_object_ownership = true
  object_ownership         = "ObjectWriter"

  tags = local.tags
}

# Create a bucket for sample data
module "athena_data_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~&gt; 4.8"

  bucket = "${local.name}-data"
  force_destroy = true
  acl = "private"

  # Add ownership controls
  control_object_ownership = true
  object_ownership         = "ObjectWriter"

  tags = local.tags
}
</code></pre><p>In the code above, we are creating two buckets:</p><ul><li><p><strong>query-results</strong>: Athena needs a bucket to save the generated results in a CSV format and serve them to the user. In this bucket, <strong>usually we have a <a href="https://registry.terraform.io/modules/terraform-aws-modules/s3-bucket/aws/latest#input_lifecycle_rule">lifecycle policy</a> where we discard the contents after an X number of days</strong>. </p></li><li><p><strong>data</strong>: Athena also needs a bucket where the data is saved and ready for querying. Additionally, Athena can use this bucket to generate data with <a href="https://docs.aws.amazon.com/athena/latest/ug/ctas.html">CTAS</a> statements.</p></li></ul><p>Next, we will create an Athena workgroup and a database to save the data (<a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/athena.tf">athena.tf</a>).</p><pre><code>resource "aws_athena_workgroup" "athena_workgroup" {
  name        = local.name
  description = "Athena workgroup for ${local.name}"

  configuration {
    enforce_workgroup_configuration    = local.athena_workgroup.enforce_workgroup_configuration
    publish_cloudwatch_metrics_enabled = local.athena_workgroup.publish_cloudwatch_metrics_enabled
    bytes_scanned_cutoff_per_query     = local.athena_workgroup.bytes_scanned_cutoff_per_query

    engine_version {
      selected_engine_version = local.athena_workgroup.engine_version
    }

    result_configuration {
      output_location = "s3://${module.athena_results_bucket.s3_bucket_id}/"

      encryption_configuration {
        encryption_option = local.athena_workgroup.encryption_option
      }
    }
  }

  tags = local.tags
}

resource "aws_athena_database" "athena_database" {
  name   = replace(local.name, "-", "_")
  bucket = module.athena_data_bucket.s3_bucket_id
  force_destroy = true

  properties = {
    location = "s3://${module.athena_data_bucket.s3_bucket_id}/tables/"
  }
} </code></pre><p>You can use <a href="https://docs.aws.amazon.com/athena/latest/ug/workgroups-manage-queries-control-costs.html">Athena workgroups</a> to separate workloads, control team access, enforce configuration, track query metrics, and control costs.</p><p>As for the database, we have selected the bucket as our data bucket, but this is primarily for the database metadata. We also defined a location where the data (CTAS) will be saved. We are using force_destroy in case we want to delete the database, avoid this in production workloads.</p><p>It is time to create the Glue Crawler to catalog our new data (<a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/crawler.tf">crawler.tf</a>).</p><pre><code># Create Glue crawler for CUR data
resource "aws_glue_crawler" "demo_crawler" {
  name          = "${local.name}-demo-crawler"
  database_name = aws_athena_database.athena_database.name
  role          = aws_iam_role.glue_crawler.arn
  table_prefix  = "demo_"  # This will prefix all tables created by this crawler

  s3_target {
    path = "s3://${module.athena_data_bucket.s3_bucket_id}/${local.sample_data_prefix}"
  }

  schema_change_policy {
    delete_behavior = "LOG"
    update_behavior = "UPDATE_IN_DATABASE"
  }

  configuration = jsonencode({
    Version = 1.0
    CrawlerOutput = {
      Partitions = { AddOrUpdateBehavior = "InheritFromTable" }
      Tables     = { AddOrUpdateBehavior = "MergeNewColumns" }
    }
  })

  tags = local.tags
} </code></pre><p>The above Terraform code is pretty standard for a <a href="https://docs.aws.amazon.com/glue/latest/dg/add-crawler.html">Glue Crawler</a>. The most important parts here are:</p><ul><li><p>s3_target: where we are pointing to the data we have uploaded (not done that yet)</p></li><li><p>schema_change_policy: where we provide the <a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-glue-crawler-schemachangepolicy.html#aws-properties-glue-crawler-schemachangepolicy-properties">behaviour of the catalog</a> when the schema changes, here we say to update the database, but also keep the existing schema on the catalog</p></li></ul><p>Last but not least, we have the IAM policies and roles (<a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/iam.tf">iam.tf</a>). There is not much to say here.</p><p>Once we are ready, we can proceed with </p><pre><code>terraform plan</code></pre><p>and once we validate the plan, we can run</p><pre><code>terraform apply</code></pre><p>And voila! You have a working environment to start putting data and experimenting!</p><h2>Using Athena</h2><p>Now that everything is ready, we can populate our S3 bucket with some data. Our choice will be parquet format, and we will upload the iconic Titanic Dataset, which you can find in the <a href="https://github.com/siakon89/AWS-examples/blob/master/data-services/Amazon%20Athena/infrastructure/titanic.parquet">repo</a>.</p><p>So, navigate to your S3 bucket and upload the dataset under the prefix (data) you have selected above while creating the Glue Crawler. I will use a CLI command, but you are more than welcome to use the console.</p><pre><code>aws s3 cp titanic.parquet s3://&lt;bucket_name&gt;/data/titanic.parquet</code></pre><p>Then we go to our newly generated crawler and run it to catalog our file. Again, I will use a CLI command.</p><pre><code>aws glue start-crawler --name &lt;crawler-name&gt;</code></pre><p>This will create a table within our database in Amazon Athena.</p><p>We can now run a simple query on our Athena Interface. Use your database name in your case.</p><pre><code>SELECT * FROM "thelastdev_athena_demo"."demo_data" limit 10;</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EtA7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EtA7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 424w, https://substackcdn.com/image/fetch/$s_!EtA7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 848w, https://substackcdn.com/image/fetch/$s_!EtA7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 1272w, https://substackcdn.com/image/fetch/$s_!EtA7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EtA7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png" width="1404" height="967" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:967,&quot;width&quot;:1404,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:97975,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/162598264?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EtA7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 424w, https://substackcdn.com/image/fetch/$s_!EtA7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 848w, https://substackcdn.com/image/fetch/$s_!EtA7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 1272w, https://substackcdn.com/image/fetch/$s_!EtA7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d7350ef-68a4-4116-9c44-173d0a32dcea_1404x967.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And, believe it or not, this is pretty much it&#8230;. This is how Athena works. You can expand your knowledge with the <a href="https://docs.aws.amazon.com/athena/latest/ug/what-is.html">AWS Documentation</a> regarding Athena or by asking Q Developer for more complex implementations.</p><p>Closing this post, I will share my notes  regarding optimizing Athena.</p><h2>Athena Optimizations</h2><p>When it comes to Optimizations, we are going to talk about:</p><ul><li><p>Cost Optimizations</p></li><li><p>Performance Optimizations</p></li></ul><p>So, let&#8217;s begin and see the low-hanging fruits:</p><p><strong>Partitioning strategies</strong></p><p>By properly <a href="https://docs.aws.amazon.com/athena/latest/ug/ctas-partitioning-and-bucketing-what-is-partitioning.html">partitioning your data</a>, you will significantly benefit both in cost and performance optimization. If you utilize a partition in your query (where statement on that field), Athena will only scan the data existing in this partition.</p><p>For example, (let&#8217;s exaggerate) let&#8217;s say you have a bucket with log data going back 3 years, totaling 970 TBs, and the bucket is split into partitions by week. If you search for a specific week or a collection of weeks, Athena will only scan the data and not the 970 TBs to find your results.</p><p>To utilize partitioning, you can split your data into prefixes (&#8220;folders&#8221;) within your bucket</p><pre><code># with one partition
data/week=1/test.parquet
data/week=2/test2.parquet

or

# with multiple partitions
data/year=2025/week=1/day=1/test.partquet</code></pre><p><strong>File format selection (Parquet, ORC)</strong></p><p>Utilizing a proper format for your data is very very crucial. I personally prefer parquet due to its columnar representation, which allows me to be optimal in select statements (and not *). The difference between CSV files and Parquet regarding performance is significant! Additionally, especially in the CSV format, Athena will open the entire file, while in Parquet, it will be more cost optimized due to its columnar format.</p><p><strong>Compression techniques</strong></p><p>Because Athena charges you by the amount of TBs you scan, compressing your data will result in lower costs, but not necessarily better performance. My favourite is snappy, fast, and lightweight.</p><p><strong>Result caching</strong></p><p>You can <a href="https://docs.aws.amazon.com/athena/latest/ug/reusing-query-results.html">reuse query results in Athena</a>. When you enable result reuse for a query, Athena looks for a previous query execution within the same workgroup. If Athena finds corresponding stored query results, it does not rerun the query, but points to the previous result location or fetches data from it.</p><p>Now, let&#8217;s move to more complex optimizations and focus on the 20% of the 80/20 rule. Trying to reach almost optimal performance and cost, but with more effort than the low-hanging fruits we mentioned before.</p><p><strong>Query planning - Explain</strong></p><p><a href="https://docs.aws.amazon.com/athena/latest/ug/athena-explain-statement.html">You can analyze your query execution</a> and discover how Athena scans your data for further optimization.</p><p><strong>Workgroup configurations</strong></p><p>Cost control settings: You can limit the amount of data that can be scanned in a single query</p><p>Performance settings: Use engine version &#8220;Athena version 3&#8221;, which is currently the latest, but make sure you check the benefits when a new engine comes out.</p><p>Publish metrics in CloudWatch to properly monitor Athena usage.</p><p>You can see Athena in action with read data in my AWS CUR processing post!</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;f4c1009a-642d-46e6-aa0e-22e62229073c&quot;,&quot;caption&quot;:&quot;In today&#8217;s post, I am going to show you how to enable Cost and Usage reports for your AWS account/organization and how to query them to get valuable insights.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Managing Cost and Usage Reports (Data Exports) in AWS&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:1234969,&quot;name&quot;:&quot;Konstantinos Siaterlis&quot;,&quot;bio&quot;:&quot;Cloud Engineer, AWS Hero, AWS User Group Athens co-organizer, and blogger. Passionate about Data and DevOps, with extensive experience in architecting and implementing robust data platforms. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/373847a0-448b-4cc3-8220-698fb8c74a75_300x300.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-05-20T12:43:45.437Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/809d38f7-8a4e-48ad-946e-b6215f00dff1_1188x1461.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thelastdev.com/p/managing-cost-and-usage-reports-data&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:163612253,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;The Last Dev&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Conclusion</h2><p>And this is it, this is Amazon Athena. A very powerful and cheap database where you can query your structured and semi-structured data. As a next step, I will go through the S3 Tables to see Athena in action with Iceberg, and move on to the remaining data stack of AWS, like AWS Glue Jobs, Lake Formation, Redshift, and many more.</p><p>Feel free to reach out if you encounter any problems or have suggestions.</p><p>Till the next time, stay safe and have fun!</p><p>If you enjoy this content, take a moment to subscribe; it means a lot to me.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.thelastdev.com/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item><item><title><![CDATA[The Last Dev ~ Cloud Updates #8]]></title><description><![CDATA[Alongside the format, I am also changing the name of this series.]]></description><link>https://www.thelastdev.com/p/the-last-dev-cloud-updates-8</link><guid isPermaLink="false">https://www.thelastdev.com/p/the-last-dev-cloud-updates-8</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 05 May 2025 07:51:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Alongside the format, I am also changing the name of this series. The idea is to express my personal opinion on some updates that I found interesting, and showcase some local events and articles.</p><p></p><h2><strong>News</strong></h2><p>There have been two exciting announcements about Amazon Q Developer in the past week. It seems that AWS properly focuses on LLMs and is starting to catch up with the competition. First of all, Amazon Q Developer CLI now <a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-q-developer-cli-model-context-protocol/">supports MCP</a>, allowing you to have a customized response regarding code generation. Additionally, Amazon Q Developer offers an <a href="https://aws.amazon.com/about-aws/whats-new/2025/05/amazon-q-developer-agentic-coding-experience-ide/">agentic experience in IDEs</a>. The new coding experience provides intelligent task execution, enabling Q Developer to perform actions beyond code suggestions, such as modifying files, generating code diffs, and running commands based on your natural language instructions. Available in VS Code and JetBrains.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Last but not least, a massive quality of life upgrade in my opinion, is that <a href="https://aws.amazon.com/about-aws/whats-new/2025/04/ec2-image-builder-integrates-ssm-parameter-store/">EC2 Image Builder now integrates with SSM Parameter Store</a>. This streamlines the way you build your custom images and their maintenance. This new feature comes at no additional cost.</p><h2><strong>Events</strong></h2><p>20th of May: The Mondelez International Cloud Engineering team will speak at the <a href="https://www.meetup.com/aws-user-group-athens/">AWS User Group in Athens</a>. <a href="https://www.meetup.com/aws-user-group-athens/events/306767439/">FinOps in Practice: Managing AWS Costs at Scale</a></p><p>7-9 May: Panathenea event. You can get tickets from <a href="https://www.panathenea.org/tickets/">here</a>. The agenda can be found <a href="https://www.panathenea.org/agenda/program/">here</a>.</p><p>AWS Community Day Adria: <a href="https://awscommunityadria.com/">Registration has opened</a>. This event will be on the 5th of September, and it will have excellent speakers! Jeff Barr will be doing the keynote. <a href="https://sessionize.com/aws-community-day-adria-2025/">There is also a call for speakers</a>.</p><h2>Interesting Posts</h2><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:162393260,&quot;url&quot;:&quot;https://blog.thecloudengineers.com/p/cloud-migration-strategies-choosing&quot;,&quot;publication_id&quot;:3955850,&quot;publication_name&quot;:&quot;The Cloud Engineers&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cd05ee5-ead2-4457-b11d-1a1ba68f39b2_759x759.png&quot;,&quot;title&quot;:&quot;Cloud Migration Strategies: Choosing the Right Path for Your Organization&quot;,&quot;truncated_body_text&quot;:&quot;Moving to the cloud is no longer a question of \&quot;if\&quot; but \&quot;when\&quot; and \&quot;how.\&quot; As organizations continue to embrace digital transformation, understanding different cloud migration strategies is crucial for a successful transition. In this article, we'll explore the \&quot;6 R's\&quot; of cloud migration and help you determine which strategy might work best for your orga&#8230;&quot;,&quot;date&quot;:&quot;2025-04-30T08:01:41.005Z&quot;,&quot;like_count&quot;:1,&quot;comment_count&quot;:2,&quot;bylines&quot;:[{&quot;id&quot;:314296885,&quot;name&quot;:&quot;Lefteris Karageorgiou&quot;,&quot;handle&quot;:&quot;lefteriskarageorgiou&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3c96d85-f3e8-4107-bb93-5c928c58c466_446x446.png&quot;,&quot;bio&quot;:&quot;Solutions Architect @ AWS | Author of Mastering Event Driven Microservices in AWS | Serverless Expert | Software Engineer | Java Expert&quot;,&quot;profile_set_up_at&quot;:&quot;2025-01-30T07:55:44.797Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-02-13T04:22:43.628Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:4033407,&quot;user_id&quot;:314296885,&quot;publication_id&quot;:3955850,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:3955850,&quot;name&quot;:&quot;The Cloud Engineers&quot;,&quot;subdomain&quot;:&quot;thecloudengineers&quot;,&quot;custom_domain&quot;:&quot;blog.thecloudengineers.com&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Level up your Cloud engineering knowledge with this weekly newsletter. Perfect for Software Engineers, Architects, DevOps engineers, and other tech enthusiasts looking to advance their careers while mastering the latest Cloud technologies.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3cd05ee5-ead2-4457-b11d-1a1ba68f39b2_759x759.png&quot;,&quot;author_id&quot;:314296885,&quot;primary_user_id&quot;:314296885,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-01-30T07:56:35.848Z&quot;,&quot;email_from_name&quot;:null,&quot;copyright&quot;:&quot;Lefteris Karageorgiou&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;,&quot;source&quot;:null}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://blog.thecloudengineers.com/p/cloud-migration-strategies-choosing?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!SP6x!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3cd05ee5-ead2-4457-b11d-1a1ba68f39b2_759x759.png" loading="lazy"><span class="embedded-post-publication-name">The Cloud Engineers</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Cloud Migration Strategies: Choosing the Right Path for Your Organization</div></div><div class="embedded-post-body">Moving to the cloud is no longer a question of "if" but "when" and "how." As organizations continue to embrace digital transformation, understanding different cloud migration strategies is crucial for a successful transition. In this article, we'll explore the "6 R's" of cloud migration and help you determine which strategy might work best for your orga&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a year ago &#183; 1 like &#183; 2 comments &#183; Lefteris Karageorgiou</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Lefteris Karageorgiou&quot;,&quot;id&quot;:314296885,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3c96d85-f3e8-4107-bb93-5c928c58c466_446x446.png&quot;,&quot;uuid&quot;:&quot;fd539a53-0b90-487d-9b7f-6a9d31ce7c75&quot;}" data-component-name="MentionToDOM"></span> is exploring the 6 R&#8217;s and showcasing the different approaches to migrating your applications to the cloud. This is something that anyone who aspires to migrate workloads to the cloud (any cloud) should know.</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:162247183,&quot;url&quot;:&quot;https://antonisangelakis.substack.com/p/delta-lake-vs-apache-iceberg-a-mindful&quot;,&quot;publication_id&quot;:4147723,&quot;publication_name&quot;:&quot;DataConscious &#8211; A mindful approach to analytics&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e03e52-8550-48c8-91d2-13d11a06d394_1024x1024.png&quot;,&quot;title&quot;:&quot;Delta Lake vs. Apache Iceberg: A Mindful Choice for Your Data Lakehouse&quot;,&quot;truncated_body_text&quot;:&quot;Thanks for reading DataConscious &#8211; A mindful approach to analytics! Subscribe for free to receive new posts and support my work.&quot;,&quot;date&quot;:&quot;2025-04-30T07:30:40.266Z&quot;,&quot;like_count&quot;:1,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:313458359,&quot;name&quot;:&quot;Antonios Angelakis&quot;,&quot;handle&quot;:&quot;antoniosangelakis&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98207dfd-f157-43ec-a81d-7bcce967aaf0_489x489.jpeg&quot;,&quot;bio&quot;:&quot;Data Professional, Ex Athens #Tableau User Group Leader, #Data #Analytics #Mentoring &amp; Instructor #payments #e-commerce #cybersecurity #insurance #DataAnalyticsInModernCorporateBusiness https://kedivim-apply.ihu.gr/en/progs/prog-350&quot;,&quot;profile_set_up_at&quot;:&quot;2025-02-06T09:19:36.142Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-02-18T15:20:30.447Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:4229954,&quot;user_id&quot;:313458359,&quot;publication_id&quot;:4147723,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:4147723,&quot;name&quot;:&quot;DataConscious &#8211; A mindful approach to analytics&quot;,&quot;subdomain&quot;:&quot;antonisangelakis&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;DataConscious delivers mindful insights, tips and stuff on data analytics and engineering, filling gaps with purpose to drive real value. Subscribe to stay ahead! &#128640;&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85e03e52-8550-48c8-91d2-13d11a06d394_1024x1024.png&quot;,&quot;author_id&quot;:313458359,&quot;primary_user_id&quot;:313458359,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-02-18T10:22:20.392Z&quot;,&quot;email_from_name&quot;:&quot;DataConscious &#8211; A mindful approach to analytics&quot;,&quot;copyright&quot;:&quot;Antonios Angelakis&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;,&quot;source&quot;:null}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://antonisangelakis.substack.com/p/delta-lake-vs-apache-iceberg-a-mindful?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!WeEp!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e03e52-8550-48c8-91d2-13d11a06d394_1024x1024.png" loading="lazy"><span class="embedded-post-publication-name">DataConscious &#8211; A mindful approach to analytics</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Delta Lake vs. Apache Iceberg: A Mindful Choice for Your Data Lakehouse</div></div><div class="embedded-post-body">Thanks for reading DataConscious &#8211; A mindful approach to analytics! Subscribe for free to receive new posts and support my work&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a year ago &#183; 1 like &#183; Antonios Angelakis</div></a></div><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Antonios Angelakis&quot;,&quot;id&quot;:313458359,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98207dfd-f157-43ec-a81d-7bcce967aaf0_489x489.jpeg&quot;,&quot;uuid&quot;:&quot;b1cecf1f-0ab1-4aad-bad7-6b1361715ef1&quot;}" data-component-name="MentionToDOM"></span> is comparing Delta Lake and Apache Iceberg. Understanding how these two data formats work to build a Data Lake is good. My favourite is Apache Iceberg, which makes it very easy to manage with Amazon S3 Tables.</p><p>Last but not last, this is my first official blog post after a looooong time. I hope you like it!</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;e58ea103-34c5-4587-8d36-3e6790251769&quot;,&quot;caption&quot;:&quot;In this post, I will create a Minecraft server, but not from a paid service that provides servers, not locally on my computer, and certainly not by hand in a virtual machine. I will try to create the server using Infrastructure as Code and a Cloud provider.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Learning ECS the fun way ~ Hosting a Minecraft Server&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:1234969,&quot;name&quot;:&quot;Konstantinos Siaterlis&quot;,&quot;bio&quot;:&quot;Cloud Engineer, AWS Hero, AWS User Group Athens co-organizer, and blogger. Passionate about Data and DevOps, with extensive experience in architecting and implementing robust data platforms. &quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/373847a0-448b-4cc3-8220-698fb8c74a75_300x300.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-30T10:55:44.697Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thelastdev.com/p/learning-ecs-the-fun-way-hosting&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:160762302,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:1,&quot;comment_count&quot;:3,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;The Last Dev&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>A sneak peek of my next posts:</p><ul><li><p>Managing data at scale with a fraction of the cost</p></li><li><p>Optimizing your data format</p></li><li><p>Reduce operational burden for managing Apache Iceberg</p></li></ul><p>Till the next time, stay safe and have fun!</p>]]></content:encoded></item><item><title><![CDATA[Learning ECS the fun way ~ Hosting a Minecraft Server]]></title><description><![CDATA[How to host a Minecraft Server on ECS using Terraform]]></description><link>https://www.thelastdev.com/p/learning-ecs-the-fun-way-hosting</link><guid isPermaLink="false">https://www.thelastdev.com/p/learning-ecs-the-fun-way-hosting</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Wed, 30 Apr 2025 10:55:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8aeP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, I will create a Minecraft server, but not from a paid service that provides servers, not locally on my computer, and certainly not by hand in a virtual machine. I will try to create the server using Infrastructure as Code and a Cloud provider.</p><p>The purpose of this blog post is to showcase some capabilities of AWS ECS and Terraform modules, and by doing so, why not end up with a functional Vanilla Minecraft server way overpriced, where we can also play &#128513;</p><p>Disclaimer: While coming from an FPS background as a kid, I chose Minecraft for two reasons: first, it is one of the greatest games, and second, I play with my daughter. </p><h2>What You&#8217;ll Learn</h2><p>We are going to take a look at the following:</p><ul><li><p>Basics of ECS (Fargate, Task Definition, Services)</p></li><li><p>Configuring our container in ECS</p></li><li><p>Utilizing Terraform modules<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> (not resources) by <a href="https://www.linkedin.com/in/antonbabenko/">Anton Babenko</a> </p></li><li><p>Deploying our IaC in Terraform</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>This is a Level 200 difficulty, meaning you will need to know some basic concepts of AWS, Docker, and VPCs </p><p>You will need the following:</p><ul><li><p>An AWS Account and access to credentials of your AWS (either a service account or configure SSO)</p></li><li><p><a href="https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli">Terraform installed</a></p></li></ul><h2>Setting the Stage</h2><p>This is the architecture we are going to implement</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8aeP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8aeP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 424w, https://substackcdn.com/image/fetch/$s_!8aeP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 848w, https://substackcdn.com/image/fetch/$s_!8aeP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 1272w, https://substackcdn.com/image/fetch/$s_!8aeP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8aeP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png" width="611" height="733" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:733,&quot;width&quot;:611,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:42489,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/160762302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8aeP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 424w, https://substackcdn.com/image/fetch/$s_!8aeP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 848w, https://substackcdn.com/image/fetch/$s_!8aeP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 1272w, https://substackcdn.com/image/fetch/$s_!8aeP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff99a40a2-343c-43cd-b0e8-59fac78259a9_611x733.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As shown above, we have created a VPC with both public and private subnets. We have placed our Network Load Balancer in the public subnets, and in the private subnet, we have our Minecraft server and the EFS storage (a type of persistent storage). </p><p>This will <strong>cost approximately 65$ per month</strong>! &#128552;&#128552;Obviously, we can reduce the cost to 25$ per month, but we are not trying to make a cheap Minecraft server, <strong>we are trying to learn ECS and how it works</strong>!</p><p>The full code for this project is in this <a href="https://github.com/siakon89/minecraft-server">GitHub Repo</a>. I have parameterized the infrastructure and added some locals in the file locals.tf.</p><h2>Terraform into play</h2><p>DISCLAIMER: I will explain the different modules I have used; however, for a complete example, please visit the GitHub repository. Files are missing, like </p><ul><li><p>locals.tf</p></li><li><p>data.tf</p></li><li><p>etc</p></li></ul><p>We first start by creating a VPC for our server. We are using this <a href="https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/latest">module</a>.</p><pre><code>module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~&gt; 5.19.0"

  name = "minecraft-vpc"
  cidr = local.cidr

  azs             = local.azs
  private_subnets = local.private_subnets
  public_subnets  = local.public_subnets

  # Enable NAT Gateway for private subnets
  enable_nat_gateway = true
  single_nat_gateway = true # Use a single NAT Gateway to save costs

  tags = {
    Terraform   = "true"
    Environment = "dev"
  }
}</code></pre><p>With the code above, we create a VPC with two private and two public subnets spanning two availability zones. Additionally, we enable the NAT gateway because we are going to place our container in the private subnet, and it will need access to the internet in order to download the Minecraft Docker image.</p><p>Once we have the VPC in place, we will need to add the EFS (Elastic File Storage) for persistent storage. We will use this <a href="https://registry.terraform.io/modules/terraform-aws-modules/efs/aws/latest">module</a>.</p><pre><code>module "efs" {
  source  = "terraform-aws-modules/efs/aws"
  version = "~&gt; 1.8"

  # File system
  name           = "minecraft-volume"
  creation_token = "minecraft-volume"
  encrypted      = true
  kms_key_arn    = module.kms.key_arn

  # File system policy
  attach_policy                      = true
  bypass_policy_lockout_safety_check = false

  # policy statements will be added after the ECS Service
  # &lt;THIS IS WHERE WE WILL PUT LATER THE STATEMENT&gt;
  
  # Mount targets / security group
  mount_targets              = { for k, v in zipmap(module.vpc.azs, module.vpc.private_subnets) : k =&gt; { subnet_id = v } }
  security_group_description = "EFS security group for minecraft server"
  security_group_vpc_id      = module.vpc.vpc_id

  security_group_rules = {
    vpc = {
      # relying on the defaults provided for EFS/NFS (2049/TCP + ingress)
      description = "NFS ingress from VPC private subnets"
      cidr_blocks = module.vpc.private_subnets_cidr_blocks
    }
  }

  access_points = {
    vanilla_minecraft = {
      posix_user = {
        gid = 1000
        uid = 1000
      }
      root_directory = {
        path = "/vanilla"
        creation_info = {
          owner_gid   = 1000
          owner_uid   = 1000
          permissions = "755"
        }
      }
    }
  }

  # Backup policy
  enable_backup_policy = false
  # Replication configuration
  create_replication_configuration = false

  tags = {
    Terraform   = "true"
    Environment = "dev"
  }
}


module "kms" {
  source  = "terraform-aws-modules/kms/aws"
  version = "~&gt; 1.0"

  aliases               = ["efs/minecraft-volume"]
  description           = "EFS customer managed key"
  enable_default_policy = true
}
</code></pre><p>In the code above, we are creating our EFS. First of all, we are encrypting our storage, not that we need to for a personal Minecraft server, but again, we are here to learn. We have omitted the policy statements for now, as the ECS task has not been created yet. We create the mount targets in our private subnets to allow us to attach EFS to resources existing in the private subnets, then we create the security group rules. Lastly, we create the access points for the folder where Minecraft will store its data. You do not necessarily need the access point, but by doing so, you can create multiple servers within the same cluster, all of which use the same EFS filesystem (spoilers).</p><p>Now that we have our EFS, we will need to create the Network Load Balancer to forward the traffic to our ECS. We are using this <a href="https://registry.terraform.io/modules/terraform-aws-modules/alb/aws/latest">module</a>.</p><pre><code> module "nlb" {
  source  = "terraform-aws-modules/alb/aws"
  version = "~&gt; 9.16"

  name = "minecraft-nlb"

  load_balancer_type = "network"

  vpc_id                     = module.vpc.vpc_id
  subnets                    = module.vpc.public_subnets
  enable_deletion_protection = false
  create_security_group      = false
  security_groups            = [aws_security_group.nlb.id]

  listeners = {
    minecraft = {
      port     = local.container_port
      protocol = "TCP"
      forward = {
        target_group_key = "minecraft-vanilla"
      }
    }
  }

  target_groups = {
    minecraft-vanilla = {
      name              = "minecraft-vanilla"
      protocol          = "TCP"
      port              = local.container_port
      target_type       = "ip"
      create_attachment = false
      health_check = {
        enabled             = true
        interval            = 30
        healthy_threshold   = 3
        unhealthy_threshold = 3
        protocol            = "TCP"
        timeout             = 10
      }
    }
  }


  tags = {
    Environment = "Development"
    Project     = "Example"
  }
}

resource "aws_security_group" "nlb" {
  name        = "minecraft-nlb-sg"
  description = "Security group for Minecraft NLB"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port   = local.container_port
    to_port     = local.container_port
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
    description = "Minecraft server port"
  }

  egress {
    from_port   = local.container_port
    to_port     = local.container_port
    protocol    = "tcp"
    description = "Minecraft port to VPC"
    cidr_blocks = [module.vpc.vpc_cidr_block]
  }

  tags = {
    Name        = "minecraft-nlb-sg"
    Environment = "Development"
    Project     = "Example"
  }
} </code></pre><p>This is a very straightforward approach. We create a network load balancer in the public subnets and attach a security group that allows all connections from anywhere to the Minecraft server port. We then forward these connections to resources only within our VPC. Then, we create the listener and a target group. This target group will be connected to the ECS module.</p><p>Now, the final part, let&#8217;s create the ECS Cluster with our Minecraft server! &#128170;We are going to use this <a href="https://registry.terraform.io/modules/terraform-aws-modules/ecs/aws/latest">module</a>.</p><pre><code>module "ecs" {
  source  = "terraform-aws-modules/ecs/aws"
  version = "~&gt; 5.12"

  cluster_name = "minecraft-servers"

  cluster_configuration = {
    execute_command_configuration = {
      logging = "OVERRIDE"
      log_configuration = {
        cloud_watch_log_group_name = "/aws/ecs/minecraft-servers"
      }
    }
  }

  fargate_capacity_providers = {
    FARGATE_SPOT = {
      default_capacity_provider_strategy = {
        weight = 100
      }
    }
  }

  services = {
    minecraft-vanilla = {
      cpu    = 4096
      memory = 8192

      volume = [
        {
          name = "minecraft-storage"

          efs_volume_configuration = {
            file_system_id     = module.efs.id
            transit_encryption = "ENABLED"
            root_directory     = "/"
            authorization_config = {
              iam             = "ENABLED"
              access_point_id = module.efs.access_points["vanilla_minecraft"].id
            }
          }
        }
      ]

      # Move to private subnets and remove public IP
      assign_public_ip = false
      subnet_ids       = module.vpc.private_subnets

      # Configure load balancer
      load_balancer = {
        service = {
          target_group_arn = module.nlb.target_groups["minecraft-vanilla"].arn
          container_name   = "minecraft-vanilla-task"
          container_port   = local.container_port
        }
      }

      # Use the dedicated security group
      security_group_ids = [aws_security_group.ecs_service.id]

      # add access to EFS and KMS in the task role
      tasks_iam_role_policies = {
        efs_access = aws_iam_policy.efs_access_policy.arn
        kms_access = aws_iam_policy.efs_kms_access_policy.arn
        ssm_access = aws_iam_policy.ssm_session_manager_policy.arn
      }

      # Container definition(s)
      container_definitions = {

        minecraft-vanilla-task = {
          cpu    = 4096
          memory = 8192
          image  = "itzg/minecraft-server"

          port_mappings = [
            {
              name          = "minecraft-vanilla-container"
              containerPort = local.container_port
              hostPort      = local.container_port
              protocol      = "tcp"
            }
          ]

          environment = [
            {
              name  = "EULA"
              value = "TRUE"
            },
            {
              name  = "WHITELIST"
              value = local.whitelist_list
            },
            {
              name  = "DIFFICULTY"
              value = local.difficulty
            }
          ]

          mount_points = [
            {
              sourceVolume  = "minecraft-storage"
              containerPath = "/data"
              readOnly      = false
            }
          ]

          # Example image used requires access to write to root filesystem
          readonly_root_filesystem = false
          memory_reservation       = 100

          # Enable SSM Session Manager
          enable_execute_command = true
        }
      }

    }
  }


  # Create task execution role and attach policies for EFS  create_task_exec_iam_role = true
  create_task_exec_iam_role = true
  task_exec_iam_role_name   = "minecraft-exec-role"
  task_exec_iam_role_policies = {
    efs_access = aws_iam_policy.efs_access_policy.arn
    kms_access = aws_iam_policy.efs_kms_access_policy.arn
  }

  tags = {
    Environment = "Development"
    Project     = "Example"
  }
}

# Create a dedicated security group for the ECS service
resource "aws_security_group" "ecs_service" {
  name        = "minecraft-ecs-service-sg"
  description = "Security group for Minecraft ECS service"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port       = local.container_port
    to_port         = local.container_port
    protocol        = "tcp"
    description     = "Minecraft port from NLB"
    security_groups = [aws_security_group.nlb.id]
  }

  ingress {
    from_port       = 2049
    to_port         = 2049
    protocol        = "tcp"
    description     = "NFS Port"
    security_groups = [module.efs.security_group_id]
  }

  egress {
    from_port       = local.container_port
    to_port         = local.container_port
    protocol        = "tcp"
    description     = "Minecraft port to NLB"
    security_groups = [aws_security_group.nlb.id]
  }

  egress {
    from_port       = 2049
    to_port         = 2049
    protocol        = "tcp"
    description     = "NFS Port"
    security_groups = [module.efs.security_group_id]
  }

  egress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    description = "SSM Session Manager"
    cidr_blocks = ["0.0.0.0/0"] # SSM endpoints are AWS managed
  }

  tags = {
    Name        = "minecraft-ecs-service-sg"
    Environment = "Development"
    Project     = "Example"
  }
}


</code></pre><p>The above code may seem intimidating, but once we go through it, it will become clearer. Let&#8217;s dive in.</p><p>First, we define the logs and the capacity providers for our cluster. We select FARGATE_SPOT to save a few bucks and because we can afford some outage. If you run business-critical applications that are not self-healing and resumable, use FARGATE.</p><p>Then, we define our service where our Minecraft docker instance will run. We set the CPU and RAM available for this service, and we define the attachment to our EFS. Now, once the service is deployed, it will fail. We will need to add the policy to our EFS (remember the comment on the EFS module). "</p><pre><code>volume = [
        {
          name = "minecraft-storage"

          efs_volume_configuration = {
            file_system_id     = module.efs.id
            transit_encryption = "ENABLED"
            root_directory     = "/"
            authorization_config = {
              iam             = "ENABLED"
              access_point_id = module.efs.access_points["vanilla_minecraft"].id
            }
          }
        }
      ]</code></pre><p>We disable public access to the service, which is why we have the NLB, and we also place everything that will be spawned within the service in private subnets. </p><pre><code>assign_public_ip = false
subnet_ids       = module.vpc.private_subnets</code></pre><p>We attach the load balancer to our service and ensure that we reference the target group and the container to which we will forward the traffic.</p><pre><code>load_balancer = {
  service = {
    target_group_arn = module.nlb.target_groups["minecraft-vanilla"].arn
    container_name   = "minecraft-vanilla-task"
    container_port   = local.container_port
  }
}</code></pre><p>Lastly, for the service, we will define the policies and security groups. You can find the IAM roles <a href="https://github.com/siakon89/minecraft-server">here</a> in the iam.tf file. The security group is mentioned at the end.</p><pre><code># Use the dedicated security group
security_group_ids = [aws_security_group.ecs_service.id]

# add access to EFS and KMS in the task role
tasks_iam_role_policies = {
  efs_access = aws_iam_policy.efs_access_policy.arn
  kms_access = aws_iam_policy.efs_kms_access_policy.arn
  ssm_access = aws_iam_policy.ssm_session_manager_policy.arn
}</code></pre><p>Now let&#8217;s move to the container definitions. We create our container with the name minecraft-vanilla-task, set up the port mappings, image, and CPU and RAM</p><pre><code>minecraft-vanilla-task = {
  cpu    = 4096
  memory = 8192
  image  = "itzg/minecraft-server"

  port_mappings = [
    {
      name          = "minecraft-vanilla-container"
      containerPort = local.container_port
      hostPort      = local.container_port
      protocol      = "tcp"
    }
  ]
...</code></pre><p>Then, the environment variables, mount points to our EFS, and we also enable the execute command, in case we get stuck in a cave mining nearly dead &#128521;</p><pre><code>...
environment = [
  {
    name  = "EULA"
    value = "TRUE"
  },
  {
    name  = "WHITELIST"
    value = local.whitelist_list
  },
  {
    name  = "DIFFICULTY"
    value = local.difficulty
  }
]

mount_points = [
  {
    sourceVolume  = "minecraft-storage"
    containerPath = "/data"
    readOnly      = false
  }
]

# Example image used requires access to write to root filesystem
readonly_root_filesystem = false
memory_reservation       = 100

# Enable SSM Session Manager
enable_execute_command = true
</code></pre><p>Now that we are done with the container definitions, we set up the task execution role, so our container will have access to the EFS and the encryption key.</p><pre><code> create_task_exec_iam_role = true
  task_exec_iam_role_name   = "minecraft-exec-role"
  task_exec_iam_role_policies = {
    efs_access = aws_iam_policy.efs_access_policy.arn
    kms_access = aws_iam_policy.efs_kms_access_policy.arn
  }</code></pre><p>And that is all! &#127881;&#127881;&#127881;</p><p>Let&#8217;s deploy this thing now.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Deploy and Test</h2><p>I assume you already have Terraform installed and connected to your AWS account.</p><p>First, we run init to load our modules</p><pre><code>terraform init</code></pre><p>Then we plan and see what it will be created</p><pre><code>terraform plan</code></pre><p>Then we deploy</p><pre><code>terraform apply</code></pre><p>Once everything is deployed, we will need to add the policy to our EFS module.</p><pre><code>policy_statements = [
    {
      sid = "Example"
      actions = [
        "elasticfilesystem:ClientMount",
        "elasticfilesystem:ClientWrite",
        "elasticfilesystem:ClientRootAccess",
        "elasticfilesystem:DescribeFileSystems"
      ]
      principals = [
        {
          type        = "AWS"
          identifiers = [module.ecs.task_exec_iam_role_arn]
        }
      ]
    }
  ]</code></pre><p>And then deploy again.</p><p>By that point, you should have everything provisioned, with your service running smoothly and healthily.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QX5_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QX5_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 424w, https://substackcdn.com/image/fetch/$s_!QX5_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 848w, https://substackcdn.com/image/fetch/$s_!QX5_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 1272w, https://substackcdn.com/image/fetch/$s_!QX5_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QX5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png" width="1456" height="664" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:664,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:114409,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/160762302?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QX5_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 424w, https://substackcdn.com/image/fetch/$s_!QX5_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 848w, https://substackcdn.com/image/fetch/$s_!QX5_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 1272w, https://substackcdn.com/image/fetch/$s_!QX5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94244b6b-cefb-4de8-87ad-526373d25357_1596x728.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Log in to your AWS account and go to the ECS service to check your cluster.</p><p>If everything is healthy, you can run this command from your terminal (AWS CLI command that requires logging in to AWS via the CLI) to get the public DNS of your NLB.</p><pre><code>aws elbv2 describe-load-balancers --names minecraft-nlb --query 'LoadBalancers[0].DNSName' --output text</code></pre><p>Or, you can go to the EC2 service, search for the load balancer, and get it from there.</p><p>Now the only thing left is to spin up your Minecraft and connect to the server by providing the NLB DNS.</p><p>&#127881;<strong>Congratulations!</strong>&#127881;</p><p>You have a working Minecraft server, which is way overcomplicated and way more expensive, but you learned ECS along the way!</p><h2>Clean up</h2><p>DO NOT FORGET TO CLEAN UP YOUR ENVIRONMENT!!! It is expensive!</p><p>Run the command:</p><pre><code>terraform destroy</code></pre><h2>What comes next?</h2><p>Since this is an expensive solution, I would like to create a mechanism that allows me to open and close the Minecraft server on demand. In the future, I will place the Minecraft server on public subnets to avoid the NAT gateway cost and create a hosted zone for my vanilla Minecraft server.</p><p>Feel free to reach out if you encounter any problems or have suggestions. </p><p>Till the next time, stay safe and have fun!</p><h2>Appendix</h2><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>https://github.com/terraform-aws-modules</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[AWS News ~ Issue #7]]></title><description><![CDATA[AWS news for week 17 2025]]></description><link>https://www.thelastdev.com/p/aws-news-issue-7</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-news-issue-7</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 28 Apr 2025 07:32:54 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We are changing the format of the AWS News. I want to provide you with some insights regarding the AWS new releases and updates. This thread will be split into three sections:</p><ul><li><p><strong>News</strong>: Any interesting news I find and their impact in the landscape.</p></li><li><p><strong>Events</strong>: Any upcoming events</p></li><li><p><strong>Posts</strong>: Any interesting posts from my fellow peers</p></li></ul><h2><strong>News</strong></h2><p>&#127878;This is huge! Amazon Redshift <a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-redshift-history-mode-third-party-saas-applications/">introduces history mode support for eight third-party SaaS applications</a>. Having a history mode in the data will allow you to run multiple analyses on your data, taking into account historicity (i.e., Churn). Pretty much this is the fundamental property of analytical data &#128513; </p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>AWS DMS is one of my favourites, while my worst service (don&#8217;t tell me this makes no sense, for me it does). It&#8217;s always nice to see features that will allow you to have one less thing to worry about when you use DMS. AWS DMS Serverless now supports <a href="https://aws.amazon.com/about-aws/whats-new/2025/04/aws-dms-serverless-automatic-storage-scaling/">automatic storage scaling</a>. &#127881;</p><p>And for those who are trying to save a &#8220;few&#8221; bucks &#128184; (few and Redshift do not always go together), Amazon Redshift Serverless has announced <a href="https://aws.amazon.com/about-aws/whats-new/2025/04/serverless-reservations-discounted-pricing-option-amazon-redshift-serverless/">a new discounted pricing</a>. AWS offers up to a 24% discount for the all-upfront option and a 20% discount on the no-upfront option. All the options are for a one-year term.</p><h2><strong>Events</strong></h2><p>29th of April: <a href="https://www.meetup.com/aws-user-group-athens/events/306767392/?notificationId=1495629187737280512&amp;eventOrigin=notifications">Ctrl+Alt+Migrate: Rebooting to the Cloud</a> from <a href="https://www.meetup.com/aws-user-group-athens/">AWS User Group Athens</a>. Do not miss this if you live in Athens!</p><p>AWS Community Day Adria: <a href="https://awscommunityadria.com/">Registration has opened</a>. This event will be on the 5th of September, and it is going to have excellent speakers! Jeff Barr will be doing the keynote!</p><p><em>Feel free to reach out if you want to add your upcoming event here</em></p><p></p><h2>Posts</h2><p><a href="https://gitpushleadership.substack.com/p/are-we-really-a-team">Are We Really a Team?</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Christos Chatzis&quot;,&quot;id&quot;:293000154,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/394dae05-e34b-4d9f-bae9-69f85d070255_957x957.jpeg&quot;,&quot;uuid&quot;:&quot;137850aa-9e27-4b92-aa1b-893a2aec5664&quot;}" data-component-name="MentionToDOM"></span>. A really nice post about what it really means to be a team. I agree that the team is so much more than being together</p><blockquote><p>Being a team isn&#8217;t about how often we&#8217;re together.</p><p>It&#8217;s about how deeply we&#8217;ve got each other&#8217;s backs&#8212;even when we&#8217;re apart.</p></blockquote><p></p><p><a href="https://blog.thecloudengineers.com/p/scaling-global-8-use-cases-where">Scaling Global: 8 Use Cases Where CloudFront Performs Best</a> by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Lefteris Karageorgiou&quot;,&quot;id&quot;:314296885,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3c96d85-f3e8-4107-bb93-5c928c58c466_446x446.png&quot;,&quot;uuid&quot;:&quot;ff30ca0c-5f3c-4087-a7b9-a1bd9d38ac17&quot;}" data-component-name="MentionToDOM"></span>. An informative post about Amazon&#8217;s CDN solution, CloudFront. Lefteris presents eight use cases where CloudFront shines.</p><blockquote><p>Remember that effective CDN implementation is not just about deployment&#8212;it's about continuous optimization.</p></blockquote><p></p><p>Till the next time, stay safe and have fun!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[AWS News ~ Issue #5-6]]></title><description><![CDATA[AWS news for week 15-16 2025]]></description><link>https://www.thelastdev.com/p/aws-news-issue-5-6</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-news-issue-5-6</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Wed, 23 Apr 2025 07:05:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I hope you all had a nice Easter. This issue is a double, covering two weeks.</p><h2><strong>News</strong></h2><ul><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-ecs-set-default-log-driver-blocking-mode/">Amazon ECS adds the ability to set a default log driver blocking mode</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-s3-tables-server-side-encryption-aws-kms-customer-managed-keys/">Amazon S3 Tables now support server-side encryption using AWS KMS with customer-managed keys</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-q-business-hallucination-mitigation-chat-responses/">Amazon Q Business launches support for hallucination mitigation in chat responses</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-vpc-peering-billing/">AWS simplifies Amazon VPC Peering billing</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/load-balancer-capacity-unit-reservation-gateway-load-balancers/">Load Balancer Capacity Unit Reservation for Gateway Load Balancers</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-s3-express-one-zone-reduces-storage-request-prices/">Amazon S3 Express One Zone reduces storage and request prices</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/aws-compute-optimizer-new-amazon-ec2-instance-types/">AWS Compute Optimizer now supports 57 new Amazon EC2 instance types</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2><strong>Blog Posts</strong></h2><ul><li><p><a href="https://aws.amazon.com/blogs/machine-learning/build-a-finops-agent-using-amazon-bedrock-with-multi-agent-capability-and-amazon-nova-as-the-foundation-model/">Build a FinOps agent using Amazon Bedrock with multi-agent capability and Amazon Nova as the foundation model</a></p></li><li><p><a href="https://aws.amazon.com/blogs/big-data/accelerate-your-analytics-with-amazon-s3-tables-and-amazon-sagemaker-lakehouse/">Accelerate your analytics with Amazon S3 Tables and Amazon SageMaker Lakehouse</a></p></li><li><p><a href="https://aws.amazon.com/blogs/aws-insights/2025-aws-summit-london-top-5-recommended-sessions-for-business-leaders/">2025 AWS Summit London: Top 5 recommended sessions for business leaders</a></p></li><li><p><a href="https://aws.amazon.com/blogs/storage/migrating-files-to-fsx-for-windows-file-server-using-robocopy/">Migrating files to Amazon FSx for Windows File Server using Robocopy</a></p></li><li><p><a href="https://aws.amazon.com/blogs/storage/validate-recovery-readiness-with-aws-backup-restore-testing/">Validate recovery readiness with AWS Backup restore testing</a></p></li></ul><p></p><h2><strong>Events</strong></h2><ul><li><p>29th of April: <a href="https://www.meetup.com/aws-user-group-athens/events/306767392/?notificationId=1495629187737280512&amp;eventOrigin=notifications">Ctrl+Alt+Migrate: Rebooting to the Cloud</a></p></li><li><p>AWS Community Day Adria: <a href="https://awscommunityadria.com/">Registration has opened</a></p></li></ul><p>Till the next time, stay safe and have fun!</p>]]></content:encoded></item><item><title><![CDATA[AWS News ~ Issue #4]]></title><description><![CDATA[AWS news for week 14 2025]]></description><link>https://www.thelastdev.com/p/aws-news-issue-4</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-news-issue-4</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 07 Apr 2025 07:28:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>News</h2><ul><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-eks-bottlerocket-fips-amis-node-groups/">Amazon EKS Adds Support for Bottlerocket FIPS AMIs in Managed Node Groups</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-security-lake-internet-protocol-version-6/">Amazon Security Lake now supports Internet Protocol Version 6 (IPv6)</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-sns-internet-protocol-version-6/">Amazon SNS now supports Internet Protocol Version 6 (IPv6)</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-quicksight-dashboard-versioning-publish-analysis-dashboard/">Amazon QuickSight launches dashboard versioning and publish any analysis to any dashboard</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-quicksight-highlighting/">Amazon QuickSight now supports Highlighting</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/04/amazon-quicksight-q-embedded/">Amazon QuickSight launches Amazon Q in embedded QuickSight</a></p></li></ul><h2>Blog Posts</h2><ul><li><p><a href="https://aws.amazon.com/blogs/devops/validate-your-lambda-runtime-with-cloudformation-lambda-hooks/">Validate Your Lambda Runtime with CloudFormation Lambda Hooks</a></p></li><li><p><a href="https://aws.amazon.com/blogs/mt/simplify-aws-cost-data-analysis-with-amazon-q-and-quicksight/">Simplify AWS Cost Data Analysis with Amazon Q in QuickSight</a></p></li><li><p><a href="https://aws.amazon.com/blogs/database/build-low-latency-resilient-applications-with-amazon-memorydb-multi-region/">Build low-latency, resilient applications with Amazon MemoryDB Multi-Region</a></p></li><li><p><a href="https://aws.amazon.com/blogs/media/streamlining-content-compliance-automating-media-analysis-with-amazon-nova/">Streamlining content compliance: Automating media analysis with Amazon Nova</a></p></li><li><p><a href="https://aws.amazon.com/blogs/training-and-certification/beyond-the-server-room/">Beyond the server room: Transform your IT career with AWS</a></p></li><li><p><a href="https://aws.amazon.com/blogs/compute/simplifying-private-api-integrations-with-amazon-eventbridge-and-aws-step-functions-2/">Simplifying private API integrations with Amazon EventBridge and AWS Step Functions</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Events in Greece</h2><ul><li><p>10-12 April: <a href="https://devoxx.gr/">DEVOXX GREECE</a></p></li></ul><p></p><p>Till the next time, stay safe and have fun!</p><p></p>]]></content:encoded></item><item><title><![CDATA[AWS News ~ Issue #3]]></title><description><![CDATA[AWS news for week 13 2025]]></description><link>https://www.thelastdev.com/p/aws-news-issue-3</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-news-issue-3</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 31 Mar 2025 19:33:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>News</h2><ul><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-eventbridge-scheduler-privatelink/">Amazon EventBridge Scheduler now supports AWS PrivateLink</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/aws-cloudformation-targeted-resource-scans-iac-generator/">AWS CloudFormation now supports targeted resource scans in the IaC generator</a></p></li><li><p><a href="http://aws.amazon.com/about-aws/whats-new/2025/03/deploy-storage-browser-amazon-s3-sample-applications/">Deploy Storage Browser for Amazon S3 quickly with AWS sample application</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-sagemaker-metadata-rules-standards-improve-data-governance/">Amazon SageMaker introduces metadata rules to enforce standards and improve data governance</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-datazone-metadata-rules-publishing/">Amazon DataZone now supports metadata rules for publishing</a></p><p></p></li></ul><h2>Blog Posts</h2><ul><li><p><a href="https://aws.amazon.com/blogs/compute/simplifying-private-api-integrations-with-amazon-eventbridge-and-aws-step-functions-2/">Simplifying private API integrations with Amazon EventBridge and AWS Step Functions</a></p></li></ul><h2>Events in Greece</h2><ul><li><p><a href="https://www.meetup.com/dddgreece/events/306804737/?notificationId=%3Cinbox%3E%21226827742-1742553171035&amp;eventOrigin=notifications">DDD Greece: #37: Beyond the Hype: Practical Insights on Data Mesh Adoption - April 2nd at Orfium</a> </p></li><li><p><a href="https://www.meetup.com/leadership-conversations-in-tech/events/306478349/">Ask the CTO - April 3rd at Kaizen Campus</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[AWS News ~ Issue #2]]></title><description><![CDATA[AWS news for week 12 2025]]></description><link>https://www.thelastdev.com/p/aws-news-issue-2</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-news-issue-2</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 24 Mar 2025 17:36:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Week 12, 2025. I have not followed many updates, but here are some that caught my interest</p><h2>News</h2><ul><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-bedrock-rag-evaluation-generally-available/">Amazon Bedrock now supports RAG Evaluation (generally available)</a></p><p></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2>Blog Posts</h2><ul><li><p><a href="https://aws.amazon.com/blogs/big-data/using-amazon-s3-tables-with-amazon-redshift-to-query-apache-iceberg-tables/">Using Amazon S3 Tables with Amazon Redshift to query Apache Iceberg tables</a></p></li><li><p><a href="https://aws.amazon.com/blogs/business-intelligence/optimize-your-amazon-quicksight-implementation-a-guide-to-usage-analytics-and-cost-management/">Optimize your Amazon QuickSight implementation: a guide to usage analytics and cost management</a></p></li><li><p><a href="https://aws.amazon.com/blogs/compute/optimizing-network-footprint-in-serverless-applications/">Optimizing network footprint in serverless applications</a></p></li></ul><h2>DEVOXX Greece 2025</h2><p>Register: https://devoxx.gr/</p><p>DDD Meetup 2nd of April: <a href="https://www.meetup.com/dddgreece/events/306804737/?eventOrigin=group_upcoming_events">#37: Beyond the Hype: Practical Insights on Data Mesh Adoption</a></p>]]></content:encoded></item><item><title><![CDATA[How SageMaker Debugger works]]></title><description><![CDATA[Archive of my old post from my blog. Explaining how Sagemaker debugger works.]]></description><link>https://www.thelastdev.com/p/how-sagemaker-debugger-works</link><guid isPermaLink="false">https://www.thelastdev.com/p/how-sagemaker-debugger-works</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 24 Mar 2025 08:02:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qHmf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is an old post from the old blog thelastdev.com. I still remember the feeling when this Sagemaker feature released, I was very excited! Well, I will try to revive that post here. I updated the link references, it&#8217;s been 5 years after all&#128552;, and I ensured the code here works as intended. I hope you find this post useful! Additionally, I have hidden some edits throughout the post, indicating how long ago I wrote the post &#128064;</p><p>Now before we begin, please let me know what you want to see (obviously new posts) in this substack by answering the following poll.</p><div class="poll-embed" data-attrs="{&quot;id&quot;:288320}" data-component-name="PollToDOM"></div><div><hr></div><p>I have been using SageMaker for some time now, both for personal and professional use. SageMaker provides you a series of tools that every Data Scientist needs, but no obligations to use them all to produce a complete result. For example, SageMaker has a tool to tune your hyperparameters of your model called <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html">Automatic Model Tuning</a>. You can use the Model Tunner if you want, but if you skip this tool, you still have the complete result. In other words, you can go and use whatever tool you may need, even in your local machine (using the SageMaker SDK) and create, train, and deploy your model seamlessly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qHmf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qHmf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 424w, https://substackcdn.com/image/fetch/$s_!qHmf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 848w, https://substackcdn.com/image/fetch/$s_!qHmf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 1272w, https://substackcdn.com/image/fetch/$s_!qHmf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qHmf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png" width="724" height="283.19727891156464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:460,&quot;width&quot;:1176,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:284311,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qHmf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 424w, https://substackcdn.com/image/fetch/$s_!qHmf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 848w, https://substackcdn.com/image/fetch/$s_!qHmf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 1272w, https://substackcdn.com/image/fetch/$s_!qHmf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00ca3d75-7f50-4821-8b36-c011fcc0f1a5_1176x460.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">OH MY GOD this is an old image, but still valid. Sagemaker now has more tools and features.</figcaption></figure></div><p>While I usually create and train a model in SageMaker, I want to have a complete overview of the progress to spot problems in the training process. For example, I want to know if my model is in a plateau or when I have an exploding tensor, etc When you are in your PC and your train the model in your own GPU, you can monitor everything, but when you are on the cloud and you train your model in a cluster of machines it is a bit difficult to monitor and debug everything. If something went wrong you need to know why and how to correct it. SageMaker now provides a tool called <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/train-debugger.html">Amazon SageMaker Debugger</a> that helps with the progress of the model&#8217;s training in a very detailed manner. I will use the Debugger on a previous post that I have made for &#8220;classifying the Fashion-MNIST dataset&#8221; (I will update this post and repost it here).</p><p>Difficulty and costs</p><ul><li><p>Level: 300</p></li><li><p>Total Cost: Let&#8217;s say 1$ (with no Endpoint deployed)</p><ul><li><p>Amazon SageMaker Notebook: ml.system for 24 hours -&gt; 0.00364 x 24 -&gt; 0.087$ + ml.t3.medium for 12 hours -&gt; 0.7$ = 0.8$</p></li><li><p>Amazon SageMaker Training instance: 122 seconds in ml.c5.2xlarge -&gt; 0.0019 x 122 = 0.14</p></li><li><p>Debugger Instance: 244 seconds in ml.t3.medium -&gt; 0.003$</p></li></ul></li></ul><h1>Amazon SageMaker Debugger</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sren!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sren!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 424w, https://substackcdn.com/image/fetch/$s_!Sren!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 848w, https://substackcdn.com/image/fetch/$s_!Sren!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 1272w, https://substackcdn.com/image/fetch/$s_!Sren!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sren!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png" width="643" height="731" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:731,&quot;width&quot;:643,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74849,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sren!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 424w, https://substackcdn.com/image/fetch/$s_!Sren!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 848w, https://substackcdn.com/image/fetch/$s_!Sren!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 1272w, https://substackcdn.com/image/fetch/$s_!Sren!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2e01990-fce6-4b8c-83c2-f85a3feb474b_643x731.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Fig 1. Training Job in SageMaker with Debugger</figcaption></figure></div><p>The Debugger is the fourth component in the equation, and it monitors the model, saves the metrics to S3, and can evaluate the metrics using something called Rules. Let&#8217;s take one piece at a time. You can add a Debugger Hook to your model on the circled number 1 in Figure 1. Once the job starts, that Hook will save metrics from your model as a <a href="https://github.com/awslabs/sagemaker-debugger/blob/master/docs/analysis.md#Tensor-1">tensor</a>. Then (2) the hook listens for the requested metrics, creates the tensors, and saves them to S3. The debugger (3) can evaluate the model in real-time based on provided (built-in or custom) rules and decide if the training process was successful.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2>Add a Debugging Hook</h2><p>Amazon SageMaker provides us a series of built-in Collections in order to monitor and save metrics (in a tensor format) of our model. <a href="https://github.com/awslabs/sagemaker-debugger/blob/master/docs/api.md#collection">Here is the list that we can use</a>. For this example, we are saving the biases, the weights, and the metrics of our model.</p><pre><code>from sagemaker.debugger import CollectionConfig, DebuggerHookConfig

bucket = sess.default_bucket()

collection_config_biases = CollectionConfig(name='biases')
collection_config_weights = CollectionConfig(name='weights')
collection_config_metrics = CollectionConfig(name='metrics')

debugger_hook_config = DebuggerHookConfig(
    s3_output_path=f"s3://{bucket}/fashion-mnist/hook",
    hook_parameters={
        'save_interval': '100'
    },
    collection_configs=[
        collection_config_biases,
        collection_config_weights,
        collection_config_metrics
    ]
)</code></pre><p>To integrate the Debugger Hook to your model, add it to the estimator.</p><pre><code>hyperparameters = {'epochs': 10, 'batch_size': 256, 'learning_rate': 0.001}

estimator = TensorFlow(
    entry_point='model.py',
    train_instance_type='ml.p3.2xlarge',
    train_instance_count=1,
    model_dir='/opt/ml/model',
    hyperparameters=hyperparameters,
    role=sagemaker.get_execution_role(),
    base_job_name='tf-fashion-mnist',
    framework_version='1.15',
    py_version='py3',
    script_mode=True,
    debugger_hook_config=debugger_hook_config,
)</code></pre><p>In addition, you can create custom Debugger Hooks as shown in the following snippet. This snippet is from an AWS example on Debugger called <a href="https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-debugger/model_specific_realtime_analysis/autoencoder_mnist/autoencoder_mnist.ipynb">Visualizing Debugging Tensors</a>.</p><pre><code>debugger_hook_config = DebuggerHookConfig(
    s3_output_path=f"s3://{bucket}/fashion-mnist/hook",
    collection_configs=[
        CollectionConfig(
            name="all_tensors",
            parameters={
                "include_regex": ".*",
                "save_steps": "1, 2, 3"
            }
        )
    ]
)</code></pre><p>In this example, we are creating a Hook that will save all the tensors of our model. The AWS example has an amazing result, and I urge you to go and read it! More on custom Hooks here.</p><h2>Analyze the Tensors</h2><p>To analyze the tensors you need a SageMaker package that is called <code>sm_debug</code> and the documentation can be <a href="https://github.com/awslabs/sagemaker-debugger">found here</a>. With <code>smdebug</code> we fetch the tensors and explore them.</p><pre><code>from smdebug.trials import create_trial
from smdebug import modes
import numpy as np
import matplotlib.pyplot as plt


# Get the tensors from S3
s3_output_path = estimator.latest_job_debugger_artifacts_path()

# Create a Trial https://github.com/awslabs/sagemaker-debugger/blob/master/docs/analysis.md#Trial
trial = create_trial(s3_output_path)

# Get all the tensor names
trial.tensor_names()

# Get the values of the tensor `val_acc`for mode GLOBAL (validation accuracy)
values = trial.tensor("val_acc").values(modes.GLOBAL)

# Convert it to numpy array
values_eval = np.array(list(values.items()))

fig = plt.figure()
plt.plot(values_eval[:, 1])
fig.suptitle('Validation Accuracy', fontsize=20)
plt.xlabel('Intervals of sampling', fontsize=18)
plt.ylabel('Acuracy', fontsize=16)
fig.savefig('temp.jpg')</code></pre><p>First things first, let&#8217;s talk about what we just did. Initially, at line 8, we are fetching the location of the tensors in S3, then we create a <a href="https://github.com/awslabs/sagemaker-debugger/blob/master/docs/analysis.md#Trial">trial</a> to be able to query the tensors (line 11). Once we have the trial we are able to fetch all the tensor names (line 14) and with a specific name, in our case <code>acc</code> we are fetching the values for the validation accuracy (line 17), we are also using the modes to select the validation tensors. Finally, we convert the values to a NumPy array and we plot it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N0UP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N0UP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 424w, https://substackcdn.com/image/fetch/$s_!N0UP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 848w, https://substackcdn.com/image/fetch/$s_!N0UP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 1272w, https://substackcdn.com/image/fetch/$s_!N0UP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N0UP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png" width="429" height="287" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:287,&quot;width&quot;:429,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:37439,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!N0UP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 424w, https://substackcdn.com/image/fetch/$s_!N0UP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 848w, https://substackcdn.com/image/fetch/$s_!N0UP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 1272w, https://substackcdn.com/image/fetch/$s_!N0UP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd2ceb5-9367-4712-9984-5499b7e258fe_429x287.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There are so many possibilities for using tensors. Once you define a trial, you can use the SageMaker trials page to create all of your plots. This feature is available only on SageMaker Studio.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2>Add a Debugging Rule</h2><p>Now that we have a Hook to create tensors from our model, we need to evaluate the model to see if the training was successful. We will set a <a href="https://github.com/awslabs/sagemaker-debugger/blob/master/docs/analysis.md#Rules">SageMaker Rule</a> that will read the tensors from our Debugger Hook and it will evaluate them. Be careful, you need to set up a hook for a Rule to extract and evaluate the tensors. Some amazing built-in rules will really help you evaluate your model. You can find the built-in rules <a href="https://github.com/awslabs/sagemaker-debugger/blob/master/docs/sagemaker.md#built-in-rules">here</a>. I have only used built-in rules so far, so we will not dive into the custom ones. I will leave this for a future post, for now, you can take a look on awslabs examples <a href="https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-debugger/tensorflow_keras_custom_rule">here</a>.</p><pre><code>from sagemaker.debugger import Rule, rule_configs

estimator = TensorFlow(
    entry_point='model.py',
    train_instance_type=train_instance_type,
    train_instance_count=1,
    model_dir=model_dir,
    hyperparameters=hyperparameters,
    role=sagemaker.get_execution_role(),
    base_job_name='tf-fashion-mnist',
    framework_version='1.15',
    py_version='py3',
    script_mode=True,
    debugger_hook_config=debugger_hook_config,
    rules=[
        Rule.sagemaker(rule_configs.overfit()),
        Rule.sagemaker(rule_configs.loss_not_decreasing())
    ],
)</code></pre><p>We have created two separate rules (more examples <a href="https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-debugger/tensorflow_builtin_rule/tf-mnist-builtin-rule.ipynb">here</a>).</p><ol><li><p>The <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-built-in-rules.html#overfit">overfit rule</a> where SageMaker will know when the model is overfitted and it will mark the training job as failed</p></li><li><p>The <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-built-in-rules.html#loss-not-decreasing">loss_not_decreasing rule</a> where Sagemaker will monitor the loss and if it is stale then it will mark the training job as failed</p></li></ol><p>Furthermore, we can go to the tensors, explore them, and find out at what step this problem occurred, why, and how we can prevent it.</p><p>Deliberately, I have increased the epochs of my example to 100, this may cause the model to overfit. Let&#8217;s see how SageMaker will react to that.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DpPm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DpPm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 424w, https://substackcdn.com/image/fetch/$s_!DpPm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 848w, https://substackcdn.com/image/fetch/$s_!DpPm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 1272w, https://substackcdn.com/image/fetch/$s_!DpPm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DpPm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png" width="1021" height="282" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c33870f6-6523-425f-beb5-1198bde07d00_1021x282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:282,&quot;width&quot;:1021,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:134692,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DpPm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 424w, https://substackcdn.com/image/fetch/$s_!DpPm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 848w, https://substackcdn.com/image/fetch/$s_!DpPm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 1272w, https://substackcdn.com/image/fetch/$s_!DpPm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc33870f6-6523-425f-beb5-1198bde07d00_1021x282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We can see that once the Training Job has started, we can go to the Experiments, search the trials nad find our Job. In the last tab, there is Debugger, that holds all the rules, both built-in and custom. We will see the verdict of the debugger once the job is done. SageMaker correctly captured the overfit. If we go to the tensors we can see that the training accuracy is very close to 1 and the validation accuracy is remaining the same.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vqwv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vqwv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 424w, https://substackcdn.com/image/fetch/$s_!Vqwv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 848w, https://substackcdn.com/image/fetch/$s_!Vqwv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 1272w, https://substackcdn.com/image/fetch/$s_!Vqwv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vqwv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png" width="1016" height="181" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:181,&quot;width&quot;:1016,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90031,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vqwv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 424w, https://substackcdn.com/image/fetch/$s_!Vqwv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 848w, https://substackcdn.com/image/fetch/$s_!Vqwv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 1272w, https://substackcdn.com/image/fetch/$s_!Vqwv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F376a6102-e318-4da0-ad40-3918e145c1ff_1016x181.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N1-N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N1-N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 424w, https://substackcdn.com/image/fetch/$s_!N1-N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 848w, https://substackcdn.com/image/fetch/$s_!N1-N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 1272w, https://substackcdn.com/image/fetch/$s_!N1-N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N1-N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png" width="428" height="286" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1fdd595f-8574-4460-95a9-998607afce7c_428x286.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:286,&quot;width&quot;:428,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31983,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!N1-N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 424w, https://substackcdn.com/image/fetch/$s_!N1-N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 848w, https://substackcdn.com/image/fetch/$s_!N1-N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 1272w, https://substackcdn.com/image/fetch/$s_!N1-N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1fdd595f-8574-4460-95a9-998607afce7c_428x286.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!27Fp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!27Fp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 424w, https://substackcdn.com/image/fetch/$s_!27Fp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 848w, https://substackcdn.com/image/fetch/$s_!27Fp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 1272w, https://substackcdn.com/image/fetch/$s_!27Fp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!27Fp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png" width="427" height="285" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:285,&quot;width&quot;:427,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54714,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thelastdev.com/i/159166681?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!27Fp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 424w, https://substackcdn.com/image/fetch/$s_!27Fp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 848w, https://substackcdn.com/image/fetch/$s_!27Fp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 1272w, https://substackcdn.com/image/fetch/$s_!27Fp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e8b7a1d-39db-4237-b53d-a2c7dba105f9_427x285.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You can also, create alerts when a rule condition is met. you can receive messages, stop the training job, etc. All of these will be covered in a future post. For now, you can follow awslabs examples <a href="https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-debugger/tensorflow_action_on_rule">here</a>.</p><h1>Conclusion</h1><p>This post taught us how to debug our machine learning model in SageMaker. As you can see, once you get a grip on it, finding what you are looking for will be easy. Personally, I am very excited about this addition to SageMaker (lol, blast from the past, it&#8217;s been 5 years!). Debugger exists in the ecosystem of AWS and it can communicate with several other AWS services, such as Lambda. I hope I have shed some light on SageMaker Debugger in this post. If you have any questions, suggestions or requests, please let me know in the comments section below or message me at my BlueSky account <a href="https://bsky.app/profile/thelastdev.com">@thelastdev.com</a> or <a href="https://www.linkedin.com/in/siaterliskonstantinos/">LinkedIn</a>. </p><p>Till the next time, stay safe and have fun!</p><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[AWS News ~ Issue #1]]></title><description><![CDATA[AWS news for week 11 2025]]></description><link>https://www.thelastdev.com/p/aws-news-1</link><guid isPermaLink="false">https://www.thelastdev.com/p/aws-news-1</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Mon, 17 Mar 2025 16:29:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Week 11, 2025, AWS had some nice updates and blog posts. Here is a list of what I found interesting. I know that those updates and posts are heavily in the data aspect; as we progress, we will focus on other topics as well, like networking and application development.</p><h2>News</h2><p>The most exciting thing is the S3 Tables, which I want to cover in a later blog post, explaining how they work and building a simple use-case.</p><ul><li><p><a href="http://aws.amazon.com/about-aws/whats-new/2025/03/amazon-s3-tables-create-query-table-s3-console/">Amazon S3 Tables add create and query table support in the S3 console</a></p></li><li><p><a href="http://aws.amazon.com/about-aws/whats-new/2025/03/amazon-sagemaker-lakehouse-integration-s3-tables-generally-available/">Amazon S3 Tables integration with SageMaker Lakehouse is now generally available</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/ready-to-use-serverless-land-patterns-vs-code-ide/">Accelerate serverless development with ready-to-use Serverless Land Patterns in Visual Studio Code</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/aws-glue-data-catalog-views-glue-5-0/">Announcing support of AWS Glue Data Catalog views with AWS Glue 5.0</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-s3-access-grants-authentication-provider-permissions/">Amazon S3 Access Grants simplify authentication when using both IAM and Identity Provider permissions</a></p></li><li><p><a href="https://aws.amazon.com/about-aws/whats-new/2025/03/amazon-data-firehose-real-time-streaming-data-s3-tables/">Amazon Data Firehose now delivers real-time streaming data into Amazon S3 Tables</a></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading The Last Dev! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><h2>Blog Posts</h2><ul><li><p><a href="https://aws.amazon.com/blogs/storage/connect-snowflake-to-s3-tables-using-the-sagemaker-lakehouse-iceberg-rest-endpoint/">Connect Snowflake to S3 Tables using the SageMaker Lakehouse Iceberg REST endpoint</a></p></li><li><p><a href="https://aws.amazon.com/blogs/containers/building-multi-arch-containers-with-github-actions-in-aws/">Building multi-arch containers with GitHub Actions in AWS</a></p></li><li><p><a href="https://aws.amazon.com/blogs/aws/amazon-s3-tables-integration-with-amazon-sagemaker-lakehouse-is-now-generally-available/">Amazon S3 Tables integration with Amazon SageMaker Lakehouse is now generally available</a></p></li></ul><h2>AWS Developer Day 2025</h2><p>Recordings: https://aws.amazon.com/blogs/devops/watch-the-recordings-from-aws-developer-day-2025/</p>]]></content:encoded></item><item><title><![CDATA[Coming Soon: The Last Dev Substack!]]></title><description><![CDATA[I&#8217;m launching a brand-new Substack where I&#8217;ll be diving into DevOps, Cloud, Data Mesh, and more!]]></description><link>https://www.thelastdev.com/p/coming-soon-the-last-dev-substack</link><guid isPermaLink="false">https://www.thelastdev.com/p/coming-soon-the-last-dev-substack</guid><dc:creator><![CDATA[Konstantinos Siaterlis]]></dc:creator><pubDate>Sat, 15 Mar 2025 16:20:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7BdZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9a09b4e-a465-40db-b7f5-11f2a830d1c2_261x261.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Exciting things are on the way! &#127881;</p><p>Oh Gosh&#8230;. it&#8217;s been some time&#8230;.some time&#8230;.but now I am back!</p><p>I&#8217;m launching a brand-new Substack where I&#8217;ll be diving into DevOps, Cloud, Data Mesh, and more! Whether you're a <strong>tech enthusiast, builder, or curious mind</strong>, this space will be packed with <strong>insights, strategies, and real-world lessons</strong> you won&#8217;t want to miss. I will start re-uploading my old (OLD) posts from thelastdev.com to familiarize myself with posting here but also, I do not want to lose them :)</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.thelastdev.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.thelastdev.com/subscribe?"><span>Subscribe now</span></a></p><p>&#128197; <strong>Stay tuned for the first post!</strong> In the meantime, hit <strong>Subscribe</strong> so you don&#8217;t miss a thing. </p><p>Drop a &#128293; in the comments if you're excited! Let me know what topics you'd love to see covered.</p><p>Till the next time, stay safe and have fun!</p>]]></content:encoded></item></channel></rss>