close
close

AWS PI Day 2025: Data Foundation for Analytics and AI

Every year on March 14th (3.14), AWS Pi Day AWS innovations emphasized with which you can manage and work with you. What started in 2021 to remember the fifteenth start of the Amazon Simple Storage Service (Amazon S3) (Amazon S3) is now developed into an event that highlights how cloud technologies change data management, analysis and AI.

This year AWS PI Day returns to the acceleration of analyzes and AI innovations with a unified data Foundation on AWS. In most corporate strategies, the data landscape is subjected to a profound transformation, with analytics and KI workloads increasingly converge by many of the same data and work. You need an easy way to access all your data and use all your preferred analyzes and AI tools in a single integrated experience. On this AWS PI day we introduce a number of new functions with which you can build up uniform and integrated data experiences.

The next generation of Amazon Sagemaker: the center of all data, analyzes and AI
At Re: Invent 2024 we presented the next generation of Amazon Sagemaker, the center of all data, analyzes and AI. Sagemaker includes practically all components that you need for data exploring, preparation and integration, big -data processing, fast SQL analysis, model development and training for machine learning (ML) and generative AI application development. With this new generation of Amazon Sagemaker, Sagemaker Lakehouse offers uniform access to your data, and the Sagemaker catalog helps you to meet your governance and security requirements. You can read the Start Blog contribution written by my colleague Antje to find out more.

The core of the next generation of Amazon Sagemaker is Sagemaker Unified Studio, a single data and AI development environment in which you can use all your data and tools for analytics and AI. Sagemaker Unified Studio is now generally available.

Sagemaker Unified Studio enables cooperation between data scientists, analysts, engineers and developers when they work on data, analyzes, AI workflows and applications. It offers familiar tools from AWS Analytics and artificial intelligence and machine learning services (AI/ML), including data processing, SQL Analytics, ML model development and generant AI application development, into a single user experience.

Sagemaker Unified Studio

Sagemaker Unified Studio also brings out selected functions Amazon's basic rock in Sagemaker. You can now flow generative AI applications with foundation models (FMS) and advanced functions such as Amazon -Grundstein -kissensbasen, Amazon basic frames, Amazon basic rock agents and Amazon gun to create tailor -made solutions.

Not least, Amazon Q developer is now available in the Sagemaker Unified Studio in general. Amazon Q Developer offers generative AI supported support for data and AI development. It helps you with tasks such as writing SQL queries, the establishment of extracts, transformations and charging (ETL) jobs and troubleshooting and is available in the free level and the pro level for existing subscribers.

In this blog post recently written by my colleague Donnie, you will learn more about Sagemaker Unified Studio.

During the 2024 refund, we launched Amazon Sagemaker Lakehouse as part of the next generation of Sagemaker. Sagemaker Lakehouse combines all of your data in Amazon S3 Data Lakes, Amazon Redshift Data Warehouses and data sources from third -party providers. It helps you to build powerful analyzes and KI/ML applications on a single copy of your data. Sagemaker Lakehouse offers you the flexibility of accessing and querying your data with Apache ICEBERG-COMPOPATILE tools and engines. In addition, zero-SETL integrations automate the process of bringing data from AWS data sources such as Amazon Aurora or Amazon Dynamodb and applications such as Salesforce, Facebook ads, Instagram ads, ServiceNow, SAP, Zendesk and Zoho CRM in Sagemaker Laehouse. You can find the full list of integrations in the FAQ Sagemaker Lakehouse.

Creating a data foundation with Amazon S3
Creating a data foundation is the cornerstone of the acceleration of analyzes and AI workloads and enables companies to seamlessly manage, discover and use their data assets on any level. Amazon S3 is the world's best place to build a data lake with a practically unlimited scale and offers the essential basis for this transformation.

I am always amazed to learn about the scale in which we operate Amazon S3: currently it contains over 400 trillion objects, data defenders and processes a stunning 150 million inquiries per second. A decade ago, less than one petabyte (PB) data on S3. Today, thousands of customers exceeded the 1 PB milestone.

Amazon S3 stores exabytes of tabular data and an average of over 15 million requirements for tabular data per second. In order to reduce the undifferentiated severe lifting when managing your tabular data in S3 buckets, we announced Amazon S3 tables under AWS RE: invention in 2024. S3 tables are especially optimized for Analytics workloads, which leads to an up to triple faster query throughput and a transactions up to ten times higher transactions per second compared to self-managed tables.

Today we will announce that General availability of Amazon S3 table integration in Amazon Sagemaker Lakehouse Amazon S3 tables now integrate into the Amazon Sagemaker Lakehouse and easily make it easier for you to access S3 tables from AWS Analytics services such as Amazon Redshift, Amazon Athena, Amazon EMR, AWS adhesive and Apache Iceberg -Compatrising engines such as Apache Spark or Pyiceberg. Sagemaker Lakehouse enables centralized management of fine-grained data access authorizations for S3 tables and other sources and consistently uses all engines.

For those of you who use a catalog of third-party providers, have a custom catalog implementation or only need basic reading and writing access to tabular data in a single table bucket, we have added New APIs that are compatible with the Eisberg Rest catalog standard. This allows all iceberg-compatible applications to be seamlessly created, updated, listed and deleted in an S3 table bucket. You can also use S3 tables with Sagemaker Lakehouse for uniform data management about all your tabular data, data management and fine-grained access controls.

To make access easier S3 tables we started updates in the AWS management console. You can now create a table, populate it with data and query Amazon Athena directly from the S3 console, which makes it easier to start and analyze data in S3 table buckets.

The following screenshot shows how to access Athena's S3 console directly.

S3 console: Create table with AthenaWhen I choose Query tables with Athena or Create table with AthenaIt opens the Athena console for the correct data source, the correct catalog and the database.

S3 tables in Athena

Since RE: Invent 2024 we have added new functions to S3 tables at a quick pace. For example, we added the support of the Scheme definition to support the scheme CreateTable API and you can now create up to 10,000 tables in an S3 table bucket. We also launched S3 tables in eight additional AWS regions, with the latest Asia -Pacific (Seoul, Singapore, Sydney) coming with more on March 4. You can refer to the S3 tables -AWS regions of the documentation to get the list of eleven regions where S3 tables are available today.

Amazon S3 metadata-in RE: Invent 2024 announced-is generally available since January 27th. It is the fastest and easiest way to help you discover and understand your S3 data with automated, effortlessly suitable metadata and to be updated in real time. S3 metadata works with S3 object day. Tags help you to group data logically for various reasons, e.g. For example, to apply IAM guidelines to ensure fine-grained access, specify day-based filters in order to manage the rules for object life cycle and selectively replicate data into another area. In regions in which S3 metadata is available, you can record and queries custom metadata that are saved as an object day. In order to reduce the costs associated with object day when using S3 metadata, Amazon S3 reduced the pricing for the S3 object -tagging by 35 percent To use cheaper, custom metadata in all regions.

AWS PI Day 2025
Over the years, AWS PI Day has presented important milestones in cloud storage and data analyzes. This year, the virtual event of AWS PI Day will offer a number of topics developed for developers and technical decision-makers, data engineers, KI/ML practitioners and IT executives. The most important highlights include Deep Dives, Live Demos and Expert meetings to all services and skills that I have discussed in this article.

When you take part in this event, you will find out how you can accelerate your analysis and AI innovation. You will find out how you can use S3 tables with a native Apache -Eisberg -Support and S3 metadata to create scalable data lakes that serve both conventional analyzes and emerging KI/ML workload. You will also discover the next generation of Amazon Sagemaker, the center for all your data, analyzes and AI, to help your teams to work together and build up a uniform studio faster, whereby familiar AWS tools are used to access all of your data, regardless of whether you are saved in Data Lakes, Data Warehouses or Third Party or Federeted Data Sources.

For those who want to remain the latest cloud trends, AWS PI Day 2025 is an event that you cannot miss. Regardless of whether you have data -lakehouses, training -KI models, create generative AI applications or optimization of the workloads of analytics, you can maximize the value of your data with the shared knowledge.

Turn on today and explore the latest innovations in Cloud data. Do not miss the opportunity to deal with AWS experts, partners and customers who shape the future of data, analyzes and AI.

If you missed the virtual event on March 14th, you can visit the event page at any time-we keep all the content there when asked!

– SEB


How is the news blog? Take this 1 -minute survey!

(This survey is hosted by an external company. AWS assumes its information as described in the AWS data protection note. AWS has the data collected via this survey and does not share the information collected with the respondents.)